1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344
|
\htmlhr
\chapterAndLabel{Type inference}{type-inference}
There are two different tasks that are commonly called ``type inference'':
\begin{enumerate}
\item
Type inference during type-checking:
The type-checker fills in an appropriate type where the programmer didn't
write one, but does not change the source code.
See Section~\ref{type-inference-refinement}.
\item
Type inference to annotate a program:
As a separate step before type-checking, a type inference tool
inserts type qualifiers into the source code.
See Section~\ref{type-inference-to-annotate}.
\end{enumerate}
Each variety has its own advantages, discussed below.
Advantages of \emph{all} varieties of type inference include:
\begin{itemize}
\item
Less work for the programmer.
\item
The tool chooses the most general type, whereas a programmer might
accidentally write a more-specific, less-generally-useful annotation.
\end{itemize}
\sectionAndLabel{Local type inference during type-checking}{type-inference-refinement}
During type-checking, if certain variables have no programmer-written type qualifier, the
type-checker determines whether there is some type qualifier that would
permit the program to type-check. If so, the type-checker uses that type
qualifier, but does not change the source code. Each time the
type-checker runs, it re-infers the type qualifier for that variable. If
no type qualifier exists that permits the program to type-check, the
type-checker issues a warning.
Local type inference is built into the Checker Framework.
Every checker automatically uses it. As a result, a programmer typically
does not have to write any qualifiers inside the body of a method
(except occasionally on type arguments).
However, it primarily
works within a method, not across method boundaries.
The source code must already contain annotations for method
signatures (arguments and return values) and fields.
Advantages of this variety of type inference include:
\begin{itemize}
\item
If the type qualifier is obvious to the programmer, then omitting it
can reduce annotation clutter in the program.
\item
If the code changes, then there is no old annotation that
might need to be updated.
\item
Within-method type inference occurs automatically.
The programmer doesn't have to do anything to take advantage of it.
\end{itemize}
For more details about local type inference during type-checking, also
known as ``flow-sensitive type refinement'', see
Section~\ref{type-refinement}.
\sectionAndLabel{Type inference to annotate a program}{type-inference-to-annotate}
As a separate step before type-checking, a type inference tool takes the
program as input, and outputs a set of type qualifiers that would
make the program type-check. (If no such set exists, for example because
the program is not type-correct, then the inference tool does its best but
makes no guarantees.)
These qualifiers are inserted into the source code or the
class file. They can be viewed and adjusted by the programmer, and can
be used by tools such as the type-checker.
Advantages of this variety of type inference include:
\begin{itemize}
\item
The inference may be more precise by taking account of the entire program
rather than just reasoning one method at a time.
\item
The program source code contains documentation in the form of type
qualifiers, which can aid programmer understanding and may make
type-checking warnings more comprehensible.
\end{itemize}
\subsectionAndLabel{Type inference tools}{type-inference-tools}
This section lists tools that take a program and output a set of
annotations for it.
It first lists tools that work only for a single type system (but may do a
more accurate job for that type system)
then lists general tools that work for any type system.
\subsubsectionAndLabel{Type inference for specific type systems}{type-inference-tools-specialized}
Section~\ref{nullness-inference} lists several tools that infer
annotations for the Nullness Checker.
If you run the Checker Framework with the \<-AsuggestPureMethods>
command-line option, it will suggest methods that can be marked as
\<@SideEffectFree>, \<@Deterministic>, or \<@Pure>; see
Section~\ref{type-refinement-purity}.
\subsubsectionAndLabel{Type inference for any type system}{type-inference-tools-general}
By supplying the \<-Ainfer> command-line option,
any type-checker can infer annotations. See Section~\ref{whole-program-inference}.
\href{https://github.com/reprogrammer/cascade/}{Cascade}~\cite{VakilianPEJ2014}
is an Eclipse plugin that implements interactive type qualifier inference.
Cascade is interactive rather than fully-automated: it makes it easier for
a developer to insert annotations.
Cascade starts with an unannotated program and runs a type-checker. For each
warning it suggests multiple fixes, the developer chooses a fix, and
Cascade applies it. Cascade works with any checker built on the Checker
Framework.
You can find installation instructions and a video tutorial at \url{https://github.com/reprogrammer/cascade}.
\sectionAndLabel{Whole-program inference}{whole-program-inference}
Whole-program inference is an interprocedural inference that
infers types for fields, method parameters, and method return types that do not
have a user-written qualifier (for the given type system).
The inferred type qualifiers are inserted into
your program.
The inferred type is the most specific type that is compatible with all the
uses in the program. For example, the inferred type for a field is the
least upper bound of the types of all the expressions that are assigned
into the field.
\begin{sloppypar}
To use whole-program inference, make sure that
\<insert-annotations-to-source>, from the Annotation File Utilities project,
is on your path (for example, its directory is in the \<\$PATH> environment variable).
Then, run the script \<checker-framework/checker/bin/infer-and-annotate.sh>.
Its command-line arguments are:
\end{sloppypar}
\begin{enumerate}
\item Optional: Command-line arguments to
\href{https://checkerframework.org/annotation-file-utilities/#insert-annotations-to-source}{\<insert-annotations-to-source>}.
\item Processor's name.
\item Target program's classpath. This argument is required; pass "" if it
is empty.
\item Optional: Extra processor arguments which will be passed to the checker, if any.
You may supply any number of such arguments, or none. Each such argument
must start with a hyphen.
\item Optional: Paths to \<.jaif> files used as input in the inference
process.
\item Paths to \<.java> files in the program.
\end{enumerate}
% TODO: Change the example project that is being annotated, since plume-lib is now deprecated.
For example, to add annotations to the \<plume-lib> project:
\begin{Verbatim}
git clone https://github.com/mernst/plume-lib.git
cd plume-lib
make jar
$CHECKERFRAMEWORK/checker/bin/infer-and-annotate.sh \
"LockChecker,NullnessChecker" java/plume.jar:java/lib/junit-4.12.jar:$JAVA_HOME/lib/tools.jar \
`find java/src/plume/ -name "*.java"`
# View the results
git diff
\end{Verbatim}
You may need to wait a few minutes for the command to complete.
You can ignore warnings that the command outputs while it tries different
annotations in your code.
It is recommended that you run \<infer-and-annotate.sh> on a copy of your
code, so that you can see what changes it made and so that it does not
change your only copy. One way to do this is to work in a clone of your
repository that has no uncommitted changes.
Whole-program inference differs from type refinement (Section~\ref{type-refinement})
in three ways. First, type refinement only works within a method body.
Second, type refinement always
refines the current type, regardless of whether the value already has an
annotation in the source code.
Third, whole-program inference can infer a subtype
or a supertype of the default type, by contrast with type refinement which
always refines the current type to a subtype.
\subsectionAndLabel{Whole-program inference ignores some code}{whole-program-inference-ignores-some-code}
Whole-program inference ignores code within the scope of a
\<@SuppressWarnings> annotation with an appropriate key
(Section~\ref{suppresswarnings-annotation}). In particular, uses within
the scope do not contribute to the inferred type, and declarations within
the scope are not changed. You should remove \<@SuppressWarnings> annotations
from the class declaration of any class you wish to infer types for.
As noted below, whole-program inference requires invocations of your code, or
assignments to your methods, to generalize from. If a field is set via
reflection (such as via injection), then whole-program inference would produce
an inaccurate result. There are two ways to make whole-program inference
ignore such a field.
%
(1)
You probably have an annotation such as
\javaeejavadocanno{javax/inject/Inject.html}{Inject}
or
\href{https://types.cs.washington.edu/plume-lib/api/plume/Option.html}{\<@Option>}
that indicates such fields. Meta-annotate the declaration of the \<Inject>
or \<Option> annotation with
\refqualclass{framework/qual}{IgnoreInWholeProgramInference}.
%
(2)
Annotate the field to be ignored with
\refqualclass{framework/qual}{IgnoreInWholeProgramInference}.
Whole-program inference, for a type-checker other than the Nullness Checker,
ignores (pseudo-)assignments where the right-hand-side is the \<null> literal.
\subsectionAndLabel{Manually checking whole-program inference results}{whole-program-inference-manual-checking}
As with any type inference tool, it is a good idea to manually examine the
results.
\begin{itemize}
\item
Whole-program inference can produce undesired results when your code has
non-representative or erroneous calls to a particular method or assignments to a
particular field, as explained below.
This is especially noticeable when the arguments or assignments are literals.
\item
If an annotation is inferred for a \emph{use} of type variables;
it might be more appropriate for you to move those annotations
to the corresponding upper bounds of the type variable \emph{declaration}.
\end{itemize}
\subsubsectionAndLabel{Poor whole-program inference results due to non-representative uses}{whole-program-inference-non-representative-uses}
Whole-program inference determines a method parameter's type
annotation based on what arguments are passed to the method, but not on how the
parameter is used within the method body.
\begin{itemize}
\item
If the program contains erroneous calls, the
inferred annotations may reflect those errors.
Suppose you intend method \<m2> to be called with
non-null arguments, but your program contains an error and one of the calls
to \<m2> passes \<null> as the argument. Then the tool will infer that
\<m2>'s parameter has \<@Nullable> type.
You should correct the bug and re-run inference.
\item
If the program uses (say) a method in a limited way, then the inferred
annotations will be legal for the program as
currently written but may not be as general as possible and may not
accommodate future program changes.
Here are some examples:
\begin{itemize}
\item
Suppose that your program currently calls
method \<m1> only with non-null
arguments. The tool will infer that \<m1>'s parameter has
\<@NonNull> type. If you had intended the method to be able to
take \<null> as an argument and you later add such a call, the type-checker
will issue a warning because the automatically-inserted \<@NonNull>
annotation is inconsistent with the new call.
\item
If your program (or test suite) passes only \<null> as an argument, the
inferred type will be the bottom type, such as \<@GuardedByBottom>.
\item
It is common for whole-program inference to infer
\<@Interned> and \<@Regex> annotations on String variables for which the
analyzed code only uses a constant string.
\end{itemize}
In each case, you can correct the inferred results manually, or you can
add tests that pass additional values then re-run inference.
\end{itemize}
\subsectionAndLabel{How whole-program inference works}{how-whole-program-inference-works}
This section explains how the \<infer-and-annotate.sh> script works. If you
merely want to run the script and you are not encountering trouble, you can
skip this section.
The \<infer-and-annotate.sh> script repeatedly runs the following steps:
\begin{enumerate}
\item Run the checker with the \<-Ainfer> command-line option to infer
types for fields and method signatures. The output of this step
is a \<.jaif> file that records the inferred types.
\item Insert the inferred annotations in the program. This step uses the
Annotation File Utilities
(\myurl{https://checkerframework.org/annotation-file-utilities/}).
\end{enumerate}
The process halts when there are no more changes to the inference results,
that is, the \<.jaif> file is unchanged between two runs. On each
iteration through the process, there may be new annotations in the \<.jaif>
file, and some type-checking errors may be eliminated (though others might
be introduced).
When the type-checker is run on the program with the final annotations
inserted, there might still be errors. This may be because the tool did
not infer enough annotations, or because your program cannot typecheck
(either because contains a defect, or because it contains subtle code that
is beyond the capabilities of the type system).
However, each of the inferred annotations is sound, and this reduces your
manual effort in annotating the program.
The iterative process is required because type-checking is modular: it
processes each class and each method only once, independently. Modularity
enables you to run type-checking on only part of your program, and it makes
type-checking fast. However, it has some disadvantages:
\begin{itemize}
\item
The first run of the type-checker cannot take advantage of whole-program
inference results because whole-program inference is only complete at the
end of type-checking, and modular type-checking does not revisit any
already-processed classes.
\item
Revisiting an
already-processed class may result in a better estimate.
\end{itemize}
%% LocalWords: Ainfer java jaif plugin classpath m2 m1 multi javax
%% LocalWords: AsuggestPureMethods CHECKERFRAMEWORK GuardedByBottom
%% LocalWords: IgnoreInWholeProgramInference
|