File: index-checker.tex

package info (click to toggle)
checker-framework-java 3.2.0%2Bds-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 23,104 kB
  • sloc: java: 145,916; xml: 839; sh: 518; makefile: 404; perl: 26
file content (792 lines) | stat: -rw-r--r-- 34,405 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
\htmlhr
% Reinstate when lists are supported:
% \chapterAndLabel{Index Checker for sequence bounds (arrays, strings, lists)}{index-checker}
\chapterAndLabel{Index Checker for sequence bounds (arrays and strings)}{index-checker}

The Index Checker warns about potentially out-of-bounds accesses to sequence
data structures, such as arrays
% , lists,
and strings.

The Index Checker prevents \<IndexOutOfBoundsException>s that result from
an index expression that might be negative or might be equal to or larger
than the sequence's length.
It also prevents \<NegativeArraySizeException>s that result from a negative
array dimension in an array creation expression.
(A caveat: the Index Checker does not check for arithmetic overflow. If
an expression overflows, the Index Checker might fail to warn about a
possible exception.  This is unlikely to be a problem in practice unless
you have an array whose length is \<Integer.MAX\_VALUE>.)

% Here's a pathological example of overflow leading to unsoundness:
%
% public class IndexOverflow {
%     public static void main(String[] args) {
%         @Positive int x = 1073741825; // 2 ^ 30 + 1
%         @Positive int x2 = x + x; // 2 ^ 31 + 2 == - 2 ^ 31 + 2
%         int[] a = new int[0];
%         if (x2 < a.length) {
%             a[x2] = 42;
%         }
%     }
% }

The programmer can write annotations that indicate which expressions are
indices for which sequences.  The Index Checker prohibits any operation that
may violate these properties, and the Index Checker takes advantage of
these properties when verifying indexing operations.
%
Typically, a programmer writes few annotations, because the Index Checker
infers properties of indexes from
the code around them. For example, it will infer that \<x> is positive
within the \<then> block of an \code{if (x > 0)} statement.
The programmer does need to write field types and method pre-conditions or post-conditions. For instance,
if a method's formal parameter is used as an index for
\<myArray>, the programmer might need to
write an \refqualclasswithparams{checker/index/qual}{IndexFor}{"myArray"}
annotation on the formal parameter's types.

The Index Checker checks fixed-size data structures, whose size is never
changed after creation.  A fixed-size data structure has no \<add> or
\<remove> operation.  Examples are strings and arrays, and you can add
support for other fixed-size data structures (see
Section~\ref{index-annotating-fixed-size}).

To run the Index Checker, run the command

\begin{alltt}
  javac -processor index \emph{MyJavaFile}.java
\end{alltt}

Recall that in Java, type annotations are written before the type;
in particular,
array annotations appear immediately before ``\<[]>''.
Here is how to declare a length-9 array of positive integers:

\begin{Verbatim}
  @Positive int @ArrayLen(9) []
\end{Verbatim}

Multi-dimensional arrays are similar.
Here is how to declare a length-2 array of length-4 arrays:

\begin{Verbatim}
  String @ArrayLen(2) [] @ArrayLen(4) []
\end{Verbatim}


\sectionAndLabel{Index Checker structure and annotations}{index-annotations}

Internally, the Index Checker computes information about integers that
might be indices:
\begin{itemize}
\item
  the lower bound on an integer, such as whether it is known to be positive
  (Section~\ref{index-lowerbound})
\item
  the upper bound on an integer, such as whether it is less than the length
  of a given sequence (Section~\ref{index-upperbound})
\item
  whether an integer came from calling the JDK's binary search routine on
  an array (Section~\ref{index-searchindex})
\item
  whether an integer came from calling a string search routine
  (Section~\ref{index-substringindex})
\end{itemize}

\noindent
and about sequence lengths:
\begin{itemize}
\item
  the minimum length of a sequence, such ``\<myArray> contains at least 3
  elements'' (Section~\ref{index-minlen})
\item
  whether two sequences have the same length (Section~\ref{index-samelen})
\end{itemize}

The Index Checker checks of all these properties at once, but
this manual discusses each type system in a different section.
There are some annotations that are shorthand for writing multiple
annotations, each from a different type system:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{IndexFor}{String[] names}]
  The value is a valid index for the named sequences.  For example, the
  \sunjavadoc{java.base/java/lang/String.html\#charAt(int)}{String.charAt(int)}
  method is declared as

  \begin{Verbatim}
  class String {
    char charAt(@IndexFor("this") index) { ... }
  }
  \end{Verbatim}

  More generally, a variable
  declared as \<@IndexFor("someArray") int i> has type
  \<@IndexFor("someArray") int> and its run-time value is guaranteed to be
  non-negative and less than the length of \<someArray>.  You could also
  express this as
  \<\refqualclass{checker/index/qual}{NonNegative}
  \refqualclasswithparams{checker/index/qual}{LTLengthOf}{"someArray"}
  int i>,
  but \<@IndexFor("someArray") int i> is more concise.

 \item[\refqualclasswithparams{checker/index/qual}{IndexOrHigh}{String[] names}]
   The value is non-negative and is less than or equal to the length of
   each named sequence.  This type combines
  \refqualclass{checker/index/qual}{NonNegative} and
  \refqualclass{checker/index/qual}{LTEqLengthOf}.

  For example, the
  \sunjavadoc{java.base/java/util/Arrays.html\#fill(java.lang.Object\%5B\%5D,int,int,java.lang.Object)}{Arrays.fill}
   method is declared as

  \begin{mysmall}
  \begin{Verbatim}
  class Arrays {
    void fill(Object[] a, @IndexFor("#1") int fromIndex, @IndexOrHigh("#1") int toIndex, Object val)
  }
  \end{Verbatim}
  \end{mysmall}

 \item[\refqualclasswithparams{checker/index/qual}{LengthOf}{String[] names}]
   The value is exactly equal to the length of the named
   sequences. In the implementation, this type aliases
   \refqualclass{checker/index/qual}{IndexOrHigh}, so writing it
   only adds documentation (although future versions of the Index Checker
   may use it to improve precision).

 \item[\refqualclasswithparams{checker/index/qual}{IndexOrLow}{String[] names}]
   The value is -1 or is a valid index for
   each named sequence.  This type combines
  \refqualclass{checker/index/qual}{GTENegativeOne} and
  \refqualclass{checker/index/qual}{LTLengthOf}.

%  Example commented out; IndexOrLow is not sound for indexOf, because "".indexOf("") returns 0
%
%  For example, the
%  \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf(String)}
%  method is declared as
%
%  \begin{Verbatim}
%  class String {
%    @IndexOrLow("this") int indexOf(String str) { ... }
%  }
%  \end{Verbatim}

 \item[\refqualclass{checker/index/qual}{PolyIndex}]
   indicates qualifier polymorphism.  This type combines
   \refqualclass{checker/index/qual}{PolyLowerBound} and
   \refqualclass{checker/index/qual}{PolyUpperBound}.
   For a description of qualifier polymorphism, see
   Section~\ref{qualifier-polymorphism}.

 \item[\refqualclass{checker/index/qual}{PolyLength}]
   is a special polymorphic qualifier that combines
   \refqualclass{checker/index/qual}{PolySameLen} and
   \refqualclass{common/value/qual}{PolyValue} from the
   Constant Value Checker (see \chapterpageref{constant-value-checker}).
   \refqualclass{checker/index/qual}{PolyLength} exists
   as a shorthand for these two annotations, since
   they often appear together.

\end{description}

\sectionAndLabel{Lower bounds}{index-lowerbound}

The Index Checker issues an error when
a sequence is indexed by an integer that might be negative.
The Lower Bound Checker uses a type system (Figure~\ref{fig-index-int-types}) with the following
qualifiers:

\begin{description}
\item[\refqualclass{checker/index/qual}{Positive}]
  The value is 1 or greater, so it is not too low to be used as an index.
  Note that this annotation is trusted by the Constant Value Checker,
  so if the Constant Value Checker is run on code containing this annotation,
  the Lower Bound Checker must be run on the same code in order to
  guarantee soundness.
\item[\refqualclass{checker/index/qual}{NonNegative}]
  The value is 0 or greater, so it is not too low to be used as an index.
\item[\refqualclass{checker/index/qual}{GTENegativeOne}]
  The value is -1 or greater.
  It may not be used as an index for a sequence, because it might be too low.
  (``\<GTE>'' stands for ``Greater Than or Equal to''.)
\item[\refqualclass{checker/index/qual}{PolyLowerBound}]
  indicates qualifier polymorphism.
  For a description of qualifier polymorphism, see
  Section~\ref{qualifier-polymorphism}.
\item[\refqualclass{checker/index/qual}{LowerBoundUnknown}]
  There is no information about the value.
  It may not be used as an index for a sequence, because it might be too low.
\item[\refqualclass{checker/index/qual}{LowerBoundBottom}]
    The value cannot take on any integral types. The bottom type, which
    should not need to be written by the programmer.
\end{description}

\begin{figure}
\begin{center}
  \hfill
  \includeimagenocentering{lowerbound}{5cm}
  ~~~~\hfill~~~~
  \includeimagenocentering{upperbound}{7cm}
  \hfill
\end{center}
  \caption{The two type hierarchies for integer types used by the Index
    Checker.  On the left is a type system for lower bounds.  On the right
    is a type system for upper bounds.  Qualifiers written in gray should
    never be written in source code; they are used internally by the type
    system. \<"myArray"> in the Upper Bound qualifiers is an example of
    an array's name.}
  \label{fig-index-int-types}
\end{figure}


\sectionAndLabel{Upper bounds}{index-upperbound}

The Index Checker issues an error when a sequence index might be
too high. To do this, it maintains information about which expressions are
safe indices for which sequences.
The length of a sequence is \code{arr.length} for arrays and
\code{str.length()} for strings.
It uses a type system (Figure~\ref{fig-index-int-types}) with the following
qualifiers:

It issues an error when a sequence \code{arr}
is indexed by an integer that is not of type \code{@LTLengthOf("arr")}
or \code{@LTOMLengthOf("arr")}.

\begin{description}

\item[\refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names, String[] offset}]
  An expression with this type
  has value less than the length of each sequence listed in \<names>.
  The expression may be used as an index into any of those sequences,
  if it is non-negative.
  For example, an expression of type \code{@LTLengthOf("a") int} might be
  used as an index to \<a>.
  The type \code{@LTLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTLengthOf("a")} and \code{@LTLengthOf("b")}.
  (``\<LT>'' stands for ``Less Than''.)

  \<@LTLengthOf> takes an optional \<offset> element, meaning that the
  annotated expression plus the offset is less than the length of the given
  sequence.  For example, suppose expression \<e> has type \<@LTLengthOf(value
  = \{"a", "b"\}, offset = \{"-1", "x"\})>. Then \<e - 1> is less than
  \<a.length>, and \<e + x> is less than \<b.length>.  This helps to make
  the checker more precise.  Programmers rarely need to write the \<offset>
  element.

\item[\refqualclasswithparams{checker/index/qual}{LTEqLengthOf}{String[] names}]
  An expression with this type
  has value less than or equal to the length of each sequence listed in \<names>.
  It may not be used as an index for these sequences, because it might be too high.
  \code{@LTEqLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTEqLengthOf("a")} and \code{@LTEqLengthOf("b")}.
  (``\<LTEq>'' stands for ``Less Than or Equal to''.)

 \item[\refqualclass{checker/index/qual}{PolyUpperBound}]
   indicates qualifier polymorphism.
   For a description of qualifier polymorphism, see
   Section~\ref{qualifier-polymorphism}.

\item[\refqualclasswithparams{checker/index/qual}{LTOMLengthOf}{String[] names}]
  An expression with this type
  has value at least 2 less than the length of each sequence listed in \<names>.
  It may always used as an index for a sequence listed in \<names>, if it is
  non-negative.

  This type exists to allow the checker to infer the safety of loops of
  the form:
\begin{Verbatim}
  for (int i = 0; i < array.length - 1; ++i) {
    arr[i] = arr[i+1];
  }
\end{Verbatim}
  This annotation should rarely (if ever) be written by the programmer; usually
  \refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names}
  should be written instead.
  \code{@LTOMLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTOMLengthOf("a")} and \code{@LTOMLengthOf("b")}.
  (``\<LTOM>'' stands for ``Less Than One Minus'', because another way of
  saying ``at least 2 less than \<a.length>'' is ``less than \<a.length-1>''.)

\item[\refqualclass{checker/index/qual}{UpperBoundUnknown}]
  There is no information about the upper bound on the value of an expression with this type.
  It may not be used as an index for a sequence, because it might be too high.
  This type is the top type, and should never need to be written by the
  programmer.

\item[\refqualclass{checker/index/qual}{UpperBoundBottom}]
  This is the bottom type for the upper bound type system. It should
  never need to be written by the programmer.

\end{description}

The following method annotations can be used to establish a method postcondition
that ensures that a certain expression is a valid index for a sequence:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOf}{String[] value, String[] targetValue, String[] offset}]
  When the method with this annotation returns, the expression (or all the expressions) given in the \code{value} element
  is less than the length of the given sequences with the given offsets. More precisely, the expression
  has the \code{@LTLengthOf} qualifier with the \code{value} and \code{offset} arguments
  taken from the \code{targetValue} and \code{offset} elements of this annotation.
\item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOfIf}{String[] expression, boolean result, String[] targetValue, String[] offset}]
  If the method with this annotation returns the given boolean value,
  then the given expression (or all the given expressions)
  is less than the length of the given sequences with the given offsets.
\end{description}

There is one declaration annotation that indicates the relationship between
two sequences:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{HasSubsequence}{String[]
    value, String[] from, String[] to}]
  indicates that a subsequence (from \code{from} to \code{to}) of the
  annotated sequence is equal to some other sequence, named by
  \code{value}).

For example, to indicate that \<shorter> is a subsequence of \<longer>:

\begin{Verbatim}
  int startIndex;
  int endIndex;
  int[] shorter;
  @HasSubsequence(value="shorter", from="this.start", to="this.end")
  int[] longer;
\end{Verbatim}

Thus, a valid index into \<shorter> is also a valid index (between
\code{start} and \code{end-1} inclusive) into \<longer>.  More generally,
if \code{x} is \code{@IndexFor("shorter")} in the example above, then
\code{start + x} is \code{@IndexFor("longer")}. If \code{y} is
\code{@IndexFor("longer")} and \code{@LessThan("end")}, then \code{y -
  start} is \code{@IndexFor("shorter")}. Finally, \code{end - start} is
\code{@IndexOrHigh("shorter")}.

This annotation is in part checked and in part trusted.  When an array is
assigned to \code{longer}, three facts are checked: that \code{start} is
non-negative, that \code{start} is less than or equal to \code{end}, and
that \code{end} is less than or equal to the length of \code{longer}.  This
ensures that the indices are valid. The programmer must manually verify
that the value of \code{shorter} equals the subsequence that they describe.
\end{description}


\sectionAndLabel{Sequence minimum lengths}{index-minlen}

The Index Checker estimates, for each sequence expression, how long its value
might be at run time by computing a minimum length that
the sequence is guaranteed to have.  This enables the Index Checker to
verify indices that are compile-time constants.  For example, this code:

\begin{Verbatim}
  String getThirdElement(String[] arr) {
    return arr[2];
  }
\end{Verbatim}

\noindent
is legal if \<arr> has at least three elements, which can be indicated
in this way:

\begin{Verbatim}
  String getThirdElement(String @MinLen(3) [] arr) {
    return arr[2];
  }
\end{Verbatim}

When the index is not a compile-time constant, as in \<arr[i]>, then the
Index Checker depends not on a \<@MinLen> annotation but on \<i> being
annotated as
\refqualclasswithparams{checker/index/qual}{LTLengthOf}{"arr"}.

The MinLen type qualifier is implemented in practice by the Constant Value Checker,
using \<@ArrayLenRange> annotations (see \chapterpageref{constant-value-checker}).
This means that errors related to the minimum lengths of arrays must be suppressed using
the "value" argument to \<@SuppressWarnings>.
\refqualclass{common/value/qual}{ArrayLenRange} and \refqualclass{common/value/qual}{ArrayLen}
annotations can also be used to establish the minimum length of a sequence, if a
more precise estimate of length is known. For example,
if \<arr> is known to have exactly three elements:

\begin{Verbatim}
  String getThirdElement(String @ArrayLen(3) [] arr) {
    return arr[2];
  }
\end{Verbatim}

The following type qualifiers (from the \chapterpageref{constant-value-checker})
can establish the minimum length of a sequence:

\begin{description}
\item[\refqualclasswithparams{common/value/qual}{MinLen}{int value}]
  The value of an expression of this type is a sequence with at least
  \code{value} elements.  The default annotation is
  \code{@MinLen(0)}, and it may be applied to non-sequences.
  \code{@MinLen($x$)} is a subtype of \code{@MinLen($x-1$)}.
  An \code{@MinLen} annotation is treated internally as an
  \refqualclass{common/value/qual}{ArrayLenRange} with only its
  \code{from} field filled.
\item[\refqualclasswithparams{common/value/qual}{ArrayLen}{int[] value}]
  The value of an expression of this type is a sequence whose
  length is exactly one of the integers listed in its argument.
  The argument can contain at most ten integers; larger collections of
  integers are converted to \refqualclass{common/value/qual}{ArrayLenRange}
  annotations. The minimum length of a sequence with this annotation
  is the smallest element of the argument.
\item[\refqualclasswithparams{common/value/qual}{ArrayLenRange}{int from, int to}]
  The value of an expression of this type is a sequence whose
  length is bounded by its arguments, inclusive.
  The minimum length of a sequence with this annotation is its \<from> argument.
\end{description}

\begin{figure}
\begin{center}
  \hfill
  \includeimage{samelen}{5cm}
  \hfill
\end{center}
  \caption{The type hierarchy for arrays of equal length ("a" and "b" are
    assumed to be in-scope sequences).  Qualifiers
    written in gray should never be written in source code; they are used
    internally by the type system.}
  \label{fig-index-array-types}
\end{figure}

The following method annotation can be used to establish a method postcondition
that ensures that a certain sequence has a minimum length:

\begin{description}
\item[\refqualclasswithparams{common/value/qual}{EnsuresMinLenIf}{String[] expression, boolean result, int targetValue}]
  If the method with this annotation returns the given boolean value,
  then the given expression (or all the given expressions) is a sequence
  with at least \code{targetValue} elements.
\end{description}

\sectionAndLabel{Sequences of the same length}{index-samelen}

The Index Checker determines whether two or more sequences have the same length.
This enables it to verify that all the indexing operations are safe in code
like the following:

\begin{Verbatim}
  boolean lessThan(double[] arr1, double @SameLen("#1") [] arr2) {
    for (int i = 0; i < arr1.length; i++) {
      if (arr1[i] < arr2[i]) {
        return true;
      } else if (arr1[i] > arr2[i]) {
        return false;
      }
    }
    return false;
  }
\end{Verbatim}

When needed, you can specify which sequences have the same length using the following type qualifiers (Figure~\ref{fig-index-array-types}):

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SameLen}{String[] names}]
  An expression with this type represents a sequence that has the
  same length as the other sequences named in \<names>. In general,
  \code{@SameLen} types that have non-intersecting sets of names
  are \textit{not} subtypes of each other. However, if at least one
  sequence is named by both types, the types are actually the same,
  because all the named sequences must have the same length.
\item[\refqualclass{checker/index/qual}{PolySameLen}]
  indicates qualifier polymorphism.
  For a description of qualifier polymorphism, see
  Section~\ref{qualifier-polymorphism}.
\item[\refqualclass{checker/index/qual}{SameLenUnknown}]
  No information is known about which other sequences have the same length
  as this one.
  This is the top type, and programmers should never need to write it.
\item[\refqualclass{checker/index/qual}{SameLenBottom}]
  This is the bottom type, and programmers should rarely need to write it.
  \code{null} has this type.
\end{description}


\sectionAndLabel{Binary search indices}{index-searchindex}

The JDK's
\sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}
method returns either where the value was found, or a negative value
indicating where the value could be inserted.  The Search Index Checker
represents this concept.

\begin{figure}
\begin{center}
  \hfill
  \includeimage{searchindex}{7cm}
  \hfill
\end{center}
  \caption{The type hierarchy for the Index Checker's internal type system
  that captures information about the results of calls to
  \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}.}
  \label{fig-index-searchindex}
\end{figure}

The Search Index Checker's type hierarchy (Figure~\ref{fig-index-searchindex}) has four type qualifiers:
\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SearchIndexFor}{String[] names}]
  An expression with this type represents an integer that could have been
  produced by calling
  \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}:
  for each array \<a> specified in the annotation, the annotated integer is
  between \<-a.length-1> and \<a.length-1>, inclusive
\item[\refqualclasswithparams{checker/index/qual}{NegativeIndexFor}{String[] names}]
  An expression with this type represents a ``negative index'' that is
  between \<a.length-1> and \<-1>, inclusive; that is, a value that is both
  a \<@SearchIndex> and is negative.  Applying the bitwise complement
  operator (\verb|~|) to an expression of this type produces an expression
  of type \refqualclass{checker/index/qual}{IndexOrHigh}.
\item[\refqualclass{checker/index/qual}{SearchIndexBottom}]
  This is the bottom type, and programmers should rarely need to write it.
\item[\refqualclass{checker/index/qual}{SearchIndexUnknown}]
  No information is known about whether this integer is a search index.
  This is the top type, and programmers should rarely need to write it.
\end{description}


\sectionAndLabel{Substring indices}{index-substringindex}

The methods
\sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and
\sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}
return an index of a given substring within a given string, or -1 if no
such substring exists.  The index \<i> returned from
\<receiver.indexOf(substring)> satisfies the following property, which is
stated here in three equivalent ways:
\begin{Verbatim}
 i == -1 || ( i >= 0       && i <= receiver.length() - substring.length()                  )
 i == -1 || ( @NonNegative && @LTLengthOf(value="receiver", offset="substring.length()-1") )
 @SubstringIndexFor(value="receiver", offset="substring.length()-1")
\end{Verbatim}

% This new annotation is similar to \<@LTLengthOf\allowbreak(value =
% "receiver", offset = "substring.length()-1")>, but explicitly allows the
% index to be -1 even if the upper bound would not allow it because of the
% offset.  The Upper Bound Checker can infer the corresponding
% \<@LTLengthOf> annotation for expressions that have a
% \<@SubstringIndexFor> annotation and at the same time are known to be
% non-negative (according to the Lower Bound Checker).

The return type of methods \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf} has the annotation
\refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{value="this", offset="\#1.length()-1")}.
This allows writing code such as the following with no warnings from the
Index Checker:

\begin{Verbatim}
  public static String removeSubstring(String original, String removed) {
    int i = original.indexOf(removed);
    if (i != -1) {
      return original.substring(0, i) + original.substring(i + removed.length());
    }
    return original;
  }
\end{Verbatim}

% The code removes the first occurrence of \<removed> from
% \<original>. After checking that \code{i != -1}, the value of \<i> must
% be a valid index for \<original>. Because this index is the start of an
% occurrence of \<removed>, \code{i + removed.length()} is the index of the
% end of the occurrence.  Without the \<@SubstringIndexFor> annotation, the
% Upper Bound Checker would not be able to verify that \code{i +
% removed.length()} is a valid argument to \<substring>, which requires
% both arguments to be \<@IndexOrHigh("original")>.

\begin{figure}
\begin{center}
  \hfill
  \includeimage{substringindex}{3.5cm}
  \hfill
\end{center}
  \caption{The type hierarchy for the Substring Index Checker, which
    captures information about the results of calls to
    \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
    and
    \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}.}
  \label{fig-index-substringindex}
\end{figure}

The \<@SubstringIndexFor> annotation is implemented in a Substring Index
Checker that runs together with the Index Checker and has its own type
hierarchy (Figure~\ref{fig-index-substringindex}) with three type
qualifiers:
\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{String[] value, String[] offset}]
  An expression with this type represents an integer that could have been
  produced by calling
  \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}:
  the annotated integer is either -1, or it is non-negative and is less
  than or equal to \<receiver.length - offset> (where the sequence
  \<receiver> and the offset \<offset> are corresponding elements of the
  annotation's arguments).
\item[\refqualclass{checker/index/qual}{SubstringIndexBottom}]
  This is the bottom type, and programmers should rarely need to write it.
\item[\refqualclass{checker/index/qual}{SubstringIndexUnknown}]
  No information is known about whether this integer is a substring index.
  This is the top type, and programmers should rarely need to write it.
\end{description}


\subsectionAndLabel{The need for the \<@SubstringIndexFor> annotation}{index-substringindex-justification}

No other annotation supported by the Index Checker precisely represents the
possible return values of methods
\sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and
\sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}.
The reason is the methods' special cases for empty strings and for failed matches.

Consider the result \<i> of \<receiver.indexOf(substring)>:

\begin{itemize}
\item
  \<i> is \<@GTENegativeOne>, because \code{i >= -1}.
\item
  \<i> is \<@LTEqLengthOf("receiver")>, because \code{i <= receiver.length()}.
\item
  \<i> is not \<@IndexOrLow("receiver")>, because for
  \code{receiver = "", substring = "", i = 0}, the property
  \code{i >= -1 \&\& i < receiver.length()} does not hold.
\item
  \<i> is not \<@IndexOrHigh("receiver")>, because for
  \code{receiver = "", substring = "b", i = -1}, the property
  \code{i >= 0 \&\& i <= receiver.length()} does not hold.
\item
  \<i> is not
  \<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>,
  because for \code{receiver = "", substring = "abc", i = -1}, the property
  \code{i + substring.length() - 1 < receiver.length()} does not hold.
\end{itemize}

\noindent
The last annotation in the list above,
\<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>,
is the correct and precise upper bound for all values of \<i> except -1.
The offset expresses the fact that we can add \<substring.length()> to this
index and still get a valid index for \<receiver>.  That is useful for
type-checking code that adds the length of the substring to the found
index, in order to obtain the rest of the string.  However, the upper bound
applies only after the index is explicitly checked not to be -1:

\begin{Verbatim}
  int i = receiver.indexOf(substring);
  // i is @GTENegativeOne and @LTEqLengthOf("receiver")
  // i is not @LTLengthOf(value = "receiver", offset = "substring.length()-1")
  if (i != -1) {
    // i is @NonNegative and @LTLengthOf(value = "receiver", offset = "substring.length()-1")
    int j = i + substring.length();
    // j is @IndexOrHigh("receiver")
    return receiver.substring(j); // this call is safe
  }
\end{Verbatim}

The property of the result of \<indexOf> cannot be expressed by any
combination of lower-bound (Section~\ref{index-lowerbound}) and upper-bound
(Section~\ref{index-upperbound}) annotations, because the upper-bound
annotations apply independently of the lower-bound annotations, but in this
case, the upper bound \code{i <= receiver.length() - substring.length()}
holds only if \code{i >= 0}.  Therefore, to express this property and make
the example type-check without false positives, a new annotation such as
\<@SubstringIndexFor\allowbreak(value = "receiver", offset = "substring.length()-1")>
is necessary.

\sectionAndLabel{Inequalities}{index-inequalities}

The Index Checker estimates which expression's values are less than other expressions' values.

\begin{description}

\item[\refqualclasswithparams{checker/index/qual}{LessThan}{String[] values}]
  An expression with this type has a value that is less than the value of each
  expression listed in \<values>. The expressions in values must be composed of
  final or effectively final variables and constants.

\item[\refqualclass{checker/index/qual}{LessThanUnknown}]
  There is no information about the value of an expression this type relative to other expressions.
  This is the top type, and should not be written by the programmer.

 \item[\refqualclass{checker/index/qual}{LessThanBottom}]
   This is the bottom type for the less than type system. It should
   never need to be written by the programmer.

\end{description}

\sectionAndLabel{Annotating fixed-size data structures}{index-annotating-fixed-size}

The Index Checker has built-in support for Strings and arrays.
You can add support for additional fixed-size data structures by writing
annotations.
This allows the Index Checker to typecheck the data structure's
implementation and to typecheck uses of the class.

This section gives an example:  a fixed-length collection.

%% The code that follows is copied from checker/tests/index/ArrayWrapper.java.
%% If this code is updated, please update that file, too.

\begin{Verbatim}
/** ArrayWrapper is a fixed-size generic collection. */
public class ArrayWrapper<T> {
    private final Object @SameLen("this") [] delegate;

    @SuppressWarnings("index") // constructor creates object of size @SameLen(this) by definition
    ArrayWrapper(@NonNegative int size) {
        delegate = new Object[size];
    }

    public @LengthOf("this") int size() {
        return delegate.length;
    }

    public void set(@IndexFor("this") int index, T obj) {
        delegate[index] = obj;
    }

    @SuppressWarnings("unchecked") // required for normal Java compilation due to unchecked cast
    public T get(@IndexFor("this") int index) {
        return (T) delegate[index];
    }
}
\end{Verbatim}

The Index Checker treats methods annotated with \code{@LengthOf("this")}  as
the length of a sequence like \code{arr.length} for arrays and
\code{str.length()} for strings.

With these annotations, client code like the following typechecks with no
warnings:
\begin{Verbatim}
    public static void clearIndex1(ArrayWrapper<? extends Object> a, @IndexFor("#1") int i) {
        a.set(i, null);
    }

    public static void clearIndex2(ArrayWrapper<? extends Object> a, int i) {
        if (0 <= i && i < a.size()) {
            a.set(i, null);
        }
    }
\end{Verbatim}

%%  LocalWords:  NegativeArraySizeException pre myArray IndexFor someArray
%%  LocalWords:  MyJavaFile LTLengthOf LTEqLengthOf GTENegativeOne GTE str
%%  LocalWords:  LowerBoundUnknown LTOMLengthOf LTEq LTOM UpperBoundBottom
%%  LocalWords:  UpperBoundUnknown MinLen MinLenBottom SameLen indexOf abc
%%  LocalWords:  SameLenUnknown SameLenBottom lastIndexOf html lang charAt
%%  LocalWords:  LengthOf IndexOrLow PolyIndex PolyLowerBound PolyLength
%%  LocalWords:  PolyUpperBound PolySameLen PolyValue LowerBoundBottom
%%  LocalWords:  lowerbound upperbound EnsuresLTLengthOf targetValue
%%  LocalWords:  EnsuresLTLengthOfIf boolean HasSubsequence LessThan
%%  LocalWords:  minlen ArrayLenRange ArrayLen EnsuresMinLenIf samelen
%%  LocalWords:  searchindex binarySearch SearchIndexFor NegativeIndexFor
%%  LocalWords:  SearchIndex bitwise SearchIndexBottom SearchIndexUnknown
%%  LocalWords:  Substring substringindex substring SubstringIndexFor
%%  LocalWords:  SubstringIndexBottom SubstringIndexUnknown LessThanBottom
%%  LocalWords:  LessThanUnknown typecheck typechecks