1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279
|
@node Static type system
@section Static type system
@cindex type inference
@cindex static type analysis
Scheme48 supports a rudimentary static type system. It is intended
mainly to catch some classes of type and arity mismatch errors early,
at compile-time. By default, there is only @emph{extremely} basic
analysis, which is typically only good enough to catch arity errors and
the really egregious type errors. The full reconstructor, which is
still not very sophisticated, is enabled by specifying an optimizer
pass that invokes the code usage analyzer. The only optimizer pass
built-in to Scheme48, the automatic procedure integrator, named
@code{auto-integrate}, does so.
The type reconstructor attempts to assign the most specific type it can
to program terms, signalling warnings for terms that are certain to be
invalid by Scheme's dynamic semantics. Since the reconstructor is not
very sophisticated, it frequently gives up and assigns very general
types to many terms. Note, however, that it is very lenient in that it
only assigns more general types: it will @emph{never} signal a warning
because it could not reconstruct a very specific type. For example,
the following program will produce no warnings:
@lisp
(define (foo x y) (if x (+ y 1) (car y)))@end lisp
@noindent
Calls to @code{foo} that are clearly invalid, such as @code{(foo #t
'a)}, could cause the type analyzer to signal warnings, but it is not
sophisticated enough to determine that @code{foo}'s second argument
must be either a number or a pair; it simply assigns a general value
type (see below).
There are some tricky cases that depend on the order by which arguments
are evaluated in a combination, because that order is not specified in
Scheme. In these cases, the relevant types are narrowed to the most
specific ones that could not possibly cause errors at run-time for any
order. For example,
@lisp
(lambda (x) (+ (begin (set! x '(3)) 5) (car x)))@end lisp
@noindent
will be assigned the type @code{(proc (:pair) :number)}, because, if
the arguments are evaluated right-to-left, and @code{x} is not a pair,
there will be a run-time type error.
The type reconstructor presumes that all code is potentially reachable,
so it may signal warnings for code that the most trivial control flow
analyzer could decide unreachable. For example, it would signal a
warning for @code{(if #t 3 (car 7))}. Furthermore, it does not account
for continuation throws; for example, though it is a perfectly valid
Scheme program, the type analyzer might signal a warning for this code:
@lisp
(call-with-current-continuation
(lambda (k) (0 (k))))@end lisp
@cindex type lattice
The type system is based on a type lattice. There are several maximum
or `top' elements, such as @code{:values}, @code{:syntax}, and
@code{:structure}; and one minimum or `bottom' element, @code{:error}.
This description of the type system makes use of the following
notations: @code{@var{E} : @var{T}} means that the term @var{E} has the
type, or some compatible subtype of, @var{T}; and @code{@var{T@suba{a}}
@subtype{} @var{T@suba{b}}} means that @var{T@suba{a}} is a compatible
subtype of @var{T@suba{b}} --- that is, any term whose static type is
@var{T@suba{a}} is valid in any context that expects the type
@var{T@suba{b}} ---.
Note that the previous text has used the word `term,' not `expression,'
because static types are assigned to not only Scheme expressions. For
example, @code{cond} macro has the type @code{:syntax}. Structures in
the configuration language also have static types: their interfaces.
(Actually, they really have the type @code{:structure}, but this is a
deficiency in the current implementation's design.) Types, in fact,
have their own type: @code{:type}. Here are some examples of values,
first-class or otherwise, and their types:
@example
cond : :syntax
(values 1 'foo '(x . y))
: (some-values :exact-integer :symbol :pair)
:syntax : :type
3 : :exact-integer
(define-structure foo (export a b) ...)
foo : (export a b)@end example
@cindex parametric polymorphism
One notable deficiency of the type system is the absence of any sort of
parametric polymorphism.
@cindex join types
@cindex meet types
@deffn {type constructor} join type @dots{}
@deffnx {type constructor} meet type @dots{}
@code{Join} and @code{meet} construct the supremum and infimum elements
in the type lattice of the given types. That is, for any two disjoint
types @var{T@suba{a}} and @var{T@suba{b}}, let @var{T@suba{j}} be
@code{(join @var{T@suba{a}} @var{T@suba{b}})} and @var{T@suba{m}} be
@code{(meet @var{T@suba{a}} @var{T@suba{b}})}:
@itemize
@item @var{T@suba{j}} @subtype{} @var{T@suba{a}} and @var{T@suba{j}} @subtype{} @var{T@suba{b}}
@item @var{T@suba{a}} @subtype{} @var{T@suba{m}} and @var{T@suba{b}} @subtype{} @var{T@suba{m}}
@end itemize
For example, @code{(join :pair :null)} allows either pairs or nil,
@ie{} lists, and @code{(meet :integer :exact)} accepts only integers
that are also exact.
(More complete definitions of supremum, infimum, and other elements of
lattice theory, may be found elsewhere.)
@end deffn
@deftp type :error
This is the minimal, or `bottom,' element in the type lattice. It is
the type of, for example, calls to @code{error}.
@end deftp
@deftp type :values
@deftpx type :arguments
All Scheme @emph{expressions} have the type @code{:values}. They may
have more specific types as well, but all expressions' types are
compatible subtypes of @code{:values}. @code{:Values} is a maximal
element of the type lattice. @code{:Arguments} is synonymous with
@code{:values}.
@end deftp
@deftp type :value
Scheme expressions that have a single result have the type
@code{:value}, or some compatible subtype thereof; it is itself a
compatible subtype of @code{:values}.
@end deftp
@deffn {type constructor} some-values type @dots{}
@code{Some-values} is used to denote the types of expressions that have
multiple results: if @code{@var{E@sub{1}} @dots{} @var{E@sub{n}}} have
the types @code{@var{T@sub{1}} @dots{} @var{T@sub{n}}}, then the Scheme
expression @code{(values @var{E@sub{1}} @dots{} @var{E@sub{n}})} has
the type @code{(some-values @var{T@sub{1}} @dots{} @var{T@sub{n}})}.
@code{Some-values}-constructed types are compatible subtypes of
@code{:values}.
@code{Some-values} also accepts `optional' and `rest' types, similarly
to Common Lisp's `optional' and `rest' formal parameters. The sequence
of types may contain a @code{&opt} token, followed by which is any
number of further types, which are considered to be optional. For
example, @code{make-vector}'s domain is @code{(some-values
:exact-integer &opt :value)}. There may also be a @code{&rest} token,
which must follow the @code{&opt} token if there is one. Following the
@code{&rest} token is one more type, which the rest of the sequents in
a sequence after the required or optional sequents must satisfy. For
example, @code{map}'s domain is @code{(some-values :procedure (join
:pair :null) &rest (join :pair :null))}: it accepts one procedure and
at least one list (pair or null) argument.
@end deffn
@deffn {type constructor} procedure domain codomain
@deffnx {type constructor} proc (arg-type @dots{}) result-type
Procedure type constructors. Procedure types are always compatible
subtypes of @code{:value}. @code{Procedure} is a simple constructor
from a specific domain and codomain; @var{domain} and @var{codomain}
must be compatible subtypes of @code{:values}. @code{Proc} is a more
convenient constructor. It is equivalent to @code{(procedure
(some-values @var{arg-type} @dots{}) @var{result-type})}.
@end deffn
@deftp type :boolean
@deftpx type :char
@deftpx type :null
@deftpx type :unspecific
@deftpx type :pair
@deftpx type :string
@deftpx type :symbol
@deftpx type :vector
@deftpx type :procedure
@deftpx type :input-port
@deftpx type :output-port
Types that represent standard Scheme data. These are all compatible
subtypes of @code{:value}. @code{:Procedure} is the general type for
all procedures; see @code{proc} and @code{procedure} for procedure
types with specific domains and codomains.
@end deftp
@deftp type :number
@deftpx type :complex
@deftpx type :real
@deftpx type :rational
@deftpx type :integer
Types of the Scheme numeric tower. @code{:integer @subtype{} :rational
@subtype{} :real @subtype{} :complex @subtype{} :number}
@end deftp
@deftp type :exact
@deftpx type :inexact
@deftpx type :exact-integer
@deftpx type :inexact-real
@code{:Exact} and @code{:inexact} are the types of exact and inexact
numbers, respectively. They are typically met with one of the types in
the numeric tower above; @code{:exact-integer} and @code{:inexact-real}
are two conveniences for the most common meets.
@end deftp
@deftp type :other
@code{:Other} is for types that do not fall into any of the previous
value categories. (@code{:other @subtype{} :value}) All new types
introduced, for example by @embedref{Type annotations,@code{loophole}},
are compatible subtypes of @code{:other}.
@end deftp
@deffn {type constructor} variable type
This is the type of all assignable variables, where @code{@var{type}
@subtype{} :value}. Assignment to variables whose types are value
types, not assignable variable types, is invalid.
@end deffn
@deftp type :syntax
@deftpx type :structure
@code{:Syntax} and @code{:structure} are two other maximal elements of
the type lattice, along with @code{:values}. @code{:Syntax} is the
type of macros or syntax transformers. @code{:Structure} is the
general type of all structures.
@end deftp
@subsection Types in the configuration language
Scheme48's configuration language has several places in which to write
types. However, due to the definitions of certain elements of the
configuration language, notably the @code{export} syntax, the allowable
type syntax is far more limited than the above. Only the following are
provided:
@deftp type :values
@deftpx type :value
@deftpx type :arguments
@deftpx type :syntax
@deftpx type :structure
All of the built-in maximal elements of the type lattice are provided,
as well as the simple compatible subtype @code{:values}, @code{:value}.
@end deftp
@deftp type :boolean
@deftpx type :char
@deftpx type :null
@deftpx type :unspecific
@deftpx type :pair
@deftpx type :string
@deftpx type :symbol
@deftpx type :vector
@deftpx type :procedure
@deftpx type :input-port
@deftpx type :output-port
@deftpx type :number
@deftpx type :complex
@deftpx type :real
@deftpx type :rational
@deftpx type :integer
@deftpx type :exact-integer
These are the only value types provided in the configuration language.
Note the conspicuous absence of @code{:exact}, @code{:inexact}, and
@code{:inexact-real}.
@end deftp
@deffn {type constructor} procedure domain codomain
@deffnx {type constructor} proc (arg-type @dots{}) result-type
These two are the only type constructors available. Note here the
conspicuous absence of @code{some-values}, so procedure types that are
constructed by @code{procedure} can accept only one argument (or use
the overly general @code{:values} type) & return only one result (or,
again, use @code{:values} for the codomain), and procedure types that
are constructed by @code{proc} are similar in the result type.
@end deffn
|