1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437
|
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="hevea 2.32">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
<link rel="stylesheet" type="text/css" href="manual.css">
<title>8.10 Generalized algebraic datatypes</title>
</head>
<body>
<a href="overridingopen.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="extn.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="bigarray.html"><img src="next_motif.svg" alt="Next"></a>
<hr>
<h2 class="section" id="s:gadts"><a class="section-anchor" href="#s:gadts" aria-hidden="true"></a>8.10 Generalized algebraic datatypes</h2>
<p> <a id="hevea_manual.kwd225"></a>
<a id="hevea_manual.kwd226"></a>
</p><p>(Introduced in OCaml 4.00)</p><div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" href="typedecl.html#constr-decl"><span class="c010">constr-decl</span></a></td><td class="c015">::=</td><td class="c017">
...
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <a class="syntax" href="names.html#constr-name"><span class="c010">constr-name</span></a> <span class="c004">:</span> [ <a class="syntax" href="typedecl.html#constr-args"><span class="c010">constr-args</span></a> <span class="c004">-></span> ] <a class="syntax" href="types.html#typexpr"><span class="c010">typexpr</span></a>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" href="typedecl.html#type-param"><span class="c010">type-param</span></a></td><td class="c015">::=</td><td class="c017">
...
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> [<a class="syntax" href="typedecl.html#variance"><span class="c010">variance</span></a>] <span class="c004">_</span>
</td></tr>
</table></td></tr>
</table></div><p>Generalized algebraic datatypes, or GADTs, extend usual sum types in
two ways: constraints on type parameters may change depending on the
value constructor, and some type variables may be existentially
quantified.
Adding constraints is done by giving an explicit return type
(the rightmost <a class="syntax" href="types.html#typexpr"><span class="c010">typexpr</span></a> in the above syntax), where type parameters
are instantiated.
This return type must use the same type constructor as the type being
defined, and have the same number of parameters.
Variables are made existential when they appear inside a constructor’s
argument, but not in its return type.</p><p>Since the use of a return type often eliminates the need to name type
parameters in the left-hand side of a type definition, one can replace
them with anonymous types <span class="c004">_</span> in that case.</p><p>The constraints associated to each constructor can be recovered
through pattern-matching.
Namely, if the type of the scrutinee of a pattern-matching contains
a locally abstract type, this type can be refined according to the
constructor used.
These extra constraints are only valid inside the corresponding branch
of the pattern-matching.
If a constructor has some existential variables, fresh locally
abstract types are generated, and they must not escape the
scope of this branch.</p>
<h5 class="paragraph" id="p:gadts-recfun"><a class="section-anchor" href="#p:gadts-recfun" aria-hidden="true"></a>Recursive functions</h5>
<p>Here is a concrete example:
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
<span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> eval : <span class="ocamlkeyword">type</span> a. a term -> a = <span class="ocamlkeyword">function</span>
| Int n -> n <span class="ocamlcomment">(* a = int *)</span>
| Add -> (<span class="ocamlkeyword">fun</span> x y -> x+y) <span class="ocamlcomment">(* a = int -> int -> int *)</span>
| App(f,x) -> (eval f) (eval x)
<span class="ocamlcomment">(* eval called at types (b->a) and b for fresh b *)</span></div></div>
</div><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">let</span> two = eval (App (App (Add, Int 1), Int 1))</div>
<div class="pre caml-output ok"><span class="ocamlkeyword">val</span> two : int = 2</div></div>
</div><p>
It is important to remark that the function <span class="c003">eval</span> is using the
polymorphic syntax for locally abstract types. When defining a recursive
function that manipulates a GADT, explicit polymorphic recursion should
generally be used. For instance, the following definition fails with a
type error:
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> eval (<span class="ocamlkeyword">type</span> a) : a term -> a = <span class="ocamlkeyword">function</span>
| Int n -> n
| Add -> (<span class="ocamlkeyword">fun</span> x y -> x+y)
| App(f,x) -> (eval <span class="ocamlhighlight">f</span>) (eval x)</div>
<div class="pre caml-output error"><span class="ocamlerror">Error</span>: This expression has type ($App_'b -> a) term
but an expression was expected of type 'a
The type constructor $App_'b would escape its scope</div></div>
</div><p>
In absence of an explicit polymorphic annotation, a monomorphic type
is inferred for the recursive function. If a recursive call occurs
inside the function definition at a type that involves an existential
GADT type variable, this variable flows to the type of the recursive
function, and thus escapes its scope. In the above example, this happens
in the branch <span class="c003">App(f,x)</span> when <span class="c003">eval</span> is called with <span class="c003">f</span> as an argument.
In this branch, the type of <span class="c003">f</span> is <span class="c003">($App_ 'b-> a)</span>. The prefix <span class="c003">$</span> in
<span class="c003">$App_ 'b</span> denotes an existential type named by the compiler
(see <a href="#p%3Aexistential-names">8.10</a>). Since the type of <span class="c003">eval</span> is
<span class="c003">'a term -> 'a</span>, the call <span class="c003">eval f</span> makes the existential type <span class="c003">$App_'b</span>
flow to the type variable <span class="c003">'a</span> and escape its scope. This triggers the
above error.</p>
<h5 class="paragraph" id="p:gadts-type-inference"><a class="section-anchor" href="#p:gadts-type-inference" aria-hidden="true"></a>Type inference</h5>
<p>Type inference for GADTs is notoriously hard.
This is due to the fact some types may become ambiguous when escaping
from a branch.
For instance, in the <span class="c003">Int</span> case above, <span class="c003">n</span> could have either type <span class="c003">int</span>
or <span class="c003">a</span>, and they are not equivalent outside of that branch.
As a first approximation, type inference will always work if a
pattern-matching is annotated with types containing no free type
variables (both on the scrutinee and the return type).
This is the case in the above example, thanks to the type annotation
containing only locally abstract types.</p><p>In practice, type inference is a bit more clever than that: type
annotations do not need to be immediately on the pattern-matching, and
the types do not have to be always closed.
As a result, it is usually enough to only annotate functions, as in
the example above. Type annotations are
propagated in two ways: for the scrutinee, they follow the flow of
type inference, in a way similar to polymorphic methods; for the
return type, they follow the structure of the program, they are split
on functions, propagated to all branches of a pattern matching,
and go through tuples, records, and sum types.
Moreover, the notion of ambiguity used is stronger: a type is only
seen as ambiguous if it was mixed with incompatible types (equated by
constraints), without type annotations between them.
For instance, the following program types correctly.
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> sum : <span class="ocamlkeyword">type</span> a. a term -> _ = <span class="ocamlkeyword">fun</span> x ->
<span class="ocamlkeyword">let</span> y =
<span class="ocamlkeyword">match</span> x <span class="ocamlkeyword">with</span>
| Int n -> n
| Add -> 0
| App(f,x) -> sum f + sum x
<span class="ocamlkeyword">in</span> y + 1</div>
<div class="pre caml-output ok"><span class="ocamlkeyword">val</span> sum : 'a term -> int = <<span class="ocamlkeyword">fun</span>></div></div>
</div><p>
Here the return type <span class="c003">int</span> is never mixed with <span class="c003">a</span>, so it is seen as
non-ambiguous, and can be inferred.
When using such partial type annotations we strongly suggest
specifying the <span class="c003">-principal</span> mode, to check that inference is
principal.</p><p>The exhaustiveness check is aware of GADT constraints, and can
automatically infer that some cases cannot happen.
For instance, the following pattern matching is correctly seen as
exhaustive (the <span class="c003">Add</span> case cannot happen).
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">let</span> get_int : int term -> int = <span class="ocamlkeyword">function</span>
| Int n -> n
| App(_,_) -> 0</div></div>
</div>
<h5 class="paragraph" id="p:gadt-refutation-cases"><a class="section-anchor" href="#p:gadt-refutation-cases" aria-hidden="true"></a>Refutation cases</h5>
<p> (Introduced in OCaml 4.03)</p><p>Usually, the exhaustiveness check only tries to check whether the
cases omitted from the pattern matching are typable or not.
However, you can force it to try harder by adding <em>refutation cases</em>:
</p><div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="matching-case"><span class="c010">matching-case</span></a></td><td class="c015">::=</td><td class="c017">
<a class="syntax" href="patterns.html#pattern"><span class="c010">pattern</span></a> [<span class="c004">when</span> <a class="syntax" href="expr.html#expr"><span class="c010">expr</span></a>] <span class="c004">-></span> <a class="syntax" href="expr.html#expr"><span class="c010">expr</span></a>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <a class="syntax" href="patterns.html#pattern"><span class="c010">pattern</span></a> <span class="c004">-></span> <span class="c004">.</span>
</td></tr>
</table></td></tr>
</table></div><p>
In presence of a refutation case, the exhaustiveness check will first
compute the intersection of the pattern with the complement of the
cases preceding it. It then checks whether the resulting patterns can
really match any concrete values by trying to type-check them.
Wild cards in the generated patterns are handled in a special way: if
their type is a variant type with only GADT constructors, then the
pattern is split into the different constructors, in order to check whether
any of them is possible (this splitting is not done for arguments of these
constructors, to avoid non-termination). We also split tuples and
variant types with only one case, since they may contain GADTs inside.
For instance, the following code is deemed exhaustive:</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> _ t =
| Int : int t
| Bool : bool t
<span class="ocamlkeyword">let</span> deep : (char t * int) option -> char = <span class="ocamlkeyword">function</span>
| None -> 'c'
| _ -> .</div></div>
</div><p>Namely, the inferred remaining case is <span class="c003">Some _</span>, which is split into
<span class="c003">Some (Int, _)</span> and <span class="c003">Some (Bool, _)</span>, which are both untypable because
<span class="c003">deep</span> expects a non-existing <span class="c003">char t</span> as the first element of the tuple.
Note that the refutation case could be omitted here, because it is
automatically added when there is only one case in the pattern
matching.</p><p>Another addition is that the redundancy check is now aware of GADTs: a
case will be detected as redundant if it could be replaced by a
refutation case using the same pattern.</p>
<h5 class="paragraph" id="p:gadts-advexamples"><a class="section-anchor" href="#p:gadts-advexamples" aria-hidden="true"></a>Advanced examples</h5>
<p>
The <span class="c003">term</span> type we have defined above is an <em>indexed</em> type, where
a type parameter reflects a property of the value contents.
Another use of GADTs is <em>singleton</em> types, where a GADT value
represents exactly one type. This value can be used as runtime
representation for this type, and a function receiving it can have a
polytypic behavior.</p><p>Here is an example of a polymorphic function that takes the
runtime representation of some type <span class="c003">t</span> and a value of the same type,
then pretty-prints the value as a string:
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> _ typ =
| Int : int typ
| String : string typ
| Pair : 'a typ * 'b typ -> ('a * 'b) typ
<span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> to_string: <span class="ocamlkeyword">type</span> t. t typ -> t -> string =
<span class="ocamlkeyword">fun</span> t x ->
<span class="ocamlkeyword">match</span> t <span class="ocamlkeyword">with</span>
| Int -> Int.to_string x
| String -> Printf.sprintf <span class="ocamlstring">"%S"</span> x
| Pair(t1,t2) ->
<span class="ocamlkeyword">let</span> (x1, x2) = x <span class="ocamlkeyword">in</span>
Printf.sprintf <span class="ocamlstring">"(%s,%s)"</span> (to_string t1 x1) (to_string t2 x2)</div></div>
</div><p>Another frequent application of GADTs is equality witnesses.
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> (_,_) eq = Eq : ('a,'a) eq
<span class="ocamlkeyword">let</span> cast : <span class="ocamlkeyword">type</span> a b. (a,b) eq -> a -> b = <span class="ocamlkeyword">fun</span> Eq x -> x</div></div>
</div><p>
Here type <span class="c003">eq</span> has only one constructor, and by matching on it one
adds a local constraint allowing the conversion between <span class="c003">a</span> and <span class="c003">b</span>.
By building such equality witnesses, one can make equal types which
are syntactically different.</p><p>Here is an example using both singleton types and equality witnesses
to implement dynamic types.
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> eq_type : <span class="ocamlkeyword">type</span> a b. a typ -> b typ -> (a,b) eq option =
<span class="ocamlkeyword">fun</span> a b ->
<span class="ocamlkeyword">match</span> a, b <span class="ocamlkeyword">with</span>
| Int, Int -> Some Eq
| String, String -> Some Eq
| Pair(a1,a2), Pair(b1,b2) ->
<span class="ocamlkeyword">begin</span> <span class="ocamlkeyword">match</span> eq_type a1 b1, eq_type a2 b2 <span class="ocamlkeyword">with</span>
| Some Eq, Some Eq -> Some Eq
| _ -> None
<span class="ocamlkeyword">end</span>
| _ -> None
<span class="ocamlkeyword">type</span> dyn = Dyn : 'a typ * 'a -> dyn
<span class="ocamlkeyword">let</span> get_dyn : <span class="ocamlkeyword">type</span> a. a typ -> dyn -> a option =
<span class="ocamlkeyword">fun</span> a (Dyn(b,x)) ->
<span class="ocamlkeyword">match</span> eq_type a b <span class="ocamlkeyword">with</span>
| None -> None
| Some Eq -> Some x</div></div>
</div>
<h5 class="paragraph" id="p:existential-names"><a class="section-anchor" href="#p:existential-names" aria-hidden="true"></a>Existential type names in error messages</h5>
<p>(Updated in OCaml 4.03.0)</p><p>The typing of pattern matching in presence of GADT can generate many
existential types. When necessary, error messages refer to these
existential types using compiler-generated names. Currently, the
compiler generates these names according to the following nomenclature:
</p><ul class="itemize"><li class="li-itemize">
First, types whose name starts with a <span class="c003">$</span> are existentials.
</li><li class="li-itemize"><span class="c003">$Constr_'a</span> denotes an existential type introduced for the type
variable <span class="c003">'a</span> of the GADT constructor <span class="c003">Constr</span>:
<div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> any = Any : 'name -> any
<span class="ocamlkeyword">let</span> escape (Any x) = <span class="ocamlhighlight">x</span></div>
<div class="pre caml-output error"><span class="ocamlerror">Error</span>: This expression has type $Any_'name
but an expression was expected of type 'a
The type constructor $Any_'name would escape its scope</div></div>
</div>
</li><li class="li-itemize"><span class="c003">$Constr</span> denotes an existential type introduced for an anonymous type variable in the GADT constructor <span class="c003">Constr</span>:
<div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> any = Any : _ -> any
<span class="ocamlkeyword">let</span> escape (Any x) = <span class="ocamlhighlight">x</span></div>
<div class="pre caml-output error"><span class="ocamlerror">Error</span>: This expression has type $Any but an expression was expected of type
'a
The type constructor $Any would escape its scope</div></div>
</div>
</li><li class="li-itemize"><span class="c003">$'a</span> if the existential variable was unified with the type variable <span class="c003">'a</span> during typing:
<div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">type</span> ('arg,'result,'aux) fn =
| Fun: ('a ->'b) -> ('a,'b,unit) fn
| Mem1: ('a ->'b) * 'a * 'b -> ('a, 'b, 'a * 'b) fn
<span class="ocamlkeyword">let</span> apply: ('arg,'result, _ ) fn -> 'arg -> 'result = <span class="ocamlkeyword">fun</span> f x ->
<span class="ocamlkeyword">match</span> f <span class="ocamlkeyword">with</span>
| Fun f -> f x
| <span class="ocamlhighlight">Mem1 (f,y,fy)</span> -> <span class="ocamlkeyword">if</span> x = y <span class="ocamlkeyword">then</span> fy <span class="ocamlkeyword">else</span> f x</div>
<div class="pre caml-output error"><span class="ocamlerror">Error</span>: This pattern matches values of type
($'arg, 'result, $'arg * 'result) fn
but a pattern was expected which matches values of type
($'arg, 'result, unit) fn
The type constructor $'arg would escape its scope</div></div>
</div>
</li><li class="li-itemize"><span class="c003">$n</span> (n a number) is an internally generated existential which could not be named using one of the previous schemes.
</li></ul><p>As shown by the last item, the current behavior is imperfect
and may be improved in future versions.</p>
<h5 class="paragraph" id="p:gadt-equation-nonlocal-abstract"><a class="section-anchor" href="#p:gadt-equation-nonlocal-abstract" aria-hidden="true"></a>Equations on non-local abstract types</h5>
<p> (Introduced in OCaml
4.04)</p><p>GADT pattern-matching may also add type equations to non-local
abstract types. The behaviour is the same as with local abstract
types. Reusing the above <span class="c003">eq</span> type, one can write:
</p><div class="caml-example verbatim">
<div class="ocaml">
<div class="pre caml-input"> <span class="ocamlkeyword">module</span> M : <span class="ocamlkeyword">sig</span> <span class="ocamlkeyword">type</span> t <span class="ocamlkeyword">val</span> x : t <span class="ocamlkeyword">val</span> e : (t,int) eq <span class="ocamlkeyword">end</span> = <span class="ocamlkeyword">struct</span>
<span class="ocamlkeyword">type</span> t = int
<span class="ocamlkeyword">let</span> x = 33
<span class="ocamlkeyword">let</span> e = Eq
<span class="ocamlkeyword">end</span>
<span class="ocamlkeyword">let</span> x : int = <span class="ocamlkeyword">let</span> Eq = M.e <span class="ocamlkeyword">in</span> M.x</div></div>
</div><p>Of course, not all abstract types can be refined, as this would
contradict the exhaustiveness check. Namely, builtin types (those
defined by the compiler itself, such as <span class="c003">int</span> or <span class="c003">array</span>), and
abstract types defined by the local module, are non-instantiable, and
as such cause a type error rather than introduce an equation.</p>
<hr>
<a href="overridingopen.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="extn.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="bigarray.html"><img src="next_motif.svg" alt="Next"></a>
</body>
</html>
|