1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299
|
------------------------------------------------------------------------------
A prototype implementation of extensible records for Hugs
Mark P. Jones, March 1997
This short note provides some background details for anybody wishing to
experiment with the implementation of extensible records that is
included in this distribution of Hugs. Please note that this extension
is only enabled if you use a version of Hugs that has been configured
using the --enable-TREX option or by setting the TREX flag to 1 in the
file hugs/src/options.h.
For further details, and particularly for information about its
theoretical foundations and its relationship with other proposals, the
reader should refer to the report:
A Polymorphic Type System for Extensible Records and Variants,
Benedict R. Gaster and Mark P. Jones Technical report NOTTCS-TR-96-3 ,
Department of Computer Science, University of Nottingham, November 1996.
(http://www.cs.nott.ac.uk/Department/Techreports/96-3.html)
The current implementation does not use our prefered syntax for record
operations; too many of the symbols that we would like to have used are
already used in conflicting ways in the syntax of standard Haskell.
Basic concepts:
---------------
In essence, records are just collections of values, each of which is
associated with a particular label. For example:
(a = True, b = "Hello", c = 12::Int)
is a record with three components: an "a" field, containing a boolean
value, a "b" field containing a string, and a "c" field containing the
number 12. The order in which the fields are listed is not
significant, so the same record value might also be written as:
(c = 12::Int, a = True, b = "Hello")
These examples show simple ways to construct record values. We can
also inspect the values held in a record using selector functions.
These are written using the # character, followed immediately by the
label of the field that we want to extract. For example:
? #a (a = True, b = "Hello", c = 12::Int)
True
? #b (a = True, b = "Hello", c = 12::Int)
"Hello"
? #c (a = True, b = "Hello", c = 12::Int)
12
?
This notation for selector functions should be familiar to users of
ML. Our implementation also allows programs to use pattern matching to
inspect record values. For example:
? (\(a=x, c=y, b=_) -> (y,x)) (a = True, b = "Hello", c = 12::Int)
(12,True)
?
The order of fields in a record pattern is significant because it
determines the order---from left to right---in which they are
matched. In the following example, an attempt to match the pattern
(a=[x], b=True) against the record (b=undefined, a=[]), fails because
[x] does not match the empty list, but a match against (a=[2],b=True)
succeeds, binding x to 2:
? [ x | (a=[x], b=True) <- [(b=undefined, a=[]), (a=[2],b=True)]]
[2]
?
Changing the order of the fields in the pattern to (b=True, a=[x])
forces matching to start with the "b" component. But the first element
in the list of records used above has "undefined" in its "b" component,
so now the evaluation produces a run-time error message:
? [ x | (b=True, a=[x]) <- [(b=undefined, a=[]), (a=[2],b=True)]]
Program error: {undefined}
?
Although Hugs lets you work with record values, it does not, by
default, allow you to print them. More accurately, it does not
automatically provide instances of the Show class for record values.
So a simple attempt to print a record value will result in an error
like the following:
? (a = True, b = "Hello", c = 12::Int)
ERROR: Cannot find "show" function for:
*** expression : (a=True, b="Hello", c=12)
*** of type : Rec (a::Bool, b::[Char], c::Int)
?
The problem here occurs because Hugs attempts to display the record by
applying the "show" function to it, and no version of "show" has been
defined. If you do want to be able to display record values, then you
should load or import the "Trex.hs" module---which is usually included
in the lib/hugs directory of the Hugs distribution:
? :load Trex
? (a = True, b = "Hello", c = 12::Int)
(a=True, b="Hello", c=12)
? (c = 12::Int, a = True, b = "Hello")
(a=True, b="Hello", c=12)
?
Note that the fields are always displayed with their labels in
alphabetical order. The fact that the fields appear in a specific
(but, frankly, arbitrary) order is very important---"show" is a normal
function, so its output must be uniquely determined by its input, and
not by the way in which that input value is written. (Note that the
two records used in the example above have exactly the same value.)
Of course, like all other values in Haskell, records have types,
written using expressions of the form Rec r where "Rec" is a built-in
type constructor and "r" represents a `row' that associates labels with
types. For example:
? :t (c = 12::Int, a = True, b = "Hello")
(a=True, b="Hello", c=12) :: Rec (a::Bool, b::[Char], c::Int)
?
The type here tells us, unsurprisingly, that the record (a=True,
b="Hello", c=12) has three components: an "a" field containing a
Bool, a "b" field containing a String, and a "c" field of type Int.
As with record values themselves, the order of the components in a
row is not significant:
? (a=True, b="Hello", c=12) :: Rec (b::String, c::Int, a::Bool)
(a=True, b="Hello", c=12)
?
However, the type of a record must be an accurate reflection of the
fields that appear in the corresponding value. The following example
produces an error because the specified type does not list all of the
fields in the record value:
? (a=True, b="Hello", c=12) :: Rec (b::String, c::Int)
ERROR: Type error in type signature expression
*** term : (a=True, b="Hello", c=12)
*** type : Rec (a::Bool, b::[Char], c::a)
*** does not match : Rec (b::String, c::Int)
*** because : field mismatch
?
Extensibility:
--------------
Any given record "r" can be extended with a new field labelled "l",
provided that "r" does not already include an "l" field. For example,
we can construct the record (a=True, b="Hello") by extending the record
(a = True) with a field b = "Hello":
? (b = "Hello" | (a = True))
(a=True, b="Hello")
?
Alternatively, we can construct the same result by extending a record
(b = "Hello") with a field a = True:
? (a = True | (b = "Hello"))
(a=True, b="Hello")
?
The syntax of our current implementation allows us to add several new
fields at a time (and the corresponding syntax for pattern matching is
also supported):
? (a=True, b="Hello", c=12::Int | (b1="World"))
(a=True, b="Hello", b1="World", c=12)
?
On the other hand, a record cannot be extended with a field of the same
name, even if it has a different type. The following examples illustrate
this:
? (a=True | (a=False))
ERROR: Repeated label "a" in record (a=True, a=False)
? (a=True | r) where r = (a=12::Int)
ERROR: (a::Int) already includes a "a" field
? (a=True | r) where r = (a=False)
ERROR: (a::Bool) already includes a "a" field
?
Notice how the error messages that Hugs produced here vary. In the
first case, the presence of a repeated label was detected
syntactically. In the second and third examples, the problem was
detected using information about the type of the record "r".
Much the same syntax can be used in patterns to decompose record values:
? (\(b=bval | r) -> (bval,r)) (a=True, b="Hello")
("Hello",(a=True))
?
In the previous examples, we saw how a record could be extended with
new fields. As this example shows, we can use pattern matching to do
the reverse operation, removing fields from a record.
We can also use pattern matching to understand how selector functions
like #a, #b, and so on are implemented. For example, the selector #x
is equivalent to the following function:
(\(x=value | _) -> value)
A selector function like this is `polymorphic' in the sense that it can
be used with *any* record containing an "x" field, regardless of the
type associated with that particular component, or of any other fields
that the record might contain:
? (\(x=value | _) -> value) (x=True, b="Hello")
True
? (\(x=value | _) -> value) (name="Hugs", age=2, x="None")
"None"
?
To understand how this works, it is useful to look at the type that
Hugs assigns to this particular selector function:
? :type (\(x=value | _) -> value)
\(x=value | _) -> value :: r\x => Rec (x::a | r) -> a
?
There are two important pieces of notation that have to be explained
here:
o Rec (x::a | r) is the type of a record with an "x" component of
type a. The `row variable' "r" represents the rest of the row;
that is, it represents any other fields in the record apart from "x".
This syntax---for record type extension---was chosen to mirror the
syntax that we have already seen in the examples above for record
value extension.
o The constraint r\x tells us that the type on the right of
the => symbol is only valid if "r lacks x", that is, if "r" is
a row that does not contain an "x" field. Those familiar with
Haskell may like to think of "\x" as a kind of class constraint,
written with postfix syntax, whose instances are precisely the
rows without an "x" field.
For example, when we apply our selector function to the record (x=True,
b="Hello") of type Rec (b::String, x::Bool), then we instantiate the
variables "a" and "r" in the type above to Bool and (b::String),
respectively.
In fact, the built-in selector functions have exactly the same type as
the user-defined selector shown above:
? :type #x
#x :: b\x => Rec (x::a | b) -> a
?
The row constraints that we see here can also occur in the type of any
function that operates on record values if the types of those records
are not fully determined at compile-time. For example, given the following
definition:
average r = (#x r + #y r) / 2
Hugs infers a principal type of the form:
average :: (Fractional a, b\y, b\x) => Rec (y::a, x::a | b) -> a
However, any of the following, more specific types could be specified in
a type declaration for the "average" function:
average :: (Fractional a) => Rec (x::a, y::a) -> a
average :: (r\x, r\y) => Rec (x::Double, y::Double | r) -> Double
average :: Rec (x::Double, y::Double) -> Double
average :: Rec (x::Double, y::Double, z::Bool) -> Double
Each of these types is an instance of the principal type given above.
These examples show an important difference between the system of
records described here, and the record facilities provided by SML.
In particular, SML prohibits definitions that involve records for
which the complete set of fields cannot be determined at compile-time.
So, the SML equivalent of the "average" function described above would
be rejected because there is no way to determine if the record r will
have any fields other than x or y. SML programmers usually avoid such
problems by giving a type annotation that completely specifies the
structure of the record. But, of course, if a definition is limited
in this way, then it also less flexible.
With the current implementation of our type system, there is an advantage
to knowing the full type of a record at compile-time because it allows
the compiler to generate more efficient code. However, unlike SML, our
type system also offers the extra flexibility of polymorphism and
extensibility over records if that is needed.
------------------------------------------------------------------------------
|