1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378
|
%**<title>A Gentle Introduction to Haskell: Functions</title>
%**~header
%%%
\section{Functions}
\label{tut-functions}
Since Haskell is a functional language, one would expect functions to
play a major role, and indeed they do. In this section, we look at
several aspects of functions in Haskell.
First, consider this definition of a function which adds its two
arguments:
\bprog
@
add :: Integer -> Integer -> Integer
add x y = x + y
@
\eprog
This is an example of a {\em curried} function.\footnote{The name {\em
curry} derives from the person who popularized the idea: Haskell
Curry. To get the effect of an {\em uncurried} function, we could use
a {\em tuple}, as in:
\bprog
@
add (x,y) = x + y
@
\eprog
But then we see that this version of @add@ is really just a function
of one argument!} An application of @add@ has the form @add @$e_1\
e_2$, and is equivalent to @(add @$e_1$@) @$e_2$, since function
application associates to the {\em left}. In other words, applying
@add@ to one argument yields a new function which is then applied to
the second argument. This is consistent with the type of @add@,
@Integer->Integer->Integer@, which is equivalent to
@Integer->(Integer->Integer)@; i.e.~@->@
associates to the {\em right}. Indeed, using @add@, we can define
@inc@ in a different way from earlier:
\bprog
@
inc = add 1
@
\eprog
This is an example of the {\em partial application} of a curried
function, and is one way that a function can be returned as a value.
Let's consider a case in which it's useful to pass a function as an
argument. The well-known @map@ function is a perfect example:
\bprog
@
map :: (a->b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
@
\eprog
\syn{Function application has higher precedence than any infix
operator, and thus the right-hand side of the second equation parses
as @(f x) : (map f xs)@.}\ \ \ The @map@ function is polymorphic and
its type indicates clearly that its first argument is a function; note
also that the two @a@'s must be instantiated with the same type
(likewise for the @b@'s). As an example of the use of @map@, we can
increment the elements in a list:
\[
@map (add 1) [1,2,3]@\ \ \ \ \red\ \ \ \ @[2,3,4]@
\]
These examples demonstrate the first-class nature of functions, which
when used in this way are usually called {\em higher-order} functions.
\subsection{Lambda Abstractions}
\label{tut-lambda}
Instead of using equations to define functions, we can also define
them ``anonymously'' via a {\em lambda abstraction}. For example, a
function equivalent to @inc@ could be written as @\x -> x+1@.
Similarly, the function @add@ is equivalent to @\x -> \y -> x+y@.
Nested lambda abstractions such as this may be written using the
equivalent shorthand notation @\x y -> x+y@. In fact, the equations:
\bprog
@
inc x = x+1
add x y = x+y
@
\eprog
are really shorthand for:
\bprog
@
inc = \x -> x+1
add = \x y -> x+y
@
\eprog
We will have more to say about such equivalences later.
In general, given that @x@ has type $t_1$ and @exp@ has type $t_2$,
then @\x->exp@ has type $t_1$@->@$t_2$.
\subsection{Infix Operators}
\label{tut-infix-ops}
Infix operators are really just functions, and can also be defined
using equations. For example, here is a definition of a
list concatenation operator:
\bprog
@
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)
@
\eprog
\syn{Lexically, infix operators consist entirely of ``symbols,'' as
opposed to normal identifiers which are alphanumeric (\see{ids}).
Haskell has no prefix operators, with the exception of minus (@-@),
which is both infix and prefix.}
As another example, an important infix operator on functions is that
for {\em function composition}:
\bprog
@
(.) :: (b->c) -> (a->b) -> (a->c)
f . g = \ x -> f (g x)
@
\eprog
\subsubsection{Sections}
\label{tut-sections}
Since infix operators are really just functions, it makes sense to be
able to partially apply them as well. In Haskell the partial
application of an infix operator is called a {\em section}. For
example:
\[\ba{ccc}
@(x+)@\ \ \ \ &\equiv&\ \ \ \ @\y -> x+y@ \\
@(+y)@\ \ \ \ &\equiv&\ \ \ \ @\x -> x+y@ \\
@(+)@ \ \ \ \ &\equiv&\ \ \ \ @\x y -> x+y@
\ea\]
\syn{The parentheses are mandatory.}
The last form of section given above essentially coerces an infix
operator into an equivalent functional value, and is handy when
passing an infix operator as an argument to a function, as in
@map (+) [1,2,3]@ (the reader should verify that this returns a list
of functions!). It is also necessary when giving a function type
signature, as in the examples of @(++)@ and @(.)@ given earlier.
We can now see that @add@ defined earlier is just @(+)@, and @inc@ is
just @(+1)@! Indeed, these definitions would do just fine:
\bprog
@
inc = (+ 1)
add = (+)
@
\eprog
We can coerce an infix operator into a functional value, but can we go
the other way? Yes---we simply enclose an identifier bound to a
functional value in backquotes. For example, @x `add` y@ is the same
as @add x y@.\footnote{Note carefully that @add@ is enclosed in {\em
backquotes}, not {\em apostrophes} as used in the syntax of
characters; i.e. @'f'@ is a character, whereas @`f`@ is an infix
operator. Fortunately, most ASCII terminals distinguish these much
better than the font used in this manuscript.} Some functions read
better this way. An example is the predefined list membership
predicate @elem@; the expression @x `elem` xs@ can be read intuitively
as ``@x@ is an element of @xs@.''
\syn{There are some special rules regarding sections involving
the prefix/infix operator @-@; see (\see{sections},\see{operators}).}
At this point, the reader may be confused at having so many ways to
define a function! The decision to provide these mechanisms partly
reflects historical conventions, and partly reflects the desire for
consistency (for example, in the treatment of infix vs. regular
functions).
% That's all right---the designers of Haskell are too.
\subsubsection{Fixity Declarations}
A {\em fixity declaration} can be given for any infix operator or
constructor (including those made from ordinary identifiers, such as
@`elem`@).
%\footnote{Fixity declarations must only appear at the very
%beginning of a Haskell {\em module}, as will be described in Section
%\ref{tut-modules}.}
This declaration specifies a precedence level from
0 to 9 (with 9 being the strongest; normal application is assumed to
have a precedence level of 10), and left-, right-, or
non-associativity. For example, the fixity declarations for @++@ and
@.@ are:
\bprog
@
infixr 5 ++
infixr 9 .
@
\eprog
Both of these specify right-associativity, the first with a precedence
level of 5, the other 9. Left associativity is specified via
@infixl@, and non-associativity by @infix@. Also, the fixity of more
than one operator may be specified with the same fixity declaration.
If no fixity declaration is given for a particular operator, it
defaults to @infixl 9@. (See \see{fixity} for a detailed definition
of the associativity rules.)
\subsection{Functions are Non-strict}
\label{tut-non-strict}
Suppose @bot@ is defined by:
\bprog
@
bot = bot
@
\eprog
In other words, @bot@ is a non-terminating expression. Abstractly, we
denote the {\em value} of a non-terminating expression as $\bot$ (read
``bottom''). Expressions that result in some kind of a run-time
error, such as @1/0@, also have this value. Such an error is not
recoverable: programs will not continue past these errors. Errors
encountered by the I/O system, such as an end-of-file error, are
recoverable and are handled in a different manner. (Such an I/O error
is really not an error at all but rather an exception. Much more will
be said about exceptions in Section \ref{tut-io}.)
A function @f@ is said to be {\em strict} if, when applied to a
nonterminating expression, it also fails to terminate. In other
words, @f@ is strict iff the value of @f bot@ is $\bot$. For most
programming languages, {\em all} functions are strict. But this is
not so in Haskell. As a simple example, consider @const1@, the
constant 1 function, defined by:
\bprog
@
const1 x = 1
@
\eprog
The value of @const1 bot@ in Haskell is @1@. Operationally speaking,
since @const1@ does not ``need'' the value of its argument, it never
attempts to evaluate it, and thus never gets caught in a
nonterminating computation. For this reason, non-strict functions are
also called ``lazy functions'', and are said to evaluate their
arguments ``lazily'', or ``by need''.
Since error and nonterminating values are semantically the same in
Haskell, the above argument also holds for errors. For example,
@const1 (1/0)@ also evaluates properly to @1@.
Non-strict functions are extremely useful in a variety of contexts.
The main advantage is that they free the programmer from many concerns
about evaluation order. Computationally expensive values may be
passed as arguments to functions without fear of them being computed
if they are not needed. An important example of this is a possibly
{\em infinite} data structure.
Another way of explaining non-strict functions is that Haskell
computes using "definitions" rather than the "assignments" found in
traditional languages. Read a declaration such as
\bprog
@
v = 1/0
@
\eprog
as `define @v@ as @1/0@' instead of `compute @1/0@ and store the
result in @v@'. Only if the value (definition) of @v@ is needed
will the division by zero error occur. By itself, this
declaration does not imply any computation. Programming using
assignments requires careful attention to the ordering of the
assignments: the meaning of the program depends on the order in which
the assignments are executed. Definitions, in contrast, are much
simpler: they can be presented in any order without affecting the
meaning of the program.
\subsection{``Infinite'' Data Structures}
\label{tut-infinite}
One advantage of the non-strict nature of Haskell is that data
constructors are non-strict, too. This should not be surprising,
since constructors are really just a special kind of function (the
distinguishing feature being that they can be used in pattern
matching). For example, the constructor for lists, @(:)@, is
non-strict.
Non-strict constructors permit the definition of (conceptually) {\em
infinite} data structures. Here is an infinite list of ones:
\bprog
@
ones = 1 : ones
@
\eprog
Perhaps more interesting is the function @numsFrom@:
\bprog
@
numsFrom n = n : numsFrom (n+1)
@
\eprog
Thus @numsFrom n@ is the infinite list of successive integers
beginning with @n@. From it we can construct an infinite list of
squares:
\bprog
@
squares = map (^2) (numsfrom 0)
@
\eprog
(Note the use of a section; @^@ is the infix exponentiation operator.)
Of course, eventually we expect to extract some finite portion of the
list for actual computation, and there are lots of predefined
functions in Haskell that do this sort of thing: @take@, @takeWhile@,
@filter@, and others. The definition of Haskell includes a large set
of built-in functions and types---this is called the ``Standard
Prelude''. The complete Standard Prelude is included in Appendix A of
the Haskell report; see the portion named @PreludeList@ for many
useful functions involving lists. For example, @take@ removes the first @n@
elements from a list:
\[ @take 5 squares@\ \ \ \ \red\ \ \ \ @[0,1,4,9,16]@ \]
The definition of @ones@ above is an example of a {\em circular list}.
In most circumstances laziness has an important impact on efficiency,
since an implementation can be expected to implement the list as a
true circular structure, thus saving space.
For another example of the use of circularity, the Fibonacci sequence
can be computed efficiently as the following infinite sequence:
\bprog
@
fib = 1 : 1 : [ a+b | (a,b) <- zip fib (tail fib) ]
@
\eprog
where @zip@ is a Standard Prelude function that returns the pairwise
interleaving of its two list arguments:
\bprog
@
zip (x:xs) (y:ys) = (x,y) : zip xs ys
zip xs ys = []
@
\eprog
Note how @fib@, an infinite list, is defined in terms of itself, as if
it were ``chasing its tail.'' Indeed, we can draw a picture of this
computation as shown in Figure \ref{tut-fib-fig}.
%**<div align=center><img src="fig1.gif" alt="Fib Example">
%**<h4>Figure 1</h4> </div>
For another application of infinite lists, see Section \ref{tut-lazy-patterns}.
%*ignore
\begin{figure}
\begin{center}
\mbox{
{\epsfxsize=2.5in \epsfbox{fig1.eps}}}
\end{center}
\caption{Circular Fibonacci Sequence}
\label{tut-fib-fig}
\end{figure}
%*endignore
\subsection{The Error Function}
Haskell has a built-in function called @error@ whose type is
@String->a@. This is a somewhat odd function: From its type it looks
as if it is returning a value of a polymorphic type about which it
knows nothing, since it never receives a value of that type as an
argument!
In fact, there {\em is} one value ``shared'' by all types: $\bot$.
Indeed, semantically that is exactly what value is always returned by
@error@ (recall that all errors have value $\bot$). However, we can
expect that a reasonable implementation will print the string argument
to @error@ for diagnostic purposes. Thus this function is useful when
we wish to terminate a program when something has ``gone wrong.'' For
example, the actual definition of @head@ taken from the Standard
Prelude is:
\bprog
@
head (x:xs) = x
head [] = error "head{PreludeList}: head []"
@
\eprog
%**~footer
|