1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<HTML>
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<META name="GENERATOR" content="hevea 1.06">
<TITLE>
Introduction
</TITLE>
</HEAD>
<BODY TEXT=black BGCOLOR=white>
<A HREF="index.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A>
<A HREF="tutorial002.html"><IMG SRC ="next_motif.gif" ALT="Next"></A>
<HR>
<TABLE CELLPADDING=0 CELLSPACING=0 WIDTH="100%">
<TR><TD BGCOLOR="#2de52d"><DIV ALIGN=center><TABLE>
<TR><TD><A NAME="htoc1"><B><FONT SIZE=6>Chapter 1</FONT></B></A></TD>
<TD WIDTH="100%" ALIGN=center><B><FONT SIZE=6>Introduction</FONT></B></TD>
</TR></TABLE></DIV></TD>
</TR></TABLE>
<A NAME="c:tutorial"></A>
<BR>
Camlp4 is a preprocessor for OCaml. As a preprocessor, you can do
<EM>syntax extensions</EM> to your OCaml programs. But Camlp4 also provides
some other features:
<UL><LI>
A system of <EM>grammars</EM>
<LI>A kind of macros called <EM>quotations</EM>
<LI>A revised syntax for OCaml
<LI>A pretty printing system of OCaml programs
<LI>and other: <EM>extensible functions</EM>, <EM>functional streams</EM>, ...
</UL>
Camlp4 is syntax, syntax, syntax. It uses its own syntax systems to do
its own syntax extensions: it is highly bootstrapped. Camlp4 stops at
syntax level: it does not know anything about semantic, typing nor
code generation (for it, a type definition is just a syntactic thing
which starts with ``type'').<BR>
<BR>
The ``p4'' in the name ``camlp4'' stands for the 4 ``p'' of
``Pre-Processor-Pretty-Printer''.<BR>
<BR>
<A NAME="toc1"></A><TABLE CELLPADDING=0 CELLSPACING=0 WIDTH="100%">
<TR><TD BGCOLOR="#66ff66"><DIV ALIGN=center><TABLE>
<TR><TD><A NAME="htoc2"><B><FONT SIZE=5>1.1</FONT></B></A></TD>
<TD WIDTH="100%" ALIGN=center><B><FONT SIZE=5>Extending the syntax of OCaml</FONT></B></TD>
</TR></TABLE></DIV></TD>
</TR></TABLE><BR>
To start with the beginning, we could try to learn how to make simple
syntax extensions to OCaml. If you know the C language, you probably
experimented the <CODE>define</CODE> construction, very easy to use:
<PRE>
#define FOO xyzzy
</PRE>
and all occurrences of <CODE>FOO</CODE> in the rest of the program are
replaced by <CODE>xyzzy</CODE>.<BR>
<BR>
In Camlp4, is it not so simple. A syntax extension is not just text
replacing: it is an extension of an entry of the grammar of the
language, and you need to create syntax trees.<BR>
<BR>
It is therefore necessary 1/ to know what is the grammar system
provided by Camlp4 2/ to know how to create syntax trees. It is what
we are going to do in this tutorial. Once these points described, we
have got the tools to do the syntax extensions of the language.<BR>
<BR>
If you are impatient, and you want to create your syntax extension in
the next quarter of an hour, and you don't want to learn all that
stuff, you may consider taking the text of an already existing syntax
extension and change it for you own needs. A syntax extension is not
necessarily a long program (for example, adding the
<CODE>repeat..until</CODE> construction of Pascal takes 6 lines) and you can
guess ``how it works'' and ask the wizards...<BR>
<BR>
Examples are given in chapter <A HREF="tutorial007.html#c:tutext">7</A>.<BR>
<BR>
However, if you read this manual, you may be interested on learning
the original system of grammars that Camlp4 provides. It can be used
for other goals than extending the OCaml language: for your own
grammars. This system of grammars is an alternative of yacc: a
different approach, but you can describe your language in some
identical way.<BR>
<BR>
Just the practical things before (what do I type to experiment?)<BR>
<BR>
<A NAME="toc2"></A><TABLE CELLPADDING=0 CELLSPACING=0 WIDTH="100%">
<TR><TD BGCOLOR="#66ff66"><DIV ALIGN=center><TABLE>
<TR><TD><A NAME="htoc3"><B><FONT SIZE=5>1.2</FONT></B></A></TD>
<TD WIDTH="100%" ALIGN=center><B><FONT SIZE=5>Using Camlp4 as a command and in the toplevel</FONT></B></TD>
</TR></TABLE></DIV></TD>
</TR></TABLE><BR>
You must first know that camlp4 is a command. This chapter does not
explain all the details of the command and its options: we see that
further (chapter <A HREF="tutorial008.html#c:tutcomm">8</A>; you can also use the man pages by typing
<CODE>"man camlp4"</CODE> in your shell).<BR>
<BR>
For the moment, here is a magic incantation to compile a file named
<CODE>foo.ml</CODE>:
<PRE>
ocamlc -pp "camlp4o pa_extend.cmo" -I +camlp4 -c foo.ml
</PRE>
This command just compiles <CODE>foo.ml</CODE> as a normal OCaml file, but
where the parsing is done by camlp4. The first examples in this
documentation (grammars in Camlp4) can be compiled using this
command. Otherwise, the examples are given with the correct command to
use in order to compile the files.<BR>
<BR>
Another (recommended) better way is to use the OCaml toplevel. In the
toplevel, type:
<PRE>
#load "camlp4o.cma";;
#load "pa_extend.cmo";;
</PRE>
You can type the examples of this documentation in the toplevel. You
can also type them in files and use the directive <CODE>#use</CODE> to
include them.<BR>
<BR>
All the examples in this documentation are written in the normal
syntax of OCaml, but if you know and prefer the revised syntax
provided by Camlp4, change <CODE>camlp4o</CODE> into <CODE>camlp4r</CODE> in the
<CODE>ocamlc</CODE> command, or, load <CODE>"camlp4r.cma"</CODE> instead of
<CODE>"camlp4o.cma"</CODE> in the toplevel.<BR>
<BR>
<A NAME="toc3"></A><TABLE CELLPADDING=0 CELLSPACING=0 WIDTH="100%">
<TR><TD BGCOLOR="#66ff66"><DIV ALIGN=center><TABLE>
<TR><TD><A NAME="htoc4"><B><FONT SIZE=5>1.3</FONT></B></A></TD>
<TD WIDTH="100%" ALIGN=center><B><FONT SIZE=5>Linking applications using Camlp4 libraries</FONT></B></TD>
</TR></TABLE></DIV></TD>
</TR></TABLE><BR>
Many examples of this tutorial use some specific Camlp4 libraries. In
the toplevel, you don't need to load them because they are in the file
<CODE>camlp4o.cma</CODE>.<BR>
<BR>
To link a standalone application, you need to add the library named
<CODE>gramlib.cma</CODE> of the Camlp4 library directory. The command is:
<PRE>
ocamlc -I +camlp4 gramlib.cma <the_files_you_link>
</PRE>
<A NAME="toc4"></A><TABLE CELLPADDING=0 CELLSPACING=0 WIDTH="100%">
<TR><TD BGCOLOR="#66ff66"><DIV ALIGN=center><TABLE>
<TR><TD><A NAME="htoc5"><B><FONT SIZE=5>1.4</FONT></B></A></TD>
<TD WIDTH="100%" ALIGN=center><B><FONT SIZE=5>Differences in parsing behavior</FONT></B></TD>
</TR></TABLE></DIV></TD>
</TR></TABLE>
<BR>
Even if you use the normal syntax, there are some small differences
in the parsing behavior between the normal <TT>ocamlc</TT> parser (bottom
up, LALR parsing) and the <TT>camlp4</TT> parser (top down, recursive
descent parsing). These differences appear notably when giving
erroneous input. As a trivial example, suppose that you wanted to type
<PRE>
(* correct intended input *)
type t = Buf of Buffer.t
| Str of string
</PRE>
Instead of typing the above example, you forgot the second occurrence
of the <TT><B>of</B></TT> keyword, getting
<PRE>
(* file wrongsyntax.ml : wrong input - missing keyword *)
type t = Buf of Buffer.t
| Str (*missing "of"*) string
</PRE>
The <TT>ocamlc</TT> compiler<SUP><A NAME="text1" HREF="#note1"><FONT SIZE=2>1</FONT></A></SUP>
(invoked as <CODE><TT>ocamlc -c wrongsyntax.ml</TT></CODE>) finds a syntax
error on the <TT>string</TT> word; it parses the whole file as a single
type declaration and finds a syntax error inside it.<BR>
<BR>
The <TT>camlp4</TT> parser (with ordinary syntax), invoked as <CODE><TT>ocamlc -c -pp camlp4o wrongsyntax.ml</TT></CODE> don't find any (shallow)
syntactic error, but parses the above input as two items:
<PRE>
type t = Buf of Buffer.t
| Str
</PRE>
which is a correct type declaration (different from the one intended
by the author), followed by a simple expression
<PRE>
string
</PRE>
which is understood like <CODE><TT>let _ = string</TT></CODE>, and produces the
following message <CODE>Unbound value string</CODE> <BR>
<BR>
<HR WIDTH="50%" SIZE=1><DL><DT><A NAME="note1" HREF="#text1"><FONT SIZE=5>1</FONT></A><DD>The interactive toplevel <TT>ocaml</TT> has the same behavior, unless you load a different parser.
</DL>
<I><FONT COLOR=maroon>
<br>
For remarks about Camlp4, write to:
<img src="http://cristal.inria.fr/~ddr/images/email.jpg" alt=email align=top>
</FONT></I><HR>
<A HREF="index.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A>
<A HREF="tutorial002.html"><IMG SRC ="next_motif.gif" ALT="Next"></A>
</BODY>
</HTML>
|