1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314
|
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="hevea 2.32">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
<link rel="stylesheet" type="text/css" href="manual.css">
<title>7.1 Lexical conventions</title>
</head>
<body>
<a href="language.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="values.html"><img src="next_motif.svg" alt="Next"></a>
<hr>
<h2 class="section" id="s:lexical-conventions"><a class="section-anchor" href="#s:lexical-conventions" aria-hidden="true"></a>7.1 Lexical conventions</h2>
<h4 class="subsubsection" id="sss:lex:blanks"><a class="section-anchor" href="#sss:lex:blanks" aria-hidden="true"></a>Blanks</h4>
<p>The following characters are considered as blanks: space,
horizontal tabulation, carriage return, line feed and form feed. Blanks are
ignored, but they separate adjacent identifiers, literals and
keywords that would otherwise be confused as one single identifier,
literal or keyword.</p><h4 class="subsubsection" id="sss:lex:comments"><a class="section-anchor" href="#sss:lex:comments" aria-hidden="true"></a>Comments</h4>
<p>Comments are introduced by the two characters <span class="c004">(*</span>, with no
intervening blanks, and terminated by the characters <span class="c004">*)</span>, with
no intervening blanks. Comments are treated as blank characters.
Comments do not occur inside string or character literals. Nested
comments are handled correctly.</p><h4 class="subsubsection" id="sss:lex:identifiers"><a class="section-anchor" href="#sss:lex:identifiers" aria-hidden="true"></a>Identifiers</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="ident"><span class="c010">ident</span></a></td><td class="c015">::=</td><td class="c017"> ( <a class="syntax" href="#letter"><span class="c010">letter</span></a> ∣ <span class="c004">_</span> ) { <a class="syntax" href="#letter"><span class="c010">letter</span></a> ∣ <span class="c004">0</span> … <span class="c004">9</span> ∣ <span class="c004">_</span> ∣ <span class="c004">'</span> } </td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="capitalized-ident"><span class="c010">capitalized-ident</span></a></td><td class="c015">::=</td><td class="c017"> (<span class="c004">A</span> … <span class="c004">Z</span>) { <a class="syntax" href="#letter"><span class="c010">letter</span></a> ∣ <span class="c004">0</span> … <span class="c004">9</span> ∣ <span class="c004">_</span> ∣ <span class="c004">'</span> } </td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="lowercase-ident"><span class="c010">lowercase-ident</span></a></td><td class="c015">::=</td><td class="c017">
(<span class="c004">a</span> … <span class="c004">z</span> ∣ <span class="c004">_</span>) { <a class="syntax" href="#letter"><span class="c010">letter</span></a> ∣ <span class="c004">0</span> … <span class="c004">9</span> ∣ <span class="c004">_</span> ∣ <span class="c004">'</span> } </td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="letter"><span class="c010">letter</span></a></td><td class="c015">::=</td><td class="c017"> <span class="c004">A</span> … <span class="c004">Z</span> ∣ <span class="c004">a</span> … <span class="c004">z</span>
</td></tr>
</table></td></tr>
</table></div><p>Identifiers are sequences of letters, digits, <span class="c003">_</span> (the underscore
character), and <span class="c003">'</span> (the single quote), starting with a
letter or an underscore.
Letters contain at least the 52 lowercase and uppercase
letters from the ASCII set. The current implementation
also recognizes as letters some characters from the ISO
8859-1 set (characters 192–214 and 216–222 as uppercase letters;
characters 223–246 and 248–255 as lowercase letters). This
feature is deprecated and should be avoided for future compatibility.</p><p>All characters in an identifier are
meaningful. The current implementation accepts identifiers up to
16000000 characters in length.</p><p>In many places, OCaml makes a distinction between capitalized
identifiers and identifiers that begin with a lowercase letter. The
underscore character is considered a lowercase letter for this
purpose.</p><h4 class="subsubsection" id="sss:integer-literals"><a class="section-anchor" href="#sss:integer-literals" aria-hidden="true"></a>Integer literals</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="integer-literal"><span class="c010">integer-literal</span></a></td><td class="c015">::=</td><td class="c017">
[<span class="c004">-</span>] (<span class="c004">0</span>…<span class="c004">9</span>) { <span class="c004">0</span>…<span class="c004">9</span> ∣ <span class="c004">_</span> }
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> [<span class="c004">-</span>] (<span class="c004">0x</span>∣ <span class="c004">0X</span>) (<span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>)
{ <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>∣ <span class="c004">_</span> }
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> [<span class="c004">-</span>] (<span class="c004">0o</span>∣ <span class="c004">0O</span>) (<span class="c004">0</span>…<span class="c004">7</span>) { <span class="c004">0</span>…<span class="c004">7</span>∣ <span class="c004">_</span> }
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> [<span class="c004">-</span>] (<span class="c004">0b</span>∣ <span class="c004">0B</span>) (<span class="c004">0</span>…<span class="c004">1</span>) { <span class="c004">0</span>…<span class="c004">1</span>∣ <span class="c004">_</span> }
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="int32-literal"><span class="c010">int32-literal</span></a></td><td class="c015">::=</td><td class="c017"> <a class="syntax" href="#integer-literal"><span class="c010">integer-literal</span></a> <span class="c004">l</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="int64-literal"><span class="c010">int64-literal</span></a></td><td class="c015">::=</td><td class="c017"> <a class="syntax" href="#integer-literal"><span class="c010">integer-literal</span></a> <span class="c004">L</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="nativeint-literal"><span class="c010">nativeint-literal</span></a></td><td class="c015">::=</td><td class="c017"> <a class="syntax" href="#integer-literal"><span class="c010">integer-literal</span></a> <span class="c004">n</span>
</td></tr>
</table></td></tr>
</table></div><p>An integer literal is a sequence of one or more digits, optionally
preceded by a minus sign. By default, integer literals are in decimal
(radix 10). The following prefixes select a different radix:
</p><div class="tableau">
<div class="center"><table class="c000 cellpadding1" border=1><tr><td class="c014"><span class="c013">Prefix</span></td><td class="c014"><span class="c013">Radix</span> </td></tr>
<tr><td class="c016">
<span class="c003">0x</span>, <span class="c003">0X</span></td><td class="c016">hexadecimal (radix 16) </td></tr>
<tr><td class="c016"><span class="c003">0o</span>, <span class="c003">0O</span></td><td class="c016">octal (radix 8) </td></tr>
<tr><td class="c016"><span class="c003">0b</span>, <span class="c003">0B</span></td><td class="c016">binary (radix 2) </td></tr>
</table></div></div><p>
(The initial <span class="c004">0</span> is the digit zero; the <span class="c004">O</span> for octal is the letter O.)
An integer literal can be followed by one of the letters <span class="c003">l</span>, <span class="c003">L</span> or <span class="c003">n</span>
to indicate that this integer has type <span class="c003">int32</span>, <span class="c003">int64</span> or <span class="c003">nativeint</span>
respectively, instead of the default type <span class="c003">int</span> for integer literals.
The interpretation of integer literals that fall outside the range of
representable integer values is undefined.</p><p>For convenience and readability, underscore characters (<span class="c004">_</span>) are accepted
(and ignored) within integer literals.</p><h4 class="subsubsection" id="sss:floating-point-literals"><a class="section-anchor" href="#sss:floating-point-literals" aria-hidden="true"></a>Floating-point literals</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="float-literal"><span class="c010">float-literal</span></a></td><td class="c015">::=</td><td class="c017">
[<span class="c004">-</span>] (<span class="c004">0</span>…<span class="c004">9</span>) { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">_</span> } [<span class="c004">.</span> { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">_</span> }]
[(<span class="c004">e</span>∣ <span class="c004">E</span>) [<span class="c004">+</span>∣ <span class="c004">-</span>] (<span class="c004">0</span>…<span class="c004">9</span>) { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">_</span> }]
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> [<span class="c004">-</span>] (<span class="c004">0x</span>∣ <span class="c004">0X</span>)
(<span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>)
{ <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>∣ <span class="c004">_</span> }
[<span class="c004">.</span> { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>∣ <span class="c004">_</span> }]
[(<span class="c004">p</span>∣ <span class="c004">P</span>) [<span class="c004">+</span>∣ <span class="c004">-</span>] (<span class="c004">0</span>…<span class="c004">9</span>) { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">_</span> }]
</td></tr>
</table></td></tr>
</table></div><p>Floating-point decimal literals consist in an integer part, a
fractional part and
an exponent part. The integer part is a sequence of one or more
digits, optionally preceded by a minus sign. The fractional part is a
decimal point followed by zero, one or more digits.
The exponent part is the character <span class="c004">e</span> or <span class="c004">E</span> followed by an
optional <span class="c004">+</span> or <span class="c004">-</span> sign, followed by one or more digits. It is
interpreted as a power of 10.
The fractional part or the exponent part can be omitted but not both, to
avoid ambiguity with integer literals.
The interpretation of floating-point literals that fall outside the
range of representable floating-point values is undefined.</p><p>Floating-point hexadecimal literals are denoted with the <span class="c004">0x</span> or <span class="c004">0X</span>
prefix. The syntax is similar to that of floating-point decimal
literals, with the following differences.
The integer part and the fractional part use hexadecimal
digits. The exponent part starts with the character <span class="c004">p</span> or <span class="c004">P</span>.
It is written in decimal and interpreted as a power of 2.</p><p>For convenience and readability, underscore characters (<span class="c004">_</span>) are accepted
(and ignored) within floating-point literals.</p><h4 class="subsubsection" id="sss:character-literals"><a class="section-anchor" href="#sss:character-literals" aria-hidden="true"></a>Character literals</h4>
<p>
<a id="s:characterliteral"></a></p><div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="char-literal"><span class="c010">char-literal</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">'</span> <span class="c010">regular-char</span> <span class="c004">'</span>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">'</span> <a class="syntax" href="#escape-sequence"><span class="c010">escape-sequence</span></a> <span class="c004">'</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="escape-sequence"><span class="c010">escape-sequence</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">\</span> ( <span class="c004">\</span> ∣ <span class="c004">"</span> ∣ <span class="c004">'</span> ∣ <span class="c004">n</span> ∣ <span class="c004">t</span> ∣ <span class="c004">b</span> ∣ <span class="c004">r</span> ∣ <span class="c010">space</span> )
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">\</span> (<span class="c004">0</span>…<span class="c004">9</span>) (<span class="c004">0</span>…<span class="c004">9</span>) (<span class="c004">0</span>…<span class="c004">9</span>)
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">\x</span> (<span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>)
(<span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span>)
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">\o</span> (<span class="c004">0</span>…<span class="c004">3</span>) (<span class="c004">0</span>…<span class="c004">7</span>) (<span class="c004">0</span>…<span class="c004">7</span>)
</td></tr>
</table></td></tr>
</table></div><p>Character literals are delimited by <span class="c004">'</span> (single quote) characters.
The two single quotes enclose either one character different from
<span class="c004">'</span> and <span class="c004">\</span>, or one of the escape sequences below:
</p><div class="tableau">
<div class="center"><table class="c000 cellpadding1" border=1><tr><td class="c014"><span class="c013">Sequence</span></td><td class="c014"><span class="c013">Character denoted</span> </td></tr>
<tr><td class="c016">
<span class="c003">\\</span></td><td class="c016">backslash (<span class="c003">\</span>) </td></tr>
<tr><td class="c016"><span class="c003">\"</span></td><td class="c016">double quote (<span class="c003">"</span>) </td></tr>
<tr><td class="c016"><span class="c003">\'</span></td><td class="c016">single quote (<span class="c003">'</span>) </td></tr>
<tr><td class="c016"><span class="c003">\n</span></td><td class="c016">linefeed (LF) </td></tr>
<tr><td class="c016"><span class="c003">\r</span></td><td class="c016">carriage return (CR) </td></tr>
<tr><td class="c016"><span class="c003">\t</span></td><td class="c016">horizontal tabulation (TAB) </td></tr>
<tr><td class="c016"><span class="c003">\b</span></td><td class="c016">backspace (BS) </td></tr>
<tr><td class="c016"><span class="c003">\</span><span class="c009">space</span></td><td class="c016">space (SPC) </td></tr>
<tr><td class="c016"><span class="c003">\</span><span class="c009">ddd</span></td><td class="c016">the character with ASCII code <span class="c009">ddd</span> in decimal </td></tr>
<tr><td class="c016"><span class="c003">\x</span><span class="c009">hh</span></td><td class="c016">the character with ASCII code <span class="c009">hh</span> in hexadecimal </td></tr>
<tr><td class="c016"><span class="c003">\o</span><span class="c009">ooo</span></td><td class="c016">the character with ASCII code <span class="c009">ooo</span> in octal </td></tr>
</table></div></div><h4 class="subsubsection" id="sss:stringliterals"><a class="section-anchor" href="#sss:stringliterals" aria-hidden="true"></a>String literals</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="string-literal"><span class="c010">string-literal</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">"</span> { <a class="syntax" href="#string-character"><span class="c010">string-character</span></a> } <span class="c004">"</span>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">{</span> <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> <span class="c004">|</span> { <span class="c010">any-char</span> } <span class="c004">|</span> <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> <span class="c004">}</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="quoted-string-id"><span class="c010">quoted-string-id</span></a></td><td class="c015">::=</td><td class="c017">
{ <span class="c004">a</span>...<span class="c004">z</span> ∣ <span class="c004">_</span> }
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="string-character"><span class="c010">string-character</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c010">regular-string-char</span>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <a class="syntax" href="#escape-sequence"><span class="c010">escape-sequence</span></a>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">\u{</span> { <span class="c004">0</span>…<span class="c004">9</span>∣ <span class="c004">A</span>…<span class="c004">F</span>∣ <span class="c004">a</span>…<span class="c004">f</span> }<sup>+</sup> <span class="c004">}</span>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">\</span> <span class="c010">newline</span> { <span class="c010">space</span> ∣ <span class="c010">tab</span> }
</td></tr>
</table></td></tr>
</table></div><p>String literals are delimited by <span class="c004">"</span> (double quote) characters.
The two double quotes enclose a sequence of either characters
different from <span class="c004">"</span> and <span class="c004">\</span>, or escape sequences from the
table given above for character literals, or a Unicode character
escape sequence.</p><p>A Unicode character escape sequence is substituted by the UTF-8
encoding of the specified Unicode scalar value. The Unicode scalar
value, an integer in the ranges 0x0000...0xD7FF or 0xE000...0x10FFFF,
is defined using 1 to 6 hexadecimal digits; leading zeros are allowed.</p><p>To allow splitting long string literals across lines, the sequence
<span class="c003">\</span><span class="c009">newline</span> <span class="c009">spaces-or-tabs</span> (a backslash at the end of a line
followed by any number of spaces and horizontal tabulations at the
beginning of the next line) is ignored inside string literals.</p><p>Quoted string literals provide an alternative lexical syntax for
string literals. They are useful to represent strings of arbitrary content
without escaping. Quoted strings are delimited by a matching pair
of <span class="c004">{</span> <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> <span class="c004">|</span> and <span class="c004">|</span> <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> <span class="c004">}</span> with
the same <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> on both sides. Quoted strings do not interpret
any character in a special way but requires that the
sequence <span class="c004">|</span> <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> <span class="c004">}</span> does not occur in the string itself.
The identifier <a class="syntax" href="#quoted-string-id"><span class="c010">quoted-string-id</span></a> is a (possibly empty) sequence of
lowercase letters and underscores that can be freely chosen to avoid
such issue (e.g. <span class="c003">{|hello|}</span>, <span class="c003">{ext|hello {|world|}|ext}</span>, ...).</p><p>The current implementation places practically no restrictions on the
length of string literals.</p><h4 class="subsubsection" id="sss:labelname"><a class="section-anchor" href="#sss:labelname" aria-hidden="true"></a>Naming labels</h4>
<p>To avoid ambiguities, naming labels in expressions cannot just be defined
syntactically as the sequence of the three tokens <span class="c003">~</span>, <a class="syntax" href="#ident"><span class="c010">ident</span></a> and
<span class="c003">:</span>, and have to be defined at the lexical level.</p><div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="label-name"><span class="c010">label-name</span></a></td><td class="c015">::=</td><td class="c017"> <a class="syntax" href="#lowercase-ident"><span class="c010">lowercase-ident</span></a>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="label"><span class="c010">label</span></a></td><td class="c015">::=</td><td class="c017"> <span class="c004">~</span> <a class="syntax" href="#label-name"><span class="c010">label-name</span></a> <span class="c004">:</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="optlabel"><span class="c010">optlabel</span></a></td><td class="c015">::=</td><td class="c017"> <span class="c004">?</span> <a class="syntax" href="#label-name"><span class="c010">label-name</span></a> <span class="c004">:</span>
</td></tr>
</table></td></tr>
</table></div><p>Naming labels come in two flavours: <a class="syntax" href="#label"><span class="c010">label</span></a> for normal arguments and
<a class="syntax" href="#optlabel"><span class="c010">optlabel</span></a> for optional ones. They are simply distinguished by their
first character, either <span class="c003">~</span> or <span class="c003">?</span>.</p><p>Despite <a class="syntax" href="#label"><span class="c010">label</span></a> and <a class="syntax" href="#optlabel"><span class="c010">optlabel</span></a> being lexical entities in expressions,
their expansions <span class="c004">~</span> <a class="syntax" href="#label-name"><span class="c010">label-name</span></a> <span class="c004">:</span> and <span class="c004">?</span> <a class="syntax" href="#label-name"><span class="c010">label-name</span></a> <span class="c004">:</span> will be
used in grammars, for the sake of readability. Note also that inside
type expressions, this expansion can be taken literally, <em>i.e.</em>
there are really 3 tokens, with optional blanks between them.</p><h4 class="subsubsection" id="sss:lex-ops-symbols"><a class="section-anchor" href="#sss:lex-ops-symbols" aria-hidden="true"></a>Prefix and infix symbols</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="infix-symbol"><span class="c010">infix-symbol</span></a></td><td class="c015">::=</td><td class="c017">
( <a class="syntax" href="#core-operator-char"><span class="c010">core-operator-char</span></a> ∣ <span class="c004">%</span> ∣ <span class="c004"><</span> ) { <a class="syntax" href="#operator-char"><span class="c010">operator-char</span></a> }
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">#</span> { <a class="syntax" href="#operator-char"><span class="c010">operator-char</span></a> }<sup>+</sup>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="prefix-symbol"><span class="c010">prefix-symbol</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">!</span> { <a class="syntax" href="#operator-char"><span class="c010">operator-char</span></a> }
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> (<span class="c004">?</span> ∣ <span class="c004">~</span>) { <a class="syntax" href="#operator-char"><span class="c010">operator-char</span></a> }<sup>+</sup>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="operator-char"><span class="c010">operator-char</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">~</span> ∣ <span class="c004">!</span> ∣ <span class="c004">?</span> ∣ <a class="syntax" href="#core-operator-char"><span class="c010">core-operator-char</span></a> ∣ <span class="c004">%</span> ∣ <span class="c004"><</span> ∣ <span class="c004">:</span> ∣ <span class="c004">.</span>
</td></tr>
<tr><td class="c018"> </td></tr>
<tr><td class="c018">
<a class="syntax" id="core-operator-char"><span class="c010">core-operator-char</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">$</span> ∣ <span class="c004">&</span> ∣ <span class="c004">*</span> ∣ <span class="c004">+</span> ∣ <span class="c004">-</span> ∣ <span class="c004">/</span> ∣ <span class="c004">=</span> ∣ <span class="c004">></span> ∣ <span class="c004">@</span> ∣ <span class="c004">^</span> ∣ <span class="c004">|</span>
</td></tr>
</table></td></tr>
</table></div><p>
See also the following language extensions:
<a href="extensionsyntax.html#s%3Aext-ops">extension operators</a>,
<a href="indexops.html#s%3Aindex-operators">extended indexing operators</a>,
and <a href="bindingops.html#s%3Abinding-operators">binding operators</a>.</p><p>Sequences of “operator characters”, such as <span class="c003"><=></span> or <span class="c003">!!</span>,
are read as a single token from the <a class="syntax" href="#infix-symbol"><span class="c010">infix-symbol</span></a> or <a class="syntax" href="#prefix-symbol"><span class="c010">prefix-symbol</span></a>
class. These symbols are parsed as prefix and infix operators inside
expressions, but otherwise behave like normal identifiers.
</p><h4 class="subsubsection" id="sss:keywords"><a class="section-anchor" href="#sss:keywords" aria-hidden="true"></a>Keywords</h4>
<p>The identifiers below are reserved as keywords, and cannot be employed
otherwise:
</p><pre> and as assert asr begin class
constraint do done downto else end
exception external false for fun function
functor if in include inherit initializer
land lazy let lor lsl lsr
lxor match method mod module mutable
new nonrec object of open or
private rec sig struct then to
true try type val virtual when
while with
</pre><p> <br>
The following character sequences are also keywords:
</p><pre>
<span class="c003"> != # & && ' ( ) * + , -</span>
<span class="c003"> -. -> . .. .~ : :: := :> ; ;;</span>
<span class="c003"> < <- = > >] >} ? [ [< [> [|</span>
<span class="c003"> ] _ ` { {< | |] || } ~</span>
</pre><p>
Note that the following identifiers are keywords of the Camlp4
extensions and should be avoided for compatibility reasons.
</p><pre> parser value $ $$ $: <: << >> ??
</pre><h4 class="subsubsection" id="sss:lex-ambiguities"><a class="section-anchor" href="#sss:lex-ambiguities" aria-hidden="true"></a>Ambiguities</h4>
<p>Lexical ambiguities are resolved according to the “longest match”
rule: when a character sequence can be decomposed into two tokens in
several different ways, the decomposition retained is the one with the
longest first token.</p><h4 class="subsubsection" id="sss:lex-linedir"><a class="section-anchor" href="#sss:lex-linedir" aria-hidden="true"></a>Line number directives</h4>
<div class="syntax"><table class="display dcenter"><tr class="c019"><td class="dcell"><table class="c001 cellpading0"><tr><td class="c018">
<a class="syntax" id="linenum-directive"><span class="c010">linenum-directive</span></a></td><td class="c015">::=</td><td class="c017">
<span class="c004">#</span> {<span class="c004">0</span> … <span class="c004">9</span>}<sup>+</sup>
</td></tr>
<tr><td class="c018"> </td><td class="c015">∣</td><td class="c017"> <span class="c004">#</span> {<span class="c004">0</span> … <span class="c004">9</span>}<sup>+</sup> <span class="c004">"</span> { <a class="syntax" href="#string-character"><span class="c010">string-character</span></a> } <span class="c004">"</span>
</td></tr>
</table></td></tr>
</table></div><p>Preprocessors that generate OCaml source code can insert line number
directives in their output so that error messages produced by the
compiler contain line numbers and file names referring to the source
file before preprocessing, instead of after preprocessing.
A line number directive is composed of a <span class="c004">#</span> (sharp sign), followed by
a positive integer (the source line number), optionally followed by a
character string (the source file name).
Line number directives are treated as blanks during lexical
analysis.</p>
<hr>
<a href="language.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="values.html"><img src="next_motif.svg" alt="Next"></a>
</body>
</html>
|