File: syntax_1.html

package info (click to toggle)
eli-doc 4.4.0-4
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 13,256 kB
  • ctags: 4,583
  • sloc: makefile: 42
file content (587 lines) | stat: -rw-r--r-- 24,579 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
<HTML>
<HEAD>
<!-- This HTML file has been created by texi2html 1.29
     from ../tnf/syntax.tnf on 12 Febuary 2003 -->

<TITLE>Syntactic Analysis - Context-Free Grammars and Parsing</TITLE>
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#FF0000" BACKGROUND="gifs/bg.gif">
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0" VALIGN=BOTTOM>
<TR VALIGN=BOTTOM>
<TD WIDTH="160" VALIGN=BOTTOM><IMG SRC="gifs/elilogo.gif" BORDER=0>&nbsp;</TD>
<TD WIDTH="25" VALIGN=BOTTOM><img src="gifs/empty.gif" WIDTH=25 HEIGHT=25></TD>
<TD ALIGN=LEFT WIDTH="600" VALIGN=BOTTOM><IMG SRC="gifs/title.gif"></TD>
</TR>
</TABLE>

<HR size=1 noshade width=785 align=left>
<TABLE BORDER=0 CELLSPACING=2 CELLPADDING=0>
<TR>
<TD VALIGN=TOP WIDTH="160">
<h4>General Information</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="index.html">Eli: Translator Construction Made Easy</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="gindex_toc.html">Global Index</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="faq_toc.html" >Frequently Asked Questions</a> </td></tr>
</table>

<h4>Tutorials</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="EliRefCard_toc.html">Quick Reference Card</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="novice_toc.html">Guide For new Eli Users</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="news_toc.html">Release Notes of Eli</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="nametutorial_toc.html">Tutorial on Name Analysis</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="typetutorial_toc.html">Tutorial on Type Analysis</a></td></tr>
</table>

<h4>Reference Manuals</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="ui_toc.html">User Interface</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="pp_toc.html">Eli products and parameters</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lidoref_toc.html">LIDO Reference Manual</a></td></tr>
</table>

<h4>Libraries</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lib_toc.html">Eli library routines</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="modlib_toc.html">Specification Module Library</a></td></tr>
</table>

<h4>Translation Tasks</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lex_toc.html">Lexical analysis specification</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="syntax_toc.html">Syntactic Analysis Manual</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="comptrees_toc.html">Computation in Trees</a></td></tr>
</table>

<h4>Tools</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lcl_toc.html">LIGA Control Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="show_toc.html">Debugging Information for LIDO</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="gorto_toc.html">Graphical ORder TOol</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="fw_toc.html">FunnelWeb User's Manual</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="ptg_toc.html">Pattern-based Text Generator</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="deftbl_toc.html">Property Definition Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="oil_toc.html">Operator Identification Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="tp_toc.html">Tree Grammar Specification Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="clp_toc.html">Command Line Processing</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="cola_toc.html">COLA Options Reference Manual</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="idem_toc.html">Generating Unparsing Code</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="mon_toc.html">Monitoring a Processor's Execution</a> </td></tr>
</table>

<h4>Administration</h4>

<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="sysadmin_toc.html">System Administration Guide</a> </td></tr>
</table>

<HR WIDTH="100%">
<CENTER>&nbsp;<A HREF="mailto:elibugs@cs.colorado.edu"><IMG SRC="gifs/button_mail.gif" NOSAVE BORDER=0 HEIGHT=32 WIDTH=32></A><A HREF="mailto:elibugs@cs.colorado.edu">Questions, Comments, ....</A></CENTER>

</TD>
<TD VALIGN=TOP WIDTH="25"><img src="gifs/empty.gif" WIDTH=25 HEIGHT=25></TD>

<TD VALIGN=TOP WIDTH="600">
<H1>Syntactic Analysis</H1>
<P>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="syntax_2.html"><IMG SRC="gifs/next.gif" ALT="Next Chapter" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="syntax_toc.html"><IMG SRC="gifs/up.gif" ALT="Table of Contents" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT="">
<HR size=1 noshade width=600 align=left>
<H1><A NAME="SEC1" HREF="syntax_toc.html#SEC1">Context-Free Grammars and Parsing</A></H1>
<P>
A <DFN>context-free grammar</DFN>
<A NAME="IDX10"></A>
<A NAME="IDX9"></A>
is a formal system that describes a language by
specifying how any legal text can be derived from a distinguished symbol
called the <DFN>axiom</DFN>,
<A NAME="IDX11"></A>
or <DFN>sentence symbol</DFN>.
<A NAME="IDX12"></A>
It consists of a set of <DFN>productions</DFN>,
<A NAME="IDX13"></A>
each of which states that a given symbol can be replaced by a given sequence
<A NAME="IDX14"></A>
of symbols.
To derive a legal text,
<A NAME="IDX15"></A>
the grammar is used as data for the following algorithm:
<P>
<OL>
<LI>
Let <CODE>text</CODE> be a single occurrence of the axiom.
<P>
<LI>
If no production states that a symbol currently in <CODE>text</CODE> can be replaced
by some sequence of symbols, then stop.
<P>
<LI>
Rewrite <CODE>text</CODE> by replacing one of its symbols with a sequence
according to some production.
<P>
<LI>
Go to step (2).
</OL>
<P>
When this algorithm terminates, <CODE>text</CODE> is a legal text in the language.
The <DFN>phrase structure</DFN>
<A NAME="IDX16"></A>
of that text is the hierarchy of sequences used in its derivation.
<P>
Given a context-free grammar that satisfies certain conditions,
Eli can generate a <DFN>parsing routine</DFN>
<A NAME="IDX17"></A>
to determine the derivation (and hence the phrase structure) of any legal text.
This routine will also automatically detect and report any errors
<A NAME="IDX19"></A>
<A NAME="IDX20"></A>
<A NAME="IDX21"></A>
<A NAME="IDX18"></A>
in the text, and repair
<A NAME="IDX22"></A>
them to produce a correct phrase structure
(which may not be that intended by the person who wrote the erroneous text).
<P>
<H2><A NAME="SEC2" HREF="syntax_toc.html#SEC2">How to describe a context-free grammar</A></H2>
<P>
Each production of a context-free grammar consists of a symbol to be replaced
and the sequence that replaces it.
This can be represented in a type-<TT>`con'</TT> file
<A NAME="IDX24"></A>
<A NAME="IDX25"></A>
<A NAME="IDX23"></A>
by giving the symbol to be replaced, followed by a colon,
followed by the sequence that replaces it, followed by a period:
<P>
<PRE>
Assignment: Variable ':=' Expression.
StatementList: .
Statement:
   'if' Expression 'then' Statement
   'else' Statement.
</PRE>
<P>
The first production asserts that the symbol <CODE>Assignment</CODE> can be replaced
by the sequence consisting of the three symbols <CODE>Variable</CODE>, <CODE>':='</CODE>,
and <CODE>Expression</CODE>.
Any occurrence of the symbol <CODE>StatementList</CODE> can be replaced by an empty
sequence according to the second production.
In the third production, you see that new lines can be used as separators
in the description of a production.  This notation is often more commonly
referred to as <DFN>Backus Naur Form</DFN>, or just <DFN>BNF</DFN>.
<A NAME="IDX27"></A>
<A NAME="IDX26"></A>
<P>
Symbols that are to be replaced are called <DFN>nonterminals</DFN>,
<A NAME="IDX28"></A>
and are always represented by <DFN>identifiers</DFN>.
<A NAME="IDX29"></A>
(An identifier is a sequence of letters and digits, the first of which is a
letter.)
Every nonterminal must appear before a colon in at least one production. 
The axiom is a nonterminal that appears before the colon in exactly one
production, and does not appear between the colon and the period in any
production.
There must be exactly one nonterminal satisfying the conditions for the axiom.
<P>
Symbols that cannot be replaced are called <DFN>terminals</DFN>,
<A NAME="IDX30"></A>
and may be represented by either identifiers or <DFN>literals</DFN>.
<A NAME="IDX31"></A>
(A literal is a sequence of characters bounded by apostrophes (<KBD>'</KBD>).
An apostrophe appearing within a literal is represented by two successive
apostrophes.)
No terminal may appear before a colon in any production.
Terminals represent character strings that are recognized by the lexical
analyzer (see  <A HREF="lex_toc.html">Lexical Analysis</A>).
<A NAME="IDX32"></A>
<P>
<H3><A NAME="SEC3" HREF="syntax_toc.html#SEC3">Using extended BNF to describe more complex rules</A></H3>
<P>
Extended BNF allows the use of certain operators on the right hand side
of a production.  These operators are designed to be short-hands to simplify
the grammar description.  Rules with extended BNF operators can be
translated into rules which use only the strict BNF constructs described
so far.  While the use of extended BNF constructs is supported for the
concrete syntax description in Eli, only strict BNF constructs are allowed
in the abstract syntax.  When it comes time to deduce the correspondence
between the concrete and abstract syntax, Maptool operates on the abstract
syntax and a version of the concrete syntax in which all rules containing 
extended BNF constructs have been translated into equivalent strict
BNF rules.
<P>
The remainder of this section is devoted to describing how each of the extended
BNF constructs are translated to their strict BNF equivalents.  Note that
most of the EBNF constructs require the introduction of generated symbols
for their strict BNF translation.  Users are strongly discouraged from using
these constructs in instances where attribution is required for those
contexts, because changes in the grammar will change the names of the
generated symbols used.
<P>
The most appropriate use of EBNF constructs that introduce generated
symbols is when matching the LIDO
<CODE>LISTOF</CODE> construct, since the <CODE>LISTOF</CODE> construct makes no
assumptions about the phrase structure of the list.
For a description of the <CODE>LISTOF</CODE> construct, see
 <A HREF="lidoref_3.html#SEC4">Productions of LIDO - Reference Manual</A>.
<P>
When a grammar contains many productions specifying replacement of the same
nonterminal, a slash, denoting <DFN>alternation</DFN>
<A NAME="IDX33"></A>
can be used to avoid re-writing the symbol being replaced:
<P>
<PRE>
Statement:
   Variable ':=' Expression /
   'if' Expression 'then' Statement 'else' Statement /
   'while' Expression 'do' Statement .
</PRE>
<P>
This alternation specifies three productions.
The nonterminal to be replaced is <CODE>Statement</CODE> in each case.
Possible replacement sequences are separated by slashes (<KBD>/</KBD>).
The strict BNF translation for the above example is:
<P>
<PRE>
Statement: Variable ':=' Expression .
Statement: 'if' Expression 'then' Statement 'else' Statement . 
Statement: 'while' Expression 'do' Statement . 
</PRE>
<P>
Alternation does not introduce any generated symbols and has a very
straight-forward translation.  As a result, it is the most heavily used
of the EBNF constructs.
<P>
Square brackets are used to denote that the set of symbols
enclosed by the brackets are optional.  In the following
example, <CODE>Constants</CODE> and <CODE>Variables</CODE> are optional,
but <CODE>Body</CODE> is not:
<P>
<PRE>
Program: [Constants] [Variables] Body .
</PRE>
<P>
The strict BNF translation of this construct is to generate
a rule for each possible permutation of the right hand side.
In the case of the above example, the following four rules
would result:
<P>
<PRE>
Program: Body .
Program: Variables Body .
Program: Constants Body .
Program: Constants Variables Body .
</PRE>
<P>
While the translation doesn't introduce any generated symbols,
indiscriminate use of this construct may lead to less readable specifications.
<P>
An asterisk (or star) is used to denote zero or more occurrences
of the phrase to which it is applied.  In the following example,
<CODE>Program</CODE> consists of zero or more occurrences of <CODE>Variable</CODE>
followed by <CODE>Body</CODE>:
<P>
<PRE>
Program: Variable* Body .
</PRE>
<P>
The strict BNF translation of this construct requires the introduction
of a generated symbol.  Generated symbols begin with the letter <CODE>G</CODE>
and are followed by a unique number.  Generated symbols are chosen to not
conflict with existing symbols in the concrete syntax.  No check is
performed to ensure that the generated symbols do not conflict with
symbols in the abstract syntax, so users should avoid using symbols
of this form in their abstract syntax.  The translation
for the above example is as follows:
<P>
<PRE>
Program: G1 Body .
G1: G1 Variable .
G1: .
</PRE>
<P>
A plus is used to denote one or more occurrences
of the phrase to which it is applied.  In the following example,
<CODE>Program</CODE> consists of one or more occurrences of <CODE>Variable</CODE>
followed by <CODE>Body</CODE>:
<P>
<PRE>
Program: Variable+ Body .
</PRE>
<P>
The strict BNF translation of this construct is similar to the translation
of the asterisk (see  <A HREF="syntax_1.html#SEC3">Using extended BNF to describe more complex rules</A>).  The translation
for the above example is as follows:
<P>
<PRE>
Program: G1 Body .
G1: G1 Variable .
G1: Variable .
</PRE>
<P>
A double slash is used to denote one or more occurrences of a phrase
separated by a symbol.  In the following example, <CODE>Input</CODE> is a
sequence of one or more <CODE>Declaration</CODE>'s separated by a comma:
<P>
<PRE>
Input: Declaration // ',' .
</PRE>
<P>
The strict BNF translation for the above example is as follows:
<P>
<PRE>
Input: G1 .
G1: G2 .
G1: G1 ',' G2 .
G2: Declaration .
</PRE>
<P>
Note that all of the EBNF constructs, except the single slash (for alternation)
have higher precedence than the separator construct.
<P>
Parentheses are used to group EBNF constructs.  This is used primarily
to apply other EBNF operators to more than a single symbol.  For example:
<P>
<PRE>
Program: (Definition Use)+ .
</PRE>
<P>
In this example, we want to apply the Plus operator to the concatenation of
a <CODE>Definition</CODE> and a <CODE>Use</CODE>.  The result denotes one or more
occurrences of <CODE>Definition</CODE>'s followed by <CODE>Use</CODE>'s.  The strict
BNF translation for the above is:
<P>
<PRE>
Program: G2 .
G1: Definition Use .
G2: G1 .
G2: G2 G1 .
</PRE>
<P>
This is identical to the translation for the Plus operator operating on a
single symbol, except that another generated symbol is created to represent
the parenthetical phrase.
<P>
Note that a common error is to introduce parentheses where they are not
needed.  This will result in the introduction of unexpected generated
symbols.
<P>
<H2><A NAME="SEC4" HREF="syntax_toc.html#SEC4">Using structure to convey meaning</A></H2>
<P>
A production is a construct with two components: the symbol to be replaced
and the sequence that replaces it.
We defined the meaning of the production in terms of those components,
saying that whenever the symbol was found in <CODE>text</CODE>, it could be
replaced by the sequence.
This is the general approach that we use in defining the meaning of constructs
<A NAME="IDX34"></A>
in any language.
For example, we say that an assignment is a statement with two components,
a variable and an expression.
The meaning of the assignment is to replace the value of the variable with
the value resulting from evaluating the expression.
<P>
The context-free grammar for a language specifies a "component" relationship.
Each production says that the components of the phrase represented by the
symbol to be replaced are the elements of the sequence that replaces it.
To be useful, the context-free grammar for a language should embody exactly the
relationship that we use in defining the meanings of the constructs of that
language.
<P>
<H3><A NAME="SEC5" HREF="syntax_toc.html#SEC5">Operator precedence</A></H3>
<P>
Consider the following expressions:
<P>
<PRE>
A + B * C
(A + B) * C
</PRE>
<P>
In the first expression, the operands of the addition are the variable
<CODE>A</CODE> and the product of the variables <CODE>B</CODE> and <CODE>C</CODE>.
The reason is that in normal mathematical notation, multiplication takes
precedence over addition.
<A NAME="IDX36"></A>
<A NAME="IDX37"></A>
<A NAME="IDX35"></A>
Parentheses have been used in the second expression to indicate that the
operands of the multiplication are the sum of variables <CODE>A</CODE> and
<CODE>B</CODE>, and the variable <CODE>C</CODE>.
<P>
The general method for embodying this concept of operator precedence in a
context-free grammar for expressions is to associate a distinct nonterminal
with each precedence level, and one with operands that do not contain
"visible" operators.
For our expressions, this requires three nonterminals:
<P>
<DL COMPACT>
<DT><CODE>Sum</CODE>
<DD>An expression whose operator is <CODE>+</CODE>
<P>
<DT><CODE>Term</CODE>
<DD>An expression whose operator is <CODE>*</CODE>
<P>
<DT><CODE>Primary</CODE>
<DD>An expression not containing "visible" operators
</DL>
<P>
The productions that embody the concept of operator precedence would then
be:
<P>
<PRE>
Sum: Sum '+' Term / Term.
Term: Term '*' Primary / Primary.
Primary: '(' Sum ')' / Identifier.
</PRE>
<P>
<H3><A NAME="SEC6" HREF="syntax_toc.html#SEC6">Operator associativity</A></H3>
<P>
Consider the following expressions:
<P>
<PRE>
A - B - C
A ** B ** C
A &#60; B &#60; C
</PRE>
<P>
Which operator has variable <CODE>B</CODE> as an operand in each case?
<P>
This question can be answered by stating an <DFN>association</DFN>
<A NAME="IDX39"></A>
<A NAME="IDX40"></A>
<A NAME="IDX38"></A>
for each operator:
If <CODE>-</CODE> is "left-associative",
<A NAME="IDX41"></A>
then the first expression is interpreted as though it had been written
<CODE>(A-B)-C</CODE>.
Saying that <CODE>**</CODE> is "right-associative"
<A NAME="IDX42"></A>
means that the second expression is interpreted as though it had been written
<CODE>A**(B**C)</CODE>.
The language designer may wish to disallow the third expression by saying
that <CODE>&#60;</CODE> is "non-associative".
<A NAME="IDX43"></A>
<P>
Association rules are embodied in a context-free grammar by selecting
appropriate nonterminals to describe the operands of an operator.
For each operator, two nonterminals must be known:
the nonterminal describing expressions that may contain that operator, and
the nonterminal describing expressions that do not contain that operator
but may be operands of that operator.
Usually these nonterminals have been established to describe operator
precedence.
Here is a typical set of nonterminals used to describe expressions:
<P>
<DL COMPACT>
<DT><CODE>Relation</CODE>
<DD>An expression whose operator is <CODE>&#60;</CODE> or <CODE>&#62;</CODE>
<P>
<DT><CODE>Sum</CODE>
<DD>An expression whose operator is <CODE>+</CODE> or <CODE>-</CODE>
<P>
<DT><CODE>Term</CODE>
<DD>An expression whose operator is <CODE>*</CODE> or <CODE>/</CODE>
<P>
<DT><CODE>Factor</CODE>
<DD>An expression whose operator is <CODE>**</CODE>
<P>
<DT><CODE>Primary</CODE>
<DD>An expression not containing "visible" operators
</DL>
<P>
The association rules discussed above would therefore be expressed by the
following productions
(these are <EM>not</EM> the only productions in the grammar):
<P>
<PRE>
Sum: Sum '-' Term.
Factor: Primary '**' Factor.
Relation: Sum '&#60;' Sum.
</PRE>
<P>
The first production says that the left operand of <CODE>-</CODE> can contain
other <CODE>-</CODE> operators, while the right operand cannot (unless the
subexpression containing them is surrounded by parentheses).
Similarly, the right operand of <CODE>**</CODE> can contain other <CODE>**</CODE>
operators but the left operand cannot.
The third rule says that neither operand  of <CODE>&#60;</CODE> can contain other
<CODE>&#60;</CODE> operators.
<P>
<H3><A NAME="SEC7" HREF="syntax_toc.html#SEC7">Scope rules for declarations</A></H3>
<P>
Identifiers
<A NAME="IDX45"></A>
<A NAME="IDX44"></A>
are normally given meaning by declarations.
The meaning given to an identifier by a particular declaration holds over
some portion of the program, called the <DFN>scope</DFN>
<A NAME="IDX46"></A>
of that declaration.
A context-free grammar for a language should define a phrase structure that
is consistent with the scope rules of that language.
<P>
For example, the declaration of a procedure <CODE>P</CODE> within the
body of procedure <CODE>Q</CODE> gives meaning to the identifier <CODE>P</CODE>, and
its scope might be the body of the procedure <CODE>Q</CODE>.
If <CODE>P</CODE> has parameters, the scope of their declarations (which are
components of the procedure declaration) is the body of procedure <CODE>Q</CODE>.
<P>
Now consider the following productions describing a procedure declaration:
<A NAME="IDX47"></A>
<P>
<PRE>
procedure_declaration: 'procedure' procedure_heading procedure_body.
procedure_heading:
   ProcIdDef formal_parameter_part ';' specification_part.
</PRE>
<P>
Notice that the phrase structure induced by these productions is
inconsistent with the postulated scope rules.
The declaration of <CODE>P</CODE> (<CODE>ProcIdDef</CODE>) is in the same phrase
(<CODE>procedure_heading</CODE>) as the declarations of the formal parameters.
This defect can be remedied by a slight change in the productions:
<P>
<PRE>
procedure_declaration: 'procedure' ProcIdDef ProcRange.
ProcRange:
   formal_parameter_part ';' value_part specification_part procedure_body.
</PRE>
<P>
Here the formal parameters and the body have both been made components of a
single phrase (<CODE>ProcRange</CODE>), which defines the scope of the formal
parameter declarations.
The declaration of <CODE>P</CODE> lies outside of this phrase, thus allowing its
scope to be differentiated from that of the formal parameters.
<P>
<HR size=1 noshade width=600 align=left>
<P>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="syntax_2.html"><IMG SRC="gifs/next.gif" ALT="Next Chapter" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="syntax_toc.html"><IMG SRC="gifs/up.gif" ALT="Table of Contents" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT="">
<HR size=1 noshade width=600 align=left>
</TD>
</TR>
</TABLE>

</BODY></HTML>