File: Semantics.pod

package info (click to toggle)
libmarpa-r2-perl 12.000000-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 6,660 kB
  • sloc: perl: 42,628; ansic: 23,387; sh: 4,363; makefile: 157
file content (756 lines) | stat: -rw-r--r-- 23,377 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
# Copyright 2022 Jeffrey Kegler
# This file is part of Marpa::R2.  Marpa::R2 is free software: you can
# redistribute it and/or modify it under the terms of the GNU Lesser
# General Public License as published by the Free Software Foundation,
# either version 3 of the License, or (at your option) any later version.
#
# Marpa::R2 is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser
# General Public License along with Marpa::R2.  If not, see
# http://www.gnu.org/licenses/.

=head1 NAME

Marpa::R2::Deprecated::NAIF::Semantics - How the NAIF evaluates parses

=head1 THE NAIF INTERFACE IS DEPRECATED

This document describes the NAIF interface,
which is deprecated.
PLEASE DO NOT USE IT FOR NEW DEVELOPMENT.

=head1 Synopsis

=for Marpa::R2::Display
name: Engine Synopsis Unambiguous Parse
partial: 1
normalize-whitespace: 1

    my $grammar = Marpa::R2::Grammar->new(
        {   start          => 'Expression',
            actions        => 'My_Actions',
            default_action => 'first_arg',
            rules          => [
                { lhs => 'Expression', rhs => [qw/Term/] },
                { lhs => 'Term',       rhs => [qw/Factor/] },
                { lhs => 'Factor',     rhs => [qw/Number/] },
                { lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
                {   lhs    => 'Factor',
                    rhs    => [qw/Factor Multiply Factor/],
                    action => 'do_multiply'
                },
            ],
        }
    );

=for Marpa::R2::Display::End

=for Marpa::R2::Display
name: Engine Synopsis Unambiguous Parse
partial: 1
normalize-whitespace: 1

    sub My_Actions::do_add {
        my ( undef, $t1, undef, $t2 ) = @_;
        return $t1 + $t2;
    }

    sub My_Actions::do_multiply {
        my ( undef, $t1, undef, $t2 ) = @_;
        return $t1 * $t2;
    }

    sub My_Actions::first_arg { shift; return shift; }

    my $value_ref = $recce->value;
    my $value = $value_ref ? ${$value_ref} : 'No Parse';

=for Marpa::R2::Display::End

=head1 Overview

This document deals with Marpa's low-level NAIF interface.
If you are new to Marpa,
or are not sure which interface you are interested in,
or do not know what the Named Argment InterFace (NAIF) is,
you probably want to look instead at
L<the document on semantics for the SLIF
interface|Marpa::R2::Semantics>.

The NAIF's semantics
will be
familiar
to those who have used traditional
methods to evaluate parses.
A parse is seen as a parse tree.
Nodes on the tree are evaluated recursively, bottom-up.
Once the values of all its child nodes are known,
a parent node is ready to be evaluated.
The value of a parse is the value of the top node
of the parse tree.

When a Marpa grammar is created,
its semantics
is
specified indirectly, as B<action names>.
To produce the values used in the semantics,
Marpa must do three things:

=over

=item * Determine the action name.

=item * Resolve the action name to an action.

=item * If the action is a rule evaluation closure,
call it to produce the actual result.

=back

An action name is either reserved,
or resolves to a Perl object.
When it resolves to a Perl object,
that object is usually a Perl closure -- code.
If the Perl object action is code,
it is a rule evaluation closure, and will be
called to produce the result.
When a Perl object action is some other kind of object,
its treatment is
L<as described
below|/"Types of Perl actions">

An action name
and action is also used
to create the per-parse-tree variable,
as L<described
below|/"The per-parse-tree variable">.

=head1 Nodes

=head2 Token nodes

For every input token, there is an associated B<token node>.
Token nodes are leaf nodes
in the parse tree.
Tokens always have a B<token symbol>.
At lexing time,
they can be assigned a B<token value>.
If no token value is assigned at lex time,
the value of the token is a Perl C<undef>.

=head2 Rule nodes

If a node is not a token node,
then it is a B<rule node>.
Rule nodes are always
associated with a rule.
It a rule's action is a rule evaluation closure,
it is called at
L<Node Evaluation Time|Marpa::R2::Deprecated::NAIF::Semantics::Phases/"Tree Traversal Phase">.

The rule evaluation closures's
arguments will be a
per-parse-tree variable followed, if the rule is not nulled,
by the values of its child nodes in lexical order.
If the rule is nulled, the child node values will be omitted.
A rule evaluation closure action is always called in scalar context.

If the action is a constant,
it becomes the value of the rule node.
If the action is a rule evaluation closure,
its return value becomes the value of the node.
If there is no action for a
rule node,
the value of the rule node is a Perl C<undef>.

=head2 Sequence rule nodes

Some rules are L<sequence rules|Marpa::R2::Deprecated::NAIF::Grammar/"Sequence rules">.
Sequence rule nodes are also rule nodes.
Everything said above about rule nodes
applies to sequence rule nodes.
Specifically,
the arguments to the value actions for sequence rules
are the
per-parse-tree variable followed by
the values of the child nodes in lexical order.

The difference (and it is a big one)
is that in an ordinary rule, the right hand side
is fixed in length, and that length is known
when you are writing the code for the value action.
In a sequence rule,
the number of right hand side symbols is not known
until node evaluation time.
The rule evaluation closure
of a sequence rule
must be capable of
dealing with
a variable number of arguments.

Sequence semantics work best when
every child node
in the sequence has the same semantics.
When that is not the case,
writing the sequence using
ordinary non-sequence rules should be considered as
an alternative.

By default, if a sequence rule has separators,
the separators are thrown away before
the value action is called.
(Separators are described in
L<the section introducing sequence rules|Marpa::R2::Deprecated::NAIF::Grammar/"Sequence rules">.)
This means that separators do not appear in the C<@_> array
of the rule evaluation closure which is the value action.
If the value of the C<keep> rule property
is a Perl true value, separators are kept,
and do appear in the
value action's
C<@_> array.

=head2 Null nodes

A null node is a special case of a rule node,
one where the rule derives the zero-length,
or empty string.
When the rule node is a null node,
the rule evaluation closure will be called with
no child value arguments.

When a node is nulled,
it must be as a result of a nullable rule,
and the action name and action are those
associated with that rule.
An ambiguity can arise if there is more
than one nullable rule with the same LHS,
but a different action name.
In that case the action name for the null nodes
is that of the empty rule.

The remaining case is that of
a set of nullable rules with the same LHS,
where two or more of the rules have different action names,
but none of the rules in the set is an empty rule.
When this happens, Marpa throws an exception.
To fix the issue,
the user can add an empty rule.
For more details,
see L<the document on
null semantics|Marpa::R2::Deprecated::NAIF::Semantics::Null>.

=head1 Action context

=for Marpa::R2::Display
name: Action context synopsis
normalize-whitespace: 1

    sub do_S {
        my ($action_object) = @_;
        my $rule_id         = $Marpa::R2::Context::rule;
        my $grammar         = $Marpa::R2::Context::grammar;
        my ( $lhs, @rhs ) = $grammar->rule($rule_id);
        $action_object->{text} =
              "rule $rule_id: $lhs ::= "
            . ( join q{ }, @rhs ) . "\n"
            . "locations: "
            . ( join q{-}, Marpa::R2::Context::location() ) . "\n";
        return $action_object;
    } ## end sub do_S

=for Marpa::R2::Display::End

In addition to the per-parse-tree variable
and their child arguments,
rule evaluation closures also have access
to B<context variables>.

=over

=item * C<$Marpa::R2::Context::grammar> is set to
the grammar being parsed.

=item * C<$Marpa::R2::Context::rule> is the ID of the
current rule.
Given the rule ID, an application can find
its LHS and RHS symbols using
L<the grammar's C<rule()> method|Marpa::R2::Deprecated::NAIF::Grammar/"rule()">.

=item * C<Marpa::R2::Context::location()> returns the start
and end locations of the current rule.

=back

=head1 Bailing out of parse evaluation

=for Marpa::R2::Display
name: Semantics bail synopsis
normalize-whitespace: 1

    my $bail_message = "This is a bail out message!";

    sub do_bail_with_message_if_A {
        my ($action_object, $terminal) = @_;
        Marpa::R2::Context::bail($bail_message) if $terminal eq 'A';
    }

    sub do_bail_with_object_if_A {
        my ($action_object, $terminal) = @_;
        Marpa::R2::Context::bail([$bail_message]) if $terminal eq 'A';
    }

=for Marpa::R2::Display::End

The C<Marpa::R2::Context::bail()> static method is used to
"bail out" of the evaluation of a parse tree.
It will cause an exception to be thrown.
If its first and only argument is a reference,
that reference is the exception object.
Otherwise, an exception message is created
by converting the method's arguments to strings,
concatenating them,
and prepending them with a message indicating
the file and line number at which the
C<Marpa::R2::Context::bail()> method was called.

=head1 Parse trees, parse results and parse series

When the semantics are applied to a parse tree,
it produces a value called a B<parse result>.
Because Marpa allows ambiguous parsing,
each parse can produce a B<parse series> --
a series of zero or more parse trees,
each with its own parse result.
The first call to the
L<the recognizer's C<value>
method|Marpa::R2::Deprecated::NAIF::Recognizer/"value()">
after the recognizer is created is the
start of the first parse series.
The first parse series continues until there is
a call to the
L<the C<reset_evaluation>
method|Marpa::R2::Deprecated::NAIF::Recognizer/"reset_evaluation()">
or until the recognizer is destroyed.
Usually, an application is only interested in a single
parse series.

When the
L<C<reset_evaluation>|Marpa::R2::Deprecated::NAIF::Recognizer/"reset_evaluation()">
method
is called
for a recognizer, it begins a new parse series.
The new parse series continues until
there is another
call to the
L<the C<reset_evaluation>
method|Marpa::R2::Deprecated::NAIF::Recognizer/"reset_evaluation()">,
or until the recognizer is destroyed.

Most applications will find that the order in which
Marpa executes its semantics "just works".
L<A separate
document|Marpa::R2::Deprecated::NAIF::Semantics::Phases>
describes that order
in detail.
The details can matter in some applications,
for example, those which exploit side effects.

=head1 Finding the action for a rule

Marpa finds the action for each rule based on
rule and symbol properties and on the grammar named arguments.
Specifically, Marpa attempts the following,
in order:

=over

=item * Resolving an action based on the C<action> property of the rule,
if one is defined.

=item * If the rule is empty,
and the C<default_empty_action> named argument of the grammar
is defined,
resolving an action based on that named argument.

=item * Resolving an action based on
the C<default_action> named argument of the grammar,
if one is defined.

=item * Defaulting to a Perl C<undef> value.

=back

Resolution of action names is L<described
below|/"Resolving action names">.
If the C<action> property,
the C<default_action> named argument,
or the C<default_empty_action> named argument
is defined,
but does not resolve successfully, Marpa
throws an exception.
Marpa prefers to "fast fail" in these cases,
because they usually indicate a mistake
that the application's author
will want to correct.

=head1 Resolving action names

Action names come from these sources:

=over

=item * The C<default_action> named argument of Marpa's grammar.

=item * The C<default_empty_action> named argument of Marpa's grammar.

=item * The C<action> property of Marpa's rules.

=item * The C<new> constructor in the package specified by the
C<action_object> named argument of the Marpa grammar.

=back

=head2 Reserved action names

Action names that begin with a double colon ("C<::>") are reserved.
At present only the C<::undef> reserved action is documented for
use outside of the DSL-based interfaces.

=head3 ::undef

A constant whose value is a Perl C<undef>.
Perl is unable to distinguish reliably between a
non-existent value and scalars with an C<undef> value.
This makes it impossible to reliably distinguish
resolutions
to a Perl C<undef> from resolution problems.
The "C<::undef>" reserved action name should 
be preferred for
indicating a constant whose value is a Perl C<undef>.

=head2 Explicit resolution

The recognizer's C<closures> named argument
allows the user to directly control the mapping from action names
to actions.
The value of the C<closures> named argument
is a reference to a hash whose keys are
action names and whose hash values are references.
Typically (but not always) these will be C<CODE> refs.

If an action name is the key of an entry in the C<closures> hash,
it resolves to the closure referenced by the value part of that hash entry.
Resolution via the C<closures> named argument is
called B<explicit resolution>.

When explicit resolution is the only kind of resolution that is wanted,
it is best to pick a name that is very unlikely to be the name
of a Perl object.
Many of
L<Marpa::HTML>'s action names
are intended for explicit resolution only.
In L<Marpa::HTML> those action names
begin with
an exclamation mark ("!"),
and that convention is recommended.

=head2 Fully qualified action names

If explicit resolution fails,
Marpa transforms the action name into a
B<fully qualified> Perl name.
An action name that
contains a double colon ("C<::>") or a single quote ("C<'>")
is considered to be a fully qualified name.
Any other action name is considered to be a B<bare action name>.

If the action name to be resolved is already a fully qualified name,
it is not further transformed.
It will be resolved in the form it was received,
or not at all.

For bare action names,
Marpa tries to qualify them by adding a package name.
If the C<actions> grammar named argument is defined,
Marpa uses it as the package name.
Otherwise,
if the
C<action_object> grammar named argument is defined,
Marpa uses it as the package name.
Once Marpa has fully qualified the action name,
Marpa looks for a Perl object with that name.

Marpa will not attempt to resolve an action name
that it cannot fully qualify.
This implies that,
for an action name to resolve successfully,
one of these five things must be the case:

=over

=item * The action name is one of the reserved action names.

=item * The action name resolves explicitly.

=item * The action name is fully qualified to begin with.

=item * The C<actions> named argument is defined.

=item * The C<action_object> named argument is defined.

=back

Marpa's philosophy
is to require that the programmer be specific about action names.
This can be an inconvenience, but Marpa prefers this to
silently executing unintended code.

If the user wants to leave the
rule evaluation closures in the C<main> namespace,
she can specify
C<"main">
as the value of the C<actions> named argument.
But
it can be good practice to keep
the rule evaluation closures
in their own namespace,
particularly if the application is not small.

=head2 Types of Perl actions

Actions resolve in three ways:
to reserved actions, to Perl rule evaluation closures and to Perl variable actions.
The following are tried, in order.

=over 4

=item * 
If an action name begins with a double colon ("C<::>"),
it will resolve to a reserved action, or not at all.

=item *
If the fully qualified form of an action name is the name of a Perl
subroutine in the symbol table,
the action name resolves to the Perl subroutine.
That subroutine then becomes a Perl rule evaluation closure.

=item *
If the fully qualified form of an action name is the name of a Perl
scalar variable in the symbol table,
the action name resolves to the Perl variable.
Note that, for this purpose, a Perl reference variable is considered as one type of Perl scalar.

=item *
Other kinds of Perl objects in the symbol table that match the fully qualified name,
such as arrays and hashes, are ignored.
Note that, while resolution to arrays and hashes is not allowed,
resolution to references to arrays and hashes is permitted.

=back

Resolution to a
Perl rule evaluation closure
or to a Perl variable
may came from explicit resolution.
Explicit resolution always takes place via a reference,
and requires an extra level of indirection.
For resolution to a rule evaluation closure,
the closure must be provided in the form of a reference to the closure.
For resolution to a Perl variable,
the variable has to be provided in the form of a reference to the variable.
If the Perl variable is a reference, that means adding another level of
indirection.

=head2 Modifying Perl variable actions

When resolution is to a
Perl variable,
it is possible to modify the value of the variable.
In practice, this will usually be a bad idea.
The Perl variable reference actions should be treated
as read-only constants, and never modified.

This is because multiple resolutions
to a Perl variable will
always point to the same contents.
Any modification to those contents will be
seen by other users of that Perl variable.
In other words, the modification will have global
effect.
For this reason modifying the referents of
reference actions is almost always bad practice
at the least, and is often an error.

For example,
assume that actions are in a package named C<My_Nodes>,
which contains a hash reference named C<empty_hash>, 

=for Marpa::R2::Display
ignore: 1

        package My_Nodes;
        our $empty_hash = {};

=for Marpa::R2::Display::End

It can be tempting, in building objects which are hashes,
to start with a leaf node whose action is C<empty_hash>
and to add contents to it as the object is passed up the evaluation
tree.
But C<$empty_hash> points to a single hash object.
This single hash object will shared by all nodes,
with all nodes seeing each other's changes.
Worse, all Marpa parsers which use the same C<My_Nodes>
namespace will share the same hash object.
An application which needs an action which produces
an empty hash should have the action resolve to a Perl rule
evaluation closure
that returns C<{}>.

=head2 Visibility and resolution

When Perl closures and variables are used for the semantics,
they must be visible in the scope where the semantics are
B<resolved>.
The action names are usually B<specified> with the grammar,
but action B<resolution> takes place
in the recognizer's
L<C<value>|Marpa::R2::Deprecated::NAIF::Recognizer/"value()"> method.
This can sometimes be a source of confusion.
For example, if a Perl closure is visible when the
action is specified,
but goes out of scope before the action name is resolved,
resolution will fail.

=head1 The per-parse-tree variable

In the Tree Setup Phase,
Marpa creates a per-parse-tree variable.
This becomes the first argument of the rule evaluation closures for
the rule nodes.
If the grammar's C<action_object> named argument is not defined,
the per-parse-tree variable is initialized to an empty hash ref.

Most data for
the value actions of the rules
will be passed up the parse tree.
The actions will see the values of the rule node's child nodes
as arguments,
and will return their own value to be seen as an argument
by their parent node.
The per-parse-tree variable can be used for data which does not
conveniently fit this model.

The lifetime of the per-parse-tree variable
extends into the Tree Traversal Phase.
During the Tree Traversal Phase,
Marpa's internals never alter the per-parse-tree variable --
it is reserved for use by the application.

=head2 Action object constructor

If the grammar's C<action_object> named argument has a defined value,
that value is treated as the name of a class.
The B<action object constructor> is
the C<new> method
in the C<action_object> class.

The action object constructor is called
in the Tree Setup Phase.
The return value of the
action object constructor becomes the per-parse-tree variable.
It is a fatal error if the
grammar's C<action_object> named argument is defined,
but does not name a class with a C<new> method.

Resolution of the action object constructor is
resolution of the B<action object constructor name>.
The action object constructor name is
formed by concatenating
the literal string "C<::new>" to
the value of the C<action_object> named argument.

All standard rules apply when resolving the action
object constructor name.
In particular, bypass via
explicit resolution applies to
the action object constructor.
If the action object constructor name is
a hash key in the
evaluator's C<closures> named argument,
then
the value referred to by
that hash entry becomes the
action object constructor.

If a grammar has both the C<actions> and the
C<action_object> named arguments defined,
all action names B<except>
for the action object constructor will be
resolved in the C<actions> package or not at all.
The action object constructor is always in
the C<action_object> class, if it is anywhere.

=head1 Parse order

If a parse is ambiguous, all parses are returned,
with no duplication.
By default, the order is arbitrary, but
it is also possible to control the order.
Details are in L<the document
on parse order|Marpa::R2::Deprecated::NAIF::Semantics::Order>.

=head1 Infinite loops

Grammars with infinite loops (cycles)
are generally regarded as useless in practical applications,
but Marpa allows them.
Marpa can accurately
claim to support B<every grammar> that can be written in BNF.

If a grammar with cycles is ambiguous,
it can produce cycle-free parses
and parses with finite-length cycles,
as well as parses with infinite length cycles.
Marpa will parse with grammars that contain cycles,
and Marpa's evaluator will iterate through
the values from the grammar's
cycle-free parses.
For those who want to know more,
the details are in a L<separate
document|Marpa::R2::Deprecated::NAIF::Semantics::Infinite>.

=head1 Copyright and License

=for Marpa::R2::Display
ignore: 1

  Copyright 2022 Jeffrey Kegler
  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
  redistribute it and/or modify it under the terms of the GNU Lesser
  General Public License as published by the Free Software Foundation,
  either version 3 of the License, or (at your option) any later version.

  Marpa::R2 is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser
  General Public License along with Marpa::R2.  If not, see
  http://www.gnu.org/licenses/.

=for Marpa::R2::Display::End

=cut

# Local Variables:
#   mode: cperl
#   cperl-indent-level: 4
#   fill-column: 100
# End:
# vim: expandtab shiftwidth=4: