File: Order.pod

package info (click to toggle)
libmarpa-r2-perl 12.000000-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 6,660 kB
  • sloc: perl: 42,628; ansic: 23,387; sh: 4,363; makefile: 157
file content (301 lines) | stat: -rw-r--r-- 9,652 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# Copyright 2022 Jeffrey Kegler
# This file is part of Marpa::R2.  Marpa::R2 is free software: you can
# redistribute it and/or modify it under the terms of the GNU Lesser
# General Public License as published by the Free Software Foundation,
# either version 3 of the License, or (at your option) any later version.
#
# Marpa::R2 is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser
# General Public License along with Marpa::R2.  If not, see
# http://www.gnu.org/licenses/.

=head1 NAME

Marpa::R2::Semantics::Order - How the SLIF ranks ambiguous parses

=head1 Description

Marpa allows ambiguous parses.
While an unambiguous parse can produce at most one parse tree
and one parse result,
an ambiguous parse will produce a parse series.
A parse series is a sequence of parse trees,
each of which will have its own parse result.

This document describes ways of controlling
the order in which
the L<SLIF recognizer's C<value()> method|Marpa::R2::Scanless::R/"value()">
evaluates the parse
trees of an ambiguous parse.
It also describes ways to exclude selected parse trees
from the parse series.

=head2 Default parse order

By calling
the recognizer's
L<C<value()>|Marpa::R2::Scanless::R/"value()">
method
repeatedly,
Marpa can produce all the parse results
in the current parse series.
The default is for the parse results to be returned
in an B<arbitrary parse order>.
This corresponds to the "C<none>" value of
L<the recognizer's C<ranking_method>|Marpa::R2::Scanless::R/"ranking_method">
named argument.

Traversal of the parse trees in
arbitrary parse order
will be always be well-behaved
in the sense
that no two parse trees will be semantic duplicates,
and no unique (semantic non-duplicate)
parse tree will be omitted in it.
No other property of arbitrary parse order is guaranteed.
For example, the order may
change each time
the parse series is traversed.

=head2 Choicepoints

When ranking, the logic traverses each node
of the parse bocage.
In this context, the nodes are also called "choicepoints".
From the point of view of the individual parse trees,
the traversal will be top-down
and left-to-right.

Each choicepoint has one or more "choices".
Often a choicepoint has only a single choice,
in which case the choicepoint is called "trivial".
For two rule instances to be choices of the same
choicepoint,
they must end at the same location,
and their rules must have the same LHS.

The terms "choicepoint" and "choice"
are defined carefully
in L<a separate
document|Marpa::R2::Semantics::Rank>.
That document also gives several examples
of ranking,
which are explained in detail.

=head2 Ranking methods

SLIF recognizer objects have a L<C<ranking_method> named
argument|Marpa::R2::Scanless::R/"ranking_method">,
whose value can be the name of a ranking method,
or "C<none>", indicating that the default ranking method is to
be used.

=head2 The C<rule> ranking method

The rule method ranks alternative parses according to their rule alternatives.
Every rule alternative has a B<numeric rank>.
A rule's rank can be specified using the
the C<rank> adverb
argument for that RHS alternative.
Rule ranks must be integers.
They may be negative.
If no numeric rank is specified, the numeric rank is 0.

=head2 The C<high_rule_only> ranking method

The C<high_rule_only> ranking method is similar to the
C<rule> ranking method, except that, at every choicepoint,
it discards all of the choices which
have a rank lower than that of the highest ranked choice.

The C<high_rule_only> ranking method
can reduce the ambiguity of a parse,
but it does not necessarily do so.
This is because, at each choicepoint among the parse trees,
it is possible that several of the choices,
or all of them, will have the same rank
as the highest ranked choice.

=head2 Rule ranking

At each choicepoint,
the choices
are ranked as follows:

=over

=item * B<Different numeric ranks>:

If the two parse choices have different numeric ranks,
they must also have different rule alternatives.
The parse choice whose rule alternative has the higher numeric rank
will rank high.

=item * B<Same rule alternative>:

If the two parse choices have the same rule alternative,
they rank as described
under L<"Null variant ranking">.

=item * B<Same numeric rank, different rule alternatives>:

Two different rule alternatives can have the same numeric rank.
If the two parse choices are for
rule alternatives that are different,
but that have the same numeric rank,
the relative order of the two parse choices is
arbitrary.

=back

Rule alternatives may be part of a single rule in the DSL --
for example, a
L<prioritized rule|Marpa::R2::Scanless::DSL/"Prioritized rule">.
Lexical order within a DSL rule
makes no difference when ranking rule alternatives.
For example, it makes no difference if two rule alternatives
come from the same prioritized rule;
or from two different prioritized rules.

=head2 Null variant ranking

Some rules have a RHS which contains
B<proper nullables>:
symbols
which may be nulled, but which are not nulling
symbols.
(Nulling symbols are symbols which are B<always> nulled.)

When a rule alternative contains proper nullables,
each instance
of that rule creates a B<nulling variant>.
A B<nulling variant> is
a specific pattern of
null and non-null symbols in a rule instance's RHS.
In many cases, this creates an ambiguity -- different
nulling variants can match the same substring in the input.
In ambiguous parsings of this kind,
some applications may want to rank nulling variants that start
with non-null symbols higher.
Other applications may want to do the opposite --
to rank nulling variants that start
with null symbols higher.

The
L<C<null-ranking> adverb
for RHS alternatives|Marpa::R2::Scanless::DSL/"null-ranking">
specifies which nulling variants are ranked high or low.
If the C<null-ranking> is "C<low>",
then the closer a nulling variant
places its B<visible> (non-null) symbols to the start of the rule instance,
the higher it ranks.
A null ranking of C<low> is the default.
If the C<null-ranking> is "C<high>",
then the closer a nulling variant
places its B<null> symbols to the start of the rule instance,
the higher it ranks.
In ranking nulling variants with more than one proper nullable,
major-to-minor is left-to-right.

=head2 A general approach to sorting parses

The most general way to sort Marpa parses is for the application
to take control.
The application can set up the Marpa semantic actions
so that the parse result of every parse tree is a
C<< <rank, true_value> >> duple.
The duples can then be sorted by C<rank>.
Once the results are sorted,
the C<rank> element of the duple can be discarded.
(Those familiar with the Schwartzian transform
may note a resemblance.
In Perl,
duples can be implemented as references to arrays of 2 elements.)

The user needs to be careful.
In theory, ambiguity can cause an exponential explosion in the number of results.
In practice, ambiguity tends to get out of hand very easily.
Producing and sorting all the parses can take a very
long time.

=head1 Details

This section contains additional explanations, not essential to understanding
the rest of this document.
Often they are formal or mathematical.
While some people find these helpful, others find them distracting,
which is why
they are segregated here.

=head2 Duplicate parses

When evaluating the parse trees in a parse series,
Marpa never evaluates the same parse tree twice.
What this means probably matches the programmer's
intuition of what it should mean.
Marpa considers two parse trees to be the same if they are
B<semantic equivalents>.

Two parse trees are semantic equivalents if
and only if
a recursive, top-down evaluation of each
applies
the same rules
in the same order
at the same G1 locations.
If the semantics are deterministic,
and if two parse trees are
semantic equivalents according to this definition,
the two parse trees will always produce the same parse result.

The two parse trees are called semantic equivalents,
because from the point
of view of a deterministic semantics they are indistinguishable.
When the Marpa documentation refers to duplicate
parses,
unless otherwise stated,
it means that the two
are semantic equivalents.

Formally,
B<semantic equivalence> is defined as follows:
Call the set of parse trees, C<T>.
B<Semantic equivalence> is an equivalence relation
on C<T>.
Call this relation C<~>.
Call C<E>, the quotient set of C<T> by C<~>.
In this document, the term
B<arbitrary parse order>
is used to mean an
arbitrary choice among the relations
which are strict total orders of C<E>.

=head1 Copyright and License

=for Marpa::R2::Display
ignore: 1

  Copyright 2022 Jeffrey Kegler
  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
  redistribute it and/or modify it under the terms of the GNU Lesser
  General Public License as published by the Free Software Foundation,
  either version 3 of the License, or (at your option) any later version.

  Marpa::R2 is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser
  General Public License along with Marpa::R2.  If not, see
  http://www.gnu.org/licenses/.

=for Marpa::R2::Display::End

=cut

# vim: expandtab shiftwidth=4: