File: Glade.pod

package info (click to toggle)
libmarpa-r2-perl 2.086000~dfsg-6
  • links: PTS, VCS
  • area: main
  • in suites: buster, stretch
  • size: 7,944 kB
  • ctags: 4,367
  • sloc: perl: 38,955; ansic: 19,252; sh: 11,611; makefile: 428
file content (1047 lines) | stat: -rw-r--r-- 35,890 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
# Copyright 2014 Jeffrey Kegler
# This file is part of Marpa::R2.  Marpa::R2 is free software: you can
# redistribute it and/or modify it under the terms of the GNU Lesser
# General Public License as published by the Free Software Foundation,
# either version 3 of the License, or (at your option) any later version.
#
# Marpa::R2 is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser
# General Public License along with Marpa::R2.  If not, see
# http://www.gnu.org/licenses/.

=head1 Name

Marpa::R2::Glade - Low-level interface to Marpa's Abstract Syntax Forests (ASF's)

=head1 Synopsis

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 1
normalize-whitespace: 1

  my $grammar = Marpa::R2::Scanless::G->new(
      {   source => \(<<'END_OF_SOURCE'),
  :start ::= pair
  pair ::= duple | item item
  duple ::= item item
  item ::= Hesperus | Phosphorus
  Hesperus ::= 'a'
  Phosphorus ::= 'a'
  END_OF_SOURCE
      }
  );

  my $slr = Marpa::R2::Scanless::R->new( { grammar => $grammar } );
  $slr->read( \'aa' );
  my $asf = Marpa::R2::ASF->new( { slr => $slr } );
  die 'No ASF' if not defined $asf;
  my $output_as_array = asf_to_basic_tree($asf);
  my $actual_output   = array_display($output_as_array);

=for Marpa::R2::Display::End

The code for C<asf_to_basic_tree()> represents a user-supplied call
using the interface described below.  An full example of C<ast_to_basic_tree()>,
which constructs a Perl array "tree",
is given
L<below|/"The code for ast_to_basic_tree()">.
C<array_display()> displays the tree in a compact form.
The code for it is also given L<below|/"The code for array_display()">.
The return value of C<array_display()> is as follows:

=for Marpa::R2::Display
name: ASF low-level calls synopsis, output
remove-display-indent: 1
remove-blank-last-line: 1

    Glade 2 has 2 symches
      Glade 2, Symch 0, pair ::= duple
          Glade 6, duple ::= item item
              Glade 8 has 2 symches
                Glade 8, Symch 0, item ::= Hesperus
                    Glade 13, Hesperus ::= 'a'
                        Glade 15, Symbol 'a': "a"
                Glade 8, Symch 1, item ::= Phosphorus
                    Glade 1, Phosphorus ::= 'a'
                        Glade 17, Symbol 'a': "a"
              Glade 7 has 2 symches
                Glade 7, Symch 0, item ::= Hesperus
                    Glade 22, Hesperus ::= 'a'
                        Glade 24, Symbol 'a': "a"
                Glade 7, Symch 1, item ::= Phosphorus
                    Glade 9, Phosphorus ::= 'a'
                        Glade 26, Symbol 'a': "a"
      Glade 2, Symch 1, pair ::= item item
          Glade 8 revisited
          Glade 7 revisited

=for Marpa::R2::Display::End

=head1 This INTERFACE is ALPHA and EXPERIMENTAL

The interface described in this document is very much a work in progress.
It is alpha and experimental.
The bad side of this is that it is subject to radical change without notice.
The good side is that field is 100% open for users
to have feedback into the final interface.

=head1 About this document

This document describes the low-level interface to Marpa's abstract syntax forests (ASF's).
It assumes that you are already familiar with L<the high-level interface|Marpa::R2::ASF>.
This low-level interface allows the maximum flexiblity in building the forest,
but requires the application to do much of the work.

=head1 Ambiguity: factoring versus symches

An abstract syntax forest (ASF) is similar to an abstract syntax tree (AST), but it
has an additional ability -- it can represent an ambiguous parse.
Ambiguity in a parse can come in two forms, and Marpa's ASF's treat the
distinction as important.  An ambiguity can be a symbolic choice
(a symch), or a factoring.  Symbolic choices are the kind of ambiguity
that springs first to mind -- a choice between rules, or a choice
between a rule and token.  Factorings involve only one rule,
but the RHS symbols of that rule divide the input up ("factor it") in different ways.
I'll give examples below.

Symches and factorings are treated separately,
because they behave very differently:

=over 4

=item * Symches are less common than factorings.

=item * Factorings are frequently not of interest; symches are almost
always of major interest.

=item * Symches usually have just a few alternatives; the
possible number of factorings easily grows into the thousands.

=item * In the worst case, the number of symches is a constant
that depends on size of the grammar.
In the worst case,
the number of factorings
grows exponentially with the length of the string being factored.

=item * The constant limiting the number of symches will
almost always be of manageable size.
The number of factorings can grow without limit.

=back

=head2 An example of a symch

Hesperus is Venus's traditional name as an evening star,
and Phosphorus (aka Lucifer) is its traditional name as a morning star.
For the grammar,

=for Marpa::R2::Display
name: ASF symch dump example grammar
remove-display-indent: 1
remove-blank-last-line: 1

    :start ::= planet
    planet ::= hesperus
    planet ::= phosphorus
    hesperus ::= venus
    phosphorus ::= venus
    venus ~ 'venus'

=for Marpa::R2::Display::End

and the input string 'C<venus>', the forest would look like

=for Marpa::R2::Display
name: ASF symch dump example output
remove-blank-last-line: 1
remove-display-indent: 1

    Symbol #0 planet has 2 symches
      Symch #0.0
      GL2 Rule 0: planet ::= hesperus
        GL3 Rule 2: hesperus ::= venus
          GL4 Symbol venus: "venus"
      Symch #0.1
      GL2 Rule 1: planet ::= phosphorus
        GL5 Rule 3: phosphorus ::= venus
          GL6 Symbol venus: "venus"

=for Marpa::R2::Display::End

Notice the tags of the form "C<GLn>", where I<n> is an integer.
These identify the glade.
Glades will be described in detail below.

The rules allow the string 'C<venus>' to be parsed
as either one of two planets: 'C<hesperus>'
or 'C<phosphorus>',
depending on whether rule 0 or rule 1 is used.
The choice, at glade 2, between rules 0 and 1,
is a symch.

=head2 An example of a factoring

For the grammar,

=for Marpa::R2::Display
name: ASF factoring dump example grammar
normalize-whitespace: 1
partial: 1

    :start ::= top
    top ::= b b
    b ::= a a
    b ::= a
    a ~ 'a'

=for Marpa::R2::Display::End

and the input 'C<aaa>',
a successful parse will always have two C<b>'s.
Of these two C<b>'s one will always be short, deriving
a string of length 1:
'C<a>'.
The other will always be long, deriving a string of length 2:
'C<aa>'.
But they can be in either order,
which means that the two C<b>'s can divide up the input stream in
two different ways: long string first; or short string first.

These two different ways of dividing the input stream using the rule

=for Marpa::R2::Display
name: ASF factoring dump example grammar
normalize-whitespace: 1
partial: 1

    top ::= b b

=for Marpa::R2::Display::End

are called a B<factoring>.  Here's Marpa's dump of the forest:

=for Marpa::R2::Display
name: ASF factoring dump example output
remove-blank-last-line: 1
remove-display-indent: 1

    GL2 Rule 0: top ::= b b
      Factoring #0
        GL3 Rule 2: b ::= a
          GL4 Symbol a: "a"
        GL5 Rule 1: b ::= a a
          GL6 Symbol a: "a"
          GL7 Symbol a: "a"
      Factoring #1
        GL8 Rule 1: b ::= a a
          GL9 Symbol a: "a"
          GL10 Symbol a: "a"
        GL11 Rule 2: b ::= a
          GL12 Symbol a: "a"

=for Marpa::R2::Display::End

=head1 The structure of a forest

An ASF can be pictured as a forest on a mountain.
This mountain forest has glades,
and there are paths between the glades.
The term "glade" comes from the idea of a glade as a distinct place in a forest that is
open to light.

The paths between glades have a direction -- they are always thought of
as running one-way: downhill.
If a path connects two glades, the one uphill is called an upglade and the
one downhill is called a downglade.

There is a glade at the top of mountain called the "peak".
The peak has no upglades.

=head1 The glade hierarchy

Every glade has the same internal structure, which is this hierarchy:

=over 4

=item * Glades contain symches.  A symch is either for a rule or for a token.

=item * Rule symches contain factorings.

=item * Factorings contain factors.

=item * A factor is the uphill end of a path which leads to a downglade.
That downglade will contain a glade hierarchy of its own.

=back

=head2 Glades

Each glade node represents an instance of a symbol in one of the possible parse trees.
This means that each glade has a symbol (called the "glade symbol"),
and an "input span".
An input span is an input start location, and a length in characters.
Because it has a start location and a length,
a span also specifies an end location in the input.

=head2 Symches

Every glade contains one or more symches.
If a glade has only one symch, that symch is said to be B<trivial>.
A symch is either a token symch or a rule symch.
For a token symch, the glade symbol is the token symbol.
For a rule symch, the glade symbol is the LHS of the rule.

At most one of the symches in a glade can be a token symch.
There can, however, be many rule symches in a glade --
one for every rule with the glade symbol on its LHS.

=head2 Factorings

Each rule symch contains one or more factorings.
A factoring is a way of dividing up the input span of the glade among its RHS symbols,
which in this context are called B<factors>.
If a rule symch has only one factoring, that factoring is said to be B<trivial>.
A token symch contains no factorings, which means that
token symches are the B<terminals> of an ASF.

Because the number of factorings can get out of hand,
factorings may be omitted.
A symch which omits factorings is said to be B<truncated>.
By default, every symch is truncated down to its first 42 factorings.

=head2 Factors

Every factoring has one or more factors.
Each "factor" corresponds to a symbol instance on the RHS of the rule.
Each such RHS factor is also a downglade, one which contains its own
symches.

=head1 The glade ID

Each glade has a glade ID.
This can be relied on to be a non-negative integer.
A glade ID may be zero.
Glade ID's are obtained from the L</"peak()">
and L</"factoring_downglades()"> methods.

=head1 Techniques for traversing ASF's

=head2 Memoization

When traversing a forest, you should take steps to avoid
traversing the same glades twice.
You can do this by memoizing the result of each glade, perhaps
using its glade ID to index an array.
When a glade is visited, the array can be checked to see if its result has
been memoized.
If so, the memoized result should be used.

This memoization eliminates the need to revisit the downglades
of an already visited glade.
It does not eliminate multiple visits to a glade, but it does
eliminate retraversal of the glades downhill from it.
In practice, the improvement in speed can be stunning.
It will often be the difference between 
an program which is unuseably slow even for very small inputs,
and one which is extremely fast even for large inputs.

Repeated subtraversals happen when two glades share the same downglades,
something that occurs frequently in ASF's.
Additionally, some day the SLIF may allow cycles.
Memoization will prevent a cycle form causing an infinite loop.

The example in this POD includes a memoization scheme which is very simple,
but adequate for most purposes.
The main logic of its memoization is shown here.

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

        my ( $asf, $glade, $seen ) = @_;
        return bless ["Glade $glade revisited"], 'My_Revisit'
            if $seen->[$glade];
        $seen->[$glade] = 1;

=for Marpa::R2::Display::End

Putting memoization in one of the very first drafts
of your code
will save you time and trouble.

=head1 Forest method

=head2 peak()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $peak = $asf->peak();

=for Marpa::R2::Display::End

Returns the glade ID of the peak.
This may be zero.
All failures are thrown as exceptions.

=head1 Glade methods

=head2 glade_literal()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

        my $literal = $asf->glade_literal($glade);

=for Marpa::R2::Display::End

Returns the literal substring of the input associated
with the glade.
Every glade is associated with a span -- a start location in the input,
and a length.
On failure, throws an exception.

The literal is determined by the range.
This works as expected if your application reads the input characters 
one-by-one in order.
(We will call applications which read in this fashion, B<monotonic>.)
Most applications are monotonic,
and yours is, unless you've taken special pains to make it otherwise.
Computation of literal substrings for non-monotonic applications
is addressed in L<Marpa::R2::Scanless::R/"Literals and G1 spans">.

=head2 glade_span()

=for Marpa::R2::Display
name: glade_span() example
normalize-whitespace: 1

    my ( $glade_start, $glade_length ) = $asf->glade_span($glade_id);

=for Marpa::R2::Display::End

Returns the span of the input associated
with the glade.
Every glade is associated with a span -- a start location in the input,
and a length.
On failure, throws an exception.

The span will be as expected if your application reads the input characters 
one-by-one in order.
(We will call applications which read in this fashion, B<monotonic>.)
Most applications are monotonic,
and yours is, unless you've taken special pains to make it otherwise.
Computation of literal substrings for non-monotonic applications
is addressed in L<Marpa::R2::Scanless::R/"Literals and G1 spans">.

=head2 glade_symch_count()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $symch_count = $asf->glade_symch_count($glade);

=for Marpa::R2::Display::End

Requires a glade ID as its only argument.
Returns the number of symches contained in the glade specified by the argument.
On failure, throws an exception.

=head2 glade_symbol_id()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $symbol_id    = $asf->glade_symbol_id($glade);
    my $display_form = $grammar->symbol_display_form($symbol_id);

=for Marpa::R2::Display::End

Requires a glade ID as its only argument.
Returns the symbol ID of the "glade symbol" for the glade specified by the argument.
On failure, throws an exception.

=head1 Symch methods

=head2 symch_rule_id()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $rule_id = $asf->symch_rule_id( $glade, $symch_ix );

=for Marpa::R2::Display::End

Requires two arguments: a glade ID and a zero-based symch index.
These specify a symch.
If the symch specified is a rule symch, returns the rule ID.
If it is a token symch, returns -1.

Returns a Perl undef, if the glade exists, but the symch index is too high.
On other failure, throws an exception.

=head2 symch_is_truncated()

[ To be written. ]

=head2 symch_factoring_count()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $factoring_count =
        $asf->symch_factoring_count( $glade, $symch_ix );

=for Marpa::R2::Display::End

Requires two arguments: a glade ID and a zero-based symch index.
These specify a symch.
Returns the count of factorings if the specified symch is a rule symch.
This count will always be one or greater.
Returns zero if the specified symch is a token symch.

Returns a Perl undef, if the glade exists, but the symch index is too high.
On other failure, throws an exception.

=head1 Factoring methods

=head2 factoring_downglades()

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1
partial: 1

    my $downglades =
        $asf->factoring_downglades( $glade, $symch_ix,
        $factoring_ix );

=for Marpa::R2::Display::End

Requires three arguments: a glade ID,
the zero-based index of a symch
and the zero-based index of a factoring.
These specify a factoring.
On success, returns a reference to an array.
The array contains the glade IDs of the the downglades in
the factoring specified.

Returns a Perl undef, if the glade and symch exist, but the factoring index is too high.
On other failure, throws an exception.
In particular, exceptions are thrown if the symch is for a token;
and if the glade exists, but the symch index is too high.

=head1 Methods for reporting ambiguity

=for Marpa::R2::Display
name: ASF ambiguity reporting
normalize-whitespace: 1
partial: 1

    if ( $recce->ambiguity_metric() > 1 ) {
        my $asf = Marpa::R2::ASF->new( { slr => $recce } );
        die 'No ASF' if not defined $asf;
        my $ambiguities = Marpa::R2::Internal::ASF::ambiguities($asf);

        # Only report the first two
        my @ambiguities = grep {defined} @{$ambiguities}[ 0 .. 1 ];

        $actual_value = 'Application grammar is ambiguous';
        $actual_result =
            Marpa::R2::Internal::ASF::ambiguities_show( $asf, \@ambiguities );
        last PROCESSING;
    } ## end if ( $recce->ambiguity_metric() > 1 )

=for Marpa::R2::Display::End

=head2 ambiguities()

=for Marpa::R2::Display
name: ASF ambiguity reporting
normalize-whitespace: 1
partial: 1

    my $ambiguities = Marpa::R2::Internal::ASF::ambiguities($asf);

=for Marpa::R2::Display::End

Returns a reference to an array of ambiguity reports in the ASF.
The first and only argument must be an ASF object.
The array returned will be be zero length if the parse was not ambiguous.
Ambiguity reports are as L<described below|"Ambiguity reports">.

While the C<ambiguities()> method can be called to determine whether
or not ambiguities exist, it is the more expensive way to do it.
The L<$slr-E<gt>ambiguity_metric() method|Marpa::R2::Scanless::R/"ambiguity_metric()">
tests an already-existing boolean and is therefore extremely fast.
If you are simply testing for ambiguity,
or if you can save time when you know that a parse is unambiguous,
you will usually want to test for ambiguity with the C<ambiguity_metric()> method
before calling the C<ambiguities()> method.

=head2 ambiguities_show()

=for Marpa::R2::Display
name: ASF ambiguity reporting
normalize-whitespace: 1
partial: 1

  $actual_result =
    Marpa::R2::Internal::ASF::ambiguities_show( $asf, \@ambiguities );

=for Marpa::R2::Display::End

Returns a string which contains a description of the ambiguities in its arguments.
Takes two arguments, both required.
The first is an ASF, and the second is a reference to an array of ambiguities,
in the format returned by L<the ambiguities() method|/"ambiguities()">.

Major applications will often have their own 
customized ambiguity formatting routine, one which can formulate
error messages based, not just on the names of the rules and symbols,
but on knowledge of the role that
the rules and symbols play in
the application.
This method is intended for applications which do not have
their own customized ambiguity handling.
For those which do, it can be used
as a fallback for handling those reports that the customized method does not recognize
or that do not need special handling.
The format of the returned string is subject to change.

=head1 Ambiguity reports

The ambiguity reports returned by the L<C<ambiguities()> method|/"ambiguities()">
are of two kinds: symch reports and factoring reports.

=head2 Symch reports

A symch report is issued whenever, in a top-down traversal of the ASF,
an non-trivial symch is encountered.
A symch report takes the form

=for Marpa::R2::Display
ignore: 1

   [ 'symch', $glade ]

=for Marpa::R2::Display::End

where C<$glade> is the ID of the glade with the symch ambiguity.
With this and the accessor methods in this document, an application can report full
details of the symch ambiguity.

Typically, when there is more than one kind of ambiguity in an input span, only
one is of real interest.
Symch ambiguities are usually of more interest than factorings.
And if one ambiguity is uphill from another, the downhill ambiguity is usually
a side effect of the uphill one and
of little interest.

Accordingly, if a glade has both a symch ambiguity and a factoring ambiguity,
only the symch ambiguity is reported.
And if two ambiguities in the ASF overlap, only the one closest to the peak is reported.

=head2 Factoring reports

A symch report is issued whenever, in a top-down traversal of the ASF,
an sequence of symbols is found which has more than one factoring.
Factoring reports are specific -- they identify not just rules,
but the specific sequences within the RHS
which are differently factored -- B<multifactored stretches>.
Sequence rules especially have long stretches where the symbols are in sync
with each other, broken by other stretches where they are out of sync.
Marpa reports each of the ambiguous stretches.
(A detailed definition of multifactored stretches is L<below|/"Multifactored stretches">.)

A factoring report takes the form

=for Marpa::R2::Display
ignore: 1

    [ 'factoring', $glade, $symch_ix, $factor_ix1, $factoring_ix2, $factor_ix2 ];

=for Marpa::R2::Display::End

where C<$glade> is the ID of the glade with the factoring ambiguity,
and C<$symch_ix> is the index of the symch involved.
The multifactored stretch is described by
two "identifying factors".
Both factors are at the beginning of the stretch,
and therefore have the same input start location.
They differ in length.

The first of the two identifying factors has factoring index of 0,
and its factor index is C<$factor_ix1>.
The second identifying factor has a factoring index of C<$factoring_ix2>,
and its factor index is C<$factor_ix2>.

The identifying factors will usually be enough
for error reporting, which is the usual application of these reports.
Full details of the stretch are not given because they can be extremely large;
are usually not of interest;
and can be determined by following up on the information in the factoring report
using the accessor methods described in this document.

Ambiguities in rules and symbols downhill from an ambiguously factored stretch
are not reported.
If a glade has both a symch ambiguity and a factoring ambiguity,
only the symch ambiguity is reported.

=head1 The code for the synopsis

=head2 The asf_to_basic_tree() code

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 2
normalize-whitespace: 1

  sub asf_to_basic_tree {
      my ( $asf, $glade ) = @_;
      my $peak = $asf->peak();
      return glade_to_basic_tree( $asf, $peak, [] );
  } ## end sub asf_to_basic_tree

  sub glade_to_basic_tree {
      my ( $asf, $glade, $seen ) = @_;
      return bless ["Glade $glade revisited"], 'My_Revisit'
          if $seen->[$glade];
      $seen->[$glade] = 1;
      my $grammar     = $asf->grammar();
      my @symches     = ();
      my $symch_count = $asf->glade_symch_count($glade);
      SYMCH: for ( my $symch_ix = 0; $symch_ix < $symch_count; $symch_ix++ ) {
          my $rule_id = $asf->symch_rule_id( $glade, $symch_ix );
          if ( $rule_id < 0 ) {
              my $literal      = $asf->glade_literal($glade);
              my $symbol_id    = $asf->glade_symbol_id($glade);
              my $display_form = $grammar->symbol_display_form($symbol_id);
              push @symches,
                  bless [qq{Glade $glade, Symbol $display_form: "$literal"}],
                  'My_Token';
              next SYMCH;
          } ## end if ( $rule_id < 0 )

          # ignore any truncation of the factorings
          my $factoring_count =
              $asf->symch_factoring_count( $glade, $symch_ix );
          my @symch_description = ("Glade $glade");
          push @symch_description, "Symch $symch_ix" if $symch_count > 1;
          push @symch_description, $grammar->rule_show($rule_id);
          my $symch_description = join q{, }, @symch_description;

          my @factorings = ($symch_description);
          for (
              my $factoring_ix = 0;
              $factoring_ix < $factoring_count;
              $factoring_ix++
              )
          {
              my $downglades =
                  $asf->factoring_downglades( $glade, $symch_ix,
                  $factoring_ix );
              push @factorings,
                  bless [ map { glade_to_basic_tree( $asf, $_, $seen ) }
                      @{$downglades} ], 'My_Rule';
          } ## end for ( my $factoring_ix = 0; $factoring_ix < $factoring_count...)
          if ( $factoring_count > 1 ) {
              push @symches,
                  bless [
                  "Glade $glade, symch $symch_ix has $factoring_count factorings",
                  @factorings
                  ],
                  'My_Factorings';
              next SYMCH;
          } ## end if ( $factoring_count > 1 )
          push @symches, bless [ @factorings[ 0, 1 ] ], 'My_Factorings';
      } ## end SYMCH: for ( my $symch_ix = 0; $symch_ix < $symch_count; ...)
      return bless [ "Glade $glade has $symch_count symches", @symches ],
          'My_Symches'
          if $symch_count > 1;
      return $symches[0];
  } ## end sub glade_to_basic_tree

=for Marpa::R2::Display::End

=head2 The array_display() code

Because of the blessings in this example,
a standard dump of the output array is
too cluttered for comfortable reading.
The following code displays the output from
C<asf_to_basic_tree()> in a more compact form.
Note that this code makes no use of Marpa, 
and works for all Perl arrays.
It is included for completeness,
and as a simple example of array traversal.

=for Marpa::R2::Display
name: ASF low-level calls synopsis, code part 3
normalize-whitespace: 1

    sub array_display {
        my ($array) = @_;
        my ( undef, @lines ) = @{ array_lines_display($array) };
        my $text = q{};
        for my $line (@lines) {
            my ( $indent, $body ) = @{$line};
            $indent -= 6;
            $text .= ( q{ } x $indent ) . $body . "\n";
        }
        return $text;
    } ## end sub array_display

    sub array_lines_display {
        my ($array) = @_;
        my $reftype = Scalar::Util::reftype($array) // '!undef!';
        return [ [ 0, $array ] ] if $reftype ne 'ARRAY';
        my @lines = ();
        ELEMENT: for my $element ( @{$array} ) {
            for my $line ( @{ array_lines_display($element) } ) {
                my ( $indent, $body ) = @{$line};
                push @lines, [ $indent + 2, $body ];
            }
        } ## end ELEMENT: for my $element ( @{$array} )
        return \@lines;
    } ## end sub array_lines_display

=for Marpa::R2::Display::End

=head1 Details

This section contains some elaborations of the above,
some of them in mathematical terms.
These details are segregated because they are not essential to using this interface,
and while some readers find them more helpful than distracting,
for many others it is the reverse.

=head2 An alternative way of defining glade terminology

Here's a way of defining some of the above terms which is less intuitive,
but more precise.
First, define the B<glade length> from glades A to glade B in an ASF as 
the number of glades on the shortest path from A to B, not including glade A.
(Recall that paths are directional.)
If there is no path between glades A and B, the glade length is undefined.
Glade B is a B<downglade> of glade A,
and glade A is an B<upglade> of glade B,
if and only if the glade length from A to B is 1.

A glade A is B<uphill> with respect to glade B,
and a glade B is B<downhill> with respect to glade A,
if and only if the glade length from A to B is defined.

A B<peak> of an ASF is a node without upglades.
By construction of the ASF, there is only one peak.
A glade with a token symch is B<trivial> if it has no rule symches.
A glade without a token symch is B<trivial> if it has exactly one downglade.

The B<distance-to-peak> of a glade C<A> is the glade length from the peak to glade C<A>.
Glade C<A> is said to have a higher B<altitude> than glade C<B> if the distance-to-peak of glade C<A>
is less than that of glade C<B>.
Glade C<A> has a lower B<altitude> than glade C<B> if the distance-to-peak of glade C<A>
is greater than that of glade C<B>.
Glade C<A> has the same B<altitude> as glade C<B> if the distance-to-peak of glade C<A>
is equal to that of glade C<B>.

=head2 Cycles

In the current SLIF implementation, a forest is a directed acyclic graph (DAG).
(In the mathematical literature a DAG is also called a "tree", but that use is confusing
in the present context.)
The underlying Marpa algorithm allows parse trees with cycles,
and someday the SLIF probably will as well.
When that happens, ASF's will no longer be "acyclic" and therefore will no
longer be DAG's.
This document talks about ASF's as if that day had already come --
it assumes that the ASF's might
contain cycles.

In an ASF that contains one or more cycles,
the concepts of uphill and downhill become much less useful
for describing the relative positions of glades.
For example, if glade A cycles back to itself through glade B, then

=over 4

=item * Glade A will be uphill from glade B, and

=item * Glade B will be uphill from glade A; so that

=item * Glade B will be downhill from glade A, and

=item * Glade A will be downhill from glade B; and

=item * Glade A will be both downhill and uphill from itself; and

=item * Glade B will be both downhill and uphill from itself.

=back

ASF's will always be constructed so that the peak has no upglades.
Because of this, the peak can never be part of a cycle.
This means that altitude will always be well defined in the sense that,
for any two glades C<A> and C<B>, one and only one of the following statements will be true:

=over 4

=item * Glade C<A> is lower in altitude than glade C<B>.

=item * Glade C<A> is higher in altitude than glade C<B>.

=item * Glade C<A> is equal in altitude to glade C<b>.

=back

=head2 Token symches

In the current SLIF implementation, a symbol is always either a token
or the LHS of a rule.
This means that any glade that contains a token symch
cannot contain any rule symches.
It also means that any glade that contains a rule symch will not
contain a token symch.

However, the underlying Marpa algorithm allows LHS terminals,
and someday the SLIF probably will as well.
This document is written as if that day has already come,
and describes
glades as if they could contain both rule symches and a token symch.

=head2 Maximum symches per glade

Above, the point is made that the number of symches in a glade,
even in the worst case, is a very manageable number.
For a particular case, it is not hard to work out the exact maximum.
Here are the details.

There can be at most one token symch.
There can be only rule symch for every rule.
In addition, all rules in a glade must have the glade symbol as their LHS.
Let the number of rules with the glade symbol on their LHS be C<r>.
The maximum number of symches in a glade is C<r+1>.

=head2 Multifactored stretches

Marpa locates factoring ambiguities, not just by rule, but by RHS symbol.
It finds B<multifactored stretches>, input spans where a sequence of symbols
within the RHS of a rule have multiple factorings.
A multifactored stretch will sometimes encompass the entire RHS of a rule.
In other cases, the RHS of a single rule might contain many multifactored stretches.
This is often the case with sequence rules.
Sequence rules can have a very long RHS,
and in those situations narrowing down factoring ambiguities to specific input spans is 
necessary for precise error reporting.

The main body of this document worked with an intuitive "know one when I see one"
idea of multifactored stretches.
The exact definition follows.
First we will need a series of preliminary definitions.


Consider the case of a arbitrary rule symch.
Intuitively, a B<factoring position> is a location within the factors of one of the factorings
of that symch.
It can be seen as a duple C<E<lt>factoring_ix, factor_ixE<gt>> where 
C<E<lt>factoring_ixE<gt>> is the index of a factoring within the symch,
and 
C<E<lt>factor_ixE<gt>> is the index of one of the factors of the factoring.

Let C<SP> be a function that maps the symch's set of factoring indexes to the
non-negative integers, such that for a factoring index C<i>
and factor index C<j>, C<SP(i)=j>, C<j> is a valid factor index within the factoring C<i>.
The function C<SP> can be called a B<symch position>.

Every B<symch position> is equivalent to a set of factoring positions.
The B<initial symch position> is the symch position all of whose factoring positions
have a factor index of 0.
Equivalently, it is the constant function C<ISP>, where C<ISP(i)=0> for all factoring
indexes C<i>.

The factor with index C<factor_ix> in the factoring with index C<factoring_ix> is
said to be the factor at factoring position C<E<lt>factoring_ix, factor_ixE<gt>>.
A factor is one of the factors of a symch position if and only if it is a factor
at one of its factoring positions.

An B<aligned symch position> is a factoring position all of whose factors have
the same start location.
The location of an aligned symch position is that start location.
The B<initial symch position> is always an aligned factoring position.
A B<synced symch position> is an aligned symch position all of whose factors have
the same length and symbol ID.
A B<unsynced symch position> is an aligned symch position that is not a synced symch position.

We are now in a position to define a B<multifactored stretch>.
Intuitively, a multifactored stretch is a longest possible input span that contains at least one unsynced symch position,
but no synced symch positions.
More formally,
a multifactored stretch of a symch is a span of start locations within that symch,
such that:

=over 

=item * Its first location is the location of unsynced symch position.

=item * Its first location is the initial symch position, or the first symch positiion after a
synched symch position.

=item * Its end location is the end location of the symch, or a synced symch position, whichever
occurs first.

=back

Note that multifactored stretch are aligned in terms of input locations,
but they do not have to be aligned in terms of factor indexes.
The factoring positions of
a multifactored stretch can have many different factor indexes.
This is true of all rules, but it is particularly likely for
a sequence rule,
where the RHS consists of repetitions of a single symbol.

=head1 Copyright and License

=for Marpa::R2::Display
ignore: 1

  Copyright 2014 Jeffrey Kegler
  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
  redistribute it and/or modify it under the terms of the GNU Lesser
  General Public License as published by the Free Software Foundation,
  either version 3 of the License, or (at your option) any later version.

  Marpa::R2 is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser
  General Public License along with Marpa::R2.  If not, see
  http://www.gnu.org/licenses/.

=for Marpa::R2::Display::End

=cut

# Local Variables:
#   mode: cperl
#   cperl-indent-level: 4
#   fill-column: 100
# End:
# vim: expandtab shiftwidth=4: