
|
\input texinfo @c -*-texinfo-*-
@c %**start of header
@setfilename api.info
@include version.texi
@settitle Libmarpa @value{VERSION}
@finalout
@c %**end of header
@copying
This manual (@value{UPDATED})
is for Libmarpa @value{VERSION}.
Copyright @copyright{} 2014 Jeffrey Kegler.
@quotation
Permission is granted to copy, distribute and/or modify this document
under the terms of the @acronym{GNU} Free Documentation License,
Version 1.3 or any later version published by the Free Software
Foundation.
A copy of the license is included in the section entitled
``@acronym{GNU} Free Documentation License.''
@end quotation
@end copying
@dircategory Development
@direntry
* Marpa R2: (marpa-r2). Marpa R2 parser library
@end direntry
@titlepage
@title Libmarpa
@subtitle Version @value{VERSION}
@subtitle @value{UPDATED}
@author Jeffrey Kegler
@c The following two commands
@c start the copyright page.
@page
@vskip 0pt plus 1filll
@insertcopying
Published @value{UPDATED} by Jeffrey Kegler
@end titlepage
@c So the toc is printed at the start.
@contents
@ifnottex
@node Top, Copying, (dir), (dir)
@top Libmarpa: The Marpa low-level library
@insertcopying
@end ifnottex
@menu
* Copying::
* About this document::
* About Libmarpa::
* Architecture::
* Input::
* Semantics::
* Threads::
* Fatal Errors::
* Introduction to the external interface::
* Static methods::
* Configuration methods::
* Grammar methods::
* Recognizer methods::
* Progress reports::
* Bocage methods::
* Ordering methods::
* Tree methods::
* Value methods::
* Events::
* Error methods macros and codes::
* Design notes::
* Work in Progress::
* Deprecated techniques and methods::
* GNU Free Documentation License::
@detailmenu
--- The Detailed Node Listing ---
About this document
* How to read this document::
* Prerequisites::
* Parsing theory::
Architecture
* Major objects::
* Time objects::
* Reference counting::
* Numbered objects::
Input
* Earlemes::
* Terminals::
Earlemes
* The traditional model::
* The earleme variables::
* The significances of the earleme variables::
* The initial earleme settings::
* The standard model of input::
* Ambiguous input::
* Variable length tokens::
* The generalized model::
* General rules for the earleme variables::
Introduction to the external interface
* About the overviews::
* Return values::
* Naming conventions::
Grammar methods
* Grammar overview::
* Grammar constructor::
* Grammar reference counting::
* Symbols::
* Rules::
* Sequences::
* Ranks::
* Grammar precomputation::
Recognizer methods
* Recognizer overview::
* Recognizer constructor::
* Recognizer reference counting::
* Recognizer life cycle mutators::
* Location accessors::
* Other parse status methods::
* Untested recognizer methods::
Bocage methods
* Bocage overview::
* Bocage constructor::
* Bocage reference counting::
* Bocage accessor::
Ordering methods
* Ordering overview::
* Ordering constructor::
* Ordering reference counting::
* Order accessor::
* Non-default ordering::
Tree methods
* Tree overview::
* Tree constructor::
* Tree reference counting::
* Tree iteration::
Value methods
* Value overview::
* How to use the valuator::
* Advantages of step-driven valuation::
* Maintaining the stack::
* Valuator constructor::
* Valuator reference counting::
* Stepping through the valuator::
* Valuator steps by type::
* Basic step accessors::
* Other step accessors::
Maintaining the stack
* Sizing the stack::
* Initializing locations in the stack::
Events
* Events overview::
* Event methods::
* Event codes::
Error methods, macros and codes
* Error methods::
* Error Macros::
* External error codes::
* Internal error codes::
Design notes
* Why so many time objects::
* Design of numbered objects::
* LHS Terminals::
Work in Progress
* Untested methods::
Untested methods
* Ranking methods::
* Zero-width assertion methods::
* Methods for revising parses::
Deprecated techniques and methods
* Valued and unvalued symbols::
Valued and unvalued symbols
* What unvalued symbols were::
* Grammar methods dealing with unvalued symbols::
* Registering semantics in the valuator::
@end detailmenu
@end menu
@node Copying, About this document, Top, Top
@unnumbered GNU LESSER GENERAL PUBLIC LICENSE
@include lgpl-3.0.texi
@node About this document, About Libmarpa, Copying, Top
@chapter About this document
@menu
* How to read this document::
* Prerequisites::
* Parsing theory::
@end menu
@node How to read this document, Prerequisites, About this document, About this document
@section How to read this document
This is essentially a reference document,
but its early chapters lay out concepts
essential to the others.
Readers will usually want to read the
chapters up and including
@ref{Introduction to the external interface}
in order.
Otherwise, they should follow their interests.
@node Prerequisites, Parsing theory, How to read this document, About this document
@section Prerequisites
This document is very far from self-contained.
It assumes the following:
@itemize
@item
The reader knows the C programming language
at least well
enough to understand function prototypes and return values.
@item
The reader
has read the documents for one of Libmarpa's upper layers.
As of this writing, the only such layer is @code{Marpa::R2}
or @code{Marpa::R3},
in Perl.
@item
The reader knows some parsing theory.
@xref{Parsing theory}.
@end itemize
@node Parsing theory, , Prerequisites, About this document
@section Parsing theory
This document assumes some acquaintance
with parsing theory.
The reader's
level of knowledge is probably adequate
if he can
answer the following questions,
either immediately or after a little reflection.
@itemize @bullet
@item
What is a BNF rule?
@item
What is a Marpa sequence rule?
@item
As a reminder,
Marpa's sequence rules are implemented
as left recursions.
What does that mean?
@item
Take a Marpa sequence rule at random.
What does it look like when rewritten in BNF?
@item
What does the sequence look like when rewritten
in BNF as a right-recursion?
@end itemize
@node About Libmarpa, Architecture, About this document, Top
@chapter About Libmarpa
Libmarpa implements the Marpa parsing algorithm.
Marpa is named
after the legendary 11th century Tibetan translator,
Marpa Lotsawa.
In creating Marpa,
I depended heavily on previous work by Jay Earley,
Joop Leo,
John Aycock and Nigel Horspool.
Libmarpa implements the entire Marpa algorithm.
This library does
the necessary grammar preprocessing, recognizes the input,
and produces parse trees.
It also supports the ordering, iteration
and evaluation of the parse
trees.
Libmarpa is very low-level.
For example, it has no strings.
Rules, symbols, and token values are all represented
by integers.
This, of course, will not suffice for many applications.
Users will very often want
names for the symbols, non-integer values for
tokens, or both.
Typically, applications will use arrays to
translate Libmarpa's integer ID's to strings or other
values as required.
Libmarpa also does @strong{not} implement most of the semantics.
Libmarpa does have an evaluator (called a ``valuator''),
but it does @strong{not}
manipulate the stack directly.
Instead, Libmarpa,
based on its traversal of the parse tree,
passes optimized step by step stack manipulation
instructions to the upper layer.
These instructions indicate the token or rule involved,
and the proper location for the true token value or
the result of the rule evaluation.
For rule evaluations, the instructions include the stack location
of the arguments.
Marpa requires most semantics to be
implemented in the application.
This allows the application total flexibility.
It also puts
the application is in a much better position to prevent errors,
to catch errors at runtime or,
failing all else,
to successfully debug the logic.
@node Architecture, Input, About Libmarpa, Top
@chapter Architecture
@menu
* Major objects::
* Time objects::
* Reference counting::
* Numbered objects::
@end menu
@node Major objects, Time objects, Architecture, Architecture
@section Major objects
The classes of
Libmarpa's object system fall into two types:
major and numbered.
These are the Libmarpa's major classes,
in sequence.
@itemize
@item
Configuration:
A configuration object is
a thread-safe way to hold configuration variables,
as well as the return code from failed attempts
to create grammar objects.
@item
Grammar:
A grammar object contains rules and symbols,
with their properties.
@item
Recognizer:
A recognizer object reads input.
@item
Bocage:
A bocage object is a collection of
parse trees, as found by a recognizer.
Bocages are similar to parse forests.
@item
Ordering:
An ordering object
is an ordering of the trees
in a bocage.
@item
Tree:
A tree object is a bocage iterator.
@item
Value:
A value object is a tree iterator.
Iteration of tree using a value object
produces ``steps''.
These ``steps'' are
instructions to
the application on how
to evaluate the semantics,
and how to manipulate the stack.
@end itemize
The major objects have one letter abbreviations,
which are used frequently.
These are, in the standard sequence,
@itemize
@item
Configuration: C
@item
Grammar: G
@item
Recognizer: R
@item
Bocage: B
@item
Ordering: O
@item
Tree: T
@item
Value: V
@end itemize
@node Time objects, Reference counting, Major objects, Architecture
@section Time objects
All of Libmarpa's major classes,
except the configuration class,
are ``time'' classes.
Except for objects in the grammar class,
all time objects are created from another time
object.
Each time object is created from a time object
of the class before it in the sequence.
A recognizer cannot be created without a precomputed grammar;
a bocage cannot be created without a recognizer;
and so on.
When one time object is used to create a second
time object,
the first time object is the @dfn{parent object}
and the second time object is the @dfn{child object}.
For example, when a bocage is created from a
recognizer,
the recognizer is the parent object,
and the bocage is the child object.
Grammars have no parent object.
Every other time object has exactly one parent object.
Value objects have no child objects.
All other time objects can have any number of children,
from zero up to a number determined by memory or
some other machine-determined limit.
Every time object has a @dfn{base grammar}.
A grammar object is its own base grammar.
The base grammar of a recognizer is the grammar
that it was created with.
The base grammar of any other time object is the base
grammar of its parent object.
For example,
the base grammar of a bocage is the base
grammar of the recognizer that it was created
with.
@node Reference counting, Numbered objects, Time objects, Architecture
@section Reference counting
Every object in a ``time'' class
has its own, distinct, lifetime,
which is controlled by the object's reference count.
Reference counting follows the usual practice.
Contexts which take a share of the
``ownership'' of an object
increase the reference count by 1.
When a context relinquishes its share of
the ownership of an object, it decreases the reference
count by 1.
Each class of time object has a ``ref'' and an ``unref''
method, to be used by those contexts which need to
explicitly increment and decrement the reference count.
For example, the ``ref'' method for the grammar class is
@code{marpa_g_ref()}
and the ``unref'' method for the grammar class is
@code{marpa_g_unref()}.
Time objects do not have explicit destructors.
When the reference count of a time object reaches
0, that time object is destroyed.
Much of the necessary reference counting
is performed automatically.
The context calling the constructor of a time object
does not need to explicitly increase the reference
count, because
Libmarpa time objects are
always created with a reference count of 1.
Child objects ``own'' their parents,
and when a child object is successfully created,
the reference count of its parent object is
automatically incremented to reflect this.
When a child object is destroyed, it
automatically decrements the reference count of its parent.
In a typical application, a calling context needs only
to remember
to ``unref'' each time object that it creates,
once it is finished with that time object.
All other reference decrements and increments are taken
care of automatically.
The typical application never needs to explicitly
call one of the ``ref'' methods.
More complex applications may find it convenient
to have one or more contexts share ownership of objects
created in another context.
These more complex situations
are the only cases in which the ``ref'' methods
will be needed.
@node Numbered objects, , Reference counting, Architecture
@section Numbered objects
In addition to its major, ``time'' objects, Libmarpa also has
numbered objects.
Numbered objects do not have lifetimes of their own.
Every numbered object belongs to a time object,
and is destroyed with it.
Rules and symbols are numbered objects.
Tokens values are another class of numbered
objects.
@node Input, Semantics, Architecture, Top
@chapter Input
@menu
* Earlemes::
* Terminals::
@end menu
@node Earlemes, Terminals, Input, Input
@section Earlemes
@menu
* The traditional model::
* The earleme variables::
* The significances of the earleme variables::
* The initial earleme settings::
* The standard model of input::
* Ambiguous input::
* Variable length tokens::
* The generalized model::
* General rules for the earleme variables::
@end menu
@node The traditional model, The earleme variables, Earlemes, Earlemes
@subsection The traditional model
In traditional Earley parsers, the concept of location is very simple.
Locations are numbered from 0 to @var{n}, where @var{n} is the length of
the input.
Every location has an Earley set, and vice versa.
Location 0 is the start location.
Every location after the start location has exactly one input token
associated with it.
Some applications
do not fit this traditional input model ---
natural language processing requires ambiguous tokens,
for example.
Libmarpa allows a wide variety of alternative input models.
This document assumes that the reader knows the concepts
behind Libmarpa's
alternative input models, either from the documentation
of a higher level interface, such as
@code{Marpa::XS},
@code{Marpa::R2},
or @code{Marpa::R3},
or from Marpa's
@uref{https://docs.google.com/file/d/0B9_mR_M2zOc4Ni1zSW5IYzk3TGc/edit?usp=sharing, theory document}.
As a reminder,
in Libmarpa a location is called a @dfn{earleme}.
The number of an Earley set is the @dfn{ID of the Earley set},
or its @dfn{ordinal}.
In the traditional model, the ordinal of an Earley set and
its earleme are always exactly the same, but in Libmarpa
they will be different.
@node The earleme variables, The significances of the earleme variables, The traditional model, Earlemes
@subsection The earleme variables
The important earleme variables are the current earleme, the furthest earleme
and the latest earleme.
The @dfn{current earleme} is the earleme that Libmarpa is currently working on.
More specifically, it is the one at which new tokens will @strong{start}.
Since tokens are never zero length, a new token will always end after the
current earleme.
The current earleme is initially earleme 0.
Every call to @code{marpa_r_earleme_complete()} advances the
current earleme by 1.
The @dfn{furthest earleme} is the highest numbered (and therefore ``furthest'')
earleme at which a token ends.
The furthest earleme is initially earleme 0.
With every call to @code{marpa_r_alternative()}, the end of the token
it adds is calculated.
A token ends at the earleme location @var{current}+@var{length},
where @var{current} is the current earleme,
and @var{length} is the length of the newly added token.
After a call to @code{marpa_r_alternative()},
the furthest earleme is its value before the call,
or @var{current}+@var{length},
whichever is greater.
The @dfn{latest earleme} is the earleme of the latest
Earley set.
The @dfn{latest Earley set} is the last Earley set completed.
This is always the highest numbered Earley set.
If there is an Earley set at the current earleme,
it is the latest Earley set and the latest earleme
is equal to the current earleme.
There is never an Earley set after the current earleme.
After every call to the @code{marpa_r_earleme_complete()} method
that adds a token,
the value of the latest earleme is
same as the value of the current earleme.
After every call to the @code{marpa_r_earleme_complete()} method
that does @strong{not} add a token,
the value of the latest earleme is unchanged
from its value before the call.
@node The significances of the earleme variables, The initial earleme settings, The earleme variables, Earlemes
@subsection The significances of the earleme variables
The current earleme tracks the advance of the recognizer through the input.
Input tokens always start at the current earleme.
An application can advance past the current earleme,
by calling @code{marpa_r_earleme_complete()}, which
increments the current earleme by 1.
After initialization,
@code{marpa_r_earleme_complete()} is
the only way to manipulate the value of the current earleme.
The furthest earleme tracks how ``far out'' tokens can be found.
In the standard input model, calling
@code{marpa_r_earleme_complete()} after each
@code{marpa_r_alternative()} call is sufficient to process
all inputs,
and the furthest earleme's value
can be typically be ignored.
In alternative input models, if tokens have lengths greater than
1, calling
@code{marpa_r_earleme_complete()} once after the last token
is read may not be enough to ensure that all tokens have been processed.
To ensure that all tokens have been processed,
an application must advance the current earleme
by calling @code{marpa_r_earleme_complete()},
until the current earleme is equal to the furthest earleme.
The lastest earleme is the earleme of the last Earley set.
The latest earleme is different from the current earleme if and only if
there is no Earley set at the current earleme.
A different end of parsing can be specified,
but by default, parsing is of the input
in the range
from earleme 0 to the latest earleme.
@node The initial earleme settings, The standard model of input, The significances of the earleme variables, Earlemes
@subsection The initial earleme settings
All input models have the same initial values.
Initially the current, latest and furthest earleme
are always earleme 0.
Understanding the
settings of current, latest and furthest earleme is
crucial to working with advanced input models,
and for this reason the next sections will go
through the possibilities carefully.
The presentation will start with the most traditional
and restrictive models.
It will proceed to less restrictive models.
@node The standard model of input, Ambiguous input, The initial earleme settings, Earlemes
@subsection The standard model of input
In the standard model of input,
calls to @code{marpa_r_alternative()}
and @code{marpa_r_earleme_complete()} are
made in pairs.
There is, first, exactly one call
to @code{marpa_r_alternative()},
and it is for a token with length 1.
Following it must be a call
to @code{marpa_r_earleme_complete()}.
For an input of length @var{n}, there will be
exactly @var{n} such paired calls.
Suppose, in the standard model that,
for a call
to @code{marpa_r_alternative()},
the following is true:
@itemize
@item The current earleme before the call was @var{c}.
@end itemize
Because this is the standard model,
this means that we also know that
@itemize
@item The latest earleme before the call was @var{c}.
@item The furthest earleme before the call was @var{c}.
@end itemize
In that case,
after the call to
@code{marpa_r_alternative()},
the following will be true:
@itemize
@item The current earleme will still be @var{c}.
@item The latest earleme will still be @var{c}.
@item The furthest earleme will be @var{c}+1.
@end itemize
Suppose, in the standard model that,
for a call
to @code{marpa_r_earleme_complete()},
the following is true:
@itemize
@item The current earleme before the call is @var{c}.
@end itemize
Because this is the standard model,
this means that we also know that
@itemize
@item The latest earleme before the call was @var{c}.
@item The furthest earleme before the call was @var{c+1}.
@end itemize
In that case,
after the call
to @code{marpa_r_earleme_complete()},
the current, latest and furthest earlemes will
be the same.
Specifically,
the following will be true:
@itemize
@item The current earleme will be advanced to @var{c+1}.
@item The latest earleme will be advanced to @var{c+1}.
@item The furthest earleme will remain at @var{c+1}.
@end itemize
@node Ambiguous input, Variable length tokens, The standard model of input, Earlemes
@subsection Ambiguous input
As a first loosening of the standard model,
we no longer require calls to @code{marpa_r_alternative()}
to be paired with calls to
@code{marpa_r_earleme_complete()}.
Instead,
we allow a series of one or more calls
to @code{marpa_r_alternative()}
before each call to
@code{marpa_r_earleme_complete()}.
We still require that there be at least one call
to @code{marpa_r_alternative()}
before each call to
@code{marpa_r_earleme_complete()},
and we still require that all tokens have
a length of 1.
In the ambiguous input model, the behavior of the current,
latest and furthest earlemes are exactly
as described for the standard model, with one minor
exception:
Where the current earleme is @var{c},
and a call to @code{marpa_r_alternative()} is not the first
of a series,
the value of the furthest earleme before the call will be
@var{c}+1.
@node Variable length tokens, The generalized model, Ambiguous input, Earlemes
@subsection Variable length tokens
Our next loosening of the restrictions is to allow
variable length tokens.
That is, instead of requiring that all tokens
be of length 1,
we allow tokens to be of length 1 or longer.
This does change the behavior of the earleme variables.
Suppose, in the variable length token model that,
for a call
to @code{marpa_r_alternative()},
the following is true:
@itemize
@item The current earleme before the call was @var{c}.
In this model, this means that
the latest earleme before the call is also @var{c}.
@item The furthest earleme before the call was @var{old_f}.
@item The length of the token is @var{length}.
@end itemize
In that case,
after the call to
@code{marpa_r_alternative()},
the following will be true:
@itemize
@item The current and latest earlemes will still be @var{c}.
The current and latest earlemes are never changed by a call
to @code{marpa_r_alternative()}.
@item The furthest earleme will be either @var{old_f} or
@var{c}+@var{length}, whichever was greater.
@end itemize
Suppose, in the variable length token model that,
for a call
to @code{marpa_r_earleme_complete()},
the following is true:
@itemize
@item The current earleme before the call is @var{c}.
In this model, this means that
the latest earleme before the call is also @var{c}.
@item The furthest earleme before the call is @var{old_f}.
@end itemize
In that case,
after the call
to @code{marpa_r_earleme_complete()},
the following will be true:
@itemize
@item The current earleme will be advanced to @var{c+1}.
@item The latest earleme will also be @var{c+1}.
@item The furthest earleme will still be @var{old_f}.
The furthest earleme is never changed by a call
to @code{marpa_r_earleme_complete()}.
@end itemize
@node The generalized model, General rules for the earleme variables, Variable length tokens, Earlemes
@subsection The generalized model
To fully generalize the input model,
we now need to remove only one restriction.
We now allow empty earlemes --- earlemes with
no tokens and no Earley set.
For the generalized input model,
the effect on the earleme variables of
a call
to @code{marpa_r_alternative()} is exactly
the same as it is for the variable length input model.
The effect on the earleme variables of a call to
to @code{marpa_r_earleme_complete()} depends on
whether or not that call creates an empty earleme.
A call
to @code{marpa_r_earleme_complete()}
creates an empty earleme if and only if
it falls into one of these two cases:
@itemize
@item
There has been no call
to @code{marpa_r_alternative()} since
recognizer initialization.
@item
There has been no call
to @code{marpa_r_alternative()} since
the previous call
to @code{marpa_r_earleme_complete()}.
@end itemize
Suppose, in the generalized input model that,
for a call
to @code{marpa_r_earleme_complete()}
that creates an empty earleme,
the following is true:
@itemize
@item The current earleme before the call is @var{c}.
@item The latest earleme before the call is @var{old_l}.
@item The furthest earleme before the call is @var{old_f}.
@end itemize
In that case,
after the call
to @code{marpa_r_earleme_complete()},
the following will be true:
@itemize
@item The current earleme will be advanced to @var{c+1}.
@item The latest earleme will remain at @var{old_l}.
This means that the latest earleme will be less than
the current earleme.
@item The furthest earleme will remain at @var{old_f}.
The furthest earleme is never changed by a call
to @code{marpa_r_earleme_complete()}.
@end itemize
Suppose, in the generalized input model that,
for a call
to @code{marpa_r_earleme_complete()}
that does not creates an empty earleme,
the following is true:
@itemize
@item The current earleme before the call is @var{c}.
@item The latest earleme before the call is @var{old_l}.
@item The furthest earleme before the call is @var{old_f}.
@end itemize
In that case,
after the call
to @code{marpa_r_earleme_complete()},
the following will be true:
@itemize
@item The current earleme will be advanced to @var{c+1}.
@item The latest earleme will be advanced to @var{c+1}.
@item The furthest earleme will remain at @var{old_f}.
The furthest earleme is never changed by a call
to @code{marpa_r_earleme_complete()}.
@end itemize
@node General rules for the earleme variables, , The generalized model, Earlemes
@subsection General rules for the earleme variables
At this point, the most generalized input model has been
introduced.
Here are some useful facts about the earleme variables that will always be the case,
no matter which of the input models is in use.
@itemize
@item The current earleme is always greater than
or equal to the latest earleme.
@item The furthest earleme is always greater than
or equal to the latest earleme.
@item If the furthest earleme is greater than the current earleme,
the parser is not exhausted.
@item If the furthest earleme is less than the current earleme,
the parser is exhausted.
@end itemize
If the parser is not exhausted,
the following will always be true,
no matter which of the input models is in use.
@itemize
@item
The furthest earleme is greater than
or equal to the current earleme.
@end itemize
In an exhausted parser, the following will always be true,
no matter which of the input models is in use.
@itemize
@item
The furthest earleme is less than
or equal to the current earleme.
@end itemize
@node Terminals, , Earlemes, Input
@section Terminals
A terminal symbol is a symbol which
may appear in the input.
Traditionally,
all LHS symbols, as well as
the start symbol, must be non-terminals.
This is Marpa's behavior, by default.
Marpa allows the user to eliminate the distinction
between terminals and non-terminals.
In this, it
differs from traditional parsers.
A Libmarpa
can arrange for a a terminal
to appear on the LHS of one or more rules,
or for a terminal to be the start symbol.
However,
since terminals can never be zero length,
it is a logical contradiction for a nulling
symbol to also be a terminal
and Marpa does not allow it.
Token values are @code{int}'s.
Libmarpa does nothing with token values except accept
them from the application and return them during
parse evaluation.
@node Semantics, Threads, Input, Top
@chapter Semantics
Libmarpa handling of semantics is unusual.
Most semantics are left up to the application,
but Libmarpa guides them.
Specifically, the application is expected to maintain the evaluation
stack.
Libmarpa's valuator provides instructions on how to handle the stack.
Libmarpa's stack handling instructions
are called ``steps''.
For example, a Libmarpa step might tell the application that the value
of a token needs to go into a certain stack position.
Or a Libmarpa step might tell the application that a rule is to be evaluated.
For rule evalution, Libmarpa will tell the application where the operands
are to be found,
and where the result must go.
The detailed discussion of
Libmarpa's handling of semantics is in the reference chapters
of this document,
under the appropriate methods and classes.
The most extensive discussion of the semantics
is in the section that deals with the methods of the value time class.
@xref{Value methods}.
@node Threads, Fatal Errors, Semantics, Top
@chapter Threads
Libmarpa is thread-safe,
given circumstances as described below.
The Libmarpa methods are not reentrant.
Libmarpa is C89-compliant.
It uses no global data,
and calls only the routines
that are defined in the C89 standard
and that can be made thread-safe.
In most modern implementations,
the default C89 implementation is thread-safe
to the extent possible.
But the C89 standard does not require thread-safety,
and even most modern environments allow the user
to turn thread safety off.
To be thread-safe, Libmarpa must be compiled
and linked in an environment that provides
thread-safety.
While Libmarpa can be used safely across
multiple threads,
a Libmarpa grammar cannot be.
Further, a Libmarpa time object can
only be used safely in the same thread
as its base grammar.
This is because all
time objects with the same base grammar share data
from that base grammar.
To work around this limitation,
the same grammar definition can be
used to a create a new
Libmarpa grammar
time object in each thread.
If there is sufficient interest, future versions of
Libmarpa could allow thread-safe
cloning of grammars and other
time objects.
@node Fatal Errors, Introduction to the external interface, Threads, Top
@chapter Fatal Errors
Libmarpa avoids fatal errors,
leaving that decision up to the application,
with one exception.
Currently, if @code{malloc} fails to allocate memory,
Libmarpa terminates the program with a fatal error.
While this is in keeping with current practice,
future versions of Libmarpa are likely to both allow
an alternative memory allocator to be specified,
and to allow the user to specifier a handler to
be called when an out-of-memory condition occurs.
@node Introduction to the external interface, Static methods, Fatal Errors, Top
@chapter Introduction to the external interface
The following chapters describe Libmarpa's external
interface in detail.
@menu
* About the overviews::
* Return values::
* Naming conventions::
@end menu
@node About the overviews, Return values, Introduction to the external interface, Introduction to the external interface
@section About the overviews
The reference sections for each major class begin with an overview.
These sections name the
most important Libmarpa methods
in the order in which they are typically used,
and very briefly describe their purpose.
The overviews are intended as ``cheat sheets''.
The overview sections often speak of
an ``archetypal'' application.
The archetypal Libmarpa application
implements a complete logic flow,
starting with the creation of a grammar,
and proceeding all the way
to the return of the final result from a value object.
In the archetypal Libmarpa application,
the grammar, input and semantics are
all small but non-trivial.
@node Return values, Naming conventions, About the overviews, Introduction to the external interface
@section Return values
Return values are discussed method by method,
but some general practices are worth
mentioning.
For methods that return an integer,
Libmarpa usually reserves @minus{}1 for special purposes,
such as indicating loop termination in an iterator.
In Libmarpa, methods usually indicate failure
by returning @minus{}2.
Any result greater than @minus{}2 usually indicates success.
If a method returns an pointer value,
@code{NULL} usually indicates failure.
Any other result usually indicates success.
On failure, a Libmarpa method always sets
the error code.
On success, a Libmarpa method may take
one of three actions:
@itemize
@item Set
an informational error code.
@item Leave
the error code as it is.
@item Reset the error code to @code{MARPA_ERROR_NONE}.
@end itemize
Which of the three actions the method
takes on success
should be regarded as unspecified,
unless stated in the description of the method
in this document.
If a method sets an informational error code on
success, that will always be stated in
its description, and the description will
specify,
for each informational error code,
the specific error code value,
the circumstances under which it is set,
and what return values will accompany it.
Informational error codes are set
because some applications may find them convenient.
Typically information error codes are redundant information,
which may be ignored.
A method can have many reasons for
failing, and many of the reasons
for failure are common to
large number of methods.
Full descriptions of the error codes
that are returned by the external methods
are given in their own section.
@xref{External error codes}.
@node Naming conventions, , Return values, Introduction to the external interface
@section Naming conventions
Methods in Libmarpa follow a strict naming convention.
All methods have a name beginning with @code{marpa_},
if they are part of the
external interface.
If an external method is not a static method,
its name is prefixed with one of
@code{marpa_c_},
@code{marpa_g_},
@code{marpa_r_},
@code{marpa_b_},
@code{marpa_o_},
@code{marpa_t_} or
@code{marpa_v_},
where the single letter between underscores
is one of the Libmarpa major class abbreviations.
The letter indicates which class
the method belongs to.
Methods that are exported,
but that are part of
the internal interface,
begin with @code{_marpa_}.
Methods that are part of the internal interface
(often called ``internal methods'')
are subject to change and are intended for use
only by Libmarpa's developers.
Libmarpa reserves the @code{marpa_}
and @code{_marpa_} prefixes for itself,
with all their capitalization variants.
All Libmarpa names visible outside the package
will begin with a capitalization variant
of one of these two prefixes.
@node Static methods, Configuration methods, Introduction to the external interface, Top
@chapter Static methods
@deftypefun Marpa_Error_Code marpa_check_version @
(int @var{required_major}, @
int @var{required_minor}, @
int @var{required_micro} @
)
Checks that the Marpa library in use is compatible with the
given version. Generally you will pass in the constants
@code{MARPA_MAJOR_VERSION},
@code{MARPA_MINOR_VERSION},
@code{MARPA_MICRO_VERSION}
as the three arguments to this function; that produces
a check that the library in use is compatible with
the version of Libmarpa the application or module was compiled
against.
Compatibility is defined by two things:
first the version
of the running library is newer than the version
@var{required_major}.@var{required_minor}.@var{required_micro}.
Second
the running library must be binary compatible with the
version
@var{required_major}.@var{required_minor}.@var{required_micro}
(same major version.)
Return value: @code{MARPA_ERR_NONE} if the Marpa library is compatible with the
requested version. If the library is not compatible,
one of the following is returned, indicating the nature of the mismatch:
@itemize
@item @code{MARPA_ERR_MAJOR_VERSION_MISMATCH},
@item @code{MARPA_ERR_MINOR_VERSION_MISMATCH}
@item @code{MARPA_ERR_MICRO_VERSION_MISMATCH}
@end itemize
@end deftypefun
@deftypefun Marpa_Error_Code marpa_version (int* version)
Returns the version number in @var{version},
which must have room for three @code{int}'s.
Return value: A non-negative number on success,
@minus{}2 on failure.
@end deftypefun
@node Configuration methods, Grammar methods, Static methods, Top
@chapter Configuration methods
The configuration object is intended for future extensions.
These may
allow the application to override Libmarpa's memory allocation
and fatal error handling without resorting to global
variables, and therefore in a thread-safe way.
Currently, the only function of the @code{Marpa_Config}
class is to give @code{marpa_g_new()}
a place to put its error code.
@code{Marpa_Config} is Libmarpa's only ``major''
class which is not a time class.
There is no constructor or destructor, although
@code{Marpa_Config} objects @strong{do} need to be initialized
before use.
Aside from its own accessor,
@code{Marpa_Config} objects are only used by @code{marpa_g_new}
and no reference to their location is not kept
in any of Libmarpa's time objects.
The intent is to that it be convenient
to have them in memory that might be deallocated
soon after @code{marpa_g_new} returns.
For example, they could be put on the stack.
@deftypefun int marpa_c_init ( @
Marpa_Config* @var{config})
Initialize the @var{config} information to ``safe'' default
values.
Unspecified behavior will result
if an initialized
configuration is used to create a grammar.
Return value: A non-negative value. Always succeeds.
@end deftypefun
@deftypefun Marpa_Error_Code marpa_c_error ( @
Marpa_Config* @var{config}, const char** @var{p_error_string} )
Error codes are usually kept in the base grammar,
which leaves @code{marpa_g_new()} no place to put
its error code on failure.
Objects of
the @code{Marpa_Config} class provide such a place.
Return value:
The error code in @var{config}.
Always succeeds.
@end deftypefun
@node Grammar methods, Recognizer methods, Configuration methods, Top
@chapter Grammar methods
@cindex grammars
@menu
* Grammar overview::
* Grammar constructor::
* Grammar reference counting::
* Symbols::
* Rules::
* Sequences::
* Ranks::
* Grammar precomputation::
@end menu
@node Grammar overview, Grammar constructor, Grammar methods, Grammar methods
@section Overview
An archtypal application has a grammar.
To create a grammar, use the @code{marpa_g_new()} method.
When a grammar is no longer in use, its memory can be freed
using the
@code{marpa_g_unref()} method.
To be precomputed,
a grammar must have one or more symbols.
To create symbols, use the
@code{marpa_g_symbol_new()} method.
To be precomputed,
a grammar must have one or more rules.
To create rules, use the
@code{marpa_g_rule_new()} and
@code{marpa_g_sequence_new()} methods.
For non-trivial parsing,
one or more of the symbols must be terminals.
To mark a symbol as a terminal,
use the
@code{marpa_g_symbol_is_terminal_set()} method.
To be precomputed,
a grammar must have exactly one start symbol.
To mark a symbol as the start symbol,
use the
@code{marpa_g_start_symbol_set()} method.
Before parsing with a grammar, it must be precomputed.
To precompute a grammar,
use the
@code{marpa_g_precompute()} method.
@node Grammar constructor, Grammar reference counting, Grammar overview, Grammar methods
@section Creating a new grammar
@cindex grammar constructor
@deftypefun Marpa_Grammar marpa_g_new ( @
Marpa_Config* @var{configuration} )
Creates a new grammar time object.
The returned grammar object is not yet precomputed,
and will have no symbols and rules.
Its reference count will be 1.
Unless the application calls @code{marpa_c_error},
Libmarpa will not reference the location
pointed to by the @var{configuration}
argument after @code{marpa_g_new} returns.
The @var{configuration} argument may be @code{NULL},
but if it is,
there will be no way to determine
the error code on failure.
Return value: On success, the grammar object.
On failure, @code{NULL},
and the error code is set in @var{configuration}.
@end deftypefun
@deftypefun int marpa_g_force_valued ( @
Marpa_Grammar @var{g} )
It is recommended that
this call be made immediately after the
grammar constructor.
It turns off a deprecated feature.
The @code{marpa_g_force_valued} forces all the
symbols in a grammar to be ``valued''.
The opposite of a valued symbol is one about whose value
you do not care.
This distinction has been made in the past in hope
of gaining efficiencies at evaluation time.
Current thinking is that the gains do not repay the extra
complexity.
Return value: On success, a non-negative integer.
On failure, a negative integer.
@end deftypefun
@node Grammar reference counting, Symbols, Grammar constructor, Grammar methods
@section Tracking the reference count of the grammar
@cindex grammar destructor
@cindex grammar reference
@cindex grammar reference count
@deftypefun Marpa_Grammar marpa_g_ref (Marpa_Grammar @var{g})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, the grammar object it was called with;
@code{NULL} on failure.
@end deftypefun
@deftypefun void marpa_g_unref (Marpa_Grammar @var{g})
Decreases the reference count by 1,
destroying @var{g} once the reference count reaches
zero.
@end deftypefun
@node Symbols, Rules, Grammar reference counting, Grammar methods
@section Symbols
@deftypefun Marpa_Symbol_ID marpa_g_start_symbol (Marpa_Grammar @var{g})
Returns current value of the start symbol of grammar @var{g}.
The value is that
specified in the @code{marpa_g_start_symbol_set()} call,
if there has been one.
Return value:
On failure, @minus{}2;
@minus{}1 if there is no start symbol yet;
otherwise the ID of the new start symbol.
@end deftypefun
@deftypefun Marpa_Symbol_ID marpa_g_start_symbol_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
Sets the start symbol of grammar @var{g} to symbol @var{id}.
Return value: On success, the ID of the new start symbol.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_highest_symbol_id (Marpa_Grammar @var{g})
Return value: On success, the numerically largest symbol ID
of @var{g}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_accessible (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
A symbol is @dfn{accessible} if it can be reached from the start symbol.
Return value: On success, 1 if symbol @var{sym_id} is accessible, 0 if not.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_completion_event ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
@deftypefunx int marpa_g_symbol_is_completion_event_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{value})
Libmarpa can be set up to generate an
@code{MARPA_EVENT_SYMBOL_COMPLETED}
event whenever the symbol is completed.
A symbol is said to be @strong{completed}
when a non-nulling rule with
that symbol on its LHS is completed.
For completion events to occur, the symbol must be marked
as a completion event symbol.
The @code{marpa_g_symbol_is_completion_event_set()} function
marks symbol @var{sym_id} as a completion event symbol
if @var{value} is 1,
and unmarks it
it as a completion event symbol if @var{value} is 0.
The @code{marpa_g_symbol_is_completion_event()} method
returns the current value of the completion event marking
for symbol @var{sym_id}.
Nulled rules and symbols will never cause completion events.
Nullable symbols
may be marked as completion event symbols,
but this will have an effect only when the symbol
is not nulled.
Nulling symbols
may be marked as completion event symbols,
but no completion events will ever be generated
for a nulling symbol.
Note that this implies at no completion event will
ever be generated at earleme 0,
the start of parsing.
The completion event marking cannot be changed once
the grammar is precomputed.
However, if a symbol is marked as a completion event symbol,
it can be deactivated
and reactivated in the recognizer.
Success: On success, 1 if symbol @var{sym_id}
is a completion event symbol after the
call, 0 otherwise.
Failures:
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
if the grammar @var{g} is precomputed;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_nulled_event ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
@deftypefunx int marpa_g_symbol_is_nulled_event_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{value})
Libmarpa can set up to generate
an @code{MARPA_EVENT_SYMBOL_NULLED}
event whenever the symbol is nulled.
A symbol is said to be @strong{nulled}
when a zero length instance of that symbol
is recognized.
For nulled events to occur, the symbol must be marked
as a nulled event symbol.
The @code{marpa_g_symbol_is_nulled_event_set()} function
marks symbol @var{sym_id} as a nulled event symbol
if @var{value} is 1,
and unmarks it
it as a nulled event symbol if @var{value} is 0.
The @code{marpa_g_symbol_is_nulled_event()} method
returns the current value of the nulled event marking
for symbol @var{sym_id}.
Note that Earleme 0 is a exception.
It is possible for a symbol to be nulled at
earleme 0, the start of parsing,
but no nulled symbol event will be generated.
When a symbol is recognized, it can generate
a nulled event or a completion event,
but recognition of a single instance of a symbol will
generate at most one or the other event -- never
both.
Symbols recognized
as zero length can only generate nulled events.
Symbols recognized
as non-zero in length can only generate completed events.
Zero length derivations can be ambiguous.
When a zero length symbol is recognized,
all of its zero-length derivations are also considered to be
recognized.
The nulled event marking cannot be changed once
the grammar is precomputed.
However, if a symbol is marked as a nulled event symbol,
it can be deactivated
and reactivated in the recognizer.
Success: On success, 1 if symbol @var{sym_id}
is a nulled event symbol after the
call, 0 otherwise.
Failures:
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
if the grammar @var{g} is precomputed;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_nullable ( @
Marpa_Grammar g, Marpa_Symbol_ID sym_id)
A symbol is @dfn{nullable} if it sometimes produces the empty string.
A @strong{nulling} symbol is always a @strong{nullable} symbol,
but not all @strong{nullable} symbols are @strong{nulling} symbols.
Return value: On success, 1 if symbol @var{sym_id} is nullable, 0 if not.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_nulling (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
A symbol is @dfn{nulling} if it always produces the empty string.
Return value: On success, 1 if symbol @var{sym_id} is nulling, 0 if not.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_productive (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
A symbol is @dfn{productive} if it can produce a string of terminals.
All nullable symbols are considered productive.
Return value: On success, 1 if symbol @var{sym_id} is productive, 0 if not. If the grammar
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_prediction_event ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
@deftypefunx int marpa_g_symbol_is_prediction_event_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{value})
Libmarpa can be set up
to generate a
@code{MARPA_EVENT_SYMBOL_PREDICTED}
event when a symbol is predicted.
A symbol is said to be @strong{predicted}
when a it is acceptable at the current
earleme according to the grammar.
Note that Earleme 0 is a exception.
Predicted symbol events are not generated
for symbols predicted at the start of parsing.
For predicted events to occur, the symbol must be marked
as a predicted event symbol.
The @code{marpa_g_symbol_is_predicted_event_set()} function
marks symbol @var{sym_id} as a predicted event symbol
if @var{value} is 1,
and unmarks it
it as a predicted event symbol if @var{value} is 0.
The @code{marpa_g_symbol_is_predicted_event()} method
returns the current value of the predicted event marking
for symbol @var{sym_id}.
The predicted event marking cannot be changed once
the grammar is precomputed.
However, if a symbol is marked as a predicted event symbol,
it can be deactivated
and reactivated in the recognizer.
Success: On success, 1 if symbol @var{sym_id}
is a predicted event symbol after the
call, 0 otherwise.
Failures:
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
if the grammar @var{g} is precomputed;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_start ( Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
This return value of this call indicates whether @var{sym_id}
is the start symbol.
Return value:
On success, 1 if @var{sym_id} is the start symbol;
0 otherwise.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
On other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_terminal ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
@deftypefunx int marpa_g_symbol_is_terminal_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{value})
These methods, respectively, set
and query the ``terminal status'' of a symbol.
To be used as an input symbol
in the @code{marpa_r_alternative()} method,
a symbol must be a terminal.
This function flags symbol @var{sym_id} as a terminal if
@var{value} is 1,
or flags it as a non-terminal if @var{value} is 0.
Once set to a value with the
@code{marpa_g_symbol_is_terminal_set()} method,
the terminal status of a symbol is ``locked'' at that value.
A subsequent call to
@code{marpa_g_symbol_is_terminal_set()} that attempts
to change the terminal status
of @var{sym_id} to a value different from its current one
will fail.
The error code will be @code{MARPA_ERR_TERMINAL_IS_LOCKED}.
By default, a symbol is a terminal if and only if it
does not appear on the LHS of any rule.
An attempt to flag a nulling symbol
as a terminal will cause a failure,
but this is not necesssarily detected before precomputation.
Return value: On success, 1 if symbol @var{sym_id}
is a terminal symbol after the
call, 0 otherwise.
If @var{sym_id} is well-formed, but there is no
such symbol, -1.
If @var{value} is not 0 or 1;
if the terminal status is locked and @var{value}
is different from its current value;
if the grammar @var{g} is precomputed;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun Marpa_Symbol_ID marpa_g_symbol_new (Marpa_Grammar @var{g})
Creates a new symbol.
The symbol ID will be a non-negative integer.
Return value: On success, the ID of a new symbol;
On failure, @minus{}2.
@end deftypefun
@node Rules, Sequences, Symbols, Grammar methods
@section Rules
@deftypefun int marpa_g_highest_rule_id (Marpa_Grammar @var{g})
Return value: On success, the numerically largest rule ID
of @var{g}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_is_accessible (Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
A rule is @dfn{accessible} if it can be reached from the start symbol.
A rule is accessible if and only if its LHS symbol is accessible.
The start rule is always an accessible rule.
Return value: On success, 1 if rule @var{rule_id}
is accessible, 0 if not.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If the grammar
is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_is_nullable ( @
Marpa_Grammar @var{g}, Marpa_Rule_ID @var{ruleid})
A rule is @dfn{nullable} if it sometimes produces the empty string.
A @strong{nulling} rule is always a @strong{nullable} rule,
but not all @strong{nullable} rules are @strong{nulling} rules.
Return value: On success,
1 if rule @var{ruleid} is nullable, 0 if not.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_is_nulling (Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{ruleid})
A rule is @dfn{nulling} if it always produces the empty string.
Return value: On success,
1 if rule @var{ruleid} is nulling, 0 if not.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_is_loop (Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
A rule is a loop rule if it non-trivially
produces the string of length one
which consists only of its LHS symbol.
Such a derivation takes the parse back to where
it started, hence the term ``loop''.
``Non-trivially'' means the zero-step derivation does not count --- the
derivation must have at least one step.
The presence of a loop rule makes a grammar infinitely ambiguous,
and applications will typically want to treat them as fatal errors.
But nothing forces an application to do this,
and Marpa will successfully parse and evaluate grammars with
loop rules.
Return value: On success,
1 if rule @var{rule_id} is a loop rule, 0 if not.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If the grammar
is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_is_productive (Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
A rule is @dfn{productive} if it can produce a string of terminals.
An rule is productive if and only if all the symbols on
its RHS are productive.
The empty string counts as a string of terminals,
so that a nullable rule is always a productive rule.
For that same reason,
an empty rule is considered productive.
Return value: On success,
1 if rule @var{rule_id} is productive, 0 if not.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If the grammar is not precomputed, or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_rule_length ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
The length of a rule is the number of symbols on its RHS.
Return value: On success, the length of rule @var{rule_id}.
On failure, @minus{}2.
@end deftypefun
@deftypefun Marpa_Symbol_ID marpa_g_rule_lhs ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
Return value: On success, the LHS symbol of rule @var{rule_id}.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
On other failure, @minus{}2.
@end deftypefun
@deftypefun Marpa_Rule_ID marpa_g_rule_new (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{lhs_id}, @
Marpa_Symbol_ID *@var{rhs_ids}, @
int @var{length})
Creates a new external BNF rule in grammar @var{g}.
The ID of the new rule will be a non-negative integer,
which will be unique to that rule.
In addition to BNF rules, Marpa also allows sequence rules,
which are created by another method:
@code{marpa_g_sequence_new()}.
Sequence rules and BNF rules are numbered in the same series,
so that a BNF rule will never have the same rule ID as a sequence
rule, and vice versa.
The LHS symbol is @var{lhs_id},
and there are @var{length} symbols on the RHS.
The RHS symbols are in an array
pointed to by @var{rhs_ids}.
Possible failures, with their error codes, include:
@itemize
@item @code{MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE}: The LHS symbol is the same
as that of a sequence rule.
@item @code{MARPA_ERR_DUPLICATE_RULE}: The new rule would duplicate another BNF
rule.
Another BNF rule is considered the duplicate of the new one,
if its LHS symbol is the same as symbol @var{lhs_id},
if its length is the same as @var{length},
and if its RHS symbols match one for one those
in the array of symbols @var{rhs_ids}.
@end itemize
Return value: On success, the ID of new external rule.
On failure, @minus{}2.
@end deftypefun
@deftypefun Marpa_Symbol_ID marpa_g_rule_rhs ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id}, @
int @var{ix})
Returns the ID of the symbol in position @var{ix}
in the RHS of rule @var{rule_id}.
The RHS position, @var{ix}, is zero-based.
Return value: On success,
the ID of the symbol in position @var{ix}
on the rules RHS.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
If @var{ix} is greater than or equal to the length of
the rule,
or on other failure, @minus{}2.
@end deftypefun
@node Sequences, Ranks, Rules, Grammar methods
@section Sequences
@deftypefun int marpa_g_rule_is_proper_separation ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
Return value:
On success, if rule @var{rule_id} is not a sequence rule, 0.
On success, if rule @var{rule_id}
is a sequence rule where the proper separation flag
is set, 1.
On success, if rule @var{rule_id}
is a sequence rule where the proper separation flag
is @strong{not} set, 0.
If @var{rule_id} is well-formed, but there is no
such rule, -1.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_sequence_min ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
Returns the mininum length of a sequence rule.
This accessor can be also be used to test whether
or not a rule is a sequence rule.
@minus{}1 is returned if and only if the rule is valid
but not a sequence rule.
Return value:
If rule @var{rule_id} is a sequence rule, its minimum length.
If rule @var{rule_id} is valid, but is
not the rule ID of sequence rule, @minus{}1.
If @var{rule_id} is well-formed, but there is no
such rule,
or on other failure, @minus{}2.
@end deftypefun
@deftypefun Marpa_Rule_ID marpa_g_sequence_new (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{lhs_id}, @
Marpa_Symbol_ID @var{rhs_id}, @
Marpa_Symbol_ID @var{separator_id}, @
int @var{min}, @
int @var{flags} )
Adds a new sequence rule to grammar @var{g}.
The ID of the new sequence rule will be a non-negative integer,
which is unique to that rule.
Sequence rules and BNF rules are numbered in the same series,
so that a BNF rule will never have the same rule ID as a sequence
rule, and vice versa.
The sequence is @var{lhs_id},
and the item to be repeated in the sequence is @var{rhs_id}.
The sequence must be repeated at least @var{min} times,
where @var{min} is 0 or 1.
If @var{separator_id} is non-negative,
it is a separator symbol.
If @code{flags & MARPA_PROPER_SEPARATION} is non-zero,
separation is ``proper'', that is,
a trailing separator is not allowed.
The term @dfn{proper} is based on the idea that
properly-speaking, separators should actually separate items.
Some higher-level Marpa interfaces offer the ability to
discard separators in the semantics,
and in fact will do this by default.
At the Libmarpa level, sequences always ``keep
separators''.
It is up to the programmer to arrange
to discard separators,
if that is what is desired.
The sequence RHS, or item,
is restricted to a single symbol,
and that symbol cannot be nullable.
If @var{separator_id} is a symbol, it also cannot
be a nullable symbol.
Nullables on the RHS of sequences are restricted
because they lead to highly ambiguous grammars.
Grammars of this kind are allowed by Libmarpa, but
they must be expressed using BNF rules, not sequence rules.
This is for two reasons:
First, sequence optimizations would not work
in the presence of nullables.
Second, since it is not completely clear what
an application intends
when it asks for a sequence of identical items,
some of which are nullable,
the user's intent can be more clearly expressed
directly in BNF.
The LHS symbol cannot be the LHS of any other rule,
whether a BNF rule or a sequence rule.
On an attempt to create an sequence rule with a duplicate
LHS,
@code{marpa_g_sequence_new()} fails,
setting the error code to
@code{MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE}.
Sequence rules do not add to the classes of grammar parsed
by Libmarpa ---
a sequence can always be written as BNF rules.
When a rule is created
with the @code{marpa_g_sequence_new()} method,
Libmarpa will understand that it is a sequence,
and will optimize accordingly.
The speedup is often considerable.
Return value: On success, the ID of the external rule.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_sequence_separator ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id})
Return value:
If rule @var{rule_id} is a sequence rule, its separator.
If rule @var{rule_id} is a sequence rule,, but there is no
separator, @minus{}1.
If @var{rule_id} is
not a sequence rule,
does not exist
or is not well-formed;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_symbol_is_counted (Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id})
A symbol is @dfn{counted}
if it appears on the RHS of a sequence rule,
or if it is used as
the separator symbol of a sequence rule.
Return value: On success,
1 if symbol @var{sym_id} is counted, 0 if not.
On failure, @minus{}2.
@end deftypefun
@node Ranks, Grammar precomputation, Sequences, Grammar methods
@section Ranks
@deftypefun Marpa_Rank marpa_g_rule_rank ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID rule_id)
@deftypefunx Marpa_Rank marpa_g_rule_rank_set ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id}, @
Marpa_Rank @var{rank})
These methods, respectively, set
and query the rank of a rule @var{rule_id}.
When @var{rule_id} is created, its rank
initialized to the default rank of the grammar.
Initially, the default rank of the grammar is 0.
Methods to reset the grammar's default rank
are currently in the experimental stage.
Return value: On success, returns
the rank @strong{after}
the call,
and sets the error code to
@code{MARPA_ERR_NONE}.
On failure, returns @minus{}2,
and sets the error code to an appropriate
value, which will never be
@code{MARPA_ERR_NONE}.
Note that when the rank is @minus{}2,
the error code is the only way to distinguish
success from failure.
The error code can be determined by using the
@code{marpa_g_error()} call.
@end deftypefun
@deftypefun int marpa_g_rule_null_high ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID rule_id)
@deftypefunx int marpa_g_rule_null_high_set ( @
Marpa_Grammar @var{g}, @
Marpa_Rule_ID @var{rule_id}, @
int @var{flag})
These methods, respectively, set
and discover the ``null ranks high'' setting of the rule @var{rule_id}.
The ``null ranks high'' setting is either 0 or 1.
When @var{rule_id} is created, its ``null ranks high''
setting is initialized to 0.
The ``null ranks high'' setting affects the ranking of rules
with properly nullable symbols on their right hand side.
If a rule has properly nullable symbols on its RHS,
each instance in which it appears in a parse will have a pattern
of nulled and non-nulled symbols.
Such a pattern is called a ``null variant''.
If the ``null ranks high'' setting is 0 (the default),
nulled symbols rank low.
If the ``null ranks high'' setting is 1,
nulled symbols rank high.
Ranking of a null variants is done from left-to-right.
Return value: On success, the setting
of the ``null ranks high'' flag @strong{after}
the call.
If the grammar has been precomputed,
or on other failure, @minus{}2.
@end deftypefun
@node Grammar precomputation, , Ranks, Grammar methods
@section Precomputing the Grammar
@deftypefun int marpa_g_precompute (Marpa_Grammar @var{g})
@anchor{marpa_g_precompute}
Precomputation is necessary for a recognizer to be generated
from a grammar.
On success, @code{marpa_g_precompute} returns a non-negative
number to indicate that it precomputed the grammar without
issues.
On failure, @code{marpa_g_precompute} returns @minus{}2.
Precomputation may return one or more events,
which may be queried using the
@code{marpa_g_event()} method.
At this point events only occur when failure is reported,
and events always report issues.
But application writers should expect future versions
to have events which are reported on success,
as well as events which do not represent issues.
A @code{MARPA_EVENT_LOOP_RULES} event occurs
when there are infinite loop rules (cycles)
in the grammar.
The presence of one or more of these will cause failure
to be reported,
but will not prevent the grammar from being precomputed.
Each @code{MARPA_EVENT_COUNTED_NULLABLE} event is a symbol
which is a nullable on the right hand side of a sequence
rule --- a ``counted'' symbol.
The presence of one or more of these will cause failure
to be reported,
and will prevent the grammar from being precomputed.
So that the programmer can fix several at once,
these failures are delayed until events are created
for all of the counted nullables.
Each @code{MARPA_EVENT_NULLING_TERMINAL} event is a nulling
symbol which is also flagged as a terminal.
Since terminals cannot be of zero length, this is a logical
impossibility.
The presence of one or more of these will cause failure
to be reported,
and will prevent the grammar from being precomputed.
So that the programmer can fix several at once,
the failure is delayed until events are created
for all of the counted nullables.
Precomputation involves freezing
and then thoroughly checking the grammar.
Among the reasons for precomputation to fail
are the following:
@itemize
@item @code{MARPA_ERR_NO_RULES}: The grammar has no rules.
@item @code{MARPA_ERR_NO_START_SYMBOL}: No start symbol was specified.
@item @code{MARPA_ERR_INVALID_START_SYMBOL}: A start symbol ID was specified, but it
is not the ID of a valid symbol.
@item @code{MARPA_ERR_START_NOT_LHS}: The start symbol is not on the LHS of any rule.
@item @code{MARPA_ERR_UNPRODUCTIVE_START}: The start symbol is not productive.
@item @code{MARPA_ERR_COUNTED_NULLABLE}: A symbol on the RHS of a sequence rule is
nullable.
Libmarpa does not allow this.
@item @code{MARPA_ERR_NULLING_TERMINAL}: A terminal is also a nulling symbol.
Libmarpa does not allow this.
@end itemize
More details of these can be found under the
description of the appropriate code.
@xref{External error codes}.
@code{marpa_g_precompute()} is unusual in that it
is possible to treat one of its failures as ``advisory'',
and to proceed with parsing.
If
@code{marpa_g_precompute()} fails with an error code
of @code{MARPA_ERR_GRAMMAR_HAS_CYCLE},
parsing can proceed, just as it typically would for
success.
The grammar will have been precomputed, as
calling the @code{marpa_g_is_precomputed()} method
will confirm.
Most applications, however,
will want to simply treat failure
with @code{MARPA_ERR_GRAMMAR_HAS_CYCLE},
as simply another failure,
and fix the cycles before parsing.
Cycles make a grammar infinitely ambiguous,
and are considered useless in current
practice.
Cycles make processing the grammar less
efficient, sometimes considerably so.
Detection of cycles is returned as failure
because that is by far the convenient thing to do
for the vast majority of applications.
Return value: On success, a non-negative number.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_is_precomputed (Marpa_Grammar @var{g})
Return value: On success, 1
if grammar @var{g} is already precomputed,
0 otherwise.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_g_has_cycle (Marpa_Grammar @var{g})
This function allows the application to determine if grammar
@var{g} has a cycle.
As mentioned, most applications will want to treat these
as fatal errors.
To determine which rules are in the cycle,
@code{marpa_g_rule_is_loop()} can be used.
Return value: On success, 1 if the grammar has a cycle,
0 otherwise.
On failure, @minus{}2.
@end deftypefun
@node Recognizer methods, Progress reports, Grammar methods, Top
@chapter Recognizer methods
@menu
* Recognizer overview::
* Recognizer constructor::
* Recognizer reference counting::
* Recognizer life cycle mutators::
* Location accessors::
* Other parse status methods::
* Untested recognizer methods::
@end menu
@node Recognizer overview, Recognizer constructor, Recognizer methods, Recognizer methods
@section Overview
An archtypal application uses a recognizer to read input.
To create a recognizer, use the @code{marpa_r_new()} method.
When a recognizer is no longer in use, its memory can be freed
using the
@code{marpa_r_unref()} method.
To make a recognizer ready for input,
use the @code{marpa_r_start_input()} method.
The recognizer starts with its current earleme
at location 0.
To read a token at the current earleme,
use the @code{marpa_r_alternative()} call.
To complete the processing of the current earleme,
and move forward to a new one,
use the @code{marpa_r_earleme_complete()} call.
@node Recognizer constructor, Recognizer reference counting, Recognizer overview, Recognizer methods
@section Creating a new recognizer
@deftypefun Marpa_Recognizer marpa_r_new ( Marpa_Grammar @var{g} )
Creates a new recognizer.
The reference count of the recognizer will be 1.
The reference count of @var{g},
the base grammar,
will be incremented by one.
Return value: On success, the newly created recognizer.
If @var{g} is not precomputed, or on other failure, @code{NULL}.
@end deftypefun
@node Recognizer reference counting, Recognizer life cycle mutators, Recognizer constructor, Recognizer methods
@section Keeping the reference count of a recognizer
@deftypefun Marpa_Recognizer marpa_r_ref (Marpa_Recognizer @var{r})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, the recognizer object, @var{r}.
On failure, @code{NULL}.
@end deftypefun
@deftypefun void marpa_r_unref (Marpa_Recognizer @var{r})
Decreases the reference count by 1,
destroying @var{r} once the reference count reaches
zero.
When @var{r} is destroyed, the reference count
of its base grammar is decreased by one.
If this takes the reference count of the base grammar
to zero, it too is destroyed.
@end deftypefun
@node Recognizer life cycle mutators, Location accessors, Recognizer reference counting, Recognizer methods
@section Life cycle mutators
@deftypefun int marpa_r_start_input (Marpa_Recognizer @var{r})
Makes @var{r} ready to accept input.
Return value: On success, a non-negative value.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_alternative (Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{token_id}, @
int @var{value}, @
int @var{length})
Reads a token into @var{r}.
The token will start at the current earleme.
Libmarpa allows tokens to be ambiguous, to be of
variable length and to overlap.
@var{token_id} is the symbol ID of the token,
which must be a terminal.
@var{length} is the length of the token.
@var{value} is an
integer that represents the value of the
token.
In applications where the token's actual value is not an integer, it is
expected that the application will use this value as
a ``virtual'' value,
perhaps finding the actual value by using @var{value}
to index an array.
@var{value} is not used inside Libmarpa --- it is simply
stored to be returned by the valuator
as a convenience for the application.
Some applications may prefer to track token values on
their own, perhaps based on
the earleme location and @var{token_id},
instead of using Libmarpa's token values.
A @var{value} of 0 is reserved for a now-deprecated feature.
Do not use it.
For more details on that feature, see the
section @ref{Valued and unvalued symbols}.
When @code{marpa_r_alternative()}
is successful,
the value of the furthest earleme is set to
the greater of its value before the call,
and @var{current}+@var{length},
where @var{current} is the value of the current earleme.
The values of the current and latest earlemes
are unchanged by
calls to @code{marpa_r_alternative()}.
Several error codes leave the recognizer in a fully
recoverable state, allowing the application to
retry the @code{marpa_r_alternative()} method.
Retry is efficient, and quite useable as a parsing
technique.
The error code
of primary interest from this point of view
is @code{MARPA_ERR_UNEXPECTED_TOKEN_ID},
which indicates that the token was not accepted
because of its token ID.
Retry after this condition is used in several
applications,
and is called ``the Ruby Slippers technique''.
The error codes
@code{MARPA_ERR_DUPLICATE_TOKEN},
@code{MARPA_ERR_NO_TOKEN_EXPECTED_HERE}
and @code{MARPA_ERR_INACCESSIBLE_TOKEN}
also leave the recognizer in a fully recoverable
state, and may also be useable for the
Ruby Slippers or similar techniques.
At this writing,
the author knows of no applications which
attempt to recover from these errors.
Return value: On success, @code{MARPA_ERR_NONE}.
On failure, some other error code.
@end deftypefun
@deftypefun Marpa_Earleme marpa_r_earleme_complete (Marpa_Recognizer @var{r})
This method does the final processing for the current earleme.
It then advances the current earleme by one.
Note that @code{marpa_r_earleme_complete()} may be called
even when no tokens have been read at the current earleme ---
in the character-per-earleme input model, for example, tokens
can span many characters and, if the input is unambiguous over that
span, there will be no other tokens that start inside it.
As mentioned,
@code{marpa_r_earleme_complete()} always advances the current earleme,
incrementing its value by 1.
This means that value of the current earleme after the call
will be the one plus the value of the earleme processed by the call
to @code{marpa_r_earleme_complete()}.
If any token was accepted at the earleme being processed,
@code{marpa_r_earleme_complete()} creates a new Earley set
which will be the latest Earley set
and, after the call, the latest
earleme will be equal to the new current earleme.
If no token was accepted at the
earleme being processed,
no Earley set is created,
and the value of the latest earleme remains unchanged.
The value of the furthest earleme is never changed by
a call to @code{marpa_r_earleme_complete()}.
During this method, one or more events may occur.
On success, this function returns the number of events
generated,
but it is important to note that events may be
created whether earleme completion fails or succeeds.
When this method fails,
the application must call @code{marpa_g_event()}
if it wants to determine if any events occurred.
Since the reason for failure to complete an earleme is often
detailed in the events, applications that fail will often
be at least as interested in the events as those
that succeed.
The @code{MARPA_EVENT_EARLEY_ITEM_THRESHOLD} event
indicates that an application-settable threshold
on the number of Earley items has been reached or exceeded.
What this means depends on the application,
but when the default threshold is exceeded,
it means that it is very likely
that the time and space resources consumed by
the parse will prove excessive.
A parse is ``exhausted'' when it can accept no more
input.
This can happen both on success and on failure.
Note that the failure due to parse exhaustion
only means failure
at the current earleme.
There may be successful parses
at earlier earlemes.
If a parse is exhausted, but successful,
an event with the event code
@code{MARPA_EVENT_EXHAUSTED} occurs.
Because the parse is exhausted,
no input will be accepted at later earlemes.
It is quite common for a parse to become exhausted
when it succeeds.
Many practical grammars are designed so that a
successful parse cannot be extended.
An exhausted parse may cause a failure,
in which case
@code{marpa_r_earleme_complete()} returns an
error
whose error code is @code{MARPA_ERR_PARSE_EXHAUSTED}.
For a parse to fail at an earleme due to exhaustion,
it must be the case that no alternatives were
accepted at that earleme.
In fact,
in the standard input model,
a failure due to parse exhaustion
occurs if and only if
no alternatives
were accepted at the current earleme.
The circumstances under which
failure due to parse exhaustion occurs
are slightly more complicated
when variable length tokens are in use.
Informally,
a parse will never fail due to exhaustion as long
as it is possible that a token
ending at some future
earleme will continue the parse.
More precisely,
a call to
@code{marpa_r_earleme_complete()} fails
due to parse exhaustion
if and only if, first,
no alternatives were added at the current earleme
and, second,
that call left the current
earleme equal to the furthest earleme.
Return value: On success, the number of events generated.
On failure, @minus{}2.
@end deftypefun
@node Location accessors, Other parse status methods, Recognizer life cycle mutators, Recognizer methods
@section Location accessors
@deftypefun @code{unsigned int} marpa_r_current_earleme (Marpa_Recognizer @var{r})
Return value: If input has started, the current earleme.
If input has not started, @minus{}1.
Always succeeds.
@end deftypefun
@deftypefun Marpa_Earleme marpa_r_earleme ( @
Marpa_Recognizer @var{r}, @
Marpa_Earley_Set_ID @var{set_id})
In the default, token-stream model, Earley set ID and earleme
are always equal, but this is not the case in other input
models.
(The ID of an Earley set ID is also called its ordinal.)
If there is no Earley set whose ID is
@var{set_id},
@code{marpa_r_earleme()} fails.
If @var{set_id} was negative,
the error code is set to
@code{MARPA_ERR_INVALID_LOCATION}.
If @var{set_id} is greater than the ordinal
of the latest Earley set,
the error code is set to
@code{MARPA_ERR_NO_EARLEY_SET_AT_LOCATION}.
At this writing, there is no method for
the inverse operation (conversion of an earleme to an Earley set
ID).
One consideration in writing
such a method is that not all earlemes correspond to Earley sets.
Applications that want to map earlemes
to Earley sets will have no trouble if they
are using the standard input model ---
the Earley set
ID is always exactly equal to the earleme in that model.
For other applications
that want an earleme-to-ID mapping,
the most general method is create an ID-to-earleme
array using the @code{marpa_r_earleme()} method
and invert it.
Return value:
On success,
the earleme corresponding to Earley
set @var{set_id}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_earley_set_value (@
Marpa_Recognizer @var{r}, @
Marpa_Earley_Set_ID earley_set)
Returns the integer value of @var{earley_set}.
For more details, see
the description of @code{marpa_r_earley_set_values()}.
Return value: On success, the value of @var{earley_set}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_earley_set_values (@
Marpa_Recognizer @var{r}, @
Marpa_Earley_Set_ID earley_set, @
int* p_value, @
void** p_pvalue @
)
If @var{p_value} is non-zero,
sets the location pointed to by
@var{p_value} to the integer value of the Earley set.
Similarly, if @var{p_pvalue} is non-zero,
sets the location pointed to by
@var{p_pvalue} to the pointer value of the Earley set.
The ``value'' and ``pointer'' of an Earley set are
an arbitrary integer
and an arbitrary pointer
that the application
can use for its own purposes.
In character-per-earleme input models, for example,
the integer can be the codepoint of the current character.
In a traditional token-per-earley input model,
they could be used to indicate the string value of the token --
the pointer could point to the start of the string,
and the integer could indicate its length.
The Earley set value and pointer can be set using
the
@code{marpa_r_latest_earley_set_values_set()} method.
The Earley set integer value defaults to @minus{}1,
and the pointer value defaults to @code{NULL}.
Return value: On success, returns a non-negative integer.
On failure, returns @minus{}2.
@end deftypefun
@deftypefun @code{unsigned int} marpa_r_furthest_earleme (Marpa_Recognizer @var{r})
Return value: On success, the furthest earleme.
Always succeeds.
@end deftypefun
@deftypefun Marpa_Earley_Set_ID marpa_r_latest_earley_set (Marpa_Recognizer @var{r})
This method returns the Earley set ID (ordinal) of the latest Earley set.
Applications that want the
value of the latest earleme can convert
this value using
the @code{marpa_r_earleme()} method.
Return value: On success, the ID of the latest Earley set.
Always succeeds.
@end deftypefun
@deftypefun int marpa_r_latest_earley_set_value_set (@
Marpa_Recognizer @var{r}, @
int value)
Sets the integer value of the latest Earley set.
For more details, see
the description of
@code{marpa_r_latest_earley_set_values_set()}.
Return value: On success, the new value of @var{earley_set}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_latest_earley_set_values_set (@
Marpa_Recognizer @var{r}, @
int value, @
void* pvalue)
Sets the integer and pointer value of the latest Earley set.
For more about the ``integer value'' and ``pointer value''
of an Earley set,
see the description of the
@code{marpa_r_earley_set_values()} method.
Return value: On success, returns a non-negative integer.
On failure, returns @minus{}2.
@end deftypefun
@node Other parse status methods, Untested recognizer methods, Location accessors, Recognizer methods
@section Other parse status methods
@deftypefun int marpa_r_earley_item_warning_threshold (Marpa_Recognizer @var{r})
@deftypefunx int marpa_r_earley_item_warning_threshold_set (Marpa_Recognizer @var{r}, @
int @var{threshold})
These methods, respectively, report and set the Earley item warning threshold.
The @dfn{Earley item warning threshold}
is a number that is compared with
the count of Earley items in each Earley set.
When it is matched or exceeded,
a @code{MARPA_EVENT_EARLEY_ITEM_THRESHOLD} event is created.
If @var{threshold} is zero or less,
an unlimited number of Earley items
will be allowed without warning.
This will rarely be what the user wants.
By default, Libmarpa calculates a value based on the grammar.
The formula Libmarpa uses is the result of some experience,
and most applications will
be happy with it.
Return value:
The value that the Earley item warning threshold has
after the method call is finished.
Always succeeds.
@end deftypefun
@deftypefun int marpa_r_expected_symbol_event_set ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{symbol_id}, @
int @var{value})
Sets the ``expected symbol event bit'' for @var{symbol_id} to @var{value}.
A recognizer event is created whenever
symbol @var{symbol_id} is expected at the current earleme.
if and only if the expected symbol event bit
for @var{symbol_id} is 1.
The ``expected symbol event bit'' must be 1 or 0.
In this context, ``expected'' means ``expected as a terminal''.
Even if a symbol is predicted at the current earleme,
if it is not acceptable as a terminal,
it does not trigger an
``expected symbol event''.
By default, the ``expected symbol event bit'' is 0.
It is an error to attempt to set the
``expected symbol event bit'' to 1 for a nulling symbol,
an inaccessible symbol,
or an unproductive symbol.
Return value:
The value of the event bit after the method call is finished.
-2 if @var{symbol_id} is not the ID of a valid symbol;
if it is the ID of an nulling, inaccessible for unproductive symbol;
or on other failure.
@end deftypefun
@deftypefun int marpa_r_is_exhausted (Marpa_Recognizer @var{r})
A parser is ``exhausted'' if it cannot accept any more input.
Both successful and failed parses can be exhausted.
In many grammars,
the parse is always exhausted as soon as it succeeds.
Good parses may also exist at earlemes prior to the
current one.
Return value:
1 if the parser is exhausted, 0 otherwise.
Always succeeds.
@end deftypefun
@deftypefun int marpa_r_terminals_expected ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID* @var{buffer})
Returns a list of the ID's of the symbols
that are acceptable as tokens
at the current earleme.
@var{buffer} is expected to be large enough to hold
the result.
This is guaranteed to be the case if the buffer
is large enough to hold a number of
@code{Marpa_Symbol_ID}'s that
is greater than or equal to the number of symbols
in the grammar.
Return value: On success, the number of @code{Marpa_Symbol_ID}'s
in @var{buffer}.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_terminal_is_expected ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{symbol_id})
Return values on success:
If @var{symbol_id} is the
ID of a valid terminal symbol that is expected at
the current earleme,
a number greater than zero.
If @var{symbol_id} is the
ID of a valid terminal symbol that is
@strong{not} expected at
the current earleme, or
if @var{symbol_id} is the ID of a valid symbol
that is not a terminal, zero.
Failure cases:
Returns @minus{}2 on failure.
It is a failure
if @var{symbol_id} is not the ID of a valid
symbol.
@end deftypefun
@node Untested recognizer methods, , Other parse status methods, Recognizer methods
@section Untested recognizer methods
@deftypefun int marpa_r_completion_symbol_activate ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{reactivate} )
Allows the user to deactivate and reactivate symbol completion
events in the recognizer.
If @var{reactivate} is zero, the event is deactivated.
If @var{reactivate} is one, the event is activated.
Symbol completion events are active by default if
the symbol was set up for completion events in the
grammar.
If a symbol was not set up for completion events in
the grammar, symbol completion events are inactive
by default and any attempt to change that is a fatal error.
Success cases:
On success,
the method returns the value of @var{reactivate}.
The method succeeds trivially
if the symbol is already set as indicated by
@var{reactivate}.
Failure cases:
If the active status
of the
completion event for @var{sym_id}
cannot be set as indicated by @var{reactivate},
the method fails.
On failure, @minus{}2 is returned.
@end deftypefun
@deftypefun int marpa_r_nulled_symbol_activate ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{boolean} )
Allows the user to deactivate and reactivate symbol nulled
events in the recognizer.
If @var{boolean} is zero, the event is deactivated.
If @var{boolean} is one, the event is activated.
Symbol nulled events are active by default if
the symbol was set up for nulled events in the
grammar.
If a symbol was not set up for nulled events in
the grammar, symbol nulled events are inactive
by default and any attempt to change that is a fatal error.
Success cases:
On success,
the method returns the value of @var{boolean}.
The method succeeds trivially
if the symbol is already set as indicated by
@var{boolean}.
Failure cases:
If the active status
of the
nulled event for @var{sym_id}
cannot be set as indicated by @var{boolean},
the method fails.
On failure, @minus{}2 is returned.
@end deftypefun
@deftypefun int marpa_r_prediction_symbol_activate ( @
Marpa_Recognizer @var{r}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{boolean} )
Allows the user to deactivate and reactivate symbol prediction
events in the recognizer.
If @var{boolean} is zero, the event is deactivated.
If @var{boolean} is one, the event is activated.
Symbol prediction events are active by default if
the symbol was set up for prediction events in the
grammar.
If a symbol was not set up for prediction events in
the grammar, symbol prediction events are inactive
by default and any attempt to change that is a fatal error.
Success cases:
On success,
the method returns the value of @var{boolean}.
The method succeeds trivially
if the symbol is already set as indicated by
@var{boolean}.
Failure cases:
If the active status
of the
prediction event for @var{sym_id}
cannot be set as indicated by @var{boolean},
the method fails.
On failure, @minus{}2 is returned.
@end deftypefun
@node Progress reports, Bocage methods, Recognizer methods, Top
@chapter Progress reports
An important advantage of the Marpa algorithm is the ability
to easily get full information about the state of the parse.
To start a progress report,
use the @code{marpa_r_progress_report_start()} command.
Only one progress report can be in use at any one time.
To get the information in a progress report,
it is necessary to step through the progress report
items.
To get the data for the current progress report item,
and advance to the next one,
use the @code{marpa_r_progress_item()} method.
To destroy a progress report,
freeing the memory it uses,
call the @code{marpa_r_progress_report_finish()} method.
@deftypefun int marpa_r_progress_report_reset ( @
Marpa_Recognizer @var{r})
Resets the progress report.
Assumes a report of the progress has already been initialized
at some Earley set
for recognizer @var{r},
with
@code{marpa_r_progress_report_start()}.
The reset progress report will
be positioned before its first item.
Return value: On success, a non-negative value.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_progress_report_start ( @
Marpa_Recognizer @var{r}, @
Marpa_Earley_Set_ID @var{set_id})
Initializes a report of the progress at Earley set @var{set_id}
for recognizer @var{r}.
If a progress report already exists, it is destroyed and its
memory is freed.
Initially,
the progress report is positioned before its first item.
If no Earley set with ID
@var{set_id} exists,
@code{marpa_r_progress_report_start()} fails.
The error code is @code{MARPA_ERR_INVALID_LOCATION} if @var{set_id}
is negative.
The error code is @code{MARPA_ERR_NO_EARLEY_SET_AT_LOCATION}
if @var{set_id} is greater than the ID of
the latest Earley set.
Return value: On success, the number of report items available.
If the recognizer has not been started;
if @var{set_id} does not exist;
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_r_progress_report_finish ( @
Marpa_Recognizer @var{r} )
Destroys the report of the progress at Earley set @var{set_id}
for recognizer @var{r},
freeing the memory and other resources.
It is often not necessary to call this method.
Any previously existing progress report
is destroyed automatically
whenever a new progress report is started,
and when the recognizer is destroyed.
Return value: @minus{}2 if no progress report has been started,
or on other failure.
On success, a non-negative value.
@end deftypefun
@deftypefun Marpa_Rule_ID marpa_r_progress_item ( @
Marpa_Recognizer @var{r}, @
int* @var{position}, @
Marpa_Earley_Set_ID* @var{origin} )
This method allows access to the data
for the next item of a
progress report.
If there are no more progress report items,
it returns @minus{}1 as a termination indicator
and sets the error code to @code{MARPA_ERR_PROGRESS_REPORT_EXHAUSTED}.
Either the termination indicator,
or the item count returned by
@code{marpa_r_progress_report_start()},
can be used to determine when the last
item has been seen.
On success,
the dot position is returned in the location
pointed to by the @var{position} argument,
and the origin is returned in the location
pointed to by the @var{origin} argument.
On failure, the locations pointed to by
the @var{position} and @var{origin}
arguments are unchanged.
Return value: On success, the rule ID of
the next progress report item.
If there are no more progress report items, @minus{}1.
If either the @var{position} or the @var{origin}
argument is @code{NULL},
or on other failure, @minus{}2.
@end deftypefun
@node Bocage methods, Ordering methods, Progress reports, Top
@chapter Bocage methods
@menu
* Bocage overview::
* Bocage constructor::
* Bocage reference counting::
* Bocage accessor::
@end menu
@node Bocage overview, Bocage constructor, Bocage methods, Bocage methods
@section Overview
A bocage is structure containing the full set of parses found
by processing the input according to the grammar.
The bocage structure is new with Libmarpa, but is very similar
in purpose to the more familar parse forests.
To create a bocage, use the @code{marpa_b_new()} method.
When a bocage is no longer in use, its memory can be freed
using the
@code{marpa_b_unref()} method.
@node Bocage constructor, Bocage reference counting, Bocage overview, Bocage methods
@section Creating a new bocage
@deftypefun Marpa_Bocage marpa_b_new (Marpa_Recognizer @var{r}, @
Marpa_Earley_Set_ID @var{earley_set_ID})
Creates a new bocage object, with a reference count of 1.
The reference count of its parent recognizer object, @var{r},
is increased by 1.
If there is no parse ending at Earley set @var{earley_set_ID},
@code{marpa_b_new} fails
and the error code is set to
@code{MARPA_ERR_NO_PARSE}.
Return value: On success, the new bocage object.
On failure, @code{NULL}.
@end deftypefun
@node Bocage reference counting, Bocage accessor, Bocage constructor, Bocage methods
@section Reference counting
@deftypefun Marpa_Bocage marpa_b_ref (Marpa_Bocage @var{b})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, @var{b}.
On failure, @code{NULL}.
@end deftypefun
@deftypefun void marpa_b_unref (Marpa_Bocage @var{b})
Decreases the reference count by 1,
destroying @var{b} once the reference count reaches
zero.
When @var{b} is destroyed, the reference count
of its parent recognizer is decreased by 1.
If this takes the reference count of the parent recognizer
to zero, it too is destroyed.
If the parent recognizer is destroyed, the reference count
of its base grammar is decreased by 1.
If this takes the reference count of the base grammar
to zero, it too is destroyed.
@end deftypefun
@node Bocage accessor, , Bocage reference counting, Bocage methods
@section Accessors
@deftypefun int marpa_b_ambiguity_metric (Marpa_Bocage @var{b})
Returns an ambiguity metric.
For the time being, it is vaguely defined, but it does differentiate
between an ambiguous and an unambiguous bocage.
Return value on success:
1 if the bocage is not for an ambiguous parse;
a number greater than 1 if the bocage is for an ambiguous parse.
Failures: On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_b_is_null (Marpa_Bocage @var{b})
Return value on success:
A number greater than or equal to 1 if the bocage is for a null parse;
otherwise, 0.
Failures: On failure, @minus{}2.
@end deftypefun
@node Ordering methods, Tree methods, Bocage methods, Top
@chapter Ordering methods
@menu
* Ordering overview::
* Ordering constructor::
* Ordering reference counting::
* Order accessor::
* Non-default ordering::
@end menu
@node Ordering overview, Ordering constructor, Ordering methods, Ordering methods
@section Overview
Before iterating the parses in the bocage,
they must be ordered.
To create an ordering, use the @code{marpa_o_new()} method.
When an ordering is no longer in use, its memory can be freed
using the
@code{marpa_o_unref()} method.
An ordering is @dfn{frozen} once the first
tree iterator is created
using it.
A frozen ordering cannot be changed.
As of this writing, the only methods to order parses
are internal and undocumented.
This is expected to change.
@node Ordering constructor, Ordering reference counting, Ordering overview, Ordering methods
@section Creating an ordering
@deftypefun Marpa_Order marpa_o_new ( @
Marpa_Bocage @var{b})
Creates a new ordering object, with a reference count of 1.
The reference count of its parent bocage object, @var{b},
is increased by 1.
Return value: On success, the new ordering object.
On failure, @code{NULL}.
@end deftypefun
@node Ordering reference counting, Order accessor, Ordering constructor, Ordering methods
@section Reference counting
@deftypefun Marpa_Order marpa_o_ref ( @
Marpa_Order @var{o})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, @var{o}.
On failure, @code{NULL}.
@end deftypefun
@deftypefun void marpa_o_unref ( @
Marpa_Order @var{o})
Decreases the reference count by 1,
destroying @var{o} once the reference count reaches
zero.
Beginning with @var{o}'s parent bocage,
Libmarpa then proceeds up the chain of parent objects.
Every time a child is destroyed, the
reference count of its parent is decreased by 1.
Every time the reference count of an object
is decreased by 1,
if that reference count is now zero,
that object is destroyed.
Libmarpa follows this chain of decrements
and destructions as required,
all the way back to the
base grammar, if necessary.
@end deftypefun
@node Order accessor, Non-default ordering, Ordering reference counting, Ordering methods
@section Accessors
@deftypefun int marpa_o_ambiguity_metric (Marpa_Order @var{o})
Returns an ambiguity metric.
For the time being, it is vaguely defined, but it does differentiate
between an ambiguous and an unambiguous ordering.
Return value on success:
1 if the ordering is not for an ambiguous parse;
a number greater than 1 if the ordering is for an ambiguous parse.
Failures: On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_o_is_null (Marpa_Order @var{o})
Return value on success:
A number greater than or equal to 1 if the ordering is for a null parse;
otherwise, 0.
Failures: On failure, @minus{}2.
@end deftypefun
@node Non-default ordering, , Order accessor, Ordering methods
@section Non-default ordering
@deftypefun int marpa_o_high_rank_only_set ( @
Marpa_Order @var{o}, @
int @var{flag})
@deftypefunx int marpa_o_high_rank_only ( @
Marpa_Order @var{o})
These methods, respectively, set and discover
the ``high rank only'' flag of ordering @var{o}.
A @var{flag} of 1 indicates that, when ranking,
all choices should be discarded except those of the
highest rank.
A @var{flag} of 0 indicates that
no choices should be discarded on the
basis of their rank.
A value of 1 is the default.
The value of the ``high rank only'' flag has no effect
unless ranking has been turned on using the
@code{marpa_o_rank()} method.
Return value: On success, the value of the
``high rank only'' flag @strong{after}
the call.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_o_rank ( Marpa_Order @var{o} )
By default, the ordering of parse trees is arbitrary.
This method causes the ordering to be ranked
according to the ranks of symbols and rules,
the ``null ranks high'' flags of the rules,
and the ``high rank only'' flag of the ordering.
Once this method returns, the ordering is frozen.
Return value: On success, a non-negative value.
On failure, @minus{}2.
@end deftypefun
@node Tree methods, Value methods, Ordering methods, Top
@chapter Tree methods
@menu
* Tree overview::
* Tree constructor::
* Tree reference counting::
* Tree iteration::
@end menu
@node Tree overview, Tree constructor, Tree methods, Tree methods
@section Overview
Once the bocage has an ordering, the parses trees can be iterated.
Marpa's @dfn{parse tree iterators} iterate the parse trees contained
in a bocage object.
In Libmarpa,
``parse tree iterators'' are usually just called @dfn{trees}.
To create a tree, use the @code{marpa_t_new()} method.
A newly created tree iterator is positioned before the first parse tree.
When a tree iterator is no longer in use, its memory can be freed
using the
@code{marpa_t_unref()} method.
To position a newly created tree iterator at the first parse tree,
use the @code{marpa_t_next()} method.
Once the tree iterator is positioned at a parse tree,
the same @code{marpa_t_next()} method is used
to position it to the next parse tree.
@node Tree constructor, Tree reference counting, Tree overview, Tree methods
@section Creating a new tree iterator
@deftypefun Marpa_Tree marpa_t_new (Marpa_Order @var{o})
Creates a new tree iterator, with a reference count of 1.
The reference count of its parent ordering object, @var{o},
is increased by 1.
When initialized, a tree iterator is positioned
before the first parse tree.
To position the tree iterator to the first parse,
the application must call @code{marpa_t_next()}.
Return value: On success, a newly created tree.
On failure, @code{NULL}.
@end deftypefun
@node Tree reference counting, Tree iteration, Tree constructor, Tree methods
@section Reference counting
@deftypefun Marpa_Tree marpa_t_ref (Marpa_Tree @var{t})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, @var{t}.
On failure, @code{NULL}.
@end deftypefun
@deftypefun void marpa_t_unref (Marpa_Tree @var{t})
Decreases the reference count by 1,
destroying @var{t} once the reference count reaches
zero.
Beginning with @var{t}'s parent ordering,
Libmarpa then proceeds up the chain of parent objects.
Every time a child is destroyed, the
reference count of its parent is decreased by 1.
Every time the reference count of an object
is decreased by 1,
if that reference count is now zero,
that object is destroyed.
Libmarpa follows this chain of decrements
and destructions as required,
all the way back to the
base grammar, if necessary.
@end deftypefun
@node Tree iteration, , Tree reference counting, Tree methods
@section Iterating through the trees
@deftypefun int marpa_t_next ( @
Marpa_Tree @var{t})
Positions @var{t} at the next parse tree
in the iteration.
Tree iterators are initialized to the position
before the first parse tree,
so this method must be called before creating a valuator
from a tree.
If a tree iterator is positioned after the last parse,
the tree is said to be ``exhausted''.
A tree iterator for a bocage with no parse trees
is considered to be ``exhausted'' when initialized.
If the tree iterator is exhausted, @code{marpa_t_next()}
returns @minus{}1 as a termination indicator,
and sets the error code to
@code{MARPA_ERR_TREE_EXHAUSTED}.
Return value: On success, a non-negative value.
If the tree iterator is exhausted, @minus{}1.
On failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_t_parse_count ( @
Marpa_Tree @var{t})
The parse counter counts the number of parse trees
traversed so far.
The count includes the current iteration of the
tree, so that a value of 0 indicates that the tree iterator
is at its initialized position,
before the first parse tree.
Return value: The number of parses traversed so far.
Always succeeds.
@end deftypefun
@node Value methods, Events, Tree methods, Top
@chapter Value methods
@menu
* Value overview::
* How to use the valuator::
* Advantages of step-driven valuation::
* Maintaining the stack::
* Valuator constructor::
* Valuator reference counting::
* Stepping through the valuator::
* Valuator steps by type::
* Basic step accessors::
* Other step accessors::
@end menu
@node Value overview, How to use the valuator, Value methods, Value methods
@section Overview
The archetypal application needs
a value object (or @dfn{valuator}) to produce
the value of the parse.
To create a valuator, use the @code{marpa_v_new()} method.
When a valuator is no longer in use, its memory can be freed
using the
@code{marpa_v_unref()} method.
The application is required to maintain the stack,
and the application is also required to implement
most of the semantics, including the evaluation
of rules.
Libmarpa's valuator provides instructions to
the application on how to manipulate the stack.
To iterate through this series of instructions,
use the @code{marpa_v_step()} method.
@code{marpa_v_step()} returns the type
of step.
Most step types have values associated with them.
To access these values use the methods
described in the section @ref{Basic step accessors}.
How to perform the steps is described in
the sections
@ref{How to use the valuator}
and @ref{Stepping through the valuator}.
@node How to use the valuator, Advantages of step-driven valuation, Value overview, Value methods
@section How to use the valuator
Libmarpa's valuator provides the application with
``steps'', which are
instructions for stack manipulation.
Libmarpa itself does not maintain a stack.
This leaves the upper layer in total control of the
stack and the values which are placed on it.
As example may make this clearer.
Suppose the evalution is at a place in the parse tree
where an addition is being performed.
Libmarpa does not know that the operation
is an addition.
It will tell the application that rule number @var{R}
is to be applied to the arguments at stack locations
@var{N} and @var{N}+1, and that the result is to placed in
stack location @var{N}.
In this system
the application keeps track of the semantics for all
rules, so it looks up rule @var{R} and determines that it
is an addition.
The application can do this by using @var{R} as an index
into an array of callbacks, or by any other method
it chooses.
Let's assume a callback implements the semantics
for rule @var{R}.
Libmarpa has told the application that two arguments
are available for this operation, and that they are
at locations @var{N} and @var{N}+1 in the stack.
They might be the numbers 42 and 711.
So the callback is called with its two arguments,
and produces a return value, let's say, 753.
Libmarpa has told the application that the result
belongs at location @var{N} in the stack,
so the application writes 753 to location @var{N}.
Since Libmarpa knows nothing about the semantics,
the operation for rule R could be string concatenation
instead of addition.
Or, if it is addition, it could allow for its arguments
to be floating point or complex numbers.
Since the application maintains the stack, it is up
to the application whether the stack contains integers,
strings, complex numbers, or polymorphic objects which are
capable of being any of these things and more.
@node Advantages of step-driven valuation, Maintaining the stack, How to use the valuator, Value methods
@section Advantages of step-driven valuation
Step-driven valuation
hides Libmarpa's grammar rewrites from the application,
and is quite efficient.
Libmarpa knows which rules are sequences.
Libmarpa optimizes stack manipulations based on this knowledge.
Long sequences
are very common in practical grammars.
For these,
the stack manipulations suggested by Libmarpa's
step-driven valuator
will be significantly faster than the
traditional stack evaluation algorithm.
Step-driven evalution has another advantage.
To illustrate this,
consider what is a very common case:
The semantics are implemented in a higher-level language,
using callbacks.
If Libmarpa did not use step-driven valuation,
it would need to provide for this case.
But for generality,
Libmarpa would have to deal in C callbacks.
Therefore, a middle layer would have to create C language wrappers
for the callbacks in the higher level language.
The implementation that results is this:
The higher level language would need to wrap each callback in C.
When calling Libmarpa, it would pass the wrappered callback.
Libmarpa would then need to call the C language ``wrappered'' callback.
Next, the wrapper would call the higher-level language callback.
The return value,
which would be data native to the higher-level language,
would need to be passed to the C language wrapper,
which will need to make arrangements for it to be based
back to the higher-level language when appropriate.
A setup like this is not terribly efficient.
And exception handling across language boundaries would be
very tricky.
But neither of these is the worst problem.
Callbacks are hard to debug.
Wrappered callbacks are even worse.
Calls made across language boundaries
are harder yet to debug.
In the system described above,
by the time a return value is finally consumed,
a language boundary will have been crossed four times.
How do
programmers deal with difficulties like this?
Usually, by
doing the absolute minimum possible in the callbacks.
A horrific debugging enviroment can become a manageable
one if there is next to no code to be debugged.
And this can be accomplished by
doing as much as possible in pre- and post-processing.
In essence, callbacks force applications to do most
of the programming
via side effects.
One need not be a functional programming purist to find
this a very undesirable style of design to force on
an application.
But the ability to debug can make the difference between
code that does work and code that does not.
Unfairly or not,
code is rarely considered well-designed when it does
not work.
So, while step-driven valuation seems
a roundabout approach,
it is simpler and more direct than
the likely alternatives.
And there is something to be said for pushing
semantics up to the higher levels ---
they can be expected to know more about it.
These advantages of step-driven valuation
are strictly in
the context of a low-level interface.
The author is under no illusion
that direct use of Libmarpa's valuator will be found
satisfactory by most application programmers,
even those using the C language.
The author certainly avoids using step-driven valuation directly.
Libmarpa's valuator is intended
to be used via an upper layer,
one which @strong{does} know about semantics.
@node Maintaining the stack, Valuator constructor, Advantages of step-driven valuation, Value methods
@section Maintaining the stack
This section discusses in detail the requirements
for maintaining the stack.
In some cases,
such as implementation using a Perl array,
fulfilling these requirements is trivial.
Perl auto-extends its arrays,
and initializes the element values,
on every read or write.
For the C programmer,
things are not quite so easy.
In this section,
we will assume a C90 or C99 standard-conformant
C application.
This assumption is convenient on two grounds.
First, this will be the intended use
for many readers.
Second, standard-conformant C is
a ``worst case''.
Any issue faced by a programmer of another environment
is likely to also be one that must be solved
by the C programmer.
Libmarpa often
optimizes away unnecessary stack writes
to stack locations.
When it does so, it will not
necessarily optimize away all reads
to that stack location.
This means that a location's first access,
as suggested by the Libmarpa step instructions,
may be a read.
This possibility
requires a special awareness from the C
programmer,
as discussed in
the sections
@ref{Sizing the stack} and
@ref{Initializing locations in the stack}.
In the discussions in this document,
stack locations are non-negative integers.
The bottom of the stack is location 0.
In moving from the bottom of the stack to the top,
the numbers increase.
Stack location @var{Y} is said to be ``greater''
than stack location @var{X} if stack location
@var{Y} is closer to the top of stack than location @var{X},
and therefore stack locations are considered greater or
lesser if the integers that represent them are
greater or lesser.
Another way to state that
a stack location @var{Y} is greater (lesser)
than stack location @var{X} is to say that
a stack location @var{Y} is later (earlier)
than stack location @var{X}.
@menu
* Sizing the stack::
* Initializing locations in the stack::
@end menu
@node Sizing the stack, Initializing locations in the stack, Maintaining the stack, Maintaining the stack
@subsection Sizing the stack
If an implementation applies Libmarpa's step
instructions literally, using a physical stack,
it must make sure the stack is large enough.
Specifically, the application must do the following
@itemize
@item Ensure location 0 exists --- in other
words that the stack is at least length 1.
@item For @code{MARPA_STEP_TOKEN} steps,
ensure that location @code{marpa_v_result(v)}
exists.
@item For @code{MARPA_STEP_NULLING_SYMBOL} steps,
ensure that location @code{marpa_v_result(v)}
exists.
@item For @code{MARPA_STEP_RULE} steps,
ensure that stack locations from @code{marpa_v_arg_0(v)}
to @code{marpa_v_arg_n(v)} exist.
@end itemize
Three aspects of these requirements deserve special mention.
First,
note that the requirement for a
@code{MARPA_STEP_RULE} is that the application
size the stack to include the arguments to be
read.
Because stack writes may be optimized away,
an application,
when reading,
cannot assume
that the stack was
sized appropriately by a prior write.
The first access to a new stack location may be
a read.
Second,
note that there is no explicit requirement that
the application size the stack to include the
location for the result of the
@code{MARPA_STEP_RULE} step.
An application is allowed to assume that
result will go into one of the locations
that were read.
Third, special note should be made of the requirement
that location 0 exist.
By convention, the parse result resides
in location 0 of the stack.
Because of potential optimizations,
an application cannot assume that it
will receive a Libmarpa step instruction that
either reads from or writes to location 0.
@node Initializing locations in the stack, , Sizing the stack, Maintaining the stack
@subsection Initializing locations in the stack
Write optimizations also creates issues for implementations
which require data to be initialized before reading.
Every fully standard-conforming C application is such an
implementation.
Both C90 and C99 allow ``trap values'',
and therefore conforming applications must be
prepared for
an uninitialized location to contain one of those.
Reading a trap value may cause an abend.
(It is safe, in standard-conforming C, to write to a location
containing a trap value.)
The requirement that locations be initialized before
reading occurs in other implementations.
Any implementation that has a ``universe'' of ``safe'' values,
may require special precautions.
The required precautions may amount to a need to initialize
``uninitialized'' values.
A practical example might be an implementation that expects
all locations to contain a pointer which it can safely indirect
from.
In such implementations,
just as in standard-conformant C,
every stack location
needs to be initialized before being read.
Due to write optimizations, an application
cannot rely on Libmarpa's step instructions to
initialize every stack location before its first read.
One way to safely deal with the
initialization of stack locations,
is to do all of the following:
@itemize
@item When starting evaluation, ensure that the stack contains at least location 0.
@item Also, when starting evaluation, initialize every location in the stack.
@item Whenever the stack is extended,
initialize every stack location added.
@end itemize
Applications which try to optimize out some of
these initializations
need to be aware that
an application can never assume that activity in
the stack is safely ``beyond'' an uninitialized location.
Libmarpa steps often revisit earlier sections of the stack,
and these revisits may include reads of previously
unvisited stack locations.
@node Valuator constructor, Valuator reference counting, Maintaining the stack, Value methods
@section Creating a new valuator
@deftypefun Marpa_Value marpa_v_new ( @
Marpa_Tree @var{t} @
)
Creates a new valuator.
The parent object of the new valuator
will be the tree iterator @var{t},
and the reference count of the new valuator will be 1.
The reference count of @var{t} is increased by 1.
The parent tree iterator is ``paused'',
so that the tree iterator
cannot move on to a new parse tree
until the valuator is destroyed.
Many valuators of the same parse tree
can exist at once.
A tree iterator is ``unpaused'' when
all of the valuators of that tree iterator are destroyed.
Return value: On success, the newly created valuator.
On failure, @code{NULL}.
@end deftypefun
@node Valuator reference counting, Stepping through the valuator, Valuator constructor, Value methods
@section Reference counting
@deftypefun Marpa_Value marpa_v_ref (Marpa_Value @var{v})
Increases the reference count by 1.
Not needed by most applications.
Return value:
On success, @var{v}.
On failure, @code{NULL}.
@end deftypefun
@deftypefun void marpa_v_unref ( @
Marpa_Value @var{v})
Decreases the reference count by 1,
destroying @var{v} once the reference count reaches
zero.
Beginning with @var{v}'s parent tree,
Libmarpa then proceeds up the chain of parent objects.
Every time a child is destroyed, the
reference count of its parent is decreased by 1.
Every time the reference count of an object
is decreased by 1,
if that reference count is now zero,
that object is destroyed.
Libmarpa follows this chain of decrements
and destructions as required,
all the way back to the
base grammar, if necessary.
@end deftypefun
@node Stepping through the valuator, Valuator steps by type, Valuator reference counting, Value methods
@section Stepping through the valuator
@deftypefun Marpa_Step_Type marpa_v_step ( @
Marpa_Value @var{v})
This method ``steps through'' the valuator.
The return value is a @code{Marpa_Step_Type},
an integer which indicates the type of step.
How the application is expected to act on
each step is described below.
@xref{Valuator steps by type}.
When the iteration through the steps is finished,
@code{marpa_v_step} returns @code{MARPA_STEP_INACTIVE}.
Return value: On success, the type of the step
to be performed.
This will always be a non-negative number.
On failure, @minus{}2.
@end deftypefun
@node Valuator steps by type, Basic step accessors, Stepping through the valuator, Value methods
@section Valuator steps by type
@deftypevr Macro Marpa_Step_Type MARPA_STEP_RULE
The semantics of a rule should be performed.
The application can find the value of the rule's
children in the stack locations from
@code{marpa_v_arg_0(v)}
to @code{marpa_v_arg_n(v)}.
The semantics for the rule whose ID is
@code{marpa_v_rule(v)} should be executed
on these child values,
and the result placed in
@code{marpa_v_result(v)}.
In the case of a @code{MARPA_STEP_RULE} step,
the stack location of
@code{marpa_v_result(v)} is guaranteed to
be equal to @code{marpa_v_arg_0(v)}.
@end deftypevr
@deftypevr Macro Marpa_Step_Type MARPA_STEP_TOKEN
The semantics of a non-null token should be performed.
The application's value for the token whose ID is
@code{marpa_v_token(v)} should be
placed in
stack location @code{marpa_v_result(v)}.
Its value according to Libmarpa will be in
@code{marpa_v_token_value(v)}.
@end deftypevr
@deftypevr Macro Marpa_Step_Type MARPA_STEP_NULLING_SYMBOL
The semantics for a nulling symbol should be performed.
The ID of the symbol is
@code{marpa_v_symbol(v)} and its value should
be placed in
stack location @code{marpa_v_result(v)}.
@end deftypevr
@deftypevr Macro Marpa_Step_Type MARPA_STEP_INACTIVE
The valuator has gone through all of its steps
and is now inactive.
The value of the parse will be in stack location 0.
Because of optimizations,
it is possible for valuator to immediately
became inactive --- @code{MARPA_STEP_INACTIVE} could
be both the first and last step.
@end deftypevr
@deftypevr Macro Marpa_Step_Type MARPA_STEP_INITIAL
The valuator is new and has
yet to go through any steps.
@end deftypevr
@deftypevr Macro Marpa_Step_Type MARPA_STEP_INTERNAL1
@deftypevrx Macro Marpa_Step_Type MARPA_STEP_INTERNAL2
@deftypevrx Macro Marpa_Step_Type MARPA_STEP_TRACE
These step types are reserved for internal purposes.
@end deftypevr
@node Basic step accessors, Other step accessors, Valuator steps by type, Value methods
@section Basic step accessors
The basic step accessors are so called because their information
is basic to the stack manipulation.
The basic step accessors are implemented as macros.
They always succeed.
@deftypefn {Macro} int marpa_v_arg_0 (Marpa_Value @var{v})
For a @code{MARPA_STEP_RULE} step,
returns the stack location where the value of first child
can be found.
@end deftypefn
@deftypefn {Macro} int marpa_v_arg_n (Marpa_Value @var{v})
For a @code{MARPA_STEP_RULE} step,
returns the stack location where the value of the last child
can be found.
@end deftypefn
@deftypefn {Macro} int marpa_v_result (Marpa_Value @var{v})
For @code{MARPA_STEP_RULE},
@code{MARPA_STEP_TOKEN},
and @code{MARPA_STEP_NULLING_SYMBOL} steps,
returns the stack location where the result of the semantics
should be placed.
@end deftypefn
@deftypefn {Macro} Marpa_Rule_ID marpa_v_rule (Marpa_Value @var{v})
For the
@code{MARPA_STEP_RULE} step,
returns the ID of the rule.
@end deftypefn
@deftypefn {Macro} Marpa_Step_Type marpa_v_step_type (Marpa_Value @var{v})
Returns the current step type:
@code{MARPA_STEP_TOKEN},
@code{MARPA_STEP_RULE},
etc.
Usually not needed since this is also the return value of
@code{marpa_v_step()}.
@end deftypefn
@deftypefn {Macro} Marpa_Symbol_ID marpa_v_symbol (Marpa_Value @var{v})
For the @code{MARPA_STEP_NULLING_SYMBOL} step,
returns the ID of the symbol.
The value returned is the same as that
returned by the @code{marpa_v_token()}
macro.
@end deftypefn
@deftypefn {Macro} Marpa_Symbol_ID marpa_v_token (Marpa_Value @var{v})
For the @code{MARPA_STEP_TOKEN} step,
returns the ID of the token.
The value returned is the same as that
returned by the @code{marpa_v_symbol()}
macro.
@end deftypefn
@deftypefn {Macro} int marpa_v_token_value (Marpa_Value @var{v})
For the @code{MARPA_STEP_TOKEN} step,
returns the integer which is
(or which represents)
the value of the token.
@end deftypefn
@node Other step accessors, , Basic step accessors, Value methods
@section Other step accessors
This section contains the step accessors that
are not basic to stack manipulation, but which provide
other useful information about the parse.
These step accessors are implemented as macros.
All of these accessors always succeed, but
if called when they are irrelevant
they return an unspecified value.
In this context, an ``unspecified value'' is a value
that is either @minus{}1 or the ID of a valid Earley set,
but which is otherwise unpredictable.
@deftypefn {Macro} Marpa_Earley_Set_ID marpa_v_es_id (Marpa_Value @var{v})
Return value:
If the current step type is @code{MARPA_STEP_RULE},
the Earley Set ordinal where the rule ends.
If the current step type is
@code{MARPA_STEP_TOKEN}
or @code{MARPA_STEP_NULLING_SYMBOL},
the Earley Set ordinal where the symbol ends.
If the current step type is anything else, an unspecified value.
@end deftypefn
@deftypefn {Macro} Marpa_Earley_Set_ID marpa_v_rule_start_es_id (Marpa_Value @var{v})
Return value:
If the current step type is @code{MARPA_STEP_RULE},
the Earley Set ordinal where the rule begins.
If the current step type is anything else, an unspecified value.
@end deftypefn
@deftypefn {Macro} Marpa_Earley_Set_ID marpa_v_token_start_es_id (Marpa_Value @var{v})
Return value:
If the current step type is @code{MARPA_STEP_TOKEN}
or @code{MARPA_STEP_NULLING_SYMBOL},
the Earley Set ordinal where the token begins.
If the current step type is anything else, an unspecified value.
@end deftypefn
@node Events, Error methods macros and codes, Value methods, Top
@chapter Events
@menu
* Events overview::
* Event methods::
* Event codes::
@end menu
@node Events overview, Event methods, Events, Events
@section Overview
Events are generated by the
@code{marpa_g_precompute()},
@code{marpa_r_earleme_complete()},
and
@code{marpa_r_start_input()} methods.
The methods are called event-active.
Event-active methods always clear all previous events,
so that after an event-active method the only events
available
will be those generated by that method.
Events are volatile,
and it is expected that events will be queried
immediately after the method that generated them.
Note especially
that multiple recognizers using the same base grammar
overwrite each other's events.
To find out how many events were generated by the last
event-active method,
use the @code{marpa_g_event_count} method.
To query a specific event,
use the @code{marpa_g_event} and
@code{marpa_g_event_value} methods.
@node Event methods, Event codes, Events overview, Events
@section Methods
@deftypefun Marpa_Event_Type marpa_g_event (Marpa_Grammar @var{g}, @
Marpa_Event* @var{event}, @
int @var{ix})
On success,
the type of the @var{ix}'th event is returned
and the data for the @var{ix}'th event is placed
in the location pointed to by @var{event}.
Event indexes are in sequence.
Valid events will be in the range from 0 to @var{n},
where @var{n} is one less than the event count.
The event count
can be queried using the @code{marpa_g_event_count()}
method.
Return value: On success, the type of event @var{ix}.
If there is no @var{ix}'th event,
if @var{ix} is negative,
or on other failure, @minus{}2.
On failure,
the locations pointed to by @var{event}
are not changed.
@end deftypefun
@deftypefun int marpa_g_event_count ( Marpa_Grammar g )
Return value: On success, the number of events.
On failure, @minus{}2.
@end deftypefun
@deftypefn {Macro} int marpa_g_event_value (Marpa_Event* @var{event})
This macro provides access to the ``value'' of the event.
The semantics of the value varies according to the type
of the event, and is described in the section on event
codes.
@xref{Event codes}.
@end deftypefn
@node Event codes, , Event methods, Events
@section Event codes
@deftypevr Macro int MARPA_EVENT_NONE
Applications should never see this event.
Event value: Undefined.
Suggested message: "No event".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_COUNTED_NULLABLE
A nullable symbol is either the separator
for, or the right hand side of, a sequence.
Event value: The ID of the symbol.
Suggested message: "This symbol is a counted nullable".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_EARLEY_ITEM_THRESHOLD
This event indicates the current Earley item count
exceeded a
user-settable threshold.
Event value: The current Earley item count.
Suggested message: "Too many Earley items".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_EXHAUSTED
The parse is exhausted.
Event value: Undefined.
Suggested message: "Recognizer is exhausted".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_LOOP_RULES
One or more rules are loop rules --- rules
that are part of a cycle.
Cycles are pathological cases of recursion,
in which the same symbol string derives itself
a potentially infinite number of times.
Nonetheless, Marpa parses in the presence of these,
and it is up to the application to treat these
as fatal errors,
something they almost always will wish to do.
Event value: The count of loop rules.
Suggested message: "Grammar contains a infinite loop".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_NULLING_TERMINAL
A nulling symbol is also a terminal.
Event value: The ID of the symbol.
Suggested message: "This symbol is a nulling terminal".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_SYMBOL_COMPLETED
The recognizer can be set to generate an event
a symbol is completed
using its @code{marpa_g_symbol_is_completion_event_set()}
method.
(A symbol is "completed" if and only if any rule with that symbol
as its LHS is completed.)
This event code indicates that one of those events
occurred.
Event value: The ID of the completed symbol.
Suggested message: "Completed symbol".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_SYMBOL_EXPECTED
The recognizer can be set to generate an event when a
symbol is expected as a terminal,
using its @code{marpa_r_expected_symbol_event_set()}
method.
Note that this event only triggers if the symbol is
expected as a terminal.
Predicted symbols which are not expected as terminals
do not trigger this event.
This event code indicates that one of those events
occurred.
Event value: The ID of the expected symbol.
Suggested message: "Expecting symbol".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_SYMBOL_NULLED
The recognizer can be set to generate an event when a
symbol is nulled -- that is, recognized as a
zero-length symbol.
To set an nulled symbol event,
use the recognizer's @code{marpa_r_nulled_symbol_event_set()}
method.
This event code indicates that a nulled symbol event
occurred.
Event value: The ID of the nulled symbol.
Suggested message: "Symbol was nulled".
@end deftypevr
@deftypevr Macro int MARPA_EVENT_SYMBOL_PREDICTED
The recognizer can be set to generate an event when a
symbol is predicted.
To set an predicted symbol event,
use the recognizer's @code{marpa_r_predicted_symbol_event_set()}
method.
Unlike the
@code{MARPA_EVENT_SYMBOL_EXPECTED} event,
the @code{MARPA_EVENT_SYMBOL_PREDICTED} event
triggers for predictions of both
non-terminals and terminals.
This event code indicates that a predicted symbol event
occurred.
Event value: The ID of the predicted symbol.
Suggested message: "Symbol was predicted".
@end deftypevr
@node Error methods macros and codes, Design notes, Events, Top
@chapter Error methods, macros and codes
@menu
* Error methods::
* Error Macros::
* External error codes::
* Internal error codes::
@end menu
@node Error methods, Error Macros, Error methods macros and codes, Error methods macros and codes
@section Error methods
@deftypefun Marpa_Error_Code marpa_g_error @
( Marpa_Grammar @var{g}, @
const char** @var{p_error_string})
When a method fails,
this method allows the application to read
the error code.
@var{p_error_string} is reserved for use by
the internals.
Applications should set it to @code{NULL}.
Return value: The last error code from a Libmarpa method.
Always succeeds.
@end deftypefun
@deftypefun Marpa_Error_Code marpa_g_error_clear @
( Marpa_Grammar @var{g} )
Sets the error code
to @code{MARPA_ERR_NONE}.
Not often used,
but now and then it can be useful
to force the error code to a known state.
Return value: @code{MARPA_ERR_NONE}.
Always succeeds.
@end deftypefun
@node Error Macros, External error codes, Error methods, Error methods macros and codes
@section Error Macros
@deftypevr Macro int MARPA_ERRCODE_COUNT
The number of error codes.
All error codes, whether internal or external,
will be integers, non-negative but
strictly less than @code{MARPA_ERRCODE_COUNT}.
@end deftypevr
@node External error codes, Internal error codes, Error Macros, Error methods macros and codes
@section External error codes
This section lists the external error codes.
These are the only error codes that users
of the Libmarpa external interface should ever see.
Internal error codes are in their own section.
@xref{Internal error codes}.
@deftypevr Macro int MARPA_ERR_NONE
No error condition.
The error code is initialized to this value.
Methods which do not result in failure
sometimes reset the error code to @code{MARPA_ERR_NONE}.
Numeric value: 0.
Suggested message: "No error".
@end deftypevr
@deftypevr Macro int MARPA_ERR_BAD_SEPARATOR
A separator was specified for a sequence rule,
but its ID was not that
of a valid symbol.
Numeric value: 6.
Suggested message: "Separator has invalid symbol ID".
@end deftypevr
@deftypevr Macro int MARPA_ERR_BEFORE_FIRST_TREE
A tree iterator is positioned before the first tree,
and it was specified in a context where that is not
allowed.
A newly created tree is positioned before the first
tree.
To position a newly created tree iterator to the first tree
use the @code{marpa_t_next()} method.
Numeric value: 91.
Suggested message: "Tree iterator is before first tree".
@end deftypevr
@deftypevr Macro int MARPA_ERR_COUNTED_NULLABLE
A ``counted'' symbol was found
that is also a nullable symbol.
A ``counted'' symbol is one that appears on the RHS
of a sequence rule.
If a symbol is nullable,
counting its occurrences becomes difficult.
Questions of definition and
problems of implementation arise.
At a minimum, a sequence with counted nullables
would be wildly
ambigious.
Sequence rules are simply an optimized shorthand
for rules that can also be written in ordinary BNF.
If the equivalent of a sequence of nullables is
really what your application needs,
nothing in Libmarpa prevents you from specifying
that sequence
with ordinary BNF rules.
Numeric value: 8.
Suggested message: "Nullable symbol on RHS of a sequence rule".
@end deftypevr
@deftypevr Macro int MARPA_ERR_DUPLICATE_RULE
This error indicates an attempt to add a BNF rule which
is a duplicate of a BNF rule already in the grammar.
Two BNF rules are considered duplicates if
@itemize @bullet
@item
Both rules have the same left hand symbol, and
@item
Both rules have the same right hand symbols in the same order.
@end itemize
Duplication of sequence rules,
and duplication between BNF rules and sequence rules,
is dealt with by requiring that the LHS of a sequence rule
not be the LHS of any other rule.
Numeric value: 11.
Suggested message: "Duplicate rule".
@end deftypevr
@deftypevr Macro int MARPA_ERR_DUPLICATE_TOKEN
This error indicates an attempt to add a duplicate token.
A token is a duplicate if one already read at the same
earleme has the same symbol ID and the same length.
Numeric value: 12.
Suggested message: "Duplicate token".
@end deftypevr
@deftypevr Macro int MARPA_ERR_YIM_COUNT
This error code indicates that
an implementation-defined limit on the
number of Earley items per Earley set
was exceedeed.
This limit is different from
the Earley item warning threshold,
an optional limit on the number
of Earley items in an Earley set,
which can be set by the application.
The implementation defined-limit is very large,
at least 500,000,000 earlemes.
An application is unlikely ever to see this
error.
Libmarpa's use of memory
would almost certainly exceed the implementation's
limits before it occurred.
Numeric value: 13.
Suggested message: "Maximum number of Earley items exceeded".
@end deftypevr
@deftypevr Macro int MARPA_ERR_EVENT_IX_NEGATIVE
A negative event index was specified.
That is not allowed.
Numeric value: 15.
Suggested message: "Negative event index".
@end deftypevr
@deftypevr Macro int MARPA_ERR_EVENT_IX_OOB
An non-negative event index was specified,
but there is no event at that index.
Since the events are in sequence, this means it
was too large.
Numeric value: 16.
Suggested message: "No event at that index".
@end deftypevr
@deftypevr Macro int MARPA_ERR_GRAMMAR_HAS_CYCLE
The grammar has a cycle --- one or more loop
rules.
This is a recoverable error,
although most applications will want to treat
it as fatal.
For more see the description of @ref{marpa_g_precompute}.
Numeric value: 17.
Suggested message: "Grammar has cycle".
@end deftypevr
@deftypevr Macro int MARPA_ERR_HEADERS_DO_NOT_MATCH
This is an internal error, and indicates that
Libmarpa was wrongly built.
Libmarpa was compiled with headers which do not
match the rest of the code.
The solution is to find a correctly built
Libmarpa.
Numeric value: 98.
Suggested message: "Internal error: Libmarpa was built incorrectly"
@end deftypevr
@deftypevr Macro int MARPA_ERR_I_AM_NOT_OK
The Libmarpa base grammar is in a "not ok"
state.
Currently, the only way this can happen
is if Libmarpa memory is being overwritten.
Numeric value: 29.
Suggested message: "Marpa is in a not OK state".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INACCESSIBLE_TOKEN
This error code indicates that
the token symbol is an inaccessible symbol --- one which
cannot be reached from the start symbol.
Since the inaccessibility of a symbol is a property of the grammar,
this error code typically indicates an application error.
Nevertheless, a retry at this location, using another token ID,
may succeed.
At this writing,
the author knows of no uses of this technique.
Numeric value: 18.
Suggested message: "Token symbol is inaccessible".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_BOOLEAN
A function was called that takes a boolean argument,
but the value of that argument was not either 0 or 1.
Numeric value: 22.
Suggested message: "Argument is not boolean".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_LOCATION
The location (Earley set ID) is not valid.
It may be invalid for one of two reasons:
@itemize
@item It is negative,
and it is being used as the argument to a method
for which that negative value does not have a special meaning.
@item It is after the latest Earley set.
@end itemize
For users of input models other than the standard one,
the term ``location'', as used in association
with this error code,
means Earley set ID or Earley set ordinal.
In the standard input model, this will always
be identical with Libmarpa's other idea of
location, the earleme.
Numeric value: 25.
Suggested message: "Location is not valid".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_START_SYMBOL
A start symbol was specified,
but its symbol ID is not that of a valid symbol.
Numeric value: 27.
Suggested message: "Specified start symbol is not valid".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_ASSERTION_ID
A method was called with an invalid assertion ID.
This is a assertion ID which not only does
not exist, but cannot exist.
Currently that means its value is less than zero.
Numeric value: 96.
Suggested message: "Assertion ID is malformed".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_RULE_ID
A method was called with an invalid rule ID.
This is a rule ID which not only does
not exist, but cannot exist.
Currently that means its value is less than zero.
Numeric value: 26.
Suggested message: "Rule ID is malformed".
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_SYMBOL_ID
A method was called with an invalid symbol ID.
This is a symbol ID which not only does
not exist, but cannot exist.
Currently that means its value is less than zero.
Numeric value: 28.
Suggested message: "Symbol ID is malformed".
@end deftypevr
@deftypevr Macro int MARPA_ERR_MAJOR_VERSION_MISMATCH
There was a mismatch in the major version number
between the requested version
of libmarpa, and the actual one.
Numeric value: 30.
Suggested message: "Libmarpa major version number is a mismatch".
@end deftypevr
@deftypevr Macro int MARPA_ERR_MICRO_VERSION_MISMATCH
There was a mismatch in the micro version number
between the requested version
of libmarpa, and the actual one.
Numeric value: 31.
Suggested message: "Libmarpa micro version number is a mismatch".
@end deftypevr
@deftypevr Macro int MARPA_ERR_MINOR_VERSION_MISMATCH
There was a mismatch in the minor version number
between the requested version
of libmarpa, and the actual one.
Numeric value: 32.
Suggested message: "Libmarpa minor version number is a mismatch".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_EARLEY_SET_AT_LOCATION
A non-negative Earley set ID (also called an Earley set ordinal)
was specified,
but there is no corresponding Earley set.
Since the Earley set ordinals are in sequence,
this means that the specified ID is greater
than that of the latest Earley set.
Numeric value: 39.
Suggested message: "Earley set ID is after latest Earley set".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NOT_PRECOMPUTED
The grammar is not precomputed,
and attempt was made to do something with it
that is not allowed for unprecomputed
grammars.
For example, a recognizer cannot be
created from a grammar until it is precomputed.
Numeric value: 34.
Suggested message: "This grammar is not precomputed".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_PARSE
The application attempted to create a bocage
from a recognizer without a parse.
Applications will often want to treat this as
a soft error.
Numeric value: 41.
Suggested message: "No parse".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_RULES
A grammar which has no rules is being used
in a way that is not allowed.
Usually the problem is that the user is
trying to precompute the grammar.
Numeric value: 42.
Suggested message: "This grammar does not have any rules".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_START_SYMBOL
The grammar has no start symbol,
and an attempt was made to perform an
operation which requires one.
Usually the problem is that the user is
trying to precompute the grammar.
Numeric value: 43.
Suggested message: "This grammar has no start symbol".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_SUCH_ASSERTION_ID
A method was called with an assertion ID which is well-formed,
but the assertion does not exist.
Numeric value: 97.
Suggested message: "No assertion with this ID exists".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_SUCH_RULE_ID
A method was called with a rule ID which is well-formed,
but the rule does not exist.
Numeric value: 89.
Suggested message: "No rule with this ID exists".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_SUCH_SYMBOL_ID
A method was called with a symbol ID which is well-formed,
but the symbol does not exist.
Numeric value: 90.
Suggested message: "No symbol with this ID exists".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_TOKEN_EXPECTED_HERE
This error code indicates that
no tokens at all were expected at this earleme
location.
This can only happen in alternative input models.
Typically, this indicates an application programming
error.
Retrying input at this location will always fail.
But if the application is able to leave this
earleme empty, a retry at a later location,
using this or another token,
may succeed.
At this writing,
the author knows of no uses of this technique.
Numeric value: 44.
Suggested message: "No token is expected at this earleme location".
@end deftypevr
@deftypevr Macro int MARPA_ERR_NULLING_TERMINAL
Marpa does not allow a symbol to be both nulling
and a terminal.
Numeric value: 49.
Suggested message: "A symbol is both terminal and nulling".
@end deftypevr
@deftypevr Macro int MARPA_ERR_ORDER_FROZEN
The Marpa order object has been frozen.
If a Marpa order object is frozen, it cannot be
changed.
Multiple tree iterators can share a Marpa order object,
but that order object is frozen after the first tree
iterator is created from it.
Applications can order an bocage in many ways,
but they must do so by creating multiple order objects.
Numeric value: 50.
Suggested message: "The ordering is frozen".
@end deftypevr
@deftypevr Macro int MARPA_ERR_PARSE_EXHAUSTED
The parse is exhausted.
Numeric value: 53.
Suggested message: "The parse is exhausted".
@end deftypevr
@deftypevr Macro int MARPA_ERR_PARSE_TOO_LONG
The parse is too long.
The limit on the length of a parse is implementation
dependent, but it is very large,
at least 500,000,000 earlemes.
This error code is unlikely in the standard input model.
Almost certainly memory would be exceeded
before it could occur.
If an application sees this error,
it almost certainly using one of the non-standard
input models.
Most often this messsage will occur because
of a request to add a single extremely long token,
perhaps as a result of an application error.
But it is also possible this error condition will
occur after the input of a large number
of long tokens.
Numeric value: 54.
Suggested message: "This input would make the parse too long".
@end deftypevr
@deftypevr Macro int MARPA_ERR_POINTER_ARG_NULL
In a method which takes pointers as arguments,
one of the pointer arguments is @code{NULL},
in a case where that is not allowed.
One such method is@ @code{marpa_r_progress_item()}.
Numeric value: 56.
Suggested message: "An argument is null when it should not be".
@end deftypevr
@deftypevr Macro int MARPA_ERR_PRECOMPUTED
An attempt was made to use a precomputed grammar
in a way that is not allowed.
Often this is an attempt to change the grammar.
Nearly every change to a grammar after
precomputation invalidates the precomputation,
and is therefore not allowed.
Numeric value: 57.
Suggested message: "This grammar is precomputed".
@end deftypevr
@deftypevr Macro int MARPA_ERR_PROGRESS_REPORT_NOT_STARTED
No recognizer progress report is currently active,
and an action has been attempted which
is inconsistent with that.
One such action would be a
@code{marpa_r_progress_item()} call.
Numeric value: 59.
Suggested message: "No progress report has been started".
@end deftypevr
@deftypevr Macro int MARPA_ERR_PROGRESS_REPORT_EXHAUSTED
The progress report is ``exhausted'' --- all its
items have been iterated through.
Numeric value: 58.
Suggested message: "The progress report is exhausted".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RANK_TOO_LOW
A symbol or rule rank was specified which
was less than an implementation-defined minimum.
Implementations will always allow at least those
ranks in the range between
@minus{}134,217,727 and 134,217,727.
Numeric value: 85.
Suggested message: "Rule or symbol rank too low".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RANK_TOO_HIGH
A symbol or rule rank was specified which
was greater than an implementation-defined maximum.
Implementations will always allow at least those
ranks in the range between
@minus{}134,217,727 and 134,217,727.
Numeric value: 86.
Suggested message: "Rule or symbol rank too high".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RECCE_IS_INCONSISTENT
The recognizer is ``inconsistent'',
usually because the user has rejected one or
more rules or terminals,
and has not yet called
the @code{marpa_r_consistent()} method.
Numeric value: 95.
Suggested message: "The recognizer is inconsistent.
@end deftypevr
@deftypevr Macro int MARPA_ERR_RECCE_NOT_ACCEPTING_INPUT
The recognizer is not accepting input,
and the application has attempted something that
is inconsistent with that fact.
Numeric value: 60.
Suggested message: "The recognizer is not accepting input".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RECCE_NOT_STARTED
The recognizer has not been started.
and the application has attempted something that
is inconsistent with that fact.
Numeric value: 61.
Suggested message: "The recognizer has not been started".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RECCE_STARTED
The recognizer has been started.
and the application has attempted something that
is inconsistent with that fact.
Numeric value: 62.
Suggested message: "The recognizer has been started".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RHS_IX_NEGATIVE
The index of a RHS symbol was specified,
and it was negative.
That is not allowed.
Numeric value: 63.
Suggested message: "RHS index cannot be negative".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RHS_IX_OOB
A non-negative index of RHS symbol was specified,
but there is no symbol at that index.
Since the indexes are in sequence, this means the
index was greater than or equal to the rule length.
Numeric value: 64.
Suggested message: "RHS index must be less than rule length".
@end deftypevr
@deftypevr Macro int MARPA_ERR_RHS_TOO_LONG
An attempt was made to add a rule with too many
right hand side symbols.
The limit on the RHS symbol count is implementation
dependent, but it is very large,
at least 500,000,000 symbols.
This is
far beyond what is required in any current practical grammar.
An application with rules of this length is almost
certain to run into memory and other limits.
Numeric value: 65.
Suggested message: "The RHS is too long".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE
The LHS of a
sequence rule cannot be the LHS of any other rule,
whether a sequence rule or a BNF rule.
An attempt was made to violate this restriction.
Numeric value: 66.
Suggested message: "LHS of sequence rule would not be unique".
@end deftypevr
@deftypevr Macro int MARPA_ERR_START_NOT_LHS
The start symbol is not on the LHS on
any rule.
That means it could never match any possible input,
not even the null string.
Presumably, an error in writing the grammar.
Numeric value: 73.
Suggested message: "Start symbol not on LHS of any rule".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_IS_NOT_COMPLETION_EVENT
An attempt was made to use a symbol in
a way that requires it to be
set up for completion events,
but the symbol was not set
set up for completion events.
The archtypal case is an attempt to activate completion
events for the symbol in the recognizer.
The archtypal case is an attempt to activate a completion
event in the recognizer for
a symbol that is not set up as a completion event.
Numeric value: 92.
Suggested message: "Symbol is not set up for completion events".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_IS_NOT_NULLED_EVENT
An attempt was made to use a symbol in
a way that requires it to be
set up for nulled events,
but the symbol was not set
set up for nulled events.
The archtypal case is an attempt to activate a nulled
events in the recognizer for
a symbol that is not set up as a nulled event.
Numeric value: 93.
Suggested message: "Symbol is not set up for nulled events".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_IS_NOT_PREDICTION_EVENT
An attempt was made to use a symbol in
a way that requires it to be
set up for predictino events,
but the symbol was not set
set up for predictino events.
The archtypal case is an attempt to activate a prediction
event in the recognizer for
a symbol that is not set up as a prediction event.
Numeric value: 94.
Suggested message: "Symbol is not set up for prediction events".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_VALUED_CONFLICT
Unvalued symbols are a deprecated Marpa feature,
which may be avoided with
the @code{marpa_g_force_valued} method.
An unvalued symbol may take on any value,
and therefore a symbol which is unvalued at some points
cannot safely to be used to contain a value at
others.
This error indicates that such an unsafe use is
being attempted.
Numeric value: 74.
Suggested message: "Symbol is treated both as valued and unvalued".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TERMINAL_IS_LOCKED
An attempt was made to change the terminal status
of a symbol to a different value
after it was locked.
Numeric value: 75.
Suggested message: "The terminal status of the symbol is locked".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TOKEN_IS_NOT_TERMINAL
A token was specified whose symbol ID is not
a terminal.
Numeric value: 76.
Suggested message: "Token symbol must be a terminal".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TOKEN_LENGTH_LE_ZERO
A token length was specified which is less than
or equal to zero.
Zero-length tokens are not allowed in Libmarpa.
Numeric value: 77.
Suggested message: "Token length must greater than zero".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TOKEN_TOO_LONG
The token length is too long.
The limit on the length of a token
is implementation dependent, but it
is at least 500,000,000 earlemes.
An application using a token that long
is almost certain to run into some other
limit.
Numeric value: 78.
Suggested message: "Token is too long".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TREE_EXHAUSTED
A Libmarpa parse tree iterator
is ``exhausted'', that is,
it has no more parses.
Numeric value: 79.
Suggested message: "Tree iterator is exhausted".
@end deftypevr
@deftypevr Macro int MARPA_ERR_TREE_PAUSED
A Libmarpa tree is ``paused''
and an operation was attempted which
is inconsistent with that fact.
Typically, this operation will be
a call of the @code{marpa_t_next()} method.
Numeric value: 80.
Suggested message: "Tree iterator is paused".
@end deftypevr
@deftypevr Macro int MARPA_ERR_UNEXPECTED_TOKEN_ID
An attempt was made to read a token
where a token with that symbol ID is not
expected.
This message can also occur when an
attempt is made to read a token
at a location where no token is expected.
Numeric value: 81.
Suggested message: "Unexpected token".
@end deftypevr
@deftypevr Macro int MARPA_ERR_UNPRODUCTIVE_START
The start symbol is unproductive.
That means it could never match any possible input,
not even the null string.
Presumably, an error in writing the grammar.
Numeric value: 82.
Suggested message: "Unproductive start symbol".
@end deftypevr
@deftypevr Macro int MARPA_ERR_VALUATOR_INACTIVE
The valuator is inactive in a context where that
should not be the case.
Numeric value: 83.
Suggested message: "Valuator inactive".
@end deftypevr
@deftypevr Macro int MARPA_ERR_VALUED_IS_LOCKED
Unvalued symbols are a deprecated Marpa feature,
which may be avoided with
the @code{marpa_g_force_valued} method.
This error code
indicates that the valued status of a symbol is locked,
and an attempt was made
to change it to a status different from the
current one.
Numeric value: 84.
Suggested message: "The valued status of the symbol is locked".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_IS_NULLING
An attempt was made to do something with a nulling
symbol that is not allowed.
For example,
the ID of a nulling symbol cannot be an argument
to @code{marpa_r_expected_symbol_event_set} ---
because it is not possible to create an ``expected symbol'' event
for a nulling symbol.
Numeric value: 87.
Suggested message: "Symbol is nulling".
@end deftypevr
@deftypevr Macro int MARPA_ERR_SYMBOL_IS_UNUSED
An attempt was made to do something with a nulling
symbol that is not allowed.
An ``unused'' symbol is a inaccessible or unproductive symbol.
For example,
the ID of a unused symbol cannot be an argument
to @code{marpa_r_expected_symbol_event_set} ---
because it is not possible to create an ``expected symbol'' event
for a nulling symbol.
Numeric value: 88.
Suggested message: "Symbol is not used".
@end deftypevr
@node Internal error codes, , External error codes, Error methods macros and codes
@section Internal error codes
An internal error code may be one of two things:
First,
it can be an error code which
arises from an internal Libmarpa programming issue
(in other words, something happening in the code
that was not supposed to be able to happen.)
Second, it can be an error code which only occurs
when a method from Libmarpa's internal interface
is used.
Both kinds of internal error message share one common
trait --- users of the Libmarpa's external interface
should never see them.
Internal error messages
require someone with knowledge of the Libmarpa internals
to follow up on them.
They usually do not have descriptions or suggested messages.
@deftypevr Macro int MARPA_ERR_AHFA_IX_NEGATIVE
Numeric value: 1.
@end deftypevr
@deftypevr Macro int MARPA_ERR_AHFA_IX_OOB
Numeric value: 2.
@end deftypevr
@deftypevr Macro int MARPA_ERR_ANDID_NEGATIVE
Numeric value: 3.
@end deftypevr
@deftypevr Macro int MARPA_ERR_ANDID_NOT_IN_OR
Numeric value: 4.
@end deftypevr
@deftypevr Macro int MARPA_ERR_ANDIX_NEGATIVE
Numeric value: 5.
@end deftypevr
@deftypevr Macro int MARPA_ERR_BOCAGE_ITERATION_EXHAUSTED
Numeric value: 7.
@end deftypevr
@deftypevr Macro int MARPA_ERR_DEVELOPMENT
"Development" errors were used heavily during
Libmarpa's development,
when it was not yet clear how precisely
to classify every error condition.
Unless they are using a developer's version,
users of the external interface
should never
see development errors.
Development errors have an error string
associated with them.
The error string is a
short 7-bit ASCII error string
which describes the error.
Numeric value: 9.
Suggested message: "Development error, see string".
@end deftypevr
@deftypevr Macro int MARPA_ERR_DUPLICATE_AND_NODE
Numeric value: 10.
@end deftypevr
@deftypevr Macro int MARPA_ERR_YIM_ID_INVALID
Numeric value: 14.
@end deftypevr
@deftypevr Macro int MARPA_ERR_INTERNAL
A ``catchall'' internal error.
Numeric value: 19.
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_AHFA_ID
The AHFA ID was invalid. There are no AHFAs
any more, so this message should not occur.
Numeric value: 20.
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_AIMID
The AHM ID was invalid. The term ``AIMID''
is a legacy of earlier implementations and must
be kept for backward compatibility.
Numeric value: 21.
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_IRLID
Numeric value: 23.
@end deftypevr
@deftypevr Macro int MARPA_ERR_INVALID_NSYID
Numeric value: 24.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NOOKID_NEGATIVE
Numeric value: 33.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NOT_TRACING_COMPLETION_LINKS
Numeric value: 35.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NOT_TRACING_LEO_LINKS
Numeric value: 36.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NOT_TRACING_TOKEN_LINKS
Numeric value: 37.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_AND_NODES
Numeric value: 38.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_OR_NODES
Numeric value: 40.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_TRACE_YS
Numeric value: 46.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_TRACE_PIM
Numeric value: 47.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_TRACE_YIM
Numeric value: 45.
@end deftypevr
@deftypevr Macro int MARPA_ERR_NO_TRACE_SRCL
Numeric value: 48.
@end deftypevr
@deftypevr Macro int MARPA_ERR_ORID_NEGATIVE
Numeric value: 51.
@end deftypevr
@deftypevr Macro int MARPA_ERR_OR_ALREADY_ORDERED
Numeric value: 52.
@end deftypevr
@deftypevr Macro int MARPA_ERR_PIM_IS_NOT_LIM
Numeric value: 55.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_NONE
Numeric value: 70.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_TOKEN
Numeric value: 71.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_COMPLETION
Numeric value: 68.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_LEO
Numeric value: 69.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_AMBIGUOUS
Numeric value: 67.
@end deftypevr
@deftypevr Macro int MARPA_ERR_SOURCE_TYPE_IS_UNKNOWN
Numeric value: 72.
@end deftypevr
@node Design notes, Work in Progress, Error methods macros and codes, Top
@chapter Design notes
Design apologetics are
throughout this document,
dispersed among sections where it is
hoped that they motivate the description or discussion.
The notes in this section
did not find a home elsewhere.
@menu
* Why so many time objects::
* Design of numbered objects::
* LHS Terminals::
@end menu
@node Why so many time objects, Design of numbered objects, Design notes, Design notes
@section Why so many time objects?
Marpa is an aggressively multi-pass algorithm.
Marpa achieves its efficiency,
not in spite of making multiple
passes over the data, but because of it.
Marpa regularly substitutes
two fast O(@var{n}) passes for a single
O(@var{n} log @var{n}) pass.
Marpa's proliferation of time objects is in
keeping with its multi-pass approach.
Bocage objects come at no cost,
even for unambiguous parses,
because the same pass which creates the bocage
also deals with other issues which are of major
significance for unambiguous parses.
It is the post-processing of the bocage pass
that enables Marpa to do both left-
and right-recursion in linear time.
Of the various objects, the best
case for elimination is of the
ordering object.
In many cases, the ordering is trivial.
Either the parse is unambiguous, or the
application does not care about the order in
which parses are returned.
But while it would be easy to add an option
to bypass creation of an ordering object,
there is little to be gained from it.
When the ordering is trivial,
its overhead is very small ---
essentially a handful of subroutine calls.
Many orderings accomplish nothing,
but these cost next to nothing.
Tree objects come at minimal cost to unambiguous grammars,
because the same pass that allows iteration through multiple
parse trees does the tree traversal.
This eliminates much of the work that otherwise would
need to be done in
the valuation time object.
In the current implement, the valuation time object
needs only to step through a sequence already determined
in the tree iterator.
@node Design of numbered objects, LHS Terminals, Why so many time objects, Design notes
@section Numbered objects
As the name suggests,
the choice was made to implement
numbered objects as integers,
and not as
pointers.
In standard-conformant C,
integers can be safely checked for validity,
while pointers cannot.
There are efficiency tradeoffs between pointers and
integers but they are complicated,
and they go both ways.
Pointers can be faster, but integers can be used
as indexes into more than one data structure.
Which is actually faster depends on the design.
Integers allow for a more flexible design,
so that once the choice is settled on,
careful programming can make them a win,
possibly a very big one.
The approach taken in Libmarpa was to settle,
from the outset,
on integers as the implementation for numbered
objects,
and to optimize on that basis.
The author concedes that it is
possible that
others redoing Libmarpa from scratch
might find that pointers are faster.
But the author is confident
that they will also discover,
on modern architectures,
that the lack of safe validity checking
is far too high a price to pay
for the difference in speed.
@node LHS Terminals, , Design of numbered objects, Design notes
@section LHS terminals
Marpa's idea
in losing the sharp division between terminals
and non-terminals is that the distinction,
while helpful for proving theorems,
is not essential in practice.
LHS symbols
in the input might be useful for
``short circuiting'' the rules in which they occur.
This may
prove helpful in debugging, or have other applications.
However,
it also can be useful,
for checking input validity as well as for efficiency,
to follow tradition and distinguish
non-terminals from terminals.
For this reason,
the traditional behavior is the default
in Libmarpa.
@node Work in Progress, Deprecated techniques and methods, Design notes, Top
@chapter Work in Progress
@menu
* Untested methods::
@end menu
@node Untested methods, , Work in Progress, Work in Progress
@section Untested methods
The methods of this section are not in the external interface,
because they have not been adequately tested.
Their fate is uncertain.
Users should regard these methods
as unsupported.
@menu
* Ranking methods::
* Zero-width assertion methods::
* Methods for revising parses::
@end menu
@node Ranking methods, Zero-width assertion methods, Untested methods, Untested methods
@subsection Ranking methods
@deftypefun Marpa_Rank marpa_g_default_rank ( @
Marpa_Grammar @var{g})
@deftypefunx Marpa_Rank marpa_g_default_rank_set ( @
Marpa_Grammar @var{g}, @
Marpa_Rank @var{rank})
These methods, respectively, set
and query the default rank of the grammar.
When a grammar is created, the default rank is 0.
When rules and symbols are created, their rank is
the default rank of the grammar.
Changing the grammar's default rank does not affect
those
rules and symbols already created,
only those that will be created.
This means that
the grammar's default rank can be used to,
in effect, assign ranks to groups of rules and symbols.
Applications may find this behavior useful.
Return value: On success, returns
the rank @strong{after}
the call,
and sets the error code to
@code{MARPA_ERR_NONE}.
On failure, returns @minus{}2,
and sets the error code to an appropriate
value, which will never be
@code{MARPA_ERR_NONE}.
Note that when the rank is @minus{}2,
the error code is the only way to distinguish
success from failure.
The error code can be determined by using the
@code{marpa_g_error()} call.
@end deftypefun
@deftypefun Marpa_Rank marpa_g_symbol_rank ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID sym_id)
@deftypefunx Marpa_Rank marpa_g_symbol_rank_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{sym_id}, @
Marpa_Rank @var{rank})
These methods, respectively, set
and query the rank of a symbol @var{sym_id}.
When @var{sym_id} is created, its rank
initialized to the default rank of the grammar.
Return value: On success, returns
the rank @strong{after}
the call,
and sets the error code to
@code{MARPA_ERR_NONE}.
On failure, returns @minus{}2,
and sets the error code to an appropriate
value, which will never be
@code{MARPA_ERR_NONE}.
Note that when the rank is @minus{}2,
the error code is the only way to distinguish
success from failure.
The error code can be determined by using the
@code{marpa_g_error()} call.
@end deftypefun
@node Zero-width assertion methods, Methods for revising parses, Ranking methods, Untested methods
@subsection Zero-width assertion methods
@deftypefun Marpa_Assertion_ID marpa_g_zwa_new ( @
Marpa_Grammar @var{g}, @
int @var{default_value})
@end deftypefun
@deftypefun int marpa_g_zwa_place ( @
Marpa_Grammar @var{g}, @
Marpa_Assertion_ID @var{zwaid}, @
Marpa_Rule_ID @var{xrl_id}, @
int @var{rhs_ix})
@end deftypefun
@deftypefun int marpa_r_zwa_default ( @
Marpa_Recognizer @var{r}, @
Marpa_Assertion_ID @var{zwaid})
On success, returns previous default value of the assertion.
@end deftypefun
@deftypefun int marpa_r_zwa_default_set ( @
Marpa_Recognizer @var{r}, @
Marpa_Assertion_ID @var{zwaid}, @
int @var{default_value})
Changes default value to @var{default_value}.
On success, returns previous default value of the assertion.
@end deftypefun
@deftypefun Marpa_Assertion_ID marpa_g_highest_zwa_id ( @
Marpa_Grammar @var{g} )
@end deftypefun
@node Methods for revising parses, , Zero-width assertion methods, Untested methods
@subsection Methods for revising parses
Marpa allows an application to ``change its mind'' about a parse,
rejected rule previously recognized or predicted,
and terminals previously scanned.
The methods in this section provide that capability.
@deftypefun Marpa_Earleme marpa_r_clean ( @
Marpa_Recognizer @var{r})
@end deftypefun
@node Deprecated techniques and methods, GNU Free Documentation License, Work in Progress, Top
@chapter Deprecated techniques and methods
@menu
* Valued and unvalued symbols::
@end menu
@node Valued and unvalued symbols, , Deprecated techniques and methods, Deprecated techniques and methods
@section Valued and unvalued symbols
@menu
* What unvalued symbols were::
* Grammar methods dealing with unvalued symbols::
* Registering semantics in the valuator::
@end menu
@node What unvalued symbols were, Grammar methods dealing with unvalued symbols, Valued and unvalued symbols, Valued and unvalued symbols
@subsection What unvalued symbols were
Libmarpa symbols can have values,
which is the traditional way of doing semantics.
Libmarpa also allows symbols to be unvalued.
An @dfn{unvalued} symbol is one whose value
is unpredictable from instance to instance.
If a symbol is unvalued, we sometimes say that it
has ``whatever'' semantics.
Situations where the semantics can tolerate unvalued symbols
are surprisingly frequent.
For example, the top-level of many languages is a series
of major units, all of whose semantics are typically accomplished
via side effects.
The compiler is typically indifferent to the actual value produced
by these major units, and tracking them is a waste of time.
Similarly, the value of the separators in a list is typically
ignored.
Rules are unvalued if and only if their LHS symbols
are unvalued.
When rules and symbols are unvalued,
Libmarpa optimizes their evaluation.
It is in principle unsafe to check the value
of a symbol if it can be unvalued.
For this reason,
once a symbol has been treated as valued,
Libmarpa marks it as valued.
Similarly,
once a symbol has been treated as unvalued,
Libmarpa marks it as unvalued.
Once marked, a symbol's valued status is
@dfn{locked} and cannot be changed later.
The valued status of terminals is marked the first
time they are read.
The valued status of LHS symbols must be explicitly
marked by the application when initializing the
valuator --- this is Libmarpa's equivalent of
registering a callback.
LHS terminals are disabled by default.
If allowed, the user should be aware that the valued
status of a LHS terminal
will be locked in the recognizer
if it is used as a terminal,
and the symbol's use as a rule LHS
in the valuator must be
consistent with the recognizer's marking.
Marpa reports an error when a symbol's use
conflicts with its locked valued status.
Doing so usually saves the programmer
some tricky debugging further down the road.
@node Grammar methods dealing with unvalued symbols, Registering semantics in the valuator, What unvalued symbols were, Valued and unvalued symbols
@subsection Grammar methods dealing with unvalued symbols
@deftypefun int marpa_g_symbol_is_valued ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{symbol_id})
@deftypefunx int marpa_g_symbol_is_valued_set ( @
Marpa_Grammar @var{g}, @
Marpa_Symbol_ID @var{symbol_id}, @
int value)
These methods, respectively, set
and query the ``valued status'' of a symbol.
Once set to a value with the
@code{marpa_g_symbol_is_valued_set()} method,
the valued status of a symbol is ``locked'' at that value.
It cannot thereafter be changed.
Subsequent calls to
@code{marpa_g_symbol_is_valued_set()}
for the same @var{sym_id} will fail,
leaving @var{sym_id}'s valued status unchanged,
unless @var{value} is the same as the locked-in value.
Return value: On success, 1 if the symbol @var{symbol_id}
is valued after the call, 0 if not.
If the valued status is locked and @var{value}
is different from the current status, @minus{}2.
If @var{value} is not 0 or 1;
or on other failure, @minus{}2.
@end deftypefun
@node Registering semantics in the valuator, , Grammar methods dealing with unvalued symbols, Valued and unvalued symbols
@subsection Registering semantics in the valuator
By default, Libmarpa's valuator objects
assume that
non-terminal symbols have
no semantics.
The archetypal application will need to register
symbols that contain semantics.
The primary method for doing this is
@code{marpa_v_symbol_is_valued()}.
Applications will typically register semantics by rule,
and these applications will find
the @code{marpa_v_rule_is_valued()} method more convenient.
@deftypefun int marpa_v_symbol_is_valued ( @
Marpa_Value @var{v}, @
Marpa_Symbol_ID @var{sym_id} )
@deftypefunx int marpa_v_symbol_is_valued_set ( @
Marpa_Value @var{v}, @
Marpa_Symbol_ID @var{sym_id}, @
int @var{value} )
These methods, respectively,
discover and set to @var{value},
the valued status for symbol @var{sym_id}.
A valued status of 1 indicates that the symbol is valued.
A valued status of 0 indicates that the symbol is unvalued.
If the valued status is locked,
an attempt to change to a status different from the
current one will fail
(error code @code{MARPA_ERR_VALUED_IS_LOCKED}).
Return value: On success, the valued status @strong{after}
the call.
If @var{value} is not either 0 or 1,
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_v_rule_is_valued ( @
Marpa_Value @var{v}, @
Marpa_Rule_ID @var{rule_id} )
@deftypefunx int marpa_v_rule_is_valued_set ( @
Marpa_Value @var{v}, @
Marpa_Rule_ID @var{rule_id}, @
int @var{value} )
These methods respectively,
discover and set to @var{value},
the valued status
for the LHS symbol of rule @var{rule_id}.
A valued status of 1 indicates that the symbol is valued.
A valued status of 0 indicates that the symbol is unvalued.
If the valued status is locked,
an attempt to change to a status different from the
current one will fail
(error code @code{MARPA_ERR_VALUED_IS_LOCKED}).
Rules have no valued status of their own.
The valued status of a rule
is always that of its LHS symbol.
These methods are conveniences --- they
save the application the trouble of looking
up the rule's LHS.
Return value: On success, the valued status of the
rule @var{rule_id}'s LHS symbol @strong{after}
the call.
If @var{value} is not either 0 or 1,
or on other failure, @minus{}2.
@end deftypefun
@deftypefun int marpa_v_valued_force ( @
Marpa_Value @var{v})
This methods locks the valued status of all symbols
to 1, indicated that the symbol is valued.
If this is not possible, for example because one of
the grammar's symbols already is locked at a valued
status of 0,
failure is returned.
Return value: On success, a non-negative number.
On failure, returns @minus{}2,
and sets the error code to an appropriate
value, which will never be
@code{MARPA_ERR_NONE}.
@end deftypefun
@node GNU Free Documentation License, , Deprecated techniques and methods, Top
@appendix GNU Free Documentation License
@include fdl-1.3.texi
@bye
@c vim: expandtab shiftwidth=4:
|