File: parser_engine.html

package info (click to toggle)
pypeg2 2.15.2-2.2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 304 kB
  • sloc: python: 1,649; makefile: 3
file content (82 lines) | stat: -rw-r--r-- 20,476 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>pyPEG – the Parser Engine</title><meta content="text/html;charset=UTF-8" http-equiv="Content-Type"/><link href="format.css" type="text/css" rel="stylesheet"/></head><body style="counter-reset: chapter 2;"><a name="top"/><div id="headline"><p>pyPEG – a PEG Parser-Interpreter in Python</p><div class="small">pyPEG 2.15.0 of Fr Jan 10 2014 – Copyleft 2009-2014, <a href="http://fdik.org">Volker Birk</a></div><div id="python1"><p>Requires Python 3.x or 2.7<br/>
Older versions: <a href="http://fdik.org/pyPEG1">pyPEG 1.x</a>
</p></div></div><div id="navigation"><p class="head"><a href="index.html">How to use pyPEG</a></p><div class="contents"><menu><li><em><a href="index.html#installation">Installation</a></em></li><li><em><a href="index.html#parsing">Parsing text with pyPEG</a></em></li><li><em><a href="index.html#composing">Composing text</a></em></li><li><a href="index.html#indenting">Indenting text</a></li><li><a href="index.html#usercallbacks">User defined Callback Functions</a></li><li><em><a href="index.html#xmlout">XML output</a></em></li></menu></div><p class="head"><a href="grammar_elements.html">Grammar Elements</a></p><div class="contents"><menu><li><em><a href="grammar_elements.html#basic">Basic Grammar Elements</a></em></li><li><a href="grammar_elements.html#literals">str instances and Literal</a></li><li><a href="grammar_elements.html#regex">Regular Expressions</a></li><li><a href="grammar_elements.html#tuple">tuple instances and Concat</a></li><li><a href="grammar_elements.html#lists">list instances</a></li><li><a href="grammar_elements.html#none">Constant None</a></li><li><em><a href="grammar_elements.html#goclasses">Grammar Element Classes</a></em></li><li><a href="grammar_elements.html#symbol">Class Symbol</a></li><li><a href="grammar_elements.html#keyword">Class Keyword</a></li><li><a href="grammar_elements.html#list">Class List</a></li><li><a href="grammar_elements.html#namespace">Class Namespace</a></li><li><a href="grammar_elements.html#enum">Class Enum</a></li><li><em><a href="grammar_elements.html#ggfunc">Grammar generator functions</a></em></li><li><a href="grammar_elements.html#some">Function some()</a></li><li><a href="grammar_elements.html#maybesome">Function maybe_some()</a></li><li><a href="grammar_elements.html#optional">Function optional()</a></li><li><a href="grammar_elements.html#csl">Function csl()</a></li><li><a href="grammar_elements.html#attr">Function attr()</a></li><li><a href="grammar_elements.html#flag">Function flag()</a></li><li><a href="grammar_elements.html#name">Function name()</a></li><li><a href="grammar_elements.html#ignore">Function ignore()</a></li><li><a href="grammar_elements.html#indent">Function indent()</a></li><li><a href="grammar_elements.html#contiguous">Function contiguous()</a></li><li><a href="grammar_elements.html#separated">Function separated()</a></li><li><a href="grammar_elements.html#omit">Function omit()</a></li><li><em><a href="grammar_elements.html#callbacks">Callback functions</a></em></li><li><a href="grammar_elements.html#blank">Callback function blank()</a></li><li><a href="grammar_elements.html#endl">Callback function endl()</a></li><li><a href="grammar_elements.html#udcf">User defined callback functions</a></li><li><em><a href="grammar_elements.html#common">Common class methods for grammar elements</a></em></li><li><a href="grammar_elements.html#override_parse">parse() class method of a grammar element</a></li><li><a href="grammar_elements.html#override_compose">compose() method of a grammar element</a></li></menu></div><p class="head"><a href="parser_engine.html">Parser Engine</a></p><div class="contents"><menu><li><em><a href="parser_engine.html#parser">Class Parser</a></em></li><li><a href="parser_engine.html#parser_vars">Instance variables</a></li><li><a href="parser_engine.html#parser_init">Method __init__()</a></li><li><a href="parser_engine.html#parser_clear_memory">Method clear_memory()</a></li><li><a href="parser_engine.html#parser_parse">Method parse()</a></li><li><a href="parser_engine.html#parser_compose">Method compose()</a></li><li><a href="parser_engine.html#gen_syntax_error">Method generate_syntax_error()</a></li><li><em><a href="parser_engine.html#convenience">Convenience functions</a></em></li><li><a href="parser_engine.html#parse">Function parse()</a></li><li><a href="parser_engine.html#compose">Function compose()</a></li><li><a href="parser_engine.html#attributes">Function attributes()</a></li><li><a href="parser_engine.html#howmany">Function how_many()</a></li><li><em><a href="parser_engine.html#errors">Exceptions</a></em></li><li><a href="parser_engine.html#gerror">GrammarError</a></li><li><a href="parser_engine.html#getype">GrammarTypeError</a></li><li><a href="parser_engine.html#gevalue">GrammarValueError</a></li></menu></div><p class="head"><a href="xml_backend.html">XML Backend</a></p><div class="contents"><menu><li><em><a href="xml_backend.html#workhorses">etree functions</a></em></li><li><a href="xml_backend.html#create_tree">Function create_tree()</a></li><li><a href="xml_backend.html#create_thing">Function create_thing()</a></li><li><em><a href="xml_backend.html#xmlconvenience">XML convenience functions</a></em></li><li><a href="xml_backend.html#thing2xml">Function thing2xml()</a></li><li><a href="xml_backend.html#xml2thing">Function xml2thing()</a></li></menu></div><p class="head">I want this!</p><menu><li><a href="http://fdik.org/pyPEG2/pyPEG2.tar.gz"><strong>Download pyPEG 2</strong></a></li><li><a href="LICENSE.txt">License</a></li><li><a href="https://bitbucket.org/fdik/pypeg/">Bitbucket Repository</a></li><li><a href="http://fdik.org/yml">YML is using pyPEG</a></li><li><a href="http://fdik.org/iec2xml/">The IEC 61131-3 Structured Text to XML Compiler is using pyPEG</a></li><li><a href="http://fdik.org/pyPEG1">pyPEG version 1.x</a></li></menu></div><div id="entries"><h1 id="pengine">Parser Engine</h1><h2 id="parser">Class Parser</h2><p>Offers parsing and composing capabilities. Implements an intrinsic
<a href="https://en.wikipedia.org/wiki/Packrat parser">Packrat parser</a>.
</p><p><em>pyPEG</em> uses memoization as speed enhancement. Create a
<a href="#parser"><code>Parser</code></a> instance to have a reset cache memory.
Usually this is recommended if you're parsing another text – the cache
memory will not provide wrong results but a reset will save memory
consumption. If you're altering the grammar then clearing the cache
memory for the respective things is required for having correct parsing
results. Please use the
<a href="#parser_clear_memory"><code>clear_memory()</code></a> method in that
case.
</p><h3 id="parser_vars">Instance variables</h3><p>The instance variables are representing the parser's state.
</p><table class="glossary"><tr><td class="glossary"><p><code>whitespace</code></p></td><td class="glossary"><p>Regular expression to scan whitespace; default: <code>re.compile(r"(?m)\s+")</code>.
Set to <code>None</code> to disable automatic <code>whitespace</code> removing.
</p></td></tr><tr><td class="glossary"><p><code>comment</code></p></td><td class="glossary"><p><code>grammar</code> to parse comments; default: <code>None</code>.
If a <code>grammar</code> is set here, comments will be removed from the
source text automatically.
</p></td></tr><tr><td class="glossary"><p><code>last_error</code></p></td><td class="glossary"><p>after parsing, <code>SyntaxError</code> which ended parsing</p></td></tr><tr><td class="glossary"><p><code>indent</code></p></td><td class="glossary"><p>string to use to indent while composing; default: four spaces</p></td></tr><tr><td class="glossary"><p><code>indention_level</code></p></td><td class="glossary"><p>level to indent to; default: <code>0</code></p></td></tr><tr><td class="glossary"><p><code>text</code></p></td><td class="glossary"><p>original text to parse; set for decorated syntax errors</p></td></tr><tr><td class="glossary"><p><code>filename</code></p></td><td class="glossary"><p>filename where text is origin from</p></td></tr><tr><td class="glossary"><p><code>autoblank</code></p></td><td class="glossary"><p>add blanks while composing if grammar would possibly be violated otherwise; default: True</p></td></tr><tr><td class="glossary"><p><code>keep_feeble_things</code></p></td><td class="glossary"><p>keep otherwise cropped things like comments and whitespace; these
things are being put into the <code>feeble_things</code> attribute
</p></td></tr></table><h3 id="parser_init">Method __init__()</h3><h4>Synopsis</h4><p><code>__init__(self)</code></p><p>Initialize instance variables to their defaults.</p><h3 id="parser_clear_memory">Method clear_memory()</h3><h4>Synopsis</h4><p><code>clear_memory(self, thing=None)</code></p><p>Clear cache memory for packrat parsing.</p><p>This method clears the cache memory for <code>thing</code>. If <code>None</code> is given
as <code>thing</code>, it clears the cache completely.
</p><h4>Arguments</h4><table class="glossary"><tr><td class="glossary"><p><code>thing</code></p></td><td class="glossary"><p>thing for which cache memory is cleared; default: <code>None</code></p></td></tr></table><h3 id="parser_parse">Method parse()</h3><h4>Synopsis</h4><p><code>parse(self, text, thing, filename=None)</code></p><p>(Partially) parse <code>text</code> following <code>thing</code> as grammar and return the
resulting things.
</p><p>This method parses as far as possible. It does not raise a
<code>SyntaxError</code> if the source <code>text</code> does not parse completely. It
returns a <code>SyntaxError</code> object as <code>result</code> part of the return value if
the beginning of the source <code>text</code> does not comply with grammar
<code>thing</code>.
</p><h4>Arguments</h4><table class="glossary"><tr><td class="glossary"><p><code>text</code></p></td><td class="glossary"><p>text to parse</p></td></tr><tr><td class="glossary"><p><code>thing</code></p></td><td class="glossary"><p>grammar for things to parse</p></td></tr><tr><td class="glossary"><p><code>filename</code></p></td><td class="glossary"><p>filename where text is origin from</p></td></tr></table><h4>Returns</h4><p>Returns <code>(text, result)</code> with:</p><table class="glossary"><tr><td class="glossary"><p><code>text</code></p></td><td class="glossary"><p>unparsed text</p></td></tr><tr><td class="glossary"><p><code>result</code></p></td><td class="glossary"><p>generated objects</p></td></tr></table><h4>Raises</h4><table class="glossary"><tr><td class="glossary"><p><code>ValueError</code></p></td><td class="glossary"><p>if input does not match types</p></td></tr><tr><td class="glossary"><p><code>TypeError</code></p></td><td class="glossary"><p>if output classes have wrong syntax for their respective <code>__init__(self, ...)</code></p></td></tr><tr><td class="glossary"><p><code>GrammarTypeError</code></p></td><td class="glossary"><p>if grammar contains an object of unkown type</p></td></tr><tr><td class="glossary"><p><code>GrammarValueError</code></p></td><td class="glossary"><p>if grammar contains an illegal cardinality value</p></td></tr></table><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import Parser, csl, word
&gt;&gt;&gt; <span class="mark">p = Parser()</span>
&gt;&gt;&gt; <span class="mark">p.parse("hello, world!", csl(word))</span>
('!', ['hello', 'world'])
</code></pre><h3 id="parser_compose">Method compose()</h3><h4>Synopsis</h4><p><code>compose(self, thing, grammar=None)</code></p><p>Compose text using <code>thing</code> with <code>grammar</code>. If <code>thing.compose()</code>
exists, execute it, otherwise use <code>grammar</code> to compose.
</p><h4>Arguments</h4><table class="glossary"><tr><td class="glossary"><p><code>thing</code></p></td><td class="glossary"><p><code>thing</code> containing other things with <code>grammar</code></p></td></tr><tr><td class="glossary"><p><code>grammar</code></p></td><td class="glossary"><p><code>grammar</code> to use for composing <code>thing</code>; default: <code>type(thing).grammar</code></p></td></tr></table><h4>Returns</h4><p>Composed text</p><h4>Raises</h4><table class="glossary"><tr><td class="glossary"><p><code>ValueError</code></p></td><td class="glossary"><p>if <code>thing</code> does not match <code>grammar</code></p></td></tr><tr><td class="glossary"><p><code>GrammarTypeError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an object of unkown type</p></td></tr><tr><td class="glossary"><p><code>GrammarValueError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an illegal cardinality value</p></td></tr></table><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import Parser, csl, word
&gt;&gt;&gt; <span class="mark">p = Parser()</span>
&gt;&gt;&gt; <span class="mark">p.compose(['hello', 'world'], csl(word))</span>
'hello, world'
</code></pre><h3 id="gen_syntax_error">Method generate_syntax_error()</h3><h4>Synopsis</h4><p><code>generate_syntax_error(self, msg, pos)</code></p><p>Generate a syntax error construct.</p><table class="glossary"><tr><td class="glossary"><p><code>msg</code></p></td><td class="glossary"><p>string with error message</p></td></tr><tr><td class="glossary"><p><code>pos</code></p></td><td class="glossary"><p><code>(lineNo, charInText)</code> with positioning information</p></td></tr></table><h4>Returns</h4><p>Instance of <code>SyntaxError</code> with error text</p><h2 id="convenience">Convenience functions</h2><h3 id="parse">Function parse()</h3><h4>Synopsis</h4><pre>parse(text, thing, filename=None, whitespace=whitespace,
        comment=None, keep_feeble_things=False)
</pre><p>Parse text following <code>thing</code> as grammar and return the resulting things or
raise an error.
</p><h4>Arguments</h4><table class="glossary"><tr><td class="glossary"><p><code>text</code></p></td><td class="glossary"><p><code>text</code> to parse</p></td></tr><tr><td class="glossary"><p><code>thing</code></p></td><td class="glossary"><p><code>grammar</code> for things to parse</p></td></tr><tr><td class="glossary"><p><code>filename</code></p></td><td class="glossary"><p><code>filename</code> where <code>text</code> is origin from</p></td></tr><tr><td class="glossary"><p><code>whitespace</code></p></td><td class="glossary"><p>regular expression to skip <code>whitespace</code>; default: <code>re.compile(r"(?m)\s+")</code></p></td></tr><tr><td class="glossary"><p><code>comment</code></p></td><td class="glossary"><p><code>grammar</code> to parse comments; default: <code>None</code></p></td></tr><tr><td class="glossary"><p><code>keep_feeble_things</code></p></td><td class="glossary"><p>keep otherwise cropped things like comments and whitespace; these
things are being put into the <code>feeble_things</code> attribute; default:
<code>False</code>
</p></td></tr></table><h4>Returns</h4><p>generated things</p><h4>Raises</h4><table class="glossary"><tr><td class="glossary"><p><code>SyntaxError</code></p></td><td class="glossary"><p>if <code>text</code> does not match the <code>grammar</code> in <code>thing</code></p></td></tr><tr><td class="glossary"><p><code>ValueError</code></p></td><td class="glossary"><p>if input does not match types</p></td></tr><tr><td class="glossary"><p><code>TypeError</code></p></td><td class="glossary"><p>if output classes have wrong syntax for <code>__init__()</code></p></td></tr><tr><td class="glossary"><p><code>GrammarTypeError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an object of unkown type</p></td></tr><tr><td class="glossary"><p><code>GrammarValueError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an illegal cardinality value</p></td></tr></table><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import parse, csl, word
&gt;&gt;&gt; <span class="mark">parse("hello, world", csl(word))</span>
['hello', 'world']
</code></pre><h3 id="compose">Function compose()</h3><h4>Synopsis</h4><p><code>compose(thing, grammar=None, indent="    ", autoblank=True)</code></p><p>Compose text using <code>thing</code> with <code>grammar</code>.</p><h4>Arguments</h4><table class="glossary"><tr><td class="glossary"><p><code>thing</code></p></td><td class="glossary"><p><code>thing</code> containing other things with <code>grammar</code></p></td></tr><tr><td class="glossary"><p><code>grammar</code></p></td><td class="glossary"><p><code>grammar</code> to use to compose thing; default: <code>thing.grammar</code></p></td></tr><tr><td class="glossary"><p><code>indent</code></p></td><td class="glossary"><p>string to use to indent while composing; default: four spaces</p></td></tr><tr><td class="glossary"><p><code>autoblank</code></p></td><td class="glossary"><p>add blanks if grammar would possibly be violated otherwise; default: True</p></td></tr></table><h4>Returns</h4><p>composed text</p><h4>Raises</h4><table class="glossary"><tr><td class="glossary"><p><code>ValueError</code></p></td><td class="glossary"><p>if input does not match <code>grammar</code></p></td></tr><tr><td class="glossary"><p><code>GrammarTypeError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an object of unkown type</p></td></tr><tr><td class="glossary"><p><code>GrammarValueError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an illegal cardinality value</p></td></tr></table><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import compose, csl, word
&gt;&gt;&gt; <span class="mark">compose(['hello', 'world'], csl(word))</span>
'hello, world'
</code></pre><h3 id="attributes">Function attributes()</h3><h4>Synopsis</h4><p><code>attributes(grammar, invisible=False)</code></p><p>Iterates all attributes of a <code>grammar</code>.</p><p>This function can be used to iterate through all attributes which
will be generated for the top level object of the <code>grammar</code>. If
invisible is <code>False</code> omit attributes whose names are starting with
an underscore <code>_</code>.
</p><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import attr, name, attributes, word, restline
&gt;&gt;&gt; class Me:
...     grammar = name(), attr("typing", word), restline
... 
&gt;&gt;&gt; for a in <span class="mark">attributes(Me.grammar)</span>: print(a.name)
... 
name
typing
&gt;&gt;&gt; 
</code></pre><h3 id="howmany">Function how_many()</h3><h4>Synopsis</h4><p><code>how_many(grammar)</code></p><p>Determines the possibly parsed objects of grammar.</p><p>This function is meant to check if the results of a grammar
can be stored in a single object or a collection will be needed.
</p><h4>Returns</h4><table class="glossary"><tr><td class="glossary"><p><code>0</code></p></td><td class="glossary"><p>if there will be no objects</p></td></tr><tr><td class="glossary"><p><code>1</code></p></td><td class="glossary"><p>if there will be a maximum of one object</p></td></tr><tr><td class="glossary"><p><code>2</code></p></td><td class="glossary"><p>if there can be more than one object</p></td></tr></table><h4>Raises</h4><table class="glossary"><tr><td class="glossary"><p><code>GrammarTypeError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an object of unkown type</p></td></tr><tr><td class="glossary"><p><code>GrammarValueError</code></p></td><td class="glossary"><p>if <code>grammar</code> contains an illegal cardinality value</p></td></tr></table><p>Example:</p><pre><code>&gt;&gt;&gt; from pypeg2 import how_many, word, csl
&gt;&gt;&gt; <span class="mark">how_many("some")</span>
0
&gt;&gt;&gt; <span class="mark">how_many(word)</span>
1
&gt;&gt;&gt; <span class="mark">how_many(csl(word))</span>
2
</code></pre><h2 id="errors">Exceptions</h2><h3 id="gerror">GrammarError</h3><p>Base class for all errors <em>pyPEG</em> delivers.
</p><h3 id="getype">GrammarTypeError</h3><p>A grammar contains an object of a type which cannot be parsed,
for example an instance of an unknown class or of a basic type
like <code>float</code>. It can be caused by an <code>int</code> at the wrong place, too.
</p><h3 id="gevalue">GrammarValueError</h3><p>A grammar contains an object with an illegal value, for example
an undefined cardinality.
</p><div id="bottom">Want to download? Go to the <a href="#top">^Top^</a> and look to the right ;-)</div></div></body></html>