File: tricks.html.textile

package info (click to toggle)
ruby-parslet 2.0.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,260 kB
  • sloc: ruby: 6,157; sh: 8; javascript: 3; makefile: 3
file content (227 lines) | stat: -rw-r--r-- 6,608 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
---
title: Tricks for common situations
---

Here's a topic overview: <div id="toc"></div>

h2. Matching EOF (End Of File)

Ahh Sir, you'll be needin what us parsers call _epsilon_: 

<pre class="sh_ruby"><code>
  rule(:eof) { any.absent? }
</code></pre>

Of course, most of us don't use this at all, since any parser has EOF as
implicit last input.

h2. Matching Strings Case Insensitive

Parslet is fully hackable: You can use code to create parsers easily. Here's
how I would match a string in case insensitive manner: 

<pre class="sh_ruby"><code title="case insensitive match">
  def stri(str)
    key_chars = str.split(//)
    key_chars.
      collect! { |char| match["#{char.upcase}#{char.downcase}"] }.
      reduce(:>>)
  end

  # Constructs a parser using a Parser Expression Grammar 
  stri('keyword').parse "kEyWoRd"     # => "kEyWoRd"@0
</code></pre>

h2. Testing

Parslet helps you to create parsers that are in turn created out of many small
parsers. It is really turtles all the way down. Imagine you have a complex 
parser: 

<pre class="sh_ruby"><code>
  class ComplexParser < Parslet::Parser
    root :lots_of_stuff
  
    rule(:lots_of_stuff) { ... }
  
    # and many lines later: 
    rule(:simple_rule) { str('a') }
  end
</code></pre>

Also imagine that the parser (as a whole) fails to consume the 'a' that 
<code>simple_rule</code> is talking about. 

This kind of problem can very often be fixed by bisecting it into two possible
problems. Either: 

# the <code>lots_of_stuff</code> rule somehow doesn't place <code>simple_rule</code>
  in the right context or
# the <code>simple_rule</code> simply (hah!) fails to match its input. 

I find it very useful in this situation to eliminate 2. from our options: 

<pre class="sh_ruby"><code title="rspec">
  require 'rspec'
  require 'parslet/rig/rspec'
  
  class ComplexParser < Parslet::Parser
    rule(:simple_rule) { str('a') }
  end

  RSpec.describe ComplexParser  do
    let(:parser) { ComplexParser.new }
    context "simple_rule" do
      it "should consume 'a'" do
        expect(parser.simple_rule).to parse('a')
      end 
    end
  end
  
  RSpec::Core::Runner.run(['--format', 'documentation'])
</code></pre>

Output is: 
<pre class="output">

Example::ComplexParser
  simple_rule
    should consume 'a'

Finished in 0.00094 seconds (files took 0.29367 seconds to load)
1 example, 0 failures
</pre>

Parslet parsers have one method per rule. These methods return valid parsers
for a subset of your grammar. 

h2. Error reports

If your grammar fails and you're aching to know why, here's a bit of exception
handling code that will help you out: 

<pre class="sh_ruby"><code title="exception handling">
  parser = str('foo')
  begin
    parser.parse('bar')
  rescue Parslet::ParseFailed => error
    puts error.parse_failure_cause.ascii_tree
  end
</code></pre>

This should print something akin to: 

<pre class="output"> 
Expected "foo", but got "bar" at line 1 char 1.
</pre>

These error reports are probably the fastest way to know exactly where you
went wrong (or where your input is wrong, which is aequivalent).

And since this is such a common idiom, we provide you with a shortcut: to
get the above, just: 

<pre class="sh_ruby"><code>
require 'parslet/convenience'
parser.parse_with_debug(input)
</code></pre>


h3. Reporter engines

Note that there is currently not one, but two error reporting engines! The 
default engine will report errors in a structure that looks exactly like the
grammar structure: 

<pre class="sh_ruby"><code title="error reporter 1">
  class P < Parslet::Parser
    root(:body)
    rule(:body) { elements }
    rule(:elements) { (call | element).repeat(2) }
    rule(:element) { str('bar') }
    rule(:call) { str('baz') >> str('()') }
  end
  
  begin
    P.new.parse('barbaz')
  rescue Parslet::ParseFailed => error
    puts error.parse_failure_cause.ascii_tree
  end
</code></pre>

Outputs: 

<pre class="output"> 
Expected at least 2 of CALL / ELEMENT at line 1 char 1.
`- Expected one of [CALL, ELEMENT] at line 1 char 4.
   |- Failed to match sequence ('baz' '()') at line 1 char 7.
   |  `- Premature end of input at line 1 char 7.
   `- Expected "bar", but got "baz" at line 1 char 4.
</pre>

Let's switch out the 'grammar structure' engine (called '<code>Tree</code>')
with the 'deepest error position' engine:

<pre class="sh_ruby"><code title="error reporter 2">
  class P < Parslet::Parser
    root(:body)
    rule(:body) { elements }
    rule(:elements) { (call | element).repeat(2) }
    rule(:element) { str('bar') }
    rule(:call) { str('baz') >> str('()') }
  end
  
  begin
    P.new.parse('barbaz', reporter: Parslet::ErrorReporter::Deepest.new)
  rescue Parslet::ParseFailed => error
    puts error.parse_failure_cause.ascii_tree
  end
</code></pre>

Outputs: 

<pre class="output"> 
Expected at least 2 of CALL / ELEMENT at line 1 char 1.
`- Expected one of [CALL, ELEMENT] at line 1 char 4.
   |- Failed to match sequence ('baz' '()') at line 1 char 7.
   |  `- Premature end of input at line 1 char 7.
   `- Premature end of input at line 1 char 7.
</pre>

The <code>'Deepest'</code> position engine will store errors that are the
farthest into the input. In some examples, this produces more readable output
for the end user. 

h2. Line numbers from parser output

A traditional parser would parse and then perform several checking phases,
like for example verifying all type constraints are respected in the input.
During this checking phase, you will most likely want to report screens full
of type errors back to the user ('cause that's what types are for, right?).
Now where did that 'int' come from?

Parslet gives you slices (Parslet::Slice) of input as part of your tree. These
are essentially strings with line numbers. Here's how to print that error
message:

<pre class="sh_ruby"><code>
  # assume that type == "int"@0 - a piece from your parser output
  line, col = type.line_and_column
  puts "Sorry. Can't have #{type} at #{line}:#{col}!"
</code></pre>

h2. Precedence climber

You might want to implement a parser for simple arithmetic infix expressions
such as `1 + 2`. The quickest way to do this with parslet is to use the 
infix expression parser atom: 

<pre class="sh_ruby"><code>
  infix_expression(
    match('[0-9]').repeat,
    [str('*'), 2],
    [str('+'), 1]) # matches both "1+2*3" and "1*2+3"
</code></pre>

Please also see the "example":https://github.com/kschiess/parslet/blob/master/example/prec_calc.rb and
the "inline documentation":https://github.com/kschiess/parslet/blob/master/lib/parslet.rb#L203-L229 for this feature.