1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210
|
# Copyright 2022 Jeffrey Kegler
# This file is part of Marpa::R2. Marpa::R2 is free software: you can
# redistribute it and/or modify it under the terms of the GNU Lesser
# General Public License as published by the Free Software Foundation,
# either version 3 of the License, or (at your option) any later version.
#
# Marpa::R2 is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser
# General Public License along with Marpa::R2. If not, see
# http://www.gnu.org/licenses/.
:default ::= action => [start,length,values] bless => ::lhs
lexeme default =
action => [start,length,value]
bless => ::name
forgiving => 1
:start ::= statements
statements ::= statement+
statement ::= <start rule> | <empty rule>
| <null statement> | <statement group>
| <priority rule> | <quantified rule>
| <discard rule> | <default rule>
| <lexeme default statement>
| <discard default statement>
| <lexeme rule>
| <completion event declaration>
| <nulled event declaration>
| <prediction event declaration>
| <current lexer statement>
| <inaccessible statement>
<null statement> ::= ';'
<statement group> ::= ('{') statements '}'
<start rule> ::= (':start' <op declare bnf>) symbol
<start rule> ::= ('start' 'symbol' 'is') symbol
<default rule> ::= ':default' <op declare bnf> <adverb list>
<lexeme default statement> ::= ('lexeme' 'default' '=') <adverb list>
<discard default statement> ::= ('discard' 'default' '=') <adverb list>
<priority rule> ::= lhs <op declare> priorities
<empty rule> ::= lhs <op declare> <adverb list>
<quantified rule> ::= lhs <op declare> <single symbol> quantifier <adverb list>
<discard rule> ::= (':discard' <op declare match>) <single symbol> <adverb list>
<lexeme rule> ::= (':lexeme' <op declare match>) symbol <adverb list>
<completion event declaration> ::= ('event') <event initialization> ('=' 'completed') <symbol name>
<nulled event declaration> ::= ('event') <event initialization> ('=' 'nulled') <symbol name>
<prediction event declaration> ::= ('event') <event initialization> ('=' 'predicted') <symbol name>
<current lexer statement> ::= ('current' 'lexer' 'is') <lexer name>
<inaccessible statement> ::= ('inaccessible' 'is') <inaccessible treatment> ('by' 'default')
<inaccessible treatment> ::= 'warn' | 'ok' | 'fatal'
<op declare> ::= <op declare bnf> | <op declare match>
priorities ::= alternatives+
separator => <op loosen> proper => 1
alternatives ::= alternative+
separator => <op equal priority> proper => 1
alternative ::= rhs <adverb list>
<adverb list> ::= <adverb list items>
<adverb list items> ::= <adverb item>*
<adverb item> ::=
action
| <left association> | <right association> | <group association>
| <separator specification> | <proper specification>
| <rank specification> | <null ranking specification>
| <priority specification> | <pause specification> | <event specification>
| <latm specification> | blessing | naming | <null adverb>
<null adverb> ::= ','
action ::= ('action' '=>') <action name>
<left association> ::= ('assoc' '=>' 'left')
<right association> ::= ('assoc' '=>' 'right')
<group association> ::= ('assoc' '=>' 'group')
<separator specification> ::= ('separator' '=>') <single symbol>
<proper specification> ::= ('proper' '=>') boolean
<rank specification> ::= ('rank' '=>') <signed integer>
<null ranking specification> ::= ('null-ranking' '=>') <null ranking constant>
<null ranking specification> ::= ('null' 'rank' '=>') <null ranking constant>
<null ranking constant> ::= 'low' | 'high'
<priority specification> ::= ('priority' '=>') <signed integer>
<pause specification> ::= ('pause' '=>') <before or after>
<event specification> ::= ('event' '=>') <event initialization>
<event initialization> ::= <event name> <event initializer>
<event initializer> ::= ('=') <on or off>
<on or off> ::= 'on' | 'off'
<event initializer> ::= # empty
<latm specification> ::= ('forgiving' '=>') boolean
<latm specification> ::= ('latm' '=>') boolean
<blessing> ::= ('bless' '=>') <blessing name>
<naming> ::= ('name' '=>') <alternative name>
<alternative name> ::= <standard name> | <single quoted name>
<lexer name> ::= <standard name> | <single quoted name>
<event name> ::= <standard name>
| <single quoted name>
| <reserved event name>
<reserved event name> ~ ':symbol'
<blessing name> ::= <standard name>
<blessing name> ::= <reserved blessing name>
lhs ::= <symbol name>
rhs ::= <rhs primary>+
<rhs primary> ::= <single symbol>
<rhs primary> ::= <single quoted string>
<rhs primary> ::= <parenthesized rhs primary list>
<parenthesized rhs primary list> ::= ('(') <rhs primary list> (')')
<rhs primary list> ::= <rhs primary>+
<single symbol> ::=
symbol
| <character class>
symbol ::= <symbol name>
<symbol name> ::= <bare name>
<symbol name> ::= <bracketed name>
<action name> ::= <Perl name>
<action name> ::= <reserved action name>
<action name> ::= <array descriptor>
:discard ~ whitespace
whitespace ~ [\s]+
# allow comments
:discard ~ <hash comment>
<hash comment> ~ <terminated hash comment> | <unterminated
final hash comment>
<terminated hash comment> ~ '#' <hash comment body> <vertical space char>
<unterminated final hash comment> ~ '#' <hash comment body>
<hash comment body> ~ <hash comment char>*
<vertical space char> ~ [\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
<hash comment char> ~ [^\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
<op declare bnf> ~ '::='
<op declare match> ~ '~'
<op loosen> ~ '||'
<op equal priority> ~ '|'
quantifier ::= '*' | '+'
<before or after> ~ 'before' | 'after'
<signed integer> ~ <integer> | <sign> <integer>
<sign> ~ [+-]
<integer> ~ [\d]+
boolean ~ [01]
<reserved action name> ~ '::' <one or more word characters>
<reserved blessing name> ~ '::' <one or more word characters>
<one or more word characters> ~ [\w]+
<zero or more word characters> ~ [\w]*
# Perl identifiers allow an initial digit, which makes them slightly more liberal than Perl bare names
# but equivalent to Perl names with sigils.
<Perl identifier> ~ [\w]+
<double colon> ~ '::'
<Perl name> ~ <Perl identifier>+ separator => <double colon> proper => 1
<bare name> ~ [\w]+
<standard name> ~ [a-zA-Z] <zero or more word characters>
<bracketed name> ~ '<' <bracketed name string> '>'
<bracketed name string> ~ [\s\w]+
<array descriptor> ~ <array descriptor left bracket> <result item descriptor list> <array descriptor right bracket>
<array descriptor left bracket> ~ '['
<array descriptor left bracket> ~ '[' whitespace
<array descriptor right bracket> ~ ']'
<array descriptor right bracket> ~ whitespace ']'
<result item descriptor list> ~ <result item descriptor>* separator => <result item descriptor separator>
<result item descriptor separator> ~ [,]
<result item descriptor separator> ~ [,] whitespace
<result item descriptor> ~ 'start' | 'length'
| 'g1start' | 'g1length'
| 'name' | 'lhs' | 'symbol' | 'rule'
| 'value' | 'values'
# In single quoted strings and character classes
# no escaping or internal newlines, and disallow empty string
<single quoted string> ~ ['] <string without single quote or vertical space> ['] <character class modifiers>
<single quoted name> ~ ['] <string without single quote or vertical space> [']
<string without single quote or vertical space> ~ [^'\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}]+
<character class> ~ '[' <cc elements> ']' <character class modifiers>
<cc elements> ~ <cc element>+
<cc element> ~ <safe cc character>
# hex 5d is right square bracket
<safe cc character> ~ [^\x{5d}\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}]
<cc element> ~ <escaped cc character>
<escaped cc character> ~ '\' <horizontal character>
<cc element> ~ <posix char class>
<cc element> ~ <negated posix char class>
<character class modifiers> ~ <character class modifier>*
<character class modifier> ~ ':ic'
<character class modifier> ~ ':i'
# [=xyz=] and [.xyz.] are parsed by Perl, but then currently cause an exception.
# Catching Perl exceptions is inconvenient for Marpa,
# so we reject them syntactically instead.
<posix char class> ~ '[:' <posix char class name> ':]'
<negated posix char class> ~ '[:^' <posix char class name> ':]'
<posix char class name> ~ [[:alnum:]]+
# a horizontal character is any character that is not vertical space
<horizontal character> ~ [^\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
|