1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211
|
= REX: Ruby Lex for Racc
== About
Lexical Scanner Generator used with Racc for Ruby
== Usage
rex [options] grammarfile
-o --output-file filename designate output filename
-s --stub append stub main for debugging
-i --ignorecase ignore character case
-C --check-only only check syntax
--independent independent mode
-d --debug print debug information
-h --help print usage
--version print version
--copyright print copyright
== Default Output Filename
The destination file for foo.rex is foo.rex.rb.
To use, include the following in the Ruby source code file.
require 'foo.rex'
== Grammar File Format
A definition consists of a header section, a rule section,
and a footer section. The rule section includes one or more clauses.
Each clause starts with a keyword.
Summary:
[Header section]
"class" Foo
["option"
[options] ]
["inner"
[methods] ]
["macro"
[macro-name regular-expression] ]
"rule"
[start-state] pattern [actions]
"end"
[Footer section]
=== Grammar File Description Example
class Foo
macro
BLANK \s+
DIGIT \d+
rule
{BLANK}
{DIGIT} { [:NUMBER, text.to_i] }
. { [text, text] }
end
== Header Section ( Optional )
All of the contents described before the definitions in the rule section are
copied to the beginning of the output file.
== Footer Section ( Optional )
All the contents described after the definitions in the rule section are
copied to the end of the output file.
== Rule Section
The rule section starts at the line beginning with the "class" keyword
and ends at the line beginning with the "end" keyword.
The class name is specified after the "class" keyword.
If a module name is specified, the class will be included in the module.
A class that inherits Racc::Parser is generated.
=== Example of Rule Section Definition
class Foo
class Bar::Foo
== Option Section ( Optional )
This section begins with the "option" keyword.
"ignorecase" ignore the character case when pattern matching
"stub" append stub main for debugging
"independent" independent mode, do not inherit Racc.
== Inner Section for User Code ( Optional )
This section begins with the "inner" keyword.
The contents defined here are defined by the contents of the class
of the generated scanner.
== Macro Section ( Optional )
This section begins with the "macro" keyword.
A name is assigned to one regular expression.
A space character (0x20) can be included by using a backslash \ to escape.
=== Example of Macro Definition
DIGIT \d+
IDENT [a-zA-Z_][a-zA-Z0-9_]*
BLANK [\ \t]+
REMIN \/\*
REMOUT \*\/
== Rule Section
This section begins with the "rule" keyword.
[state] pattern [actions]
=== state: Start State ( Optional )
A start state is indicated by an identifier beginning with ":", a Ruby symbol.
If uppercase letters follow the ":", the state becomes an exclusive start state.
If lowercase letters follow the ":", the state becomes an inclusive start state.
The initial value and the default value of a start state are nil.
=== pattern: String Pattern
A regular expression specifies a character string.
A regular expression description may include a macro definition enclosed
by curly braces { }.
A macro definition is used when the regular expression includes whitespace.
=== actions: Processing Actions ( Optional )
An action is executed when the pattern is matched.
The action defines the process for creating the appropriate token.
A token is a two-element array containing a type and a value, or is nil.
The following elements can be used to create a token.
lineno Line number ( Read Only )
text Matched string ( Read Only )
state Start state ( Read/Write )
The action is a block of Ruby code enclosed by { }.
Do not use functions that exit the block and change the control flow.
( return, exit, next, break, ... )
If the action is omitted, the matched character string is discarded,
and the process advances to the next scan.
=== Example of Rule Section Definition
{REMIN} { state = :REM ; [:REM_IN, text] }
:REM {REMOUT} { state = nil ; [:REM_OUT, text] }
:REM (.+)(?={REMOUT}) { [:COMMENT, text] }
{BLANK}
-?{DIGIT} { [:NUMBER, text.to_i] }
{WORD} { [:word, text] }
. { [text, text] }
== Comments ( Optional )
Any text following a "#" to the end of the line becomes a comment.
== Using the Generated Class
=== scan_setup( str )
Initializes the scanner with the str string argument.
This is redefined and used.
=== scan_str( str )
Parses the string described in the defined grammar.
The tokens are stored internally.
=== scan_file( filename )
Reads in a file described in the defined grammar.
The tokens are stored internally.
=== next_token
The tokens stored internally are extracted one by one.
When there are no more tokens, nil is returned.
== Notice
This specification is provisional and may be changed without prior notice.
|