File: DOCUMENTATION.en.rdoc

package info (click to toggle)
rexical 1.0.7-2
links: PTS, VCS
area: main
in suites: bookworm, forky, sid, trixie
size: 284 kB
sloc: ruby: 1,123; makefile: 8; ansic: 5
file content (211 lines) | stat: -rw-r--r-- 5,756 bytes
parent folder | download | duplicates (2)
= REX: Ruby Lex for Racc


== About

  Lexical Scanner Generator used with Racc for Ruby


== Usage

  rex [options] grammarfile

  -o  --output-file  filename   designate output filename
  -s  --stub                    append stub main for debugging
  -i  --ignorecase              ignore character case
  -C  --check-only              only check syntax
      --independent             independent mode
  -d  --debug                   print debug information
  -h  --help                    print usage
      --version                 print version
      --copyright               print copyright


== Default Output Filename

  The destination file for foo.rex is foo.rex.rb.
  To use, include the following in the Ruby source code file.

    require 'foo.rex' 


== Grammar File Format

  A definition consists of a header section, a rule section, 
  and a footer section.  The rule section includes one or more clauses.
  Each clause starts with a keyword.

  Summary:

    [Header section]
    "class" Foo
    ["option"
      [options] ]
    ["inner"
      [methods] ]
    ["macro"
      [macro-name  regular-expression] ]
    "rule"
      [start-state]  pattern  [actions]
    "end"
    [Footer section]


=== Grammar File Description Example

    class Foo
    macro
      BLANK         \s+
      DIGIT         \d+
    rule
      {BLANK}
      {DIGIT}       { [:NUMBER, text.to_i] }
      .             { [text, text] }
    end


== Header Section ( Optional )

  All of the contents described before the definitions in the rule section are 
  copied to the beginning of the output file.


== Footer Section ( Optional )

  All the contents described after the definitions in the rule section are 
  copied to the end of the output file.


== Rule Section

  The rule section starts at the line beginning with the "class" keyword 
  and ends at the line beginning with the "end" keyword.
  The class name is specified after the "class" keyword.
  If a module name is specified, the class will be included in the module.
  A class that inherits Racc::Parser is generated.


=== Example of Rule Section Definition

    class Foo
    class Bar::Foo


== Option Section ( Optional )

  This section begins with the "option" keyword.

    "ignorecase"  ignore the character case when pattern matching
    "stub"        append stub main for debugging
    "independent" independent mode, do not inherit Racc.


== Inner Section for User Code ( Optional )

  This section begins with the "inner" keyword.
  The contents defined here are defined by the contents of the class 
  of the generated scanner.


== Macro Section ( Optional )

  This section begins with the "macro" keyword.
  A name is assigned to one regular expression.
  A space character (0x20) can be included by using a backslash \ to escape.


=== Example of Macro Definition

    DIGIT         \d+
    IDENT         [a-zA-Z_][a-zA-Z0-9_]*
    BLANK         [\ \t]+
    REMIN         \/\*
    REMOUT        \*\/


== Rule Section

  This section begins with the "rule" keyword.

  [state]  pattern  [actions]


=== state: Start State ( Optional )

    A start state is indicated by an identifier beginning with ":", a Ruby symbol.
    If uppercase letters follow the ":", the state becomes an exclusive start state.
    If lowercase letters follow the ":", the state becomes an inclusive start state.
    The initial value and the default value of a start state are nil.


=== pattern: String Pattern

    A regular expression specifies a character string.
    A regular expression description may include a macro definition enclosed
    by curly braces { }.
    A macro definition is used when the regular expression includes whitespace.


=== actions: Processing Actions ( Optional )

    An action is executed when the pattern is matched.
    The action defines the process for creating the appropriate token.
    A token is a two-element array containing a type and a value, or is nil.
    The following elements can be used to create a token.

      lineno    Line number     ( Read Only )
      text      Matched string  ( Read Only )
      state     Start state     ( Read/Write )

    The action is a block of Ruby code enclosed by { }.
    Do not use functions that exit the block and change the control flow.
    ( return, exit, next, break, ... )
    If the action is omitted, the matched character string is discarded, 
    and the process advances to the next scan.


=== Example of Rule Section Definition

        {REMIN}                 { state = :REM ; [:REM_IN, text] }
  :REM  {REMOUT}                { state = nil ; [:REM_OUT, text] }
  :REM  (.+)(?={REMOUT})        { [:COMMENT, text] }
        {BLANK}
        -?{DIGIT}               { [:NUMBER, text.to_i] }
        {WORD}                  { [:word, text] }
        .                       { [text, text] }

== Comments ( Optional )

  Any text following a "#" to the end of the line becomes a comment.


== Using the Generated Class

=== scan_setup( str )

    Initializes the scanner with the str string argument.
    This is redefined and used.


=== scan_str( str )

    Parses the string described in the defined grammar.
    The tokens are stored internally.


=== scan_file( filename )

    Reads in a file described in the defined grammar.
    The tokens are stored internally.


=== next_token

    The tokens stored internally are extracted one by one.
    When there are no more tokens, nil is returned.


== Notice

  This specification is provisional and may be changed without prior notice.