File: reg.hlp

package info (click to toggle)
arb 6.0.2-1%2Bdeb8u1
links: PTS, VCS
area: non-free
in suites: jessie
size: 65,916 kB
ctags: 53,258
sloc: ansic: 394,903; cpp: 250,252; makefile: 19,620; sh: 15,878; perl: 10,461; fortran: 6,019; ruby: 683; xml: 503; python: 53; awk: 32
file content (138 lines) | stat: -rw-r--r-- 6,541 bytes
parent folder | download | duplicates (6)
#Please insert up references in the next lines (line starts with keyword UP)
UP	arb.hlp
UP	glossary.hlp

#Please insert subtopic references  (line starts with keyword SUB)
SUB	srt.hlp
SUB	aci.hlp

# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}

#************* Title of helpfile !! and start of real helpfile ********
TITLE		Regular Expressions (REG)

OCCURRENCE	Many places

SECTION		Ways to use regular expressions

		There are two ways to use regular expressions:

		[1]	/Search Regexpr/Replace String/
		[2]	/Search Regexpr/

		[1] searches the input for occurrences of 'Search Regexpr' and
                replaces every occurrence with 'Replace String'.

		[2] searches the input for the FIRST occurrence of 'Search
                Regexpr' and returns the found match.
                If nothing matches, it returns an empty string.

                Notes:

                * You can use regular expressions everywhere where you can use
                  ACI and SRT expressions.
                * At some places only [2] is available (e.g. in Search&Query).
                * Normally regular expressions work case sensitive. To make them
                  work case insensitive, simply append an 'i' to the
                  expression (i.e. '/expr/i' or '/expr/repl/i')

SECTION         Syntax of POSIX extended regular expressions as used in ARB

                A regular expression specifies a set of character strings,
                e.g. the expression '/pseu/i' specifies all strings containing
                "pseu", "Pseu" or "pSeu" and so on. We say the expression "matches" 
                (a part of) these strings.

                Several characters have special meanings in regular expressions.
                All other characters just match against themselves.

                Special characters:

                  '.'          matches any character (e.g. '/h.s/' matches "has" and "his")
                  '[xyz]'      matches 'x', 'y' or 'z'
                  '[a-z]'      matches all lower case letters
                  '^'          matches the beginning of the string
                               (e.g. '/^pseu/i' matches all strings starting with "pseu")
                  '$'          matches the end of the string
                               (e.g. '/cens$/i' matches all strings ending in "cens")

                  '*'          matches the preceding element zero or more times
                               (e.g. '/th*is/' matches "tis", "this", "thhhhhhiss", ..)
                  '?'          matches the preceding element zero or one time
                               (e.g. '/th?is/' matches "tis" or "this", but not "thhis")
                  '+'          matches the preceding element one or more times
                               (e.g. '/th+is/' matches "this" or "thhhis", but not "tis")
                  '{mi,ma}'    matches the preceding element 3 to 5 times
                               (e.g. '/th{2,4}is/' matches "thhis", "thhhis" or "thhhhis")

                  '|'          marks an alternative
                               (e.g. '/bacter|spiri/i' matches all strings containing "bacter" or "spiri")

                  '()'         marks a subexpression. Subexpressions can be used to separate alternatives
                               or to mark parts for use in the replace expression (see below).

                               (e.g.    '/bact|spiri.*cens/'   match '/bact/'       or '/spiri.*cens/',
                                whereas '/(bact|spiri).*cens/' match '/bact.*cens/' or '/spiri.*cens/')

                  To match against special characters themselves, escape them
                  using a '\' (e.g. '/\*/' matches the character "*", '/\\/' matches "\")


                Character classes:

                  [...]      is called a character class. It matches against any of the characters
                             listed in between the brackets.
                  [^...]     If the character class starts with '^' it matches against any character
                             NOT listed (e.g. '[^78]' matches all but '7' or '8')
                  [5-9]      If the character class contains a '-' it is interpreted as "range of characters".
                             Here '5-9' is equivalent to '56789'.
                             You may mix ranges and single characters, e.g. '14-79' is same as '145679',
                             '7-91-3' is same as '789123'.

                  To add special characters to a character class, escape them using '\'.

                  There are several special predefined character classes like '[:word:]' or
                  '[:punct:]'. See link below for details.


                Links:

                * A more in-depth explanation of POSIX extended regular expressions can be
                  found at LINK{http://en.wikipedia.org/wiki/Regular_expression#POSIX}.
                * Many examples are given in this guide: LINK{http://www.digitalamit.com/article/regular_expression.phtml}

                Notes:

                * if an expression matches one string multiple times, the longest leftmost
                  match is used (e.g: '/a*e*/' matches 'aaeee' at position 3 of the
                  string 'bbaaeeeffaegg', not 'ae' at position 10). 


SECTION         Special syntax for search and replace

                Syntax: '/regexp/replace/'

                       The part of the input string matched by 'regexp' gets replaced by 'replace'.

                Simple example:

                       Input string:    'The quick brown fox jumps over the lazy dog'
                       Search&replace:  '/fox|dog/cat/'
                       Result:          'The quick brown cat jumps over the lazy cat'

                Additionally the match (or parts of it) can be referenced in the replace string:

                             \0        refers to the whole match
                             \1        refers to the first subexpression
                             \2        refers to the second subexpression
                             ...
                             \9        refers to the ninth subexpression

                Example using refs:

                       Input string:    'The quick brown fox jumps over the lazy dog'
                       Search&replace:  '/(brown|lazy)\s+(fox|dog)/\2 \1/'
                       Result:          'The quick fox brown jumps over the dog lazy'

BUGS		No bugs known