1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
|
Inside a character class all regular expression operators lose their special
meanings, except for the special atoms tt(\s, \S, \d, \D, \w,) and tt(\W); the
character range operator tt(-); the end of character class operator tt(]);
and, at the beginning of the character class, tt(^). Except in combination
with the special atoms the escape character is interpreted as a literal
backslash character (to define a character class containing a backslash and a
tt(d) simply use tt([d\])).
To add a closing bracket to a character class use tt([]) immediately following
the initial open-bracket, or start with tt([^]) for a negated character class
not containing the closing bracket. Minus characters are used to define
character ranges (e.g., tt([a-d]), defining tt([abcd])) (be advised that the
actual range may depend on the locale being used). To add a literal minus
character to a character class put it at the very beginning (tt([-), or
tt([^-)) or at the very end (tt(-])) of a character class.
Once a character class has started, all subsequent characters are added to the
class's set of characters, until the final closing bracket (tt(])) has been
reached.
In addition to characters and ranges of characters, character classes may also
contain em(predefined sets of character). They are:
verb( [:alnum:] [:alpha:] [:blank:]
[:cntrl:] [:digit:] [:graph:]
[:lower:] [:print:] [:punct:]
[:space:] [:upper:] [:xdigit:])
These predefined sets designate sets of characters equivalent to the
corresponding standard bf(C) tt(isXXX) function. For example, tt([:alnum:])
defines all characters for which bf(isalnum)(3) returns true.
|