File: cvsregexp.html

package info (click to toggle)
bonsai 1.3%2Bcvs20060111-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 1,096 kB
  • ctags: 445
  • sloc: perl: 11,380; sh: 271; makefile: 223; sql: 217
file content (257 lines) | stat: -rw-r--r-- 8,932 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
<HTML>
<HEAD>
   <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
   <META NAME="Author" CONTENT="lloyd tabb">
   <META NAME="GENERATOR" CONTENT="Mozilla/4.0 [en] (WinNT; I) [Netscape]">
   <TITLE>Regular expressions in the cvs query tool</TITLE>
</HEAD>
<BODY>

<H1>
Description of MySQL regular expression syntax.</H1>
Regular expressions are a powerful way of specifying complex searches.

<P><B>MySQL</B> uses Henry Spencer's implementation of regular expressions.
And that is aimed to conform to POSIX 1003.2. <B>MySQL</B> uses the extended
version.

<P>To get more exact information see Henry Spencer's regex.7 manual.

<P>This is a simplistic reference that skips the details. From here on
a regular expression is called a regexp.

<P>A regular expression describes a set of strings. The simplest case is
one that has no special characters in it. For example the regexp <TT>hello</TT>
matches <TT>hello</TT> and nothing else.

<P>Nontrivial regular expressions use certain special constructs so that
they can match more than one string. For example, the regexp <TT>hello|word</TT>
matches either the string <TT>hello</TT> or the string <TT>word</TT>.

<P>And a more complex example regexp <TT>B[an]*s</TT> matches any of the
strings <TT>Bananas</TT>, <TT>Baaaaas</TT>, <TT>Bs</TT> and all other string
starting with a <TT>B</TT> and continuing with any number of <TT>a</TT>
<TT>n</TT> and ending with a <TT>s</TT>.

<P>The following special characters/constructs are known.
<DL COMPACT>
<DT>
<TT>^</TT></DT>

<DD>
Start of whole string.</DD>

<PRE>mysql> select "fo\nfo" regexp "^fo$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0
mysql> select "fofo" regexp "^fo";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1</PRE>

<DT>
<TT>$</TT></DT>

<DD>
End of whole string.</DD>

<PRE>mysql> select "fo\no" regexp "^fo\no$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "fo\no" regexp "^fo$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>

<DT>
<TT>.</TT></DT>

<DD>
Any character (including newline).</DD>

<PRE>mysql> select "fofo" regexp "^f.*";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "fo\nfo" regexp "^f.*";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1</PRE>

<DT>
<TT>a*</TT></DT>

<DD>
Any sequence of zero or more a's.</DD>

<PRE>mysql> select "Ban" regexp "^Ba*n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "Baaan" regexp "^Ba*n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "Bn" regexp "^Ba*n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1</PRE>

<DT>
<TT>a+</TT></DT>

<DD>
Any sequence of one or more a's.</DD>

<PRE>mysql> select "Ban" regexp "^Ba+n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "Bn" regexp "^Ba+n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>

<DT>
<TT>a?</TT></DT>

<DD>
Either zero or one a.</DD>

<PRE>mysql> select "Bn" regexp "^Ba?n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "Ban" regexp "^Ba?n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "Baan" regexp "^Ba?n";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>

<DT>
<TT>de|abc</TT></DT>

<DD>
Either the sequence <TT>de</TT> or <TT>abc</TT>.</DD>

<PRE>mysql> select "pi" regexp "pi|apa";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "axe" regexp "pi|apa";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0
mysql> select "apa" regexp "pi|apa";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "apa" regexp "^(pi|apa)$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "pi" regexp "^(pi|apa)$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "pix" regexp "^(pi|apa)$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>

<DT>
<TT>(abc)*</TT></DT>

<DD>
Zero or more times the sequence <TT>abc</TT>.</DD>

<PRE>mysql> select "pi" regexp "^(pi)+$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "pip" regexp "^(pi)+$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0
mysql> select "pipi" regexp "^(pi)+$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1</PRE>

<DT>
<TT>{1}</TT></DT>

<DT>
<TT>{2,3}</TT></DT>

<DD>
There is a more general way of writing regexps that match many occurrences.</DD>

<DL COMPACT>
<DT>
<TT>a*</TT></DT>

<DD>
Can be written as <TT>a{0,}</TT>.</DD>

<DT>
<TT>+</TT></DT>

<DD>
Can be written as <TT>a{1,}</TT>.</DD>

<DT>
<TT>?</TT></DT>

<DD>
Can be written as <TT>a{0,1}</TT>.</DD>
</DL>
To be more precise, an atom followed by a bound containing one integer <TT>i</TT>
and no comma matches a sequence of exactly <TT>i</TT> matches of the atom.
An atom followed by a bound containing one integer <TT>i</TT> and a comma
matches a sequence of <TT>i</TT> or more matches of the atom. An atom followed
by a bound containing two integers <TT>i</TT> and <TT>j</TT> matches a
sequence of <TT>i</TT> through <TT>j</TT> (inclusive) matches of the atom.
Both arguments must <TT>0 >= value &lt;= RE_DUP_MAX (default 255)</TT>,
and if there are two of them, the second must be bigger or equal to the
first.
<DT>
<TT>[a-dX]</TT></DT>

<DT>
<TT>[^a-dX]</TT></DT>

<DD>
Any character which is (not if ^ is used) either <TT>a</TT>, <TT>b</TT>,
<TT>c</TT>, <TT>d</TT> or <TT>X</TT>. To include <TT>]</TT> it has to be
written first. To include <TT>-</TT> it has to be written first or last.
So <TT>[0-9]</TT> matches any decimal digit. All character that does not
have a defined meaning inside a <TT>[]</TT> pair has no special meaning
and matches only itself.</DD>

<PRE>mysql> select "aXbc" regexp "[a-dXYZ]";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "aXbc" regexp "^[a-dXYZ]$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0
mysql> select "aXbc" regexp "^[a-dXYZ]+$";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "aXbc" regexp "^[^a-dXYZ]+$";&nbsp;&nbsp;&nbsp;&nbsp; -> 0
mysql> select "gheis" regexp "^[^a-dXYZ]+$";&nbsp;&nbsp;&nbsp; -> 1
mysql> select "gheisa" regexp "^[^a-dXYZ]+$";&nbsp;&nbsp; -> 0</PRE>

<DT>
<TT>[[.characters.]]</TT></DT>

<DD>
The sequence of characters of that collating element. The sequence is a
single element of the bracket expression's list. A bracket expression containing
a multi-character collating element can thus match more than one character,
e.g. if the collating sequence includes a <TT>ch</TT> collating element,
then the RE <TT>[[.ch.]]*c</TT> matches the first five characters of <TT>chchcc</TT>.</DD>

<DT>
<TT>[=character-class=]</TT></DT>

<DD>
An equivalence class, standing for the sequences of characters of all collating
elements equivalent to that one, including itself. For example, if <TT>o</TT>
and <TT>(+)</TT> are the members of an equivalence class, then <TT>[[=o=]]</TT>,
<TT>[[=(+)=]]</TT>, and <TT>[o(+)]</TT> are all synonymous. An equivalence
class may not be an endpoint of a range.</DD>

<DT>
<TT>[:character_class:]</TT></DT>

<DD>
Within a bracket expression, the name of a character class enclosed in
<TT>[:</TT> and <TT>:]</TT> stands for the list of all characters belonging
to that class. Standard character class names are:</DD>

<TABLE BORDER WIDTH="100%" NOSAVE >
<TR>
<TD>alnum&nbsp;</TD>

<TD>digit&nbsp;</TD>

<TD>punct&nbsp;</TD>
</TR>

<TR>
<TD>alpha&nbsp;</TD>

<TD>graph&nbsp;</TD>

<TD>space&nbsp;</TD>
</TR>

<TR>
<TD>blank&nbsp;</TD>

<TD>lower&nbsp;</TD>

<TD>upper&nbsp;</TD>
</TR>

<TR>
<TD>cntrl&nbsp;</TD>

<TD>print&nbsp;</TD>

<TD>xdigit&nbsp;</TD>
</TR>
</TABLE>
These stand for the character classes defined in ctype(3). A locale may
provide others. A character class may not be used as an endpoint of a range.
<PRE>mysql> select "justalnums" regexp "[[:alnum:]]+";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "!!" regexp "[[:alnum:]]+";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>

<LI>
[[:&lt;:]]</LI>

<LI>
[[:>:]] These match the null string at the beginning and end of a word
respectively. A word is defined as a sequence of word characters which
is neither preceded nor followed by word characters. A word character is
an alnum character (as defined by ctype(3)) or an underscore.</LI>

<PRE>mysql> select "a word a" regexp "[[:&lt;:]]word[[:>:]]";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -> 1
mysql> select "a xword a" regexp "[[:&lt;:]]word[[:>:]]";&nbsp;&nbsp;&nbsp;&nbsp; -> 0</PRE>
</DL>

<PRE>mysql> select "weeknights" regexp "^(wee|week)(knights|nights)$"; -> 1</PRE>
&nbsp;
</BODY>
</HTML>