File: regex_grep.html

package info (click to toggle)
boost 1.32.0-6
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 93,952 kB
  • ctags: 128,458
  • sloc: cpp: 492,477; xml: 52,125; python: 13,519; ansic: 13,013; sh: 1,773; yacc: 853; makefile: 526; perl: 418; lex: 110; csh: 6
file content (381 lines) | stat: -rw-r--r-- 16,591 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
   <head>
      <title>Boost.Regex: Algorithm regex_grep (deprecated)</title>
      <meta name="generator" content="HTML Tidy, see www.w3.org">
      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
      <link rel="stylesheet" type="text/css" href="../../../boost.css">
   </head>
   <body>
      <p></p>
      <table id="Table1" cellspacing="1" cellpadding="1" width="100%" border="0">
         <tr>
            <td valign="top" width="300">
               <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
            </td>
            <td width="353">
               <h1 align="center">Boost.Regex</h1>
               <h2 align="center">Algorithm regex_grep (deprecated)</h2>
            </td>
            <td width="50">
               <h3><a href="index.html"><img height="45" width="43" alt="Boost.Regex Index" src="uarrow.gif" border="0"></a></h3>
            </td>
         </tr>
      </table>
      <br>
      <br>
      <hr>
      <p>The algorithm regex_grep is deprecated in favor of <a href="regex_iterator.html">regex_iterator</a>
         which provides a more convenient and standard library friendly interface.</p>
      <p>The following documentation is taken unchanged from the previous boost release, 
         and will not be updated in future.</p>
      <hr>
      <pre>
#include &lt;<a href="../../../boost/regex.hpp">boost/regex.hpp</a>&gt;
</pre>
      <p>regex_grep allows you to search through a bidirectional-iterator range and 
         locate all the (non-overlapping) matches with a given regular expression. The 
         function is declared as:</p>
      <pre>
<b>template</b> &lt;<b>class</b> Predicate, <b>class</b> iterator, <b>class</b> charT, <b>class</b> traits, <b>class</b> Allocator&gt;
<b>unsigned</b> <b>int</b> regex_grep(Predicate foo,
                         iterator first,
                         iterator last,
                         <b>const</b> basic_regex&lt;charT, traits, Allocator&gt;&amp; e,
                         boost::match_flag_type flags = match_default)
</pre>
      <p>The library also defines the following convenience versions, which take either 
         a const charT*, or a const std::basic_string&lt;&gt;&amp; in place of a pair of 
         iterators [note - these versions may not be available, or may be available in a 
         more limited form, depending upon your compilers capabilities]:</p>
      <pre>
<b>template</b> &lt;<b>class</b> Predicate, <b>class</b> charT, <b>class</b> Allocator, <b>class</b> traits&gt;
<b>unsigned</b> <b>int</b> regex_grep(Predicate foo, 
              <b>const</b> charT* str, 
              <b>const</b> basic_regex&lt;charT, traits, Allocator&gt;&amp; e, 
              boost::match_flag_type flags = match_default);

<b>template</b> &lt;<b>class</b> Predicate, <b>class</b> ST, <b>class</b> SA, <b>class</b> Allocator, <b>class</b> charT, <b>class</b> traits&gt;
<b>unsigned</b> <b>int</b> regex_grep(Predicate foo, 
              <b>const</b> std::basic_string&lt;charT, ST, SA&gt;&amp; s, 
              <b>const</b> basic_regex&lt;charT, traits, Allocator&gt;&amp; e, 
              boost::match_flag_type flags = match_default);
</pre>
      <p>The parameters for the primary version of regex_grep have the following 
         meanings:&nbsp;</p>
      <p></p>
      <table id="Table2" cellspacing="0" cellpadding="7" width="624" border="0">
         <tr>
            <td width="5%">&nbsp;</td>
            <td valign="top" width="50%">foo</td>
            <td valign="top" width="50%">A predicate function object or function pointer, see 
               below for more information.</td>
            <td width="5%">&nbsp;</td>
         </tr>
         <tr>
            <td>&nbsp;</td>
            <td valign="top" width="50%">first</td>
            <td valign="top" width="50%">The start of the range to search.</td>
            <td>&nbsp;</td>
         </tr>
         <tr>
            <td>&nbsp;</td>
            <td valign="top" width="50%">last</td>
            <td valign="top" width="50%">The end of the range to search.</td>
            <td>&nbsp;</td>
         </tr>
         <tr>
            <td>&nbsp;</td>
            <td valign="top" width="50%">e</td>
            <td valign="top" width="50%">The regular expression to search for.</td>
            <td>&nbsp;</td>
         </tr>
         <tr>
            <td>&nbsp;</td>
            <td valign="top" width="50%">flags</td>
            <td valign="top" width="50%">The flags that determine how matching is carried out, 
               one of the <a href="#match_type">match_flags</a> enumerators.</td>
            <td>&nbsp;</td>
         </tr>
      </table>
      <br>
      <br>
      <p>The algorithm finds all of the non-overlapping matches of the expression e, for 
         each match it fills a <a href="#reg_match">match_results</a>&lt;iterator, 
         Allocator&gt; structure, which contains information on what matched, and calls 
         the predicate foo, passing the match_results&lt;iterator, Allocator&gt; as a 
         single argument. If the predicate returns true, then the grep operation 
         continues, otherwise it terminates without searching for further matches. The 
         function returns the number of matches found.</p>
      <p>The general form of the predicate is:</p>
      <pre>
<b>struct</b> grep_predicate
{
 <b>  bool</b> <b>operator</b>()(<b>const</b> match_results&lt;iterator_type, typename expression_type::alloc_type::template rebind&lt;sub_match&lt;BidirectionalIterator&gt; &gt;::other&gt;&amp; m);
};
</pre>
      <p>Note that in almost every case the allocator parameter can be omitted, when 
         specifying the <a href="match_results.html">match_results</a> type, 
         alternatively one of the typedefs cmatch, wcmatch, smatch or wsmatch can be 
         used.</p>
      <p>For example the regular expression "a*b" would find one match in the string 
         "aaaaab" and two in the string "aaabb".</p>
      <p>Remember this algorithm can be used for a lot more than implementing a version 
         of grep, the predicate can be and do anything that you want, grep utilities 
         would output the results to the screen, another program could index a file 
         based on a regular expression and store a set of bookmarks in a list, or a text 
         file conversion utility would output to file. The results of one regex_grep can 
         even be chained into another regex_grep to create recursive parsers.</p>
      <P>The algorithm may throw&nbsp;<CODE>std::runtime_error</CODE> if the complexity 
         of matching the expression against an N character string begins to exceed O(N<SUP>2</SUP>), 
         or if the program runs out of stack space while matching the expression (if 
         Boost.regex is <A href="configuration.html">configured</A> in recursive mode), 
         or if the matcher exhausts it's permitted memory allocation (if Boost.regex is <A href="configuration.html">
            configured</A> in non-recursive mode).</P>
      <p><a href="../example/snippets/regex_grep_example_1.cpp"> Example</a>: convert 
         the example from <i>regex_search</i> to use <i>regex_grep</i> instead:</p>
      <pre>
<font color="#008000">#include &lt;string&gt; 
#include &lt;map&gt; 
#include &lt;boost/regex.hpp&gt; 

</font><font color="#000080"><i>// IndexClasses: 
// takes the contents of a file in the form of a string 
// and searches for all the C++ class definitions, storing 
// their locations in a map of strings/int's 
</i></font><b>typedef</b> std::map&lt;std::string, <b>int</b>, std::less&lt;std::string&gt; &gt; map_type; 

const char* re = 
   // possibly leading whitespace:   
   "^[[:space:]]*" 
   // possible template declaration:
   "(template[[:space:]]*&lt;[^;:{]+&gt;[[:space:]]*)?"
   // class or struct:
   "(class|struct)[[:space:]]*" 
   // leading declspec macros etc:
   "("
      "\\&lt;\\w+\\&gt;"
      "("
         "[[:blank:]]*\\([^)]*\\)"
      ")?"
      "[[:space:]]*"
   ")*" 
   // the class name
   "(\\&lt;\\w*\\&gt;)[[:space:]]*" 
   // template specialisation parameters
   "(&lt;[^;:{]+&gt;)?[[:space:]]*"
   // terminate in { or :
   "(\\{|:[^;\\{()]*\\{)";

boost::regex expression(re); 
<b>class</b> IndexClassesPred 
{ 
   map_type&amp; m; 
   std::string::const_iterator base; 
<b>public</b>: 
   IndexClassesPred(map_type&amp; a, std::string::const_iterator b) : m(a), base(b) {} 
   <b>bool</b> <b>operator</b>()(<b>const</b>  smatch&amp; what) 
   { 
 <font color=
#000080>     <i>// what[0] contains the whole string 
</i>      <i>// what[5] contains the class name. 
</i>      <i>// what[6] contains the template specialisation if any. 
</i>      <i>// add class name and position to map: 
</i></font>      m[std::string(what[5].first, what[5].second) + std::string(what[6].first, what[6].second)] = 
                what[5].first - base; 
      <b>return</b> <b>true</b>; 
   } 
}; 
<b>void</b> IndexClasses(map_type&amp; m, <b>const</b> std::string&amp; file) 
{ 
   std::string::const_iterator start, end; 
   start = file.begin(); 
   end = file.end(); 
   regex_grep(IndexClassesPred(m, start), start, end, expression); 
}
</pre>
      <p><a href="../example/snippets/regex_grep_example_2.cpp"> Example</a>: Use 
         regex_grep to call a global callback function:</p>
      <pre>
<font color="#008000">#include &lt;string&gt; 
#include &lt;map&gt; 
#include &lt;boost/regex.hpp&gt; 

</font><font color="#000080"><i>// purpose: 
// takes the contents of a file in the form of a string 
// and searches for all the C++ class definitions, storing 
// their locations in a map of strings/int's 
</i></font><b>typedef</b> std::map&lt;std::string, <b>int</b>, std::less&lt;std::string&gt; &gt; map_type; 

const char* re = 
   // possibly leading whitespace:   
   "^[[:space:]]*" 
   // possible template declaration:
   "(template[[:space:]]*&lt;[^;:{]+&gt;[[:space:]]*)?"
   // class or struct:
   "(class|struct)[[:space:]]*" 
   // leading declspec macros etc:
   "("
      "\\&lt;\\w+\\&gt;"
      "("
         "[[:blank:]]*\\([^)]*\\)"
      ")?"
      "[[:space:]]*"
   ")*" 
   // the class name
   "(\\&lt;\\w*\\&gt;)[[:space:]]*" 
   // template specialisation parameters
   "(&lt;[^;:{]+&gt;)?[[:space:]]*"
   // terminate in { or :
   "(\\{|:[^;\\{()]*\\{)";

boost::regex expression(re);
map_type class_index; 
std::string::const_iterator base; 

<b>bool</b> grep_callback(<b>const</b>  boost::smatch&amp; what) 
{ 
 <font color="#000080">  <i>// what[0] contains the whole string 
</i>   <i>// what[5] contains the class name. 
</i>   <i>// what[6] contains the template specialisation if any. 
</i>   <i>// add class name and position to map: 
</i></font>   class_index[std::string(what[5].first, what[5].second) + std::string(what[6].first, what[6].second)] = 
                what[5].first - base; 
   <b>return</b> <b>true</b>; 
} 
<b>void</b> IndexClasses(<b>const</b> std::string&amp; file) 
{ 
   std::string::const_iterator start, end; 
   start = file.begin(); 
   end = file.end(); 
   base = start; 
   regex_grep(grep_callback, start, end, expression, match_default); 
}
 
</pre>
      <p><a href="../example/snippets/regex_grep_example_3.cpp"> Example</a>: use 
         regex_grep to call a class member function, use the standard library adapters <i>std::mem_fun</i>
         and <i>std::bind1st</i> to convert the member function into a predicate:</p>
      <pre>
<font color="#008000">#include &lt;string&gt; 
#include &lt;map&gt; 
#include &lt;boost/regex.hpp&gt; 
#include &lt;functional&gt; 
</font><font color="#000080"><i>// purpose: 
// takes the contents of a file in the form of a string 
// and searches for all the C++ class definitions, storing 
// their locations in a map of strings/int's 

</i></font><b>typedef</b> std::map&lt;std::string, <b>int</b>, std::less&lt;std::string&gt; &gt; map_type; 
<b>class</b> class_index 
{ 
   boost::regex expression; 
   map_type index; 
   std::string::const_iterator base; 
   <b>bool</b>  grep_callback(boost::smatch what); 
<b>public</b>: 
 <b>  void</b> IndexClasses(<b>const</b> std::string&amp; file); 
   class_index() 
      : index(), 
        expression(<font color=
#000080>"^(template[[:space:]]*&lt;[^;:{]+&gt;[[:space:]]*)?" 
                   "(class|struct)[[:space:]]*(\\&lt;\\w+\\&gt;([[:blank:]]*\\([^)]*\\))?" 
                   "[[:space:]]*)*(\\&lt;\\w*\\&gt;)[[:space:]]*(&lt;[^;:{]+&gt;[[:space:]]*)?" 
                   "(\\{|:[^;\\{()]*\\{)" 
</font>                   ){} 
}; 
<b>bool</b>  class_index::grep_callback(boost::smatch what) 
{ 
 <font color="#000080">  <i>// what[0] contains the whole string 
</i>   <i>// what[5] contains the class name. 
</i>   <i>// what[6] contains the template specialisation if any. 
</i>   <i>// add class name and position to map: 
</i></font>   index[std::string(what[5].first, what[5].second) + std::string(what[6].first, what[6].second)] = 
               what[5].first - base; 
   <b>return</b> <b>true</b>; 
} 

<b>void</b> class_index::IndexClasses(<b>const</b> std::string&amp; file) 
{ 
   std::string::const_iterator start, end; 
   start = file.begin(); 
   end = file.end(); 
   base = start; 
   regex_grep(std::bind1st(std::mem_fun(&amp;class_index::grep_callback), <b>this</b>), 
              start, 
              end, 
              expression); 
} 
 
</pre>
      <p><a href="../example/snippets/regex_grep_example_4.cpp"> Finally</a>, C++ 
         Builder users can use C++ Builder's closure type as a callback argument:</p>
      <pre>
<font color="#008000">#include &lt;string&gt; 
#include &lt;map&gt; 
#include &lt;boost/regex.hpp&gt; 
#include &lt;functional&gt; 
</font><font color="#000080"><i>// purpose: 
// takes the contents of a file in the form of a string 
// and searches for all the C++ class definitions, storing 
// their locations in a map of strings/int's 

</i></font><b>typedef</b> std::map&lt;std::string, <b>int</b>, std::less&lt;std::string&gt; &gt; map_type; 
<b>class</b> class_index 
{ 
   boost::regex expression; 
   map_type index; 
   std::string::const_iterator base; 
   <b>typedef</b>  boost::smatch arg_type; 
   <b>bool</b> grep_callback(<b>const</b> arg_type&amp; what); 
<b>public</b>: 
   <b>typedef</b> <b>bool</b> (<b>__closure</b>* grep_callback_type)(<b>const</b> arg_type&amp;); 
   <b>void</b> IndexClasses(<b>const</b> std::string&amp; file); 
   class_index() 
      : index(), 
        expression(<font color=
#000080>"^(template[[:space:]]*&lt;[^;:{]+&gt;[[:space:]]*)?" 
                   "(class|struct)[[:space:]]*(\\&lt;\\w+\\&gt;([[:blank:]]*\\([^)]*\\))?" 
                   "[[:space:]]*)*(\\&lt;\\w*\\&gt;)[[:space:]]*(&lt;[^;:{]+&gt;[[:space:]]*)?" 
                   "(\\{|:[^;\\{()]*\\{)" 
</font>                   ){} 
}; 

<b>bool</b> class_index::grep_callback(<b>const</b> arg_type&amp; what) 
{ 
 <font color=
#000080>  <i>// what[0] contains the whole string</i>    
<i>// what[5] contains the class name.</i>    
<i>// what[6] contains the template specialisation if any.</i>    
<i>// add class name and position to map:</i></font>    
index[std::string(what[5].first, what[5].second) + std::string(what[6].first, what[6].second)] = 
               what[5].first - base; 
   <b>return</b> <b>true</b>; 
} 

<b>void</b> class_index::IndexClasses(<b>const</b> std::string&amp; file) 
{ 
   std::string::const_iterator start, end; 
   start = file.begin(); 
   end = file.end(); 
   base = start; 
   class_index::grep_callback_type cl = &amp;(<b>this</b>-&gt;grep_callback); 
   regex_grep(cl, 
            start, 
            end, 
            expression); 
}
</pre>
      <p></p>
      <hr>
      <p>Revised 
         <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan --> 
         04 Feb 2004 
         <!--webbot bot="Timestamp" endspan i-checksum="39359" --></p>
      <p><i> Copyright John Maddock&nbsp;1998- 
            <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan -->  2004<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
      <P><I>Use, modification and distribution are subject to the Boost Software License, 
            Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A>
            or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P>
   </body>
</html>