File: overview_string.html

package info (click to toggle)
wxpython3.0 3.0.2.0%2Bdfsg-4
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 482,760 kB
  • ctags: 518,293
  • sloc: cpp: 2,127,226; python: 294,045; makefile: 51,942; ansic: 19,033; sh: 3,013; xml: 1,629; perl: 17
file content (206 lines) | stat: -rw-r--r-- 37,692 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<title>wxWidgets: wxString Overview</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="extra_stylesheet.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div id="page_container">
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0" style="width: 100%;">
 <tbody>
 <tr>
  <td id="projectlogo">
    <a href="http://www.wxwidgets.org/" target="_new">
      <img alt="wxWidgets" src="logo.png"/>
    </a>
  </td>
  <td style="padding-left: 0.5em; text-align: right;">
   <span id="projectnumber">Version: 3.0.2</span>
  </td>
 </tr>
 </tbody>
</table>
</div>
<!-- Generated by Doxygen 1.8.2 -->
  <div id="navrow1" class="tabs">
    <ul class="tablist">
      <li><a href="index.html"><span>Main&#160;Page</span></a></li>
      <li class="current"><a href="pages.html"><span>Related&#160;Pages</span></a></li>
      <li><a href="modules.html"><span>Categories</span></a></li>
      <li><a href="annotated.html"><span>Classes</span></a></li>
      <li><a href="files.html"><span>Files</span></a></li>
    </ul>
  </div>
<div id="nav-path" class="navpath">
  <ul>
<li class="navelem"><a class="el" href="index.html">Documentation</a></li><li class="navelem"><a class="el" href="page_topics.html">Programming Guides</a></li>  </ul>
</div>
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title"><a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> Overview </div>  </div>
</div><!--header-->
<div class="contents">
<div class="toc"><h3>Table of Contents</h3>
<ul><li class="level1"><a href="#overview_string_internal">Internal wxString Encoding</a></li>
<li class="level1"><a href="#overview_string_binary">Using wxString to store binary data</a></li>
<li class="level1"><a href="#overview_string_comparison">Comparison to Other String Classes</a></li>
<li class="level1"><a href="#overview_string_advice">Advice About Using wxString</a><ul><li class="level2"><a href="#overview_string_implicitconv">Implicit conversions</a></li>
<li class="level2"><a href="#overview_string_iterating">Iterating wxString Characters</a></li>
</ul>
</li>
<li class="level1"><a href="#overview_string_related">String Related Functions and Classes</a></li>
<li class="level1"><a href="#overview_string_tuning">Tuning wxString for Your Application</a></li>
<li class="level1"><a href="#overview_string_settings">wxString Related Compilation Settings</a></li>
</ul>
</div>
<div class="textblock"><p><a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> is a class which represents a Unicode string of arbitrary length and containing arbitrary Unicode characters.</p>
<p>This class has all the standard operations you can expect to find in a string class: dynamic memory management (string extends to accommodate new characters), construction from other strings, compatibility with C strings and wide character C strings, assignment operators, access to individual characters, string concatenation and comparison, substring extraction, case conversion, trimming and padding (with spaces), searching and replacing and both C-like <code>printf</code> (<a class="el" href="classwx_string.html#a9588b7f2684b9a6a924dc3746a2b2f8d" title="Similar to the standard function sprintf().">wxString::Printf</a>) and stream-like insertion functions as well as much more - see <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> for a list of all functions.</p>
<p>The <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> class has been completely rewritten for wxWidgets 3.0 but much work has been done to make existing code using ANSI string literals work as it did in previous versions.</p>
<h1><a class="anchor" id="overview_string_internal"></a>
Internal wxString Encoding</h1>
<p>Since wxWidgets 3.0 <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> may use any of <code>UTF-16</code> (under Windows, using the native 16 bit <code>wchar_t</code>), <code>UTF-32</code> (under Unix, using the native 32 bit <code>wchar_t</code>) or <code>UTF-8</code> (under both Windows and Unix) to store its content. By default, <code>wchar_t</code> is used under all platforms, but wxWidgets can be compiled with <code>wxUSE_UNICODE_UTF8=1</code> to use UTF-8.</p>
<p>For simplicity of implementation, <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> uses <em>per code unit indexing</em> instead of <em>per code point indexing</em> when using UTF-16, i.e. in the default <code>wxUSE_UNICODE_WCHAR==1</code> build under Windows and doesn't know anything about surrogate pairs. In other words it always considers code points to be composed by 1 code unit, while this is really true only for characters in the <em>BMP</em> (Basic Multilingual Plane), as explained in more details in the <a class="el" href="overview_unicode.html#overview_unicode_encodings">Unicode Representations and Terminology</a> section. Thus when iterating over a UTF-16 string stored in a <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> under Windows, the user code has to take care of <em>surrogate pairs</em> himself. (Note however that Windows itself has built-in support for surrogate pairs in UTF-16, such as for drawing strings on screen.)</p>
<dl class="section remark"><dt>Remarks</dt><dd>Note that while the behaviour of <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> when <code>wxUSE_UNICODE_WCHAR==1</code> resembles UCS-2 encoding, it's not completely correct to refer to <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> as UCS-2 encoded since you can encode code points outside the <em>BMP</em> in a <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> as two code units (i.e. as a surrogate pair; as already mentioned however <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> will "see" them as two different code points)</dd></dl>
<p>In <code>wxUSE_UNICODE_UTF8==1</code> case, <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> handles UTF-8 multi-bytes sequences just fine also for characters outside the BMP (it implements <em>per code point indexing</em>), so that you can use UTF-8 in a completely transparent way:</p>
<p>Example: </p>
<div class="fragment"><div class="line"><span class="comment">// first test, using exotic characters outside of the Unicode BMP:</span></div>
<div class="line"></div>
<div class="line"><a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> test = <a class="code" href="classwx_string.html#a2ddc1b7c8e1eb9adbf5874dead5b180b" title="Converts C string encoded in UTF-8 to wxString.">wxString::FromUTF8</a>(<span class="stringliteral">&quot;\xF0\x90\x8C\x80&quot;</span>);</div>
<div class="line">    <span class="comment">// U+10300 is &quot;OLD ITALIC LETTER A&quot; and is part of Unicode Plane 1</span></div>
<div class="line">    <span class="comment">// in UTF8 it&#39;s encoded as 0xF0 0x90 0x8C 0x80</span></div>
<div class="line"></div>
<div class="line"><span class="comment">// it&#39;s a single Unicode code-point encoded as:</span></div>
<div class="line"><span class="comment">// - a UTF16 surrogate pair under Windows</span></div>
<div class="line"><span class="comment">// - a UTF8 multiple-bytes sequence under Linux</span></div>
<div class="line"><span class="comment">// (without considering the final NULL)</span></div>
<div class="line"></div>
<div class="line">wxPrintf(<span class="stringliteral">&quot;wxString reports a length of %d character(s)&quot;</span>, test.<a class="code" href="classwx_string.html#af63f200410b56436a830550905e20539">length</a>());</div>
<div class="line">    <span class="comment">// prints &quot;wxString reports a length of 1 character(s)&quot; on Linux</span></div>
<div class="line">    <span class="comment">// prints &quot;wxString reports a length of 2 character(s)&quot; on Windows</span></div>
<div class="line">    <span class="comment">// since wxString on Windows doesn&#39;t have surrogate pairs support!</span></div>
<div class="line"></div>
<div class="line"></div>
<div class="line"><span class="comment">// second test, this time using characters part of the Unicode BMP:</span></div>
<div class="line"></div>
<div class="line"><a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> test2 = <a class="code" href="classwx_string.html#a2ddc1b7c8e1eb9adbf5874dead5b180b" title="Converts C string encoded in UTF-8 to wxString.">wxString::FromUTF8</a>(<span class="stringliteral">&quot;\x41\xC3\xA0\xE2\x82\xAC&quot;</span>);</div>
<div class="line">    <span class="comment">// this is the UTF8 encoding of capital letter A followed by</span></div>
<div class="line">    <span class="comment">// &#39;small case letter a with grave&#39; followed by the &#39;euro sign&#39;</span></div>
<div class="line"></div>
<div class="line"><span class="comment">// they are 3 Unicode code-points encoded as:</span></div>
<div class="line"><span class="comment">// - 3 UTF16 code units under Windows</span></div>
<div class="line"><span class="comment">// - 6 UTF8 code units under Linux</span></div>
<div class="line"><span class="comment">// (without considering the final NULL)</span></div>
<div class="line"></div>
<div class="line">wxPrintf(<span class="stringliteral">&quot;wxString reports a length of %d character(s)&quot;</span>, test2.length());</div>
<div class="line">    <span class="comment">// prints &quot;wxString reports a length of 3 character(s)&quot; on Linux</span></div>
<div class="line">    <span class="comment">// prints &quot;wxString reports a length of 3 character(s)&quot; on Windows</span></div>
</div><!-- fragment --><p>To better explain what stated above, consider the second string of the example above; it's composed by 3 characters and the final <code>NULL:</code> </p>
<div class="image">
<img src="overview_wxstring_encoding.png" alt="overview_wxstring_encoding.png"/>
</div>
<p>As you can see, UTF16 encoding is straightforward (for characters in the <em>BMP</em>) and in this example the UTF16-encoded <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> takes 8 bytes. UTF8 encoding is more elaborated and in this example takes 7 bytes.</p>
<p>In general, for strings containing many latin characters UTF8 provides a big advantage with regards to the memory footprint respect UTF16, but requires some more processing for common operations like e.g. length calculation.</p>
<p>Finally, note that the type used by <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> to store Unicode code units (<code>wchar_t</code> or <code>char</code>) is always <code>typedef-ined</code> to be <a class="el" href="group__group__funcmacro__string.html#gaf558f1d34fbf3cf5e3258e42a40875fd" title="wxStringCharType is defined to be:char when wxUSE_UNICODE==0char when wxUSE_UNICODE_WCHAR==0 and wxUS...">wxStringCharType</a>.</p>
<h1><a class="anchor" id="overview_string_binary"></a>
Using wxString to store binary data</h1>
<p><a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> can be used to store binary data (even if it contains <code>NULs</code>) using the functions <a class="el" href="classwx_string.html#afa91a632574bcbba1bf35b54f2c5562a" title="Converts the string to an 8-bit string in ISO-8859-1 encoding in the form of a wxCharBuffer (Unicode ...">wxString::To8BitData</a> and <a class="el" href="classwx_string.html#a5aedc23e9cc2774237d99148d0622661" title="Converts given buffer of binary data from 8-bit string to wxString.">wxString::From8BitData</a>.</p>
<p>Beware that even if <code>NUL</code> character is allowed, in the current string implementation some methods might not work correctly with them.</p>
<p>Note however that other classes like <a class="el" href="classwx_memory_buffer.html" title="A wxMemoryBuffer is a useful data structure for storing arbitrary sized blocks of memory...">wxMemoryBuffer</a> are more suited to this task. For handling binary data you may also want to look at the <a class="el" href="classwx_stream_buffer.html" title="wxStreamBuffer is a cache manager for wxStreamBase: it manages a stream buffer linked to a stream...">wxStreamBuffer</a>, <a class="el" href="classwx_memory_output_stream.html" title="This class allows to use all methods taking a wxOutputStream reference to write to in-memory data...">wxMemoryOutputStream</a>, <a class="el" href="classwx_memory_input_stream.html" title="This class allows to use all methods taking a wxInputStream reference to read in-memory data...">wxMemoryInputStream</a> classes.</p>
<h1><a class="anchor" id="overview_string_comparison"></a>
Comparison to Other String Classes</h1>
<p>The advantages of using a special string class instead of working directly with C strings are so obvious that there is a huge number of such classes available. The most important advantage is the need to always remember to allocate/free memory for C strings; working with fixed size buffers almost inevitably leads to buffer overflows. At last, C++ has a standard string class (<code>std::string</code>). So why the need for <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>? There are several advantages:</p>
<ul>
<li><b>Efficiency:</b> Since wxWidgets 3.0 <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> uses <code>std::string</code> (in UTF8 mode under Linux, Unix and OS X) or <code>std::wstring</code> (in UTF16 mode under Windows) internally by default to store its contents. <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> will therefore inherit the performance characteristics from <code>std::string</code>. </li>
<li><b>Compatibility:</b> This class tries to combine almost full compatibility with the old wxWidgets 1.xx <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> class, some reminiscence of MFC's CString class and 90% of the functionality of <code>std::string</code> class. </li>
<li><b>Rich set of functions:</b> Some of the functions present in <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> are very useful but don't exist in most of other string classes: for example, <a class="el" href="classwx_string.html#a1605126b7bbf5f60a6fca7f393a58f1d" title="Gets all the characters after the first occurrence of ch.">wxString::AfterFirst</a>, <a class="el" href="classwx_string.html#a9b6f088a6ef2faadf922a521df0fae3a" title="Gets all characters before the last occurrence of ch.">wxString::BeforeLast</a>, <a class="el" href="classwx_string.html#a9588b7f2684b9a6a924dc3746a2b2f8d" title="Similar to the standard function sprintf().">wxString::Printf</a>. Of course, all the standard string operations are supported as well. </li>
<li><b><a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> is Unicode friendly:</b> it allows to easily convert to and from ANSI and Unicode strings (see <a class="el" href="overview_unicode.html">Unicode Support in wxWidgets</a> for more details) and maps to <code>std::wstring</code> transparently. </li>
<li><b>Used by wxWidgets:</b> And, of course, this class is used everywhere inside wxWidgets so there is no performance loss which would result from conversions of objects of any other string class (including <code>std::string</code>) to <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> internally by wxWidgets.</li>
</ul>
<p>However, there are several problems as well. The most important one is probably that there are often several functions to do exactly the same thing: for example, to get the length of the string either one of <a class="el" href="classwx_string.html#af63f200410b56436a830550905e20539">wxString::length()</a>, <a class="el" href="classwx_string.html#ab20a87ca731a52c36ec674dae2213ad8" title="Returns the length of the string.">wxString::Len()</a> or <a class="el" href="classwx_string.html#a8895cca03120099236c002c0577b4d1c" title="Returns the length of the string (same as Len).">wxString::Length()</a> may be used. The first function, as almost all the other functions in lowercase, is <code>std::string</code> compatible. The second one is the "native" <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> version and the last one is the wxWidgets 1.xx way.</p>
<p>So which is better to use? The usage of the <code>std::string</code> compatible functions is strongly advised! It will both make your code more familiar to other C++ programmers (who are supposed to have knowledge of <code>std::string</code> but not of <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>), let you reuse the same code in both wxWidgets and other programs (by just typedefing <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> as <code>std::string</code> when used outside wxWidgets) and by staying compatible with future versions of wxWidgets which will probably start using <code>std::string</code> sooner or later too.</p>
<p>In the situations where there is no corresponding <code>std::string</code> function, please try to use the new <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> methods and not the old wxWidgets 1.xx variants which are deprecated and may disappear in future versions.</p>
<h1><a class="anchor" id="overview_string_advice"></a>
Advice About Using wxString</h1>
<h2><a class="anchor" id="overview_string_implicitconv"></a>
Implicit conversions</h2>
<p>Probably the main trap with using this class is the implicit conversion operator to <code>const char*</code>. It is advised that you use <a class="el" href="classwx_string.html#a6418ec90c6d4ffe0b05702be1b35df4f" title="Returns a lightweight intermediate class which is in turn implicitly convertible to both const char* ...">wxString::c_str()</a> instead to clearly indicate when the conversion is done. Specifically, the danger of this implicit conversion may be seen in the following code fragment:</p>
<div class="fragment"><div class="line"><span class="comment">// this function converts the input string to uppercase,</span></div>
<div class="line"><span class="comment">// output it to the screen and returns the result</span></div>
<div class="line"><span class="keyword">const</span> <span class="keywordtype">char</span> *SayHELLO(<span class="keyword">const</span> <a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>&amp; input)</div>
<div class="line">{</div>
<div class="line">    <a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> output = input.<a class="code" href="classwx_string.html#ab84d4b6e9f38ba939d61f3382d2a009b" title="Returns this string converted to upper case.">Upper</a>();</div>
<div class="line">    printf(<span class="stringliteral">&quot;Hello, %s!\n&quot;</span>, output);</div>
<div class="line">    <span class="keywordflow">return</span> output;</div>
<div class="line">}</div>
</div><!-- fragment --><p>There are two nasty bugs in these three lines. The first is in the call to the <code>printf()</code> function. Although the implicit conversion to C strings is applied automatically by the compiler in the case of</p>
<div class="fragment"><div class="line">puts(output);</div>
</div><!-- fragment --><p>because the argument of <code>puts()</code> is known to be of the type <code>const char*</code>, this is <b>not</b> done for <code>printf()</code> which is a function with variable number of arguments (and whose arguments are of unknown types). So this call may do any number of things (including displaying the correct string on screen), although the most likely result is a program crash. The solution is to use <a class="el" href="classwx_string.html#a6418ec90c6d4ffe0b05702be1b35df4f" title="Returns a lightweight intermediate class which is in turn implicitly convertible to both const char* ...">wxString::c_str()</a>. Just replace this line with this:</p>
<div class="fragment"><div class="line">printf(<span class="stringliteral">&quot;Hello, %s!\n&quot;</span>, output.<a class="code" href="classwx_string.html#a6418ec90c6d4ffe0b05702be1b35df4f" title="Returns a lightweight intermediate class which is in turn implicitly convertible to both const char* ...">c_str</a>());</div>
</div><!-- fragment --><p>The second bug is that returning <code>output</code> doesn't work. The implicit cast is used again, so the code compiles, but as it returns a pointer to a buffer belonging to a local variable which is deleted as soon as the function exits, its contents are completely arbitrary. The solution to this problem is also easy, just make the function return <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> instead of a C string.</p>
<p>This leads us to the following general advice: all functions taking string arguments should take <code>const <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>&amp;</code> (this makes assignment to the strings inside the function faster) and all functions returning strings should return <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> - this makes it safe to return local variables.</p>
<p>Finally note that <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> uses the current locale encoding to convert any C string literal to Unicode. The same is done for converting to and from <code>std::string</code> and for the return value of c_str(). For this conversion, the <em>wxConvLibc</em> class instance is used. See <a class="el" href="classwx_c_s_conv.html" title="This class converts between any character set supported by the system and Unicode.">wxCSConv</a> and <a class="el" href="classwx_m_b_conv.html" title="This class is the base class of a hierarchy of classes capable of converting text strings between mul...">wxMBConv</a>.</p>
<h2><a class="anchor" id="overview_string_iterating"></a>
Iterating wxString Characters</h2>
<p>As previously described, when <code>wxUSE_UNICODE_UTF8==1</code>, <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> internally uses the variable-length UTF8 encoding. Accessing a UTF-8 string by index can be very <b>inefficient</b> because a single character is represented by a variable number of bytes so that the entire string has to be parsed in order to find the character. Since iterating over a string by index is a common programming technique and was also possible and encouraged by <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> using the access operator[]() <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> implements caching of the last used index so that iterating over a string is a linear operation even in UTF-8 mode.</p>
<p>It is nonetheless recommended to use <b>iterators</b> (instead of index based access) like this:</p>
<div class="fragment"><div class="line"><a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> s = <span class="stringliteral">&quot;hello&quot;</span>;</div>
<div class="line">wxString::const_iterator i;</div>
<div class="line"><span class="keywordflow">for</span> (i = s.<a class="code" href="classwx_string.html#ad59ca2dd208720b3cce07d90bcb90093">begin</a>(); i != s.<a class="code" href="classwx_string.html#a6a0f235fff88df5e6b16b5f0e1e719cc">end</a>(); ++i)</div>
<div class="line">{</div>
<div class="line">    <a class="code" href="classwx_uni_char.html" title="This class represents a single Unicode character.">wxUniChar</a> uni_ch = *i;</div>
<div class="line">    <span class="comment">// do something with it</span></div>
<div class="line">}</div>
</div><!-- fragment --><h1><a class="anchor" id="overview_string_related"></a>
String Related Functions and Classes</h1>
<p>As most programs use character strings, the standard C library provides quite a few functions to work with them. Unfortunately, some of them have rather counter-intuitive behaviour (like <code>strncpy()</code> which doesn't always terminate the resulting string with a <span class="literal">NULL</span>) and are in general not very safe (passing <span class="literal">NULL</span> to them will probably lead to program crash). Moreover, some very useful functions are not standard at all. This is why in addition to all <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> functions, there are also a few global string functions which try to correct these problems: <a class="el" href="group__group__funcmacro__crt.html#ga4d749baaa27c4c97d579733b0ac6a495">wxIsEmpty()</a> verifies whether the string is empty (returning <span class="literal">true</span> for <span class="literal">NULL</span> pointers), <a class="el" href="group__group__funcmacro__crt.html#ga8ee0fe62cfc16ac60a217e825dcf4ba5">wxStrlen()</a> also handles <span class="literal">NULL</span> correctly and returns 0 for them and wxStricmp() is just a platform-independent version of case-insensitive string comparison function known either as <code>stricmp()</code> or <code>strcasecmp()</code> on different platforms.</p>
<p>The <code>&lt;<a class="el" href="interface_2wx_2string_8h.html">wx/string.h</a>&gt;</code> header also defines wxSnprintf() and wxVsnprintf() functions which should be used instead of the inherently dangerous standard <code>sprintf()</code> and which use <code>snprintf()</code> instead which does buffer size checks whenever possible. Of course, you may also use <a class="el" href="classwx_string.html#a9588b7f2684b9a6a924dc3746a2b2f8d" title="Similar to the standard function sprintf().">wxString::Printf</a> which is also safe.</p>
<p>There is another class which might be useful when working with <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>: <a class="el" href="classwx_string_tokenizer.html" title="wxStringTokenizer helps you to break a string up into a number of tokens.">wxStringTokenizer</a>. It is helpful when a string must be broken into tokens and replaces the standard C library <code>strtok()</code> function.</p>
<p>And the very last string-related class is <a class="el" href="classwx_array_string.html" title="wxArrayString is an efficient container for storing wxString objects.">wxArrayString</a>: it is just a version of the "template" dynamic array class which is specialized to work with strings. Please note that this class is specially optimized (using its knowledge of the internal structure of <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>) for storing strings and so it is vastly better from a performance point of view than a wxObjectArray of wxStrings.</p>
<h1><a class="anchor" id="overview_string_tuning"></a>
Tuning wxString for Your Application</h1>
<dl class="section note"><dt>Note</dt><dd>This section is strictly about performance issues and is absolutely not necessary to read for using <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> class. Please skip it unless you feel familiar with profilers and relative tools.</dd></dl>
<p>For the performance reasons <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> doesn't allocate exactly the amount of memory needed for each string. Instead, it adds a small amount of space to each allocated block which allows it to not reallocate memory (a relatively expensive operation) too often as when, for example, a string is constructed by subsequently adding one character at a time to it, as for example in:</p>
<div class="fragment"><div class="line"><span class="comment">// delete all vowels from the string</span></div>
<div class="line"><a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> DeleteAllVowels(<span class="keyword">const</span> <a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>&amp; original)</div>
<div class="line">{</div>
<div class="line">    <a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> vowels( <span class="stringliteral">&quot;aeuioAEIOU&quot;</span> );</div>
<div class="line">    <a class="code" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> result;</div>
<div class="line">    wxString::const_iterator i;</div>
<div class="line">    <span class="keywordflow">for</span> ( i = original.<a class="code" href="classwx_string.html#ad59ca2dd208720b3cce07d90bcb90093">begin</a>(); i != original.<a class="code" href="classwx_string.html#a6a0f235fff88df5e6b16b5f0e1e719cc">end</a>(); ++i )</div>
<div class="line">    {</div>
<div class="line">        <span class="keywordflow">if</span> (vowels.Find( *i ) == <a class="code" href="defs_8h.html#a89de5e6353fc7812991b085e12263e98">wxNOT_FOUND</a>)</div>
<div class="line">            result += *i;</div>
<div class="line">    }</div>
<div class="line"></div>
<div class="line">    <span class="keywordflow">return</span> result;</div>
<div class="line">}</div>
</div><!-- fragment --><p>This is quite a common situation and not allocating extra memory at all would lead to very bad performance in this case because there would be as many memory (re)allocations as there are consonants in the original string. Allocating too much extra memory would help to improve the speed in this situation, but due to a great number of <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> objects typically used in a program would also increase the memory consumption too much.</p>
<p>The very best solution in precisely this case would be to use <a class="el" href="classwx_string.html#a87e614d9924a1b5524334aac3fc96d38" title="Preallocate enough space for wxString to store nLen characters.">wxString::Alloc()</a> function to preallocate, for example, len bytes from the beginning - this will lead to exactly one memory allocation being performed (because the result is at most as long as the original string).</p>
<p>However, using <a class="el" href="classwx_string.html#a87e614d9924a1b5524334aac3fc96d38" title="Preallocate enough space for wxString to store nLen characters.">wxString::Alloc()</a> is tedious and so <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> tries to do its best. The default algorithm assumes that memory allocation is done in granularity of at least 16 bytes (which is the case on almost all of wide-spread platforms) and so nothing is lost if the amount of memory to allocate is rounded up to the next multiple of 16. Like this, no memory is lost and 15 iterations from 16 in the example above won't allocate memory but use the already allocated pool.</p>
<p>The default approach is quite conservative. Allocating more memory may bring important performance benefits for programs using (relatively) few very long strings. The amount of memory allocated is configured by the setting of <code>EXTRA_ALLOC</code> in the file string.cpp during compilation (be sure to understand why its default value is what it is before modifying it!). You may try setting it to greater amount (say twice nLen) or to 0 (to see performance degradation which will follow) and analyse the impact of it on your program. If you do it, you will probably find it helpful to also define <code>WXSTRING_STATISTICS</code> symbol which tells the <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> class to collect performance statistics and to show them on stderr on program termination. This will show you the average length of strings your program manipulates, their average initial length and also the percent of times when memory wasn't reallocated when string concatenation was done but the already preallocated memory was used (this value should be about 98% for the default allocation policy, if it is less than 90% you should really consider fine tuning <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> for your application).</p>
<p>It goes without saying that a profiler should be used to measure the precise difference the change to <code>EXTRA_ALLOC</code> makes to your program.</p>
<h1><a class="anchor" id="overview_string_settings"></a>
wxString Related Compilation Settings</h1>
<p>The main option affecting <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> is <code>wxUSE_UNICODE</code> which is now always defined as <code>1</code> by default to indicate Unicode support. You may set it to 0 to disable Unicode support in <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> and elsewhere in wxWidgets but this is <em>strongly</em> not recommended.</p>
<p>Another option affecting wxWidgets is <code>wxUSE_UNICODE_WCHAR</code> which is also 1 by default. You may want to set it to 0 and set <code>wxUSE_UNICODE_UTF8</code> to 1 instead to use UTF-8 internally. <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a> still provides the same API in this case, but using UTF-8 has performance implications as explained in <a class="el" href="overview_unicode.html#overview_unicode_performance">Performance Implications of Using UTF-8</a>, so it probably shouldn't be enabled for legacy code which might contain a lot of index-using loops.</p>
<p>See also <a class="el" href="page_wxusedef.html#page_wxusedef_important">Most Important Symbols</a> for a few other options affecting <a class="el" href="classwx_string.html" title="String class for passing textual data to or receiving it from wxWidgets.">wxString</a>. </p>
</div></div><!-- contents -->

<address class="footer">
	<small>
		Generated on Thu Nov 27 2014 13:46:42 for wxWidgets by <a href="http://www.doxygen.org/index.html" target="_new">Doxygen</a> 1.8.2
	</small>
</address>
<script src="wxwidgets.js" type="text/javascript"></script>
</div><!-- #page_container -->
</body>
</html>