File: sec-i18n-expecting-utf8.html

package info (click to toggle)
gtkmm-documentation 4.12.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 25,772 kB
  • sloc: cpp: 15,541; javascript: 1,208; makefile: 1,080; python: 401; xml: 106; perl: 67; sh: 8
file content (98 lines) | stat: -rw-r--r-- 4,915 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="stylesheet" href="highlight.min.css">
<script src="highlight.min.js"></script><script>
      hljs.configure({languages: ['cpp']});
      hljs.highlightAll();
    </script><title>Expecting UTF8</title>
<link rel="stylesheet" type="text/css" href="style.css">
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot">
<link rel="home" href="index.html" title="Programming with gtkmm 4">
<link rel="up" href="chapter-internationalization.html" title="Chapter 27. Internationalization and Localization">
<link rel="prev" href="sec-i18n-marking-strings.html" title="Marking strings for translation">
<link rel="next" href="sec-i18n-pitfalls.html" title="Pitfalls">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<div class="navheader">
<table width="100%" summary="Navigation header">
<tr><th colspan="3" align="center">Expecting UTF8</th></tr>
<tr>
<td width="20%" align="left">
<a accesskey="p" href="sec-i18n-marking-strings.html"><img src="icons/prev.png" alt="Prev"></a> </td>
<th width="60%" align="center">Chapter 27. Internationalization and Localization</th>
<td width="20%" align="right"> <a accesskey="n" href="sec-i18n-pitfalls.html"><img src="icons/next.png" alt="Next"></a>
</td>
</tr>
</table>
<hr>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="sec-i18n-expecting-utf8"></a>Expecting UTF8</h2></div></div></div>


<p>
A properly internationalized application will not make assumptions about the
number of bytes in a character. That means that you shouldn't use pointer
arithmetic to step through the characters in a string, and it means you
shouldn't use <code class="classname">std::string</code> or standard C functions such
as <code class="function">strlen()</code> because they make the same assumption.
</p>
<p>
However, you probably already avoid bare char* arrays and pointer arithmetic by
using <code class="classname">std::string</code>, so you just need to start using
<code class="classname">Glib::ustring</code> instead. See the <a class="link" href="sec-basics-ustring.html" title="Glib::ustring">Basics</a> chapter about
<code class="classname">Glib::ustring</code>.
</p>

<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="i18n-ustring-iostreams"></a>Glib::ustring and std::iostreams</h3></div></div></div>



<p>
Unfortunately, the integration with the standard iostreams is not completely
foolproof. <span class="application">gtkmm</span> converts <code class="classname">Glib::ustring</code>s to a
locale-specific encoding (which usually is not UTF-8) if you output them to an
<code class="classname">ostream</code> with <code class="function">operator&lt;&lt;</code>.
Likewise, retrieving <code class="classname">Glib::ustring</code>s from
<code class="classname">istream</code> with <code class="function">operator&gt;&gt;</code>
causes a conversion in the opposite direction. But this scheme breaks down if
you go through a <code class="classname">std::string</code>, e.g. by inputting text
from a stream to a <code class="classname">std::string</code> and then implicitly
converting it to a <code class="classname">Glib::ustring</code>. If the string
contained non-ASCII characters and the current locale is not UTF-8 encoded, the
result is a corrupted <code class="classname">Glib::ustring</code>. You can work around
this with a manual conversion. For instance, to retrieve the
<code class="classname">std::string</code> from a <code class="classname">ostringstream</code>:
</p>
<pre class="programlisting"><code class="code">std::locale::global(std::locale("")); // Set the global locale to the user's preferred locale.
                                      // Usually unnecessary here, because Glib::init()
                                      // or Gtk::Application::create() does it for you.
std::ostringstream output;
output &lt;&lt; percentage &lt;&lt; " % done";
label-&gt;set_text(Glib::locale_to_utf8(output.str()));</code></pre>
</div>

</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="sec-i18n-marking-strings.html"><img src="icons/prev.png" alt="Prev"></a> </td>
<td width="20%" align="center"><a accesskey="u" href="chapter-internationalization.html"><img src="icons/up.png" alt="Up"></a></td>
<td width="40%" align="right"> <a accesskey="n" href="sec-i18n-pitfalls.html"><img src="icons/next.png" alt="Next"></a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Marking strings for translation </td>
<td width="20%" align="center"><a accesskey="h" href="index.html"><img src="icons/home.png" alt="Home"></a></td>
<td width="40%" align="right" valign="top"> Pitfalls</td>
</tr>
</table>
</div>
</body>
</html>