1 |
20 |
jlechner |
<?xml version="1.0" encoding="ISO-8859-1"?>
|
2 |
|
|
<!DOCTYPE html
|
3 |
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
4 |
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
5 |
|
|
|
6 |
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
7 |
|
|
<head>
|
8 |
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
|
9 |
|
|
<meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" />
|
10 |
|
|
<meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
|
11 |
|
|
<meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 22." />
|
12 |
|
|
<meta name="GENERATOR" content="vi and eight fingers" />
|
13 |
|
|
<title>libstdc++-v3 HOWTO: Chapter 22: Localization</title>
|
14 |
|
|
<link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
|
15 |
|
|
<link rel="Start" href="../documentation.html" type="text/html"
|
16 |
|
|
title="GNU C++ Standard Library" />
|
17 |
|
|
<link rel="Prev" href="../21_strings/howto.html" type="text/html"
|
18 |
|
|
title="Strings" />
|
19 |
|
|
<link rel="Next" href="../23_containers/howto.html" type="text/html"
|
20 |
|
|
title="Containers" />
|
21 |
|
|
<link rel="Bookmark" href="locale.html" type="text/html" title="class locale" />
|
22 |
|
|
<link rel="Bookmark" href="codecvt.html" type="text/html" title="class codecvt" />
|
23 |
|
|
<link rel="Bookmark" href="ctype.html" type="text/html" title="class ctype" />
|
24 |
|
|
<link rel="Bookmark" href="messages.html" type="text/html" title="class messages" />
|
25 |
|
|
<link rel="Bookmark" href="http://www.research.att.com/~bs/3rd_loc0.html" type="text/html" title="Bjarne Stroustrup on Locales" />
|
26 |
|
|
<link rel="Bookmark" href="http://www.cantrip.org/locale.html" type="text/html" title="Nathan Myers on Locales" />
|
27 |
|
|
<link rel="Copyright" href="../17_intro/license.html" type="text/html" />
|
28 |
|
|
<link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." />
|
29 |
|
|
</head>
|
30 |
|
|
<body>
|
31 |
|
|
|
32 |
|
|
<h1 class="centered"><a name="top">Chapter 22: Localization</a></h1>
|
33 |
|
|
|
34 |
|
|
<p>Chapter 22 deals with the C++ localization facilities.
|
35 |
|
|
</p>
|
36 |
|
|
<!-- I wanted to write that sentence in something requiring an exotic font,
|
37 |
|
|
like Cyrllic or Kanji. Probably more work than such cuteness is worth,
|
38 |
|
|
but I still think it'd be funny.
|
39 |
|
|
-->
|
40 |
|
|
|
41 |
|
|
|
42 |
|
|
<!-- ####################################################### -->
|
43 |
|
|
<hr />
|
44 |
|
|
<h1>Contents</h1>
|
45 |
|
|
<ul>
|
46 |
|
|
<li><a href="#1">class locale</a></li>
|
47 |
|
|
<li><a href="#2">class codecvt</a></li>
|
48 |
|
|
<li><a href="#3">class ctype</a></li>
|
49 |
|
|
<li><a href="#4">class messages</a></li>
|
50 |
|
|
<li><a href="#5">Bjarne Stroustrup on Locales</a></li>
|
51 |
|
|
<li><a href="#6">Nathan Myers on Locales</a></li>
|
52 |
|
|
<li><a href="#7">Correct Transformations</a></li>
|
53 |
|
|
</ul>
|
54 |
|
|
|
55 |
|
|
<!-- ####################################################### -->
|
56 |
|
|
|
57 |
|
|
<hr />
|
58 |
|
|
<h2><a name="1">class locale</a></h2>
|
59 |
|
|
<p>Notes made during the implementation of locales can be found
|
60 |
|
|
<a href="locale.html">here</a>.
|
61 |
|
|
</p>
|
62 |
|
|
|
63 |
|
|
<hr />
|
64 |
|
|
<h2><a name="2">class codecvt</a></h2>
|
65 |
|
|
<p>Notes made during the implementation of codecvt can be found
|
66 |
|
|
<a href="codecvt.html">here</a>.
|
67 |
|
|
</p>
|
68 |
|
|
|
69 |
|
|
<p>The following is the abstract from the implementation notes:
|
70 |
|
|
</p>
|
71 |
|
|
<blockquote>
|
72 |
|
|
The standard class codecvt attempts to address conversions between
|
73 |
|
|
different character encoding schemes. In particular, the standard
|
74 |
|
|
attempts to detail conversions between the implementation-defined
|
75 |
|
|
wide characters (hereafter referred to as wchar_t) and the standard
|
76 |
|
|
type char that is so beloved in classic "C" (which can
|
77 |
|
|
now be referred to as narrow characters.) This document attempts
|
78 |
|
|
to describe how the GNU libstdc++-v3 implementation deals with the
|
79 |
|
|
conversion between wide and narrow characters, and also presents a
|
80 |
|
|
framework for dealing with the huge number of other encodings that
|
81 |
|
|
iconv can convert, including Unicode and UTF8. Design issues and
|
82 |
|
|
requirements are addressed, and examples of correct usage for both
|
83 |
|
|
the required specializations for wide and narrow characters and the
|
84 |
|
|
implementation-provided extended functionality are given.
|
85 |
|
|
</blockquote>
|
86 |
|
|
|
87 |
|
|
<hr />
|
88 |
|
|
<h2><a name="3">class ctype</a></h2>
|
89 |
|
|
<p>Notes made during the implementation of ctype can be found
|
90 |
|
|
<a href="ctype.html">here</a>.
|
91 |
|
|
</p>
|
92 |
|
|
|
93 |
|
|
<hr />
|
94 |
|
|
<h2><a name="4">class messages</a></h2>
|
95 |
|
|
<p>Notes made during the implementation of messages can be found
|
96 |
|
|
<a href="messages.html">here</a>.
|
97 |
|
|
</p>
|
98 |
|
|
|
99 |
|
|
<hr />
|
100 |
|
|
<h2><a name="5">Bjarne Stroustrup on Locales</a></h2>
|
101 |
|
|
<p>Dr. Bjarne Stroustrup has released a
|
102 |
|
|
<a href="http://www.research.att.com/~bs/3rd_loc0.html">pointer</a>
|
103 |
|
|
to Appendix D of his book,
|
104 |
|
|
<a href="http://www.research.att.com/~bs/3rd.html">The C++
|
105 |
|
|
Programming Language (3rd Edition)</a>. It is a detailed
|
106 |
|
|
description of locales and how to use them.
|
107 |
|
|
</p>
|
108 |
|
|
<p>He also writes:
|
109 |
|
|
</p>
|
110 |
|
|
<blockquote><em>
|
111 |
|
|
Please note that I still consider this detailed description of
|
112 |
|
|
locales beyond the needs of most C++ programmers. It is written
|
113 |
|
|
with experienced programmers in mind and novices will do best to
|
114 |
|
|
avoid it.
|
115 |
|
|
</em></blockquote>
|
116 |
|
|
|
117 |
|
|
<hr />
|
118 |
|
|
<h2><a name="6">Nathan Myers on Locales</a></h2>
|
119 |
|
|
<p>An article entitled "The Standard C++ Locale" was
|
120 |
|
|
published in Dr. Dobb's Journal and can be found
|
121 |
|
|
<a href="http://www.cantrip.org/locale.html">here</a>.
|
122 |
|
|
</p>
|
123 |
|
|
|
124 |
|
|
<hr />
|
125 |
|
|
<h2><a name="7">Correct Transformations</a></h2>
|
126 |
|
|
<!-- Jumping directly to here from chapter 21. -->
|
127 |
|
|
<p>A very common question on newsgroups and mailing lists is, "How
|
128 |
|
|
do I do <foo> to a character string?" where <foo> is
|
129 |
|
|
a task such as changing all the letters to uppercase, to lowercase,
|
130 |
|
|
testing for digits, etc. A skilled and conscientious programmer
|
131 |
|
|
will follow the question with another, "And how do I make the
|
132 |
|
|
code portable?"
|
133 |
|
|
</p>
|
134 |
|
|
<p>(Poor innocent programmer, you have no idea the depths of trouble
|
135 |
|
|
you are getting yourself into. 'Twould be best for your sanity if
|
136 |
|
|
you dropped the whole idea and took up basket weaving instead. No?
|
137 |
|
|
Fine, you asked for it...)
|
138 |
|
|
</p>
|
139 |
|
|
<p>The task of changing the case of a letter or classifying a character
|
140 |
|
|
as numeric, graphical, etc, all depends on the cultural context of the
|
141 |
|
|
program at runtime. So, first you must take the portability question
|
142 |
|
|
into account. Once you have localized the program to a particular
|
143 |
|
|
natural language, only then can you perform the specific task.
|
144 |
|
|
Unfortunately, specializing a function for a human language is not
|
145 |
|
|
as simple as declaring
|
146 |
|
|
<code> extern "Danish" int tolower (int); </code>.
|
147 |
|
|
</p>
|
148 |
|
|
<p>The C++ code to do all this proceeds in the same way. First, a locale
|
149 |
|
|
is created. Then member functions of that locale are called to
|
150 |
|
|
perform minor tasks. Continuing the example from Chapter 21, we wish
|
151 |
|
|
to use the following convenience functions:
|
152 |
|
|
</p>
|
153 |
|
|
<pre>
|
154 |
|
|
namespace std {
|
155 |
|
|
template <class charT>
|
156 |
|
|
charT
|
157 |
|
|
toupper (charT c, const locale& loc) const;
|
158 |
|
|
template <class charT>
|
159 |
|
|
charT
|
160 |
|
|
tolower (charT c, const locale& loc) const;
|
161 |
|
|
}</pre>
|
162 |
|
|
<p>
|
163 |
|
|
This function extracts the appropriate "facet" from the
|
164 |
|
|
locale <em>loc</em> and calls the appropriate member function of that
|
165 |
|
|
facet, passing <em>c</em> as its argument. The resulting character
|
166 |
|
|
is returned.
|
167 |
|
|
</p>
|
168 |
|
|
<p>For the C/POSIX locale, the results are the same as calling the
|
169 |
|
|
classic C <code>toupper/tolower</code> function that was used in previous
|
170 |
|
|
examples. For other locales, the code should Do The Right Thing.
|
171 |
|
|
</p>
|
172 |
|
|
<p>Of course, these functions take a second argument, and the
|
173 |
|
|
transformation algorithm's operator argument can only take a single
|
174 |
|
|
parameter. So we write simple wrapper structs to handle that.
|
175 |
|
|
</p>
|
176 |
|
|
<p>The next-to-final version of the code started in Chapter 21 looks like:
|
177 |
|
|
</p>
|
178 |
|
|
<pre>
|
179 |
|
|
#include <iterator> // for back_inserter
|
180 |
|
|
#include <locale>
|
181 |
|
|
#include <string>
|
182 |
|
|
#include <algorithm>
|
183 |
|
|
#include <cctype> // old <ctype.h>
|
184 |
|
|
|
185 |
|
|
struct ToUpper
|
186 |
|
|
{
|
187 |
|
|
ToUpper(std::locale const& l) : loc(l) {;}
|
188 |
|
|
char operator() (char c) const { return std::toupper(c,loc); }
|
189 |
|
|
private:
|
190 |
|
|
std::locale const& loc;
|
191 |
|
|
};
|
192 |
|
|
|
193 |
|
|
struct ToLower
|
194 |
|
|
{
|
195 |
|
|
ToLower(std::locale const& l) : loc(l) {;}
|
196 |
|
|
char operator() (char c) const { return std::tolower(c,loc); }
|
197 |
|
|
private:
|
198 |
|
|
std::locale const& loc;
|
199 |
|
|
};
|
200 |
|
|
|
201 |
|
|
int main ()
|
202 |
|
|
{
|
203 |
|
|
std::string s("Some Kind Of Initial Input Goes Here");
|
204 |
|
|
ToUpper up(std::locale::classic());
|
205 |
|
|
ToLower down(std::locale::classic());
|
206 |
|
|
|
207 |
|
|
// Change everything into upper case.
|
208 |
|
|
std::transform(s.begin(), s.end(), s.begin(), up);
|
209 |
|
|
|
210 |
|
|
// Change everything into lower case.
|
211 |
|
|
std::transform(s.begin(), s.end(), s.begin(), down);
|
212 |
|
|
|
213 |
|
|
// Change everything back into upper case, but store the
|
214 |
|
|
// result in a different string.
|
215 |
|
|
std::string capital_s;
|
216 |
|
|
std::transform(s.begin(), s.end(), std::back_inserter(capital_s), up);
|
217 |
|
|
}</pre>
|
218 |
|
|
<p>The <code>ToUpper</code> and <code>ToLower</code> structs can be
|
219 |
|
|
generalized for other character types by making <code>operator()</code>
|
220 |
|
|
a member function template.
|
221 |
|
|
</p>
|
222 |
|
|
<p>The final version of the code uses <code>bind2nd</code> to eliminate
|
223 |
|
|
the wrapper structs, but the resulting code is tricky. I have not
|
224 |
|
|
shown it here because no compilers currently available to me will
|
225 |
|
|
handle it.
|
226 |
|
|
</p>
|
227 |
|
|
|
228 |
|
|
|
229 |
|
|
<!-- ####################################################### -->
|
230 |
|
|
|
231 |
|
|
<hr />
|
232 |
|
|
<p class="fineprint"><em>
|
233 |
|
|
See <a href="../17_intro/license.html">license.html</a> for copying conditions.
|
234 |
|
|
Comments and suggestions are welcome, and may be sent to
|
235 |
|
|
<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
|
236 |
|
|
</em></p>
|
237 |
|
|
|
238 |
|
|
|
239 |
|
|
</body>
|
240 |
|
|
</html>
|