1 |
742 |
jeremybenn |
|
2 |
|
|
xml:id="appendix.contrib" xreflabel="Contributing">
|
3 |
|
|
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
Contributing
|
7 |
|
|
|
8 |
|
|
Appendix
|
9 |
|
|
Contributing
|
10 |
|
|
|
11 |
|
|
|
12 |
|
|
|
13 |
|
|
|
14 |
|
|
ISO C++
|
15 |
|
|
|
16 |
|
|
|
17 |
|
|
library
|
18 |
|
|
|
19 |
|
|
|
20 |
|
|
|
21 |
|
|
|
22 |
|
|
|
23 |
|
|
|
24 |
|
|
|
25 |
|
|
The GNU C++ Library follows an open development model. Active
|
26 |
|
|
contributors are assigned maintainer-ship responsibility, and given
|
27 |
|
|
write access to the source repository. First time contributors
|
28 |
|
|
should follow this procedure:
|
29 |
|
|
|
30 |
|
|
|
31 |
|
|
|
32 |
|
|
|
33 |
|
|
|
34 |
|
|
|
35 |
|
|
|
36 |
|
|
|
37 |
|
|
|
38 |
|
|
|
39 |
|
|
|
40 |
|
|
Get and read the relevant sections of the C++ language
|
41 |
|
|
specification. Copies of the full ISO 14882 standard are
|
42 |
|
|
available on line via the ISO mirror site for committee
|
43 |
|
|
members. Non-members, or those who have not paid for the
|
44 |
|
|
privilege of sitting on the committee and sustained their
|
45 |
|
|
two meeting commitment for voting rights, may get a copy of
|
46 |
|
|
the standard from their respective national standards
|
47 |
|
|
organization. In the USA, this national standards
|
48 |
|
|
organization is
|
49 |
|
|
ANSI.
|
50 |
|
|
(And if you've already registered with them you can
|
51 |
|
|
buy the standard on-line.)
|
52 |
|
|
|
53 |
|
|
|
54 |
|
|
|
55 |
|
|
|
56 |
|
|
|
57 |
|
|
The library working group bugs, and known defects, can
|
58 |
|
|
be obtained here:
|
59 |
|
|
http://www.open-std.org/jtc1/sc22/wg21
|
60 |
|
|
|
61 |
|
|
|
62 |
|
|
|
63 |
|
|
|
64 |
|
|
|
65 |
|
|
The newsgroup dedicated to standardization issues is
|
66 |
|
|
comp.std.c++: the
|
67 |
|
|
FAQ
|
68 |
|
|
for this group is quite useful.
|
69 |
|
|
|
70 |
|
|
|
71 |
|
|
|
72 |
|
|
|
73 |
|
|
|
74 |
|
|
Peruse
|
75 |
|
|
the GNU
|
76 |
|
|
Coding Standards, and chuckle when you hit the part
|
77 |
|
|
about Using Languages Other Than C .
|
78 |
|
|
|
79 |
|
|
|
80 |
|
|
|
81 |
|
|
|
82 |
|
|
|
83 |
|
|
Be familiar with the extensions that preceded these
|
84 |
|
|
general GNU rules. These style issues for libstdc++ can be
|
85 |
|
|
found in Coding Style.
|
86 |
|
|
|
87 |
|
|
|
88 |
|
|
|
89 |
|
|
|
90 |
|
|
|
91 |
|
|
And last but certainly not least, read the
|
92 |
|
|
library-specific information found in
|
93 |
|
|
Porting and Maintenance.
|
94 |
|
|
|
95 |
|
|
|
96 |
|
|
|
97 |
|
|
|
98 |
|
|
|
99 |
|
|
|
100 |
|
|
|
101 |
|
|
|
102 |
|
|
Small changes can be accepted without a copyright assignment form on
|
103 |
|
|
file. New code and additions to the library need completed copyright
|
104 |
|
|
assignment form on file at the FSF. Note: your employer may be required
|
105 |
|
|
to fill out appropriate disclaimer forms as well.
|
106 |
|
|
|
107 |
|
|
|
108 |
|
|
|
109 |
|
|
Historically, the libstdc++ assignment form added the following
|
110 |
|
|
question:
|
111 |
|
|
|
112 |
|
|
|
113 |
|
|
|
114 |
|
|
|
115 |
|
|
Which Belgian comic book character is better, Tintin or Asterix, and
|
116 |
|
|
why?
|
117 |
|
|
|
118 |
|
|
|
119 |
|
|
|
120 |
|
|
|
121 |
|
|
While not strictly necessary, humoring the maintainers and answering
|
122 |
|
|
this question would be appreciated.
|
123 |
|
|
|
124 |
|
|
|
125 |
|
|
|
126 |
|
|
For more information about getting a copyright assignment, please see
|
127 |
|
|
Legal
|
128 |
|
|
Matters.
|
129 |
|
|
|
130 |
|
|
|
131 |
|
|
|
132 |
|
|
Please contact Benjamin Kosnik at
|
133 |
|
|
bkoz+assign@redhat.com if you are confused
|
134 |
|
|
about the assignment or have general licensing questions. When
|
135 |
|
|
requesting an assignment form from
|
136 |
|
|
mailto:assign@gnu.org, please cc the libstdc++
|
137 |
|
|
maintainer above so that progress can be monitored.
|
138 |
|
|
|
139 |
|
|
|
140 |
|
|
|
141 |
|
|
|
142 |
|
|
|
143 |
|
|
|
144 |
|
|
Getting write access
|
145 |
|
|
(look for "Write after approval")
|
146 |
|
|
|
147 |
|
|
|
148 |
|
|
|
149 |
|
|
|
150 |
|
|
|
151 |
|
|
|
152 |
|
|
|
153 |
|
|
Every patch must have several pieces of information before it can be
|
154 |
|
|
properly evaluated. Ideally (and to ensure the fastest possible
|
155 |
|
|
response from the maintainers) it would have all of these pieces:
|
156 |
|
|
|
157 |
|
|
|
158 |
|
|
|
159 |
|
|
|
160 |
|
|
|
161 |
|
|
A description of the bug and how your patch fixes this
|
162 |
|
|
bug. For new features a description of the feature and your
|
163 |
|
|
implementation.
|
164 |
|
|
|
165 |
|
|
|
166 |
|
|
|
167 |
|
|
|
168 |
|
|
|
169 |
|
|
A ChangeLog entry as plain text; see the various
|
170 |
|
|
ChangeLog files for format and content. If you are
|
171 |
|
|
using emacs as your editor, simply position the insertion
|
172 |
|
|
point at the beginning of your change and hit CX-4a to bring
|
173 |
|
|
up the appropriate ChangeLog entry. See--magic! Similar
|
174 |
|
|
functionality also exists for vi.
|
175 |
|
|
|
176 |
|
|
|
177 |
|
|
|
178 |
|
|
|
179 |
|
|
|
180 |
|
|
A testsuite submission or sample program that will
|
181 |
|
|
easily and simply show the existing error or test new
|
182 |
|
|
functionality.
|
183 |
|
|
|
184 |
|
|
|
185 |
|
|
|
186 |
|
|
|
187 |
|
|
|
188 |
|
|
The patch itself. If you are accessing the SVN
|
189 |
|
|
repository use svn update; svn diff NEW;
|
190 |
|
|
else, use diff -cp OLD NEW ... If your
|
191 |
|
|
version of diff does not support these options, then get the
|
192 |
|
|
latest version of GNU
|
193 |
|
|
diff. The SVN
|
194 |
|
|
Tricks wiki page has information on customising the
|
195 |
|
|
output of svn diff .
|
196 |
|
|
|
197 |
|
|
|
198 |
|
|
|
199 |
|
|
|
200 |
|
|
|
201 |
|
|
When you have all these pieces, bundle them up in a
|
202 |
|
|
mail message and send it to libstdc++@gcc.gnu.org. All
|
203 |
|
|
patches and related discussion should be sent to the
|
204 |
|
|
libstdc++ mailing list.
|
205 |
|
|
|
206 |
|
|
|
207 |
|
|
|
208 |
|
|
|
209 |
|
|
|
210 |
|
|
|
211 |
|
|
|
212 |
|
|
|
213 |
|
|
Directory Layout and Source Conventions
|
214 |
|
|
|
215 |
|
|
|
216 |
|
|
|
217 |
|
|
|
218 |
|
|
The unpacked source directory of libstdc++ contains the files
|
219 |
|
|
needed to create the GNU C++ Library.
|
220 |
|
|
|
221 |
|
|
|
222 |
|
|
|
223 |
|
|
It has subdirectories:
|
224 |
|
|
|
225 |
|
|
doc
|
226 |
|
|
Files in HTML and text format that document usage, quirks of the
|
227 |
|
|
implementation, and contributor checklists.
|
228 |
|
|
|
229 |
|
|
include
|
230 |
|
|
All header files for the C++ library are within this directory,
|
231 |
|
|
modulo specific runtime-related files that are in the libsupc++
|
232 |
|
|
directory.
|
233 |
|
|
|
234 |
|
|
include/std
|
235 |
|
|
Files meant to be found by #include <name> directives in
|
236 |
|
|
standard-conforming user programs.
|
237 |
|
|
|
238 |
|
|
include/c
|
239 |
|
|
Headers intended to directly include standard C headers.
|
240 |
|
|
[NB: this can be enabled via --enable-cheaders=c]
|
241 |
|
|
|
242 |
|
|
include/c_global
|
243 |
|
|
Headers intended to include standard C headers in
|
244 |
|
|
the global namespace, and put select names into the std::
|
245 |
|
|
namespace. [NB: this is the default, and is the same as
|
246 |
|
|
--enable-cheaders=c_global]
|
247 |
|
|
|
248 |
|
|
include/c_std
|
249 |
|
|
Headers intended to include standard C headers
|
250 |
|
|
already in namespace std, and put select names into the std::
|
251 |
|
|
namespace. [NB: this is the same as --enable-cheaders=c_std]
|
252 |
|
|
|
253 |
|
|
include/bits
|
254 |
|
|
Files included by standard headers and by other files in
|
255 |
|
|
the bits directory.
|
256 |
|
|
|
257 |
|
|
include/backward
|
258 |
|
|
Headers provided for backward compatibility, such as <iostream.h>.
|
259 |
|
|
They are not used in this library.
|
260 |
|
|
|
261 |
|
|
include/ext
|
262 |
|
|
Headers that define extensions to the standard library. No
|
263 |
|
|
standard header refers to any of them.
|
264 |
|
|
|
265 |
|
|
scripts
|
266 |
|
|
Scripts that are used during the configure, build, make, or test
|
267 |
|
|
process.
|
268 |
|
|
|
269 |
|
|
src
|
270 |
|
|
Files that are used in constructing the library, but are not
|
271 |
|
|
installed.
|
272 |
|
|
|
273 |
|
|
testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
|
274 |
|
|
Test programs are here, and may be used to begin to exercise the
|
275 |
|
|
library. Support for "make check" and "make check-install" is
|
276 |
|
|
complete, and runs through all the subdirectories here when this
|
277 |
|
|
command is issued from the build directory. Please note that
|
278 |
|
|
"make check" requires DejaGNU 1.4 or later to be installed. Please
|
279 |
|
|
note that "make check-script" calls the script mkcheck, which
|
280 |
|
|
requires bash, and which may need the paths to bash adjusted to
|
281 |
|
|
work properly, as /bin/bash is assumed.
|
282 |
|
|
|
283 |
|
|
Other subdirectories contain variant versions of certain files
|
284 |
|
|
that are meant to be copied or linked by the configure script.
|
285 |
|
|
Currently these are:
|
286 |
|
|
|
287 |
|
|
config/abi
|
288 |
|
|
config/cpu
|
289 |
|
|
config/io
|
290 |
|
|
config/locale
|
291 |
|
|
config/os
|
292 |
|
|
|
293 |
|
|
In addition, a subdirectory holds the convenience library libsupc++.
|
294 |
|
|
|
295 |
|
|
libsupc++
|
296 |
|
|
Contains the runtime library for C++, including exception
|
297 |
|
|
handling and memory allocation and deallocation, RTTI, terminate
|
298 |
|
|
handlers, etc.
|
299 |
|
|
|
300 |
|
|
Note that glibc also has a bits/ subdirectory. We will either
|
301 |
|
|
need to be careful not to collide with names in its bits/
|
302 |
|
|
directory; or rename bits to (e.g.) cppbits/.
|
303 |
|
|
|
304 |
|
|
In files throughout the system, lines marked with an "XXX" indicate
|
305 |
|
|
a bug or incompletely-implemented feature. Lines marked "XXX MT"
|
306 |
|
|
indicate a place that may require attention for multi-thread safety.
|
307 |
|
|
|
308 |
|
|
|
309 |
|
|
|
310 |
|
|
|
311 |
|
|
|
312 |
|
|
|
313 |
|
|
|
314 |
|
|
|
315 |
|
|
|
316 |
|
|
|
317 |
|
|
|
318 |
|
|
|
319 |
|
|
Identifiers that conflict and should be avoided.
|
320 |
|
|
|
321 |
|
|
|
322 |
|
|
|
323 |
|
|
This is the list of names reserved to the
|
324 |
|
|
implementation that have been claimed by certain
|
325 |
|
|
compilers and system headers of interest, and should not be used
|
326 |
|
|
in the library. It will grow, of course. We generally are
|
327 |
|
|
interested in names that are not all-caps, except for those like
|
328 |
|
|
"_T"
|
329 |
|
|
|
330 |
|
|
For Solaris:
|
331 |
|
|
_B
|
332 |
|
|
_C
|
333 |
|
|
_L
|
334 |
|
|
_N
|
335 |
|
|
_P
|
336 |
|
|
_S
|
337 |
|
|
_U
|
338 |
|
|
_X
|
339 |
|
|
_E1
|
340 |
|
|
..
|
341 |
|
|
_E24
|
342 |
|
|
|
343 |
|
|
Irix adds:
|
344 |
|
|
_A
|
345 |
|
|
_G
|
346 |
|
|
|
347 |
|
|
MS adds:
|
348 |
|
|
_T
|
349 |
|
|
|
350 |
|
|
BSD adds:
|
351 |
|
|
__used
|
352 |
|
|
__unused
|
353 |
|
|
__inline
|
354 |
|
|
_Complex
|
355 |
|
|
__istype
|
356 |
|
|
__maskrune
|
357 |
|
|
__tolower
|
358 |
|
|
__toupper
|
359 |
|
|
__wchar_t
|
360 |
|
|
__wint_t
|
361 |
|
|
_res
|
362 |
|
|
_res_ext
|
363 |
|
|
__tg_*
|
364 |
|
|
|
365 |
|
|
SPU adds:
|
366 |
|
|
__ea
|
367 |
|
|
|
368 |
|
|
For GCC:
|
369 |
|
|
|
370 |
|
|
[Note that this list is out of date. It applies to the old
|
371 |
|
|
name-mangling; in G++ 3.0 and higher a different name-mangling is
|
372 |
|
|
used. In addition, many of the bugs relating to G++ interpreting
|
373 |
|
|
these names as operators have been fixed.]
|
374 |
|
|
|
375 |
|
|
The full set of __* identifiers (combined from gcc/cp/lex.c and
|
376 |
|
|
gcc/cplus-dem.c) that are either old or new, but are definitely
|
377 |
|
|
recognized by the demangler, is:
|
378 |
|
|
|
379 |
|
|
__aa
|
380 |
|
|
__aad
|
381 |
|
|
__ad
|
382 |
|
|
__addr
|
383 |
|
|
__adv
|
384 |
|
|
__aer
|
385 |
|
|
__als
|
386 |
|
|
__alshift
|
387 |
|
|
__amd
|
388 |
|
|
__ami
|
389 |
|
|
__aml
|
390 |
|
|
__amu
|
391 |
|
|
__aor
|
392 |
|
|
__apl
|
393 |
|
|
__array
|
394 |
|
|
__ars
|
395 |
|
|
__arshift
|
396 |
|
|
__as
|
397 |
|
|
__bit_and
|
398 |
|
|
__bit_ior
|
399 |
|
|
__bit_not
|
400 |
|
|
__bit_xor
|
401 |
|
|
__call
|
402 |
|
|
__cl
|
403 |
|
|
__cm
|
404 |
|
|
__cn
|
405 |
|
|
__co
|
406 |
|
|
__component
|
407 |
|
|
__compound
|
408 |
|
|
__cond
|
409 |
|
|
__convert
|
410 |
|
|
__delete
|
411 |
|
|
__dl
|
412 |
|
|
__dv
|
413 |
|
|
__eq
|
414 |
|
|
__er
|
415 |
|
|
__ge
|
416 |
|
|
__gt
|
417 |
|
|
__indirect
|
418 |
|
|
__le
|
419 |
|
|
__ls
|
420 |
|
|
__lt
|
421 |
|
|
__max
|
422 |
|
|
__md
|
423 |
|
|
__method_call
|
424 |
|
|
__mi
|
425 |
|
|
__min
|
426 |
|
|
__minus
|
427 |
|
|
__ml
|
428 |
|
|
__mm
|
429 |
|
|
__mn
|
430 |
|
|
__mult
|
431 |
|
|
__mx
|
432 |
|
|
__ne
|
433 |
|
|
__negate
|
434 |
|
|
__new
|
435 |
|
|
__nop
|
436 |
|
|
__nt
|
437 |
|
|
__nw
|
438 |
|
|
__oo
|
439 |
|
|
__op
|
440 |
|
|
__or
|
441 |
|
|
__pl
|
442 |
|
|
__plus
|
443 |
|
|
__postdecrement
|
444 |
|
|
__postincrement
|
445 |
|
|
__pp
|
446 |
|
|
__pt
|
447 |
|
|
__rf
|
448 |
|
|
__rm
|
449 |
|
|
__rs
|
450 |
|
|
__sz
|
451 |
|
|
__trunc_div
|
452 |
|
|
__trunc_mod
|
453 |
|
|
__truth_andif
|
454 |
|
|
__truth_not
|
455 |
|
|
__truth_orif
|
456 |
|
|
__vc
|
457 |
|
|
__vd
|
458 |
|
|
__vn
|
459 |
|
|
|
460 |
|
|
SGI badnames:
|
461 |
|
|
__builtin_alloca
|
462 |
|
|
__builtin_fsqrt
|
463 |
|
|
__builtin_sqrt
|
464 |
|
|
__builtin_fabs
|
465 |
|
|
__builtin_dabs
|
466 |
|
|
__builtin_cast_f2i
|
467 |
|
|
__builtin_cast_i2f
|
468 |
|
|
__builtin_cast_d2ll
|
469 |
|
|
__builtin_cast_ll2d
|
470 |
|
|
__builtin_copy_dhi2i
|
471 |
|
|
__builtin_copy_i2dhi
|
472 |
|
|
__builtin_copy_dlo2i
|
473 |
|
|
__builtin_copy_i2dlo
|
474 |
|
|
__add_and_fetch
|
475 |
|
|
__sub_and_fetch
|
476 |
|
|
__or_and_fetch
|
477 |
|
|
__xor_and_fetch
|
478 |
|
|
__and_and_fetch
|
479 |
|
|
__nand_and_fetch
|
480 |
|
|
__mpy_and_fetch
|
481 |
|
|
__min_and_fetch
|
482 |
|
|
__max_and_fetch
|
483 |
|
|
__fetch_and_add
|
484 |
|
|
__fetch_and_sub
|
485 |
|
|
__fetch_and_or
|
486 |
|
|
__fetch_and_xor
|
487 |
|
|
__fetch_and_and
|
488 |
|
|
__fetch_and_nand
|
489 |
|
|
__fetch_and_mpy
|
490 |
|
|
__fetch_and_min
|
491 |
|
|
__fetch_and_max
|
492 |
|
|
__lock_test_and_set
|
493 |
|
|
__lock_release
|
494 |
|
|
__lock_acquire
|
495 |
|
|
__compare_and_swap
|
496 |
|
|
__synchronize
|
497 |
|
|
__high_multiply
|
498 |
|
|
__unix
|
499 |
|
|
__sgi
|
500 |
|
|
__linux__
|
501 |
|
|
__i386__
|
502 |
|
|
__i486__
|
503 |
|
|
__cplusplus
|
504 |
|
|
__embedded_cplusplus
|
505 |
|
|
// long double conversion members mangled as __opr
|
506 |
|
|
// http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
|
507 |
|
|
__opr
|
508 |
|
|
|
509 |
|
|
|
510 |
|
|
|
511 |
|
|
|
512 |
|
|
|
513 |
|
|
|
514 |
|
|
This library is written to appropriate C++ coding standards. As such,
|
515 |
|
|
it is intended to precede the recommendations of the GNU Coding
|
516 |
|
|
Standard, which can be referenced in full here:
|
517 |
|
|
|
518 |
|
|
http://www.gnu.org/prep/standards/standards.html#Formatting
|
519 |
|
|
|
520 |
|
|
The rest of this is also interesting reading, but skip the "Design
|
521 |
|
|
Advice" part.
|
522 |
|
|
|
523 |
|
|
The GCC coding conventions are here, and are also useful:
|
524 |
|
|
http://gcc.gnu.org/codingconventions.html
|
525 |
|
|
|
526 |
|
|
In addition, because it doesn't seem to be stated explicitly anywhere
|
527 |
|
|
else, there is an 80 column source limit.
|
528 |
|
|
|
529 |
|
|
ChangeLog entries for member functions should use the
|
530 |
|
|
classname::member function name syntax as follows:
|
531 |
|
|
|
532 |
|
|
|
533 |
|
|
1999-04-15 Dennis Ritchie <dr@att.com>
|
534 |
|
|
|
535 |
|
|
* src/basic_file.cc (__basic_file::open): Fix thinko in
|
536 |
|
|
_G_HAVE_IO_FILE_OPEN bits.
|
537 |
|
|
|
538 |
|
|
|
539 |
|
|
Notable areas of divergence from what may be previous local practice
|
540 |
|
|
(particularly for GNU C) include:
|
541 |
|
|
|
542 |
|
|
01. Pointers and references
|
543 |
|
|
|
544 |
|
|
char* p = "flop";
|
545 |
|
|
char& c = *p;
|
546 |
|
|
-NOT-
|
547 |
|
|
char *p = "flop"; // wrong
|
548 |
|
|
char &c = *p; // wrong
|
549 |
|
|
|
550 |
|
|
|
551 |
|
|
Reason: In C++, definitions are mixed with executable code. Here,
|
552 |
|
|
p is being initialized, not *p . This is near-universal
|
553 |
|
|
practice among C++ programmers; it is normal for C hackers
|
554 |
|
|
to switch spontaneously as they gain experience.
|
555 |
|
|
|
556 |
|
|
02. Operator names and parentheses
|
557 |
|
|
|
558 |
|
|
operator==(type)
|
559 |
|
|
-NOT-
|
560 |
|
|
operator == (type) // wrong
|
561 |
|
|
|
562 |
|
|
|
563 |
|
|
Reason: The == is part of the function name. Separating
|
564 |
|
|
it makes the declaration look like an expression.
|
565 |
|
|
|
566 |
|
|
03. Function names and parentheses
|
567 |
|
|
|
568 |
|
|
void mangle()
|
569 |
|
|
-NOT-
|
570 |
|
|
void mangle () // wrong
|
571 |
|
|
|
572 |
|
|
|
573 |
|
|
Reason: no space before parentheses (except after a control-flow
|
574 |
|
|
keyword) is near-universal practice for C++. It identifies the
|
575 |
|
|
parentheses as the function-call operator or declarator, as
|
576 |
|
|
opposed to an expression or other overloaded use of parentheses.
|
577 |
|
|
|
578 |
|
|
04. Template function indentation
|
579 |
|
|
|
580 |
|
|
template<typename T>
|
581 |
|
|
void
|
582 |
|
|
template_function(args)
|
583 |
|
|
{ }
|
584 |
|
|
-NOT-
|
585 |
|
|
template<class T>
|
586 |
|
|
void template_function(args) {};
|
587 |
|
|
|
588 |
|
|
|
589 |
|
|
Reason: In class definitions, without indentation whitespace is
|
590 |
|
|
needed both above and below the declaration to distinguish
|
591 |
|
|
it visually from other members. (Also, re: "typename"
|
592 |
|
|
rather than "class".) T often could be int , which is
|
593 |
|
|
not a class. ("class", here, is an anachronism.)
|
594 |
|
|
|
595 |
|
|
05. Template class indentation
|
596 |
|
|
|
597 |
|
|
template<typename _CharT, typename _Traits>
|
598 |
|
|
class basic_ios : public ios_base
|
599 |
|
|
{
|
600 |
|
|
public:
|
601 |
|
|
// Types:
|
602 |
|
|
};
|
603 |
|
|
-NOT-
|
604 |
|
|
template<class _CharT, class _Traits>
|
605 |
|
|
class basic_ios : public ios_base
|
606 |
|
|
{
|
607 |
|
|
public:
|
608 |
|
|
// Types:
|
609 |
|
|
};
|
610 |
|
|
-NOT-
|
611 |
|
|
template<class _CharT, class _Traits>
|
612 |
|
|
class basic_ios : public ios_base
|
613 |
|
|
{
|
614 |
|
|
public:
|
615 |
|
|
// Types:
|
616 |
|
|
};
|
617 |
|
|
|
618 |
|
|
|
619 |
|
|
06. Enumerators
|
620 |
|
|
|
621 |
|
|
enum
|
622 |
|
|
{
|
623 |
|
|
space = _ISspace,
|
624 |
|
|
print = _ISprint,
|
625 |
|
|
cntrl = _IScntrl
|
626 |
|
|
};
|
627 |
|
|
-NOT-
|
628 |
|
|
enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
|
629 |
|
|
|
630 |
|
|
|
631 |
|
|
07. Member initialization lists
|
632 |
|
|
All one line, separate from class name.
|
633 |
|
|
|
634 |
|
|
|
635 |
|
|
gribble::gribble()
|
636 |
|
|
: _M_private_data(0), _M_more_stuff(0), _M_helper(0)
|
637 |
|
|
{ }
|
638 |
|
|
-NOT-
|
639 |
|
|
gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0)
|
640 |
|
|
{ }
|
641 |
|
|
|
642 |
|
|
|
643 |
|
|
08. Try/Catch blocks
|
644 |
|
|
|
645 |
|
|
try
|
646 |
|
|
{
|
647 |
|
|
//
|
648 |
|
|
}
|
649 |
|
|
catch (...)
|
650 |
|
|
{
|
651 |
|
|
//
|
652 |
|
|
}
|
653 |
|
|
-NOT-
|
654 |
|
|
try {
|
655 |
|
|
//
|
656 |
|
|
} catch(...) {
|
657 |
|
|
//
|
658 |
|
|
}
|
659 |
|
|
|
660 |
|
|
|
661 |
|
|
09. Member functions declarations and definitions
|
662 |
|
|
Keywords such as extern, static, export, explicit, inline, etc
|
663 |
|
|
go on the line above the function name. Thus
|
664 |
|
|
|
665 |
|
|
|
666 |
|
|
virtual int
|
667 |
|
|
foo()
|
668 |
|
|
-NOT-
|
669 |
|
|
virtual int foo()
|
670 |
|
|
|
671 |
|
|
|
672 |
|
|
Reason: GNU coding conventions dictate return types for functions
|
673 |
|
|
are on a separate line than the function name and parameter list
|
674 |
|
|
for definitions. For C++, where we have member functions that can
|
675 |
|
|
be either inline definitions or declarations, keeping to this
|
676 |
|
|
standard allows all member function names for a given class to be
|
677 |
|
|
aligned to the same margin, increasing readability.
|
678 |
|
|
|
679 |
|
|
|
680 |
|
|
10. Invocation of member functions with "this->"
|
681 |
|
|
For non-uglified names, use this->name to call the function.
|
682 |
|
|
|
683 |
|
|
|
684 |
|
|
this->sync()
|
685 |
|
|
-NOT-
|
686 |
|
|
sync()
|
687 |
|
|
|
688 |
|
|
|
689 |
|
|
Reason: Koenig lookup.
|
690 |
|
|
|
691 |
|
|
11. Namespaces
|
692 |
|
|
|
693 |
|
|
namespace std
|
694 |
|
|
{
|
695 |
|
|
blah blah blah;
|
696 |
|
|
} // namespace std
|
697 |
|
|
|
698 |
|
|
-NOT-
|
699 |
|
|
|
700 |
|
|
namespace std {
|
701 |
|
|
blah blah blah;
|
702 |
|
|
} // namespace std
|
703 |
|
|
|
704 |
|
|
|
705 |
|
|
12. Spacing under protected and private in class declarations:
|
706 |
|
|
space above, none below
|
707 |
|
|
i.e.
|
708 |
|
|
|
709 |
|
|
|
710 |
|
|
public:
|
711 |
|
|
int foo;
|
712 |
|
|
|
713 |
|
|
-NOT-
|
714 |
|
|
public:
|
715 |
|
|
|
716 |
|
|
int foo;
|
717 |
|
|
|
718 |
|
|
|
719 |
|
|
13. Spacing WRT return statements.
|
720 |
|
|
no extra spacing before returns, no parenthesis
|
721 |
|
|
i.e.
|
722 |
|
|
|
723 |
|
|
|
724 |
|
|
}
|
725 |
|
|
return __ret;
|
726 |
|
|
|
727 |
|
|
-NOT-
|
728 |
|
|
}
|
729 |
|
|
|
730 |
|
|
return __ret;
|
731 |
|
|
|
732 |
|
|
-NOT-
|
733 |
|
|
|
734 |
|
|
}
|
735 |
|
|
return (__ret);
|
736 |
|
|
|
737 |
|
|
|
738 |
|
|
|
739 |
|
|
14. Location of global variables.
|
740 |
|
|
All global variables of class type, whether in the "user visible"
|
741 |
|
|
space (e.g., cin ) or the implementation namespace, must be defined
|
742 |
|
|
as a character array with the appropriate alignment and then later
|
743 |
|
|
re-initialized to the correct value.
|
744 |
|
|
|
745 |
|
|
This is due to startup issues on certain platforms, such as AIX.
|
746 |
|
|
For more explanation and examples, see src/globals.cc. All such
|
747 |
|
|
variables should be contained in that file, for simplicity.
|
748 |
|
|
|
749 |
|
|
15. Exception abstractions
|
750 |
|
|
Use the exception abstractions found in , which allow
|
751 |
|
|
C++ programmers to use this library with -fno-exceptions. (Even if
|
752 |
|
|
that is rarely advisable, it's a necessary evil for backwards
|
753 |
|
|
compatibility.)
|
754 |
|
|
|
755 |
|
|
16. Exception error messages
|
756 |
|
|
All start with the name of the function where the exception is
|
757 |
|
|
thrown, and then (optional) descriptive text is added. Example:
|
758 |
|
|
|
759 |
|
|
|
760 |
|
|
__throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
|
761 |
|
|
|
762 |
|
|
|
763 |
|
|
Reason: The verbose terminate handler prints out exception::what() ,
|
764 |
|
|
as well as the typeinfo for the thrown exception. As this is the
|
765 |
|
|
default terminate handler, by putting location info into the
|
766 |
|
|
exception string, a very useful error message is printed out for
|
767 |
|
|
uncaught exceptions. So useful, in fact, that non-programmers can
|
768 |
|
|
give useful error messages, and programmers can intelligently
|
769 |
|
|
speculate what went wrong without even using a debugger.
|
770 |
|
|
|
771 |
|
|
17. The doxygen style guide to comments is a separate document,
|
772 |
|
|
see index.
|
773 |
|
|
|
774 |
|
|
The library currently has a mixture of GNU-C and modern C++ coding
|
775 |
|
|
styles. The GNU C usages will be combed out gradually.
|
776 |
|
|
|
777 |
|
|
Name patterns:
|
778 |
|
|
|
779 |
|
|
For nonstandard names appearing in Standard headers, we are constrained
|
780 |
|
|
to use names that begin with underscores. This is called "uglification".
|
781 |
|
|
The convention is:
|
782 |
|
|
|
783 |
|
|
Local and argument names: __[a-z].*
|
784 |
|
|
|
785 |
|
|
Examples: __count __ix __s1
|
786 |
|
|
|
787 |
|
|
Type names and template formal-argument names: _[A-Z][^_].*
|
788 |
|
|
|
789 |
|
|
Examples: _Helper _CharT _N
|
790 |
|
|
|
791 |
|
|
Member data and function names: _M_.*
|
792 |
|
|
|
793 |
|
|
Examples: _M_num_elements _M_initialize ()
|
794 |
|
|
|
795 |
|
|
Static data members, constants, and enumerations: _S_.*
|
796 |
|
|
|
797 |
|
|
Examples: _S_max_elements _S_default_value
|
798 |
|
|
|
799 |
|
|
Don't use names in the same scope that differ only in the prefix,
|
800 |
|
|
e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
|
801 |
|
|
(The most tempting of these seem to be and "_T" and "__sz".)
|
802 |
|
|
|
803 |
|
|
Names must never have "__" internally; it would confuse name
|
804 |
|
|
unmanglers on some targets. Also, never use "__[0-9]", same reason.
|
805 |
|
|
|
806 |
|
|
--------------------------
|
807 |
|
|
|
808 |
|
|
[BY EXAMPLE]
|
809 |
|
|
|
810 |
|
|
|
811 |
|
|
#ifndef _HEADER_
|
812 |
|
|
#define _HEADER_ 1
|
813 |
|
|
|
814 |
|
|
namespace std
|
815 |
|
|
{
|
816 |
|
|
class gribble
|
817 |
|
|
{
|
818 |
|
|
public:
|
819 |
|
|
gribble() throw();
|
820 |
|
|
|
821 |
|
|
gribble(const gribble&);
|
822 |
|
|
|
823 |
|
|
explicit
|
824 |
|
|
gribble(int __howmany);
|
825 |
|
|
|
826 |
|
|
gribble&
|
827 |
|
|
operator=(const gribble&);
|
828 |
|
|
|
829 |
|
|
virtual
|
830 |
|
|
~gribble() throw ();
|
831 |
|
|
|
832 |
|
|
// Start with a capital letter, end with a period.
|
833 |
|
|
inline void
|
834 |
|
|
public_member(const char* __arg) const;
|
835 |
|
|
|
836 |
|
|
// In-class function definitions should be restricted to one-liners.
|
837 |
|
|
int
|
838 |
|
|
one_line() { return 0 }
|
839 |
|
|
|
840 |
|
|
int
|
841 |
|
|
two_lines(const char* arg)
|
842 |
|
|
{ return strchr(arg, 'a'); }
|
843 |
|
|
|
844 |
|
|
inline int
|
845 |
|
|
three_lines(); // inline, but defined below.
|
846 |
|
|
|
847 |
|
|
// Note indentation.
|
848 |
|
|
template<typename _Formal_argument>
|
849 |
|
|
void
|
850 |
|
|
public_template() const throw();
|
851 |
|
|
|
852 |
|
|
template<typename _Iterator>
|
853 |
|
|
void
|
854 |
|
|
other_template();
|
855 |
|
|
|
856 |
|
|
private:
|
857 |
|
|
class _Helper;
|
858 |
|
|
|
859 |
|
|
int _M_private_data;
|
860 |
|
|
int _M_more_stuff;
|
861 |
|
|
_Helper* _M_helper;
|
862 |
|
|
int _M_private_function();
|
863 |
|
|
|
864 |
|
|
enum _Enum
|
865 |
|
|
{
|
866 |
|
|
_S_one,
|
867 |
|
|
_S_two
|
868 |
|
|
};
|
869 |
|
|
|
870 |
|
|
static void
|
871 |
|
|
_S_initialize_library();
|
872 |
|
|
};
|
873 |
|
|
|
874 |
|
|
// More-or-less-standard language features described by lack, not presence.
|
875 |
|
|
# ifndef _G_NO_LONGLONG
|
876 |
|
|
extern long long _G_global_with_a_good_long_name; // avoid globals!
|
877 |
|
|
# endif
|
878 |
|
|
|
879 |
|
|
// Avoid in-class inline definitions, define separately;
|
880 |
|
|
// likewise for member class definitions:
|
881 |
|
|
inline int
|
882 |
|
|
gribble::public_member() const
|
883 |
|
|
{ int __local = 0; return __local; }
|
884 |
|
|
|
885 |
|
|
class gribble::_Helper
|
886 |
|
|
{
|
887 |
|
|
int _M_stuff;
|
888 |
|
|
|
889 |
|
|
friend class gribble;
|
890 |
|
|
};
|
891 |
|
|
}
|
892 |
|
|
|
893 |
|
|
// Names beginning with "__": only for arguments and
|
894 |
|
|
// local variables; never use "__" in a type name, or
|
895 |
|
|
// within any name; never use "__[0-9]".
|
896 |
|
|
|
897 |
|
|
#endif /* _HEADER_ */
|
898 |
|
|
|
899 |
|
|
|
900 |
|
|
namespace std
|
901 |
|
|
{
|
902 |
|
|
template<typename T> // notice: "typename", not "class", no space
|
903 |
|
|
long_return_value_type<with_many, args>
|
904 |
|
|
function_name(char* pointer, // "char *pointer" is wrong.
|
905 |
|
|
char* argument,
|
906 |
|
|
const Reference& ref)
|
907 |
|
|
{
|
908 |
|
|
// int a_local; /* wrong; see below. */
|
909 |
|
|
if (test)
|
910 |
|
|
{
|
911 |
|
|
nested code
|
912 |
|
|
}
|
913 |
|
|
|
914 |
|
|
int a_local = 0; // declare variable at first use.
|
915 |
|
|
|
916 |
|
|
// char a, b, *p; /* wrong */
|
917 |
|
|
char a = 'a';
|
918 |
|
|
char b = a + 1;
|
919 |
|
|
char* c = "abc"; // each variable goes on its own line, always.
|
920 |
|
|
|
921 |
|
|
// except maybe here...
|
922 |
|
|
for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
|
923 |
|
|
// ...
|
924 |
|
|
}
|
925 |
|
|
}
|
926 |
|
|
|
927 |
|
|
gribble::gribble()
|
928 |
|
|
: _M_private_data(0), _M_more_stuff(0), _M_helper(0)
|
929 |
|
|
{ }
|
930 |
|
|
|
931 |
|
|
int
|
932 |
|
|
gribble::three_lines()
|
933 |
|
|
{
|
934 |
|
|
// doesn't fit in one line.
|
935 |
|
|
}
|
936 |
|
|
} // namespace std
|
937 |
|
|
|
938 |
|
|
|
939 |
|
|
|
940 |
|
|
|
941 |
|
|
|
942 |
|
|
|
943 |
|
|
|
944 |
|
|
|
945 |
|
|
|
946 |
|
|
|
947 |
|
|
|
948 |
|
|
|
949 |
|
|
|
950 |
|
|
The Library
|
951 |
|
|
-----------
|
952 |
|
|
|
953 |
|
|
This paper is covers two major areas:
|
954 |
|
|
|
955 |
|
|
- Features and policies not mentioned in the standard that
|
956 |
|
|
the quality of the library implementation depends on, including
|
957 |
|
|
extensions and "implementation-defined" features;
|
958 |
|
|
|
959 |
|
|
- Plans for required but unimplemented library features and
|
960 |
|
|
optimizations to them.
|
961 |
|
|
|
962 |
|
|
Overhead
|
963 |
|
|
--------
|
964 |
|
|
|
965 |
|
|
The standard defines a large library, much larger than the standard
|
966 |
|
|
C library. A naive implementation would suffer substantial overhead
|
967 |
|
|
in compile time, executable size, and speed, rendering it unusable
|
968 |
|
|
in many (particularly embedded) applications. The alternative demands
|
969 |
|
|
care in construction, and some compiler support, but there is no
|
970 |
|
|
need for library subsets.
|
971 |
|
|
|
972 |
|
|
What are the sources of this overhead? There are four main causes:
|
973 |
|
|
|
974 |
|
|
- The library is specified almost entirely as templates, which
|
975 |
|
|
with current compilers must be included in-line, resulting in
|
976 |
|
|
very slow builds as tens or hundreds of thousands of lines
|
977 |
|
|
of function definitions are read for each user source file.
|
978 |
|
|
Indeed, the entire SGI STL, as well as the dos Reis valarray,
|
979 |
|
|
are provided purely as header files, largely for simplicity in
|
980 |
|
|
porting. Iostream/locale is (or will be) as large again.
|
981 |
|
|
|
982 |
|
|
- The library is very flexible, specifying a multitude of hooks
|
983 |
|
|
where users can insert their own code in place of defaults.
|
984 |
|
|
When these hooks are not used, any time and code expended to
|
985 |
|
|
support that flexibility is wasted.
|
986 |
|
|
|
987 |
|
|
- Templates are often described as causing to "code bloat". In
|
988 |
|
|
practice, this refers (when it refers to anything real) to several
|
989 |
|
|
independent processes. First, when a class template is manually
|
990 |
|
|
instantiated in its entirely, current compilers place the definitions
|
991 |
|
|
for all members in a single object file, so that a program linking
|
992 |
|
|
to one member gets definitions of all. Second, template functions
|
993 |
|
|
which do not actually depend on the template argument are, under
|
994 |
|
|
current compilers, generated anew for each instantiation, rather
|
995 |
|
|
than being shared with other instantiations. Third, some of the
|
996 |
|
|
flexibility mentioned above comes from virtual functions (both in
|
997 |
|
|
regular classes and template classes) which current linkers add
|
998 |
|
|
to the executable file even when they manifestly cannot be called.
|
999 |
|
|
|
1000 |
|
|
- The library is specified to use a language feature, exceptions,
|
1001 |
|
|
which in the current gcc compiler ABI imposes a run time and
|
1002 |
|
|
code space cost to handle the possibility of exceptions even when
|
1003 |
|
|
they are not used. Under the new ABI (accessed with -fnew-abi),
|
1004 |
|
|
there is a space overhead and a small reduction in code efficiency
|
1005 |
|
|
resulting from lost optimization opportunities associated with
|
1006 |
|
|
non-local branches associated with exceptions.
|
1007 |
|
|
|
1008 |
|
|
What can be done to eliminate this overhead? A variety of coding
|
1009 |
|
|
techniques, and compiler, linker and library improvements and
|
1010 |
|
|
extensions may be used, as covered below. Most are not difficult,
|
1011 |
|
|
and some are already implemented in varying degrees.
|
1012 |
|
|
|
1013 |
|
|
Overhead: Compilation Time
|
1014 |
|
|
--------------------------
|
1015 |
|
|
|
1016 |
|
|
Providing "ready-instantiated" template code in object code archives
|
1017 |
|
|
allows us to avoid generating and optimizing template instantiations
|
1018 |
|
|
in each compilation unit which uses them. However, the number of such
|
1019 |
|
|
instantiations that are useful to provide is limited, and anyway this
|
1020 |
|
|
is not enough, by itself, to minimize compilation time. In particular,
|
1021 |
|
|
it does not reduce time spent parsing conforming headers.
|
1022 |
|
|
|
1023 |
|
|
Quicker header parsing will depend on library extensions and compiler
|
1024 |
|
|
improvements. One approach is some variation on the techniques
|
1025 |
|
|
previously marketed as "pre-compiled headers", now standardized as
|
1026 |
|
|
support for the "export" keyword. "Exported" template definitions
|
1027 |
|
|
can be placed (once) in a "repository" -- really just a library, but
|
1028 |
|
|
of template definitions rather than object code -- to be drawn upon
|
1029 |
|
|
at link time when an instantiation is needed, rather than placed in
|
1030 |
|
|
header files to be parsed along with every compilation unit.
|
1031 |
|
|
|
1032 |
|
|
Until "export" is implemented we can put some of the lengthy template
|
1033 |
|
|
definitions in #if guards or alternative headers so that users can skip
|
1034 |
|
|
over the full definitions when they need only the ready-instantiated
|
1035 |
|
|
specializations.
|
1036 |
|
|
|
1037 |
|
|
To be precise, this means that certain headers which define
|
1038 |
|
|
templates which users normally use only for certain arguments
|
1039 |
|
|
can be instrumented to avoid exposing the template definitions
|
1040 |
|
|
to the compiler unless a macro is defined. For example, in
|
1041 |
|
|
<string>, we might have:
|
1042 |
|
|
|
1043 |
|
|
template <class _CharT, ... > class basic_string {
|
1044 |
|
|
... // member declarations
|
1045 |
|
|
};
|
1046 |
|
|
... // operator declarations
|
1047 |
|
|
|
1048 |
|
|
#ifdef _STRICT_ISO_
|
1049 |
|
|
# if _G_NO_TEMPLATE_EXPORT
|
1050 |
|
|
# include <bits/std_locale.h> // headers needed by definitions
|
1051 |
|
|
# ...
|
1052 |
|
|
# include <bits/string.tcc> // member and global template definitions.
|
1053 |
|
|
# endif
|
1054 |
|
|
#endif
|
1055 |
|
|
|
1056 |
|
|
Users who compile without specifying a strict-ISO-conforming flag
|
1057 |
|
|
would not see many of the template definitions they now see, and rely
|
1058 |
|
|
instead on ready-instantiated specializations in the library. This
|
1059 |
|
|
technique would be useful for the following substantial components:
|
1060 |
|
|
string, locale/iostreams, valarray. It would *not* be useful or
|
1061 |
|
|
usable with the following: containers, algorithms, iterators,
|
1062 |
|
|
allocator. Since these constitute a large (though decreasing)
|
1063 |
|
|
fraction of the library, the benefit the technique offers is
|
1064 |
|
|
limited.
|
1065 |
|
|
|
1066 |
|
|
The language specifies the semantics of the "export" keyword, but
|
1067 |
|
|
the gcc compiler does not yet support it. When it does, problems
|
1068 |
|
|
with large template inclusions can largely disappear, given some
|
1069 |
|
|
minor library reorganization, along with the need for the apparatus
|
1070 |
|
|
described above.
|
1071 |
|
|
|
1072 |
|
|
Overhead: Flexibility Cost
|
1073 |
|
|
--------------------------
|
1074 |
|
|
|
1075 |
|
|
The library offers many places where users can specify operations
|
1076 |
|
|
to be performed by the library in place of defaults. Sometimes
|
1077 |
|
|
this seems to require that the library use a more-roundabout, and
|
1078 |
|
|
possibly slower, way to accomplish the default requirements than
|
1079 |
|
|
would be used otherwise.
|
1080 |
|
|
|
1081 |
|
|
The primary protection against this overhead is thorough compiler
|
1082 |
|
|
optimization, to crush out layers of inline function interfaces.
|
1083 |
|
|
Kuck & Associates has demonstrated the practicality of this kind
|
1084 |
|
|
of optimization.
|
1085 |
|
|
|
1086 |
|
|
The second line of defense against this overhead is explicit
|
1087 |
|
|
specialization. By defining helper function templates, and writing
|
1088 |
|
|
specialized code for the default case, overhead can be eliminated
|
1089 |
|
|
for that case without sacrificing flexibility. This takes full
|
1090 |
|
|
advantage of any ability of the optimizer to crush out degenerate
|
1091 |
|
|
code.
|
1092 |
|
|
|
1093 |
|
|
The library specifies many virtual functions which current linkers
|
1094 |
|
|
load even when they cannot be called. Some minor improvements to the
|
1095 |
|
|
compiler and to ld would eliminate any such overhead by simply
|
1096 |
|
|
omitting virtual functions that the complete program does not call.
|
1097 |
|
|
A prototype of this work has already been done. For targets where
|
1098 |
|
|
GNU ld is not used, a "pre-linker" could do the same job.
|
1099 |
|
|
|
1100 |
|
|
The main areas in the standard interface where user flexibility
|
1101 |
|
|
can result in overhead are:
|
1102 |
|
|
|
1103 |
|
|
- Allocators: Containers are specified to use user-definable
|
1104 |
|
|
allocator types and objects, making tuning for the container
|
1105 |
|
|
characteristics tricky.
|
1106 |
|
|
|
1107 |
|
|
- Locales: the standard specifies locale objects used to implement
|
1108 |
|
|
iostream operations, involving many virtual functions which use
|
1109 |
|
|
streambuf iterators.
|
1110 |
|
|
|
1111 |
|
|
- Algorithms and containers: these may be instantiated on any type,
|
1112 |
|
|
frequently duplicating code for identical operations.
|
1113 |
|
|
|
1114 |
|
|
- Iostreams and strings: users are permitted to use these on their
|
1115 |
|
|
own types, and specify the operations the stream must use on these
|
1116 |
|
|
types.
|
1117 |
|
|
|
1118 |
|
|
Note that these sources of overhead are _avoidable_. The techniques
|
1119 |
|
|
to avoid them are covered below.
|
1120 |
|
|
|
1121 |
|
|
Code Bloat
|
1122 |
|
|
----------
|
1123 |
|
|
|
1124 |
|
|
In the SGI STL, and in some other headers, many of the templates
|
1125 |
|
|
are defined "inline" -- either explicitly or by their placement
|
1126 |
|
|
in class definitions -- which should not be inline. This is a
|
1127 |
|
|
source of code bloat. Matt had remarked that he was relying on
|
1128 |
|
|
the compiler to recognize what was too big to benefit from inlining,
|
1129 |
|
|
and generate it out-of-line automatically. However, this also can
|
1130 |
|
|
result in code bloat except where the linker can eliminate the extra
|
1131 |
|
|
copies.
|
1132 |
|
|
|
1133 |
|
|
Fixing these cases will require an audit of all inline functions
|
1134 |
|
|
defined in the library to determine which merit inlining, and moving
|
1135 |
|
|
the rest out of line. This is an issue mainly in chapters 23, 25, and
|
1136 |
|
|
27. Of course it can be done incrementally, and we should generally
|
1137 |
|
|
accept patches that move large functions out of line and into ".tcc"
|
1138 |
|
|
files, which can later be pulled into a repository. Compiler/linker
|
1139 |
|
|
improvements to recognize very large inline functions and move them
|
1140 |
|
|
out-of-line, but shared among compilation units, could make this
|
1141 |
|
|
work unnecessary.
|
1142 |
|
|
|
1143 |
|
|
Pre-instantiating template specializations currently produces large
|
1144 |
|
|
amounts of dead code which bloats statically linked programs. The
|
1145 |
|
|
current state of the static library, libstdc++.a, is intolerable on
|
1146 |
|
|
this account, and will fuel further confused speculation about a need
|
1147 |
|
|
for a library "subset". A compiler improvement that treats each
|
1148 |
|
|
instantiated function as a separate object file, for linking purposes,
|
1149 |
|
|
would be one solution to this problem. An alternative would be to
|
1150 |
|
|
split up the manual instantiation files into dozens upon dozens of
|
1151 |
|
|
little files, each compiled separately, but an abortive attempt at
|
1152 |
|
|
this was done for <string> and, though it is far from complete, it
|
1153 |
|
|
is already a nuisance. A better interim solution (just until we have
|
1154 |
|
|
"export") is badly needed.
|
1155 |
|
|
|
1156 |
|
|
When building a shared library, the current compiler/linker cannot
|
1157 |
|
|
automatically generate the instantiations needed. This creates a
|
1158 |
|
|
miserable situation; it means any time something is changed in the
|
1159 |
|
|
library, before a shared library can be built someone must manually
|
1160 |
|
|
copy the declarations of all templates that are needed by other parts
|
1161 |
|
|
of the library to an "instantiation" file, and add it to the build
|
1162 |
|
|
system to be compiled and linked to the library. This process is
|
1163 |
|
|
readily automated, and should be automated as soon as possible.
|
1164 |
|
|
Users building their own shared libraries experience identical
|
1165 |
|
|
frustrations.
|
1166 |
|
|
|
1167 |
|
|
Sharing common aspects of template definitions among instantiations
|
1168 |
|
|
can radically reduce code bloat. The compiler could help a great
|
1169 |
|
|
deal here by recognizing when a function depends on nothing about
|
1170 |
|
|
a template parameter, or only on its size, and giving the resulting
|
1171 |
|
|
function a link-name "equate" that allows it to be shared with other
|
1172 |
|
|
instantiations. Implementation code could take advantage of the
|
1173 |
|
|
capability by factoring out code that does not depend on the template
|
1174 |
|
|
argument into separate functions to be merged by the compiler.
|
1175 |
|
|
|
1176 |
|
|
Until such a compiler optimization is implemented, much can be done
|
1177 |
|
|
manually (if tediously) in this direction. One such optimization is
|
1178 |
|
|
to derive class templates from non-template classes, and move as much
|
1179 |
|
|
implementation as possible into the base class. Another is to partial-
|
1180 |
|
|
specialize certain common instantiations, such as vector<T*>, to share
|
1181 |
|
|
code for instantiations on all types T. While these techniques work,
|
1182 |
|
|
they are far from the complete solution that a compiler improvement
|
1183 |
|
|
would afford.
|
1184 |
|
|
|
1185 |
|
|
Overhead: Expensive Language Features
|
1186 |
|
|
-------------------------------------
|
1187 |
|
|
|
1188 |
|
|
The main "expensive" language feature used in the standard library
|
1189 |
|
|
is exception support, which requires compiling in cleanup code with
|
1190 |
|
|
static table data to locate it, and linking in library code to use
|
1191 |
|
|
the table. For small embedded programs the amount of such library
|
1192 |
|
|
code and table data is assumed by some to be excessive. Under the
|
1193 |
|
|
"new" ABI this perception is generally exaggerated, although in some
|
1194 |
|
|
cases it may actually be excessive.
|
1195 |
|
|
|
1196 |
|
|
To implement a library which does not use exceptions directly is
|
1197 |
|
|
not difficult given minor compiler support (to "turn off" exceptions
|
1198 |
|
|
and ignore exception constructs), and results in no great library
|
1199 |
|
|
maintenance difficulties. To be precise, given "-fno-exceptions",
|
1200 |
|
|
the compiler should treat "try" blocks as ordinary blocks, and
|
1201 |
|
|
"catch" blocks as dead code to ignore or eliminate. Compiler
|
1202 |
|
|
support is not strictly necessary, except in the case of "function
|
1203 |
|
|
try blocks"; otherwise the following macros almost suffice:
|
1204 |
|
|
|
1205 |
|
|
#define throw(X)
|
1206 |
|
|
#define try if (true)
|
1207 |
|
|
#define catch(X) else if (false)
|
1208 |
|
|
|
1209 |
|
|
However, there may be a need to use function try blocks in the
|
1210 |
|
|
library implementation, and use of macros in this way can make
|
1211 |
|
|
correct diagnostics impossible. Furthermore, use of this scheme
|
1212 |
|
|
would require the library to call a function to re-throw exceptions
|
1213 |
|
|
from a try block. Implementing the above semantics in the compiler
|
1214 |
|
|
is preferable.
|
1215 |
|
|
|
1216 |
|
|
Given the support above (however implemented) it only remains to
|
1217 |
|
|
replace code that "throws" with a call to a well-documented "handler"
|
1218 |
|
|
function in a separate compilation unit which may be replaced by
|
1219 |
|
|
the user. The main source of exceptions that would be difficult
|
1220 |
|
|
for users to avoid is memory allocation failures, but users can
|
1221 |
|
|
define their own memory allocation primitives that never throw.
|
1222 |
|
|
Otherwise, the complete list of such handlers, and which library
|
1223 |
|
|
functions may call them, would be needed for users to be able to
|
1224 |
|
|
implement the necessary substitutes. (Fortunately, they have the
|
1225 |
|
|
source code.)
|
1226 |
|
|
|
1227 |
|
|
Opportunities
|
1228 |
|
|
-------------
|
1229 |
|
|
|
1230 |
|
|
The template capabilities of C++ offer enormous opportunities for
|
1231 |
|
|
optimizing common library operations, well beyond what would be
|
1232 |
|
|
considered "eliminating overhead". In particular, many operations
|
1233 |
|
|
done in Glibc with macros that depend on proprietary language
|
1234 |
|
|
extensions can be implemented in pristine Standard C++. For example,
|
1235 |
|
|
the chapter 25 algorithms, and even C library functions such as strchr,
|
1236 |
|
|
can be specialized for the case of static arrays of known (small) size.
|
1237 |
|
|
|
1238 |
|
|
Detailed optimization opportunities are identified below where
|
1239 |
|
|
the component where they would appear is discussed. Of course new
|
1240 |
|
|
opportunities will be identified during implementation.
|
1241 |
|
|
|
1242 |
|
|
Unimplemented Required Library Features
|
1243 |
|
|
---------------------------------------
|
1244 |
|
|
|
1245 |
|
|
The standard specifies hundreds of components, grouped broadly by
|
1246 |
|
|
chapter. These are listed in excruciating detail in the CHECKLIST
|
1247 |
|
|
file.
|
1248 |
|
|
|
1249 |
|
|
17 general
|
1250 |
|
|
18 support
|
1251 |
|
|
19 diagnostics
|
1252 |
|
|
20 utilities
|
1253 |
|
|
21 string
|
1254 |
|
|
22 locale
|
1255 |
|
|
23 containers
|
1256 |
|
|
24 iterators
|
1257 |
|
|
25 algorithms
|
1258 |
|
|
26 numerics
|
1259 |
|
|
27 iostreams
|
1260 |
|
|
Annex D backward compatibility
|
1261 |
|
|
|
1262 |
|
|
Anyone participating in implementation of the library should obtain
|
1263 |
|
|
a copy of the standard, ISO 14882. People in the U.S. can obtain an
|
1264 |
|
|
electronic copy for US$18 from ANSI's web site. Those from other
|
1265 |
|
|
countries should visit http://www.iso.org/ to find out the location
|
1266 |
|
|
of their country's representation in ISO, in order to know who can
|
1267 |
|
|
sell them a copy.
|
1268 |
|
|
|
1269 |
|
|
The emphasis in the following sections is on unimplemented features
|
1270 |
|
|
and optimization opportunities.
|
1271 |
|
|
|
1272 |
|
|
Chapter 17 General
|
1273 |
|
|
-------------------
|
1274 |
|
|
|
1275 |
|
|
Chapter 17 concerns overall library requirements.
|
1276 |
|
|
|
1277 |
|
|
The standard doesn't mention threads. A multi-thread (MT) extension
|
1278 |
|
|
primarily affects operators new and delete (18), allocator (20),
|
1279 |
|
|
string (21), locale (22), and iostreams (27). The common underlying
|
1280 |
|
|
support needed for this is discussed under chapter 20.
|
1281 |
|
|
|
1282 |
|
|
The standard requirements on names from the C headers create a
|
1283 |
|
|
lot of work, mostly done. Names in the C headers must be visible
|
1284 |
|
|
in the std:: and sometimes the global namespace; the names in the
|
1285 |
|
|
two scopes must refer to the same object. More stringent is that
|
1286 |
|
|
Koenig lookup implies that any types specified as defined in std::
|
1287 |
|
|
really are defined in std::. Names optionally implemented as
|
1288 |
|
|
macros in C cannot be macros in C++. (An overview may be read at
|
1289 |
|
|
<http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
|
1290 |
|
|
and "mkcshadow", and the directories shadow/ and cshadow/, are the
|
1291 |
|
|
beginning of an effort to conform in this area.
|
1292 |
|
|
|
1293 |
|
|
A correct conforming definition of C header names based on underlying
|
1294 |
|
|
C library headers, and practical linking of conforming namespaced
|
1295 |
|
|
customer code with third-party C libraries depends ultimately on
|
1296 |
|
|
an ABI change, allowing namespaced C type names to be mangled into
|
1297 |
|
|
type names as if they were global, somewhat as C function names in a
|
1298 |
|
|
namespace, or C++ global variable names, are left unmangled. Perhaps
|
1299 |
|
|
another "extern" mode, such as 'extern "C-global"' would be an
|
1300 |
|
|
appropriate place for such type definitions. Such a type would
|
1301 |
|
|
affect mangling as follows:
|
1302 |
|
|
|
1303 |
|
|
namespace A {
|
1304 |
|
|
struct X {};
|
1305 |
|
|
extern "C-global" { // or maybe just 'extern "C"'
|
1306 |
|
|
struct Y {};
|
1307 |
|
|
};
|
1308 |
|
|
}
|
1309 |
|
|
void f(A::X*); // mangles to f__FPQ21A1X
|
1310 |
|
|
void f(A::Y*); // mangles to f__FP1Y
|
1311 |
|
|
|
1312 |
|
|
(It may be that this is really the appropriate semantics for regular
|
1313 |
|
|
'extern "C"', and 'extern "C-global"', as an extension, would not be
|
1314 |
|
|
necessary.) This would allow functions declared in non-standard C headers
|
1315 |
|
|
(and thus fixable by neither us nor users) to link properly with functions
|
1316 |
|
|
declared using C types defined in properly-namespaced headers. The
|
1317 |
|
|
problem this solves is that C headers (which C++ programmers do persist
|
1318 |
|
|
in using) frequently forward-declare C struct tags without including
|
1319 |
|
|
the header where the type is defined, as in
|
1320 |
|
|
|
1321 |
|
|
struct tm;
|
1322 |
|
|
void munge(tm*);
|
1323 |
|
|
|
1324 |
|
|
Without some compiler accommodation, munge cannot be called by correct
|
1325 |
|
|
C++ code using a pointer to a correctly-scoped tm* value.
|
1326 |
|
|
|
1327 |
|
|
The current C headers use the preprocessor extension "#include_next",
|
1328 |
|
|
which the compiler complains about when run "-pedantic".
|
1329 |
|
|
(Incidentally, it appears that "-fpedantic" is currently ignored,
|
1330 |
|
|
probably a bug.) The solution in the C compiler is to use
|
1331 |
|
|
"-isystem" rather than "-I", but unfortunately in g++ this seems
|
1332 |
|
|
also to wrap the whole header in an 'extern "C"' block, so it's
|
1333 |
|
|
unusable for C++ headers. The correct solution appears to be to
|
1334 |
|
|
allow the various special include-directory options, if not given
|
1335 |
|
|
an argument, to affect subsequent include-directory options additively,
|
1336 |
|
|
so that if one said
|
1337 |
|
|
|
1338 |
|
|
-pedantic -iprefix $(prefix) \
|
1339 |
|
|
-idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
|
1340 |
|
|
-iwithprefix -I g++-v3/ext
|
1341 |
|
|
|
1342 |
|
|
the compiler would search $(prefix)/g++-v3 and not report
|
1343 |
|
|
pedantic warnings for files found there, but treat files in
|
1344 |
|
|
$(prefix)/g++-v3/ext pedantically. (The undocumented semantics
|
1345 |
|
|
of "-isystem" in g++ stink. Can they be rescinded? If not it
|
1346 |
|
|
must be replaced with something more rationally behaved.)
|
1347 |
|
|
|
1348 |
|
|
All the C headers need the treatment above; in the standard these
|
1349 |
|
|
headers are mentioned in various chapters. Below, I have only
|
1350 |
|
|
mentioned those that present interesting implementation issues.
|
1351 |
|
|
|
1352 |
|
|
The components identified as "mostly complete", below, have not been
|
1353 |
|
|
audited for conformance. In many cases where the library passes
|
1354 |
|
|
conformance tests we have non-conforming extensions that must be
|
1355 |
|
|
wrapped in #if guards for "pedantic" use, and in some cases renamed
|
1356 |
|
|
in a conforming way for continued use in the implementation regardless
|
1357 |
|
|
of conformance flags.
|
1358 |
|
|
|
1359 |
|
|
The STL portion of the library still depends on a header
|
1360 |
|
|
stl/bits/stl_config.h full of #ifdef clauses. This apparatus
|
1361 |
|
|
should be replaced with autoconf/automake machinery.
|
1362 |
|
|
|
1363 |
|
|
The SGI STL defines a type_traits<> template, specialized for
|
1364 |
|
|
many types in their code including the built-in numeric and
|
1365 |
|
|
pointer types and some library types, to direct optimizations of
|
1366 |
|
|
standard functions. The SGI compiler has been extended to generate
|
1367 |
|
|
specializations of this template automatically for user types,
|
1368 |
|
|
so that use of STL templates on user types can take advantage of
|
1369 |
|
|
these optimizations. Specializations for other, non-STL, types
|
1370 |
|
|
would make more optimizations possible, but extending the gcc
|
1371 |
|
|
compiler in the same way would be much better. Probably the next
|
1372 |
|
|
round of standardization will ratify this, but probably with
|
1373 |
|
|
changes, so it probably should be renamed to place it in the
|
1374 |
|
|
implementation namespace.
|
1375 |
|
|
|
1376 |
|
|
The SGI STL also defines a large number of extensions visible in
|
1377 |
|
|
standard headers. (Other extensions that appear in separate headers
|
1378 |
|
|
have been sequestered in subdirectories ext/ and backward/.) All
|
1379 |
|
|
these extensions should be moved to other headers where possible,
|
1380 |
|
|
and in any case wrapped in a namespace (not std!), and (where kept
|
1381 |
|
|
in a standard header) girded about with macro guards. Some cannot be
|
1382 |
|
|
moved out of standard headers because they are used to implement
|
1383 |
|
|
standard features. The canonical method for accommodating these
|
1384 |
|
|
is to use a protected name, aliased in macro guards to a user-space
|
1385 |
|
|
name. Unfortunately C++ offers no satisfactory template typedef
|
1386 |
|
|
mechanism, so very ad-hoc and unsatisfactory aliasing must be used
|
1387 |
|
|
instead.
|
1388 |
|
|
|
1389 |
|
|
Implementation of a template typedef mechanism should have the highest
|
1390 |
|
|
priority among possible extensions, on the same level as implementation
|
1391 |
|
|
of the template "export" feature.
|
1392 |
|
|
|
1393 |
|
|
Chapter 18 Language support
|
1394 |
|
|
----------------------------
|
1395 |
|
|
|
1396 |
|
|
Headers: <limits> <new> <typeinfo> <exception>
|
1397 |
|
|
C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp>
|
1398 |
|
|
<ctime> <csignal> <cstdlib> (also 21, 25, 26)
|
1399 |
|
|
|
1400 |
|
|
This defines the built-in exceptions, rtti, numeric_limits<>,
|
1401 |
|
|
operator new and delete. Much of this is provided by the
|
1402 |
|
|
compiler in its static runtime library.
|
1403 |
|
|
|
1404 |
|
|
Work to do includes defining numeric_limits<> specializations in
|
1405 |
|
|
separate files for all target architectures. Values for integer types
|
1406 |
|
|
except for bool and wchar_t are readily obtained from the C header
|
1407 |
|
|
<limits.h>, but values for the remaining numeric types (bool, wchar_t,
|
1408 |
|
|
float, double, long double) must be entered manually. This is
|
1409 |
|
|
largely dog work except for those members whose values are not
|
1410 |
|
|
easily deduced from available documentation. Also, this involves
|
1411 |
|
|
some work in target configuration to identify the correct choice of
|
1412 |
|
|
file to build against and to install.
|
1413 |
|
|
|
1414 |
|
|
The definitions of the various operators new and delete must be
|
1415 |
|
|
made thread-safe, which depends on a portable exclusion mechanism,
|
1416 |
|
|
discussed under chapter 20. Of course there is always plenty of
|
1417 |
|
|
room for improvements to the speed of operators new and delete.
|
1418 |
|
|
|
1419 |
|
|
<cstdarg>, in Glibc, defines some macros that gcc does not allow to
|
1420 |
|
|
be wrapped into an inline function. Probably this header will demand
|
1421 |
|
|
attention whenever a new target is chosen. The functions atexit(),
|
1422 |
|
|
exit(), and abort() in cstdlib have different semantics in C++, so
|
1423 |
|
|
must be re-implemented for C++.
|
1424 |
|
|
|
1425 |
|
|
Chapter 19 Diagnostics
|
1426 |
|
|
-----------------------
|
1427 |
|
|
|
1428 |
|
|
Headers: <stdexcept>
|
1429 |
|
|
C headers: <cassert> <cerrno>
|
1430 |
|
|
|
1431 |
|
|
This defines the standard exception objects, which are "mostly complete".
|
1432 |
|
|
Cygnus has a version, and now SGI provides a slightly different one.
|
1433 |
|
|
It makes little difference which we use.
|
1434 |
|
|
|
1435 |
|
|
The C global name "errno", which C allows to be a variable or a macro,
|
1436 |
|
|
is required in C++ to be a macro. For MT it must typically result in
|
1437 |
|
|
a function call.
|
1438 |
|
|
|
1439 |
|
|
Chapter 20 Utilities
|
1440 |
|
|
---------------------
|
1441 |
|
|
Headers: <utility> <functional> <memory>
|
1442 |
|
|
C header: <ctime> (also in 18)
|
1443 |
|
|
|
1444 |
|
|
SGI STL provides "mostly complete" versions of all the components
|
1445 |
|
|
defined in this chapter. However, the auto_ptr<> implementation
|
1446 |
|
|
is known to be wrong. Furthermore, the standard definition of it
|
1447 |
|
|
is known to be unimplementable as written. A minor change to the
|
1448 |
|
|
standard would fix it, and auto_ptr<> should be adjusted to match.
|
1449 |
|
|
|
1450 |
|
|
Multi-threading affects the allocator implementation, and there must
|
1451 |
|
|
be configuration/installation choices for different users' MT
|
1452 |
|
|
requirements. Anyway, users will want to tune allocator options
|
1453 |
|
|
to support different target conditions, MT or no.
|
1454 |
|
|
|
1455 |
|
|
The primitives used for MT implementation should be exposed, as an
|
1456 |
|
|
extension, for users' own work. We need cross-CPU "mutex" support,
|
1457 |
|
|
multi-processor shared-memory atomic integer operations, and single-
|
1458 |
|
|
processor uninterruptible integer operations, and all three configurable
|
1459 |
|
|
to be stubbed out for non-MT use, or to use an appropriately-loaded
|
1460 |
|
|
dynamic library for the actual runtime environment, or statically
|
1461 |
|
|
compiled in for cases where the target architecture is known.
|
1462 |
|
|
|
1463 |
|
|
Chapter 21 String
|
1464 |
|
|
------------------
|
1465 |
|
|
Headers: <string>
|
1466 |
|
|
C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
|
1467 |
|
|
<cstdlib> (also in 18, 25, 26)
|
1468 |
|
|
|
1469 |
|
|
We have "mostly-complete" char_traits<> implementations. Many of the
|
1470 |
|
|
char_traits<char> operations might be optimized further using existing
|
1471 |
|
|
proprietary language extensions.
|
1472 |
|
|
|
1473 |
|
|
We have a "mostly-complete" basic_string<> implementation. The work
|
1474 |
|
|
to manually instantiate char and wchar_t specializations in object
|
1475 |
|
|
files to improve link-time behavior is extremely unsatisfactory,
|
1476 |
|
|
literally tripling library-build time with no commensurate improvement
|
1477 |
|
|
in static program link sizes. It must be redone. (Similar work is
|
1478 |
|
|
needed for some components in chapters 22 and 27.)
|
1479 |
|
|
|
1480 |
|
|
Other work needed for strings is MT-safety, as discussed under the
|
1481 |
|
|
chapter 20 heading.
|
1482 |
|
|
|
1483 |
|
|
The standard C type mbstate_t from <cwchar> and used in char_traits<>
|
1484 |
|
|
must be different in C++ than in C, because in C++ the default constructor
|
1485 |
|
|
value mbstate_t() must be the "base" or "ground" sequence state.
|
1486 |
|
|
(According to the likely resolution of a recently raised Core issue,
|
1487 |
|
|
this may become unnecessary. However, there are other reasons to
|
1488 |
|
|
use a state type not as limited as whatever the C library provides.)
|
1489 |
|
|
If we might want to provide conversions from (e.g.) internally-
|
1490 |
|
|
represented EUC-wide to externally-represented Unicode, or vice-
|
1491 |
|
|
versa, the mbstate_t we choose will need to be more accommodating
|
1492 |
|
|
than what might be provided by an underlying C library.
|
1493 |
|
|
|
1494 |
|
|
There remain some basic_string template-member functions which do
|
1495 |
|
|
not overload properly with their non-template brethren. The infamous
|
1496 |
|
|
hack akin to what was done in vector<> is needed, to conform to
|
1497 |
|
|
23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
|
1498 |
|
|
or incomplete, are so marked for this reason.
|
1499 |
|
|
|
1500 |
|
|
Replacing the string iterators, which currently are simple character
|
1501 |
|
|
pointers, with class objects would greatly increase the safety of the
|
1502 |
|
|
client interface, and also permit a "debug" mode in which range,
|
1503 |
|
|
ownership, and validity are rigorously checked. The current use of
|
1504 |
|
|
raw pointers as string iterators is evil. vector<> iterators need the
|
1505 |
|
|
same treatment. Note that the current implementation freely mixes
|
1506 |
|
|
pointers and iterators, and that must be fixed before safer iterators
|
1507 |
|
|
can be introduced.
|
1508 |
|
|
|
1509 |
|
|
Some of the functions in <cstring> are different from the C version.
|
1510 |
|
|
generally overloaded on const and non-const argument pointers. For
|
1511 |
|
|
example, in <cstring> strchr is overloaded. The functions isupper
|
1512 |
|
|
etc. in <cctype> typically implemented as macros in C are functions
|
1513 |
|
|
in C++, because they are overloaded with others of the same name
|
1514 |
|
|
defined in <locale>.
|
1515 |
|
|
|
1516 |
|
|
Many of the functions required in <cwctype> and <cwchar> cannot be
|
1517 |
|
|
implemented using underlying C facilities on intended targets because
|
1518 |
|
|
such facilities only partly exist.
|
1519 |
|
|
|
1520 |
|
|
Chapter 22 Locale
|
1521 |
|
|
------------------
|
1522 |
|
|
Headers: <locale>
|
1523 |
|
|
C headers: <clocale>
|
1524 |
|
|
|
1525 |
|
|
We have a "mostly complete" class locale, with the exception of
|
1526 |
|
|
code for constructing, and handling the names of, named locales.
|
1527 |
|
|
The ways that locales are named (particularly when categories
|
1528 |
|
|
(e.g. LC_TIME, LC_COLLATE) are different) varies among all target
|
1529 |
|
|
environments. This code must be written in various versions and
|
1530 |
|
|
chosen by configuration parameters.
|
1531 |
|
|
|
1532 |
|
|
Members of many of the facets defined in <locale> are stubs. Generally,
|
1533 |
|
|
there are two sets of facets: the base class facets (which are supposed
|
1534 |
|
|
to implement the "C" locale) and the "byname" facets, which are supposed
|
1535 |
|
|
to read files to determine their behavior. The base ctype<>, collate<>,
|
1536 |
|
|
and numpunct<> facets are "mostly complete", except that the table of
|
1537 |
|
|
bitmask values used for "is" operations, and corresponding mask values,
|
1538 |
|
|
are still defined in libio and just included/linked. (We will need to
|
1539 |
|
|
implement these tables independently, soon, but should take advantage
|
1540 |
|
|
of libio where possible.) The num_put<>::put members for integer types
|
1541 |
|
|
are "mostly complete".
|
1542 |
|
|
|
1543 |
|
|
A complete list of what has and has not been implemented may be
|
1544 |
|
|
found in CHECKLIST. However, note that the current definition of
|
1545 |
|
|
codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
|
1546 |
|
|
out the raw bytes representing the wide characters, rather than
|
1547 |
|
|
trying to convert each to a corresponding single "char" value.
|
1548 |
|
|
|
1549 |
|
|
Some of the facets are more important than others. Specifically,
|
1550 |
|
|
the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
|
1551 |
|
|
are used by other library facilities defined in <string>, <istream>,
|
1552 |
|
|
and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
|
1553 |
|
|
in <fstream>, so a conforming iostream implementation depends on
|
1554 |
|
|
these.
|
1555 |
|
|
|
1556 |
|
|
The "long long" type eventually must be supported, but code mentioning
|
1557 |
|
|
it should be wrapped in #if guards to allow pedantic-mode compiling.
|
1558 |
|
|
|
1559 |
|
|
Performance of num_put<> and num_get<> depend critically on
|
1560 |
|
|
caching computed values in ios_base objects, and on extensions
|
1561 |
|
|
to the interface with streambufs.
|
1562 |
|
|
|
1563 |
|
|
Specifically: retrieving a copy of the locale object, extracting
|
1564 |
|
|
the needed facets, and gathering data from them, for each call to
|
1565 |
|
|
(e.g.) operator<< would be prohibitively slow. To cache format
|
1566 |
|
|
data for use by num_put<> and num_get<> we have a _Format_cache<>
|
1567 |
|
|
object stored in the ios_base::pword() array. This is constructed
|
1568 |
|
|
and initialized lazily, and is organized purely for utility. It
|
1569 |
|
|
is discarded when a new locale with different facets is imbued.
|
1570 |
|
|
|
1571 |
|
|
Using only the public interfaces of the iterator arguments to the
|
1572 |
|
|
facet functions would limit performance by forbidding "vector-style"
|
1573 |
|
|
character operations. The streambuf iterator optimizations are
|
1574 |
|
|
described under chapter 24, but facets can also bypass the streambuf
|
1575 |
|
|
iterators via explicit specializations and operate directly on the
|
1576 |
|
|
streambufs, and use extended interfaces to get direct access to the
|
1577 |
|
|
streambuf internal buffer arrays. These extensions are mentioned
|
1578 |
|
|
under chapter 27. These optimizations are particularly important
|
1579 |
|
|
for input parsing.
|
1580 |
|
|
|
1581 |
|
|
Unused virtual members of locale facets can be omitted, as mentioned
|
1582 |
|
|
above, by a smart linker.
|
1583 |
|
|
|
1584 |
|
|
Chapter 23 Containers
|
1585 |
|
|
----------------------
|
1586 |
|
|
Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
|
1587 |
|
|
|
1588 |
|
|
All the components in chapter 23 are implemented in the SGI STL.
|
1589 |
|
|
They are "mostly complete"; they include a large number of
|
1590 |
|
|
nonconforming extensions which must be wrapped. Some of these
|
1591 |
|
|
are used internally and must be renamed or duplicated.
|
1592 |
|
|
|
1593 |
|
|
The SGI components are optimized for large-memory environments. For
|
1594 |
|
|
embedded targets, different criteria might be more appropriate. Users
|
1595 |
|
|
will want to be able to tune this behavior. We should provide
|
1596 |
|
|
ways for users to compile the library with different memory usage
|
1597 |
|
|
characteristics.
|
1598 |
|
|
|
1599 |
|
|
A lot more work is needed on factoring out common code from different
|
1600 |
|
|
specializations to reduce code size here and in chapter 25. The
|
1601 |
|
|
easiest fix for this would be a compiler/ABI improvement that allows
|
1602 |
|
|
the compiler to recognize when a specialization depends only on the
|
1603 |
|
|
size (or other gross quality) of a template argument, and allow the
|
1604 |
|
|
linker to share the code with similar specializations. In its
|
1605 |
|
|
absence, many of the algorithms and containers can be partial-
|
1606 |
|
|
specialized, at least for the case of pointers, but this only solves
|
1607 |
|
|
a small part of the problem. Use of a type_traits-style template
|
1608 |
|
|
allows a few more optimization opportunities, more if the compiler
|
1609 |
|
|
can generate the specializations automatically.
|
1610 |
|
|
|
1611 |
|
|
As an optimization, containers can specialize on the default allocator
|
1612 |
|
|
and bypass it, or take advantage of details of its implementation
|
1613 |
|
|
after it has been improved upon.
|
1614 |
|
|
|
1615 |
|
|
Replacing the vector iterators, which currently are simple element
|
1616 |
|
|
pointers, with class objects would greatly increase the safety of the
|
1617 |
|
|
client interface, and also permit a "debug" mode in which range,
|
1618 |
|
|
ownership, and validity are rigorously checked. The current use of
|
1619 |
|
|
pointers for iterators is evil.
|
1620 |
|
|
|
1621 |
|
|
As mentioned for chapter 24, the deque iterator is a good example of
|
1622 |
|
|
an opportunity to implement a "staged" iterator that would benefit
|
1623 |
|
|
from specializations of some algorithms.
|
1624 |
|
|
|
1625 |
|
|
Chapter 24 Iterators
|
1626 |
|
|
---------------------
|
1627 |
|
|
Headers: <iterator>
|
1628 |
|
|
|
1629 |
|
|
Standard iterators are "mostly complete", with the exception of
|
1630 |
|
|
the stream iterators, which are not yet templatized on the
|
1631 |
|
|
stream type. Also, the base class template iterator<> appears
|
1632 |
|
|
to be wrong, so everything derived from it must also be wrong,
|
1633 |
|
|
currently.
|
1634 |
|
|
|
1635 |
|
|
The streambuf iterators (currently located in stl/bits/std_iterator.h,
|
1636 |
|
|
but should be under bits/) can be rewritten to take advantage of
|
1637 |
|
|
friendship with the streambuf implementation.
|
1638 |
|
|
|
1639 |
|
|
Matt Austern has identified opportunities where certain iterator
|
1640 |
|
|
types, particularly including streambuf iterators and deque
|
1641 |
|
|
iterators, have a "two-stage" quality, such that an intermediate
|
1642 |
|
|
limit can be checked much more quickly than the true limit on
|
1643 |
|
|
range operations. If identified with a member of iterator_traits,
|
1644 |
|
|
algorithms may be specialized for this case. Of course the
|
1645 |
|
|
iterators that have this quality can be identified by specializing
|
1646 |
|
|
a traits class.
|
1647 |
|
|
|
1648 |
|
|
Many of the algorithms must be specialized for the streambuf
|
1649 |
|
|
iterators, to take advantage of block-mode operations, in order
|
1650 |
|
|
to allow iostream/locale operations' performance not to suffer.
|
1651 |
|
|
It may be that they could be treated as staged iterators and
|
1652 |
|
|
take advantage of those optimizations.
|
1653 |
|
|
|
1654 |
|
|
Chapter 25 Algorithms
|
1655 |
|
|
----------------------
|
1656 |
|
|
Headers: <algorithm>
|
1657 |
|
|
C headers: <cstdlib> (also in 18, 21, 26))
|
1658 |
|
|
|
1659 |
|
|
The algorithms are "mostly complete". As mentioned above, they
|
1660 |
|
|
are optimized for speed at the expense of code and data size.
|
1661 |
|
|
|
1662 |
|
|
Specializations of many of the algorithms for non-STL types would
|
1663 |
|
|
give performance improvements, but we must use great care not to
|
1664 |
|
|
interfere with fragile template overloading semantics for the
|
1665 |
|
|
standard interfaces. Conventionally the standard function template
|
1666 |
|
|
interface is an inline which delegates to a non-standard function
|
1667 |
|
|
which is then overloaded (this is already done in many places in
|
1668 |
|
|
the library). Particularly appealing opportunities for the sake of
|
1669 |
|
|
iostream performance are for copy and find applied to streambuf
|
1670 |
|
|
iterators or (as noted elsewhere) for staged iterators, of which
|
1671 |
|
|
the streambuf iterators are a good example.
|
1672 |
|
|
|
1673 |
|
|
The bsearch and qsort functions cannot be overloaded properly as
|
1674 |
|
|
required by the standard because gcc does not yet allow overloading
|
1675 |
|
|
on the extern-"C"-ness of a function pointer.
|
1676 |
|
|
|
1677 |
|
|
Chapter 26 Numerics
|
1678 |
|
|
--------------------
|
1679 |
|
|
Headers: <complex> <valarray> <numeric>
|
1680 |
|
|
C headers: <cmath>, <cstdlib> (also 18, 21, 25)
|
1681 |
|
|
|
1682 |
|
|
Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
|
1683 |
|
|
and the few algorithms from the STL are "mostly done". Of course
|
1684 |
|
|
optimization opportunities abound for the numerically literate. It
|
1685 |
|
|
is not clear whether the valarray implementation really conforms
|
1686 |
|
|
fully, in the assumptions it makes about aliasing (and lack thereof)
|
1687 |
|
|
in its arguments.
|
1688 |
|
|
|
1689 |
|
|
The C div() and ldiv() functions are interesting, because they are the
|
1690 |
|
|
only case where a C library function returns a class object by value.
|
1691 |
|
|
Since the C++ type div_t must be different from the underlying C type
|
1692 |
|
|
(which is in the wrong namespace) the underlying functions div() and
|
1693 |
|
|
ldiv() cannot be re-used efficiently. Fortunately they are trivial to
|
1694 |
|
|
re-implement.
|
1695 |
|
|
|
1696 |
|
|
Chapter 27 Iostreams
|
1697 |
|
|
---------------------
|
1698 |
|
|
Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
|
1699 |
|
|
<iomanip> <sstream> <fstream>
|
1700 |
|
|
C headers: <cstdio> <cwchar> (also in 21)
|
1701 |
|
|
|
1702 |
|
|
Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
|
1703 |
|
|
ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
|
1704 |
|
|
basic_ostream<> are well along, but basic_istream<> has had little work
|
1705 |
|
|
done. The standard stream objects, <sstream> and <fstream> have been
|
1706 |
|
|
started; basic_filebuf<> "write" functions have been implemented just
|
1707 |
|
|
enough to do "hello, world".
|
1708 |
|
|
|
1709 |
|
|
Most of the istream and ostream operators << and >> (with the exception
|
1710 |
|
|
of the op<<(integer) ones) have not been changed to use locale primitives,
|
1711 |
|
|
sentry objects, or char_traits members.
|
1712 |
|
|
|
1713 |
|
|
All these templates should be manually instantiated for char and
|
1714 |
|
|
wchar_t in a way that links only used members into user programs.
|
1715 |
|
|
|
1716 |
|
|
Streambuf is fertile ground for optimization extensions. An extended
|
1717 |
|
|
interface giving iterator access to its internal buffer would be very
|
1718 |
|
|
useful for other library components.
|
1719 |
|
|
|
1720 |
|
|
Iostream operations (primarily operators << and >>) can take advantage
|
1721 |
|
|
of the case where user code has not specified a locale, and bypass locale
|
1722 |
|
|
operations entirely. The current implementation of op<</num_put<>::put,
|
1723 |
|
|
for the integer types, demonstrates how they can cache encoding details
|
1724 |
|
|
from the locale on each operation. There is lots more room for
|
1725 |
|
|
optimization in this area.
|
1726 |
|
|
|
1727 |
|
|
The definition of the relationship between the standard streams
|
1728 |
|
|
cout et al. and stdout et al. requires something like a "stdiobuf".
|
1729 |
|
|
The SGI solution of using double-indirection to actually use a
|
1730 |
|
|
stdio FILE object for buffering is unsatisfactory, because it
|
1731 |
|
|
interferes with peephole loop optimizations.
|
1732 |
|
|
|
1733 |
|
|
The <sstream> header work has begun. stringbuf can benefit from
|
1734 |
|
|
friendship with basic_string<> and basic_string<>::_Rep to use
|
1735 |
|
|
those objects directly as buffers, and avoid allocating and making
|
1736 |
|
|
copies.
|
1737 |
|
|
|
1738 |
|
|
The basic_filebuf<> template is a complex beast. It is specified to
|
1739 |
|
|
use the locale facet codecvt<> to translate characters between native
|
1740 |
|
|
files and the locale character encoding. In general this involves
|
1741 |
|
|
two buffers, one of "char" representing the file and another of
|
1742 |
|
|
"char_type", for the stream, with codecvt<> translating. The process
|
1743 |
|
|
is complicated by the variable-length nature of the translation, and
|
1744 |
|
|
the need to seek to corresponding places in the two representations.
|
1745 |
|
|
For the case of basic_filebuf<char>, when no translation is needed,
|
1746 |
|
|
a single buffer suffices. A specialized filebuf can be used to reduce
|
1747 |
|
|
code space overhead when no locale has been imbued. Matt Austern's
|
1748 |
|
|
work at SGI will be useful, perhaps directly as a source of code, or
|
1749 |
|
|
at least as an example to draw on.
|
1750 |
|
|
|
1751 |
|
|
Filebuf, almost uniquely (cf. operator new), depends heavily on
|
1752 |
|
|
underlying environmental facilities. In current releases iostream
|
1753 |
|
|
depends fairly heavily on libio constant definitions, but it should
|
1754 |
|
|
be made independent. It also depends on operating system primitives
|
1755 |
|
|
for file operations. There is immense room for optimizations using
|
1756 |
|
|
(e.g.) mmap for reading. The shadow/ directory wraps, besides the
|
1757 |
|
|
standard C headers, the libio.h and unistd.h headers, for use mainly
|
1758 |
|
|
by filebuf. These wrappings have not been completed, though there
|
1759 |
|
|
is scaffolding in place.
|
1760 |
|
|
|
1761 |
|
|
The encapsulation of certain C header <cstdio> names presents an
|
1762 |
|
|
interesting problem. It is possible to define an inline std::fprintf()
|
1763 |
|
|
implemented in terms of the 'extern "C"' vfprintf(), but there is no
|
1764 |
|
|
standard vfscanf() to use to implement std::fscanf(). It appears that
|
1765 |
|
|
vfscanf but be re-implemented in C++ for targets where no vfscanf
|
1766 |
|
|
extension has been defined. This is interesting in that it seems
|
1767 |
|
|
to be the only significant case in the C library where this kind of
|
1768 |
|
|
rewriting is necessary. (Of course Glibc provides the vfscanf()
|
1769 |
|
|
extension.) (The functions related to exit() must be rewritten
|
1770 |
|
|
for other reasons.)
|
1771 |
|
|
|
1772 |
|
|
|
1773 |
|
|
Annex D
|
1774 |
|
|
-------
|
1775 |
|
|
Headers: <strstream>
|
1776 |
|
|
|
1777 |
|
|
Annex D defines many non-library features, and many minor
|
1778 |
|
|
modifications to various headers, and a complete header.
|
1779 |
|
|
It is "mostly done", except that the libstdc++-2 <strstream>
|
1780 |
|
|
header has not been adopted into the library, or checked to
|
1781 |
|
|
verify that it matches the draft in those details that were
|
1782 |
|
|
clarified by the committee. Certainly it must at least be
|
1783 |
|
|
moved into the std namespace.
|
1784 |
|
|
|
1785 |
|
|
We still need to wrap all the deprecated features in #if guards
|
1786 |
|
|
so that pedantic compile modes can detect their use.
|
1787 |
|
|
|
1788 |
|
|
Nonstandard Extensions
|
1789 |
|
|
----------------------
|
1790 |
|
|
Headers: <iostream.h> <strstream.h> <hash> <rbtree>
|
1791 |
|
|
<pthread_alloc> <stdiobuf> (etc.)
|
1792 |
|
|
|
1793 |
|
|
User code has come to depend on a variety of nonstandard components
|
1794 |
|
|
that we must not omit. Much of this code can be adopted from
|
1795 |
|
|
libstdc++-v2 or from the SGI STL. This particularly includes
|
1796 |
|
|
<iostream.h>, <strstream.h>, and various SGI extensions such
|
1797 |
|
|
as <hash_map.h>. Many of these are already placed in the
|
1798 |
|
|
subdirectories ext/ and backward/. (Note that it is better to
|
1799 |
|
|
include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
|
1800 |
|
|
to search the subdirectory itself via a "-I" directive.
|
1801 |
|
|
|
1802 |
|
|
|
1803 |
|
|
|
1804 |
|
|
|