OpenCores
URL https://opencores.org/ocsvn/openrisc/openrisc/trunk

Subversion Repositories openrisc

[/] [openrisc/] [trunk/] [gnu-dev/] [or1k-gcc/] [libstdc++-v3/] [doc/] [xml/] [manual/] [appendix_contributing.xml] - Blame information for rev 750

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 742 jeremybenn
2
          xml:id="appendix.contrib" xreflabel="Contributing">
3
4
 
5
</code></pre></td>
      </tr>
      <tr valign="middle">
         <td>6</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>  Contributing</code></pre></td>
      </tr>
      <tr valign="middle">
         <td>7</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>  <indexterm></code></pre></td>
      </tr>
      <tr valign="middle">
         <td>8</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>    <primary>Appendix</primary></code></pre></td>
      </tr>
      <tr valign="middle">
         <td>9</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>    <secondary>Contributing</secondary></code></pre></td>
      </tr>
      <tr valign="middle">
         <td>10</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>  </indexterm></code></pre></td>
      </tr>
      <tr valign="middle">
         <td>11</td>
         <td></td>
         <td></td>
         <td class="code"><pre><code>
12
  
13
    
14
      ISO C++
15
    
16
    
17
      library
18
    
19
  
20
21
 
22
 
23
 
24
25
  The GNU C++ Library follows an open development model. Active
26
  contributors are assigned maintainer-ship responsibility, and given
27
  write access to the source repository. First time contributors
28
  should follow this procedure:
29
30
 
31
Contributor Checklist
32
 
33
 
34
  
Reading
35
 
36
 
37
    
38
      
39
        
40
          Get and read the relevant sections of the C++ language
41
          specification. Copies of the full ISO 14882 standard are
42
          available on line via the ISO mirror site for committee
43
          members. Non-members, or those who have not paid for the
44
          privilege of sitting on the committee and sustained their
45
          two meeting commitment for voting rights, may get a copy of
46
          the standard from their respective national standards
47
          organization. In the USA, this national standards
48
          organization is
49
          ANSI.
50
          (And if you've already registered with them you can
51
          buy the standard on-line.)
52
        
53
      
54
 
55
      
56
        
57
          The library working group bugs, and known defects, can
58
          be obtained here:
59
          http://www.open-std.org/jtc1/sc22/wg21
60
        
61
      
62
 
63
      
64
        
65
          The newsgroup dedicated to standardization issues is
66
          comp.std.c++: the
67
          FAQ
68
          for this group is quite useful.
69
      
70
      
71
 
72
      
73
        
74
          Peruse
75
          the GNU
76
          Coding Standards, and chuckle when you hit the part
77
          about Using Languages Other Than C.
78
        
79
      
80
 
81
      
82
        
83
          Be familiar with the extensions that preceded these
84
          general GNU rules. These style issues for libstdc++ can be
85
          found in Coding Style.
86
      
87
      
88
 
89
      
90
        
91
          And last but certainly not least, read the
92
          library-specific information found in
93
          Porting and Maintenance.
94
      
95
      
96
    
97
 
98
  
99
  
Assignment
100
 
101
    
102
      Small changes can be accepted without a copyright assignment form on
103
      file. New code and additions to the library need completed copyright
104
      assignment form on file at the FSF. Note: your employer may be required
105
      to fill out appropriate disclaimer forms as well.
106
    
107
 
108
    
109
      Historically, the libstdc++ assignment form added the following
110
      question:
111
    
112
 
113
    
114
      
115
        Which Belgian comic book character is better, Tintin or Asterix, and
116
        why?
117
      
118
    
119
 
120
    
121
      While not strictly necessary, humoring the maintainers and answering
122
      this question would be appreciated.
123
    
124
 
125
    
126
      For more information about getting a copyright assignment, please see
127
      Legal
128
        Matters.
129
    
130
 
131
    
132
      Please contact Benjamin Kosnik at
133
      bkoz+assign@redhat.com if you are confused
134
      about the assignment or have general licensing questions. When
135
      requesting an assignment form from
136
      mailto:assign@gnu.org, please cc the libstdc++
137
      maintainer above so that progress can be monitored.
138
    
139
  
140
 
141
  
Getting Sources
142
 
143
    
144
      Getting write access
145
        (look for "Write after approval")
146
    
147
  
148
 
149
  
Submitting Patches
150
 
151
 
152
    
153
      Every patch must have several pieces of information before it can be
154
      properly evaluated. Ideally (and to ensure the fastest possible
155
      response from the maintainers) it would have all of these pieces:
156
    
157
 
158
    
159
      
160
        
161
          A description of the bug and how your patch fixes this
162
          bug. For new features a description of the feature and your
163
          implementation.
164
        
165
      
166
 
167
      
168
        
169
          A ChangeLog entry as plain text; see the various
170
          ChangeLog files for format and content. If you are
171
          using emacs as your editor, simply position the insertion
172
          point at the beginning of your change and hit CX-4a to bring
173
          up the appropriate ChangeLog entry. See--magic! Similar
174
          functionality also exists for vi.
175
        
176
      
177
 
178
      
179
        
180
          A testsuite submission or sample program that will
181
          easily and simply show the existing error or test new
182
          functionality.
183
        
184
      
185
 
186
      
187
        
188
          The patch itself. If you are accessing the SVN
189
          repository use svn update; svn diff NEW;
190
          else, use diff -cp OLD NEW ... If your
191
          version of diff does not support these options, then get the
192
          latest version of GNU
193
          diff. The SVN
194
          Tricks wiki page has information on customising the
195
          output of svn diff.
196
        
197
      
198
 
199
      
200
        
201
          When you have all these pieces, bundle them up in a
202
          mail message and send it to libstdc++@gcc.gnu.org. All
203
          patches and related discussion should be sent to the
204
          libstdc++ mailing list.
205
        
206
      
207
    
208
 
209
  
210
 
211
212
 
213
Directory Layout and Source Conventions
214
  
215
 
216
 
217
  
218
    The unpacked source directory of libstdc++ contains the files
219
    needed to create the GNU C++ Library.
220
  
221
 
222
  
223
It has subdirectories:
224
 
225
  doc
226
    Files in HTML and text format that document usage, quirks of the
227
    implementation, and contributor checklists.
228
 
229
  include
230
    All header files for the C++ library are within this directory,
231
    modulo specific runtime-related files that are in the libsupc++
232
    directory.
233
 
234
    include/std
235
      Files meant to be found by #include <name> directives in
236
      standard-conforming user programs.
237
 
238
    include/c
239
      Headers intended to directly include standard C headers.
240
      [NB: this can be enabled via --enable-cheaders=c]
241
 
242
    include/c_global
243
      Headers intended to include standard C headers in
244
      the global namespace, and put select names into the std::
245
      namespace.  [NB: this is the default, and is the same as
246
      --enable-cheaders=c_global]
247
 
248
    include/c_std
249
      Headers intended to include standard C headers
250
      already in namespace std, and put select names into the std::
251
      namespace.  [NB: this is the same as --enable-cheaders=c_std]
252
 
253
    include/bits
254
      Files included by standard headers and by other files in
255
      the bits directory.
256
 
257
    include/backward
258
      Headers provided for backward compatibility, such as <iostream.h>.
259
      They are not used in this library.
260
 
261
    include/ext
262
      Headers that define extensions to the standard library.  No
263
      standard header refers to any of them.
264
 
265
  scripts
266
    Scripts that are used during the configure, build, make, or test
267
    process.
268
 
269
  src
270
    Files that are used in constructing the library, but are not
271
    installed.
272
 
273
  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
274
    Test programs are here, and may be used to begin to exercise the
275
    library.  Support for "make check" and "make check-install" is
276
    complete, and runs through all the subdirectories here when this
277
    command is issued from the build directory.  Please note that
278
    "make check" requires DejaGNU 1.4 or later to be installed.  Please
279
    note that "make check-script" calls the script mkcheck, which
280
    requires bash, and which may need the paths to bash adjusted to
281
    work properly, as /bin/bash is assumed.
282
 
283
Other subdirectories contain variant versions of certain files
284
that are meant to be copied or linked by the configure script.
285
Currently these are:
286
 
287
  config/abi
288
  config/cpu
289
  config/io
290
  config/locale
291
  config/os
292
 
293
In addition, a subdirectory holds the convenience library libsupc++.
294
 
295
  libsupc++
296
    Contains the runtime library for C++, including exception
297
    handling and memory allocation and deallocation, RTTI, terminate
298
    handlers, etc.
299
 
300
Note that glibc also has a bits/ subdirectory.  We will either
301
need to be careful not to collide with names in its bits/
302
directory; or rename bits to (e.g.) cppbits/.
303
 
304
In files throughout the system, lines marked with an "XXX" indicate
305
a bug or incompletely-implemented feature.  Lines marked "XXX MT"
306
indicate a place that may require attention for multi-thread safety.
307
  
308
 
309
310
 
311
Coding Style
312
  
313
 
314
  
315
  
316
  
Bad Identifiers
317
 
318
    
319
      Identifiers that conflict and should be avoided.
320
    
321
 
322
    
323
      This is the list of names reserved to the
324
      implementation that have been claimed by certain
325
      compilers and system headers of interest, and should not be used
326
      in the library. It will grow, of course.  We generally are
327
      interested in names that are not all-caps, except for those like
328
      "_T"
329
 
330
      For Solaris:
331
      _B
332
      _C
333
      _L
334
      _N
335
      _P
336
      _S
337
      _U
338
      _X
339
      _E1
340
      ..
341
      _E24
342
 
343
      Irix adds:
344
      _A
345
      _G
346
 
347
      MS adds:
348
      _T
349
 
350
      BSD adds:
351
      __used
352
      __unused
353
      __inline
354
      _Complex
355
      __istype
356
      __maskrune
357
      __tolower
358
      __toupper
359
      __wchar_t
360
      __wint_t
361
      _res
362
      _res_ext
363
      __tg_*
364
 
365
      SPU adds:
366
      __ea
367
 
368
      For GCC:
369
 
370
      [Note that this list is out of date. It applies to the old
371
      name-mangling; in G++ 3.0 and higher a different name-mangling is
372
      used. In addition, many of the bugs relating to G++ interpreting
373
      these names as operators have been fixed.]
374
 
375
      The full set of __* identifiers (combined from gcc/cp/lex.c and
376
      gcc/cplus-dem.c) that are either old or new, but are definitely
377
      recognized by the demangler, is:
378
 
379
      __aa
380
      __aad
381
      __ad
382
      __addr
383
      __adv
384
      __aer
385
      __als
386
      __alshift
387
      __amd
388
      __ami
389
      __aml
390
      __amu
391
      __aor
392
      __apl
393
      __array
394
      __ars
395
      __arshift
396
      __as
397
      __bit_and
398
      __bit_ior
399
      __bit_not
400
      __bit_xor
401
      __call
402
      __cl
403
      __cm
404
      __cn
405
      __co
406
      __component
407
      __compound
408
      __cond
409
      __convert
410
      __delete
411
      __dl
412
      __dv
413
      __eq
414
      __er
415
      __ge
416
      __gt
417
      __indirect
418
      __le
419
      __ls
420
      __lt
421
      __max
422
      __md
423
      __method_call
424
      __mi
425
      __min
426
      __minus
427
      __ml
428
      __mm
429
      __mn
430
      __mult
431
      __mx
432
      __ne
433
      __negate
434
      __new
435
      __nop
436
      __nt
437
      __nw
438
      __oo
439
      __op
440
      __or
441
      __pl
442
      __plus
443
      __postdecrement
444
      __postincrement
445
      __pp
446
      __pt
447
      __rf
448
      __rm
449
      __rs
450
      __sz
451
      __trunc_div
452
      __trunc_mod
453
      __truth_andif
454
      __truth_not
455
      __truth_orif
456
      __vc
457
      __vd
458
      __vn
459
 
460
      SGI badnames:
461
      __builtin_alloca
462
      __builtin_fsqrt
463
      __builtin_sqrt
464
      __builtin_fabs
465
      __builtin_dabs
466
      __builtin_cast_f2i
467
      __builtin_cast_i2f
468
      __builtin_cast_d2ll
469
      __builtin_cast_ll2d
470
      __builtin_copy_dhi2i
471
      __builtin_copy_i2dhi
472
      __builtin_copy_dlo2i
473
      __builtin_copy_i2dlo
474
      __add_and_fetch
475
      __sub_and_fetch
476
      __or_and_fetch
477
      __xor_and_fetch
478
      __and_and_fetch
479
      __nand_and_fetch
480
      __mpy_and_fetch
481
      __min_and_fetch
482
      __max_and_fetch
483
      __fetch_and_add
484
      __fetch_and_sub
485
      __fetch_and_or
486
      __fetch_and_xor
487
      __fetch_and_and
488
      __fetch_and_nand
489
      __fetch_and_mpy
490
      __fetch_and_min
491
      __fetch_and_max
492
      __lock_test_and_set
493
      __lock_release
494
      __lock_acquire
495
      __compare_and_swap
496
      __synchronize
497
      __high_multiply
498
      __unix
499
      __sgi
500
      __linux__
501
      __i386__
502
      __i486__
503
      __cplusplus
504
      __embedded_cplusplus
505
      // long double conversion members mangled as __opr
506
      // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
507
      __opr
508
    
509
  
510
 
511
  
By Example
512
 
513
    
514
      This library is written to appropriate C++ coding standards. As such,
515
      it is intended to precede the recommendations of the GNU Coding
516
      Standard, which can be referenced in full here:
517
 
518
      http://www.gnu.org/prep/standards/standards.html#Formatting
519
 
520
      The rest of this is also interesting reading, but skip the "Design
521
      Advice" part.
522
 
523
      The GCC coding conventions are here, and are also useful:
524
      http://gcc.gnu.org/codingconventions.html
525
 
526
      In addition, because it doesn't seem to be stated explicitly anywhere
527
      else, there is an 80 column source limit.
528
 
529
      ChangeLog entries for member functions should use the
530
      classname::member function name syntax as follows:
531
 
532
533
1999-04-15  Dennis Ritchie  <dr@att.com>
534
 
535
      * src/basic_file.cc (__basic_file::open): Fix thinko in
536
      _G_HAVE_IO_FILE_OPEN bits.
537
538
 
539
      Notable areas of divergence from what may be previous local practice
540
      (particularly for GNU C) include:
541
 
542
      01. Pointers and references
543
      
544
        char* p = "flop";
545
        char& c = *p;
546
          -NOT-
547
        char *p = "flop";  // wrong
548
        char &c = *p;      // wrong
549
      
550
 
551
      Reason: In C++, definitions are mixed with executable code. Here,
552
      p is being initialized, not *p.  This is near-universal
553
      practice among C++ programmers; it is normal for C hackers
554
      to switch spontaneously as they gain experience.
555
 
556
      02. Operator names and parentheses
557
      
558
        operator==(type)
559
          -NOT-
560
        operator == (type)  // wrong
561
      
562
 
563
      Reason: The == is part of the function name. Separating
564
      it makes the declaration look like an expression.
565
 
566
      03. Function names and parentheses
567
      
568
        void mangle()
569
          -NOT-
570
        void mangle ()  // wrong
571
      
572
 
573
      Reason: no space before parentheses (except after a control-flow
574
      keyword) is near-universal practice for C++. It identifies the
575
      parentheses as the function-call operator or declarator, as
576
      opposed to an expression or other overloaded use of parentheses.
577
 
578
      04. Template function indentation
579
      
580
        template<typename T>
581
          void
582
          template_function(args)
583
          { }
584
          -NOT-
585
        template<class T>
586
        void template_function(args) {};
587
      
588
 
589
      Reason: In class definitions, without indentation whitespace is
590
      needed both above and below the declaration to distinguish
591
      it visually from other members. (Also, re: "typename"
592
      rather than "class".)  T often could be int, which is
593
      not a class. ("class", here, is an anachronism.)
594
 
595
      05. Template class indentation
596
      
597
        template<typename _CharT, typename _Traits>
598
          class basic_ios : public ios_base
599
          {
600
          public:
601
            // Types:
602
          };
603
          -NOT-
604
        template<class _CharT, class _Traits>
605
        class basic_ios : public ios_base
606
          {
607
          public:
608
            // Types:
609
          };
610
          -NOT-
611
        template<class _CharT, class _Traits>
612
          class basic_ios : public ios_base
613
        {
614
          public:
615
            // Types:
616
        };
617
      
618
 
619
      06. Enumerators
620
      
621
        enum
622
        {
623
          space = _ISspace,
624
          print = _ISprint,
625
          cntrl = _IScntrl
626
        };
627
          -NOT-
628
        enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
629
      
630
 
631
      07. Member initialization lists
632
      All one line, separate from class name.
633
 
634
      
635
        gribble::gribble()
636
        : _M_private_data(0), _M_more_stuff(0), _M_helper(0)
637
        { }
638
          -NOT-
639
        gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0)
640
        { }
641
      
642
 
643
      08. Try/Catch blocks
644
      
645
        try
646
          {
647
            //
648
          }
649
        catch (...)
650
          {
651
            //
652
          }
653
          -NOT-
654
        try {
655
          //
656
        } catch(...) {
657
          //
658
        }
659
      
660
 
661
      09. Member functions declarations and definitions
662
      Keywords such as extern, static, export, explicit, inline, etc
663
      go on the line above the function name. Thus
664
 
665
      
666
      virtual int
667
      foo()
668
      -NOT-
669
      virtual int foo()
670
      
671
 
672
      Reason: GNU coding conventions dictate return types for functions
673
      are on a separate line than the function name and parameter list
674
      for definitions. For C++, where we have member functions that can
675
      be either inline definitions or declarations, keeping to this
676
      standard allows all member function names for a given class to be
677
      aligned to the same margin, increasing readability.
678
 
679
 
680
      10. Invocation of member functions with "this->"
681
      For non-uglified names, use this->name to call the function.
682
 
683
      
684
      this->sync()
685
      -NOT-
686
      sync()
687
      
688
 
689
      Reason: Koenig lookup.
690
 
691
      11. Namespaces
692
      
693
      namespace std
694
      {
695
        blah blah blah;
696
      } // namespace std
697
 
698
      -NOT-
699
 
700
      namespace std {
701
        blah blah blah;
702
      } // namespace std
703
      
704
 
705
      12. Spacing under protected and private in class declarations:
706
      space above, none below
707
      i.e.
708
 
709
      
710
      public:
711
        int foo;
712
 
713
      -NOT-
714
      public:
715
 
716
        int foo;
717
      
718
 
719
      13. Spacing WRT return statements.
720
      no extra spacing before returns, no parenthesis
721
      i.e.
722
 
723
      
724
      }
725
      return __ret;
726
 
727
      -NOT-
728
      }
729
 
730
      return __ret;
731
 
732
      -NOT-
733
 
734
      }
735
      return (__ret);
736
      
737
 
738
 
739
      14. Location of global variables.
740
      All global variables of class type, whether in the "user visible"
741
      space (e.g., cin) or the implementation namespace, must be defined
742
      as a character array with the appropriate alignment and then later
743
      re-initialized to the correct value.
744
 
745
      This is due to startup issues on certain platforms, such as AIX.
746
      For more explanation and examples, see src/globals.cc. All such
747
      variables should be contained in that file, for simplicity.
748
 
749
      15. Exception abstractions
750
      Use the exception abstractions found in functexcept.h, which allow
751
      C++ programmers to use this library with -fno-exceptions.  (Even if
752
      that is rarely advisable, it's a necessary evil for backwards
753
      compatibility.)
754
 
755
      16. Exception error messages
756
      All start with the name of the function where the exception is
757
      thrown, and then (optional) descriptive text is added. Example:
758
 
759
      
760
      __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
761
      
762
 
763
      Reason: The verbose terminate handler prints out exception::what(),
764
      as well as the typeinfo for the thrown exception. As this is the
765
      default terminate handler, by putting location info into the
766
      exception string, a very useful error message is printed out for
767
      uncaught exceptions. So useful, in fact, that non-programmers can
768
      give useful error messages, and programmers can intelligently
769
      speculate what went wrong without even using a debugger.
770
 
771
      17. The doxygen style guide to comments is a separate document,
772
      see index.
773
 
774
      The library currently has a mixture of GNU-C and modern C++ coding
775
      styles. The GNU C usages will be combed out gradually.
776
 
777
      Name patterns:
778
 
779
      For nonstandard names appearing in Standard headers, we are constrained
780
      to use names that begin with underscores. This is called "uglification".
781
      The convention is:
782
 
783
      Local and argument names:  __[a-z].*
784
 
785
      Examples:  __count  __ix  __s1
786
 
787
      Type names and template formal-argument names: _[A-Z][^_].*
788
 
789
      Examples:  _Helper  _CharT  _N
790
 
791
      Member data and function names: _M_.*
792
 
793
      Examples:  _M_num_elements  _M_initialize ()
794
 
795
      Static data members, constants, and enumerations: _S_.*
796
 
797
      Examples: _S_max_elements  _S_default_value
798
 
799
      Don't use names in the same scope that differ only in the prefix,
800
      e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
801
      (The most tempting of these seem to be and "_T" and "__sz".)
802
 
803
      Names must never have "__" internally; it would confuse name
804
      unmanglers on some targets. Also, never use "__[0-9]", same reason.
805
 
806
      --------------------------
807
 
808
      [BY EXAMPLE]
809
      
810
 
811
      #ifndef  _HEADER_
812
      #define  _HEADER_ 1
813
 
814
      namespace std
815
      {
816
        class gribble
817
        {
818
        public:
819
          gribble() throw();
820
 
821
          gribble(const gribble&);
822
 
823
          explicit
824
          gribble(int __howmany);
825
 
826
          gribble&
827
          operator=(const gribble&);
828
 
829
          virtual
830
          ~gribble() throw ();
831
 
832
          // Start with a capital letter, end with a period.
833
          inline void
834
          public_member(const char* __arg) const;
835
 
836
          // In-class function definitions should be restricted to one-liners.
837
          int
838
          one_line() { return 0 }
839
 
840
          int
841
          two_lines(const char* arg)
842
          { return strchr(arg, 'a'); }
843
 
844
          inline int
845
          three_lines();  // inline, but defined below.
846
 
847
          // Note indentation.
848
          template<typename _Formal_argument>
849
            void
850
            public_template() const throw();
851
 
852
          template<typename _Iterator>
853
            void
854
            other_template();
855
 
856
        private:
857
          class _Helper;
858
 
859
          int _M_private_data;
860
          int _M_more_stuff;
861
          _Helper* _M_helper;
862
          int _M_private_function();
863
 
864
          enum _Enum
865
            {
866
              _S_one,
867
              _S_two
868
            };
869
 
870
          static void
871
          _S_initialize_library();
872
        };
873
 
874
        // More-or-less-standard language features described by lack, not presence.
875
      # ifndef _G_NO_LONGLONG
876
        extern long long _G_global_with_a_good_long_name;  // avoid globals!
877
      # endif
878
 
879
        // Avoid in-class inline definitions, define separately;
880
        // likewise for member class definitions:
881
        inline int
882
        gribble::public_member() const
883
        { int __local = 0; return __local; }
884
 
885
        class gribble::_Helper
886
        {
887
          int _M_stuff;
888
 
889
          friend class gribble;
890
        };
891
      }
892
 
893
      // Names beginning with "__": only for arguments and
894
      //   local variables; never use "__" in a type name, or
895
      //   within any name; never use "__[0-9]".
896
 
897
      #endif /* _HEADER_ */
898
 
899
 
900
      namespace std
901
      {
902
        template<typename T>  // notice: "typename", not "class", no space
903
          long_return_value_type<with_many, args>
904
          function_name(char* pointer,               // "char *pointer" is wrong.
905
                        char* argument,
906
                        const Reference& ref)
907
          {
908
            // int a_local;  /* wrong; see below. */
909
            if (test)
910
            {
911
              nested code
912
            }
913
 
914
            int a_local = 0;  // declare variable at first use.
915
 
916
            //  char a, b, *p;   /* wrong */
917
            char a = 'a';
918
            char b = a + 1;
919
            char* c = "abc";  // each variable goes on its own line, always.
920
 
921
            // except maybe here...
922
            for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
923
              // ...
924
            }
925
          }
926
 
927
        gribble::gribble()
928
        : _M_private_data(0), _M_more_stuff(0), _M_helper(0)
929
        { }
930
 
931
        int
932
        gribble::three_lines()
933
        {
934
          // doesn't fit in one line.
935
        }
936
      } // namespace std
937
      
938
    
939
  
940
941
 
942
Design Notes
943
  
944
 
945
  
946
  
947
 
948
  
949
 
950
    The Library
951
    -----------
952
 
953
    This paper is covers two major areas:
954
 
955
    - Features and policies not mentioned in the standard that
956
    the quality of the library implementation depends on, including
957
    extensions and "implementation-defined" features;
958
 
959
    - Plans for required but unimplemented library features and
960
    optimizations to them.
961
 
962
    Overhead
963
    --------
964
 
965
    The standard defines a large library, much larger than the standard
966
    C library. A naive implementation would suffer substantial overhead
967
    in compile time, executable size, and speed, rendering it unusable
968
    in many (particularly embedded) applications. The alternative demands
969
    care in construction, and some compiler support, but there is no
970
    need for library subsets.
971
 
972
    What are the sources of this overhead?  There are four main causes:
973
 
974
    - The library is specified almost entirely as templates, which
975
    with current compilers must be included in-line, resulting in
976
    very slow builds as tens or hundreds of thousands of lines
977
    of function definitions are read for each user source file.
978
    Indeed, the entire SGI STL, as well as the dos Reis valarray,
979
    are provided purely as header files, largely for simplicity in
980
    porting. Iostream/locale is (or will be) as large again.
981
 
982
    - The library is very flexible, specifying a multitude of hooks
983
    where users can insert their own code in place of defaults.
984
    When these hooks are not used, any time and code expended to
985
    support that flexibility is wasted.
986
 
987
    - Templates are often described as causing to "code bloat". In
988
    practice, this refers (when it refers to anything real) to several
989
    independent processes. First, when a class template is manually
990
    instantiated in its entirely, current compilers place the definitions
991
    for all members in a single object file, so that a program linking
992
    to one member gets definitions of all. Second, template functions
993
    which do not actually depend on the template argument are, under
994
    current compilers, generated anew for each instantiation, rather
995
    than being shared with other instantiations. Third, some of the
996
    flexibility mentioned above comes from virtual functions (both in
997
    regular classes and template classes) which current linkers add
998
    to the executable file even when they manifestly cannot be called.
999
 
1000
    - The library is specified to use a language feature, exceptions,
1001
    which in the current gcc compiler ABI imposes a run time and
1002
    code space cost to handle the possibility of exceptions even when
1003
    they are not used. Under the new ABI (accessed with -fnew-abi),
1004
    there is a space overhead and a small reduction in code efficiency
1005
    resulting from lost optimization opportunities associated with
1006
    non-local branches associated with exceptions.
1007
 
1008
    What can be done to eliminate this overhead?  A variety of coding
1009
    techniques, and compiler, linker and library improvements and
1010
    extensions may be used, as covered below. Most are not difficult,
1011
    and some are already implemented in varying degrees.
1012
 
1013
    Overhead: Compilation Time
1014
    --------------------------
1015
 
1016
    Providing "ready-instantiated" template code in object code archives
1017
    allows us to avoid generating and optimizing template instantiations
1018
    in each compilation unit which uses them. However, the number of such
1019
    instantiations that are useful to provide is limited, and anyway this
1020
    is not enough, by itself, to minimize compilation time. In particular,
1021
    it does not reduce time spent parsing conforming headers.
1022
 
1023
    Quicker header parsing will depend on library extensions and compiler
1024
    improvements.  One approach is some variation on the techniques
1025
    previously marketed as "pre-compiled headers", now standardized as
1026
    support for the "export" keyword. "Exported" template definitions
1027
    can be placed (once) in a "repository" -- really just a library, but
1028
    of template definitions rather than object code -- to be drawn upon
1029
    at link time when an instantiation is needed, rather than placed in
1030
    header files to be parsed along with every compilation unit.
1031
 
1032
    Until "export" is implemented we can put some of the lengthy template
1033
    definitions in #if guards or alternative headers so that users can skip
1034
    over the full definitions when they need only the ready-instantiated
1035
    specializations.
1036
 
1037
    To be precise, this means that certain headers which define
1038
    templates which users normally use only for certain arguments
1039
    can be instrumented to avoid exposing the template definitions
1040
    to the compiler unless a macro is defined. For example, in
1041
    <string>, we might have:
1042
 
1043
    template <class _CharT, ... > class basic_string {
1044
    ... // member declarations
1045
    };
1046
    ... // operator declarations
1047
 
1048
    #ifdef _STRICT_ISO_
1049
    # if _G_NO_TEMPLATE_EXPORT
1050
    #   include <bits/std_locale.h>  // headers needed by definitions
1051
    #   ...
1052
    #   include <bits/string.tcc>  // member and global template definitions.
1053
    # endif
1054
    #endif
1055
 
1056
    Users who compile without specifying a strict-ISO-conforming flag
1057
    would not see many of the template definitions they now see, and rely
1058
    instead on ready-instantiated specializations in the library. This
1059
    technique would be useful for the following substantial components:
1060
    string, locale/iostreams, valarray. It would *not* be useful or
1061
    usable with the following: containers, algorithms, iterators,
1062
    allocator. Since these constitute a large (though decreasing)
1063
    fraction of the library, the benefit the technique offers is
1064
    limited.
1065
 
1066
    The language specifies the semantics of the "export" keyword, but
1067
    the gcc compiler does not yet support it. When it does, problems
1068
    with large template inclusions can largely disappear, given some
1069
    minor library reorganization, along with the need for the apparatus
1070
    described above.
1071
 
1072
    Overhead: Flexibility Cost
1073
    --------------------------
1074
 
1075
    The library offers many places where users can specify operations
1076
    to be performed by the library in place of defaults. Sometimes
1077
    this seems to require that the library use a more-roundabout, and
1078
    possibly slower, way to accomplish the default requirements than
1079
    would be used otherwise.
1080
 
1081
    The primary protection against this overhead is thorough compiler
1082
    optimization, to crush out layers of inline function interfaces.
1083
    Kuck & Associates has demonstrated the practicality of this kind
1084
    of optimization.
1085
 
1086
    The second line of defense against this overhead is explicit
1087
    specialization. By defining helper function templates, and writing
1088
    specialized code for the default case, overhead can be eliminated
1089
    for that case without sacrificing flexibility. This takes full
1090
    advantage of any ability of the optimizer to crush out degenerate
1091
    code.
1092
 
1093
    The library specifies many virtual functions which current linkers
1094
    load even when they cannot be called. Some minor improvements to the
1095
    compiler and to ld would eliminate any such overhead by simply
1096
    omitting virtual functions that the complete program does not call.
1097
    A prototype of this work has already been done. For targets where
1098
    GNU ld is not used, a "pre-linker" could do the same job.
1099
 
1100
    The main areas in the standard interface where user flexibility
1101
    can result in overhead are:
1102
 
1103
    - Allocators:  Containers are specified to use user-definable
1104
    allocator types and objects, making tuning for the container
1105
    characteristics tricky.
1106
 
1107
    - Locales: the standard specifies locale objects used to implement
1108
    iostream operations, involving many virtual functions which use
1109
    streambuf iterators.
1110
 
1111
    - Algorithms and containers: these may be instantiated on any type,
1112
    frequently duplicating code for identical operations.
1113
 
1114
    - Iostreams and strings: users are permitted to use these on their
1115
    own types, and specify the operations the stream must use on these
1116
    types.
1117
 
1118
    Note that these sources of overhead are _avoidable_. The techniques
1119
    to avoid them are covered below.
1120
 
1121
    Code Bloat
1122
    ----------
1123
 
1124
    In the SGI STL, and in some other headers, many of the templates
1125
    are defined "inline" -- either explicitly or by their placement
1126
    in class definitions -- which should not be inline. This is a
1127
    source of code bloat. Matt had remarked that he was relying on
1128
    the compiler to recognize what was too big to benefit from inlining,
1129
    and generate it out-of-line automatically. However, this also can
1130
    result in code bloat except where the linker can eliminate the extra
1131
    copies.
1132
 
1133
    Fixing these cases will require an audit of all inline functions
1134
    defined in the library to determine which merit inlining, and moving
1135
    the rest out of line. This is an issue mainly in chapters 23, 25, and
1136
    27. Of course it can be done incrementally, and we should generally
1137
    accept patches that move large functions out of line and into ".tcc"
1138
    files, which can later be pulled into a repository. Compiler/linker
1139
    improvements to recognize very large inline functions and move them
1140
    out-of-line, but shared among compilation units, could make this
1141
    work unnecessary.
1142
 
1143
    Pre-instantiating template specializations currently produces large
1144
    amounts of dead code which bloats statically linked programs. The
1145
    current state of the static library, libstdc++.a, is intolerable on
1146
    this account, and will fuel further confused speculation about a need
1147
    for a library "subset". A compiler improvement that treats each
1148
    instantiated function as a separate object file, for linking purposes,
1149
    would be one solution to this problem. An alternative would be to
1150
    split up the manual instantiation files into dozens upon dozens of
1151
    little files, each compiled separately, but an abortive attempt at
1152
    this was done for <string> and, though it is far from complete, it
1153
    is already a nuisance. A better interim solution (just until we have
1154
    "export") is badly needed.
1155
 
1156
    When building a shared library, the current compiler/linker cannot
1157
    automatically generate the instantiations needed. This creates a
1158
    miserable situation; it means any time something is changed in the
1159
    library, before a shared library can be built someone must manually
1160
    copy the declarations of all templates that are needed by other parts
1161
    of the library to an "instantiation" file, and add it to the build
1162
    system to be compiled and linked to the library. This process is
1163
    readily automated, and should be automated as soon as possible.
1164
    Users building their own shared libraries experience identical
1165
    frustrations.
1166
 
1167
    Sharing common aspects of template definitions among instantiations
1168
    can radically reduce code bloat. The compiler could help a great
1169
    deal here by recognizing when a function depends on nothing about
1170
    a template parameter, or only on its size, and giving the resulting
1171
    function a link-name "equate" that allows it to be shared with other
1172
    instantiations. Implementation code could take advantage of the
1173
    capability by factoring out code that does not depend on the template
1174
    argument into separate functions to be merged by the compiler.
1175
 
1176
    Until such a compiler optimization is implemented, much can be done
1177
    manually (if tediously) in this direction. One such optimization is
1178
    to derive class templates from non-template classes, and move as much
1179
    implementation as possible into the base class. Another is to partial-
1180
    specialize certain common instantiations, such as vector<T*>, to share
1181
    code for instantiations on all types T. While these techniques work,
1182
    they are far from the complete solution that a compiler improvement
1183
    would afford.
1184
 
1185
    Overhead: Expensive Language Features
1186
    -------------------------------------
1187
 
1188
    The main "expensive" language feature used in the standard library
1189
    is exception support, which requires compiling in cleanup code with
1190
    static table data to locate it, and linking in library code to use
1191
    the table. For small embedded programs the amount of such library
1192
    code and table data is assumed by some to be excessive. Under the
1193
    "new" ABI this perception is generally exaggerated, although in some
1194
    cases it may actually be excessive.
1195
 
1196
    To implement a library which does not use exceptions directly is
1197
    not difficult given minor compiler support (to "turn off" exceptions
1198
    and ignore exception constructs), and results in no great library
1199
    maintenance difficulties. To be precise, given "-fno-exceptions",
1200
    the compiler should treat "try" blocks as ordinary blocks, and
1201
    "catch" blocks as dead code to ignore or eliminate. Compiler
1202
    support is not strictly necessary, except in the case of "function
1203
    try blocks"; otherwise the following macros almost suffice:
1204
 
1205
    #define throw(X)
1206
    #define try      if (true)
1207
    #define catch(X) else if (false)
1208
 
1209
    However, there may be a need to use function try blocks in the
1210
    library implementation, and use of macros in this way can make
1211
    correct diagnostics impossible. Furthermore, use of this scheme
1212
    would require the library to call a function to re-throw exceptions
1213
    from a try block. Implementing the above semantics in the compiler
1214
    is preferable.
1215
 
1216
    Given the support above (however implemented) it only remains to
1217
    replace code that "throws" with a call to a well-documented "handler"
1218
    function in a separate compilation unit which may be replaced by
1219
    the user. The main source of exceptions that would be difficult
1220
    for users to avoid is memory allocation failures, but users can
1221
    define their own memory allocation primitives that never throw.
1222
    Otherwise, the complete list of such handlers, and which library
1223
    functions may call them, would be needed for users to be able to
1224
    implement the necessary substitutes. (Fortunately, they have the
1225
    source code.)
1226
 
1227
    Opportunities
1228
    -------------
1229
 
1230
    The template capabilities of C++ offer enormous opportunities for
1231
    optimizing common library operations, well beyond what would be
1232
    considered "eliminating overhead". In particular, many operations
1233
    done in Glibc with macros that depend on proprietary language
1234
    extensions can be implemented in pristine Standard C++. For example,
1235
    the chapter 25 algorithms, and even C library functions such as strchr,
1236
    can be specialized for the case of static arrays of known (small) size.
1237
 
1238
    Detailed optimization opportunities are identified below where
1239
    the component where they would appear is discussed. Of course new
1240
    opportunities will be identified during implementation.
1241
 
1242
    Unimplemented Required Library Features
1243
    ---------------------------------------
1244
 
1245
    The standard specifies hundreds of components, grouped broadly by
1246
    chapter. These are listed in excruciating detail in the CHECKLIST
1247
    file.
1248
 
1249
    17 general
1250
    18 support
1251
    19 diagnostics
1252
    20 utilities
1253
    21 string
1254
    22 locale
1255
    23 containers
1256
    24 iterators
1257
    25 algorithms
1258
    26 numerics
1259
    27 iostreams
1260
    Annex D  backward compatibility
1261
 
1262
    Anyone participating in implementation of the library should obtain
1263
    a copy of the standard, ISO 14882.  People in the U.S. can obtain an
1264
    electronic copy for US$18 from ANSI's web site. Those from other
1265
    countries should visit http://www.iso.org/ to find out the location
1266
    of their country's representation in ISO, in order to know who can
1267
    sell them a copy.
1268
 
1269
    The emphasis in the following sections is on unimplemented features
1270
    and optimization opportunities.
1271
 
1272
    Chapter 17  General
1273
    -------------------
1274
 
1275
    Chapter 17 concerns overall library requirements.
1276
 
1277
    The standard doesn't mention threads. A multi-thread (MT) extension
1278
    primarily affects operators new and delete (18), allocator (20),
1279
    string (21), locale (22), and iostreams (27). The common underlying
1280
    support needed for this is discussed under chapter 20.
1281
 
1282
    The standard requirements on names from the C headers create a
1283
    lot of work, mostly done. Names in the C headers must be visible
1284
    in the std:: and sometimes the global namespace; the names in the
1285
    two scopes must refer to the same object. More stringent is that
1286
    Koenig lookup implies that any types specified as defined in std::
1287
    really are defined in std::. Names optionally implemented as
1288
    macros in C cannot be macros in C++. (An overview may be read at
1289
    <http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
1290
    and "mkcshadow", and the directories shadow/ and cshadow/, are the
1291
    beginning of an effort to conform in this area.
1292
 
1293
    A correct conforming definition of C header names based on underlying
1294
    C library headers, and practical linking of conforming namespaced
1295
    customer code with third-party C libraries depends ultimately on
1296
    an ABI change, allowing namespaced C type names to be mangled into
1297
    type names as if they were global, somewhat as C function names in a
1298
    namespace, or C++ global variable names, are left unmangled. Perhaps
1299
    another "extern" mode, such as 'extern "C-global"' would be an
1300
    appropriate place for such type definitions. Such a type would
1301
    affect mangling as follows:
1302
 
1303
    namespace A {
1304
    struct X {};
1305
    extern "C-global" {  // or maybe just 'extern "C"'
1306
    struct Y {};
1307
    };
1308
    }
1309
    void f(A::X*);  // mangles to f__FPQ21A1X
1310
    void f(A::Y*);  // mangles to f__FP1Y
1311
 
1312
    (It may be that this is really the appropriate semantics for regular
1313
    'extern "C"', and 'extern "C-global"', as an extension, would not be
1314
    necessary.) This would allow functions declared in non-standard C headers
1315
    (and thus fixable by neither us nor users) to link properly with functions
1316
    declared using C types defined in properly-namespaced headers. The
1317
    problem this solves is that C headers (which C++ programmers do persist
1318
    in using) frequently forward-declare C struct tags without including
1319
    the header where the type is defined, as in
1320
 
1321
    struct tm;
1322
    void munge(tm*);
1323
 
1324
    Without some compiler accommodation, munge cannot be called by correct
1325
    C++ code using a pointer to a correctly-scoped tm* value.
1326
 
1327
    The current C headers use the preprocessor extension "#include_next",
1328
    which the compiler complains about when run "-pedantic".
1329
    (Incidentally, it appears that "-fpedantic" is currently ignored,
1330
    probably a bug.)  The solution in the C compiler is to use
1331
    "-isystem" rather than "-I", but unfortunately in g++ this seems
1332
    also to wrap the whole header in an 'extern "C"' block, so it's
1333
    unusable for C++ headers. The correct solution appears to be to
1334
    allow the various special include-directory options, if not given
1335
    an argument, to affect subsequent include-directory options additively,
1336
    so that if one said
1337
 
1338
    -pedantic -iprefix $(prefix) \
1339
    -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
1340
    -iwithprefix -I g++-v3/ext
1341
 
1342
    the compiler would search $(prefix)/g++-v3 and not report
1343
    pedantic warnings for files found there, but treat files in
1344
    $(prefix)/g++-v3/ext pedantically. (The undocumented semantics
1345
    of "-isystem" in g++ stink. Can they be rescinded?  If not it
1346
    must be replaced with something more rationally behaved.)
1347
 
1348
    All the C headers need the treatment above; in the standard these
1349
    headers are mentioned in various chapters. Below, I have only
1350
    mentioned those that present interesting implementation issues.
1351
 
1352
    The components identified as "mostly complete", below, have not been
1353
    audited for conformance. In many cases where the library passes
1354
    conformance tests we have non-conforming extensions that must be
1355
    wrapped in #if guards for "pedantic" use, and in some cases renamed
1356
    in a conforming way for continued use in the implementation regardless
1357
    of conformance flags.
1358
 
1359
    The STL portion of the library still depends on a header
1360
    stl/bits/stl_config.h full of #ifdef clauses. This apparatus
1361
    should be replaced with autoconf/automake machinery.
1362
 
1363
    The SGI STL defines a type_traits<> template, specialized for
1364
    many types in their code including the built-in numeric and
1365
    pointer types and some library types, to direct optimizations of
1366
    standard functions. The SGI compiler has been extended to generate
1367
    specializations of this template automatically for user types,
1368
    so that use of STL templates on user types can take advantage of
1369
    these optimizations. Specializations for other, non-STL, types
1370
    would make more optimizations possible, but extending the gcc
1371
    compiler in the same way would be much better. Probably the next
1372
    round of standardization will ratify this, but probably with
1373
    changes, so it probably should be renamed to place it in the
1374
    implementation namespace.
1375
 
1376
    The SGI STL also defines a large number of extensions visible in
1377
    standard headers. (Other extensions that appear in separate headers
1378
    have been sequestered in subdirectories ext/ and backward/.)  All
1379
    these extensions should be moved to other headers where possible,
1380
    and in any case wrapped in a namespace (not std!), and (where kept
1381
    in a standard header) girded about with macro guards. Some cannot be
1382
    moved out of standard headers because they are used to implement
1383
    standard features.  The canonical method for accommodating these
1384
    is to use a protected name, aliased in macro guards to a user-space
1385
    name. Unfortunately C++ offers no satisfactory template typedef
1386
    mechanism, so very ad-hoc and unsatisfactory aliasing must be used
1387
    instead.
1388
 
1389
    Implementation of a template typedef mechanism should have the highest
1390
    priority among possible extensions, on the same level as implementation
1391
    of the template "export" feature.
1392
 
1393
    Chapter 18  Language support
1394
    ----------------------------
1395
 
1396
    Headers: <limits> <new> <typeinfo> <exception>
1397
    C headers: <cstddef> <climits> <cfloat>  <cstdarg> <csetjmp>
1398
    <ctime>   <csignal> <cstdlib> (also 21, 25, 26)
1399
 
1400
    This defines the built-in exceptions, rtti, numeric_limits<>,
1401
    operator new and delete. Much of this is provided by the
1402
    compiler in its static runtime library.
1403
 
1404
    Work to do includes defining numeric_limits<> specializations in
1405
    separate files for all target architectures. Values for integer types
1406
    except for bool and wchar_t are readily obtained from the C header
1407
    <limits.h>, but values for the remaining numeric types (bool, wchar_t,
1408
    float, double, long double) must be entered manually. This is
1409
    largely dog work except for those members whose values are not
1410
    easily deduced from available documentation. Also, this involves
1411
    some work in target configuration to identify the correct choice of
1412
    file to build against and to install.
1413
 
1414
    The definitions of the various operators new and delete must be
1415
    made thread-safe, which depends on a portable exclusion mechanism,
1416
    discussed under chapter 20.  Of course there is always plenty of
1417
    room for improvements to the speed of operators new and delete.
1418
 
1419
    <cstdarg>, in Glibc, defines some macros that gcc does not allow to
1420
    be wrapped into an inline function. Probably this header will demand
1421
    attention whenever a new target is chosen. The functions atexit(),
1422
    exit(), and abort() in cstdlib have different semantics in C++, so
1423
    must be re-implemented for C++.
1424
 
1425
    Chapter 19  Diagnostics
1426
    -----------------------
1427
 
1428
    Headers: <stdexcept>
1429
    C headers: <cassert> <cerrno>
1430
 
1431
    This defines the standard exception objects, which are "mostly complete".
1432
    Cygnus has a version, and now SGI provides a slightly different one.
1433
    It makes little difference which we use.
1434
 
1435
    The C global name "errno", which C allows to be a variable or a macro,
1436
    is required in C++ to be a macro. For MT it must typically result in
1437
    a function call.
1438
 
1439
    Chapter 20  Utilities
1440
    ---------------------
1441
    Headers: <utility> <functional> <memory>
1442
    C header: <ctime> (also in 18)
1443
 
1444
    SGI STL provides "mostly complete" versions of all the components
1445
    defined in this chapter. However, the auto_ptr<> implementation
1446
    is known to be wrong. Furthermore, the standard definition of it
1447
    is known to be unimplementable as written. A minor change to the
1448
    standard would fix it, and auto_ptr<> should be adjusted to match.
1449
 
1450
    Multi-threading affects the allocator implementation, and there must
1451
    be configuration/installation choices for different users' MT
1452
    requirements. Anyway, users will want to tune allocator options
1453
    to support different target conditions, MT or no.
1454
 
1455
    The primitives used for MT implementation should be exposed, as an
1456
    extension, for users' own work. We need cross-CPU "mutex" support,
1457
    multi-processor shared-memory atomic integer operations, and single-
1458
    processor uninterruptible integer operations, and all three configurable
1459
    to be stubbed out for non-MT use, or to use an appropriately-loaded
1460
    dynamic library for the actual runtime environment, or statically
1461
    compiled in for cases where the target architecture is known.
1462
 
1463
    Chapter 21  String
1464
    ------------------
1465
    Headers: <string>
1466
    C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
1467
    <cstdlib> (also in 18, 25, 26)
1468
 
1469
    We have "mostly-complete" char_traits<> implementations. Many of the
1470
    char_traits<char> operations might be optimized further using existing
1471
    proprietary language extensions.
1472
 
1473
    We have a "mostly-complete" basic_string<> implementation. The work
1474
    to manually instantiate char and wchar_t specializations in object
1475
    files to improve link-time behavior is extremely unsatisfactory,
1476
    literally tripling library-build time with no commensurate improvement
1477
    in static program link sizes. It must be redone. (Similar work is
1478
    needed for some components in chapters 22 and 27.)
1479
 
1480
    Other work needed for strings is MT-safety, as discussed under the
1481
    chapter 20 heading.
1482
 
1483
    The standard C type mbstate_t from <cwchar> and used in char_traits<>
1484
    must be different in C++ than in C, because in C++ the default constructor
1485
    value mbstate_t() must be the "base" or "ground" sequence state.
1486
    (According to the likely resolution of a recently raised Core issue,
1487
    this may become unnecessary. However, there are other reasons to
1488
    use a state type not as limited as whatever the C library provides.)
1489
    If we might want to provide conversions from (e.g.) internally-
1490
    represented EUC-wide to externally-represented Unicode, or vice-
1491
    versa, the mbstate_t we choose will need to be more accommodating
1492
    than what might be provided by an underlying C library.
1493
 
1494
    There remain some basic_string template-member functions which do
1495
    not overload properly with their non-template brethren. The infamous
1496
    hack akin to what was done in vector<> is needed, to conform to
1497
    23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
1498
    or incomplete, are so marked for this reason.
1499
 
1500
    Replacing the string iterators, which currently are simple character
1501
    pointers, with class objects would greatly increase the safety of the
1502
    client interface, and also permit a "debug" mode in which range,
1503
    ownership, and validity are rigorously checked. The current use of
1504
    raw pointers as string iterators is evil. vector<> iterators need the
1505
    same treatment. Note that the current implementation freely mixes
1506
    pointers and iterators, and that must be fixed before safer iterators
1507
    can be introduced.
1508
 
1509
    Some of the functions in <cstring> are different from the C version.
1510
    generally overloaded on const and non-const argument pointers. For
1511
    example, in <cstring> strchr is overloaded. The functions isupper
1512
    etc. in <cctype> typically implemented as macros in C are functions
1513
    in C++, because they are overloaded with others of the same name
1514
    defined in <locale>.
1515
 
1516
    Many of the functions required in <cwctype> and <cwchar> cannot be
1517
    implemented using underlying C facilities on intended targets because
1518
    such facilities only partly exist.
1519
 
1520
    Chapter 22  Locale
1521
    ------------------
1522
    Headers: <locale>
1523
    C headers: <clocale>
1524
 
1525
    We have a "mostly complete" class locale, with the exception of
1526
    code for constructing, and handling the names of, named locales.
1527
    The ways that locales are named (particularly when categories
1528
    (e.g. LC_TIME, LC_COLLATE) are different) varies among all target
1529
    environments. This code must be written in various versions and
1530
    chosen by configuration parameters.
1531
 
1532
    Members of many of the facets defined in <locale> are stubs. Generally,
1533
    there are two sets of facets: the base class facets (which are supposed
1534
    to implement the "C" locale) and the "byname" facets, which are supposed
1535
    to read files to determine their behavior. The base ctype<>, collate<>,
1536
    and numpunct<> facets are "mostly complete", except that the table of
1537
    bitmask values used for "is" operations, and corresponding mask values,
1538
    are still defined in libio and just included/linked. (We will need to
1539
    implement these tables independently, soon, but should take advantage
1540
    of libio where possible.)  The num_put<>::put members for integer types
1541
    are "mostly complete".
1542
 
1543
    A complete list of what has and has not been implemented may be
1544
    found in CHECKLIST. However, note that the current definition of
1545
    codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
1546
    out the raw bytes representing the wide characters, rather than
1547
    trying to convert each to a corresponding single "char" value.
1548
 
1549
    Some of the facets are more important than others. Specifically,
1550
    the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
1551
    are used by other library facilities defined in <string>, <istream>,
1552
    and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
1553
    in <fstream>, so a conforming iostream implementation depends on
1554
    these.
1555
 
1556
    The "long long" type eventually must be supported, but code mentioning
1557
    it should be wrapped in #if guards to allow pedantic-mode compiling.
1558
 
1559
    Performance of num_put<> and num_get<> depend critically on
1560
    caching computed values in ios_base objects, and on extensions
1561
    to the interface with streambufs.
1562
 
1563
    Specifically: retrieving a copy of the locale object, extracting
1564
    the needed facets, and gathering data from them, for each call to
1565
    (e.g.) operator<< would be prohibitively slow.  To cache format
1566
    data for use by num_put<> and num_get<> we have a _Format_cache<>
1567
    object stored in the ios_base::pword() array. This is constructed
1568
    and initialized lazily, and is organized purely for utility. It
1569
    is discarded when a new locale with different facets is imbued.
1570
 
1571
    Using only the public interfaces of the iterator arguments to the
1572
    facet functions would limit performance by forbidding "vector-style"
1573
    character operations. The streambuf iterator optimizations are
1574
    described under chapter 24, but facets can also bypass the streambuf
1575
    iterators via explicit specializations and operate directly on the
1576
    streambufs, and use extended interfaces to get direct access to the
1577
    streambuf internal buffer arrays. These extensions are mentioned
1578
    under chapter 27. These optimizations are particularly important
1579
    for input parsing.
1580
 
1581
    Unused virtual members of locale facets can be omitted, as mentioned
1582
    above, by a smart linker.
1583
 
1584
    Chapter 23  Containers
1585
    ----------------------
1586
    Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
1587
 
1588
    All the components in chapter 23 are implemented in the SGI STL.
1589
    They are "mostly complete"; they include a large number of
1590
    nonconforming extensions which must be wrapped. Some of these
1591
    are used internally and must be renamed or duplicated.
1592
 
1593
    The SGI components are optimized for large-memory environments. For
1594
    embedded targets, different criteria might be more appropriate. Users
1595
    will want to be able to tune this behavior. We should provide
1596
    ways for users to compile the library with different memory usage
1597
    characteristics.
1598
 
1599
    A lot more work is needed on factoring out common code from different
1600
    specializations to reduce code size here and in chapter 25. The
1601
    easiest fix for this would be a compiler/ABI improvement that allows
1602
    the compiler to recognize when a specialization depends only on the
1603
    size (or other gross quality) of a template argument, and allow the
1604
    linker to share the code with similar specializations. In its
1605
    absence, many of the algorithms and containers can be partial-
1606
    specialized, at least for the case of pointers, but this only solves
1607
    a small part of the problem. Use of a type_traits-style template
1608
    allows a few more optimization opportunities, more if the compiler
1609
    can generate the specializations automatically.
1610
 
1611
    As an optimization, containers can specialize on the default allocator
1612
    and bypass it, or take advantage of details of its implementation
1613
    after it has been improved upon.
1614
 
1615
    Replacing the vector iterators, which currently are simple element
1616
    pointers, with class objects would greatly increase the safety of the
1617
    client interface, and also permit a "debug" mode in which range,
1618
    ownership, and validity are rigorously checked. The current use of
1619
    pointers for iterators is evil.
1620
 
1621
    As mentioned for chapter 24, the deque iterator is a good example of
1622
    an opportunity to implement a "staged" iterator that would benefit
1623
    from specializations of some algorithms.
1624
 
1625
    Chapter 24  Iterators
1626
    ---------------------
1627
    Headers: <iterator>
1628
 
1629
    Standard iterators are "mostly complete", with the exception of
1630
    the stream iterators, which are not yet templatized on the
1631
    stream type. Also, the base class template iterator<> appears
1632
    to be wrong, so everything derived from it must also be wrong,
1633
    currently.
1634
 
1635
    The streambuf iterators (currently located in stl/bits/std_iterator.h,
1636
    but should be under bits/) can be rewritten to take advantage of
1637
    friendship with the streambuf implementation.
1638
 
1639
    Matt Austern has identified opportunities where certain iterator
1640
    types, particularly including streambuf iterators and deque
1641
    iterators, have a "two-stage" quality, such that an intermediate
1642
    limit can be checked much more quickly than the true limit on
1643
    range operations. If identified with a member of iterator_traits,
1644
    algorithms may be specialized for this case. Of course the
1645
    iterators that have this quality can be identified by specializing
1646
    a traits class.
1647
 
1648
    Many of the algorithms must be specialized for the streambuf
1649
    iterators, to take advantage of block-mode operations, in order
1650
    to allow iostream/locale operations' performance not to suffer.
1651
    It may be that they could be treated as staged iterators and
1652
    take advantage of those optimizations.
1653
 
1654
    Chapter 25  Algorithms
1655
    ----------------------
1656
    Headers: <algorithm>
1657
    C headers: <cstdlib> (also in 18, 21, 26))
1658
 
1659
    The algorithms are "mostly complete". As mentioned above, they
1660
    are optimized for speed at the expense of code and data size.
1661
 
1662
    Specializations of many of the algorithms for non-STL types would
1663
    give performance improvements, but we must use great care not to
1664
    interfere with fragile template overloading semantics for the
1665
    standard interfaces. Conventionally the standard function template
1666
    interface is an inline which delegates to a non-standard function
1667
    which is then overloaded (this is already done in many places in
1668
    the library). Particularly appealing opportunities for the sake of
1669
    iostream performance are for copy and find applied to streambuf
1670
    iterators or (as noted elsewhere) for staged iterators, of which
1671
    the streambuf iterators are a good example.
1672
 
1673
    The bsearch and qsort functions cannot be overloaded properly as
1674
    required by the standard because gcc does not yet allow overloading
1675
    on the extern-"C"-ness of a function pointer.
1676
 
1677
    Chapter 26  Numerics
1678
    --------------------
1679
    Headers: <complex> <valarray> <numeric>
1680
    C headers: <cmath>, <cstdlib> (also 18, 21, 25)
1681
 
1682
    Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
1683
    and the few algorithms from the STL are "mostly done".  Of course
1684
    optimization opportunities abound for the numerically literate. It
1685
    is not clear whether the valarray implementation really conforms
1686
    fully, in the assumptions it makes about aliasing (and lack thereof)
1687
    in its arguments.
1688
 
1689
    The C div() and ldiv() functions are interesting, because they are the
1690
    only case where a C library function returns a class object by value.
1691
    Since the C++ type div_t must be different from the underlying C type
1692
    (which is in the wrong namespace) the underlying functions div() and
1693
    ldiv() cannot be re-used efficiently. Fortunately they are trivial to
1694
    re-implement.
1695
 
1696
    Chapter 27  Iostreams
1697
    ---------------------
1698
    Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
1699
    <iomanip> <sstream> <fstream>
1700
    C headers: <cstdio> <cwchar> (also in 21)
1701
 
1702
    Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
1703
    ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
1704
    basic_ostream<> are well along, but basic_istream<> has had little work
1705
    done. The standard stream objects, <sstream> and <fstream> have been
1706
    started; basic_filebuf<> "write" functions have been implemented just
1707
    enough to do "hello, world".
1708
 
1709
    Most of the istream and ostream operators << and >> (with the exception
1710
    of the op<<(integer) ones) have not been changed to use locale primitives,
1711
    sentry objects, or char_traits members.
1712
 
1713
    All these templates should be manually instantiated for char and
1714
    wchar_t in a way that links only used members into user programs.
1715
 
1716
    Streambuf is fertile ground for optimization extensions. An extended
1717
    interface giving iterator access to its internal buffer would be very
1718
    useful for other library components.
1719
 
1720
    Iostream operations (primarily operators << and >>) can take advantage
1721
    of the case where user code has not specified a locale, and bypass locale
1722
    operations entirely. The current implementation of op<</num_put<>::put,
1723
    for the integer types, demonstrates how they can cache encoding details
1724
    from the locale on each operation. There is lots more room for
1725
    optimization in this area.
1726
 
1727
    The definition of the relationship between the standard streams
1728
    cout et al. and stdout et al. requires something like a "stdiobuf".
1729
    The SGI solution of using double-indirection to actually use a
1730
    stdio FILE object for buffering is unsatisfactory, because it
1731
    interferes with peephole loop optimizations.
1732
 
1733
    The <sstream> header work has begun. stringbuf can benefit from
1734
    friendship with basic_string<> and basic_string<>::_Rep to use
1735
    those objects directly as buffers, and avoid allocating and making
1736
    copies.
1737
 
1738
    The basic_filebuf<> template is a complex beast. It is specified to
1739
    use the locale facet codecvt<> to translate characters between native
1740
    files and the locale character encoding. In general this involves
1741
    two buffers, one of "char" representing the file and another of
1742
    "char_type", for the stream, with codecvt<> translating. The process
1743
    is complicated by the variable-length nature of the translation, and
1744
    the need to seek to corresponding places in the two representations.
1745
    For the case of basic_filebuf<char>, when no translation is needed,
1746
    a single buffer suffices. A specialized filebuf can be used to reduce
1747
    code space overhead when no locale has been imbued. Matt Austern's
1748
    work at SGI will be useful, perhaps directly as a source of code, or
1749
    at least as an example to draw on.
1750
 
1751
    Filebuf, almost uniquely (cf. operator new), depends heavily on
1752
    underlying environmental facilities. In current releases iostream
1753
    depends fairly heavily on libio constant definitions, but it should
1754
    be made independent.  It also depends on operating system primitives
1755
    for file operations. There is immense room for optimizations using
1756
    (e.g.) mmap for reading. The shadow/ directory wraps, besides the
1757
    standard C headers, the libio.h and unistd.h headers, for use mainly
1758
    by filebuf. These wrappings have not been completed, though there
1759
    is scaffolding in place.
1760
 
1761
    The encapsulation of certain C header <cstdio> names presents an
1762
    interesting problem. It is possible to define an inline std::fprintf()
1763
    implemented in terms of the 'extern "C"' vfprintf(), but there is no
1764
    standard vfscanf() to use to implement std::fscanf(). It appears that
1765
    vfscanf but be re-implemented in C++ for targets where no vfscanf
1766
    extension has been defined. This is interesting in that it seems
1767
    to be the only significant case in the C library where this kind of
1768
    rewriting is necessary. (Of course Glibc provides the vfscanf()
1769
    extension.)  (The functions related to exit() must be rewritten
1770
    for other reasons.)
1771
 
1772
 
1773
    Annex D
1774
    -------
1775
    Headers: <strstream>
1776
 
1777
    Annex D defines many non-library features, and many minor
1778
    modifications to various headers, and a complete header.
1779
    It is "mostly done", except that the libstdc++-2 <strstream>
1780
    header has not been adopted into the library, or checked to
1781
    verify that it matches the draft in those details that were
1782
    clarified by the committee. Certainly it must at least be
1783
    moved into the std namespace.
1784
 
1785
    We still need to wrap all the deprecated features in #if guards
1786
    so that pedantic compile modes can detect their use.
1787
 
1788
    Nonstandard Extensions
1789
    ----------------------
1790
    Headers: <iostream.h> <strstream.h> <hash> <rbtree>
1791
    <pthread_alloc> <stdiobuf> (etc.)
1792
 
1793
    User code has come to depend on a variety of nonstandard components
1794
    that we must not omit. Much of this code can be adopted from
1795
    libstdc++-v2 or from the SGI STL. This particularly includes
1796
    <iostream.h>, <strstream.h>, and various SGI extensions such
1797
    as <hash_map.h>. Many of these are already placed in the
1798
    subdirectories ext/ and backward/. (Note that it is better to
1799
    include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
1800
    to search the subdirectory itself via a "-I" directive.
1801
  
1802
1803
 
1804

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.