1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787
|
<appendix xmlns="http://docbook.org/ns/docbook" version="5.0"
xml:id="appendix.contrib" xreflabel="Contributing">
<?dbhtml filename="appendix_contributing.html"?>
<info><title>
Contributing
<indexterm>
<primary>Appendix</primary>
<secondary>Contributing</secondary>
</indexterm>
</title>
<keywordset>
<keyword>ISO C++</keyword>
<keyword>library</keyword>
</keywordset>
</info>
<para>
The GNU C++ Library is part of GCC and follows the same development model,
so the general rules for
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html">contributing
to GCC</link> apply. Active
contributors are assigned maintainership responsibility, and given
write access to the source repository. First-time contributors
should follow this procedure:
</para>
<section xml:id="contrib.list" xreflabel="Contributor Checklist"><info><title>Contributor Checklist</title></info>
<section xml:id="list.reading"><info><title>Reading</title></info>
<itemizedlist>
<listitem>
<para>
Get and read the relevant sections of the C++ language
specification. Copies of the full ISO 14882 standard are
available on line via the ISO mirror site for committee
members. Non-members, or those who have not paid for the
privilege of sitting on the committee and sustained their
two meeting commitment for voting rights, may get a copy of
the standard from their respective national standards
organization. In the USA, this national standards
organization is
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.ansi.org">ANSI</link>.
(And if you've already registered with them you can
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://webstore.ansi.org/RecordDetail.aspx?sku=INCITS%2fISO%2fIEC+14882-2012">buy the standard on-line</link>.)
</para>
</listitem>
<listitem>
<para>
The library working group bugs, and known defects, can
be obtained here:
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21</link>
</para>
</listitem>
<listitem>
<para>
Peruse
the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/">GNU
Coding Standards</link>, and chuckle when you hit the part
about <quote>Using Languages Other Than C</quote>.
</para>
</listitem>
<listitem>
<para>
Be familiar with the extensions that preceded these
general GNU rules. These style issues for libstdc++ can be
found in <link linkend="contrib.coding_style">Coding Style</link>.
</para>
</listitem>
<listitem>
<para>
And last but certainly not least, read the
library-specific information found in
<link linkend="appendix.porting">Porting and Maintenance</link>.
</para>
</listitem>
</itemizedlist>
</section>
<section xml:id="list.copyright"><info><title>Assignment</title></info>
<para>
See the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/contribute.html#legal">legal prerequisites</link> for all GCC contributions.
</para>
<para>
Historically, the libstdc++ assignment form added the following
question:
</para>
<para>
<quote>
Which Belgian comic book character is better, Tintin or Asterix, and
why?
</quote>
</para>
<para>
While not strictly necessary, humoring the maintainers and answering
this question would be appreciated.
</para>
<para>
Please contact
Paolo Carlini at <email>paolo.carlini@oracle.com</email>
or
Jonathan Wakely at <email>jwakely+assign@redhat.com</email>
if you are confused about the assignment or have general licensing
questions. When requesting an assignment form from
<email>assign@gnu.org</email>, please CC the libstdc++
maintainers above so that progress can be monitored.
</para>
</section>
<section xml:id="list.getting"><info><title>Getting Sources</title></info>
<para>
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/svnwrite.html">Getting write access
(look for "Write after approval")</link>
</para>
</section>
<section xml:id="list.patches"><info><title>Submitting Patches</title></info>
<para>
Every patch must have several pieces of information before it can be
properly evaluated. Ideally (and to ensure the fastest possible
response from the maintainers) it would have all of these pieces:
</para>
<itemizedlist>
<listitem>
<para>
A description of the bug and how your patch fixes this
bug. For new features a description of the feature and your
implementation.
</para>
</listitem>
<listitem>
<para>
A ChangeLog entry as plain text; see the various
ChangeLog files for format and content. If you are
using emacs as your editor, simply position the insertion
point at the beginning of your change and hit CX-4a to bring
up the appropriate ChangeLog entry. See--magic! Similar
functionality also exists for vi.
</para>
</listitem>
<listitem>
<para>
A testsuite submission or sample program that will
easily and simply show the existing error or test new
functionality.
</para>
</listitem>
<listitem>
<para>
The patch itself. If you are accessing the SVN
repository use <command>svn update; svn diff NEW</command>;
else, use <command>diff -cp OLD NEW</command> ... If your
version of diff does not support these options, then get the
latest version of GNU
diff. The <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/wiki/SvnTricks">SVN
Tricks</link> wiki page has information on customising the
output of <code>svn diff</code>.
</para>
</listitem>
<listitem>
<para>
When you have all these pieces, bundle them up in a
mail message and send it to libstdc++@gcc.gnu.org. All
patches and related discussion should be sent to the
libstdc++ mailing list.
</para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="contrib.organization" xreflabel="Source Organization"><info><title>Directory Layout and Source Conventions</title></info>
<?dbhtml filename="source_organization.html"?>
<para>
The unpacked source directory of libstdc++ contains the files
needed to create the GNU C++ Library.
</para>
<literallayout class="normal">
It has subdirectories:
doc
Files in HTML and text format that document usage, quirks of the
implementation, and contributor checklists.
include
All header files for the C++ library are within this directory,
modulo specific runtime-related files that are in the libsupc++
directory.
include/std
Files meant to be found by #include <name> directives in
standard-conforming user programs.
include/c
Headers intended to directly include standard C headers.
[NB: this can be enabled via --enable-cheaders=c]
include/c_global
Headers intended to include standard C headers in
the global namespace, and put select names into the std::
namespace. [NB: this is the default, and is the same as
--enable-cheaders=c_global]
include/c_std
Headers intended to include standard C headers
already in namespace std, and put select names into the std::
namespace. [NB: this is the same as --enable-cheaders=c_std]
include/bits
Files included by standard headers and by other files in
the bits directory.
include/backward
Headers provided for backward compatibility, such as <iostream.h>.
They are not used in this library.
include/ext
Headers that define extensions to the standard library. No
standard header refers to any of them.
scripts
Scripts that are used during the configure, build, make, or test
process.
src
Files that are used in constructing the library, but are not
installed.
testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
Test programs are here, and may be used to begin to exercise the
library. Support for "make check" and "make check-install" is
complete, and runs through all the subdirectories here when this
command is issued from the build directory. Please note that
"make check" requires DejaGNU 1.4 or later to be installed. Please
note that "make check-script" calls the script mkcheck, which
requires bash, and which may need the paths to bash adjusted to
work properly, as /bin/bash is assumed.
Other subdirectories contain variant versions of certain files
that are meant to be copied or linked by the configure script.
Currently these are:
config/abi
config/cpu
config/io
config/locale
config/os
In addition, a subdirectory holds the convenience library libsupc++.
libsupc++
Contains the runtime library for C++, including exception
handling and memory allocation and deallocation, RTTI, terminate
handlers, etc.
Note that glibc also has a bits/ subdirectory. We will either
need to be careful not to collide with names in its bits/
directory; or rename bits to (e.g.) cppbits/.
In files throughout the system, lines marked with an "XXX" indicate
a bug or incompletely-implemented feature. Lines marked "XXX MT"
indicate a place that may require attention for multi-thread safety.
</literallayout>
</section>
<section xml:id="contrib.coding_style" xreflabel="Coding Style"><info><title>Coding Style</title></info>
<?dbhtml filename="source_code_style.html"?>
<para>
</para>
<section xml:id="coding_style.bad_identifiers"><info><title>Bad Identifiers</title></info>
<para>
Identifiers that conflict and should be avoided.
</para>
<literallayout class="normal">
This is the list of names <quote>reserved to the
implementation</quote> that have been claimed by certain
compilers and system headers of interest, and should not be used
in the library. It will grow, of course. We generally are
interested in names that are not all-caps, except for those like
"_T"
For Solaris:
_B
_C
_L
_N
_P
_S
_U
_X
_E1
..
_E24
Irix adds:
_A
_G
MS adds:
_T
BSD adds:
__used
__unused
__inline
_Complex
__istype
__maskrune
__tolower
__toupper
__wchar_t
__wint_t
_res
_res_ext
__tg_*
SPU adds:
__ea
For GCC:
[Note that this list is out of date. It applies to the old
name-mangling; in G++ 3.0 and higher a different name-mangling is
used. In addition, many of the bugs relating to G++ interpreting
these names as operators have been fixed.]
The full set of __* identifiers (combined from gcc/cp/lex.c and
gcc/cplus-dem.c) that are either old or new, but are definitely
recognized by the demangler, is:
__aa
__aad
__ad
__addr
__adv
__aer
__als
__alshift
__amd
__ami
__aml
__amu
__aor
__apl
__array
__ars
__arshift
__as
__bit_and
__bit_ior
__bit_not
__bit_xor
__call
__cl
__cm
__cn
__co
__component
__compound
__cond
__convert
__delete
__dl
__dv
__eq
__er
__ge
__gt
__indirect
__le
__ls
__lt
__max
__md
__method_call
__mi
__min
__minus
__ml
__mm
__mn
__mult
__mx
__ne
__negate
__new
__nop
__nt
__nw
__oo
__op
__or
__pl
__plus
__postdecrement
__postincrement
__pp
__pt
__rf
__rm
__rs
__sz
__trunc_div
__trunc_mod
__truth_andif
__truth_not
__truth_orif
__vc
__vd
__vn
SGI badnames:
__builtin_alloca
__builtin_fsqrt
__builtin_sqrt
__builtin_fabs
__builtin_dabs
__builtin_cast_f2i
__builtin_cast_i2f
__builtin_cast_d2ll
__builtin_cast_ll2d
__builtin_copy_dhi2i
__builtin_copy_i2dhi
__builtin_copy_dlo2i
__builtin_copy_i2dlo
__add_and_fetch
__sub_and_fetch
__or_and_fetch
__xor_and_fetch
__and_and_fetch
__nand_and_fetch
__mpy_and_fetch
__min_and_fetch
__max_and_fetch
__fetch_and_add
__fetch_and_sub
__fetch_and_or
__fetch_and_xor
__fetch_and_and
__fetch_and_nand
__fetch_and_mpy
__fetch_and_min
__fetch_and_max
__lock_test_and_set
__lock_release
__lock_acquire
__compare_and_swap
__synchronize
__high_multiply
__unix
__sgi
__linux__
__i386__
__i486__
__cplusplus
__embedded_cplusplus
// long double conversion members mangled as __opr
// http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
__opr
</literallayout>
</section>
<section xml:id="coding_style.example"><info><title>By Example</title></info>
<literallayout class="normal">
This library is written to appropriate C++ coding standards. As such,
it is intended to precede the recommendations of the GNU Coding
Standard, which can be referenced in full here:
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.gnu.org/prep/standards/standards.html#Formatting">http://www.gnu.org/prep/standards/standards.html#Formatting</link>
The rest of this is also interesting reading, but skip the "Design
Advice" part.
The GCC coding conventions are here, and are also useful:
<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/codingconventions.html">http://gcc.gnu.org/codingconventions.html</link>
In addition, because it doesn't seem to be stated explicitly anywhere
else, there is an 80 column source limit.
<filename>ChangeLog</filename> entries for member functions should use the
classname::member function name syntax as follows:
<code>
1999-04-15 Dennis Ritchie <dr@att.com>
* src/basic_file.cc (__basic_file::open): Fix thinko in
_G_HAVE_IO_FILE_OPEN bits.
</code>
Notable areas of divergence from what may be previous local practice
(particularly for GNU C) include:
01. Pointers and references
<code>
char* p = "flop";
char& c = *p;
-NOT-
char *p = "flop"; // wrong
char &c = *p; // wrong
</code>
Reason: In C++, definitions are mixed with executable code. Here,
<code>p</code> is being initialized, not <code>*p</code>. This is near-universal
practice among C++ programmers; it is normal for C hackers
to switch spontaneously as they gain experience.
02. Operator names and parentheses
<code>
operator==(type)
-NOT-
operator == (type) // wrong
</code>
Reason: The <code>==</code> is part of the function name. Separating
it makes the declaration look like an expression.
03. Function names and parentheses
<code>
void mangle()
-NOT-
void mangle () // wrong
</code>
Reason: no space before parentheses (except after a control-flow
keyword) is near-universal practice for C++. It identifies the
parentheses as the function-call operator or declarator, as
opposed to an expression or other overloaded use of parentheses.
04. Template function indentation
<code>
template<typename T>
void
template_function(args)
{ }
-NOT-
template<class T>
void template_function(args) {};
</code>
Reason: In class definitions, without indentation whitespace is
needed both above and below the declaration to distinguish
it visually from other members. (Also, re: "typename"
rather than "class".) <code>T</code> often could be <code>int</code>, which is
not a class. ("class", here, is an anachronism.)
05. Template class indentation
<code>
template<typename _CharT, typename _Traits>
class basic_ios : public ios_base
{
public:
// Types:
};
-NOT-
template<class _CharT, class _Traits>
class basic_ios : public ios_base
{
public:
// Types:
};
-NOT-
template<class _CharT, class _Traits>
class basic_ios : public ios_base
{
public:
// Types:
};
</code>
06. Enumerators
<code>
enum
{
space = _ISspace,
print = _ISprint,
cntrl = _IScntrl
};
-NOT-
enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
</code>
07. Member initialization lists
All one line, separate from class name.
<code>
gribble::gribble()
: _M_private_data(0), _M_more_stuff(0), _M_helper(0)
{ }
-NOT-
gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0)
{ }
</code>
08. Try/Catch blocks
<code>
try
{
//
}
catch (...)
{
//
}
-NOT-
try {
//
} catch(...) {
//
}
</code>
09. Member functions declarations and definitions
Keywords such as extern, static, export, explicit, inline, etc
go on the line above the function name. Thus
<code>
virtual int
foo()
-NOT-
virtual int foo()
</code>
Reason: GNU coding conventions dictate return types for functions
are on a separate line than the function name and parameter list
for definitions. For C++, where we have member functions that can
be either inline definitions or declarations, keeping to this
standard allows all member function names for a given class to be
aligned to the same margin, increasing readability.
10. Invocation of member functions with "this->"
For non-uglified names, use <code>this->name</code> to call the function.
<code>
this->sync()
-NOT-
sync()
</code>
Reason: Koenig lookup.
11. Namespaces
<code>
namespace std
{
blah blah blah;
} // namespace std
-NOT-
namespace std {
blah blah blah;
} // namespace std
</code>
12. Spacing under protected and private in class declarations:
space above, none below
i.e.
<code>
public:
int foo;
-NOT-
public:
int foo;
</code>
13. Spacing WRT return statements.
no extra spacing before returns, no parenthesis
i.e.
<code>
}
return __ret;
-NOT-
}
return __ret;
-NOT-
}
return (__ret);
</code>
14. Location of global variables.
All global variables of class type, whether in the "user visible"
space (e.g., <code>cin</code>) or the implementation namespace, must be defined
as a character array with the appropriate alignment and then later
re-initialized to the correct value.
This is due to startup issues on certain platforms, such as AIX.
For more explanation and examples, see <filename>src/globals.cc</filename>. All such
variables should be contained in that file, for simplicity.
15. Exception abstractions
Use the exception abstractions found in <filename class="headerfile">functexcept.h</filename>, which allow
C++ programmers to use this library with <literal>-fno-exceptions</literal>. (Even if
that is rarely advisable, it's a necessary evil for backwards
compatibility.)
16. Exception error messages
All start with the name of the function where the exception is
thrown, and then (optional) descriptive text is added. Example:
<code>
__throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
</code>
Reason: The verbose terminate handler prints out <code>exception::what()</code>,
as well as the typeinfo for the thrown exception. As this is the
default terminate handler, by putting location info into the
exception string, a very useful error message is printed out for
uncaught exceptions. So useful, in fact, that non-programmers can
give useful error messages, and programmers can intelligently
speculate what went wrong without even using a debugger.
17. The doxygen style guide to comments is a separate document,
see index.
The library currently has a mixture of GNU-C and modern C++ coding
styles. The GNU C usages will be combed out gradually.
Name patterns:
For nonstandard names appearing in Standard headers, we are constrained
to use names that begin with underscores. This is called "uglification".
The convention is:
Local and argument names: <literal>__[a-z].*</literal>
Examples: <code>__count __ix __s1</code>
Type names and template formal-argument names: <literal>_[A-Z][^_].*</literal>
Examples: <code>_Helper _CharT _N</code>
Member data and function names: <literal>_M_.*</literal>
Examples: <code>_M_num_elements _M_initialize ()</code>
Static data members, constants, and enumerations: <literal>_S_.*</literal>
Examples: <code>_S_max_elements _S_default_value</code>
Don't use names in the same scope that differ only in the prefix,
e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
(The most tempting of these seem to be and "_T" and "__sz".)
Names must never have "__" internally; it would confuse name
unmanglers on some targets. Also, never use "__[0-9]", same reason.
--------------------------
[BY EXAMPLE]
<code>
#ifndef _HEADER_
#define _HEADER_ 1
namespace std
{
class gribble
{
public:
gribble() throw();
gribble(const gribble&);
explicit
gribble(int __howmany);
gribble&
operator=(const gribble&);
virtual
~gribble() throw ();
// Start with a capital letter, end with a period.
inline void
public_member(const char* __arg) const;
// In-class function definitions should be restricted to one-liners.
int
one_line() { return 0 }
int
two_lines(const char* arg)
{ return strchr(arg, 'a'); }
inline int
three_lines(); // inline, but defined below.
// Note indentation.
template<typename _Formal_argument>
void
public_template() const throw();
template<typename _Iterator>
void
other_template();
private:
class _Helper;
int _M_private_data;
int _M_more_stuff;
_Helper* _M_helper;
int _M_private_function();
enum _Enum
{
_S_one,
_S_two
};
static void
_S_initialize_library();
};
// More-or-less-standard language features described by lack, not presence.
# ifndef _G_NO_LONGLONG
extern long long _G_global_with_a_good_long_name; // avoid globals!
# endif
// Avoid in-class inline definitions, define separately;
// likewise for member class definitions:
inline int
gribble::public_member() const
{ int __local = 0; return __local; }
class gribble::_Helper
{
int _M_stuff;
friend class gribble;
};
}
// Names beginning with "__": only for arguments and
// local variables; never use "__" in a type name, or
// within any name; never use "__[0-9]".
#endif /* _HEADER_ */
namespace std
{
template<typename T> // notice: "typename", not "class", no space
long_return_value_type<with_many, args>
function_name(char* pointer, // "char *pointer" is wrong.
char* argument,
const Reference& ref)
{
// int a_local; /* wrong; see below. */
if (test)
{
nested code
}
int a_local = 0; // declare variable at first use.
// char a, b, *p; /* wrong */
char a = 'a';
char b = a + 1;
char* c = "abc"; // each variable goes on its own line, always.
// except maybe here...
for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
// ...
}
}
gribble::gribble()
: _M_private_data(0), _M_more_stuff(0), _M_helper(0)
{ }
int
gribble::three_lines()
{
// doesn't fit in one line.
}
} // namespace std
</code>
</literallayout>
</section>
</section>
<section xml:id="contrib.design_notes" xreflabel="Design Notes"><info><title>Design Notes</title></info>
<?dbhtml filename="source_design_notes.html"?>
<para>
</para>
<literallayout class="normal">
The Library
-----------
This paper is covers two major areas:
- Features and policies not mentioned in the standard that
the quality of the library implementation depends on, including
extensions and "implementation-defined" features;
- Plans for required but unimplemented library features and
optimizations to them.
Overhead
--------
The standard defines a large library, much larger than the standard
C library. A naive implementation would suffer substantial overhead
in compile time, executable size, and speed, rendering it unusable
in many (particularly embedded) applications. The alternative demands
care in construction, and some compiler support, but there is no
need for library subsets.
What are the sources of this overhead? There are four main causes:
- The library is specified almost entirely as templates, which
with current compilers must be included in-line, resulting in
very slow builds as tens or hundreds of thousands of lines
of function definitions are read for each user source file.
Indeed, the entire SGI STL, as well as the dos Reis valarray,
are provided purely as header files, largely for simplicity in
porting. Iostream/locale is (or will be) as large again.
- The library is very flexible, specifying a multitude of hooks
where users can insert their own code in place of defaults.
When these hooks are not used, any time and code expended to
support that flexibility is wasted.
- Templates are often described as causing to "code bloat". In
practice, this refers (when it refers to anything real) to several
independent processes. First, when a class template is manually
instantiated in its entirely, current compilers place the definitions
for all members in a single object file, so that a program linking
to one member gets definitions of all. Second, template functions
which do not actually depend on the template argument are, under
current compilers, generated anew for each instantiation, rather
than being shared with other instantiations. Third, some of the
flexibility mentioned above comes from virtual functions (both in
regular classes and template classes) which current linkers add
to the executable file even when they manifestly cannot be called.
- The library is specified to use a language feature, exceptions,
which in the current gcc compiler ABI imposes a run time and
code space cost to handle the possibility of exceptions even when
they are not used. Under the new ABI (accessed with -fnew-abi),
there is a space overhead and a small reduction in code efficiency
resulting from lost optimization opportunities associated with
non-local branches associated with exceptions.
What can be done to eliminate this overhead? A variety of coding
techniques, and compiler, linker and library improvements and
extensions may be used, as covered below. Most are not difficult,
and some are already implemented in varying degrees.
Overhead: Compilation Time
--------------------------
Providing "ready-instantiated" template code in object code archives
allows us to avoid generating and optimizing template instantiations
in each compilation unit which uses them. However, the number of such
instantiations that are useful to provide is limited, and anyway this
is not enough, by itself, to minimize compilation time. In particular,
it does not reduce time spent parsing conforming headers.
Quicker header parsing will depend on library extensions and compiler
improvements. One approach is some variation on the techniques
previously marketed as "pre-compiled headers", now standardized as
support for the "export" keyword. "Exported" template definitions
can be placed (once) in a "repository" -- really just a library, but
of template definitions rather than object code -- to be drawn upon
at link time when an instantiation is needed, rather than placed in
header files to be parsed along with every compilation unit.
Until "export" is implemented we can put some of the lengthy template
definitions in #if guards or alternative headers so that users can skip
over the full definitions when they need only the ready-instantiated
specializations.
To be precise, this means that certain headers which define
templates which users normally use only for certain arguments
can be instrumented to avoid exposing the template definitions
to the compiler unless a macro is defined. For example, in
<string>, we might have:
template <class _CharT, ... > class basic_string {
... // member declarations
};
... // operator declarations
#ifdef _STRICT_ISO_
# if _G_NO_TEMPLATE_EXPORT
# include <bits/std_locale.h> // headers needed by definitions
# ...
# include <bits/string.tcc> // member and global template definitions.
# endif
#endif
Users who compile without specifying a strict-ISO-conforming flag
would not see many of the template definitions they now see, and rely
instead on ready-instantiated specializations in the library. This
technique would be useful for the following substantial components:
string, locale/iostreams, valarray. It would *not* be useful or
usable with the following: containers, algorithms, iterators,
allocator. Since these constitute a large (though decreasing)
fraction of the library, the benefit the technique offers is
limited.
The language specifies the semantics of the "export" keyword, but
the gcc compiler does not yet support it. When it does, problems
with large template inclusions can largely disappear, given some
minor library reorganization, along with the need for the apparatus
described above.
Overhead: Flexibility Cost
--------------------------
The library offers many places where users can specify operations
to be performed by the library in place of defaults. Sometimes
this seems to require that the library use a more-roundabout, and
possibly slower, way to accomplish the default requirements than
would be used otherwise.
The primary protection against this overhead is thorough compiler
optimization, to crush out layers of inline function interfaces.
Kuck & Associates has demonstrated the practicality of this kind
of optimization.
The second line of defense against this overhead is explicit
specialization. By defining helper function templates, and writing
specialized code for the default case, overhead can be eliminated
for that case without sacrificing flexibility. This takes full
advantage of any ability of the optimizer to crush out degenerate
code.
The library specifies many virtual functions which current linkers
load even when they cannot be called. Some minor improvements to the
compiler and to ld would eliminate any such overhead by simply
omitting virtual functions that the complete program does not call.
A prototype of this work has already been done. For targets where
GNU ld is not used, a "pre-linker" could do the same job.
The main areas in the standard interface where user flexibility
can result in overhead are:
- Allocators: Containers are specified to use user-definable
allocator types and objects, making tuning for the container
characteristics tricky.
- Locales: the standard specifies locale objects used to implement
iostream operations, involving many virtual functions which use
streambuf iterators.
- Algorithms and containers: these may be instantiated on any type,
frequently duplicating code for identical operations.
- Iostreams and strings: users are permitted to use these on their
own types, and specify the operations the stream must use on these
types.
Note that these sources of overhead are _avoidable_. The techniques
to avoid them are covered below.
Code Bloat
----------
In the SGI STL, and in some other headers, many of the templates
are defined "inline" -- either explicitly or by their placement
in class definitions -- which should not be inline. This is a
source of code bloat. Matt had remarked that he was relying on
the compiler to recognize what was too big to benefit from inlining,
and generate it out-of-line automatically. However, this also can
result in code bloat except where the linker can eliminate the extra
copies.
Fixing these cases will require an audit of all inline functions
defined in the library to determine which merit inlining, and moving
the rest out of line. This is an issue mainly in clauses 23, 25, and
27. Of course it can be done incrementally, and we should generally
accept patches that move large functions out of line and into ".tcc"
files, which can later be pulled into a repository. Compiler/linker
improvements to recognize very large inline functions and move them
out-of-line, but shared among compilation units, could make this
work unnecessary.
Pre-instantiating template specializations currently produces large
amounts of dead code which bloats statically linked programs. The
current state of the static library, libstdc++.a, is intolerable on
this account, and will fuel further confused speculation about a need
for a library "subset". A compiler improvement that treats each
instantiated function as a separate object file, for linking purposes,
would be one solution to this problem. An alternative would be to
split up the manual instantiation files into dozens upon dozens of
little files, each compiled separately, but an abortive attempt at
this was done for <string> and, though it is far from complete, it
is already a nuisance. A better interim solution (just until we have
"export") is badly needed.
When building a shared library, the current compiler/linker cannot
automatically generate the instantiations needed. This creates a
miserable situation; it means any time something is changed in the
library, before a shared library can be built someone must manually
copy the declarations of all templates that are needed by other parts
of the library to an "instantiation" file, and add it to the build
system to be compiled and linked to the library. This process is
readily automated, and should be automated as soon as possible.
Users building their own shared libraries experience identical
frustrations.
Sharing common aspects of template definitions among instantiations
can radically reduce code bloat. The compiler could help a great
deal here by recognizing when a function depends on nothing about
a template parameter, or only on its size, and giving the resulting
function a link-name "equate" that allows it to be shared with other
instantiations. Implementation code could take advantage of the
capability by factoring out code that does not depend on the template
argument into separate functions to be merged by the compiler.
Until such a compiler optimization is implemented, much can be done
manually (if tediously) in this direction. One such optimization is
to derive class templates from non-template classes, and move as much
implementation as possible into the base class. Another is to partial-
specialize certain common instantiations, such as vector<T*>, to share
code for instantiations on all types T. While these techniques work,
they are far from the complete solution that a compiler improvement
would afford.
Overhead: Expensive Language Features
-------------------------------------
The main "expensive" language feature used in the standard library
is exception support, which requires compiling in cleanup code with
static table data to locate it, and linking in library code to use
the table. For small embedded programs the amount of such library
code and table data is assumed by some to be excessive. Under the
"new" ABI this perception is generally exaggerated, although in some
cases it may actually be excessive.
To implement a library which does not use exceptions directly is
not difficult given minor compiler support (to "turn off" exceptions
and ignore exception constructs), and results in no great library
maintenance difficulties. To be precise, given "-fno-exceptions",
the compiler should treat "try" blocks as ordinary blocks, and
"catch" blocks as dead code to ignore or eliminate. Compiler
support is not strictly necessary, except in the case of "function
try blocks"; otherwise the following macros almost suffice:
#define throw(X)
#define try if (true)
#define catch(X) else if (false)
However, there may be a need to use function try blocks in the
library implementation, and use of macros in this way can make
correct diagnostics impossible. Furthermore, use of this scheme
would require the library to call a function to re-throw exceptions
from a try block. Implementing the above semantics in the compiler
is preferable.
Given the support above (however implemented) it only remains to
replace code that "throws" with a call to a well-documented "handler"
function in a separate compilation unit which may be replaced by
the user. The main source of exceptions that would be difficult
for users to avoid is memory allocation failures, but users can
define their own memory allocation primitives that never throw.
Otherwise, the complete list of such handlers, and which library
functions may call them, would be needed for users to be able to
implement the necessary substitutes. (Fortunately, they have the
source code.)
Opportunities
-------------
The template capabilities of C++ offer enormous opportunities for
optimizing common library operations, well beyond what would be
considered "eliminating overhead". In particular, many operations
done in Glibc with macros that depend on proprietary language
extensions can be implemented in pristine Standard C++. For example,
the chapter 25 algorithms, and even C library functions such as strchr,
can be specialized for the case of static arrays of known (small) size.
Detailed optimization opportunities are identified below where
the component where they would appear is discussed. Of course new
opportunities will be identified during implementation.
Unimplemented Required Library Features
---------------------------------------
The standard specifies hundreds of components, grouped broadly by
chapter. These are listed in excruciating detail in the CHECKLIST
file.
17 general
18 support
19 diagnostics
20 utilities
21 string
22 locale
23 containers
24 iterators
25 algorithms
26 numerics
27 iostreams
Annex D backward compatibility
Anyone participating in implementation of the library should obtain
a copy of the standard, ISO 14882. People in the U.S. can obtain an
electronic copy for US$18 from ANSI's web site. Those from other
countries should visit http://www.iso.org/ to find out the location
of their country's representation in ISO, in order to know who can
sell them a copy.
The emphasis in the following sections is on unimplemented features
and optimization opportunities.
Chapter 17 General
-------------------
Chapter 17 concerns overall library requirements.
The standard doesn't mention threads. A multi-thread (MT) extension
primarily affects operators new and delete (18), allocator (20),
string (21), locale (22), and iostreams (27). The common underlying
support needed for this is discussed under chapter 20.
The standard requirements on names from the C headers create a
lot of work, mostly done. Names in the C headers must be visible
in the std:: and sometimes the global namespace; the names in the
two scopes must refer to the same object. More stringent is that
Koenig lookup implies that any types specified as defined in std::
really are defined in std::. Names optionally implemented as
macros in C cannot be macros in C++. (An overview may be read at
<http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
and "mkcshadow", and the directories shadow/ and cshadow/, are the
beginning of an effort to conform in this area.
A correct conforming definition of C header names based on underlying
C library headers, and practical linking of conforming namespaced
customer code with third-party C libraries depends ultimately on
an ABI change, allowing namespaced C type names to be mangled into
type names as if they were global, somewhat as C function names in a
namespace, or C++ global variable names, are left unmangled. Perhaps
another "extern" mode, such as 'extern "C-global"' would be an
appropriate place for such type definitions. Such a type would
affect mangling as follows:
namespace A {
struct X {};
extern "C-global" { // or maybe just 'extern "C"'
struct Y {};
};
}
void f(A::X*); // mangles to f__FPQ21A1X
void f(A::Y*); // mangles to f__FP1Y
(It may be that this is really the appropriate semantics for regular
'extern "C"', and 'extern "C-global"', as an extension, would not be
necessary.) This would allow functions declared in non-standard C headers
(and thus fixable by neither us nor users) to link properly with functions
declared using C types defined in properly-namespaced headers. The
problem this solves is that C headers (which C++ programmers do persist
in using) frequently forward-declare C struct tags without including
the header where the type is defined, as in
struct tm;
void munge(tm*);
Without some compiler accommodation, munge cannot be called by correct
C++ code using a pointer to a correctly-scoped tm* value.
The current C headers use the preprocessor extension "#include_next",
which the compiler complains about when run "-pedantic".
(Incidentally, it appears that "-fpedantic" is currently ignored,
probably a bug.) The solution in the C compiler is to use
"-isystem" rather than "-I", but unfortunately in g++ this seems
also to wrap the whole header in an 'extern "C"' block, so it's
unusable for C++ headers. The correct solution appears to be to
allow the various special include-directory options, if not given
an argument, to affect subsequent include-directory options additively,
so that if one said
-pedantic -iprefix $(prefix) \
-idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
-iwithprefix -I g++-v3/ext
the compiler would search $(prefix)/g++-v3 and not report
pedantic warnings for files found there, but treat files in
$(prefix)/g++-v3/ext pedantically. (The undocumented semantics
of "-isystem" in g++ stink. Can they be rescinded? If not it
must be replaced with something more rationally behaved.)
All the C headers need the treatment above; in the standard these
headers are mentioned in various clauses. Below, I have only
mentioned those that present interesting implementation issues.
The components identified as "mostly complete", below, have not been
audited for conformance. In many cases where the library passes
conformance tests we have non-conforming extensions that must be
wrapped in #if guards for "pedantic" use, and in some cases renamed
in a conforming way for continued use in the implementation regardless
of conformance flags.
The STL portion of the library still depends on a header
stl/bits/stl_config.h full of #ifdef clauses. This apparatus
should be replaced with autoconf/automake machinery.
The SGI STL defines a type_traits<> template, specialized for
many types in their code including the built-in numeric and
pointer types and some library types, to direct optimizations of
standard functions. The SGI compiler has been extended to generate
specializations of this template automatically for user types,
so that use of STL templates on user types can take advantage of
these optimizations. Specializations for other, non-STL, types
would make more optimizations possible, but extending the gcc
compiler in the same way would be much better. Probably the next
round of standardization will ratify this, but probably with
changes, so it probably should be renamed to place it in the
implementation namespace.
The SGI STL also defines a large number of extensions visible in
standard headers. (Other extensions that appear in separate headers
have been sequestered in subdirectories ext/ and backward/.) All
these extensions should be moved to other headers where possible,
and in any case wrapped in a namespace (not std!), and (where kept
in a standard header) girded about with macro guards. Some cannot be
moved out of standard headers because they are used to implement
standard features. The canonical method for accommodating these
is to use a protected name, aliased in macro guards to a user-space
name. Unfortunately C++ offers no satisfactory template typedef
mechanism, so very ad-hoc and unsatisfactory aliasing must be used
instead.
Implementation of a template typedef mechanism should have the highest
priority among possible extensions, on the same level as implementation
of the template "export" feature.
Chapter 18 Language support
----------------------------
Headers: <limits> <new> <typeinfo> <exception>
C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp>
<ctime> <csignal> <cstdlib> (also 21, 25, 26)
This defines the built-in exceptions, rtti, numeric_limits<>,
operator new and delete. Much of this is provided by the
compiler in its static runtime library.
Work to do includes defining numeric_limits<> specializations in
separate files for all target architectures. Values for integer types
except for bool and wchar_t are readily obtained from the C header
<limits.h>, but values for the remaining numeric types (bool, wchar_t,
float, double, long double) must be entered manually. This is
largely dog work except for those members whose values are not
easily deduced from available documentation. Also, this involves
some work in target configuration to identify the correct choice of
file to build against and to install.
The definitions of the various operators new and delete must be
made thread-safe, which depends on a portable exclusion mechanism,
discussed under chapter 20. Of course there is always plenty of
room for improvements to the speed of operators new and delete.
<cstdarg>, in Glibc, defines some macros that gcc does not allow to
be wrapped into an inline function. Probably this header will demand
attention whenever a new target is chosen. The functions atexit(),
exit(), and abort() in cstdlib have different semantics in C++, so
must be re-implemented for C++.
Chapter 19 Diagnostics
-----------------------
Headers: <stdexcept>
C headers: <cassert> <cerrno>
This defines the standard exception objects, which are "mostly complete".
Cygnus has a version, and now SGI provides a slightly different one.
It makes little difference which we use.
The C global name "errno", which C allows to be a variable or a macro,
is required in C++ to be a macro. For MT it must typically result in
a function call.
Chapter 20 Utilities
---------------------
Headers: <utility> <functional> <memory>
C header: <ctime> (also in 18)
SGI STL provides "mostly complete" versions of all the components
defined in this chapter. However, the auto_ptr<> implementation
is known to be wrong. Furthermore, the standard definition of it
is known to be unimplementable as written. A minor change to the
standard would fix it, and auto_ptr<> should be adjusted to match.
Multi-threading affects the allocator implementation, and there must
be configuration/installation choices for different users' MT
requirements. Anyway, users will want to tune allocator options
to support different target conditions, MT or no.
The primitives used for MT implementation should be exposed, as an
extension, for users' own work. We need cross-CPU "mutex" support,
multi-processor shared-memory atomic integer operations, and single-
processor uninterruptible integer operations, and all three configurable
to be stubbed out for non-MT use, or to use an appropriately-loaded
dynamic library for the actual runtime environment, or statically
compiled in for cases where the target architecture is known.
Chapter 21 String
------------------
Headers: <string>
C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
<cstdlib> (also in 18, 25, 26)
We have "mostly-complete" char_traits<> implementations. Many of the
char_traits<char> operations might be optimized further using existing
proprietary language extensions.
We have a "mostly-complete" basic_string<> implementation. The work
to manually instantiate char and wchar_t specializations in object
files to improve link-time behavior is extremely unsatisfactory,
literally tripling library-build time with no commensurate improvement
in static program link sizes. It must be redone. (Similar work is
needed for some components in clauses 22 and 27.)
Other work needed for strings is MT-safety, as discussed under the
chapter 20 heading.
The standard C type mbstate_t from <cwchar> and used in char_traits<>
must be different in C++ than in C, because in C++ the default constructor
value mbstate_t() must be the "base" or "ground" sequence state.
(According to the likely resolution of a recently raised Core issue,
this may become unnecessary. However, there are other reasons to
use a state type not as limited as whatever the C library provides.)
If we might want to provide conversions from (e.g.) internally-
represented EUC-wide to externally-represented Unicode, or vice-
versa, the mbstate_t we choose will need to be more accommodating
than what might be provided by an underlying C library.
There remain some basic_string template-member functions which do
not overload properly with their non-template brethren. The infamous
hack akin to what was done in vector<> is needed, to conform to
23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
or incomplete, are so marked for this reason.
Replacing the string iterators, which currently are simple character
pointers, with class objects would greatly increase the safety of the
client interface, and also permit a "debug" mode in which range,
ownership, and validity are rigorously checked. The current use of
raw pointers as string iterators is evil. vector<> iterators need the
same treatment. Note that the current implementation freely mixes
pointers and iterators, and that must be fixed before safer iterators
can be introduced.
Some of the functions in <cstring> are different from the C version.
generally overloaded on const and non-const argument pointers. For
example, in <cstring> strchr is overloaded. The functions isupper
etc. in <cctype> typically implemented as macros in C are functions
in C++, because they are overloaded with others of the same name
defined in <locale>.
Many of the functions required in <cwctype> and <cwchar> cannot be
implemented using underlying C facilities on intended targets because
such facilities only partly exist.
Chapter 22 Locale
------------------
Headers: <locale>
C headers: <clocale>
We have a "mostly complete" class locale, with the exception of
code for constructing, and handling the names of, named locales.
The ways that locales are named (particularly when categories
(e.g. LC_TIME, LC_COLLATE) are different) varies among all target
environments. This code must be written in various versions and
chosen by configuration parameters.
Members of many of the facets defined in <locale> are stubs. Generally,
there are two sets of facets: the base class facets (which are supposed
to implement the "C" locale) and the "byname" facets, which are supposed
to read files to determine their behavior. The base ctype<>, collate<>,
and numpunct<> facets are "mostly complete", except that the table of
bitmask values used for "is" operations, and corresponding mask values,
are still defined in libio and just included/linked. (We will need to
implement these tables independently, soon, but should take advantage
of libio where possible.) The num_put<>::put members for integer types
are "mostly complete".
A complete list of what has and has not been implemented may be
found in CHECKLIST. However, note that the current definition of
codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
out the raw bytes representing the wide characters, rather than
trying to convert each to a corresponding single "char" value.
Some of the facets are more important than others. Specifically,
the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
are used by other library facilities defined in <string>, <istream>,
and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
in <fstream>, so a conforming iostream implementation depends on
these.
The "long long" type eventually must be supported, but code mentioning
it should be wrapped in #if guards to allow pedantic-mode compiling.
Performance of num_put<> and num_get<> depend critically on
caching computed values in ios_base objects, and on extensions
to the interface with streambufs.
Specifically: retrieving a copy of the locale object, extracting
the needed facets, and gathering data from them, for each call to
(e.g.) operator<< would be prohibitively slow. To cache format
data for use by num_put<> and num_get<> we have a _Format_cache<>
object stored in the ios_base::pword() array. This is constructed
and initialized lazily, and is organized purely for utility. It
is discarded when a new locale with different facets is imbued.
Using only the public interfaces of the iterator arguments to the
facet functions would limit performance by forbidding "vector-style"
character operations. The streambuf iterator optimizations are
described under chapter 24, but facets can also bypass the streambuf
iterators via explicit specializations and operate directly on the
streambufs, and use extended interfaces to get direct access to the
streambuf internal buffer arrays. These extensions are mentioned
under chapter 27. These optimizations are particularly important
for input parsing.
Unused virtual members of locale facets can be omitted, as mentioned
above, by a smart linker.
Chapter 23 Containers
----------------------
Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
All the components in chapter 23 are implemented in the SGI STL.
They are "mostly complete"; they include a large number of
nonconforming extensions which must be wrapped. Some of these
are used internally and must be renamed or duplicated.
The SGI components are optimized for large-memory environments. For
embedded targets, different criteria might be more appropriate. Users
will want to be able to tune this behavior. We should provide
ways for users to compile the library with different memory usage
characteristics.
A lot more work is needed on factoring out common code from different
specializations to reduce code size here and in chapter 25. The
easiest fix for this would be a compiler/ABI improvement that allows
the compiler to recognize when a specialization depends only on the
size (or other gross quality) of a template argument, and allow the
linker to share the code with similar specializations. In its
absence, many of the algorithms and containers can be partial-
specialized, at least for the case of pointers, but this only solves
a small part of the problem. Use of a type_traits-style template
allows a few more optimization opportunities, more if the compiler
can generate the specializations automatically.
As an optimization, containers can specialize on the default allocator
and bypass it, or take advantage of details of its implementation
after it has been improved upon.
Replacing the vector iterators, which currently are simple element
pointers, with class objects would greatly increase the safety of the
client interface, and also permit a "debug" mode in which range,
ownership, and validity are rigorously checked. The current use of
pointers for iterators is evil.
As mentioned for chapter 24, the deque iterator is a good example of
an opportunity to implement a "staged" iterator that would benefit
from specializations of some algorithms.
Chapter 24 Iterators
---------------------
Headers: <iterator>
Standard iterators are "mostly complete", with the exception of
the stream iterators, which are not yet templatized on the
stream type. Also, the base class template iterator<> appears
to be wrong, so everything derived from it must also be wrong,
currently.
The streambuf iterators (currently located in stl/bits/std_iterator.h,
but should be under bits/) can be rewritten to take advantage of
friendship with the streambuf implementation.
Matt Austern has identified opportunities where certain iterator
types, particularly including streambuf iterators and deque
iterators, have a "two-stage" quality, such that an intermediate
limit can be checked much more quickly than the true limit on
range operations. If identified with a member of iterator_traits,
algorithms may be specialized for this case. Of course the
iterators that have this quality can be identified by specializing
a traits class.
Many of the algorithms must be specialized for the streambuf
iterators, to take advantage of block-mode operations, in order
to allow iostream/locale operations' performance not to suffer.
It may be that they could be treated as staged iterators and
take advantage of those optimizations.
Chapter 25 Algorithms
----------------------
Headers: <algorithm>
C headers: <cstdlib> (also in 18, 21, 26))
The algorithms are "mostly complete". As mentioned above, they
are optimized for speed at the expense of code and data size.
Specializations of many of the algorithms for non-STL types would
give performance improvements, but we must use great care not to
interfere with fragile template overloading semantics for the
standard interfaces. Conventionally the standard function template
interface is an inline which delegates to a non-standard function
which is then overloaded (this is already done in many places in
the library). Particularly appealing opportunities for the sake of
iostream performance are for copy and find applied to streambuf
iterators or (as noted elsewhere) for staged iterators, of which
the streambuf iterators are a good example.
The bsearch and qsort functions cannot be overloaded properly as
required by the standard because gcc does not yet allow overloading
on the extern-"C"-ness of a function pointer.
Chapter 26 Numerics
--------------------
Headers: <complex> <valarray> <numeric>
C headers: <cmath>, <cstdlib> (also 18, 21, 25)
Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
and the few algorithms from the STL are "mostly done". Of course
optimization opportunities abound for the numerically literate. It
is not clear whether the valarray implementation really conforms
fully, in the assumptions it makes about aliasing (and lack thereof)
in its arguments.
The C div() and ldiv() functions are interesting, because they are the
only case where a C library function returns a class object by value.
Since the C++ type div_t must be different from the underlying C type
(which is in the wrong namespace) the underlying functions div() and
ldiv() cannot be re-used efficiently. Fortunately they are trivial to
re-implement.
Chapter 27 Iostreams
---------------------
Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
<iomanip> <sstream> <fstream>
C headers: <cstdio> <cwchar> (also in 21)
Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
basic_ostream<> are well along, but basic_istream<> has had little work
done. The standard stream objects, <sstream> and <fstream> have been
started; basic_filebuf<> "write" functions have been implemented just
enough to do "hello, world".
Most of the istream and ostream operators << and >> (with the exception
of the op<<(integer) ones) have not been changed to use locale primitives,
sentry objects, or char_traits members.
All these templates should be manually instantiated for char and
wchar_t in a way that links only used members into user programs.
Streambuf is fertile ground for optimization extensions. An extended
interface giving iterator access to its internal buffer would be very
useful for other library components.
Iostream operations (primarily operators << and >>) can take advantage
of the case where user code has not specified a locale, and bypass locale
operations entirely. The current implementation of op<</num_put<>::put,
for the integer types, demonstrates how they can cache encoding details
from the locale on each operation. There is lots more room for
optimization in this area.
The definition of the relationship between the standard streams
cout et al. and stdout et al. requires something like a "stdiobuf".
The SGI solution of using double-indirection to actually use a
stdio FILE object for buffering is unsatisfactory, because it
interferes with peephole loop optimizations.
The <sstream> header work has begun. stringbuf can benefit from
friendship with basic_string<> and basic_string<>::_Rep to use
those objects directly as buffers, and avoid allocating and making
copies.
The basic_filebuf<> template is a complex beast. It is specified to
use the locale facet codecvt<> to translate characters between native
files and the locale character encoding. In general this involves
two buffers, one of "char" representing the file and another of
"char_type", for the stream, with codecvt<> translating. The process
is complicated by the variable-length nature of the translation, and
the need to seek to corresponding places in the two representations.
For the case of basic_filebuf<char>, when no translation is needed,
a single buffer suffices. A specialized filebuf can be used to reduce
code space overhead when no locale has been imbued. Matt Austern's
work at SGI will be useful, perhaps directly as a source of code, or
at least as an example to draw on.
Filebuf, almost uniquely (cf. operator new), depends heavily on
underlying environmental facilities. In current releases iostream
depends fairly heavily on libio constant definitions, but it should
be made independent. It also depends on operating system primitives
for file operations. There is immense room for optimizations using
(e.g.) mmap for reading. The shadow/ directory wraps, besides the
standard C headers, the libio.h and unistd.h headers, for use mainly
by filebuf. These wrappings have not been completed, though there
is scaffolding in place.
The encapsulation of certain C header <cstdio> names presents an
interesting problem. It is possible to define an inline std::fprintf()
implemented in terms of the 'extern "C"' vfprintf(), but there is no
standard vfscanf() to use to implement std::fscanf(). It appears that
vfscanf but be re-implemented in C++ for targets where no vfscanf
extension has been defined. This is interesting in that it seems
to be the only significant case in the C library where this kind of
rewriting is necessary. (Of course Glibc provides the vfscanf()
extension.) (The functions related to exit() must be rewritten
for other reasons.)
Annex D
-------
Headers: <strstream>
Annex D defines many non-library features, and many minor
modifications to various headers, and a complete header.
It is "mostly done", except that the libstdc++-2 <strstream>
header has not been adopted into the library, or checked to
verify that it matches the draft in those details that were
clarified by the committee. Certainly it must at least be
moved into the std namespace.
We still need to wrap all the deprecated features in #if guards
so that pedantic compile modes can detect their use.
Nonstandard Extensions
----------------------
Headers: <iostream.h> <strstream.h> <hash> <rbtree>
<pthread_alloc> <stdiobuf> (etc.)
User code has come to depend on a variety of nonstandard components
that we must not omit. Much of this code can be adopted from
libstdc++-v2 or from the SGI STL. This particularly includes
<iostream.h>, <strstream.h>, and various SGI extensions such
as <hash_map.h>. Many of these are already placed in the
subdirectories ext/ and backward/. (Note that it is better to
include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
to search the subdirectory itself via a "-I" directive.
</literallayout>
</section>
</appendix>
|