1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321
|
%{
/* Copyright (C) 1989-1991 James A. Roskind, All rights reserved.
This grammar was developed and written by James A. Roskind.
Copying of this grammar description, as a whole, is permitted
providing this notice is intact and applicable in all complete
copies. Translations as a whole to other parser generator input
languages (or grammar description languages) is permitted
provided that this notice is intact and applicable in all such
copies, along with a disclaimer that the contents are a
translation. The reproduction of derived text, such as modified
versions of this grammar, or the output of parser generators, is
permitted, provided the resulting work includes the copyright
notice "Portions Copyright (c) 1989, 1990 James A. Roskind".
Derived products, such as compilers, translators, browsers, etc.,
that use this grammar, must also provide the notice "Portions
Copyright (c) 1989, 1990 James A. Roskind" in a manner
appropriate to the utility, and in keeping with copyright law
(e.g.: EITHER displayed when first invoked/executed; OR displayed
continuously on display terminal; OR via placement in the object
code in form readable in a printout, with or near the title of
the work, or at the end of the file). No royalties, licenses or
commissions of any kind are required to copy this grammar, its
translations, or derivative products, when the copies are made in
compliance with this notice. Persons or corporations that do make
copies in compliance with this notice may charge whatever price
is agreeable to a buyer, for such copies or derivative works.
LE_THIS GRAMMAR IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS LE_FOR A PARTICULAR
PURPOSE.
James A. Roskind
Independent Consultant
516 Latania Palm Drive
Indialantic FL, 32903
(407)729-4348
jar@hq.ileaf.com
---end of copyright notice---
MOTIVATION-
My goal is to see software developers adopt this grammar as a
standard until such time as a better standard is accessible. The
only way to get it to become a standard, is to be sure that people
know that derivations are based on a specific work. The intent of
releasing this grammar is to provide a publicly accessible standard
grammar for C++. The intent of the copyright notice is to allow
arbitrary commercial and non-commercial use of the grammar, as long
as reference is given to the original standard. Without reference to
a specific standard, many alternative grammars would develop. By
referring to the standard, this grammar is given publicity, which
should lead to further use in compatible products and systems. The
benefits of such a standard to commercial products (browsers,
beautifiers, translators, compilers, ...) should be obvious to the
developers, in that other compatible products will emerge, and the
value of all conforming products will rise. Most developers are
aware of the value of acquiring a fairly complete grammar for a
language, and the copyright notice (and the resulting affiliation
with my work) should not be too high a price to pay. By copyrighting
this grammar, I have some minor control over what this standard is,
and I can (hopefully) keep it from degrading without my approval. I
will consistently attempt to provide upgraded grammars that are
compliant with the current art, and the ANSI C++ Committee
recommendation in particular. A developer is never prevented from
modifying the grammar to improve it in whatever way is seen fit.
There is also no restriction on the sale of copies, or derivative
works, providing the requests in the copyright notice are satisfied.
If you are not "copying" my work, but are rather only abstracting
some of the standard, an acknowledgment with references to such a
standard would be appreciated. Specifically, agreements with this
standard as to the resolution of otherwise ambiguous constructs,
should be noted.
Simply put: "make whatever use you would like of the grammar, but
include the ``portions Copyright ...'' as a reference to this
standard."
*/
/* Last modified 7/4/91, Version 2.0 */
/* File CPP5.Y is translated by YACC to Y.TAB.C */
/* ACKNOWLEDGMENT: Without Bjarne Stroustrup and his many co-workers
at Bell Labs, there would be no C++ Language for which to provide a
syntax description. Bjarne has also been especially helpful and open
in discussions, and by permitting me to review his texts prior to
their publication, allowed me a wonderful vantage point of clarity.
Without the effort expended by the ANSI C standardizing committee, I
would have been lost. Although the ANSI C standard does not include
a fully disambiguated syntax description, the committee has at least
provided most of the disambiguating rules in narratives. This C++
grammar is intended to be a superset of an ANSI C compatible grammar
that is provided in an related file.
Several reviewers have also recently critiqued this grammar, the
related C grammar, and or assisted in discussions during it's
preparation. These reviewers are certainly not responsible for the
errors I have committed here, but they are responsible for allowing
me to provide fewer errors. These colleagues include: Bruce
Blodgett, Mark Langley, Joe Fialli, Greg Perkins, Ron Guilmette, and
Eric Krohn. */
/* Required fixes from last release :
done: 0) Allow direct call to destructors
done: 1) Allow placement of declarations in labeled statements. The
easiest fix involves using a larger variance from the C grammar, and
simply making "statement" include declarations. Note that it should
also be legal for declarations to be in the branches of if
statements, as long as there is no other code in the block (I think).
Consider:
...
{
if (0 == a)
int b=5;
else
int c=4;
}
1) template support: Not done: pending syntax specification from
ANSI. (This looks like a major effort, as ANSI has decided to extend
the "LE_TYPEDEFname"-feedback-to-the-lexer-hack to support template
names as a new kind of terminal token.)
2) exception handling: Not done: pending syntax specification from
ANSI (but it doesn't look hard)
done: 3) Support nested types, including identifier::name, where we
realize that identifier was a hidden type. Force the lexer to keep
pace in this situation. This will require an extension of the
yacc-lex feedback loop.
done: 4) Support nested types even when derivations are used in class
definitions.
5) Provide advanced tutorial on YACC conflicts: almost done in
documentation about machine generated documentation.
done: 6) Allow declaration specifiers to be left out of declarations
at file and structure scope so that operator conversion functions can
be declared and/or defined. Note that checking to see that it was a
function type that does not require declaration_specifiers is now a
constraint check, and not a syntax issue. Within function body
scopes, declaration specifiers are required, and this is critical to
distinguishing expressions.
*/
%}
/*
Interesting ambiguity:
Usually
typename ( typename2 ) ...
or
typename ( typename2 [4] ) ...
etc.
is a redeclaration of typename2.
Inside a structure elaboration, it is sometimes the declaration of a
constructor! Note, this only counts if typename IS the current
containing class name. (Note this can't conflict with ANSI C because
ANSI C would call it a redefinition, but claim it is semantically
illegal because you can't have a member declared the same type as the
containing struct!) Since the ambiguity is only reached when a ';' is
found, there is no problem with the fact that the semantic
interpretation is providing the true resolution. As currently
implemented, the constructor semantic actions must be able to process
an ordinary declaration. I may reverse this in the future, to ease
semantic implementation.
*/
/*
INTRO TO ANSI C GRAMMAR (provided in a separate file):
The refined grammar resolves several typedef ambiguities in the draft
proposed ANSI C standard syntax down to 1 shift/reduce conflict, as
reported by a YACC process. Note that the one shift reduce conflicts
is the traditional if-if-else conflict that is not resolved by the
grammar. This ambiguity can be removed using the method described in
the Dragon Book (2nd edition), but this does not appear worth the
effort.
There was quite a bit of effort made to reduce the conflicts to this
level, and an additional effort was made to make the grammar quite
similar to the C++ grammar being developed in parallel. Note that
this grammar resolves the following ANSI C ambiguities:
ANSI C section 3.5.6, "If the [typedef name] is redeclared at an
inner scope, the type specifiers shall not be omitted in the inner
declaration". Supplying type specifiers prevents consideration of T
as a typedef name in this grammar. Failure to supply type specifiers
forced the use of the LE_TYPEDEFname as a type specifier. This is taken
to an (unnecessary) extreme by this implementation. The ambiguity is
only a problem with the first declarator in a declaration, but we
restrict ALL declarators whenever the users fails to use a
type_specifier.
ANSI C section 3.5.4.3, "In a parameter declaration, a single typedef
name in parentheses is taken to be an abstract declarator that
specifies a function with a single parameter, not as redundant
parentheses around the identifier". This is extended to cover the
following cases:
typedef float T;
int noo(const (T[5]));
int moo(const (T(int)));
...
Where again the '(' immediately to the left of 'T' is interpreted as
being the start of a parameter type list, and not as a redundant
paren around a redeclaration of T. Hence an equivalent code fragment
is:
typedef float T;
int noo(const int identifier1 (T identifier2 [5]));
int moo(const int identifier1 (T identifier2 (int identifier3)));
...
*/
%{
/*************** Includes and Defines *****************************/
#define YYDEBUG_LEXER_TEXT (yylval) /* our lexer loads this up each time.
We are telling the graphical debugger
where to find the spelling of the
tokens.*/
#define YYDEBUG 0 /* get the pretty debugging code to compile*/
#include "stdio.h"
int yyparse();
void yyerror(char *string);
extern int yylex();
extern char* yytext;
void setLexerInput(const char *line);
/*************** Standard ytab.c continues here *********************/
%}
%union {
char value[128];
}
/*************************************************************************/
/* This group is used by the C/C++ language parser */
%token <value> LE_AUTO LE_DOUBLE LE_INT LE_STRUCT
%token <value> LE_BREAK LE_ELSE LE_LONG LE_SWITCH
%token <value> LE_CASE LE_ENUM LE_REGISTER LE_TYPEDEF
%token <value> LE_CHAR LE_EXTERN LE_RETURN LE_UNION
%token <value> LE_CONST LE_FLOAT LE_SHORT LE_UNSIGNED
%token <value> LE_CONTINUE LE_FOR LE_SIGNED LE_VOID
%token <value> LE_DEFAULT LE_GOTO LE_SIZEOF LE_VOLATILE
%token <value> LE_DO LE_IF LE_STATIC LE_WHILE
/* The following are used in C++ only. ANSI C would call these IDENTIFIERs */
%token <value> LE_NEW LE_DELETE
%token <value> LE_THIS
%token <value> LE_OPERATOR
%token <value> LE_CLASS
%token <value> LE_PUBLIC LE_PROTECTED LE_PRIVATE
%token <value> LE_VIRTUAL LE_FRIEND
%token <value> LE_INLINE LE_OVERLOAD
/* ANSI C Grammar suggestions */
%token <value> LE_IDENTIFIER LE_STRINGliteral
%token <value> LE_FLOATINGconstant LE_INTEGERconstant LE_CHARACTERconstant
%token <value> LE_OCTALconstant LE_HEXconstant
%token <value> LE_POUNDPOUND LE_CComment LE_CPPComment LE_NAMESPACE
/* New Lexical element, whereas ANSI C suggested non-terminal */
%token LE_TYPEDEFname
/* Multi-Character operators */
%token <value> LE_ARROW /* -> */
%token <value> LE_ICR LE_DECR /* ++ -- */
%token <value> LE_LS LE_RS /* << >> */
%token <value> LE_LE LE_GE LE_EQ LE_NE /* <= >= == != */
%token <value> LE_ANDAND LE_OROR /* && || */
%token <value> LE_ELLIPSIS /* ... */
/* Following are used in C++, not ANSI C */
%token <value> LE_CLCL /* :: */
%token <value> LE_DOTstar LE_ARROWstar/* .* ->* */
/* modifying assignment operators */
%token <value> LE_MULTassign LE_DIVassign LE_MODassign /* *= /= %= */
%token <value> LE_PLUSassign LE_MINUSassign /* += -= */
%token <value> LE_LSassign LE_RSassign /* <<= >>= */
%token <value> LE_ANDassign LE_ERassign LE_ORassign /* &= ^= |= */
%token <value> LE_TEMPLATE
%token <value> LE_TYPENAME
/*************************************************************************/
%start translation_unit
/*************************************************************************/
%%
/*********************** CONSTANTS *********************************/
constant:
LE_INTEGERconstant
| LE_FLOATINGconstant
/* We are not including ENUMERATIONconstant here because we
are treating it like a variable with a type of "enumeration
constant". */
| LE_OCTALconstant
| LE_HEXconstant
| LE_CHARACTERconstant
;
string_literal_list:
LE_STRINGliteral
| string_literal_list LE_STRINGliteral
;
/************************* EXPRESSIONS ********************************/
/* Note that I provide a "scope_opt_identifier" that *cannot*
begin with ::. This guarantees we have a viable declarator, and
helps to disambiguate :: based uses in the grammar. For example:
...
{
int (* ::b()); // must be an expression
int (T::b); // officially a declaration, which fails on constraint grounds
This *syntax* restriction reflects the current syntax in the ANSI
C++ Working Papers. This means that it is *incorrect* for
parsers to misparse the example:
int (* ::b()); // must be an expression
as a declaration, and then report a constraint error.
In contrast, declarations such as:
class T;
class A;
class B;
main(){
T( F()); // constraint error: cannot declare local function
T (A::B::a); // constraint error: cannot declare member as a local value
are *parsed* as declarations, and *then* given semantic error
reports. It is incorrect for a parser to "change its mind" based
on constraints. If your C++ compiler claims that the above 2
lines are expressions, then *I* claim that they are wrong. */
paren_identifier_declarator:
scope_opt_identifier
| scope_opt_complex_name
| '(' paren_identifier_declarator ')'
;
/* Note that LE_CLCL LE_IDENTIFIER is NOT part of scope_opt_identifier,
but it is part of global_opt_scope_opt_identifier. It is ONLY
valid for referring to an identifier, and NOT valid for declaring
(or importing an external declaration of) an identifier. This
disambiguates the following code, which would otherwise be
syntactically and semantically ambiguous:
class base {
static int i; // element i;
float member_function(void);
};
base i; // global i
float base::member_function(void) {
i; // refers to static int element "i" of base
::i; // refers to global "i", with type "base"
{
base :: i; // import of global "i", like "base (::i);"?
// OR reference to global??
}
}
*/
primary_expression:
global_opt_scope_opt_identifier
| global_opt_scope_opt_complex_name
| LE_THIS /* C++, not ANSI C */
| constant
| string_literal_list
| '(' comma_expression ')'
;
/* I had to disallow struct, union, or enum elaborations during
operator_function_name. The ANSI C++ Working paper is vague
about whether this should be part of the syntax, or a constraint.
The ambiguities that resulted were more than LALR could handle,
so the easiest fix was to be more specific. This means that I
had to in-line expand type_specifier_or_name far enough that I
would be able to exclude elaborations. This need is what drove
me to distinguish a whole series of tokens based on whether they
include elaborations:
struct A { ... }
or simply a reference to an aggregate or enumeration:
enum A
The latter, as well an non-aggregate types are what make up
non_elaborating_type_specifier */
/* Note that the following does not include type_qualifier_list.
Hence, whenever non_elaborating_type_specifier is used, an
adjacent rule is supplied containing type_qualifier_list. It is
not generally possible to know immediately (i_e., reduce) a
type_qualifier_list, as a LE_TYPEDEFname that follows might not be
part of a type specifier, but might instead be "LE_TYPEDEFname ::*".
*/
non_elaborating_type_specifier:
sue_type_specifier
| basic_type_specifier
| typedef_type_specifier
| basic_type_name
| LE_TYPEDEFname
| global_or_scoped_typedefname
;
/* The following introduces MANY conflicts. Requiring and
allowing '(' ')' around the `type' when the type is complex would
help a lot. */
operator_function_name:
LE_OPERATOR any_operator
| LE_OPERATOR type_qualifier_list operator_function_ptr_opt
| LE_OPERATOR non_elaborating_type_specifier operator_function_ptr_opt
;
/* The following causes several ambiguities on * and &. These
conflicts would also be removed if parens around the `type' were
required in the derivations for operator_function_name */
/* Interesting aside: The use of right recursion in the
production for operator_function_ptr_opt gives both the correct
parsing, AND removes a conflict! Right recursion permits the
parser to defer reductions (a.k.a.: delay resolution), and
effectively make a second pass! */
operator_function_ptr_opt:
/* nothing */
| unary_modifier operator_function_ptr_opt
| asterisk_or_ampersand operator_function_ptr_opt
;
/* List of operators we can overload */
any_operator:
'+'
| '-'
| '*'
| '/'
| '%'
| '^'
| '&'
| '|'
| '~'
| '!'
| '<'
| '>'
| LE_LS
| LE_RS
| LE_ANDAND
| LE_OROR
| LE_ARROW
| LE_ARROWstar
| '.'
| LE_DOTstar
| LE_ICR
| LE_DECR
| LE_LE
| LE_GE
| LE_EQ
| LE_NE
| assignment_operator
| '(' ')'
| '[' ']'
| LE_NEW
| LE_DELETE
| ','
;
/* The following production for type_qualifier_list was specially
placed BEFORE the definition of postfix_expression to resolve a
reduce-reduce conflict set correctly. Note that a
type_qualifier_list is only used in a declaration, whereas a
postfix_expression is clearly an example of an expression. Hence
we are helping with the "if it can be a declaration, then it is"
rule. The reduce conflicts are on ')', ',' and '='. Do not move
the following productions */
type_qualifier_list_opt:
/* Nothing */
| type_qualifier_list
;
/* Note that the next set of productions in this grammar gives
post-increment a higher precedence that pre-increment. This is
not clearly stated in the C++ Reference manual, and is only
implied by the grammar in the ANSI C Standard. */
/* I *DON'T* use argument_expression_list_opt to simplify the
grammar shown below. I am deliberately deferring any decision
until *after* the closing paren, and using
"argument_expression_list_opt" would commit prematurely. This is
critical to proper conflict resolution. */
/* The {} in the following rules allow the parser to tell the
lexer to search for the member name in the appropriate scope,
much the way the LE_CLCL operator works.*/
postfix_expression:
primary_expression
| postfix_expression '[' comma_expression ']'
| postfix_expression '(' ')'
| postfix_expression '(' argument_expression_list ')'
| postfix_expression {} '.' member_name
| postfix_expression {} LE_ARROW member_name
| postfix_expression LE_ICR
| postfix_expression LE_DECR
/* The next 4 rules are the source of cast ambiguity */
| LE_TYPEDEFname '(' ')'
| global_or_scoped_typedefname '(' ')'
| LE_TYPEDEFname '(' argument_expression_list ')'
| global_or_scoped_typedefname '(' argument_expression_list ')'
| basic_type_name '(' assignment_expression ')'
/* If the following rule is added to the grammar, there
will be 3 additional reduce-reduce conflicts. They will
all be resolved in favor of NOT using the following rule,
so no harm will be done. However, since the rule is
semantically illegal we will omit it until we are
enhancing the grammar for error recovery */
/* | basic_type_name '(' ')' /* Illegal: no such constructor*/
;
/* The last two productions in the next set are questionable, but
do not induce any conflicts. I need to ask X3J16 : Having them
means that we have complex member function deletes like:
const unsigned int :: ~ const unsigned int
*/
member_name:
scope_opt_identifier
| scope_opt_complex_name
| basic_type_name LE_CLCL '~' basic_type_name /* C++, not ANSI C */
| declaration_qualifier_list LE_CLCL '~' declaration_qualifier_list
| type_qualifier_list LE_CLCL '~' type_qualifier_list
;
argument_expression_list:
assignment_expression
| argument_expression_list ',' assignment_expression
;
unary_expression:
postfix_expression
| LE_ICR unary_expression
| LE_DECR unary_expression
| asterisk_or_ampersand cast_expression
| '-' cast_expression
| '+' cast_expression
| '~' cast_expression
| '!' cast_expression
| LE_SIZEOF unary_expression
| LE_SIZEOF '(' type_name ')'
| allocation_expression
;
/* Note that I could have moved the newstore productions to a
lower precedence level than multiplication (binary '*'), and
lower than bitwise AND (binary '&'). These moves are the nice
way to disambiguate a trailing unary '*' or '&' at the end of a
freestore expression. Since the freestore expression (with such
a grammar and hence precedence given) can never be the left
operand of a binary '*' or '&', the ambiguity would be removed.
These problems really surface when the binary operators '*' or
'&' are overloaded, but this must be syntactically disambiguated
before the semantic checking is performed... Unfortunately, I am
not creating the language, only writing a grammar that reflects
its specification, and hence I cannot change its precedence
assignments. If I had my druthers, I would probably prefer
surrounding the type with parens all the time, and avoiding the
dangling * and & problem all together.*/
/* Following are C++, not ANSI C */
allocation_expression:
global_opt_scope_opt_operator_new '(' type_name ')'
operator_new_initializer_opt
| global_opt_scope_opt_operator_new '(' argument_expression_list ')' '(' type_name ')'
operator_new_initializer_opt
/* next two rules are the source of * and & ambiguities */
| global_opt_scope_opt_operator_new operator_new_type
| global_opt_scope_opt_operator_new '(' argument_expression_list ')' operator_new_type
;
/* Following are C++, not ANSI C */
global_opt_scope_opt_operator_new:
LE_NEW
| global_or_scope LE_NEW
;
operator_new_type:
type_qualifier_list operator_new_declarator_opt
operator_new_initializer_opt
| non_elaborating_type_specifier operator_new_declarator_opt
operator_new_initializer_opt
;
/* Right recursion is critical in the following productions to
avoid a conflict on LE_TYPEDEFname */
operator_new_declarator_opt:
/* Nothing */
| operator_new_array_declarator
| asterisk_or_ampersand operator_new_declarator_opt
| unary_modifier operator_new_declarator_opt
;
operator_new_array_declarator:
'[' ']'
| '[' comma_expression ']'
| operator_new_array_declarator '[' comma_expression ']'
;
operator_new_initializer_opt:
/* Nothing */
| '(' ')'
| '(' argument_expression_list ')'
;
cast_expression:
unary_expression
| '(' type_name ')' cast_expression
;
/* Following are C++, not ANSI C */
deallocation_expression:
cast_expression
| global_opt_scope_opt_delete deallocation_expression
| global_opt_scope_opt_delete '[' comma_expression ']' deallocation_expression /* archaic C++, what a concept */
| global_opt_scope_opt_delete '[' ']' deallocation_expression
;
/* Following are C++, not ANSI C */
global_opt_scope_opt_delete:
LE_DELETE
| global_or_scope LE_DELETE
;
/* Following are C++, not ANSI C */
point_member_expression:
deallocation_expression
| point_member_expression LE_DOTstar deallocation_expression
| point_member_expression LE_ARROWstar deallocation_expression
;
multiplicative_expression:
point_member_expression
| multiplicative_expression '*' point_member_expression
| multiplicative_expression '/' point_member_expression
| multiplicative_expression '%' point_member_expression
;
additive_expression:
multiplicative_expression
| additive_expression '+' multiplicative_expression
| additive_expression '-' multiplicative_expression
;
shift_expression:
additive_expression
| shift_expression LE_LS additive_expression
| shift_expression LE_RS additive_expression
;
relational_expression:
shift_expression
| relational_expression '<' shift_expression
| relational_expression '>' shift_expression
| relational_expression LE_LE shift_expression
| relational_expression LE_GE shift_expression
;
equality_expression:
relational_expression
| equality_expression LE_EQ relational_expression
| equality_expression LE_NE relational_expression
;
AND_expression:
equality_expression
| AND_expression '&' equality_expression
;
exclusive_OR_expression:
AND_expression
| exclusive_OR_expression '^' AND_expression
;
inclusive_OR_expression:
exclusive_OR_expression
| inclusive_OR_expression '|' exclusive_OR_expression
;
logical_AND_expression:
inclusive_OR_expression
| logical_AND_expression LE_ANDAND inclusive_OR_expression
;
logical_OR_expression:
logical_AND_expression
| logical_OR_expression LE_OROR logical_AND_expression
;
conditional_expression:
logical_OR_expression
| logical_OR_expression '?' comma_expression ':'
conditional_expression
;
assignment_expression:
conditional_expression
| unary_expression assignment_operator assignment_expression
;
assignment_operator:
'='
| LE_MULTassign
| LE_DIVassign
| LE_MODassign
| LE_PLUSassign
| LE_MINUSassign
| LE_LSassign
| LE_RSassign
| LE_ANDassign
| LE_ERassign
| LE_ORassign
;
comma_expression:
assignment_expression
| comma_expression ',' assignment_expression
;
constant_expression:
conditional_expression
;
/* The following was used for clarity */
comma_expression_opt:
/* Nothing */
| comma_expression
;
/******************************* DECLARATIONS *********************************/
/* The following are notably different from the ANSI C Standard
specified grammar, but are present in my ANSI C compatible
grammar. The changes were made to disambiguate typedefs presence
in declaration_specifiers (vs. in the declarator for
redefinition); to allow struct/union/enum/class tag declarations
without declarators, and to better reflect the parsing of
declarations (declarators must be combined with
declaration_specifiers ASAP, so that they can immediately become
visible in the current scope). */
declaration:
declaring_list ';' {printf("declaring_list\n");}
| default_declaring_list ';' {printf("default_declaring_list\n");}
| sue_declaration_specifier ';' {printf("sue_declaration_specifier\n");}
| sue_type_specifier ';' {printf("sue_type_specifier\n");}
| sue_type_specifier_elaboration ';' {printf("sue_type_specifier_elaboration\n");}
;
/* Note that if a typedef were redeclared, then a declaration
specifier must be supplied (re: ANSI C spec). The following are
declarations wherein no declaration_specifier is supplied, and
hence the 'default' must be used. An example of this is
const a;
which by default, is the same as:
const int a;
`a' must NOT be a typedef in the above example. */
/* The presence of `{}' in the following rules indicates points
at which the symbol table MUST be updated so that the tokenizer
can IMMEDIATELY continue to maintain the proper distinction
between a LE_TYPEDEFname and an LE_IDENTIFIER. */
default_declaring_list: /* Can't redeclare typedef names */
declaration_qualifier_list identifier_declarator {} initializer_opt
| type_qualifier_list identifier_declarator {} initializer_opt
| default_declaring_list ',' identifier_declarator {} initializer_opt
| declaration_qualifier_list constructed_identifier_declarator
| type_qualifier_list constructed_identifier_declarator
| default_declaring_list ',' constructed_identifier_declarator
;
/* Note how type_qualifier_list is NOT used in the following
productions. Qualifiers are NOT sufficient to redefine
typedef-names (as prescribed by the ANSI C standard).*/
declaring_list:
declaration_specifier declarator {} initializer_opt {printf("1\n");}
| type_specifier declarator {} initializer_opt {printf("2\n");}
| basic_type_name declarator {} initializer_opt {printf("3\n");}
| LE_TYPEDEFname declarator {} initializer_opt {printf("4\n");}
| global_or_scoped_typedefname declarator {} initializer_opt {printf("5\n");}
| declaring_list ',' declarator {} initializer_opt {printf("6\n");}
| declaration_specifier constructed_declarator {printf("7\n");}
| type_specifier constructed_declarator {printf("8\n");}
| basic_type_name constructed_declarator {printf("9\n");}
| LE_TYPEDEFname constructed_declarator {printf("10\n");}
| global_or_scoped_typedefname constructed_declarator {printf("11\n");}
| declaring_list ',' constructed_declarator {printf("12\n");}
;
/* Declarators with parenthesized initializers present a big
problem. Typically a declarator that looks like: "*a(...)" is
supposed to bind FIRST to the "(...)", and then to the "*". This
binding presumes that the "(...)" stuff is a prototype. With
constructed declarators, we must (officially) finish the binding
to the "*" (finishing forming a good declarator) and THEN connect
with the argument list. Unfortunately, by the time we realize it
is an argument list (and not a prototype) we have pushed the
separate declarator tokens "*" and "a" onto the yacc stack
WITHOUT combining them. The solution is to use odd productions to
carry the incomplete declarator along with the "argument
expression list" back up the yacc stack. We would then actually
instantiate the symbol table after we have fully decorated the
symbol with all the leading "*" stuff. Actually, since we don't
have all the type information in one spot till we reduce to a
declaring_list, this delay is not a problem. Note that ordinary
initializers REQUIRE (ANSI C Standard) that the symbol be placed
into the symbol table BEFORE its initializer is read, but in the
case of parenthesized initializers, this is not possible (we
don't even know we have an initializer till have passed the
opening "(". ) */
constructed_declarator:
nonunary_constructed_identifier_declarator
| constructed_paren_typedef_declarator
| simple_paren_typedef_declarator '(' argument_expression_list ')'
| simple_paren_typedef_declarator postfixing_abstract_declarator
'(' argument_expression_list ')' /* constraint error */
| constructed_parameter_typedef_declarator
| asterisk_or_ampersand constructed_declarator
| unary_modifier constructed_declarator
;
constructed_paren_typedef_declarator:
'(' paren_typedef_declarator ')'
'(' argument_expression_list ')'
| '(' paren_typedef_declarator ')' postfixing_abstract_declarator
'(' argument_expression_list ')'
| '(' simple_paren_typedef_declarator postfixing_abstract_declarator ')'
'(' argument_expression_list ')'
| '(' LE_TYPEDEFname postfixing_abstract_declarator ')'
'(' argument_expression_list ')'
;
constructed_parameter_typedef_declarator:
LE_TYPEDEFname '(' argument_expression_list ')'
| LE_TYPEDEFname postfixing_abstract_declarator
'(' argument_expression_list ')' /* constraint error */
| '(' clean_typedef_declarator ')'
'(' argument_expression_list ')'
| '(' clean_typedef_declarator ')' postfixing_abstract_declarator
'(' argument_expression_list ')'
;
constructed_identifier_declarator:
nonunary_constructed_identifier_declarator
| asterisk_or_ampersand constructed_identifier_declarator
| unary_modifier constructed_identifier_declarator
;
/* The following are restricted to NOT begin with any pointer
operators. This includes both "*" and "T::*" modifiers. Aside
from this restriction, the following would have been:
identifier_declarator '(' argument_expression_list ')' */
nonunary_constructed_identifier_declarator:
paren_identifier_declarator '(' argument_expression_list ')'
| paren_identifier_declarator postfixing_abstract_declarator
'(' argument_expression_list ')' /* constraint error*/
| '(' unary_identifier_declarator ')'
'(' argument_expression_list ')'
| '(' unary_identifier_declarator ')' postfixing_abstract_declarator
'(' argument_expression_list ')'
;
declaration_specifier:
basic_declaration_specifier /* Arithmetic or void */
| sue_declaration_specifier /* struct/union/enum/class */
| typedef_declaration_specifier /* typedef*/
;
type_specifier:
basic_type_specifier /* Arithmetic or void */
| sue_type_specifier /* Struct/Union/Enum/Class */
| sue_type_specifier_elaboration /* elaborated Struct/Union/Enum/Class */
| typedef_type_specifier /* Typedef */
;
declaration_qualifier_list: /* storage class and optional const/volatile */
storage_class
| type_qualifier_list storage_class
| declaration_qualifier_list declaration_qualifier
;
type_qualifier_list:
type_qualifier
| type_qualifier_list type_qualifier
;
declaration_qualifier:
storage_class
| type_qualifier /* const or volatile */
;
type_qualifier:
LE_CONST
| LE_VOLATILE
;
basic_declaration_specifier: /*Storage Class+Arithmetic or void*/
declaration_qualifier_list basic_type_name
| basic_type_specifier storage_class
| basic_type_name storage_class
| basic_declaration_specifier declaration_qualifier
| basic_declaration_specifier basic_type_name
;
basic_type_specifier:
type_qualifier_list basic_type_name /* Arithmetic or void */
| basic_type_name basic_type_name
| basic_type_name type_qualifier
| basic_type_specifier type_qualifier
| basic_type_specifier basic_type_name
;
sue_declaration_specifier: /* Storage Class + struct/union/enum/class */
declaration_qualifier_list elaborated_type_name
| declaration_qualifier_list elaborated_type_name_elaboration
| sue_type_specifier storage_class
| sue_type_specifier_elaboration storage_class
| sue_declaration_specifier declaration_qualifier
;
sue_type_specifier_elaboration:
elaborated_type_name_elaboration /* elaborated struct/union/enum/class */
| type_qualifier_list elaborated_type_name_elaboration
| sue_type_specifier_elaboration type_qualifier
;
sue_type_specifier:
elaborated_type_name /* struct/union/enum/class */
| type_qualifier_list elaborated_type_name
| sue_type_specifier type_qualifier
;
typedef_declaration_specifier: /*Storage Class + typedef types */
declaration_qualifier_list LE_TYPEDEFname
| declaration_qualifier_list global_or_scoped_typedefname
| typedef_type_specifier storage_class
| LE_TYPEDEFname storage_class
| global_or_scoped_typedefname storage_class
| typedef_declaration_specifier declaration_qualifier
;
typedef_type_specifier: /* typedef types */
type_qualifier_list LE_TYPEDEFname
| type_qualifier_list global_or_scoped_typedefname
| LE_TYPEDEFname type_qualifier
| global_or_scoped_typedefname type_qualifier
| typedef_type_specifier type_qualifier
;
/* There are really several distinct sets of storage_classes. The
sets vary depending on whether the declaration is at file scope, is a
declaration within a struct/class, is within a function body, or in a
function declaration/definition (prototype parameter declarations).
They are grouped here to simplify the grammar, and can be
semantically checked. Note that this approach tends to ease the
syntactic restrictions in the grammar slightly, but allows for future
language development, and tends to provide superior diagnostics and
error recovery (i_e.: a syntax error does not disrupt the parse).
File File Member Member Local Local Formal
Var Funct Var Funct Var Funct Params
LE_TYPEDEF x x x x x x
LE_EXTERN x x x x
LE_STATIC x x x x x
LE_AUTO x x
LE_REGISTER x x
LE_FRIEND x
LE_OVERLOAD x x x
LE_INLINE x x x
LE_VIRTUAL x x
*/
storage_class:
LE_EXTERN
| LE_TYPEDEF
| LE_STATIC
| LE_AUTO
| LE_REGISTER
| LE_FRIEND /* C++, not ANSI C */
| LE_OVERLOAD /* C++, not ANSI C */
| LE_INLINE /* C++, not ANSI C */
| LE_VIRTUAL /* C++, not ANSI C */
;
basic_type_name:
LE_INT
| LE_CHAR
| LE_SHORT
| LE_LONG
| LE_FLOAT
| LE_DOUBLE
| LE_SIGNED
| LE_UNSIGNED
| LE_VOID
;
elaborated_type_name_elaboration:
aggregate_name_elaboration
| enum_name_elaboration
;
elaborated_type_name:
aggregate_name
| enum_name
;
/* Since the expression "new type_name" MIGHT use an elaborated
type and a derivation, it MIGHT have a ':'. This fact conflicts
with the requirement that a new expression can be placed between
a '?' and a ':' in a conditional expression (at least it confuses
LR(1) parsers). Hence the aggregate_name_elaboration is
responsible for a series of SR conflicts on ':'.*/
/* The intermediate actions {} represent points at which the
database of typedef names must be updated in C++. This is
critical to the lexer, which must begin to tokenize based on this
new information. */
aggregate_name_elaboration:
aggregate_name derivation_opt '{' member_declaration_list_opt '}'
| aggregate_key derivation_opt '{' member_declaration_list_opt '}'
;
/* We distinguish between the above, which support elaboration,
and this set of productions so that we can provide special
declaration specifiers for operator_new_type, and for conversion
functions. Note that without this restriction a large variety of
conflicts appear when processing operator_new and conversions
operators (which can be followed by a ':' in a ternary ?:
expression) */
/* Note that at the end of each of the following rules we should
be sure that the tag name is in, or placed in the indicated
scope. If no scope is specified, then we must add it to our
current scope IFF it cannot be found in an external lexical
scope. */
aggregate_name:
aggregate_key tag_name
| global_scope scope aggregate_key tag_name
| global_scope aggregate_key tag_name
| scope aggregate_key tag_name
;
derivation_opt:
/* nothing */
| ':' derivation_list
;
derivation_list:
parent_class
| derivation_list ',' parent_class
;
parent_class:
global_opt_scope_opt_typedefname
| LE_VIRTUAL access_specifier_opt global_opt_scope_opt_typedefname
| access_specifier virtual_opt global_opt_scope_opt_typedefname
;
virtual_opt:
/* nothing */
| LE_VIRTUAL
;
access_specifier_opt:
/* nothing */
| access_specifier
;
access_specifier:
LE_PUBLIC
| LE_PRIVATE
| LE_PROTECTED
;
aggregate_key:
LE_STRUCT
| LE_UNION
| LE_CLASS /* C++, not ANSI C */
;
/* Note that an empty list is ONLY allowed under C++. The grammar
can be modified so that this stands out. The trick is to define
member_declaration_list, and have that referenced for non-trivial
lists. */
member_declaration_list_opt:
/* nothing */
| member_declaration_list_opt member_declaration
;
member_declaration:
member_declaring_list ';'
| member_default_declaring_list ';'
| access_specifier ':' /* C++, not ANSI C */
| new_function_definition /* C++, not ANSI C */
| constructor_function_in_class /* C++, not ANSI C */
| sue_type_specifier ';' /* C++, not ANSI C */
| sue_type_specifier_elaboration ';' /* C++, not ANSI C */
| identifier_declarator ';' /* C++, not ANSI C
access modification
conversion functions,
unscoped destructors */
| typedef_declaration_specifier ';' /* friend T */ /* C++, not ANSI C */
| sue_declaration_specifier ';' /* friend class C*/ /* C++, not ANSI C */
;
member_default_declaring_list: /* doesn't redeclare typedef*/
type_qualifier_list
identifier_declarator member_pure_opt
| declaration_qualifier_list
identifier_declarator member_pure_opt /* C++, not ANSI C */
| member_default_declaring_list ','
identifier_declarator member_pure_opt
| type_qualifier_list bit_field_identifier_declarator
| declaration_qualifier_list bit_field_identifier_declarator /* C++, not ANSI C */
| member_default_declaring_list ',' bit_field_identifier_declarator
;
/* There is a conflict when "struct A" is used as a declaration
specifier, and there is a chance that a bit field name will be
provided. To fix this syntactically would require distinguishing
non_elaborating_declaration_specifiers the way I handled
non_elaborating_type_specifiers. I think this should be a
constraint error anyway :-). */
member_declaring_list: /* Can possibly redeclare typedefs */
type_specifier declarator member_pure_opt
| basic_type_name declarator member_pure_opt
| global_or_scoped_typedefname declarator member_pure_opt
| member_conflict_declaring_item
| member_declaring_list ',' declarator member_pure_opt
| type_specifier bit_field_declarator
| basic_type_name bit_field_declarator
| LE_TYPEDEFname bit_field_declarator
| global_or_scoped_typedefname bit_field_declarator
| declaration_specifier bit_field_declarator /* constraint violation: storage class used */
| member_declaring_list ',' bit_field_declarator
;
/* The following conflict with constructors-
member_conflict_declaring_item:
LE_TYPEDEFname declarator member_pure_opt
| declaration_specifier declarator member_pure_opt /* C++, not ANSI C * /
;
so we inline expand declarator to get the following productions...
*/
member_conflict_declaring_item:
LE_TYPEDEFname identifier_declarator member_pure_opt
| LE_TYPEDEFname parameter_typedef_declarator member_pure_opt
| LE_TYPEDEFname simple_paren_typedef_declarator member_pure_opt
| declaration_specifier identifier_declarator member_pure_opt
| declaration_specifier parameter_typedef_declarator member_pure_opt
| declaration_specifier simple_paren_typedef_declarator member_pure_opt
| member_conflict_paren_declaring_item
;
/* The following still conflicts with constructors-
member_conflict_paren_declaring_item:
LE_TYPEDEFname paren_typedef_declarator member_pure_opt
| declaration_specifier paren_typedef_declarator member_pure_opt
;
so paren_typedef_declarator is expanded inline to get...*/
member_conflict_paren_declaring_item:
LE_TYPEDEFname asterisk_or_ampersand
'(' simple_paren_typedef_declarator ')' member_pure_opt
| LE_TYPEDEFname unary_modifier
'(' simple_paren_typedef_declarator ')' member_pure_opt
| LE_TYPEDEFname asterisk_or_ampersand
'(' LE_TYPEDEFname ')' member_pure_opt
| LE_TYPEDEFname unary_modifier
'(' LE_TYPEDEFname ')' member_pure_opt
| LE_TYPEDEFname asterisk_or_ampersand
paren_typedef_declarator member_pure_opt
| LE_TYPEDEFname unary_modifier
paren_typedef_declarator member_pure_opt
| declaration_specifier asterisk_or_ampersand
'(' simple_paren_typedef_declarator ')' member_pure_opt
| declaration_specifier unary_modifier
'(' simple_paren_typedef_declarator ')' member_pure_opt
| declaration_specifier asterisk_or_ampersand
'(' LE_TYPEDEFname ')' member_pure_opt
| declaration_specifier unary_modifier
'(' LE_TYPEDEFname ')' member_pure_opt
| declaration_specifier asterisk_or_ampersand
paren_typedef_declarator member_pure_opt
| declaration_specifier unary_modifier
paren_typedef_declarator member_pure_opt
| member_conflict_paren_postfix_declaring_item
;
/* but we still have the following conflicts with constructors-
member_conflict_paren_postfix_declaring_item:
LE_TYPEDEFname postfix_paren_typedef_declarator member_pure_opt
| declaration_specifier postfix_paren_typedef_declarator member_pure_opt
;
so we expand paren_postfix_typedef inline and get...*/
member_conflict_paren_postfix_declaring_item:
LE_TYPEDEFname '(' paren_typedef_declarator ')'
member_pure_opt
| LE_TYPEDEFname '(' simple_paren_typedef_declarator
postfixing_abstract_declarator ')' member_pure_opt
| LE_TYPEDEFname '(' LE_TYPEDEFname
postfixing_abstract_declarator ')' member_pure_opt
| LE_TYPEDEFname '(' paren_typedef_declarator ')'
postfixing_abstract_declarator member_pure_opt
| declaration_specifier '(' paren_typedef_declarator ')'
member_pure_opt
| declaration_specifier '(' simple_paren_typedef_declarator
postfixing_abstract_declarator ')' member_pure_opt
| declaration_specifier '(' LE_TYPEDEFname
postfixing_abstract_declarator ')' member_pure_opt
| declaration_specifier '(' paren_typedef_declarator ')'
postfixing_abstract_declarator member_pure_opt
;
/* ...and we are done. Now all the conflicts appear on ';',
which can be semantically evaluated/disambiguated */
member_pure_opt:
/* nothing */
| '=' LE_OCTALconstant /* C++, not ANSI C */ /* Pure function*/
;
/* Note that bit field names, where redefining TYPEDEFnames,
cannot be parenthesized in C++ (due to ambiguities), and hence
this part of the grammar is simpler than ANSI C. :-) The problem
occurs because:
LE_TYPEDEFname ( LE_TYPEDEFname) : .....
doesn't look like a bit field, rather it looks like a constructor
definition! */
bit_field_declarator:
bit_field_identifier_declarator
| LE_TYPEDEFname {} ':' constant_expression
;
/* The actions taken in the "{}" above and below are intended to
allow the symbol table to be updated when the declarator is
complete. It is critical for code like:
foo : sizeof(foo + 1);
*/
bit_field_identifier_declarator:
':' constant_expression
| identifier_declarator {} ':' constant_expression
;
enum_name_elaboration:
global_opt_scope_opt_enum_key '{' enumerator_list '}'
| enum_name '{' enumerator_list '}'
;
/* As with structures, the distinction between "elaborating" and
"non-elaborating" enum types is maintained. In actuality, it
probably does not cause much in the way of conflicts, since a ':'
is not allowed. For symmetry, we maintain the distinction. The
{} actions are intended to allow the symbol table to be updated.
These updates are significant to code such as:
enum A { first=sizeof(A)};
*/
enum_name:
global_opt_scope_opt_enum_key tag_name
;
global_opt_scope_opt_enum_key:
LE_ENUM
| global_or_scope LE_ENUM
;
enumerator_list:
enumerator_list_no_trailing_comma
| enumerator_list_no_trailing_comma ',' /* C++, not ANSI C */
;
/* Note that we do not need to rush to add an enumerator to the
symbol table until *AFTER* the enumerator_value_opt is parsed.
The enumerated value is only in scope AFTER its definition is
complete. Hence the following is legal: "enum {a, b=a+10};" but
the following is (assuming no external matching of names) is not
legal: "enum {c, d=sizeof(d)};" ("d" not defined when sizeof was
applied.) This is notably contrasted with declarators, which
enter scope as soon as the declarator is complete. */
enumerator_list_no_trailing_comma:
enumerator_name enumerator_value_opt
| enumerator_list_no_trailing_comma ',' enumerator_name enumerator_value_opt
;
enumerator_name:
LE_IDENTIFIER
| LE_TYPEDEFname
;
enumerator_value_opt:
/* Nothing */
| '=' constant_expression
;
/* We special case the lone type_name which has no storage class
(even though it should be an example of a parameter_type_list).
This helped to disambiguate type-names in parenthetical casts.*/
parameter_type_list:
'(' ')' type_qualifier_list_opt
| '(' type_name ')' type_qualifier_list_opt
| '(' type_name initializer ')' type_qualifier_list_opt /* C++, not ANSI C */
| '(' named_parameter_type_list ')' type_qualifier_list_opt
;
/* The following are used in old style function definitions, when
a complex return type includes the "function returning" modifier.
Note the subtle distinction from parameter_type_list. These
parameters are NOT the parameters for the function being defined,
but are simply part of the type definition. An example would be:
int(*f( a ))(float) long a; {...}
which is equivalent to the full new style definition:
int(*f(long a))(float) {...}
The type list `(float)' is an example of an
old_parameter_type_list. The bizarre point here is that an old
function definition declarator can be followed by a type list,
which can start with a qualifier `const'. This conflicts with
the new syntactic construct for const member functions!?! As a
result, an old style function definition cannot be used in all
cases for a member function. */
old_parameter_type_list:
'(' ')'
| '(' type_name ')'
| '(' type_name initializer ')' /* C++, not ANSI C */
| '(' named_parameter_type_list ')'
;
named_parameter_type_list: /* WARNING: excludes lone type_name*/
parameter_list
| parameter_list comma_opt_ellipsis
| type_name comma_opt_ellipsis
| type_name initializer comma_opt_ellipsis /* C++, not ANSI C */
| LE_ELLIPSIS /* C++, not ANSI C */
;
comma_opt_ellipsis:
LE_ELLIPSIS /* C++, not ANSI C */
| ',' LE_ELLIPSIS
;
parameter_list:
non_casting_parameter_declaration
| non_casting_parameter_declaration initializer /* C++, not ANSI C */
| type_name ',' parameter_declaration
| type_name initializer ',' parameter_declaration /* C++, not ANSI C */
| parameter_list ',' parameter_declaration
;
/* There is some very subtle disambiguation going on here. Do
not be tempted to make further use of the following production in
parameter_list, or else the conflict count will grow noticeably.
Specifically, the next set of rules has already been inline
expanded for the first parameter in a parameter_list to support a
deferred disambiguation. The subtle disambiguation has to do with
contexts where parameter type lists look like old-style-casts. */
parameter_declaration:
type_name
| type_name initializer /* C++, not ANSI C */
| non_casting_parameter_declaration
| non_casting_parameter_declaration initializer /* C++, not ANSI C */
;
/* There is an LR ambiguity between old-style parenthesized casts
and parameter-type-lists. This tends to happen in contexts where
either an expression or a parameter-type-list is possible. For
example, assume that T is an externally declared type in the
code:
int (T ((int
it might continue:
int (T ((int)0));
which would make it:
(int) (T) (int)0 ;
which is an expression, consisting of a series of casts.
Alternatively, it could be:
int (T ((int a)));
which would make it the redeclaration of T, equivalent to:
int T (dummy_name (int a));
if we see a type that either has a named variable (in the above
case "a"), or a storage class like:
int (T ((int register
then we know it can't be a cast, and it is "forced" to be a
parameter_list.
It is not yet clear that the ANSI C++ committee would decide to
place this disambiguation into the syntax, rather than leaving it
as a constraint check (i.e., a valid parser would have to parse
everything as though it were a parameter list (in these odd
contexts), and then give an error if is to a following context
(like "0" above) that invalidated this syntax evaluation. */
/* One big thing implemented here is that a LE_TYPEDEFname CANNOT be
redeclared when we don't have declaration_specifiers! Notice that
when we do use a LE_TYPEDEFname based declarator, only the "special"
(non-ambiguous in this context) typedef_declarator is used.
Everything else that is "missing" shows up as a type_name. */
non_casting_parameter_declaration: /*have names or storage classes */
declaration_specifier
| declaration_specifier abstract_declarator
| declaration_specifier identifier_declarator
| declaration_specifier parameter_typedef_declarator
| declaration_qualifier_list
| declaration_qualifier_list abstract_declarator
| declaration_qualifier_list identifier_declarator
| type_specifier identifier_declarator
| type_specifier parameter_typedef_declarator
| basic_type_name identifier_declarator
| basic_type_name parameter_typedef_declarator
| LE_TYPEDEFname identifier_declarator
| LE_TYPEDEFname parameter_typedef_declarator
| global_or_scoped_typedefname identifier_declarator
| global_or_scoped_typedefname parameter_typedef_declarator
| type_qualifier_list identifier_declarator
;
type_name:
type_specifier
| basic_type_name
| LE_TYPEDEFname
| global_or_scoped_typedefname
| type_qualifier_list
| type_specifier abstract_declarator
| basic_type_name abstract_declarator
| LE_TYPEDEFname abstract_declarator
| global_or_scoped_typedefname abstract_declarator
| type_qualifier_list abstract_declarator
;
initializer_opt:
/* nothing */
| initializer
;
initializer:
'=' initializer_group
;
initializer_group:
'{' initializer_list '}'
| '{' initializer_list ',' '}'
| assignment_expression
;
initializer_list:
initializer_group
| initializer_list ',' initializer_group
;
/*************************** STATEMENTS *******************************/
statement:
labeled_statement
| compound_statement
| expression_statement
| selection_statement
| iteration_statement
| jump_statement
| declaration /* C++, not ANSI C */
;
labeled_statement:
label ':' statement
| LE_CASE constant_expression ':' statement
| LE_DEFAULT ':' statement
;
/* I sneak declarations into statement_list to support C++. The
grammar is a little clumsy this way, but the violation of C
syntax is heavily localized */
compound_statement:
'{' statement_list_opt '}'
;
declaration_list:
declaration
| declaration_list declaration
;
statement_list_opt:
/* nothing */
| statement_list_opt statement
;
expression_statement:
comma_expression_opt ';'
;
selection_statement:
LE_IF '(' comma_expression ')' statement
| LE_IF '(' comma_expression ')' statement LE_ELSE statement
| LE_SWITCH '(' comma_expression ')' statement
;
iteration_statement:
LE_WHILE '(' comma_expression_opt ')' statement
| LE_DO statement LE_WHILE '(' comma_expression ')' ';'
| LE_FOR '(' comma_expression_opt ';' comma_expression_opt ';'
comma_expression_opt ')' statement
| LE_FOR '(' declaration comma_expression_opt ';'
comma_expression_opt ')' statement /* C++, not ANSI C */
;
jump_statement:
LE_GOTO label ';'
| LE_CONTINUE ';'
| LE_BREAK ';'
| LE_RETURN comma_expression_opt ';'
;
/* The following actions should update the symbol table in the
"label" name space */
label:
LE_IDENTIFIER
| LE_TYPEDEFname
;
/***************************** EXTERNAL DEFINITIONS *****************************/
translation_unit:
/* nothing */
| translation_unit external_definition
;
external_definition:
function_declaration {printf("Found function (decl)\n");}
| function_definition {printf("Found function (impl)\n");}
| declaration {printf("Found declaration %s\n", yytext);}
| linkage_specifier function_declaration /* C++, not ANSI C*/
| linkage_specifier function_definition /* C++, not ANSI C*/
| linkage_specifier declaration /* C++, not ANSI C*/
| linkage_specifier '{' translation_unit '}' /* C++, not ANSI C*/
;
linkage_specifier:
LE_EXTERN LE_STRINGliteral
;
/* Note that declaration_specifiers are left out of the following
function declarations. Such omission is illegal in ANSI C. It is
sometimes necessary in C++, in instances where no return type
should be specified (e_g., a conversion operator).*/
function_declaration:
identifier_declarator ';' /* semantically verify it is a
function, and (if ANSI says it's
the law for C++ also...) that it
is something that can't have a
return type (like a conversion
function, or a destructor */
| constructor_function_declaration ';'
;
function_definition:
new_function_definition
| old_function_definition
| constructor_function_definition
;
/* Note that in ANSI C, function definitions *ONLY* are presented
at file scope. Hence, if there is a typedefname active, it is
illegal to redeclare it (there is no enclosing scope at file
scope).
In contrast, C++ allows function definitions at struct
elaboration scope, and allows tags that are defined at file scope
(and hence look like typedefnames) to be redeclared to function
calls. Hence several of the rules are "partially C++ only". I
could actually build separate rules for typedef_declarators and
identifier_declarators, and mention that the typedef_declarator
rules represent the C++ only features.
In some sense, this is haggling, as I could/should have left
these as constraints in the ANSI C grammar, rather than as syntax
requirements. */
new_function_definition:
identifier_declarator compound_statement
| declaration_specifier declarator compound_statement /* partially C++ only */
| type_specifier declarator compound_statement /* partially C++ only */
| basic_type_name declarator compound_statement /* partially C++ only */
| LE_TYPEDEFname declarator compound_statement /* partially C++ only */
| global_or_scoped_typedefname declarator compound_statement /* partially C++ only */
| declaration_qualifier_list identifier_declarator compound_statement
| type_qualifier_list identifier_declarator compound_statement
;
/* Note that I do not support redeclaration of TYPEDEFnames into
function names as I did in new_function_definitions (see note).
Perhaps I should do it, but for now, ignore the issue. Note that
this is a non-problem with ANSI C, as tag names are not
considered TYPEDEFnames. */
old_function_definition:
old_function_declarator {} old_function_body
| declaration_specifier old_function_declarator {} old_function_body
| type_specifier old_function_declarator {} old_function_body
| basic_type_name old_function_declarator {} old_function_body
| LE_TYPEDEFname old_function_declarator {} old_function_body
| global_or_scoped_typedefname old_function_declarator {} old_function_body
| declaration_qualifier_list old_function_declarator {} old_function_body
| type_qualifier_list old_function_declarator {} old_function_body
;
old_function_body:
declaration_list compound_statement
| compound_statement
;
/* Verify via constraints that the following
declaration_specifier is really a
typedef_declaration_specifier, consisting of:
... LE_TYPEDEFname :: LE_TYPEDEFname
optionally *preceded* by a "inline" keyword. Use care not to
support "inline" as a postfix!
Similarly, the global_or_scoped_typedefname must be:
... LE_TYPEDEFname :: LE_TYPEDEFname
with matching names at the end of the list.
We use the more general form to prevent a syntax conflict with a
typical function definition (which won't have a
constructor_init_list) */
constructor_function_definition:
global_or_scoped_typedefname parameter_type_list
constructor_init_list_opt compound_statement
| declaration_specifier parameter_type_list
constructor_init_list_opt compound_statement
;
/* Same comments as seen for constructor_function_definition
apply here */
constructor_function_declaration:
global_or_scoped_typedefname parameter_type_list /* wasteful redeclaration; used for friend decls. */
| declaration_specifier parameter_type_list /* request to inline, no definition */
;
/* The following use of declaration_specifiers are made to allow
for a LE_TYPEDEFname preceded by an LE_INLINE modifier. This fact must
be verified semantically. It should also be verified that the
LE_TYPEDEFname is ACTUALLY the class name being elaborated. Note
that we could break out typedef_declaration_specifier from within
declaration_specifier, and we might narrow down the conflict
region a bit. A second alternative (to what is done) for cleaning
up this stuff is to let the tokenizer specially identify the
current class being elaborated as a special token, and not just a
typedefname. Unfortunately, things would get very confusing for
the lexer, as we may pop into enclosed tag elaboration scopes;
into function definitions; or into both recursively! */
/* I should make the following rules easier to annotate with
scope entry and exit actions. Note how hard it is to establish
the scope when you don't even know what the decl_spec is!! It can
be done with $-1 hacking, but I should not encourage users to do
this directly. */
constructor_function_in_class:
declaration_specifier constructor_parameter_list_and_body
| LE_TYPEDEFname constructor_parameter_list_and_body
;
/* The following conflicts with member declarations-
constructor_parameter_list_and_body:
parameter_type_list ';'
| parameter_type_list constructor_init_list_opt compound_statement
;
so parameter_type_list was expanded inline to get */
/* C++, not ANSI C */
constructor_parameter_list_and_body:
'(' ')' type_qualifier_list_opt ';'
| '(' type_name initializer ')' type_qualifier_list_opt ';'
| '(' named_parameter_type_list ')' type_qualifier_list_opt ';'
| '(' ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' type_name initializer ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' named_parameter_type_list ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| constructor_conflicting_parameter_list_and_body
;
/* The following conflicted with member declaration-
constructor_conflicting_parameter_list_and_body:
'(' type_name ')' type_qualifier_list_opt ';'
| '(' type_name ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
;
so type_name was inline expanded to get the following... */
/* Note that by inline expanding type_qualifier_opt in a few of
the following rules I can transform 3 RR conflicts into 3 SR
conflicts. Since all the conflicts have a look ahead of ';', it
doesn't really matter (also, there are no bad LALR-only
components in the conflicts) */
constructor_conflicting_parameter_list_and_body:
'(' type_specifier ')' type_qualifier_list_opt
';'
| '(' basic_type_name ')' type_qualifier_list_opt
';'
| '(' LE_TYPEDEFname ')' type_qualifier_list_opt
';'
| '(' global_or_scoped_typedefname ')' type_qualifier_list_opt
';'
| '(' type_qualifier_list ')' type_qualifier_list_opt
';'
| '(' type_specifier abstract_declarator ')' type_qualifier_list_opt
';'
| '(' basic_type_name abstract_declarator ')' type_qualifier_list_opt
';'
/* missing entry posted below */
| '(' global_or_scoped_typedefname abstract_declarator ')' type_qualifier_list_opt
';'
| '(' type_qualifier_list abstract_declarator ')' type_qualifier_list_opt
';'
| '(' type_specifier ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' basic_type_name ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' LE_TYPEDEFname ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' global_or_scoped_typedefname ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' type_qualifier_list ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' type_specifier abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' basic_type_name abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
/* missing entry posted below */
| '(' global_or_scoped_typedefname abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' type_qualifier_list abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| constructor_conflicting_typedef_declarator
;
/* The following have ambiguities with member declarations-
constructor_conflicting_typedef_declarator:
'(' LE_TYPEDEFname abstract_declarator ')' type_qualifier_list_opt
';'
| '(' LE_TYPEDEFname abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
;
which can be deferred by expanding abstract_declarator, and in two
cases parameter_qualifier_list, resulting in ...*/
constructor_conflicting_typedef_declarator:
'(' LE_TYPEDEFname unary_abstract_declarator ')' type_qualifier_list_opt
';'
| '(' LE_TYPEDEFname unary_abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' LE_TYPEDEFname postfix_abstract_declarator ')' type_qualifier_list_opt
';'
| '(' LE_TYPEDEFname postfix_abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
| '(' LE_TYPEDEFname postfixing_abstract_declarator ')' type_qualifier_list_opt
';'
| '(' LE_TYPEDEFname postfixing_abstract_declarator ')' type_qualifier_list_opt
constructor_init_list_opt compound_statement
;
constructor_init_list_opt:
/* nothing */
| constructor_init_list
;
constructor_init_list:
':' constructor_init
| constructor_init_list ',' constructor_init
;
constructor_init:
LE_IDENTIFIER '(' argument_expression_list ')'
| LE_IDENTIFIER '(' ')'
| LE_TYPEDEFname '(' argument_expression_list ')'
| LE_TYPEDEFname '(' ')'
| global_or_scoped_typedefname '(' argument_expression_list ')'
| global_or_scoped_typedefname '(' ')'
| '(' argument_expression_list ')' /* Single inheritance ONLY*/
| '(' ')' /* Is this legal? It might be default! */
;
declarator:
identifier_declarator
| typedef_declarator
;
typedef_declarator:
paren_typedef_declarator /* would be ambiguous as parameter*/
| simple_paren_typedef_declarator /* also ambiguous */
| parameter_typedef_declarator /* not ambiguous as parameter*/
;
parameter_typedef_declarator:
LE_TYPEDEFname
| LE_TYPEDEFname postfixing_abstract_declarator
| clean_typedef_declarator
;
/* The following have at least one '*'or '&'. There is no
(redundant) '(' between the '*'/'&' and the LE_TYPEDEFname. This
definition is critical in that a redundant paren that it too
close to the LE_TYPEDEFname (i.e., nothing between them at all)
would make the LE_TYPEDEFname into a parameter list, rather than a
declarator.*/
clean_typedef_declarator:
clean_postfix_typedef_declarator
| asterisk_or_ampersand parameter_typedef_declarator
| unary_modifier parameter_typedef_declarator
;
clean_postfix_typedef_declarator:
'(' clean_typedef_declarator ')'
| '(' clean_typedef_declarator ')' postfixing_abstract_declarator
;
/* The following have a redundant '(' placed immediately to the
left of the LE_TYPEDEFname. This opens up the possibility that the
LE_TYPEDEFname is really the start of a parameter list, and *not* a
declarator*/
paren_typedef_declarator:
postfix_paren_typedef_declarator
| asterisk_or_ampersand '(' simple_paren_typedef_declarator ')'
| unary_modifier '(' simple_paren_typedef_declarator ')'
| asterisk_or_ampersand '(' LE_TYPEDEFname ')' /* redundant paren */
| unary_modifier '(' LE_TYPEDEFname ')' /* redundant paren */
| asterisk_or_ampersand paren_typedef_declarator
| unary_modifier paren_typedef_declarator
;
postfix_paren_typedef_declarator:
'(' paren_typedef_declarator ')'
| '(' simple_paren_typedef_declarator postfixing_abstract_declarator ')'
| '(' LE_TYPEDEFname postfixing_abstract_declarator ')' /* redundant paren */
| '(' paren_typedef_declarator ')' postfixing_abstract_declarator
;
/* The following excludes lone LE_TYPEDEFname to help in a conflict
resolution. We have special cased lone LE_TYPEDEFname along side
all uses of simple_paren_typedef_declarator */
simple_paren_typedef_declarator:
'(' LE_TYPEDEFname ')'
| '(' simple_paren_typedef_declarator ')'
;
identifier_declarator:
unary_identifier_declarator
| paren_identifier_declarator
;
/* The following allows "function return array of" as well as
"array of function returning". It COULD be cleaned up the way
abstract declarators have been. This change might make it hard
to recover from user's syntax errors, whereas now they appear as
simple constraint errors. */
unary_identifier_declarator:
postfix_identifier_declarator
| asterisk_or_ampersand identifier_declarator
| unary_modifier identifier_declarator
;
postfix_identifier_declarator:
paren_identifier_declarator postfixing_abstract_declarator
| '(' unary_identifier_declarator ')'
| '(' unary_identifier_declarator ')' postfixing_abstract_declarator
;
old_function_declarator:
postfix_old_function_declarator
| asterisk_or_ampersand old_function_declarator
| unary_modifier old_function_declarator
;
/* ANSI C section 3.7.1 states "An identifier declared as a
typedef name shall not be redeclared as a parameter". Hence the
following is based only on IDENTIFIERs.
Instead of identifier_lists, an argument_expression_list is used
in old style function definitions. The ambiguity with
constructors required the use of argument lists, with a
constraint verification of the list (e_g.: check to see that the
"expressions" consisted of lone identifiers).
An interesting ambiguity appeared:
const constant=5;
int foo(constant) ...
Is this an old function definition or constructor? The decision
is made later by LE_THIS grammar based on trailing context :-). This
ambiguity is probably what caused many parsers to give up on old
style function definitions. */
postfix_old_function_declarator:
paren_identifier_declarator '(' argument_expression_list ')'
| '(' old_function_declarator ')'
| '(' old_function_declarator ')' old_postfixing_abstract_declarator
;
old_postfixing_abstract_declarator:
array_abstract_declarator /* array modifiers */
| old_parameter_type_list /* function returning modifiers */
;
abstract_declarator:
unary_abstract_declarator
| postfix_abstract_declarator
| postfixing_abstract_declarator
;
postfixing_abstract_declarator:
array_abstract_declarator
| parameter_type_list
;
array_abstract_declarator:
'[' ']'
| '[' constant_expression ']'
| array_abstract_declarator '[' constant_expression ']'
;
unary_abstract_declarator:
asterisk_or_ampersand
| unary_modifier
| asterisk_or_ampersand abstract_declarator
| unary_modifier abstract_declarator
;
postfix_abstract_declarator:
'(' unary_abstract_declarator ')'
| '(' postfix_abstract_declarator ')'
| '(' postfixing_abstract_declarator ')'
| '(' unary_abstract_declarator ')' postfixing_abstract_declarator
;
asterisk_or_ampersand:
'*'
| '&'
;
unary_modifier:
scope '*' type_qualifier_list_opt
| asterisk_or_ampersand type_qualifier_list
;
/************************* NESTED SCOPE SUPPORT ******************************/
/* The actions taken in the rules that follow involve notifying
the lexer that it should use the scope specified to determine if
the next LE_IDENTIFIER token is really a LE_TYPEDEFname token. Note
that the actions must be taken before the parse has a chance to
"look-ahead" at the token that follows the "::", and hence should
be done during a reduction to "scoping_name" (which is always
followed by LE_CLCL). Since we are defining an LR(1) grammar, we
are assured that an action specified *before* the :: will take
place before the :: is shifted, and hence before the token that
follows the LE_CLCL is scanned/lexed. */
/* Note that at the end of each of the following rules we should
be sure that the tag name is in, or placed in the indicated
scope. If no scope is specified, then we must add it to our
current scope IFF it cannot be found in an external lexical
scope. */
scoping_name:
tag_name
| aggregate_key tag_name /* also update symbol table here by notifying it about a (possibly) new tag*/
;
scope:
scoping_name LE_CLCL
| scope scoping_name LE_CLCL
;
/* Don't try to simplify the count of non-terminals by using one
of the other definitions of "LE_IDENTIFIER or LE_TYPEDEFname" (like
"label"). If you reuse such a non-terminal, 2 RR conflicts will
appear. The conflicts are LALR-only. The underlying cause of the
LALR-only conflict is that labels, are followed by ':'.
Similarly, structure elaborations which provide a derivation have
have ':' just after tag_name This reuse, with common right
context, is too much for an LALR parser. */
tag_name:
LE_IDENTIFIER
| LE_TYPEDEFname
;
global_scope:
{ /*scan for upcoming name in file scope */ } LE_CLCL
;
global_or_scope:
global_scope
| scope
| global_scope scope
;
/* The following can be used in an identifier based declarator.
(Declarators that redefine an existing LE_TYPEDEFname require
special handling, and are not included here). In addition, the
following are valid "identifiers" in an expression, whereas a
LE_TYPEDEFname is NOT.*/
scope_opt_identifier:
LE_IDENTIFIER
| scope LE_IDENTIFIER /* C++ not ANSI C */
;
scope_opt_complex_name:
complex_name
| scope complex_name
;
complex_name:
'~' LE_TYPEDEFname
| operator_function_name
;
/* Note that the derivations for global_opt_scope_opt_identifier
and global_opt_scope_opt_complex_name must be placed after the
derivation:
paren_identifier_declarator : scope_opt_identifier
There are several states with RR conflicts on "(", ")", and "[".
In these states we give up and assume a declaration, which means
resolving in favor of paren_identifier_declarator. This is
basically the "If it can be a declaration rule...", with our
finite cut off. */
global_opt_scope_opt_identifier:
global_scope scope_opt_identifier
| scope_opt_identifier
;
global_opt_scope_opt_complex_name:
global_scope scope_opt_complex_name
| scope_opt_complex_name
;
/* Note that we exclude a lone LE_TYPEDEFname. When all alone, it
gets involved in a lot of ambiguities (re: function like cast vs
declaration), and hence must be special cased in several
contexts. Note that generally every use of scoped_typedefname is
accompanied by a parallel production using lone LE_TYPEDEFname */
scoped_typedefname:
scope LE_TYPEDEFname
;
global_or_scoped_typedefname:
scoped_typedefname
| global_scope scoped_typedefname
| global_scope LE_TYPEDEFname
;
global_opt_scope_opt_typedefname:
LE_TYPEDEFname
| global_or_scoped_typedefname
;
%%
void yyerror(char* string)
{
printf("CodeLite: parser error: %s\n", string);
}
main()
{
setLexerInput("class A{}; int foo(){ for(i=0; i<100; i++){}; return 3;}");
yyparse();
}
|