1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666
|
\documentclass{article}
\input{Preamble}
\PdfDocument{Wild Magic 5 Overview}
\begin{document}
\PdfTitle{Wild Magic 5 Overview}{May 1, 2010}
This document provides a high-level overview of Wild Magic 5 and its
similarities and differences compared to Wild Magic 4. This is not intended
to be a comprehensive description; consider it a brain dump of what I was
thinking for the various files and subsystems. Your best bet for
understanding how to use Wild Magic 5 is to browse the sample applications
and see the engine in action. If you have used Wild Magic 4, you can compare
those samples with their rewrites in Wild Magic 5.
\section{Introduction}
\subsection{Licensing}
The versions of Wild Magic prior to 4.10 used the LGPL Open Source license.
The license was changed to the Boost License for Wild Magic 4.10. Wild Magic
5 also uses the Boost License.
\subsection{Naming Conventions}
Based on user feedback, the Microsoft-like Hungarian notation was removed.
The notation is now simpler, choosing instead to use the prefixes \Code{m} for
nonstatic class data members, \Code{ms} for static class data members,
\Code{g} for nonstatic global data, and \Code{gs} for static global data.
Modern compilers and tools are quite good at allowing you to determine the
type of identifiers, usually via tool tips with a mouse-over of the
identifiers, so there is no reason to embed the type information in the
name. Local identifiers within functions also no longer have embedded type
information. The source code is easier to read.
\subsection{Source Code Organization}
The code has been factored and reorganized.
The WM4 LibFoundation library was factored into two WM5 libraries: LibCore
and LibMathematics. LibCore has basic system support, including assertion
handling, data types for tuples (1D arrays) and tables (2D arrays), file
and buffer input-output, memory management and smart pointers, object-oriented
support (base class \Code{Object}, file and buffer input-output, run-time type
information, streaming, and initialization-termination semantics), mutexes
and threads (the threading is not yet implemented), and time measurement.
LibMathematics contains just about everything else that lived in
LibFoundation. Most of that code remains the same as in WM4 (except for
the naming conventions).
The WM4 LibGraphics library contained a platform-independent engine for
graphics. An abstract class \Code{Renderer} lived in this library. The WM4
LibRenderers folder contained projects with \Code{Renderer}-derived classes
for each graphics API of interest: \Code{Dx9Renderer} (DirectX 9 for Microsoft
Windows); \Code{OpenGLRenderer} with flavors \Code{WglRenderer} (Microsoft
Windows), \Code{AglRenderer} (Macintosh OS X), and \Code{GlxRenderer} (Linux
using X Windows); and \Code{SoftRenderer} with flavors \Code{WinSoftRenderer}
(Microsoft Windows), \Code{MacSoftRenderer} (Macintosh OS X), and
\Code{XSoftRenderer} (Linux using X Windows). The main drawback to this
approach is that \Code{Renderer} contained a large number of virtual
functions. In an application with a large number of calls to the virtual
functions, there is a performance hit due to those calls. Specifically, there
are many data cache misses due to the lookup of the function pointers in the
virtual function table (the tables are global data). WM5 has a concrete class
\Code{Renderer} that does not have virtual functions. The class is
implemented for each graphics API. The code for these APIs is also part of
WM5 LibGraphics. The selection of the API is controlled via build
configurations.
The WM4 LibApplications library that provides a platform-independent
application layer did not change much in WM5. The design of the application
layer is such that each platform (Microsoft Windows, Macintosh OS X, Linux)
implements an entry point that is called by code in class \Code{Application}.
The entry point implementation and any event handling is, of course, specific
to the platform. The application library is mainly for the convenience of
supporting Wild Magic sample applications. Although it can be used in shipping
applications, it was never intended for use this way. I expected that users
would roll their own layer.
\subsection{LIB Header Files}
Each of the libraries LibCore, LibMathematics, LibGraphics, LibPhysics, and
LibImagics has a corresponding header file: \Code{Wm5CoreLIB.h},
\Code{Wm5MathematicsLIB.h}, \Code{Wm5GraphicsLIB.h}, \Code{Wm5PhysicsLIB.h},
and \Code{Wm5ImagicsLIB.h}. These header files contain preprocessor
commands that control the compilation of the libraries. Users are encouraged
to modify these files to suit their own needs.
\subsubsection{Wm5CoreLIB.h}
\label{subsubsec.corelib}
The file \Code{Wm5CoreLIB.h} contains preprocessor commands to expose various
features that are dependent on the development platform (Microsoft Windows,
Macintosh OS X, Linux). For example, one of the the flags
\Code{WM5\_LITTLE\_ENDIAN} or \Code{WM5\_BIG\_ENDIAN} is exposed depending
on the byte order required by the CPU. The only tested platform that has
a big-endian ordering is the Macintosh PowerPC G4/G5. The other tested
platforms have little-endian ordering, including the Intel Macintosh.
The header file contains declarations of some standard integer types when
compiling using Microsoft Visual Studio 2008. I am patiently waiting for
consistent cross-platform support for \Code{stdint.h}.
Various headers from the C standard library and from the C$++$ standard library
are included for convenience. Although generally you want to structure the
header inclusions to obtain minimal time for compilation, nearly all modern
compilers provide support for precompiled headers. Having a large number of
includes in \Code{Wm5CoreLIB.h}, a file that is indirectly included in all
source files, will lead to a slow compile without precompiled headers. However,
the precompiled header builds are quite fast.
The symbols \Code{public\_internal}, \Code{protected\_internal}, and
\Code{private\_internal} are defined to be the keywords \Code{public},
\Code{protected}, and \Code{private}, respectively. This allows me to
use the \Code{*\_internal} symbols to designate sections within class
declarations that are intended for my internal use. For example, sometimes a
class needs a subsystem to support the engine design, and that subsystem must
have public functions that are called within the engine. Such functions are
tagged as \Code{public\_internal} to let the users know that I do not
intend for these to be called explicitly by applications.
Within Microsoft Visual Studio 2008, the newly defined symbols may be assigned
colors for syntax highlighting. To change the color, edit the following file
\begin{verbatim}
C:/Program Files/Microsoft Visual Studio 9.0/Common7/IDE/usertype.dat
\end{verbatim}
Add each identifier you want highlighted on a line by itself. My file
contains
\begin{verbatim}
public_internal
protected_internal
private_internal
new0
new1
new2
new3
new4
delete0
delete1
delete2
delete3
delete4
assertion
\end{verbatim}
The additional symbols in this file for syntax highlighting are described
later in this document. In Visual Studio, select the menu item
\begin{verbatim}
Tools | Options ...
\end{verbatim}
In the Options dialog that appears, expand the Environment item and
select Fonts and Colors. On the right there is a control named
``Display items''; in the drop-down list, select ``User Keywords''.
You can change the color using the controls named ``Item foreground''
and ``Item background''. I selected purple for the foreground color,
as shown in the next figure.
\begin{center}
\includegraphics[width=5in]{OptionsDialog.png}
\end{center}
The macro \Code{WM5\_UNUSED(variable)} is used to avoid compiler warnings
iabout unused variables when compiling Release configurations. For example,
\begin{verbatim}
bool successful = DoSomeOperation();
assert(successful);
\end{verbatim}
will compile without warnings in Debug configurations. However, the
compiler generates a warning in Release configurations that \Code{successful}
is not used. The reason, of course, is that the \Code{assert} statement has
no generated code in Release configurations, so \Code{successful} is not used.
To avoid the warning, use
\begin{verbatim}
bool successful = DoSomeOperation();
assert(successful);
WM5_UNUSED(successful);
\end{verbatim}
The header file contains three additional blocks, all enabled in Debug
configurations. The first is related to run-time assertions, the second is
related the WM5 memory management system that supports testing for memory
leaks, and the third is related to file and buffer input-output. The various
preprocessor commands in these blocks are described later in this document.
\subsubsection{Wm5MathematicsLIB.h}
Currently, the only preprocessor control in \Code{Wm5MathematicsLIB.h} is
related to handling of exact rational arithmetic. I added a patch to
WM4.10 so that subnormal (denormal) floating-point numbers are handled
correctly by the class \Code{Rational} constructors and converters between
floating-point and \Code{Rational}. The WM5 code supports conversion of
subnormal numbers. You can enable the engine to assert when an attempt is
made to convert a NaN (Not a Number) to a \Code{Rational}.
\subsubsection{Wm5GraphicsLIB.h}
A few controls are allowed in \Code{Wm5GraphicsLIB.h}. When reorienting the
camera by a call to \Code{Camera::SetAxes}, either explicitly or indirectly
with a call to \Code{Camera::SetFrame}, the input axis vectors might be
computed by the application in such a manner that, over time, numerical
round-off errors cause the vectors not to be a right-handed orthonormal set.
The \Code{SetAxis} function uses Gram-Schmidt orthonormalization to ensure
that the vectors do form a right-handed orthonormal set. You can enable
\Code{WM5\_VALIDATE\_CAMERA\_FRAME\_ONCE} to trap the first time the
vectors appear to fail the test for right-handed orthonormality. I have
found this to be a useful feature for trapping when the initial settings
for the application camera, \Code{mCamera}, are applied. In most cases,
the user has incorrectly specified the vectors.
The shader system supports only a few shader models (profiles). To be
specific, currently only four profiles are supported, but also a {\em none}
value is used to flag invalid profiles. The total number, including
the {\em none} profile is five. For vertex shaders, the supported profiles
are \Code{vs\_1\_1}, \Code{vs\_2\_0}, and \Code{vs\_3\_0} for DirectX 9 and
\Code{arbvp1} for OpenGL. For pixel shaders, the supported profiles are
\Code{ps\_1\_1}, \Code{ps\_2\_0}, and \Code{ps\_3\_0} for DirectX 9 and
\Code{arbfp1} for OpenGL. Sometimes you might need advanced OpenGL support
for an effect, but the Cg compiler still includes the ARB versions of the
profile names in the compiled code. For example, the sample graphics
application \Code{VertexTextures} requires a Cg command-line parameter
\verb|-profile vp40|, but the Cg compiler still displays the first line
of the compiled file as \verb|!!ARBVP1.0|. The WM5 shader system bundles
together the shader programs for the profiles into a single object of class
\Code{Shader}. This class has arrays whose number of elements is 5, which
is stored as \Code{Shader::MAX\_PROFILES}.
{\em You can modify WM5 to include more profiles.} However, if you use the
WM5 streaming system, the streamed output implicitly depends on
\Code{Shader::MAX\_PROFILES}. If you were to increase the maximum number
of profiles, and then load a file streamed with the previous maximum number,
there is a mismatch and the file load will ungracefully fail (all data loaded
thereafter is misaligned). To trap this problem when loading files, you can
enable \Code{WM5\_ASSERT\_ON\_CHANGED\_MAX\_PROFILES}.
In the \Code{Renderer::Draw(const Visual*, const VisualEffectInstance*)}
function, the global render state is reset to the defaults after each pass
of the effect. Given that every draw function is required to set the
all the global state, it is not necessary to reset the state. Thus, the
reset code is not compiled by default. During development and testing, I
had some problems when not resetting the state, so I added a preprocessor
symbol to allow me to toggle the reset code:
\Code{WM5\_RESET\_STATE\_AFTER\_DRAW}. Just in case problems show up later,
I kept the preprocessor symbol. You can enable this if you prefer by
uncommenting the define in \Code{Wm5GraphicsLIB.h}.
Sometimes during application development, you might not see a rendered
object when you were expecting one. A simple test to determine whether
any pixels were actually drawn involves queries supported by the graphics
APIs. The \Code{Renderer::DrawPrimitive} calls in \Code{Wm5Dx9Renderer.cpp}
and \Code{Wm5OpenGLRenderer.cpp} have conditionally compiled blocks of
code that, when enabled, perform the queries. To enable these, uncomment
the \Code{WM5\_QUERY\_PIXEL\_COUNT} symbol in \Code{Wm5GraphicsLIB.h}.
Recompile the graphics library and your application, and then set a
breakpoint in \Code{DrawPrimitive} on the lines with \Code{WM5\_END\_QUERY}.
When you reach the breakpoint, step over the line of code and look at the
value of \Code{numPixelsDrawn}. If it is zero, no pixels were drawn for
the current primitive.
When using the OpenGL renderer, I have code to draw text either using
display lists or using precomputed bitmap fonts (see
\Code{Wm5GLVerdanaS16B0I0.cpp}). The default is to use display lists,
but you can change this by commenting out \Code{WM5\_USE\_TEXT\_DISPLAY\_LIST}
in \Code{Wm5GraphicsLIB.h}.
\Code{Wm5GraphicsLIB.h} contains the symbol
\Code{WM5\_USE\_OPENGL2\_NORMAL\_ATTRIBUTES} that is defined for Microsoft
Windows and Linux. It is not defined for Macintosh OS X. I had problems
with incorrent renderings on the Macintosh when the effects use lighting
and normals, so I had to fall back to using the conventional
\Code{glNormalPointer} for setting the vertex data source for normals.
As it turns out, the problem is that I have been using OpenGL extensions
for shader support, and those extensions were created before OpenGL 2.0
was released. The assembly for the compiled shaders contains
\Code{vertex.normal}, which is for the conventional way of accessing
the vertex normals. When I use \Code{glEnableVertexAttribArrayARB}
and \Code{glVertexAttribPointerARB} to set the data source for vertex
normals, the NVIDIA drivers for Microsoft Windows and for Fedora Linux
hook up the normals so that \Code{vertex.normal} refers to those normals.
However, the NVIDIA drivers on the Macintosh do not hook these up, so
the vertex shader is unable to access the normals.
I added the aforementioned preprocessor symbol as a hack to make the shaders
work on all platforms. Alternatively, on the Macintosh you can edit the
assembly code and replace \Code{vertex.normal} by the corresponding
generic attribute accessor (not my first choice). I am in the process
of updating the OpenGL renderer so that it uses the core OpenGL 2.0 (and
later) shader system. However, this means that the shaders must be written
in GLSL, not in Cg. The end result of the update is {\em EmeraldGL}, and
will be an OpenGL-only graphics system. I might consider implementing
a DirectX-only system ({\em EmeraldDX}) that uses DirectX 11.
The last preprocessor symbol in \Code{Wm5GraphicsLIB.h} is
\Code{WM5\_PDR\_DEBUG}, which is enabled by default for the DirectX 9
renderer. This exposes assertions that are triggered whenever the DirectX
calls fail.
\subsubsection{Wm5PhysicsLIB.h}
The only preprocessor symbols in \Code{Wm5PhysicsLIB.h} are used for debugging
the LCP code. There is no reason to enable these except if you want to
determine whether the LCP code is working correctly. The LCP code was
part of {\em Game Physics, 1st edition}, but it was intended to be pedagogic
and illustrate the Lemke algorithm (which looks a lot like a basic linear
programming solver and similar to linear system solving). This code is not
what people use in physics engines. (Someday I will get around to implementing
a velocity-based iterative algorithm \ldots)
\subsubsection{Wm5ImagicsLIB.h}
No preprocessor symbols are defined in \Code{Wm5ImagicsLIB.h}. This library
has not been worked on for many years, but remains useful (to me) for rapid
prototyping of image analysis projects. It needs some major updating and
expansion.
\subsection{No DLL Configurations}
For years I have provided build configurations for both static and dynamic
libraries. The Microsoft Windows annoyance of having to use
\Code{\_\_declspec(dllexport)} and \Code{\_\_declspec(dllimport)} so that
classes are properly exported or imported has been a pain. The WM4
libraries had LIB files containing preprocessor symbols as shown next:
\begin{verbatim}
#ifdef WM4_FOUNDATION_DLL_EXPORT
// For the DLL library.
#define WM4_FOUNDATION_ITEM __declspec(dllexport)
#else
#ifdef WM4_FOUNDATION_DLL_IMPORT
// For a client of the DLL library.
#define WM4_FOUNDATION_ITEM __declspec(dllimport)
#else
// For the static library.
#define WM4_FOUNDATION_ITEM
#endif
#endif
\end{verbatim}
Each class is structured as
\begin{verbatim}
class WM4_FOUNDATION_ITEM MyClass { ... }
\end{verbatim}
However, template classes with no explicit instantiation in the library
could not use the \Code{WM4\_FOUNDATION\_ITEM} macro. And various
static class data members needed the macro per member. The separation
between the abstract \Code{Renderer} class and its derived classes per
graphics API required the virtual function members so that the DLL configurations
would link successfully.
Given the abundance of disk space, the usage I had in mind for Wild Magic
libraries, the problems with linking when attempting to remove virtual
functions from the \Code{Renderer} class, and the annoyance of the
aforementioned macro handling, I decided to stop supporting DLLs. WM5 has
only static debug and static release configurations.
\subsection{The WM4 Shader Programming and FX System}
WM4 had a somewhat complicated approach to shader programming and effects,
which made it sometimes difficult to extend to shaders not already part of
the engine (or part of the sample applications). The problems with this
approach are described next.
The abstraction of the drawing pass in WM4 is
\scriptsize
\begin{verbatim}
renderer.Draw(geometry)
{
renderer.SetGlobalState(...); // alpha, cull, depth, ...
renderer.SetWorldTransformation(); // sets model-to-world (W), others computed later (WV, WVP)
renderer.EnableIBuffer(geometry); // enable the index buffer of geometry
for each effect of geometry do // multieffect drawing loop
{
renderer.ApplyEffect(effect);
{
for each pass of effect do // multipass drawing loop
{
pass.SetGlobalState();
pass.ConnectVShaderConstants(); // set sources for constants
pass.ConnectPShaderConstants(); // set sources for constants
pass.GetVProgram(); // loaded first time, cached in catalog for later times
pass.EnableVProgram();
pass.GetPProgram(); // loaded first time, cached in catalog for later times
pass.EnablePProgram();
for each vertex texture of pass do
{
pass.GetVTexture(); // loaded first time, cached in catalog for later times
pass.EnableVTexture();
}
for each pixel texture of pass do
{
pass.GetPTexture(); // loaded first time, cached in catalog for later times
pass.EnablePTexture();
}
pass.EnableVBuffer();
renderer.DrawPrimitive(geometry);
pass.DisableVBuffer();
pass.DisablePTextures();
pass.DisableVTextures();
pass.DisablePProgram();
pass.DisableVProgram();
pass.RestoreGlobalState();
}
}
}
DisableIBuffer();
RestoreWorldTransformation();
RestoreGlobalState();
}
\end{verbatim}
\normalsize
The drawing supports multiple effects per geometric primitive and multiple
passes per effect; it is not necessary to have a double-loop sytem. WM5
has a single-loop system, iterating over the passes of a single effect
attached to the geometric primitive.
The renderer sets and restores global states (alpha, face culling, depth
buffering, stencil buffering, polygon offset, wireframe), but so does each
pass. Given that each pass restores state, there is no need for the renderer
object itself to manage global state.
The index buffer is invariant across all effects and passes, so it is enabled
and disabled once only. However, the vertex buffer is enabled and disabled
per pass, which is not necessary. What WM4 does is create a VRAM vertex
buffer for the geometric primitive. It then maintains vertex buffers
that match what the vertex program requires for the passes, as determined
during the first call to \Code{GetVProgram} (when the vertex program is
loaded from disk and parsed). If the effect has multiple passes, a second
(or later) pass involves finding an already existing vertex buffer that has
the required attributes. If none exists, a new VRAM vertex buffer is created
that has the required attributes. Thus, it is possible that multiple vertex
buffers exist in VRAM with data copied from the primary vertex buffer of the
geometric primitive, which is a waste of memory. An effect with multiple
passes should be applied to a geometric primitive whose vertex buffer has
{\em all} the attributes necessary for {\em all} the passes (WM5 does this).
In effect, WM4 tried to assume responsibility for ensuring that the vertex
buffers match what the vertex program needs. If there is a mismatch between
primary vertex buffer and what the vertex program needs, WM4 creates a
matching vertex buffer; however, the attributes generated by a mismatch have
have no chance of being initialized by the application programmer. In the
WM4 sample applications, there are no mismatches, so there is no penalty in
wasted memory. But there is a penalty in having a vertex buffer management
system that is irrelevant. In the end, it is the application programmer's
responsibility for ensuring that the vertex buffer has all that it needs to
support an effect and that the outputs of a vertex program match the inputs
of a pixel program.
In WM4, class \Code{Shader} represents a shader program and its associated
storage for shader constants and for textures. However, it was convenient
to allow applications to specify their own data sources for the shader
constants (for ease of access). WM4 has shader constant classes that
provide such storage; for example, the class \Code{UserConstant}. In the
drawing pass, the functions \Code{Renderer::ConnectVShaderConstants} and
\Code{Renderer::ConnectPShaderConstants} set the data sources for the
shaders. This allows an application to change the data source for each
drawing pass, an event that is highly unlikely (and never happens in WM4
sample applications). The redesign of the shader system for WM5 avoids
this.
The function \Code{Renderer::GetVProgram} is called during drawing to
get access to the vertex program of the effect pass. The first time a
vertex program is requested, it is loaded from disk. The shaders were
written using NVIDIA's Cg, and they were all compiled for Shader Model 2.
The compiled assembly is still textual, and is stored in files with
extension \Code{wmsp}. The WM4 engine contains a class \Code{Program}
and derived classes \Code{VertexProgram} (loads \Code{wmsp} files with
prefix \Code{v\_}) and \Code{PixelProgram} (loads \Code{wmsp} files with
prefix \Code{p\_}). The comments in the \Code{wmsp} files are parsed
to obtain information about the shader program, which effectively is
WM4's attempt to have an FX run-time system.
A problem with this system is that the shader programs are constrained
to contain special names for some of the shader constants to support
automatic updating of those constants during drawing. A class
\Code{RendererConstant} provides a set of enumerations and corresponding
names for common quantities that change frequently, such as
world-view-projection matrices, camera parameters, and light and
material parameters. Class \Code{Renderer} contains an array of functions
corresponding to the enumerations in \Code{RendererConstant}. The
function \Code{Renderer::SetRendererConstant} determines which shader
constants need to be updated (in system memory). After such a call,
\Code{Renderer::SetVProgramConstant} or \Code{Renderer::SetPProgramConstant}
are called so that the graphics API can update the constants (by copying
to constant registers). These \Code{Renderer} calls are part of the
\Code{Renderer::EnableVProgram} and \Code{Renderer::EnablePProgram} calls
in the drawing pass. WM5 provides a different mechanism for automatic
constant updating that does not have constraints on the shader constant
names.
Another problem with the \Code{Program} loading and parsing is that it
is not general. Often I would want to support a new effect but the
Cg programs used features not supported by the parser of
\Code{Program}. That meant modifying \Code{Program} as needed. WM5
avoids this system and allows you to compile shaders to a binary format
that contains the textual program string but also contains information
about the shader. That is, the loading and parsing is now part of a
tool. The output files of the tool are ready to load by WM5, so there
is no error checking that needs to be performed at application run time.
In WM4, when \Code{Renderer::GetVProgram} is called the first time for a
vertex program, and the program loads correctly, it is stored in a cache
implemented in the \Code{Catalog} class. This caching system is overly
complicated. In WM5, caching is the responsibility of the application
programmer, because the programmer knows best how the objects will be
used and shared.
When effects use vertex or pixel textures, they are loaded the first time
they are encountered by calls to \Code{ShaderEffect::GetVTexture} and
\Code{ShaderEffect::GetPTexture}. The mechanism is similar to that of
\Code{GetVProgram} and \Code{GetPProgram}--the first time a texture is
encountered, it is loaded from disk and cached in a catalog. Later
requests look in the catalog first to find the textures and, if found,
use them instead of loading a new copy from disk.
Although manageable, the drawing system of WM4 turned out to be more
complicated than is necessary, and it was not general enough to support
many advanced special effects without having to modify the engine.
\subsection{The WM5 Shader Programming and FX System}
\label{subsec.shaderfx}
The abstraction of the drawing pass in WM5 is described next. What used
to be the \Code{Geometry} class is now \Code{Visual}, which I thought
was a better name that allows me to add \Code{Audial} (for 3D sound) at
a later date.
Some other major design changes were made. DirectX 9 has the concept of
a {\em vertex format} that describes a vertex stored in a vertex buffer.
OpenGL does not encapsulate this in a simple manner. WM5 has a new class
called \Code{VertexFormat} that implements the idea. The class
\Code{VertexBuffer} still represents a vertex buffer but, of course, with
changes. Reading and writing vertex buffer information requires knowing
a vertex buffer and a vertex format. The read/write is supported by the
class \Code{VertexBufferAccessor}.
The WM5 class \Code{VisualEffect} is the natural successor to WM4's
\Code{ShaderEffect}, except that \Code{VisualEffect} represents a
vertex shader and pixel shader pair {\em but without specific data for
the shader constants and textures}. A single \Code{VisualEffect} object
can have multiple instances, each instance having data. These instances
are represented by class \Code{VisualEffectInstance}. For example, you
can create a texture visual effect with user-specified sampler parameters.
If you want this effect for each of two different texture images, you
create two visual effect instances.
A \Code{Visual} object has attached a single pair of \Code{VisualEffect}
and \Code{VisualEffectInstance}. Each object of type \Code{VisualEffectInstance}
manages multiple passes for the drawing, each pass of class \Code{VisualPass}.
The \Code{VisualPass} class contains global render state objects (alpha,
face culling, depth buffering, polygon offset, stencil buffering, and
wireframe), a vertex shader, and a pixel shader.
A class \Code{ShaderParameters} represents the shader constants and
textures used \Code{VisualEffectInstance}, one such object for the
vertex shader and one such object for the pixel shader. The shader
constants are encapsulated by a system whose base class is \Code{ShaderFloat}.
Many derived classes are provided for common shader constants, such as
world-view-projection matrices, camera parameters, and light and material
parameters. This system replaces WM4's \Code{RendererConstant} system for
automatic updating of shader constants.
The drawing pass is abstractly
\scriptsize
\begin{verbatim}
renderer.Draw(visual, visualEffectInstance)
{
renderer.Enable(visual.vertexBuffer);
renderer.Enable(visual.vertexFormat);
renderer.Enable(visual.indexBuffer); // if it has such a buffer
for each visualPass of visualEffectInstance do
{
visualPass.vertexShaderParameters.UpdateConstants(visual, renderer.camera);
visualPass.pixelShaderParameters.UpdateConstants(visual, renderer.camera);
visualPass.SetGlobalState(); // alpha, cull, depth, ...
renderer.Enable(visualPass.vertexShader, visualPass.vertexShaderParameters);
renderer.Enable(visualPass.pixelShader, visualPass.pixelShaderParameters);
renderer.DrawPrimitive(visual);
renderer.Disable(visualPass.pixelShader, visualPass.pixelShaderParameters);
renderer.Disable(visualPass.vertexShader, visualPass.vertexShaderParameters);
visualPass.RestoreGlobalState();
}
renderer.Disable(visual.indexBuffer);
renderer.Disable(visual.vertexFormat);
renderer.Disable(visual.vertexBuffer);
}
\end{verbatim}
\normalsize
At a high level, the drawing is similar to that of WM4. But as mentioned in the
section describing the WM4 drawing, the vertex buffer is enabled and disabled once
outside the loop over passes. The WM4 setting of sources for shader constants was
eliminated. Instead, the \Code{ShaderFloat} objects provide storage and the
\Code{UpdateConstants} performs the automatic updates of the constants.
All caching of effects, textures, vertex buffers, vertex formats, and index buffers
is the responsibility of the application programmer. It is simple enough to use
the smart-pointer system for the management rather than a complicated cataloging
system.
As mentioned in the previous section, WM5 has a tool for compiling Cg Shaders
to a binary format that can be loaded directly by the engine. This tool is
named \Code{WmfxCompiler} (in the \Code{WildMagic5/Tools} subfolder).
{\em Local effects} are those applied to a single geometric primitive; for
example, basic texturing and lighting. {\em Global effects} are typically more
complicated and are applied to scene graphs; for example, planar shadows and
planar reflections. WM5 has implementations of quite a few local effects, but
has only planar shadows and planar reflections as examples of global effects.
The sample applications have additional global effects that are implemented at
the application level rather than as classes.
\subsection{Design Change Regarding Lights and Materials}
\label{subsec.designchangelights}
WM4 had classes \Code{Light} and \Code{Material} that provided the ability to
attach \Code{Light} objects to a scene graph node. Each light attached to a
node was assumed to illuminate any objects in the subtree rooted at the node.
To support this automatically, WM4 internally generated a shader effect
(class \Code{LightingEffect}) that was used for lighting. If an application
attached a \Code{ShaderEffect} to a leaf node of that subtree, a multieffect
drawing occurred. The \Code{LightingEffect} was executed first for
the geometry, and the \Code{ShaderEffect} was executed second with a default
alpha blend applied to combine it with the lighting. This approach still
has the flavor of the fixed-function pipeline. Moreover, it was not a good
idea (based on technical support requests from users having problems working
with the lighting). It is possible to roll your own lighting effects without
attaching lights to the scene, but then you have to make \Code{Renderer} calls
so that the renderer knows about the lights. Very cumbersome and nonintuitive.
WM5 eliminates this system. The \Code{Light} class still exists, but it is
only a container for light properties (light type, colors, attenuation, and
so on). You cannot attach a \Code{Light} to a scene directly. Instead, you
can create lighting-related shader constants via classes derived from
\Code{ShaderFloat} and include them in the visual effect instances. See,
for example, files of the form \Code{Wm5Light*Constant.\{h,cpp\}} and
\Code{Wm5Material*Constant.\{h,cpp\}} and local effects files of the form
\Code{Wm5Light*Effect.\{h,cpp\}}.
\section{LibCore}
The \Code{LibCore} library contains some basic support that applications need.
Some of this support is for convenience during development. A summary of
the files in this library is provided in this section. The subsection titles
are the names of the subfolders of the \Code{WildMagic5/LibCore} folder.
\subsection{Assert}
C$++$ run-time libraries typically implement a macro called \Code{assert}
that has a single input which is a Boolean expression. In debug
configurations, the macro is expanded to platform-specific code that triggers
the assertion when the Boolean expression is false. Moreover, typically
a breakpoint is generated so that the debugger stops on that line of code
for the programmer to diagnose the problem. For example,
\begin{verbatim}
float numerator = <some integer>;
float denominator = <some integer>;
assert(denominator != 0.0f);
float ratio = numerator/denominator;
\end{verbatim}
This bare-bones approach is suitable most of the time, but other times it
is useful to perform more actions when an unexpected condition occurs.
Moreover, it might be useful to have an assertion triggered when running
in release configurations.
The files \Code{Wm5Assert.*} provide an alternate implementation for
assertions, which at the moment is utilized only on Microsoft Windows
and Microsoft Visual Studio. The class \Code{Assert} has a constructor whose
first input is the Boolean expression to be tested. The name of the file and
line number within that file where the assertion is triggered are also
parameters. These support writing assertions to a logfile, identifying
the file and line number, but not triggering an interrupt on the assertion.
These also support writing information to a Microsoft Windows message box.
Yet another parameter of the constructor is a format string. Values to be
printed via the format statement may be provided to the constructor (note
the use of the ellipsis in the constructor). This allows you to specify
more than just that the assertion failed. You can print as much information
as you believe necessary to help with debugging. A variadic macro named
\Code{assertion} is used to wrap the construction of \Code{Assert} objects;
such a macro supports a variable number of arguments.
By default, the alternative assertion system is enabled for Microsoft
Windows and Visual Studio when in a debug configuration. The preprocessor
flag controlling this is in \Code{Wm5CoreLIB.h}. The system is enabled
when \Code{WM5\_USE\_ASSERT} is defined. If you want, you can expose the
macros even in a release configuration. Notice that there are three
additional preprocessor symbols you can define. These control whether
the assertion information is written to a log file, to the output window
of Visual Studio, and/or to a message box.
In my environment, I have \Code{assertion} specified as a user keyword
with syntax highlighting that shows the keyword in purple. For details
on highlighting user keywords, see Section \ref{subsubsec.corelib}.
\subsection{DataTypes}
I implemented only two basic data types in the core library: \Code{Tuple}
and \Code{Table}. These are templated classes with two template
parameters: one is the number of components of the tuple and one is the
type of the component. Only basic services are provided: construction,
destruction, access to the array pointer, access to components, assignment,
and comparison (support for standard C$++$ library containers). The main
use of \Code{Tuple} in the engine is as a base class for floating-point
vectors \Code{Float1}, \Code{Float2}, \Code{Float3}, and \Code{Float4}.
The derived classes provided specialized constructors and assignment
operators.
Class \Code{Table} represents a 2-dimensional array of components and has
three template parameters: one is the number of rows of the table, one is
the number of columns of the table, and one is the type of the component.
Only basic services are provided: construction, destruction, access to
the array pointer, access to components, access to rows and columns (as
tuples), assignment, and comparison (support for standard C$++$ library
containers). The main use of \Code{Table} in the engine is as a base
class for floating-point matrices \Code{Matrix2}, \Code{Matrix3}, and
\Code{Matrix4}.
I have tried to rely on the standard C$++$ library containers as much as
possible, but I find my own minimum-heap template class to be useful (for
priority queue support with fast updates when neighbors change). I have
kept this template class, files \Code{Wm5MinHeap.*}.
\subsection{InputOutput}
This folder contains implementation for handling of byte-order (endianness)
and for file and buffer input-output. It also contains a path system for
locating files.
\subsubsection{Endianness}
Class \Code{Endian} has code to test whether a processor is little endian
or big endian. The class also has functions for swapping data types with
2, 4, or 8 bytes per element. I used byte-swapping in WM4 extensively to
allow data files that could be loaded either on a little-endian or a
big-endian machine. The data itself was always stored in little-endian
format, which meant that the PowerPC Macintosh had extra computational
work to do when loading.
\subsubsection{File and Buffer Input-Output}
My goal in WM5 was to provide file and buffer input-output that can be
configured for the platforms in such a manner as to avoid byte swapping.
Classes \Code{BufferIO} and \Code{FileIO} are the implementations. The
constructors for these classes have a \Code{mode} parameter that allows
you to specify whether the object is for reading data or for writing
data. Moreover, the \Code{mode} flags specify whether to read as is,
to write as is, to read and swap bytes, or to write and swap bytes.
Additionally, I have \Code{mode} flags for the default read/write modes.
In the engine, any time I use \Code{BufferIO} or \Code{FileIO} objects,
I arrange for the \Code{mode} parameter to be defaulted itself to the
default read/write modes. In this manner, if you want a global change
in the engine, say, to switch from read to read-and-swap, you need only
edit \Code{Wm5BufferIO.h} and \Code{Wm5FileIO.h} and change what the
default flags are (they currently are set to read/write without swaps).
This sounds fine in theory, but I encountered one big problem after
writing most of the graphics library. The vertex buffers and textures
were streamed to disk as arrays of bytes, ignoring the actual structure
of a vertex and the actual format of a texture. This is a problem when
you want to write-and-swap, because byte arrays are never byte-swapped.
Instead, it is necessary to write vertices one at a time and swap native
fields as they are encountered. Similary, texels must be written one at
a time to ensure that the color channels are swapped correctly; for
example, if you have an RGBA 16-bits-per-channel texel, you must swap
two bytes per channel for each of four channels. The source code was
due soon for the {\em Game Physics, 2nd edition} CD-ROM, so it was too
late to modify the code. Instead, I created WMOF (Wild Magic Object
File) versions for little endian and big endian. Only two such files
are shipped anyway (\Code{FacePN.wmof} and \Code{SkinnedBipedPN.wmof}),
so not a big deal. My goal for future development is to avoid the
streaming system and just rely on raw formats for vertex buffers,
index buffers, and textures, and each platform can generate its own
byte-ordered versions.
\subsubsection{Path Handling}
In WM4, the files \Code{Wm4System.*} contained the ability to specify
a filename and create the fully qualified hard-disk path for the file.
The function of interest was \Code{System::GetPath}. Someone who had
experience with the Macintosh implemented the Apple version of this
function, which involves some low-level operating system calls. I
had to hack this function, because it depended on how Xcode was
configured (and the configuration varied between Xcode versions).
Not having enough experience with low-level Macintosh programming,
I ignored some complaints from users about how \Code{GetPath} was
slow and annoying.
In WM4, I also required users to set an environment variable that
stored the path to the \Code{WildMagic4} folder of the installation.
I missed a simple opportunity to bootstrap off this environment
variable and avoid the low-level programming.
WM5 does take advantage of the environment variable, now called
\Code{WM5\_PATH} in the WM5 distribution. Class \Code{Environment}
encapsulates computing the fully qualified path for a specified file.
Just as class \Code{System} allowed in WM4, \Code{Environment}
allows you to insert and remove directory strings (paths to the
folders) for an array of strings. The most common function in this
class that the sample applications use is
\begin{verbatim}
std::string Environment::GetPathR (const std::string& name);
\end{verbatim}
You specify the name of a file to be read (the suffix \Code{R}
stands for ``read'') and the function returns the fully qualified
path for that file, if it can find it using the array of directory
strings it manages. If it cannot find the function, the empty
string is returned.
The main entry point in the application code inserts the path to
the \Code{WildMagic5} folder. It also inserts paths to various
\Code{WildMagic5/Data} subfolders: \Code{Wmfx}, \Code{Wmof},
\Code{Wmtf}, \Code{Wmvf}, and \Code{Im}. More importantly, the
path to the application's project folder is inserted in the
main function. The application initialization mechanism sets the
path, which is a static member \Code{Application::ThePath}. In
order for this to work, it is necessary that the application
set the console title (for \Code{ConsoleApplication}-derived
classes) or the window title (for \Code{WindowApplication}-derived
classes). For example, the application \Code{BillboardNodes}
has a class with constructor defined as
\begin{verbatim}
BillboardNodes::BillboardNodes ()
:
WindowApplication3("SampleGraphics/BillboardNodes",0, 0, 640, 480,
Float4(0.9f, 0.9f, 0.9f, 1.0f)),
mTextColor(1.0f, 1.0f, 1.0f, 1.0f)
{
}
\end{verbatim}
The window title is the quoted string. This string is appended to
the fully qualified string for the \Code{WildMagic5} folder. The
resulting string is the fully qualified path for the folder of the
\Code{BillboardNodes} project.
\subsection{Memory}
\subsubsection{WM4 Memory Tracking}
WM4 has a memory system that supported finding memory leaks. The macros
\Code{WM4\_NEW} and \Code{WM4\_DELETE} are simple macros that wrap
\Code{new} and \Code{delete} when the memory system is disabled and
that wrap \Code{new(\_\_FILE\_\_,\_\_LINE\_\_)} and \Code{delete} when the
memory system is enabled. All engine memory allocations and
deallocations use these macros so that without code changes, you can
toggle on/off the memory tracking.
The heart of the system is class \Code{Memory} whose interface is used
to override the C$++$ operators
\begin{verbatim}
void* operator new (size_t size, char* file, unsigned int line);
void* operator new[] (size_t size, char* file, unsigned int line);
\end{verbatim}
Although a simple system, the override affects all allocations in the
application; indirectly, any other code linked to the application is
forced to use the overridden operator.
I was not satisfied with this approach, wanting instead to provide the
ability for users to substitute in their own memory management/tracking
system that affects only Wild Magic code. For example, a user might want
to patch in a system that gives Wild Magic a {\em memory budget}--a
fixed-size heap that the engine must use for all its memory needs.
I also was not satisfied with the C$++$ memory management itself. In
the memory tracking, the calls to \Code{new(\_\_FILE\_\_,\_\_LINE\_\_)}
allow you to intercept the allocation request and save it for writing
to a log file at the end of an application run. If there is a memory
leak, the log file can list information about the allocations, including
the name of the source file and the line of that file where the leaked
allocation occurred. Unfortunately, C$++$ does not allow you to override
\Code{delete} in a way that uses the \Code{\_\_FILE\_\_} and
\Code{\_\_LINE\_\_} macros. At first glance you might override with
\begin{verbatim}
void operator delete (void* address, char* file, unsigned int line);
void operator delete[] (void* address, char* file, unsigned int line);
#define WM4_DELETE delete(__FILE__,__LINE__)
\end{verbatim}
This does not do what you think it does. These versions of \Code{delete}
are called only when exceptions occur, and you cannot force them to be
called otherwise. It would really be helpful to be able to log the files
and lines on which deallocations occur, especially when you want to monitor
{\em memory usage patterns} rather than memory leaks.
The \Code{operator new} function is for dynamically allocating a single object,
a 0-dimensional array so to speak. The \Code{operator new[]} function is for
dynamically allocating a 1-dimensional array of objects. The general rule is
that if you allocate with \Code{operator new}, you must deallocate with
\Code{operator delete}. If you allocate with \Code{operator new[]}, you must
deallocate with \Code{operator delete[]}. If you mix these, consider that
an error in memory management, even if the application does not abnormally
terminate. For example,
\begin{verbatim}
MyObject* objects = new MyObjects[10];
delete[] objects; // matches the new[] call
delete objects; // error - a mismatch
\end{verbatim}
It is the programmers reponsibility to ensure the new and delete calls are
matched.
C$++$ does not have new/delete operators for higher dimensional arrays. It is
not clear how to provide language support for this in a robust manner. For
example,
\begin{verbatim}
MyObject** objects0 = new MyObject*[N];
for (i = 0; i < N; ++i)
{
objects0[i] = new MyObject[M];
}
<code using objects0>;
for (i = 0; i < N; ++i)
{
delete[] objects0[i];
}
delete[] objects0;
MyObject someObjects[N]; // objects live on the stack, not in the heap
MyObject** objects1 = new MyObject*[N];
for (i = 0; i < N; ++i)
{
objects0[i] = &someObjects[i];
}
<code using objects1>;
delete[] objects1;
\end{verbatim}
In the first block of code, the user has dynamically allocated a 2-dimensional
array of \Code{MyObject} objects, manipulated the objects, and then dynamically
deallocated the array one row at a time. In the second block of code, the
user has created a 1-dimensional array of \Code{MyObject*} pointers that
point to a 1-dimensional array of \Code{MyObject} objects that live on the
stack. It is an error to attempt to dynamically deallocate these objects.
Clearly, the semantics of \Code{objects0} and \Code{objects1} are different,
despite both being of type \Code{MyObject**}. Without knowledge of the
semantics, it would be difficult for C$++$ to provide a new/delete pair
for \Code{Type**} pointers.
In the case when the user does want a 2-dimensional array of the form
that \Code{objects0} illustrates, you can provide your own allocation
and deallocation. WM4 had several template functions in class \Code{System}
for allocating and deallocating 2-dimensional and 3-dimensional arrays.
The idea of these is to encapsulate the work required, hiding the details
from the user, {\em and to minimize the number of new/delete calls}.
Returning to the first code block of the example, an alternative scheme
that minimizes new/delete calls is
\begin{verbatim}
MyObject* objects2 = new MyObject*[N];
objects2[0] = new MyObject[N*M];
for (i = 1; i < N; ++i)
{
objects2[i] = &objects2[0][M*i];
}
<code using objects2>;
delete[] objects2[0];
delete[] objects2;
\end{verbatim}
Allocation of \Code{objects0} requires $N+1$ calls to \Code{new} and
deallocation requires $N+1$ calls to \Code{delete}. Allocation of
\Code{objects2} requires $2$ calls to \Code{new} and $2$ calls to
\Code{delete}. Calls to new/delete can be relatively expensive because
of the work that the memory manager must due to manage the free list of
blocks, so minimizing the calls is a desirable goal. Moreover, you are
guaranteed that the \Code{N*M} \Code{MyObject} objects are contiguous,
which can be friendly to a memory cache, and also allows you to iterate
over the 2-dimensional array as a 1-dimensional array in an efficient
manner.
\begin{verbatim}
// Iteration as a 2-dimensional array.
for (row = 0; row < N; ++row)
{
for (col = 0; col < M; ++col)
{
MyObject& object = objects2[row][col];
<do something with object>;
}
}
// Iteration as a 1-dimensional array.
for (i = 0; i < N*M; ++i)
{
MyObject& object = objects[0][i];
<do something with object>;
}
\end{verbatim}
In the memory allocation scheme for \Code{objects2}, you are not
guaranteed that the rows occur in contiguous memory, so there is
the potential for memory cache misses when iterating over the
2-dimensional array, and it is not possible to iterate over the
objects as a 1-dimensional array.
Allocation and deallocation of 3-dimensional arrays with a minimum
of new/delete calls is similar.
\begin{verbatim}
MyObject*** objects3 = new MyObject**[P];
objects3[0] = new MyObject*[P*N];
objects3[0][0] = new MyObject[P*N*M];
for (int j = 0; j < P; j++)
{
objects3[j] = &objects3[0][N*j];
for (int i = 0; i < N; i++)
{
objects3[j][i] = &objects3[0][0][M*(i + N*j)];
}
}
<code using objects>;
delete[] objects3[0][0];
delete[] objects3[0];
delete[] objects3;
\end{verbatim}
In WM4, the allocation and deallocation are wrapped with template functions
named \Code{System::Allocate} and \Code{System::Deallocate}. However, I find
it displeasing to have inconsistent readability by calling \Code{WM4\_NEW}
for single objects (0-dimensional) and 1-dimensional arrays but having to
call \Code{System::Allocate} for 2-dimensional and 3-dimensional arrays.
\subsubsection{WM5 Memory Tracking}
A review of the ideas in the previous section led me to the following
requirements for the WM5 memory management system. Several additional
requirements were added as I discovered problems while developing the
memory manager. The first item in the list is about memory tracking
disabled. All other items are about memory tracking enabled.
\begin{enumerate}
\item When memory tracking is disabled, the allocation and deallocation
fall back to the standard \Code{new} and \Code{delete} calls.
\item Support semantics for arrays of dimension two or larger.
\item Interception of \Code{new} and \Code{delete} calls must affect
only the Wild Magic source code; that is, a side effect should not
be that other systems (C$++$ run-time libraries or third-party
software) are forced to use the interception system.
\item Provide hooks to the users for the low-level allocation and
deallocation so that Wild Magic transparently accesses a user-specified
heap (to enforce a memory budget).
\item File names and line numbers must be tracked both for allocations
and deallocations.
\item The inclusion of \Code{\_\_FILE\_\_} and \Code{\_\_LINE\_\_}
macros must be hidden from the user (for readability).
\item The tracking system must be reentrant; that is, if the system
manages containers that store tracking information and those
containers must be dynamically allocated, they must not do so by
using the tracking system (infinite recursion problem).
\item The system must allow for smart pointers (reference-counted
objects).
\item The tracking system must be thread safe.
\end{enumerate}
I struggled with designing a system that satisfied all the requirements,
finally settling on the one that is implemented in class \Code{Memory}.
I was burned only a couple of times along the way \ldots
\vspace*{0.1in}
{\bf Usage}
Before discussing the issues in designing \Code{Memory}, let us look at
the final result and how it is used. A set of macros are defined to
make allocation and deallocation calls simple, readable, and that hide
the file-line information. The allocation macros are named: \Code{new0},
\Code{new1}, \Code{new2}, \Code{new3}, and \Code{new4}. The numeric
suffix denotes the dimension of the allocation. Effectively, \Code{new0}
corresponds to \Code{new} for a single object, \Code{new1} corresponds
to \Code{new[]} for a 1-dimensional array of objects, and the remaining
macros correspond to higher dimensional arrays, as described in the
previous section (minimizing the number of calls to \Code{new}). The
corresponding deallocation macros are \Code{delete0}, \Code{delete1},
\Code{delete2}, \Code{delete3}, and \Code{delete4}. Although it is
still the user's responsibility to pair the correct new/delete macro
calls, if there is a mismatch (on a delete call), the memory tracking
system will report this. In my development environment, these macros
were added as user keywords, which I highlight in purple.
For allocation, all but the \Code{new0} call are templated. Typical usage is
\begin{verbatim}
MyObject* object = new0 MyObject(parameters);
delete0(object);
MyObject* objects1 = new1<MyObject>(numElements);
objects1[elementIndex] = <do something>;
delete1(objects1);
MyObject** objects2 = new2<MyObject>(numRows, numColumns);
objects2[rowIndex][columnIndex] = <do something>;
delete2(objects2);
MyObject*** objects3 = new3<MyObject(numSlices, numRows, numColumns);
objects3[sliceIndex][rowIndex][columnIndex] = <do something>;
delete3(objects3);
\end{verbatim}
\vspace*{0.1in}
{\bf Design Issues}
Now for design issues. One of the main problems I had was trying to wrap
the allocation and deallocation with macros for readability and ease of
use, yet satisfying all the requirements I mentioned previously. It
appeared to be practically impossible to use macros, hide an overload
of \Code{operator new} specific to Wild Magic, interact properly with
\Code{new} for single objects, hide the \Code{\_\_FILE\_\_} and
\Code{\_\_LINE\_\_} macros, and fall back to standard \Code{new} and
\Code{delete} when the tracking is disabled. Moreover, Requirement 7
is problematic, because it effectively forces you to have a container
external to the WM5 memory management system, which means a memory budget
cannot be fully enforced. I decided that having such a container was
something I (and users) can live with--you can always estimate how large
a container will be for your application, and then factor that into your
memory budgets.
In WM4, I had a macro to wrap overloaded \Code{operator new},
\begin{verbatim}
#define WM4_NEW new(__FILE__,__LINE__)
void* operator new (size_t size, char* file, unsigned int line);
void* operator new[] (size_t size, char* file, unsigned int line);
\end{verbatim}
This macro hides the \Code{\_\_FILE\_\_} and \Code{\_\_LINE\_\_} macros,
satisfying Requirement 6. However, the overloaded allocators violate
Requirement 3--the compiler would generate code for non-Wild-Magic code
that use the overloads. Regardless, such a simple macro cannot simultaneously
hide the file-line macros, the dimension of the array to be allocated, and
wrap the overloaded \Code{new}.
I was able to accomplish some of the hiding, but suffered the consequence of
needing lines of code such as
\begin{verbatim}
MyObject* object = WM5_NEW(MyObject, constructorParameters WM5_FILE_LINE);
\end{verbatim}
where \Code{WM5\_FILE\_LINE} expanded to nothing when memory tracking was
disabled, but expanded to
\begin{verbatim}
#define WM5_FILE_LINE , __FILE__, __LINE__
\end{verbatim}
when memory tracking was enabled. I was able to circumvent this problem by
designing \Code{Memory} so that objects of this class were only temporary
(for one line of code) but stored the file-line information. This also
addressed Requirement 5 (tracking delete calls). Specifically, class
\Code{Memory} has members \Code{mFile} and \Code{mLine} and a constructor
\begin{verbatim}
Memory::Memory (const char* file, int line) : mFile(file), mLine(line) { }
\end{verbatim}
The file-line information persists only while the temporary object exists, so
they are temporarily accessible to the memory tracking system.
The overloaded allocation operator has signature
\begin{verbatim}
void* operator new (size_t numBytes, const Wm5::Memory& memory);
\end{verbatim}
This satisfies Requirement 3 in that it is not possible for the compiler to
match this against allocation calls outside the Wild Magic 5 engine. There
was no need to overload \Code{operator new[]}.
Some of the macros for allocation and deallocation are
\begin{verbatim}
#define new0 new(Wm5::Memory(__FILE__,__LINE__))
#define new1 new Wm5::Memory(__FILE__,__LINE__).New1
#define delete0 Wm5::Memory(__FILE__,__LINE__).Delete0
#define delete1 Wm5::Memory(__FILE__,__LINE__).Delete1
\end{verbatim}
Notice that \Code{new0} uses the overloaded \Code{new} operator, where the
input \Code{memory} is a reference to the temporary \Code{Memory} object.
In the implementation of the overloaded \Code{new} operator, the memory
tracking system has access to file-line information because the temporary
object stores that information.
\vspace*{0.1in}
{\bf Template and Macro Interaction}
Notice that \Code{new1} raises some additional questions. The intent is for
this macro to support allocation of 1-dimensional arrays of {\em any type}.
The type information is not part of the macro. One could explore the
possibility for including the type as a macro parameter, and I did explore
this. You quickly run into the problem that the types might be template
types with multiple parameters separated by commas. These commas interfere
with the preprocessor's parsing of the macro. For example, you might try
\begin{verbatim}
#define new1(type) new(Wm5::Memory(__FILE__,__LINE__)) type
float* anArray = new1(float)[10]; // okay
MyTemplate<int,float> anotherArray = new1(MyTemplate<int,float>)[10]; // error
\end{verbatim}
The last line is a problem because the preprocess things that
\Code{MyTemplate<int} is the macro parameter. To convince the preprocessor otherwise
would require an extra pair of parentheses
\begin{verbatim}
new1((MyTemplate<int,float>))[10]; // still an error
\end{verbatim}
but this does not work because the extra parentheses now cause a syntax
error when the compiler tries to determine the type of the allocation. A fix is
to use
\begin{verbatim}
typedef MyTemplate<int,float> MyTemplateIF;
MyTemplateIF anotherArray = new1(MyTemplateIF)[10]; // okay
\end{verbatim}
but then the user has to make excessive use of \Code{typedef}. There were
other situations in the engine where I wanted to pass template types through
macro parameters, but the template-comma/macro-comma problem prevented those,
too. It would have been nice had C$++$ provided a separator other than a
comma for multiple template parameters.
At any rate, the \Code{Memory} class was then designed to have functions
\Code{New1}, \Code{New2}, and so on, that are templated. This avoids having
to pass template types through macro parameters, but runs the risk of generation
of excessive code. These templated member functions are why the previous example
had code such as
\begin{verbatim}
MyObject* objects1 = new1<MyObject>(numElements);
// The macro expanded code.
MyObject* objects1 = Wm5::Memory(__FILE__,__LINE__).New1<MyObject>(numElements);
\end{verbatim}
\vspace*{0.1in}
{\bf Lack of Specialized New0}
Observe that there is no templated function \Code{Memory::New0}. I had hoped to
have consistent coding style for all allocations, wanting
\begin{verbatim}
MyObject* object = new0<MyObject>(parameters);
\end{verbatim}
My first pass on the design and implementation used this approach, and the
\Code{Memory} class had a large number of \Code{New0} functions, one for a
default (0-parameter) constructor, one for a 1-parameter constructor, and so on.
The implementation was along the lines of the following abstraction for a
2-parameter constructor,
\begin{verbatim}
template <typename T, typename Param0, typename Param1>
T* Memory::New1 (Param0 p0, Param1 p1)
{
// Memory tracking code not shown...
return new T(p0, p1);
}
\end{verbatim}
During testing, I was burned by this approach. A class had a constructor with
a constant reference, say, \Code{MyClass::MyClass (int i, const SomeClass\& object)}.
\begin{verbatim}
SomeClass object = <some object>;
MyClass* something = new0<MyClass>(i, object);
\end{verbatim}
The compiler determined \Code{Param0} was \Code{int} and \Code{Param1} was
\Code{SomeClass}, {\em not} \Code{const SomeClass\&}. The generated code included
creating a temporary copy of \Code{object} and passing the copy to the \Code{MyClass}
constructor, which had some difficult to diagnose side effects. Realizing that the
difference was template code generation instead of macro textual substitution, I
removed the support in \Code{Memory} for templated allocations of single objects.
\vspace*{0.1in}
{\bf Hooks for User-Specified Allocations and Deallocations}
There is a static function \Code{Memory::Initialize} that allows the user to specify
low-level memory allocators and deallocators. Defaults are provided, namely,
\Code{Memory::DefaultAllocator}, which wraps \Code{malloc}, and
\Code{Memory::DefaultDeallocator}, which wraps \Code{free}. The functions provided
by the user must have parameters for the file name and line number, even if the user
is not interested in this information. The hooks for allocation and deallocation
allow you to provide a fixed-size heap when you want to insist on memory budgets
for the components of your application.
\vspace*{0.1in}
{\bf Memory Tracking}
The \Code{Memory} class maintains a map of the memory that is currently allocated
by Wild Magic; see static member \Code{msMap}. This map uses memory from the
global heap, so is not part of any user-specified heap implied by the hooks to
low-level allocators and deallocators. To avoid pre-main allocation, \Code{msMap}
is a pointer to a map and must be allocated during initialization of the application.
This is performed in \Code{Memory::Initialization}, which is called in \Code{main}
in \Code{Wm5Application.cpp}. There is a matching \Code{Memory::Termination} function
that is also called in \Code{main}. Note that \Code{msMap} is shared data, so it must
be protected from concurrent accesses when running in a multithreaded environment.
\Code{Memory} provides a mutex for the critical sections that access \Code{msMap};
see static member \Code{msMutex}.
When a call is made to \Code{new0}, the overloaded \Code{operator new} is called.
The implementation is in \Code{Wm5Memory.h}. A trap is supplied to ensure that
\Code{msMap} was actually allocated; if the trap is activated, an assertion is
triggered to let you know that the map does not exist. The most likely event is
that you are trying to allocate memory before \Code{main} has been called (such
as global objects within file scope that require dynamic allocation of members).
In this event, the allocation does not fail (in release builds); rather, it just
uses \Code{malloc} and does not track the memory.
When the map does exist, the static member function \Code{Memory::CreateBlock} is
called. Its parameters are the number of bytes to be allocated and the dimension
of the request, which is zero for \Code{New0}. \Code{CreateBlock} has a critical
section that calls \Code{msAllocator}, which is either \Code{Memory::DefaultAllocator}
or an allocator supplied by the user via \Code{Memory::Initialize}. The address
of the allocated block is the key for the map entry and a \Code{Memory::Information}
object is created to be the value for the map entry. The information object stores
the number of bytes requested, the number of dimensions, the file name, and the line
number for which the request was made.
When a call is made to \Code{new1}, more work must occur than that for \Code{new0}.
A trap also occurs in \Code{Memory::New1} for an allocation request that is made
before \Code{msMap} exists. If the request is made pre-\Code{main}, then the standard
C$++$ \Code{new[]} function is called and the memory is not tracked. I recommend
that you not allocate pre-\Code{main}, because it makes for more predictable debugging
(in a single-threaded environment) when all allocations occur when in the scope of
\Code{main} (including any of the functions it calls).
If the allocation request is made when the map exists, the allocation in
\Code{Memory::CreateBlock} uses low-level C-style memory allocation (\Code{malloc}
by default). However, the call to \Code{new1} is for an array of objects that must
then be default constructed. This is accomplised by calling the placement-new
operator.
\scriptsize
\begin{verbatim}
template <typename T>
T* Memory::New1 (const int bound0)
{
T* data;
if (msMap)
{
// Insert T[] into memory map.
data = (T*)CreateBlock(bound0*sizeof(T), 1);
// Call the default constructors for T.
T* object = data;
for (int i = 0; i < bound0; ++i, ++object)
{
::new(object) T; // THE PLACEMENT-NEW CALL
}
}
else
{
#ifdef WM5_USE_MEMORY_ASSERT_ON_PREMAIN_POSTMAIN_OPERATIONS
assertion(false, "Pre-main allocations are not tracked.\n");
#endif
data = new T[bound0];
}
return data;
}
\end{verbatim}
\normalsize
The implementations for \Code{New2}, \Code{New3}, and \Code{New4} are similar.
The implementations of \Code{Delete0} through \Code{Delete4} have a similar flavor.
If the map does not exist when a deletion is requested, most likely the problem
is post-main deallocation. A trap is set for this and, if encountered, the memory
is deleted using the standard C$++$ \Code{delete} operator. If the map does exist,
then a critical section is entered and \Code{msMap} is searched for the
address-information pair that should be in the map--the memory was allocated at some
previoue time. It is possible that the pair is not in the map, perhaps a double
deletion, so an assertion is triggered. In release configurations, the deletion is
actually made using the standard C$++$ \Code{delete} operator. (It is possible that
\Code{new} was used to allocate but \Code{delete0} was used to deallocate.)
When the pair exists in the map, a comparison is made between the \Code{Information}
member for number of dimensions and the dimension implied by the \Code{deleteN} call
(\Code{N} is 0, 1, 2, 3, or 4). If there is a mismatch, an assertion is triggered.
The goal is to provide debugging support to let the user know that there is a mismatch
in allocation and deallocation calls.
Assuming the pair exists and the dimensions match, the object must be destroyed.
Because this is not an implicit generation by the compiler of a destructor call,
an explicit destruction call must be made. For example,
\scriptsize
\begin{verbatim}
template <typename T>
void Memory::Delete0 (T*& data)
{
if (data)
{
if (!msMap)
{
#ifdef WM5_USE_MEMORY_ASSERT_ON_PREMAIN_POSTMAIN_OPERATIONS
assertion(false, "Post-main deallocations are not tracked.\n");
#endif
delete data;
data = 0;
return;
}
msMutex.Enter();
MemoryMap::iterator iter = msMap->find(data);
if (iter != msMap->end())
{
if (iter->second.mNumDimensions == 0)
{
// Call destructor for T. If T is a pointer type, the
// compiler will not generate any code for the destructor
// call.
data->~T(); // EXPLICIT CALL TO THE DESTRUCTOR
// Remove T from memory map.
msMap->erase(data);
msDeallocator(data, mFile, mLine);
}
else
{
assertion(false, "Mismatch in dimensions.\n");
}
}
else
{
#ifdef WM5_USE_MEMORY_ALLOW_DELETE_ON_FAILED_MAP_LOOKUP
delete data;
#else
assertion(false, "Memory block not in map.\n");
#endif
}
data = 0;
msMutex.Leave();
}
}
\end{verbatim}
\normalsize
After the object(s) is destroyed, the address-information pair is removed from the map.
Finally, the memory is deallocated by a call to \Code{msDeallocator}, which is either
\Code{Memory::DefaultDeallocator} or a function provided by the user in the call to
\Code{Memory::Initialize}.
\vspace*{0.1in}
{\bf Fallback to Standard C$++$ Calls}
Enabling or disabling the WM5 memory tracking system is accomplished by
symbols in \Code{Wm5CoreLIB.h}. The default is that it is enabled in debug
configurations, whereby \Code{WM5\_USE\_MEMORY} is defined. When the memory system
is disabled, the macros \Code{new0} through \Code{new4} and \Code{delete0} through
\Code{delete4} are expanded to inline function calls. The signatures are provided
in \Code{Wm5Memory.h} and the implementations are in \Code{Wm5Memory.inl}. These
functions only use C$++$ \Code{new} and \Code{delete} calls; in fact, the class
\Code{Memory} is not even defined when the memory system is disabled.
\vspace*{0.1in}
{\bf Smart Pointers}
WM4 has a reference-counting system that is implemented in class \Code{SmartPointer}.
This system is tied to the base class \Code{Object}. In particular, each \Code{Object}
manages its own reference count. Firstly, this is not thread safe. You can have a
race condition when two threads are attempting to modify the reference counter when
the object is being accessed by both threads. Secondly, this ties the reference
counting to the Wild Magic graphics library. Thirdly, the smart pointers work only
for single objects. Arrays of objects must be handled differently; for example, see
the \Code{Wm4TSharedArray} class .
In WM5, the smart pointers are thread safe, the reference counting is external (not
part of some base class for object-oriented support), and there are various smart
pointer classes to support sharing of arrays as well as single objects. The
implementation is in files \Code{Wm5SmartPointer.*}.
The base class for smart pointers is \Code{PointerBase}. This is similar to the
\Code{Memory} class in that a map is used to keep track of objects that are
currently reference counted (the references that were managed by the WM4
\Code{Object}s are not managed by an external system). One difference, though,
is that the \Code{msMap} member is an object, not a pointer. You may not create
reference counted objects pre-\Code{main} and they may not be destroyed
post-\Code{main}--I can modify this to be allowed, but it is better for
ease of debugging not to allocate/deallocate before/after \Code{main}.
The derived class \Code{Pointer0} of WM5 is equivalent to the WM4 class \Code{Pointer}.
The suffix of $0$ denotes that this class is for sharing of single objects (0-dimensional).
The derived class \Code{Pointer1} is used to share 1-dimensional arrays. There is no
need for a separate class such as \Code{Wm4TSharedArray}. Other smart pointer classes
exist for sharing 2-, 3-, and 4-dimensional arrays.
The semantics are the same as they were in WM4. When an object is shared by someone new,
the (external) reference count is incremented. When a shared object goes out of scope,
its (external) reference count is decremented. When the reference count becomes zero,
the object is deleted/deallocated. The code has traps for various unexpected conditions,
and asserts are triggered accordingly.
\subsection{ObjectSystems}
\subsubsection{Initialization and Termination}
WM4 provides the ability for each class to have static initialization and termination
functions. These are registered pre-\Code{main}. The initializers are executed after
\Code{main} begins but before the application starts (before \Code{Application::Run} is
executed). The terminators are executed after the application finishes but before
\Code{main} ends. This allows you to have better predicability of what your application
is doing--you have no control over the order of pre-main initialization calls and
post-main termination calls that are generated by the compiler. WM5 uses the same
system for initialization and termination.
\subsubsection{The Object Base Class}
Just like WM4, WM5 has a base class called \Code{Object} that provides various
services for large libraries. The class supports run-time type information (RTTI),
naming of objects, and streaming. The WM4 base class also had the foundation for
smart pointers, but in WM5 the smart pointer system is external (not part of
\Code{Object}).
RTTI and naming remain unchanged from WM4 to WM5. However, the streaming system
was significantly revamped. From a high-level perspective, the interface functions
for streaming are the same (although I skipped porting the \Code{StringTree} code).
However, the streaming is now factored into input streaming and output streaming.
The linker pass has had a major overhaul (described later).
The WM4 streaming system has a new feature that turned out to be necessary
when I painted myself into a corner. The loading system used the default
constructor for \Code{Object}-derived classes to generate an object via a
factory. This object was then assigned data that was loaded from disk. There
are times where the default constructor performs significant work, such as
memory allocation. The loading system really needed a ``clean object'' created.
In the case of default construction that contains memory allocation, some
hard to track memory leaks were occurring. The load-data-and-assign-to-object
paradigm itself was allocating memory for various members and overwriting the
pointers that were allocated by the default constructor. To circumvent this
subtlety, \Code{Object} has an enumeration \Code{LoadConstructor} with a single
member \Code{LC\_LOADER}. There is a constructor \Code{Object(LoadConstructor)}
and each derived class must have such a constructor. These are now what the
loading system uses, so you do not have to worry about loading interfering
with the default constructor semantics.
\subsubsection{Run-Time Type Information}
Support for run-time type information has not changed from that of WM4. The
template functions \Code{StaticCast} and \Code{DynamicCast} still exist. The
\Code{Object} members \Code{IsExactly}, \Code{IsDerived}, \Code{IsExactlyTypeOf},
and \Code{IsDerivedTypeOf} still exist.
\subsubsection{Object Names}
Support for object names has not changed from that of WM4. The \Code{Object}
members \Code{GetObjectByName} and \Code{GetAllObjectsByName} still exist.
\subsubsection{Streaming}
The streaming system was factored into support for input streams (reading from
disk or from buffer) and for output streams (writing to disk or to buffer). The
public interfaces are reduced to the bare essentials.
The input streaming is implemented in class \Code{InStream}. You can create and
destroy such objects. You can either load objects from a buffer (in memory) or from
a file (on disk). Once objects are loaded, you can access them via the member
functions \Code{GetNumObjects} and \Code{GetObjectAt}. The low-level reading
functions are templatized. Specializations of some of these are provided by
other classes (in the graphics library), specifically those that are aggregates
of native types.
The output streaming is implemented in class \Code{OutStream}. You can create and
destroy such objects. You can either save objects to a buffer (in memory) or to
a file (on disk). Once an output stream is created, you can insert objects to
be streamed via the member function \Code{Insert}. The low-level writing
functions are templatized. Specializations of some of these are provided by
other classes (in the graphics library), specifically those that are aggregates
of native types.
The linker system was overhauled. In WM4, \Code{Object*} pointers were written
to disk for output streaming. The written data was simply the memory addresses.
When a file was loaded for input streaming, the memory address in the file are
of course no longer valid addresses, but they were used as unique identifiers
for the objects. For each unique identifier, an \Code{Object} is created and
paired with the identifier. After all \Code{Objects}s are created (the loading
phase). Any \Code{Object*} data members contain the unique identifiers. The
linker phase then kicks in and the unique identifiers are replaced by the
actual memory addresses for the corresponding objects.
Two problems occur with this system. Firstly, I had to account for the fact that
some computers have 32-bit addresses and others have 64-bit addresses. Each
memory address was packed into 64-bits on writing and the unique identifiers
were extracted from 64-bits on reading. Secondly, the same scene graph saved twice
can lead to two scene graphs on disk for which a byte-by-byte difference program
will report are not the same. For example, if you run an application and save
the scene, then re-run the application and save the scene again, the streamed
files can have differences because memory addresses of the \Code{Object}s are
different {\em even though the scenes are the same at a high level}.
All that is necessary is that a unique identifier be assigned to a \Code{Object*}
during a save operation, and that unique identifier is written to disk. And
the generation of the unique identifier must not depend on application state
(such as memory addresses). The WM5 linker system does this. Now when you
stream the same scene graph to disk multiple times, those files are the same
byte-by-byte. (This assumes the saves are to the same endian-order platform.)
\subsection{Threading}
I added support for mutexes and the {\em hooks} for threads. Class \Code{Mutex}
is provided for a standard mutex; see files \Code{Wm5Mutex.*}. The mutex details
depend on platform, which are encapsulated in \Code{Wm5MutexType.h}. On Windows,
the mutex type is made opaque by using \Code{void*}, but in the implementation it
is of type \Code{HANDLE}. On Macintosh and Linux, the pthread support is used for
POSIX threads and mutexes. If you want a scoped critical section (the mutex is
destroyed when it goes out of scope), see \Code{Wm5ScopedCS.*}.
Thread types are also platform dependent; see \Code{Wm5ThreadType.h}. On Windows,
the thread type is made opaque by using \Code{void*}. In the implementation it is
a \Code{HANDLE}. On Macintosh and Linux, the type is \Code{pthread\_t}. I have
the Windows implementation started, but I have not yet provided examples that use
it. Over time, I will start the process of threading the engine code.
\subsection{Time}
I have only simple support for time measurements, in \Code{Wm5Time.*}. The function
\Code{GetTimeInMicroseconds} is a wrapper for basic time measurements, but they are
not for a high-resolution timer. There is also \Code{GetTimeInSeconds}. Eventually,
I will add platform-dependent support for high-resolution timers. The current
functions suffice for simple frame-rate monitoring.
\section{LibMathematics}
The mathematics code was factored out of the WM4 LibFoundation library into its
own library. The folder organization has changed. The \Code{WildMagic4/Mathematics}
folder was split into \Code{WildMagic5/Base}, \Code{WildMagic5/Algebra},
\Code{WildMagic/Object2D}, and \Code{WildMagic/Object3D}.
\subsection{Base}
The \Code{Base} folder contains the \Code{Math} class in files \Code{Wm5Math.*}.
The bit hack functions are in \Code{Wm5BitHacks.*}. Classes \Code{Float1}, \Code{Float2},
\Code{Float3}, and \Code{Float4} were added to support the graphics library.
These are simple classes derived from the \Code{Tuple} template class in
\Code{LibCore} and provide specialized constructors and assignment.
\subsection{Objects2D}
The old \Code{Mathematics} folder contained classes for 2D objects. These
classes and files were moved to the new \Code{Object2D} folder.
\subsection{Objects3D}
The old \Code{Mathematics} folder contained classes for 3D objects. These
classes and files were moved to the new \Code{Object3D} folder.
\subsection{Algebra}
\subsubsection{Vector and Matrix Classes}
The algebra classes used most by WM4 were moved to \Code{Algebra}. These include
\Code{Vector2}, \Code{Vector3}, \Code{Vector4}, \Code{Matrix2}, \Code{Matrix3},
\Code{Matrix4}, and \Code{Quaternion}.
\subsubsection{Classes to Support Numerical Computations}
Classes supporting numerical computations were moved to the \Code{Algebra}
folder. These include \Code{Polynomial1}, \Code{GVector}, \Code{GMatrix},
and \Code{BandedMatrix}.
\subsubsection{New Classes for Affine and Homogeneous Algebra}
The \Code{Algebra} folder contains new files for new classes. The main idea
is that the data of the classes are 4-tuples or $4 \times 4$ matrices, all
component of type \Code{float} and which will eventually be set up for SIMD
computations. (At the moment they are not set up for SIMD.) The template
\Code{Vector} and \Code{Matrix} class still remain template classes that
can support 32-bit \Code{float} and 64-bit \Code{double}.
\Code{AVector} represents 3D vectors but stored as 4-tuples of the form
$(x,y,z,0)$. \Code{APoint} represents 3D points but stored as 4-tuples of
the form $(x,y,z,1)$. \Code{HPoint} represents homogenous 4-tuples of the
form $(x,y,z,w)$. \Code{HMatrix} represents homogeneous $4 \times 4$
matrices. \Code{HQuaternion} is not much different from \Code{Quaternion},
but the idea was to encapsulate the planned SIMD code computations in
\Code{HQuaternion}. \Code{HPlane} represents a plane as a 4-tuple.
I originally used the Curiously Recurring Template paradigm for the
\Code{Vector} and \Code{Matrix} classes, but in my opinion the problems with
getting this to work properly on all the supported platforms was not worth
the effort. I ran into problems with the C$++$ requirement for template
classes derived from other template classes that force you either to scope
the base class with \Code{this->mSomeMember} or to add a \Code{using}
statement in the derived class to avoid the explicit scoping. I am still
of the opinion that having to scope base class members but not scope global
variables is backwards. The \Code{using} paradigm has its own problems,
because it can affect the public/protected/private mechanism. With the
vector and matrix classes, the Microsoft compiler had problems with
\Code{using} and started complaining about certain base-class members
not being visible when without \Code{using} they were. Having enough
of this, I ripped out the CRT paradigm and just derived the
\Code{Vector} classes from \Code{Tuple} and the \Code{Matrix} classes
from \Code{Table}.
I added the \Code{struct Information} nested structures to \Code{Vector2}
and \Code{Vector3}. This informatoin used to be in the WM4 \Code{Mapper2}
and \Code{Mapper3} classes and used by the computational geometry code.
I eliminated the mapper classes.
One of the annoyances with representing 3-tuples as 4-tuples is that there
are several situations in the graphics engine where you have to convert
from one to the other, especially with reading and writing vertex buffers.
The new classes have some constructors and implicit conversion operators to
support this, but I consider them an eye sore.
\subsection{CurvesSurfacesVolumes}
WM4 had separate folders, \Code{Curves} and \Code{Surfaces}, and some other
code for B-spline volumes. I consolidated all these files into a single
folder in WM5, \Code{CurvesSurfacesVolumes}.
\subsection{Distance}
Nothing has changed regarding functions for distance calculations. The
number of files and combinations are too numerous to summarize them here
in an effective manner.
\subsection{Intersection}
Nothing has changed regarding functions for intersection calculations. The
number of files and combinations are too numerous to summarize them here
in an effective manner.
\subsection{Approximation}
Nothing has changed regarding functions for approximations and fitting. The
number of files and combinations are too numerous to summarize them here
in an effective manner.
\subsection{Containment}
Nothing has changed regarding functions for containment. The
number of files and combinations are too numerous to summarize them here
in an effective manner.
\subsection{Interpolation}
Nothing has changed regarding functions for interpolation. The
number of files and combinations are too numerous to summarize them here
in an effective manner.
\subsection{NumericalAnalysis}
Nothing has changed regarding the numerical analysis code except that
I renamed the class \Code{Eigen} to \Code{EigenDecomposition}.
\subsection{Meshes}
Nothing has changed regarding the graph data structures for meshes.
\subsection{Rational}
The integer and rational arithmetic code was moved from the WM4
\Code{ComputationalGeometry} folder to its own folder. The reason is
that many other algorithms can use exact rational arithmetic, so no
reason to isolate it to the computational geometry folder.
Class \Code{Rational} has constructors and converters for \Code{float}
and \Code{double} to \Code{Rational}. These had not handled subnormal
(denormal) numbers, and in fact the conversions were significantly slow.
In WM4 and WM5, \Code{Rational} now handles subnormal numbers and the
conversion code is a lot faster.
\subsection{Query}
The queries involve floating-point arithmetic, but some also involve
exact integer and rational arithmetic. I moved these to a separate
folder for the same reasons as the \Code{Rational} folder. The
computational geometry code is not the only code in the engine that
can benefit from exact arithmetic, so no reason to isolate the queries
to the computational geometry folder.
\subsection{ComputationalGeometry}
Other than moving the exact integer and rational arithmetic to a new
folder and the queries to a new folder, nothing has changed in this
folder.
\subsection{Miscellaneous}
Nothing has changed in this folder.
\section{LibGraphics}
The graphics library has the most significant changes of anything from Wild
Magic 4. LibGraphics of Wild Magic 5 is a significant rewrite of its
predecessor.
\subsection{DataTypes}
The \Code{Bound} class has nearly the same interface as in WM4, except that
the sphere center is a \Code{APoint} rather than a \Code{Vector3f}. The
\Code{Bound::ComputeFromData} function now takes a generic pointer and a
stride to allow you to compute a bounding sphere from data that lives in
a vertex buffer. In WM4, the data was a contiguous array of 3-tuple
positions.
The \Code{Transform} class has much of the interface as that in WM4. However,
the class stores a homogeneous matrix that is used by the graphics system.
This matrix is a composition of the translation, scale, and rotation (or
general matrix) components of the transformation. The class also stores the
inverse of the homogeneous matrix. This matrix is computed only when it is
needed. Once I add SIMD support, this class will have an option to use it
instead of the standard CPU computations.
The files \Code{Wm5HalfFloat.*} contain converters between 32-bit
floating-point numbers and 16-bit floating-point numbers. The latter are
stored as unsigned short integers. The conversion is useful for vertex
buffers and textures that want to use half floats.
The files \Code{Wm5Color.*} contain the implementation of a class
\Code{Color} that has all static members. This is used to convert between
various color formats for use by the WM5 texture system. Specifically, the
conversion is used for generating mipmaps on the CPU.
The streaming code in LibCore has classes \Code{InStream} and \Code{OutStream}
that contain some template member functions to support streaming of
{\em aggregrate data}. For example, \Code{Bound} has an \Code{APoint}
member and a \Code{float} member. \Code{Transform} has several native
members. To stream these, the template member functions of \Code{InStream}
and \Code{OutStream} must be specialized; see the functions of the form
\Code{ReadAggregate*} and \Code{WriteAggregate*}. Specializations are in
the files \Code{Wm5SpecializedIO.*}.
The files \Code{Wm5Utility.*} contain only two functions that are used by
the \Code{SampleGraphics/CubeMaps} application.
\subsection{Resources}
The renderer has various resources that it manages. These include vertex
buffers, vertex formats, index buffers, render targets, and textures. The
\Code{Resources} folder stores the source files for these objects.
A vertex format describes the layout of a vertex in a vertex buffer. DirectX 9
has an interface for this, \Code{IDirect3DVertexDeclaration9}, and each
item of interest in the vertex format is a vertex element (position, normal,
color, texture coordinate, and so on), \Code{D3DVERTEXELEMENT9}. OpenGL does
not encapsulate this concept, so the WM5 OpenGL renderer creates its own
representation. The term render target is DirectX 9 terminology. OpenGL
uses the term framebuffer object. I flipped a coin to decide which term to
use--render target won.
The classes \Code{VertexBuffer}, \Code{IndexBuffer}, \Code{VertexFormat},
\Code{Texture1D}, \Code{Texture2D}, \Code{Texture3D}, \Code{TextureCube},
and \Code{RenderTarget} are all platform independent. The \Code{Renderer}
class is an abstract interface that has several member functions that
allow you to bind the platform-independent objects to platform-dependent
objects, the latter objects not visible to the application writer. The
platform-dependent objects are managed by the back-end renderers for
DirectX and OpenGL.
When working with vertex buffers, the vertex formats tell you how the
vertices are structured. The class \Code{VertexBufferAccessor} takes a
format-buffer pair and allows you to set/get the vertex buffer data.
This class has template member functions that allow you to access the
buffer data in whatever form is convenient to you. The sample applications
make heavy use of this class, so look at those applications for usage.
\subsection{Renderers}
The \Code{LibGraphics/Renderers} folder has files \Code{Wm5Renderer.*} that
has the abstract interface for rendering that is platform independent. Any
member functions that do not depend on the underlying graphics API are
implemented in \Code{Wm5Renderer.cpp}. Platform-dependent implementations
occur in several subfolders.
The \Code{Dx9Renderer} subfolder has a DirectX 9 implementation. There are
no implementations for DirectX 10 or DirectX 11.
The \Code{OpenGLRenderer} subfolder has an OpenGL implementation. Please be
aware that the shader system of Wild Magic 5 (and previous) uses OpenGL
extensions that were available before OpenGL 2.0 shipped. These extensions
are friendly to having an FX system that uses NVIDIA's Cg programs, and
the back-end DirectX and OpenGL renderers have very similar organization.
I have plans to move to OpenGL 2.0 and later, using GLSL instead of Cg,
and to abandon Cg programming. See the last section of this document on
the future of Wild Magic.
OpenGL renderer creation and a few operations (swap buffers, for example) are
specific to the platform. The Microsoft Windows OpenGL portions (WGL) are
in the subfolder \Code{WglRenderer}. Macintosh OS X OpenGL portions (AGL)
are in the subfolder \Code{AglRenderer}. Linux OpenGL portions (for
X Windows) are in the subfolder \Code{GlxRenderer}.
The resource management member functions of \Code{Renderer} have names such
as \Code{Bind}, \Code{Unbind}, \Code{Enable}, \Code{Disable}, \Code{Lock},
and \Code{Unlock}. The \Code{Bind} call creates a platform-dependent
object that corresponds to the platform-independent resource. For example,
\Code{Bind} applied to a \Code{VertexBuffer} will create a corresponding
platform-dependent object \Code{PdrVertexBuffer}. Other calls support
{\em lazy creation}; for example, if you call \Code{Enable} for a
\Code{VertexBuffer} and the platform-dependent companion \Code{PdrVertexBuffer}
does not yet exist, one will be created automatically.
For most applications, you do not even need to worry about explicit calls
to the resource management functions. The rendering system will handle this
for you. One exception, though, is related to render targets. Sometimes it
is necessary to bind a render target explicitly so that its underlying
texture object is bound for use as a render target. If that texture object
is attached to an effect, and you draw an object using the effect {\em before}
the render target is created, the texture object is bound as a regular texture,
not as a render target. See the image processing samples for examples.
The \Code{Lock} and \Code{Unlock} calls were designed to allow you to access
vertex buffers, index buffers, and textures directly when they are in video
memory. However, each resource is backed by system memory, which you can
also access. If you modify the system memory for buffers and textures, the
\Code{Renderer} interface has \Code{Update} calls that cause the corresonding
video memory to be refreshed with the contents from system memory. If you
modify the video memory directly, the system memory and video memory are out
of sync. This may be of no concern in your application.
Originally, I planned not to back the resources with system memory, but then
I remembered that users reported that the DirectX 9 renderer of Wild Magic 4
does not handle lost devices. For example, if you have a DX9 application
running and then use CTRL-ALT-DELETE to launch the Windows Task Manager, the
application device is lost. When your application window is given the focus
again, DX9 requires you to recreate many (but not all) of the resources.
I found it quite annoying that the operating system would not manage the
video memory itself, forcing the application writers to have responsibility.
Regardless, I added the system memory backing and now the DX9 renderer will
restore the resources. This is a serious waste of system memory. I did not
see this problem with OpenGL during WM5 development, but I was running only
on Windows Vista and Windows 7. Later, I read that Windows Vista and Windows
7 (via DirectX 10 or 11) does properly manage the video memory, but apparently
DirectX 9 still makes you manage the memory yourself. When I ship EmeraldGL
(see the last section of this document), you can select whether or not to
have a system memory backing.
WM4 had the ability to attach global render state (alpha, face culling,
depth buffering, polygon offset, stencil buffering, wire state) to \Code{Node}
objects. This state was propagated to the leaf geometry and to the attached
effect via a call to \Code{UpdateRS}. This system does not exist in WM5.
You can, however, specify a global render state for the \Code{Renderer}
that overrides that type of global render state when applying the shader
effects to the geometric objects that are being drawn. See the
\Code{Renderer} functions such as \Code{SetOverrideAlphaState}. The reason
for removing this is that it seemed unnatural to allow WM4 \Code{Spatial} and
\Code{Node} to contain render state when their primary purpose instead was to
manage hierarchical transformations and culling. After using WM5 for quite
some time now, I actually like the WM4 approach better and will restore it
(in EmeraldGL). A node hierarchy can very well manage multiple scopes
(transformation/culling, render state, global effects).
Related to this is the ability in WM4 to attach an effect to a \Code{Node}
object. This ability is also removed in WM5, but there is a new
renderer \Code{Draw} function that allows you to specify a global effect
that overrides any local effect in the geometry objects provided by the
visible set. The sample graphics applications for planar reflections and
planar shadows show how to use global effects. In fact, these samples use
multiple scene graphs (some folks seem to think that an application must
have only one scene graph, which has never been required in Wild Magic).
\subsection{Shaders}
The WM5 shader system and FX system have had significant rewriting from what
WM4 provided.
The \Code{LibGraphics/Shaders} folder contains global render state that is
nearly identical to that of WM4. The classes for this state are
\Code{AlphaState} (alpha blending), \Code{CullState} (face culling),
\Code{DepthState} (depth buffering), \Code{OffsetState} (polygon offset
for depth biasing), \Code{StencilState} (stencil buffering), and
\Code{WireState} (wireframe/solid mode for drawing).
The special FX system is encapsulated by classes \Code{VisualEffect} and
\Code{VisualEffectInstance}. Section \ref{subsec.shaderfx} already provided
some description of these. The FX system is similar to Cg FX and to HLSL
support in DirectX 9. An effect can have multiple techniques. A technique
is encapsulated by \Code{VisualTechnique} and can have multiple passes.
A pass is encapsulated by \Code{VisualPass}. Each such pass has a set of
global render state, a vertex shader, and a pixel shader. The vertex shader
is encapsulated by class \Code{VertexShader} and the pixel shader is
encapsulated by class \Code{PixelShader}. Both classes are derived from
class \Code{Shader}.
A \Code{Shader} object contains an array of input names for the inputs,
whether vertex attributes such as position, normal, and so on, or pixel
inputs (the outputs of the vertex shader). The object contains an array
of output names, also. Information about the shader constants and samplers
used by the shader programs are also maintained by \Code{Shader}. This
information is encapsulated in class \Code{ShaderParameters}, which allows
you to set/get the constants and textures. The shader constants live in
a system with base class \Code{ShaderFloat} (see the next section).
I still use Cg programs. The compiled shaders have text output that stores
information used by Cg Runtime. A tool called \Code{WmfxCompiler} ships with
WM5 that uses the Cg output to generate binary files for local effects to be
loaded by WM5. The files contains ASM code for {\em all} the profiles WM5
supports for both OpenGL and DirectX 9. Thus, one binary file (with
extension \Code{wmfx}) may be regardless of graphics API. The \Code{Shader}
class stores all the program strings, registers, and texture units, and
selects an appropriate program based on which graphics API you are using
and what the best profile your graphics card supports.
The program strings need not be generated and stored in a \Code{wmfx} file.
Many of the local effects in WM5 have these strings and other information
stored as class-static data. For basic applications, this means not having
to ship shader files as data for those applications. In WM4, to ship without
the \Code{wmsp} files, you would have to embed them as character strings in
the application/engine and then roll your own program loader/parser to
read from those strings rather than from disk.
\subsection{ShaderFloats}
The \Code{ShaderFloat} class was designed to encapsulate shader constants
and allow them to be streamed, just as other graphics resources can be
streamed. Most of the class interface is straightforward, allowing you
to set/get data in the various registers.
The class has four additional member functions that support updating of
constants during run time,
\begin{verbatim}
inline void EnableUpdater ();
inline void DisableUpdater ();
inline bool AllowUpdater () const;
virtual void Update (const Visual* visual, const Camera* camera);
\end{verbatim}
In the \Code{Renderer::Draw} function for a single \Code{Visual} object,
there is a loop over the passes of an effect. In that loop you will see
\begin{verbatim}
// Update any shader constants that vary during runtime.
vparams->UpdateConstants(visual, mCamera);
pparams->UpdateConstants(visual, mCamera);
\end{verbatim}
The \Code{ShaderParameters::UpdateConstants} is the following
\begin{verbatim}
void ShaderParameters::UpdateConstants (const Visual* visual,
const Camera* camera)
{
ShaderFloatPtr* constants = mConstants;
for (int i = 0; i < mNumConstants; ++i, ++constants)
{
ShaderFloat* constant = *constants;
if (constant->AllowUpdater())
{
constant->Update(visual, camera);
}
}
}
\end{verbatim}
The function iterates over the shader constants, querying each whether it
allows (needs) updating. If it does, then the \Code{ShaderFloat::Update}
function is called.
By default, the creation of a \Code{ShaderFloat} object does not allow
updating. For example, if you have a \Code{ShaderFloat} that manages a
specific color for a vertex shader, and that color never changes during
application execution, then there is no need to update the color. However,
some shader constants do vary at run time, most notably those associated
with the model-to-world matrix (map model coordinates into world coordinates)
and world-to-view matrix (map world coordinates to camera/view coordinates).
The \Code{ShaderFloat}-derived classes that encapsulate runtime-varying
constants should allow updates, either setting \Code{mAllowUpdater} in the
constructors or by calling \Code{EnableUpdater}. Moreover, the derived
classes must override the virtual function \Code{ShaderFloat::Update} to
perform the appropriate calculations.
The matrices tend to vary at a rate of once per draw call,
so allowing the \Code{ShaderFloat::Update} call to occur always is the
right thing to do. Some shader constants, though, might vary less
frequently, in which case the \Code{Update} call needlessly consumes cycles.
For these constants, you can call \Code{DisableUpdater} so that the update
function is not called. When you change the value of the shader constant,
call \Code{EnableUpdater}, allow the draw to occur, and then call
\Code{DisableUpdater}. At its lowest level, you can call the update
function yourself when needed, and always disable the update call--you
always manage the shader constant status, not the renderer.
The \Code{ShaderFloat} folder contains a large number of derived classes.
The one you will use most often is \Code{PVWMatrixConstant} that handles
the world-view-projection matrix. This matrix is the one used by a
typical vertex shader for mapping the model-space vertex position
to clip-space coordinates.
See the examples in the \Code{LocalEffects} folder for how to create a
\Code{VisualEffect}-derived class. In particular, you can see how to
create the vertex and pixel shaders and how to create the shader constants.
Creation of a shader constant requires you to provide a string name, the
same one used in the Cg program. Unlike WM4 which required you to name
your shader constants with specific names so that the FX system functions
correctly, WM5 allows you to name the shader constants anything you like.
Hooking them up with the engine becomes the responsibility of the constructor
for the effect.
\subsection{LocalEffects}
The \Code{LibGraphics/LocalEffects} folder contains several examples of
classes derived from \Code{VisualEffect}. These include basic vertex
coloring, texturing, and lighting. The lighting effects include per-vertex
effects and per-pixel effects. All these classes have hard-coded program
strings, registers, and texture units (as class-static data).
\subsection{GlobalEffects}
I use the term {\em global effect} to refer to drawing that involves
multiple geometric objects and requires direct access to the renderer
to manage the drawing. The examples implemented are for planar reflection
and planar shadows; sample graphics applications are provided for both.
The abstract base class \Code{GlobalEffect} has a pure virtual function
\begin{verbatim}
virtual void Draw (Renderer*, const VisibleSet&);
\end{verbatim}
that is implemented in each derived class. This function is called
by \Code{Renderer::Draw(const VisibleSet\&, GlobalEffect*)} when you pass
a non-null pointer via the \Code{GlobalEffect} parameter.
For example, the classes \Code{PlanarReflectionEffect} and
\Code{PlanarShadowEffect} implement the \Code{Draw} function. Much of that
code involves managing global render state for alpha blending, depth
buffering, and stencil buffering. It also makes high-level draw calls
to set camera matrices and to render the current visible set.
\subsection{ImageProcessing}
This code is new to Wild Magic that is more along the lines of using the
GPU for general-purpose programming. Some image processing, whether 2D or 3D,
can be done on the GPU using render targets. The prototypical case is to
apply Gaussian blurring to a 2D image. Two render targets are used. The
first target is loaded with the image. The image is Gaussian blurred using
a shader program and the output is drawn to the second target. This target
becomes the source for the next blurring pass, and the other target becomes
the destination. The targets alternate between being the source and
destination targets.
There is a significant amount of overhead in the setup for doing this. The
classes in the \Code{ImageProcessing} subfolder encapsulate the overhead
so that the application itself can focus on the specific details of the
filters it wants to use to process the image.
The base class \Code{ImageProcessing} contains the setup code common to both
2D and 3D image processing. The class \Code{ImageProcessing2} builds on top
of this by allowing you to select the type of boundary conditions for the
image filtering, currently Dirichlet or Neumann boundary conditions. The
class also has a drawing function that is called for the image processing.
A sample application that illustrates this code is for 2D Gaussian blurring,
\Code{SampleImagics/GpuGaussingBlur2}.
The class \Code{ImageProcessing3} is also derived from \Code{ImageProcessing}.
Image processing of 3D images has a few more technical details to consider
compare to 2D processing. On the CPU, a 3D image is typically stored in
{\em lexicographical order}. If the image has $b_0$ columns (index named
$x$), $b_1$ rows (index named $y$), and $b_2$ slices (index named $z$), then
the mapping from the three-dimensional image coordinate $(x,y,z)$ to linear
memory with index $i$ for lexicographical ordering is
$i = x + b_0 (y + b_2 z)$. The $z = 0$ slice is stored first in memory in
row-major order. The voxel ordering is such that $x$ varies the fastest,
$y$ next fastest:
\[
(0,0,0), (1,0,0), \ldots, (b_0-1,0,0), (0,1,0), \ldots,
(b_0-1, b_1-1, 0)
\]
The $z = 1$ slice follow this one, and so on. This mapping is not useful for
GPU computations on 3D images.
The 2D image processing naturally maps to render targets. Standard filtering,
such as for Gaussian blurring, uses finite differences to estimate derivatives.
For example, centered differences to estimate first-order partial derivatives
are
\[
\frac{\partial f(x,y)}{\partial x} \doteq \frac{f(x+h,y) - f(x-h,y)}{2h}, \;\;
\frac{\partial f(x,y)}{\partial y} \doteq \frac{f(x,y-h) - f(x,y-h)}{2h}
\]
for small $h$. Estimates for second-order partial derivatives are
\[
\begin{array}{lcl}
\frac{\partial^2 f(x,y)}{\partial x^2} & \doteq & \frac{f(x+h,y) - 2f(x,y) + f(x-h,y)}{h^2} \\
\frac{\partial^2 f(x,y)}{\partial y^2} & \doteq & \frac{f(x,y+h) - 2f(x,y) + f(x,y+h)}{h^2} \\
\frac{\partial^2 f(x,y)}{\partial x \partial y} & \doteq & \frac{f(x+h,y+h) + f(x-h,y-h) - f(x+h,y-h) - f(x-y,y+h)}{4h^2} \\
\end{array}
\]
Gaussian blurring is modeled by the linear heat equation,
\[
\frac{\partial f}{\partial t} = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}
\]
for some time scale $t \geq 0$. The solution is a function $f(x,y,t)$ and the
initial condition is $f(x,y,0) = I(x,y)$, where $I(x,y)$ is your image. There
are boundary conditions to deal with, but for the sake of illustration, ignore
these for now. Using a forward difference in time and centered differences in
space, the heat equation is approximated by
\small
\[
\frac{f(x,y,t+k) - f(x,y,t)}{k} =
\frac{f(x+h,y,t) - 2f(x,y,t) + f(x-h,y,t)}{h^2} +
\frac{f(x,y+h,t) - 2f(x,y,t) + f(x,y+h,t)}{h^2}
\]
\normalsize
Solving for $f$ at time $t + k$,
\[
f(x,y,t+k) = \left(1 - \frac{4k}{h^2} \right)f(x,y,t) + \frac{k}{h^2} \left(
f(x+h,y,t) + f(x-h,y,t) + f(x,y+h,t) + f(x,y-h,t) \right)
\]
The left-hand side represents a slightly blurred version of the image
$f(x,y,t)$. If $f$ is stored as a texture in a render target, the right-hand
side becomes part of a pixel shader. The various $f$ terms are evaluated as
samples of the texture (5 such evaluations).
The graphics APIs do not have the concept of a 3D render target where the
underlying texture is a volume texture. However, the 3D image can be represented
as a tiled texture that is an array of 2D image slices. For example, consider
a $4 \times 4 \times 4$ image. The tiled texture is a $2 \times 2$ array of
$4 \times 4$ image slices. The tiles are ordered as
\begin{center}
\begin{tabular}{|c|c|} \hline
$z = 0$ & $z = 1$ \\ \hline
$z = 2$ & $z = 3$ \\ \hline
\end{tabular}
\end{center}
As an $8 \times 8$ texture with origin in the upper-left corner, the layout is
the following where the triples are $(x,y,z)$,
\begin{center}
\begin{tabular}{|c|c|c|c|c|c|c|c|} \hline
(0,0,0) & (1,0,0) & (2,0,0) & (3,0,0) & (0,0,1) & (1,0,1) & (2,0,1) & (3,0,1) \\ \hline
(0,1,0) & (1,1,0) & (2,1,0) & (3,1,0) & (0,1,1) & (1,1,1) & (2,1,1) & (3,1,1) \\ \hline
(0,2,0) & (1,2,0) & (2,2,0) & (3,2,0) & (0,2,1) & (1,2,1) & (2,2,1) & (3,2,1) \\ \hline
(0,3,0) & (1,3,0) & (2,3,0) & (3,3,0) & (0,3,1) & (1,3,1) & (2,3,1) & (3,3,1) \\ \hline
(0,0,2) & (1,0,2) & (2,0,2) & (3,0,2) & (0,0,3) & (1,0,3) & (2,0,3) & (3,0,3) \\ \hline
(0,1,2) & (1,1,2) & (2,1,2) & (3,1,2) & (0,1,3) & (1,1,3) & (2,1,3) & (3,1,3) \\ \hline
(0,2,2) & (1,2,2) & (2,2,2) & (3,2,2) & (0,2,3) & (1,2,3) & (2,2,3) & (3,2,3) \\ \hline
(0,3,2) & (1,3,2) & (2,3,2) & (3,3,2) & (0,3,3) & (1,3,3) & (2,3,3) & (3,3,3) \\ \hline
\end{tabular}
\end{center}
The top row has texture coordinates $(u,v)$ from left to right of $(0,0),
(1,0), \ldots, (7,0)$. The next row has texture coordinates from left to
right of $(0,1), (1,1), \ldots, (7,1)$. The other rows have similar mappings
to texture coordinates. The lexicographical mapping of the 3D image to 1D
memory is $i = x + 4(y + 4z)$. The memory locations are
\[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, \ldots
\]
In the tiled mapping, the texture is also stored in 1D memory. The ordering
is
\[
0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23, \ldots
\]
The \Code{ImageProcessing3} class has several member functions for mapping
between $(x,y,z)$, $(u,v)$, and $i$. In particular, the function
\Code{CreateTiledImage} takes as input a 3D image in lexicographical order
and generates a 2D tiled texture (as in the previous example).
Given a tiled texture, the next problem is to compute the finite differences
for the filtering. For example, 3D Gaussian blurring is model by the linear
heat equation,
\[
\frac{\partial f}{\partial t} =
\frac{\partial^2 f}{\partial x^2} +
\frac{\partial^2 f}{\partial y^2} +
\frac{\partial^2 f}{\partial z^2}
\]
where the solution is $f(x,y,z,t)$ and the initial value is $f(x,y,z,0) =
I(x,y,z)$ with $I$ being the 3D image to blur. Finite difference estimates
are used, as in 2D, to obtain the numerical method
\[
\begin{array}{l}
f(x,y,z,t+k) =
\left(1 - \frac{6k}{h^2} \right)f(x,y,z,t) +
\frac{k}{h^2} \left( f(x+h,y,z,t) + f(x-h,y,z,t) \right. \\
\;\;\; \left. + f(x,y+h,z,t) + f(x,y-h,z,t) + f(x,y,z+h) + f(x,y,z-h) \right) \\
\end{array}
\]
Coding this in a pixel shader, the right-hand side must be evaluated. Each $f$
term requires sampling the 2D tiled texture. Let the texture function be
$T(u,v)$. For example, evaluation of $f(1,1,0)$ requires sampling the
texture, $T(1,1)$. Evaluation of $f(0,1,1)$ requires sampling the texture,
$T(4,1)$. In the following discussion, the voxel spacing is $h = 1$.
There are two problems with the sampling. Firstly, consider the pixel shader
when the input is $(x,y,z) = (1,1,1)$. The evaluations of the function values
for $z = 1$ are texture samples,
$f(x+h,y,z) = f(2,1,1) = T(6,1)$,
$f(x-h,y,z) = f(0,1,1) = T(4,1)$,
$f(x,y+h,z) = f(1,2,1) = T(5,2)$, and
$f(x,y-h,z) = f(1,0,1) = T(5,0)$.
The texture samples are all at spatially close neighbors of $(x,y,1)$.
The numerical method also requires evaluating
$f(x,y,z+h) = f(1,1,2) = T(1,5)$ and
$f(x,y,z-h) = f(1,1,0) = T(1,1)$. These texture samples are not spatially
close to $(x,y,1)$. In order to shade the pixel at $(x,y,z)$, it is necessary
to have a dependent texture lookup. The image filtering is accomplished by
drawing to a render target using a square as the geometry, with the square
having texture coordinates $(0,0)$, $(1,0)$, $(0,1)$, and $(1,1)$. The
texture coordinates from the interpolation and that are passed to the
pixel shader are used to look up the $(u,v)$ values for sampling $T(u,v)$
that corresponds to $f(x,y,z)$. The lookup is into what I call an
{\em offset texture}.
Secondly, the boundary conditions come into play. Consider when the input
to the pixel shader is $(x,y,z) = (3,1,0)$. This is a {\em boundary voxel}
in the original 3D image. Evaluation of $f(x,y,z) = f(3,1,0) = T(3,1)$ is
just a sample of the tiled texture. However, $(x+h,y,z) = (4,1,0)$ is
{\em outside the 3D image}. You must decide how to handle boundary voxels
in the blurring. The two standard choices are Dirichlet boundary conditions
and Neumann boundary conditions.
Dirichlet boundary conditions involve specifying the $f$-values on the
boundary of the image to be a constant. If an $(x \pm h, y \pm h, z \pm h)$
input is outside the image domain, the $f$-evaluation just uses the
specified constant. We need to know, however, when an input to the pixel
shader is a boundary pixel. This involves creating another dependent texture
lookup. I call this texture a {\em mask texture}. The texture value is $1$
when the corresponding $(x,y,z)$ is an interior voxel and is $0$ when it is
a boundary voxel.
Neumann boundary conditions amount to clamping to the image boundary.
The evaluation of $f(x+h,y,z) = f(4,1,0)$ becomes an evaluation of
$f(3,1,0)$; that is, the $x$-value is clamped to $3$. This would be
equivalent to clamp mode for a volume texture, but because we are using
a tiled texture, the clamping has to be part of the offset texture
lookup described in the previous paragraph. Observe that any inputs
$(x,y,z)$ with $z = 0$ or $z = 3$ are boundary voxels.
The class \Code{ImageProcessing3} has member functions to compute the
offset and mask textures based on which type of boundary conditions
you choose. An illustration of \Code{ImageProcessing3} is with 3D
Gaussian blurring. See \Code{SampleImagics/GpuGaussianBlur3}.
The 2D and 3D Gaussian blurring samples do not use the mask texture. However,
the GPU-based fluid solver for 2D Navier-Stokes equations does (for what
is called {\em mixed boundary conditions}). See the sample
\Code{SamplePhysics/GpuFluids2D}. A class project recommended in
{\em Game Physics, 2nd edition} involves implementing \Code{GpuFluids3D}.
This also will use the offset and mask textures. Much of the foundation
needed to implement the 3D fluid solver is already built into
\Code{ImageProcessing3}.
\subsection{SceneGraph}
In Section \ref{subsec.shaderfx}, I already mentioned some key differences
between scene graph classes of WM4 and WM5. Most notably is the replacement
of the WM4 class \Code{Geometry} by the WM5 class \Code{Visual}. The
latter class removes the support for per-node global render state and global
effects, making it mainly a supporting class for hierarchical transformations
and culling. The \Code{Node} class and special \Code{Node}-derived classes
are as they were in WM4 (other than the removal of support for global
render state and global effects).
As mentioned in Section \ref{subsec.designchangelights}, the lighting
system has changed with the elimination of the ability to attach lights
to a scene. Class \Code{Light} is now just a container for the light
information, and the \Code{ShaderFloat}-derived classes for shader
constants include a variety of constants involving lights and materials.
The \Code{Camera} class has not changed much, but it does use \Code{APoint},
\Code{AVector}, and \Code{HMatrix} for affine and homogeneous entities. The
class now has support for specifying pre-view and post-projection matrices.
The standard matrix used to map from model space to clip space is $H = PVW$,
where $W$ maps model space to world space (the world matrix), $V$ maps
world space to camera/view space (the view matrix), and $P$ maps view
space to homogeneous clip space (the projection matrix). The product is
written with the convention that it is applied to column vectors on its
right, $PVW \Vector{x}$. Sometimes it is convenient to apply another
transformation to world space before the conversion to view space. The
prototypical example is a reflection matrix that is used for planar
reflections (see \Code{PlanarReflectionEffect}). Such a matrix $R$ is
referred to as a pre-view matrix because it is applied {\em before} the
view matrix is applied, $H = PVRW$. Sometimes it is convenient to apply
a transformation after the projection but before the perspective divide.
The prototypical example is a reflection matrix that is used for mirror
effects (replace $x$ by $-x$ for example). Such a matrix $R$ is referred
to as a post-projection matrix because it is applied {\em after} the
projection, $H = RPVW$.
The \Code{CameraNode} and \Code{LightNode} classes are the same as in WM4.
They allow you to attach a camera/light to a scene graph. For example,
you might have headlights on an automobile. The headlights have geometry
so you can draw them on the vehicle, and they have light associated with
them that are used in rendering to illuminate anything they shine on.
The \Code{LightNode} is given a light and can have the headlight geometry
attached as a child. Another example is a security camera in a corner of
a room. The \Code{CameraNode} manages the \Code{Camera} position and
orientation and the geometry to represent the physical box of the camera
is attached as a child.
The culling system has not changed. Classes \Code{Culler} and
\Code{VisibleSet} are as in WM4. The picking system also has not changed.
Classes \Code{PickRecord} and \Code{Picker} are as in WM4.
The geometric primitive classes are the same, although I changed the
name \Code{Polyline} to \Code{Polysegment}. Polylines are really multiple
segments, so why not call them polysegments? Regardless, the code
reorganization exposes Microsoft Windows (when on a Windows PC) and the
Windows name \Code{Polyline} clashed with my class name. Rather than
provided explicit scope with the \Code{Wm5} namespace, I just changed
the name.
Two new classes were added. Class \Code{Projector} is derived from
\Code{Camera} and allows the projector to use a frustum with normalized
depth different from what the underlying graphics API requires. If using
OpenGL, the underlying normalized depths are in $[-1,1]$. But you can
have a projector object with depths of $[0,1]$.
The class \Code{ScreenTarget} provides support for creating standard
objects needed for drawing to a render target. This includes a screen-space
camera, the rectangle geometry for the quad to which the render target
is associated, and texture coordinates for that quad. This hides some
annoying differences between DirectX and OpenGL texture coordinate and
pixel coordinate handling.
\subsection{Controllers}
The controller system has the same design as in WM4, but I added two new
classes.
The \Code{TransformController} class is new and is designed to be a base
class for any controller that modifies \Code{Transform} objects. The
\Code{KeyframeController} class is now derived from the new class. This
fixed a subtle problem when a keyframe controller attached to a node
did not have keys to manage {\em all} of translation, rotation, and
scale. This never showed up in my Wild Magic 4 samples, but it did
when adding support for blended animations.
The other new class is \Code{BlendedTransformController}. This controller
allows you to manage two transform controllers and blend together the
keys. An illustration for using the class is in the new sample
application, \Code{SampleGraphics/BlendedAnimations}. This sample has
a skinned biped with two skin controllers (two triangle meshes) and with
keyframe controllers at a majority of the nodes of the biped. The biped
has an idle cycle, a walk cycle, and a run cycle. The sample shows how
to blend these for transitions between idle and walk and between walk
and run.
\subsection{Detail}
The level-of-detail classes have not changed. However, I rewrote the
\Code{CreateClodMesh} classes to account for the design changes for
vertex buffers. I also thought hard about the abstract problem of
the edge collapses, and I believe this rewrite produced more
readable source code. In particular, the WM4 version had a lot of
hand-rolled code for graph handling. I removed this and used as much
standard C$++$ library support (STL) as I could.
\subsection{Sorting}
The sorting code has not changed from that of WM4.
\subsection{CurvesSurfaces}
The code for supporting dynamically tessellated curves and surfaces was
mainly rewritten because of the design changes for vertex buffers. This
required some tedious changes to the internal workings, but from a
user's perspective, nothing has changed conceptually.
\subsection{Terrain}
I retired the \Code{ClodTerrain*} classes. That continuous level of
detail algorithm is quite old and not needed given the power and
memory of current generation graphics cards.
\section{LibPhysics}
I have made some changes to the physics library involving collision detection
and fluids.
\subsection{CollisionDetection}
The collision detection code used to live in the graphics library. I wanted
to move it to the physics library without causing a compiler dependency
between the two. To do this, the collision detection code has been converted
to use templates. The two template parameters are \Code{Mesh} and
\Code{Bound}. These classes must be instantiated with classes that include
the following interfaces.
\begin{verbatim}
Class Mesh must have the following functions in its interface.
int GetNumVertices () const;
Float3 GetPosition (int i) const;
int GetNumTriangles () const;
bool GetTriangle (int triangle, int& i0, int& i1, int& i2) const;
bool GetModelTriangle (int triangle, APoint* modelTriangle) const;
bool GetWorldTriangle (int triangle, APoint* worldTriangle) const;
const Transform& GetWorldTransform () const;
Class Bound must have the following functions in its interface.
Bound (); // default constructor
void ComputeFromData (int numElements, int stride, const char* data);
void TransformBy (const Transform& transform, Bound& bound) const;
bool TestIntersection (const Bound& bound) const;
bool TestIntersection (const Bound& bound, float tmax,
const AVector& velocity0, const AVector& velocity1) const;
\end{verbatim}
Of course, in Wild Magic you instantiate with \Code{TriMesh} and \Code{Bound}.
However, it is relatively easy to use other mesh and bound classes and add to
them the few interface functions required.
WM4 had an \Code{Object}-derived class \Code{BoundingVolume} which is now
a non-\Code{Object}-derived class \Code{Bound}. The WM4 class
\Code{BoundingVolumeTree} is replaced by the template class \Code{BoundTree}.
The template class avoids having explicit derived classes such as
\Code{BoxBVTree} and \Code{SphereBVTree}. The \Code{Bound} template
parameter can represent any bounding volume container you choose to
implement.
Because \Code{BoundTree} is templated, the \Code{CollisionRecord} and
\Code{CollisionGroup} classes need to have the same template parameters.
Moreover, these classes have some requirements for the \Code{Mesh} template
parameter. Specifically, the mesh class needs to provide access to its
triangles.
\subsection{Fluid}
This is a new folder for the physics library. It contains a CPU-based
implementation for solving the Navier-Stokes equation in 2D and in 3D
on regular grids. Sample physics applications that use the solvers
are \Code{Fluids2D} and \Code{Fluids3D}. The description of the classes
and the sample applications are in {\em Game Physics, 2nd edition}.
\subsection{Intersection}
The code is essentially the same, but some class names changed. We now
have classes \Code{IntervalManager}, \Code{RectangleManager}, and \Code{BoxManager}
for the sort-and-sweep space-time coherent collision culling. {\em Game
Physics, 2nd edition} uses the new class names. The book also talks
about how \Code{BoxManager} can be implemented using multithreading,
using multiple cores (Xbox 360), and on specialized processors (SPUs on
PS3).
\subsection{LCPSolver}
The LCP solver has not changed. Eventually, I hope to replace this with
an implementation of the velocity-based dynamics described in {\em Game
Physics, 2nd edition}.
\subsection{ParticleSystem}
The particle system code has not changed.
\subsection{RigidBody}
The rigid body code has not changed.
\section{LibImagics}
Nothing has changed in the LibImagics library. The WM4 and WM5 services are
exactly the same except that I recently fixed the performance problems with
the 2D and 3D connected component labelers. The WM4 fixes have been posted,
but the WM5 version will occur with the post of Wild Magic 5.2 patch.
\section{LibApplications}
The application layer has not changed much. I added a static member
\Code{Application::ThePath}. This stores the
path to the project folder of the application; you must support this by
providing a console/window title (string) that is a path to the project
folder relative to the path stored in the \Code{WM5\_PATH} environment
variable.
A change that I have not yet posted for either WM4 or WM5 is the replacement
of the console/window title with \Code{std::string} instead of
\Code{const char*}. If you need the console/window title to store other
information, such as an input file your application is processing, you can
safely change the string during an \Code{OnPrecreate} call without destroying
the environment-path mechanism that relies on knowing the project folder
location.
The \Code{main} function has been restructured based on the changes for
path finding. It also has specific calls to \Code{Initialize} and
\Code{Terminate} for the memory management system of WM5.
The Microsoft Windows stub is \Code{Wm5WinApplication.cpp} and serves
as the place \Code{WindowApplication::Main} lives, whether DirectX or
OpenGL. This consolidates the Windows code into one source file (rather
than maintaining separate files for DirectX and OpenGL).
The \Code{Main} function has some new code. The \Code{Camera} class needs
its normalized depth model specified based on graphics API. The redesign
of the \Code{Renderer} class and how a renderer is created affects the
initialization.
\section{Tools}
Only a few tools are provided right now.
\subsection{GenerateProjects}
This is similar to the same-named project I provided in WM4. You can use
this tool to generate the Microsoft Visual Studio 2008 \Code{vcproj} file
and the Xcode subfolder and project file for an application. These
project files have all the compiler settings and library link information
that are present in the sample applications. The usage is
\begin{verbatim}
GenerateProjects MyProjectName
\end{verbatim}
The output is \Code{MyProjectName\_VC90.vcproj} and a subfolder named
\Code{MyProjectName.xcodeproj}. The subfolder contains a file
\Code{project.pbxproj}. The subfolder can be copied to a Macintosh
(by network or by sneaker net).
\subsection{BmpToWmtf}
This is a simple tool that runs on Microsoft Windows. It loads a
24-bit BMP file and stores it as a Wild Magic 5 WMTF file, the
raw texture format for loading in WM5 The usage is
\begin{verbatim}
BmpToWmtf MyBmpFile
\end{verbatim}
The output format is \Code{Texture::TF\_A8R8G8B8} and the alpha channel
is filled with 255. If you want a constant alpha channel of your
own choosing, say, of value 128, use
\begin{verbatim}
BmpToWmtf -a 128 MyBmpFile
\end{verbatim}
The specified file must be without the BMP extension (I need to fix this
and allow it or not). If you want a constant alpha channel
\subsection{WmfxCompiler}
This tool generates Wild Magic 5 WMFX files that encapsulate the shader
programs for all the supported profiles. The tool calls the Cg compiler
for an FX file specified on the command line. It does so for the
profiles: \Code{vs\_1\_0}, \Code{vs\_2\_0}, \Code{vs\_3\_0},
\Code{arbvp1}, \Code{ps\_1\_0}, \Code{ps\_2\_0}, \Code{ps\_3\_0},
and \Code{arbfp1}. Whether all compiles succeed depends on the shader
model and what your shader programs are trying to do. Failure to compile
a profile does not cause the tool to abort. The output WMFX file contains
support for those profiles that were compiled successfully. I write
log files to indicate what has (or has not) happened. Of course, you
can still see the Cg warnings and errors when you run this tool.
Sometimes the profiles \Code{arbvp1} and \Code{arbfp1} are not enough
to compile a shader. For example, vertex texturing requires a profile
of \Code{vp40}. You can compile such shaders manually and either
hard-code them in the application code or manually generate a WMFX file.
\subsection{ObjMtlImporter}
This is a simple and not fully featured importer for the Wavefront
OBJ and MTL file formats. It has sufficed for me for basic geometry
and materials. The folder has only source code that you include in
your application. Later I will provide some sort of stand-alone tool.
Within your source code, you can query the loader class to obtain
relevant information about your vertices, triangles and materials.
\subsection{WmtfViewer}
This is a simple viewer for \Code{Texture2D} images. Eventually I can
add support to view cube maps and mipmap levels. For now, this tool is
useful for debugging render targets. You can save the texture of the
render target to disk and view it with this tool to see what is (or is
not) working correctly.
One warning. The code maps color channels to a normalized color range.
The textures might have different hues than the original images that
were used to generate the WMTF files. I fixed this in a local copy
of the viewer and need to post them (in Wild Magic 5.2 patch).
\subsection{BmpColorToGray}
This is probably not useful for graphics, but I use this to convert 24-bit
color BMP files to gray scale images for screen captures in my books.
\section{The Future of Wild Magic}
After years of maintaining an abstract rendering API that hides Direct X,
OpenGL, and software rendering, the conclusion is that each underlying API
suffers to some extent from the abstraction. Given my desire to provide
a cross-platform graphics engine, it makes sense to focus on OpenGL. As of
the time of writing this document, I have no plans to ship something called
Wild Magic 6.
This is not a judgment of whether OpenGL or DirectX is the better graphics
API. Supporting multiple platform-dependent renderers slows down the
evolution of the platform-independent engine code, so focusing on only
one graphics API should speed up new development. Given the requests for
graphics support on cell phones and given the abundance of OpenGL support for
desktops and embedded devices, it makes sense to abandon DirectX for now.
The Wild Magic source code will be split and evolve along two paths.
The mathematics portion of the source code will become part of a product
called the Malleable Mathematics Library. Most of this code is not graphics
related, and the emphasis will be on robustness, speed, and accuracy of the
implementations. This includes developing implementations that use any
SIMD support on the CPUs, that run on multiple cores, and that can use the
GPU as a general-purpose processor. When robustness and accuracy are of the
utmost importance and speed is not an issue, some of the algorithms will
have implementations that use exact rational arithmetic and/or arbitrary
precision floating-point arithmetic.
The graphics portion of the source code will become part of a product called
EmeraldGL. The renderer layer will still hide any explicit dependence on
OpenGL, but the hiding is relatively shallow and the architecture of the
renderer and graphics engine will be driven by the OpenGL/GLSL view of graphics.
This product will run on desktop computers (OpenGL 2.0 or later) and on
embedded devices (via OpenGL ES 2.0), using GLSL (or whatever variant
is necessary for embedded devices). Naturally, not everything that runs
on a desktop will run on an embedded device, but the engine will allow you
to work with either. EmeraldGL will have the minimal amount of code for
basic mathematics that graphics requires (points, vectors, matrices, planes,
quaternions) and will use SIMD and/or GPU when it makes sense.
Perhaps in the future I will return to supporting DirectX, maybe
creating EmeraldDX, but that remains to be seen.
\end{document}
|