File: WildMagic5Overview.tex

package info (click to toggle)
libwildmagic 5.17%2Bcleaned1-6
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye
  • size: 90,112 kB
  • sloc: cpp: 215,940; csh: 637; sh: 91; makefile: 39
file content (2666 lines) | stat: -rw-r--r-- 139,293 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
\documentclass{article}
\input{Preamble}

\PdfDocument{Wild Magic 5 Overview}

\begin{document}

\PdfTitle{Wild Magic 5 Overview}{May 1, 2010}

This document provides a high-level overview of Wild Magic 5 and its
similarities and differences compared to Wild Magic 4.  This is not intended
to be a comprehensive description; consider it a brain dump of what I was
thinking for the various files and subsystems.  Your best bet for
understanding how to use Wild Magic 5 is to browse the sample applications
and see the engine in action.  If you have used Wild Magic 4, you can compare
those samples with their rewrites in Wild Magic 5.

\section{Introduction}

\subsection{Licensing}

The versions of Wild Magic prior to 4.10 used the LGPL Open Source license.
The license was changed to the Boost License for Wild Magic 4.10.  Wild Magic
5 also uses the Boost License.

\subsection{Naming Conventions}

Based on user feedback, the Microsoft-like Hungarian notation was removed.
The notation is now simpler, choosing instead to use the prefixes \Code{m} for
nonstatic class data members, \Code{ms} for static class data members,
\Code{g} for nonstatic global data, and \Code{gs} for static global data.
Modern compilers and tools are quite good at allowing you to determine the
type of identifiers, usually via tool tips with a mouse-over of the
identifiers, so there is no reason to embed the type information in the
name.  Local identifiers within functions also no longer have embedded type
information.  The source code is easier to read.

\subsection{Source Code Organization}

The code has been factored and reorganized.

The WM4 LibFoundation library was factored into two WM5 libraries: LibCore
and LibMathematics.  LibCore has basic system support, including assertion
handling, data types for tuples (1D arrays) and tables (2D arrays), file
and buffer input-output, memory management and smart pointers, object-oriented
support (base class \Code{Object}, file and buffer input-output, run-time type
information, streaming, and initialization-termination semantics), mutexes
and threads (the threading is not yet implemented), and time measurement.

LibMathematics contains just about everything else that lived in
LibFoundation.  Most of that code remains the same as in WM4 (except for
the naming conventions).

The WM4 LibGraphics library contained a platform-independent engine for
graphics.  An abstract class \Code{Renderer} lived in this library.  The WM4
LibRenderers folder contained projects with \Code{Renderer}-derived classes
for each graphics API of interest: \Code{Dx9Renderer} (DirectX 9 for Microsoft
Windows); \Code{OpenGLRenderer} with flavors \Code{WglRenderer} (Microsoft
Windows), \Code{AglRenderer} (Macintosh OS X), and \Code{GlxRenderer} (Linux
using X Windows); and \Code{SoftRenderer} with flavors \Code{WinSoftRenderer}
(Microsoft Windows), \Code{MacSoftRenderer} (Macintosh OS X), and
\Code{XSoftRenderer} (Linux using X Windows).  The main drawback to this
approach is that \Code{Renderer} contained a large number of virtual
functions.  In an application with a large number of calls to the virtual
functions, there is a performance hit due to those calls.  Specifically, there
are many data cache misses due to the lookup of the function pointers in the
virtual function table (the tables are global data).  WM5 has a concrete class
\Code{Renderer} that does not have virtual functions.  The class is
implemented for each graphics API.  The code for these APIs is also part of
WM5 LibGraphics.  The selection of the API is controlled via build
configurations.

The WM4 LibApplications library that provides a platform-independent
application layer did not change much in WM5.  The design of the application
layer is such that each platform (Microsoft Windows, Macintosh OS X, Linux)
implements an entry point that is called by code in class \Code{Application}.
The entry point implementation and any event handling is, of course, specific
to the platform.  The application library is mainly for the convenience of
supporting Wild Magic sample applications.  Although it can be used in shipping
applications, it was never intended for use this way.  I expected that users
would roll their own layer.

\subsection{LIB Header Files}

Each of the libraries LibCore, LibMathematics, LibGraphics, LibPhysics, and
LibImagics has a corresponding header file:  \Code{Wm5CoreLIB.h},
\Code{Wm5MathematicsLIB.h}, \Code{Wm5GraphicsLIB.h}, \Code{Wm5PhysicsLIB.h},
and \Code{Wm5ImagicsLIB.h}.  These header files contain preprocessor
commands that control the compilation of the libraries.  Users are encouraged
to modify these files to suit their own needs.

\subsubsection{Wm5CoreLIB.h}
\label{subsubsec.corelib}

The file \Code{Wm5CoreLIB.h} contains preprocessor commands to expose various
features that are dependent on the development platform (Microsoft Windows,
Macintosh OS X, Linux).  For example, one of the the flags
\Code{WM5\_LITTLE\_ENDIAN} or \Code{WM5\_BIG\_ENDIAN} is exposed depending
on the byte order required by the CPU.  The only tested platform that has
a big-endian ordering is the Macintosh PowerPC G4/G5.  The other tested
platforms have little-endian ordering, including the Intel Macintosh.

The header file contains declarations of some standard integer types when
compiling using Microsoft Visual Studio 2008.  I am patiently waiting for
consistent cross-platform support for \Code{stdint.h}.

Various headers from the C standard library and from the C$++$ standard library
are included for convenience.  Although generally you want to structure the
header inclusions to obtain minimal time for compilation, nearly all modern
compilers provide support for precompiled headers.  Having a large number of
includes in \Code{Wm5CoreLIB.h}, a file that is indirectly included in all
source files, will lead to a slow compile without precompiled headers.  However,
the precompiled header builds are quite fast.

The symbols \Code{public\_internal}, \Code{protected\_internal}, and
\Code{private\_internal} are defined to be the keywords \Code{public},
\Code{protected}, and \Code{private}, respectively.  This allows me to
use the \Code{*\_internal} symbols to designate sections within class
declarations that are intended for my internal use.  For example, sometimes a
class needs a subsystem to support the engine design, and that subsystem must
have public functions that are called within the engine.  Such functions are
tagged as \Code{public\_internal} to let the users know that I do not
intend for these to be called explicitly by applications.

Within Microsoft Visual Studio 2008, the newly defined symbols may be assigned
colors for syntax highlighting.  To change the color, edit the following file
\begin{verbatim}
  C:/Program Files/Microsoft Visual Studio 9.0/Common7/IDE/usertype.dat
\end{verbatim}
Add each identifier you want highlighted on a line by itself.  My file
contains
\begin{verbatim}
    public_internal
    protected_internal
    private_internal
    new0
    new1
    new2
    new3
    new4
    delete0
    delete1
    delete2
    delete3
    delete4
    assertion
\end{verbatim}
The additional symbols in this file for syntax highlighting are described
later in this document.  In Visual Studio, select the menu item
\begin{verbatim}
  Tools | Options ...
\end{verbatim}
In the Options dialog that appears, expand the Environment item and
select Fonts and Colors.  On the right there is a control named
``Display items''; in the drop-down list, select ``User Keywords''.
You can change the color using the controls named ``Item foreground''
and ``Item background''. I selected purple for the foreground color,
as shown in the next figure.
\begin{center}
  \includegraphics[width=5in]{OptionsDialog.png}
\end{center}

The macro \Code{WM5\_UNUSED(variable)} is used to avoid compiler warnings
iabout unused variables when compiling Release configurations.  For example,
\begin{verbatim}
    bool successful = DoSomeOperation();
    assert(successful);
\end{verbatim}
will compile without warnings in Debug configurations.  However, the
compiler generates a warning in Release configurations that \Code{successful}
is not used.  The reason, of course, is that the \Code{assert} statement has
no generated code in Release configurations, so \Code{successful} is not used.
To avoid the warning, use
\begin{verbatim}
    bool successful = DoSomeOperation();
    assert(successful);
    WM5_UNUSED(successful);
\end{verbatim}

The header file contains three additional blocks, all enabled in Debug
configurations.  The first is related to run-time assertions, the second is
related the WM5 memory management system that supports testing for memory
leaks, and the third is related to file and buffer input-output.  The various
preprocessor commands in these blocks are described later in this document.

\subsubsection{Wm5MathematicsLIB.h}

Currently, the only preprocessor control in \Code{Wm5MathematicsLIB.h} is
related to handling of exact rational arithmetic.  I added a patch to
WM4.10 so that subnormal (denormal) floating-point numbers are handled
correctly by the class \Code{Rational} constructors and converters between
floating-point and \Code{Rational}.  The WM5 code supports conversion of
subnormal numbers.  You can enable the engine to assert when an attempt is
made to convert a NaN (Not a Number) to a \Code{Rational}.

\subsubsection{Wm5GraphicsLIB.h}

A few controls are allowed in \Code{Wm5GraphicsLIB.h}.  When reorienting the
camera by a call to \Code{Camera::SetAxes}, either explicitly or indirectly
with a call to \Code{Camera::SetFrame}, the input axis vectors might be
computed by the application in such a manner that, over time, numerical
round-off errors cause the vectors not to be a right-handed orthonormal set.
The \Code{SetAxis} function uses Gram-Schmidt orthonormalization to ensure
that the vectors do form a right-handed orthonormal set.  You can enable
\Code{WM5\_VALIDATE\_CAMERA\_FRAME\_ONCE} to trap the first time the
vectors appear to fail the test for right-handed orthonormality.  I have
found this to be a useful feature for trapping when the initial settings
for the application camera, \Code{mCamera}, are applied.  In most cases,
the user has incorrectly specified the vectors.

The shader system supports only a few shader models (profiles).  To be
specific, currently only four profiles are supported, but also a {\em none}
value is used to flag invalid profiles.  The total number, including
the {\em none} profile is five. For vertex shaders, the supported profiles
are \Code{vs\_1\_1}, \Code{vs\_2\_0}, and \Code{vs\_3\_0} for DirectX 9 and
\Code{arbvp1} for OpenGL.  For pixel shaders, the supported profiles are
\Code{ps\_1\_1}, \Code{ps\_2\_0}, and \Code{ps\_3\_0} for DirectX 9 and
\Code{arbfp1} for OpenGL.  Sometimes you might need advanced OpenGL support
for an effect, but the Cg compiler still includes the ARB versions of the
profile names in the compiled code.  For example, the sample graphics
application \Code{VertexTextures} requires a Cg command-line parameter
\verb|-profile vp40|, but the Cg compiler still displays the first line
of the compiled file as \verb|!!ARBVP1.0|.  The WM5 shader system bundles
together the shader programs for the profiles into a single object of class
\Code{Shader}.  This class has arrays whose number of elements is 5, which
is stored as \Code{Shader::MAX\_PROFILES}.

{\em You can modify WM5 to include more profiles.}  However, if you use the
WM5 streaming system, the streamed output implicitly depends on
\Code{Shader::MAX\_PROFILES}.  If you were to increase the maximum number
of profiles, and then load a file streamed with the previous maximum number,
there is a mismatch and the file load will ungracefully fail (all data loaded
thereafter is misaligned).  To trap this problem when loading files, you can
enable \Code{WM5\_ASSERT\_ON\_CHANGED\_MAX\_PROFILES}.

In the \Code{Renderer::Draw(const Visual*, const VisualEffectInstance*)}
function, the global render state is reset to the defaults after each pass
of the effect.  Given that every draw function is required to set the
all the global state, it is not necessary to reset the state.  Thus, the
reset code is not compiled by default.  During development and testing, I
had some problems when not resetting the state, so I added a preprocessor
symbol to allow me to toggle the reset code:
\Code{WM5\_RESET\_STATE\_AFTER\_DRAW}.  Just in case problems show up later,
I kept the preprocessor symbol.  You can enable this if you prefer by
uncommenting the define in \Code{Wm5GraphicsLIB.h}.

Sometimes during application development, you might not see a rendered
object when you were expecting one.  A simple test to determine whether
any pixels were actually drawn involves queries supported by the graphics
APIs.  The \Code{Renderer::DrawPrimitive} calls in \Code{Wm5Dx9Renderer.cpp}
and \Code{Wm5OpenGLRenderer.cpp} have conditionally compiled blocks of
code that, when enabled, perform the queries.  To enable these, uncomment
the \Code{WM5\_QUERY\_PIXEL\_COUNT} symbol in \Code{Wm5GraphicsLIB.h}.
Recompile the graphics library and your application, and then set a
breakpoint in \Code{DrawPrimitive} on the lines with \Code{WM5\_END\_QUERY}.
When you reach the breakpoint, step over the line of code and look at the
value of \Code{numPixelsDrawn}.  If it is zero, no pixels were drawn for
the current primitive.

When using the OpenGL renderer, I have code to draw text either using
display lists or using precomputed bitmap fonts (see
\Code{Wm5GLVerdanaS16B0I0.cpp}).  The default is to use display lists,
but you can change this by commenting out \Code{WM5\_USE\_TEXT\_DISPLAY\_LIST}
in \Code{Wm5GraphicsLIB.h}.

\Code{Wm5GraphicsLIB.h} contains the symbol
\Code{WM5\_USE\_OPENGL2\_NORMAL\_ATTRIBUTES} that is defined for Microsoft
Windows and Linux.  It is not defined for Macintosh OS X.  I had problems
with incorrent renderings on the Macintosh when the effects use lighting
and normals, so I had to fall back to using the conventional
\Code{glNormalPointer} for setting the vertex data source for normals.
As it turns out, the problem is that I have been using OpenGL extensions
for shader support, and those extensions were created before OpenGL 2.0
was released.  The assembly for the compiled shaders contains
\Code{vertex.normal}, which is for the conventional way of accessing
the vertex normals.  When I use \Code{glEnableVertexAttribArrayARB}
and \Code{glVertexAttribPointerARB} to set the data source for vertex
normals, the NVIDIA drivers for Microsoft Windows and for Fedora Linux
hook up the normals so that \Code{vertex.normal} refers to those normals.
However, the NVIDIA drivers on the Macintosh do not hook these up, so
the vertex shader is unable to access the normals.

I added the aforementioned preprocessor symbol as a hack to make the shaders
work on all platforms.  Alternatively, on the Macintosh you can edit the
assembly code and replace \Code{vertex.normal} by the corresponding
generic attribute accessor (not my first choice).  I am in the process
of updating the OpenGL renderer so that it uses the core OpenGL 2.0 (and
later) shader system.  However, this means that the shaders must be written
in GLSL, not in Cg.  The end result of the update is {\em EmeraldGL}, and
will be an OpenGL-only graphics system.  I might consider implementing
a DirectX-only system ({\em EmeraldDX}) that uses DirectX 11.

The last preprocessor symbol in \Code{Wm5GraphicsLIB.h} is
\Code{WM5\_PDR\_DEBUG}, which is enabled by default for the DirectX 9
renderer.  This exposes assertions that are triggered whenever the DirectX
calls fail.

\subsubsection{Wm5PhysicsLIB.h}

The only preprocessor symbols in \Code{Wm5PhysicsLIB.h} are used for debugging
the LCP code.  There is no reason to enable these except if you want to
determine whether the LCP code is working correctly.  The LCP code was
part of {\em Game Physics, 1st edition}, but it was intended to be pedagogic
and illustrate the Lemke algorithm (which looks a lot like a basic linear
programming solver and similar to linear system solving).  This code is not
what people use in physics engines.  (Someday I will get around to implementing
a velocity-based iterative algorithm \ldots)

\subsubsection{Wm5ImagicsLIB.h}

No preprocessor symbols are defined in \Code{Wm5ImagicsLIB.h}.  This library
has not been worked on for many years, but remains useful (to me) for rapid
prototyping of image analysis projects.  It needs some major updating and
expansion.


\subsection{No DLL Configurations}

For years I have provided build configurations for both static and dynamic
libraries.  The Microsoft Windows annoyance of having to use
\Code{\_\_declspec(dllexport)} and \Code{\_\_declspec(dllimport)} so that
classes are properly exported or imported has been a pain.  The WM4
libraries had LIB files containing preprocessor symbols as shown next:
\begin{verbatim}
    #ifdef WM4_FOUNDATION_DLL_EXPORT
        // For the DLL library.
        #define WM4_FOUNDATION_ITEM __declspec(dllexport)
    #else
        #ifdef WM4_FOUNDATION_DLL_IMPORT
            // For a client of the DLL library.
            #define WM4_FOUNDATION_ITEM __declspec(dllimport)
        #else
            // For the static library.
            #define WM4_FOUNDATION_ITEM
        #endif
    #endif
\end{verbatim}
Each class is structured as
\begin{verbatim}
    class WM4_FOUNDATION_ITEM MyClass { ... }
\end{verbatim}
However, template classes with no explicit instantiation in the library
could not use the \Code{WM4\_FOUNDATION\_ITEM} macro.  And various
static class data members needed the macro per member.  The separation
between the abstract \Code{Renderer} class and its derived classes per
graphics API required the virtual function members so that the DLL configurations
would link successfully.

Given the abundance of disk space, the usage I had in mind for Wild Magic
libraries, the problems with linking when attempting to remove virtual
functions from the \Code{Renderer} class, and the annoyance of the
aforementioned macro handling, I decided to stop supporting DLLs.  WM5 has
only static debug and static release configurations.

\subsection{The WM4 Shader Programming and FX System}

WM4 had a somewhat complicated approach to shader programming and effects,
which made it sometimes difficult to extend to shaders not already part of
the engine (or part of the sample applications).  The problems with this
approach are described next.

The abstraction of the drawing pass in WM4 is

\scriptsize
\begin{verbatim}
    renderer.Draw(geometry)
    {
        renderer.SetGlobalState(...);  // alpha, cull, depth, ...
        renderer.SetWorldTransformation();  // sets model-to-world (W), others computed later (WV, WVP)
        renderer.EnableIBuffer(geometry);  // enable the index buffer of geometry
        for each effect of geometry do // multieffect drawing loop
        {
            renderer.ApplyEffect(effect);
            {
                for each pass of effect do  // multipass drawing loop
                {
                    pass.SetGlobalState();
                    pass.ConnectVShaderConstants();  // set sources for constants
                    pass.ConnectPShaderConstants();  // set sources for constants
                    pass.GetVProgram();  // loaded first time, cached in catalog for later times
                    pass.EnableVProgram();
                    pass.GetPProgram();  // loaded first time, cached in catalog for later times
                    pass.EnablePProgram();
                    for each vertex texture of pass do
                    {
                        pass.GetVTexture();  // loaded first time, cached in catalog for later times
                        pass.EnableVTexture();
                    }
                    for each pixel texture of pass do
                    {
                        pass.GetPTexture();  // loaded first time, cached in catalog for later times
                        pass.EnablePTexture();
                    }
                    pass.EnableVBuffer();
                    renderer.DrawPrimitive(geometry);
                    pass.DisableVBuffer();
                    pass.DisablePTextures();
                    pass.DisableVTextures();
                    pass.DisablePProgram();
                    pass.DisableVProgram();
                    pass.RestoreGlobalState();
                }
            }
        }
        DisableIBuffer();
        RestoreWorldTransformation();
        RestoreGlobalState();
    }
\end{verbatim}
\normalsize

The drawing supports multiple effects per geometric primitive and multiple
passes per effect; it is not necessary to have a double-loop sytem.  WM5
has a single-loop system, iterating over the passes of a single effect
attached to the geometric primitive.

The renderer sets and restores global states (alpha, face culling, depth
buffering, stencil buffering, polygon offset, wireframe), but so does each
pass.  Given that each pass restores state, there is no need for the renderer
object itself to manage global state.

The index buffer is invariant across all effects and passes, so it is enabled
and disabled once only.  However, the vertex buffer is enabled and disabled
per pass, which is not necessary.  What WM4 does is create a VRAM vertex
buffer for the geometric primitive.  It then maintains vertex buffers
that match what the vertex program requires for the passes, as determined
during the first call to \Code{GetVProgram} (when the vertex program is
loaded from disk and parsed).  If the effect has multiple passes, a second
(or later) pass involves finding an already existing vertex buffer that has
the required attributes.  If none exists, a new VRAM vertex buffer is created
that has the required attributes.  Thus, it is possible that multiple vertex
buffers exist in VRAM with data copied from the primary vertex buffer of the
geometric primitive, which is a waste of memory.  An effect with multiple
passes should be applied to a geometric primitive whose vertex buffer has
{\em all} the attributes necessary for {\em all} the passes (WM5 does this).

In effect, WM4 tried to assume responsibility for ensuring that the vertex
buffers match what the vertex program needs.  If there is a mismatch between
primary vertex buffer and what the vertex program needs, WM4 creates a
matching vertex buffer; however, the attributes generated by a mismatch have
have no chance of being initialized by the application programmer.  In the
WM4 sample applications, there are no mismatches, so there is no penalty in
wasted memory.  But there is a penalty in having a vertex buffer management
system that is irrelevant.  In the end, it is the application programmer's
responsibility for ensuring that the vertex buffer has all that it needs to
support an effect and that the outputs of a vertex program match the inputs
of a pixel program.

In WM4, class \Code{Shader} represents a shader program and its associated
storage for shader constants and for textures.  However, it was convenient
to allow applications to specify their own data sources for the shader
constants (for ease of access).  WM4 has shader constant classes that
provide such storage; for example, the class \Code{UserConstant}.  In the
drawing pass, the functions \Code{Renderer::ConnectVShaderConstants} and
\Code{Renderer::ConnectPShaderConstants} set the data sources for the
shaders.  This allows an application to change the data source for each
drawing pass, an event that is highly unlikely (and never happens in WM4
sample applications).  The redesign of the shader system for WM5 avoids
this.

The function \Code{Renderer::GetVProgram} is called during drawing to
get access to the vertex program of the effect pass.  The first time a
vertex program is requested, it is loaded from disk.  The shaders were
written using NVIDIA's Cg, and they were all compiled for Shader Model 2.
The compiled assembly is still textual, and is stored in files with
extension \Code{wmsp}.  The WM4 engine contains a class \Code{Program}
and derived classes \Code{VertexProgram} (loads \Code{wmsp} files with
prefix \Code{v\_}) and \Code{PixelProgram} (loads \Code{wmsp} files with
prefix \Code{p\_}).  The comments in the \Code{wmsp} files are parsed
to obtain information about the shader program, which effectively is
WM4's attempt to have an FX run-time system.

A problem with this system is that the shader programs are constrained
to contain special names for some of the shader constants to support
automatic updating of those constants during drawing.  A class
\Code{RendererConstant} provides a set of enumerations and corresponding
names for common quantities that change frequently, such as
world-view-projection matrices, camera parameters, and light and
material parameters.  Class \Code{Renderer} contains an array of functions
corresponding to the enumerations in \Code{RendererConstant}.  The
function \Code{Renderer::SetRendererConstant} determines which shader
constants need to be updated (in system memory).  After such a call,
\Code{Renderer::SetVProgramConstant} or \Code{Renderer::SetPProgramConstant}
are called so that the graphics API can update the constants (by copying
to constant registers).  These \Code{Renderer} calls are part of the
\Code{Renderer::EnableVProgram} and \Code{Renderer::EnablePProgram} calls
in the drawing pass.  WM5 provides a different mechanism for automatic
constant updating that does not have constraints on the shader constant
names.

Another problem with the \Code{Program} loading and parsing is that it
is not general.  Often I would want to support a new effect but the
Cg programs used features not supported by the parser of
\Code{Program}.  That meant modifying \Code{Program} as needed.  WM5
avoids this system and allows you to compile shaders to a binary format
that contains the textual program string but also contains information
about the shader.  That is, the loading and parsing is now part of a
tool.  The output files of the tool are ready to load by WM5, so there
is no error checking that needs to be performed at application run time.

In WM4, when \Code{Renderer::GetVProgram} is called the first time for a
vertex program, and the program loads correctly, it is stored in a cache
implemented in the \Code{Catalog} class.  This caching system is overly
complicated.  In WM5, caching is the responsibility of the application
programmer, because the programmer knows best how the objects will be
used and shared.

When effects use vertex or pixel textures, they are loaded the first time
they are encountered by calls to \Code{ShaderEffect::GetVTexture} and
\Code{ShaderEffect::GetPTexture}.  The mechanism is similar to that of
\Code{GetVProgram} and \Code{GetPProgram}--the first time a texture is
encountered, it is loaded from disk and cached in a catalog.  Later
requests look in the catalog first to find the textures and, if found,
use them instead of loading a new copy from disk.

Although manageable, the drawing system of WM4 turned out to be more
complicated than is necessary, and it was not general enough to support
many advanced special effects without having to modify the engine.

\subsection{The WM5 Shader Programming and FX System}
\label{subsec.shaderfx}

The abstraction of the drawing pass in WM5 is described next.  What used
to be the \Code{Geometry} class is now \Code{Visual}, which I thought
was a better name that allows me to add \Code{Audial} (for 3D sound) at
a later date.

Some other major design changes were made.  DirectX 9 has the concept of
a {\em vertex format} that describes a vertex stored in a vertex buffer.
OpenGL does not encapsulate this in a simple manner.  WM5 has a new class
called \Code{VertexFormat} that implements the idea.  The class
\Code{VertexBuffer} still represents a vertex buffer but, of course, with
changes.  Reading and writing vertex buffer information requires knowing
a vertex buffer and a vertex format.  The read/write is supported by the
class \Code{VertexBufferAccessor}.

The WM5 class \Code{VisualEffect} is the natural successor to WM4's
\Code{ShaderEffect}, except that \Code{VisualEffect} represents a
vertex shader and pixel shader pair {\em but without specific data for
the shader constants and textures}.  A single \Code{VisualEffect} object
can have multiple instances, each instance having data.  These instances
are represented by class \Code{VisualEffectInstance}.  For example, you
can create a texture visual effect with user-specified sampler parameters.
If you want this effect for each of two different texture images, you
create two visual effect instances.

A \Code{Visual} object has attached a single pair of \Code{VisualEffect}
and \Code{VisualEffectInstance}.  Each object of type \Code{VisualEffectInstance}
manages multiple passes for the drawing, each pass of class \Code{VisualPass}.
The \Code{VisualPass} class contains global render state objects (alpha,
face culling, depth buffering, polygon offset, stencil buffering, and
wireframe), a vertex shader, and a pixel shader.

A class \Code{ShaderParameters} represents the shader constants and
textures used \Code{VisualEffectInstance}, one such object for the
vertex shader and one such object for the pixel shader.  The shader
constants are encapsulated by a system whose base class is \Code{ShaderFloat}.
Many derived classes are provided for common shader constants, such as
world-view-projection matrices, camera parameters, and light and material
parameters.  This system replaces WM4's \Code{RendererConstant} system for
automatic updating of shader constants.

The drawing pass is abstractly

\scriptsize
\begin{verbatim}
    renderer.Draw(visual, visualEffectInstance)
    {
        renderer.Enable(visual.vertexBuffer);
        renderer.Enable(visual.vertexFormat);
        renderer.Enable(visual.indexBuffer);  // if it has such a buffer
        for each visualPass of visualEffectInstance do
        {
            visualPass.vertexShaderParameters.UpdateConstants(visual, renderer.camera);
            visualPass.pixelShaderParameters.UpdateConstants(visual, renderer.camera);
            visualPass.SetGlobalState();  // alpha, cull, depth, ...
            renderer.Enable(visualPass.vertexShader, visualPass.vertexShaderParameters);
            renderer.Enable(visualPass.pixelShader, visualPass.pixelShaderParameters);
            renderer.DrawPrimitive(visual);
            renderer.Disable(visualPass.pixelShader, visualPass.pixelShaderParameters);
            renderer.Disable(visualPass.vertexShader, visualPass.vertexShaderParameters);
            visualPass.RestoreGlobalState();
        }
        renderer.Disable(visual.indexBuffer);
        renderer.Disable(visual.vertexFormat);
        renderer.Disable(visual.vertexBuffer);
    }
\end{verbatim}
\normalsize

At a high level, the drawing is similar to that of WM4.  But as mentioned in the
section describing the WM4 drawing, the vertex buffer is enabled and disabled once
outside the loop over passes.  The WM4 setting of sources for shader constants was
eliminated.  Instead, the \Code{ShaderFloat} objects provide storage and the
\Code{UpdateConstants} performs the automatic updates of the constants.

All caching of effects, textures, vertex buffers, vertex formats, and index buffers
is the responsibility of the application programmer.  It is simple enough to use
the smart-pointer system for the management rather than a complicated cataloging
system.

As mentioned in the previous section, WM5 has a tool for compiling Cg Shaders
to a binary format that can be loaded directly by the engine.  This tool is
named \Code{WmfxCompiler} (in the \Code{WildMagic5/Tools} subfolder).

{\em Local effects} are those applied to a single geometric primitive; for
example, basic texturing and lighting.  {\em Global effects} are typically more
complicated and are applied to scene graphs; for example, planar shadows and
planar reflections.  WM5 has implementations of quite a few local effects, but
has only planar shadows and planar reflections as examples of global effects.
The sample applications have additional global effects that are implemented at
the application level rather than as classes.

\subsection{Design Change Regarding Lights and Materials}
\label{subsec.designchangelights}

WM4 had classes \Code{Light} and \Code{Material} that provided the ability to
attach \Code{Light} objects to a scene graph node.  Each light attached to a
node was assumed to illuminate any objects in the subtree rooted at the node.
To support this automatically, WM4 internally generated a shader effect
(class \Code{LightingEffect}) that was used for lighting.  If an application
attached a \Code{ShaderEffect} to a leaf node of that subtree, a multieffect
drawing occurred.  The \Code{LightingEffect} was executed first for
the geometry, and the \Code{ShaderEffect} was executed second with a default
alpha blend applied to combine it with the lighting.  This approach still
has the flavor of the fixed-function pipeline.  Moreover, it was not a good
idea (based on technical support requests from users having problems working
with the lighting).  It is possible to roll your own lighting effects without
attaching lights to the scene, but then you have to make \Code{Renderer} calls
so that the renderer knows about the lights.  Very cumbersome and nonintuitive.

WM5 eliminates this system.  The \Code{Light} class still exists, but it is
only a container for light properties (light type, colors, attenuation, and
so on).  You cannot attach a \Code{Light} to a scene directly.  Instead, you
can create lighting-related shader constants via classes derived from
\Code{ShaderFloat} and include them in the visual effect instances.  See,
for example, files of the form \Code{Wm5Light*Constant.\{h,cpp\}} and
\Code{Wm5Material*Constant.\{h,cpp\}} and local effects files of the form
\Code{Wm5Light*Effect.\{h,cpp\}}.

\section{LibCore}

The \Code{LibCore} library contains some basic support that applications need.
Some of this support is for convenience during development.  A summary of
the files in this library is provided in this section.  The subsection titles
are the names of the subfolders of the \Code{WildMagic5/LibCore} folder.

\subsection{Assert}

C$++$ run-time libraries typically implement a macro called \Code{assert}
that has a single input which is a Boolean expression.  In debug
configurations, the macro is expanded to platform-specific code that triggers
the assertion when the Boolean expression is false.  Moreover, typically
a breakpoint is generated so that the debugger stops on that line of code
for the programmer to diagnose the problem.  For example,
\begin{verbatim}
    float numerator = <some integer>;
    float denominator = <some integer>;
    assert(denominator != 0.0f);
    float ratio = numerator/denominator;
\end{verbatim}
This bare-bones approach is suitable most of the time, but other times it
is useful to perform more actions when an unexpected condition occurs.
Moreover, it might be useful to have an assertion triggered when running
in release configurations.

The files \Code{Wm5Assert.*} provide an alternate implementation for
assertions, which at the moment is utilized only on Microsoft Windows
and Microsoft Visual Studio.  The class \Code{Assert} has a constructor whose
first input is the Boolean expression to be tested.  The name of the file and
line number within that file where the assertion is triggered are also
parameters.  These support writing assertions to a logfile, identifying
the file and line number, but not triggering an interrupt on the assertion.
These also support writing information to a Microsoft Windows message box.

Yet another parameter of the constructor is a format string.  Values to be
printed via the format statement may be provided to the constructor (note
the use of the ellipsis in the constructor).  This allows you to specify
more than just that the assertion failed.  You can print as much information
as you believe necessary to help with debugging.  A variadic macro named
\Code{assertion} is used to wrap the construction of \Code{Assert} objects;
such a macro supports a variable number of arguments.

By default, the alternative assertion system is enabled for Microsoft
Windows and Visual Studio when in a debug configuration.  The preprocessor
flag controlling this is in \Code{Wm5CoreLIB.h}.  The system is enabled
when \Code{WM5\_USE\_ASSERT} is defined.  If you want, you can expose the
macros even in a release configuration.  Notice that there are three
additional preprocessor symbols you can define.  These control whether
the assertion information is written to a log file, to the output window
of Visual Studio, and/or to a message box.

In my environment, I have \Code{assertion} specified as a user keyword
with syntax highlighting that shows the keyword in purple.  For details
on highlighting user keywords, see Section \ref{subsubsec.corelib}.

\subsection{DataTypes}

I implemented only two basic data types in the core library:  \Code{Tuple}
and \Code{Table}.  These are templated classes with two template
parameters: one is the number of components of the tuple and one is the
type of the component.  Only basic services are provided:  construction,
destruction, access to the array pointer, access to components, assignment,
and comparison (support for standard C$++$ library containers).  The main
use of \Code{Tuple} in the engine is as a base class for floating-point
vectors \Code{Float1}, \Code{Float2}, \Code{Float3}, and \Code{Float4}.
The derived classes provided specialized constructors and assignment
operators.

Class \Code{Table} represents a 2-dimensional array of components and has
three template parameters: one is the number of rows of the table, one is
the number of columns of the table, and one is the type of the component.
Only basic services are provided:  construction, destruction, access to
the array pointer, access to components, access to rows and columns (as
tuples), assignment, and comparison (support for standard C$++$ library
containers).  The main use of \Code{Table} in the engine is as a base
class for floating-point matrices \Code{Matrix2}, \Code{Matrix3}, and
\Code{Matrix4}.

I have tried to rely on the standard C$++$ library containers as much as
possible, but I find my own minimum-heap template class to be useful (for
priority queue support with fast updates when neighbors change).  I have
kept this template class, files \Code{Wm5MinHeap.*}.

\subsection{InputOutput}

This folder contains implementation for handling of byte-order (endianness)
and for file and buffer input-output.  It also contains a path system for
locating files.

\subsubsection{Endianness}

Class \Code{Endian} has code to test whether a processor is little endian
or big endian.  The class also has functions for swapping data types with
2, 4, or 8 bytes per element.  I used byte-swapping in WM4 extensively to
allow data files that could be loaded either on a little-endian or a
big-endian machine.  The data itself was always stored in little-endian
format, which meant that the PowerPC Macintosh had extra computational
work to do when loading.

\subsubsection{File and Buffer Input-Output}

My goal in WM5 was to provide file and buffer input-output that can be
configured for the platforms in such a manner as to avoid byte swapping.
Classes \Code{BufferIO} and \Code{FileIO} are the implementations.  The
constructors for these classes have a \Code{mode} parameter that allows
you to specify whether the object is for reading data or for writing
data.  Moreover, the \Code{mode} flags specify whether to read as is,
to write as is, to read and swap bytes, or to write and swap bytes.
Additionally, I have \Code{mode} flags for the default read/write modes.
In the engine, any time I use \Code{BufferIO} or \Code{FileIO} objects,
I arrange for the \Code{mode} parameter to be defaulted itself to the
default read/write modes.  In this manner, if you want a global change
in the engine, say, to switch from read to read-and-swap, you need only
edit \Code{Wm5BufferIO.h} and \Code{Wm5FileIO.h} and change what the
default flags are (they currently are set to read/write without swaps).

This sounds fine in theory, but I encountered one big problem after
writing most of the graphics library.  The vertex buffers and textures
were streamed to disk as arrays of bytes, ignoring the actual structure
of a vertex and the actual format of a texture.  This is a problem when
you want to write-and-swap, because byte arrays are never byte-swapped.
Instead, it is necessary to write vertices one at a time and swap native
fields as they are encountered.  Similary, texels must be written one at
a time to ensure that the color channels are swapped correctly; for
example, if you have an RGBA 16-bits-per-channel texel, you must swap
two bytes per channel for each of four channels.  The source code was
due soon for the {\em Game Physics, 2nd edition} CD-ROM, so it was too
late to modify the code.  Instead, I created WMOF (Wild Magic Object
File) versions for little endian and big endian.  Only two such files
are shipped anyway (\Code{FacePN.wmof} and \Code{SkinnedBipedPN.wmof}),
so not a big deal.  My goal for future development is to avoid the
streaming system and just rely on raw formats for vertex buffers,
index buffers, and textures, and each platform can generate its own
byte-ordered versions.

\subsubsection{Path Handling}

In WM4, the files \Code{Wm4System.*} contained the ability to specify
a filename and create the fully qualified hard-disk path for the file.
The function of interest was \Code{System::GetPath}.  Someone who had
experience with the Macintosh implemented the Apple version of this
function, which involves some low-level operating system calls.  I
had to hack this function, because it depended on how Xcode was
configured (and the configuration varied between Xcode versions).
Not having enough experience with low-level Macintosh programming,
I ignored some complaints from users about how \Code{GetPath} was
slow and annoying.

In WM4, I also required users to set an environment variable that
stored the path to the \Code{WildMagic4} folder of the installation.
I missed a simple opportunity to bootstrap off this environment
variable and avoid the low-level programming.

WM5 does take advantage of the environment variable, now called
\Code{WM5\_PATH} in the WM5 distribution.  Class \Code{Environment}
encapsulates computing the fully qualified path for a specified file.
Just as class \Code{System} allowed in WM4, \Code{Environment}
allows you to insert and remove directory strings (paths to the
folders) for an array of strings.  The most common function in this
class that the sample applications use is
\begin{verbatim}
    std::string Environment::GetPathR (const std::string& name);
\end{verbatim}
You specify the name of a file to be read (the suffix \Code{R}
stands for ``read'') and the function returns the fully qualified
path for that file, if it can find it using the array of directory
strings it manages.  If it cannot find the function, the empty
string is returned.

The main entry point in the application code inserts the path to
the \Code{WildMagic5} folder.  It also inserts paths to various
\Code{WildMagic5/Data} subfolders: \Code{Wmfx}, \Code{Wmof},
\Code{Wmtf}, \Code{Wmvf}, and \Code{Im}.  More importantly, the
path to the application's project folder is inserted in the
main function.  The application initialization mechanism sets the
path, which is a static member \Code{Application::ThePath}.  In
order for this to work, it is necessary that the application
set the console title (for \Code{ConsoleApplication}-derived
classes) or the window title (for \Code{WindowApplication}-derived
classes).  For example, the application \Code{BillboardNodes}
has a class with constructor defined as
\begin{verbatim}
    BillboardNodes::BillboardNodes ()
        :
        WindowApplication3("SampleGraphics/BillboardNodes",0, 0, 640, 480,
            Float4(0.9f, 0.9f, 0.9f, 1.0f)),
            mTextColor(1.0f, 1.0f, 1.0f, 1.0f)
    {
    }
\end{verbatim}
The window title is the quoted string.  This string is appended to
the fully qualified string for the \Code{WildMagic5} folder.  The
resulting string is the fully qualified path for the folder of the
\Code{BillboardNodes} project.

\subsection{Memory}

\subsubsection{WM4 Memory Tracking}

WM4 has a memory system that supported finding memory leaks.  The macros
\Code{WM4\_NEW} and \Code{WM4\_DELETE} are simple macros that wrap
\Code{new} and \Code{delete} when the memory system is disabled and
that wrap \Code{new(\_\_FILE\_\_,\_\_LINE\_\_)} and \Code{delete} when the
memory system is enabled.  All engine memory allocations and
deallocations use these macros so that without code changes, you can
toggle on/off the memory tracking.

The heart of the system is class \Code{Memory} whose interface is used
to override the C$++$ operators
\begin{verbatim}
    void* operator new (size_t size, char* file, unsigned int line);
    void* operator new[] (size_t size, char* file, unsigned int line);
\end{verbatim}
Although a simple system, the override affects all allocations in the
application; indirectly, any other code linked to the application is
forced to use the overridden operator.

I was not satisfied with this approach, wanting instead to provide the
ability for users to substitute in their own memory management/tracking
system that affects only Wild Magic code.  For example, a user might want
to patch in a system that gives Wild Magic a {\em memory budget}--a
fixed-size heap that the engine must use for all its memory needs.

I also was not satisfied with the C$++$ memory management itself.  In
the memory tracking, the calls to \Code{new(\_\_FILE\_\_,\_\_LINE\_\_)}
allow you to intercept the allocation request and save it for writing
to a log file at the end of an application run.  If there is a memory
leak, the log file can list information about the allocations, including
the name of the source file and the line of that file where the leaked
allocation occurred.  Unfortunately, C$++$ does not allow you to override
\Code{delete} in a way that uses the \Code{\_\_FILE\_\_} and
\Code{\_\_LINE\_\_} macros.  At first glance you might override with
\begin{verbatim}
    void operator delete (void* address, char* file, unsigned int line);
    void operator delete[] (void* address, char* file, unsigned int line);
    #define WM4_DELETE delete(__FILE__,__LINE__)
\end{verbatim}
This does not do what you think it does.  These versions of \Code{delete}
are called only when exceptions occur, and you cannot force them to be
called otherwise.  It would really be helpful to be able to log the files
and lines on which deallocations occur, especially when you want to monitor
{\em memory usage patterns} rather than memory leaks.

The \Code{operator new} function is for dynamically allocating a single object,
a 0-dimensional array so to speak.  The \Code{operator new[]} function is for
dynamically allocating a 1-dimensional array of objects.  The general rule is
that if you allocate with \Code{operator new}, you must deallocate with
\Code{operator delete}.  If you allocate with \Code{operator new[]}, you must
deallocate with \Code{operator delete[]}.  If you mix these, consider that
an error in memory management, even if the application does not abnormally
terminate.  For example,
\begin{verbatim}
    MyObject* objects = new MyObjects[10];
    delete[] objects;  // matches the new[] call
    delete objects;  // error - a mismatch
\end{verbatim}
It is the programmers reponsibility to ensure the new and delete calls are
matched.

C$++$ does not have new/delete operators for higher dimensional arrays.  It is
not clear how to provide language support for this in a robust manner.  For
example,
\begin{verbatim}
    MyObject** objects0 = new MyObject*[N];
    for (i = 0; i < N; ++i)
    {
        objects0[i] = new MyObject[M];
    }
    <code using objects0>;
    for (i = 0; i < N; ++i)
    {
        delete[] objects0[i];
    }
    delete[] objects0;
    
    MyObject someObjects[N];  // objects live on the stack, not in the heap
    MyObject** objects1 = new MyObject*[N];
    for (i = 0; i < N; ++i)
    {
        objects0[i] = &someObjects[i];
    }
    <code using objects1>;
    delete[] objects1;
\end{verbatim}
In the first block of code, the user has dynamically allocated a 2-dimensional
array of \Code{MyObject} objects, manipulated the objects, and then dynamically
deallocated the array one row at a time.  In the second block of code, the
user has created a 1-dimensional array of \Code{MyObject*} pointers that
point to a 1-dimensional array of \Code{MyObject} objects that live on the
stack.  It is an error to attempt to dynamically deallocate these objects.
Clearly, the semantics of \Code{objects0} and \Code{objects1} are different,
despite both being of type \Code{MyObject**}.  Without knowledge of the
semantics, it would be difficult for C$++$ to provide a new/delete pair
for \Code{Type**} pointers.

In the case when the user does want a 2-dimensional array of the form
that \Code{objects0} illustrates, you can provide your own allocation
and deallocation.  WM4 had several template functions in class \Code{System}
for allocating and deallocating 2-dimensional and 3-dimensional arrays.
The idea of these is to encapsulate the work required, hiding the details
from the user, {\em and to minimize the number of new/delete calls}.
Returning to the first code block of the example, an alternative scheme
that minimizes new/delete calls is
\begin{verbatim}
    MyObject* objects2 = new MyObject*[N];
    objects2[0] = new MyObject[N*M];
    for (i = 1; i < N; ++i)
    {
        objects2[i] = &objects2[0][M*i];
    }
    <code using objects2>;
    delete[] objects2[0];
    delete[] objects2;
\end{verbatim}
Allocation of \Code{objects0} requires $N+1$ calls to \Code{new} and
deallocation requires $N+1$ calls to \Code{delete}.  Allocation of
\Code{objects2} requires $2$ calls to \Code{new} and $2$ calls to
\Code{delete}.  Calls to new/delete can be relatively expensive because
of the work that the memory manager must due to manage the free list of
blocks, so minimizing the calls is a desirable goal.  Moreover, you are
guaranteed that the \Code{N*M} \Code{MyObject} objects are contiguous,
which can be friendly to a memory cache, and also allows you to iterate
over the 2-dimensional array as a 1-dimensional array in an efficient
manner.
\begin{verbatim}
    // Iteration as a 2-dimensional array.
    for (row = 0; row < N; ++row)
    {
        for (col = 0; col < M; ++col)
        {
            MyObject& object = objects2[row][col];
            <do something with object>;
        }
    }
    
    // Iteration as a 1-dimensional array.
    for (i = 0; i < N*M; ++i)
    {
        MyObject& object = objects[0][i];
        <do something with object>;
    }
\end{verbatim}
In the memory allocation scheme for \Code{objects2}, you are not
guaranteed that the rows occur in contiguous memory, so there is
the potential for memory cache misses when iterating over the
2-dimensional array, and it is not possible to iterate over the
objects as a 1-dimensional array.

Allocation and deallocation of 3-dimensional arrays with a minimum
of new/delete calls is similar.
\begin{verbatim}
    MyObject*** objects3 = new MyObject**[P];
    objects3[0] = new MyObject*[P*N];
    objects3[0][0] = new MyObject[P*N*M];
    for (int j = 0; j < P; j++)
    {
        objects3[j] = &objects3[0][N*j];
        for (int i = 0; i < N; i++)
        {
            objects3[j][i] = &objects3[0][0][M*(i + N*j)];
        }
    }
    <code using objects>;
    delete[] objects3[0][0];
    delete[] objects3[0];
    delete[] objects3;
\end{verbatim}

In WM4, the allocation and deallocation are wrapped with template functions
named \Code{System::Allocate} and \Code{System::Deallocate}.  However, I find
it displeasing to have inconsistent readability by calling \Code{WM4\_NEW}
for single objects (0-dimensional) and 1-dimensional arrays but having to
call \Code{System::Allocate} for 2-dimensional and 3-dimensional arrays.

\subsubsection{WM5 Memory Tracking}

A review of the ideas in the previous section led me to the following
requirements for the WM5 memory management system.  Several additional
requirements were added as I discovered problems while developing the
memory manager.  The first item in the list is about memory tracking
disabled.  All other items are about memory tracking enabled.
\begin{enumerate}
  \item When memory tracking is disabled, the allocation and deallocation
    fall back to the standard \Code{new} and \Code{delete} calls.
  \item Support semantics for arrays of dimension two or larger.
  \item Interception of \Code{new} and \Code{delete} calls must affect
    only the Wild Magic source code; that is, a side effect should not
    be that other systems (C$++$ run-time libraries or third-party
    software) are forced to use the interception system.
  \item Provide hooks to the users for the low-level allocation and
    deallocation so that Wild Magic transparently accesses a user-specified
    heap (to enforce a memory budget).
  \item File names and line numbers must be tracked both for allocations
    and deallocations.
  \item The inclusion of \Code{\_\_FILE\_\_} and \Code{\_\_LINE\_\_}
    macros must be hidden from the user (for readability).
  \item The tracking system must be reentrant; that is, if the system
    manages containers that store tracking information and those
    containers must be dynamically allocated, they must not do so by
    using the tracking system (infinite recursion problem).
  \item The system must allow for smart pointers (reference-counted
    objects).
  \item The tracking system must be thread safe.
\end{enumerate}
I struggled with designing a system that satisfied all the requirements,
finally settling on the one that is implemented in class \Code{Memory}.
I was burned only a couple of times along the way \ldots

\vspace*{0.1in}
{\bf Usage}

Before discussing the issues in designing \Code{Memory}, let us look at
the final result and how it is used.  A set of macros are defined to
make allocation and deallocation calls simple, readable, and that hide
the file-line information.  The allocation macros are named:  \Code{new0},
\Code{new1}, \Code{new2}, \Code{new3}, and \Code{new4}.  The numeric
suffix denotes the dimension of the allocation.  Effectively, \Code{new0}
corresponds to \Code{new} for a single object, \Code{new1} corresponds
to \Code{new[]} for a 1-dimensional array of objects, and the remaining
macros correspond to higher dimensional arrays, as described in the
previous section (minimizing the number of calls to \Code{new}).  The
corresponding deallocation macros are \Code{delete0}, \Code{delete1},
\Code{delete2}, \Code{delete3}, and \Code{delete4}.  Although it is
still the user's responsibility to pair the correct new/delete macro
calls, if there is a mismatch (on a delete call), the memory tracking
system will report this.  In my development environment, these macros
were added as user keywords, which I highlight in purple.

For allocation, all but the \Code{new0} call are templated.  Typical usage is
\begin{verbatim}
    MyObject* object = new0 MyObject(parameters);
    delete0(object);
    
    MyObject* objects1 = new1<MyObject>(numElements);
    objects1[elementIndex] = <do something>;
    delete1(objects1);
    
    MyObject** objects2 = new2<MyObject>(numRows, numColumns);
    objects2[rowIndex][columnIndex] = <do something>;
    delete2(objects2);
    
    MyObject*** objects3 = new3<MyObject(numSlices, numRows, numColumns);
    objects3[sliceIndex][rowIndex][columnIndex] = <do something>;
    delete3(objects3);
\end{verbatim}

\vspace*{0.1in}
{\bf Design Issues}

Now for design issues.  One of the main problems I had was trying to wrap
the allocation and deallocation with macros for readability and ease of
use, yet satisfying all the requirements I mentioned previously.  It
appeared to be practically impossible to use macros, hide an overload
of \Code{operator new} specific to Wild Magic, interact properly with
\Code{new} for single objects, hide the \Code{\_\_FILE\_\_} and
\Code{\_\_LINE\_\_} macros, and fall back to standard \Code{new} and
\Code{delete} when the tracking is disabled.  Moreover, Requirement 7
is problematic, because it effectively forces you to have a container
external to the WM5 memory management system, which means a memory budget
cannot be fully enforced.  I decided that having such a container was
something I (and users) can live with--you can always estimate how large
a container will be for your application, and then factor that into your
memory budgets.

In WM4, I had a macro to wrap overloaded \Code{operator new},
\begin{verbatim}
    #define WM4_NEW new(__FILE__,__LINE__)
    void* operator new (size_t size, char* file, unsigned int line);
    void* operator new[] (size_t size, char* file, unsigned int line);
\end{verbatim}
This macro hides the \Code{\_\_FILE\_\_} and \Code{\_\_LINE\_\_} macros,
satisfying Requirement 6.  However, the overloaded allocators violate
Requirement 3--the compiler would generate code for non-Wild-Magic code
that use the overloads.  Regardless, such a simple macro cannot simultaneously
hide the file-line macros, the dimension of the array to be allocated, and
wrap the overloaded \Code{new}.

I was able to accomplish some of the hiding, but suffered the consequence of
needing lines of code such as
\begin{verbatim}
    MyObject* object = WM5_NEW(MyObject, constructorParameters WM5_FILE_LINE);
\end{verbatim}
where \Code{WM5\_FILE\_LINE} expanded to nothing when memory tracking was
disabled, but expanded to
\begin{verbatim}
    #define WM5_FILE_LINE , __FILE__, __LINE__
\end{verbatim}
when memory tracking was enabled.  I was able to circumvent this problem by
designing \Code{Memory} so that objects of this class were only temporary
(for one line of code) but stored the file-line information.  This also
addressed Requirement 5 (tracking delete calls).  Specifically, class
\Code{Memory} has members \Code{mFile} and \Code{mLine} and a constructor
\begin{verbatim}
    Memory::Memory (const char* file, int line) : mFile(file), mLine(line) { }
\end{verbatim}
The file-line information persists only while the temporary object exists, so
they are temporarily accessible to the memory tracking system.

The overloaded allocation operator has signature
\begin{verbatim}
    void* operator new (size_t numBytes, const Wm5::Memory& memory);
\end{verbatim}
This satisfies Requirement 3 in that it is not possible for the compiler to
match this against allocation calls outside the Wild Magic 5 engine.  There
was no need to overload \Code{operator new[]}.

Some of the macros for allocation and deallocation are
\begin{verbatim}
    #define new0 new(Wm5::Memory(__FILE__,__LINE__))
    #define new1 new Wm5::Memory(__FILE__,__LINE__).New1
    #define delete0 Wm5::Memory(__FILE__,__LINE__).Delete0
    #define delete1 Wm5::Memory(__FILE__,__LINE__).Delete1
\end{verbatim}
Notice that \Code{new0} uses the overloaded \Code{new} operator, where the
input \Code{memory} is a reference to the temporary \Code{Memory} object.
In the implementation of the overloaded \Code{new} operator, the memory
tracking system has access to file-line information because the temporary
object stores that information.

\vspace*{0.1in}
{\bf Template and Macro Interaction}

Notice that \Code{new1} raises some additional questions.  The intent is for
this macro to support allocation of 1-dimensional arrays of {\em any type}.
The type information is not part of the macro.  One could explore the
possibility for including the type as a macro parameter, and I did explore
this.  You quickly run into the problem that the types might be template
types with multiple parameters separated by commas.  These commas interfere
with the preprocessor's parsing of the macro.  For example, you might try
\begin{verbatim}
    #define new1(type) new(Wm5::Memory(__FILE__,__LINE__)) type
    float* anArray = new1(float)[10];  // okay
    MyTemplate<int,float> anotherArray = new1(MyTemplate<int,float>)[10];  // error
\end{verbatim}
The last line is a problem because the preprocess things that
\Code{MyTemplate<int} is the macro parameter.  To convince the preprocessor otherwise
would require an extra pair of parentheses
\begin{verbatim}
    new1((MyTemplate<int,float>))[10];  // still an error
\end{verbatim}
but this does not work because the extra parentheses now cause a syntax
error when the compiler tries to determine the type of the allocation.  A fix is
to use
\begin{verbatim}
    typedef MyTemplate<int,float> MyTemplateIF;
    MyTemplateIF anotherArray = new1(MyTemplateIF)[10];  // okay
\end{verbatim}
but then the user has to make excessive use of \Code{typedef}.  There were
other situations in the engine where I wanted to pass template types through
macro parameters, but the template-comma/macro-comma problem prevented those,
too.  It would have been nice had C$++$ provided a separator other than a
comma for multiple template parameters.

At any rate, the \Code{Memory} class was then designed to have functions
\Code{New1}, \Code{New2}, and so on, that are templated.  This avoids having
to pass template types through macro parameters, but runs the risk of generation
of excessive code.  These templated member functions are why the previous example
had code such as
\begin{verbatim}
    MyObject* objects1 = new1<MyObject>(numElements);
    
    // The macro expanded code.
    MyObject* objects1 = Wm5::Memory(__FILE__,__LINE__).New1<MyObject>(numElements);
\end{verbatim}

\vspace*{0.1in}
{\bf Lack of Specialized New0}

Observe that there is no templated function \Code{Memory::New0}.  I had hoped to
have consistent coding style for all allocations, wanting
\begin{verbatim}
    MyObject* object = new0<MyObject>(parameters);
\end{verbatim}
My first pass on the design and implementation used this approach, and the
\Code{Memory} class had a large number of \Code{New0} functions, one for a
default (0-parameter) constructor, one for a 1-parameter constructor, and so on.
The implementation was along the lines of the following abstraction for a
2-parameter constructor,
\begin{verbatim}
    template <typename T, typename Param0, typename Param1>
    T* Memory::New1 (Param0 p0, Param1 p1)
    {
        // Memory tracking code not shown...
        return new T(p0, p1);
    }
\end{verbatim}
During testing, I was burned by this approach.  A class had a constructor with
a constant reference, say, \Code{MyClass::MyClass (int i, const SomeClass\& object)}.
\begin{verbatim}
    SomeClass object = <some object>;
    MyClass* something = new0<MyClass>(i, object);
\end{verbatim}
The compiler determined \Code{Param0} was \Code{int} and \Code{Param1} was
\Code{SomeClass}, {\em not} \Code{const SomeClass\&}.  The generated code included
creating a temporary copy of \Code{object} and passing the copy to the \Code{MyClass}
constructor, which had some difficult to diagnose side effects.  Realizing that the
difference was template code generation instead of macro textual substitution, I
removed the support in \Code{Memory} for templated allocations of single objects.

\vspace*{0.1in}
{\bf Hooks for User-Specified Allocations and Deallocations}

There is a static function \Code{Memory::Initialize} that allows the user to specify
low-level memory allocators and deallocators.  Defaults are provided, namely,
\Code{Memory::DefaultAllocator}, which wraps \Code{malloc}, and
\Code{Memory::DefaultDeallocator}, which wraps \Code{free}.  The functions provided
by the user must have parameters for the file name and line number, even if the user
is not interested in this information.  The hooks for allocation and deallocation
allow you to provide a fixed-size heap when you want to insist on memory budgets
for the components of your application.

\vspace*{0.1in}
{\bf Memory Tracking}

The \Code{Memory} class maintains a map of the memory that is currently allocated
by Wild Magic; see static member \Code{msMap}.  This map uses memory from the
global heap, so is not part of any user-specified heap implied by the hooks to
low-level allocators and deallocators.  To avoid pre-main allocation, \Code{msMap}
is a pointer to a map and must be allocated during initialization of the application.
This is performed in \Code{Memory::Initialization}, which is called in \Code{main}
in \Code{Wm5Application.cpp}.  There is a matching \Code{Memory::Termination} function
that is also called in \Code{main}.  Note that \Code{msMap} is shared data, so it must
be protected from concurrent accesses when running in a multithreaded environment.
\Code{Memory} provides a mutex for the critical sections that access \Code{msMap};
see static member \Code{msMutex}.

When a call is made to \Code{new0}, the overloaded \Code{operator new} is called.
The implementation is in \Code{Wm5Memory.h}.  A trap is supplied to ensure that
\Code{msMap} was actually allocated; if the trap is activated, an assertion is
triggered to let you know that the map does not exist.  The most likely event is
that you are trying to allocate memory before \Code{main} has been called (such
as global objects within file scope that require dynamic allocation of members).
In this event, the allocation does not fail (in release builds); rather, it just
uses \Code{malloc} and does not track the memory.

When the map does exist, the static member function \Code{Memory::CreateBlock} is
called.  Its parameters are the number of bytes to be allocated and the dimension
of the request, which is zero for \Code{New0}.  \Code{CreateBlock} has a critical
section that calls \Code{msAllocator}, which is either \Code{Memory::DefaultAllocator}
or an allocator supplied by the user via \Code{Memory::Initialize}.  The address
of the allocated block is the key for the map entry and a \Code{Memory::Information}
object is created to be the value for the map entry.  The information object stores
the number of bytes requested, the number of dimensions, the file name, and the line
number for which the request was made.

When a call is made to \Code{new1}, more work must occur than that for \Code{new0}.
A trap also occurs in \Code{Memory::New1} for an allocation request that is made
before \Code{msMap} exists.  If the request is made pre-\Code{main}, then the standard
C$++$ \Code{new[]} function is called and the memory is not tracked.  I recommend
that you not allocate pre-\Code{main}, because it makes for more predictable debugging
(in a single-threaded environment) when all allocations occur when in the scope of
\Code{main} (including any of the functions it calls).

If the allocation request is made when the map exists, the allocation in
\Code{Memory::CreateBlock} uses low-level C-style memory allocation (\Code{malloc}
by default).  However, the call to \Code{new1} is for an array of objects that must
then be default constructed.  This is accomplised by calling the placement-new
operator.

\scriptsize
\begin{verbatim}
    template <typename T>
    T* Memory::New1 (const int bound0)
    {
        T* data;
        if (msMap)
        {
            // Insert T[] into memory map.
            data = (T*)CreateBlock(bound0*sizeof(T), 1);

            // Call the default constructors for T.
            T* object = data;
            for (int i = 0; i < bound0; ++i, ++object)
            {
                ::new(object) T;  // THE PLACEMENT-NEW CALL
            }
        }
        else
        {
    #ifdef WM5_USE_MEMORY_ASSERT_ON_PREMAIN_POSTMAIN_OPERATIONS
            assertion(false, "Pre-main allocations are not tracked.\n");
    #endif
            data = new T[bound0];
        }
        return data;
    }
\end{verbatim}
\normalsize

The implementations for \Code{New2}, \Code{New3}, and \Code{New4} are similar.

The implementations of \Code{Delete0} through \Code{Delete4} have a similar flavor.
If the map does not exist when a deletion is requested, most likely the problem
is post-main deallocation.  A trap is set for this and, if encountered, the memory
is deleted using the standard C$++$ \Code{delete} operator.  If the map does exist,
then a critical section is entered and \Code{msMap} is searched for the
address-information pair that should be in the map--the memory was allocated at some
previoue time.  It is possible that the pair is not in the map, perhaps a double
deletion, so an assertion is triggered.  In release configurations, the deletion is
actually made using the standard C$++$ \Code{delete} operator.  (It is possible that
\Code{new} was used to allocate but \Code{delete0} was used to deallocate.)

When the pair exists in the map, a comparison is made between the \Code{Information}
member for number of dimensions and the dimension implied by the \Code{deleteN} call
(\Code{N} is 0, 1, 2, 3, or 4).  If there is a mismatch, an assertion is triggered.
The goal is to provide debugging support to let the user know that there is a mismatch
in allocation and deallocation calls.

Assuming the pair exists and the dimensions match, the object must be destroyed.
Because this is not an implicit generation by the compiler of a destructor call,
an explicit destruction call must be made.  For example,

\scriptsize
\begin{verbatim}
    template <typename T>
    void Memory::Delete0 (T*& data)
    {
        if (data)
        {
            if (!msMap)
            {
    #ifdef WM5_USE_MEMORY_ASSERT_ON_PREMAIN_POSTMAIN_OPERATIONS
                assertion(false, "Post-main deallocations are not tracked.\n");
    #endif
                delete data;
                data = 0;
                return;
            }

            msMutex.Enter();

            MemoryMap::iterator iter = msMap->find(data);
            if (iter != msMap->end())
            {
                if (iter->second.mNumDimensions == 0)
                {
                    // Call destructor for T.  If T is a pointer type, the
                    // compiler will not generate any code for the destructor
                    // call.
                    data->~T();  // EXPLICIT CALL TO THE DESTRUCTOR

                    // Remove T from memory map.
                    msMap->erase(data);
                    msDeallocator(data, mFile, mLine);
                }
                else
                {
                    assertion(false, "Mismatch in dimensions.\n");
                }
            }
            else
            {
    #ifdef WM5_USE_MEMORY_ALLOW_DELETE_ON_FAILED_MAP_LOOKUP
                delete data;
    #else
                assertion(false, "Memory block not in map.\n");
    #endif
            }

            data = 0;

            msMutex.Leave();
        }
    }
\end{verbatim}
\normalsize

After the object(s) is destroyed, the address-information pair is removed from the map.
Finally, the memory is deallocated by a call to \Code{msDeallocator}, which is either
\Code{Memory::DefaultDeallocator} or a function provided by the user in the call to
\Code{Memory::Initialize}.

\vspace*{0.1in}
{\bf Fallback to Standard C$++$ Calls}

Enabling or disabling the WM5 memory tracking system is accomplished by
symbols in \Code{Wm5CoreLIB.h}.  The default is that it is enabled in debug
configurations, whereby \Code{WM5\_USE\_MEMORY} is defined.  When the memory system
is disabled, the macros \Code{new0} through \Code{new4} and \Code{delete0} through
\Code{delete4} are expanded to inline function calls.  The signatures are provided
in \Code{Wm5Memory.h} and the implementations are in \Code{Wm5Memory.inl}.  These
functions only use C$++$ \Code{new} and \Code{delete} calls; in fact, the class
\Code{Memory} is not even defined when the memory system is disabled.

\vspace*{0.1in}
{\bf Smart Pointers}

WM4 has a reference-counting system that is implemented in class \Code{SmartPointer}.
This system is tied to the base class \Code{Object}.  In particular, each \Code{Object}
manages its own reference count.  Firstly, this is not thread safe.  You can have a
race condition when two threads are attempting to modify the reference counter when
the object is being accessed by both threads.  Secondly, this ties the reference
counting to the Wild Magic graphics library.  Thirdly, the smart pointers work only
for single objects.  Arrays of objects must be handled differently; for example, see
the \Code{Wm4TSharedArray} class .

In WM5, the smart pointers are thread safe, the reference counting is external (not
part of some base class for object-oriented support), and there are various smart
pointer classes to support sharing of arrays as well as single objects.  The
implementation is in files \Code{Wm5SmartPointer.*}.

The base class for smart pointers is \Code{PointerBase}.  This is similar to the
\Code{Memory} class in that a map is used to keep track of objects that are
currently reference counted (the references that were managed by the WM4
\Code{Object}s are not managed by an external system).  One difference, though,
is that the \Code{msMap} member is an object, not a pointer.  You may not create
reference counted objects pre-\Code{main} and they may not be destroyed
post-\Code{main}--I can modify this to be allowed, but it is better for
ease of debugging not to allocate/deallocate before/after \Code{main}.

The derived class \Code{Pointer0} of WM5 is equivalent to the WM4 class \Code{Pointer}.
The suffix of $0$ denotes that this class is for sharing of single objects (0-dimensional).
The derived class \Code{Pointer1} is used to share 1-dimensional arrays.  There is no
need for a separate class such as \Code{Wm4TSharedArray}.  Other smart pointer classes
exist for sharing 2-, 3-, and 4-dimensional arrays.

The semantics are the same as they were in WM4.  When an object is shared by someone new,
the (external) reference count is incremented.  When a shared object goes out of scope,
its (external) reference count is decremented.  When the reference count becomes zero,
the object is deleted/deallocated.  The code has traps for various unexpected conditions,
and asserts are triggered accordingly.

\subsection{ObjectSystems}

\subsubsection{Initialization and Termination}

WM4 provides the ability for each class to have static initialization and termination
functions.  These are registered pre-\Code{main}.  The initializers are executed after
\Code{main} begins but before the application starts (before \Code{Application::Run} is
executed).  The terminators are executed after the application finishes but before
\Code{main} ends.  This allows you to have better predicability of what your application
is doing--you have no control over the order of pre-main initialization calls and
post-main termination calls that are generated by the compiler.  WM5 uses the same
system for initialization and termination.

\subsubsection{The Object Base Class}

Just like WM4, WM5 has a base class called \Code{Object} that provides various
services for large libraries.  The class supports run-time type information (RTTI),
naming of objects, and streaming.  The WM4 base class also had the foundation for
smart pointers, but in WM5 the smart pointer system is external (not part of
\Code{Object}).

RTTI and naming remain unchanged from WM4 to WM5.  However, the streaming system
was significantly revamped.  From a high-level perspective, the interface functions
for streaming are the same (although I skipped porting the \Code{StringTree} code).
However, the streaming is now factored into input streaming and output streaming.
The linker pass has had a major overhaul (described later).

The WM4 streaming system has a new feature that turned out to be necessary
when I painted myself into a corner.  The loading system used the default
constructor for \Code{Object}-derived classes to generate an object via a
factory.  This object was then assigned data that was loaded from disk.  There
are times where the default constructor performs significant work, such as
memory allocation.  The loading system really needed a ``clean object'' created.
In the case of default construction that contains memory allocation, some
hard to track memory leaks were occurring.  The load-data-and-assign-to-object
paradigm itself was allocating memory for various members and overwriting the
pointers that were allocated by the default constructor.  To circumvent this
subtlety, \Code{Object} has an enumeration \Code{LoadConstructor} with a single
member \Code{LC\_LOADER}.  There is a constructor \Code{Object(LoadConstructor)}
and each derived class must have such a constructor.  These are now what the
loading system uses, so you do not have to worry about loading interfering
with the default constructor semantics.

\subsubsection{Run-Time Type Information}

Support for run-time type information has not changed from that of WM4.  The
template functions \Code{StaticCast} and \Code{DynamicCast} still exist.  The
\Code{Object} members \Code{IsExactly}, \Code{IsDerived}, \Code{IsExactlyTypeOf},
and \Code{IsDerivedTypeOf} still exist.

\subsubsection{Object Names}

Support for object names has not changed from that of WM4.  The \Code{Object}
members \Code{GetObjectByName} and \Code{GetAllObjectsByName} still exist.

\subsubsection{Streaming}

The streaming system was factored into support for input streams (reading from
disk or from buffer) and for output streams (writing to disk or to buffer).  The
public interfaces are reduced to the bare essentials.

The input streaming is implemented in class \Code{InStream}.  You can create and
destroy such objects.  You can either load objects from a buffer (in memory) or from
a file (on disk).  Once objects are loaded, you can access them via the member
functions \Code{GetNumObjects} and \Code{GetObjectAt}.  The low-level reading
functions are templatized.  Specializations of some of these are provided by
other classes (in the graphics library), specifically those that are aggregates
of native types.

The output streaming is implemented in class \Code{OutStream}.  You can create and
destroy such objects.  You can either save objects to a buffer (in memory) or to
a file (on disk).  Once an output stream is created, you can insert objects to
be streamed via the member function \Code{Insert}.  The low-level writing
functions are templatized.  Specializations of some of these are provided by
other classes (in the graphics library), specifically those that are aggregates
of native types.

The linker system was overhauled.  In WM4, \Code{Object*} pointers were written
to disk for output streaming.  The written data was simply the memory addresses.
When a file was loaded for input streaming, the memory address in the file are
of course no longer valid addresses, but they were used as unique identifiers
for the objects.  For each unique identifier, an \Code{Object} is created and
paired with the identifier.  After all \Code{Objects}s are created (the loading
phase).  Any \Code{Object*} data members contain the unique identifiers.  The
linker phase then kicks in and the unique identifiers are replaced by the
actual memory addresses for the corresponding objects.

Two problems occur with this system.  Firstly, I had to account for the fact that
some computers have 32-bit addresses and others have 64-bit addresses.  Each
memory address was packed into 64-bits on writing and the unique identifiers
were extracted from 64-bits on reading.  Secondly, the same scene graph saved twice
can lead to two scene graphs on disk for which a byte-by-byte difference program
will report are not the same.  For example, if you run an application and save
the scene, then re-run the application and save the scene again, the streamed
files can have differences because memory addresses of the \Code{Object}s are
different {\em even though the scenes are the same at a high level}.

All that is necessary is that a unique identifier be assigned to a \Code{Object*}
during a save operation, and that unique identifier is written to disk.  And
the generation of the unique identifier must not depend on application state
(such as memory addresses).  The WM5 linker system does this.  Now when you
stream the same scene graph to disk multiple times, those files are the same
byte-by-byte.  (This assumes the saves are to the same endian-order platform.)

\subsection{Threading}

I added support for mutexes and the {\em hooks} for threads.  Class \Code{Mutex}
is provided for a standard mutex; see files \Code{Wm5Mutex.*}.  The mutex details
depend on platform, which are encapsulated in \Code{Wm5MutexType.h}.  On Windows,
the mutex type is made opaque by using \Code{void*}, but in the implementation it
is of type \Code{HANDLE}.  On Macintosh and Linux, the pthread support is used for
POSIX threads and mutexes.  If you want a scoped critical section (the mutex is
destroyed when it goes out of scope), see \Code{Wm5ScopedCS.*}.

Thread types are also platform dependent; see \Code{Wm5ThreadType.h}.  On Windows,
the thread type is made opaque by using \Code{void*}.  In the implementation it is
a \Code{HANDLE}.  On Macintosh and Linux, the type is \Code{pthread\_t}.  I have
the Windows implementation started, but I have not yet provided examples that use
it.  Over time, I will start the process of threading the engine code.

\subsection{Time}

I have only simple support for time measurements, in \Code{Wm5Time.*}.  The function
\Code{GetTimeInMicroseconds} is a wrapper for basic time measurements, but they are
not for a high-resolution timer.  There is also \Code{GetTimeInSeconds}.  Eventually,
I will add platform-dependent support for high-resolution timers.  The current
functions suffice for simple frame-rate monitoring.

\section{LibMathematics}

The mathematics code was factored out of the WM4 LibFoundation library into its
own library.  The folder organization has changed.  The \Code{WildMagic4/Mathematics}
folder was split into \Code{WildMagic5/Base}, \Code{WildMagic5/Algebra},
\Code{WildMagic/Object2D}, and \Code{WildMagic/Object3D}.

\subsection{Base}

The \Code{Base} folder contains the \Code{Math} class in files \Code{Wm5Math.*}.
The bit hack functions are in \Code{Wm5BitHacks.*}.   Classes \Code{Float1}, \Code{Float2},
\Code{Float3}, and \Code{Float4} were added to support the graphics library.
These are simple classes derived from the \Code{Tuple} template class in
\Code{LibCore} and provide specialized constructors and assignment.

\subsection{Objects2D}

The old \Code{Mathematics} folder contained classes for 2D objects.  These
classes and files were moved to the new \Code{Object2D} folder.

\subsection{Objects3D}

The old \Code{Mathematics} folder contained classes for 3D objects.  These
classes and files were moved to the new \Code{Object3D} folder.

\subsection{Algebra}

\subsubsection{Vector and Matrix Classes}

The algebra classes used most by WM4 were moved to \Code{Algebra}.  These include
\Code{Vector2}, \Code{Vector3}, \Code{Vector4}, \Code{Matrix2}, \Code{Matrix3},
\Code{Matrix4}, and \Code{Quaternion}.

\subsubsection{Classes to Support Numerical Computations}

Classes supporting numerical computations were moved to the \Code{Algebra}
folder.  These include \Code{Polynomial1}, \Code{GVector}, \Code{GMatrix},
and \Code{BandedMatrix}.

\subsubsection{New Classes for Affine and Homogeneous Algebra}

The \Code{Algebra} folder contains new files for new classes.  The main idea
is that the data of the classes are 4-tuples or $4 \times 4$ matrices, all
component of type \Code{float} and which will eventually be set up for SIMD
computations.  (At the moment they are not set up for SIMD.)  The template
\Code{Vector} and \Code{Matrix} class still remain template classes that
can support 32-bit \Code{float} and 64-bit \Code{double}.

\Code{AVector} represents 3D vectors but stored as 4-tuples of the form
$(x,y,z,0)$.  \Code{APoint} represents 3D points but stored as 4-tuples of
the form $(x,y,z,1)$.  \Code{HPoint} represents homogenous 4-tuples of the
form $(x,y,z,w)$.  \Code{HMatrix} represents homogeneous $4 \times 4$
matrices.  \Code{HQuaternion} is not much different from \Code{Quaternion},
but the idea was to encapsulate the planned SIMD code computations in
\Code{HQuaternion}.  \Code{HPlane} represents a plane as a 4-tuple.

I originally used the Curiously Recurring Template paradigm for the
\Code{Vector} and \Code{Matrix} classes, but in my opinion the problems with
getting this to work properly on all the supported platforms was not worth
the effort.  I ran into problems with the C$++$ requirement for template
classes derived from other template classes that force you either to scope
the base class with \Code{this->mSomeMember} or to add a \Code{using}
statement in the derived class to avoid the explicit scoping.  I am still
of the opinion that having to scope base class members but not scope global
variables is backwards.  The \Code{using} paradigm has its own problems,
because it can affect the public/protected/private mechanism.  With the
vector and matrix classes, the Microsoft compiler had problems with
\Code{using} and started complaining about certain base-class members
not being visible when without \Code{using} they were.  Having enough
of this, I ripped out the CRT paradigm and just derived the
\Code{Vector} classes from \Code{Tuple} and the \Code{Matrix} classes
from \Code{Table}.

I added the \Code{struct Information} nested structures to \Code{Vector2}
and \Code{Vector3}.  This informatoin used to be in the WM4 \Code{Mapper2}
and \Code{Mapper3} classes and used by the computational geometry code.
I eliminated the mapper classes.

One of the annoyances with representing 3-tuples as 4-tuples is that there
are several situations in the graphics engine where you have to convert
from one to the other, especially with reading and writing vertex buffers.
The new classes have some constructors and implicit conversion operators to
support this, but I consider them an eye sore.

\subsection{CurvesSurfacesVolumes}

WM4 had separate folders, \Code{Curves} and \Code{Surfaces}, and some other
code for B-spline volumes.  I consolidated all these files into a single
folder in WM5, \Code{CurvesSurfacesVolumes}.

\subsection{Distance}

Nothing has changed regarding functions for distance calculations.  The
number of files and combinations are too numerous to summarize them here
in an effective manner.

\subsection{Intersection}

Nothing has changed regarding functions for intersection calculations.  The
number of files and combinations are too numerous to summarize them here
in an effective manner.

\subsection{Approximation}

Nothing has changed regarding functions for approximations and fitting.  The
number of files and combinations are too numerous to summarize them here
in an effective manner.

\subsection{Containment}

Nothing has changed regarding functions for containment.  The
number of files and combinations are too numerous to summarize them here
in an effective manner.

\subsection{Interpolation}

Nothing has changed regarding functions for interpolation.  The
number of files and combinations are too numerous to summarize them here
in an effective manner.

\subsection{NumericalAnalysis}

Nothing has changed regarding the numerical analysis code except that
I renamed the class \Code{Eigen} to \Code{EigenDecomposition}.

\subsection{Meshes}

Nothing has changed regarding the graph data structures for meshes.

\subsection{Rational}

The integer and rational arithmetic code was moved from the WM4
\Code{ComputationalGeometry} folder to its own folder.  The reason is
that many other algorithms can use exact rational arithmetic, so no
reason to isolate it to the computational geometry folder.

Class \Code{Rational} has constructors and converters for \Code{float}
and \Code{double} to \Code{Rational}.  These had not handled subnormal
(denormal) numbers, and in fact the conversions were significantly slow.
In WM4 and WM5, \Code{Rational} now handles subnormal numbers and the
conversion code is a lot faster.

\subsection{Query}

The queries involve floating-point arithmetic, but some also involve
exact integer and rational arithmetic.  I moved these to a separate
folder for the same reasons as the \Code{Rational} folder.  The
computational geometry code is not the only code in the engine that
can benefit from exact arithmetic, so no reason to isolate the queries
to the computational geometry folder.

\subsection{ComputationalGeometry}

Other than moving the exact integer and rational arithmetic to a new
folder and the queries to a new folder, nothing has changed in this
folder.

\subsection{Miscellaneous}

Nothing has changed in this folder.

\section{LibGraphics}

The graphics library has the most significant changes of anything from Wild
Magic 4.  LibGraphics of Wild Magic 5 is a significant rewrite of its
predecessor.

\subsection{DataTypes}

The \Code{Bound} class has nearly the same interface as in WM4, except that
the sphere center is a \Code{APoint} rather than a \Code{Vector3f}.  The
\Code{Bound::ComputeFromData} function now takes a generic pointer and a
stride to allow you to compute a bounding sphere from data that lives in
a vertex buffer.  In WM4, the data was a contiguous array of 3-tuple
positions.

The \Code{Transform} class has much of the interface as that in WM4.  However,
the class stores a homogeneous matrix that is used by the graphics system.
This matrix is a composition of the translation, scale, and rotation (or
general matrix) components of the transformation.  The class also stores the
inverse of the homogeneous matrix.  This matrix is computed only when it is
needed.  Once I add SIMD support, this class will have an option to use it
instead of the standard CPU computations.

The files \Code{Wm5HalfFloat.*} contain converters between 32-bit
floating-point numbers and 16-bit floating-point numbers.  The latter are
stored as unsigned short integers.  The conversion is useful for vertex
buffers and textures that want to use half floats.

The files \Code{Wm5Color.*} contain the implementation of a class
\Code{Color} that has all static members.  This is used to convert between
various color formats for use by the WM5 texture system.  Specifically, the
conversion is used for generating mipmaps on the CPU.

The streaming code in LibCore has classes \Code{InStream} and \Code{OutStream}
that contain some template member functions to support streaming of
{\em aggregrate data}.  For example, \Code{Bound} has an \Code{APoint}
member and a \Code{float} member.  \Code{Transform} has several native
members.  To stream these, the template member functions of \Code{InStream}
and \Code{OutStream} must be specialized; see the functions of the form
\Code{ReadAggregate*} and \Code{WriteAggregate*}.  Specializations are in
the files \Code{Wm5SpecializedIO.*}.

The files \Code{Wm5Utility.*} contain only two functions that are used by
the \Code{SampleGraphics/CubeMaps} application.

\subsection{Resources}

The renderer has various resources that it manages.  These include vertex
buffers, vertex formats, index buffers, render targets, and textures.  The
\Code{Resources} folder stores the source files for these objects.

A vertex format describes the layout of a vertex in a vertex buffer.  DirectX 9
has an interface for this, \Code{IDirect3DVertexDeclaration9}, and each
item of interest in the vertex format is a vertex element (position, normal,
color, texture coordinate, and so on), \Code{D3DVERTEXELEMENT9}.  OpenGL does
not encapsulate this concept, so the WM5 OpenGL renderer creates its own
representation.  The term render target is DirectX 9 terminology.  OpenGL
uses the term framebuffer object.  I flipped a coin to decide which term to
use--render target won.

The classes \Code{VertexBuffer}, \Code{IndexBuffer}, \Code{VertexFormat},
\Code{Texture1D}, \Code{Texture2D}, \Code{Texture3D}, \Code{TextureCube},
and \Code{RenderTarget} are all platform independent.  The \Code{Renderer}
class is an abstract interface that has several member functions that
allow you to bind the platform-independent objects to platform-dependent
objects, the latter objects not visible to the application writer.  The
platform-dependent objects are managed by the back-end renderers for
DirectX and OpenGL.

When working with vertex buffers, the vertex formats tell you how the
vertices are structured.  The class \Code{VertexBufferAccessor} takes a
format-buffer pair and allows you to set/get the vertex buffer data.
This class has template member functions that allow you to access the
buffer data in whatever form is convenient to you.  The sample applications
make heavy use of this class, so look at those applications for usage.

\subsection{Renderers}

The \Code{LibGraphics/Renderers} folder has files \Code{Wm5Renderer.*} that
has the abstract interface for rendering that is platform independent.  Any
member functions that do not depend on the underlying graphics API are
implemented in \Code{Wm5Renderer.cpp}.  Platform-dependent implementations
occur in several subfolders.

The \Code{Dx9Renderer} subfolder has a DirectX 9 implementation.  There are
no implementations for DirectX 10 or DirectX 11.

The \Code{OpenGLRenderer} subfolder has an OpenGL implementation.  Please be
aware that the shader system of Wild Magic 5 (and previous) uses OpenGL
extensions that were available before OpenGL 2.0 shipped.  These extensions
are friendly to having an FX system that uses NVIDIA's Cg programs, and
the back-end DirectX and OpenGL renderers have very similar organization.
I have plans to move to OpenGL 2.0 and later, using GLSL instead of Cg,
and to abandon Cg programming.  See the last section of this document on
the future of Wild Magic.

OpenGL renderer creation and a few operations (swap buffers, for example) are
specific to the platform.  The Microsoft Windows OpenGL portions (WGL) are
in the subfolder \Code{WglRenderer}.  Macintosh OS X OpenGL portions (AGL)
are in the subfolder \Code{AglRenderer}.  Linux OpenGL portions (for
X Windows) are in the subfolder \Code{GlxRenderer}.

The resource management member functions of \Code{Renderer} have names such
as \Code{Bind}, \Code{Unbind}, \Code{Enable}, \Code{Disable}, \Code{Lock},
and \Code{Unlock}.  The \Code{Bind} call creates a platform-dependent
object that corresponds to the platform-independent resource.  For example,
\Code{Bind} applied to a \Code{VertexBuffer} will create a corresponding
platform-dependent object \Code{PdrVertexBuffer}.  Other calls support
{\em lazy creation}; for example, if you call \Code{Enable} for a
\Code{VertexBuffer} and the platform-dependent companion \Code{PdrVertexBuffer}
does not yet exist, one will be created automatically.

For most applications, you do not even need to worry about explicit calls
to the resource management functions.  The rendering system will handle this
for you.  One exception, though, is related to render targets.  Sometimes it
is necessary to bind a render target explicitly so that its underlying
texture object is bound for use as a render target.  If that texture object
is attached to an effect, and you draw an object using the effect {\em before}
the render target is created, the texture object is bound as a regular texture,
not as a render target.  See the image processing samples for examples.

The \Code{Lock} and \Code{Unlock} calls were designed to allow you to access
vertex buffers, index buffers, and textures directly when they are in video
memory.  However, each resource is backed by system memory, which you can
also access.  If you modify the system memory for buffers and textures, the
\Code{Renderer} interface has \Code{Update} calls that cause the corresonding
video memory to be refreshed with the contents from system memory.  If you
modify the video memory directly, the system memory and video memory are out
of sync.  This may be of no concern in your application.

Originally, I planned not to back the resources with system memory, but then
I remembered that users reported that the DirectX 9 renderer of Wild Magic 4
does not handle lost devices.  For example, if you have a DX9 application
running and then use CTRL-ALT-DELETE to launch the Windows Task Manager, the
application device is lost.  When your application window is given the focus
again, DX9 requires you to recreate many (but not all) of the resources.
I found it quite annoying that the operating system would not manage the
video memory itself, forcing the application writers to have responsibility.
Regardless, I added the system memory backing and now the DX9 renderer will
restore the resources.  This is a serious waste of system memory.  I did not
see this problem with OpenGL during WM5 development, but I was running only
on Windows Vista and Windows 7.  Later, I read that Windows Vista and Windows
7 (via DirectX 10 or 11) does properly manage the video memory, but apparently
DirectX 9 still makes you manage the memory yourself.  When I ship EmeraldGL
(see the last section of this document), you can select whether or not to
have a system memory backing.

WM4 had the ability to attach global render state (alpha, face culling,
depth buffering, polygon offset, stencil buffering, wire state) to \Code{Node}
objects.  This state was propagated to the leaf geometry and to the attached
effect via a call to \Code{UpdateRS}.  This system does not exist in WM5.
You can, however, specify a global render state for the \Code{Renderer}
that overrides that type of global render state when applying the shader
effects to the geometric objects that are being drawn.  See the
\Code{Renderer} functions such as \Code{SetOverrideAlphaState}.  The reason
for removing this is that it seemed unnatural to allow WM4 \Code{Spatial} and
\Code{Node} to contain render state when their primary purpose instead was to
manage hierarchical transformations and culling.  After using WM5 for quite
some time now, I actually like the WM4 approach better and will restore it
(in EmeraldGL).  A node hierarchy can very well manage multiple scopes
(transformation/culling, render state, global effects).

Related to this is the ability in WM4 to attach an effect to a \Code{Node}
object.  This ability is also removed in WM5, but there is a new
renderer \Code{Draw} function that allows you to specify a global effect
that overrides any local effect in the geometry objects provided by the
visible set.  The sample graphics applications for planar reflections and
planar shadows show how to use global effects.  In fact, these samples use
multiple scene graphs (some folks seem to think that an application must
have only one scene graph, which has never been required in Wild Magic).

\subsection{Shaders}

The WM5 shader system and FX system have had significant rewriting from what
WM4 provided.

The \Code{LibGraphics/Shaders} folder contains global render state that is
nearly identical to that of WM4.  The classes for this state are
\Code{AlphaState} (alpha blending), \Code{CullState} (face culling),
\Code{DepthState} (depth buffering), \Code{OffsetState} (polygon offset
for depth biasing), \Code{StencilState} (stencil buffering), and
\Code{WireState} (wireframe/solid mode for drawing).

The special FX system is encapsulated by classes \Code{VisualEffect} and
\Code{VisualEffectInstance}.  Section \ref{subsec.shaderfx} already provided
some description of these.  The FX system is similar to Cg FX and to HLSL
support in DirectX 9.  An effect can have multiple techniques.  A technique
is encapsulated by \Code{VisualTechnique} and can have multiple passes.
A pass is encapsulated by \Code{VisualPass}.  Each such pass has a set of
global render state, a vertex shader, and a pixel shader.  The vertex shader
is encapsulated by class \Code{VertexShader} and the pixel shader is
encapsulated by class \Code{PixelShader}.  Both classes are derived from
class \Code{Shader}.

A \Code{Shader} object contains an array of input names for the inputs,
whether vertex attributes such as position, normal, and so on, or pixel
inputs (the outputs of the vertex shader).  The object contains an array
of output names, also.  Information about the shader constants and samplers
used by the shader programs are also maintained by \Code{Shader}.  This
information is encapsulated in class \Code{ShaderParameters}, which allows
you to set/get the constants and textures.  The shader constants live in
a system with base class \Code{ShaderFloat} (see the next section).

I still use Cg programs.  The compiled shaders have text output that stores
information used by Cg Runtime.  A tool called \Code{WmfxCompiler} ships with
WM5 that uses the Cg output to generate binary files for local effects to be
loaded by WM5.  The files contains ASM code for {\em all} the profiles WM5
supports for both OpenGL and DirectX 9.  Thus, one binary file (with
extension \Code{wmfx}) may be regardless of graphics API.  The \Code{Shader}
class stores all the program strings, registers, and texture units, and
selects an appropriate program based on which graphics API you are using
and what the best profile your graphics card supports.

The program strings need not be generated and stored in a \Code{wmfx} file.
Many of the local effects in WM5 have these strings and other information
stored as class-static data.  For basic applications, this means not having
to ship shader files as data for those applications.  In WM4, to ship without
the \Code{wmsp} files, you would have to embed them as character strings in
the application/engine and then roll your own program loader/parser to
read from those strings rather than from disk.

\subsection{ShaderFloats}

The \Code{ShaderFloat} class was designed to encapsulate shader constants
and allow them to be streamed, just as other graphics resources can be
streamed.  Most of the class interface is straightforward, allowing you
to set/get data in the various registers.

The class has four additional member functions that support updating of
constants during run time,
\begin{verbatim}
    inline void EnableUpdater ();
    inline void DisableUpdater ();
    inline bool AllowUpdater () const;
    virtual void Update (const Visual* visual, const Camera* camera);
\end{verbatim}
In the \Code{Renderer::Draw} function for a single \Code{Visual} object,
there is a loop over the passes of an effect.  In that loop you will see
\begin{verbatim}
    // Update any shader constants that vary during runtime.
    vparams->UpdateConstants(visual, mCamera);
    pparams->UpdateConstants(visual, mCamera);
\end{verbatim}
The \Code{ShaderParameters::UpdateConstants} is the following
\begin{verbatim}
    void ShaderParameters::UpdateConstants (const Visual* visual,
        const Camera* camera)
    {
        ShaderFloatPtr* constants = mConstants;
        for (int i = 0; i < mNumConstants; ++i, ++constants)
        {
            ShaderFloat* constant = *constants;
            if (constant->AllowUpdater())
            {
                constant->Update(visual, camera);
            }
        }
    }
\end{verbatim}
The function iterates over the shader constants, querying each whether it
allows (needs) updating.  If it does, then the \Code{ShaderFloat::Update}
function is called.

By default, the creation of a \Code{ShaderFloat} object does not allow
updating.  For example, if you have a \Code{ShaderFloat} that manages a
specific color for a vertex shader, and that color never changes during
application execution, then there is no need to update the color.  However,
some shader constants do vary at run time, most notably those associated
with the model-to-world matrix (map model coordinates into world coordinates)
and world-to-view matrix (map world coordinates to camera/view coordinates).
The \Code{ShaderFloat}-derived classes that encapsulate runtime-varying
constants should allow updates, either setting \Code{mAllowUpdater} in the
constructors or by calling \Code{EnableUpdater}.  Moreover, the derived
classes must override the virtual function \Code{ShaderFloat::Update} to
perform the appropriate calculations.

The matrices tend to vary at a rate of once per draw call,
so allowing the \Code{ShaderFloat::Update} call to occur always is the
right thing to do.  Some shader constants, though, might vary less
frequently, in which case the \Code{Update} call needlessly consumes cycles.
For these constants, you can call \Code{DisableUpdater} so that the update
function is not called.  When you change the value of the shader constant,
call \Code{EnableUpdater}, allow the draw to occur, and then call
\Code{DisableUpdater}.  At its lowest level, you can call the update
function yourself when needed, and always disable the update call--you
always manage the shader constant status, not the renderer.

The \Code{ShaderFloat} folder contains a large number of derived classes.
The one you will use most often is \Code{PVWMatrixConstant} that handles
the world-view-projection matrix.  This matrix is the one used by a
typical vertex shader for mapping the model-space vertex position
to clip-space coordinates.

See the examples in the \Code{LocalEffects} folder for how to create a
\Code{VisualEffect}-derived class.  In particular, you can see how to
create the vertex and pixel shaders and how to create the shader constants.
Creation of a shader constant requires you to provide a string name, the
same one used in the Cg program.  Unlike WM4 which required you to name
your shader constants with specific names so that the FX system functions
correctly, WM5 allows you to name the shader constants anything you like.
Hooking them up with the engine becomes the responsibility of the constructor
for the effect.

\subsection{LocalEffects}

The \Code{LibGraphics/LocalEffects} folder contains several examples of
classes derived from \Code{VisualEffect}.  These include basic vertex
coloring, texturing, and lighting.  The lighting effects include per-vertex
effects and per-pixel effects.  All these classes have hard-coded program
strings, registers, and texture units (as class-static data).

\subsection{GlobalEffects}

I use the term {\em global effect} to refer to drawing that involves
multiple geometric objects and requires direct access to the renderer
to manage the drawing.  The examples implemented are for planar reflection
and planar shadows; sample graphics applications are provided for both.

The abstract base class \Code{GlobalEffect} has a pure virtual function
\begin{verbatim}
    virtual void Draw (Renderer*, const VisibleSet&);
\end{verbatim}
that is implemented in each derived class.  This function is called
by \Code{Renderer::Draw(const VisibleSet\&, GlobalEffect*)} when you pass
a non-null pointer via the \Code{GlobalEffect} parameter.

For example, the classes \Code{PlanarReflectionEffect} and
\Code{PlanarShadowEffect} implement the \Code{Draw} function.  Much of that
code involves managing global render state for alpha blending, depth
buffering, and stencil buffering.  It also makes high-level draw calls
to set camera matrices and to render the current visible set.

\subsection{ImageProcessing}

This code is new to Wild Magic that is more along the lines of using the
GPU for general-purpose programming.  Some image processing, whether 2D or 3D,
can be done on the GPU using render targets.  The prototypical case is to
apply Gaussian blurring to a 2D image.  Two render targets are used.  The
first target is loaded with the image.  The image is Gaussian blurred using
a shader program and the output is drawn to the second target.  This target
becomes the source for the next blurring pass, and the other target becomes
the destination.  The targets alternate between being the source and
destination targets.

There is a significant amount of overhead in the setup for doing this.  The
classes in the \Code{ImageProcessing} subfolder encapsulate the overhead
so that the application itself can focus on the specific details of the
filters it wants to use to process the image.

The base class \Code{ImageProcessing} contains the setup code common to both
2D and 3D image processing.  The class \Code{ImageProcessing2} builds on top
of this by allowing you to select the type of boundary conditions for the
image filtering, currently Dirichlet or Neumann boundary conditions.  The
class also has a drawing function that is called for the image processing.
A sample application that illustrates this code is for 2D Gaussian blurring,
\Code{SampleImagics/GpuGaussingBlur2}.

The class \Code{ImageProcessing3} is also derived from \Code{ImageProcessing}.
Image processing of 3D images has a few more technical details to consider
compare to 2D processing.  On the CPU, a 3D image is typically stored in
{\em lexicographical order}.  If the image has $b_0$ columns (index named
$x$), $b_1$ rows (index named $y$), and $b_2$ slices (index named $z$), then
the mapping from the three-dimensional image coordinate $(x,y,z)$ to linear
memory with index $i$ for lexicographical ordering is
$i = x + b_0 (y + b_2 z)$.  The $z = 0$ slice is stored first in memory in
row-major order.  The voxel ordering is such that $x$ varies the fastest,
$y$ next fastest:
\[
  (0,0,0), (1,0,0), \ldots, (b_0-1,0,0), (0,1,0), \ldots,
    (b_0-1, b_1-1, 0)
\]
The $z = 1$ slice follow this one, and so on.  This mapping is not useful for
GPU computations on 3D images.

The 2D image processing naturally maps to render targets.  Standard filtering,
such as for Gaussian blurring, uses finite differences to estimate derivatives.
For example, centered differences to estimate first-order partial derivatives
are
\[
  \frac{\partial f(x,y)}{\partial x} \doteq \frac{f(x+h,y) - f(x-h,y)}{2h}, \;\;
  \frac{\partial f(x,y)}{\partial y} \doteq \frac{f(x,y-h) - f(x,y-h)}{2h}
\]
for small $h$.  Estimates for second-order partial derivatives are
\[
\begin{array}{lcl}
  \frac{\partial^2 f(x,y)}{\partial x^2} & \doteq & \frac{f(x+h,y) - 2f(x,y) + f(x-h,y)}{h^2} \\
  \frac{\partial^2 f(x,y)}{\partial y^2} & \doteq & \frac{f(x,y+h) - 2f(x,y) + f(x,y+h)}{h^2} \\
  \frac{\partial^2 f(x,y)}{\partial x \partial y} & \doteq & \frac{f(x+h,y+h) + f(x-h,y-h) - f(x+h,y-h) - f(x-y,y+h)}{4h^2} \\
\end{array}
\]
Gaussian blurring is modeled by the linear heat equation,
\[
  \frac{\partial f}{\partial t} = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}
\]
for some time scale $t \geq 0$.  The solution is a function $f(x,y,t)$ and the
initial condition is $f(x,y,0) = I(x,y)$, where $I(x,y)$ is your image.  There
are boundary conditions to deal with, but for the sake of illustration, ignore
these for now.  Using a forward difference in time and centered differences in
space, the heat equation is approximated by

\small
\[
  \frac{f(x,y,t+k) - f(x,y,t)}{k} =
    \frac{f(x+h,y,t) - 2f(x,y,t) + f(x-h,y,t)}{h^2} +
    \frac{f(x,y+h,t) - 2f(x,y,t) + f(x,y+h,t)}{h^2}
\]
\normalsize

Solving for $f$ at time $t + k$,
\[
  f(x,y,t+k) = \left(1 - \frac{4k}{h^2} \right)f(x,y,t) + \frac{k}{h^2} \left(
    f(x+h,y,t) + f(x-h,y,t) + f(x,y+h,t) + f(x,y-h,t) \right)
\]
The left-hand side represents a slightly blurred version of the image
$f(x,y,t)$.  If $f$ is stored as a texture in a render target, the right-hand
side becomes part of a pixel shader.  The various $f$ terms are evaluated as
samples of the texture (5 such evaluations).

The graphics APIs do not have the concept of a 3D render target where the
underlying texture is a volume texture.  However, the 3D image can be represented
as a tiled texture that is an array of 2D image slices.  For example, consider
a $4 \times 4 \times 4$ image.  The tiled texture is a $2 \times 2$ array of
$4 \times 4$ image slices.  The tiles are ordered as
\begin{center}
  \begin{tabular}{|c|c|} \hline
    $z = 0$ & $z = 1$ \\ \hline
    $z = 2$ & $z = 3$ \\ \hline
  \end{tabular}
\end{center}
As an $8 \times 8$ texture with origin in the upper-left corner, the layout is
the following where the triples are $(x,y,z)$,
\begin{center}
  \begin{tabular}{|c|c|c|c|c|c|c|c|} \hline
    (0,0,0) & (1,0,0) & (2,0,0) & (3,0,0) & (0,0,1) & (1,0,1) & (2,0,1) & (3,0,1) \\ \hline
    (0,1,0) & (1,1,0) & (2,1,0) & (3,1,0) & (0,1,1) & (1,1,1) & (2,1,1) & (3,1,1) \\ \hline
    (0,2,0) & (1,2,0) & (2,2,0) & (3,2,0) & (0,2,1) & (1,2,1) & (2,2,1) & (3,2,1) \\ \hline
    (0,3,0) & (1,3,0) & (2,3,0) & (3,3,0) & (0,3,1) & (1,3,1) & (2,3,1) & (3,3,1) \\ \hline
    (0,0,2) & (1,0,2) & (2,0,2) & (3,0,2) & (0,0,3) & (1,0,3) & (2,0,3) & (3,0,3) \\ \hline
    (0,1,2) & (1,1,2) & (2,1,2) & (3,1,2) & (0,1,3) & (1,1,3) & (2,1,3) & (3,1,3) \\ \hline
    (0,2,2) & (1,2,2) & (2,2,2) & (3,2,2) & (0,2,3) & (1,2,3) & (2,2,3) & (3,2,3) \\ \hline
    (0,3,2) & (1,3,2) & (2,3,2) & (3,3,2) & (0,3,3) & (1,3,3) & (2,3,3) & (3,3,3) \\ \hline
  \end{tabular}
\end{center}
The top row has texture coordinates $(u,v)$ from left to right of $(0,0),
(1,0), \ldots, (7,0)$.  The next row has texture coordinates from left to
right of $(0,1), (1,1), \ldots, (7,1)$.  The other rows have similar mappings
to texture coordinates.  The lexicographical mapping of the 3D image to 1D
memory is $i = x + 4(y + 4z)$.  The memory locations are
\[
  0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, \ldots
\]
In the tiled mapping, the texture is also stored in 1D memory.  The ordering
is
\[
  0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23, \ldots
\]
The \Code{ImageProcessing3} class has several member functions for mapping
between $(x,y,z)$, $(u,v)$, and $i$.  In particular, the function
\Code{CreateTiledImage} takes as input a 3D image in lexicographical order
and generates a 2D tiled texture (as in the previous example).

Given a tiled texture, the next problem is to compute the finite differences
for the filtering.  For example, 3D Gaussian blurring is model by the linear
heat equation,
\[
  \frac{\partial f}{\partial t} =
    \frac{\partial^2 f}{\partial x^2} +
    \frac{\partial^2 f}{\partial y^2} +
    \frac{\partial^2 f}{\partial z^2}
\]
where the solution is $f(x,y,z,t)$ and the initial value is $f(x,y,z,0) =
I(x,y,z)$ with $I$ being the 3D image to blur.  Finite difference estimates
are used, as in 2D, to obtain the numerical method
\[
\begin{array}{l}
  f(x,y,z,t+k) =
    \left(1 - \frac{6k}{h^2} \right)f(x,y,z,t) +
    \frac{k}{h^2} \left( f(x+h,y,z,t) + f(x-h,y,z,t) \right. \\
    \;\;\; \left. + f(x,y+h,z,t) + f(x,y-h,z,t) + f(x,y,z+h) + f(x,y,z-h) \right) \\
\end{array}
\]
Coding this in a pixel shader, the right-hand side must be evaluated.  Each $f$
term requires sampling the 2D tiled texture.  Let the texture function be
$T(u,v)$.  For example, evaluation of $f(1,1,0)$ requires sampling the
texture, $T(1,1)$.  Evaluation of $f(0,1,1)$ requires sampling the texture,
$T(4,1)$.  In the following discussion, the voxel spacing is $h = 1$.

There are two problems with the sampling.  Firstly, consider the pixel shader
when the input is $(x,y,z) = (1,1,1)$.  The evaluations of the function values
for $z = 1$ are texture samples,
$f(x+h,y,z) = f(2,1,1) = T(6,1)$,
$f(x-h,y,z) = f(0,1,1) = T(4,1)$,
$f(x,y+h,z) = f(1,2,1) = T(5,2)$, and
$f(x,y-h,z) = f(1,0,1) = T(5,0)$.
The texture samples are all at spatially close neighbors of $(x,y,1)$.
The numerical method also requires evaluating
$f(x,y,z+h) = f(1,1,2) = T(1,5)$ and
$f(x,y,z-h) = f(1,1,0) = T(1,1)$.  These texture samples are not spatially
close to $(x,y,1)$.  In order to shade the pixel at $(x,y,z)$, it is necessary
to have a dependent texture lookup.  The image filtering is accomplished by
drawing to a render target using a square as the geometry, with the square
having texture coordinates $(0,0)$, $(1,0)$, $(0,1)$, and $(1,1)$.  The
texture coordinates from the interpolation and that are passed to the
pixel shader are used to look up the $(u,v)$ values for sampling $T(u,v)$
that corresponds to $f(x,y,z)$.  The lookup is into what I call an
{\em offset texture}.

Secondly, the boundary conditions come into play.  Consider when the input
to the pixel shader is $(x,y,z) = (3,1,0)$.  This is a {\em boundary voxel}
in the original 3D image.  Evaluation of $f(x,y,z) = f(3,1,0) = T(3,1)$ is
just a sample of the tiled texture.  However, $(x+h,y,z) = (4,1,0)$ is
{\em outside the 3D image}.  You must decide how to handle boundary voxels
in the blurring.  The two standard choices are Dirichlet boundary conditions
and Neumann boundary conditions.

Dirichlet boundary conditions involve specifying the $f$-values on the
boundary of the image to be a constant.  If an $(x \pm h, y \pm h, z \pm h)$
input is outside the image domain, the $f$-evaluation just uses the
specified constant.  We need to know, however, when an input to the pixel
shader is a boundary pixel.  This involves creating another dependent texture
lookup.  I call this texture a {\em mask texture}.  The texture value is $1$
when the corresponding $(x,y,z)$ is an interior voxel and is $0$ when it is
a boundary voxel.

Neumann boundary conditions amount to clamping to the image boundary.
The evaluation of $f(x+h,y,z) = f(4,1,0)$ becomes an evaluation of
$f(3,1,0)$; that is, the $x$-value is clamped to $3$.  This would be
equivalent to clamp mode for a volume texture, but because we are using
a tiled texture, the clamping has to be part of the offset texture
lookup described in the previous paragraph.  Observe that any inputs
$(x,y,z)$ with $z = 0$ or $z = 3$ are boundary voxels.

The class \Code{ImageProcessing3} has member functions to compute the
offset and mask textures based on which type of boundary conditions
you choose.  An illustration of \Code{ImageProcessing3} is with 3D
Gaussian blurring.  See \Code{SampleImagics/GpuGaussianBlur3}.

The 2D and 3D Gaussian blurring samples do not use the mask texture.  However,
the GPU-based fluid solver for 2D Navier-Stokes equations does (for what
is called {\em mixed boundary conditions}).  See the sample
\Code{SamplePhysics/GpuFluids2D}.  A class project recommended in
{\em Game Physics, 2nd edition} involves implementing \Code{GpuFluids3D}.
This also will use the offset and mask textures.  Much of the foundation
needed to implement the 3D fluid solver is already built into
\Code{ImageProcessing3}.

\subsection{SceneGraph}

In Section \ref{subsec.shaderfx}, I already mentioned some key differences
between scene graph classes of WM4 and WM5.  Most notably is the replacement
of the WM4 class \Code{Geometry} by the WM5 class \Code{Visual}.  The
latter class removes the support for per-node global render state and global
effects, making it mainly a supporting class for hierarchical transformations
and culling.  The \Code{Node} class and special \Code{Node}-derived classes
are as they were in WM4 (other than the removal of support for global
render state and global effects).

As mentioned in Section \ref{subsec.designchangelights}, the lighting
system has changed with the elimination of the ability to attach lights
to a scene.  Class \Code{Light} is now just a container for the light
information, and the \Code{ShaderFloat}-derived classes for shader
constants include a variety of constants involving lights and materials.

The \Code{Camera} class has not changed much, but it does use \Code{APoint},
\Code{AVector}, and \Code{HMatrix} for affine and homogeneous entities.  The
class now has support for specifying pre-view and post-projection matrices.
The standard matrix used to map from model space to clip space is $H = PVW$,
where $W$ maps model space to world space (the world matrix), $V$ maps
world space to camera/view space (the view matrix), and $P$ maps view
space to homogeneous clip space (the projection matrix).  The product is
written with the convention that it is applied to column vectors on its
right, $PVW \Vector{x}$.  Sometimes it is convenient to apply another
transformation to world space before the conversion to view space.  The
prototypical example is a reflection matrix that is used for planar
reflections (see \Code{PlanarReflectionEffect}).  Such a matrix $R$ is
referred to as a pre-view matrix because it is applied {\em before} the
view matrix is applied, $H = PVRW$.  Sometimes it is convenient to apply
a transformation after the projection but before the perspective divide.
The prototypical example is a reflection matrix that is used for mirror
effects (replace $x$ by $-x$ for example).  Such a matrix $R$ is referred
to as a post-projection matrix because it is applied {\em after} the
projection, $H = RPVW$.

The \Code{CameraNode} and \Code{LightNode} classes are the same as in WM4.
They allow you to attach a camera/light to a scene graph.  For example,
you might have headlights on an automobile.  The headlights have geometry
so you can draw them on the vehicle, and they have light associated with
them that are used in rendering to illuminate anything they shine on.
The \Code{LightNode} is given a light and can have the headlight geometry
attached as a child.  Another example is a security camera in a corner of
a room.  The \Code{CameraNode} manages the \Code{Camera} position and
orientation and the geometry to represent the physical box of the camera
is attached as a child.

The culling system has not changed.  Classes \Code{Culler} and
\Code{VisibleSet} are as in WM4.  The picking system also has not changed.
Classes \Code{PickRecord} and \Code{Picker} are as in WM4.

The geometric primitive classes are the same, although I changed the
name \Code{Polyline} to \Code{Polysegment}.  Polylines are really multiple
segments, so why not call them polysegments?  Regardless, the code
reorganization exposes Microsoft Windows (when on a Windows PC) and the
Windows name \Code{Polyline} clashed with my class name.  Rather than
provided explicit scope with the \Code{Wm5} namespace, I just changed
the name.

Two new classes were added.  Class \Code{Projector} is derived from
\Code{Camera} and allows the projector to use a frustum with normalized
depth different from what the underlying graphics API requires.  If using
OpenGL, the underlying normalized depths are in $[-1,1]$.  But you can
have a projector object with depths of $[0,1]$.

The class \Code{ScreenTarget} provides support for creating standard
objects needed for drawing to a render target.  This includes a screen-space
camera, the rectangle geometry for the quad to which the render target
is associated, and texture coordinates for that quad.  This hides some
annoying differences between DirectX and OpenGL texture coordinate and
pixel coordinate handling.

\subsection{Controllers}

The controller system has the same design as in WM4, but I added two new
classes.

The \Code{TransformController} class is new and is designed to be a base
class for any controller that modifies \Code{Transform} objects.  The
\Code{KeyframeController} class is now derived from the new class.  This
fixed a subtle problem when a keyframe controller attached to a node
did not have keys to manage {\em all} of translation, rotation, and
scale.  This never showed up in my Wild Magic 4 samples, but it did
when adding support for blended animations.

The other new class is \Code{BlendedTransformController}.  This controller
allows you to manage two transform controllers and blend together the
keys.  An illustration for using the class is in the new sample
application, \Code{SampleGraphics/BlendedAnimations}.  This sample has
a skinned biped with two skin controllers (two triangle meshes) and with
keyframe controllers at a majority of the nodes of the biped.  The biped
has an idle cycle, a walk cycle, and a run cycle.  The sample shows how
to blend these for transitions between idle and walk and between walk
and run.

\subsection{Detail}

The level-of-detail classes have not changed.  However, I rewrote the
\Code{CreateClodMesh} classes to account for the design changes for
vertex buffers.  I also thought hard about the abstract problem of
the edge collapses, and I believe this rewrite produced more
readable source code.  In particular, the WM4 version had a lot of
hand-rolled code for graph handling.  I removed this and used as much
standard C$++$ library support (STL) as I could.

\subsection{Sorting}

The sorting code has not changed from that of WM4.

\subsection{CurvesSurfaces}

The code for supporting dynamically tessellated curves and surfaces was
mainly rewritten because of the design changes for vertex buffers.  This
required some tedious changes to the internal workings, but from a
user's perspective, nothing has changed conceptually.

\subsection{Terrain}

I retired the \Code{ClodTerrain*} classes.  That continuous level of
detail algorithm is quite old and not needed given the power and
memory of current generation graphics cards.

\section{LibPhysics}

I have made some changes to the physics library involving collision detection
and fluids.

\subsection{CollisionDetection}

The collision detection code used to live in the graphics library.  I wanted
to move it to the physics library without causing a compiler dependency
between the two.  To do this, the collision detection code has been converted
to use templates.  The two template parameters are \Code{Mesh} and
\Code{Bound}.  These classes must be instantiated with classes that include
the following interfaces.
\begin{verbatim}
    Class Mesh must have the following functions in its interface.
        int GetNumVertices () const;
        Float3 GetPosition (int i) const;
        int GetNumTriangles () const;
        bool GetTriangle (int triangle, int& i0, int& i1, int& i2) const;
        bool GetModelTriangle (int triangle, APoint* modelTriangle) const;
        bool GetWorldTriangle (int triangle, APoint* worldTriangle) const;
        const Transform& GetWorldTransform () const;

    Class Bound must have the following functions in its interface.
        Bound ();  // default constructor
        void ComputeFromData (int numElements, int stride, const char* data);
        void TransformBy (const Transform& transform, Bound& bound) const;
        bool TestIntersection (const Bound& bound) const;
        bool TestIntersection (const Bound& bound, float tmax,
            const AVector& velocity0, const AVector& velocity1) const;
\end{verbatim}
Of course, in Wild Magic you instantiate with \Code{TriMesh} and \Code{Bound}.
However, it is relatively easy to use other mesh and bound classes and add to
them the few interface functions required.

WM4 had an \Code{Object}-derived class \Code{BoundingVolume} which is now
a non-\Code{Object}-derived class \Code{Bound}.  The WM4 class
\Code{BoundingVolumeTree} is replaced by the template class \Code{BoundTree}.
The template class avoids having explicit derived classes such as
\Code{BoxBVTree} and \Code{SphereBVTree}.  The \Code{Bound} template
parameter can represent any bounding volume container you choose to
implement.

Because \Code{BoundTree} is templated, the \Code{CollisionRecord} and
\Code{CollisionGroup} classes need to have the same template parameters.
Moreover, these classes have some requirements for the \Code{Mesh} template
parameter.  Specifically, the mesh class needs to provide access to its
triangles.

\subsection{Fluid}

This is a new folder for the physics library.  It contains a CPU-based
implementation for solving the Navier-Stokes equation in 2D and in 3D
on regular grids.  Sample physics applications that use the solvers
are \Code{Fluids2D} and \Code{Fluids3D}.  The description of the classes
and the sample applications are in {\em Game Physics, 2nd edition}.

\subsection{Intersection}

The code is essentially the same, but some class names changed.  We now
have classes \Code{IntervalManager}, \Code{RectangleManager}, and \Code{BoxManager}
for the sort-and-sweep space-time coherent collision culling.  {\em Game
Physics, 2nd edition} uses the new class names.  The book also talks
about how \Code{BoxManager} can be implemented using multithreading,
using multiple cores (Xbox 360), and on specialized processors (SPUs on
PS3).

\subsection{LCPSolver}

The LCP solver has not changed.  Eventually, I hope to replace this with
an implementation of the velocity-based dynamics described in {\em Game
Physics, 2nd edition}.

\subsection{ParticleSystem}

The particle system code has not changed.

\subsection{RigidBody}

The rigid body code has not changed.

\section{LibImagics}

Nothing has changed in the LibImagics library.  The WM4 and WM5 services are
exactly the same except that I recently fixed the performance problems with
the 2D and 3D connected component labelers.  The WM4 fixes have been posted,
but the WM5 version will occur with the post of Wild Magic 5.2 patch.

\section{LibApplications}

The application layer has not changed much.  I added a static member
\Code{Application::ThePath}.  This stores the
path to the project folder of the application; you must support this by
providing a console/window title (string) that is a path to the project
folder relative to the path stored in the \Code{WM5\_PATH} environment
variable.

A change that I have not yet posted for either WM4 or WM5 is the replacement
of the console/window title with \Code{std::string} instead of
\Code{const char*}.  If you need the console/window title to store other
information, such as an input file your application is processing, you can
safely change the string during an \Code{OnPrecreate} call without destroying
the environment-path mechanism that relies on knowing the project folder
location.

The \Code{main} function has been restructured based on the changes for
path finding.  It also has specific calls to \Code{Initialize} and
\Code{Terminate} for the memory management system of WM5.

The Microsoft Windows stub is \Code{Wm5WinApplication.cpp} and serves
as the place \Code{WindowApplication::Main} lives, whether DirectX or
OpenGL.  This consolidates the Windows code into one source file (rather
than maintaining separate files for DirectX and OpenGL).

The \Code{Main} function has some new code.  The \Code{Camera} class needs
its normalized depth model specified based on graphics API.  The redesign
of the \Code{Renderer} class and how a renderer is created affects the
initialization.

\section{Tools}

Only a few tools are provided right now.

\subsection{GenerateProjects}

This is similar to the same-named project I provided in WM4.  You can use
this tool to generate the Microsoft Visual Studio 2008 \Code{vcproj} file
and the Xcode subfolder and project file for an application.  These
project files have all the compiler settings and library link information
that are present in the sample applications.  The usage is
\begin{verbatim}
  GenerateProjects MyProjectName
\end{verbatim}
The output is \Code{MyProjectName\_VC90.vcproj} and a subfolder named
\Code{MyProjectName.xcodeproj}.  The subfolder contains a file
\Code{project.pbxproj}.  The subfolder can be copied to a Macintosh
(by network or by sneaker net).

\subsection{BmpToWmtf}

This is a simple tool that runs on Microsoft Windows.  It loads a
24-bit BMP file and stores it as a Wild Magic 5 WMTF file, the
raw texture format for loading in WM5  The usage is
\begin{verbatim}
  BmpToWmtf MyBmpFile
\end{verbatim}
The output format is \Code{Texture::TF\_A8R8G8B8} and the alpha channel
is filled with 255.  If you want a constant alpha channel of your
own choosing, say, of value 128, use
\begin{verbatim}
  BmpToWmtf -a 128 MyBmpFile
\end{verbatim}
The specified file must be without the BMP extension (I need to fix this
and allow it or not).  If you want a constant alpha channel

\subsection{WmfxCompiler}

This tool generates Wild Magic 5 WMFX files that encapsulate the shader
programs for all the supported profiles.  The tool calls the Cg compiler
for an FX file specified on the command line.  It does so for the
profiles:  \Code{vs\_1\_0}, \Code{vs\_2\_0}, \Code{vs\_3\_0},
\Code{arbvp1}, \Code{ps\_1\_0}, \Code{ps\_2\_0}, \Code{ps\_3\_0},
and \Code{arbfp1}.  Whether all compiles succeed depends on the shader
model and what your shader programs are trying to do.  Failure to compile
a profile does not cause the tool to abort.  The output WMFX file contains
support for those profiles that were compiled successfully.  I write
log files to indicate what has (or has not) happened.  Of course, you
can still see the Cg warnings and errors when you run this tool.

Sometimes the profiles \Code{arbvp1} and \Code{arbfp1} are not enough
to compile a shader.  For example, vertex texturing requires a profile
of \Code{vp40}.  You can compile such shaders manually and either
hard-code them in the application code or manually generate a WMFX file.

\subsection{ObjMtlImporter}

This is a simple and not fully featured importer for the Wavefront
OBJ and MTL file formats.  It has sufficed for me for basic geometry
and materials.  The folder has only source code that you include in
your application.  Later I will provide some sort of stand-alone tool.
Within your source code, you can query the loader class to obtain
relevant information about your vertices, triangles and materials.

\subsection{WmtfViewer}

This is a simple viewer for \Code{Texture2D} images.  Eventually I can
add support to view cube maps and mipmap levels.  For now, this tool is
useful for debugging render targets.  You can save the texture of the
render target to disk and view it with this tool to see what is (or is
not) working correctly.

One warning.  The code maps color channels to a normalized color range.
The textures might have different hues than the original images that
were used to generate the WMTF files.  I fixed this in a local copy
of the viewer and need to post them (in Wild Magic 5.2 patch).

\subsection{BmpColorToGray}

This is probably not useful for graphics, but I use this to convert 24-bit
color BMP files to gray scale images for screen captures in my books.

\section{The Future of Wild Magic}

After years of maintaining an abstract rendering API that hides Direct X,
OpenGL, and software rendering, the conclusion is that each underlying API
suffers to some extent from the abstraction.  Given my desire to provide
a cross-platform graphics engine, it makes sense to focus on OpenGL.  As of
the time of writing this document, I have no plans to ship something called
Wild Magic 6.

This is not a judgment of whether OpenGL or DirectX is the better graphics
API.  Supporting multiple platform-dependent renderers slows down the
evolution of the platform-independent engine code, so focusing on only
one graphics API should speed up new development.  Given the requests for
graphics support on cell phones and given the abundance of OpenGL support for
desktops and embedded devices, it makes sense to abandon DirectX for now.

The Wild Magic source code will be split and evolve along two paths.

The mathematics portion of the source code will become part of a product
called the Malleable Mathematics Library.  Most of this code is not graphics
related, and the emphasis will be on robustness, speed, and accuracy of the
implementations.  This includes developing implementations that use any
SIMD support on the CPUs, that run on multiple cores, and that can use the
GPU as a general-purpose processor.  When robustness and accuracy are of the
utmost importance and speed is not an issue, some of the algorithms will
have implementations that use exact rational arithmetic and/or arbitrary
precision floating-point arithmetic.

The graphics portion of the source code will become part of a product called
EmeraldGL.  The renderer layer will still hide any explicit dependence on
OpenGL, but the hiding is relatively shallow and the architecture of the
renderer and graphics engine will be driven by the OpenGL/GLSL view of graphics.
This product will run on desktop computers (OpenGL 2.0 or later) and on
embedded devices (via OpenGL ES 2.0), using GLSL (or whatever variant
is necessary for embedded devices).  Naturally, not everything that runs
on a desktop will run on an embedded device, but the engine will allow you
to work with either.  EmeraldGL will have the minimal amount of code for
basic mathematics that graphics requires (points, vectors, matrices, planes,
quaternions) and will use SIMD and/or GPU when it makes sense.

Perhaps in the future I will return to supporting DirectX, maybe
creating EmeraldDX, but that remains to be seen.

\end{document}