1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 5751 5752 5753 5754 5755 5756 5757 5758 5759 5760 5761 5762 5763 5764 5765 5766 5767 5768 5769 5770 5771 5772 5773 5774 5775 5776 5777 5778 5779 5780 5781 5782 5783 5784 5785 5786 5787 5788 5789 5790 5791 5792 5793 5794 5795 5796 5797 5798 5799 5800 5801 5802 5803 5804 5805 5806 5807 5808 5809 5810 5811 5812 5813 5814 5815 5816 5817 5818 5819 5820 5821 5822 5823 5824 5825 5826 5827 5828 5829 5830 5831 5832 5833 5834 5835 5836 5837 5838 5839 5840 5841 5842 5843 5844 5845 5846 5847 5848 5849 5850 5851 5852 5853 5854 5855 5856 5857 5858 5859 5860 5861 5862 5863 5864 5865 5866 5867 5868 5869 5870 5871 5872 5873 5874 5875 5876 5877 5878 5879 5880 5881 5882 5883 5884 5885 5886 5887 5888 5889 5890 5891 5892 5893 5894 5895 5896 5897 5898 5899 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 6300 6301 6302 6303 6304 6305 6306 6307 6308 6309 6310 6311 6312 6313 6314 6315 6316 6317 6318 6319 6320 6321 6322 6323 6324 6325 6326 6327 6328 6329 6330 6331 6332 6333 6334 6335 6336 6337 6338 6339 6340 6341 6342 6343 6344 6345 6346 6347 6348 6349 6350 6351 6352 6353 6354 6355 6356 6357 6358 6359 6360 6361 6362 6363 6364 6365 6366 6367 6368 6369 6370 6371 6372 6373 6374 6375 6376 6377 6378 6379 6380 6381 6382 6383 6384 6385 6386 6387 6388 6389 6390 6391 6392 6393 6394 6395 6396 6397 6398 6399 6400 6401 6402 6403 6404 6405 6406 6407 6408 6409 6410 6411 6412 6413 6414 6415 6416 6417 6418 6419 6420 6421 6422 6423 6424 6425 6426 6427 6428 6429 6430 6431 6432 6433 6434 6435 6436 6437 6438 6439 6440 6441 6442 6443 6444 6445 6446 6447 6448 6449 6450 6451 6452 6453 6454 6455 6456 6457 6458 6459 6460 6461 6462 6463 6464 6465 6466 6467 6468 6469 6470 6471 6472 6473 6474 6475 6476 6477 6478 6479 6480 6481 6482 6483 6484 6485 6486 6487 6488 6489 6490 6491 6492 6493 6494 6495 6496 6497 6498 6499 6500 6501 6502 6503 6504 6505 6506 6507 6508 6509 6510 6511 6512 6513 6514 6515 6516 6517 6518 6519 6520 6521 6522 6523 6524 6525 6526 6527 6528 6529 6530 6531 6532 6533 6534 6535 6536 6537 6538 6539 6540 6541 6542 6543 6544 6545 6546 6547 6548 6549 6550 6551 6552 6553 6554 6555 6556 6557 6558 6559 6560 6561 6562 6563 6564 6565 6566 6567 6568 6569 6570 6571 6572 6573 6574 6575 6576 6577 6578 6579 6580 6581 6582 6583 6584 6585 6586 6587 6588 6589 6590 6591 6592 6593 6594 6595 6596 6597 6598 6599 6600 6601 6602 6603 6604 6605 6606 6607 6608 6609 6610 6611 6612 6613 6614 6615 6616 6617 6618 6619 6620 6621 6622 6623 6624 6625 6626 6627 6628 6629 6630 6631 6632 6633 6634 6635 6636 6637 6638 6639 6640 6641 6642 6643 6644 6645 6646 6647 6648 6649 6650 6651 6652 6653 6654 6655 6656 6657 6658 6659 6660 6661 6662 6663 6664 6665 6666 6667 6668 6669 6670 6671 6672 6673 6674 6675 6676 6677 6678 6679 6680 6681 6682 6683 6684 6685 6686 6687 6688 6689 6690 6691 6692 6693 6694 6695 6696 6697 6698 6699 6700 6701 6702 6703 6704 6705 6706 6707 6708 6709 6710 6711 6712 6713 6714 6715 6716 6717 6718 6719 6720 6721 6722 6723 6724 6725 6726 6727 6728 6729 6730 6731 6732 6733 6734 6735 6736 6737 6738 6739 6740 6741 6742 6743 6744 6745 6746 6747 6748 6749 6750 6751 6752 6753 6754 6755 6756 6757 6758 6759 6760 6761 6762 6763 6764 6765 6766 6767 6768 6769 6770 6771 6772 6773 6774 6775 6776 6777 6778 6779 6780 6781 6782 6783 6784 6785 6786 6787 6788 6789 6790 6791 6792 6793 6794 6795 6796 6797 6798 6799 6800 6801 6802 6803 6804 6805 6806 6807 6808 6809 6810 6811 6812 6813 6814 6815 6816 6817 6818 6819 6820 6821 6822 6823 6824 6825 6826 6827 6828 6829 6830 6831 6832 6833 6834 6835 6836 6837 6838 6839 6840 6841 6842 6843 6844 6845 6846 6847 6848 6849 6850 6851 6852 6853 6854 6855 6856 6857 6858 6859 6860 6861 6862 6863 6864 6865 6866 6867 6868 6869 6870 6871 6872 6873 6874 6875 6876 6877 6878 6879 6880 6881 6882 6883 6884 6885 6886 6887 6888 6889 6890 6891 6892 6893 6894 6895 6896 6897 6898 6899 6900 6901 6902 6903 6904 6905 6906 6907 6908 6909 6910 6911 6912 6913 6914 6915 6916 6917 6918 6919 6920 6921 6922 6923 6924 6925 6926 6927 6928 6929 6930 6931 6932 6933 6934 6935 6936 6937 6938 6939 6940 6941 6942 6943 6944 6945 6946 6947 6948 6949 6950 6951 6952 6953 6954 6955 6956 6957 6958 6959 6960 6961 6962 6963 6964 6965 6966 6967 6968 6969 6970 6971 6972 6973 6974 6975 6976 6977 6978 6979 6980 6981 6982 6983 6984 6985 6986 6987 6988 6989 6990 6991 6992 6993 6994 6995 6996 6997 6998 6999 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013 7014 7015 7016 7017 7018 7019 7020 7021 7022 7023 7024 7025 7026 7027 7028 7029 7030 7031 7032 7033 7034 7035 7036 7037 7038 7039 7040 7041 7042 7043 7044 7045 7046 7047 7048 7049 7050 7051 7052 7053 7054 7055 7056 7057 7058 7059 7060 7061 7062 7063 7064 7065 7066 7067 7068 7069 7070 7071 7072 7073 7074 7075 7076 7077 7078 7079 7080 7081 7082 7083 7084 7085 7086 7087 7088 7089 7090 7091 7092 7093 7094 7095 7096 7097 7098 7099 7100 7101 7102 7103 7104 7105 7106 7107 7108 7109 7110 7111 7112 7113 7114 7115 7116 7117 7118 7119 7120 7121 7122 7123 7124 7125 7126 7127 7128 7129 7130 7131 7132 7133 7134 7135 7136 7137 7138 7139 7140 7141 7142 7143 7144 7145 7146 7147 7148 7149 7150 7151 7152 7153 7154 7155 7156 7157 7158 7159 7160 7161 7162 7163 7164 7165 7166 7167 7168 7169 7170 7171 7172 7173 7174 7175 7176 7177 7178 7179 7180 7181 7182 7183 7184 7185 7186 7187 7188 7189 7190 7191 7192 7193 7194 7195 7196 7197 7198 7199 7200 7201 7202 7203 7204 7205 7206 7207 7208 7209 7210 7211 7212 7213 7214 7215 7216 7217 7218 7219 7220 7221 7222 7223 7224 7225 7226 7227 7228 7229 7230 7231 7232 7233 7234 7235 7236 7237 7238 7239 7240 7241 7242 7243 7244 7245 7246 7247 7248 7249 7250 7251 7252 7253 7254 7255 7256 7257 7258 7259 7260 7261 7262 7263 7264 7265 7266 7267 7268 7269 7270 7271 7272 7273 7274 7275 7276 7277 7278 7279 7280 7281 7282 7283 7284 7285 7286 7287 7288 7289 7290 7291 7292 7293 7294 7295 7296 7297 7298 7299 7300 7301 7302 7303 7304 7305 7306 7307 7308 7309 7310 7311 7312 7313 7314 7315 7316 7317 7318 7319 7320 7321 7322 7323 7324 7325 7326 7327 7328 7329 7330 7331 7332 7333 7334 7335 7336 7337 7338 7339 7340 7341 7342 7343 7344 7345 7346 7347 7348 7349 7350 7351 7352 7353 7354 7355 7356 7357 7358 7359 7360 7361 7362 7363 7364 7365 7366 7367 7368 7369 7370 7371 7372 7373 7374 7375 7376 7377 7378 7379 7380 7381 7382 7383 7384 7385 7386 7387 7388 7389 7390 7391 7392 7393 7394 7395 7396 7397 7398 7399 7400 7401 7402 7403 7404 7405 7406 7407 7408 7409 7410 7411 7412 7413 7414 7415 7416 7417 7418 7419 7420 7421 7422 7423 7424 7425 7426 7427 7428 7429 7430 7431 7432 7433 7434 7435 7436 7437 7438 7439 7440 7441 7442 7443 7444 7445 7446 7447 7448 7449 7450 7451 7452 7453 7454 7455 7456 7457 7458 7459 7460 7461 7462 7463 7464 7465 7466 7467 7468 7469 7470 7471 7472 7473 7474 7475 7476 7477 7478 7479 7480 7481 7482 7483 7484 7485 7486 7487 7488 7489 7490 7491 7492 7493 7494 7495 7496 7497 7498 7499 7500 7501 7502 7503 7504 7505 7506 7507 7508 7509 7510 7511 7512 7513 7514 7515 7516 7517 7518 7519 7520 7521 7522 7523 7524 7525 7526 7527 7528 7529 7530 7531 7532 7533 7534 7535 7536 7537 7538 7539 7540 7541 7542 7543 7544 7545 7546 7547 7548 7549 7550 7551 7552 7553 7554 7555 7556 7557 7558 7559 7560 7561 7562 7563 7564 7565 7566 7567 7568 7569 7570 7571 7572 7573 7574 7575 7576 7577 7578 7579 7580 7581 7582 7583 7584 7585 7586 7587 7588 7589 7590 7591 7592 7593 7594 7595 7596 7597 7598 7599 7600 7601 7602 7603 7604 7605 7606 7607 7608 7609 7610 7611 7612 7613 7614 7615 7616 7617 7618 7619 7620 7621 7622 7623 7624 7625 7626 7627 7628 7629 7630 7631 7632 7633 7634 7635 7636 7637 7638 7639 7640 7641 7642 7643 7644 7645 7646 7647 7648 7649 7650 7651 7652 7653 7654 7655 7656 7657 7658 7659 7660 7661 7662 7663 7664 7665 7666 7667 7668 7669 7670 7671 7672 7673 7674 7675 7676 7677 7678 7679 7680 7681 7682 7683 7684 7685 7686 7687 7688 7689 7690 7691 7692 7693 7694 7695 7696 7697 7698 7699 7700 7701 7702 7703 7704 7705 7706 7707 7708 7709 7710 7711 7712 7713 7714 7715 7716 7717 7718 7719 7720 7721 7722 7723 7724 7725 7726 7727 7728 7729 7730 7731 7732 7733 7734 7735 7736 7737 7738 7739 7740 7741 7742 7743 7744 7745 7746 7747 7748 7749 7750 7751 7752 7753 7754 7755 7756 7757 7758 7759 7760 7761 7762 7763 7764 7765 7766 7767 7768 7769 7770 7771 7772 7773 7774 7775 7776 7777 7778 7779 7780 7781 7782 7783 7784 7785 7786 7787 7788 7789 7790 7791 7792 7793 7794 7795 7796 7797 7798 7799 7800 7801 7802 7803 7804 7805 7806 7807 7808 7809 7810 7811 7812 7813 7814 7815 7816 7817 7818 7819 7820 7821 7822 7823 7824 7825 7826 7827 7828 7829 7830 7831 7832 7833 7834 7835 7836 7837 7838 7839 7840 7841 7842 7843 7844 7845 7846 7847 7848 7849 7850 7851 7852 7853 7854 7855 7856 7857 7858 7859 7860 7861 7862 7863 7864 7865 7866 7867 7868 7869 7870 7871 7872 7873 7874 7875 7876 7877 7878 7879 7880 7881 7882 7883 7884 7885 7886 7887 7888 7889 7890 7891 7892 7893 7894 7895 7896 7897 7898 7899 7900 7901 7902 7903 7904 7905 7906 7907 7908 7909 7910 7911 7912 7913 7914 7915 7916 7917 7918 7919 7920 7921 7922 7923 7924 7925 7926 7927 7928 7929 7930 7931 7932 7933 7934 7935 7936 7937 7938 7939 7940 7941 7942 7943 7944 7945 7946 7947 7948 7949 7950 7951 7952 7953 7954 7955 7956 7957 7958 7959 7960 7961 7962 7963 7964 7965 7966 7967 7968 7969 7970 7971 7972 7973 7974 7975 7976 7977 7978 7979 7980 7981 7982 7983 7984 7985 7986 7987 7988 7989 7990 7991 7992 7993 7994 7995 7996 7997 7998 7999 8000 8001 8002 8003 8004 8005 8006 8007 8008 8009 8010 8011 8012 8013 8014 8015 8016 8017 8018 8019 8020 8021 8022 8023 8024 8025 8026 8027 8028 8029 8030 8031 8032 8033 8034 8035 8036 8037 8038 8039 8040 8041 8042 8043 8044 8045 8046 8047 8048 8049 8050 8051 8052 8053 8054 8055 8056 8057 8058 8059 8060 8061 8062 8063 8064 8065 8066 8067 8068 8069 8070 8071 8072 8073 8074 8075 8076 8077 8078 8079 8080 8081 8082 8083 8084 8085 8086 8087 8088 8089 8090 8091 8092 8093 8094 8095 8096 8097 8098 8099 8100 8101 8102 8103 8104 8105 8106 8107 8108 8109 8110 8111 8112 8113 8114 8115 8116 8117 8118 8119 8120 8121 8122 8123 8124 8125 8126 8127 8128 8129 8130 8131 8132 8133 8134 8135 8136 8137 8138 8139 8140 8141 8142 8143 8144 8145 8146 8147 8148 8149 8150 8151 8152 8153 8154 8155 8156 8157 8158 8159 8160 8161 8162 8163 8164 8165 8166 8167 8168 8169 8170 8171 8172 8173 8174 8175 8176 8177 8178 8179 8180 8181 8182 8183 8184 8185 8186 8187 8188 8189 8190 8191 8192 8193 8194 8195 8196 8197 8198 8199 8200 8201 8202 8203 8204 8205 8206 8207 8208 8209 8210 8211 8212 8213 8214 8215 8216 8217 8218 8219 8220 8221 8222 8223 8224 8225 8226 8227 8228 8229 8230 8231 8232 8233 8234 8235 8236 8237 8238 8239 8240 8241 8242 8243 8244 8245 8246 8247 8248 8249 8250 8251 8252 8253 8254 8255 8256 8257 8258 8259 8260 8261 8262 8263 8264 8265 8266 8267 8268 8269 8270 8271 8272 8273 8274 8275 8276 8277 8278 8279 8280 8281 8282 8283 8284 8285 8286 8287 8288 8289 8290 8291 8292 8293 8294 8295 8296 8297 8298 8299 8300 8301 8302 8303 8304 8305 8306 8307 8308 8309 8310 8311 8312 8313 8314 8315 8316 8317 8318 8319 8320 8321 8322 8323 8324 8325 8326 8327 8328 8329 8330 8331 8332 8333 8334 8335 8336 8337 8338 8339 8340 8341 8342 8343 8344 8345 8346 8347 8348 8349 8350 8351 8352 8353 8354 8355 8356 8357 8358 8359 8360 8361 8362 8363 8364 8365 8366 8367 8368 8369 8370 8371 8372 8373 8374 8375 8376 8377 8378 8379 8380 8381 8382 8383 8384 8385 8386 8387 8388 8389 8390 8391 8392 8393 8394 8395 8396 8397 8398 8399 8400 8401 8402 8403 8404 8405 8406 8407 8408 8409 8410 8411 8412 8413 8414 8415 8416 8417 8418 8419 8420 8421 8422 8423 8424 8425 8426 8427 8428 8429 8430 8431 8432 8433 8434 8435 8436 8437 8438 8439 8440 8441 8442 8443 8444 8445 8446 8447 8448 8449 8450 8451 8452 8453 8454 8455 8456 8457 8458 8459 8460 8461 8462 8463 8464 8465 8466 8467 8468 8469 8470 8471 8472 8473 8474 8475 8476 8477 8478 8479 8480 8481 8482 8483 8484 8485 8486 8487 8488 8489 8490 8491 8492 8493 8494 8495 8496 8497 8498 8499 8500 8501 8502 8503 8504 8505 8506 8507 8508 8509 8510 8511 8512 8513 8514 8515 8516 8517 8518 8519 8520 8521 8522 8523 8524 8525 8526 8527 8528 8529 8530 8531 8532 8533 8534 8535 8536 8537 8538 8539 8540 8541 8542 8543 8544 8545 8546 8547 8548 8549 8550 8551 8552 8553 8554 8555 8556 8557 8558 8559 8560 8561 8562 8563 8564 8565 8566 8567 8568 8569 8570 8571 8572 8573 8574 8575 8576 8577 8578 8579 8580 8581 8582 8583 8584 8585 8586 8587 8588 8589 8590 8591 8592 8593 8594 8595 8596 8597 8598 8599 8600 8601 8602 8603 8604 8605 8606 8607 8608 8609 8610 8611 8612 8613 8614 8615 8616 8617 8618 8619 8620 8621 8622 8623 8624 8625 8626 8627 8628 8629 8630 8631 8632 8633 8634 8635 8636 8637 8638 8639 8640 8641 8642 8643 8644 8645 8646 8647 8648 8649 8650 8651 8652 8653 8654 8655 8656 8657 8658 8659 8660 8661 8662 8663 8664 8665 8666 8667 8668 8669 8670 8671 8672 8673 8674 8675 8676 8677 8678 8679 8680 8681 8682 8683 8684 8685 8686 8687 8688 8689 8690 8691 8692 8693 8694 8695 8696 8697 8698 8699 8700 8701 8702 8703 8704 8705 8706 8707 8708 8709 8710 8711 8712 8713 8714 8715 8716 8717 8718 8719 8720 8721 8722 8723 8724 8725 8726 8727 8728 8729 8730 8731 8732 8733 8734 8735 8736 8737 8738 8739 8740 8741 8742 8743 8744 8745 8746 8747 8748 8749 8750 8751 8752 8753 8754 8755 8756 8757 8758 8759 8760 8761 8762 8763 8764 8765 8766 8767 8768 8769 8770 8771 8772 8773 8774 8775 8776 8777 8778 8779 8780 8781 8782 8783 8784 8785 8786 8787 8788 8789 8790 8791 8792 8793 8794 8795 8796 8797 8798 8799 8800 8801 8802 8803 8804 8805 8806 8807 8808 8809 8810 8811 8812 8813 8814 8815 8816 8817 8818 8819 8820 8821 8822 8823 8824 8825 8826 8827 8828 8829 8830 8831 8832 8833 8834 8835 8836 8837 8838 8839 8840 8841 8842 8843 8844 8845 8846 8847 8848 8849 8850 8851 8852 8853 8854 8855 8856 8857 8858 8859 8860 8861 8862 8863 8864 8865 8866 8867 8868 8869 8870 8871 8872 8873 8874 8875 8876 8877 8878 8879 8880 8881 8882 8883 8884 8885 8886 8887 8888 8889 8890 8891 8892 8893 8894 8895 8896 8897 8898 8899 8900 8901 8902 8903 8904 8905 8906 8907 8908 8909 8910 8911 8912 8913 8914 8915 8916 8917 8918 8919 8920 8921 8922 8923 8924 8925 8926 8927 8928 8929 8930 8931 8932 8933 8934 8935 8936 8937 8938 8939 8940 8941 8942 8943 8944 8945 8946 8947 8948 8949 8950 8951 8952 8953 8954 8955 8956 8957 8958 8959 8960 8961 8962 8963 8964 8965 8966 8967 8968 8969 8970 8971 8972 8973 8974 8975 8976 8977 8978 8979 8980 8981 8982 8983 8984 8985 8986 8987 8988 8989 8990 8991 8992 8993 8994 8995 8996 8997 8998 8999 9000 9001 9002 9003 9004 9005 9006 9007 9008 9009 9010 9011 9012 9013 9014 9015 9016 9017 9018 9019 9020 9021 9022 9023 9024 9025 9026 9027 9028 9029 9030 9031 9032 9033 9034 9035 9036 9037 9038 9039 9040 9041 9042 9043 9044 9045 9046 9047 9048 9049 9050 9051 9052 9053 9054 9055 9056 9057 9058 9059 9060 9061 9062 9063 9064 9065 9066 9067 9068 9069 9070 9071 9072 9073 9074 9075 9076 9077 9078 9079 9080 9081 9082 9083 9084 9085 9086 9087 9088 9089 9090 9091 9092 9093 9094 9095 9096 9097 9098 9099 9100 9101 9102 9103 9104 9105 9106 9107 9108 9109 9110 9111 9112 9113 9114 9115 9116 9117 9118 9119 9120 9121 9122 9123 9124 9125 9126 9127 9128 9129 9130 9131 9132 9133 9134 9135 9136 9137 9138 9139 9140 9141 9142 9143 9144 9145 9146 9147 9148 9149 9150 9151 9152 9153 9154 9155 9156 9157 9158 9159 9160 9161 9162 9163 9164 9165 9166 9167 9168 9169 9170 9171 9172 9173 9174 9175 9176 9177 9178 9179 9180 9181 9182 9183 9184 9185 9186 9187 9188 9189 9190 9191 9192 9193 9194 9195 9196 9197 9198 9199 9200 9201 9202 9203 9204 9205 9206 9207 9208 9209 9210 9211 9212 9213 9214 9215 9216 9217 9218 9219 9220 9221 9222 9223 9224 9225 9226 9227 9228 9229 9230 9231 9232 9233 9234 9235 9236 9237 9238 9239 9240 9241 9242 9243 9244 9245 9246 9247 9248 9249 9250 9251 9252 9253 9254 9255 9256 9257 9258 9259 9260 9261 9262 9263 9264 9265 9266 9267 9268 9269 9270 9271 9272 9273 9274 9275 9276 9277 9278 9279 9280 9281 9282 9283 9284 9285 9286 9287 9288 9289 9290 9291 9292 9293 9294 9295 9296 9297 9298 9299 9300 9301 9302 9303 9304 9305 9306 9307 9308 9309 9310 9311 9312 9313 9314 9315 9316 9317 9318 9319 9320 9321 9322 9323 9324 9325 9326 9327 9328 9329 9330 9331 9332 9333 9334 9335 9336 9337 9338 9339 9340 9341 9342 9343 9344 9345 9346 9347 9348 9349 9350 9351 9352 9353 9354 9355 9356 9357 9358 9359 9360 9361 9362 9363 9364 9365 9366 9367 9368 9369 9370 9371 9372 9373 9374 9375 9376 9377 9378 9379 9380 9381 9382 9383 9384 9385 9386 9387 9388 9389 9390 9391 9392 9393 9394 9395 9396 9397 9398 9399 9400 9401 9402 9403 9404 9405 9406 9407 9408 9409 9410 9411 9412 9413 9414 9415 9416 9417 9418 9419 9420 9421 9422 9423 9424 9425 9426 9427 9428 9429 9430 9431 9432 9433 9434 9435 9436 9437 9438 9439 9440 9441 9442 9443 9444 9445 9446 9447 9448 9449 9450 9451 9452 9453 9454 9455 9456 9457 9458 9459 9460 9461 9462 9463 9464 9465 9466 9467 9468 9469 9470 9471 9472 9473 9474 9475 9476 9477 9478 9479 9480 9481 9482 9483 9484 9485 9486 9487 9488 9489 9490 9491 9492 9493 9494 9495 9496 9497 9498 9499 9500 9501 9502 9503 9504 9505 9506 9507 9508 9509 9510 9511 9512 9513 9514 9515 9516 9517 9518 9519 9520 9521 9522 9523 9524 9525 9526 9527 9528 9529 9530 9531 9532 9533 9534 9535 9536 9537 9538 9539 9540 9541 9542 9543 9544 9545 9546 9547 9548 9549 9550 9551 9552 9553 9554 9555 9556 9557 9558 9559 9560 9561 9562 9563 9564 9565 9566 9567 9568 9569 9570 9571 9572 9573 9574 9575 9576 9577 9578 9579 9580 9581 9582 9583 9584 9585 9586 9587 9588 9589 9590 9591 9592 9593 9594 9595 9596 9597 9598 9599 9600 9601 9602 9603 9604 9605 9606 9607 9608 9609 9610 9611 9612 9613 9614 9615 9616 9617 9618 9619 9620 9621 9622 9623 9624 9625 9626 9627 9628 9629 9630 9631 9632 9633 9634 9635 9636 9637 9638 9639 9640 9641 9642 9643 9644 9645 9646 9647 9648 9649 9650 9651 9652 9653 9654 9655 9656 9657 9658 9659 9660 9661 9662 9663 9664 9665 9666 9667 9668 9669 9670 9671 9672 9673 9674 9675 9676 9677 9678 9679 9680 9681 9682 9683 9684 9685 9686 9687 9688 9689 9690 9691 9692 9693 9694 9695 9696 9697 9698 9699 9700 9701 9702 9703 9704 9705 9706 9707 9708 9709 9710 9711 9712 9713 9714 9715 9716 9717 9718 9719 9720 9721 9722 9723 9724 9725 9726 9727 9728 9729 9730 9731 9732 9733 9734 9735 9736 9737 9738 9739 9740 9741 9742 9743 9744 9745 9746 9747 9748 9749 9750 9751 9752 9753 9754 9755 9756 9757 9758 9759 9760 9761 9762 9763 9764 9765 9766 9767 9768 9769 9770 9771 9772 9773 9774 9775 9776 9777 9778 9779 9780 9781 9782 9783 9784 9785 9786 9787 9788 9789 9790 9791 9792 9793 9794 9795 9796 9797 9798 9799 9800 9801 9802 9803 9804 9805 9806 9807 9808 9809 9810 9811 9812 9813 9814 9815 9816 9817 9818 9819 9820 9821 9822 9823 9824 9825 9826 9827 9828 9829 9830 9831 9832
|
\input texinfo @c -*- texinfo -*-
@c %**start of header
@setfilename pspp.info
@settitle PSPP
@set TIMESTAMP Time-stamp: <2000-01-02 22:32:14 blp>
@set EDITION 0.2
@set VERSION 0.2
@c For double-sided printing, uncomment:
@c @setchapternewpage odd
@c %**end of header
@iftex
@finalout
@end iftex
@ifinfo
@format
START-INFO-DIR-ENTRY
* PSPP: (pspp). Statistical analysis package.
END-INFO-DIR-ENTRY
@end format
PSPP, for statistical analysis of sampled data, by Ben Pfaff.
This file documents PSPP, a statistical package for analysis of
sampled data that uses a command language compatible with SPSS.
Copyright (C) 1996-9, 2000 Free Software Foundation, Inc.
This version of the PSPP documentation is consistent with version 2 of
``texinfo.tex''.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
@ignore
Permission is granted to process this file through TeX and print the
results, provided the printed document carries copying permission notice
identical to this one except for the removal of this paragraph (this
paragraph not being relevant to the printed manual).
@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the
entire resulting derived work is distributed under the terms of a
permission notice identical to this one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above condition for modified
versions, except that this permission notice may be stated in a
translation approved by the Free Software Foundation.
@end ifinfo
@titlepage
@title PSPP
@subtitle A System for Statistical Analysis
@subtitle Edition @value{EDITION}, for PSPP version @value{VERSION}
@author by Ben Pfaff
@page
@vskip 0pt plus 1filll
PSPP Copyright @copyright{} 1997, 1998 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the
entire derived work is distributed under the terms of a permission
notice identical to this one.
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation
approved by the Foundation.
@end titlepage
@node Top, Introduction, (dir), (dir)
@ifinfo
@top PSPP
This file documents the PSPP package for statistical analysis of sampled
data. This is edition @value{EDITION}, for PSPP version
@value{VERSION}, last modified at @value{TIMESTAMP}.
@end ifinfo
@menu
* Introduction:: Description of the package.
* License:: Your rights and obligations.
* Credits:: Acknowledgement of authors.
* Installation:: How to compile and install PSPP.
* Configuration:: Configuring PSPP.
* Invocation:: Starting and running PSPP.
* Language:: Basics of the PSPP command language.
* Expressions:: Numeric and string expression syntax.
* Data Input and Output:: Reading data from user files.
* System and Portable Files:: Dealing with system & portable files.
* Variable Attributes:: Adjusting and examining variables.
* Data Manipulation:: Simple operations on data.
* Data Selection:: Select certain cases for analysis.
* Conditionals and Looping:: Doing things many times or not at all.
* Statistics:: Basic statistical procedures.
* Utilities:: Other commands.
* Not Implemented:: What's not here yet
* Data File Format:: Format of PSPP system files.
* Portable File Format:: Format of PSPP portable files.
* q2c Input Format:: Format of syntax accepted by q2c.
* Bugs:: Known problems; submitting bug reports.
* Function Index:: Index of PSPP functions for expressions.
* Concept Index:: Index of concepts.
* Command Index:: Index of PSPP procedures.
@end menu
@node Introduction, License, Top, Top
@chapter Introduction
@cindex introduction
@cindex PSPP language
@cindex language, PSPP
PSPP is a tool for statistical analysis of sampled data. It reads a
syntax file and a data file, analyzes the data, and writes the results
to a listing file or to standard output.
The language accepted by PSPP is similar to those accepted by SPSS
statistical products. The details of PSPP's language are given
later in this manual.
@cindex files, PSPP
@cindex output, PSPP
@cindex PostScript
@cindex graphics
@cindex Ghostscript
@cindex Free Software Foundation
PSPP produces output in two forms: tables and charts. Both of these can
be written in several formats; currently, ASCII, PostScript, and HTML
are supported. In the future, more drivers, such as PCL and X Window
System drivers, may be developed. For now, Ghostscript, available from
the Free Software Foundation, may be used to convert PostScript chart
output to other formats.
The current version of PSPP, @value{VERSION}, is woefully incomplete in
terms of its statistical procedure support. PSPP is a work in progress.
The author hopes to support fully support all features in the products
that PSPP replaces, eventually. The author welcomes questions,
comments, donations, and code submissions. @xref{Bugs,,Submitting Bug
Reports}, for instructions on contacting the author.
@node License, Credits, Introduction, Top
@chapter Your rights and obligations
@cindex license
@cindex your rights and obligations
@cindex rights, your
@cindex obligations, your
@cindex Free Software Foundation
@cindex GNU General Public License
@cindex General Public License
@cindex GPL
@cindex distribution
@cindex redistribution
Most of PSPP is distributed under the GNU General Public
License. The General Public License says, in effect, that you may
modify and distribute PSPP as you like, as long as you grant the
same rights to others. It also states that you must provide source code
when you distribute PSPP, or, if you obtained PSPP
source code from an anonymous ftp site, give out the name of that site.
The General Public License is given in full in the source distribution
as file @file{COPYING}. In Debian GNU/Linux, this file is also
available as file @file{/usr/doc/copyright/GPL}.
To quote the GPL itself:
@quotation
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
675 Mass Ave, Cambridge, MA 02139, USA.
@end quotation
@node Credits, Installation, License, Top
@chapter Credits
@cindex credits
@cindex authors
@cindex Minton, Claire
@cindex @cite{Cat's Cradle}
@cindex Vonnegut, Kurt, Jr.
@cindex quotations
@quotation
I'm always embarrassed when I see an index an author has made of his own
work. It's a shameless exhibition---to the @i{trained} eye. Never
index your own book.
---Claire Minton, @cite{Cat's Cradle}, Kurt Vonnegut, Jr.
@end quotation
@cindex Pfaff, Ben
Most of PSPP, as well as this manual (including the indices),
was written by Ben Pfaff. @xref{Contacting the Author}, for
instructions on contacting the author.
@cindex Covington, Michael A.
@cindex Van Zandt, James
@cindex @file{ftp.cdrom.com}
@cindex @file{/pub/algorithms/c/julcal10}
@cindex @file{julcal.c}
@cindex @file{julcal.h}
The PSPP source code incorporates @code{julcal10} originally
written by Michael A. Covington and translated into C by Jim Van Zandt.
The original package can be found in directory
@file{ftp://ftp.cdrom.com/pub/algorithms/c/julcal10}. The entire
contents of that directory constitute the package. The files actually
used in PSPP are @code{julcal.c} and @code{julcal.h}.
@node Installation, Configuration, Credits, Top
@chapter Installing PSPP
@cindex installation
@cindex PSPP, installing
@cindex GNU C compiler
@cindex gcc
@cindex compiler, recommended
@cindex compiler, gcc
PSPP conforms to the GNU Coding Standards. PSPP is written in, and
requires for proper operation, ANSI/ISO C. You might want to
additionally note the following points:
@itemize @bullet
@item
The compiler and linker must allow for significance of several
characters in external identifiers. The exact number is unknown but at
least 31 is recommended.
@item
The @code{int} type must be 32 bits or wider.
@item
The recommended compiler is gcc 2.7.2.1 or later, but any ANSI compiler
will do if it fits the above criteria.
@end itemize
Many UNIX variants should work out-of-the-box, as PSPP uses GNU
autoconf to detect differences between environments. Please report any
problems with compilation of PSPP under UNIX and UNIX-like operating
systems---portability is a major concern of the author.
The pages below give specific instructions for installing PSPP
on each type of system mentioned above.
@menu
* UNIX installation:: Installing on UNIX-like environments.
@end menu
@node UNIX installation, , Installation, Installation
@section UNIX installation
@cindex UNIX, installing PSPP under
@cindex installation, under UNIX
@noindent
To install PSPP under a UNIX-like operating system, follow the steps
below in order. Some of the text below was taken directly from various
Free Software Foundation sources.
@enumerate
@item
@code{cd} to the directory containing the PSPP source.
@cindex configure, GNU
@cindex GNU configure
@item
Type @samp{./configure} to configure for your particular operating
system and compiler. Running @code{configure} takes a while. While
running, it displays some messages telling which features it is checking
for.
You can optionally supply some options to @code{configure} in order to
give it hints about how to do its job. Type @code{./configure --help}
to see a list of options. One of the most useful options is
@samp{--with-checker}, which enables the use of the Checker memory
debugger under supported operating systems. Checker must already be
installed to use this option. Do not use @samp{--with-checker} if you
are not debugging PSPP itself.
@cindex @file{Makefile}
@cindex @file{config.h}
@cindex @file{pref.h}
@cindex makefile
@item
(optional) Edit @file{Makefile}, @file{config.h}, and @file{pref.h}.
These files are produced by @code{configure}. Note that most PSPP
settings can be changed at runtime.
@file{pref.h} is only generated by @code{configure} if it does not
already exist. (It's copied from @file{prefh.orig}.)
@cindex compiling
@item
Type @samp{make} to compile the package. If there are any errors during
compilation, try to fix them. If modifications are necessary to compile
correctly under your configuration, contact the author.
@xref{Bugs,,Submitting Bug Reports}, for details.
@cindex self-tests, running
@item
Type @samp{make check} to run self-tests on the compiled PSPP package.
@cindex installation
@cindex PSPP, installing
@cindex @file{/usr/local/share/pspp/}
@cindex @file{/usr/local/bin/}
@cindex @file{/usr/local/info/}
@cindex documentation, installing
@item
Become the superuser and type @samp{make install} to install the
PSPP binaries, by default in @file{/usr/local/bin/}. The
directory @file{/usr/local/share/pspp/} is created and populated with
files needed by PSPP at runtime. This step will also cause the
PSPP documentation to be installed in @file{/usr/local/info/},
but only if that directory already exists.
@item
(optional) Type @samp{make clean} to delete the PSPP binaries
from the source tree.
@end enumerate
@node Configuration, Invocation, Installation, Top
@chapter Configuring PSPP
@cindex configuration
@cindex PSPP, configuring
PSPP has dozens of configuration possibilities and hundreds of
settings. This is both a bane and a blessing. On one hand, it's
possible to easily accommodate diverse ranges of setups. But, on the
other, the multitude of possibilities can overwhelm the casual user.
Fortunately, the configuration mechanisms are profusely described in the
sections below@enddots{}
@menu
* File locations:: How PSPP finds config files.
* Configuration techniques:: Many different methods of configuration@enddots{}
* Configuration files:: How configuration files are read.
* Environment variables:: All about environment variables.
* Output devices:: Describing your terminal(s) and printer(s).
* PostScript driver class:: Configuration of PostScript devices.
* ASCII driver class:: Configuration of character-code devices.
* HTML driver class:: Configuration for HTML output.
* Miscellaneous configuring:: Even more configuration variables.
* Improving output quality:: Hints for producing ever-more-lovely output.
@end menu
@node File locations, Configuration techniques, Configuration, Configuration
@section Locating configuration files
PSPP uses the same method to find most of its configuration files:
@enumerate
@item
The @dfn{base name} of the file being sought is determined.
@item
The path to search is determined.
@item
Each directory in the search path, from left to right, is searched for a
file with the name of the base name. The first occurrence is read
as the configuration file.
@end enumerate
The first two steps are elaborated below for the sake of our pedantic
friends.
@enumerate
@item
A @dfn{base name} is a file name lacking an absolute directory
reference. Some examples of base names are: @file{ps-encodings},
@file{devices}, @file{devps/DESC} (under UNIX), @file{devps\DESC} (under
M$ environments).
Determining the base name is a two-step process:
@enumerate a
@item
If the appropriate environment variable is defined, the value of that
variable is used (@pxref{Environment variables}). For instance, when
searching for the output driver initialization file, the variable
examined is @code{STAT_OUTPUT_INIT_FILE}.
@item
Otherwise, the compiled-in default is used. For example, when searching
for the output driver initialization file, the default base name is
@file{devices}.
@end enumerate
@strong{Please note:} If a user-specified base name does contain an
absolute directory reference, as in a file name like
@file{/home/pfaff/fonts/TR}, no path is searched---the file name is used
exactly as given---and the algorithm terminates.
@item
The path is the first of the following that is defined:
@itemize @bullet
@item
A variable definition for the path given in the user environment. This
is a PSPP-specific environment variable name; for instance,
@code{STAT_OUTPUT_INIT_PATH}.
@item
In some cases, another, less-specific environment variable is checked.
For instance, when searching for font files, the PostScript driver first
checks for a variable with name @code{STAT_GROFF_FONT_PATH}, then for
one with name @code{GROFF_FONT_PATH}. (However, font searching has its
own list of esoteric search rules.)
@item
The configuration file path, which is itself determined by the
following rules:
@enumerate a
@item
If the command line contains an option of the form @samp{-B @var{path}}
or @samp{--config-dir=@var{path}}, then the value given on the
rightmost occurrence of such an option is used.
@item
Otherwise, if the environment variable @code{STAT_CONFIG_PATH} is
defined, the value of that variable is used.
@item
Otherwise, the compiled-in fallback default is used. On UNIX machines,
the default fallback path is
@enumerate 1
@item
@file{~/.pspp}
@item
@file{/usr/local/lib/pspp}
@item
@file{/usr/lib/pspp}
@end enumerate
On DOS machines, the default fallback path is:
@enumerate 1
@item
All the paths from the DOS search path in the @samp{PATH} environment
variable, in left-to-right order.
@item
@file{C:\PSPP}, as a last resort.
@end enumerate
Note that the installer of PSPP can easily change this default
fallback path; thus the above should not be taken as gospel.
@end enumerate
@end itemize
@end enumerate
As a final note: Under DOS, directories given in paths are delimited by
semicolons (@samp{;}); under UNIX, directories are delimited by colons
(@samp{:}). This corresponds with the standard path delimiter under
these OSes.
@node Configuration techniques, Configuration files, File locations, Configuration
@section Configuration techniques
There are many ways that PSPP can be configured. These are
described in the list below. Values given by earlier items take
precedence over those given by later items.
@enumerate
@item
Syntax commands that modify settings, such as @code{SET}.
@item
Command-line options. @xref{Invocation}.
@item
PSPP-specific environment variable contents. @xref{Environment
variables}.
@item
General environment variable contents. @xref{Environment variables}.
@item
Configuration file contents. @xref{Configuration files}.
@item
Fallback defaults.
@end enumerate
Some of the above may not apply to a particular setting. For instance,
the current pager (such as @samp{more}, @samp{most}, or @samp{less})
cannot be determined by configuration file contents because there is no
appropriate configuration file.
@node Configuration files, Environment variables, Configuration techniques, Configuration
@section Configuration files
Most configuration files have a common form:
@itemize @bullet
@item
Each line forms a separate command or directive. This means that lines
cannot be broken up, unless they are spliced together with a trailing
backslash, as described below.
@item
Before anything else is done, trailing whitespace is removed.
@item
When a line ends in a backslash (@samp{\}), the backslash is removed,
and the next line is read and appended to the current line.
@itemize @minus
@item
Whitespace preceding the backslash is retained.
@item
This rule continues to be applied until the line read does not end in a
backslash.
@item
It is an error if the last line in the file ends in a backslash.
@end itemize
@item
Comments are introduced by an octothorpe (#), and continue until the
end of the line.
@itemize @minus
@item
An octothorpe inside balanced pairs of double quotation marks (@samp{"})
or single quotation marks (@samp{'}) does not introduce a comment.
@item
The backslash character can be used inside balanced quotes of either
type to escape the following character as a literal character.
(This is distinct from the use of a backslash as a line-splicing
character.)
@item
Line splicing takes place before comment removal.
@end itemize
@item
Blank lines, and lines that contain only whitespace, are ignored.
@end itemize
@node Environment variables, Output devices, Configuration files, Configuration
@section Environment variables
You may think the concept of environment variables is a fairly simple
one. However, the author of PSPP has found a way to complicate
even something so simple. Environment variables are further described
in the sections below:
@menu
* Variable values:: Values of variables are determined this way.
* Environment substitutions:: How environment substitutions are made.
* Predefined variables:: A few variables are automatically defined.
@end menu
@node Variable values, Environment substitutions, Environment variables, Environment variables
@subsection Values of environment variables
Values for environment variables are obtained by the following means,
which are arranged in order of decreasing precedence:
@enumerate
@item
Command-line options. @xref{Invocation}.
@item
The @file{environment} configuration file---more on this below.
@item
Actual environment variables (defined in the shell or other parent
process).
@end enumerate
The @file{environment} configuration file is located through application
of the usual algorithm for configuration files (@pxref{File locations}),
except that its contents do not affect the search path used to find
@file{environment} itself. Use of @file{environment} is discouraged on
systems that allow an arbitrarily large environment; it is supported for
use on systems like MS-DOS that limit environment size.
@file{environment} is composed of lines having the form
@samp{@var{key}=@var{value}}, where @var{key} and the equals sign
(@samp{=}) are required, and @var{value} is optional. If @var{value} is
given, variable @var{key} is given that value; if @var{value} is absent,
variable @var{key} is undefined (deleted). Variables may not be defined
with a null value.
Environment substitutions are performed on each line in the file
(@pxref{Environment substitutions}).
See @ref{Configuration files}, for more details on formatting of the
environment configuration file.
@quotation
@strong{Please note:} Support for @file{environment} is not yet
implemented.
@end quotation
@node Environment substitutions, Predefined variables, Variable values, Environment variables
@subsection Environment substitutions
Much of the power of environment variables lies in the way that they may
be substituted into configuration files. Variable substitutions are
described below.
The line is scanned from left to right. In this scan, all characters
other than dollar signs (@samp{$}) are retained unmolested. Dollar
signs, however, introduce an environment variable reference. References
take three forms:
@table @code
@item $@var{var}
Replaced by the value of environment variable @var{var}, determined as
specified in @ref{Variable values}. @var{var} must be one of the
following:
@itemize @bullet
@item
One or more letters.
@item
Exactly one nonalphabetic character. This may not be a left brace
(@samp{@{}).
@end itemize
@item $@{@var{var}@}
Same as above, but @var{var} may contain any character (except
@samp{@}}).
@item $$
Replaced by a single dollar sign.
@end table
Undefined variables expand to a empty value.
@node Predefined variables, , Environment substitutions, Environment variables
@subsection Predefined environment variables
There are two environment variables predefined for use in environment
substitutions:
@table @samp
@item VER
Defined as the version number of PSPP, as a string, in a format
something like @samp{0.9.4}.
@item ARCH
Defined as the host architecture of PSPP, as a string, in standard
cpu-manufacturer-OS format. For instance, Debian GNU/Linux 1.1 on an
Intel machine defines this as @samp{i586-unknown-linux}. This is
somewhat dependent on the system used to compile PSPP.
@end table
Nothing prevents these values from being overridden, although it's a
good idea not to do so.
@node Output devices, PostScript driver class, Environment variables, Configuration
@section Output devices
Configuring output devices is the most complicated aspect of configuring
PSPP. The output device configuration file is named
@file{devices}. It is searched for using the usual algorithm for
finding configuration files (@pxref{File locations}). Each line in the
file is read in the usual manner for configuration files
(@pxref{Configuration files}).
Lines in @file{devices} are divided into three categories, described
briefly in the table below:
@table @i
@item driver category definitions
Define a driver in terms of other drivers.
@item macro definitions
Define environment variables local to the the output driver
configuration file.
@item device definitions
Describe the configuration of an output device.
@end table
The following sections further elaborate the contents of the
@file{devices} file.
@menu
* Driver categories:: How to organize the driver namespace.
* Macro definitions:: Environment variables local to @file{devices}.
* Device definitions:: Output device descriptions.
* Dimensions:: Lengths, widths, sizes, @enddots{}
* papersize:: Letter, legal, A4, envelope, @enddots{}
* Distinguishing line types:: Details on @file{devices} parsing.
* Tokenizing lines:: Dividing @file{devices} lines into tokens.
@end menu
@node Driver categories, Macro definitions, Output devices, Output devices
@subsection Driver categories
Drivers can be divided into categories. Drivers are specified by their
names, or by the names of the categories that they are contained in.
Only certain drivers are enabled each time PSPP is run; by
default, these are the drivers in the category `default'. To enable a
different set of drivers, use the @samp{-o @var{device}} command-line
option (@pxref{Invocation}).
Categories are specified with a line of the form
@samp{@var{category}=@var{driver1} @var{driver2} @var{driver3} @var{@dots{}}
@var{driver@var{n}}}. This line specifies that the category
@var{category} is composed of drivers named @var{driver1},
@var{driver2}, and so on. There may be any number of drivers in the
category, from zero on up.
Categories may also be specified on the command line
(@pxref{Invocation}).
This is all you need to know about categories. If you're still curious,
read on.
First of all, the term `categories' is a bit of a misnomer. In fact,
the internal representation is nothing like the hierarchy that the term
seems to imply: a linear list is used to keep track of the enabled
drivers.
When PSPP first begins reading @file{devices}, this list contains
the name of any drivers or categories specified on the command line, or
the single item `default' if none were specified.
Each time a category definition is specified, the list is searched for
an item with the value of @var{category}. If a matching item is found,
it is deleted. If there was a match, the list of drivers (@var{driver1}
through @var{driver@var{n}}) is then appended to the list.
Each time a driver definition line is encountered, the list is searched.
If the list contains an item with that driver's name, the driver is
enabled and the item is deleted from the list. Otherwise, the driver
is not enabled.
It is an error if the list is not empty when the end of @file{devices}
is reached.
@node Macro definitions, Device definitions, Driver categories, Output devices
@subsection Macro definitions
Macro definitions take the form @samp{define @var{macroname}
@var{definition}}. In such a macro definition, the environment variable
@var{macroname} is defined to expand to the value @var{definition}.
Before the definition is made, however, any macros used in
@var{definition} are expanded.
Please note the following nuances of macro usage:
@itemize @bullet
@item
For the purposes of this section, @dfn{macro} and @dfn{environment
variable} are synonyms.
@item
Macros may not take arguments.
@item
Macros may not recurse.
@item
Macros are just environment variable definitions like other environment
variable definitions, with the exception that they are limited in scope
to the @file{devices} configuration file.
@item
Macros override other all environment variables of the same name (within
the scope of @file{devices}).
@item
Earlier macro definitions for a particular @var{key} override later
ones. In particular, macro definitions on the command line override
those in the device definition file. @xref{Non-option Arguments}.
@item
There are two predefined macros, whose values are determined at runtime:
@table @samp
@item viewwidth
Defined as the width of the console screen, in columns of text.
@item viewlength
Defined as the length of the console screen, in lines of text.
@end table
@end itemize
@node Device definitions, Dimensions, Macro definitions, Output devices
@subsection Driver definitions
Driver definitions are the ultimate purpose of the @file{devices}
configuration file. These are where the real action is. Driver
definitions tell PSPP where it should send its output.
Each driver definition line is divided into four fields. These fields
are delimited by colons (@samp{:}). Each line is subjected to
environment variable interpolation before it is processed further
(@pxref{Environment substitutions}). From left to right, the four
fields are, in brief:
@table @i
@item driver name
A unique identifier, used to determine whether to enable the driver.
@item class name
One of the predefined driver classes supported by PSPP. The
currently supported driver classes include `postscript' and `ascii'.
@item device type(s)
Zero or more of the following keywords, delimited by spaces:
@table @code
@item screen
Indicates that the device is a screen display. This may reduce the
amount of buffering done by the driver, to make interactive use more
convenient.
@item printer
Indicates that the device is a printer.
@item listing
Indicates that the device is a listing file.
@end table
These options are just hints to PSPP and do not cause the output to be
directed to the screen, or to the printer, or to a listing file---those
must be set elsewhere in the options. They are used primarily to decide
which devices should be enabled at any given time. @xref{SET}, for more
information.
@item options
An optional set of options to pass to the driver itself. The exact
format for the options varies among drivers.
@end table
The driver is enabled if:
@enumerate
@item
Its driver name is specified on the command line, or
@item
It's in a category specified on the command line, or
@item
If no categories or driver names are specified on the command line, it
is in category @code{default}.
@end enumerate
For more information on driver names, see @ref{Driver categories}.
The class name must be one of those supported by PSPP. The
classes supported depend on the options with which PSPP was
compiled. See later sections in this chapter for descriptions of the
available driver classes.
Options are dependent on the driver. See the driver descriptions for
details.
@node Dimensions, papersize, Device definitions, Output devices
@subsection Dimensions
Quite often in configuration it is necessary to specify a length or a
size. PSPP uses a common syntax for all such, calling them
collectively by the name @dfn{dimensions}.
@itemize @bullet
@item
You can specify dimensions in decimal form (@samp{12.5}) or as
fractions, either as mixed numbers (@samp{12-1/2}) or raw fractions
(@samp{25/2}).
@item
A number of different units are available. These are suffixed to the
numeric part of the dimension. There must be no spaces between the
number and the unit. The available units are identical to those offered
by the popular typesetting system @TeX{}:
@table @code
@item in
inch (1 @code{in} = 2.54 @code{cm})
@item "
inch (1 @code{in} = 2.54 @code{cm})
@item pt
printer's point (1 @code{in} = 72.27 @code{pt})
@item pc
pica (12 @code{pt} = 1 @code{pc})
@item bp
PostScript point (1 @code{in} = 72 @code{bp})
@item cm
centimeter
@item mm
millimeter (10 @code{mm} = 1 @code{cm})
@item dd
didot point (1157 @code{dd} = 1238 @code{pt})
@item cc
cicero (1 @code{cc} = 12 @code{dd})
@item sp
scaled point (65536 @code{sp} = 1 @code{pt})
@end table
@item
If no explicit unit is given, a DWIM@footnote{Do What I Mean}
``feature'' attempts to guess the best unit:
@itemize @minus
@item
Numbers less than 50 are assumed to be in inches.
@item
Numbers 50 or greater are assumed to be in millimeters.
@end itemize
@end itemize
@node papersize, Distinguishing line types, Dimensions, Output devices
@subsection Paper sizes
Output drivers usually deal with some sort of hardcopy media. This
media is called @dfn{paper} by the drivers, though in reality it could
be a transparency or film or thinly veiled sarcasm. To make it easier
for you to deal with paper, PSPP allows you to have (of course!) a
configuration file that gives symbolic names, like ``letter'' or
``legal'' or ``a4'', to paper sizes, rather than forcing you to use
cryptic numbers like ``8-1/2 x 11'' or ``210 by 297''. Surprisingly
enough, this configuration file is named @file{papersize}.
@xref{Configuration files}.
When PSPP tries to connect a symbolic paper name to a paper size, it
reads and parses each non-comment line in the file, in order. The first
field on each line must be a symbolic paper name in double quotes.
Paper names may not contain double quotes. Paper names are not
case-sensitive: @samp{legal} and @samp{Legal} are equivalent.
If a match is found for the paper name, the rest of the line is parsed.
If it is found to be a pair of dimensions (@pxref{Dimensions}) separated
by either @samp{x} or @samp{by}, then those are taken to be the paper
size, in order of width followed by length. There @emph{must} be at
least one space on each side of @samp{x} or @samp{by}.
Otherwise the line must be of the form
@samp{"@var{paper-1}"="@var{paper-2}"}. In this case the target of the
search becomes paper name @var{paper-2} and the search through the file
continues.
@node Distinguishing line types, Tokenizing lines, papersize, Output devices
@subsection How lines are divided into types
The lines in @file{devices} are distinguished in the following manner:
@enumerate
@item
Leading whitespace is removed.
@item
If the resulting line begins with the exact string @code{define},
followed by one or more whitespace characters, the line is processed as
a macro definition.
@item
Otherwise, the line is scanned for the first instance of a colon
(@samp{:}) or an equals sign (@samp{=}).
@item
If a colon is encountered first, the line is processed as a driver
definition.
@item
Otherwise, if an equals sign is encountered, the line is processed as a
macro definition.
@item
Otherwise, the line is ill-formed.
@end enumerate
@node Tokenizing lines, , Distinguishing line types, Output devices
@subsection How lines are divided into tokens
Each driver definition line is run through a simple tokenizer. This
tokenizer recognizes two basic types of tokens.
The first type is an equals sign (@samp{=}). Equals signs are both
delimiters between tokens and tokens in themselves.
The second type is an identifier or string token. Identifiers and
strings are equivalent after tokenization, though they are written
differently. An identifier is any string of characters other than
whitespace or equals sign.
A string is introduced by a single- or double-quote character (@samp{'}
or @samp{"}) and, in general, continues until the next occurrence of
that same character. The following standard C escapes can also be
embedded within strings:
@table @code
@item \'
A single-quote (@samp{'}).
@item \"
A double-quote (@samp{"}).
@item \?
A question mark (@samp{?}). Included for hysterical raisins.
@item \\
A backslash (@samp{\}).
@item \a
Audio bell (ASCII 7).
@item \b
Backspace (ASCII 8).
@item \f
Formfeed (ASCII 12).
@item \n
Newline (ASCII 10)
@item \r
Carriage return (ASCII 13).
@item \t
Tab (ASCII 9).
@item \v
Vertical tab (ASCII 11).
@item \@var{o}@var{o}@var{o}
Each @samp{o} must be an octal digit. The character is the one having
the octal value specified. Any number of octal digits is read and
interpreted; only the lower 8 bits are used.
@item \x@var{h}@var{h}
Each @samp{h} must be a hex digit. The character is the one having the
hexadecimal value specified. Any number of hex digits is read and
interpreted; only the lower 8 bits are used.
@end table
Tokens, outside of quoted strings, are delimited by whitespace or equals
signs.
@node PostScript driver class, ASCII driver class, Output devices, Configuration
@section The PostScript driver class
The @code{postscript} driver class is used to produce output that is
acceptable to PostScript printers and to PC-based PostScript
interpreters such as Ghostscript. Continuing a long tradition,
PSPP's PostScript driver is configurable to the point of
absurdity.
There are actually two PostScript drivers. The first one,
@samp{postscript}, produces ordinary DSC-compliant PostScript output.
The second one @samp{epsf}, produces an Encapsulated PostScript file.
The two drivers are otherwise identical in configuration and in
operation.
The PostScript driver is described in further detail below.
@menu
* PS output options:: Output file options.
* PS page options:: Paper, margins, scaling & rotation, more!
* PS file options:: Configuration files.
* PS font options:: Default fonts, font options.
* PS line options:: Line widths, options.
* Prologue:: Details on the PostScript prologue.
* Encodings:: Details on PostScript font encodings.
@end menu
@node PS output options, PS page options, PostScript driver class, PostScript driver class
@subsection PostScript output options
These options deal with the form of the output and the output file
itself:
@table @code
@item output-file=@var{filename}
File to which output should be sent. This can be an ordinary filename
(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or
stdout (@code{"-"}). Default: @code{"pspp.ps"}.
@item color=@var{boolean}
Most of the time black-and-white PostScript devices are smart enough to
map colors to shades themselves. However, you can cause the PSPP
output driver to do an ugly simulation of this in its own driver by
turning @code{color} off. Default: @code{on}.
This is a boolean setting, as are many settings in the PostScript
driver. Valid positive boolean values are @samp{on}, @samp{true},
@samp{yes}, and nonzero integers. Negative boolean values are
@samp{off}, @samp{false}, @samp{no}, and zero.
@item data=@var{data-type}
One of @code{clean7bit}, @code{clean8bit}, or @code{binary}. This
controls what characters will be written to the output file. PostScript
produced with @code{clean7bit} can be transmitted over 7-bit
transmission channels that use ASCII control characters for line
control. @code{clean8bit} is similar but allows characters above 127 to
be written to the output file. @code{binary} allows any character in
the output file. Default: @code{clean7bit}.
@item line-ends=@var{line-end-type}
One of @code{cr}, @code{lf}, or @code{crlf}. This controls what is used
for newline in the output file. Default: @code{cr}.
@item optimize-line-size=@var{level}
Either @code{0} or @code{1}. If @var{level} is @code{1}, then short
line segments will be collected and merged into longer ones. This
reduces output file size but requires more time and memory. A
@var{level} of @code{0} has the advantage of being better for
interactive environments. @code{1} is the default unless the
@code{screen} flag is set; in that case, the default is @code{0}.
@item optimize-text-size=@var{level}
One of @code{0}, @code{1}, or @code{2}, each higher level representing
correspondingly more aggressive space savings for text in the output
file and requiring correspondingly more time and memory. Unfortunately
the levels presently are all the same. @code{1} is the default unless
the @code{screen} flag is set; in that case, the default is @code{0}.
@end table
@node PS page options, PS file options, PS output options, PostScript driver class
@subsection PostScript page options
These options affect page setup:
@table @code
@item headers=@var{boolean}
Controls whether the standard headers showing the time and date and
title and subtitle are printed at the top of each page. Default:
@code{on}.
@item paper-size=@var{paper-size}
Paper size, either as a symbolic name (i.e., @code{letter} or @code{a4})
or specific measurements (i.e., @code{8-1/2x11} or @code{"210 x 297"}.
@xref{papersize, , Paper sizes}. Default: @code{letter}.
@item orientation=@var{orientation}
Either @code{portrait} or @code{landscape}. Default: @code{portrait}.
@item left-margin=@var{dimension}
@itemx right-margin=@var{dimension}
@itemx top-margin=@var{dimension}
@itemx bottom-margin=@var{dimension}
Sets the margins around the page. The headers, if enabled, are not
included in the margins; they are in addition to the margins. For a
description of dimensions, see @ref{Dimensions}. Default: @code{0.5in}.
@end table
@node PS file options, PS font options, PS page options, PostScript driver class
@subsection PostScript file options
Oh, my. You don't really want to know about the way that the PostScript
driver deals with files, do you? Well I suppose you're entitled, but I
warn you right now: it's not pretty. Here goes@enddots{}
First let's look at the options that are available:
@table @code
@item font-dir=@var{font-directory}
Sets the font directory. Default: @code{devps}.
@item prologue-file=@var{prologue-file-name}
Sets the name of the PostScript prologue file. You can write your own
prologue, though I have no idea why you'd want to: see @ref{Prologue}.
Default: @code{ps-prologue}.
@item device-file=@var{device-file-name}
Sets the name of the Groff-format device description file. The
PostScript driver reads this in order to know about the scaling of fonts
and so on. The format of such files is described in groff_font(5),
included with Groff. Default: @code{DESC}.
@item encoding-file=@var{encoding-file-name}
Sets the name of the encoding file. This file contains a list of all
font encodings that will be needed so that the driver can put all of
them at the top of the prologue. @xref{Encodings}. Default:
@code{ps-encodings}.
If the specified encoding file cannot be found, this error will be
silently ignored, since most people do not need any encodings besides
the ones that can be found using @code{auto-encodings}, described below.
@item auto-encode=@var{boolean}
When enabled, the font encodings needed by the default proportional- and
fixed-pitch fonts will automatically be dumped to the PostScript
output. Otherwise, it is assumed that the user has an encoding file
and knows how to use it (@pxref{Encodings}). There is probably no good
reason to turn off this convenient feature. Default: @code{on}.
@end table
Next I suppose it's time to describe the search algorithm. When the
PostScript driver needs a file, whether that file be a font, a
PostScript prologue, or what you will, it searches in this manner:
@enumerate
@item
Constructs a path by taking the first of the following that is defined:
@enumerate a
@item
Environment variable @code{STAT_GROFF_FONT_PATH}. @xref{Environment
variables}.
@item
Environment variable @code{GROFF_FONT_PATH}.
@item
The compiled-in fallback default.
@end enumerate
@item
Constructs a base name from concatenating, in order, the font directory,
a path separator (@samp{/} or @samp{\}), and the file to be found. A
typical base name would be something like @code{devps/ps-encodings}.
@item
Searches for the base name in the path constructed above. If the file
is found, the algorithm terminates.
@item
Searches for the base name in the standard configuration path. See
@ref{File locations}, for more details. If the file is found, the
algorithm terminates.
@item
At this point we remove the font directory and path separator from the
base name. Now the base name is simply the file to be found, i.e.,
@code{ps-encodings}.
@item
Searches for the base name in the path constructed in the first step.
If the file is found, the algorithm terminates.
@item
Searches for the base name in the standard configuration path. If the
file is found, the algorithm terminates.
@item
The algorithm terminates unsuccessfully.
@end enumerate
So, as you see, there are several ways to configure the PostScript
drivers. Careful selection of techniques can make the configuration
very flexible indeed.
@node PS font options, PS line options, PS file options, PostScript driver class
@subsection PostScript font options
The list of available font options is short and sweet:
@table @code
@item prop-font=@var{font-name}
Sets the default proportional font. The name should be that of a
PostScript font. Default: @code{"Helvetica"}.
@item fixed-font=@var{font-name}
Sets the default fixed-pitch font. The name should be that of a
PostScript font. Default: @code{"Courier"}.
@item font-size=@var{font-size}
Sets the size of the default fonts, in thousandths of a point. Default:
@code{10000}.
@end table
@node PS line options, Prologue, PS font options, PostScript driver class
@subsection PostScript line options
Most tables contain lines, or rules, between cells. Some features of
the way that lines are drawn in PostScript tables are user-definable:
@table @code
@item line-style=@var{style}
Sets the style used for lines used to divide tables into sections.
@var{style} must be either @code{thick}, in which case thick lines are
used, or @var{double}, in which case double lines are used. Default:
@code{thick}.
@item line-gutter=@var{dimension}
Sets the line gutter, which is the amount of whitespace on either side
of lines that border text or graphics objects. @xref{Dimensions}.
Default: @code{0.5pt}.
@item line-spacing=@var{dimension}
Sets the line spacing, which is the amount of whitespace that separates
lines that are side by side, as in a double line. Default:
@code{0.5pt}.
@item line-width=@var{dimension}
Sets the width of a typical line used in tables. Default: @code{0.5pt}.
@item line-width-thick=@var{dimension}
Sets the width of a thick line used in tables. Not used if
@code{line-style} is set to @code{thick}. Default: @code{1.5pt}.
@end table
@node Prologue, Encodings, PS line options, PostScript driver class
@subsection The PostScript prologue
Most PostScript files that are generated mechanically by programs
consist of two parts: a prologue and a body. The prologue is generally
a collection of boilerplate. Only the body differs greatly between
two outputs from the same program.
This is also the strategy used in the PSPP PostScript driver. In
general, the prologue supplied with PSPP will be more than sufficient.
In this case, you will not need to read the rest of this section.
However, hackers might want to know more. Read on, if you fall into
this category.
The prologue is dumped into the output stream essentially unmodified.
However, two actions are performed on its lines. First, certain lines
may be omitted as specified in the prologue file itself. Second,
variables are substituted.
The following lines are omitted:
@enumerate
@item
All lines that contain three bangs in a row (@code{!!!}).
@item
Lines that contain @code{!eps}, if the PostScript driver is producing
ordinary PostScript output. Otherwise an EPS file is being produced,
and the line is included in the output, although everything following
@code{!eps} is deleted.
@item
Lines that contain @code{!ps}, if the PostScript driver is producing EPS
output. Otherwise, ordinary PostScript is being produced, and the line
is included in the output, although everything following @code{!ps} is
deleted.
@end enumerate
The following are the variables that are substituted. Only the
variables listed are substituted; environment variables are not.
@xref{Environment substitutions}.
@table @code
@item bounding-box
The page bounding box, in points, as four space-separated numbers. For
U.S. letter size paper, this is @samp{0 0 612 792}.
@item creator
PSPP version as a string: @samp{GNU PSPP 0.1b}, for example.
@item date
Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}.
@item data
Value of the @code{data} PostScript driver option, as one of the strings
@samp{Clean7Bit}, @samp{Clean8Bit}, or @samp{Binary}.
@item orientation
Page orientation, as one of the strings @code{Portrait} or
@code{Landscape}.
@item user
Under multiuser OSes, the user's login name, taken either from the
environment variable @code{LOGNAME} or, if that fails, the result of the
C library function @code{getlogin()}. Defaults to @samp{nobody}.
@item host
System hostname as reported by @code{gethostname()}. Defaults to
@samp{nowhere}.
@item prop-font
Name of the default proportional font, prefixed by the word
@samp{font} and a space. Example: @samp{font Times-Roman}.
@item fixed-font
Name of the default fixed-pitch font, prefixed by the word @samp{font}
and a space.
@item scale-factor
The page scaling factor as a floating-point number. Example:
@code{1.0}. Note that this is also passed as an argument to the BP
macro.
@item paper-length
@item paper-width
The paper length and paper width, respectively, in thousandths of a
point. Note that these are also passed as arguments to the BP macro.
@item left-margin
@item top-margin
The left margin and top margin, respectively, in thousandths of a
point. Note that these are also passed as arguments to the BP macro.
@item title
Document title as a string. This is not the title specified in the
PSPP syntax file. A typical title is the word @samp{PSPP} followed
by the syntax file name in parentheses. Example: @samp{PSPP
(<stdin>)}.
@item source-file
PSPP syntax file name. Example: @samp{mary96/first.stat}.
@end table
Any other questions about the PostScript prologue can best be answered
by examining the default prologue or the PSPP source.
@node Encodings, , Prologue, PostScript driver class
@subsection PostScript encodings
PostScript fonts often contain many more than 256 characters, in order
to accommodate foreign language characters and special symbols.
PostScript uses @dfn{encodings} to map these onto single-byte symbol
sets. Each font can have many different encodings applied to it.
PSPP's PostScript driver needs to know which encoding to apply to each
font. It can determine this from the information encapsulated in the
Groff font description that it reads. However, there is an additional
problem---for efficiency, the PostScript driver needs to have a complete
list of all encodings that will be used in the entire session @emph{when
it opens the output file}. For this reason, it can't use the
information built into the fonts because it doesn't know which fonts
will be used.
As a stopgap solution, there are two mechanisms for specifying which
encodings will be used. The first mechanism is automatic and it is the
only one that most PSPP users will ever need. The second mechanism is
manual, but it is more flexible. Either mechanism or both may be used
at one time.
The first mechanism is activated by the @samp{auto-encode} driver option
(@pxref{PS file options}). When enabled, @samp{auto-encode} causes the
PostScript driver to include the encodings used by the default
proportional and fixed-pitch fonts (@pxref{PS font options}). Many
PSPP output files will only need these encodings.
The second mechanism is the file specified by the @samp{encoding-file}
option (@pxref{PS file options}). If it exists, this file must consist
of lines in PSPP configuration-file format (@pxref{Configuration
files}). Each line that is not a comment should name a PostScript
encoding to include in the output.
It is not an error if an encoding is included more than once, by either
mechanism. It will appear only once in the output. It is also not an
error if an encoding is included in the output but never used. It
@emph{is} an error if an encoding is used but not included by one of
these mechanisms. In this case, the built-in PostScript encoding
@samp{ISOLatin1Encoding} is substituted.
@node ASCII driver class, HTML driver class, PostScript driver class, Configuration
@section The ASCII driver class
The ASCII driver class produces output that can be displayed on a
terminal or output to printers. All of its options are highly
configurable. The ASCII driver has class name @samp{ascii}.
The ASCII driver is described in further detail below.
@menu
* ASCII output options:: Output file options.
* ASCII page options:: Page size, margins, more.
* ASCII font options:: Box character, bold & italics.
@end menu
@node ASCII output options, ASCII page options, ASCII driver class, ASCII driver class
@subsection ASCII output options
@table @code
@item output-file=@var{filename}
File to which output should be sent. This can be an ordinary filename
(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or
stdout (@code{"-"}). Default: @code{"pspp.list"}.
@item char-set=@var{char-set-type}
One of @samp{ascii} or @samp{latin1}. This has no effect on output at
the present time. Default: @code{ascii}.
@item form-feed-string=@var{form-feed-value}
The string written to the output to cause a formfeed. See also
@code{paginate}, described below, for a related setting. Default:
@code{"\f"}.
@item newline-string=@var{newline-value}
The string written to the output to cause a newline (carriage return
plus linefeed). The default, which can be specified explicitly with
@code{newline-string=default}, is to use the system-dependent newline
sequence by opening the output file in text mode. This is usually the
right choice.
However, @code{newline-string} can be set to any string. When this is
done, the output file is opened in binary mode.
@item paginate=@var{boolean}
If set, a formfeed (as set in @code{form-feed-string}, described above)
will be written to the device after every page. Default: @code{on}.
@item tab-width=@var{tab-width-value}
The distance between tab stops for this device. If set to 0, tabs will
not be used in the output. Default: @code{8}.
@item init=@var{initialization-string}.
String written to the device before anything else, at the beginning of
the output. Default: @code{""} (the empty string).
@item done=@var{finalization-string}.
String written to the device after everything else, at the end of the
output. Default: @code{""} (the empty string).
@end table
@node ASCII page options, ASCII font options, ASCII output options, ASCII driver class
@subsection ASCII page options
These options affect page setup:
@table @code
@item headers=@var{boolean}
If enabled, two lines of header information giving title and subtitle,
page number, date and time, and PSPP version are printed at the top of
every page. These two lines are in addition to any top margin
requested. Default: @code{on}.
@item length=@var{line-count}
Physical length of a page, in lines. Headers and margins are subtracted
from this value. Default: @code{66}.
@item width=@var{character-count}
Physical width of a page, in characters. Margins are subtracted from
this value. Default: @code{130}.
@item lpi=@var{lines-per-inch}
Number of lines per vertical inch. Not currently used. Default: @code{6}.
@item cpi=@var{characters-per-inch}
Number of characters per horizontal inch. Not currently used. Default:
@code{10}.
@item left-margin=@var{left-margin-width}
Width of the left margin, in characters. PSPP subtracts this value
from the page width. Default: @code{0}.
@item right-margin=@var{right-margin-width}
Width of the right margin, in characters. PSPP subtracts this value
from the page width. Default: @code{0}.
@item top-margin=@var{top-margin-lines}
Length of the top margin, in lines. PSPP subtracts this value from
the page length. Default: @code{2}.
@item bottom-margin=@var{bottom-margin-lines}
Length of the bottom margin, in lines. PSPP subtracts this value from
the page length. Default: @code{2}.
@end table
@node ASCII font options, , ASCII page options, ASCII driver class
@subsection ASCII font options
These are the ASCII font options:
@table @code
@item box[@var{line-type}]=@var{box-chars}
The characters used for lines in tables produced by the ASCII driver can
be changed using this option. @var{line-type} is used to indicate which
type of line to change; @var{box-chars} is the character or string of
characters to use for this type of line.
@var{line-type} must be a 4-digit number in base 4. The digits are in
the order `right', `bottom', `left', `top'. The four possibilities for
each digit are:
@table @asis
@item 0
No line.
@item 1
Single line.
@item 2
Double line.
@item 3
Special device-defined line, if one is available; otherwise, a double
line.
@end table
Examples:
@table @code
@item box[0101]="|"
Sets @samp{|} as the character to use for a single-width line with
bottom and top components.
@item box[2222]="#"
Sets @samp{#} as the character to use for the intersection of four
double-width lines, one each from the top, bottom, left and right.
@item box[1100]="\xda"
Sets @samp{"\xda"}, which under MS-DOG is a box character suitable for
the top-left corner of a box, as the character for the intersection of
two single-width lines, one each from the right and bottom.
@end table
Defaults:
@itemize @bullet
@item
@code{box[0000]=" "}
@item
@code{box[1000]="-"}
@*@code{box[0010]="-"}
@*@code{box[1010]="-"}
@item
@code{box[0100]="|"}
@*@code{box[0001]="|"}
@*@code{box[0101]="|"}
@item
@code{box[2000]="="}
@*@code{box[0020]="="}
@*@code{box[2020]="="}
@item
@code{box[0200]="#"}
@*@code{box[0002]="#"}
@*@code{box[0202]="#"}
@item
@code{box[3000]="="}
@*@code{box[0030]="="}
@*@code{box[3030]="="}
@item
@code{box[0300]="#"}
@*@code{box[0003]="#"}
@*@code{box[0303]="#"}
@item
For all others, @samp{+} is used unless there are double lines or
special lines, in which case @samp{#} is used.
@end itemize
@item italic-on=@var{italic-on-string}
Character sequence written to turn on italics or underline printing. If
this is set to @code{overstrike}, then the driver will simulate
underlining by overstriking with underscore characters (@samp{_}) in the
manner described by @code{overstrike-style} and
@code{carriage-return-style}. Default: @code{overstrike}.
@item italic-off=@var{italic-off-string}
Character sequence to turn off italics or underline printing. Default:
@code{""} (the empty string).
@item bold-on=@var{bold-on-string}
Character sequence written to turn on bold or emphasized printing. If
set to @code{overstrike}, then the driver will simulated bold printing
by overstriking characters in the manner described by
@code{overstrike-style} and @code{carriage-return-style}. Default:
@code{overstrike}.
@item bold-off=@var{bold-off-string}
Character sequence to turn off bold or emphasized printing. Default:
@code{""} (the empty string).
@item bold-italic-on=@var{bold-italic-on-string}
Character sequence written to turn on bold-italic printing. If set to
@code{overstrike}, then the driver will simulate bold-italics by
overstriking twice, once with the character, a second time with an
underscore (@samp{_}) character, in the manner described by
@code{overstrike-style} and @code{carriage-return-style}. Default:
@code{overstrike}.
@item bold-italic-off=@var{bold-italic-off-string}
Character sequence to turn off bold-italic printing. Default: @code{""}
(the empty string).
@item overstrike-style=@var{overstrike-option}
Either @code{single} or @code{line}:
@itemize @bullet
@item
If @code{single} is selected, then, to overstrike a line of text, the
output driver will output a character, backspace, overstrike, output a
character, backspace, overstrike, and so on along a line.
@item
If @code{line} is selected then the output driver will output an entire
line, then backspace or emit a carriage return (as indicated by
@code{carriage-return-style}), then overstrike the entire line at once.
@end itemize
@code{single} is recommended for use with ttys and programs that
understand overstriking in text files, such as the pager @code{less}.
@code{single} will also work with printer devices but results in rapid
back-and-forth motions of the printhead that can cause the printer to
physically overheat!
@code{line} is recommended for use with printer devices. Most programs
that understand overstriking in text files will not properly deal with
@code{line} mode.
Default: @code{single}.
@item carriage-return-style=@var{carriage-return-type}
Either @code{bs} or @code{cr}. This option applies only when one or
more of the font commands is set to @code{overstrike} and, at the same
time, @code{overstrike-style} is set to @code{line}.
@itemize @bullet
@item
If @code{bs} is selected then the driver will return to the beginning of
a line by emitting a sequence of backspace characters (ASCII 8).
@item
If @code{cr} is selected then the driver will return to the beginning of
a line by emitting a single carriage-return character (ASCII 13).
@end itemize
Although @code{cr} is preferred as being more compact, @code{bs} is more
general since some devices do not interpret carriage returns in the
desired manner. Default: @code{bs}.
@end table
@node HTML driver class, Miscellaneous configuring, ASCII driver class, Configuration
@section The HTML driver class
The @code{html} driver class is used to produce output for viewing in
tables-capable web browsers such as Emacs' w3-mode. Its configuration
is very simple. Currently, the output has a very plain format. In the
future, further work may be done on improving the output appearance.
There are few options for use with the @code{html} driver class:
@table @code
@item output-file=@var{filename}
File to which output should be sent. This can be an ordinary filename
(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or
stdout (@code{"-"}). Default: @code{"pspp.html"}.
@item prologue-file=@var{prologue-file-name}
Sets the name of the PostScript prologue file. You can write your own
prologue if you want to customize colors or other settings: see
@ref{HTML Prologue}. Default: @code{html-prologue}.
@end table
@menu
* HTML Prologue:: Format of the HTML prologue file.
@end menu
@node HTML Prologue, , HTML driver class, HTML driver class
@subsection The HTML prologue
HTML files that are generated by PSPP consist of two parts: a prologue
and a body. The prologue is a collection of boilerplate. Only the body
differs greatly between two outputs. You can tune the colors and other
attributes of the output by editing the prologue.
The prologue is dumped into the output stream essentially unmodified.
However, two actions are performed on its lines. First, certain lines
may be omitted as specified in the prologue file itself. Second,
variables are substituted.
The following lines are omitted:
@enumerate
@item
All lines that contain three bangs in a row (@code{!!!}).
@item
Lines that contain @code{!title}, if no title is set for the output. If
a title is set, then the characters @code{!title} are removed before the
line is output.
@item
Lines that contain @code{!subtitle}, if no subtitle is set for the
output. If a subtitle is set, then the characters @code{!subtitle} are
removed before the line is output.
@end enumerate
The following are the variables that are substituted. Only the
variables listed are substituted; environment variables are not.
@xref{Environment substitutions}.
@table @code
@item generator
PSPP version as a string: @samp{GNU PSPP 0.1b}, for example.
@item date
Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}.
@item user
Under multiuser OSes, the user's login name, taken either from the
environment variable @code{LOGNAME} or, if that fails, the result of the
C library function @code{getlogin()}. Defaults to @samp{nobody}.
@item host
System hostname as reported by @code{gethostname()}. Defaults to
@samp{nowhere}.
@item title
Document title as a string. This is the title specified in the PSPP
syntax file.
@item subtitle
Document subtitle as a string.
@item source-file
PSPP syntax file name. Example: @samp{mary96/first.stat}.
@end table
@node Miscellaneous configuring, Improving output quality, HTML driver class, Configuration
@section Miscellaneous configuration
The following environment variables can be used to further configure
PSPP:
@table @code
@item HOME
Used to determine the user's home directory. No default value.
@item STAT_INCLUDE_PATH
Path used to find include files in PSPP syntax files. Defaults vary
across operating systems:
@table @asis
@item UNIX
@itemize @bullet
@item
@file{.}
@item
@file{~/.pspp/include}
@item
@file{/usr/local/lib/pspp/include}
@item
@file{/usr/lib/pspp/include}
@item
@file{/usr/local/share/pspp/include}
@item
@file{/usr/share/pspp/include}
@end itemize
@item MS-DOS
@itemize @bullet
@item
@file{.}
@item
@file{C:\PSPP\INCLUDE}
@item
@file{$PATH}
@end itemize
@item Other OSes
No default path.
@end table
@item STAT_PAGER
@itemx PAGER
When PSPP invokes an external pager, it uses the first of these that
is defined. There is a default pager only if the person who compiled
PSPP defined one.
@item TERM
The terminal type @code{termcap} or @code{ncurses} will use, if such
support was compiled into PSPP.
@item STAT_OUTPUT_INIT_FILE
The basename used to search for the driver definition file.
@xref{Output devices}. @xref{File locations}. Default: @code{devices}.
@item STAT_OUTPUT_PAPERSIZE_FILE
The basename used to search for the papersize file. @xref{papersize}.
@xref{File locations}. Default: @code{papersize}.
@item STAT_OUTPUT_INIT_PATH
The path used to search for the driver definition file and the papersize
file. @xref{File locations}. Default: the standard configuration path.
@item TMPDIR
The @code{sort} procedure stores its temporary files in this directory.
Default: (UNIX) @file{/tmp}, (MS-DOS) @file{\}, (other OSes) empty string.
@item TEMP
@item TMP
Under MS-DOS only, these variables are consulted after TMPDIR, in this
order.
@end table
@node Improving output quality, , Miscellaneous configuring, Configuration
@section Improving output quality
When its drivers are set up properly, PSPP can produce output that
looks very good indeed. The PostScript driver, suitably configured, can
produce presentation-quality output. Here are a few guidelines for
producing better-looking output, regardless of output driver. Your
mileage may vary, of course, and everyone has different esthetic
preferences.
@itemize @bullet
@item
Width is important in PSPP output. Greater output width leads to more
readable output, to a point. Try the following to increase the output
width:
@itemize @minus
@item
If you're using the ASCII driver with a dot-matrix printer, figure out
what you need to do to put the printer into compressed mode. Put that
string into the @code{init-string} setting. Try to get 132 columns; 160
might be better, but you might find that print that tiny is difficult to
read.
@item
With the PostScript driver, try these ideas:
@itemize +
@item
Landscape mode.
@item
Legal-size (8.5" x 14") paper in landscape mode.
@item
Reducing font sizes. If you're using 12-point fonts, try 10 point; if
you're using 10-point fonts, try 8 point. Some fonts are more readable
than others at small sizes.
@end itemize
@end itemize
Try to strike a balance between character size and page width.
@item
Use high-quality fonts. Many public domain fonts are poor in quality.
Recently, URW made some high-quality fonts available under the GPL.
These are probably suitable.
@item
Be sure you're using the proper font metrics. The font metrics provided
with PSPP may not correspond to the fonts actually being printed.
This can cause bizarre-looking output.
@item
Make sure that you're using good ink/ribbon/toner. Darker print is
easier to read.
@item
Use plain fonts with serifs, such as Times-Roman or Palatino. Avoid
choosing italic or bold fonts as document base fonts.
@end itemize
@node Invocation, Language, Configuration, Top
@chapter Invoking PSPP
@cindex invocation
@cindex PSPP, invoking
@cindex command line, options
@cindex options, command-line
@example
pspp [ -B @var{dir} | --config-dir=@var{dir} ] [ -o @var{device} | --device=@var{device} ]
[ -d @var{var}[=@var{value}] | --define=@var{var}[=@var{value}] ] [-u @var{var} | --undef=@var{var} ]
[ -f @var{file} | --out-file=@var{file} ] [ -p | --pipe ] [ -I- | --no-include ]
[ -I @var{dir} | --include=@var{dir} ] [ -i | --interactive ]
[ -n | --edit | --dry-run | --just-print | --recon ]
[ -r | --no-statrc ] [ -h | --help ] [ -l | --list ]
[ -c @var{command} | --command @var{command} ] [ -s | --safer ]
[ --testing-mode ] [ -V | --version ] [ -v | --verbose ]
[ @var{key}=@var{value} ] @var{file}@enddots{}
@end example
@menu
* Non-option Arguments:: Specifying syntax files and output devices.
* Configuration Options:: Change the configuration for the current run.
* Input and output options:: Controlling input and output files.
* Language control options:: Language variants.
* Informational options:: Helpful information about PSPP.
@end menu
@node Non-option Arguments, Configuration Options, Invocation, Invocation
@section Non-option Arguments
Syntax files and output device substitutions can be specified on
PSPP's command line:
@table @code
@item @var{file}
A file by itself on the command line will be executed as a syntax file.
PSPP terminates after the syntax file runs, unless the @code{-i} or
@code{--interactive} option is given (@pxref{Language control options}).
@item @var{file1} @var{file2}
When two or more filenames are given on the command line, the first
syntax file is executed, then PSPP's dictionary is cleared, then the second
syntax file is executed.
@item @var{file1} + @var{file2}
If syntax files' names are delimited by a plus sign (@samp{+}), then the
dictionary is not cleared between their executions, as if they were
concatenated together into a single file.
@item @var{key}=@var{value}
Defines an output device macro @var{key} to expand to @var{value},
overriding any macro having the same @var{key} defined in the device
configuration file. @xref{Macro definitions}.
@end table
There is one other way to specify a syntax file, if your operating
system supports it. If you have a syntax file @file{foobar.stat}, put
the notation
@example
#! /usr/local/bin/pspp
@end example
at the top, and mark the file as executable with @code{chmod +x
foobar.stat}. (If PSPP is not installed in @file{/usr/local/bin},
then insert its actual installation directory into the syntax file
instead.) Now you should be able to invoke the syntax file just by
typing its name. You can include any options on the command line as
usual. PSPP entirely ignores any lines beginning with @samp{#!}.
@node Configuration Options, Input and output options, Non-option Arguments, Invocation
@section Configuration Options
Configuration options are used to change PSPP's configuration for the
current run. The configuration options are:
@table @code
@item -B @var{dir}
@itemx --config-dir=@var{dir}
Sets the configuration directory to @var{dir}. @xref{File locations}.
@item -o @var{device}
@itemx --device=@var{device}
Selects the output device with name @var{device}. If this option is
given more than once, then all devices mentioned are selected. This
option disables all devices besides those mentioned on the command line.
@item -d @var{var}[=@var{value}]
@itemx --define=@var{var}[=@var{value}]
Defines an `environment variable' named @var{var} having the optional
value @var{value} specified. @xref{Variable values}.
@item -u @var{var}
@itemx --undef=@var{var}
Undefines the `environment variable' named @var{var}. @xref{Variable
values}.
@end table
@node Input and output options, Language control options, Configuration Options, Invocation
@section Input and output options
Input and output options affect how PSPP reads input and writes
output. These are the input and output options:
@table @code
@item -f @var{file}
@itemx --out-file=@var{file}
This overrides the output file name for devices designated as listing
devices. If a file named @var{file} already exists, it is overwritten.
@item -p
@itemx --pipe
Allows PSPP to be used as a filter by causing the syntax file to be
read from stdin and output to be written to stdout. Conflicts with the
@code{-f @var{file}} and @code{--file=@var{file}} options.
@item -I-
@itemx --no-include
Clears all directories from the include path. This includes all
directories put in the include path by default. @xref{Miscellaneous
configuring}.
@item -I @var{dir}
@itemx --include=@var{dir}
Appends directory @var{dir} to the path that is searched for include
files in PSPP syntax files.
@item -c @var{command}
@itemx --command=@var{command}
Execute literal command @var{command}. The command is executed before
startup syntax files, if any.
@item --testing-mode
Invoke heuristics to assist with testing PSPP. For use by @code{make
check} and similar scripts.
@end table
@node Language control options, Informational options, Input and output options, Invocation
@section Language control options
Language control options control how PSPP syntax files are parsed and
interpreted. The available language control options are:
@table @code
@item -i
@itemx --interactive
When a syntax file is specified on the command line, PSPP normally
terminates after processing it. Giving this option will cause PSPP to
bring up a command prompt after processing the syntax file.
In addition, this forces syntax files to be interpreted in interactive
mode, rather than the default batch mode. @xref{Tokenizing lines}, for
information on the differences between batch mode and interactive mode
command interpretation.
@item -n
@itemx --edit
@itemx --dry-run
@itemx --just-print
@itemx --recon
Only the syntax of any syntax file specified or of commands entered at
the command line is checked. Transformations are not performed and
procedures are not executed. Not yet implemented.
@item -r
@itemx --no-statrc
Prevents the execution of the PSPP startup syntax file. Not yet
implemented, as startup syntax files aren't, either.
@item -s
@itemx --safer
Disables certain unsafe operations. This includes the @code{ERASE} and
@code{HOST} commands, as well as use of pipes as input and output files.
@end table
@node Informational options, , Language control options, Invocation
@section Informational options
Informational options cause information about PSPP to be written to
the terminal. Here are the available options:
@table @code
@item -h
@item --help
Prints a message describing PSPP command-line syntax and the available
device driver classes, then terminates.
@item -l
@item --list
Lists the available device driver classes, then terminates.
@item -V
@item --version
Prints a brief message listing PSPP's version, warranties you don't
have, copying conditions and copyright, and e-mail address for bug
reports, then terminates.
@item -v
@item --verbose
Increments PSPP's verbosity level. Higher verbosity levels cause
PSPP to display greater amounts of information about what it is
doing. Often useful for debugging PSPP's configuration.
This option can be given multiple times to set the verbosity level to
that value. The default verbosity level is 0, in which no informational
messages will be displayed.
Higher verbosity levels cause messages to be displayed when the
corresponding events take place.
@table @asis
@item 1
Driver and subsystem initializations.
@item 2
Completion of driver initializations. Beginning of driver closings.
@item 3
Completion of driver closings.
@item 4
Files searched for; success of searches.
@item 5
Individual directories included in file searches.
@end table
Each verbosity level also includes messages from lower verbosity levels.
@end table
@node Language, Expressions, Invocation, Top
@chapter The PSPP language
@cindex language, PSPP
@cindex PSPP, language
@quotation
@strong{Please note:} PSPP is not even close to completion.
Only a few actual statistical procedures are implemented. PSPP
is a work in progress.
@end quotation
This chapter discusses elements common to many PSPP commands.
Later chapters will describe individual commands in detail.
@menu
* Tokens:: Characters combine to form tokens.
* Commands:: Tokens combine to form commands.
* Types of Commands:: Commands come in several flavors.
* Order of Commands:: Commands combine to form syntax files.
* Missing Observations:: Handling missing observations.
* Variables:: The unit of data storage.
* Files:: Files used by PSPP.
* BNF:: How command syntax is described.
@end menu
@node Tokens, Commands, Language, Language
@section Tokens
@cindex language, lexical analysis
@cindex language, tokens
@cindex tokens
@cindex lexical analysis
@cindex lexemes
PSPP divides most syntax file lines into series of short chunks
called @dfn{tokens}, @dfn{lexical elements}, or @dfn{lexemes}. These
tokens are then grouped to form commands, each of which tells
PSPP to take some action---read in data, write out data, perform
a statistical procedure, etc. The process of dividing input into tokens
is @dfn{tokenization}, or @dfn{lexical analysis}. Each type of token is
described below.
@cindex delimiters
@cindex whitespace
Tokens must be separated from each other by @dfn{delimiters}.
Delimiters include whitespace (spaces, tabs, carriage returns, line
feeds, vertical tabs), punctuation (commas, forward slashes, etc.), and
operators (plus, minus, times, divide, etc.) Note that while whitespace
only separates tokens, other delimiters are tokens in themselves.
@table @strong
@cindex identifiers
@item Identifiers
Identifiers are names that specify variable names, commands, or command
details.
@itemize @bullet
@item
The first character in an identifier must be a letter, @samp{#}, or
@samp{@@}. Some system identifiers begin with @samp{$}, but
user-defined variables' names may not begin with @samp{$}.
@item
The remaining characters in the identifier must be letters, digits, or
one of the following special characters:
@example
. _ $ # @@
@end example
@item
@cindex variable names
@cindex names, variable
Variable names may be any length, but only the first 8 characters are
significant.
@item
@cindex case-sensitivity
Identifiers are not case-sensitive: @code{foobar}, @code{Foobar},
@code{FooBar}, @code{FOOBAR}, and @code{FoObaR} are different
representations of the same identifier.
@item
@cindex keywords
Identifiers other than variable names may be abbreviated to their first
3 characters if this abbreviation is unambiguous. These identifiers are
often called @dfn{keywords}. (Unique abbreviations of more than 3
characters are also accepted: @samp{FRE}, @samp{FREQ}, and
@samp{FREQUENCIES} are equivalent when the last is a keyword.)
@item
Whether an identifier is a keyword depends on the context.
@item
@cindex keywords, reserved
@cindex reserved keywords
Some keywords are reserved. These keywords may not be used in any
context besides those explicitly described in this manual. The reserved
keywords are:
@example
ALL AND BY EQ GE GT LE LT NE NOT OR TO WITH
@end example
@item
Since keywords are identifiers, all the rules for identifiers apply.
Specifically, they must be delimited as are other identifiers:
@code{WITH} is a reserved keyword, but @code{WITHOUT} is a valid
variable name.
@end itemize
@cindex @samp{.}
@cindex period
@cindex variable names, ending with period
@strong{Caution:} It is legal to end a variable name with a period, but
@emph{don't do it!} The variable name will be misinterpreted when it is
the final token on a line: @code{FOO.} will be divided into two separate
tokens, @samp{FOO} and @samp{.}, the @dfn{terminal dot}.
@xref{Commands, , Forming commands of tokens}.
@item Numbers
@cindex numbers
@cindex integers
@cindex reals
Numbers may be specified as integers or reals. Integers are internally
converted into reals. Scientific notation is not supported. Here are
some examples of valid numbers:
@example
1234 3.14159265359 .707106781185 8945.
@end example
@strong{Caution:} The last example will be interpreted as two tokens,
@samp{8945} and @samp{.}, if it is the last token on a line.
@item Strings
@cindex strings
@cindex @samp{'}
@cindex @samp{"}
@cindex case-sensitivity
Strings are literal sequences of characters enclosed in pairs of single
quotes (@samp{'}) or double quotes (@samp{"}).
@itemize @bullet
@item
Whitespace and case of letters @emph{are} significant inside strings.
@item
Whitespace characters inside a string are not delimiters.
@item
To include single-quote characters in a string, enclose the string in
double quotes.
@item
To include double-quote characters in a string, enclose the string in
single quotes.
@item
It is not possible to put both single- and double-quote characters
inside one string.
@end itemize
@item Hexstrings
@cindex hexstrings
Hexstrings are string variants that use hex digits to specify
characters.
@itemize @bullet
@item
A hexstring may be used anywhere that an ordinary string is allowed.
@item
@cindex @samp{X'}
@cindex @samp{'}
A hexstring begins with @samp{X'} or @samp{x'}, and ends with @samp{'}.
@cindex whitespace
@item
No whitespace is allowed between the initial @samp{X} and @samp{'}.
@item
Double quotes @samp{"} may be used in place of single quotes @samp{'} if
done in both places.
@item
Each pair of hex digits is internally changed into a single character
with the given value.
@item
If there is an odd number of hex digits, the missing last digit is
assumed to be @samp{0}.
@item
@cindex portability
@strong{Please note:} Use of hexstrings is nonportable because the same
numeric values are associated with different glyphs by different
operating systems. Therefore, their use should be confined to syntax
files that will not be widely distributed.
@item
@cindex characters, reserved
@cindex 0
@cindex whitespace
@strong{Please note also:} The character with value 00 is reserved for
internal use by PSPP. Its use in strings causes an error and
replacement with a blank space (in ASCII, hex 20, decimal 32).
@end itemize
@item Punctuation
@cindex punctuation
Punctuation separates tokens; punctuators are delimiters. These are the
punctuation characters:
@example
, / = ( )
@end example
@item Operators
@cindex operators
Operators describe mathematical operations. Some operators are delimiters:
@example
( ) + - * / **
@end example
Many of the above operators are also punctuators. Punctuators are
distinguished from operators by context.
The other operators are all reserved keywords. None of these are
delimiters:
@example
AND EQ GE GT LE LT NE OR
@end example
@item Terminal Dot
@cindex terminal dot
@cindex dot, terminal
@cindex period
@cindex @samp{.}
A period (@samp{.}) at the end of a line (except for whitespace) is one
type of a @dfn{terminal dot}, although not every terminal dot is a
period at the end of a line. @xref{Commands, , Forming commands of
tokens}. A period is a terminal dot @emph{only}
when it is at the end of a line; otherwise it is part of a
floating-point number. (A period outside a number in the middle of a
line is an error.)
@quotation
@cindex terminal dot, changing
@cindex dot, terminal, changing
@strong{Please note:} The character used for the @dfn{terminal dot} can
be changed with the SET command. This is strongly discouraged, and
throughout all the remainder of this manual it will be assumed that the
default setting is in effect.
@end quotation
@end table
@node Commands, Types of Commands, Tokens, Language
@section Forming commands of tokens
@cindex PSPP, command structure
@cindex language, command structure
@cindex commands, structure
Most PSPP commands share a common structure, diagrammed below:
@example
@var{cmd}@dots{} [@var{sbc}[=][@var{spec} [[,]@var{spec}]@dots{}]] [[/[=][@var{spec} [[,]@var{spec}]@dots{}]]@dots{}].
@end example
@cindex @samp{[ ]}
In the above, rather daunting, expression, pairs of square brackets
(@samp{[ ]}) indicate optional elements, and names such as @var{cmd}
indicate parts of the syntax that vary from command to command.
Ellipses (@samp{...}) indicate that the preceding part may be repeated
an arbitrary number of times. Let's pick apart what it says above:
@itemize @bullet
@cindex commands, names
@item
A command begins with a command name of one or more keywords, such as
@code{FREQUENCIES}, @code{DATA LIST}, or @code{N OF CASES}. @var{cmd}
may be abbreviated to its first word if that is unambiguous; each word
in @var{cmd} may be abbreviated to a unique prefix of three or more
characters as described above.
@cindex subcommands
@item
The command name may be followed by one or more @dfn{subcommands}:
@itemize @minus
@item
Each subcommand begins with a unique keyword, indicated by @var{sbc}
above. This is analogous to the command name.
@item
The subcommand name is optionally followed by an equals sign (@samp{=}).
@item
Some subcommands accept a series of one or more specifications
(@var{spec}), optionally separated by commas.
@item
Each subcommand must be separated from the next (if any) by a forward
slash (@samp{/}).
@end itemize
@cindex dot, terminal
@cindex terminal dot
@item
Each command must be terminated with a @dfn{terminal dot}.
The terminal dot may be given one of three ways:
@itemize @minus
@item
(most commonly) A period character at the very end of a line, as
described above.
@item
(only if NULLINE is on: @xref{SET, , Setting user preferences}, for more
details.) A completely blank line.
@item
(in batch mode only) Any line that is not indented from the left side of
the page causes a terminal dot to be inserted before that line.
Therefore, each command begins with a line that is flush left, followed
by zero or more lines that are indented one or more characters from the
left margin.
In batch mode, PSPP will ignore a plus sign, minus sign, or period
(@samp{+}, @samp{@minus{}}, or @samp{.}) as the first character in a
line. Any of these characters as the first character on a line will
begin a new command. This allows for visual indentation of a command
without that command being considered part of the previous command.
PSPP is in batch mode when it is reading input from a file, rather
than from an interactive user. Note that the other forms of the
terminal dot may also be used in batch mode.
Sometimes, one encounters syntax files that are intended to be
interpreted in interactive mode rather than batch mode (for instance,
this can happen if a session log file is used directly as a syntax
file). When this occurs, use the @samp{-i} command line option to force
interpretation in interactive mode (@pxref{Language control options}).
@end itemize
@end itemize
PSPP ignores empty commands when they are generated by the above
rules. Note that, as a consequence of these rules, each command must
begin on a new line.
@node Types of Commands, Order of Commands, Commands, Language
@section Types of Commands
Commands in PSPP are divided roughly into six categories:
@table @strong
@item Utility commands
Set or display various global options that affect PSPP operations.
May appear anywhere in a syntax file. @xref{Utilities, , Utility
commands}.
@item File definition commands
Give instructions for reading data from text files or from special
binary ``system files''. Most of these commands discard any previous
data or variables in order to replace it with the new data and
variables. At least one must appear before the first command in any of
the categories below. @xref{Data Input and Output}.
@item Input program commands
Though rarely used, these provide powerful tools for reading data files
in arbitrary textual or binary formats. @xref{INPUT PROGRAM}.
@item Transformations
Perform operations on data and write data to output files. Transformations
are not carried out until a procedure is executed.
@item Restricted transformations
Same as transformations for most purposes. @xref{Order of Commands}, for a
detailed description of the differences.
@item Procedures
Analyze data, writing results of analyses to the listing file. Cause
transformations specified earlier in the file to be performed. In a
more general sense, a @dfn{procedure} is any command that causes the
active file (the data) to be read.
@end table
@node Order of Commands, Missing Observations, Types of Commands, Language
@section Order of Commands
@cindex commands, ordering
@cindex order of commands
PSPP does not place many restrictions on ordering of commands.
The main restriction is that variables must be defined with one of the
file-definition commands before they are otherwise referred to.
Of course, there are specific rules, for those who are interested.
PSPP possesses five internal states, called initial, INPUT
PROGRAM, FILE TYPE, transformation, and procedure states. (Please note
the distinction between the INPUT PROGRAM and FILE TYPE @emph{commands}
and the INPUT PROGRAM and FILE TYPE @emph{states}.)
PSPP starts up in the initial state. Each successful completion
of a command may cause a state transition. Each type of command has its
own rules for state transitions:
@table @strong
@item Utility commands
@itemize @bullet
@item
Legal in all states, except Pennsylvania.
@item
Do not cause state transitions. Exception: when the N OF CASES command
is executed in the procedure state, it causes a transition to the
transformation state.
@end itemize
@item DATA LIST
@itemize @bullet
@item
Legal in all states.
@item
When executed in the initial or procedure state, causes a transition to
the transformation state.
@item
Clears the active file if executed in the procedure or transformation
state.
@end itemize
@item INPUT PROGRAM
@itemize @bullet
@item
Invalid in INPUT PROGRAM and FILE TYPE states.
@item
Causes a transition to the INPUT PROGRAM state.
@item
Clears the active file.
@end itemize
@item FILE TYPE
@itemize @bullet
@item
Invalid in INPUT PROGRAM and FILE TYPE states.
@item
Causes a transition to the FILE TYPE state.
@item
Clears the active file.
@end itemize
@item Other file definition commands
@itemize @bullet
@item
Invalid in INPUT PROGRAM and FILE TYPE states.
@item
Cause a transition to the transformation state.
@item
Clear the active file, except for ADD FILES, MATCH FILES, and UPDATE.
@end itemize
@item Transformations
@itemize @bullet
@item
Invalid in initial and FILE TYPE states.
@item
Cause a transition to the transformation state.
@end itemize
@item Restricted transformations
@itemize @bullet
@item
Invalid in initial, INPUT PROGRAM, and FILE TYPE states.
@item
Cause a transition to the transformation state.
@end itemize
@item Procedures
@itemize @bullet
@item
Invalid in initial, INPUT PROGRAM, and FILE TYPE states.
@item
Cause a transition to the procedure state.
@end itemize
@end table
@node Missing Observations, Variables, Order of Commands, Language
@section Handling missing observations
@cindex missing values
@cindex values, missing
PSPP includes special support for unknown numeric data values.
Missing observations are assigned a special value, called the
@dfn{system-missing value}. This ``value'' actually indicates the
absence of value; it means that the actual value is unknown. Procedures
automatically exclude from analyses those observations or cases that
have missing values. Whether single observations or entire cases are
excluded depends on the procedure.
The system-missing value exists only for numeric variables. String
variables always have a defined value, even if it is only a string of
spaces.
Variables, whether numeric or string, can have designated
@dfn{user-missing values}. Every user-missing value is an actual value
for that variable. However, most of the time user-missing values are
treated in the same way as the system-missing value. String variables
that are wider than a certain width, usually 8 characters (depending on
computer architecture), cannot have user-missing values.
For more information on missing values, see the following sections:
@ref{Variables}, @ref{MISSING VALUES}, @ref{Expressions}. See also the
documentation on individual procedures for information on how they
handle missing values.
@node Variables, Files, Missing Observations, Language
@section Variables
@cindex variables
Variables are the basic unit of data storage in PSPP. All the
variables in a file taken together, apart from any associated data, are
said to form a @dfn{dictionary}. Each case contain a value for each
variable. Some details of variables are described in the sections
below.
@menu
* Attributes:: Attributes of variables.
* System Variables:: Variables automatically defined by PSPP.
* Sets of Variables:: Lists of variable names.
* Input/Output Formats:: Input and output formats.
* Scratch Variables:: Variables deleted by procedures.
@end menu
@node Attributes, System Variables, Variables, Variables
@subsection Attributes of Variables
@cindex variables, attributes of
@cindex attributes of variables
Each variable has a number of attributes, including:
@table @strong
@item Name
This is an identifier. Each variable must have a different name.
@xref{Tokens}.
@cindex variables, type
@cindex type of variables
@item Type
Numeric or string.
@cindex variables, width
@cindex width of variables
@item Width
(string variables only) String variables with a width of 8 characters or
fewer are called @dfn{short string variables}. Short string variables
can be used in many procedures where @dfn{long string variables} (those
with widths greater than 8) are not allowed.
@quotation
@strong{Please note:} Certain systems may consider strings longer than 8
characters to be short strings. Eight characters represents a minimum
figure for the maximum length of a short string.
@end quotation
@item Position
Variables in the dictionary are arranged in a specific order. The
DISPLAY command can be used to show this order: see @ref{DISPLAY}.
@item Orientation
Dexter or sinister. @xref{LEAVE}.
@cindex missing values
@cindex values, missing
@item Missing values
Optionally, up to three values, or a range of values, or a specific
value plus a range, can be specified as @dfn{user-missing values}.
There is also a @dfn{system-missing value} that is assigned to an
observation when there is no other obvious value for that observation.
Observations with missing values are automatically excluded from
analyses. User-missing values are actual data values, while the
system-missing value is not a value at all. @xref{Missing Observations}.
@cindex variable labels
@cindex labels, variable
@item Variable label
A string that describes the variable. @xref{VARIABLE LABELS}.
@cindex value labels
@cindex labels, value
@item Value label
Optionally, these associate each possible value of the variable with a
string. @xref{VALUE LABELS}.
@cindex print format
@item Print format
Display width, format, and (for numeric variables) number of decimal
places. This attribute does not affect how data are stored, just how
they are displayed. Example: a width of 8, with 2 decimal places.
@xref{PRINT FORMATS}.
@cindex write format
@item Write format
Similar to print format, but used by certain commands that are
designed to write to binary files. @xref{WRITE FORMATS}.
@end table
@node System Variables, Sets of Variables, Attributes, Variables
@subsection Variables Automatically Defined by PSPP
@cindex system variables
@cindex variables, system
There are seven system variables. These are not like ordinary
variables, as they are not stored in each case. They can only be used
in expressions. These system variables, whose values and output formats
cannot be modified, are described below.
@table @code
@cindex @code{$CASENUM}
@item $CASENUM
Case number of the case at the moment. This changes as cases are
shuffled around.
@cindex @code{$DATE}
@item $DATE
Date the PSPP process was started, in format A9, following the
pattern @code{DD MMM YY}.
@cindex @code{$JDATE}
@item $JDATE
Number of days between 15 Oct 1582 and the time the PSPP process
was started.
@cindex @code{$LENGTH}
@item $LENGTH
Page length, in lines, in format F11.
@cindex @code{$SYSMIS}
@item $SYSMIS
System missing value, in format F1.
@cindex @code{$TIME}
@item $TIME
Number of seconds between midnight 14 Oct 1582 and the time the active file
was read, in format F20.
@cindex @code{$WIDTH}
@item $WIDTH
Page width, in characters, in format F3.
@end table
@node Sets of Variables, Input/Output Formats, System Variables, Variables
@subsection Lists of variable names
@cindex TO convention
@cindex convention, TO
There are several ways to specify a set of variables:
@enumerate
@item
(Most commonly.) List the variable names one after another, optionally
separating them by commas.
@cindex @code{TO}
@item
(This method cannot be used on commands that define the dictionary, such
as @code{DATA LIST}.) The syntax is the names of two existed variables,
separated by the reserved keyword @code{TO}. The meaning is to include
every variable in the dictionary between and including the variables
specified. For instance, if the dictionary contains six variables with
the names @code{ID}, @code{X1}, @code{X2}, @code{GOAL}, @code{MET}, and
@code{NEXTGOAL}, in that order, then @code{X2 TO MET} would include
variables @code{X2}, @code{GOAL}, and @code{MET}.
@item
(This method can be used only on commands that define the dictionary,
such as @code{DATA LIST}.) It is used to define sequences of variables
that end in consecutive integers. The syntax is two identifiers that
end in numbers. This method is best illustrated with examples:
@itemize @bullet
@item
The syntax @code{X1 TO X5} defines 5 variables:
@itemize @minus
@item
X1
@item
X2
@item
X3
@item
X4
@item
X5
@end itemize
@item
The syntax @code{ITEM0008 TO ITEM0013} defines 6 variables:
@itemize @minus
@item
ITEM0008
@item
ITEM0009
@item
ITEM0010
@item
ITEM0011
@item
ITEM0012
@item
ITEM0013
@end itemize
@item
Each of the syntaxes @code{QUES001 TO QUES9} and @code{QUES6 TO QUES3}
are invalid, although for different reasons, which should be evident.
@end itemize
Note that after a set of variables has been defined on @code{DATA LIST}
or another command with this method, the same set can be referenced on
later commands using the same syntax.
@item
The above methods can be combined, either one after another or delimited
by commas. For instance, the combined syntax @code{A Q5 TO Q8 X TO Z}
is legal as long as each part @code{A}, @code{Q5 TO Q8}, @code{X TO Z}
is individually legal.
@end enumerate
@node Input/Output Formats, Scratch Variables, Sets of Variables, Variables
@subsection Input and Output Formats
Data that PSPP inputs and outputs must have one of a number of formats.
These formats are described, in general, by a format specification of
the form @code{NAMEw.d}, where @var{name} is the
format name and @var{w} is a field width. @var{d} is the optional
desired number of decimal places, if appropriate. If @var{d} is not
included then it is assumed to be 0. Some formats do not allow @var{d}
to be specified.
When an input format is specified on DATA LIST or another command, then
it is converted to an output format for the purposes of PRINT and other
data output commands. For most purposes, input and output formats are
the same; the salient differences are described below.
Below are listed the input and output formats supported by PSPP. If an
input format is mapped to a different output format by default, then
that mapping is indicated with @result{}. Each format has the listed
bounds on input width (iw) and output width (ow).
The standard numeric input and output formats are given in the following
table:
@table @asis
@item Fw.d: 1 <= iw,ow <= 40
Standard decimal format with @var{d} decimal places. If the number is
too large to fit within the field width, it is expressed in scientific
notation (@code{1.2+34}) if w >= 6, with always at least two digits in
the exponent. When used as an input format, scientific notation is
allowed but an E or an F must be used to introduce the exponent.
The default output format is the same as the input format, except if
@var{d} > 1. In that case the output @var{w} is always made to be at
least 2 + @var{d}.
@item Ew.d: 1 <= iw <= 40; 6 <= ow <= 40
For input this is equivalent to F format except that no E or F is
require to introduce the exponent. For output, produces scientific
notation in the form @code{1.2+34}. There are always at least two
digits given in the exponent.
The default output @var{w} is the largest of the input @var{w}, the
input @var{d} + 7, and 10. The default output @var{d} is the input
@var{d}, but at least 3.
@item COMMAw.d: 1 <= iw,ow <= 40
Equivalent to F format, except that groups of three digits are
comma-separated on output. If the number is too large to express in the
field width, then first commas are eliminated, then if there is still
not enough space the number is expressed in scientific notation given
that w >= 6. Commas are allowed and ignored when this is used as an
input format.
@item DOTw.d: 1 <= iw,ow <= 40
Equivalent to COMMA format except that the roles of comma and decimal
point are interchanged. However: If SET /DECIMAL=DOT is in effect, then
COMMA uses @samp{,} for a decimal point and DOT uses @samp{.} for a
decimal point.
@item DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40
Equivalent to COMMA format, except that the number is prefixed by a
dollar sign (@samp{$}) if there is room. On input the value is allowed
to be prefixed by a dollar sign, which is ignored.
The default output @var{w} is the input @var{w}, but at least 2.
@item PCTw.d: 2 <= iw,ow <= 40
Equivalent to F format, except that the number is suffixed by a percent
sign (@samp{%}) if there is room. On input the value is allowed to be
suffixed by a percent sign, which is ignored.
The default output @var{w} is the input @var{w}, but at least 2.
@item Nw.d: 1 <= iw,ow <= 40
Only digits are allowed within the field width. The decimal point is
assumed to be @var{d} digits from the right margin.
The default output format is F with the same @var{w} and @var{d}, except
if @var{d} > 1. In that case the output @var{w} is always made to be at
least 2 + @var{d}.
@item Zw.d @result{} F: 1 <= iw,ow <= 40
Zoned decimal input. If you need to use this then you know how.
@item IBw.d @result{} F: 1 <= iw,ow <= 8
Integer binary format. The field is interpreted as a fixed-point
positive or negative binary number in two's-complement notation. The
location of the decimal point is implied. Endianness is the same as the
host machine.
The default output format is F8.2 if @var{d} is 0. Otherwise it is F,
with output @var{w} as 9 + input @var{d} and output @var{d} as input
@var{d}.
@item PIB @result{} F: 1 <= iw,ow <= 8
Positive integer binary format. The field is interpreted as a
fixed-point positive binary number. The location of the decimal point
is implied. Endianness is teh same as the host machine.
The default output format follows the rules for IB format.
@item Pw.d @result{} F: 1 <= iw,ow <= 16
Binary coded decimal format. Each byte from left to right, except the
rightmost, represents two digits. The upper nibble of each byte is more
significant. The upper nibble of the final byte is the least
significant digit. The lower nibble of the final byte is the sign; a
value of D represents a negative sign and all other values are
considered positive. The decimal point is implied.
The default output format follows the rules for IB format.
@item PKw.d @result{} F: 1 <= iw,ow <= 16
Positive binary code decimal format. Same as P but the last byte is the
same as the others.
The default output format follows the rules for IB format.
@item RBw @result{} F: 2 <= iw,ow <= 8
Binary C architecture-dependent ``double'' format. For a standard
IEEE754 implementation @var{w} should be 8.
The default output format follows the rules for IB format.
@item PIBHEXw.d @result{} F: 2 <= iw,ow <= 16
PIB format encoded as textual hex digit pairs. @var{w} must be even.
The input width is mapped to a default output width as follows:
2@result{}4, 4@result{}6, 6@result{}9, 8@result{}11, 10@result{}14,
12@result{}16, 14@result{}18, 16@result{}21. No allowances are made for
decimal places.
@item RBHEXw @result{} F: 4 <= iw,ow <= 16
RB format encoded as textual hex digits pairs. @var{w} must be even.
The default output format is F8.2.
@item CCAw.d: 1 <= ow <= 40
@itemx CCBw.d: 1 <= ow <= 40
@itemx CCCw.d: 1 <= ow <= 40
@itemx CCDw.d: 1 <= ow <= 40
@itemx CCEw.d: 1 <= ow <= 40
User-defined custom currency formats. May not be used as an input
format. @xref{SET}, for more details.
@end table
The date and time numeric input and output formats accept a number of
possible formats. Before describing the formats themselves, some
definitions of the elements that make up their formats will be helpful:
@table @dfn
@item leader
All formats accept an optional whitespace leader.
@item day
An integer between 1 and 31 representing the day of month.
@item day-count
An integer representing a number of days.
@item date-delimiter
One or more characters of whitespace or the following characters:
@code{- / . ,}
@item month
A month name in one of the following forms:
@itemize @bullet
@item
An integer between 1 and 12.
@item
Roman numerals representing an integer between 1 and 12.
@item
At least the first three characters of an English month name (January,
February, @dots{}).
@end itemize
@item year
An integer year number between 1582 and 19999, or between 1 and 199.
Years between 1 and 199 will have 1900 added.
@item julian
A single number with a year number in the first 2, 3, or 4 digits (as
above) and the day number within the year in the last 3 digits.
@item quarter
An integer between 1 and 4 representing a quarter.
@item q-delimiter
The letter @samp{Q} or @samp{q}.
@item week
An integer between 1 and 53 representing a week within a year.
@item wk-delimiter
The letters @samp{wk} in any case.
@item time-delimiter
At least one characters of whitespace or @samp{:} or @samp{.}.
@item hour
An integer greater than 0 representing an hour.
@item minute
An integer between 0 and 59 representing a minute within an hour.
@item opt-second
Optionally, a time-delimiter followed by a real number representing a
number of seconds.
@item hour24
An integer between 0 and 23 representing an hour within a day.
@item weekday
At least the first two characters of an English day word.
@item spaces
Any amount or no amount of whitespace.
@item sign
An optional positive or negative sign.
@item trailer
All formats accept an optional whitespace trailer.
@end table
The date input formats are strung together from the above pieces. On
output, the date formats are always printed in a single canonical
manner, based on field width. The date input and output formats are
described below:
@table @asis
@item DATEw: 9 <= iw,ow <= 40
Date format. Input format: leader + day + date-delimiter +
month + date-delimiter + year + trailer. Output format: DD-MMM-YY for
@var{w} < 11, DD-MMM-YYYY otherwise.
@item EDATEw: 8 <= iw,ow <= 40
European date format. Input format same as DATE. Output format:
DD.MM.YY for @var{w} < 10, DD.MM.YYYY otherwise.
@item SDATEw: 8 <= iw,ow <= 40
Standard date format. Input format: leader + year + date-delimiter +
month + date-delimiter + day + trailer. Output format: YY/MM/DD for
@var{w} < 10, YYYY/MM/DD otherwise.
@item ADATEw: 8 <= iw,ow <= 40
American date format. Input format: leader + month + date-delimiter +
day + date-delimiter + year + trailer. Output format: MM/DD/YY for
@var{w} < 10, MM/DD/YYYY otherwise.
@item JDATEw: 5 <= iw,ow <= 40
Julian date format. Input format: leader + julian + trailer. Output
format: YYDDD for @var{w} < 7, YYYYDDD otherwise.
@item QYRw: 4 <= iw <= 40, 6 <= ow <= 40
Quarter/year format. Input format: leader + quarter + q-delimiter +
year + trailer. Output format: @samp{Q Q YY}, where the first
@samp{Q} is one of the digits 1, 2, 3, 4, if @var{w} < 8, @code{Q Q
YYYY} otherwise.
@item MOYRw: 6 <= iw,ow <= 40
Month/year format. Input format: leader + month + date-delimiter + year
+ trailer. Output format: @samp{MMM YY} for @var{w} < 8, @samp{MMM
YYYY} otherwise.
@item WKYRw: 6 <= iw <= 40, 8 <= ow <= 40
Week/year format. Input format: leader + week + wk-delimiter + year +
trailer. Output format: @samp{WW WK YY} for @var{w} < 10, @samp{WW WK
YYYY} otherwise.
@item DATETIMEw.d: 17 <= iw,ow <= 40
Date and time format. Input format: leader + day + date-delimiter +
month + date-delimiter + yaer + time-delimiter + hour24 + time-delimiter
+ minute + opt-second. Output format: @samp{DD-MMM-YYYY HH:MM}. If
@var{w} > 19 then seconds @samp{:SS} is added. If @var{w} > 22 and
@var{d} > 0 then fractional seconds @samp{.SS} are added.
@item TIMEw.d: 5 <= iw,ow <= 40
Time format. Input format: leader + sign + spaces + hour +
time-delimiter + minute + opt-second. Output format: @samp{HH:MM}.
Seconds and fractional seconds are available with @var{w} of at least 8
and 10, respectively.
@item DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40
Time format with day count. Input format: leader + sign + spaces +
day-count + time-delimiter + hour + time-delimiter + minute +
opt-second. Output format: @samp{DD HH:MM}. Seconds and fractional
seconds are available with @var{w} of at least 8 and 10, respectively.
@item WKDAYw: 2 <= iw,ow <= 40
A weekday as a number between 1 and 7, where 1 is Sunday. Input format:
leader + weekday + trailer. Output format: as many characters, in all
capital letters, of the English name of the weekday as will fit in the
field width.
@item MONTHw: 3 <= iw,ow <= 40
A month as a number between 1 and 12, where 1 is January. Input format:
leader + month + trailer. Output format: as many character, in all
capital letters, of the English name of the month as will fit in the
field width.
@end table
There are only two formats that may be used with string variables:
@table @asis
@item Aw: 1 <= iw <= 255, 1 <= ow <= 254
The entire field is treated as a string value.
@item AHEXw @result{} A: 2 <= iw <= 254; 2 <= ow <= 510
The field is composed of characters in a string encoded as textual hex
digit pairs.
The default output @var{w} is half the input @var{w}.
@end table
@node Scratch Variables, , Input/Output Formats, Variables
@subsection Scratch Variables
Most of the time, variables don't retain their values between cases.
Instead, either they're being read from a data file or the active file,
in which case they assume the value read, or, if created with COMPUTE or
another transformation, they're initialized to the system-missing value
or to blanks, depending on type.
However, sometimes it's useful to have a variable that keeps its value
between cases. You can do this with LEAVE (@pxref{LEAVE}), or you can
use a @dfn{scratch variable}. Scratch variables are variables whose
names begin with an octothorpe (@samp{#}).
Scratch variables have the same properties as variables left with LEAVE:
they retain their values between cases, and for the first case they are
initialized to 0 or blanks. They have the additional property that they
are deleted before the execution of any procedure. For this reason,
scratch variables can't be used for analysis. To obtain the same
effect, use COMPUTE (@pxref{COMPUTE}) to copy the scratch variable's
value into an ordinary variable, then analysis that variable.
@node Files, BNF, Variables, Language
@section Files Used by PSPP
PSPP makes use of many files each time it runs. Some of these it
reads, some it writes, some it creates. Here is a table listing the
most important of these files:
@table @strong
@cindex file, command
@cindex file, syntax file
@cindex command file
@cindex syntax file
@item command file
@itemx syntax file
These names (synonyms) refer to the file that contains instructions to
PSPP that tell it what to do. The syntax file's name is specified on
the PSPP command line. Syntax files can also be pulled in with the
@code{INCLUDE} command.
@cindex file, data
@cindex data file
@item data file
Data files contain raw data in ASCII format suitable for being read in
by the @code{DATA LIST} command. Data can be embedded in the syntax
file with @code{BEGIN DATA} and @code{END DATA} commands: this makes the
syntax file a data file too.
@cindex file, output
@cindex output file
@item listing file
One or more output files are created by PSPP each time it is
run. The output files receive the tables and charts produced by
statistical procedures. The output files may be in any number of formats,
depending on how PSPP is configured.
@cindex active file
@cindex file, active
@item active file
The active file is the ``file'' on which all PSPP procedures
are performed. The active file contains variable definitions and
cases. The active file is not necessarily a disk file: it is stored
in memory if there is room.
@end table
@node BNF, , Files, Language
@section Backus-Naur Form
@cindex BNF
@cindex Backus-Naur Form
@cindex command syntax, description of
@cindex description of command syntax
The syntax of some parts of the PSPP language is presented in this
manual using the formalism known as @dfn{Backus-Naur Form}, or BNF. The
following table describes BNF:
@itemize @bullet
@cindex keywords
@cindex terminals
@item
Words in all-uppercase are PSPP keyword tokens. In BNF, these are
often called @dfn{terminals}. There are some special terminals, which
are actually written in lowercase for clarity:
@table @asis
@cindex @code{number}
@item @code{number}
A real number.
@cindex @code{integer}
@item @code{integer}
An integer number.
@cindex @code{string}
@item @code{string}
A string.
@cindex @code{var-name}
@item @code{var-name}
A single variable name.
@cindex operators
@cindex punctuators
@item @code{=}, @code{/}, @code{+}, @code{-}, etc.
Operators and punctuators.
@cindex @code{.}
@cindex terminal dot
@cindex dot, terminal
@item @code{.}
The terminal dot. This is not necessarily an actual dot in the syntax
file: @xref{Commands}, for more details.
@end table
@item
@cindex productions
@cindex nonterminals
Other words in all lowercase refer to BNF definitions, called
@dfn{productions}. These productions are also known as
@dfn{nonterminals}. Some nonterminals are very common, so they are
defined here in English for clarity:
@table @code
@cindex @code{var-list}
@item var-list
A list of one or more variable names or the keyword @code{ALL}.
@cindex @code{expression}
@item expression
An expression. @xref{Expressions}, for details.
@end table
@item
@cindex @code{::=}
@cindex ``is defined as''
@cindex productions
@samp{::=} means ``is defined as''. The left side of @samp{::=} gives
the name of the nonterminal being defined. The right side of @samp{::=}
gives the definition of that nonterminal. If the right side is empty,
then one possible expansion of that nonterminal is nothing. A BNF
definition is called a @dfn{production}.
@item
@cindex terminals and nonterminals, differences
So, the key difference between a terminal and a nonterminal is that a
terminal cannot be broken into smaller parts---in fact, every terminal
is a single token (@pxref{Tokens}). On the other hand, nonterminals are
composed of a (possibly empty) sequence of terminals and nonterminals.
Thus, terminals indicate the deepest level of syntax description. (In
parsing theory, terminals are the leaves of the parse tree; nonterminals
form the branches.)
@item
@cindex start symbol
@cindex symbol, start
The first nonterminal defined in a set of productions is called the
@dfn{start symbol}. The start symbol defines the entire syntax for
that command.
@end itemize
@node Expressions, Data Input and Output, Language, Top
@chapter Mathematical Expressions
@cindex expressions, mathematical
@cindex mathematical expressions
Some PSPP commands use expressions, which share a common syntax
among all PSPP commands. Expressions are made up of
@dfn{operands}, which can be numbers, strings, or variable names,
separated by @dfn{operators}. There are five types of operators:
grouping, arithmetic, logical, relational, and functions.
Every operator takes one or more @dfn{arguments} as input and produces
or @dfn{returns} exactly one result as output. Both strings and numeric
values can be used as arguments and are produced as results, but each
operator accepts only specific combinations of numeric and string values
as arguments. With few exceptions, operator arguments may be
full-fledged expressions in themselves.
@menu
* Booleans:: Boolean values.
* Missing Values in Expressions:: Using missing values in expressions.
* Grouping Operators:: ( )
* Arithmetic Operators:: + - * / **
* Logical Operators:: AND NOT OR
* Relational Operators:: EQ GE GT LE LT NE
* Functions:: More-sophisticated operators.
* Order of Operations:: Operator precedence.
@end menu
@node Booleans, Missing Values in Expressions, Expressions, Expressions
@section Boolean values
@cindex Boolean
@cindex values, Boolean
There is a third type for arguments and results, the @dfn{Boolean} type,
which is used to represent true/false conditions. Booleans have only
three possible values: 0 (false), 1 (true), and system-missing.
System-missing is neither true or false.
@itemize @bullet
@item
A numeric expression that has value 0, 1, or system-missing may be used
in place of a Boolean. Thus, the expression @code{0 AND 1} is valid
(although it is always true).
@item
A numeric expression with any other value will cause an error if it is
used as a Boolean. So, @code{2 OR 3} is invalid.
@item
A Boolean expression may not be used in place of a numeric expression.
Thus, @code{(1>2) + (3<4)} is invalid.
@item
Strings and Booleans are not compatible, and neither may be used in
place of the other.
@end itemize
@node Missing Values in Expressions, Grouping Operators, Booleans, Expressions
@section Missing Values in Expressions
String missing values are not treated specially in expressions. Most
numeric operators return system-missing when given system-missing
arguments. Exceptions are listed under particular operator
descriptions.
User-missing values for numeric variables are always transformed into
the system-missing value, except inside the arguments to the
@code{VALUE}, @code{SYSMIS}, and @code{MISSING} functions.
The missing-value functions can be used to precisely control how missing
values are treated in expressions. @xref{Missing Value Functions}, for
more details.
@node Grouping Operators, Arithmetic Operators, Missing Values in Expressions, Expressions
@section Grouping Operators
@cindex parentheses
@cindex @samp{( )}
@cindex grouping operators
@cindex operators, grouping
Parentheses (@samp{()}) are the grouping operators. Surround an
expression with parentheses to force early evaluation.
Parentheses also surround the arguments to functions, but in that
situation they act as punctuators, not as operators.
@node Arithmetic Operators, Logical Operators, Grouping Operators, Expressions
@section Arithmetic Operators
@cindex operators, arithmetic
@cindex arithmetic operators
The arithmetic operators take numeric arguments and produce numeric
results.
@table @code
@cindex @samp{+}
@cindex addition
@item @var{a} + @var{b}
Adds @var{a} and @var{b}, returning the sum.
@cindex @samp{-}
@cindex subtraction
@item @var{a} - @var{b}
Subtracts @var{b} from @var{a}, returning the difference.
@cindex @samp{*}
@cindex multiplication
@item @var{a} * @var{b}
Multiplies @var{a} and @var{b}, returning the product.
@cindex @samp{/}
@cindex division
@item @var{a} / @var{b}
Divides @var{a} by @var{b}, returning the quotient. If @var{b} is
zero, the result is system-missing.
@cindex @samp{**}
@cindex exponentiation
@item @var{a} ** @var{b}
Returns the result of raising @var{a} to the power @var{b}. If
@var{a} is negative and @var{b} is not an integer, the result is
system-missing. The result of @code{0**0} is system-missing as well.
@cindex @samp{-}
@cindex negation
@item - @var{a}
Reverses the sign of @var{a}.
@end table
@node Logical Operators, Relational Operators, Arithmetic Operators, Expressions
@section Logical Operators
@cindex logical operators
@cindex operators, logical
@cindex true
@cindex false
@cindex Boolean
@cindex values, system-missing
@cindex system-missing
The logical operators take logical arguments and produce logical
results, meaning ``true or false''. PSPP logical operators are
not true Boolean operators because they may also result in a
system-missing value.
@table @code
@cindex @code{AND}
@cindex @samp{&}
@cindex intersection, logical
@cindex logical intersection
@item @var{a} AND @var{b}
@itemx @var{a} & @var{b}
True if both @var{a} and @var{b} are true. However, if one argument is
false and the other is missing, the result is false, not missing. If
both arguments are missing, the result is missing.
@cindex @code{OR}
@cindex @samp{|}
@cindex union, logical
@cindex logical union
@item @var{a} OR @var{b}
@itemx @var{a} | @var{b}
True if at least one of @var{a} and @var{b} is true. If one argument is
true and the other is missing, the result is true, not missing. If both
arguments are missing, the result is missing.
@cindex @code{NOT}
@cindex @samp{~}
@cindex inversion, logical
@cindex logical inversion
@item NOT @var{a}
@itemx ~ @var{a}
True if @var{a} is false.
@end table
@node Relational Operators, Functions, Logical Operators, Expressions
@section Relational Operators
The relational operators take numeric or string arguments and produce Boolean
results.
Note that, with numeric arguments, PSPP does not make exact
relational tests. Instead, two numbers are considered to be equal even
if they differ by a small amount. This amount, @dfn{epsilon}, is
dependent on the PSPP configuration and determined at compile
time. (The default value is 0.000000001, or
@ifinfo
@code{10**(-9)}.)
@end ifinfo
@tex
$10 ^ -9$.)
@end tex
Use of epsilon allows for round-off errors. Use of epsilon is also
idiotic, but the author is not a numeric analyst.
Strings cannot be compared to numbers. When strings of different
lengths are compared, the shorter string is right-padded with spaces
to match the length of the longer string.
The results of string comparisons, other than tests for equality or
inequality, are dependent on the character set in use. String
comparisons are case-sensitive.
@table @code
@cindex equality, testing
@cindex testing for equality
@cindex @code{EQ}
@cindex @samp{=}
@item @var{a} EQ @var{b}
@itemx @var{a} = @var{b}
True if @var{a} is equal to @var{b}.
@cindex less than or equal to
@cindex @code{LE}
@cindex @code{<=}
@item @var{a} LE @var{b}
@itemx @var{a} <= @var{b}
True if @var{a} is less than or equal to @var{b}.
@cindex less than
@cindex @code{LT}
@cindex @code{<}
@item @var{a} LT @var{b}
@itemx @var{a} < @var{b}
True if @var{a} is less than @var{b}.
@cindex greater than or equal to
@cindex @code{GE}
@cindex @code{>=}
@item @var{a} GE @var{b}
@itemx @var{a} >= @var{b}
True if @var{a} is greater than or equal to @var{b}.
@cindex greater than
@cindex @code{GT}
@cindex @samp{>}
@item @var{a} GT @var{b}
@itemx @var{a} > @var{b}
True if @var{a} is greater than @var{b}.
@cindex inequality, testing
@cindex testing for inequality
@cindex @code{NE}
@cindex @code{~=}
@cindex @code{<>}
@item @var{a} NE @var{b}
@itemx @var{a} ~= @var{b}
@itemx @var{a} <> @var{b}
True is @var{a} is not equal to @var{b}.
@end table
@node Functions, Order of Operations, Relational Operators, Expressions
@section Functions
@cindex functions
@cindex mathematics
@cindex operators
@cindex parentheses
@cindex @code{(}
@cindex @code{)}
@cindex names, of functions
PSPP functions provide mathematical abilities above and beyond
those possible using simple operators. Functions have a common
syntax: each is composed of a function name followed by a left
parenthesis, one or more arguments, and a right parenthesis. Function
names are @strong{not} reserved; their names are specially treated
only when followed by a left parenthesis: @code{EXP(10)} refers to the
constant value @code{e} raised to the 10th power, but @code{EXP} by
itself refers to the value of variable EXP.
The sections below describe each function in detail.
@menu
* Advanced Mathematics:: EXP LG10 LN SQRT
* Miscellaneous Mathematics:: ABS MOD MOD10 RND TRUNC
* Trigonometry:: ACOS ARCOS ARSIN ARTAN ASIN ATAN COS SIN TAN
* Missing Value Functions:: MISSING NMISS NVALID SYSMIS VALUE
* Pseudo-Random Numbers:: NORMAL UNIFORM
* Set Membership:: ANY RANGE
* Statistical Functions:: CFVAR MAX MEAN MIN SD SUM VARIANCE
* String Functions:: CONCAT INDEX LENGTH LOWER LPAD LTRIM NUMBER
RINDEX RPAD RTRIM STRING SUBSTR UPCASE
* Time & Date:: CTIME.xxx DATE.xxx TIME.xxx XDATE.xxx
* Miscellaneous Functions:: LAG YRMODA
* Functions Not Implemented:: CDF.xxx CDFNORM IDF.xxx NCDF.xxx PROBIT RV.xxx
@end menu
@node Advanced Mathematics, Miscellaneous Mathematics, Functions, Functions
@subsection Advanced Mathematical Functions
@cindex mathematics, advanced
Advanced mathematical functions take numeric arguments and produce
numeric results.
@deftypefn {Function} {} EXP (@var{exponent})
Returns @i{e} (approximately 2.71828) raised to power @var{exponent}.
@end deftypefn
@cindex logarithms
@deftypefn {Function} {} LG10 (@var{number})
Takes the base-10 logarithm of @var{number}. If @var{number} is
not positive, the result is system-missing.
@end deftypefn
@deftypefn {Function} {} LN (@var{number})
Takes the base-@samp{e} logarithm of @var{number}. If @var{number} is
not positive, the result is system-missing.
@end deftypefn
@cindex square roots
@deftypefn {Function} {} SQRT (@var{number})
Takes the square root of @var{number}. If @var{number} is negative,
the result is system-missing.
@end deftypefn
@node Miscellaneous Mathematics, Trigonometry, Advanced Mathematics, Functions
@subsection Miscellaneous Mathematical Functions
@cindex mathematics, miscellaneous
Miscellaneous mathematical functions take numeric arguments and produce
numeric results.
@cindex absolute value
@deftypefn {Function} {} ABS (@var{number})
Results in the absolute value of @var{number}.
@end deftypefn
@cindex modulus
@deftypefn {Function} {} MOD (@var{numerator}, @var{denominator})
Returns the remainder (modulus) of @var{numerator} divided by
@var{denominator}. If @var{denominator} is 0, the result is
system-missing. However, if @var{numerator} is 0 and
@var{denominator} is system-missing, the result is 0.
@end deftypefn
@cindex modulus, by 10
@deftypefn {Function} {} MOD10 (@var{number})
Returns the remainder when @var{number} is divided by 10. If
@var{number} is negative, MOD10(@var{number}) is negative or zero.
@end deftypefn
@cindex rounding
@deftypefn {Function} {} RND (@var{number})
Takes the absolute value of @var{number} and rounds it to an integer.
Then, if @var{number} was negative originally, negates the result.
@end deftypefn
@cindex truncation
@deftypefn {Function} {} TRUNC (@var{number})
Discards the fractional part of @var{number}; that is, rounds
@var{number} towards zero.
@end deftypefn
@node Trigonometry, Missing Value Functions, Miscellaneous Mathematics, Functions
@subsection Trigonometric Functions
@cindex trigonometry
Trigonometric functions take numeric arguments and produce numeric
results.
@cindex arccosine
@cindex inverse cosine
@deftypefn {Function} {} ACOS (@var{number})
@deftypefnx {Function} {} ARCOS (@var{number})
Takes the arccosine, in radians, of @var{number}. Results in
system-missing if @var{number} is not between -1 and 1. Portability:
none.
@end deftypefn
@cindex arcsine
@cindex inverse sine
@deftypefn {Function} {} ARSIN (@var{number})
Takes the arcsine, in radians, of @var{number}. Results in
system-missing if @var{number} is not between -1 and 1 inclusive.
@end deftypefn
@cindex arctangent
@cindex inverse tangent
@deftypefn {Function} {} ARTAN (@var{number})
Takes the arctangent, in radians, of @var{number}.
@end deftypefn
@cindex arcsine
@cindex inverse sine
@deftypefn {Function} {} ASIN (@var{number})
Takes the arcsine, in radians, of @var{number}. Results in
system-missing if @var{number} is not between -1 and 1 inclusive.
Portability: none.
@end deftypefn
@cindex arctangent
@cindex inverse tangent
@deftypefn {Function} {} ATAN (@var{number})
Takes the arctangent, in radians, of @var{number}.
@end deftypefn
@quotation
@strong{Please note:} Use of the AR* group of inverse trigonometric
functions is recommended over the A* group because they are more
portable.
@end quotation
@cindex cosine
@deftypefn {Function} {} COS (@var{radians})
Takes the cosine of @var{radians}.
@end deftypefn
@cindex sine
@deftypefn {Function} {} SIN (@var{angle})
Takes the sine of @var{radians}.
@end deftypefn
@cindex tangent
@deftypefn {Function} {} TAN (@var{angle})
Takes the tangent of @var{radians}. Results in system-missing at values
of @var{angle} that are too close to odd multiples of pi/2.
Portability: none.
@end deftypefn
@node Missing Value Functions, Pseudo-Random Numbers, Trigonometry, Functions
@subsection Missing-Value Functions
@cindex missing values
@cindex values, missing
@cindex functions, missing-value
Missing-value functions take various types as arguments, returning
various types of results.
@deftypefn {Function} {} MISSING (@var{variable or expression})
@var{num} may be a single variable name or an expression. If it is a
variable name, results in 1 if the variable has a user-missing or
system-missing value for the current case, 0 otherwise. If it is an
expression, results in 1 if the expression has the system-missing value,
0 otherwise.
@quotation
@strong{Please note:} If the argument is a string expression other than
a variable name, MISSING is guaranteed to return 0, because strings do
not have a system-missing value. Also, when using a numeric expression
argument, remember that user-missing values are converted to the
system-missing value in most contexts. Thus, the expressions
@code{MISSING(VAR1 @var{op} VAR2)} and @code{MISSING(VAR1) OR
MISSING(VAR2)} are often equivalent, depending on the specific operator
@var{op} used.
@end quotation
@end deftypefn
@deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
user- or system-missing values in the list. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
@deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
values in the list that are not user- or system-missing. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
@deftypefn {Function} {} SYSMIS (@var{variable or expression})
When given the name of a numeric variable, returns 1 if the value of
that variable is system-missing. Otherwise, if the value is not
missing or if it is user-missing, returns 0. If given the name of a
string variable, always returns 1. If given an expression other than
a single variable name, results in 1 if the value is system- or
user-missing, 0 otherwise.
@end deftypefn
@deftypefn {Function} {} VALUE (@var{variable})
Prevents the user-missing values of @var{variable} from being
transformed into system-missing values: If @var{variable} is not
system- or user-missing, results in the value of @var{variable}. If
@var{variable} is user-missing, results in the value of @var{variable}
anyway. If @var{variable} is system-missing, results in system-missing.
@end deftypefn
@node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions
@subsection Pseudo-Random Number Generation Functions
@cindex random numbers
@cindex pseudo-random numbers (see random numbers)
Pseudo-random number generation functions take numeric arguments and
produce numeric results.
@cindex Knuth
The system's C library random generator is used as a basis for
generating random numbers, since random number generation is a
system-dependent task. However, Knuth's Algorithm B is used to
shuffle the resultant values, which is enough to make even a stream of
consecutive integers random enough for most applications.
(If you're worried about the quality of the random number generator,
well, you're using a statistical processing package---analyze it!)
@cindex random numbers, normally-distributed
@deftypefn {Function} {} NORMAL (@var{number})
Results in a random number. Results from @code{NORMAL} are normally
distributed with a mean of 0 and a standard deviation of @var{number}.
@end deftypefn
@cindex random numbers, uniformly-distributed
@deftypefn {Function} {} UNIFORM (@var{number})
Results in a random number between 0 and @var{number}. Results from
@code{UNIFORM} are evenly distributed across its entire range. There
may be a maximum on the largest random number ever generated---this is
often 2**31-1 (2,147,483,647), but it may be orders of magnitude
higher or lower.
@end deftypefn
@node Set Membership, Statistical Functions, Pseudo-Random Numbers, Functions
@subsection Set-Membership Functions
@cindex set membership
@cindex membership, of set
Set membership functions determine whether a value is a member of a set.
They take a set of numeric arguments or a set of string arguments, and
produce Boolean results.
String comparisons are performed according to the rules given in
@ref{Relational Operators}.
@deftypefn {Function} {} ANY (@var{value}, @var{set} [, @var{set}]@dots{})
Results in true if @var{value} is equal to any of the @var{set}
values. Otherwise, results in false. If @var{value} is
system-missing, returns system-missing. System-missing values in
@var{set} do not cause ANY to return system-missing.
@end deftypefn
@deftypefn {Function} {} RANGE (@var{value}, @var{low}, @var{high} [, @var{low}, @var{high}]@dots{})
Results in true if @var{value} is in any of the intervals bounded by
@var{low} and @var{high} inclusive. Otherwise, results in false.
Each @var{low} must be less than or equal to its corresponding
@var{high} value. @var{low} and @var{high} must be given in pairs.
If @var{value} is system-missing, returns system-missing.
System-missing values in @var{set} do not cause RANGE to return
system-missing.
@end deftypefn
@node Statistical Functions, String Functions, Set Membership, Functions
@subsection Statistical Functions
@cindex functions, statistical
@cindex statistics
Statistical functions compute descriptive statistics on a list of
values. Some statistics can be computed on numeric or string values;
other can only be computed on numeric values. They result in the same
type as their arguments.
@cindex arguments, minimum valid
@cindex minimum valid number of arguments
With statistical functions it is possible to specify a minimum number of
non-missing arguments for the function to be evaluated. To do so,
append a dot and the number to the function name. For instance, to
specify a minimum of three valid arguments to the MEAN function, use the
name @code{MEAN.3}.
@cindex coefficient of variation
@cindex variation, coefficient of
@deftypefn {Function} {} CFVAR (@var{number}, @var{number}[, @dots{}])
Results in the coefficient of variation of the values of @var{number}.
This function requires at least two valid arguments to give a
non-missing result. (The coefficient of variation is the standard
deviation divided by the mean.)
@end deftypefn
@cindex maximum
@deftypefn {Function} {} MAX (@var{value}, @var{value}[, @dots{}])
Results in the value of the greatest @var{value}. The @var{value}s may
be numeric or string. Although at least two arguments must be given,
only one need be valid for MAX to give a non-missing result.
@end deftypefn
@cindex mean
@deftypefn {Function} {} MEAN (@var{number}, @var{number}[, @dots{}])
Results in the mean of the values of @var{number}. Although at least
two arguments must be given, only one need be valid for MEAN to give a
non-missing result.
@end deftypefn
@cindex minimum
@deftypefn {Function} {} MIN (@var{number}, @var{number}[, @dots{}])
Results in the value of the least @var{value}. The @var{value}s may
be numeric or string. Although at least two arguments must be given,
only one need be valid for MAX to give a non-missing result.
@end deftypefn
@cindex standard deviation
@cindex deviation, standard
@deftypefn {Function} {} SD (@var{number}, @var{number}[, @dots{}])
Results in the standard deviation of the values of @var{number}.
This function requires at least two valid arguments to give a
non-missing result.
@end deftypefn
@cindex sum
@deftypefn {Function} {} SUM (@var{number}, @var{number}[, @dots{}])
Results in the sum of the values of @var{number}. Although at least two
arguments must be given, only one need by valid for SUM to give a
non-missing result.
@end deftypefn
@cindex variance
@deftypefn {Function} {} VAR (@var{number}, @var{number}[, @dots{}])
Results in the variance of the values of @var{number}. This function
requires at least two valid arguments to give a non-missing result.
@end deftypefn
@deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}])
Results in the variance of the values of @var{number}. This function
requires at least two valid arguments to give a non-missing result.
(Use VAR in preference to VARIANCE for reasons of portability.)
@end deftypefn
@node String Functions, Time & Date, Statistical Functions, Functions
@subsection String Functions
@cindex functions, string
@cindex string functions
String functions take various arguments and return various results.
@cindex concatenation
@cindex strings, concatenation of
@deftypefn {Function} {} CONCAT (@var{string}, @var{string}[, @dots{}])
Returns a string consisting of each @var{string} in sequence.
@code{CONCAT("abc", "def", "ghi")} has a value of @code{"abcdefghi"}.
The resultant string is truncated to a maximum of 255 characters.
@end deftypefn
@cindex searching strings
@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle})
Returns a positive integer indicating the position of the first
occurrence @var{needle} in @var{haystack}. Returns 0 if @var{haystack}
does not contain @var{needle}. Returns system-missing if @var{needle}
is an empty string.
@end deftypefn
@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle}, @var{divisor})
Divides @var{needle} into parts, each with length @var{divisor}.
Searches @var{haystack} for the first occurrence of each part, and
returns the smallest value. Returns 0 if @var{haystack} does not
contain any part in @var{needle}. It is an error if @var{divisor}
cannot be evenly divided into the length of @var{needle}. Returns
system-missing if @var{needle} is an empty string.
@end deftypefn
@cindex strings, finding length of
@deftypefn {Function} {} LENGTH (@var{string})
Returns the number of characters in @var{string}.
@end deftypefn
@cindex strings, case of
@deftypefn {Function} {} LOWER (@var{string})
Returns a string identical to @var{string} except that all uppercase
letters are changed to lowercase letters. The definitions of
``uppercase'' and ``lowercase'' are system-dependent.
@end deftypefn
@cindex strings, padding
@deftypefn {Function} {} LPAD (@var{string}, @var{length})
If @var{string} is at least @var{length} characters in length, returns
@var{string} unchanged. Otherwise, returns @var{string} padded with
spaces on the left side to length @var{length}. Returns an empty string
if @var{length} is system-missing, negative, or greater than 255.
@end deftypefn
@deftypefn {Function} {} LPAD (@var{string}, @var{length}, @var{padding})
If @var{string} is at least @var{length} characters in length, returns
@var{string} unchanged. Otherwise, returns @var{string} padded with
@var{padding} on the left side to length @var{length}. Returns an empty
string if @var{length} is system-missing, negative, or greater than 255, or
if @var{padding} does not contain exactly one character.
@end deftypefn
@cindex strings, trimming
@cindex whitespace, trimming
@deftypefn {Function} {} LTRIM (@var{string})
Returns @var{string}, after removing leading spaces. Other whitespace,
such as tabs, carriage returns, line feeds, and vertical tabs, is not
removed.
@end deftypefn
@deftypefn {Function} {} LTRIM (@var{string}, @var{padding})
Returns @var{string}, after removing leading @var{padding} characters.
If @var{padding} does not contain exactly one character, returns an
empty string.
@end deftypefn
@cindex numbers, converting from strings
@cindex strings, converting to numbers
@deftypefn {Function} {} NUMBER (@var{string})
Returns the number produced when @var{string} is interpreted according
to format F@var{x}.0, where @var{x} is the number of characters in
@var{string}. If @var{string} does not form a proper number,
system-missing is returned without an error message. Portability: none.
@end deftypefn
@deftypefn {Function} {} NUMBER (@var{string}, @var{format})
Returns the number produced when @var{string} is interpreted according
to format specifier @var{format}. Only the number of characters in
@var{string} specified by @var{format} are examined. For example,
@code{NUMBER("123", F3.0)} and @code{NUMBER("1234", F3.0)} both have
value 123. If @var{string} does not form a proper number,
system-missing is returned without an error message.
@end deftypefn
@cindex strings, searching backwards
@deftypefn {Function} {} RINDEX (@var{string}, @var{format})
Returns a positive integer indicating the position of the last
occurrence of @var{needle} in @var{haystack}. Returns 0 if
@var{haystack} does not contain @var{needle}. Returns system-missing if
@var{needle} is an empty string.
@end deftypefn
@deftypefn {Function} {} RINDEX (@var{haystack}, @var{needle}, @var{divisor})
Divides @var{needle} into parts, each with length @var{divisor}.
Searches @var{haystack} for the last occurrence of each part, and
returns the largest value. Returns 0 if @var{haystack} does not contain
any part in @var{needle}. It is an error if @var{divisor} cannot be
evenly divided into the length of @var{needle}. Returns system-missing
if @var{needle} is an empty string.
@end deftypefn
@cindex padding strings
@cindex strings, padding
@deftypefn {Function} {} RPAD (@var{string}, @var{length})
If @var{string} is at least @var{length} characters in length, returns
@var{string} unchanged. Otherwise, returns @var{string} padded with
spaces on the right to length @var{length}. Returns an empty string if
@var{length} is system-missing, negative, or greater than 255.
@end deftypefn
@deftypefn {Function} {} RPAD (@var{string}, @var{length}, @var{padding})
If @var{string} is at least @var{length} characters in length, returns
@var{string} unchanged. Otherwise, returns @var{string} padded with
@var{padding} on the right to length @var{length}. Returns an empty
string if @var{length} is system-missing, negative, or greater than 255,
or if @var{padding} does not contain exactly one character.
@end deftypefn
@cindex strings, trimming
@cindex whitespace, trimming
@deftypefn {Function} {} RTRIM (@var{string})
Returns @var{string}, after removing trailing spaces. Other types of
whitespace are not removed.
@end deftypefn
@deftypefn {Function} {} RTRIM (@var{string}, @var{padding})
Returns @var{string}, after removing trailing @var{padding} characters.
If @var{padding} does not contain exactly one character, returns an
empty string.
@end deftypefn
@cindex strings, converting from numbers
@cindex numbers, converting to strings
@deftypefn {Function} {} STRING (@var{number}, @var{format})
Returns a string corresponding to @var{number} in the format given by
format specifier @var{format}. For example, @code{STRING(123.56, F5.1)}
has the value @code{"123.6"}.
@end deftypefn
@cindex substrings
@cindex strings, taking substrings of
@deftypefn {Function} {} SUBSTR (@var{string}, @var{start})
Returns a string consisting of the value of @var{string} from position
@var{start} onward. Returns an empty string if @var{start} is system-missing
or has a value less than 1 or greater than the number of characters in
@var{string}.
@end deftypefn
@deftypefn {Function} {} SUBSTR (@var{string}, @var{start}, @var{count})
Returns a string consisting of the first @var{count} characters from
@var{string} beginning at position @var{start}. Returns an empty string
if @var{start} or @var{count} is system-missing, if @var{start} is less
than 1 or greater than the number of characters in @var{string}, or if
@var{count} is less than 1. Returns a string shorter than @var{count}
characters if @var{start} + @var{count} - 1 is greater than the number
of characters in @var{string}. Examples: @code{SUBSTR("abcdefg", 3, 2)}
has value @code{"cd"}; @code{SUBSTR("Ben Pfaff", 5, 10)} has the value
@code{"Pfaff"}.
@end deftypefn
@cindex case conversion
@cindex strings, case of
@deftypefn {Function} {} UPCASE (@var{string})
Returns @var{string}, changing lowercase letters to uppercase letters.
@end deftypefn
@node Time & Date, Miscellaneous Functions, String Functions, Functions
@subsection Time & Date Functions
@cindex functions, time & date
@cindex times
@cindex dates
@cindex dates, legal range of
The legal range of dates for use in PSPP is 15 Oct 1582
through 31 Dec 19999.
@cindex arguments, invalid
@cindex invalid arguments
@quotation
@strong{Please note:} Most time & date extraction functions will accept
invalid arguments:
@itemize @bullet
@item
Negative numbers in PSPP time format.
@item
Numbers less than 86,400 in PSPP date format.
@end itemize
However, sensible results are not guaranteed for these invalid values.
The given equivalents for these functions are definitely not guaranteed
for invalid values.
@end quotation
@quotation
@strong{Please note also:} The time & date construction
functions @strong{do} produce reasonable and useful results for
out-of-range values; these are not considered invalid.
@end quotation
@menu
* Time & Date Concepts:: How times & dates are defined and represented
* Time Construction:: TIME.@{DAYS HMS@}
* Time Extraction:: CTIME.@{DAYS HOURS MINUTES SECONDS@}
* Date Construction:: DATE.@{DMY MDY MOYR QYR WKYR YRDAY@}
* Date Extraction:: XDATE.@{DATE HOUR JDAY MDAY MINUTE MONTH
QUARTER SECOND TDAY TIME WEEK
WKDAY YEAR@}
@end menu
@node Time & Date Concepts, Time Construction, Time & Date, Time & Date
@subsubsection How times & dates are defined and represented
@cindex time, concepts
@cindex time, intervals
Times and dates are handled by PSPP as single numbers. A
@dfn{time} is an interval. PSPP measures times in seconds.
Thus, the following intervals correspond with the numeric values given:
@example
10 minutes 600
1 hour 3,600
1 day, 3 hours, 10 seconds 97,210
40 days 3,456,000
10010 d, 14 min, 24 s 864,864,864
@end example
@cindex dates, concepts
@cindex time, instants of
A @dfn{date}, on the other hand, is a particular instant in the past or
the future. PSPP represents a date as a number of seconds after the
midnight that separated 8 Oct 1582 and 9 Oct 1582. (Please note that 15
Oct 1582 immediately followed 9 Oct 1582.) Thus, the midnights before
the dates given below correspond with the numeric PSPP dates given:
@example
15 Oct 1582 86,400
4 Jul 1776 6,113,318,400
1 Jan 1900 10,010,390,400
1 Oct 1978 12,495,427,200
24 Aug 1995 13,028,601,600
@end example
@cindex time, mathematical properties of
@cindex mathematics, applied to times & dates
@cindex dates, mathematical properties of
@noindent
Please note:
@itemize @bullet
@item
A time may be added to, or subtracted from, a date, resulting in a date.
@item
The difference of two dates may be taken, resulting in a time.
@item
Two times may be added to, or subtracted from, each other, resulting in
a time.
@end itemize
(Adding two dates does not produce a useful result.)
Since times and dates are merely numbers, the ordinary addition and
subtraction operators are employed for these purposes.
@quotation
@strong{Please note:} Many dates and times have extremely large
values---just look at the values above. Thus, it is not a good idea to
take powers of these values; also, the accuracy of some procedures may
be affected. If necessary, convert times or dates in seconds to some
other unit, like days or years, before performing analysis.
@end quotation
@node Time Construction, Time Extraction, Time & Date Concepts, Time & Date
@subsubsection Functions that Produce Times
@cindex times, constructing
@cindex constructing times
These functions take numeric arguments and produce numeric results in
PSPP time format.
@cindex days
@cindex time, in days
@deftypefn {Function} {} TIME.DAYS (@var{ndays})
Results in a time value corresponding to @var{ndays} days.
(@code{TIME.DAYS(@var{x})} is equivalent to @code{@var{x} * 60 * 60 *
24}.)
@end deftypefn
@cindex hours-minutes-seconds
@cindex time, in hours-minutes-seconds
@deftypefn {Function} {} TIME.HMS (@var{nhours}, @var{nmins}, @var{nsecs})
Results in a time value corresponding to @var{nhours} hours, @var{nmins}
minutes, and @var{nsecs} seconds. (@code{TIME.HMS(@var{h}, @var{m},
@var{s})} is equivalent to @code{@var{h}*60*60 + @var{m}*60 +
@var{s}}.)
@end deftypefn
@node Time Extraction, Date Construction, Time Construction, Time & Date
@subsubsection Functions that Examine Times
@cindex extraction, of time
@cindex time examination
@cindex examination, of times
@cindex time, lengths of
These functions take numeric arguments in PSPP time format and
give numeric results.
@cindex days
@cindex time, in days
@deftypefn {Function} {} CTIME.DAYS (@var{time})
Results in the number of days and fractional days in @var{time}.
(@code{CTIME.DAYS(@var{x})} is equivalent to @code{@var{x}/60/60/24}.)
@end deftypefn
@cindex hours
@cindex time, in hours
@deftypefn {Function} {} CTIME.HOURS (@var{time})
Results in the number of hours and fractional hours in @var{time}.
(@code{CTIME.HOURS(@var{x})} is equivalent to @code{@var{x}/60/60}.)
@end deftypefn
@cindex minutes
@cindex time, in minutes
@deftypefn {Function} {} CTIME.MINUTES (@var{time})
Results in the number of minutes and fractional minutes in @var{time}.
(@code{CTIME.MINUTES(@var{x})} is equivalent to @code{@var{x}/60}.)
@end deftypefn
@cindex seconds
@cindex time, in seconds
@deftypefn {Function} {} CTIME.SECONDS (@var{time})
Results in the number of seconds and fractional seconds in @var{time}.
(@code{CTIME.SECONDS} does nothing; @code{CTIME.SECONDS(@var{x})} is
equivalent to @code{@var{x}}.)
@end deftypefn
@node Date Construction, Date Extraction, Time Extraction, Time & Date
@subsubsection Functions that Produce Dates
@cindex dates, constructing
@cindex constructing dates
@cindex arguments, of date construction functions
These functions take numeric arguments and give numeric results in the
PSPP date format. Arguments taken by these functions are:
@table @var
@item day
Refers to a day of the month between 1 and 31.
@item month
Refers to a month of the year between 1 and 12.
@item quarter
Refers to a quarter of the year between 1 and 4. The quarters of the
year begin on the first days of months 1, 4, 7, and 10.
@item week
Refers to a week of the year between 1 and 53.
@item yday
Refers to a day of the year between 1 and 366.
@item year
Refers to a year between 1582 and 19999.
@end table
@cindex arguments, invalid
If these functions' arguments are out-of-range, they are correctly
normalized before conversion to date format. Non-integers are rounded
toward zero.
@cindex day-month-year
@cindex dates, day-month-year
@deftypefn {Function} {} DATE.DMY (@var{day}, @var{month}, @var{year})
@deftypefnx {Function} {} DATE.MDY (@var{month}, @var{day}, @var{year})
Results in a date value corresponding to the midnight before day
@var{day} of month @var{month} of year @var{year}.
@end deftypefn
@cindex month-year
@cindex dates, month-year
@deftypefn {Function} {} DATE.MOYR (@var{month}, @var{year})
Results in a date value corresponding to the midnight before the first
day of month @var{month} of year @var{year}.
@end deftypefn
@cindex quarter-year
@cindex dates, quarter-year
@deftypefn {Function} {} DATE.QYR (@var{quarter}, @var{year})
Results in a date value corresponding to the midnight before the first
day of quarter @var{quarter} of year @var{year}.
@end deftypefn
@cindex week-year
@cindex dates, week-year
@deftypefn {Function} {} DATE.WKYR (@var{week}, @var{year})
Results in a date value corresponding to the midnight before the first
day of week @var{week} of year @var{year}.
@end deftypefn
@cindex year-day
@cindex dates, year-day
@deftypefn {Function} {} DATE.YRDAY (@var{year}, @var{yday})
Results in a date value corresponding to the midnight before day
@var{yday} of year @var{year}.
@end deftypefn
@node Date Extraction, , Date Construction, Time & Date
@subsubsection Functions that Examine Dates
@cindex extraction, of dates
@cindex date examination
@cindex arguments, of date extraction functions
These functions take numeric arguments in PSPP date or time
format and give numeric results. These names are used for arguments:
@table @var
@item date
A numeric value in PSPP date format.
@item time
A numeric value in PSPP time format.
@item time-or-date
A numeric value in PSPP time or date format.
@end table
@cindex days
@cindex dates, in days
@cindex time, in days
@deftypefn {Function} {} XDATE.DATE (@var{time-or-date})
For a time, results in the time corresponding to the number of whole
days @var{date-or-time} includes. For a date, results in the date
corresponding to the latest midnight at or before @var{date-or-time};
that is, gives the date that @var{date-or-time} is in.
(XDATE.DATE(@var{x}) is equivalent to TRUNC(@var{x}/86400)*86400.)
Applying this function to a time is a Portability: none feature.
@end deftypefn
@cindex hours
@cindex dates, in hours
@cindex time, in hours
@deftypefn {Function} {} XDATE.HOUR (@var{time-or-date})
For a time, results in the number of whole hours beyond the number of
whole days represented by @var{date-or-time}. For a date, results in
the hour (as an integer between 0 and 23) corresponding to
@var{date-or-time}. (XDATE.HOUR(@var{x}) is equivalent to
MOD(TRUNC(@var{x}/3600),24)) Applying this function to a time is a
Portability: none feature.
@end deftypefn
@cindex day of the year
@cindex dates, day of the year
@deftypefn {Function} {} XDATE.JDAY(@var{date})
Results in the day of the year (as an integer between 1 and 366)
corresponding to @var{date}.
@end deftypefn
@cindex day of the month
@cindex dates, day of the month
@deftypefn {Function} {} XDATE.MDAY(@var{date})
Results in the day of the month (as an integer between 1 and 31)
corresponding to @var{date}.
@end deftypefn
@cindex minutes
@cindex dates, in minutes
@cindex time, in minutes
@deftypefn {Function} {} XDATE.MINUTE(@var{time-or-date})
Results in the number of minutes (as an integer between 0 and 59) after
the last hour in @var{time-or-date}. (XDATE.MINUTE(@var{x}) is
equivalent to MOD(TRUNC(@var{x}/60),60)) Applying this function to a
time is a Portability: none feature.
@end deftypefn
@cindex months
@cindex dates, in months
@deftypefn {Function} {} XDATE.MONTH(@var{date})
Results in the month of the year (as an integer between 1 and 12)
corresponding to @var{date}.
@end deftypefn
@cindex quarters
@cindex dates, in quarters
@deftypefn {Function} {} XDATE.QUARTER(@var{date})
Results in the quarter of the year (as an integer between 1 and 4)
corresponding to @var{date}.
@end deftypefn
@cindex seconds
@cindex dates, in seconds
@cindex time, in seconds
@deftypefn {Function} {} XDATE.SECOND(@var{time-or-date})
Results in the number of whole seconds after the last whole minute (as
an integer between 0 and 59) in @var{time-or-date}.
(XDATE.SECOND(@var{x}) is equivalent to MOD(@var{x}, 60).) Applying
this function to a time is a Portability: none feature.
@end deftypefn
@cindex days
@cindex times, in days
@deftypefn {Function} {} XDATE.TDAY(@var{time})
Results in the number of whole days (as an integer) in @var{time}.
(XDATE.TDAY(@var{x}) is equivalent to TRUNC(@var{x}/86400).)
@end deftypefn
@cindex time
@cindex dates, time of day
@deftypefn {Function} {} XDATE.TIME(@var{date})
Results in the time of day at the instant corresponding to @var{date},
in PSPP time format. This is the number of seconds since
midnight on the day corresponding to @var{date}. (XDATE.TIME(@var{x}) is
equivalent to TRUNC(@var{x}/86400)*86400.)
@end deftypefn
@cindex week
@cindex dates, in weeks
@deftypefn {Function} {} XDATE.WEEK(@var{date})
Results in the week of the year (as an integer between 1 and 53)
corresponding to @var{date}.
@end deftypefn
@cindex day of the week
@cindex weekday
@cindex dates, day of the week
@cindex dates, in weekdays
@deftypefn {Function} {} XDATE.WKDAY(@var{date})
Results in the day of week (as an integer between 1 and 7) corresponding
to @var{date}. The days of the week are:
@table @asis
@item 1
Sunday
@item 2
Monday
@item 3
Tuesday
@item 4
Wednesday
@item 5
Thursday
@item 6
Friday
@item 7
Saturday
@end table
@end deftypefn
@cindex years
@cindex dates, in years
@deftypefn {Function} {} XDATE.YEAR (@var{date})
Returns the year (as an integer between 1582 and 19999) corresponding to
@var{date}.
@end deftypefn
@node Miscellaneous Functions, Functions Not Implemented, Time & Date, Functions
@subsection Miscellaneous Functions
@cindex functions, miscellaneous
Miscellaneous functions take various arguments and produce various
results.
@cindex cross-case function
@cindex function, cross-case
@deftypefn {Function} {} LAG (@var{variable})
@var{variable} must be a numeric or string variable name. @code{LAG}
results in the value of that variable for the case before the current
one. In case-selection procedures, @code{LAG} results in the value of
the variable for the last case selected. Results in system-missing (for
numeric variables) or blanks (for string variables) for the first case
or before any cases are selected.
@end deftypefn
@deftypefn {Function} {} LAG (@var{variable}, @var{ncases})
@var{variable} must be a numeric or string variable name. @var{ncases}
must be a small positive constant integer, although there is no explicit
limit. (Use of a large value for @var{ncases} will increase memory
consumption, since PSPP must keep @var{ncases} cases in memory.)
@code{LAG (@var{variable}, @var{ncases}} results in the value of
@var{variable} that is @var{ncases} before the case currently being
processed. See @code{LAG (@var{variable})} above for more details.
@end deftypefn
@cindex date, Julian
@cindex Julian date
@deftypefn {Function} {} YRMODA (@var{year}, @var{month}, @var{day})
@var{year} is a year between 0 and 199 or 1582 and 19999. @var{month} is
a month between 1 and 12. @var{day} is a day between 1 and 31. If
@var{month} or @var{day} is out-of-range, it changes the next higher
unit. For instance, a @var{day} of 0 refers to the last day of the
previous month, and a @var{month} of 13 refers to the first month of the
next year. @var{year} must be in range. If @var{year} is between 0 and
199, 1900 is added. @var{year}, @var{month}, and @var{day} must all be
integers.
@code{YRMODA} results in the number of days between 15 Oct 1582 and
the date specified, plus one. The date passed to @code{YRMODA} must be
on or after 15 Oct 1582. 15 Oct 1582 has a value of 1.
@end deftypefn
@node Functions Not Implemented, , Miscellaneous Functions, Functions
@subsection Functions Not Implemented
@cindex functions, not implemented
@cindex not implemented
@cindex features, not implemented
These functions are not yet implemented and thus not yet documented,
since it's a hassle.
@findex CDF.xxx
@findex CDFNORM
@findex IDF.xxx
@findex NCDF.xxx
@findex PROBIT
@findex RV.xxx
@itemize @bullet
@item
@code{CDF.xxx}
@item
@code{CDFNORM}
@item
@code{IDF.xxx}
@item
@code{NCDF.xxx}
@item
@code{PROBIT}
@item
@code{RV.xxx}
@end itemize
@node Order of Operations, , Functions, Expressions
@section Operator Precedence
@cindex operator precedence
@cindex precedence, operator
@cindex order of operations
@cindex operations, order of
The following table describes operator precedence. Smaller-numbered
levels in the table have higher precedence. Within a level, operations
are performed from left to right, except for level 2 (exponentiation),
where operations are performed from right to left. If an operator
appears in the table in two places (@code{-}), the first occurrence is
unary, the second is binary.
@enumerate
@item
@code{( )}
@item
@code{**}
@item
@code{-}
@item
@code{* /}
@item
@code{+ -}
@item
@code{EQ GE GT LE LT NE}
@item
@code{AND NOT OR}
@end enumerate
@node Data Input and Output, System and Portable Files, Expressions, Top
@chapter Data Input and Output
@cindex input
@cindex output
@cindex data
Data is the focus of the PSPP language. This chapter examines
the PSPP commands for defining variables and reading and writing data.
@quotation
@strong{Please note:} Data is not actually read until a procedure is
executed. These commands tell PSPP how to read data, but they
do not @emph{cause} PSPP to read data.
@end quotation
@menu
* BEGIN DATA:: Embed data within a syntax file.
* CLEAR TRANSFORMATIONS:: Clear pending transformations.
* DATA LIST:: Fundamental data reading command.
* END CASE:: Output the current case.
* END FILE:: Terminate the current input program.
* FILE HANDLE:: Support for fixed-length records.
* INPUT PROGRAM:: Support for complex input programs.
* LIST:: List cases in the active file.
* MATRIX DATA:: Read matrices in text format.
* NEW FILE:: Clear the active file and dictionary.
* PRINT:: Display values in print formats.
* PRINT EJECT:: Eject the current page then print.
* PRINT SPACE:: Print blank lines.
* REREAD:: Take another look at the previous input line.
* REPEATING DATA:: Multiple cases on a single line.
* WRITE:: Display values in write formats.
@end menu
@node BEGIN DATA, CLEAR TRANSFORMATIONS, Data Input and Output, Data Input and Output
@section BEGIN DATA
@vindex BEGIN DATA
@vindex END DATA
@cindex Embedding data in syntax files
@cindex Data, embedding in syntax files
@display
BEGIN DATA.
@dots{}
END DATA.
@end display
BEGIN DATA and END DATA can be used to embed raw ASCII data in a PSPP
syntax file. DATA LIST or another input procedure must be used before
BEGIN DATA (@pxref{DATA LIST}). BEGIN DATA and END DATA must be used
together. The END DATA command must appear by itself on a single line,
with no leading whitespace and exactly one space between the words
@code{END} and @code{DATA}, followed immediately by the terminal dot,
like this:
@example
END DATA.
@end example
@node CLEAR TRANSFORMATIONS, DATA LIST, BEGIN DATA, Data Input and Output
@section CLEAR TRANSFORMATIONS
@vindex CLEAR TRANSFORMATIONS
@display
CLEAR TRANSFORMATIONS.
@end display
The CLEAR TRANSFORMATIONS command clears out all pending
transformations. It does not cancel the current input program. It is
valid only when PSPP is interactive, not in syntax files.
@node DATA LIST, END CASE, CLEAR TRANSFORMATIONS, Data Input and Output
@section DATA LIST
@vindex DATA LIST
@cindex reading data from a file
@cindex data, reading from a file
@cindex data, embedding in syntax files
@cindex embedding data in syntax files
Used to read text or binary data, DATA LIST is the most
fundamental data-reading command. Even the more sophisticated input
methods use DATA LIST commands as a building block.
Understanding DATA LIST is important to understanding how to use
PSPP to read your data files.
There are two major variants of DATA LIST, which are fixed
format and free format. In addition, free format has a minor variant,
list format, which is discussed in terms of its differences from vanilla
free format.
Each form of DATA LIST is described in detail below.
@menu
* DATA LIST FIXED:: Fixed columnar locations for data.
* DATA LIST FREE:: Any spacing you like.
* DATA LIST LIST:: Each case must be on a single line.
@end menu
@node DATA LIST FIXED, DATA LIST FREE, DATA LIST, DATA LIST
@subsection DATA LIST FIXED
@vindex DATA LIST FIXED
@cindex reading fixed-format data
@cindex fixed-format data, reading
@cindex data, fixed-format, reading
@cindex embedding fixed-format data
@display
DATA LIST [FIXED]
@{TABLE,NOTABLE@}
FILE='filename'
RECORDS=record_count
END=end_var
/[line_no] var_spec@dots{}
where each var_spec takes one of the forms
var_list start-end [type_spec]
var_list (fortran_spec)
@end display
DATA LIST FIXED is used to read data files that have values at fixed
positions on each line of single-line or multiline records. The
keyword FIXED is optional.
The FILE subcommand must be used if input is to be taken from an
external file. It may be used to specify a filename as a string or a
file handle (@pxref{FILE HANDLE}). If the FILE subcommand is not used,
then input is assumed to be specified within the command file using
BEGIN DATA@dots{}END DATA (@pxref{BEGIN DATA}).
The optional RECORDS subcommand, which takes a single integer as an
argument, is used to specify the number of lines per record. If RECORDS
is not specified, then the number of lines per record is calculated from
the list of variable specifications later in the DATA LIST command.
The END subcommand is only useful in conjunction with the INPUT PROGRAM
input procedure, and for that reason it is not discussed here
(@pxref{INPUT PROGRAM}).
DATA LIST can optionally output a table describing how the data file
will be read. The TABLE subcommand enables this output, and NOTABLE
disables it. The default is to output the table.
The list of variables to be read from the data list must come last in
the DATA LIST command. Each line in the data record is introduced by a
slash (@samp{/}). Optionally, a line number may follow the slash.
Following, any number of variable specifications may be present.
Each variable specification consists of a list of variable names
followed by a description of their location on the input line. Sets of
variables may specified using DATA LIST's TO convention (@pxref{Sets of
Variables}). There are two ways to specify the location of the variable
on the line: SPSS style and FORTRAN style.
With SPSS style, the starting column and ending column for the field
are specified after the variable name, separated by a dash (@samp{-}).
For instance, the third through fifth columns on a line would be
specified @samp{3-5}. By default, variables are considered to be in
@samp{F} format (@pxref{Input/Output Formats}). (This default can be
changed; see @ref{SET} for more information.)
When using SPSS style, to use a variable format other than the default,
specify the format type in parentheses after the column numbers. For
instance, for alphanumeric @samp{A} format, use @samp{(A)}.
In addition, implied decimal places can be specified in parentheses
after the column numbers. As an example, suppose that a data file has a
field in which the characters @samp{1234} should be interpreted as
having the value 12.34. Then this field has two implied decimal places,
and the corresponding specification would be @samp{(2)}. If a field
that has implied decimal places contains a decimal point, then the
implied decimal places are not applied.
Changing the variable format and adding implied decimal places can be
done together; for instance, @samp{(N,5)}.
When using SPSS style, the input and output width of each variable is
computed from the field width. The field width must be evenly divisible
into the number of variables specified.
FORTRAN style is an altogether different approach to specifying field
locations. With this approach, a list of variable input format
specifications, separated by commas, are placed after the variable names
inside parentheses. Each format specifier advances as many characters
into the input line as it uses.
In addition to the standard format specifiers (@pxref{Input/Output
Formats}), FORTRAN style defines some extensions:
@table @asis
@item @code{X}
Advance the current column on this line by one character position.
@item @code{T}@var{x}
Set the current column on this line to column @var{x}, with column
numbers considered to begin with 1 at the left margin.
@item @code{NEWREC}@var{x}
Skip forward @var{x} lines in the current record, resetting the active
column to the left margin.
@item Repeat count
Any format specifier may be preceded by a number. This causes the
action of that format specifier to be repeated the specified number of
times.
@item (@var{spec1}, @dots{}, @var{specN})
Group the given specifiers together. This is most useful when preceded
by a repeat count. Groups may be nested arbitrarily.
@end table
FORTRAN and SPSS styles may be freely intermixed. SPSS style leaves the
active column immediately after the ending column specified. Record
motion using @code{NEWREC} in FORTRAN style also applies to later
FORTRAN and SPSS specifiers.
@menu
* DATA LIST FIXED Examples:: Examples of DATA LIST FIXED.
@end menu
@node DATA LIST FIXED Examples, , DATA LIST FIXED, DATA LIST FIXED
@unnumberedsubsubsec Examples
@enumerate
@item
@example
DATA LIST TABLE /NAME 1-10 (A) INFO1 TO INFO3 12-17 (1).
BEGIN DATA.
John Smith 102311
Bob Arnold 122015
Bill Yates 918 6
END DATA.
@end example
Defines the following variables:
@itemize @bullet
@item
@code{NAME}, a 10-character-wide long string variable, in columns 1
through 10.
@item
@code{INFO1}, a numeric variable, in columns 12 through 13.
@item
@code{INFO2}, a numeric variable, in columns 14 through 15.
@item
@code{INFO3}, a numeric variable, in columns 16 through 17.
@end itemize
The @code{BEGIN DATA}/@code{END DATA} commands cause three cases to be
defined:
@example
Case NAME INFO1 INFO2 INFO3
1 John Smith 10 23 11
2 Bob Arnold 12 20 15
3 Bill Yates 9 18 6
@end example
The @code{TABLE} keyword causes PSPP to print out a table
describing the four variables defined.
@item
@example
DAT LIS FIL="survey.dat"
/ID 1-5 NAME 7-36 (A) SURNAME 38-67 (A) MINITIAL 69 (A)
/Q01 TO Q50 7-56
/.
@end example
Defines the following variables:
@itemize @bullet
@item
@code{ID}, a numeric variable, in columns 1-5 of the first record.
@item
@code{NAME}, a 30-character long string variable, in columns 7-36 of the
first record.
@item
@code{SURNAME}, a 30-character long string variable, in columns 38-67 of
the first record.
@item
@code{MINITIAL}, a 1-character short string variable, in column 69 of
the first record.
@item
Fifty variables @code{Q01}, @code{Q02}, @code{Q03}, @dots{}, @code{Q49},
@code{Q50}, all numeric, @code{Q01} in column 7, @code{Q02} in column 8,
@dots{}, @code{Q49} in column 55, @code{Q50} in column 56, all in the second
record.
@end itemize
Cases are separated by a blank record.
Data is read from file @file{survey.dat} in the current directory.
This example shows keywords abbreviated to their first 3 letters.
@end enumerate
@node DATA LIST FREE, DATA LIST LIST, DATA LIST FIXED, DATA LIST
@subsection DATA LIST FREE
@vindex DATA LIST FREE
@display
DATA LIST FREE
[@{NOTABLE,TABLE@}]
FILE='filename'
END=end_var
/var_spec@dots{}
where each var_spec takes one of the forms
var_list [(type_spec)]
var_list *
@end display
In free format, the input data is structured as a series of comma- or
whitespace-delimited fields (end of line is one form of whitespace; it
is not treated specially). Field contents may be surrounded by matched
pairs of apostrophes (@samp{'}) or quotes (@samp{"}), or they may be
unenclosed. For any type of field leading white space (up to the
apostrophe or quote, if any) is not included in the field.
Multiple consecutive delimiters are equivalent to a single delimiter.
To specify an empty field, write an empty set of single or double
quotes; for instance, @samp{""}.
The NOTABLE and TABLE subcommands are as in DATA LIST FIXED above.
NOTABLE is the default.
The FILE and END subcommands are as in DATA LIST FIXED above.
The variables to be parsed are given as a single list of variable names.
This list must be introduced by a single slash (@samp{/}). The set of
variable names may contain format specifications in parentheses
(@pxref{Input/Output Formats}). Format specifications apply to all
variables back to the previous parenthesized format specification.
In addition, an asterisk may be used to indicate that all variables
preceding it are to have input/output format @samp{F8.0}.
Specified field widths are ignored on input, although all normal limits
on field width apply, but they are honored on output.
@node DATA LIST LIST, , DATA LIST FREE, DATA LIST
@subsection DATA LIST LIST
@vindex DATA LIST LIST
@display
DATA LIST LIST
[@{NOTABLE,TABLE@}]
FILE='filename'
END=end_var
/var_spec@dots{}
where each var_spec takes one of the forms
var_list [(type_spec)]
var_list *
@end display
Syntactically and semantically, DATA LIST LIST is equivalent to DATA
LIST FREE, with one exception: each input line is expected to correspond
to exactly one input record. If more or fewer fields are found on an
input line than expected, an appropriate diagnostic is issued.
@node END CASE, END FILE, DATA LIST, Data Input and Output
@section END CASE
@vindex END CASE
@display
END CASE.
@end display
END CASE is used within INPUT PROGRAM to output the current case.
@xref{INPUT PROGRAM}.
@node END FILE, FILE HANDLE, END CASE, Data Input and Output
@section END FILE
@vindex END FILE
@display
END FILE.
@end display
END FILE is used within INPUT PROGRAM to terminate the current input
program. @xref{INPUT PROGRAM}.
@node FILE HANDLE, INPUT PROGRAM, END FILE, Data Input and Output
@section FILE HANDLE
@vindex FILE HANDLE
@display
FILE HANDLE handle_name
/NAME='filename'
/RECFORM=@{VARIABLE,FIXED,SPANNED@}
/LRECL=rec_len
/MODE=@{CHARACTER,IMAGE,BINARY,MULTIPUNCH,360@}
@end display
Use the FILE HANDLE command to define the attributes of a file that does
not use conventional variable-length records terminated by newline
characters.
Specify the file handle name as an identifier. Any given identifier may
only appear once in a PSPP run. File handles may not be reassigned to a
different file. The file handle name must immediately follow the FILE
HANDLE command name.
The NAME subcommand specifies the name of the file associated with the
handle. It is the only required subcommand.
The RECFORM subcommand specifies how the file is laid out. VARIABLE
specifies variable-length lines terminated with newlines, and it is the
default. FIXED specifies fixed-length records. SPANNED is not
supported.
LRECL specifies the length of fixed-length records. It is required if
@code{/RECFORM FIXED} is specified.
MODE specifies a file mode. CHARACTER, the default, causes the data
file to be opened in ANSI C text mode. BINARY causes the data file to
be opened in ANSI C binary mode. The other possibilities are not
supported.
@node INPUT PROGRAM, LIST, FILE HANDLE, Data Input and Output
@section INPUT PROGRAM
@vindex INPUT PROGRAM
@display
INPUT PROGRAM.
@dots{} input commands @dots{}
END INPUT PROGRAM.
@end display
The INPUT PROGRAM@dots{}END INPUT PROGRAM construct is used to specify a
complex input program. By placing data input commands within INPUT
PROGRAM, PSPP programs can take advantage of more complex file
structures than available by using DATA LIST by itself.
The first sort of extended input program is to simply put multiple DATA
LIST commands within the INPUT PROGRAM. This will cause all of the data
files to be read in parallel. Input will stop when end of file is
reached on any of the data files.
Transformations, such as conditional and looping constructs, can also be
included within an INPUT PROGRAM. These can be used to combine input
from several data files in more complex ways. However, input will still
stop when end of file is reached on any of the data files.
To prevent INPUT PROGRAM from terminating at the first end of file, use
the END subcommand on DATA LIST. This subcommand takes a variable name,
which should be a numeric scratch variable (@pxref{Scratch Variables}).
(It need not be a scratch variable but otherwise the results can be
surprising.) The value of this variable is set to 0 when reading the
data file, or 1 when end of file is encountered.
Some additional commands are useful in conjunction with INPUT PROGRAM.
END CASE is the first one. Normally each loop through the INPUT PROGRAM
structure produces one case. But with END CASE you can control exactly
when cases are output. When END CASE is used, looping from the end of
INPUT PROGRAM to the beginning does not cause a case to be output.
END FILE is the other command. When the END subcommand is used on DATA
LIST, there is no way for the INPUT PROGRAM construct to stop looping,
so an infinite loop results. The END FILE command, when executed,
stops the flow of input data and passes out of the INPUT PROGRAM
structure.
All this is very confusing. A few examples should help to clarify.
@example
INPUT PROGRAM.
DATA LIST NOTABLE FILE='a.data'/X 1-10.
DATA LIST NOTABLE FILE='b.data'/Y 1-10.
END INPUT PROGRAM.
LIST.
@end example
The example above reads variable X from file @file{a.data} and variable
Y from file @file{b.data}. If one file is shorter than the other then
the extra data in the longer file is ignored.
@example
INPUT PROGRAM.
NUMERIC #A #B.
DO IF NOT #A.
DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10.
END IF.
DO IF NOT #B.
DATA LIST NOTABLE END=#B FILE='b.data'/Y 1-10.
END IF.
DO IF #A AND #B.
END FILE.
END IF.
END CASE.
END INPUT PROGRAM.
LIST.
@end example
This example reads variable X from @file{a.data} and variable Y from
@file{b.data}. If one file is shorter than the other then the missing
field is set to the system-missing value alongside the present value for
the remaining length of the longer file.
@example
INPUT PROGRAM.
NUMERIC #A #B.
DO IF #A.
DATA LIST NOTABLE END=#B FILE='b.data'/X 1-10.
DO IF #B.
END FILE.
ELSE.
END CASE.
END IF.
ELSE.
DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10.
DO IF NOT #A.
END CASE.
END IF.
END IF.
END INPUT PROGRAM.
LIST.
@end example
The above example reads data from file @file{a.data}, then from
@file{b.data}, and concatenates them into a single active file.
@example
INPUT PROGRAM.
NUMERIC #EOF.
LOOP IF NOT #EOF.
DATA LIST NOTABLE END=#EOF FILE='a.data'/X 1-10.
DO IF NOT #EOF.
END CASE.
END IF.
END LOOP.
COMPUTE #EOF = 0.
LOOP IF NOT #EOF.
DATA LIST NOTABLE END=#EOF FILE='b.data'/X 1-10.
DO IF NOT #EOF.
END CASE.
END IF.
END LOOP.
END FILE.
END INPUT PROGRAM.
LIST.
@end example
The above example does the same thing as the previous example, in a
different way.
@example
INPUT PROGRAM.
LOOP #I=1 TO 50.
COMPUTE X=UNIFORM(10).
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
LIST/FORMAT=NUMBERED.
@end example
The above example causes an active file to be created consisting of 50
random variates between 0 and 10.
@node LIST, MATRIX DATA, INPUT PROGRAM, Data Input and Output
@section LIST
@vindex LIST
@display
LIST
/VARIABLES=var_list
/CASES=FROM start_index TO end_index BY incr_index
/FORMAT=@{UNNUMBERED,NUMBERED@} @{WRAP,SINGLE@}
@{NOWEIGHT,WEIGHT@}
@end display
The LIST procedure prints the values of specified variables to the
listing file.
The VARIABLES subcommand specifies the variables whose values are to be
printed. Keyword VARIABLES is optional. If VARIABLES subcommand is not
specified then all variables in the active file are printed.
The CASES subcommand can be used to specify a subset of cases to be
printed. Specify FROM and the case number of the first case to print,
TO and the case number of the last case to print, and BY and the number
of cases to advance between printing cases, or any subset of those
settings. If CASES is not specified then all cases are printed.
The FORMAT subcommand can be used to change the output format. NUMBERED
will print case numbers along with each case; UNNUMBERED, the default,
causes the case numbers to be omitted. The WRAP and SINGLE settings are
currently not used. WEIGHT will cause case weights to be printed along
with variable values; NOWEIGHT, the default, causes case weights to be
omitted from the output.
Case numbers start from 1. They are counted after all transformations
have been considered.
LIST will attempt to fit all the values on a single line. If necessary,
variable names will be display vertically in order to fit. If values
cannot fit on a single line, then a multi-line format will be used.
LIST is a procedure. It causes the data to be read.
@node MATRIX DATA, NEW FILE, LIST, Data Input and Output
@section MATRIX DATA
@vindex MATRIX DATA
@display
MATRIX DATA
/VARIABLES=var_list
/FILE='filename'
/FORMAT=@{LIST,FREE@} @{LOWER,UPPER,FULL@} @{DIAGONAL,NODIAGONAL@}
/SPLIT=@{new_var,var_list@}
/FACTORS=var_list
/CELLS=n_cells
/N=n
/CONTENTS=@{N_VECTOR,N_SCALAR,N_MATRIX,MEAN,STDDEV,COUNT,MSE,
DFE,MAT,COV,CORR,PROX@}
@end display
The MATRIX DATA command reads square matrices in one of several textual
formats. MATRIX DATA clears the dictionary and replaces it and reads a
data file.
Use VARIABLES to specify the variables that form the rows and columns of
the matrices. You may not specify a variable named VARNAME_. You
should specify VARIABLES first.
Specify the file to read on FILE, either as a file name string or a file
handle (@pxref{FILE HANDLE}). If FILE is not specified then matrix data
must immediately follow MATRIX DATA with a BEGIN DATA@dots{}END DATA
construct (@pxref{BEGIN DATA}).
The FORMAT subcommand specifies how the matrices are formatted. LIST,
the default, indicates that there is one line per row of matrix data;
FREE allows single matrix rows to be broken across multiple lines. This
is analogous to the difference between DATA LIST FREE and DATA LIST LIST
(@pxref{DATA LIST}). LOWER, the default, indicates that the lower
triangle of the matrix is given; UPPER indicates the upper triangle; and
FULL indicates that the entire matrix is given. DIAGONAL, the default,
indicates that the diagonal is part of the data; NODIAGONAL indicates
that it is omitted. DIAGONAL/NODIAGONAL have no effect when FULL is
specified.
The SPLIT subcommand is used to specify SPLIT FILE variables for the
input matrices (@pxref{SPLIT FILE}). Specify either a single variable
not specified on VARIABLES, or one or more variables that are specified
on VARIABLES. In the former case, the SPLIT values are not present in
the data and ROWTYPE_ may not be specified on VARIABLES. In the latter
case, the SPLIT values are present in the data.
Specify a list of factor variables on FACTORS. Factor variables must
also be listed on VARIABLES. Factor variables are used when there are
some variables where, for each possible combination of their values,
statistics on the matrix variables are included in the data.
If FACTORS is specified and ROWTYPE_ is not specified on VARIABLES, the
CELLS subcommand is required. Specify the number of factor variable
combinations that are given. For instance, if factor variable A has 2
values and factor variable B has 3 values, specify 6.
The N subcommand specifies a population number of observations. When N
is specified, one N record is output for each SPLIT FILE.
Use CONTENTS to specify what sort of information the matrices include.
Each possible option is described in more detail below. When ROWTYPE_
is specified on VARIABLES, CONTENTS is optional; otherwise, if CONTENTS
is not specified then /CONTENTS=CORR is assumed.
@table @asis
@item N
@item N_VECTOR
Number of observations as a vector, one value for each variable.
@item N_SCALAR
Number of observations as a single value.
@item N_MATRIX
Matrix of counts.
@item MEAN
Vector of means.
@item STDDEV
Vector of standard deviations.
@item COUNT
Vector of counts.
@item MSE
Vector of mean squared errors.
@item DFE
Vector of degrees of freedom.
@item MAT
Generic matrix.
@item COV
Covariance matrix.
@item CORR
Correlation matrix.
@item PROX
Proximities matrix.
@end table
The exact semantics of the matrices read by MATRIX DATA are complex.
Right now MATRIX DATA isn't too useful due to a lack of procedures
accepting or producing related data, so these semantics aren't
documented. Later, they'll be described here in detail.
@node NEW FILE, PRINT, MATRIX DATA, Data Input and Output
@section NEW FILE
@vindex NEW FILE
@display
NEW FILE.
@end display
The NEW FILE command clears the current active file.
@node PRINT, PRINT EJECT, NEW FILE, Data Input and Output
@section PRINT
@vindex PRINT
@display
PRINT
OUTFILE='filename'
RECORDS=n_lines
@{NOTABLE,TABLE@}
/[line_no] arg@dots{}
arg takes one of the following forms:
'string' [start-end]
var_list start-end [type_spec]
var_list (fortran_spec)
var_list *
@end display
The PRINT transformation writes variable data to an output file. PRINT
is executed when a procedure causes the data to be read. In order to
execute the PRINT transformation without invoking a procedure, use the
EXECUTE command (@pxref{EXECUTE}).
All PRINT subcommands are optional.
The OUTFILE subcommand specifies the file to receive the output. The
file may be a file name as a string or a file handle (@pxref{FILE
HANDLE}). If OUTFILE is not present then output will be sent to PSPP's
output listing file.
The RECORDS subcommand specifies the number of lines to be output. The
number of lines may optionally be surrounded by parentheses.
TABLE will cause the PRINT command to output a table to the listing file
that describes what it will print to the output file. NOTABLE, the
default, suppresses this output table.
Introduce the strings and variables to be printed with a slash
(@samp{/}). Optionally, the slash may be followed by a number
indicating which output line will be specified. In the absence of this
line number, the next line number will be specified. Multiple lines may
be specified using multiple slashes with the intended output for a line
following its respective slash.
Literal strings may be printed. Specify the string itself. Optionally
the string may be followed by a column number or range of column
numbers, specifying the location on the line for the string to be
printed. Otherwise, the string will be printed at the current position
on the line.
Variables to be printed can be specified in the same ways as available
for DATA LIST FIXED (@pxref{DATA LIST FIXED}). In addition, a variable
list may be followed by an asterisk (@samp{*}), which indicates that the
variables should be printed in their dictionary print formats, separated
by spaces. A variable list followed by a slash or the end of command
will be interpreted the same way.
If a FORTRAN type specification is used to move backwards on the current
line, then text is written at that point on the line, the line will be
truncated to that length, although additional text being added will
again extend the line to that length.
@node PRINT EJECT, PRINT SPACE, PRINT, Data Input and Output
@section PRINT EJECT
@vindex PRINT EJECT
@display
PRINT EJECT
OUTFILE='filename'
RECORDS=n_lines
@{NOTABLE,TABLE@}
/[line_no] arg@dots{}
arg takes one of the following forms:
'string' [start-end]
var_list start-end [type_spec]
var_list (fortran_spec)
var_list *
@end display
PRINT EJECT is used to write data to an output file. Before the data is
written, the current page in the listing file is ejected.
@xref{PRINT}, for more information on syntax and usage.
@node PRINT SPACE, REREAD, PRINT EJECT, Data Input and Output
@section PRINT SPACE
@vindex PRINT SPACE
@display
PRINT SPACE OUTFILE='filename' n_lines.
@end display
The PRINT SPACE prints one or more blank lines to an output file.
The OUTFILE subcommand is optional. It may be used to direct output to
a file specified by file name as a string or file handle (@pxref{FILE
HANDLE}). If OUTFILE is not specified then output will be directed to
the listing file.
n_lines is also optional. If present, it is an expression
(@pxref{Expressions}) specifying the number of blank lines to be
printed. The expression must evaluate to a nonnegative value.
@node REREAD, REPEATING DATA, PRINT SPACE, Data Input and Output
@section REREAD
@vindex REREAD
@display
REREAD FILE=handle COLUMN=column.
@end display
The REREAD transformation allows the previous input line in a data file
already processed by DATA LIST or another input command to be re-read
for further processing.
The FILE subcommand, which is optional, is used to specify the file to
have its line re-read. The file must be specified in the form of a file
handle (@pxref{FILE HANDLE}). If FILE is not specified then the last
file specified on DATA LIST will be assumed (last file specified
lexically, not in terms of flow-of-control).
By default, the line re-read is re-read in its entirety. With the
COLUMN subcommand, a prefix of the line can be exempted from
re-reading. Specify an expression (@pxref{Expressions}) evaluating to
the first column that should be included in the re-read line. Columns
are numbered from 1 at the left margin.
Multiple REREAD commands will not back up in the data file. Instead,
they will re-read the same line multiple times.
@node REPEATING DATA, WRITE, REREAD, Data Input and Output
@section REPEATING DATA
@vindex REPEATING DATA
@display
REPEATING DATA
/STARTS=start-end
/OCCURS=n_occurs
/FILE='filename'
/LENGTH=length
/CONTINUED[=cont_start-cont_end]
/ID=id_start-id_end=id_var
/@{TABLE,NOTABLE@}
/DATA=var_spec@dots{}
where each var_spec takes one of the forms
var_list start-end [type_spec]
var_list (fortran_spec)
@end display
The REPEATING DATA command is used to parse groups of data repeating in
a uniform format, possibly with several groups on a single line. Each
group of data corresponds with one case. REPEATING DATA may only be
used within an INPUT PROGRAM structure. When used with DATA LIST, it
can be used to parse groups of cases that share a subset of variables
but differ in their other data.
The STARTS subcommand is required. Specify a range of columns, using
literal numbers or numeric variable names. This range specifies the
columns on the first line that are used to contain groups of data. The
ending column is optional. If it is not specified, then the record
width of the input file is used. For the inline file (@pxref{BEGIN
DATA}) this is 80 columns; for a file with fixed record widths it is the
record width; for other files it is 1024 characters by default.
The OCCURS subcommand is required. It must be a number or the name of a
numeric variable. Its value is the number of groups present in the
current record.
The DATA subcommand is required. It must be the last subcommand
specified. It is used to specify the data present within each repeating
group. Column numbers are specified relative to the beginning of a
group at column 1. Data is specified in the same way as with DATA LIST
FIXED (@pxref{DATA LIST FIXED}).
All other subcommands are optional.
FILE specifies the file to read, either a file name as a string or a
file handle (@pxref{FILE HANDLE}). If FILE is not present then the
default is the last file handle used on DATA LIST (lexically, not in
terms of flow of control).
By default REPEATING DATA will output a table describing how it will
parse the input data. Specifying NOTABLE will disable this behavior;
specifying TABLE will explicitly enable it.
The LENGTH subcommand specifies the length in characters of each group.
If it is not present then length is inferred from the DATA subcommand.
LENGTH can be a number or a variable name.
Normally all the data groups are expected to be present on a single
line. Use the CONTINUED command to indicate that data can be continued
onto additional lines. If data on continuation lines starts at the left
margin and continues through the entire field width, no column
specifications are necessary on CONTINUED. Otherwise, specify the
possible range of columns in the same way as on STARTS.
When data groups are continued from line to line, it's easily possible
for cases to get out of sync if hand editing is not done carefully. The
ID subcommand allows a case identifier to be present on each line of
repeating data groups. REPEATING DATA will check for the same
identifier on each line and report mismatches. Specify the range of
columns that the identifier will occupy, followed by an equals sign
(@samp{=}) and the identifier variable name. The variable must already
have been declared with NUMERIC or another command.
@node WRITE, , REPEATING DATA, Data Input and Output
@section WRITE
@vindex WRITE
@display
WRITE
OUTFILE='filename'
RECORDS=n_lines
@{NOTABLE,TABLE@}
/[line_no] arg@dots{}
arg takes one of the following forms:
'string' [start-end]
var_list start-end [type_spec]
var_list (fortran_spec)
var_list *
@end display
WRITE is used to write text or binary data to an output file.
@xref{PRINT}, for more information on syntax and usage. The main
difference between PRINT and WRITE is that whereas by default PRINT uses
variables' print formats, WRITE uses write formats.
The sole additional difference is that if WRITE is used to send output
to a binary file, carriage control characters will not be output.
@xref{FILE HANDLE}, for information on how to declare a file as binary.
@node System and Portable Files, Variable Attributes, Data Input and Output, Top
@chapter System Files and Portable Files
The commands in this chapter read, write, and examine system files and
portable files.
@menu
* APPLY DICTIONARY:: Apply system file dictionary to active file.
* EXPORT:: Write to a portable file.
* GET:: Read from a system file.
* IMPORT:: Read from a portable file.
* MATCH FILES:: Merge system files.
* SAVE:: Write to a system file.
* SYSFILE INFO:: Display system file dictionary.
* XSAVE:: Write to a system file, as a transform.
@end menu
@node APPLY DICTIONARY, EXPORT, System and Portable Files, System and Portable Files
@section APPLY DICTIONARY
@vindex APPLY DICTIONARY
@display
APPLY DICTIONARY FROM='filename'.
@end display
The APPLY DICTIONARY command applies the variable labels, value labels,
and missing values from variables in a system file to corresponding
variables in the active file. In some cases it also updates the
weighting variable.
Specify a system file with a file name string or as a file handle
(@pxref{FILE HANDLE}). The dictionary in the system file will be read,
but it will not replace the active file dictionary. The system file's
data will not be read.
Only variables with names that exist in both the active file and the
system file are considered. Variables with the same name but different
types (numeric, string) will cause an error message. Otherwise, the
system file variables' attributes will replace those in their matching
active file variables, as described below.
If a system file variable has a variable label, then it will replace the
active file variable's variable label. If the system file variable does
not have a variable label, then the active file variable's variable
label, if any, will be retained.
If the active file variable is numeric or short string, then value
labels and missing values, if any, will be copied to the active file
variable. If the system file variable does not have value labels or
missing values, then those in the active file variable, if any, will not
be disturbed.
Finally, weighting of the active file is updated (@pxref{WEIGHT}). If
the active file has a weighting variable, and the system file does not,
or if the weighting variable in the system file does not exist in the
active file, then the active file weighting variable, if any, is
retained. Otherwise, the weighting variable in the system file becomes
the active file weighting variable.
APPLY DICTIONARY takes effect immediately. It does not read the active
file. The system file is not modified.
@node EXPORT, GET, APPLY DICTIONARY, System and Portable Files
@section EXPORT
@vindex EXPORT
@display
EXPORT
/OUTFILE='filename'
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
@end display
The EXPORT procedure writes the active file dictionary and data to a
specified portable file.
The OUTFILE subcommand, which is the only required subcommand, specifies
the portable file to be written as a file name string or a file handle
(@pxref{FILE HANDLE}).
DROP, KEEP, and RENAME follow the same format as the SAVE procedure
(@pxref{SAVE}).
EXPORT is a procedure. It causes the active file to be read.
@node GET, IMPORT, EXPORT, System and Portable Files
@section GET
@vindex GET
@display
GET
/FILE='filename'
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
@end display
The GET transformation clears the current dictionary and active file and
replaces them with the dictionary and data from a specified system file.
The FILE subcommand is the only required subcommand. Specify the system
file to be read as a string file name or a file handle (@pxref{FILE
HANDLE}).
By default, all the variables in a system file are read. The DROP
subcommand can be used to specify a list of variables that are not to be
read. By contrast, the KEEP subcommand can be used to specify variable
that are to be read, with all other variables not read.
Normally variables in a system file retain the names that they were
saved under. Use the RENAME subcommand to change these names. Specify,
within parentheses, a list of variable names followed by an equals sign
(@samp{=}) and the names that they should be renamed to. Multiple
parenthesized groups of variable names can be included on a single
RENAME subcommand. Variables' names may be swapped using a RENAME
subcommand of the form @samp{/RENAME=(A B=B A)}.
Alternate syntax for the RENAME subcommand allows the parentheses to be
eliminated. When this is done, only a single variable may be renamed at
once. For instance, @samp{/RENAME=A=B}. This alternate syntax is
deprecated.
DROP, KEEP, and RENAME are performed in left-to-right order. They each
may be present any number of times.
Please note that DROP, KEEP, and RENAME do not cause the system file on
disk to be modified. Only the active file read from the system file is
changed.
GET does not cause the data to be read, only the dictionary. The data
is read later, when a procedure is executed.
@node IMPORT, MATCH FILES, GET, System and Portable Files
@section IMPORT
@vindex IMPORT
@display
IMPORT
/FILE='filename'
/TYPE=@{COMM,TAPE@}
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
@end display
The IMPORT transformation clears the active file dictionary and data and
replaces them with a dictionary and data from a portable file on disk.
The FILE subcommand, which is the only required subcommand, specifies
the portable file to be read as a file name string or a file handle
(@pxref{FILE HANDLE}).
The TYPE subcommand is currently not used.
DROP, KEEP, and RENAME follow the syntax used by GET (@pxref{GET}).
IMPORT does not cause the data to be read, only the dictionary. The
data is read later, when a procedure is executed.
@node MATCH FILES, SAVE, IMPORT, System and Portable Files
@section MATCH FILES
@vindex MATCH FILES
@display
MATCH FILES
/BY var_list
/@{FILE,TABLE@}=@{*,'filename'@}
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
/IN=var_name
/FIRST=var_name
/LAST=var_name
/MAP
@end display
The MATCH FILES command merges one or more system files, optionally
including the active file. Records with the same values for BY
variables are combined into a single record. Records with different
values are output in order. Thus, multiple sorted system files are
combined into a single sorted system file based on the value of the BY
variables.
The BY subcommand specifies a list of variables that are used to match
records from each of the system files. Variables specified must exist
in all the files specified on FILE and TABLE. BY should usually be
specified. If TABLE is used then BY is required.
Specify FILE with a system file as a file name string or file handle
(@pxref{FILE HANDLE}). An asterisk (@samp{*}) may also be specified to
indicate the current active file. The files specified on FILE are
merged together based on the BY variables, or combined case-by-case if
BY is not specified. Normally at least two FILE subcommands should be
specified.
Specify TABLE with a system file in order to use it as a @dfn{table
lookup file}. Records in table lookup files are not used up after
they've been used once. This means that data in table lookup files can
correspond to any number of records in FILE files. Table lookup files
correspond to lookup tables in traditional relational database systems.
It is incorrect to have records with duplicate BY values in table lookup
files.
Any number of FILE and TABLE subcommands may be specified. Each
instance of FILE or TABLE can be followed by DROP, KEEP, and/or RENAME
subcommands. These take the same form as the corresponding subcommands
of GET (@pxref{GET}), and perform the same functions.
Variables belonging to files that are not present for the current case
are set to the system-missing value for numeric variables or spaces for
string variables.
IN, FIRST, LAST, and MAP are currently not used.
@node SAVE, SYSFILE INFO, MATCH FILES, System and Portable Files
@section SAVE
@vindex SAVE
@display
SAVE
/OUTFILE='filename'
/@{COMPRESSED,UNCOMPRESSED@}
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
@end display
The SAVE procedure causes the dictionary and data in the active file to
be written to a system file.
The FILE subcommand is the only required subcommand. Specify the system
file to be written as a string file name or a file handle (@pxref{FILE
HANDLE}).
The COMPRESS and UNCOMPRESS subcommand determine whether the saved
system file is compressed. By default, system files are compressed.
This default can be changed with the SET command (@pxref{SET}).
By default, all the variables in the active file dictionary are written
to the system file. The DROP subcommand can be used to specify a list
of variables not to be written. In contrast, KEEP specifies variables
to be written, with all variables not specified not written.
Normally variables are saved to a system file under the same names they
have in the active file. Use the RENAME command to change these names.
Specify, within parentheses, a list of variable names followed by an
equals sign (@samp{=}) and the names that they should be renamed to.
Multiple parenthesized groups of variable names can be included on a
single RENAME subcommand. Variables' names may be swapped using a
RENAME subcommand of the form @samp{/RENAME=(A B=B A)}.
Alternate syntax for the RENAME subcommand allows the parentheses to be
eliminated. When this is done, only a single variable may be renamed at
once. For instance, @samp{/RENAME=A=B}. This alternate syntax is
deprecated.
DROP, KEEP, and RENAME are performed in left-to-right order. They each
may be present any number of times.
Please note that DROP, KEEP, and RENAME do not cause the active file to
be modified. Only the system file written to disk is changed.
SAVE causes the data to be read. It is a procedure.
@node SYSFILE INFO, XSAVE, SAVE, System and Portable Files
@section SYSFILE INFO
@vindex SYSFILE INFO
@display
SYSFILE INFO FILE='filename'.
@end display
The SYSFILE INFO command reads the dictionary in a system file and
displays the information in its dictionary.
Specify a file name or file handle. SYSFILE INFO will read that file as
a system file and display information on its dictionary.
The file does not replace the current active file.
@node XSAVE, , SYSFILE INFO, System and Portable Files
@section XSAVE
@vindex XSAVE
@display
XSAVE
/FILE='filename'
/@{COMPRESSED,UNCOMPRESSED@}
/DROP=var_list
/KEEP=var_list
/RENAME=(src_names=target_names)@dots{}
@end display
The XSAVE transformation writes the active file dictionary and data to a
system file stored on disk.
XSAVE is a transformation, not a procedure. It is executed when the
data is read by a procedure or procedure-like command. In all other
respects, XSAVE is identical to SAVE. @xref{SAVE}, for more information
on syntax and usage.
@node Variable Attributes, Data Manipulation, System and Portable Files, Top
@chapter Manipulating variables
The variables in the active file dictionary are important. There are
several utility functions for examining and adjusting them.
@menu
* ADD VALUE LABELS:: Add value labels to variables.
* DISPLAY:: Display variable names & descriptions.
* DISPLAY VECTORS:: Display a list of vectors.
* FORMATS:: Set print and write formats.
* LEAVE:: Don't clear variables between cases.
* MISSING VALUES:: Set missing values for variables.
* MODIFY VARS:: Rename, reorder, and drop variables.
* NUMERIC:: Create new numeric variables.
* PRINT FORMATS:: Set variable print formats.
* RENAME VARIABLES:: Rename variables.
* VALUE LABELS:: Set value labels for variables.
* STRING:: Create new string variables.
* VARIABLE LABELS:: Set variable labels for variables.
* VECTOR:: Declare an array of variables.
* WRITE FORMATS:: Set variable write formats.
@end menu
@node ADD VALUE LABELS, DISPLAY, Variable Attributes, Variable Attributes
@section ADD VALUE LABELS
@vindex ADD VALUE LABELS
@display
ADD VALUE LABELS
/var_list value 'label' [value 'label']@dots{}
@end display
ADD VALUE LABELS has the same syntax and purpose as VALUE LABELS (see
above), but it does not clear away value labels from the variables
before adding the ones specified.
@node DISPLAY, DISPLAY VECTORS, ADD VALUE LABELS, Variable Attributes
@section DISPLAY
@vindex DISPLAY
@display
DISPLAY @{NAMES,INDEX,LABELS,VARIABLES,DICTIONARY,SCRATCH@}
[SORTED] [var_list]
@end display
DISPLAY displays requested information on variables. Variables can
optionally be sorted alphabetically. The entire dictionary or just
specified variables can be described.
One of the following keywords can be present:
@table @asis
@item NAMES
The variables' names are displayed.
@item INDEX
The variables' names are displayed along with a value describing their
position within the active file dictionary.
@item LABELS
Variable names, positions, and variable labels are displayed.
@item VARIABLES
Variable names, positions, print and write formats, and missing values
are displayed.
@item DICTIONARY
Variable names, positions, print and write formats, missing values,
variable labels, and value labels are displayed.
@item SCRATCH
Varible names are displayed, for scratch variables only (@pxref{Scratch
Variables}).
@end table
If SORTED is specified, then the variables are displayed in ascending
order based on their names; otherwise, they are displayed in the order
that they occur in the active file dictionary.
@node DISPLAY VECTORS, FORMATS, DISPLAY, Variable Attributes
@section DISPLAY VECTORS
@vindex DISPLAY VECTORS
@display
DISPLAY VECTORS.
@end display
The DISPLAY VECTORS command causes a list of the currently declared
vectors to be displayed.
@node FORMATS, LEAVE, DISPLAY VECTORS, Variable Attributes
@section FORMATS
@vindex FORMATS
@display
FORMATS var_list (fmt_spec).
@end display
The FORMATS command set the print and write formats for the specified
variables to the specified format specification. @xref{Input/Output
Formats}.
Specify a list of variables followed by a format specification in
parentheses. The print and write formats of the specified variables
will be changed.
Additional lists of variables and formats may be included if they are
delimited by a slash (@samp{/}).
The FORMATS command takes effect immediately. It is not affected by
conditional and looping structures such as DO IF or LOOP.
@node LEAVE, MISSING VALUES, FORMATS, Variable Attributes
@section LEAVE
@vindex LEAVE
@display
LEAVE var_list.
@end display
The LEAVE command prevents the specified variables from being
reinitialized whenever a new case is processed.
Normally, when a data file is processed, every variable in the active
file is initialized to the system-missing value or spaces at the
beginning of processing for each case. When a variable has been
specified on LEAVE, this is not the case. Instead, that variable is
initialized to 0 (not system-missing) or spaces for the first case.
After that, it retains its value between cases.
This becomes useful for counters. For instance, in the example below
the variable SUM maintains a running total of the values in the ITEM
variable.
@example
DATA LIST /ITEM 1-3.
COMPUTE SUM=SUM+ITEM.
PRINT /ITEM SUM.
LEAVE SUM
BEGIN DATA.
123
404
555
999
END DATA.
@end example
@noindent Partial output from this example:
@example
123 123.00
404 527.00
555 1082.00
999 2081.00
@end example
It is best to use the LEAVE command immediately before invoking a
procedure command, because it is reset by certain transformations---for
instance, COMPUTE and IF. LEAVE is also reset by all procedure
invocations.
@node MISSING VALUES, MODIFY VARS, LEAVE, Variable Attributes
@section MISSING VALUES
@vindex MISSING VALUES
@display
MISSING VALUES var_list (missing_values).
missing_values takes one of the following forms:
num1
num1, num2
num1, num2, num3
num1 THRU num2
num1 THRU num2, num3
string1
string1, string2
string1, string2, string3
As part of a range, LO or LOWEST may take the place of num1;
HI or HIGHEST may take the place of num2.
@end display
The MISSING VALUES command sets user-missing values for numeric and
short string variables. Long string variables may not have missing
values.
Specify a list of variables, followed by a list of their user-missing
values in parentheses. Up to three discrete values may be given, or,
for numeric variables only, a range of values optionally accompanied by
a single discrete value. Ranges may be open-ended on one end, indicated
through the use of the keyword LO or LOWEST or HI or HIGHEST.
The MISSING VALUES command takes effect immediately. It is not affected
by conditional and looping constructs such as DO IF or LOOP.
@node MODIFY VARS, NUMERIC, MISSING VALUES, Variable Attributes
@section MODIFY VARS
@vindex MODIFY VARS
@display
MODIFY VARS
/REORDER=@{FORWARD,BACKWARD@} @{POSITIONAL,ALPHA@} (var_list)@dots{}
/RENAME=(old_names=new_names)@dots{}
/@{DROP,KEEP@}=var_list
/MAP
@end display
The MODIFY VARS commands allows variables in the active file to be
reordered, renamed, or deleted from the active file.
At least one subcommand must be specified, and no subcommand may be
specified more than once. DROP and KEEP may not both be specified.
The REORDER subcommand changes the order of variables in the active
file. Specify one or more lists of variable names in parentheses. By
default, each list of variables is rearranged into the specified order.
To put the variables into the reverse of the specified order, put
keyword BACKWARD before the parentheses. To put them into alphabetical
order in the dictionary, specify keyword ALPHA before the parentheses.
BACKWARD and ALPHA may also be combined.
To rename variables in the active file, specify RENAME, an equals sign
(@samp{=}), and lists of the old variable names and new variable names
separated by another equals sign within parentheses. There must be the
same number of old and new variable names. Each old variable is renamed to
the corresponding new variable name. Multiple parenthesized groups of
variables may be specified.
The DROP subcommand deletes a specified list of variables from the
active file.
The KEEP subcommand keeps the specified list of variables in the active
file. Any unlisted variables are delete from the active file.
MAP is currently ignored.
MODIFY VARS takes effect immediately. It does not cause the data to be
read.
@node NUMERIC, PRINT FORMATS, MODIFY VARS, Variable Attributes
@section NUMERIC
@vindex NUMERIC
@display
NUMERIC /var_list [(fmt_spec)].
@end display
The NUMERIC command explicitly declares new numeric variables,
optionally setting their output formats.
Specify a slash (@samp{/}), followed by the names of the new numeric
variables. If you wish to set their output formats, follow their names
by an output format specification in parentheses (@pxref{Input/Output
Formats}). If no output format specification is given then the
variables will default to F8.2.
Variables created with NUMERIC will be initialized to the system-missing
value.
@node PRINT FORMATS, RENAME VARIABLES, NUMERIC, Variable Attributes
@section PRINT FORMATS
@vindex PRINT FORMATS
@display
PRINT FORMATS var_list (fmt_spec).
@end display
The PRINT FORMATS command sets the print formats for the specified
variables to the specified format specification.
Syntax is identical to that of FORMATS (@pxref{FORMATS}), but the PRINT
FORMATS command sets only print formats, not write formats.
@node RENAME VARIABLES, VALUE LABELS, PRINT FORMATS, Variable Attributes
@section RENAME VARIABLES
@vindex RENAME VARIABLES
@display
RENAME VARIABLES (old_names=new_names)@dots{} .
@end display
The RENAME VARIABLES command allows the names of variables in the active
file to be changed.
To rename variables, specify lists of the old variable names and new
variable names, separated by an equals sign (@samp{=}), within
parentheses. There must be the same number of old and new variable
names. Each old variable is renamed to the corresponding new variable
name. Multiple parenthesized groups of variables may be specified.
RENAME VARIABLES takes effect immediately. It does not cause the data
to be read.
@node VALUE LABELS, STRING, RENAME VARIABLES, Variable Attributes
@section VALUE LABELS
@vindex VALUE LABELS
@display
VALUE LABELS
/var_list value 'label' [value 'label']@dots{}
@end display
The VALUE LABELS command allows values of numeric and short string
variables to be associated with labels. In this way, a short value can
stand for a long value.
In order to set up value labels for a set of variables, specify the
variable names after a slash (@samp{/}), followed by a list of values
and their associated labels, separated by spaces.
Before the VALUE LABELS command is executed, any existing value labels
are cleared from the variables specified.
@node STRING, VARIABLE LABELS, VALUE LABELS, Variable Attributes
@section STRING
@vindex STRING
@display
STRING /var_list (fmt_spec).
@end display
The STRING command creates new string variables for use in
transformations.
Specify a slash (@samp{/}), followed by the names of the string
variables to create and the desired output format specification in
parentheses (@pxref{Input/Output Formats}). Variable widths are
implicitly derived from the specified output formats.
Created variables are initialized to spaces.
@node VARIABLE LABELS, VECTOR, STRING, Variable Attributes
@section VARIABLE LABELS
@vindex VARIABLE LABELS
@display
VARIABLE LABELS
/var_list 'var_label'.
@end display
The VARIABLE LABELS command is used to associate an explanatory name
with a group of variables. This name (a variable label) is displayed by
statistical procedures.
To assign a variable label to a group of variables, specify a slash
(@samp{/}), followed by the list of variable names and the variable
label as a string.
@node VECTOR, WRITE FORMATS, VARIABLE LABELS, Variable Attributes
@section VECTOR
@vindex VECTOR
@display
Two possible syntaxes:
VECTOR vec_name=var_list.
VECTOR vec_name_list(count).
@end display
The VECTOR command allows a group of variables to be accessed as if they
were consecutive members of an array with a vector(index) notation.
To make a vector out of a set of existing variables, specify a name for
the vector followed by an equals sign (@samp{=}) and the variables that
belong in the vector.
To make a vector and create variables at the same time, specify one or
more vector names followed by a count in parentheses. This will cause
variables named @code{@var{vec}1} through @code{@var{vec}@var{count}} to
be created as numeric variables. Variable names including numeric
suffixes may not exceed 8 characters in length, and none of the
variables may exist prior to the VECTOR command.
All the variables in a vector must be the same type.
Vectors created with VECTOR disappear after any procedure or
procedure-like command is executed. The variables contained in the
vectors remain, unless they are scratch variables (@pxref{Scratch
Variables}).
Variables within a vector may be references in expressions using
vector(index) syntax.
@node WRITE FORMATS, , VECTOR, Variable Attributes
@section WRITE FORMATS
@vindex WRITE FORMATS
@display
WRITE FORMATS var_list (fmt_spec).
@end display
The WRITE FORMATS command sets the write formats for the specified
variables to the specified format specification.
Syntax is identical to that of FORMATS (@pxref{FORMATS}), but the WRITE
FORMATS command sets only write formats, not print formats.
@node Data Manipulation, Data Selection, Variable Attributes, Top
@chapter Data transformations
The PSPP procedures examined in this chapter manipulate data and
prepare the active file for later analyses. They do not produce output,
as a rule.
@menu
* AGGREGATE:: Summarize multiple cases into a single case.
* AUTORECODE:: Automatic recoding of variables.
* COMPUTE:: Assigning a variable a calculated value.
* COUNT:: Counting variables with particular values.
* FLIP:: Exchange variables with cases.
* IF:: Conditionally assigning a calculated value.
* RECODE:: Mapping values from one set to another.
* SORT CASES:: Sort the active file.
@end menu
@node AGGREGATE, AUTORECODE, Data Manipulation, Data Manipulation
@section AGGREGATE
@vindex AGGREGATE
@display
AGGREGATE
/BREAK=var_list
/PRESORTED
/OUTFILE=@{*,'filename'@}
/DOCUMENT
/MISSING=COLUMNWISE
/dest_vars=agr_func(src_vars, args@dots{})@dots{}
@end display
The AGGREGATE command summarizes groups of cases into single cases.
Cases are divided into groups that have the same values for one or more
variables called @dfn{break variables}. Several functions are available
for summarizing case contents.
BREAK is the only required subcommand (in addition, at least one
aggregation variable must be specified). Specify a list of variable
names. The values of these variables are used to divide the active file
into groups to be summarized.
By default, the active file is sorted based on the break variables
before aggregation takes place. If the active file is already sorted,
specify PRESORTED to save time.
The OUTFILE subcommand specifies a system file by file name string or
file handle (@pxref{FILE HANDLE}). The aggregated cases are sent to
this file. If OUTFILE is not specified, or if @samp{*} is specified,
then the aggregated cases replace the active file.
Normally the aggregate file does not receive the documents from the
active file, even if the aggregate file replaces the active file.
Specify DOCUMENT to have the documents from the active file copied to
the aggregate file.
At least one aggregation variable must be specified. Specify a list of
aggregation variables, an equals sign (@samp{=}), an aggregation
function name (see the list below), and a list of source variables in
parentheses. In addition, some aggregation functions expect additional
arguments in the parentheses following the source variable names.
There must be exactly as many source variables as aggregation variables.
Each aggregation variable receives the results of applying the specified
aggregation function to the corresponding source variable. Most
aggregation functions may be applied to numeric and short and long
string variables. Others are restricted to numeric values; these are
marked as such in this list below.
Any number of sets of aggregation variables may be specified.
The available aggregation functions are as follows:
@table @asis
@item SUM(var_name)
Sum. Limited to numeric values.
@item MEAN(var_name)
Arithmetic mean. Limited to numeric values.
@item SD(var_name)
Standard deviation of the mean. Limited to numeric values.
@item MAX(var_name)
Maximum value.
@item MIN(var_name)
Minimum value.
@item FGT(var_name, value)
@itemx PGT(var_name, value)
Fraction between 0 and 1, or percentage between 0 and 100, respectively,
of values greater than the specified constant.
@item FLT(var_name, value)
@itemx PLT(var_name, value)
Fraction or percentage, respectively, of values less than the specified
constant.
@item FIN(var_name, low, high)
@itemx PIN(var_name, low, high)
Fraction or percentage, respectively, of values within the specified
inclusive range of constants.
@item FOUT(var_name, low, high)
@itemx POUT(var_name, low, high)
Fraction or percentage, respectively, of values strictly outside the
specified range of constants.
@item N(var_name)
Number of non-missing values.
@item N
Number of cases aggregated to form this group. Don't supply a source
variable for this aggregation function.
@item NU(var_name)
Number of non-missing values. Each case is considered to have a weight
of 1, regardless of the current weighting variable (@pxref{WEIGHT}).
@item NU
Number of cases aggregated to form this group. Each case is considered
to have a weight of 1, regardless of the current weighting variable.
@item NMISS(var_name)
Number of missing values.
@item NUMISS(var_name)
Number of missing values. Each case is considered to have a weight of
1, regardless of the current weighting variable.
@item FIRST(var_name)
First value in this group.
@item LAST(var_name)
Last value in this group.
@end table
When string values are compared by aggregation functions, they are done
in terms of internal character codes. On most modern computers, this is
a form of ASCII.
In addition, there is a parallel set of aggregation functions having the
same names as those above, but with a dot after the last character (for
instance, @samp{SUM.}). These functions are the same as the above,
except that they cause user-missing values, which are normally excluded
from calculations, to be included.
Normally, only a single case (2 for SD and SD.) need be non-missing in
each group in order for the aggregate variable to be non-missing. If
/MISSING=COLUMNWISE is specified, the behavior reverses: that is, a
single missing value is enough to make the aggregate variable become a
missing value.
AGGREGATE ignores the current SPLIT FILE settings and causes them to be
canceled (@pxref{SPLIT FILE}).
@node AUTORECODE, COMPUTE, AGGREGATE, Data Manipulation
@section AUTORECODE
@vindex AUTORECODE
@display
AUTORECODE VARIABLES=src_vars INTO dest_vars
/DESCENDING
/PRINT
@end display
The AUTORECODE procedure considers the @var{n} values that a variable
takes on and maps them onto values 1@dots{}@var{n} on a new numeric
variable.
Subcommand VARIABLES is the only required subcommand and must come
first. Specify VARIABLES, an equals sign (@samp{=}), a list of source
variables, INTO, and a list of target variables. There must the same
number of source and target variables. The target variables must not
already exist.
By default, increasing values of a source variable (for a string, this
is based on character code comparisons) are recoded to increasing values
of its target variable. To cause increasing values of a source variable
to be recoded to decreasing values of its target variable (@var{n} down
to 1), specify DESCENDING.
PRINT is currently ignored.
AUTORECODE is a procedure. It causes the data to be read.
@node COMPUTE, COUNT, AUTORECODE, Data Manipulation
@section COMPUTE
@display
COMPUTE var_name = expression.
@end display
@code{COMPUTE} creates a variable with the name specified (if
necessary), then evaluates the given expression for every case and
assigns the result to the variable. @xref{Expressions}.
Numeric variables created or computed by @code{COMPUTE} are assigned an
output width of 8 character with two decimal places (@code{F8.2}).
String variables created or computed by @code{COMPUTE} have the same
width as the existing variable or constant.
COMPUTE is a transformation. It does not cause the active file to be
read.
@node COUNT, FLIP, COMPUTE, Data Manipulation
@section COUNT
@display
COUNT var_name = var@dots{} (value@dots{}).
Each value takes one of the following forms:
number
string
num1 THRU num2
MISSING
SYSMIS
In addition, num1 and num2 can be LO or LOWEST, or HI or HIGHEST,
respectively.
@end display
@code{COUNT} creates or replaces a numeric @dfn{target} variable that
counts the occurrence of a @dfn{criterion} value or set of values over
one or more @dfn{test} variables for each case.
The target variable values are always nonnegative integers. They are
never missing. The target variable is assigned an F8.2 output format.
@xref{Input/Output Formats}. Any variables, including long and short
string variables, may be test variables.
User-missing values of test variables are treated just like any other
values. They are @strong{not} treated as system-missing values.
User-missing values that are criterion values or inside ranges of
criterion values are counted as any other values. However (for numeric
variables), keyword @code{MISSING} may be used to refer to all system-
and user-missing values.
@code{COUNT} target variables are assigned values in the order
specified. In the command @code{COUNT A=A B(1) /B=A B(2).}, the
following actions occur:
@itemize @minus
@item
The number of occurrences of 1 between @code{A} and @code{B} is counted.
@item
@code{A} is assigned this value.
@item
The number of occurrences of 1 between @code{B} and the @strong{new}
value of @code{A} is counted.
@item
@code{B} is assigned this value.
@end itemize
Despite this ordering, all @code{COUNT} criterion variables must exist
before the procedure is executed---they may not be created as target
variables earlier in the command! Break such a command into two
separate commands.
The examples below may help to clarify.
@enumerate A
@item
Assuming @code{Q0}, @code{Q2}, @dots{}, @code{Q9} are numeric variables,
the following commands:
@enumerate
@item
Count the number of times the value 1 occurs through these variables
for each case and assigns the count to variable @code{QCOUNT}.
@item
Print out the total number of times the value 1 occurs throughout
@emph{all} cases using @code{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for
details.
@end enumerate
@example
COUNT QCOUNT=Q0 TO Q9(1).
DESCRIPTIVES QCOUNT /STATISTICS=SUM.
@end example
@item
Given these same variables, the following commands:
@enumerate
@item
Count the number of valid values of these variables for each case and
assigns the count to variable @code{QVALID}.
@item
Multiplies each value of @code{QVALID} by 10 to obtain a percentage of
valid values, using @code{COMPUTE}. @xref{COMPUTE}, for details.
@item
Print out the percentage of valid values across all cases, using
@code{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for details.
@end enumerate
@example
COUNT QVALID=Q0 TO Q9 (LO THRU HI).
COMPUTE QVALID=QVALID*10.
DESCRIPTIVES QVALID /STATISTICS=MEAN.
@end example
@end enumerate
@node FLIP, IF, COUNT, Data Manipulation
@section FLIP
@vindex FLIP
@display
FLIP /VARIABLES=var_list /NEWNAMES=var_name.
@end display
The FLIP command transposes rows and columns in the active file. It
causes cases to be swapped with variables, and vice versa.
There are no required subcommands. The VARIABLES subcommand specifies
variables that will be transformed into cases. Variables not specified
are discarded. By default, all variables are selected for
transposition.
The variables specified by NEWNAMES, which must be a string variable, is
used to give names to the variables created by FLIP. If NEWNAMES is not
specified then the default is a variable named CASE_LBL, if it exists.
If it does not then the variables created by FLIP are named VAR000
through VAR999, then VAR1000, VAR1001, and so on.
When a NEWNAMES variable is available, the names must be canonicalized
before becoming variable names. Invalid characters are replaced by
letter @samp{V} in the first position, or by @samp{_} in subsequent
positions. If the name thus generated is not unique, then numeric
extensions are added, starting with 1, until a unique name is found or
there are no remaining possibilities. If the latter occurs then the
FLIP operation aborts.
The resultant dictionary contains a CASE_LBL variable, which stores the
names of the variables in the dictionary before the transposition. If
the active file is subsequently transposed using FLIP, this variable can
be used to recreate the original variable names.
@node IF, RECODE, FLIP, Data Manipulation
@section IF
@display
Two possible syntaxes:
IF test_expr target_var=target_expr.
IF test_expr target_vec(target_index)=target_expr.
@end display
The IF transformation conditionally assigns the value of a target
expression to a target variable, based on the truth of a test
expression.
Specify a boolean-valued expression (@pxref{Expressions}) to be tested
following the IF keyword. This expression is calculated for each case.
If the value is true, then the value of target_expr is computed and
assigned to target_var. If the value is false or missing, nothing is
done. Numeric and short and long string variables may be used. The
type of target_expr must match the type of target_var.
For numeric variables only, target_var need not exist before the IF
transformation is executed. In this case, target_var is assigned the
system-missing value if the IF condition is not true. String variables
must be declared before they can be used as targets for IF.
In addition to ordinary variables, the target variable may be an element
of a vector. In this case, the vector index must be specified in
parentheses following the vector name.
@node RECODE, SORT CASES, IF, Data Manipulation
@section RECODE
@display
RECODE var_list (src_value@dots{}=dest_value)@dots{} [INTO var_list].
src_value may take the following forms:
number
string
num1 THRU num2
MISSING
SYSMIS
ELSE
Open-ended ranges may be specified using LO or LOWEST for num1
or HI or HIGHEST for num2.
dest_value may take the following forms:
num
string
SYSMIS
COPY
@end display
The RECODE command is used to translate data from one range of values to
another, using flexible user-specified mappings. Data may be remapped
in-place or copied to new variables. Numeric, short string, and long
string data can be recoded.
Specify the list of source variables, followed by one or more mapping
specifications each enclosed in parentheses. If the data is to be
copied to new variables, specify INTO, then the list of target
variables. String target variables must already have been declared
using STRING or another transformation, but numeric target variables can
be created on the fly. There must be exactly as many target variables
as source variables. Each source variable is remapped into its
corresponding target variable.
When INTO is not used, the input and output variables must be of the
same type. Otherwise, string values can be recoded into numeric values,
and vice versa. When this is done and there is no mapping for a
particular value, either a value consisting of all spaces or the
system-missing value is assigned, depending on variable type.
Mappings are considered from left to right. The first src_value that
matches the value of the source variable causes the target variable to
receive the value indicated by the dest_value. Literal number, string,
and range src_value's should be self-explanatory. MISSING as a
src_value matches any user- or system-missing value. SYSMIS matches the
system missing value only. ELSE is a catch-all that matches anything.
It should be the last src_value specified.
Numeric and string dest_value's should also be self-explanatory. COPY
causes the input values to be copied to the output. This is only value
if the source and target variables are of the same type. SYSMIS
indicates the system-missing value.
If the source variables are strings and the target variables are
numeric, then there is one additional mapping available: (CONVERT),
which must be the last specified mapping. CONVERT causes a number
specified as a string to be converted to a numeric value. If the string
cannot be parsed as a number, then the system-missing value is assigned.
Multiple recodings can be specified on the same RECODE command.
Introduce additional recodings with a slash (@samp{/}) in order to
separate them from the previous recodings.
@node SORT CASES, , RECODE, Data Manipulation
@section SORT CASES
@vindex SORT CASES
@display
SORT CASES BY var_list.
@end display
SORT CASES sorts the active file by the values of one or more
variables.
Specify BY and a list of variables to sort by. By default, variables
are sorted in ascending order. To override sort order, specify (D) or
(DOWN) after a list of variables to get descending order, or (A) or (UP)
for ascending order. These apply to the entire list of variables
preceding them.
SORT CASES is a procedure. It causes the data to be read.
SORT CASES will attempt to sort the entire active file in main memory.
If main memory is exhausted then it will use a merge sort algorithm that
involves writing and reading numerous temporary files. Environment
variables determine the temporary files' location. The first of
SPSSTMPDIR, SPSSXTMPDIR, or TMPDIR that is set determines the location.
Otherwise, if the compiler environment defined P_tmpdir, that is used.
Otherwise, under Unix-like OSes /tmp is used; under MS-DOS, the first of
TEMP, TMP, or root on the current drive is used; under other OSes, the
current directory.
@node Data Selection, Conditionals and Looping, Data Manipulation, Top
@chapter Selecting data for analysis
This chapter documents PSPP commands that temporarily or permanently
select data records from the active file for analysis.
@menu
* FILTER:: Exclude cases based on a variable.
* N OF CASES:: Limit the size of the active file.
* PROCESS IF:: Temporarily excluding cases.
* SAMPLE:: Select a specified proportion of cases.
* SELECT IF:: Permanently delete selected cases.
* SPLIT FILE:: Do multiple analyses with one command.
* TEMPORARY:: Make transformations' effects temporary.
* WEIGHT:: Weight cases by a variable.
@end menu
@node FILTER, N OF CASES, Data Selection, Data Selection
@section FILTER
@vindex FILTER
@display
FILTER BY var_name.
FILTER OFF.
@end display
The FILTER command allows a boolean-valued variable to be used to select
cases from the data stream for processing.
In order to set up filtering, specify BY and a variable name. Keyword
BY is optional but recommended. Cases which have a zero or system- or
user-missing value are excluded from analysis, but not deleted from the
data stream. Cases with other values are analyzed.
Use FILTER OFF to turn off case filtering.
Filtering takes place immediately before cases pass to a procedure for
analysis. Only one filter variable may be active at once. Normally,
case filtering continues until it is explicitly turned off with FILTER
OFF. However, if FILTER is placed after TEMPORARY, then filtering stops
after execution of the next procedure or procedure-like command.
@node N OF CASES, PROCESS IF, FILTER, Data Selection
@section N OF CASES
@vindex N OF CASES
@display
N [OF CASES] num_of_cases [ESTIMATED].
@end display
Sometimes you may want to disregard cases of your input. The @code{N}
command can be used to do this. @code{N 100} tells PSPP to
disregard all cases after the first 100.
If the value specified for @code{N} is greater than the number of cases
read in, the value is ignored.
@code{N} does not discard cases or cause them not to be read in. It
just causes cases beyond the last one specified to be ignored by data
analysis commands.
A later @code{N} command can increase or decrease the number of cases
selected. (To select all the cases without knowing how many there are,
specify a very high number: 100000 or whatever you think is large enough.)
Transformation procedures performed after @code{N} is executed
@emph{do} cause cases to be discarded.
The @code{SAMPLE}, @code{PROCESS IF}, and @code{SELECT IF} commands have
precedence over @code{N}---the same results are obtained by both of the
following fragments, given the same random number seeds:
@example
@i{@dots{}set up, read in data@dots{}}
N 100.
SAMPLE .5.
@i{@dots{}analyze data@dots{}}
@i{@dots{}set up, read in data@dots{}}
SAMPLE .5.
N 100.
@i{@dots{}analyze data@dots{}}
@end example
Both fragments above first randomly sample approximately half of the
cases, then select the first 100 of those sampled.
@code{N} with the @code{ESTIMATED} keyword can be used to give an
estimated number of cases before DATA LIST or another command to
read in data. (@code{ESTIMATED} never limits the number of cases
processed by procedures.)
@node PROCESS IF, SAMPLE, N OF CASES, Data Selection
@section PROCESS IF
@vindex PROCESS IF
@example
PROCESS IF expression.
@end example
The PROCESS IF command is used to temporarily eliminate cases from the
data stream. Its effects are active only through the execution of the
next procedure or procedure-like command.
Specify a boolean expression (@pxref{Expressions}). If the value of the
expression is true for a particular case, the case will be analyzed. If
the expression has a false or missing value, then the case will be
deleted from the data stream for this procedure only.
Regardless of its placement relative to other commands, PROCESS IF
always takes effect immediately before data passes to the procedure.
Only one PROCESS IF command may be in effect at any given time.
The effects of PROCESS IF are similar not identical to the effects of
executing TEMPORARY then SELECT IF (@pxref{SELECT IF}).
Use of PROCESS IF is deprecated. It is included for compatibility with
old command files. New syntax files should use SELECT IF or FILTER
instead.
@node SAMPLE, SELECT IF, PROCESS IF, Data Selection
@section SAMPLE
@vindex SAMPLE
@display
SAMPLE num1 [FROM num2].
@end display
@code{SAMPLE} is used to randomly sample a proportion of the cases in
the active file. @code{SAMPLE} is temporary, affecting only the next
procedure, unless that is a data transformation, such as @code{SELECT IF}
or @code{RECODE}.
The proportion to sample can be expressed as a single number between 0
and 1. If @code{k} is the number specified, and @code{N} is the number
of currently-selected cases in the active file, then after
@code{SAMPLE @var{k}.}, there will be @code{k*N}, plus or minus one, cases
selected.
The proportion to sample can also be specified in the style @code{SAMPLE
@var{m} FROM @var{N}}. With this style, cases are selected as follows:
@enumerate
@item
If @var{N} is equal to the number of currently-selected cases in the
active file, exactly @var{m} cases will be selected.
@item
If @var{N} is greater than the number of currently-selected cases in the
active file, an equivalent proportion of cases will be selected.
@item
If @var{N} is less than the number of currently-selected cases in the
active, exactly @var{m} cases will be selected @emph{from the first
@var{N} cases in the active file.}
@end enumerate
@code{SAMPLE}, @code{SELECT IF}, and @code{PROCESS IF} are performed in
the order specified by the syntax file.
@code{SAMPLE} is ignored before @code{SORT CASES}.
@code{SAMPLE} is always performed before @code{N OF CASES}, regardless
of ordering in the syntax file. @xref{N OF CASES}.
The same values for @code{SAMPLE} may result in different samples. To
obtain the same sample, use the @code{SET} command to set the random
number seed to the same value before each @code{SAMPLE}. By default,
the random number seed is based on the system time.
@node SELECT IF, SPLIT FILE, SAMPLE, Data Selection
@section SELECT IF
@vindex SELECT IF
@display
SELECT IF expression.
@end display
The SELECT IF command is used to select particular cases for analysis
based on the value of a boolean expression. Cases not selected are
permanently eliminated, unless TEMPORARY is in effect
(@pxref{TEMPORARY}).
Specify a boolean expression (@pxref{Expressions}). If the value of the
expression is true for a particular case, the case will be analyzed. If
the expression has a false or missing value, then the case will be
deleted from the data stream.
Always place SELECT IF commands as early in the command file as
possible. Cases that are deleted early can be processed more
efficiently in time and space.
@node SPLIT FILE, TEMPORARY, SELECT IF, Data Selection
@section SPLIT FILE
@vindex SPLIT FILE
@display
Two possible syntaxes:
SPLIT FILE BY var_list.
SPLIT FILE OFF.
@end display
The SPLIT FILE command allows multiple sets of data present in one data
file to be analyzed separately using single statistical procedure
commands.
Specify a list of variable names in order to analyze multiple sets of
data separately. Groups of cases having the same values for these
variables are analyzed by statistical procedure commands as one group.
An independent analysis is carried out for each group of cases, and the
variable values for the group are printed along with the analysis.
Specify OFF in order to disable SPLIT FILE and resume analysis of the
entire active file as a single group of data.
@node TEMPORARY, WEIGHT, SPLIT FILE, Data Selection
@section TEMPORARY
@vindex TEMPORARY
@display
TEMPORARY.
@end display
The TEMPORARY command is used to make the effects of transformations
following its execution temporary. These transformations will
affect only the execution of the next procedure or procedure-like
command. Their effects will not be saved to the active file.
The only specification is the command name.
TEMPORARY may not appear within a DO IF or LOOP construct. It may
appear only once between procedures and procedure-like commands.
An example may help to clarify:
@example
DATA LIST /X 1-2.
BEGIN DATA.
2
4
10
15
20
24
END DATA.
COMPUTE X=X/2.
TEMPORARY.
COMPUTE X=X+3.
DESCRIPTIVES X.
DESCRIPTIVES X.
@end example
The data read by the first DESCRIPTIVES command are 4, 5, 8,
10.5, 13, 15. The data read by the first DESCRIPTIVES command are 1, 2,
5, 7.5, 10, 12.
@node WEIGHT, , TEMPORARY, Data Selection
@section WEIGHT
@vindex WEIGHT
@display
WEIGHT BY var_name.
WEIGHT OFF.
@end display
WEIGHT can be used to assign cases varying weights in order to
change the frequency distribution of the active file. Execution of
WEIGHT is delayed until data have been read in.
If a variable name is specified, WEIGHT causes the values of that
variable to be used as weighting factors for subsequent statistical
procedures. Use of keyword BY is optional but recommended. Weighting
variables must be numeric. Scratch variables may not be used for
weighting (@pxref{Scratch Variables}).
When OFF is specified, subsequent statistical procedures will weight all
cases equally.
Weighting values do not need to be integers. However, negative and
system- and user-missing values for the weighting variable are
interpreted as weighting factors of 0.
WEIGHT does not cause cases in the active file to be replicated in
memory.
@node Conditionals and Looping, Statistics, Data Selection, Top
@chapter Conditional and Looping Constructs
@cindex conditionals
@cindex loops
@cindex flow of control
@cindex control flow
This chapter documents PSPP commands used for conditional execution,
looping, and flow of control.
@menu
* BREAK:: Exit a loop.
* DO IF:: Conditionally execute a block of code.
* DO REPEAT:: Textually repeat a code block.
* LOOP:: Repeat a block of code.
@end menu
@node BREAK, DO IF, Conditionals and Looping, Conditionals and Looping
@section BREAK
@vindex BREAK
@display
BREAK.
@end display
BREAK terminates execution of the innermost currently executing LOOP
construct.
BREAK is allowed only inside a LOOP construct. @xref{LOOP}, for more
details.
@node DO IF, DO REPEAT, BREAK, Conditionals and Looping
@section DO IF
@vindex DO IF
@display
DO IF condition.
@dots{}
[ELSE IF condition.
@dots{}
]@dots{}
[ELSE.
@dots{}]
END IF.
@end display
The DO IF command allows one of several sets of transformations to be
executed, depending on user-specified conditions.
Specify a boolean expression. If the condition is true, then the block
of code following DO IF is executed. If the condition is missing, then
none of the code blocks is executed. If the condition is false, then
the boolean expressions on the first ELSE IF, if present, is tested in
turn, with the same rules applied. If all expressions evaluate to
false, then the ELSE code block is executed, if it is present.
@node DO REPEAT, LOOP, DO IF, Conditionals and Looping
@section DO REPEAT
@vindex DO REPEAT
@display
DO REPEAT repvar_name=expansion@dots{}.
@dots{}
END REPEAT [PRINT].
expansion takes one of the following forms:
var_list
num_or_range@dots{}
'string'@dots{}
num_or_range takes one of the following forms:
number
num1 TO num2
@end display
The DO REPEAT command causes a block of code to be repeated a number of
times with different variables, numbers, or strings textually
substituted into the block with each repetition.
Specify a repeat variable name followed by an equals sign (@samp{=}) and
the list of replacements. Replacements can be a list of variables
(which may be existing variables or new variables or a combination
thereof), of numbers, or of strings. When new variable names are
specified, DO REPEAT creates them as numeric variables. When numbers
are specified, runs of integers may be indicated with TO notation, for
instance @samp{1 TO 5} and @samp{1 2 3 4 5} would be equivalent. There
is no equivalent notation for string values.
Multiple repeat variables can be specified. When this is done, each
variable must have the same number of replacements.
The code within DO REPEAT is repeated as many times as there are
replacements for each variable. The first time, the first value for
each repeat variable is substituted; the second time, the second value
for each repeat variable is substituted; and so on.
Repeat variable substitutions work like macros. They take place
anywhere in a line that the repeat variable name occurs as a token,
including command and subcommand names. For this reason it is not a
good idea to select words commonly used in command and subcommand names
as repeat variable identifiers.
If PRINT is specified on END REPEAT, the commands after substitutions
are made are printed to the listing file, prefixed by a plus sign
(@samp{+}).
@node LOOP, , DO REPEAT, Conditionals and Looping
@section LOOP
@vindex LOOP
@display
LOOP [index_var=start TO end [BY incr]] [IF condition].
@dots{}
END LOOP [IF condition].
@end display
The LOOP command allows a group of commands to be iterated. A number of
termination options are offered.
Specify index_var in order to make that variable count from one value to
another by a particular increment. index_var must be a pre-existing
numeric variable. start, end, and incr are numeric expressions
(@pxref{Expressions}.)
During the first iteration, index_var is set to the value of start.
During each successive iteration, index_var is increased by the value of
incr. If end > start, then the loop terminates when index_var > end;
otherwise it terminates when index_var < end. If incr is not specified
then it defaults to +1 or -1 as appropriate.
If end > start and incr < 0, or if end < start and incr > 0, then the
loop is never executed. index_var is nevertheless set to the value of
start.
Modifying index_var within the loop is allowed, but it has no effect on
the value of index_var in the next iteration.
Specify a boolean expression for the condition on the LOOP command to
cause the loop to be executed only if the condition is true. If the
condition is false or missing before the loop contents are executed the
first time, the loop contents are not executed at all.
If index and condition clauses are both present on LOOP, the index
clause is always evaluated first.
Specify a boolean expression for the condition on the END LOOP to cause
the loop to terminate if the condition is not true after the enclosed
code block is executed. The condition is evaluated at the end of the
loop, not at the beginning.
If the index clause and both condition clauses are not present, then the
loop is executed MXLOOPS (@pxref{SET}) times or until BREAK
(@pxref{BREAK}) is executed.
The BREAK command provides another way to terminate execution of a LOOP
construct.
@node Statistics, Utilities, Conditionals and Looping, Top
@chapter Statistics
This chapter documents the statistical procedures that PSPP supports so
far.
@menu
* DESCRIPTIVES:: Descriptive statistics.
* FREQUENCIES:: Frequency tables.
* CROSSTABS:: Crosstabulation tables.
@end menu
@node DESCRIPTIVES, FREQUENCIES, Statistics, Statistics
@section DESCRIPTIVES
@display
DESCRIPTIVES
/VARIABLES=var_list
/MISSING=@{VARIABLE,LISTWISE@} @{INCLUDE,NOINCLUDE@}
/FORMAT=@{LABELS,NOLABELS@} @{NOINDEX,INDEX@} @{LINE,SERIAL@}
/SAVE
/STATISTICS=@{ALL,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS,
SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM,DEFAULT,
SESKEWNESS,SEKURTOSIS@}
/SORT=@{NONE,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS,SKEWNESS,
RANGE,MINIMUM,MAXIMUM,SUM,SESKEWNESS,SEKURTOSIS,NAME@}
@{A,D@}
@end display
The DESCRIPTIVES procedure reads the active file and outputs descriptive
statistics requested by the user. In addition, it can optionally
compute Z-scores.
The VARIABLES subcommand, which is required, specifies the list of
variables to be analyzed. Keyword VARIABLES is optional.
All other subcommands are optional:
The MISSING subcommand determines the handling of missing variables. If
INCLUDE is set, then user-missing values are included in the
calculations. If NOINCLUDE is set, which is the default, user-missing
values are excluded. If VARIABLE is set, then missing values are
excluded on a variable by variable basis; if LISTWISE is set, then
the entire case is excluded whenever any value in that case has a
system-missing or, if INCLUDE is set, user-missing value.
The FORMAT subcommand affects the output format. Currently the
LABELS/NOLABELS and NOINDEX/INDEX settings is not used. When SERIAL is
set, both valid and missing number of cases are listed in the output;
when NOSERIAL is set, only valid cases are listed.
The SAVE subcommand causes DESCRIPTIVES to calculate Z scores for all
the specified variables. The Z scores are saved to new variables.
Variable names are generated by trying first the original variable name
with Z prepended and truncated to a maximum of 8 characters, then the
names ZSC000 through ZSC999, STDZ00 through STDZ09, ZZZZ00 through
ZZZZ09, ZQZQ00 through ZQZQ09, in that sequence. In addition, Z score
variable names can be specified explicitly on VARIABLES in the variable
list by enclosing them in parentheses after each variable.
The STATISTICS subcommand specifies the statistics to be displayed:
@table @code
@item ALL
All of the statistics below.
@item MEAN
Arithmetic mean.
@item SEMEAN
Standard error of the mean.
@item STDDEV
Standard deviation.
@item VARIANCE
Variance.
@item KURTOSIS
Kurtosis and standard error of the kurtosis.
@item SKEWNESS
Skewness and standard error of the skewness.
@item RANGE
Range.
@item MINIMUM
Minimum value.
@item MAXIMUM
Maximum value.
@item SUM
Sum.
@item DEFAULT
Mean, standard deviation of the mean, minimum, maximum.
@item SEKURTOSIS
Standard error of the kurtosis.
@item SESKEWNESS
Standard error of the skewness.
@end table
The SORT subcommand specifies how the statistics should be sorted. Most
of the possible values should be self-explanatory. NAME causes the
statistics to be sorted by name. By default, the statistics are listed
in the order that they are specified on the VARIABLES subcommand. The A
and D settings request an ascending or descending sort order,
respectively.
@node FREQUENCIES, CROSSTABS, DESCRIPTIVES, Statistics
@section FREQUENCIES
@display
FREQUENCIES
/VARIABLES=var_list
/FORMAT=@{TABLE,NOTABLE,LIMIT(limit)@}
@{STANDARD,CONDENSE,ONEPAGE[(onepage_limit)]@}
@{LABELS,NOLABELS@}
@{AVALUE,DVALUE,AFREQ,DFREQ@}
@{SINGLE,DOUBLE@}
@{OLDPAGE,NEWPAGE@}
/MISSING=@{EXCLUDE,INCLUDE@}
/STATISTICS=@{DEFAULT,MEAN,SEMEAN,MEDIAN,MODE,STDDEV,VARIANCE,
KURTOSIS,SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM,
SESKEWNESS,SEKURTOSIS,ALL,NONE@}
/NTILES=ntiles
/PERCENTILES=percent@dots{}
(These options are not currently implemented.)
/BARCHART=@dots{}
/HISTOGRAM=@dots{}
/HBAR=@dots{}
/GROUPED=@dots{}
(Integer mode.)
/VARIABLES=var_list (low,high)@dots{}
@end display
FREQUENCIES causes the data to be read and frequency tables to be built
and output for specified variables. FREQUENCIES can also calculate and
display descriptive statistics (including median and mode) and
percentiles.
In the future, FREQUENCIES will also support graphical output in the
form of bar charts and histograms. In addition, it will be able to
support percentiles for grouped data. (As a historical note, these
options were supported in a version of PSPP written years ago, but the
code has not survived.)
The VARIABLES subcommand is the only required subcommand. Specify the
variables to be analyzed. In most cases, this is all that is required.
This is known as @dfn{general mode}.
Occasionally, one may want to invoke a special mode called @dfn{integer
mode}. Normally, in general mode, PSPP will automatically determine
what values occur in the data. In integer mode, the user specifies the
range of values that the data assumes. To invoke this mode, specify a
range of data values in parentheses, separated by a comma. Data values
inside the range are truncated to the nearest integer, then assigned to
that value. If values occur outside this range, they are discarded.
The FORMAT subcommand controls the output format. It has several
possible settings:
@itemize @bullet
@item
TABLE, the default, causes a frequency table to be output for every
variable specified. NOTABLE prevents them from being output. LIMIT
with a numeric argument causes them to be output except when there are
more than the specified number of values in the table.
@item
STANDARD frequency tables contain more complete information, but also to
take up more space on the printed page. CONDENSE frequency tables are
less informative but take up less space. ONEPAGE with a numeric
argument will output standard frequency tables if there are the
specified number of values or less, condensed tables otherwise. ONEPAGE
without an argument defaults to a threshold of 50 values.
@item
LABELS causes value labels to be displayed in STANDARD frequency
tables. NOLABLES prevents this.
@item
Normally frequency tables are sorted in ascending order by value. This
is AVALUE. DVALUE tables are sorted in descending order by value.
AFREQ and DFREQ tables are sorted in ascending and descending order,
respectively, by frequency count.
@item
SINGLE spaced frequency tables are closely spaced. DOUBLE spaced
frequency tables have wider spacing.
@item
OLDPAGE and NEWPAGE are not currently used.
@end itemize
The MISSING subcommand controls the handling of user-missing values.
When EXCLUDE, the default, is set, user-missing values are not included
in frequency tables or statistics. When INCLUDE is set, user-missing
are included. System-missing values are never included in statistics,
but are listed in frequency tables.
The available STATISTICS are the same as available in DESCRIPTIVES
(@pxref{DESCRIPTIVES}), with the addition of MEDIAN, the data's median
value, and MODE, the mode. (If there are multiple modes, the smallest
value is reported.) By default, the mean, standard deviation of the
mean, minimum, and maximum are reported for each variable.
NTILES causes the specified quartiles to be reported. For instance,
@code{/NTILES=4} would cause quartiles to be reported. In addition,
particular percentiles can be requested with the PERCENTILES subcommand.
@node CROSSTABS, , FREQUENCIES, Statistics
@section CROSSTABS
@display
CROSSTABS
/TABLES=var_list BY var_list [BY var_list]@dots{}
/MISSING=@{TABLE,INCLUDE,REPORT@}
/WRITE=@{NONE,CELLS,ALL@}
/FORMAT=@{TABLES,NOTABLES@}
@{LABELS,NOLABELS,NOVALLABS@}
@{PIVOT,NOPIVOT@}
@{AVALUE,DVALUE@}
@{NOINDEX,INDEX@}
@{BOX,NOBOX@}
/CELLS=@{COUNT,ROW,COLUMN,TOTAL,EXPECTED,RESIDUAL,SRESIDUAL,
ASRESIDUAL,ALL,NONE@}
/STATISTICS=@{CHISQ,PHI,CC,LAMBDA,UC,BTAU,CTAU,RISK,GAMMA,D,
KAPPA,ETA,CORR,ALL,NONE@}
(Integer mode.)
/VARIABLES=var_list (low,high)@dots{}
@end display
CROSSTABS reads the active file and builds and displays crosstabulation
tables requested by the user. It can calculate several statistics for
each cell in the crosstabulation tables. In addition, a number of
statistics can be calculated for each table itself.
The TABLES subcommand is used to specify the tables to be reported. Any
number of dimensions is permitted, and any number of variables per
dimension is allowed. The TABLES subcommand may be repeated as many
times as needed. This is the only required subcommand in @dfn{general
mode}.
Occasionally, one may want to invoke a special mode called @dfn{integer
mode}. Normally, in general mode, PSPP will automatically determine
what values occur in the data. In integer mode, the user specifies the
range of values that the data assumes. To invoke this mode, specify the
VARIABLES subcommand, giving a range of data values in parentheses for
each variable to be used on the TABLES subcommand. Data values inside
the range are truncated to the nearest integer, then assigned to that
value. If values occur outside this range, they are discarded. When it
is present, the VARIABLES subcommand must precede the TABLES subcommand.
The MISSING subcommand determines the handling of user-missing values.
When set to TABLE, the default, missing values are dropped on a table by
table basis. When set to INCLUDE, user-missing values are included in
tables and statistics. When set to REPORT, which is allowed only in
integer mode, user-missing values are included in tables but marked with
an @samp{M} (for ``missing'') and excluded from statistical
calculations.
Currently the WRITE subcommand is not used.
The FORMAT subcommand controls the characteristics of the
crosstabulation tables to be displayed. It has a number of possible
settings:
@itemize @bullet
@item
TABLES, the default, causes crosstabulation tables to be output.
NOTABLES suppresses them.
@item
LABELS, the default, allows variable labels and value labels to appear
in the output. NOLABELS suppresses them. NOVALLABS displays variable
labels but suppresses value labels.
@item
PIVOT, the default, causes each TABLES subcommand to be displayed in a
pivot table format. NOPIVOT causes the old-style crosstabulation format
to be used.
@item
AVALUE, the default, causes values to be sorted in ascending order.
DVALUE asserts a descending sort order.
@item
INDEX/NOINDEX is currently ignored.
@item
BOX/NOBOX is currently ignored.
@end itemize
The CELLS subcommand controls the contents of each cell in the displayed
crosstabulation table. The possible settings are:
@table @asis
@item COUNT
Frequency count.
@item ROW
Row percent.
@item COLUMN
Column percent.
@item TOTAL
Table percent.
@item EXPECTED
Expected value.
@item RESIDUAL
Residual.
@item SRESIDUAL
Standardized residual.
@item ASRESIDUAL
Adjusted standardized residual.
@item ALL
All of the above.
@item NONE
Suppress cells entirely.
@end table
@samp{/CELLS} without any settings specified requests COUNT, ROW,
COLUMN, and TOTAL. If CELLS is not specified at all then only COUNT
will be selected.
The STATISTICS subcommand selects statistics for computation:
@table @asis
@item CHISQ
Pearson chi-square, likelihood ratio, Fisher's exact test, continuity
correction, linear-by-linear association.
@item PHI
Phi.
@item CC
Contingency coefficient.
@item LAMBDA
Lambda.
@item UC
Uncertainty coefficient.
@item BTAU
Tau-b.
@item CTAU
Tau-c.
@item RISK
Risk estimate.
@item GAMMA
Gamma.
@item D
Somers' D.
@item KAPPA
Cohen's Kappa.
@item ETA
Eta.
@item CORR
Spearman correlation, Pearson's r.
@item ALL
All of the above.
@item NONE
No statistics.
@end table
Selected statistics are only calculated when appropriate for the
statistic. Certain statistics require tables of a particular size, and
some statistics are calculated only in integer mode.
@samp{/STATISTICS} without any settings selects CHISQ. If the
STATISTICS subcommand is not given, no statistics are calculated.
@strong{Please note:} Currently the implementation of CROSSTABS has the
followings bugs:
@itemize @bullet
@item
Pearson's R (but not Spearman!) is off a little.
@item
T values for Spearman's R and Pearson's R are wrong.
@item
How to calculate significance of symmetric and directional measures?
@item
Asymmetric ASEs and T values for lambda are wrong.
@item
ASE of Goodman and Kruskal's tau is not calculated.
@item
ASE of symmetric somers' d is wrong.
@item
Approx. T of uncertainty coefficient is wrong.
@end itemize
Fix for any of these deficiencies would be welcomed.
@node Utilities, Not Implemented, Statistics, Top
@chapter Utilities
Commands that don't fit any other category are placed here.
Most of these commands are not affected by commands like IF and LOOP:
they take effect only once, unconditionally, at the time that they are
encountered in the input.
@menu
* COMMENT:: Document your syntax file.
* DOCUMENT:: Document the active file.
* DISPLAY DOCUMENTS:: Display active file documents.
* DISPLAY FILE LABEL:: Display the active file label.
* DROP DOCUMENTS:: Remove documents from the active file.
* EXECUTE:: Execute pending transformations.
* FILE LABEL:: Set the active file's label.
* INCLUDE:: Include a file within the current one.
* QUIT:: Terminate the PSPP session.
* SET:: Adjust PSPP runtime parameters.
* SUBTITLE:: Provide a document subtitle.
* SYSFILE INFO:: Display the dictionary in a system file.
* TITLE:: Provide a document title.
@end menu
@node COMMENT, DOCUMENT, Utilities, Utilities
@section COMMENT
@vindex COMMENT
@vindex *
@display
Two possibles syntaxes:
COMMENT comment text @dots{} .
*comment text @dots{} .
@end display
The COMMENT command is ignored. It is used to provide information to
the author and other readers of the PSPP syntax file.
A COMMENT command can extend over any number of lines. Don't forget to
terminate it with a dot or a blank line!
@node DOCUMENT, DISPLAY DOCUMENTS, COMMENT, Utilities
@section DOCUMENT
@vindex DOCUMENT
@display
DOCUMENT documentary_text.
@end display
The DOCUMENT command adds one or more lines of descriptive commentary to
the active file. Documents added in this way are saved to system files.
They can be viewed using SYSFILE INFO or DISPLAY DOCUMENTS. They can be
removed from the active file with DROP DOCUMENTS.
Specify the documentary text following the DOCUMENT keyword. You can
extend the documentary text over as many lines as necessary. Lines are
truncated at 80 characters width. Don't forget to terminate the
DOCUMENT command with a dot or a blank line.
@node DISPLAY DOCUMENTS, DISPLAY FILE LABEL, DOCUMENT, Utilities
@section DISPLAY DOCUMENTS
@vindex DISPLAY DOCUMENTS
@display
DISPLAY DOCUMENTS.
@end display
DISPLAY DOCUMENTS displays the documents in the active file. Each
document is preceded by a line giving the time and date that it was
added. @xref{DOCUMENT}.
@node DISPLAY FILE LABEL, DROP DOCUMENTS, DISPLAY DOCUMENTS, Utilities
@section DISPLAY FILE LABEL
@vindex DISPLAY FILE LABEL
@display
DISPLAY FILE LABEL.
@end display
DISPLAY FILE LABEL displays the file label contained in the active file,
if any. @xref{FILE LABEL}.
@node DROP DOCUMENTS, EXECUTE, DISPLAY FILE LABEL, Utilities
@section DROP DOCUMENTS
@vindex DROP DOCUMENTS
@display
DROP DOCUMENTS.
@end display
The DROP DOCUMENTS command removes all documents from the active file.
New documents can be added with the DOCUMENT utility (@pxref{DOCUMENT}).
DROP DOCUMENTS only changes the active file. It does not modify any
system files stored on disk.
@node EXECUTE, FILE LABEL, DROP DOCUMENTS, Utilities
@section EXECUTE
@vindex EXECUTE
@display
EXECUTE.
@end display
The EXECUTE utility causes the active file to be read and all pending
transformations to be executed.
@node FILE LABEL, INCLUDE, EXECUTE, Utilities
@section FILE LABEL
@vindex FILE LABEL
@display
FILE LABEL file_label.
@end display
Use the FILE LABEL command to provide a title for the active file. This
title will be saved into system files and portable files that are
created during this PSPP run.
It is not necessary to include quotes around file_label. If they are
included then they become part of the file label.
@node INCLUDE, QUIT, FILE LABEL, Utilities
@section INCLUDE
@vindex INCLUDE
@vindex @@
@display
Two possible syntaxes:
INCLUDE 'filename'.
@@filename.
@end display
The INCLUDE command causes the PSPP command processor to read an
additional command file as if it were included bodily in the current
command file.
INCLUDE files may be nested to any depth, up to the limit of available
memory.
@node QUIT, SET, INCLUDE, Utilities
@section QUIT
@vindex QUIT
@display
Two possible syntaxes:
QUIT.
EXIT.
@end display
The QUIT command terminates the current PSPP session and returns control
to the operating system.
This command is not valid within a command file.
@node SET, SUBTITLE, QUIT, Utilities
@section SET
@vindex SET
@display
SET
(data input)
/BLANKS=@{SYSMIS,'.',number@}
/DECIMAL=@{DOT,COMMA@}
/FORMAT=fmt_spec
(program input)
/ENDCMD='.'
/NULLINE=@{ON,OFF@}
(interaction)
/CPROMPT='cprompt_string'
/DPROMPT='dprompt_string'
/ERRORBREAK=@{OFF,ON@}
/MXERRS=max_errs
/MXWARNS=max_warnings
/PROMPT='prompt'
/VIEWLENGTH=@{MINIMUM,MEDIAN,MAXIMUM,n_lines@}
/VIEWWIDTH=n_characters
(program execution)
/MEXPAND=@{ON,OFF@}
/MITERATE=max_iterations
/MNEST=max_nest
/MPRINT=@{ON,OFF@}
/MXLOOPS=max_loops
/SEED=@{RANDOM,seed_value@}
/UNDEFINED=@{WARN,NOWARN@}
(data output)
/CC@{A,B,C,D,E@}=@{'npre,pre,suf,nsuf','npre.pre.suf.nsuf'@}
/DECIMAL=@{DOT,COMMA@}
/FORMAT=fmt_spec
(output routing)
/ECHO=@{ON,OFF@}
/ERRORS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@}
/INCLUDE=@{ON,OFF@}
/MESSAGES=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@}
/PRINTBACK=@{ON,OFF@}
/RESULTS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@}
(output activation)
/LISTING=@{ON,OFF@}
/PRINTER=@{ON,OFF@}
/SCREEN=@{ON,OFF@}
(output driver options)
/HEADERS=@{NO,YES,BLANK@}
/LENGTH=@{NONE,length_in_lines@}
/LISTING=filename
/MORE=@{ON,OFF@}
/PAGER=@{OFF,"pager_name"@}
/WIDTH=@{NARROW,WIDTH,n_characters@}
(logging)
/JOURNAL=@{ON,OFF@} [filename]
/LOG=@{ON,OFF@} [filename]
(system files)
/COMPRESSION=@{ON,OFF@}
/SCOMPRESSION=@{ON,OFF@}
(security)
/SAFER=ON
(obsolete settings accepted for compatibility, but ignored)
/AUTOMENU=@{ON,OFF@}
/BEEP=@{ON,OFF@}
/BLOCK='c'
/BOXSTRING=@{'xxx','xxxxxxxxxxx'@}
/CASE=@{UPPER,UPLOW@}
/COLOR=@dots{}
/CPI=cpi_value
/DISK=@{ON,OFF@}
/EJECT=@{ON,OFF@}
/HELPWINDOWS=@{ON,OFF@}
/HIGHRES=@{ON,OFF@}
/HISTOGRAM='c'
/LOWRES=@{AUTO,ON,OFF@}
/LPI=lpi_value
/MENUS=@{STANDARD,EXTENDED@}
/MXMEMORY=max_memory
/PTRANSLATE=@{ON,OFF@}
/RCOLORS=@dots{}
/RUNREVIEW=@{AUTO,MANUAL@}
/SCRIPTTAB='c'
/TB1=@{'xxx','xxxxxxxxxxx'@}
/TBFONTS='string'
/WORKDEV=drive_letter
/WORKSPACE=workspace_size
/XSORT=@{YES,NO@}
@end display
The SET command allows the user to adjust several parameters relating to
PSPP's execution. Since there are many subcommands to this command, its
subcommands will be examined in groups.
As a general comment, ON and YES are considered synonymous, and
so are OFF and NO, when used as subcommand values.
The data input subcommands affect the way that data is read from data
files. The data input subcommands are
@table @asis
@item BLANKS
This is the value assigned to an item data item that is empty or
contains only whitespace. An argument of SYSMIS or '.' will cause the
system-missing value to be assigned to null items. This is the
default. Any real value may be assigned.
@item DECIMAL
The default DOT setting causes the decimal point character to be
@samp{.}. A setting of COMMA causes the decimal point character to be
@samp{,}.
@item FORMAT
Allows the default numeric input/output format to be specified. The
default is F8.2. @xref{Input/Output Formats}.
@end table
Program input subcommands affect the way that programs are parsed when
they are typed interactively or run from a script. They are
@table @asis
@item ENDCMD
This is a single character indicating the end of a command. The default
is @samp{.}. Don't change this.
@item NULLINE
Whether a blank line is interpreted as ending the current command. The
default is ON.
@end table
Interaction subcommands affect the way that PSPP interacts with an
online user. The interaction subcommands are
@table @asis
@item CPROMPT
The command continuation prompt. The default is @samp{ > }.
@item DPROMPT
Prompt used when expecting data input within BEGIN DATA (@pxref{BEGIN
DATA}). The default is @samp{data> }.
@item ERRORBREAK
Whether an error causes PSPP to stop processing the current command
file after finishing the current command. The default is OFF.
@item MXERRS
The maximum number of errors before PSPP halts processing of the current
command file. The default is 50.
@item MXWARNS
The maximum number of warnings + errors before PSPP halts processing the
current command file. The default is 100.
@item PROMPT
The command prompt. The default is @samp{PSPP> }.
@item VIEWLENGTH
The length of the screen in lines. MINIMUM means 25 lines, MEDIAN and
MAXIMUM mean 43 lines. Otherwise specify the number of lines. Normally
PSPP should auto-detect your screen size so this shouldn't have to be
used.
@item VIEWWIDTH
The width of the screen in characters. Normally 80 or 132.
@end table
Program execution subcommands control the way that PSPP commands
execute. The program execution subcommands are
@table @asis
@item MEXPAND
@itemx MITERATE
@itemx MNEST
@itemx MPRINT
Currently not used.
@item MXLOOPS
The maximum number of iterations for an uncontrolled loop.
@item SEED
The initial pseudo-random number seed. Set to a real number or to
RANDOM, which will obtain an initial seed from the current time of day.
@item UNDEFINED
Currently not used.
@end table
Data output subcommands affect the format of output data. These
subcommands are
@table @asis
@item CCA
@itemx CCB
@itemx CCC
@itemx CCD
@itemx CCE
Set up custom currency formats. The argument is a string which must
contain exactly three commas or exactly three periods. If commas, then
the grouping character for the currency format is @samp{,}, and the
decimal point character is @samp{.}; if periods, then the situation is
reversed.
The commas or periods divide the string into four fields, which are, in
order, the negative prefix, prefix, suffix, and negative suffix. When a
value is formatted using the custom currency format, the prefix precedes
the value formatted and the suffix follows it. In addition, if the
value is negative, the negative prefix precedes the prefix and the
negative suffix follows the suffix.
@item DECIMAL
The default DOT setting causes the decimal point character to be
@samp{.}. A setting of COMMA causes the decimal point character to be
@samp{,}.
@item FORMAT
Allows the default numeric input/output format to be specified. The
default is F8.2. @xref{Input/Output Formats}.
@end table
Output routing subcommands affect where the output of transformations
and procedures is sent. These subcommands are
@table @asis
@item ECHO
If turned on, commands are written to the listing file as they are read
from command files. The default is OFF.
@itemx ERRORS
@itemx INCLUDE
@itemx MESSAGES
@item PRINTBACK
@item RESULTS
Currently not used.
@end table
Output activation subcommands affect whether output devices of
particular types are enabled. These subcommands are
@table @asis
@item LISTING
Enable or disable listing devices.
@item PRINTER
Enable or disable printer devices.
@item SCREEN
Enable or disable screen devices.
@end table
Output driver option subcommands affect output drivers' settings. These
subcommands are
@table @asis
@item HEADERS
@itemx LENGTH
@itemx LISTING
@itemx MORE
@itemx PAGER
@itemx WIDTH
Currently not used.
@end table
Logging subcommands affect logging of commands executed to external
files. These subcommands are
@table @asis
@item JOURNAL
@item LOG
Not currently used.
@end table
System file subcommands affect the default format of system files
produced by PSPP. These subcommands are
@table @asis
@item COMPRESSION
Not currently used.
@item SCOMPRESSION
Whether system files created by SAVE or XSAVE are compressed by default.
The default is ON.
@end table
Security subcommands affect the operations that commands are allowed to
perform. The security subcommands are
@table @asis
@item SAFER
When set, this setting cannot ever be reset, for obvious security
reasons. Setting this option disables the following operations:
@itemize @bullet
@item
The ERASE command.
@item
The HOST command.
@item
Pipe filenames (filenames beginning or ending with @samp{|}).
@item
@end itemize
Be aware that this setting does not guarantee safety (commands can still
overwrite files, for instance) but it is an improvement.
@end table
@node SUBTITLE, TITLE, SET, Utilities
@section SUBTITLE
@vindex SUBTITLE
@display
Two possible syntaxes:
SUBTITLE 'subtitle_string'.
SUBTITLE subtitle_string.
@end display
The SUBTITLE command is used to provide a subtitle to a particular PSPP
run. This subtitle appears at the top of each output page below the
title, if titles are enabled on the output device.
Specify a subtitle as a string in quotes. The alternate syntax that did
not require quotes is now obsolete. If it is used then the subtitle is
converted to all uppercase.
@node TITLE, , SUBTITLE, Utilities
@section TITLE
@vindex TITLE
@display
Two possible syntaxes:
TITLE 'title_string'.
TITLE title_string.
@end display
The TITLE command is used to provide a title to a particular PSPP run.
This title appears at the top of each output page, if titles are enabled
on the output device.
Specify a title as a string in quotes. The alternate syntax that did
not require quotes is now obsolete. If it is used then the title is
converted to all uppercase.
@node Not Implemented, Data File Format, Utilities, Top
@chapter Not Implemented
This chapter lists parts of the PSPP language that are not yet
implemented.
The following transformations and utilities are not yet implemented, but
they will be supported in a later release.
@itemize @bullet
@item
ADD FILES
@item
DEFINE
@item
FILE TYPE
@item
GET SAS
@item
GET TRANSLATE
@item
MCONVERT
@item
PRESERVE
@item
PROCEDURE OUTPUT
@item
RESTORE
@item
SAVE TRANSLATE
@item
SHOW
@item
UPDATE
@end itemize
The following transformations and utilities are not implemented. There
are no plans to support them in future releases. Contributions to
implement them will still be accepted.
@itemize @bullet
@item
EDIT
@item
GET DATABASE
@item
GET OSIRIS
@item
GET SCSS
@item
GSET
@item
HELP
@item
INFO
@item
INPUT MATRIX
@item
KEYED DATA LIST
@item
NUMBERED and UNNUMBERED
@item
OPTIONS
@item
REVIEW
@item
SAVE SCSS
@item
SPSS MANAGER
@item
STATISTICS
@end itemize
@node Data File Format, Portable File Format, Not Implemented, Top
@chapter Data File Format
PSPP necessarily uses the same format for system files as do the
products with which it is compatible. This chapter is a description of
that format.
There are three data types used in system files: 32-bit integers, 64-bit
floating points, and 1-byte characters. In this document these will
simply be referred to as @code{int32}, @code{flt64}, and @code{char},
the names that are used in the PSPP source code. Every field of type
@code{int32} or @code{flt64} is aligned on a 32-bit boundary.
The endianness of data in PSPP system files is not specified. System
files output on a computer of a particular endianness will have the
endianness of that computer. However, PSPP can read files of either
endianness, regardless of its host computer's endianness. PSPP
translates endianness for both integer and floating point numbers.
Floating point formats are also not specified. PSPP does not
translate between floating point formats. This is unlikely to be a
problem as all modern computer architectures use IEEE 754 format for
floating point representation.
The PSPP system-missing value is represented by the largest possible
negative number in the floating point format; in C, this is most likely
@code{-DBL_MAX}. There are two other important values used in missing
values: @code{HIGHEST} and @code{LOWEST}. These are represented by the
largest possible positive number (probably @code{DBL_MAX}) and the
second-largest negative number. The latter must be determined in a
system-dependent manner; in IEEE 754 format it is represented by value
@code{0xffeffffffffffffe}.
System files are divided into records. Each record begins with an
@code{int32} giving a numeric record type. Individual record types are
described below:
@menu
* File Header Record::
* Variable Record::
* Value Label Record::
* Value Label Variable Record::
* Document Record::
* Machine int32 Info Record::
* Machine flt64 Info Record::
* Miscellaneous Informational Records::
* Dictionary Termination Record::
* Data Record::
@end menu
@node File Header Record, Variable Record, Data File Format, Data File Format
@section File Header Record
The file header is always the first record in the file.
@example
struct sysfile_header
@{
char rec_type[4];
char prod_name[60];
int32 layout_code;
int32 case_size;
int32 compressed;
int32 weight_index;
int32 ncases;
flt64 bias;
char creation_date[9];
char creation_time[8];
char file_label[64];
char padding[3];
@};
@end example
@table @code
@item char rec_type[4];
Record type code. Always set to @samp{$FL2}. This is the only record
for which the record type is not of type @code{int32}.
@item char prod_name[60];
Product identification string. This always begins with the characters
@samp{@@(#) SPSS DATA FILE}. PSPP uses the remaining characters to
give its version and the operating system name; for example, @samp{GNU
pspp 0.1.4 - sparc-sun-solaris2.5.2}. The string is truncated if it
would be longer than 60 characters; otherwise it is padded on the right
with spaces.
@item int32 layout_code;
Always set to 2. PSPP reads this value in order to determine the
file's endianness.
@item int32 case_size;
Number of data elements per case. This is the number of variables,
except that long string variables add extra data elements (one for every
8 characters after the first 8).
@item int32 compressed;
Set to 1 if the data in the file is compressed, 0 otherwise.
@item int32 weight_index;
If one of the variables in the data set is used as a weighting variable,
set to the index of that variable. Otherwise, set to 0.
@item int32 ncases;
Set to the number of cases in the file if it is known, or -1 otherwise.
In the general case it is not possible to determine the number of cases
that will be output to a system file at the time that the header is
written. The way that this is dealt with is by writing the entire
system file, including the header, then seeking back to the beginning of
the file and writing just the @code{ncases} field. For `files' in which
this is not valid, the seek operation fails. In this case,
@code{ncases} remains -1.
@item flt64 bias;
Compression bias. Always set to 100. The significance of this value is
that only numbers between @code{(1 - bias)} and @code{(251 - bias)} can
be compressed.
@item char creation_date[9];
Set to the date of creation of the system file, in @samp{dd mmm yy}
format, with the month as standard English abbreviations, using an
initial capital letter and following with lowercase. If the date is not
available then this field is arbitrarily set to @samp{01 Jan 70}.
@item char creation_time[8];
Set to the time of creation of the system file, in @samp{hh:mm:ss}
format and using 24-hour time. If the time is not available then this
field is arbitrarily set to @samp{00:00:00}.
@item char file_label[64];
Set the the file label declared by the user, if any. Padded on the
right with spaces.
@item char padding[3];
Ignored padding bytes to make the structure a multiple of 32 bits in
length. Set to zeros.
@end table
@node Variable Record, Value Label Record, File Header Record, Data File Format
@section Variable Record
Immediately following the header must come the variable records. There
must be one variable record for every variable and every 8 characters in
a long string beyond the first 8; i.e., there must be exactly as many
variable records as the value specified for @code{case_size} in the file
header record.
@example
struct sysfile_variable
@{
int32 rec_type;
int32 type;
int32 has_var_label;
int32 n_missing_values;
int32 print;
int32 write;
char name[8];
/* The following two fields are present
only if has_var_label is 1. */
int32 label_len;
char label[/* variable length */];
/* The following field is present only
if n_missing_values is not 0. */
flt64 missing_values[/* variable length*/];
@};
@end example
@table @code
@item int32 rec_type;
Record type code. Always set to 2.
@item int32 type;
Variable type code. Set to 0 for a numeric variable. For a short
string variable or the first part of a long string variable, this is set
to the width of the string. For the second and subsequent parts of a
long string variable, set to -1, and the remaining fields in the
structure are ignored.
@item int32 has_var_label;
If this variable has a variable label, set to 1; otherwise, set to 0.
@item int32 n_missing_values;
If the variable has no missing values, set to 0. If the variable has
one, two, or three discrete missing values, set to 1, 2, or 3,
respectively. If the variable has a range for missing variables, set to
-2; if the variable has a range for missing variables plus a single
discrete value, set to -3.
@item int32 print;
Print format for this variable. See below.
@item int32 write;
Write format for this variable. See below.
@item char name[8];
Variable name. The variable name must begin with a capital letter or
the at-sign (@samp{@@}). Subsequent characters may also be octothorpes
(@samp{#}), dollar signs (@samp{$}), underscores (@samp{_}), or full
stops (@samp{.}). The variable name is padded on the right with spaces.
@item int32 label_len;
This field is present only if @code{has_var_label} is set to 1. It is
set to the length, in characters, of the variable label, which must be a
number between 0 and 120.
@item char label[/* variable length */];
This field is present only if @code{has_var_label} is set to 1. It has
length @code{label_len}, rounded up to the nearest multiple of 32 bits.
The first @code{label_len} characters are the variable's variable label.
@item flt64 missing_values[/* variable length */];
This field is present only if @code{n_missing_values} is not 0. It has
the same number of elements as the absolute value of
@code{n_missing_values}. For discrete missing values, each element
represents one missing value. When a range is present, the first
element denotes the minimum value in the range, and the second element
denotes the maximum value in the range. When a range plus a value are
present, the third element denotes the additional discrete missing
value. HIGHEST and LOWEST are indicated as described in the chapter
introduction.
@end table
The @code{print} and @code{write} members of sysfile_variable are output
formats coded into @code{int32} types. The LSB (least-significant byte)
of the @code{int32} represents the number of decimal places, and the
next two bytes in order of increasing significance represent field width
and format type, respectively. The MSB (most-significant byte) is not
used and should be set to zero.
Format types are defined as follows:
@table @asis
@item 0
Not used.
@item 1
@code{A}
@item 2
@code{AHEX}
@item 3
@code{COMMA}
@item 4
@code{DOLLAR}
@item 5
@code{F}
@item 6
@code{IB}
@item 7
@code{PIBHEX}
@item 8
@code{P}
@item 9
@code{PIB}
@item 10
@code{PK}
@item 11
@code{RB}
@item 12
@code{RBHEX}
@item 13
Not used.
@item 14
Not used.
@item 15
@code{Z}
@item 16
@code{N}
@item 17
@code{E}
@item 18
Not used.
@item 19
Not used.
@item 20
@code{DATE}
@item 21
@code{TIME}
@item 22
@code{DATETIME}
@item 23
@code{ADATE}
@item 24
@code{JDATE}
@item 25
@code{DTIME}
@item 26
@code{WKDAY}
@item 27
@code{MONTH}
@item 28
@code{MOYR}
@item 29
@code{QYR}
@item 30
@code{WKYR}
@item 31
@code{PCT}
@item 32
@code{DOT}
@item 33
@code{CCA}
@item 34
@code{CCB}
@item 35
@code{CCC}
@item 36
@code{CCD}
@item 37
@code{CCE}
@item 38
@code{EDATE}
@item 39
@code{SDATE}
@end table
@node Value Label Record, Value Label Variable Record, Variable Record, Data File Format
@section Value Label Record
Value label records must follow the variable records and must precede
the header termination record. Other than this, they may appear
anywhere in the system file. Every value label record must be
immediately followed by a label variable record, described below.
Value label records begin with @code{rec_type}, an @code{int32} value
set to the record type of 3. This is followed by @code{count}, an
@code{int32} value set to the number of value labels present in this
record.
These two fields are followed by a series of @code{count} tuples. Each
tuple is divided into two fields, the value and the label. The first of
these, the value, is composed of a 64-bit value, which is either a
@code{flt64} value or up to 8 characters (padded on the right to 8
bytes) denoting a short string value. Whether the value is a
@code{flt64} or a character string is not defined inside the value label
record.
The second field in the tuple, the label, has variable length. The
first @code{char} is a count of the number of characters in the value
label. The remainder of the field is the label itself. The field is
padded on the right to a multiple of 64 bits in length.
@node Value Label Variable Record, Document Record, Value Label Record, Data File Format
@section Value Label Variable Record
Every value label variable record must be immediately preceded by a
value label record, described above.
@example
struct sysfile_value_label_variable
@{
int32 rec_type;
int32 count;
int32 vars[/* variable length */];
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 4.
@item int32 count;
Number of variables that the associated value labels from the value
label record are to be applied.
@item int32 vars[/* variable length];
A list of variables to which to apply the value labels. There are
@code{count} elements.
@end table
@node Document Record, Machine int32 Info Record, Value Label Variable Record, Data File Format
@section Document Record
There must be no more than one document record per system file.
Document records must follow the variable records and precede the
dictionary termination record.
@example
struct sysfile_document
@{
int32 rec_type;
int32 n_lines;
char lines[/* variable length */][80];
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 6.
@item int32 n_lines;
Number of lines of documents present.
@item char lines[/* variable length */][80];
Document lines. The number of elements is defined by @code{n_lines}.
Lines shorter than 80 characters are padded on the right with spaces.
@end table
@node Machine int32 Info Record, Machine flt64 Info Record, Document Record, Data File Format
@section Machine @code{int32} Info Record
There must be no more than one machine @code{int32} info record per
system file. Machine @code{int32} info records must follow the variable
records and precede the dictionary termination record.
@example
struct sysfile_machine_int32_info
@{
/* Header. */
int32 rec_type;
int32 subtype;
int32 size;
int32 count;
/* Data. */
int32 version_major;
int32 version_minor;
int32 version_revision;
int32 machine_code;
int32 floating_point_rep;
int32 compression_code;
int32 endianness;
int32 character_code;
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 7.
@item int32 subtype;
Record subtype. Always set to 3.
@item int32 size;
Size of each piece of data in the data part, in bytes. Always set to 4.
@item int32 count;
Number of pieces of data in the data part. Always set to 8.
@item int32 version_major;
PSPP major version number. In version @var{x}.@var{y}.@var{z}, this
is @var{x}.
@item int32 version_minor;
PSPP minor version number. In version @var{x}.@var{y}.@var{z}, this
is @var{y}.
@item int32 version_revision;
PSPP version revision number. In version @var{x}.@var{y}.@var{z},
this is @var{z}.
@item int32 machine_code;
Machine code. PSPP always set this field to value to -1, but other
values may appear.
@item int32 floating_point_rep;
Floating point representation code. For IEEE 754 systems this is 1.
IBM 370 sets this to 2, and DEC VAX E to 3.
@item int32 compression_code;
Compression code. Always set to 1.
@item int32 endianness;
Machine endianness. 1 indicates big-endian, 2 indicates little-endian.
@item int32 character_code;
Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3
indicates 8-bit ASCII, 4 indicates DEC Kanji.
@end table
@node Machine flt64 Info Record, Miscellaneous Informational Records, Machine int32 Info Record, Data File Format
@section Machine @code{flt64} Info Record
There must be no more than one machine @code{flt64} info record per
system file. Machine @code{flt64} info records must follow the variable
records and precede the dictionary termination record.
@example
struct sysfile_machine_flt64_info
@{
/* Header. */
int32 rec_type;
int32 subtype;
int32 size;
int32 count;
/* Data. */
flt64 sysmis;
flt64 highest;
flt64 lowest;
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 3.
@item int32 subtype;
Record subtype. Always set to 4.
@item int32 size;
Size of each piece of data in the data part, in bytes. Always set to 4.
@item int32 count;
Number of pieces of data in the data part. Always set to 3.
@item flt64 sysmis;
The system missing value.
@item flt64 highest;
The value used for HIGHEST in missing values.
@item flt64 lowest;
The value used for LOWEST in missing values.
@end table
@node Miscellaneous Informational Records, Dictionary Termination Record, Machine flt64 Info Record, Data File Format
@section Miscellaneous Informational Records
Miscellaneous informational records must follow the variable records and
precede the dictionary termination record.
Miscellaneous informational records are ignored by PSPP when reading
system files. They are not written by PSPP when writing system files.
@example
struct sysfile_misc_info
@{
/* Header. */
int32 rec_type;
int32 subtype;
int32 size;
int32 count;
/* Data. */
char data[/* variable length */];
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 3.
@item int32 subtype;
Record subtype. May take any value.
@item int32 size;
Size of each piece of data in the data part. Should have the value 4 or
8, for @code{int32} and @code{flt64}, respectively.
@item int32 count;
Number of pieces of data in the data part.
@item char data[/* variable length */];
Arbitrary data. There must be @code{size} times @code{count} bytes of
data.
@end table
@node Dictionary Termination Record, Data Record, Miscellaneous Informational Records, Data File Format
@section Dictionary Termination Record
The dictionary termination record must follow all other records, except
for the actual cases, which it must precede. There must be exactly one
dictionary termination record in every system file.
@example
struct sysfile_dict_term
@{
int32 rec_type;
int32 filler;
@};
@end example
@table @code
@item int32 rec_type;
Record type. Always set to 999.
@item int32 filler;
Ignored padding. Should be set to 0.
@end table
@node Data Record, , Dictionary Termination Record, Data File Format
@section Data Record
Data records must follow all other records in the data file. There must
be at least one data record in every system file.
The format of data records varies depending on whether the data is
compressed. Regardless, the data is arranged in a series of 8-byte
elements.
When data is not compressed, Every case is composed of @code{case_size}
of these 8-byte elements, where @code{case_size} comes from the file
header record (@pxref{File Header Record}). Each element corresponds to
the variable declared in the respective variable record (@pxref{Variable
Record}). Numeric values are given in @code{flt64} format; string
values are literal characters string, padded on the right when
necessary.
Compressed data is arranged in the following manner: the first 8-byte
element in the data section is divided into a series of 1-byte command
codes. These codes have meanings as described below:
@table @asis
@item 0
Ignored. If the program writing the system file accumulates compressed
data in blocks of fixed length, 0 bytes can be used to pad out extra
bytes remaining at the end of a fixed-size block.
@item 1 through 251
These values indicate that the corresponding numeric variable has the
value @code{(@var{code} - @var{bias})} for the case being read, where
@var{code} is the value of the compression code and @var{bias} is the
variable @code{compression_bias} from the file header. For example,
code 105 with bias 100.0 (the normal value) indicates a numeric variable
of value 5.
@item 252
End of file. This code may or may not appear at the end of the data
stream. PSPP always outputs this code but its use is not required.
@item 253
This value indicates that the numeric or string value is not
compressible. The value is stored in the 8-byte element following the
current block of command bytes. If this value appears twice in a block
of command bytes, then it indicates the second element following the
command bytes, and so on.
@item 254
Used to indicate a string value that is all spaces.
@item 255
Used to indicate the system-missing value.
@end table
When the end of the first 8-byte element of command bytes is reached,
any blocks of non-compressible values are skipped, and the next element
of command bytes is read and interpreted, until the end of the file is
reached.
@node Portable File Format, q2c Input Format, Data File Format, Top
@chapter Portable File Format
These days, most computers use the same internal data formats for
integer and floating-point data, if one ignores little differences like
big- versus little-endian byte ordering. However, occasionally it is
necessary to exchange data between systems with incompatible data
formats. This is what portable files are designed to do.
@strong{Please note:} Although all of the following information is
correct, as far as the author has been able to ascertain, it is gleaned
from examination of ASCII-formatted portable files only, so some of it
may be incorrect in the general case.
@menu
* Portable File Characters::
* Portable File Structure::
* Portable File Header::
* Version and Date Info Record::
* Identification Records::
* Variable Count Record::
* Variable Records::
* Value Label Records::
* Portable File Data::
@end menu
@node Portable File Characters, Portable File Structure, Portable File Format, Portable File Format
@section Portable File Characters
Portable files are arranged as a series of lines of exactly 80
characters each. Each line is terminated by a carriage-return,
line-feed sequence (henceforth, ``newline''). Newlines are not
delimiters: they are only used to avoid line-length limitations existing
on some operating systems.
The file must be terminated with a @samp{Z} character. In addition, if
the final line in the file does not have exactly 80 characters, then it
is padded on the right with @samp{Z} characters. (The file contents may
be in any character set; the file contains a description of its own
character set, as explained in the next section. Therefore, the
@samp{Z} character is not necessarily an ASCII @samp{Z}.)
For the rest of the description of the portable file format, newlines
and the trailing @samp{Z}s will be ignored, as if they did not exist,
because they are not an important part of understanding the file
contents.
@node Portable File Structure, Portable File Header, Portable File Characters, Portable File Format
@section Portable File Structure
Every portable file consists of the following records, in sequence:
@itemize @bullet
@item
File header.
@item
Version and date info.
@item
Product identification.
@item
Subproduct identification (optional).
@item
Variable count.
@item
Variables. Each variable record may optionally be followed by a
missing value record and a variable label record.
@item
Value labels (optional).
@item
Data.
@end itemize
Most records are identified by a single-character tag code. The file
header and version info record do not have a tag.
Other than these single-character codes, there are three types of fields
in a portable file: floating-point, integer, and string. Floating-point
fields have the following format:
@itemize @bullet
@item
Zero or more leading spaces.
@item
Optional asterisk (@samp{*}), which indicates a missing value. The
asterisk must be followed by a single character, generally a period
(@samp{.}), but it appears that other characters may also be possible.
This completes the specification of a missing value.
@item
Optional minus sign (@samp{-}) to indicate a negative number.
@item
A whole number, consisting of one or more base-30 digits: @samp{0}
through @samp{9} plus capital letters @samp{A} through @samp{T}.
@item
A fraction, consisting of a radix point (@samp{.}) followed by one or
more base-30 digits (optional).
@item
An exponent, consisting of a plus or minus sign (@samp{+} or @samp{-})
followed by one or more base-30 digits (optional).
@item
A forward slash (@samp{/}).
@end itemize
Integer fields take form identical to floating-point fields, but they
may not contain a fraction.
String fields take the form of a integer field having value @var{n},
followed by exactly @var{n} characters, which are the string content.
@node Portable File Header, Version and Date Info Record, Portable File Structure, Portable File Format
@section Portable File Header
Every portable file begins with a 464-byte header, consisting of a
200-byte collection of vanity splash strings, followed by a 256-byte
character set translation table, followed by an 8-byte tag string.
The 200-byte segment is divided into five 40-byte sections, each of
which represents the string @code{ASCII SPSS PORT FILE} in a different
character set encoding. (If the file is encoded in EBCDIC then the
string is actually @code{EBCDIC SPSS PORT FILE}, and so on.) These
strings are padded on the right with spaces in their own character set.
It appears that these strings exist only to inform those who might view
the file on a screen, and that they are not parsed by SPSS products.
Thus, they can be safely ignored. For those interested, the strings are
supposed to be in the following character sets, in the specified order:
EBCDIC, 7-bit ASCII, CDC 6-bit ASCII, 6-bit ASCII, Honeywell 6-bit
ASCII.
The 256-byte segment describes a mapping from the character set used in
the portable file to an arbitrary character set having characters at the
following positions:
@table @asis
@item 0--60
Control characters. Not important enough to describe in full here.
@item 61--63
Reserved.
@item 64--73
Digits @samp{0} through @samp{9}.
@item 74--99
Capital letters @samp{A} through @samp{Z}.
@item 100--125
Lowercase letters @samp{a} through @samp{z}.
@item 126
Space.
@item 127--130
Symbols @code{.<(+}
@item 131
Solid vertical pipe.
@item 132--142
Symbols @code{&[]!$*);^-/}
@item 143
Broken vertical pipe.
@item 144--150
Symbols @code{,%_>}?@code{`:} @c @code{?} is an inverted question mark
@item 151
British pound symbol.
@item 152--155
Symbols @code{@@'="}.
@item 156
Less than or equal symbol.
@item 157
Empty box.
@item 158
Plus or minus.
@item 159
Filled box.
@item 160
Degree symbol.
@item 161
Dagger.
@item 162
Symbol @samp{~}.
@item 163
En dash.
@item 164
Lower left corner box draw.
@item 165
Upper left corner box draw.
@item 166
Greater than or equal symbol.
@item 167--176
Superscript @samp{0} through @samp{9}.
@item 177
Lower right corner box draw.
@item 178
Upper right corner box draw.
@item 179
Not equal symbol.
@item 180
Em dash.
@item 181
Superscript @samp{(}.
@item 182
Superscript @samp{)}.
@item 183
Horizontal dagger (?).
@item 184--186
Symbols @samp{@{@}\}.
@item 187
Cents symbol.
@item 188
Centered dot, or bullet.
@item 189--255
Reserved.
@end table
Symbols that are not defined in a particular character set are set to
the same value as symbol 64; i.e., to @samp{0}.
The 8-byte tag string consists of the exact characters @code{SPSSPORT}
in the portable file's character set, which can be used to verify that
the file is indeed a portable file.
@node Version and Date Info Record, Identification Records, Portable File Header, Portable File Format
@section Version and Date Info Record
This record does not have a tag code. It has the following structure:
@itemize @bullet
@item
A single character identifying the file format version. The letter A
represents version 0, and so on.
@item
An 8-character string field giving the file creation date in the format
YYYYMMDD.
@item
A 6-character string field giving the file creation time in the format
HHMMSS.
@end itemize
@node Identification Records, Variable Count Record, Version and Date Info Record, Portable File Format
@section Identification Records
The product identification record has tag code @samp{1}. It consists of
a single string field giving the name of the product that wrote the
portable file.
The subproduct identification record has tag code @samp{3}. It
consists of a single string field giving additional information on the
product that wrote the portable file.
@node Variable Count Record, Variable Records, Identification Records, Portable File Format
@section Variable Count Record
The variable count record has tag code @samp{4}. It consists of two
integer fields. The first contains the number of variables in the file
dictionary. The purpose of the second is unknown; it contains the value
161 in all portable files examined so far.
@node Variable Records, Value Label Records, Variable Count Record, Portable File Format
@section Variable Records
Each variable record represents a single variable. Variable records
have tag code @samp{7}. They have the following structure:
@itemize @bullet
@item
Width (integer). This is 0 for a numeric variable, and a number between 1
and 255 for a string variable.
@item
Name (string). 1--8 characters long. Must be in all capitals.
@item
Print format. This is a set of three integer fields:
@itemize @minus
@item
Format type (@pxref{Variable Record}).
@item
Format width. 1--40.
@item
Number of decimal places. 1--40.
@end itemize
@item
Write format. Same structure as the print format described above.
@end itemize
Each variable record can optionally be followed by a missing value
record, which has tag code @samp{8}. A missing value record has one
field, the missing value itself (a floating-point or string, as
appropriate). Up to three of these missing value records can be used.
There is also a record for missing value ranges, which has tag code
@samp{B}. It is followed by two fields representing the range, which
are floating-point or string as appropriate. If a missing value range
is present, it may be followed by a single missing value record.
Tag codes @samp{9} and @samp{A} represent @code{LO THRU @var{x}} and
@code{@var{x} THRU HI} ranges, respectively. Each is followed by a
single field representing @var{x}. If one of the ranges is present, it
may be followed by a single missing value record.
In addition, each variable record can optionally be followed by a
variable label record, which has tag code @samp{C}. A variable label
record has one field, the variable label itself (string).
@node Value Label Records, Portable File Data, Variable Records, Portable File Format
@section Value Label Records
Value label records have tag code @samp{D}. They have the following
format:
@itemize @bullet
@item
Variable count (integer).
@item
List of variables (strings). The variable count specifies the number in
the list. Variables are specified by their names. All variables must
be of the same type (numeric or string).
@item
Label count (integer).
@item
List of (value, label) tuples. The label count specifies the number of
tuples. Each tuple consists of a value, which is numeric or string as
appropriate to the variables, followed by a label (string).
@end itemize
@node Portable File Data, , Value Label Records, Portable File Format
@section Portable File Data
The data record has tag code @samp{F}. There is only one tag for all
the data; thus, all the data must follow the dictionary. The data is
terminated by the end-of-file marker @samp{Z}, which is not valid as the
beginning of a data element.
Data elements are output in the same order as the variable records
describing them. String variables are output as string fields, and
numeric variables are output as floating-point fields.
@node q2c Input Format, Bugs, Portable File Format, Top
@chapter @code{q2c} Input Format
PSPP statistical procedures have a bizarre and somewhat irregular
syntax. Despite this, a parser generator has been written that
adequately addresses many of the possibilities and tries to provide
hooks for the exceptional cases. This parser generator is named
@code{q2c}.
@menu
* Invoking q2c:: q2c command-line syntax.
* q2c Input Structure:: High-level layout of the input file.
* Grammar Rules:: Syntax of the grammar rules.
@end menu
@node Invoking q2c, q2c Input Structure, q2c Input Format, q2c Input Format
@section Invoking q2c
@example
q2c @var{input.q} @var{output.c}
@end example
@code{q2c} translates a @samp{.q} file into a @samp{.c} file. It takes
exactly two command-line arguments, which are the input file name and
output file name, respectively. @code{q2c} does not accept any
command-line options.
@node q2c Input Structure, Grammar Rules, Invoking q2c, q2c Input Format
@section @code{q2c} Input Structure
@code{q2c} input files are divided into two sections: the grammar rules
and the supporting code. The @dfn{grammar rules}, which make up the
first part of the input, are used to define the syntax of the
statistical procedure to be parsed. The @dfn{supporting code},
following the grammar rules, are copied largely unchanged to the output
file, except for certain escapes.
The most important lines in the grammar rules are used for defining
procedure syntax. These lines can be prefixed with a dollar sign
(@samp{$}), which prevents Emacs' CC-mode from munging them. Besides
this, a bang (@samp{!}) at the beginning of a line causes the line,
minus the bang, to be written verbatim to the output file (useful for
comments). As a third special case, any line that begins with the exact
characters @code{/* *INDENT} is ignored and not written to the output.
This allows @code{.q} files to be processed through @code{indent}
without being munged.
The syntax of the grammar rules themselves is given in the following
sections.
The supporting code is passed into the output file largely unchanged.
However, the following escapes are supported. Each escape must appear
on a line by itself.
@table @code
@item /* (header) */
Expands to a series of C @code{#include} directives which include the
headers that are required for the parser generated by @code{q2c}.
@item /* (decls @var{scope}) */
Expands to C variable and data type declarations for the variables and
@code{enum}s input and output by the @code{q2c} parser. @var{scope}
must be either @code{local} or @code{global}. @code{local} causes the
declarations to be output as function locals. @code{global} causes them
to be declared as @code{static} module variables; thus, @code{global} is
a bit of a misnomer.
@item /* (parser) */
Expands to the entire parser. Must be enclosed within a C function.
@item /* (free) */
Expands to a set of calls to the @code{free} function for variables
declared by the parser. Only needs to be invoked if subcommands of type
@code{string} are used in the grammar rules.
@end table
@node Grammar Rules, , q2c Input Structure, q2c Input Format
@section Grammar Rules
The grammar rules describe the format of the syntax that the parser
generated by @code{q2c} will understand. The way that the grammar rules
are included in @code{q2c} input file are described above.
The grammar rules are divided into tokens of the following types:
@table @asis
@item Identifier (@code{ID})
An identifier token is a sequence of letters, digits, and underscores
(@samp{_}). Identifiers are @emph{not} case-sensitive.
@item String (@code{STRING})
String tokens are initiated by a double-quote character (@samp{"}) and
consist of all the characters between that double quote and the next
double quote, which must be on the same line as the first. Within a
string, a backslash can be used as a ``literal escape''. The only
reasons to use a literal escape are to include a double quote or a
backslash within a string.
@item Special character
Other characters, other than whitespace, constitute tokens in
themselves.
@end table
The syntax of the grammar rules is as follows:
@example
grammar-rules ::= ID : subcommands .
subcommands ::= subcommand
::= subcommands ; subcommand
@end example
The syntax begins with an ID or STRING token that gives the name of the
procedure to be parsed. The rest of the syntax consists of subcommands
separated by semicolons (@samp{;}) and terminated with a full stop
(@samp{.}).
@example
subcommand ::= sbc-options ID sbc-defn
sbc-options ::=
::= sbc-option
::= sbc-options sbc-options
sbc-option ::= *
::= +
sbc-defn ::= opt-prefix = specifiers
::= [ ID ] = array-sbc
::= opt-prefix = sbc-special-form
opt-prefix ::=
::= ( ID )
@end example
Each subcommand can be prefixed with one or more option characters. An
asterisk (@samp{*}) is used to indicate the default subcommand; the
keyword used for the default subcommand can be omitted in the PSPP
syntax file. A plus sign (@samp{+}) is used to indicate that a
subcommand can appear more than once; if it is not present then that
subcommand can appear no more than once.
The subcommand name appears after the option characters.
There are three forms of subcommands. The first and most common form
simply gives an equals sign (@samp{=}) and a list of specifiers, which
can each be set to a single setting. The second form declares an array,
which is a set of flags that can be individually turned on by the user.
There are also several special forms that do not take a list of
specifiers.
Arrays require an additional @code{ID} argument. This is used as a
prefix, prepended to the variable names constructed from the
specifiers. The other forms also allow an optional prefix to be
specified.
@example
array-sbc ::= alternatives
::= array-sbc , alternatives
alternatives ::= ID
::= alternatives | ID
@end example
An array subcommand is a set of Boolean values that can independently be
turned on by the user, listed separated by commas (@samp{,}). If an value has more
than one name then these names are separated by pipes (@samp{|}).
@example
specifiers ::= specifier
::= specifiers , specifier
specifier ::= opt-id : settings
opt-id ::=
::= ID
@end example
Ordinary subcommands (other than arrays and special forms) require a
list of specifiers. Each specifier has an optional name and a list of
settings. If the name is given then a correspondingly named variable
will be used to store the user's choice of setting. If no name is given
then there is no way to tell which setting the user picked; in this case
the settings should probably have values attached.
@example
settings ::= setting
::= settings / setting
setting ::= setting-options ID setting-value
setting-options ::=
::= *
::= !
::= * !
@end example
Individual settings are separated by forward slashes (@samp{/}). Each
setting can be as little as an @code{ID} token, but options and values
can optionally be included. The @samp{*} option means that, for this
setting, the @code{ID} can be omitted. The @samp{!} option means that
this option is the default for its specifier.
@example
setting-value ::=
::= ( setting-value-2 )
::= setting-value-2
setting-value-2 ::= setting-value-options setting-value-type : ID
setting-value-restriction
setting-value-options ::=
::= *
setting-value-type ::= N
::= D
setting-value-restriction ::=
::= , STRING
@end example
Settings may have values. If the value must be enclosed in parentheses,
then enclose the value declaration in parentheses. Declare the setting
type as @samp{n} or @samp{d} for integer or floating point type,
respectively. The given @code{ID} is used to construct a variable name.
If option @samp{*} is given, then the value is optional; otherwise it
must be specified whenever the corresponding setting is specified. A
``restriction'' can also be specified which is a string giving a C
expression limiting the valid range of the value. The special escape
@code{%s} should be used within the restriction to refer to the
setting's value variable.
@example
sbc-special-form ::= VAR
::= VARLIST varlist-options
::= INTEGER opt-list
::= DOUBLE opt-list
::= PINT
::= STRING @r{(the literal word STRING)} string-options
::= CUSTOM
varlist-options ::=
::= ( STRING )
opt-list ::=
::= LIST
string-options ::=
::= ( STRING STRING )
@end example
The special forms are of the following types:
@table @code
@item VAR
A single variable name.
@item VARLIST
A list of variables. If given, the string can be used to provide
@code{PV_@var{*}} options to the call to @code{parse_variables}.
@item INTEGER
A single integer value.
@item INTEGER LIST
A list of integers separated by spaces or commas.
@item DOUBLE
A single floating-point value.
@item DOUBLE LIST
A list of floating-point values.
@item PINT
A single positive integer value.
@item STRING
A string value. If the options are given then the first string is an
expression giving a restriction on the value of the string; the second
string is an error message to display when the restriction is violated.
@item CUSTOM
A custom function is used to parse this subcommand. The function must
have prototype @code{int custom_@var{name} (void)}. It should return 0
on failure (when it has already issued an appropriate diagnostic), 1 on
success, or 2 if it fails and the calling function should issue a syntax
error on behalf of the custom handler.
@end table
@node Bugs, Function Index, q2c Input Format, Top
@chapter Bugs
@quotation
As of fvwm 0.99 there were exactly 39.342 unidentified bugs. Identified
bugs have mostly been fixed, though. Since then 9.34 bugs have been
fixed. Assuming that there are at least 10 unidentified bugs for every
identified one, that leaves us with 39.342 - 9.34 + 10 * 9.34 = 123.422
unidentified bugs. If we follow this to its logical conclusion we
will have an infinite number of unidentified bugs before the number of
bugs can start to diminish, at which point the program will be
bug-free. Since this is a computer program infinity = 3.4028e+38 if you
don't insist on double-precision. At the current rate of bug discovery
we should expect to achieve this point in 3.37e+27 years. I guess I
better plan on passing this thing on to my children@enddots{}
---Robert Nation, @cite{fvwm manpage}.
@end quotation
@menu
* Known bugs:: Pointers to other files.
* Contacting the Author:: Where to send the bug reports.
@end menu
@node Known bugs, Contacting the Author, Bugs, Bugs
@section Known bugs
This is the list of known bugs in PSPP. In addition, @xref{Not
Implemented}, and @xref{Functions Not Implemented}, for lists of bugs
due to features not implemented. For known bugs in individual language
features, see the documentation for that feature.
@itemize @bullet
@item
Nothing has yet been tested exhaustively. Be cautious using PSPP to
make important decisions.
@item
@code{make check} fails on some systems that don't like the syntax. I'm
not sure why. If someone could make an attempt to track this down, it
would be appreciated.
@item
PostScript driver bugs:
@itemize @minus
@item
Does not support driver arguments `max-fonts-simult' or
`optimize-text-size'.
@item
Minor problems with font-encodings.
@item
Fails to align fonts along their baselines.
@item
Does not support certain bizarre line intersections--should
never crop up in practice.
@item
Does not gracefully substitute for existing fonts whose
encodings are missing.
@item
Does not perform italic correction or left italic correction
on font changes.
@item
Encapsulated PostScript is unimplemented.
@end itemize
@item
ASCII driver bugs:
@itemize @minus
Does not support `infinite length' or `infinite width' paper.
@end itemize
@end itemize
See below for information on reporting bugs not listed here.
@node Contacting the Author, , Known bugs, Bugs
@section Contacting the Author
The author can be contacted at e-mail address
@ifinfo
<blp@@gnu.org>.
@end ifinfo
@iftex
@code{<blp@@gnu.org>}.
@end iftex
PSPP bug reports should be sent to
@ifinfo
<bug-gnu-pspp@@gnu.org>.
@end ifinfo
@iftex
@code{<bug-gnu-pspp@@gnu.org>}.
@end iftex
@node Function Index, Concept Index, Bugs, Top
@chapter Function Index
@printindex fn
@node Concept Index, Command Index, Function Index, Top
@chapter Concept Index
@printindex cp
@node Command Index, , Concept Index, Top
@chapter Command Index
@printindex vr
@contents
@bye
@c Local Variables:
@c compile-command: "makeinfo pspp.texi"
@c End:
|