1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172
|
Info file zmog, produced by texinfo-format-buffer -*-Text-*-
from file zmog.tex
Distribution
************
Copyright (C) 1988 Rayan S. Zachariassen.
If you received this manual directly from the author, you may make and
distribute verbatim copies of it within your organization. Except by
explicit permission from the author, all other redistribution is prohibited
prior to final release.
^_
File: zmog Node: top, Next: introduction
* Menu:
* introduction:: An introduction to ZMailer.
* overview:: General overview of how ZMailer works.
* router:: All about the Router process.
* scheduler:: All about the Scheduler process.
* transports:: All about the Transport Agents.
* miscellaneous:: Miscellany topics (compatibility, etc.)
* how-to:: A How-To guide.
^_
File: zmog Node: introduction, Prev: top, Up: top, Next: overview
Introduction
************
ZMailer is a mailer subsystem for the UNIX operating system. A mailer is in
charge of handling all mail messages that are created on a system
(typically a single host), from their creation until final disposition
locally or by transfer to another system. As such, the mailer subsystem
(the Message Transfer Agent) must interface to local mail reading and
composing programs (User Agents), to the various transport methods that can
be used to reach other mailers, and to a variety of databases describing
the mailer's environment. ZMailer provides this functionality in a package
and with a philosophy that has benefited from experiences with earlier
mailers. ZMailer provides a capable, robust, efficient subsystem to do the
job, which will excel in demanding environments, but is simple enough to
fit easily everywhere.
Motivation and Heritage
=======================
Many of my reasons for trying to improve the state of the art in message
handling systems are based on the fact that the capabilities of available
software hasn't changed for a long time, whereas the demands being placed
on the software have steadily risen. A few years ago, people were still
dreaming of a world with consistent standards for addressing electronic
mail and moving it around. Even though the consistency is constantly
improving, it now seems apparent that there will always be needs that
conflict with the ideal situation. This is very obvious to sites that
interact with two or more types of networks. For those sites that are
directly attached to just one network, any degree of transparency in
communicating with sites on other networks, must be provided by the
software on a mail gateway. Most such software was not designed to perform
the task of gatewaying messages between networks with heterogenous
addressing and message standards.
The best available system to accomplish this has previously been Sendmail,
which was carried along on its relative strength for address manipulation.
Unfortunately, Sendmail has many design flaws that lessen its usefulness
even in typical environments of our time. Sendmail's major contribution to
the art was the use of production rules to manipulate addresses and to
guide the operations performed on a message. It also popularized, and in
certain environments pioneered, other functionality that turned out to be
quite useful; for example the use of system-wide and personal aliases
files, a widely available SMTP implementation, external and consistently
treated delivery programs, etc. ZMailer was primarily a reaction to
Sendmail's disadvantages (which I shall mention), but also to the bad
points of several other mailers. The design of ZMailer was often guided by
my view of the poor choices or provisions of other mailers, as opposed to
things they had done well. This allows the design to draw from experience
without limiting its creativity and use of new solutions.
To clarify my opinions somewhat, a commentary on the various mailers I know
of should prove helpful:
Sendmail, in the right hands, can be quite a flexible tool to translate
between the different conventions of various networks. Unfortunately this
is accomplished by programming in an unfamiliar production language
containing many magic features. The learning time for doing this is very
long, the effort involved is that of learning a completely new language and
environment. Moreover, Sendmail has all major components built into a
single large program. Both of these design decisions have been acknowledged
as mistakes by the author of Sendmail. Its major shortcoming in comparison
to the MMDF mailer is its primitive database facility and lack of caching.
MMDF is a comprehensive mail environment, including its own mail
composition program and of course a mailer. There are too many parts to it
(as a friend would say, it is a system, not a subsystem), and the address
manipulation is only sufficient for a relatively homogenous environment. It
does have reasonable database facilities and caching, as opposed to
Sendmail, and the concept of Channels. However, knowledge about address
semantics is distributed in several programs instead of being centralized.
PMDF is a smaller version of MMDF with correspondingly reduced features and
flexibility.
Upas is a curious approach to the problem. It lets the user do half the
work of message routing, in a manner similar to PMDF on VMS systems. It is
entirely concerned with the message envelope, and leaves all message header
munging to auxiliary programs if appropriate. In fairness one should note
this mailer was developed in an environment where most message headers were
scorned, thus making this a reasonable approach ("optimize the normal
case"). The Eighth Edition Upas had no database capability at all, but it
did exhibit one useful characteristic: the routing decisions are made by
passing the recipient envelope address through a set of regular
expressions. This production rule approach is similar to what Sendmail
does, but uses a more familiar mechanism and environment.
The final, and most recently developed, mailer worth mentioning here is
Smail3.0. It is intended as a program capable of replacing Sendmail in many
situations. To a large extent it succeeds as this, and there are some nice
ideas involved as well. Its two major drawbacks are that it is not as easy
to adapt to local needs as Sendmail is (compiled instead of interpreted
rules and algorithms), and retaining Sendmail's single-program design. It
addresses database and caching issues, and seems generally like a nicer
design in many respects, a bit like PMDF's configuration options in a
Sendmail package.
Until the recent increase in the demand for inter-network mail gatewaying,
Sendmail's flexibility had quite adequately served to implement a gateway
function between selected networks. With increased variety of the normal
address syntax and mail capabilities of connected networks, and more complex
kinds of routing decisions becoming necessary, the existing mailers have
been showing their age and their limits. ZMailer is intended to give the
mail administrator a software tool that fits the times.
Goals
=====
Apart from the generic goals of robustness and efficiency, the following is
a list of the specific goals of ZMailer:
* Fully RFC822/RFC976 compatible syntax and semantics.
* Prepared to cater to future MHS standards.
* At a minimum provide Sendmail functionality from point of view of
users and the system administrator.
* Make routing decisions based on original sender and path of the message.
* Not have hardcoded address rewriting and routing algorithms.
* A better user interface for the mail administrator than Sendmail.
* Interact properly with Internet Nameservers.
* Easily extensible to make use of new sources of data.
* Efficient enough to handle a large message volume, and to not significantly
degrade its own or system performance when many messages are queued.
* Schedule delivery based on destination channel, destination host, or a
combination of both.
* Able to do a better job than Sendmail does in our environment at the
University of Toronto.
For a while it has been apparent that Sendmail's approach to its task is
not well-suited from several perspectives. In particular, having a single
program embody several conceptually independent functions is recognized as
poor design. In practice, merging queueing and delivery in one program
causes a bottleneck for all messages when a particular delivery mechanism
is slow.
Design Summary
==============
ZMailer is a multi-process mailer, using two daemon processes to manipulate
messages. One of these processes is a router, and makes all decisions
about what should happen to a message. The other daemon is a message queue
manager, used to schedule delivery of messages. The Router uses a
configuration file that closely follows Bourne shell script syntax and
semantics, with minimal magic. Message files are moved around in a series
of directories, and the Scheduler and its Transport Agents run off of
control files created by the Router.
The Router will process messages one at a time, as it finds them in a
directory where User Agents submit their outgoing messages. Envelope and
Message Header information is all kept in the same message file along with
the message body, and this file is never modified by any ZMailer program.
After parsing the envelope and RFC822 header information, the Router
validates the information extracted, and calls functions defined in the
configuration file to decide exactly how to deliver the message and how to
transform the embedded addresses. The algorithms that do this are easily
reconfigurable, since the control flow and address manipulation is
specified by familiar shell script statements. When the Router is
finished, it will produce a message control file for use by the delivery
processing stage of ZMailer, and move the original message file to another
location.
Once the Router has decided what to do with each of the addresses in a
message, the Scheduler builds a summary of this information by reading the
control file created by the Router. This knowledge is merged with a data
structure it maintains that stores which messages are supposed to be sent
where, and how. According to a pre-arranged agenda, the Scheduler will
execute delivery programs to properly move the message envelope, header,
and body, to the immediate destination. These delivery programs are called
Transport Agents, and communicate with the Scheduler using a simple
protocol that tells them which messages to process and returns status
reports to the Scheduler. The Scheduler also manages status reports,
taking appropriate action on delivery errors and when all delivery
instructions for a message have been processed.
There are several standard Transport Agents included with the ZMailer
distribution. The collection currently includes a local delivery program,
an SMTP client implementation, and a Transport Agent that can run
Sendmail-compatible delivery programs.
A separate utility allows querying the Scheduler for the state of its mail
queues. For existing Sendmail installations, a replacement program is
included that simulates most of the Sendmail functionality in the ZMailer
environment. This allows ZMailer to replace a Sendmail installation
without requiring changes in standard User Agents.
^_
File: zmog Node: overview, Prev: introduction, Up: top, Next: router
Overview
********
This chapter deals with the life of a message, and what will happen to a
message in the course of being processed. The processing activity is
divided into four major phases that will be dealt with here. These phases
are: injection of a message into the mailer subsystem, message routing,
message transport/delivery queueing and scheduling, and actual delivery of
a message. The phases communicate through the filesystem, by moving files
from one directory to another. All directories taking part in this
communication are clustered under the `POSTOFFICE' directory
(`/usr/spool/postoffice'), which is intended to also hold other maintenance
information or directories for use by the system Postmaster. For this
reason, we shall refer to files within this hierarchy using the
tilde-abbreviation (e.g. `~/file' referring to the Postmaster's
`$HOME/file' which is normally `/usr/spool/postoffice/file').
Message Submission
==================
A mail message is submitted to the mailer subsystem by depositing a
"message file" in a particular `ROUTER' directory (`~/router'). There is a
Sendmail replacement program which submits messages this way. The messages
are picked up by a daemon process scanning this directory, and processing
all message files it finds in it. To avoid problems with the daemon
processing an incomplete or inconsistent message file, the message files
are created in a separate `PUBLIC' directory (`~/public'), and then linked
into the `ROUTER' directory.
A message file has 3 parts to it, the first part contains the envelope
information for the message (if any). It consists of all the lines from the
start of the file to the start of part 2 (exclusive) which are in the
format of RFC822 Message Header lines *except* that there is no colon after
the header field name. An RFC822 Message Header is easily converted to an
envelope header line by simply deleting the colon after the field name. As
with message headers, various field names have specific semantics, and
these will be discussed in detail later.
The second part of a message file is its RFC822 Message Header. The third
part is the message body. Either of the envelope portion (part 1) or the
message header portion (part 2) may be null. They are separated from the
message body by an empty line, according to RFC822. The standard UNIX
conventions for files are obeyed (i.e. lines are terminated by a Newline
character (LF)). The message body is never examined by the mailer itself,
although transport/delivery programs must of course filter the message body
appropriately for the destination. In other words, the message body may
contain arbitrary binary data. The only restrictions are that the envelope
and message header must obey the RFC822 lexical/syntax rules.
Once the message file has been written into the `ROUTER' directory, its
content will never change until the system removes it after successful
delivery. The only manipulation consists of relinking the file into various
directories.
A subroutine interface exists, which should be used by application programs
or User Agents to submit messages. The subroutine interface is properly
part of the system C library (`/lib/libc.a'), and will be documented as
such. It is quite possible to submit messages by using the standard
utilities to copy or move a file into the `ROUTER' directory. Indeed, some
maintenance functions the Postmaster should perform, and automatic
resubmission of deferred messages, are most easily accomplished in this
manner.
Note that the format of a message file allows a user to simply create a
file that obeys RFC822 conventions, in order to submit a message. It also
allows simple resubmission of a message which includes the UNIX standard
`From ' envelope header line (found in `Mail' format mailbox files), as
this syntax will indeed be interpreted to represent envelope information.
Router
======
The daemon mentioned above, the one that processes message files appearing
in the `ROUTER' directory, is called the Router process. It is the
responsibility of this process to decide what to do with the message file,
and pass this information on to the next stage (message transport/delivery
queueing and scheduling). It does this by creating a Control File attached
to the message file, which contains all the necessary information for the
next stages to accomplish their work without detailed reference to the
message file (i.e. without having to reparse it). When the Router process
is finished processing a message file, it will relink it into a `QUEUE'
(`~/queue') directory [XX: which is flat -- is this bad?], and deposit the
control file in a `SCHEDULER' directory (`~/scheduler') for processing by
the next stage.
The Router will parse the message file contents, determine the boundaries
between the various parts of the file, extract addresses and other
information from the RFC822 format fields, and manipulate this information to
determine the proper action for each destination address. The lexical and
syntax analysis is carried out as a basic function within this process, but
the semantics are determined by the contents of a configuration file for the
Router. This configuration file is required to properly initialize the Router
when it starts up, and furthermore defines functions that analyse an address,
determine how to route it (given context information), and that can rewrite
a message header address appropriately based on various context information.
The configuration file looks like a Bourne Shell script at first glance.
There are minor syntax changes from standard `sh', but the aim is to be as
close to the Bourne Shell language as is practical. The contents of the
file are compiled into a parse tree, which can then be interpreted by the
Router. The configuration file is usually self-contained, although an easy
mechanism exists to make use of external UNIX programs when so desired.
Together with a very flexible database lookup mechanism, functions, and
address manipulation based on token-matching regular expressions, the
configuration file language is an extremely flexible substrate to
accomplish its purpose. When the language is inadequate, or if speed
becomes an issue, it is possible to call built in (C coded) functions. The
interface to these functions is mostly identical to what a standalone
program would expect (modulo symbol name clashes and return values), to
ease migration of external programs to inclusion in the Router process.
The Router makes use of environmental information to augment the
information that may be contained in the message file. For example, the
owner of the message file is the local user who submitted the message
(which fact is used to check believability of some of the message header
information), the message file modification time is used for local
submission time, and the message file name is part of the synthesized
message identification. Other envelope information, apart from the
standard sender and recipient addresses, may be specified to augment the
behavior of the mailer. For example, the standard library routines used to
submit messages, may include code to pass along information from the
submitting user's environment variables.
If something goes wrong when the Router processes a message file, its
action depends on the severity and type of error. If for example there is a
protocol violation of some kind, the Router may generate a rejection
message sent to the originator of the offending message. The Router
supplies the addressee and specific diagnostic messages corresponding to
the error, and uses one of the canned files in the `FORMS' directory
(`~/forms') for *everything else*. In particular this means such headers
as `From:', `Subject:' and `Cc:' lines, and a generic comment on the class
of error, are all taken from a standard form. This easily allows certain
kinds of errors to be brought to the attention of the Postmaster or other
maintenance person (by judicious use of the Carbon Copy field), and indeed
different errors may be directed to different people.
For more serious problems, the message file is filed away in yet another
`POSTMAN' directory (`~/postman'). This directory is where the Router will
put any files that need manual attention by the Postmaster. The Postmaster
may take corrective action (usually editing the message file), and resubmit
the message file by simply moving it (using `mv') to the `ROUTER'
directory.
If something went wrong that may correct itself at a later time (for
example if a database access indicates a temporary failure), the message
file will be relinked into a `DEFERRED' directory (`~/deferred'). At some
later time, these deferred message files may be resubmitted by moving them
back to the `ROUTER' directory. This may be accomplished by a simple cron
job. As indicated, the only problems that would cause this would be the
lack of a resource needed by the Router. This may include out-of-space
conditions on the disk, a database access timing out or returning a server
failure reply, etc.
When everything does work properly, once a control file has been
created in the `SCHEDULER' directory, and the message file moved
to the `QUEUE' directory, the job of the Router is done for that
message and it continues scanning its `ROUTER' directory for more work.
Queue Manager and Scheduler
===========================
The process that picks up control files from the `SCHEDULER' directory, is
called the Scheduler. It is a daemon that orchestrates the flow of messages
out from the mail subsystem. To do this, it maintains an internal model of
which messages need to go where, and how, and passes the relevant
information to the transport/delivery programs that it starts up. Various
parameters associated with each transport/delivery program are controlled
by a configuration file for the Scheduler. This configuration file is much
simpler than the one for the Router, indeed it is a simple table format. A
set of messages is selected using a channel/host specification pattern, and
associated with each pattern one must specify a startup interval, command,
and some related information, that will be used to deliver a selected
message to the appropriate addresses. Specifying startup intervals for
programs is the function which gives the Scheduler its name.
When the Scheduler picks up a control file, it extracts destination
information and groups it by the outgoing channel, and by next host. The
internally maintained model of the messages pending delivery is mapped into
a directory tree that is used to store the control files. This allows
quicker reference by other programs that more frequently need to refer to
control files, e.g. during a delivery phase. This directory tree is
maintained under the `SCHEDULER' directory, and is where control files are
relinked to after the Scheduler has parsed their contents. The filesystem
image of the model is completely maintained by the Scheduler; directories
are created when needed, and removed when empty.
The filesystem image of the internal Scheduler model, is specific to each
Scheduler instance. It may therefore be destroyed between invocations of
the Scheduler. If a Scheduler process is aborted while there are messages
pending delivery, the next Scheduler process needs to be reinitialized from
the control files of the pending messages. To ease this, and other,
maintenance chores, each control file is also linked into the `TRANSPORT'
directory (`~/transport'), where it remains until all associated delivery
has been completed. This allows reinitializing a Scheduler with the
previous state, by simply removing all directories under the `SCHEDULER'
directory (since they mirror the internal state of a dead process), and
moving all the files from the `TRANSPORT' directory to the `SCHEDULER'
directory.
[XX: to do: rendezvous with Scheduler from unrelated program, e.g. uucico ]
Transport Agents
================
A Transport Agent is responsible for doing the actual transport/delivery of
a given message to a selected set of addresses. The selection of addresses
is determined by the Transport Agent itself, perhaps using information
about the name of the delivery channel or next host, passed on the command
line. The messages each Transport Agent is asked to examine are determined
by the Scheduler. A very simple protocol is run on the standard input and
standard output of the Transport Agent, with a supervisor program (the
Scheduler) choosing which control files the Transport Agent should know
about. In turn, the Transport Agent returns status information about each
of the addresses it processes in each control file, so the Scheduler can
update its internal model of the collection of queued messages. As well,
the Transport Agent is in charge of enforcing locking of a destination
address while it is being processed, and the subsequent status update
(success, deferral, error) in the message control file. These operations
are performed in-place (and synchronously) on the contents of a control
file.
All the actions and decisions made by a Transport Agent are driven entirely
based on the contents of the control file. There is enough information
about the original message file, that most Transport Agents will not need
to reparse it. Standard Transport Agents exist for local mail delivery,
SMTP/TCP, an error processing function, and interfacing with standard
Sendmail "mailer"s.
The System Environment
======================
Simplicity is an important thread in the design and implementation of
ZMailer. This is with the hope that a simple (not simple-minded) design
will encourage flexibility, elegance, and efficiency in the end product.
If the design is done right, the result should fit in naturally with the
UNIX environment, and will have some desirable side-effects: code
portability (to UNIX variants and to other operating systems), and a
smaller conceptual load for the person(s) maintaining the mail subsystem.
This section is about the external interfaces to ZMailer; how it depends on
the underlying system, how it interacts with it, and how the maintainers
(the System Administrator and the Postmaster) communicate and interact with
ZMailer.
All ZMailer activity is (largely by convention) confined to two directory
hierarchies. One is used to keep program binaries and various databases,
the other is a work area that is used when ZMailer does its job. For
various reasons, this latter hierarchy is set up to mimic the various
sections of a real postoffice, and indeed this analogy will reappear in a
few user interface situations. The program/database locations may be spread
out arbitrarily on your system. Unless there are good local reasons not to
collect these files in one place, the following few conventions should be
kept in mind:
The program/database directory is kept in `/usr/lib/mail' (first choice),
or `/usr/lib/zmail' (in case the first choice is already taken). The
program binaries of the Router and Scheduler portions of ZMailer are kept
here, along with all program configuration files, and utility scripts. The
databases, including the system aliases database, are kept in a `db'
subdirectory. The program binaries of all the Transport Agents are kept in
a `ta' subdirectory. If you like longer names, use `databases' and
`transports' respectively.
In a mail/file server environment, mail clients only need a view of the
`POSTOFFICE' directory hierarchy, so that User Agents can submit
messages for processing and perhaps for the mail queue querying program to be
able to read control files. The programs, configuration files, and databases
stored under `/usr/lib' are only used by the mail server machine.
[XX: did I miss anything? Is this logical? should config files and programs
be separate?]
An upcoming section (*Note The Postoffice: postoffice.) will deal with the
`POSTOFFICE' hierarchy in some detail. To motivate the issues dealt with
there, we will first deal with the mechanics of sending a message.
User Agent support
==================
To ease the task of interfacing directly to the ZMailer MTA, C library
routines are provided with ZMailer. The most important of these routines
are used when submitting a message. They implement the message file name
collision avoidance protocol, outlined in the previous subsection. An
independent routine is provided to encourage proper quoting of the full
names of users, as used in RFC822 message headers. This is an attempt at
removing any excuse for "poetic license" on the part of User Agents, MTAs,
and other systems (e.g. USENET News), where violation of the RFC822
specification in this regard is a frequent irritation. *Note User Agent
support: uasupport, for details.
As mentioned earlier, information may be passed to ZMailer using envelope
header lines in the message file. This includes a method for overriding
the full name of the originating user, as found in the GECOS field of the
password file entry for the user. It also includes a method for requesting
an alternate login name (or rather local-part, in RFC822 terminlogy), which
is of course subject to approval by security mechanisms in ZMailer. In
order to promote a standard way of specifying these optional values, the
message submission interface routines will seed the file with the
appropriate envelope information to be interpreted by ZMailer. The user
interface consists of the environment variables *FULLNAME* and
*PRETTYLOGIN*, which are accessed through the standard `getenv()' routine.
These environment variables need only be set by a user for all mail
submitted through these interface routines to make use of the features.
Compatibility
=============
Because ZMailer will often not be the first mailer installed on a computer,
utility programs are provided to ease the transition between the different
mail subsystems. The programs allow the change of MTA to be largely
transparent to the User Agents, and other programs that interacted with the
previously installed MTA. This allows conversion of such programs to be
deferred to a more convenient time. If a slight performance penalty is
acceptable, conversion may not be necessary at all.
In a Sendmail environment, there is just one critical program that needs to
be replaced, namely the Sendmail binary itself. The only program, that is
not a User Agent, which executes Sendmail directly, is the Rmail program
(usually `/bin/rmail') which is conventionally used to transfer mail using
UUCP. To avoid certain limitations of the standard Rmail programs, and at
the same time gain performance by interacting directly with ZMailer, a new
version of Rmail comes with the ZMailer distribution.
The major incompatibility between the Sendmail replacement program provided
with ZMailer and Sendmail itself, is that the Verbose mode is not
simulated. [XX: it may be possible to do so in the future, if absolutely
necessary]. If the use of Verbose mode is intended for debugging a
problem, there are other ways in ZMailer to obtain the same information
(*Note Address Testing: addresstest., for a way of seeing information
analogous to that produced by Sendmail's address test mode). Routine use
of such mechanisms by users is not practical. The reason for the
difficulty is that each message is processed by several ZMailer programs
that are completely divorced from the submitting user's environment.
^_
File: zmog Node: postoffice, Up: overview
The Postoffice
==============
All of the message manipulation activity of ZMailer is confined to a
directory hierarchy conventionally placed under `/usr/spool/postoffice'.
This name reflects the kinds of activity carried out by ZMailer under the
postoffice directory. The subdirectories under `~' are:
`~/deferred'
a parking area for message files that cannot be processed due to
temporary absence of resources needed by the Router. Such a situation
would typically be due to a nameserver failure, or in case of
unexpected I/O errors. Such message files can be resubmitted by
simply relinking them (use `mv') to the directory scanned by the
Router. This might be done periodically by a `find' command. Since
the time granularity of `find' is rather coarse, a utility called
`resubmit' is included with ZMailer to carry out exactly this task.
`~/forms'
contains canned error and warning messages used by the ZMailer
programs. By convention, the file names in this directory have two
components, the first refers to the class of condition that would use
the message in the file (e.g. "error" or "warning"), and the second
component describes the actual problem (for example `err.delivery').
[XX: is this a good convention?] Each file contains a prototype
message, the only missing information is a destination address, and
perhaps specific information that will be supplied by whichever
programs make use of the form. In particular, specifying carbon-copy
headers in these forms, allows the postmaster to get copies of mail
automatically sent to users. Different types of messages may be
carbon-copied to different people, if there are specialized
postmasters on the system.
`~/postman'
is where ZMailer puts messages files that should be examined by the
postmaster. Usually the postmaster is expected to take some
corrective action, and resubmit the message. The actual reason
ZMailer took this action will be found in the Router logs. As always,
messages files can be resubmitted simply by relinking them into the
directory scanned by the Router.
`~/public'
is the publically writable directory used by the standard message
submission routines to create a new message file. When a message file
has been properly created, it is relinked into the directory scanned
by the Router. Empty files in this directory are often caused by
improper handling of interrupts in a User Agent.
`~/queue'
is the final resting place of message files, after the Router has
processed them. This is where Transport Agents finds the message
files when necessary. These files are eventually unlinked by the
Scheduler.
`~/router'
is the directory scanned by the Router for new message files. From
here, the message file goes to one of the `~/queue' (nominally),
`~/deferred', or `~/postman' directories.
`~/scheduler'
is the directory scanned by the Scheduler for new control files. Each
control file is relinked into the `~/transport' directory, and into
one or more appropriate locations in a subdirectory of `~/scheduler'
that corresponds to a destination of the message. When restarting the
Scheduler, all subdirectories and their contents should be removed.
This is taken care of by the startup shell script included in the
ZMailer distribution.
`~/transport'
is the collection of pending control files. The files here are
unlinked by the Scheduler when all destinations have been processed.
When restarting the Scheduler process, the files in this directory
should first be linked into the `~/scheduler' directory to initialize
the scheduler.
In a server-client machine environment, only the server machine needs to
have this directory hierarchy. All the clients just need a view of the
postoffice, for benefit of the User Agents. The message file name
collision avoidance protocol should work properly across any remote
filesystem. With this setup, the various ZMailer processes should only be
run on the server machine.
^_
File: zmog Node: router, Prev: overview, Up: top, Next: scheduler
Router
******
The Router is the smart half of ZMailer. All the other parts of ZMailer
essentially just carry out instructions, as determined by the Router.
Therefore, the Router is by far the most complex part of ZMailer. It must
understand, in great detail, the structure of messages. It has to contain
logic to manipulate portions of this structure, and, since many sites have
different requirements and are in different network and mail environments,
the logic used must be easily customizable. At the same time it should be
efficient, since it is a bottleneck for message processing, and should
cater to various services expected by System Administrators, Postmasters,
and the users of a machine.
This description of the Router will begin by explaining how a message is
submitted by User Agents and why a standard submission interface is a good
idea. The structure of message files, and how that structure is analysed
and used, is treated next. Then follows exposure of the mechanisms used to
manipulate this information, and especially the tools available for the
person configuring ZMailer to customize its behaviour. A final review of
the details of the control logic will explain the reasons for various
embedded behaviours of the ZMailer Router.
Message Submission
==================
In the parlance of mail and message systems, ZMailer is an MTA, a Message
Transfer Agent. It exists to process mail, similar to the function
performed by a post office. As in the real world analogy, an MTA does not
participate in the process of composing messages and getting them into the
system. To do so, would correspond to your neighborhood postman taking
dictation of your letters, and taking them along when leaving your house.
In reality of course, people use a variety of tools to compose letters
(quill pens, word processors, etc.), and to send them off. This
functionality is embodied in a front end to the MTA, called a User Agent
(UA for short). The choice of UA is a very personal one, and is usually not
critical to the basic process of composing a message, sending it off, and
getting it delivered properly.
Even though there may be many UA's in use on a computer system, there is
usually only one MTA. The exceptions to this rule usually have to do with
limitations in an MTA's capabilities. For example, a computer that can
transfer mail using both the X.400 protocols, and the Internet protocols,
may need two different MTAs to cover both protocol suites. Oftentimes, one
of the two MTAs is a primary mailer, and takes care of all decision-making.
The other mailers would then be treated by the primary MTA as a means for
delivering messages to particular destinations, and these secondary mailers
would be configured to punt any non-trivial traffic to the primary MTA.
Both in the case of a User Agent, and in the cases of alternate MTAs, there
must be a way to inject messages into the mail subsystem.
When you want to mail a letter, what do you do? Well, you drop it in a
mailbox somewhere. ZMailer accepts messages the same way: you drop a file,
containing your message, into a special submission directory. Like a
postman (although rather more frequently), ZMailer scans this directory to
pick up the new messages, and processes them. What happens to a message
from then on is interesting in its own right of course, but presently we
shall focus on how the message gets from a user into the mail subsystem.
The simplest way to "drop a file into a directory" is of course to actually
edit a file in that directory. However, a program scanning such a
directory would not be able to tell when you had finished writing your
message and stopped editing the file. To do so, would require cooperation
between the scanning program and all the programs that could conceivably
create a file a portion at a time. The next most obvious method is to edit
the message file in another location, and then simply copy or link it into
the special message submission directory. Indeed, this is almost exactly
the mechanism used. Actually, the copying program happens to be among
those programs that may construct a new file piece by piece (a disk block
at a time). For large files there is a vulnerable window between when the
copy starts and when it finishes. If the submitter is unlucky, the message
may be processed by the scanning program (the ZMailer Router) before it is
completely written out. In fact this is not a problem due to the
implementation policy of relinking message files. The real problem is if a
partial message is completely processed and delivered and removed, with a
corrupt message body.
To avoid problems, the only acceptable method is to make the complete
message available to the Router at once. This is done by linking the
message file into the directory being scanned. Due to the semantics of
hard links and the UNIX filesystem, doing this requires that the message
file is created on the same filesystem as the submission directory.
ZMailer provides a publically writable directory, specifically for the
purpose of creating message files before they are relinked into the
submission directory. In fact, the submission directory itself is
publically writable, so users can do this relinking themselves (e.g. with
the `mv' command).
There are some security concerns with this approach. Because both these
directories are writable, it is conceivable that a malicious user can cause
problems for other users, for example remove their message files, or read
or alter them if the file permissions allow. The solution to the latter
problem, is obviously to ensure that file permissions do not allow people
other than the originating user to access a message file. There are two
solutions to the former problem; one is to maintain ignorance of the
contents of the various directories, the other, better, method is to use a
feature introduced in 4.3BSD -- setting the sticky bit on the directories.
The semantics of the sticky bit on a directory is to only allow the owner
of a file to unlink it from a generally writable directory. If this
function is not available, read permissions to the submission directory can
safely be removed if only the standard message submission routines are used
by the User Agents. This leaves us having to find a way to secure the
directory used for creating files. Read permission to it cannot be
withdrawn if the aforementioned standard routines are used (they are
described later). Other things can be done, but obviously "ignorance" is
not a reliable way of enforcing security. Perhaps if you notice the
analogy with the common usage of `/tmp' for various intermediate files, you
will not consider security any more of a problem in this case. My
recommendation would be to treat these directories the same way you treat
`/tmp'. That is, if the directory sticky-bit semantics are available, use
that feature. If not, try trusting your users enough to not cover up the
directories. You should note that the possible danger is confined to
removing message files. There is no way for a user to forge the origin of
a mail message, since all validation of message origin is based on the
ownership of the message file. Trusted user id's may of course supply a
different origin address.
There is only one other significant problem with this approach, which is
the potential for name clashes of files in each of the two directories.
This problem can be only be solved if all User Agents cooperate in using
the exact same collision avoidance or collision resolution technique. What
is really needed, by all the various ZMailer components operating on a
message, is a way to get and hold a lock on the message. There are various
ways to achieve this, and a kernel-based locking mechanism may seem
appropriate in certain situations. However, given the realization afforded
by an overview of the structure of ZMailer, it becomes clear that each
message goes through states corresponding to the current processing stage.
Instead of storing the state in the message file (or some other location
associated with a particular message file), the state is encoded in the
current location of a message file. For example, if the message file is in
the submission directory, it means it is waiting for the ZMailer Router to
process it. With this solution in hand, there still remains the original
matter of avoiding file name clashes.
The best way of avoiding name clashes is to generate names that cannot
clash. The only obvious unique property of a file is its inode number
Therefore, if one uses the inode number in the file name itself, name
clashes will be completely avoided. This is a truism if the files are
created on the same filesystem, and of course breaks down if that
assumption is invalid. There is just one minor problem: the inode number
of a file is not known until the file is actually created.
To resolve that Catch-22 situation, the message file must be created under
one name, and then immediately renamed to its guaranteed unique name, using
its inode number. The first name for a message file can safely be chosen
by the same method used to find names for temporary files in `/tmp'. Now
all the problems are solved with only two important assumptions: all
message files are created by this same mechanism, and all message files are
created on the same filesystem.
ZMailer will work with files that have been manually created and moved
around, although this should only be done routinely by mail system
maintainers.
Message File Format
===================
As mentioned, the Router picks up message files from a specific directory.
Normally, message file names can be arbitrary valid file names, and indeed
this is convenient when debugging. However, because the Router daemon
scans its own current directory, miscellaneous output from the Router
process may show up in this directory (e.g. profiling data, or core dumps
(unthinkable as that is)). Furthermore, it is useful to be able to hide
files from the Router scanning (indeed the Router may wish to do so
itself).
When the Router process is scanning for message files then, it only
considers at file names that have a certain format. Specifically, the
message file name must start with a digit. This method was chosen to
accomodate the message file names, as generated by the standard submission
interface library routines, which will be strings of digits representing
the message file's inode number.
A message file contains three sections: the message envelope, the message
header, and the message body (in that order). The message body is
separated from the previous sections by a blank line. The message body may
be empty, and either of the message envelope or message header may be
empty. The restriction on the latter situation, is that one of those
sections must contain destination information for the message.
The message envelope and the message header have very similar syntax. The
only difference is that while the message header must adhere to RFC822, the
message envelope header fields are terminated by whitespace (` ') instead
of a colon (`:'). The semantics of the two message file sections is quite
different, and will be covered later.
The header fields recognized by ZMailer in the message envelope are:
`Channel word'
sets the channel corresponding to the message origin(*)
`From address'
a source address(*)
`Fullname phrase'
sets the full name of the local sender
`LoginName local-part'
requests using this mail id for the local sender
`RcvdFrom domain'
sets the host the message was received from(*)
`To address-list'
a destination address list
`User local-part'
sets the user the message was received from(*)
`Via word'
for RFC822 Received: header to be generated
`With word'
for RFC822 Received: header to be generated
The (*)'s beside the descriptions indicate this is a privileged field.
That is, the action will only happen if ZMailer trusts the owner of the
message file (*Note Security: security.). As with a normal RFC822 header,
other fields are allowed (though they will be ignored), and case is not
significant in the field name. The Router will do appropriate checks for
the fields that require it.
With this knowledge, we can now appreciate the minimal message file:
--------------------
to bond
--------------------
This will cause an empty message to be sent to `bond'. A slightly more
sophisticated version is:
--------------------
from m
to bond
via courier
From: M
To: Bond
Subject: do get a receipt, 007!
You are working for the Government, remember?
--------------------
Notice that there is no delimiter between the message envelope and the
message header. A more sophisticated example in the same vein:
--------------------
from ps/d-ops
to <007@sis.mod.uk>
From: M <d-ops@sis.mod.uk>
Sender: Moneypenny <ps/d-ops@sis.mod.uk>
To: James Bond <007@sis.mod.uk>
Subject: where are you???!
Classification: Top Secret
Priority: Flash
We have another madman on the loose. Contact "Q" for usual routine.
--------------------
If the `Classification' header is paid attention to in ZMailer, this
requires that the Router recognize it in the message header, and take
appropriate action. In general the Router can extract most of the
information in the message header, and make use of it if the information is
lacking in the envelope. The envelope headers in the above message are
superfluous, since the same information is contained in the message header.
Using the following envelope headers would be exactly equivalent to using
the ones shown above (assuming the local host is `sis.mod.uk'):
--------------------
From Moneypenny <ps/d-ops@sis.mod.uk>
To James Bond <007@sis.mod.uk>
...
--------------------
ZMailer will extract the appropriate address information from whatever the
field values are, as long as they obey the defined syntax (indicated in the
list of recognized envelope fields above). ZMailer will complain in case
of unexpected errors in the envelope headers.
The message body is not interpreted by ZMailer itself. As far as the
Router is concerned, it can be arbitrary data. However, certain Transport
Agents may require limitations on the message body data. For example, the
SMTP only deals with ASCII data with a small guaranteed line length.
Header Scanning and Parsing
===========================
Message header and envelope is scanned according to the lexical rules of
RFC822 (and RFC976), and parsed according to the grammar rules of RFC822.
RFC976 compatibility requires that the `!' and `%' characters be treated as
specials (just like `.' and `@'). This behavior is enabled at compile time
(by defining the `RFC976' preprocessor symbol), and is indeed enabled by
default.
The only divergence from RFC822/RFC976 syntax is that comments are not
allowed in certain locations within addresses, and comments and quoted
strings may not span line boundaries. Neither of these are design
limitations, they will disappear before final release. All other RFC822
constructs are properly recognized and supported.
The mentioned RFC documents serve to describe ZMailer behavior with respect
to lexical scanning, tokenization, and parsing. In summary, based on the
class of each character, a token stream is synthesized for each header.
Various headers have defined semantics (e.g. the `To' header contains
an address list), which drive the parse of the token stream for that header.
The headers that have specific semantics to the Router are:
Field name RFC822 Syntax description Class
-------------------------------------------------------------------
channel word Envelope
fullname phrase Envelope
loginname addr-spec Envelope
rcvdfrom domain Envelope
user mailbox Envelope
via word Envelope
with word Envelope
bcc #address Recipient
cc 1#address Recipient
date date-time
encrypted 1#2word
errors-to 1#address Sender
from 1#mailbox Sender/Envelope
in-reply-to *(phrase | msg-id)
keywords 1#phrase
message-id msg-id
received received
references *(phrase | msg-id)
reply-to 1#address Sender
return-path route-addr Sender
return-receipt-to 1#address Sender
sender mailbox Sender
to 1#address Recipient/Envelope
All the fields mentioned above are parsed by the Router. Some (e.g.
`in-reply-to') are just parsed and not interpreted. Since the Router will
complain about format violations, this is a way of enlightening people
about what a particular field is not supposed to contain.
The `date' and `received' fields are interesting in that it is rather
unusual for an RFC822 mailer to parse these fields. Indeed, whether or not
they are parsed depends on the definition of compile-time preprocessor
symbols (`CANON_DATE' and `CANON_RECEIVED' respectively). If a `date'
header is parsed successfully, it will be printed using proper RFC822
date-time syntax when the message is delivered. For this to be useful, the
date string parse in the Router must be rather flexible to recognize the
endless variety of formats that exist, and it is.
The intention with parsing `received' headers is to prepare for the
possibility of using the information in the trace headers to aid the
routing algorithm. For now, parsed trace headers are output in a canonical
format that follows RFC822, similar to what happens with parsed `date'
headers. As long as the information is not used, there is no common reason
to enable this feature. [XX: It should perhaps be possible to select these
features on a per-message basis. any thoughts on this?]
Some of the header field names are tagged with what kind of addresses that
header field contains. This information is used when searching for
destination addresses when there are none specified in the envelope, and to
know which headers contain addresses that must be sent through the address
manipulation mechanisms of the Router.
Router Activities
=================
The ZMailer Router has three basic functions that it must carry out on each
message:
* Determining how to deliver a message to its destinations given in the
message envelope.
* Rewriting message header and envelope addresses to accommodate the standards
imposed by the method of delivery and the destination.
* Ensuring only properly formatted and standard-conforming (RFC822) messages
leave the local system.
For everything but the syntax and semantics of addresses, the last goal is
achieved by mechanisms internal to the Router. This is a reasonable
approach since a standard is not something that adapts to local conditions.
However, when pursuing the first two goals, many sites have found it
invaluable to be able to modify the behavior of the routing function, and
of the address rewriting function, to take local idiosyncrasies into
account. The importance of this ability is very apparent to sites in
complicated environments. Since ZMailer was partially motivated by the
inadequacies of other mailers in such an environment, much effort has gone
into the design of the configurable parts of the Router behavior. The
wired logic of the Router is treated in a later subsection (*Note Router
Control Flow: sequencer). Presently, we shall examine how routing and
address manipulation is carried out, and the Router facilities which
support these activities.
Routing Model
=============
For routing purposes, one wants to derive three pieces of information from
an address: where to send the message, how to send the message, and what to
tell the immediate destination of the message about it. This is the
information needed to properly transmit or deliver a message to its next
destination.
The mechanism used to transmit a message may be regarded as a conduit
(pipe, channel, circuit, etc.) between the local MTA and a remote MTA. In
Zmailer terminology, such a conduit is called a Channel. A Channel is just
a tag associated with a destination address for the message, and is used by
the Scheduler to manage delivery of the message. Thus, a Channel is a
concept (i.e. not associated with any particular program), and may be
serviced by one or more Transport Agents. As far as the Scheduler is
concerned, it is an uninterpreted classification of the message. For
example, if there are different physical links to a remote MTA, different
Transport Agent programs may serve the same Channel.
The Channel, or rather the Transport Agents serving a Channel, may need to
know which remote MTA to deliver the message to. This is most often a
hostname of a neighbouring host on a common network. If the Channel can
only have one destination host (for example the local delivery Channel),
a destination is superfluous. By convention, the Router will translate
null destinations into the symbol `-' in a message control file.
The remote MTA will need to know what to do with the message, in the form
of some envelope information. In RFC822, this information is embodied in
an address for further delivery with respect to the remote host.
The Router must determine this triple (channel, next-host, next-address)
for every address in the envelope, including the (single) origin address to
be able to verify origin. If not for security, then to make sure that a
proper RFC822 address was specified for the sender, and that a bogus
address form is not passed on. To do this, the Router will call a function
that takes an address as its argument and returns a triple. This function
may be completely specified in a configuration file read by the Router, and
its task is termed address resolution or routing.
While the `router' function rewrites envelope address as appropriate, there
must also be a way to rewrite message header addresses. In Sendmail, this
was done based entirely on which "mailer" (similar to a ZMailer Channel)
the message was sent through. To do more sophisticated rewriting was not
possible due to a complete lack of other information. If one wished to do
different manipulations depending on the final destination of a message for
example, it was almost impossible to do so (no variables or control flow in
Sendmail rulesets). It was also impossible to do address manipulation or
validity checking based on the origin of the message, since no such
information was available.
The ZMailer Router remedies these and other shortcomings in several ways:
the configuration language has control flow and variables, and the decision
of how to rewrite each address is carried out with access to all the needed
sender and recipient information. The word "decision" is used on purpose
to indicate that the choice of rewriting method is divorced from the actual
message header address rewriting process. What happens is that for each
recipient, the Router calls a function passing the triples derived from the
sender and the recipient address as arguments. The return value from this
`crossbar' function (so named because a crossbar switch is the closest
image, that came to mind, of what it does) includes the name of a function
that is to be used for rewriting the message header addresses. This
returned function is then called separately with all the addresses in the
message header, and the results will be incorporated in the message header
for the destination corresponding to the recipient triple. At the same
time, while the routing function does generic resolution of an address into
its corresponding triple, the crossbar function may modify the sender and
recipient triples if necessary, and so serves as a cleanup or filtering
function for the routing information. The crossbar function can also be
completely specified in a configuration file read by the Router.
The names (determined at compile-time) and interface specifications for the
routing and crossbar functions, are the only crucial "magical" things one
needs to contend with in a proper Router configuration. The syntax and
semantics of the configuration file's contents are dealt with in the
following subsection. The details of the two functions introduced here are
specified after that, once the necessary background information has been
given.
Configuration File Programming Language
=======================================
Whenever the Router process starts up, its first action is to read its
configuration file. The configuration file is a text file which contains
statements interpreted immediately when the file is read. Some statements
are functions, in which case the function is defined at that point in
reading the configuration file. The purpose of the configuration file is
to provide a simple way to customize the behavior of the mailer, and this
is primarily achieved by defining the `router' and `crossbar' functions.
For these to work properly, some initialization code and auxiliary
functions will usually be needed.
At first sight, a configuration file looks like a Bourne shell script.
Indeed, the ideal is to duplicate the functionality, syntax, and to a large
degree the semantics, of a shell script. Therefore, the configuration file
programming language is defined in terms of its deviation from standard
Bourne shell syntax and semantics. The present differences are:
* No `for', `while', and `repeat' statements, no pipes, or I/O
redirection.
* Case statement labels have no following `)', i.e. use
case foo in
pattern action ;;
esac
instead of
case foo in
pattern) action ;;
esac
* Case label patterns use V8 (Eighth Edition UNIX) regular expression
syntax (`egrep'-like).
* Functions are allowed, parameter lists are allowed. If not enough
arguments are present in a function call to exhaust the parameter
list, the so-far unbound parameter variables are bound to `' (the
empty string) as local variables. For example, this is the identity
address rewriting function:
null (address) {
return $address # surprise!
}
* Multiple-value returns are allowed. The `return' statement can be
used to return a non-`' value from a function. The following are all
legal `return' statements:
return
return $address
return $channel ${next_host} ${next_address}
* Variables are dynamically scoped, the only local variables are the
ones in a function's parameter list. Only the first value of a
multiple-value return may be assigned to a variable. All values are
strings, so no type information, checking, or declaration, is
necessary.
* Quoting is a bit stilted. All quotes (double-, single-, back-), must
appear in matching pairs at the beginning and end of a word. Single
quotes are not stripped, double quotes cause the enclosed character
sequence to be collected into a quoted-string RFC822 token. For
example, the statement:
foo `bar "`baz`"`
is evaluated as `(apply 'foo (apply 'bar (baz)))'.
* The forms `${variable:=value}', `${variable:-value}', and
`${variable:+value}' are supported. The special form
`${string:relation}' returns the value of `relation(string)',
implementing a database lookup function.
* Patterns (in case labels) are evaluated once, the first time they are
encountered.
* At the end of a case label, the sequentially next case labels of the
same case statement will be tried for successful pattern matching (and
the corresponding case label body executed). The only exceptions
(apart from encountering a return statement) are:
`again'
a function which retries the current case label for a match.
`break'
continues execution after the current case statement.
* Various standard Bourne shell functions do not exist built in.
* The function `import' must be used to declare a unix program to be
accessible to the config file code. This allows development using an
existing utility, and integration into the router of the same
functionality can be delayed until the need is proven. For example use
the statement:
import hostname /bin/hostname
to do the obvious. Programs defined in this manner will have the
message file on their standard input when they are executed.
There are currently only two entry-points (i.e. magic names known to the
Router code) in the configuration file, namely the `router' and the
`crossbar' functions.
The `router' function is called with an address as argument, and returns a
triple of (channel, host, user) as three separate values, corresponding to
the channel the message should be sent out on (or, the router function can
also be called to check on who sent a message), the host or node name for
that channel (null if local delivery), and the address the receiving agent
should transmit to.
The `crossbar' function is in charge of rewriting envelope addresses,
selecting message header address munging type (a function to be called with
each message header address), and possibly doing per-message logging or
enforcing restrictions deemed necessary. It takes a sender-triple and a
receiver-triple as arguments (six parameters all together). It returns the
new values for each element of the two triples, and in addition a function
name corresponding to the function to be used to rewrite header addresses
for the specific destination. If the destination is to be ignored,
returning a null function name will accomplish this.
There is one more magic symbol the Router knows about, which is
(optionally) defined by the configuration file. That is the name of the
definition of the alias database, a protocol which will be dealt with in
the subsection explaining the database lookup mechanism hinted at above.
The Router has several built in (C coded) functions. Their calling
sequence and interface specification is exactly the same as for the
functions defined in the configuration file [XX: except that they can't yet
return multiple values; it'll be fixed]. Some of these functions have
special semantics, and they fall into three classes, as follows:
Functions that are critical to the proper functioning of the configuration
file interpreter:
`return'
returns its argument(s) as the value of a function call
`again'
repeats the current case label
`break'
exits a case statement
Functions that are necessary to complete the capabilities of the
interpreter:
`import'
defines a function name that refers to an external program
`relation'
defines a database to the database lookup mechanism
`sh'
an internal function which runs its arguments as `/bin/sh' would
Non-critical but recommended functions:
`getzenv'
retrieves global ZMailer configuration values
`echo'
emulates `/bin/echo'
`exit'
aborts the Router with the specified status code
`hostname'
internal function to get and set the system name
`trace'
turns on selected debugging output
`untrace'
turns off selected debugging output
`['
emulates a subset of `/bin/test' (a.k.a. `/bin/[') functionality
The `relation' function is described in a later section (*Note Database
Interface: databases), and the `trace' and `untrace' functions are
described in connection with debugging (*Note Logging: routerlog).
The `hostname' function requires some further explanation. It is intended
to emulate the BSD UNIX `/bin/hostname' functionality, except that setting
the hostname will only set the Router's idea of the hostname, not the
system's. Doing so will enable generation of `Message-Id' and `Received'
"trace" headers on all messages processed by the Router. It is done this
way, since the Router needs to know the official domain name of the local
host in order to properly generate these headers, and this method is
cleaner than reserving a magic variable for the purpose. The Router cannot
assume the hostname reported by the system is a properly qualified domain
name, so the configuration file may generate it using whichever method it
chooses. If the hostname indeed is a fully qualified domain name, then:
hostname `hostname`
will enable generation of trace headers.
Finally, note that a symbol can have both a function-value and a
string-value. The string value is of course accessed using the $-prefix
convention of the Bourne shell language.
Address Manipulation
====================
Most of the flexibility of Sendmail derives from its production-rule model
for address rewriting. Very loosely, the concepts of rulesets in Sendmail
correspond to the functions of the ZMailer Router configuration file
programming language, and the rules themselves correspond to the case label
bodies of the case statements in our language.
Addresses are represented as string values in this language, no different
from any other strings. Therefore, addresses can be assigned as the value
of a variable, or passed as an argument to a function. The way to do
address rewriting is to modify the value of a variable chosen to contain
the *current* address (in the sense of the Sendmail rewriting process). In
keeping with the production rule model, much of the address rewriting is
typically done within case statements, whose semantics have been tailored
for this activity.
A `case' statement in the configuration file language has almost the same
syntax as the Bourne shell case statement. However, its semantics are
different, in that it is similar to the philosophy of (Sendmail) rulesets.
That is, the normal action is for an address to "enter" at the top, and for
each case label (rule) that matches, the case label body (action) is
executed. This is carried on sequentially for each case label (rule-action
pair) in the case statement (rule set), unless the normal continuation
action is modified by a control statement.
As opposed to Sendmail, where a rule is retested until it fails, a case
label pattern is only retested if the case label body calls the special
function `again'. This change was made because it is frequently a waste of
time to retest a pattern match when one has just modified the string to be
matched against. Sendmail does provide a way to continue to the next
rule-action pair, but since it is not the default behavior, it is often not
used in many of the places it should be used. As a way of reducing the
consequent waste of time, the default behavior has been changed.
The other special function that is specific to case statements, is `break'.
It is used with the same semantics as if within a C language looping
construct or switch statement, i.e. to exit the case statement and continue
with the statement after it. Of course, a `return' statement will
completely return from its enclosing function at any time.
Case conditions usually are not just a simple constant string; they will
usually contain a variable expansion and perhaps a function call. The
value of such a condition changes as the variable(s) it depends on changes.
When doing repeated case label pattern matching with the condition string
value, it would be rather unsavory to reevaluate the condition expression
every time. If no antecedent variable has changed value, obviously the
expression will not change its value either. To avoid this unnecessary
effort, the case condition is only reevaluated when any variable it depends
on has been assigned to, and then of course only when the current
expression value is actually needed.
Some final, and very important points: Even though the case label patterns
look like normal regular expressions that one can find in editors and other
system utilities, the pattern matching in the Router is token-based, rather
than character-based. The tokens are of course the RFC822 tokens scanned
from the value of the condition expression. This is done to avoid
surprises from simplistic patterns, and to cut down on unnecessary
verbosity in describing an address when using the normal regular expression
semantics. Another thing that helps the matter, is that all case label
patterns are anchored at the beginning and end of the string. An anchored
pattern easily simulates an unanchored pattern, but not vice versa. In
patterns, parentheses are used to group a number of alternates, and are
also used to bracket portions of the pattern, so the corresponding tokens
in the matched string can later be referred to. To avoid introducing
another special character (backslash, conventionally used to refer to
selected portions of the matched string), the semantics of the $-prefix
notation are extended to handle this need. If a `$' is followed by a digit
N, this is expanded as the value of the portion of the matched string
selected by the N'th group of parentheses in the pattern.
To give an idea of how a case statement looks, here is a code fragment:
case $hostname in
.+\.(edu|gov|mil|oth|org|net|ca|dk|uk) # add toplevels as you please
break ;; # do nothing
.* hostname = $hostname.$orgdomain ;; # default domain
esac
^_Info file zmog, produced by texinfo-format-buffer -*-Text-*-
from file zmog.tex
Distribution
************
Copyright (C) 1988 Rayan S. Zachariassen.
If you received this manual directly from the author, you may make and
distribute verbatim copies of it within your organization. Except by
explicit permission from the author, all other redistribution is prohibited
prior to final release.
^_
File: zmog Node: databases, Up: router, Next: security
Database Interface
==================
Many of the decisions and actions taken by configuration file code depend
on the specifics of the environment the MTA finds itself in. So, not just
the facts that the local host is attached to (say) the UUCP network and a
Local Area Net are important, but it is also essential to know the specific
hosts that are reachable by this method. Hardcoding large amounts of such
information into the configuration file is not practical. It is also
undesirable to change what is really a program (the configuration file),
when the information (the data) changes.
The desirable solution to this data abstraction problem is to provide a way
for the configuration file programmer to manage such information externally
to ZMailer, and access it from within the Router. The logical way to do
this is to have an interface to externally maintained databases. These
databases need not be terribly complicated; after all the simplest kind of
information needed is that a string is a member of some collection. This
could simply correspond to finding that string as a word in a list of
words.
However, there are many ways to organize databases, and the necessary
interfaces cannot be known in advance. The Router therefore implements a
framework that allows flexible interfacing to databases, and easy extension
to cover new types of databases.
To use a database, two things are needed: the name of the database, and a
way of retrieving the data associated with a particular key from that
database. In addition to this knowledge, the needs of an MTA do include
some special processing pertinent to its activities and the kind of keys to
be looked up.
Specifically, the result of the data lookup can take different forms: one
may be interested only in the existence of a datum, not its value, or one
may be looking up paths in a pathalias database and need to substitute the
proper thing in place of `%s' in the string returned from the database
lookup. It should be possible to specify that this kind of postprocessing
should be carried out in association with a specific data access.
Similarly, there may be a need for search routines that depend on the
semantics of keys or the retrieved data. These possibilities have all been
taken into consideration in the definition of a relation. A relation maps
a key to a value obtained by applying the appropriate lookup and search
routines, and perhaps a postprocessing step, applied to a specified
database that has a specified access method.
The various attributes that define a relation are largely independent.
There will of course be dependencies due to the contents or other semantics
of a database. In addition to the features mentioned, each relation may
optionally have associated with it a subtype, which is a string value used
to communicate to the lookup routine which table of several in a database
one is interested in.
There are no predefined relations in the Router. They must all be
specified in the configuration file, before first use. This is done by
calling the special function `relation' with various options, as indicated
by the usage string printed by the relation function when called the wrong
way:
Usage: relation -t dbtype [-f file -s# -b|n -l/u -d driver] name
The `t' option specifies one of several predefined database types, each
with their specific lookup routine. It determines a template for the set
of attributes associated with a particular relation. The predefined
database types are:
`hostsfile'
`/etc/hosts' lookup using `gethostbyname()'.
`unordered'
the database is a text file with key-datum pairs on each line, keys
are looked up using a sequential search.
`ordered'
the database is a text file with key-datum pairs on each line, keys
are looked up using a binary search in the sorted file.
`dbm'
the database is in DBM format (strongly discouraged).
`ndbm'
the database is in NDBM (new DBM) format.
`bind'
the database is the BIND nameserver, accessed through the standard
resolver routines.
A subtype is specified by appending it to the database type name separated
by a slash. For example, specifying `bind/mx' as the argument to the `t'
option will store away `mx' for reference by the access routines whenever a
query to that relation is processed. The subtypes must therefore be
recognized by either the database-specific access routines (for translation
into some other form), or by the database interface itself.
For `unordered' and `ordered' database types, the datum corresponding to a
particular key may be null. This situation arises if the database is a
simple list, with one key per line and nothing else. In this situation,
the use of an appropriate post-processor option (e.g. `b') is recommended
to be able to detect whether or not the lookup succeeded.
The `f' option specifies the name of the database. This is typically a
path that either names the actual (and single) database file, or gives the
root path for a number of files comprising the database (e.g. `foo' may
refer to the NDBM files `foo.pag' and `foo.dir'). For the `hostsfile' type
of database, the `/etc/hosts' file is the one used (and since the normal
hosts file access routines do not allow specifying a different file, this
cannot be overridden). For the `bind' database, this filename specifies
the `resolv.conf' file read by the resolver routines the first time they
are called. [XX: what if they are called by some library routine
innocuously used by ZMailer before the relation is defined? What if there
are several relations specifying different resolv.conf files?]. The use of
the `dbm' format is strongly discouraged, since a portable program can only
have a single DBM database associated with it.
The `s' option specifies the size of the cache. If this value is non-zero
(by default it is 10), then an LRU cache of this size is maintained for
previous queries to this relation, including both positive and negative
results.
The `b' option asks that a postprocessor is applied to the database lookup
result, so the empty string is returned from the relation query if the
database search failed, and the key itself it returned if the search
succeeded. In the latter case, any retrieved data is discarded. The
option letter is short for Boolean.
The `n' option asks that a postprocessor is applied to the database lookup
result, so the key string is returned from the relation query if the
database search failed, and the retrieved datum string is returned if the
search succeeded. The option letter is short for Non-Null.
The `l' option asks that all keys are converted to lowercase before lookup
in the database. This is mutually exclusive with the `u' option.
The `u' option asks that all keys are converted to uppercase before lookup
in the database. This is mutually exclusive with the `l' option.
The `d' option specifies a search routine. Currently the only legal
argument to this option is `pathalias', specifying a driver that searches
for the key using domain name lookup rules.
The final argument is not preceeded by an option letter. It specifies the
name the relation is known under. Note that it is quite possible for
different relations to use the same database.
Some sample relation definitions follow:
if [ -f /etc/named.boot ]; then
relation -nt bind/cname -s 100 canon # T_CNAME canonicalize hostname
relation -nt bind/uname uname # T_UNAME UUCP name
relation -bt bind/mx neighbour # T_MX/T_WKS/T_A reachability
relation -t bind/mp pathalias # T_MP pathalias lookup
else
relation -nt hostsfile -s 100 canon # canonicalize hostname
relation -t unordered -f $MAILBIN/db/hosts.uucp uname
relation -bt hostsfile neighbour
relation -t unordered -f /dev/null pathalias
fi
The above fragment defines a set of relations that can be accessed in the
same way, using the same names, independent of their actual definition.
# We maintain an aliases database in the following format. Note: the
# 'aliases' db name is magic to the internal alias expansion routines.
if [ -f $MAILBIN/db/aliases.dat ]; then
relation -t ndbm -f $MAILBIN/db/aliases aliases
else
relation -t ordered -f $MAILBIN/db/aliases.idx aliases
fi
As the comment says, the relation name `aliases' has special significance
to the Router. Although the relation is not special in any other way (i.e.
it can be used in the normal fashion), the semantics of the data retrieved
are bound by assumptions in the aliasing mechanism. These assumptions are
that key strings are local-name's, and the corresponding datum gives a byte
offset into another file (the root name of the aliases file, with a `.dat'
extention), which contains the actual addresses associated with that alias.
The reason for this indirection is that the number of addresses associated
with a particular alias can be very large, and this makes the traditional
simple database formats inadequate. For example, quick lookup in a text
file is only practical if it is sorted and has a regular structure. A
large number of addresses associated with an alias makes the structuring a
problem. The situation for DBM files and variations have problems too, due
to the intrinsic limits of the storage method. The chosen indirection
scheme avoids such problems without loss of efficiency.
Finally, some miscellaneous definitions that illustrate various
possibilities:
relation -t unordered -f /usr/lib/news/active -b newsgroup
relation -t unordered -f /usr/lib/uucp/L.sys -b ldotsys
relation -t ordered -f $MAILBIN/db/hosts.transport -d pathalias transport
Here, the first two illustrate convenient coincidences of format, and the
last definition shows what might be used if outgoing channel information is
maintained in a pathalias-format database (e.g. `bar smtp!bar' means to
send mail to `bar' via the SMTP channel).
Using a Pathalias Database
--------------------------
Accessing route databases is a rather essential capability for a mailer.
At the University of Toronto, all hosts access a centrally stored database
through a slightly modified nameserver program. If such a setup is not
practical at your site, other methods are available. The most widespread
kind of route database is produced by the `pathalias' program. It
generates key-value pairs of the forms:
uunet ai.toronto.edu!uunet!%s
.css.gov ai.toronto.edu!uunet!seismo!%s
which when queried about `uunet' and `beno.css.gov' correspond to the
routes:
ai.toronto.edu!uunet
ai.toronto.edu!uunet!seismo!beno.css.gov
Notice that there are two basic forms of routes listed: routes to UUCP node
names and routes to subdomain gateways. Depending on the type of route
query, the value returned from a pathalias database lookup needs to be
treated differently. For now, this may be accomplished by a configuration
file relation definition and interface function as shown:
relation -t ndbm -f $MAILBIN/uuDB -d pathalias padb
# pathalias database lookup function
padblookup (name, path) { # path is a local variable
path = ${$name:padb}
case "$path" in
((.+)!)?([^!]+)!%s
if [ $3 == $name ]; then
path = $2!$3
else
path = $2!$3!$name
fi
;;
.*%s.* echo illegal route in pathalias db: $path
;;
esac
return $path
}
This is actually a simplistic algorithm, but it does illustrate the method.
The lookup algorithm used when the `-d' flag is specified in the
relation definition command is rather simple; it doesn't test various case
combinations for the keys it tries. Therefore, the keys in the pathalias
output data should probably be converted to a single case, and the `-l'
or `-u' flag given in the relation definition.
Mail Forwarding
===============
Although more interesting and useful models exist, the mail forwarding
functionality of ZMailer has been designed to generally emulate the
interface and behaviour of Sendmail. The mechanisms that accomplish this
are likely to be generalized in a future version.
If a relation named `aliases' is defined by the configuration file, then
the data returned by a lookup in that database is assumed to be a printed
decimal representation of the byte offset of the definition of the alias in
a separate file. In other words, the `aliases' relation associates a
particular local-part, with an index into another file that contains the
actual alias definition. The name of this other data file is constructed
from the name of the file associated with the `aliases' relation, typically
it will be `aliases.dat'.
The file containing the actual aliasing data is automatically created by
the Router when asked to reconstruct the aliases database. It does this
based on a text file containing the alias definitions. This text file,
which corresponds to the Sendmail aliases file, consists of individual
alias definitions, possibly separated by blank lines or commentary.
Comments are introduced by a sharp sign (octothorp: `#') at any point where
a token might start (for example the beginning of a line, but not in the
middle of an address), and extend to the end of the line. Each alias
definition has the exact syntax of an RFC822 message header, containing an
address-list, except for comments. The header field name is the local-part
being aliased to the address-list that is the header value.
The fact that an alias definition follows the syntax for an RFC822 message
header, introduces an incompatibility with Sendmail. The string
`:include:' at the start of a local-part (a legacy of RFC733) has special
semantics. Sendmail would strip this prefix, and regard the rest of the
local-part as a path to a file containing a list of addresses to be
included in the alias expansion. Indeed, the Router behaves in the same
manner, but because some of the characters in the prefix are RFC822
specials, the entire local-part must be quoted. Thus, whereas Sendmail
allowed:
people: :include:/usr/lib/mail/lists/people
the proper syntax with ZMailer is:
people: ":include:/usr/lib/mail/lists/people"
Like Sendmail, if a local-part is not found in the aliases database, the
Router also checks `~local-part/.forward' (if such exists) for any address
expansion. The `.forward' file format is also an RFC822 address-list,
similar to what Sendmail expects.
There are presently no special features to deal properly with mailing lists
(apart from what has been described above about the aliases database).
Such features are necessary, and will be designed after consultation [XX:
got any ideas? tell me! TODO: mailing lists, message header manipulation].
As special cases, a local-part starting with a pipe character (`|') is
treated as mail destined for a program (the rest of the local-part is any
valid argument to a `sh -c' command), and a local-part starting with a
slash character (`/') is treated as mail destined for the file named by the
local-part.
^_
File: zmog Node: security, Prev: databases, Up: router, Next: sequencer
Security
========
Having local-parts that allow delivery to arbitary files, or can trigger
execution of arbitrary programs, can clearly lead to a huge security
problem. Sendmail does address this problem, but in a restrictive and
unintuitive manner. This aspect of ZMailer security has been designed to
allow the privileges expected by common sense.
The responsibility for implementing this kind of security is split between
the Router and the Transport Agent that delivers a message to an address.
Since it is the Transport Agent that must enforce the security, it needs
some information to guide it. Specifically, for each address it delivers
to, some information about the "trustworthyness" of that address is
necessary so the Transport Agent can determine which privileges it can
assume when delivering for that destination. This information is
determined by the Router, and passed to the Transport Agent in the message
control file. The specific measure of trustworthyness chosen by [XX: the
present incarnation of] ZMailer, is simply a user id (uid) value
representing the source of the address.
When a message comes in from a non-local host, the destination addresses
should obviously have no privileges on the local host (when mailing to a
file or a program). Similarly, common sense would indicate that locally
originated mail should have the same privileges as the originator. Based
on an initial user id assigned from such considerations, the privilege
attached to each address is modified by the attributes of the various alias
files that contain expansions of it. The algorithm to determine the
appropriate privilege is to use the user id of the owner of the alias file
if and only if that file is not group or world writable, and the directory
containing the file is owned by the same user and is likewise neither group
nor world writable. If any of these conditions do not hold, an
unprivileged user id will be assigned as the privilege level of the
address.
It is entirely up to the Transport Agent whether it will honour the
privilege assignment of an address, and indeed in many cases it might not
make sense (for example for outbound mail). However, it is strongly
recommended that appropriate measures are taken when a Transport Agent has
no control over some action that may affect local files, security, or
resources.
The described algorithm is far from perfect. The obvious dangers are:
* The grandparent directories, to the Nth degree, are ignored, and may
not be secure. In that case all security loses anyway.
* There is a window of vulnerability between when the permissions are
checked, and the delivery is actually made. This is the best argument
I have heard so far for embedding the local delivery program
(currently a separate Transport Agent) in the Router.
There is also another kind of security that must be addressed. That is the
mechanism by which the Router is told about the origin of a message. This
is something that must be possible for the message receiving programs
(`/bin/rmail' and the SMTP server are examples of these) to specify to
ZMailer. The Router knows of a list of trusted accounts on the system. If
a message file is owned by one of these user id's, any sender specification
within the message file will be believed by ZMailer. If the message file
is not owned by such a trusted account, the Router will cross-check the
message file owner with any stated `From:' or `Sender:' address in the
message header, or any origin specified in the envelope. If a discrepancy
is discovered, appropriate action will be taken. This means that there is
no way to forge the origin of a message without access to a trusted
account.
^_
File: zmog Node: sequencer, Prev: security, Up: router, Next: routerlog
Router Control Flow
===================
The following few pages use pseudo-code to describe the algorithm that
produces a control file (containing delivery instructions and the new
message headers) from a message file. This algorithm is implemented in a C
function called `sequencer()', an apt description of how it orchestrates
the various parts of the ZMailer Router to implement the semantics of
RFC822 message processing.
sequencer(message file name)
{
Parse envelope and message header from the message file
if (hostname has been set)
Stamp the message with a trace header
Determine if message contains Resent-* headers or not
(from here on, only pay attention to the appropriate group of headers)
Determine if the owner of the message file is trusted user
if (there is no sender specified in the envelope) {
if (there is a Sender or From field in the message header
&& the owner of the message file is trusted)
Use the header value from it as the message sender
else
Generate a sender based on the owner of the file
if (there is still no sender)
Generate a sender referring to the local Postmaster
} else {
if (the owner of the message file is not trusted) {
Save message file for Postmaster to see
Generate a sender based on the owner of the file
}
}
if (an error occurred during parsing of the message envelope) {
Save the message file for the Postmaster to see and correct
return;
}
if (an error occurred during parsing of the message header) {
Save the message file for the amusement of the Postmaster
header_error = TRUE;
} else
header_error = FALSE;
default address delivery uid = nobody;
if (the owner of the message file is trusted) {
if (an incoming channel is specified in the envelope)
set the trusted channel origin accordingly
if (an incoming host is specified in the envelope)
set the trusted host origin accordingly
if (an incoming user is specified in the envelope)
set the trusted user origin accordingly
if (any element of the trusted origin triple is null) {
set the resolved origin triple by
routing the sender address
}
if (the message origin is a local user)
default address delivery uid = uid of that user;
} else {
/* We know sender is local */
default address delivery uid = uid of owner of message file;
if (message header contains a Sender, but no From field) {
Rename the "Sender" field into a "From" field
}
if (the message header specifies a Sender) {
if (the specified Sender address does not correspond
to the resolved origin of the message) {
Rename the "Sender" field a "Fake-Sender" field
Set a flag to generate a Sender header
}
} else if (the message header only specifies a From field) {
if (the specified From address does not correspond
to the resolved origin of the message) {
Set a flag to generate a Sender header
}
}
if (flag is set that we need to generate a Sender header)
Do so based on the owner of the message file
}
if (default address delivery uid != nobody
&& there is no From message header) {
Generate one based on the envelope origin address information
}
/* Recipient determination */
if (there are no recipients specified in the envelope) {
if (header_error) {
Reject the message with a "bad header" error
return;
}
Add all the message header recipient addresses (from To, Cc,
and Bcc headers) to the message envelope recipient list
if (there are still no recipients specified in the envelope) {
Reject the message with a "no recipients" error
return;
}
}
if (header_error) {
Return the message with a "bad header" warning
Add Illegal-Object warning headers to the message header
}
if (there is no To message header) {
/* Insert the To: header lines */
Add the list of message recipients from the message envelope
in the message header in To headers
}
#ifdef notdef
rewrite all addresses in the message according to the incoming-rewriting
rules for the originating channel.
#endif notdef
if (hostname has been set) {
/* Make sure Message-Id exists, for loop control */
if (there is no Message-Id message header)
Generate a message id and add it to the message header
else
extract a message id from the existing header
Log the message id
} else
there is no message id
/* Route recipient addresses */
for (every recipient address in the message envelope)
delivery privilege of address = default address delivery uid
for (every recipient address in the message envelope) {
router() /* Route the address */
if (the returned triple is null)
continue; /* ignore this recipient */
/* Rewrite this envelope address */
crossbar(source triple, recipient triple)
if (the return value from `crossbar()' is null)
continue; /* ignore this recipient */
else if (the message header rewriting function name is null) {
Save the message for the Postmaster to see
return;
}
/* Don't send message to the same address twice */
if (we have already seen this destination triple) {
if (the message is going to be sent to that destination)
continue; /* suppress duplicates */
}
if (this destination triple has not been alias expanded
&& it represents a local destination
&& it has an alias expansion) {
Add the list of expanded addresses to the list of
addresses processed by this loop, each with
a delivery privilege determined by the source
of the alias expansion
continue; /* ignore this address */
}
Flag that this destination triple will be sent out
Add the address destination triple to a list for each channel
if (the message header address rewriting function name
is new for this message) {
Add the name to a collection of the kinds of message
header address rewritings that need be done
}
}
for (every kind of message header address rewriting we need to do) {
Call the indicated function with every address in the header
Store the transformed headers for later use
}
if (there is no Date message header)
Generate one based on the modification time of the message file
/* Emit specification to the transport system */
for (every recipient address) {
if (control file has not been created) {
Create message control file
Write a standard preamble consisting of
the corresponding message file name
the offset of the start of the message body
the message id (if any) for log identification
if (this message did not come from the error channel)
Write an error return address
}
if (the envelope sender address form for this recipient
is different from the previous sender address form)
Write the sender address origin triple
Write the recipient address destination triple, and the
corresponding address delivery privilege
if (message header for this address is different than
the message header for the next recipient address) {
Write the complete message header for this destination,
as reconstructed from the original message
header and the stored headers transformed by
message header address rewriting
}
}
if (we created a message control file) {
relink the message file itself to the `QUEUE' directory
relink the control file to the `SCHEDULER' directory
}
return;
}
^_
File: zmog Node: routerlog, Prev: sequencer, Up: router, Next: addresstest
Logging
=======
When the Router starts up as a daemon, it will attach its standard output
and standard error streams to a log file. All messages from the Router
will appear on one of these streams, and will therefore show up in a
central location for perusal by the Postmaster or other interested parties.
Usually only abnormal occurrences will be logged in this manner, but any
messages printed will show up here. In particular, many of the components
of the Router contain trace print statements that can be enabled at
run-time. In fact, interactive debugging of the configuration file is
performed this way, since when the Router is run in the foreground, the
standard output and error streams are attached to the terminal in the
normal fashion. Thus, all messages will appear in front of the person
testing the configuration.
The tracing functionality is controlled either on the command line, or by
calling the `trace' and `untrace' functions from within the configuration
file, or interactively. The interactive behaviour of the Router, is to
read and execute its configuration file (as normal), and then sit in an
infinite loop reading commands from its standard input stream. This allows
a person executing the Router interactively, to execute arbitrary
statements in the configuration file programming language. The statements
typed in are buffered until an End Of File indication, and then executed by
the configuration file interpreter. This cycle is repeated until a syntax
error occurs, or the process is interrupted. [XX: yes, this is a very
rough mode of interaction. do you have any suggestions for improving it? ].
The `trace' and `untrace' functions take one or more words as
arguments, and turn on (off) flags that enable tracing in a component of
the Router corresponding to each word. The current list of words, and the
corresponding actions traced, are:
`alias'
alias expansion
`all'
turns all trace flags on
`assign'
variable assignment
`bind'
the BIND nameserver responses
`compare'
case label pattern matching
`db'
database lookups
`final'
print message information after sequencer returns
`functions'
function calls and returns
`matched'
successful case label matches
`memory'
memory allocation statistics
`off'
turns all trace flags off
`on'
same as `functions'
`parsetree'
the configuration file parse tree
`regexp'
regular expression execution
`resolv'
the BIND resolver library `RES_DEBUG' option
`rewrite'
message header rewriting
`router'
envelope recipient address routing
`sequencer'
control flow in the sequencer function
In addition to this, each message processed is logged via the standard
system logging facility (syslog) if it is available.
^_
File: zmog Node: addresstest, Prev: routerlog, Up: router
Address Testing
---------------
For example, if you wish to see how an address is routed, you can run the
command:
echo "trace on ; router $address" | router -I
which, with `$address' bound to `bond@sis.mod.uk' might produce something
like:
GNU Mailer router (Zmailer alpha.1 #0: Sun Jan 31 17:38:53 EST 1988)
rayan@ephemeral.ai:/usr/src/zmailer/router
Copyright 1988 Rayan S. Zachariassen
router: parameters: 'bond@sis.mod.uk'
echo: parameters: 'router:' 'bond@sis.mod.uk'
router: bond@sis.mod.uk
echo: returns: "
canonicalize: parameters: 'bond@sis.mod.uk'
focus: parameters: 'bond<@sis.mod.uk>'
[: parameters: 'sis.mod.uk' ']'
[: returns: 'true'
focus: returns: 'bond<@sis.mod.uk>'
canonicalize: returns: 'bond<@sis.mod.uk>'
[: parameters: " ']'
[: returns: "
[: parameters: 'sis.mod.uk' '==' 'ephemeral.ai.toronto.edu' ']'
[: returns: "
[: parameters: 'sis.mod.uk' ']'
[: returns: 'true'
router: returns: 'smtp' 'sis.mod.uk' 'bond@sis.mod.uk'
^_
File: zmog Node: scheduler, Prev: router, Up: top, Next: transports
Scheduler
*********
The Scheduler complements the Router as the other major process in ZMailer.
The decisions it makes involve how to manage and time delivery of messages
to their destination, and its name arises from this scheduling function.
While the Router interprets message files, the Scheduler interprets only
the control files corresponding to the message files.
The control files are usually produced by the Router, and appear in a
directory scanned by the Scheduler daemon. Whenever a new control file
does appears in that directory, its contents are used to update a data
structure, maintained by the Scheduler, that describes which addresses in
which messages are destined for which hosts and channels. The information
stored along with each channel/host combination is a set of byte offsets
into the control file, giving the location of address specifications
corresponding to that combination. This information can later be passed to
a transport/delivery program, and is updated based on feedback from these
programs.
This data structure is internal to the Scheduler, but is also mapped onto
the filesystem as a directory hierarchy that is fully maintained by the
Scheduler. The image is a two- or three-level hierarchy, with the leaf
nodes always being a link to the control file. Each leaf directory in this
hierarchy corresponds to a channel or channel/host combination, and each
leaf file is a link to a control file containing undelivered addresses for
the specific channel or channel/host. Each control file may be linked into
several places in the hierarchy, if it is to be delivered using several
corresponding mechanisms. The top level of the hierarchy consists of a
directory for each delivery channel. An optional middle-level inserts a
subdirectory for each next host, under which the control files may then be
linked.
The reason for maintaining the external structure is to segregate related
control files in a way that will allow more efficient access by Transport
Agents. Maintaining a directory for each host will be a poor decision for
low message volume hosts, or if a small part of the message traffic
consists of remailings to huge mailing lists (many different destinations),
but may be a win for hosts handling a very very large volume of traffic.
The only concern involved is the time for the operating system to search
down a path to open a specific control file. The Scheduler always
discriminates by next host, and functionality will be the same in any case.
A Transport Agent is a program that the Scheduler executes to deliver
messages. The Scheduler determines the correspondence between channels,
hosts, or channel/host destinations, and a specific Transport Agent, by
interpreting a simple table from a configuration file. The Transport Agent
process is told which control files it should inspect for work, and tells
the Scheduler the status of the destination addresses it tried to process.
The Scheduler then updates the model it maintains of work that needs to be
done, and will eventually remove the last link to a control file and its
corresponding message file. At that point, ZMailer has done its job with
regard to each message.
Note that the communication between the Scheduler and other programs is
mostly via the message control file, instead of by direct interaction (when
the Scheduler converses with Transport Agents).
Message Control File
====================
A message control file is a file created by the Router to contain all the
information necessary for delivery of a message submitted in a
corresponding message file. It is interpreted by the Scheduler, which
needs to know at all times which messages are pending to go where, and how.
It is also interpreted by one or more Transport Agents, possibly
concurrently, that extract the delivery information relevant to their
purpose.
The concurrency aspect means that the Transport Agents must cooperate on a
locking protocol to ensure that delivery to a particular destination is
attempted by only one Transport Agent at a time, and a status protocol to
ensure unique success or failure of delivery for each destination. There
are potentially many ways to implement such protocols, but, in the spirit
of simplicity, ZMailer uses a control file as a form of shared memory.
Specific locations within each control file are reserved for flags that
indicate a specific state for their associated destination address. The
rest is taken care of by the I/O semantics when multiple processes update
the same file.
Apart from necessary envelope and control information, a control file also
contains the new message header for the message, which contains the header
addresses as rewritten by the Router. Since a message may have several
destinations with incompatible address format requirements, there may be
several corresponding groups of message headers. This will be illustrated
by the sample control file shown in the following subsection.
Format
------
A control file consists of a sequence of fields. Each field starts at the
beginning of a line (i.e. at byte 0 or after a Newline), and is identified
by the appearance of a specific character in that location. This id
character is normally followed by a byte containing a tag value (semaphore
flag), followed by the field value.
Here is a simple control file produced by a test message, just before it
was removed by the Scheduler:
--------------------
i 24700
o 72
l <88Jan10.003129est.24700@bay.csri.toronto.edu>
e Rayan Zachariassen <rayan>
s local - rayan
r+local - rayan 2003
m
Received: by bay.csri.toronto.edu id 24700; Sun, 10 Jan 88 00:31:29 EST
From: Rayan Zachariassen <rayan>
To: rayan, rayan@ephemeral
Subject: a test
Message-Id: <88Jan10.003129est.24700@bay.csri.toronto.edu>
Date: Sun, 10 Jan 88 00:31:24 EST
s local - rayan@bay.csri.toronto.edu
r+smtp ephemeral.ai.toronto.edu rayan@ephemeral.ai.toronto.edu 2003
m
Received: by bay.csri.toronto.edu id 24700; Sun, 10 Jan 88 00:31:29 EST
From: Rayan Zachariassen <rayan@csri.toronto.edu>
To: rayan@csri.toronto.edu, rayan@ephemeral.ai.toronto.edu
Subject: a test
Message-Id: <88Jan10.003129est.24700@bay.csri.toronto.edu>
Date: Sun, 10 Jan 88 00:31:24 EST
--------------------
The id character values are defined in the `mail.h' system header file,
which currently contains:
#define _CF_MESSAGEID 'i' /* inode number of file containing message */
#define _CF_BODYOFFSET 'o' /* byte offset into message file of body */
#define _CF_SENDER 's' /* sender triple (channel, host, user) */
#define _CF_RECIPIENT 'r' /* recipient n-tuple, n >= 3 */
#define _CF_ERRORADDR 'e' /* return address for error messages */
#define _CF_DIAGNOSTIC 'd' /* diagnostic message for ctlfile offset */
#define _CF_MSGHEADERS 'm' /* message header for preceeding recipients */
#define _CF_LOGIDENT 'l' /* identification string for log entries */
There is one field per line, except for `_CF_MSGHEADERS' which has some
special semantics described below. The following describes the fields in
detail:
`i'
This field identifies the message file corresponding to this control
file. It is the name of the message file in the `QUEUE' directory
(`~/queue'). This is typically the same as the inode number for that
file, but need not be. It is used by Transport Agents when copying
the message body, and by the Scheduler when unlinking the file after
all the destination addresses have been processed. For example:
i 21456
`o'
Specifies the byte offset of the message body in the message file. It
is used by Transport Agents in order to copy the message body quickly,
without parsing the message file. For example:
o 466
`e'
Gives an address to which delivery errors should be sent. The address
must be an RFC822 mailbox. For example:
e "Operations Directorate" <d-ops@sis.mod.uk>
`l'
The field value is an uninterpreted string which should prefix all log
messages and accounting records associated with this message. This
value is typically the message id string. For example:
l <88Jan6.103158gmt.24694@sis.mod.uk>
`s'
This field specifies an originator (sender) address triple, in the
sequence: previous channel, previous host, return address. It remains
the current sender address until the next instance of this field.
Since there can only be one sender of a message, multiple instances of
the field will correspond to different return address formats as
produced by the `crossbar' algorithm in the Router. For example:
s smtp sis.mod.uk @lab.sis.mod.uk:q@deadly-sun.lab.sis.mod.uk
s uucp sisops lab.sis.mod.uk!deadly-sun.lab.sis.mod.uk!q
`r'
This field specifies a destination (recipient) address triple, in the
sequence: next channel, next host, address for next host. Optional
information to be passed to the Transport Agent may be placed after the
mandatory fields; this currently refers to the delivery privilege of the
destination address. Since the optional values of this field are only
interpreted by the Transport Agent, changes in what the Router writes
must be coordinated with the code of the Transport Agents that might
interpret this field. For example:
r local - bond 0
r uucp uunet sisops!bond -2
`m'
Apart from a message body, a Transport Agent needs the message headers
to construct the message it delivers. These message headers are
stored as the value of this field. Since message headers obviously
can span lines, the syntax for this field is somewhat different than
for the others. The field id is immediately followed by a newline,
which is followed by a complete set of message headers. These are
terminated (in the usual fashion) by an empty line, which also
terminates this field. In the following example, the last line of
text is followed by an empty line, after which another field may
start:
m
From: M
To: Bond
Subject: do get a receipt, 007!
`d'
This field is *not* written by the Router. It is written by the
Scheduler to remember errors associated with specific addresses. The
field value has two parts, the first being the byte offset in the
control file of the destination (recipient) address causing the error,
and the rest of the line being an error message. The Transport Agents
discover these errors and report them to the Scheduler. The Scheduler
will collect them and report them to the error return address (if any)
after all the destinations have been processed [XX:or at other times].
For example:
d 878 No such local user: 'bond'.
It should be noted, that in sender and recipient fields the first two field
values (channel and host) cannot contain embedded spaces, but the third
field value (the address) may. Therefore, in the presence of extra fields,
parsing within Transport Agents must be cautious and not assume that an
address does not contain spaces.
As mentioned, the second byte of most fields are used for concurrency
control and status indication. This tag byte can contain several values
that indicate current or previous activity. The fields where this is
relevant are the destination (recipient) address and diagnostic fields.
The tag values are defined in the `mail.h' file mentioned previously, as
follows:
#define _CFTAG_NORMAL ' ' /* what the router sets it to be */
#define _CFTAG_LOCK '~' /* that line is being processed, lock it */
#define _CFTAG_OK '+' /* positive outcome of processing */
#define _CFTAG_NOTOK '-' /* something went wrong */
#define _CFTAG_DEFER _CFTAG_NORMAL /* try again later */
The extract above is self-explanatory.
A message control file will normally contain a preamble that specifies
information about the associated message file, the message body offset, an
error return address, and a log entry tag. After this comes a repeated
sequence of: sender address field, recipient address fields, and the
message header corresponding to these recipients. After as many of these
groups as are necessary, any diagnostic fields will be appended to the end
of the control file. The restrictions on the sequence of addresses and
message headers, are that a sender address field must precede any recipient
address field, and a recipient address field must (immediately) precede any
message header field, and no sender or recipient addresses may follow the
last message header field.
Scheduler Configuration File
============================
The major action of the Scheduler is to periodically start up Transport
Agents and tell them what to do. This is controlled by a table in a
configuration file that is read by the Scheduler when it starts. A typical
configuration file would look something like:
# pattern intvl ch/ho/* uid gid command
local/* 10s 2 0 0 root daemon mailbox local
smtp/* 1m 10 2 0 root daemon smtp -l /tmp/smtp.log $host
error/* 5m 10 0 0 root daemon errormail
uucp/* 10m 10 0 0 root daemon sm -c $channel uucp
Any line starting with a `#' character is assumed to be a comment line, and
is ignored, as are empty lines. All other lines must follow a rigid
format. Each line consists of eight white-space separated fields. The
fields, in sequence, are:
A pattern, that selects which channel/host combinations are relevant to the
current line. The pattern has the form: channel/host, with the slash being
mandatory. The subpatterns (i.e. each side of the slash) may contain a
`glob' (or `sh') style pattern. These patterns are tested in the order
they appear, with the channel and host values for destination addresses in
a message. When both patterns match, the line with the matching pattern
describes the Transport Agent that should be used to deliver the message to
that destination. It is important that the Transport Agent recognizes at
least the set of addresses the in message control file, that the Scheduler
configuration table assumes it does. Otherwise, some addresses may never
get delivered to, and the message will stay in the Scheduler indefinitely.
An interval specification that says how often the Scheduler should check
for work pending for the Transport Agent described by that line. The time
specification must use an appropriate suffix: `s' for seconds, `m' for
minutes, `h' for hours, or in combinations, e.g. `1h30m'. The minimum
value specified in the configuration file will be the directory scanning
interval used by the Scheduler.
A maximum number of Transport Agents simultaneously active for the channel
matched by the pattern for that entry. If 0, no upper limit is enforced.
A maximum number of Transport Agents simultaneously active for the host
matched by the pattern for that entry. If 0, no upper limit is enforced.
A total maximum number of Transport Agents simultaneously active due to
that entry. If 0, no upper limit is enforced.
A user id to set as the real and effective user id when executing the
command associated with that entry. Either a symbolic (login name) or
numeric value may be specified.
A group id to set as the real and effective group id when executing the
command associated with that entry. Either a symbolic (login name) or
numeric value may be specified.
Finally, the Transport Agent invocation command itself, as it would appear
on a normal command line. Note however that the Scheduler executes the
command directly without all the command line interpretation afforded by a
shell. The only special action is to replace instances of the word
`$channel' with the name of the channel matched by the pattern, and
instances of `$host' with the name of the host matched by the pattern.
Note that the command must have enough privileges specified to write into
the control file, in addition to whatever is necessary to perform its
delivery duties and logging.
Transport Agent protocol
========================
Once the Scheduler starts up a Transport Agent by executing one of the
commands specified in the configuration file, it needs to pass information
to the Transport Agent about which messages and addresses it should
process. The Transport Agent in return needs to report to the Scheduler
about the success or failure of its delivery activity, so that issues
related to file management and error reporting can all be centralized in
the Scheduler process.
To accomplish this, the Scheduler engages in a simple exchange with each
Transport Agent it has started. For this reason, the Scheduler creates two
pipes attached to the standard input and standard output of the Transport
Agent processes it executes. The standard error descriptor is shared with
the Scheduler process, and usually refers to the Scheduler log file.
Just before the Scheduler starts up a Transport Agent, it scans through its
model of the pending message control files, and determines which are
relevant to the impending invocation of the Transport Agent. Once the
subprocess is running, the Scheduler will write to the standard input of
its child, the names of the control files it should process, one to each
line. This list is terminated by an empty line, to indicate to the
Transport Agent that the Scheduler finished its business normally.
In turn, the delivery process will open each named control file, scan it
for destination addresses relevant to its specific invocation, and attempt
delivery to those addresses. For each destination address, it will print
to its standard output a line that describes the address, and that contains
a status indication. The syntax is:
id/offset/status comment
The id is the message file id contained in the `i' field of the control
file. The offset is the byte offset into the pertinent control file of the
destination address field. The status is one of the following list of
keywords understood by the Scheduler:
`ok'
Delivery was successful.
`error'
Delivery was unsuccessful.
`deferred'
Delivery was attempted, but is deferred.
The optional comment is an arbitrary string that clarifies the status code.
It is separated from the status code by a single space. For example, the
following is a possible sequence of reports:
18453/3527/ok
18453/3565/deferred Unable to contact sis.mod.uk!
18453/4211/error No such local user: 'bond'.
After each message file has been processed, an empty line is output to
indicate this. The Transport Agent will continue to the next message
control file (if any) that has been written to it by the Scheduler.
A complete exchange between a Scheduler and a Transport Agent might proceed
as shown:
Scheduler Transport Agent
---------------------------------------------------------------------
smtp/21456
18453/878/ok
18453/1013/error Illegal hostname: 'spectre'
_
local/21456
18453/945/deferred Cannot lock mailbox: 'bond'
_
_
The underscores indicate empty lines emitted by either side in the
synchronous protocol. After such a conversation, the Transport Agent
process will exit gracefully. Whenever the status of a destination is
updated, the Scheduler will check its internal data on whether or not a
link to the control file should be removed, or if indeed delivery has been
completed and both the message file and the last links to its control file
should be removed.
Mail Queue printing
===================
It is possible to read through the directory hierarchy under the
`SCHEDULER' directory to synthesize a model of which messages are queued to
go where. However, this method does not guarantee an accurate image of the
model within the Scheduler process, nor can it provide any status
information (as given by the status commentary of Transport Agents) other
than success or failure. The ideal solution would be a way of
interrogating the Scheduler itself about the current state, and then
perhaps use this as a basis for verbose embellishments.
Such a facility is incorporated in the Scheduler. The exact interrogation
mechanism depends on the facilities of the host operating system: a system
with TCP/IP would use a socket rendezvous, a system with named pipes would
use a prearranged special file and signalling mechanism, a system without
either would rely on normal files.
In all cases, the result of the interrogation is a terse list of of
messages, their destinations (channel/host combination), and the offsets of
the addresses corresponding to each destination. For example, a sample
state dump from the Scheduler is:
29198: smtp/csri.toronto.edu, 2 addresses [196,247]
smtp/ephemeral.ai.toronto.edu, 1 address [128]
This shows one message (id 29198) queued for transmission via SMTP to two
different destination hosts. One destination has two associated addresses,
referred to by byte offsets (196 and 247) into the control file for the
message (`~/transport/29198'). The other destination has only one address
associated with it.
The above dump corresponds to the state just after the Scheduler has parsed
the control files. After the Transport Agent corresponding to the
`smtp/ephemeral.ai.toronto.edu' destination has exited, the state might
become:
29198: smtp/csri.toronto.edu, 2 addresses [196,247]
smtp/ephemeral.ai.toronto.edu, 1 address [128]
connect: Connection refused (will retry)
In this case, the host `ephemeral.ai.toronto.edu' is alive, but no SMTP
server is running. After the Transport Agent for the other destination has
exited, the state might be:
29198: smtp/ephemeral.ai.toronto.edu, 1 address [128]
connect: Connection refused (will retry)
Finally, once this remaining destination is processed successfully, the
Scheduler reports:
Mail queue is empty
These reports from the Scheduler succinctly express the state of the queues
in a format that is human-readable, and that is also easy to parse
automatically. The only information not provided are the actual addresses
referred to by the state dump. The program that queries the Scheduler for
this information is capable of finding these addresses if it needs to
(assuming the control files are readable), and presenting it in a different
format. The Scheduler does not remember the actual address information,
and so cannot easily include it in the dump. Since the Scheduler must
spend a minimum of time servicing requests from Transport Agents and mail
queue queries, it leaves nontrivial work to the querying program.
Some advantages arise from this mechanism: in environments with
host-to-host interprocess communication (e.g. TCP/IP) it becomes possible
to query Schedulers on remote hosts about their state, and such remote
queries can only get verbose information if the querying process has access
to the control files of the remote ZMailer installation. This makes it
possible for an environment making use of distributed filesystems to have a
single ZMailer installation on a mail server host, and for all the other
local machines to access its services transparently. At the same time, no
private information can be divulged without direct access to the
`POSTOFFICE' directory.
The ZMailer distribution contains a utility program `mailq' that is used to
query Schedulers. It supports the transparency paradigm in an NFS
environment, by arranging to query the Scheduler running on the NFS server
host for the `POSTOFFICE' directory visible on the local host.
Logging
=======
As with the Router, the Scheduler daemon will attach its standard output
and standard error streams to a log file. The standard error stream of
each Transport Agent invocation is inherited from the Scheduler, and so is
attached to the same log file. The Scheduler does have an option to
produce a debugging log, but otherwise only extraordinary occurrances are
logged (for example, a delivery failure, missing Transport Agents, etc.).
^_Info file zmog, produced by texinfo-format-buffer -*-Text-*-
from file zmog.tex
Distribution
************
Copyright (C) 1988 Rayan S. Zachariassen.
If you received this manual directly from the author, you may make and
distribute verbatim copies of it within your organization. Except by
explicit permission from the author, all other redistribution is prohibited
prior to final release.
^_
File: zmog Node: transports, Prev: scheduler, Up: top, Next: miscellaneous
Transport Agents
****************
A Transport Agent is a program that delivers mail to a particular
destination. The destination paradigm in ZMailer involves the concept of a
channel, a next-host, and a next-address. The first two are used by the
Scheduler to select a Transport Agent, and by a Transport Agent to identify
which destinations it should process. Any necessary information to
accomplish this selection, is either contained within the Transport Agent,
or supplied by the Scheduler on the command line when invoking a Transport
Agent. The message control files examined by a Transport Agent instance
are passed by the Scheduler in a simple protocol designed for the purpose,
and status reports on the actions of the Transport Agent are returned by
the same protocol.
When a Transport Agent starts up, it expects to read message control file
path names on its standard input stream, and will print status reports on
its standard output stream. Unexpected errors are sent to the standard
error stream for logging. A Transport Agent can be invoked interactively
for test purposes, but usually it is started as a child of the Scheduler
daemon, with its input and output streams attached to the Scheduler using
pipes, and sharing the error stream with the Scheduler itself (and other
concurrent Transport Agent processes).
The following sections describe the Transport Agent programs that come with
the ZMailer distribution.
Local delivery (mailbox)
========================
The delivery of local mail is of paramount importance in a mailer. Of all
the things that might go wrong during mail processing, a mistake by the
local delivery process can be the most critical. Since it is also a very
frequent operation, this Transport Agent must be both robust and efficient.
Perfection is elusive, but the local delivery program included with ZMailer
has proven itself in the original version used with Sendmail.
This program will look for destinations with a channel of `local', and will
ignore the next-host specification. The next-address specification is
either a local account id, a full path to a file, or a pipe (`|') followed
by a valid argument to `sh -c'. The following are examples of legal
values:
bond
/usr/arch/lists/info-widget
/etc/passwd
|sed -e '1,/^$/d' >> /etc/passwd
|/bin/mail badhost!badguy </etc/passwd
Here are some illegal variations:
<bond> (angle brackets invalid in local-part)
"bond" (double quotes unlikely in login id)
bond@sis.mod.uk (local-parts do not contain `@')
james bond (whitespace unlikely in login id)
lists/info-widget (not an absolute pathname)
sed -e '1,/^$/d' >> /etc/passwd (does not start with a `|')
Note that the effect of these addresses depends on whether the local
delivery program actually honours the request, and if it does which
privileges are used while executing the indicated action.
Specifically, if the next-address does not start with either `|' or `/', it
is assumed to be a user name. This is checked by lookup in the system
account database (`/etc/passwd'), to determine which user id should own the
mailbox file. If the indicated account exists, mail is delivered to its
corresponding mail spool file, in the standard format (return address and
delivery date in a `From ' line preceding the actual message, etc.). The
local delivery program does *no* aliasing on its own.
If delivery to a file or command is indicated, the actual delivery is done
using the user id listed as the destination address privilege in the
control file. What this actual privilege allows, is up to the security
mechanism in the Router. Since addresses specified from a remote host
start out with minimal privileges, they will usually not cause any harm on
the local system.
Programs executed by this Transport Agent will be given an environment
containing the `$PATH', `$SHELL', `$HOME', `$USER', `$UID', and `$SENDER'
environment variables. The first two are constant, the next three depend
on the delivery privilege of the address, and the value of the last
environment variable is set to be the return address of the message being
delivered. The current directory is set to be `$HOME' when possible.
The local delivery program contains code that may be enabled at compile time
to honour the `comsat' protocol. There are separate symbols to enable
this for local users (`BIFF') and for remote users (`RBIFF').
In the former case, users would enable the feature individually by executing
`biff y', while in the latter case a `.rbiff' file in the user's
home directory triggers the remote notification of new mail.
Error mail delivery (errormail)
===============================
The error messages of ZMailer are stored as message file forms in a
specific `FORMS' directory. The various ZMailer programs will access the
appropriate forms directly, but errors detected in the Router configuration
file must be handled in a different way. By convention, any problem found
by configuration file code is handled by changing the message destination
to be a triple of the form:
(error, form, address)
The `error' channel is serviced by this Transport Agent, which expects the
form listed to be the name of a file in the `FORMS' directory. This should
be a prototype message file, containing all generic information associated
with the error (i.e. the message header lines and an appropriate
explanation to the user). The address is the address rejected by the
configuration file code, for whatever reason is given in the form file.
By convention, the names of the form files indicate the class of error that
occurred. The following describes the standard forms that come with
ZMailer:
`err.badheader'
Syntax error in the message header.
`err.delivery'
Delivery problem, used by the Scheduler on behalf of Transport Agents.
`err.nonewsgroup'
A non-existent USENET Newsgroup was addressed.
`err.norecipients'
The message has no recipients listed.
`err.unresolvable'
The routing code in the Router configuration file cannot determine a
destination for the message.
`warn.badheader'
Used to chastize a user who sends improperly formatted mail.
Of these, only `err.nonewsgroup' and `err.unresolvable' are referred to by
the Router configuration file, the rest are used internally by the Router
or Scheduler. Therefore these forms *must* be available for proper
operation of ZMailer.
To illustrate, here is the default `err.badheader' form:
--------------------
From: The Post Office <postmaster>
Subject: Invalid message header
Cc: The Postmaster <postmaster>
The following message arrived with an illegal header according to the
RFC822/976 protocol specification. If you do not recognize the source
of the bad header, perhaps you should ask a postmaster at your site.
The following annotated headers illustrate where the error(s) occurred:
--------------------
SMTP client (smtp)
==================
The SMTP Transport Agent implements this message transfer protocol
according to RFC821. It scans message control files for a channel called
`smtp', and a next-host as specified on the command line. Only a single
virtual circuit (VC) is established to the remote SMTP server, and all
transactions are carried out in sequence across this VC. By contrast,
Sendmail opens a new VC for every mail message.
This program does not enforce the line length limits of the SMTP protocol,
nor does it check that the message file data is 7 bit ASCII. However, the
CRLF line termination rule is followed, as are all other aspects of the SMTP
protocol. When connected to a ZMailer SMTP server program, message bodies
containing arbitrary binary data may be transferred (since the SMTP DATA
encoding is reversible, and there are no line length limits on either end).
A log file may be specified for recording the SMTP transaction.
Sendmail compatible delivery programs (sm)
==========================================
Because Sendmail already has many "mailer"s written for it, and to ease the
transition from Sendmail to ZMailer, this Transport Agent was written to
interface with such programs from the ZMailer environment. The basic
characteristic of a Sendmail "mailer" is that its command line specifies
what must be done with the message available on its standard input stream.
Because of the generic interface, this Transport Agent requires a small
configuration file which it reads on startup. The configuration file declares
which programs are available, how to invoke them, and what channel each
program corresponds to. Here is a sample configuration file:
# M F = P = A =
local mS /usr/lib/mail/localm localm -r $g $u
prog - /bin/sh sh -c $u
tty rs /usr/local/to to $u
uucp U /usr/bin/uux uux - -r -a$g -gC $h!rmail ($u)
news m /usr/lib/mail/pnews post.news $h $u
The configuration file is a table with each line containing four fields:
a channel name, Sendmail "mailer" flags, the full path name of the program
to execute, and the command line that program should see.
The flags field contains the flags that are appropriate to the ZMailer
environment, for example the presently recognized flags are:
`f'
Include a `-f sender' in the command line.
`r'
Include a `-r sender' in the command line.
`S'
Do not reset the uid to the real uid of the Transport Agent process.
`n'
Do *not* prepend a `From ' line to the message.
`s'
Strip quotes on addresses [XX:todo].
`m'
Many recipients may be handled by a single instance of the command.
`P'
Add a "Return-Path" message header.
`U'
Prepend a `From ... remote from ...' line to the message.
`X'
Use the SMTP hidden dot algorithm (i.e. escape periods on a line by
themselves).
`E'
Replace occurrences of `From ' at the start of a line in the message
body with `>From '.
`7'
Pass 7-bit ASCII by stripping 8th bit of bytes in the message
[XX:todo].
`-'
No-op flag.
Mailer flags that are not mentioned in the above table have been excluded due
to their lack of semantics in this situation. Typically their functionality
should be accomplished in the Router instead [XX: if it isn't, and it is
needed, please let me know].
The command line specification may contain anything valid in the same field
in a Sendmail "mailer" definition. In particular, any argument containing
`$u' is expanded as many times as there are recipients that can be dealt
with at once by that command. The `$g' macro expands to the return
address of the message, and `$h' to the next-host in the destination.
At present, no special environment is set up for programs executed by this
Transport Agent. The standard output and standard error of such processes
are caught by the Transport Agent, and the first line read (if any) is
passed on to the Scheduler using the normal status reporting mechanism.
^_
File: zmog Node: miscellaneous, Prev: transports, Up: top, Next: how-to
Miscellaneous
*************
Sendmail compatibility
======================
After installing the Sendmail compatible ZMailer interface programs, the
present user-visible incompatibilities with Sendmail proper are:
* Verbose mode (`-v' flag to Sendmail) is not implemented.
* Occurrences of `:include:' specifications in the aliases database must
be quoted.
* The "Return-Receipt-To" message header is not yet honoured.
* The mailing-list management features of Sendmail are not implemented,
avaiting consultation.
SMTP server
===========
The ZMailer distribution contains an SMTP server program for the BSD socket
implementation of TCP/IP. It is an asynchronous implementation, in that
address semantics are not checked in real time, nor are other (optional in
the SMTP standard) functions that require Router functionality. The server
simply says "Yes yes, sure!" to everything, and passes the information to
the Router for verification. The program may also be used in non-daemon
mode to unpack BSMTP format messages on the standard input stream. For
compatibility with the Sendmail variation on the SMTP protocol, it accepts
the `VERB' and `ONEX' commands as No-Ops. The `VRFY', `EXPN', `HELP', and
`TURN' commands are presently unimplemented, as is the case for the
interactive `SEND', `SAML', and `SOML' commands.
^_
File: zmog Node: how-to, Prev: miscellaneous, Up: top, Next: uasupport
How-To Guide
************
This chapter is intended to give practical tips on topics related to the
maintenance and customization of ZMailer. If you want to see something
covered here, let me know.
How to install ZMailer
======================
Thie `README' file in the distribution contains specific instructions for
installing ZMailer. The following goes into slightly more depth than the
`README' file does:
The documentation for ZMailer (part of which you are reading right now) is
maintained in *texinfo* format. To format this for a high-quality output
device requires that you already have TeX running, and that you have the
Texinfo macro package installed in the TeX macro library. If not, these
macros are part of every GNU Emacs distribution (and included with the
current ZMailer distribution). Generating a line printer or screen version
of the documentation requires the aid of GNU Emacs (see `doc/Makefile').
If you have neither TeX nor GNU Emacs, ask me for a preformatted version of
the documentation.
There is very little hardcoded configuration information in the ZMailer
programs. The `conf.c' files in the `router' and `scheduler'
subdirectories of the distribution are the primary locations of static
configuration information. You should check these files, but there is no
need to change them unless you know what you are doing, and insist.
The only other static global information is kept in the `mail.h' header
file in the `include' subdirectory. In an operating system environment
that integrates ZMailer, this file is intended to go in `/usr/include'.
There is some dynamic global information, and other compile time
information, that needs to be specified somewhere. The way it is done with
ZMailer, is that you (the installer) edits a global configuration file
(`Config'), which contains variable definitions that will propagate to all
the makefiles in the distribution. These definitions will also appear in a
file `/etc/zmailer.conf' that ZMailer programs refer to for global
information. This information includes for example the locations of the
`POSTOFFICE' directory hierarchy, so this facility allows easy dynamic
reconfiguration of some installation parameters. The file is in `/etc' to
increase reconfiguration flexibility for diskless mail clients.
Canned error and warning messages are kept in the `proto/forms' directory
of the distribution. They should be modified to suit local preferences.
By default, all errors will be carbon-copied to the postmaster, a local
address that is hopefully defined in the aliases database. Until you are
comfortable with the ZMailer system, you should probably use the default
forms.
Once the above preliminaries have been taken care of, the time has come for
your computer to earn its keep. If you run the command:
make it so
the following will happen:
* A recursive `make clean' is run to scrub the distribution hierarchy.
* The global `make' file (`Makefile') is edited to update it with rules
for updating all the `make' files in the distribution when the
`Config' is modified.
* The `Config' file is processed into a `sed' script, which is then
applied to all the `make' files in the directory tree.
* All programs are compiled.
* Another `sed' script constructed from the `Config' file is applied to
update the `proto/zmailer' shell script.
* The `POSTOFFICE' directory hierarchy is created, and the canned error
messages from the `proto/forms' directory are copied to `~/forms'.
* The ZMailer directory hierarchy under `/usr/lib' is created, and the
standard configuration and control files and shell scripts from the
`proto' directory are copied to that location (referred to as
`MAILBIN' in the `Config' file).
* All the program binaries are installed under the `$MAILBIN' directory.
* Finally, another `sed' script is applied to the `Config' file, to
produce `/etc/zmailer.conf'.
Then it is time to get your aliases database working. If you don't already
have a central aliases database, you should create one. The minimum
requirement is that the `postmaster' address expands to a real account id.
If you already have a central aliases database, this is typically because
you are currently running Sendmail. In that case, start by copying the
Sendmail aliases file to `$MAILBIN/db/aliases'. The ZMailer Router is used
to build the aliases database from the aliases file. To do this correctly,
the Router must know what kind of aliases database to access and, in this
situation, create. The distribution Router configuration file will check
for the existence of a `$MAILBIN/db/aliases.dir' file to indicate that NDBM
or DBM is being used. If this file is absent, the Router will access the
database using a binary search algorithm on an index file.
If you are using NDBM, prepare the way for the Router by creating a null
`aliases.dir' file in the `db' subdirectory. Then run the Router to
initialize the aliases database (`router -i'). If you get syntax errors,
correct them in the `aliases' file. Eventually the Router will report some
simple counts (a la Sendmail) of defined aliases, indicating it was
successful in initializing the aliases database.
You should now arrange for host-specific information to be made available
to ZMailer. This is obviously a very site-specific customization.
Although the method of access and location of such information is defined
in the Router configuration file (which incidentally is
`$MAILBIN/router.cf'), certain Transport Agents need to know the hosts'
UUCP node name. This is read from the file `/etc/uucpname' if it exists,
and secondarily obtained from the `uname' system call in certain
environments. The convention of using `/etc/uucpname' is due to 4.3BSD
UUCP which allows this as a configuration option. I recommend this method,
since it greatly increases portability of the UUCP binaries between your
machines.
The sample Router configuration file in the distribution, assumes that the
host names is should deliver mail locally for, are listed in the file
`$MAILBIN/db/localdelivery'. For example, in an environment with a mail
server and clients, all hostnames should be listed in this file. This is
suggested as a convention for how to discover this information, and where.
The sample configuration file should be studied for other guidance of this
sort.
You can finally try running the Router in interactive mode, as illustrated
in the `README' file. This is the stage at which you should start playing
with the configuration file and with the various ZMailer programs. This is
also an opportune moment (or day) for you to customize or write a Router
configuration file for your host/site.
If you have an `/etc/services' file, it should be updated with the
definition of a TCP port used for mail queue querying. The Scheduler acts
as a server listening on this port, and the `mailq' program included with
the distribution will connect with the Scheduler and obtain a dump of the
mail queue by this mechanism. If your system does not have TCP, a
rendezvous mechanism using named pipes will be used instead. For systems
without either of these facilities, a prearranged file is used along with a
release protocol when the mail queue dump is completed.
When you are comfortable with the new environment and want to start
ZMailer, there is a shell script provided (`$MAILBIN/zmailer') to carry out
the normal startup functions. If invoked without arguments, it will start
the Router and Scheduler daemon processes, and the SMTP server process.
The latter may clash with any running Sendmail daemon if it is also acting
as an SMTP server. This script may also be invoked with individual
arguments like `router' or `scheduler' to start up just the specified
process(es). It may be run from the `/etc/rc.local' file to start ZMailer
on reboot.
How to write a Router configuration file
========================================
Sorry, I don't know what the problem areas will be at this point, so this
section is incomplete. The following is a quick summary:
The configuration file is read and all statements executed sequentially.
Like any other statement, a function definition is also executed, with the
side effect of defining a function. All functions must be defined before
use. Normal statements appearing at the top level in the configuration
file (i.e. not within a function definition), usually have the purpose of
setting up an environment for the rest of the configuration file. An
"environment" encompasses global variables (e.g. "what is my name")
initialized in assignment statements, and database definitions by the
`relation' statement.
There are four instances of magic semantics assumed by the Router:
* Setting the hostname by calling the `hostname' function, will enable
generation of trace headers (i.e. `Received' and `Message-Id').
* The aliases database is defined by the `aliases' relation. The value
of a database lookup must be a byte offset into another file that
contains the actual alias definitions.
* A `router' function must exist. It takes an address as its one
argument, and returns three values representing the channel,
next-host, and next-address.
* A `crossbar' function must exist. It takes two triples (six
arguments) and returns those triples and the name of a rewriting
function to be applied to all the header addresses (seven values).
The argument triples represent the origin and recipient envelope
addresses, and this function is in charge of rewriting them as
appropriate.
Naturally, all function names returned by the `crossbar' function must
correspond to a defined function.
Tell me what is missing from this description. Would a play-by-play of the
sample configuration file be very useful?
^_
File: zmog Node: uasupport, Prev: how-to, Up: top
User Agent support
******************
Submission Interface
====================
Three C library routines are provided to open (create), abort (remove), and
close (submit) a message file. Internally, they make use of the stdio
package, and their interface is modelled after it. The interface
definition is:
#include <mail.h>
FILE *mail_open()
int mail_abort(mfp)
FILE *mfp;
int mail_close(mfp)
FILE *mfp;
The parameter passed in a `mail_abort()' or `mail_close()' call is the
value returned by a call to the `mail_open()' function. The routines take
care of all the necessary housekeeping. They are properly used as follows:
...
FILE *mfp;
on exit or interrupt, arrange to call mail_abort(mfp);
if ((mfp = mail_open()) == NULL) {
... error handling when message submission is not possible ...
} else {
... output the mail message to mfp ...
if (oops && (mail_abort(mfp) == EOF))
... print a message that the abort failed ...
else if (mail_close(mfp) == EOF)
... error handling when message submission fails ...
}
reset behaviour on exit or interrupt
...
char *tmalloc(n) unsigned int n; { return n bytes of memory }
Notice the definition of `tmalloc()'. This routine should allocate memory
that will remain usable within the lifetime of the message submission (i.e.
until a `mail_abort()' or `mail_close()' call). This allows a User Agent
or other application program that makes many calls to these routines during
its lifetime, to provide an alternate byte allocator that will not cause
them to run out of data space.
Another point to be made is that these routines and all other code in
ZMailer that relinks files, uses `link()'/`unlink()' combinations and never
the `rename()' system call, even if it is available. Unfortunately,
`rename()' does not retain the inode number of the file being renamed.
Finally, although this interface will honour the *FULLNAME* and
*PRETTYLOGIN* environment variables mentioned earlier, a User Agent can
override this mechanism by seeking to byte 0 of the message file and
writing its message data from there.
The system standard header file `mail.h', declares these routines
appropriately. It contains all the common definitions used in passing
information between the components of ZMailer. This includes the names of
various directories, the postmaster, and symbolic names for various keys
used in the control file protocol.
Fullname quoting
================
The library routine that constructs a full user name, does so purely based
on information passed to it. This means it can be used with the contents
of a GECOS field (everything after a `,' or a `;' is ignored), or some
other arbitrary string, without incurring any unnecessary cost involved in
a password database lookup. The interface specification is as follows:
char *
fullname(gecos, buf, buflen, login)
char *gecos; /* the name we wish to quotify */
char buf[]; /* place to put the result */
int buflen; /* how much space we have */
char *login; /* what to use for a login name */
The return value from `fullname()' is always the value of the second
parameter. A sample usage would be:
struct passwd *pw;
char buffer[BUFSIZ], *name;
extern char *fullname();
name = fullname(pw->pw_gecos, buffer, sizeof buffer, pw->pw_name);
If the fourth parameter is `(char *)NULL', the `fullname()' routine will
look for the *USER* and *LOGNAME* environment variables, in that order, if
it needs a login name due to the expansion of a `&' in the GECOS field.
For example:
fullname("& Kirk", ..., "jim") returns "Jim Kirk".
fullname("James T. &", ..., "kirk") returns "\"James T. Kirk\"".
The routine will truncate the text of its return value to fit in the space
available in the buffer. If there is a leading double-quote, there will
also be a trailing double-quote. The decision to quote is made according
to the specifications in RFC822 for a phrase. In other words, when scanned
according to the lexical rules of RFC822, the return value from
`fullname()' will constitute a valid RFC822 phrase.
^_
|