1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359
|
Thu Jan 31 17:32:33 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* Release of 3.1.6.
* htdoc/confindex.html, htdoc/htsearch.html, htdoc/index.html,
htdoc/mailarchive.html: Remove CSS link, not needed in these
frameset pages.
* htdoc/howto-mirror.html: Update with Jesse's latest version.
Thu Jan 31 15:13:07 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* Makefile.in: Fixed install-strip target to properly handle relative
paths in INSTALL_PROGRAM when passing it to subdirectories.
Thu Jan 31 11:41:39 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: Updated questions 4.8 & 4.9 to emphasize use of
doc2html over parse_doc.pl. Further clarified question 2.1.
Thu Jan 31 10:14:23 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/parse_doc.pl: Added comments explaining why you should
not be using this script.
Wed Jan 30 17:20:51 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html: Updated to mention 3.1.6 as the newest version
and --with-rx as a fix for regex problems on BSDI.
Wed Jan 30 17:15:49 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/synonyms: Updated with the version contributed by
David Adams, with minor changes. Kept old one as synonyms.original.
* installdir/english.0: Changed lots more dubious uses of suffixes to
get more appropriate and correct fuzzy endings expansions.
Wed Jan 30 12:30:16 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/Connection.cc (connect): Fixed bug with allow_EINTR and
add support for looping when the connection returns EAGAIN (no
more free local ports). Thanks to Ahmon Dancy <dancy@franz.com>
for pointing out the EAGAIN issue.
Tue Jan 29 09:59:58 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: Updated with today's changes to maindocs FAQ.
Mon Jan 28 16:54:15 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/README: Added mentions of examples & xmlsearch, fixed typo.
Sun Jan 27 23:13:11 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/*.html: Final batch of documentation updates.
Sat Jan 26 23:28:25 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/*: More documentation updates from merging with the
current maindocs CVS.
Fri Jan 25 21:36:21 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* acconfig.h, include/htconfig.h.in: Add USE_RX to potential
configure #include macros.
* htlib/gregex.h: Rename regex.h to prevent conflicts with system
version.
* htlib/regex.c, htlib/HtRegex.h: Ditto.
* htfuzzy/EndingsDB.cc: Use same tests as HtRegex.h for rxposix.h,
gregex.h or regex.h depending on configure results.
* configure.in: Implement more flexible test for rx/regex, which
will check for rxposix.h if --with-rx is supplied, will "fall
back" to regex test if rxposix.h isn't available and will only use
the htlib/ code and header for regex compile.
* configure: Update using autoconf.
Fri Jan 25 12:14:26 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/whatsnew/README, contrib/whatsnew/whatsnew.html: Added
an example of how to get a what's new listing from the new features
in htsearch.
Thu Jan 24 22:43:28 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Add ignore_dead_servers attribute to
control whether indexing will continue to try to contact a dead
server.
* htdig/Retriever.cc: Only mark a server as dead if the
ignore_dead_servers attribute is set.
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html:
Documentation updates.
Thu Jan 24 15:32:59 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure, configure.in: Add --with-rx option to switch to
system rx code (e.g. on BSDI). Needs some touchups still,
including checking that rxposix.h exists and if --without-rx was
supplied for some reason.
* htlib/HtRegex.h: Add conditional <rxposix.h> header for systems
where rx is better than regex.
* htlib/Makefile.in: Make sure regex.o is only compiled if it
works on a given system via LIBOBJS as supplied by the configure
script.
Mon Jan 21 22:33:30 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/RELEASE.html: Add first shot at the release notes for
3.1.6. Still need to finish some of the htdoc/ merges, including
the SF icons and such.
* htdoc/*.html: First stab at many of the htdoc/merges including
the new Copyright line. (It is 2002, after all.)
Fri Jan 18 18:17:34 2002 Geoff Hutchison <ghutchis@wso.williams.edu>
* htmerge/docs.cc: Add a test if the DB database has no URLs
before proceeding.
* htmerge/words.cc: Add a slightly more user-friendly error
message if the word list file doesn't exist. Remove exit()
statements since reportError does this for us.
Fri Jan 18 16:47:50 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Rewrote description of prefix_match_character
to make it more clear, with crosslinks to related attributes, and
described new wildcard matching feature. Added more explanations
for relative days & months in startday et al. to make it clearer.
Added more notes about to-strings in the url_part_aliases description
and explained the example even more, as well as adding crosslinks
to the new *_rewrite_rules.
Fri Jan 18 15:56:11 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/htsearch.cc (setupWords), htsearch/parser.cc (perform_push):
Added support for a wildcard word of "*" (or prefix_match_character
if set and not empty) which returns all documents.
Wed Jan 16 17:21:26 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html, htdoc/hts_form.html: Described how to use
relative dates for startyear et al.
Wed Jan 16 16:58:05 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (buildMatchList): Fixed startday et al. to
allow relative days, month & years if values are negative.
Fri Jan 11 20:57:51 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Updated descriptions for translate_* attributes
to match the new default behavior.
Fri Jan 11 17:48:54 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/SGMLEntities.cc (translateAndUpdate): Added support for
translate_latin1 attribute, to turn off ISO-8859-1-specific entities.
* htcommon/defaults.cc: Added translate_latin1 attribute.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Fri Jan 11 17:14:54 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/xmlsearch.{README,tar.gz}: Removed older xmlsearch package.
Fri Jan 11 17:06:09 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/xmlsearch/*: Added files contributed by Nathan Hand and
me to implement XML output from htsearch, including DTD, templates
and config file.
Wed Jan 9 22:08:21 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* CONFIG.in: Fixed to allow setting BIN_DIR by configure option.
* contrib/htdig-3.1.6.spec: Fixed to make use of new ./configure
options for pathnames, do away with patch file. Used variables for
many pathnames to allow easy changes.
Wed Jan 9 16:22:32 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc (parse): Added support for max_keywords
attribute.
Wed Jan 9 16:10:44 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc (HTML, do_tag), htdig/ExternalParser.cc (parse):
Added support for description_meta_tag_names attribute.
Ensure external parser interface accepts META descriptions even if
'description' is added to the keyword list.
* htcommon/defaults.cc: Added description_meta_tag_names attribute.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Tue Jan 8 17:39:24 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc (parse): Added support for use_doc_date
attribute.
Thu Jan 3 17:10:50 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Makefile.in, htlib/lib.h: Removed references to timegm,
mytimegm and strptime functions. Removed C source for these.
Thu Jan 3 16:43:31 2002 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/htmerge.html: Added extra description for -m option to clear
up common points of confusion, added note about LC_COLLATE environment
variable.
Fri Dec 21 18:52:32 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc: Added parsedcdate function, used by got_time,
to parse DC date meta tags without requiring strptime or timegm.
Thu Dec 20 12:25:47 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc: Added parsedate function, used by getdate, to
parse date headers without requiring strptime or timegm, which have
caused problems on some systems.
Thu Dec 20 11:51:26 CET 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
* configure.in: reviewed directory settings
* Makefile.in: ditto (for 'make install' of htdig.conf and rundig)
Wed Dec 19 23:05:09 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in: Add tests for ostream.h and iostream.h.
* htlib/htString.h: Add HAVE_OSTREAM_H and HAVE_IOSTREAM_H
preprocessor statements to deal with portability issues around the
C++ header files.
Wed Dec 19 13:33:55 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
* configure.in: fixed bug in customisation of configure paramters
* CONFIG.in: ditto
* configure: re-generated with autoconf
Tue Dec 18 16:12:17 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (displayMatch): Fixed to clear out old values
of ANCHOR template variable for each result.
Thu Dec 6 13:14:22 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/examples/rundig.sh: Fixed to make use of DBDIR variable.
Wed Nov 21 12:54:42 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/rundig.html: Added note about effect of changing database_base.
* htmerge/docs.cc (convertDocs): Changed confusing message about
total doc db size in stats.
Wed Nov 21 11:37:52 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/TemplateList.cc (createFromString), htdoc/attrs.html:
Treat template_map as a _quoted_ string list. Change <i> tags to
the HTML-4.0 compliant <em> tags in builtin-long template.
Tue Nov 20 17:13:27 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/String.cc (String, append, sub): Added checks for negative
lengths or start position to make code more fault-tolerant.
Tue Nov 20 16:37:26 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Synonym.cc (createDB): Check for lines with less than
2 words, to avoid segfault caused by calling Database::Put() with
negative length for data field.
Sat Nov 3 23:55:00 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/htString.h: Add #include for ostream.h to solve compile
problems with gcc3.
* htlib/Connection.h, htlib/Connection.cc: Backport Connection
class from 3.2 code--installs alarm() call to timeout connections
and will retry connections a few times before giving up.
Fri Nov 2 12:28:35 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc, htdoc/attrs.html: Added support for dc.date,
dc.date.created and dc.date.modified to use_doc_date handling.
Fri Nov 2 12:12:59 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/xmlsearch.README, contrib/xmlsearch.tar.gz: Added files
contributed by Nathan Hand and me to implement XML output from
htsearch, including DTD, templates and config file.
Fri Nov 2 12:05:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc (do_tag), htcommon/defaults.cc: Added ignore_alt_text
attribute to avoid indexing alt text in img tags.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Thu Nov 1 14:43:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/htsearch.cc (main): Fixed to only show file names in
error messages when REQUEST_METHOD not set and -v option given,
for security.
Thu Nov 1 10:19:27 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc, htsearch/Display.h: Added a localized
method for outputing HTTP headers, added support for a new
search_results_contenttype attribute to control that header.
* htcommon/defaults.cc: Added default for it.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Wed Oct 31 13:31:18 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/english.0: Changed lots of dubious uses of suffixes to
get more appropriate and correct fuzzy endings expansions.
Tue Oct 23 14:06:37 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc (RetrievedDocument): Fixed handling of null
return from getParsable(), to avoid segfault problem introduced
by text/css conditional added Jul 25.
Fri Oct 19 17:24:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (hilight): Added Stefan Nehlsen's idea for
anchor_target attribute.
* htcommon/defaults.cc: Added default for it.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Sun Oct 14 22:05:30 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (external_parsers): Documented external converter
chaining to same content-type, e.g. text/html->text/html-internal.
Sun Oct 14 21:54:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html, htdoc/cf_byprog.html, htdoc/cf_byname.html,
htcommon/defaults.cc: Documented and declared startyear, etc.
attributes used by htsearch.
Sun Oct 14 21:16:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/htdump.html, htdoc/htload.html, htdoc/attrs.html,
htdoc/cf_byprog.html, htdoc/contents.html: Documented htdump and
htload, indicating which attributes are used by them.
Fri Oct 12 14:58:15 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/URL.cc (removeIndex): Fixed to make sure the matched file
name is at the end of the URL.
Tue Oct 2 09:34:43 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (start_url): Added a reference and link to
limit_urls_to, explaining how the two are tied together.
Fri Sep 28 17:19:45 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/htdig-3.1.6.spec: Fixed %install to make symlinks for
htdump & htload, added these to %files list.
Fri Sep 28 15:38:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (displayMatch): Save rewritten URL in DocumentRef
so it'll be used for star_patterns and template_patterns matching.
Fri Sep 28 14:25:29 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (buildMatchList, displayMatch),
htsearch/htsearch.cc (main): Added calls to pass search_rewrite_rules
to HtURLRewriter class and use it to rewrite URLs in results.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
htcommon/defaults.cc: Added search_rewrite_rules attribute.
Thu Sep 27 16:34:51 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Makefile.in, htlib/HtRegex.cc, htlib/HtRegex.h,
htlib/HtRegexReplace.cc, htlib/HtRegexReplace.h,
htlib/HtRegexReplaceList.cc, htlib/HtRegexReplaceList.h,
htlib/HtURLRewriter.cc, htlib/HtURLRewriter.h: Added new classes to
support regular expressions and implement url_rewrite_rules attribute,
using Geoff's variation of Andy Armstrong's implementation of this.
* htlib/URL.h, htlib/URL.cc: Added URL::rewrite() method.
* htlib/htString.h: Added Nth() method for HtRegex class.
* htdig/Retriever.cc (got_href, got_redirect): Added calls to
url.rewrite(), and debugging output for this.
* htdig/htdig.cc (main): Added calls to make instance of
HtURLRewriter class.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
htcommon/defaults.cc: Added url_rewrite_rules attribute.
Mon Sep 17 16:52:07 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/running.html: New documentation on how to run after configuring.
* htdoc/rundig.html: New manual page for rundig script.
* htdoc/install.html: Added link to running.html.
* htdoc/contents.html: Added link to running.html, rundig.html, related
projects. Updated links to contrib and developer site. Got rid of
link to web site stats.
Fri Sep 14 09:18:38 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc (RetrieveHTTP): Add port to Host: header when
port is not default, as per RFC2616(14.23). Fixes bug #459969.
Sat Sep 8 22:04:47 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* acconfig.h, include/htconfig.h.in: Add undef for
ALLOW_INSECURE_CGI_CONFIG, which if defined does about what you'd
expect. (This is for any wrapper authors who don't want to rewrite
but are willing to run insecure.)
* htsearch/htsearch.cc: Only allow the -c flag to work when
REQUEST_METHOD is undefined. Fixes PR#458013.
Fri Aug 31 16:00:37 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/URL.cc (URL): Fixed to call normalizePath() even if URL
is relative but with absolute path. Should fix bug #408586.
Fri Aug 31 15:21:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.h, htdig/HTML.cc (HTML, parse, do_tag): Fixed buggy
handling of nested tags that independently turn off indexing, so
</script> doesn't cancel <meta name=robots ...> tag. Add handling
of <noindex follow> tag.
Fri Aug 31 14:33:41 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
[ Backport some 3.2.0b4 HTML parser changes. ]
* htdig/HTML.cc (do_tag): Rewrite using Configuration class to
separate tag attributes. Parse <object> tags properly, looking
for data= attribute rather than src=. Add support for TITLE
attributes in anchor and related tags. Treat <script></script>
tags as noindex tags, much like <style></style> as suggested
by Torsten.
* htdig/HTML.cc(parse): Fix to prevent closing ">" from being passed
to do_tag().
Wed Aug 29 10:20:55 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (allow_in_form, build_select_lists,
limit_normalized, server_aliases, server_max_docs, server_wait_time,
url_part_aliases): Added clarifications to allow_in_form,
server_aliases and url_part_aliases descriptions. Changed word
"directive" to "attribute" where appropriate. Added cross-link to
server_aliases from limit_normalized, and to allow_in_form from
build_select_lists.
Mon Aug 27 17:22:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc (do_tag): Improve handling of whitespace in META
refresh handling. Fixes bug #406244.
Mon Aug 27 16:38:43 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc (parse): Fixed delete [] text (was missing []), added
simple optimizations for comment & noindex_start skipping, handle
decoded < entity correctly.
Mon Aug 27 15:31:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
[ Backport 3.2.0b4 config files. ]
* installdir/htdig.conf: Added .css to bad_extensions default,
added missing closing ">", added mentions of accents & substring,
fixed a couple typos in comments.
* installdir/search.html: Add DTD tag for HTML 4 compliance.
* installdir/{long, syntax, header, footer, wrapper, nomatch}.html:
Add DTD tags, ALT attributes and remove bogus </select> tags to
fix invalid HTML pointed out in PR#901. Change all <b> and <i> tags
to the HTML-4.0 compliant <strong> and <em> tags.
* htdoc/config.html: Updated with sample of latest htdig.conf and
installdir/*.html, added blurb on wrapper.html.
Thu Jul 26 15:05:29 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc, htsearch/parser.cc (perform_or),
htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added new attribute
multimatch_method and used it to boost score on 'or' method with
multiple matches.
Thu Jul 26 14:25:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc, htsearch/parser.cc, htdoc/attrs.html,
htdoc/cf_by{name,prog}.html: Added new attribute boolean_syntax_errors
and used it to generate syntax error messages for boolean method.
Wed Jul 25 23:39:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htnotify/htnotify.cc: Changed calls to EmailNotification class
to avoid compiler warnings.
Wed Jul 25 23:15:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc, htsearch/htsearch.cc, htdoc/attrs.html,
htdoc/cf_by{name,prog}.html: Added new attribute boolean_keywords
and used it to make LOGICAL_WORDS and parse "words" using boolean
method.
Wed Jul 25 22:31:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Dictionary.cc (Remove): Fixed so it doesn't clobber rest of
chain when removing an entry, as suggested by Yariv Tal.
Wed Jul 25 22:06:08 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Add new attributes htnotify_replyto,
htnotify_webmaster, htnotify_prefix_file, htnotify_suffix_file.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Document them.
* htnotify/htnotify.cc, htnotify/EmailNotification.{h,cc},
htnotify/Makefile.in: Added in code from Richard Beton
<richard.beton@roke.co.uk> to collect multiple URLs per e-mail
address and allow customization of notification messages by
reading in header/footer text as designated by the new attributes
above.
* htdoc/THANKS.html: Credit where due.
Wed Jul 25 21:38:21 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Added .css to bad_extensions, for consistency
with 3.2.
* htdoc/attrs.html: Ditto for default value. Also set examples for
translate_* and modification_time_is_now to false so the example is
different than default.
Wed Jul 25 17:26:07 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc (getParsable): Add conditional to catch
text/css files to prevent these from being parsed as Plaintext.
* htdig/htdig.cc: Quick fix to make the logging -l flag the
default behavior. (Set to Retriever_logUrl from the start.)
* htcommon/defaults.cc: Set modification_time_is_now to default to
true (now that it works correctly). Also set translate_*
attributes to true.
* htdoc/htdig.html: Remove documentation for -l flag--now no
longer used.
* htdoc/attrs.html: Correct new default values for
modification_time_is_now and translate_* attributes.
Tue Jul 24 16:12:45 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Added reference to maximum_page_buttons in the
section on maximum_pages.
Tue Jul 24 15:38:39 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (generateStars): Add NSTARS variable for
template output as suggested by Caleb Crome
<ccrome@users.sourceforge.net> (except here precision is 0). Fixes
feature request #405787.
* htdoc/hts_templates.html: Add description of NSTARS variable
above. (Actually copied hts_templates.html from 3.2.0b4.)
Tue Jul 24 14:21:53 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (expandVariables, outputVariable),
htdoc/hts_templates.html: Add support for $=(var) template variable
references, as suggested by Quim Sanmarti.
Tue Jul 24 14:12:06 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (readFile): Added missing fclose() call, and
debugging message for when file can't be opened.
* htsearch/Display.cc (displayParsedFile): Added debugging message
for when file can't be opened.
Tue Jul 24 14:03:12 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (setVariables), htcommon/defaults.cc: Added
maximum_page_buttons attribute, to limit buttons to less than
maximum_pages. Fixes PR#731 & PR#781.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Documented it.
Tue Jul 24 13:42:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/hts_templates.html, htsearch/Display.cc (displayMatch):
Add METADESCRIPTION variable.
Tue Jul 24 13:20:24 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/DocumentDB.{h,cc}: Added FindCoded() method to lookup
docdb record with URL that's still encoded.
* htsearch/Display.cc (display, displayMatch, buildMatchList):
Use new method to avoid problems with URLs that are decoded and
reencoded with another, more ambiguous url_part_aliases setting.
Also fixed a problem with date range checking looking at ref before
checking if it's null.
Thu Jul 12 11:45:05 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/conv_doc.pl, contrib/parse_doc.pl: Fixed EOF handling in
dehyphenation, fixed to handle %xx codes in title made from URL.
* contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
contrib/doc2html/swf2html.pl: Fixed to handle %xx codes in URL title.
Thu Jul 5 11:23:40 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* db/dist/config.guess: Update with more recent GNU version that
recognizes various flavors of Mac OS X automatically.
* htlib/DB2_db.cc: Only #include <malloc.h> if we have it. Fixes
compilation problems on Mac OS X.
* htlib/String.cc: Include <iostream.h> instead of depreciated
<stream.h>. Fixes compilation problems with Mac OS X.
* htlib/Configuration.cc: Make sure we never try to operate on
strings of no length--accessing string[-1] is a bug--exposed on
Mac OS X.
Fri Jun 29 11:56:25 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc (got_redirect): Allow the redirect to accept
relative redirects instead of just full URLs.
Fri Jun 22 16:25:21 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/THANKS.html: Credit Marc Pohl and Robert Marchand.
* htsearch/Display.cc (buildMatchList): Fix date_factor calculation
to avoid 32-bit int overflow after multiplication by 1000, and avoid
repetitive time(0) call, as contributed by Marc Pohl. Also move the
localtime() call up before gmtime() call, to avoid clobbering gmtime's
returned static structure (my thinko).
Tue Jun 19 17:07:01 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (setVariables): Fixed handling of
build_select_lists attribute, to deal with new restrict & exclude
attributes.
Fri Jun 15 17:45:40 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/require.html: Added mentions of accents, prefix & substring,
taken from 3.2.0b4.
* htdoc/htfuzzy: Added blurb on accents algorithm, taken from 3.2.0b4.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added entry for
accents_db attribute for htfuzzy and htsearch. Mentioned accents
algorithm in description of search_algorithm. Noted effect of
locale setting on floating point numbers in search_algorithm
and locale descriptions.
Fri Jun 15 16:47:09 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Accents.{h,cc}, htfuzzy/Fuzzy.c (getFuzzyByName),
htfuzzy/htfuzzy.cc (main, usage), htfuzzy/Makefile.in: Added
latest version of Robert Marchand's accents fuzzy match algorithm.
* htcommon/defaults.cc: Added accents_db attribute for this.
* htsearch/htsearch.cc: Fixed parsing of search_algorithm not to
use comma as separator, because it may be needed as decimal point
in some locales.
Fri Jun 15 16:30:19 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Endings.cc (getWords): Undid change introduced in 3.1.3,
in part. It now gets permutations of word whether or not it has
a root, but it also gets permutations of one or more roots that
the word has, based on a suggestion by Alexander Lebedev.
* htfuzzy/EndingsDB.cc (createRoot): Fixed to handle words that have
more than one root.
* installdir/english.0: Removed P flag from wit, like and high, so
they're not treated as roots of witness, likeness and highness, which
are already in the dictionary.
Thu Jun 7 17:09:46 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Add new attribute use_doc_date to use
document meta information for the DocTime() field.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Document it.
* htdig/HTML.cc(do_tag): Call Retriever::got_time if use_doc_date
is set and we run across a META date tag.
* htdig/Retriever.h, htdig/Retriver.cc: Add new got_date
function. When called, sets the DocTime field of the DocumentRef
after parsing is completed. Currently assumes ISO 8601 format for
the date tag.
Thu Jun 7 16:48:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Add new attribute any_keywords to allow
ORing of keywords input parameter.
* htsearch/htsearch.cc (addRequiredWords): Use it. Fix handling
of empty search word list.
* htsearch/Display.cc (excerpt, highlight): Fix handling of case
where "words" is empty but "keywords" isn't.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Document any_keywords.
Thu Jun 7 16:34:41 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Add new attribute plural_suffix to set the
language-dependent suffix for PLURAL_MATCHES contributed by Jesse.
* htsearch/Display.cc (setVariables): Use it.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Document it.
Thu Jun 7 16:03:17 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.{h,cc}, htcommon/defaults.cc: Added multi-excerpt
feature and max_excerpts attribute, as contributed by Jim Cole.
* htdoc/THANKS.html, htdoc/attrs.html, htdoc/cf_byname.html,
htdoc/cf_byprog.html: Credit where due, and document attribute.
Thu Jun 7 15:27:33 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc: Backported from 3.2.0b3, fixing these
problems: no longer confused by "; charset=..." in Content-Type,
avoids security problems with popen() and shell parsing untrusted URL
(PR#542, PR#951), avoids predictable temporary file name if mkstemp()
exists, binary output from external converter no longer mangled,
less ambiguous error messages, opens temp. file in binary mode on
non-Unix systems.
Thu Jun 7 15:10:14 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/DocumentDB.{h,cc}: Replace CreateSearchDB() with DumpDB(),
add LoadDB(), both backported from 3.2.0b3.
* htdig/htdig.cc (main, usage), htdig/Makefile.in, htdoc/htdig.html:
Add handling of -m (minimal) option, file input for URLs, and arg 0
handling for htdump & htload.
* htdig/HTML.cc (do_tag): Change all white space to blanks in meta
description tag, for proper ASCII record dumps by htdump, and to fix
bug #405771.
* htlib/String.cc (= operator), htlib/htString.cc: change handling
of 0 length strings. Add readLine() for htload support.
Thu Jun 7 14:41:42 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc (got_href): Fix hop count mishandling.
Thu Jun 7 14:23:47 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htmerge/db.cc (mergeDB), htmerge/words.cc (mergeWords),
installdir/rundig: Fix various htmerge bugs. Quotes the temp.
directory name and word_list name (PR#872). Correctly handles
words beginning with +, - and ! when in extra_word_characters
(PR#952). Corrects problems with bad wordlists generated by
htmerge -m causing it to lose entries in words.db and problems
with the sort program using non-ASCII collating having a similar
effect.
Thu Jun 7 14:13:56 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/htsearch.cc (main), htsearch/Display.cc (setVariables,
createURL, buildMatchList), htdoc/THANKS.html, htdoc/hts_form.html,
htdoc/hts_templates.html: Add Mike Grommet's date range search
feature.
Thu Jun 7 13:57:06 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc (GetLocal, GetLocalUser): Fix to allow compiling
on AIX & other non-GNU compilers.
Thu Jun 7 13:52:20 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (setVariables): Extend the handling of
build_select_lists to handle select multiple, radio buttons and
checkboxes.
* htdoc/attrs.html, htdoc/hts_selectors.html: Describe this.
Thu Jun 7 13:40:13 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Exact.cc (Exact), htfuzzy/Prefix.cc (Prefix): Set the
name field to the class name, as suggested by Jesse.
Thu Jun 7 13:27:35 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/htdig-3.1.6.spec, contrib/htdig-3.1.6-conf.patch,
htdoc/where.html, .version, README: Bump to version 3.1.6.
Thu Jun 7 11:58:28 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/multidig/*: Backport from 3.2.0b3, including fixes below.
* contrib/multidig/Makefile, gen-collect, db.conf, multidig.conf:
Add missing trailing newlines as pointed out by Doug Moran
<dmoran@dougmoran.com>.
* contrib/multidig/Makefile (install): Make sure scripts have a+x
permissions. Pointed out by Doug Moran.
* contrib/multidig/new-collect: Fix typo to ensure MULTIDIG_CONF
is set correctly.
Thu Jun 7 11:37:52 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/README: Add in descriptions for web site contrib directory,
acroconv.pl & conv_doc.pl.
* contrib/examples/rundig.sh: Update to most recent version for 3.1.x.
* contrib/htparsedoc/htparsedoc: Add in contributed bug fixes from
Andrew Bishop to work on SunOS 4.x machines.
* contrib/acroconv.pl: Added external converter script to convert
PDFs with acroread.
Thu Jun 7 10:41:05 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/ParsedString.cc (get), htsearch/Display.cc (expandVariables):
Use isalnum() instead of isalpha() to allow digits in attribute and
variable names, allow '-' in variable names too for consistency.
Wed Jun 6 17:13:49 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc (do_tag): Make parsing of meta robots tag case
insensitive.
Wed Jun 6 15:31:00 2001 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/doc2html/DETAILS, contrib/doc2html/README,
contrib/doc2html/doc2html.cfg, contrib/doc2html/doc2html.sty,
contrib/doc2html/doc2html.pl, contrib/doc2html/pdf2html.pl,
contrib/doc2html/swf2html.pl: Added version 3.0 of doc2html,
contributed by David Adams <D.J.Adams@soton.ac.uk>.
Mon Jun 4 10:31:45 CEST 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
* htdoc/cf_byname.html: I forgot to insert the 'restrict' attribute.
Wed May 30 11:30:43 2001 Gabriele Bartolini <angusgb@users.sourceforge.net>
* htsearch/htsearch.cc: two new attributes, used by htsearch, have
been added: restrict and exclude. They can now give more control
to template customisation through configuration files, allowing
to restrict or exclude URLs from search without passing
any CGI variables (although this specification overrides the
configuration one).
* htcommon/defaults.cc: ditto
* htdoc/attrs.html: ditto
* htdoc/cf_byname.html: ditto
* htdoc/cf_byprog.html: ditto
* htdoc/hts_form.html: ditto
Sat May 5 21:43:32 2001 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in, configure: Add tests for wait.h, sys/wait.h,
mkstemp() and malloc.h.
* acconfig.h, include/htconfig.h.in: Update with autoheader for
new tests.
* htlib/regex.[h,c]: Update with backports from 3.2.0b4 development.
Tue Feb 29 23:04:04 2000 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/DB2_db.cc (Error): Simply fprint the error message on
stderr. This is not a method since the db.h interface expects a C
function.
(db_init): Don't set db_errfile, instead set errcall to point to
the new Error function.
Fri Feb 25 10:11:50 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (maximum_pages): Describe new bahaviour (as of
3.1.4), where this limits total matches shown.
Thu Feb 24 20:24:24 2000 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html: Update to refer to 3.1.5 and edit comments about 3.2.
Thu Feb 24 15:20:08 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/RELEASE.html, htdoc/main.html: Updated notes for 3.1.5 release.
Thu Feb 24 10:37:45 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (external_parsers): Add references to FAQ 4.8 & 4.9.
(local_default_doc): Give an expanded example.
(logging): Explain log entry format.
(star_blank): Fix some old typos (incorrect references to other attrs.)
Wed Feb 23 13:58:24 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/cgi.cc(init): Fixed bug: array must be free by
delete [] buf, not just delete buf; (from Vadim).
* installdir/syntax.html: Fixed a $(WORDS) I'd missed earlier.
Tue Feb 22 12:40:22 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/RELEASE.html, htdoc/main.html: Updated notes for 3.1.5 release.
* htlib/URL.cc (URL, normalizePath): Fix PR#779, to handle relative
URLs correctly when there's a trailing ".." or leading "//".
Thu Feb 17 15:58:53 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/RELEASE.html, htdoc/main.html: Add notes for 3.1.5 release.
* htdoc/TODO.html, htdoc/author.html, htdoc/bugs.html,
htdoc/cf_general.html, htdoc/cf_types.html, htdoc/cf_variables.html,
htdoc/config.html, htdoc/howitworks.html, htdoc/htdig.html,
htdoc/htfuzzy.html, htdoc/htmerge.html, htdoc/htnotify.html,
htdoc/hts_form.html, htdoc/hts_general.html, htdoc/hts_method.html,
htdoc/install.html, htdoc/isp.html, htdoc/mailing.html,
htdoc/meta.html, htdoc/notification.html, htdoc/require.html,
htdoc/uses.html, htdoc/where.html: Update copyright date and fix
last modified date for automatic CVS update.
Thu Feb 17 14:37:18 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/htdig.conf: quote all HTML tag parameters.
* htsearch/TemplateList.cc (createFromString), installdir/long.html,
installdir/short.html: Use $&(URL) in templates.
Thu Feb 17 14:01:34 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/htdig-3.1.5.spec: Fix silly typos in %post script,
make cron script a %config file.
Thu Feb 17 10:34:05 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
[ Improve htsearch's HTML 4.0 compliance ]
* htsearch/TemplateList.cc (createFromString): Use file name rather
than internal name to select builtin-* templates, use $&(TITLE) in
templates and quote HTML tag parameters.
* installdir/long.html, installdir/short.html: Use $&(TITLE) in
templates and quote HTML tag parameters.
* htsearch/Display.cc (setVariables): quote all HTML tag parameters
in generated select lists.
* installdir/footer.html, installdir/header.html,
installdir/nomatch.html, installdir/search.html,
installdir/syntax.html, installdir/wrapper.html:
Use $&(var) where appropriate, and quote HTML tag parameters.
Thu Feb 17 10:00:26 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/htdig-3.1.5.spec: Fix %post script to add more descriptive
htdig.conf entries.
Wed Feb 16 16:26:05 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/htdig-3.1.5.spec, contrib/htdig-3.1.5-conf.patch,
htdoc/where.html, .version, README: Bump to version 3.1.5.
* htdoc/THANKS.html: Added new contributors.
* htdoc/FAQ.html, htdoc/main.html: Updated to versions from web site.
Wed Feb 16 15:49:28 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Configuration.h, htlib/Configuration.cc: split Add() method
into Add() and AddParsed(), so that only config attributes get parsed.
Use AddParsed() only in Read() and Defaults().
Wed Feb 16 15:02:47 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/URL.h (encodeURL): Change list of valid characters to
include only unreserved ones.
* htlib/cgi.cc (init): Allow "&" and ";" as input parameter separators.
* htsearch/Display.cc (createURL): Encode each parameter separately,
using new unreserved list, before piecing together query string, to
allow characters like "?=&" within parameters to be encoded.
Wed Feb 16 14:42:02 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (encodeSGML, excerpt): Add encoding for
characters that could pose problems in HTML output.
* htsearch/Display.cc (expandVariables, outputVariables): Add support
for $&(var) and $%(var) template variable references. This should
fix PR#750, once we use this in common/*.html.
Tue Feb 15 17:21:08 2000 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
[ Applied a whole collection of patches and fixes from the archives ]
* htdig/Server.cc (robotstxt): apply more rigorous parsing of
multiple user-agent fields, and use only the first one.
* htdig/Retriever.cc(GetLocal, GetLocalUser): Add URL-decoding
enhancements to local_urls, local_default_urls & local_default_doc,
to allow hex encoding of special characters.
* htdoc/attrs.html: Document these.
* htdig/Retriever.cc (IsValidURL): Fix problem with
valid_extensions when an "extension" would include part of a
directory path or server name, as contributed by Warren Jones.
Also fix problem with valid_extensions matching failure when URL
parameters follow extension, as reported by fxbois@cybercable.fr.
* htdig/Document.cc (RetrieveLocal), htdig/Document.h,
htdig/Retriever.cc(Initial, parse_url, GetLocal, GetLocalUser,
IsLocalURL, got_href, got_redirect), htdig/Retriever.h,
htdig/Server.cc(Server), htdig/Server.h: Apply Paul B. Henson's
enhancements to local_urls, local_user_urls & local_default_doc.
* htdoc/attrs.html: Document these.
* htsearch/htsearch.cc (setupWords): Fix problem reported by
D.J. Adams, in which bad_words removal failed on upper-case
search words.
* htsearch/Display.cc(setVariables), htcommon/defaults.cc: Added
build_select_lists attribute, to generate selector menus in forms.
* htdoc/hts_selectors.html: Added this page to explain this new
feature, plus other details on select lists in general.
* htdoc/hts_templates.html: Added relevant links to related attributes
and selectors documentation.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Added relevant
explanations and links to selectors documentation.
* htlib/QuotedStringList.cc (Create): fix PR#743, where quoted string
lists didn't allow embedded quotes of opposite sort in strings
(e.g. "'" or '"'), and fix to avoid overrunning end of string
if it ends with backslash.
* htcommon/WordList.cc (valid_word): Applied Marc Pohl's fix to make
this 8-bit clean on Solaris.
* contrib/conv_doc.pl, contrib/parse_doc.pl: Applied Warren Jones's
changes to these scripts.
* htdig/PDF.cc (parseNonTextLine): Fix bogus escape sequences
around Title parsing. (Fixes PR#740)
* htsearch/Display.cc (display, displaySyntaxError),
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
htcommon/defaults.cc: Add new attribute "nph" to send out
non-parsed headers for servers that do not supply HTTP headers on
CGI output (e.g. IIS). If nph is set, send out HTTP OK header,
as suggested by Matthew Daniel <mdaniel@scdi.com> (PR#727)
* htdig/Document.cc (getdate): avoid strftime() altogether on
filled-in tm structure, to avoid recurring segfault problems. (PR#734)
* htlib/strptime.cc (mystrptime): Use Warren Jones's fix to deal
with a web server that returns dates with a two digit year field.
(Fixes PR#770)
* htdig/HTML.cc (HTML, parse, do_tag), htcommon/defaults.cc,
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Add max_keywords attribute to limit meta keyword spamming.
Wed Dec 8 18:19:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html, htdoc/bugs.html: Update to refer to latest versions.
(Update for 3.1.4 release.)
Wed Dec 8 18:10:27 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/QuotedStringList.cc (Create): Make sure that an empty
token isn't ignored.
Tue Dec 7 10:26:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc (setVariables): Fix a compilation error by
making a statment with '?' an explicit if-else statment.
* htdoc/RELEASE.html: Change case_sensitive fix to a bug-fix,
update release date for 12/9/99. (We certainly didn't release yesterday!)
Mon Dec 6 22:17:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(Display): Add missing call to setupTemplates(),
for handling template_patterns. Oops!
* htdoc/attrs.html: Fixed a couple typos in new attributes.
* htdoc/ChangeLog: Update to latest version.
Mon Dec 6 16:41:04 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/main.html: Update news with latest version.
* htdig/htdig.cc(main), htdig/Document.cc(Document),
htcommon/defaults.cc, htdoc/attrs.html, htdoc/cf_byname.html,
htdoc/cf_byprog.html: Add authorization attribute, settable by
htdig -u. Also fixes PR#490, by setting authentication before
robots.txt fetched.
* htdoc/RELEASE.html: Update with latest fix.
Fri Dec 3 17:31:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/DocumentRef.cc(Clear): Set docHopCount & docSig to 0,
and clear docEmail, docNotification & docSubject strings to have
a clean slate for Deserialize(), which assume 0/empty for these.
Fixes problem with hop counts getting clobbered.
* htdoc/RELEASE.html: Update with latest fix.
* htdoc/ChangeLog: Update to latest version.
Fri Dec 3 12:12:19 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc: removed vestiges of internal Postscript
support that never worked, and removed test for application/msword,
which is handled only by external parser.
* htdig/Makefile.in: removed Postscript.o from list.
* htdig/Retriever.cc(parse_url): Fix compilation error;
(Initial, got_href, got_redirect): Try to get the local filename
for a server's robots.txt file and pass it along to the newly
generated server.
* htdig/Server.cc(Server): Retrieve the robots.txt file from the
filesystem when possible; fix compilation error.
* htdig/Server.h(Server): Add local_robots_file parameter to Server().
* htlib/HtWordType.h, htlib/HtWordType.cc: fix compilation errors.
Fri Dec 3 10:52:57 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc(parse, do_tag): Add handling of <img alt=...> text,
fix parsing of words in meta tags, disable indexing of meta tags
when "noindex" state in effect, fix calculations of word positions
to more accurately reflect relative positions.
* htlib/HtWordType.h, htlib/HtWordType.cc: Add HtWordToken() function,
to replace strtok() in HTML parser.
* htdoc/RELEASE.html: Update with latest fixes.
Fri Dec 3 09:02:55 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Configuration(Add): handle strings in single quotes, as in
parm='value'.
Thu Dec 2 16:14:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Add Tom Metro's suggested revisions for pdf_parser
and external_parsers.
Thu Dec 2 15:15:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/mailing.html: Updated to version from htdig.org web site.
* htcommon/defaults.cc: Add missing no_page_number_text and
page_number_text attribute definitions.
* htdoc/attrs.html(modification_time_is_now): Make the description
a bit clearer as to how it may cut down on reindexing.
Thu Dec 2 13:46:11 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(parse_url), htdig/Server.cc(Server),
htcommon/defaults.cc, htdoc/attrs.html, htdoc/cf_byname.html,
htdoc/cf_byprog.html: Add support for local_urls_only attribute.
* htdoc/RELEASE.html: Update with latest feature.
Thu Dec 2 11:02:07 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/URL.cc(ServerAlias): Fix server_aliases processing to prevent
infinite loop (as for local_urls in PR#688).
Wed Dec 1 17:23:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(parse_url), htdig/Server.h: add IsDead() methods
to query and set server status, use them in Retriever to avoid repeated
HTTP request to a dead server. (Needed for persistent local stuff.)
Wed Dec 1 16:56:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(GetLocal): Fix error in GetLocalUser() return
value check, as suggested by Vadim.
* contrib/conv_doc.pl: Added a sample external converter script.
* htdoc/THANKS.html: A couple more additions.
Tue Nov 30 15:02:25 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(IsValidURL): Fix compilation error in
valid_extensions list handling.
* contrib/htdig-3.1.4.spec, contrib/htdig-3.1.4-conf.patch:
Added sample RPM spec file and config patch for it.
Tue Nov 30 14:01:51 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/where.html: Bump to version 3.1.4.
* htdoc/THANKS.html: Added new contributors.
* htdoc/isp.html, htdoc/uses.html, htdoc/main.html, htdoc/mailing.html:
Updated to versions from htdig.org web site.
Tue Nov 30 13:01:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/RELEASE.html: Add release notes for 3.1.4 release.
* .version, README: Bump for 3.1.4.
Tue Nov 30 11:03:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html(backlink_factor): Added Geoff's clarification of
what this attribute does.
Tue Nov 30 09:47:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc(RetrieveLocal): Handle common extensions for
text/plain, application/pdf & application/postscript.
* htdig/Retriever.cc(IsValidURL): Add valid_extensions list handling,
make it and bad_extensions case insensitive.
* htcommon/defaults.cc: Add config attribute valid_extensions,
with default as empty.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: Document it.
Tue Nov 30 09:02:02 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(got_href & got_redirect): remove all of Patrick's
case insensitive server code, to replace it with Geoff's fix to URL.cc
* htlib/URL.cc(normalizePath, path): If not case_sensitive,
lowercase the URL. Should ensure that all URLs are appropriately
lowercased, regardless of where they're generated.
Mon Nov 29 20:25:01 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc, htdig/Retriever.h, htdig/Server.cc(push),
htdig/Server.h: added Alexis's patch for persistent local digging
even if HTTP server is down. Also made new GetLocal() method
call GetLocalUser() itself, to simplify its use, and made it
non-private, for eventual use by Server code.
Mon Nov 29 19:18:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(got_href & got_redirect): corrections to case
insensitive server fix, to handle redirects, to make more thorough
use of mapped URL, and to update it after normalization.
Fri Nov 26 17:14:46 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc(RetrieveHTTP): always c.close() the connection
when returning.
* htdig/HTML.cc(HTML & do_tag): add code to turn off indexing between
<style> and </style> tags.
Fri Nov 26 16:31:06 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Configuration.cc(Read): fixed to allow final line without
terminating newline character, rather than ignoring it.
* htlib/String.cc(write): added Alexis Mikhailov's fix to bump up
pointer after writing a block.
* htsearch/Display.cc(setVariables): added Alexis Mikhailov's fix
to check the number of pages against maximum_pages at the right time.
(Put it even earlier, to make sure nPages is at least 1.)
* htsearch/Display.cc(generateStars): Remove extra newline after
STARSRIGHT and STARSLEFT variables, noted by Torsten Neuer
<tneuer@inwise.de>.
Wed Nov 24 20:33:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/htdig.conf: Add bad_extensions to make it
more obvious to users how to exclude certain document types.
Fix the comments for search_algorithm to refer to all the current
possibilities. Add example of no_excerpt_show_top attribute in
line with most user's expectations. (Geoff's changes)
Wed Nov 24 20:02:32 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/search.html (Match): Add Boolean to default search
form, as suggested by PR#561.
Tue Nov 23 23:03:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(setupTemplates), htsearch/Display.h: fixed a
couple of compilation errors in template_patterns code.
Tue Nov 23 22:16:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(got_href): Applied Patrick's case insensitive
server fix, to lowercase all URLs if case_sensitive is false.
Tue Nov 23 22:08:22 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/StringList.cc(Join): Applied Loic's patch to fix memory leak.
Tue Nov 23 21:52:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
[Applied patch from Hanno Mueller <kontakt@hanno.de>, which includes...]
* contrib/README: Add scriptname directory.
* contrib/scriptname/*: An example of using htsearch within
dynamic SSI pages
* htcommon/defaults.cc: Add script_name attribute to override
SCRIPT_NAME CGI environment variable.
* htdoc/FAQ.html: Update question 4.7 based on including htsearch
as a CGI in SSI markup.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
htdoc/hts_templates.html: Update based on behavior of script_name
attribute.
* htsearch/Display.cc: Set SCRIPT_NAME variable to attribute
script_name if set and CGI environment variable if undefined.
Tue Nov 23 21:29:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: Added the past few month's updates to the FAQ.
Tue Nov 23 21:20:35 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc, htsearch/Display.h, htsearch/Display.cc,
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html,
htdoc/hts_templates.html: add template_patterns attribute, to select
result templates based on URL patterns.
Tue Nov 23 20:52:38 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/cgi.h, htlib/cgi.cc(cgi & init), htsearch/htsearch.cc
(main & usage): allow a query string to be passed as an argument.
Tue Nov 23 20:35:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(setVariables & createURL),
htsearch/htsearch.cc(main), htdoc/hts_templates.html: handle keywords
input parameter like others, and make it propagate to followups.
Tue Nov 23 20:25:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: removed vestigial references to MAX_MATCHES
template variables in search_results_{header,footer}.
* htdoc/hts_form.html: add disclaimer about keywords parameter not
being limited to meta keywords.
* htdoc/meta.html: add description of "keywords" meta tag property.
add links to keywords_factor & meta_description_factor attributes.
Tue Nov 23 20:07:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(setVariables & hilight): added Sergey's idea
for start_highlight, end_highlight & page_number_separator attributes.
* htcommon/defaults.cc: added defaults for these.
* htdoc/attrs.html, htdoc/cf_by{name,prog}.html: documented them.
Tue Nov 23 19:58:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc: added support for external converters
as extension to external_parsers attribute.
* htdoc/attrs.html: Updated external_parsers with new description
and examples of external converters.
Tue Nov 23 19:52:27 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc(transSGML), htdig/SGMLEntities.cc(translateAndUpdate):
Fix the infamous problem in htdig 3.1.3 of mangling URL parameters that
contain bare ampersands (&), and not converting & entities in URLs.
* htdig/Retriever.cc(IsLocal & IsLocalUser): Fix PR#688, where
htdig goes into an infinite loop if an entry in local_urls
(or local_user_urls) is missing a '=' (or a ',').
* htcommon/cgi.cc(cgi): Fix bug in reading long queries via POST
method (PR#668).
* htnotify/htnotify.cc(send_notification): apply Jason Haar's fix
to quote the sender name "ht://Dig Notification Service".
Wed Sep 22 11:12:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/ChangeLog, htdoc/isp.html, htdoc/FAQ.html,
htdoc/RELEASE.html, htdoc/THANKS.html, htdoc/attrs.html,
htdoc/bugs.html, htdoc/contents.html, htdoc/main.html,
htdoc/require.html, htdoc/uses.html, htdoc/where.html: Update for
3.1.3 release and synch with latest versions from the website.
Wed Sep 15 17:54:31 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
A few changes to satisfy the AIX xlC compiler:
* htdig/htdig.cc: Moved variable declaration out of case block.
* configure.in, htconfig.in: Add check for sys/select.h.
Add "long unsigned int" to the possible getpeername_length types.
* htlib/Connection.cc: Include sys/select.h.
Sun Sep 12 15:02:19 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* .version: Bump for 3.1.3.
* README: Bump first line for 3.1.3 release, remove mention of rx
directory.
* htdoc/ChangeLog: Update with latest version.
* htdoc/RELEASE.html: Add release notes for 3.1.3 release.
Thu Sep 9 14:52:19 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/parse_doc.pl: fix bug in pdf title extraction.
Wed Sep 1 15:58:14 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc(got_word): add code to check for compound words
and add their component parts to the word database.
* htdig/PDF.cc(parseString), htdig/Plaintext.cc(parse): Don't strip
punctuation or lowercase the word before calling got_word. That
should be left up to got_word & Word methods.
* htlib/StringMatch.h, htlib/StringMatch.cc(Pattern, IgnoreCase):
Add an IgnorePunct() method, which allows matches to skip over valid
punctuation, change Pattern() and IgnoreCase() to accomodate this.
* htsearch/htsearch.cc(main, createLogicalWords): use IgnorePunct()
to highlight matching words in excerpts regardless of punctuation,
toss out old origPattern, and don't add short or bad words to
logicalPattern.
* htlib/HtWordType.h, htlib/HtWordType.cc(Initialize): set up and
use a lookup table to speed up HtIsWordChar() and HtIsStrictWordChar().
Wed Sep 1 15:48:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/PDF.cc(parse), htcommon/defaults.cc, htdoc/attrs.html:
Fix PDF.cc to handle acroread in Acrobat 4, which has a bug with
the -pairs option. It turns out that even without the -pairs
option, acroread 4 is still prone to segmentation violations when
generating PostScript, so acroread 3 is a better choice anyway.
* htdoc/FAQ.html: Added the past few month's updates to the FAQ.
* contrib/parse_doc.pl: Updated to latest version, adapted for
xpdf 0.90.
Wed Sep 1 15:39:41 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
Applied "bugfixes" patch collection, which I had posted to
htdig@htdig.org mailing list in August. Changes include...
* htsearch/Display.cc(expandVariables): Fix problem with $(VAR)
at end of template string not being expanded.
* htlib/URL.cc(URL): Fix PR#566 by setting the correct length of the
string being matched. 'http://' is 7 characters. Submitted by
<wolfgang.pichler@creditanstalt.co.at>.
* htdig/HTML.h, htdig/HTML.cc(do_tag, transSGML): Fix the HTML parser
to decode SGML entities within tag attributes.
* htlib/URL.cc(ServerAlias): Fix server_aliases entries so port
defaults to 80 if omitted.
* htlib/URL.cc(removeIndex): Fix the infamous problem with files
like left_index.html not getting indexed. PR#543 & PR#585.
* htdig/PDF.cc(parseNonTextLine): Fixed a bug in the PDF parser:
when the Title header was just the temporary file name, it
wouldn't be used, but it also wouldn't be cleared from the
_parsedString variable, so it ended up polluting the document
excerpt.
* htdig/Document.cc(RetrieveHTTP): Added error messages for unknown
hosts.
* htlib/cgi.cc(cgi): Fix PR#572, where htsearch crashed if
CONTENT_LENGTH was not set but REQUEST_METHOD was.
* htdig/HTML.cc(do_tag): Fix <meta> robots parsing to allow
multiple directives to work correctly. Fixes PR#578, as provided
by Chris Liddiard <c.h.liddiard@qmw.ac.uk>.
* htsearch/htsearch.cc(main): Allow multiple keywords input
parameters in search forms.
* htdig/Document.cc(Reset, readHeader): Fix the bug in the handling
of modification_time_is_now.
* htfuzzy/Fuzzy.cc(getWords), htfuzzy/Metaphone.cc(vscode,generateKey):
Should fix PR#514 in the bug database. It's Geoff's first attempt,
with a minor correction, plus an added test in the vscode macro,
which is where the problem seemed to be happening. This won't
map accented vowels to their unaccented counterparts, but
it should hopefully put an end to the segmentation faults.
* include/htconfig.h.in, htcommon/WordReference.h,
htcommon/WordList.cc(Word, Flush, BadWordFile),
htcommon/DocumentRef.cc(AddDescription), htcommon/defaults.cc,
htsearch/parser.cc(perform_push), htdoc/attrs.html,
htdoc/cf_byname.html, htdoc/cf_byprog.html: Change the maximum word
length into a run-time option, rather than compile-time.
* htsearch/Display.cc(displayMatch): Applied Torsten Neuer's
<tneuer@inwise.de> fix for PR#554.
* htdig/HTML.cc(HTML, do_tag): Added support for <embed>, <object>
and <link> tags.
* htdig/htdig.cc(main): Applied Geoff's patch to hide the
username/password in the command line arguments.
* htdig/Document.cc(readHeader): Fixed a few problems with header
parsing, including PR#535 & PR#557.
* htdig/Document.cc(getdate): This should help with PR#81 & PR#472,
where strftime() would crash on some systems. Idea submitted
by benoit.sibaud@cnet.francetelecom.fr.
* COPYING, htdoc/COPYING, Makefile.in: Updated the FSF address
in COPYING & Makefile.in. PR#595.
* htdig/Retriever.cc(IsValidURL): Fix PR#493, to avoid rejecting
a valid URL with ".." in it.
* htlib/URL.cc(parse): Fix PR#348, to make sure a missing
or invalid port number will get set correctly.
* htsearch/Display.h, htsearch/Display.cc(excerpt): Fix declaration
to refer to "first" as reference--ensures ANCHOR is properly set.
Fixes PR#541 as suggested by <pmb1@york.ac.uk>.
* htdig/ExternalParser.cc(parse): Quote the filename before passing
it to the command-line to prevent shell escapes. Fixes PR#542.
Also make error messages more useful.
* htfuzzy/Endings.cc(getWords): Suffix-handling improvement (PR#560),
to prevent inappropriate suffix stripping in endings fuzzy matches.
* htlib/URLTrans.cc(encodeURL): Fix encoding so all non-ascii
characters get hex-encoded. I think this is what PR#339 was all about.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Added descriptions for attributes that were missing, added
a few clarifications, and corrected a few defaults and typos.
Covers PR#558, PR#626, and then some.
* configure.in, configure, include/htconfig.h.in, htlib/regex.c:
Fix PR#545, to test for presence of alloca.h
Wed Apr 21 22:45:16 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* .version: Bump for final 3.1.2 release.
* htdoc/where.html, htdoc/FAQ.html: Update to mention the new release.
Tue Apr 20 13:34:22 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/RELEASE.html: Fixed a few typos, updated modification date.
Tue Apr 20 10:54:59 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/RELEASE.html: Add notes on changes in the 3.1.2 release.
* htdoc/contents.html, htdoc/mailarchive.html, htdoc/where.html,
htdoc/uses.html: Update with versions from maindocs.
* installdir/htdig.conf: Add example max_doc_size attribute to cut
down on FAQ, also add comment on including a file for start_url.
Mon Apr 19 15:40:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/WordList.cc(valid_word): fixed to avoid having the new
HtIsStrictWordChar() test circumvent the allow_numbers option by
allowing numbers all the time. Also fixed to allow HtIsStrictWordChar()
to override iscntrl(), so extra_word_characters can define characters
that a broken locale would define as control characters.
Mon Apr 19 15:17:12 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/WordList.cc(valid_word): fixed bug introduced Jan 9,
where it stopped scanning for control characters prematurely.
Now also use iscntrl() to detect all control characters.
Fri Apr 16 10:30:42 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: fixed typo - use_meta_description was plural.
Wed Apr 14 20:22:31 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* htlib/regex.h: fixed compile problem with AIX xlc compiler
Tue Apr 13 13:01:04 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(generateStars): Set status to -1 if
URLimage.hasPattern() fails, to avoid empty URLimageList.
(Fix to Mar 31 change.)
Tue Apr 13 11:27:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.h(class Display): move enum SortType up to public
section, to avoid problem compiling on IBM AIX C++ compiler.
Mon Apr 12 17:36:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: added sections on indexing docs in other languages,
practical & theoretical limits of ht://Dig.
Fri Apr 9 16:47:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: Fixed a few typos.
Fri Apr 9 16:24:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc(RetrieveHTTP): Show "Unable to build connection"
message at lower debug level.
Fri Apr 9 15:17:53 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/FAQ.html: Added changes in maindocs from Mar 18, a few
clarifications, and four new questions.
Wed Apr 7 19:41:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc (usage): Remove bogus -w flag.
Thu Apr 1 11:58:20 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/htsearch.cc(main): Apply Gabriele's patch to avoid using an
invalid matchesperpage CGI input variable.
* htsearch/Display.cc(display) & (setVariables): Correct any invalid
values for matches_per_page attribute to avoid div. by 0 error.
Wed Mar 31 18:21:21 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/htdig.cc: Undo March 30 change.
* htdig/Retriever.cc: Use excludes.hasPattern before using the
exclude list. (More elegant solution to problem, as pointed out by
Gilles.)
* htsearch/Display.cc: Remove code setting URLimage to a bogus
pattern. Instead, check that URLimage.hasPattern() before using
it.
Wed Mar 31 15:16:36 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Synonym.cc: Fix previous fix of minor memory leak.
(db pointer wasn't properly set)
Tue Mar 30 20:08:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/htdig.cc: If exclude_urls attribute is set to empty, set
it to something that will never match a URL to ensure nothing is
excluded.
* Makefile.config.in: Fix typo leading to HTLIBS referring to itself.
Mon Mar 29 16:47:48 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(excerpt): Added patch from Gabriele to
improve display of excerpts--show top of description always,
otherwise try to find the excerpt.
Mon Mar 29 15:57:06 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/htdig.cc: Rename main.cc for consistency with other
directories.
* htdig/Makefile.in: Use it.
Mon Mar 29 12:53:17 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/HtWordType.h (HtIsWordChar): Avoid matching 0 when using
strchr.
(HtIsStrictWordChar): Ditto. (Patch from Hans-Peter Nilsson)
Mon Mar 29 10:51:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/regex.h, htlib/regex.c: Include glibc versions of the
regex functions to override possibly buggy system versions.
* htlib/Makefile.in: Use them.
* htfuzzy/EndingsDB.cc: Use glibc regex functions instead of rx
for massive speedups on non-English affix files.
* configure, configure.in: Use the system timegm function if present.
Don't configure rx since we don't use it any more. Don't worry
about tsort since that was only needed for rx.
* Makefile.in, Makefile.config.in: Ignore the rx directory if present.
Thu Mar 25 12:24:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* installdir/long.html, installdir/short.html: Remove backslashes
before quotes in HTML versions of the builtin templates.
* Makefile.in: Add long.html & short.html to COMMONHTML list, so
they get installed in common_dir.
Thu Mar 25 11:45:59 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(displayMatch), htcommon/defaults.cc,
htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Add date_format attribute suggested by Marc Pohl.
Thu Mar 25 09:49:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(displayMatch): Avoid segfault when DocAnchors
list has too few entries for current anchor number.
Wed Mar 24 12:20:02 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/main.cc (main): Call HtWordType::Initialize. (Missed this
one yesterday. Oops!)
Tue Mar 23 17:11:46 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* backport Hans-Peter Nilsson's suite of changes for HtWordType
and extra_word_characters support, to 3.1.2...
* htlib/HtWordType.h (class HtWordType): New.
* htlib/HtWordType.cc: New.
* htlib/Makefile.in (OBJS): Add HtWordType.o
* htdoc/attrs.html: Document attribute extra_word_characters.
* htdoc/cf_byprog.html: Ditto.
* htdoc/cf_byname.html: Ditto.
* htcommon/defaults.cc (defaults): Add extra_word_characters.
* htsearch/htsearch.h: Lose spurious extern declaration of unused
variable valid_punctuation.
* htsearch/htsearch.cc (main): Call HtWordType::Initialize.
(setupWords): Use HtIsWordChar, HtIsStrictWordChar and
HtStripPunctuation. Do not read valid_punctuation.
* htsearch/Display.cc (excerpt): Use HtIsStrictWordChar.
* htlib/StringMatch.cc (FindFirstWord): Ditto.
(CompareWord): Ditto.
* htdig/Retriever.h (class Retriever): Lose member
valid_punctuation.
* htdig/Retriever.cc (Retriever): Lose its initialization.
* htdig/Postscript.h (class Postscript): Lose member
valid_punctuation.
* htdig/Postscript.cc (Postscript): Lose its initialization.
(flush_word): Use HtStripPunctuation.
(parse_string): Use HtIsWordChar,
HtIsStrictWordChar and HtStripPunctuation.
* htdig/Parsable.h (class Parsable): Lose member
valid_punctuation.
* htdig/Parsable.cc (Parsable): Lose its initilization.
* htcommon/WordList.cc (valid_word): Use HtIsStrictWordChar.
(BadWordFile): Use HtStripPunctuation. Do not read
valid_punctuation.
* htcommon/DocumentRef.cc (AddDescription): Use HtIsWordChar,
HtIsStrictWordChar and HtStripPunctuation. Do not read
valid_punctuation.
* htdig/PDF.cc (parseString): Similar..
* htdig/HTML.cc (parse): Similar.
* htdig/Plaintext.cc (parse): Similar.
Tue Mar 23 15:52:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* .version: Bump to 3.1.2-dev.
Tue Mar 23 14:50:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/String.cc: Fix up code to be cleaner with memory
allocation, inline next_power_of_2, fix some memory leaks.
(Geoff's changes of Feb 22-25)
Tue Mar 23 14:35:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/HtWordCodec.cc(HtWordCodec): Fix bug with constructing from
uninitialized variables!
* htlib/HtURLCodec.cc (~HtURLCodec): Add missing deletion of
myWordCodec.
Tue Mar 23 14:18:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/PDF.cc(parseString): Use minimum_word_length instead of
hardcoded constant.
Tue Mar 23 12:02:00 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(generateStars): Add in support for use_star_image
which was lost when template support was put in way back when.
Tue Mar 23 11:47:52 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* Makefile.in: add missing ';' in for loops, between fi & done
Mon Mar 22 19:26:56 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/DocumentRef.cc(AddDescription): Check to see that
description isn't a null string or contains only whitespace before
doing anything.
Mon Mar 22 19:21:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Fix #ifdef
problems with zlib.
Mon Mar 22 19:14:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html (template_name): Typo; used by htsearch, not htdig.
Mon Mar 22 19:10:56 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Retriever.cc (got_href): Check if the ref is for the
current document before adding it to the db. (From H-P Nilsson, Mar 8)
Mon Mar 22 19:03:23 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Rephrase and clarify entry for url_part_aliases.
(From Hans-Peter Nilsson, Mar 2)
Mon Mar 22 18:48:10 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htfuzzy/Synonym.cc: Fix minor memory leak.
* htlib/Dictionary.h, htlib/Dictionary.cc(hashCode): Check if key
can be converted to an integer using strtol. If so, use the
integer as the hash code. (Geoff's patch)
Mon Mar 22 18:23:11 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/List.cc(Nth): Check for out-of-bounds requests before
doing anything.
Mon Mar 22 17:50:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc(display): Free DocumentRef memory after
displaying them.
(displayMatch): Fix memory leak when documents did not have anchors,
fix problems when documents did not have descriptions.
Mon Mar 22 17:32:14 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htmerge/docs.cc(convertDocs): Replace previous verbose patch
with H-P Nilsson's.
Mon Mar 22 17:13:35 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Plaintext.cc, htmerge/words.cc: removed Log lines.
Mon Mar 22 16:11:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/htsearch.cc: Add patch from Jerome Alet <alet@unice.fr>
to allow '.' in config field but NOT './' for security reasons.
Mon Mar 22 15:56:55 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/long.html, installdir/short.html: Write out HTML
versions of the builtin templates. (committed to 3.1.2 by Gilles)
* installdir/htdig.conf: Add commented-out template_map and
template_name attributes to use the on-disk versions.
Mon Mar 22 15:13:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc, htdoc/attrs.html: Change default locale
to "C", as H-P Nilsson recommended.
* htlib/Configuration.cc(Add): Fix small memory leak in locale code,
as Geoff discovered.
Mon Mar 22 15:03:10 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/parse_doc.pl: uses pdftotext to handle PDF files,
generates a head record with punctuation intact, extra checks
for file "wrappers" & check for MS Word signature (no longer
defaults to catdoc), strip extra punct. from start & end of words,
rehyphenate text from PDFs, fix handling of minimum word length.
Mon Mar 22 14:38:01 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Plaintext.cc(parse): Use minimum_word_length instead of
hardcoded constant.
Mon Mar 22 14:33:45 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Configuration.cc(Add): Fix function to avoid infinite loop
on some systems, which don't allow all the letters in isalnum() that
isalpha() does, e.g. accented ones.
* htdig/HTML.cc: Fix three reported bugs about inconsistent
handling of space and punctuation in title, href description & head.
Now makes destinction between tags that cause word breaks and those
that don't, and which of the latter add space.
Mon Mar 22 14:25:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htmerge/docs.cc: Make htmerge -vv report reasons for deleting docs.
* htmerge/words.cc(mergeWords): Fix to prevent description text
words from clobbering anchor number of merged anchor text words.
Fri Mar 19 17:09:21 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc: Fix bug where noindex_start was empty, allow case
insensitive matching of noindex_start & noindex_end.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Fix inconsistencies in documentation for noindex_start & noindex_end.
Fri Mar 19 17:05:16 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc: Add check for <a href=...> tag that is missing a
closing </a> tag, terminating it at next href.
Fri Mar 19 17:00:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Document.cc: Fix check of Content-type header in readHeader(),
correcting bug introduced Jan 10 (for PR#91), and check against
allowed external parsers.
* htdig/HTML.cc: More lenient comment parsing, allows extra dashes.
Fri Mar 19 16:52:51 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc: Check for presence of more than one <title> tag.
* htlib/mytimegm.cc: Fix Y2K problems.
Fri Mar 19 16:43:28 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc: Add patch from Gabriele to ensure META
descriptions are parsed, even if 'description' is added to the
keyword list.
Fri Mar 19 16:37:08 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/parser.h, htsearch/parser.cc: Clean up patch made for
error messages, made on Feb 16.
Tue Feb 16 23:48:09 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in, configure: Default to 'int' when we cannot
establish type used by getpeername.
* htdoc/RELEASE.html: Additional notes on everything fixed in 3.1.1.
Tue Feb 16 23:45:26 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* contrib/parse_doc.pl: Add replacement for less-capable (and
buggy) parse_word_doc.pl script. Handles Word, PS, RTF, and
WordPerfect files, with appropriate file->text converters.
* htsearch/parser.cc, htsearch/parser.h: Add more error messages
when the boolean expression is invalid.
Mon Feb 15 21:02:24 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc(RetrieveLocal): Fix to ensure we report
reading only max_doc_size bytes, even when the document is larger.
* configure.in, configure: Add 'socklen_t' to getpeername check to
prevent problems configuring on Solaris 7.
* htdoc/RELEASE.html: Minor changes for 3.1.1 release.
Sun Feb 14 16:29:48 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc(retrieveHTTP, retrieveLocal): Fix document
size when the document is larger than max_doc_size. Size should be
that sent by the server or as given by stat().
* htdoc/*.html: More cleanups from Marjolein.
Sat Feb 13 20:53:34 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc(got_word): Ensure heading is in a normal range.
* htdoc/RELEASE.html: Added information on the bugs fixed in 3.1.1.
* htdoc/attrs.html: Added info on the changed syntax of the pdf_parser
attribute in 3.1.0 and later.
Sat Feb 13 20:29:26 1999 Marjolein Katsma <webmaster@javawoman.com>
* htdoc/*.html: Cleaned up HTML, fixed typos, added appropriate
HTML 4.0 syntax, added DTDs to files, other minor fixed.
Fri Feb 12 19:58:28 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* .version: Bump for version 3.1.1.
* configure.in, configure: Fix problems determining getpeername
syntax under IRIX.
* db/os/os_map.c: Fixed problems on AlphaLinux pointed out by Paul
J. Meyer.
Fri Feb 12 12:00:25 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc: Fix crashes noted by Frank Richter.
* contrib/htparsedoc/parse_word_doc.pl: Use updated version (with
fixed line breaks).
* htnotify/htnotify.cc: Add patch mentioned in Feb 8 documentation
change.
Thu Feb 11 00:29:42 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/DocumentRef.cc (NUM_ASSIGN): Expand from unsigned types.
(getnum): Use temporary for "unsigned short", and memcpy data into
it instead of assignment.
Tue Feb 9 19:21:55 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html, htdoc/where.html: Update for 3.1.0 release.
* htdoc/uses.html: Added remaining backlog.
* htdoc/RELEASE.html: Finish up release notes for 3.1.0.
Tue Feb 9 19:19:13 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc: Ensure we remove the temporary file.
Mon Feb 8 20:28:07 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/ma_menu: Change relative URLs to absolute URLs to
www.htdig.org to reflect the changing mail archive.
* htdoc/install.html: Add notes on new configure flags to set
CONFIG variables.
* htdoc/*.html: Ensure Last Modifed date stamps are up-to-date.
Mon Feb 8 20:26:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/meta.html, htdoc/notification.html: Add info on date
formats for the htnotify-date tag, esp. in relation to ISO 8601.
Sat Feb 6 23:24:19 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.cc: Fixed compile problem when zlib is disabled.
* htdoc/cf_byname, htdoc/cf_byprog.html, htdoc/attrs.html: Added
entries for url_log, compression_level, noindex_start, noindex_end,
allow_in_form, bad_querystr, no_title_text.
* htdoc/THANKS.html: Added Gabriele Bartolini.
* htdoc/uses.html, htdoc/FAQ.html, htdoc/bugs.html: Synch with the
latest versions from the website tree.
Fri Feb 5 19:57:39 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htnotify/htnotify.cc: Add function parse_date() to parse date
strings from htnotify-date tags. It tries to be as flexible as
possible about formatting and will report invalid dates. Based in
part from code contributed by Gabriele Bartolini.
Fri Feb 5 19:28:24 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure, configure.in: Add a test to ensure the zlib.h header
file exists.
* include/htconfig.h.in: Added definition for HAVE_ZLIB_H.
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Add checks for
HAVE_ZLIB_H in addition to HAVE_LIBZ. Ensures the library is
actually accessible, not just present.
* htfuzzy/Soundex.cc: Fix typo.
Thu Feb 4 22:51:37 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* Makefile.in: Clean up previous patch and tidy up HTML and
dictionary installation.
Thu Feb 4 22:31:35 1999 Ric Klaren <klaren@telin.nl>
* Makefile.in, */Makefile.in: Add support for
$INSTALL_ROOT, making it easier to build packages (e.g. RPMs) into
directories for later processing.
* htsearch/Display.cc: Tiny patch to silence a compiler warning.
Thu Feb 4 13:03:44 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htfuzzy/Soundex.cc(generateKey): Skip initial non-alphabetic
characters and explicitly skip characters without values.
* htfuzzy/Metaphone.cc(generateKey): General bug-fixing, fixing a
bug that corrupted the string to be processed, fixing typos, and
ensuring keys generated fit the metaphone algorithm.
* htfuzzy/Fuzzy.cc(getWords): Add debugging output of the fuzzy
key used.
* contrib/doclist/doclist.pl, contrib/doclist/listafter.pl,
contrib/whatsnew/whatsnew.pl, contribu/urlindex.pl: Change to
support additions to ht://Dig database format.
Thu Feb 4 02:09:22 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc: Add debugging information on words
returned from fuzzy matching.
* htfuzzy/Metaphone.cc(addWord): Fix bug where only one word would be
stored per key in the database.
* htfuzzy/Soundex.cc(addWord): Ditto.
(generateKey): Rewrite to generate keys correctly.
Wed Feb 3 19:24:36 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/htdig.html: Added documentation on the -l log and restart
feature.
* htdoc/htmerge.html: Added documentation on the -m merge database
feature.
* htdig/main.cc: Added documentation on the -l flag to the usage
message.
* .version: Bump to 3.1.0.
Wed Feb 3 19:09:31 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc: Add check for URLs with no / in the
no_title code.
* htdig/Document.cc: Fix problems with dates returned from servers
with incorrect formats. Those simply missing the day of week are
parsed correctly, otherwise output an error, use the current date,
and keep going.
Wed Feb 3 09:57:14 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/nomatch.html: Fix small typo.
* htdoc/RELEASE.html: Finish up 3.1.0 release notes.
* htdoc/TODO.html: Update with status and new directions.
Wed Feb 3 14:22:11 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* htsearch/Display.cc(setVariables): Removed some of yesterdays
changes. Thanks to Gilles!
Tue Feb 2 17:26:06 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/PDF.h, htdig/PDF.cc: Fix problems with PDFs generated by
CorelDraw.
* htdoc/attrs.html: Fixed small typo.
Tue Feb 2 21:02:25 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* htsearch/Display.cc(setVariables,createURL): As pointed out by
Gilles, append allow_in_form variables to the query strings only
if they are given as input parameters.
Tue Feb 2 10:29:09 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure, configure.in: Rewrite getpeername_length_t detection
to use prototypes to eliminate type conversion.
* htsearch/Display.cc(buildMatchList): Ensure scores are always
positive or zero.
Mon Feb 1 22:54:02 1999 Hans-Peter Nilsson <hp@axis.se>
* htdoc/attrs.html: Correct "default" for "nothing_found_file".
Mon Feb 1 14:44:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc(displayMatch): Remove compiler warnings.
* */Makefile.in: Define INSTALL_PROGRAM from configure script.
Mon Feb 1 14:04:18 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/ExternalParser.cc: Add checks to prevent wayward parsers
from bringing down the dig.
Sun Jan 31 23:15:36 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/WeightWord.cc(set): Ensure word is lowercased for
accurate fuzzy comparisons.
* htfuzzy/Fuzzy.cc(openIndex): Destroy the database reference if
we cannot open the database. Fixes a coredump in classes that
inherit this method.
* Makefile.config.in: Remove bogus definitions of INSTALL.
* Makefile.in: Define INSTALL, INSTALL_PROGRAM, INSTALL_SCRIPT,
and INSSTALL_DATA as defined by configure. Use them.
* htdoc/RELEASE.html: Started release notes for version 3.1.0.
Mon Feb 1 04:36:29 1999 Hans-Peter Nilsson <hp@axis.se>
* htsearch/Display.cc (displayMatch): Fix leaking user of
String(String *).
* htfuzzy/Prefix.cc (getWords): Ditto.
* htlib/htString.h, htlib/String.cc (String(const String &)): New.
* htlib/htString.h, htlib/String.cc (String(const String &, int)):
No default argument.
* htlib/htString.cc, htlib/String.cc (String(String *)): Removed.
Sun Jan 31 21:46:52 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* htlib/Connection.cc: Include sys/time.h needed by select, fixes
PR #322.
Sun Jan 31 20:50:38 1999 Hans-Peter Nilsson <hp@axis.se>
* htdig/Retriever.cc (Initial, GetRef, Need2Get, IsValidURL,
got_href, got_redirect): Do not lowercase URLs.
* htlib/HtURLCodec.h (class HtURLCodec): Fake a friend function.
Sat Jan 30 22:29:50 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure, configure.in: Add support for program name
transformations.
* */Makefile.in: Do it.
Sat Jan 30 21:16:50 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htmerge/docs.cc: Added translation of Dutch comment for us ignorant
Americans. ;-)
* installdir/rundig: As mentioned by Gilles, use sed with ls -t
test. Add more comments for FAQs.
* configure.in, configure: Add --disable-zlib to turn off compiling
compression entirely. Add --with-cgi-bin-dir,
--with-image-dir and --with-search-dir flags to set CONFIG
variables.
* CONFIG.in: Use them.
Sat Jan 30 21:05:35 1999 Randy Winch <gumby@cafes.net>
* htcommon/DocumentRef.h: If using compressed document databases,
declare compress and decompress functions and the current state of
the head (excerpt).
* htcommon/DocumentRef.cc: Change document compression to only
compress the DocHead field and only decompress when necessary.
Sat Jan 30 03:49:21 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/DocumentRef.h: Add #ifdef around declaration of
c_buffer.
* htcommon/DocumentRef.cc: Remove spurious extra "static" from
c_buffer definition. Add #ifdef HAVE_LIBZ around it.
Fri Jan 29 13:30:11 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc: Construct the StringMatch used for finding
excerpts in two pieces--user input and post-fuzzy matching. Fixes
problems with matching searches with punctuation.
* htlib/StringMatch.cc(IgnoreCase): Fix small memory leak pointed
out by Gilles.
Thu Jan 28 21:36:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/*.html: Changed copyright information to mention the
ht://Dig group, removing Andrew's name.
* README, configure.in, Makefile.in: Ditto.
* configure: Change mention of libg++ -> libstdc++.
Thu Jan 28 12:53:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Document new remove_default_doc attribute.
* Makefile.in: Make sure we put the wrapper file in the right place.
Make sure dictionaries are installed with the correct permissions.
* installdir/rundig: Use a portable test for testing the endings
and synonym databases. Also enhanced support for flags (-a, -s,
-vvv, -c config).
* htsearch/Display.cc: Fix bug when sorting results would cause a
coredump.
Wed Jan 27 20:00:40 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc, htdig/SGMLEntities.cc, htdig/ExternalParser.cc,
htcommon/WordList.cc, htcommon/DocumentRef.cc: Speedup by
converting many config lookups into static variables.
* htdoc/attrs.html, htdoc/hts_templates.cc, htdoc/cf_byname.html,
htdoc/cf_byprog.html: Various minor fixes.
* htsearch/Display.cc: Fix problems with star_patterns attribute.
Wed Jan 27 13:02:39 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/SGMLEntities.cc: Use StringMatch class for matching
" & < and > as defined by config options. Should
speed up translation.
* htdoc/THANKS.html: Minor updates for contributions towards 3.1.0.
Tue Jan 26 19:29:08 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* include/htconfig.h.in: Define TRUE and FALSE if not
defined. Change default of NO_WORD_COUNT (now undefined) for
compatibility.
* htdig/htdig.h: Remove definition of TRUE and FALSE (for consistency).
* htcommon/DocumentDB.cc(Add, Delete, Exists, []): Do not
lowercase the URL before storing it. URLs can be case-sensitive.
Tue Jan 26 19:07:03 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htcommon/defaults.cc: Define remove_default_doc as option of
default document to strip off URLs (e.g. /index.html -> /).
* htlib/URL.cc(removeIndex): Use it.
(normalizePath): Fix bug with stripping double slashes and the
like from a query string.
* htdig/Document.h, htdig/Document.cc: Add new variable
contentLength and consider content-length headers when reading in
documents.
* htdig/PDF.cc: Fix broken code calling acroread.
* htsearch/Display.cc: Allow braces in wrapper file.
* htdoc/hts_general.html, htdoc/hts_templates.html: Add info on
the wrapper alternative to separate header and footer files.
* htdoc/config.html, installdir/header.html,
installdir/nomatch.html, installdir/wrapper.html,
installdir/search.html: Change sort option to be more grammatically
correct.
Tue Jan 26 21:19:02 1999 Hans-Peter Nilsson <hp@axis.se>
* htmerge/docs.cc (convertDocs): Use HtURLCodec to encode URLs
going into the doc_index database.
* htsearch/Display.cc (buildMatchList): Use HtURLCodec to decode
URLs from docIndex.
* htcommon/defaults.cc (defaults): Fix typo with "case_sensitive".
Tue Jan 26 18:08:19 1999 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* include/htconfig.h.in: Added HAVE_STRINGS_H. (I forgot that when
added the configure check.)
* htdig/Retriever.h: Fix small compiler error. Removed Log-lines.
Tue Jan 26 02:22:45 1999 Hans-Peter Nilsson <hp@axis.se>
* htdig/main.cc (main): Fix typo "uncoded_db_compatbile".
Mon Jan 25 19:38:31 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/Configuration(Find): Make error message for missing
entries conditional to DEBUG symbol. Removes odd error messages
under normal use.
Sun Jan 24 23:55:57 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htmerge/db.cc, htmerge/docs.cc: Fix compiler errors.
* htnotify/htnotify.cc: Similar.
Sun Jan 24 14:13:37 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/WordRecord.h (struct WordRecord): Remove member count
if NO_WORD_COUNT defined.
* htmerge/db.cc (mergeDB): Remove handling.
* htmerge/words.cc (mergeWords): Similar.
* include/htconfig.h.in: Define NO_WORD_COUNT by default.
Sun Jan 24 14:13:37 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc(logSearch): Added fix from Gilles in case
REMOTE_ADDR is NULL as well.
* htnotify/htnotify.cc: Fix compiler warnings.
* htlib/String.cc(indexOf): Use autoconf check for strstr, fix
compiler warnings.
* htlib/Configuration.cc(Find): Complain when option is not in the
list.
* htdig/HTML.cc(parse): Move declarations out of the loop.
(parse): Don't add non-word characters to the excerpt if they're
in the title. Fixes PR #80.
Mon Jan 25 02:17:58 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/defaults.cc (defaults): New option
"uncoded_db_compatible", default true.
* htcommon/DocumentDB.h (DocumentDB::SetCompatibility): New
function.
(DocumentDB::myTryUncoded): New member.
* htcommon/DocumentDB.cc (Constructor, Add(), operator[],
Exists(), Delete()): Handle uncoded URL in database if
myTryUncoded.
* htdig/main.cc (main): Call (DocumentDB::)SetCompatibility() with
option "uncoded_db_compatible".
* htsearch/Display.cc (Display): Likewise.
* htnotify/htnotify.cc (main): Likewise.
* htmerge/docs.cc (convertDocs): Likewise.
* htmerge/db.cc (mergeDB): Likewise.
* htdoc/attrs.html, htdoc/cf_byname.html, htdoc/cf_byprog.html:
Document option "uncoded_db_compatible".
Sun Jan 24 15:21:02 1999 Hans-Peter Nilsson <hp@axis.se>
* htlib/HtWordCodec.cc (HtWordCodec(StringList &, etc)): Check
limits separately for "to" and "from". Do not calculate
string-lengths separately for limit-checking; use methods Count()
and length() on data near the final result.
* htlib/HtWordCodec.cc (HtWordCodec constructors): Do not
explicitly add '\0' to the pattern strings.
* htlib/HtWordCodec.cc (code): Check for zero-length replacement
list.
Sat Jan 23 22:18:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc(parse_url): If a server ignores the
If-Modified-Since request, still compare the retrieved date to the
stored date to see if it has been modified.
Sat Jan 23 13:09:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htmerge/htmerge.cc: Unlink the db.docs.index file before we
build it again. This ensures we have a clean copy and don't
duplicate URLs.
Fri Jan 22 23:12:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* include/htconfig.h.in: Cleaned up preprocessor definitions.
* configure.in, configure: Fix NEED_PROTO_GETHOSTNAME check and
make check for GETPEERNAME_LENGTH_T more flexible.
* htlib/Connection.cc: Change __sun__ to NEED_PROTO_GETHOSTNAME
since we prefer feature tests.
Sat Jan 23 02:38:08 1999 Hans-Peter Nilsson <hp@axis.se>
* htsearch/Display.cc (logSearch): Fix simple typo in last change.
Sat Jan 23 01:18:05 1999 Hans-Peter Nilsson <hp@axis.se>
* htlib/String.cc (operator =): Add const modifier: const String &.
* htlib/htString.h (String::operator=(const String &)): Ditto.
* htlib/DB2_db.h (class DB2_db): Make Put(), Get(), Exists() and
Delete() use const modifiers on appropriate parameters.
* htlib/DB2_db.cc: Ditto.
* htlib/GDBM_db.h (class GDBM_db): Ditto.
* htlib/GDBM_db.cc: Ditto.
* htlib/Database.h (class Database): Ditto.
* htlib/Database.cc (Put): Similar.
* htlib/BTree.h (class BTree): Make Put(), Get() and Exists() use
const modifiers on appropriate parameters.
* htlib/BTree.cc: Ditto.
* htcommon/DocumentDB.cc (Add, operator[], Exists, Delete): Remove
needless temporary String.
* htcommon/DocumentRef.cc (Deserialize): Ditto.
Fri Jan 22 21:10:12 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/Configuration.cc: Add support for keyword "include" to
include other config files.
* htdoc/cf_general.html: Document it.
Thu Jan 21 23:25:37 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc(logSearch): Check if HTTP_REFERER is NULL,
if so, use a dash. (Otherwise we'll kill some syslog() services).
Thu Jan 21 05:30:40 1999 Hans-Peter Nilsson <hp@axis.se>
* htlib/HtURLCodec.h, htlib/HtURLCodec.cc, htlib/HtWordCodec.cc,
htlib/HtWordCodec.h, htlib/HtCodec.cc, htlib/HtCodec.h: New files.
* htlib/Makefile.in (OBJS): Add the corresponding *.o files
* htcommon/DocumentDB.cc (Open, Read, Add, operator[], Exists,
Delete, CreateSearchDB, URLs): Use HtURLCodec; ::encode() and
::decode() the URL used as a key.
* htcommon/DocumentRef.cc (Serialize): Encode the URL using
HtURLCodec.
(Deserialize): Decode it.
* htmerge/htmerge.h: #include <HtURLCodec.h>
* htmerge/htmerge.cc (main): Check HtURLCodec for errors.
* htnotify/htnotify.cc (main): Ditto.
* htsearch/htsearch.cc (main): Ditto.
* htdig/main.cc (main): Ditto.
* htcommon/defaults.cc (defaults): Add common_url_parts and
url_part_aliases.
* htdoc/cf_byprog.html, htdoc/cf_byname.html,
htdoc/attrs.html: Document url_part_aliases and
common_url_parts.
* htlib/StringMatch.h (StringMatch::Pattern): Add default
parameter sep = '|'.
* htlib/StringMatch.cc (Pattern): Similar.
Wed Jan 20 20:20:35 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc(logSearch): Use REMOTE_ADDR when REMOTE_HOST
is unavailable (otherwise we silently dump core). Fixes PR #138.
* htcommon/WordList.cc(valid_word): Words cannot be valid if
they're shorter than minimum_word_length! Fixes PR #139.
* htsearch/Display.cc(expandVariables): Allow variables of the
form ${VAR}, fixes PR #121.
Wed Jan 20 17:21:33 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htmerge/docs.cc: Fix logic to remove documents--missing else
statements allow some "deleted" documents to not be removed.
Wed Jan 20 11:52:18 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/good_strtok.h, htlib/good_strtok.cc: Added fixes and speed
improvements contributed by Andrew Bishop.
* htdig/ExternalParser.cc, htdig/Server.cc, htlib/cgi.cc,
htmerge/db.cc, htmerge/words.cc: Call good_strtok with appropriate
parameters (explicitly include NULL first parameter, second param
is char, not char *).
* htcommon/WordList.cc(Word): Added check for adding words with
weight zero.
* htsearch/Display.h, htsearch/Display.cc: Revised setting ANCHOR
variable: it will be empty if there is no excerpt which matches
the search formula. Fixes problems with META descriptions. Based
on a patch contributed by Marjolein.
Wed Jan 20 00:30:12 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/SGMLEntities.cc: Declare extern config, since we now use
config options.
* htsearch/Display.cc: Fix typo causing compile problems.
Tue Jan 19 23:51:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Added options translate_amp, _lt_gt, _quot as
suggested by Marjolein to control SGML translation of these
entities.
* htdig/SGMLEntities.cc: Use them as contributed by Marjolein.
Tue Jan 19 12:55:36 1999 Hans-Peter Nilsson <hp@axis.se>
* htlib/StringMatch.cc (Pattern): Always set PreviousState before
checking PreviousValue.
* htlib/StringMatch.cc (FindFirst): Be "greedy"; match longest.
(Compare): Ditto.
* htcommon/DocumentRef.cc (MEMCPY_ASSIGN, NUM_ASSIGN): New macros
for assigning portably to some possibly-enum numeric type.
(getnum): Use them.
* htlib/StringMatch.cc (FINAL): Remove.
(MATCH_INDEX_MASK): Include highest bit.
(Pattern, FindFirst, Compare, FindFirstWord, CompareWord): Do not
use FINAL.
(FindFirst, Compare, FindFirstWord, CompareWord): When shifting by
INDEX_SHIFT, cast to unsigned.
Mon Jan 18 17:43:29 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Added no_title_text option to allow
configuration of the text when no title is available. Default is
the filename.
* htsearch/Display.cc: Use no_title_text to set the title
appropriately, as contributed by Marjolein.
* htsearch/Display.cc: Ensure PERCENT variable has a minimum of 1.
Mon Jan 18 17:41:44 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/Server.cc: Use max_doc_size when retrieving robots.txt
files instead of a hard-coded 10k limit.
* htdig/Document.cc: When reading chunks of document, if a chunk
puts us over the max_doc_size limit, take everything up to that
limit (rather than discarding the entire chunk).
* htcommon/DocumentRef.cc: Fix thinko with compression_level.
Sun Jan 17 21:48:05 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/(attrs.html, cf_byname.html, cf_byprog.html, config.html,
hts_form.html, hts_templates.html): Add documentation for "sort"
config and form input.
* htcommon/defaults.cc: Added options "sort" and "sort_names" to
pick result sorting order and text names for sort options.
* htsearch/Display.cc: Added variable SORT to render a form menu
for sort options, based on "sort" and "sort_names" options.
* installdir/(wrapper.html, header.html, nomatch.html,
footer.html, search.html, syntax.html): Add in sort option to form.
Sun Jan 17 14:03:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/TemplateList.h
htsearch/TemplateList.cc(createFromString): Ensure
template_map config has three members for each template we add,
contributed by Gabriele Bartolini <tlm@mbox.comune.prato.it>.
* htsearch/Display.cc(Display): Take advantage of createFromString
returning an error value to bail out of poorly-constructed
template_maps, based on code contributed by <tlm@mbox.comune.prato.it>.
* htdig/PDF.cc: Add debugging output of URLs causing
problems. Also, switch system call to make it easier to call xpdf
instead of acroread.
* htcommon/defaults.cc: Change default pdf_parser attribute to
include acrobat-specific flags. Fix mismatched naming of
compression_level (was compression_factor).
* htdig/Retriever.cc: Fix compiler warnings.
* contrib/examples/updatedig: Added contributed rundig-type script
from David Robley <webmaster@www.nisu.flinders.edu.au>.
Sun Jan 17 13:42:43 1999 didier Gautheron <dgautheron@magic.fr>
* htcommon/defaults.cc: add url_log parameter for save and restart
function.
* htdig/Retriever.cc, htdig/Retriever.h: Add save and restart
function.
* htdig/main.cc: Add option -l for save and restart
function.
* htdig/PDF.cc: Check to see if we have acroread before copying
the pdf into TMPDIR!
Fri Jan 15 07:23:30 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/DocumentRef.cc(Serialize): Save
space when lengths can fit in an unsigned char or unsigned short.
* htcommon/DocumentRef.cc(Deserialize): Handle expansion.
Thu Jan 14 23:37:29 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Added options noindex_start and
noindex_end to enable NOT indexing some sections of
HTML. Contributed by Marjolein.
* htdig/HTML.cc: Use them.
* contrib/examples/rundig.sh: Add rundig example from Colin
Viebrock with a few modifications for using less disk space.
Thu Jan 14 23:27:24 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htlib/URL.cc: Fix parent path logic to ignore slashes in query
string. Noted by Adam Coyne <adam@criticalmass.com>.
Thu Jan 14 00:04:03 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* README: Fix for upcoming 3.1.0 release.
* htcommon/defaults.cc: Set compression_factor to 0 for default
(no compression).
Thu Jan 14 03:16:15 1999 Hans-Peter Nilsson <hp@axis.se>
* htdig/ExternalParser.cc (parse): Added support for 'm': meta element.
* htdoc/attrs.html: Document it.
Wed Jan 13 21:31:38 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.in(install): Add wrapper.html to the common directory
when installing.
* contrib/examples: Added directory for example common files
(e.g. badwords, dictionaries, templates, etc.)
* contrib/examples/badwords: Added example bad_words file by Marjolein.
* .version: Bump to 3.1.0dev.
* htdig/HTML.cc(parse): Added slight fixes to the comment parsing
code, contributed by Marjolein.
Wed Jan 13 20:11:26 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Fix typo with META example.
* htdig/Document.cc: Use new StringList::Join function for
http_proxy_exclude.
* htnotify/htnotify.cc: Bring latest security patch from 3.1.0b4
onto the mainline source.
* installdir/wrapper.html: New file to merge header and footer files.
* htcommon/defaults.cc: Added search_results_wrapper for the
location of the wrapper file, if used. (The default is empty,
which uses header.html and footer.html)
* htsearch/Display.cc: Added support for using the wrapper instead
of header and footer if search_results_wrapper is set.
* htsearch/htsearch.cc: Added check for sort config.
* htsearch/Display.cc, htsearch/Display.h: Added support for
sorting and reverse sorting by date, time, and score.
Wed Jan 13 18:45:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Removed use_document_compression
(redundant) and fixed problem with missing comma. Setting
compression_factor to 0 is the equivalent of turning off
use_document_compression.
* htcommon/DocumentRef.cc(Serialize, Deserialize): Update from
Randy Winch to eliminate use_document_compression and fix
compilation problems noted by Hans-Peter.
* htmerge/db.cc: Fixed problem with db.NextDocID() being set
incorrectly, reported by Roman Dimov <roman@mark-itt.ru>.
* htcommon/DocumentDB.h: Added IncNextDocID to allow big changes
in db.NextDocID(), such as those above.
* htdoc/THANKS.html: Added Akos Domotor.
Wed Jan 13 07:07:35 1999 Hans-Peter Nilsson <hp@axis.se>
* htsearch/htsearch.cc (setupWords): Remove parsedWords parameter
with accociated processing of original words - deletion of
bad_words, spacing and on-the-fly modifiers.
(main): Create originalWords from input, not via setupWords().
Tue Jan 12 09:16:49 1999 didier Gautheron <dgautheron@magic.fr>
* htcommon/WordList.cc, htmerge/words.cc: Changed field order
in db.wordlist. With the old order, words from HTML body and words
from links to that url weren't merged sometimes.
* htdig/Document.cc, htmerge/words.cc: Small speed improvements.
* htdig/HTML.cc: Fixed small memory leak with bogus HTML and small
speedups.
* htdig/Retriever.cc(got_href) : if ref exists we have to call
AddDescription even if max_hop_count is reached. It's important
for wwwoffle (urls in the cache are restricted by max_hop_count)
* htcommon/DocumentDB.cc, htcommon/DocumentDB.h, htdig/Retriever.cc,
htlib/Dictionary.cc, htlib/Dictionary.h, htlib/Object.cc,
htlib/Object.h, htlib/String.cc, htlib/htString.h,
htcommon/WordList.cc: Speedups after gprof data.
Tue Jan 12 07:23:35 1999 didier Gautheron <dgautheron@magic.fr>
* htlib/Configuration.cc: Fixed time format to standard to avoid
sending If-Modified-Since http headers in native format (which
would be incorrect behavior). Use C locale.
* htlib/Dictionary.h, htlib/Dictionary.cc: Add new method
GetNextElement to directly return next object when iterating.
Tue Jan 12 12:56:26 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc(serialize,
deserialize): Added support for compressing data using zlib if
available, contributed by Randy Winch <gumby@cafes.net>.
* htcommon/defaults.cc: Added config options
use_document_compression and compression_factor for zlib support.
* configure.in, include/htconfig.h.in: Added autoconf check for
libz and deflate function.
* configure: Generated from above change.
Mon Jan 11 22:48:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htmerge/db.cc: Fixed thinko with setting the docIDs of new words
in the destination wordlist.
* htdoc/FAQ.html, htdoc/THANKS.html, htdoc/contents.html: Minor
cleanups.
* htdoc/RELEASE.html: Added release info from 3.1.0b4.
* htdoc/uses.html: Alphabetized, added a form for requests, and
added in lots of new sites.
Mon Jan 11 02:42:51 1999 Hans-Peter Nilsson <hp@axis.se>
* htsearch/htsearch.cc (setupWords): Do not skip words if
"boolean" search.
Mon Jan 11 00:42:51 1999 Hans-Peter Nilsson <hp@axis.se>
* htdoc/hts_method.html: Add explanation of operator "not".
* installdir/syntax.html: Added examples of correct logical
expressions.
Mon Jan 11 00:23:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/attrs.html(search_algorithm): Added prefix and substring
matching--somehow slipped through the cracks!
* htdoc/THANKS.html: Update to be more accurate as far as recent
contributions.
Sun Jan 10 00:06:59 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc(readHeader): Added check for header status
when considering content-types. Fixed PR #91.
Sat Jan 9 20:52:49 1999 didier Gautheron <dgautheron@magic.fr>
* htcommon/WordList.cc(valid_word): Break out of looping once
we're sure the word is invalid.
* htlib/Dictionary.cc(Remove, Exists): Remember special case of an
empty dictionary.
Sat Jan 9 20:16:25 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc(parse): Don't capitalize headers--this creates
problems with non-ASCII values, since String::uppercase doesn't
know how to capitalize them. Fixes PR #100.
Sat Jan 9 14:47:17 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc(getdate): Strip off weekday before calling
strptime since some servers return invalid weekdays. Fixes PR #79.
* htmerge/htmerge.h: Declare new mergeDB code.
* htmerge/htmerge.cc: Set up merge_config file and add options for
mergeDB code.
* htmerge/db.cc: New file. Implements merging of two database sets
specified by the merge_config and config variables.
* htmerge/Makefile.in: Add db.o as an object to be compiled.
Fri Jan 8 20:11:56 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
* htdig/Plaintext.cc: fixed bug that inhibited compressing of
whitespace
* htlib/URL.cc: fixed problem in stripping anchors from URLs
Thu Jan 7 23:29:32 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc(parse): Corrected problems with parsing comments,
as contributed by Marjolein Katsma <webmaster@javawoman.com> and
Gilles.
* htsearch/Display.cc, htsearch/Display.h: Implement
add_anchors_to_excerpt option and new variable ANCHOR as
contributed by Marjolein.
* htdoc/THANKS.html: Added new contributors.
* README: Update for 1999 copyright, version, etc.
Thu Jan 7 17:29:52 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/(attrs.html, cf_byname.html, cf_byprog.html): Fix typo
noted by Joe Jah: keyword_factor -> keywords_factor.
Thu Jan 7 14:32:34 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htsearch/Display.cc (display): The start template, if provided,
should come out after the header, not before.
* htcommon/defaults.cc, installdir/footer.html: Use the
no_page_list_header stuff.
Thu Jan 7 11:09:08 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/*.png: Add PNG versions of the default GIF graphics.
Wed Jan 6 22:03:54 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htfuzzy/Synonym.cc, htfuzzy/htfuzzy.cc, htmerge/docs.cc,
htmerge/words.cc, htdig/SGMLEntities.cc: Fix minor memory leaks.
* htcommon/defaults.cc: Add .bin, .tgz, .rpm, .mov, .mpg, .avi to
bad_extensions.
* htdoc/attrs.html: Update documentation on default.
* installdir/rundig: Removed check for age of synonym and endings
DB. Nice feature, but it broke under too many shells.
* htlib/DB2_db.cc: Change allocation of database cursors to match
API in new version.
* htdig/Retriever.cc(got_word): Skip changing to lowercase, we do
it in WordList::Word.
Wed Jan 6 14:49:47 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdoc/attrs.html: Added four new attributes, fixed defaults & typos.
* htdoc/cf_byname.html: Added four new attributes.
* htdoc/cf_byprog.html: Added four new attributes.
Wed Jan 6 14:37:06 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in: Changed to require Autoconf 2.13 to eliminate bugs
obeserved by users with older autoconf versions.
* configure: Regenerated using Autoconf 2.13.
Wed Jan 6 13:08:26 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.cc: Applied fix from Dave Alden
<alden@math.ohio-state.edu> to compile under SunPRO compilers
by eliminating trailing comma in enum.
Wed Jan 6 17:50:55 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
* {.,htcommon,htdig,htfuzzy,htlib,htmerge,htnotify,htsearch}/
Makefile.in, Makefile.config.in: fixed relative path problem if
install-sh is used.
Wed Jan 6 17:12:04 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
* htlib/StringList.cc: fixed bug in StringList::Join (oops!)
Wed Jan 6 10:34:45 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.cc(AddDescription): Remove delete
instruction that fouls up everything (it was removing descriptions
as we add them!).
Wed Jan 6 14:52:11 1999 Hans-Peter Nilsson <hp@axis.se>
* htlib/String.cc (allocate_space): Add missing [] to delete.
Wed Jan 6 05:53:02 1999 Hans-Peter Nilsson <hp@axis.se>
* htcommon/DocumentRef.cc(AddDescription): Do not add non-word
characters to the wordlist.
Wed Jan 6 00:28:19 1999 Hans-Peter Nilsson <hp@axis.se>
* htdoc/cf_byname.html: Fixed html syntax "<br" and "/a>".
Tue Jan 5 22:40:58 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Check if we need to do backlink and date
factoring (e.g. we don't if they're zero!), from a patch by Gilles.
Tue Jan 5 20:57:02 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
* configure.in, htlib/Connection.cc: Check for strings.h for those
platforms that don't have it.
Tue Jan 5 14:24:52 1999 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.h: Added comments on the members (fields)
of DocumentRef objects.
* htcommon/defaults.cc: Added new option max_descriptions for
limit on the number of descriptions to store (default 5, matches
behavior pre 3.1.0b3).
* htcommon/DocumentRef.cc: Support restriction of max_descriptions.
* .version: Bump to 3.1.0b5dev.
Tue Jan 5 20:07:05 1999 Alexander Bergolth <bergolth@ariel.wu-wien.ac.at>
* htdig/Retriever.cc: fixed bug in bad_querystring detection
Sat Jan 2 16:39:34 1999 Alexander Bergolth <leo@strike.wu-wien.ac.at>
* htdig/main.cc, htlib/Configuration.cc: Added warning message if
the locale selection was not successful. (e.g. because the locale
definition is not installed) config["locale"] is now set to the
return string of setlocale.
* {.,htcommon,htdig,htfuzzy,htlib,htmerge,htnotify,htsearch}/
Makefile.in, Makefile.config.in, configure.in: Changed to allow
compiling in seperate build directories.
Fri Jan 1 05:49:19 1999 Hans-Peter Nilsson <hp@axis.se>
* htdoc/attrs.html: Describe more thoroughly how "pdf_parser"
is used.
* htdoc/attrs.html: Fix typo for anchor/attribute
"allow_virtual_hosts".
* htdoc/attrs.html: Correct and add more verbose description of
external parser program parameters and fields.
Sun Dec 27 14:52:45 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
* htlib/URL.cc: Small change in URL::removeIndex so that URLs are not
stripped if a query string ends with /index.html
* htsearch/Display.cc, htnotify/htnotify.cc: Added patches from
Gilles Detillieux <grdetil@scrc.umanitoba.ca> to fix memory leaks.
Sat Dec 19 17:53:44 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
* htdig/main.cc, htdig/htdig.h, htdig/Retriever.cc: Added new option
bad_querystr. Allows exclusion when digging CGI-Scripts.
* htsearch/htsearch.cc, htsearch/Display.cc: Added new option
allow_in_form. Does currently not work with some special variable
names!
* htcommon/defaults.cc: Added the two new options.
Sat Dec 19 11:21:38 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* contrib/htparsedoc/parse_word_doc.pl: Update from Jesse.
* .version: Bump for 3.1.0b4.
* README: Ditto.
* Makefile.in: Remove references to version number.
* htnotify/htnotify.cc: Fix nasty security hole found by Werner
Hett <hett@isbiel.ch>.
Sat Dec 19 15:22:38 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
* htlib/StringList.cc, htlib/StringList.h: Added StringList::Join
to simplify the creation of patterns for StringMatch.
* htlib/String.cc: lastIndexOf(char ch) added
* htlib/URL.cc: Changed URL::removeIndex to use local_default_doc.
(index.html was hardcoded) local_default_doc can be a list.
* htdig/main.cc, htlib/URL.cc: Use StringList::Join.
Sun Dec 13 23:06:35 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Fix potential coredump when calculating
date_factor and backlink_factor on docs that aren't in the
database.
Sat Dec 12 23:17:56 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html:
Added docs for new options since version 3.1.0b2.
* htdoc/RELEASE.html: Added notes on changes since 3.1.0b2 (we
should keep this up rather than all-at-once).
* htdoc/hts_templates: Include documentation on using CGI
environment variables in templates with this version.
* htdig/Retriever.cc(got_href): Added check to prevent
currenthopcount from becoming -1.
* htcommon/WordList.cc: Change undefined minimumWordLength to
config("minimum_word_length").
Sat Dec 12 12:01:55 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.in, Makefile.config.in, */Makefile.in: Added target
mostlyclean to clean up, but leave compile-intensive targets
(e.g. db, rx code). General cleanup too.
* htdoc/where.html: Updated for eventual 3.1.0b3 release.
* htcommon/WordList.cc: Added additional cleanups for the words in
the bad word file, in case they have invalid punctuation, etc.
Sat Dec 12 18:41:29 1998 Alexander Bergolth <leo@strike.wu-wien.ac.at>
* htmerge/words.cc: Fix last update so that it compiles on AIX.
Fri Dec 11 10:40:48 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc: Added additional debugging info on the
reason for excluding a URL, based on a patch by Benoit Majeau
<Benoit.Majeau@nrc.ca>.
* htmerge/words.cc: Fixed a bug where pointer, rather than strings
were assigned. Silly references...
* htsearch/Display.cc, htsearch/Display.h: Added patch from Gilles
to allow CGI environment variables in templates.
* htdig/HTML.cc: Fix core dump when META refresh tags don't have
content portions.
Thu Dec 10 22:28:44 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc, htdig/Server.cc, htdig/Server.h:
Changed support for server_wait_time to use delay() method in
Server. Delay is from beginning of last connection to this
one. Currently this also delays local digging, which may not be ideal.
* htcommon/defaults.cc: Added option for server_max_docs as a
limit on the number of docs returned from a server.
* contrib/htparsedoc/parse_word_doc.pl: New version from
Jesse. New code speedups and better matching of punctuation.
* htdig/Document.cc: Check http_proxy_exclude to see if it's
empty. If so, use the proxy.
Mon Dec 7 21:46:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc: Fix thinko with multiple excludes and
restricts. Pointed out by Gilles.
* htcommon/defaults.cc: Add new option server_wait_time for the
number of seconds to wait between requests.
* htdig/Retriever.cc: Use server_wait_time to call sleep() before
requests. Should help prevent server abuse. :-)
* htcommon/WordList.cc(valid_word): Remove unnecessary code.
* htcommon/DocumentRef.cc: Fix typo that added description text
that contained punctuation or was too short.
Sun Dec 6 13:12:55 1998 Geoff Hutchison <ghutchis@ethel.williams.edu>
* htsearch/parser.cc: Check for empty boolean searches and report
an error. Fixes bug reported by Chuck O'Donnell <cao@bus.net>.
* install-sh, mkinstalldirs: Import latest version from autoconf.
* htcommon/DocumentRef.cc: Add the text of descriptions to the
word database with weight description_factor.
* htcommon/WordList.cc: Ensure duplicate words have minimum
location and anchor attributes.
* htcommon/WordRecord.h: Ensure blank WordRecords have a default
count of 1 since a word has to exist to have a WordRecord!
* htdig/ExternalParser.cc, htdig/PDF.cc, htfuzzy/EndingsDB.cc:
Ensure temporary files are placed in TMPDIR if it's set.
* htdig/Retriever.cc: Don't add the text of descriptions to the
word db here, it's better to do it in the DocumentRef itself.
* htmerge/words.cc: Check for word entries that are essentially
duplicates and compact them.
Sat Dec 5 01:10:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/THANKS.html: Updated for recent submissions.
* htdoc/FAQ.html: Cleaned up title.
* htdoc/uses.html: Added more sites and cleaned up the HTML.
Fri Dec 4 20:15:41 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* db/os/os_fsync.c, db/mutex/mutex.c: Patch from Klaus Mueller
<K.Mueller@intershop.de> to compile under CygWinB20.
* htdig/HTML.cc: Fix mistake in last update--file was included
twice.
* htdig/Retriever.cc: Do a check for blank URLs before adding them
to the list to be retrieved.
Fri Dec 4 19:21:17 1998 Didier Gautheron <dgautheron@magic.fr>
* htdig/HTML.cc: Fix parser bug with < becoming a tag.
* htlib/Dictionary.cc: Added check for empty dictionaries.
* htlib/URL.cc: Allow server_aliases to work under virtual hosts.
* htmerge/htmerge.cc: Remove previous db.words.db file before
doing a word merging. Fixes bug with deleted documents keeping
entries.
* htdig/main.cc, htdig/Retriever.h, htdig/Retriever.cc: Added
parameter to Initial function to prevent URLs from being checked
twice during an update dig.
* htcommon/WordList.cc, htmerge/words.cc: Don't store c:1 and a:0
entries in db.wordlist to save space.
Fri Dec 4 19:08:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in, Makefile.in, Makefile.config.in: Remove DB_DIR and
RX_DIR.
* configure: Regenerated for configure.in changes.
* htsearch/htsearch.cc: Added usage message for the command line.
Fri Dec 4 18:52:55 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html: Added question about phrase matching.
Fri Dec 4 21:21:00 1998 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* configure.in: Check if the third argument of getpeername is a
size_t* or an unsigned int*.
* include/htconfig.h.in: Define GETPEERNAME_LENGTH_T.
* htlib/Connection.cc: Use GETPEERNAME_LENGTH_T as the type of the
third getpeername argument. Included strings.h which is needed for
FD_ZERO on AIX.
Thu Dec 3 23:03:15 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in: Check for getopt.h for those platforms that don't
have it. Fix checks for db and rx dirs since these names won't
change.
* include/htconfig.h.in: Define HAVE_GETOPT_H.
* configure: Generate from configure.in with latest autoconf
(2.12.2).
* htdig/Plaintext.cc: Removed compiler warnings.
* htdig/main.cc, htfuzzy/htfuzzy.cc, htmerge/htmerge.cc,
htnotify/htnotify.cc, htsearch/htsearch.cc: Use configure check to
only include getopt.h when it exists.
* htcommon/defaults.cc: Add new option http_proxy_exclude for
servers that shouldn't use the proxy, from a patch by Gilles
Detillieux.
* htdig/Document.h, htdig/Document.cc: Use it, from a patch by Gilles.
Tue Dec 1 21:36:37 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.in: Fixed bug with "make depend," noted by Morgan Davis
<mdavis@cts.com>.
* htdig/main.cc, htfuzzy/htfuzzy.cc, htmerge/htmerge.cc,
htnotify/htnotify.cc, htsearch/htsearch.cc: Add include <getopt.h>
to help compiling under Win32 with CygWinB20.
* htdig/Retriever.cc: Update hopcount correctly by taking the
shortest paths to documents.
* htlib/DB2_db.cc: Added fix from Alexander Bergolth for Berkeley
DB under AIX.
* htlib/StringMatch.cc: Added fix from Christian Schneider
<cschneid@relog.ch>, discovered from behavior with limit_urls_to.
Tue Dec 1 18:06:33 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/hts_form.html: Explained why config fields reject periods.
* htdoc/FAQ.html: Added information about Internal Server Errors.
* htdoc/uses.html: Updated with more sites, change e-mail to Geoff.
Sun Nov 29 21:26:56 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc: Fix last update so it compiles (oops!).
* htdig/Document.cc: As above!
Sun Nov 29 20:06:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/htsearch.cc: Improved support for multiple restrict and
exclude patterns, based on code from Gilles Detillieux
and William Rhee <willrhee@umich.edu>.
* htdig/Document.cc, htdig/PDF.cc: Fixed problems under FreeBSD
where <sys/types.h> needed to be before <sys/stat.h>, noted by
Gilles.
* htdig/Server.cc: Fixed bug with robots.txt files containing
tabs, based on patch from Christian Schneider <cschneid@relog.ch>.
* htdig/Document.cc: Fixed core dumps caused by mystrptime
returning NULL. Instead, we'll use the current timestamp. Noted by
Michael Hauber <mhauber@datacore.ch> and
<MARK_ALLEYNE@Non-HP-UnitedKingdom-om8.om.hp.com>.
Fri Nov 27 19:09:33 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* db/*: Import of Sleepycat's Berkely DB 2.5.9
* rx/*: Import of FSF rx 1.5
* configure, configure.in: Updated to deal with changes in db, rx
directories.
* Attic/db-2.4.14.tar.gz: Removed old db package for update.
* htsearch/parser.cc: Removed bogus code with "%01" -> "|"
* htlib/URL.cc: Considers URLs with "%7E" to be equivalent to "~"
* htlib/String.cc: Changed MinimumAllocationSize to cut down on
memory usage on small strings.
* htdig/Retriever.h, htdig/Retriever.cc, htdig/HTML.cc: Changed
Retriever::got_word to check for small words, valid_punctuation to
remove bugs in HTML.cc.
* htcommon/defaults.cc: Changed backlink_factor to 1000,
description_factor to 150, match_method to and, and
meta_description factor to 50. Should produce more accurate search
results.
* htcommon/WordList.cc: Fixed bug with bad_words and
MAX_WORD_LENGTH, noted by Jeff Breidenbach <jeff@alum.mit.edu>.
* README: Updated to reflect bug-tracking system.
Tue Nov 24 15:57:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc: Added patch to use local_default doc with
local_user_urls from Gilles Detillieux
<grdetil@scrc.umanitoba.ca>.
Mon Nov 23 18:57:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/RELEASE.html, htdoc/bugs.html, htdoc/contents.html,
htdoc/where.html: Updated for new bug reporting system.
* htdoc/TODO.html: Updated To Do w/ current status.
Sun Nov 22 14:03:06 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/rundig: Added checks for synonym databases older than
the synonym files.
* htcommon/defaults.cc: New config options "description_factor"
for weighting words added as link descriptions, and
"no_excerpt_show_top" to show the top of an excerpt instead of the
"no_excerpt_text".
* htdig/Retriever.cc: Use "description_factor" to weight link
descriptions with the documents at the end of the link.
* htsearch/Display.cc: Adjust date_factor and backlink_factor
rankings to produce better results.
* htsearch/Display.cc: Use "no_excerpt_show_top."
* htsearch/htsearch.cc: Don't remove boolean operators from
boolean search strings!
Thu Nov 19 01:31:37 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html: Update for -ldb problem on Digital UNIX.
Wed Nov 18 05:14:53 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/FAQ.html: Update FAQ w/ new questions, better responses.
* htdoc/mailing.html: Mention additional archive at
www.mail-archive.com.
* htdoc/require.html: Update requirements (libstc++ instead of libg++).
Tue Nov 17 23:13:04 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* contrib/wordfreq/wordfreq.pl: Added changes by Isoif.
* htsearch/Display.cc: Added HTTP_REFERER to htsearch logging
* htdig/Document.cc: Fixed memory leak as a result of thinko.
* htcommon/DocumentRef.cc: Removed limit on number of link
descriptions.
Mon Nov 16 22:30:07 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Declare new config options backlink_factor
and date_factor for counting document backlink counts and modifed
dates in rankings.
* htsearch/Display.cc: Use above factors.
* htsearch/ResultMatch.cc: Clarify getScore() comments.
* htlib/mktime.c: Import new version.
* installdir/htdig.conf: Add max_doc_size example (to help w/FAQ).
Mon Nov 16 10:46:15 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/ExternalParser.cc: Add checks for null tokens, adapted
from patch by Vadim Checkan.
* htdig/Retriever.cc: Count docBackLinks accurately (previously
all docs had count of 2!).
Sun Nov 15 17:04:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc(do_tag): Fix for refresh tags w/o URLs.
* htmerge/docs.cc, htmerge/words.cc: Change \r to \n, as mentioned
by Andrew Bishop.
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: Define new fields
docBackLinks (backlink count) and docSig (document signature).
* htdig/Retriever.cc: Keep track of docBackLinks.
* htsearch/Display.cc: Add variable BACKLINKS to display the count.
Sat Nov 14 20:30:18 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc(parse, do_tag): Ensure links respect META robot
settings. Patch contributed by Michael Spann
<mikes@mail.sv.dialogic.com>.
* htdig/HTML.cc(do_tag): Eliminate bug that ignores "?" in URLs
* htdig/HTML.cc(do_tag): Add support for META refresh tags as
"redirects", submitted by Aidas Kasparas
<kaspar@dobilas.infosistema.lt>.
Thu Nov 12 04:13:26 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/contents.html: Added link to jitterbug bug db.
Sun Nov 8 21:10:19 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/ChangeLog, htdoc/RELEASE.html, htdoc/THANKS.html:
Correct spelling error with Rene' Seindal's name.
* htdoc/hts_templates.html: Update to improve clarity.
Sun Nov 8 20:33:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Document.cc: Changed reset to keep proxy settings--fixes
bug noted by Didier Gautheron <dgautheron@magic.fr>
Fri Nov 6 17:07:00 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* contrib/wordfreq/wordfreq.pl: Updated with patch from Isoif
Fettich <ifettich@netsoft.ro> to use Berkeley DB.
* contrib/whatsnew/whatsnew.pl: Fixed mistake from Oct 26 change.
* contrib/htparsedoc/parse_word_doc.pl: Added file contributed by
Jesse.
* contrib/README: Updated to include short descriptions of the scripts.
* contrib/multidig/*: New scripts to make working with multiple DB
a little easier.
* configure, configure.in: Added changes to support snapshots.
* .version: Resurrected to automate snapshot versions.
Wed Nov 4 20:13:10 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdoc/contents.html: Added "Contributors" for THANKS.html
* htdoc/THANKS.html: Added acknowledgement to contributors.
Wed Nov 4 15:02:43 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htnotify/htnotify.cc: Fixed buglet with -F flag to sendmail.
* htdig/Plaintext.cc: Added patch from Vadim Chekan to change char
to unsigned char to fix reading Cyrillic plaintext files.
Mon Nov 2 15:34:53 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htnotify/htnotify.cc, Makefile.config.in, README:
Changed "HTDig" to "ht://Dig."
Sun Nov 1 20:34:14 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.in: Fixed buglet with dist target.
* htdig/Makefile.in: Fixed buglet with distclean target.
* htdoc/FAQ.html, htdoc/RELEASE.html, htdoc/attrs.html
htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/htdig.html
htdoc/hts_templates.html: Updated documentation for new features,
bug-fixes in ht://Dig 3.1.0b2.
* htlib/Makefile.in, htlib/lib.h: Call mytimegm.cc instead of timegm.c.
* Attic/makedp: Remove file generated by configure
* htdig/Document.cc: Remove const from *ext to fix compiler warning.
Sun Nov 1 00:17:08 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Added template var DESCRIPTION as first
item in DESCRIPTIONS, as requested by Ryan Scott
<test@netcreations.com>.
* htlib/mytimegm.cc: Resurrected mytimegm() until problems with
glibc version can be solved.
* htdig/Document.cc, htdig/Retriever.cc, htfuzzy/Prefix.cc,
htsearch/WeightWord.cc, htsearch/htsearch.cc: Replaced system
calls with htlib/my* functions.
Sat Oct 31 23:58:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/URL.cc: Fixed compiler warning.
* rx-1.5/Attic/Makefile, rx-1.5/Attic/config.log:
Removed useless Makefile and config.log file.
Tue Oct 27 22:53:03 1998 Andrew Scherpbier <andrew@contigo.com>
* */Makefile.in (depend): Fixed so that 'make depend' works
again. (Not sure exactly how long it was broken!)
Tue Oct 27 20:00:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.in: Fix buglet with distclean target
* configure configure.in: Added check for LOCALTIME_R, removed
test for timegm replacement, changed compiler for most tests to
$CC.
* include/htconfig.in: Added option for LOCALTIME_R.
* htlib/timegm.c, htlib/mktime.c: Fixed some compilation problems.
* htlib/Makefile.in: Remove mktime.o since source is included in
timegm.o.
Tue Oct 27 13:31:25 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/mktime.c: Imported new version from glibc-2.0.99.
* htcommon/DocumentDB.cc: Fixed bug noted by Vadim Chekan with
CreateSearchDB.
Mon Oct 26 15:27:28 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* Makefile.config.in, configure.in, configure: Fixed problem with
-ldb, -lrx, etc. not being declared in $LIBS
* htdoc/install.html: Added remarks about using ./configure
--prefix=
* README: Cleaned up for new URLs, version numbers, etc.
* htsearch/htsearch.cc: Added patch by Esa Ahola fixing bug with
not ingoring bad_words properly.
* contrib/whatsnew/whatsnew.pl: Added fix from Jacques Reynes
<Jacques.Reynes@cict.fr> to get whatsnew to work with Berkeley DB.
* htdig/Retriever.cc, htdig/Document.cc: Fixed bug introduced by
Oct 18 change. Authorization will not be cleared.
* htlib/URL.cc: Fixed new -Wall warnings.
Wed Oct 21 13:30:05 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/timegm.c: Corrected Oct 17 change. Should now work. :-)
* htcommon/defaults.cc: Added defaults for new directives
server_aliases and limit_normalized.
* htdig/HTML.cc: Cleaned up HTML parsing based on patch by Rene'
Seindal.
Wed Oct 21 18:31:00 1998 Alexander Bergolth <leo@leo.wu-wien.ac.at>
* htlib/URL.cc, htlib/URL.h: Added patch to support translation of
server names. (Configuration directive: server_aliases)
* htdig/Retriever.cc, htdig/htdig.h, htdig/main.cc:
Additional limiting after normalization of the URL.
(Configuration directive: limit_normalized)
Sun Oct 18 17:19:51 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/Connection.h, htlib/Connection.cc: Define new function
timeout() as adapted from a patch by Rene' Seindal.
* htdig/Document.cc: Use it as adapted from a patch by Rene' Seindal.
Sun Oct 18 16:33:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentDB.cc: Changed deserialize function to
explicitly delete DocumentRef.
* htcommon/DocumentRef.cc: Added trap for DOC_STRING value.
* htdig/Retriever.cc: Delete and reallocate Document variable
before retrieving. (Fixes database corruption bug) Removed code to
add a "/" to every URL with a 404--servers should send a redirect
in this case.
Sat Oct 17 20:15:44 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/timegm.c: Declare __gmtime_r if not defined
Sat Oct 17 10:15:57 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* configure.in: Fixed problem with configuring DB_DIR introduced
by Oct 11 change.
* configure: Regenerated by autoconf for above fix.
* htlib/Connection.h, htlib/Connection.cc: Included fixes sent by
Paul J. Meyer <pmeyer@rimeice.msfc.nasa.gov> to fix connections on
Dec Alpha environments.
* htsearch/Display.cc, htsearch/Display.h,
htdoc/hts_templates.html: Added variable CURRENT as the number of
the current match, adapted from a patch by Rene' Seindal
<seindal@webadm.kb.dk>
* htcommon/defaults.cc: Changed htdig.sdsu.edu to www.htdig.org in
start_urls
Wed Oct 14 03:43:22 1998 turtle <turtle@kiwi>
* installdir/htdig.conf: fixed broken link pointed out by
chris@impulsedata.net, moved maintainer stuff up in the file
Sun Oct 11 22:16:27 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/DB2_db.cc: Added fix suggested by Domotor Akos
<dome@impulzus.sch.bme.hu> with (char *)NULL cast.
* htlib/Attic/mytimegm.cc: Removed old mytimegm function.
* installdir/syntax.html: Improved boolean method error
message. It now gives examples of boolean expressions.
* htcommon/defaults.cc, htsearch/Display.cc, htsearch/Display.h,
htsearch/parser.cc: Added htsearch logging patch from Alexander
Bergolth.
* */Makefile.in, include/htconfig.h.in, htdig/Document.cc,
htdig/Images.cc, Attic/.version, Makefile.config.in, Makefile.in,
configure, configure.in, mkinstalldirs: Updated Makefiles and
configure variables.
* htfuzzy/Endings.cc, htfuzzy/Fuzzy.cc, htfuzzy/Prefix.cc,
htfuzzy/htfuzzy.cc, htlib/DB2_db.cc, htcommon/DocumentDB.cc:
Removed more -Wall warnings.
Fri Oct 9 00:29:18 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc: Fixed typo with "meta_desription_factor".
* htdig/Images.cc: Use user_agent config in GET request.
Thu Oct 8 09:05:41 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/syntax.html: Improved Boolean search description.
Mon Oct 5 11:30:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* contrib/ewswrap/ewswrap.cgi, contrib/ewswrap/htwrap.cgi,
contrib/ewswrap/README: New scripts, contributed by John Grohol
PsyD <johngr@cmhcsys.com>.
Fri Oct 2 13:11:24 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc: Added check for docs removed with
noindex. Now words in these docs should be ignored for the word
db.
Fri Oct 2 13:09:04 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* CONFIG Makefile.config.in Makefile.in */Makefile.in,
htcommon/defaults.cc htdig/main.cc, htfuzzy/htfuzzy.cc,
htmerge/htmerge.cc, htnotify/htnotify.cc include/htconfig.h.in:
More configure improvements--use top_srcdir instead of
HTDIG_TOP, use PACKAGE, VERSION, etc.
Fri Oct 2 11:32:59 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/StringList.cc: Added patch by Alexander Bergolth for bug
with multiple delimeter characters
Fri Oct 2 15:22:06 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* installdir/rundig, configure.in, CONFIG, CONFIG.in, aclocal.m4,
configure: Improvements in configure.in, notably using --prefix=
and --exec-prefix=
Tue Sep 29 19:26:11 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc: Added patch from Tim Frost <tim@nz.eds.com> for
single quotes around URLs.
* htfuzzy/Prefix.cc: Added patch from Esa to fix Prefix matching
for capitalization.
* htcommon/defaults.cc: Added modification_time_is_now config
* htdig/Document.cc:, htdig/Retriever.cc: Added patch from Andrew
Bishop <amb@gedanken.demon.co.uk> for above to use modification
times when servers do not supply them.
* htsearch/htsearch.cc: Added patch from Andrew Bishop for -c switch.
Wed Sep 23 14:46:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc, htdig/Server.cc: Added case_sensitive
attribute to work on case insensitive servers.
Wed Sep 23 11:58:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: re-fixed bug noted by Alexander Bergolth
* htlib/Attic/timegm.cc, htlib/Makefile.in, htlib/mktime.c,
htlib/mytimegm.cc, htlib/timegm.c: Switched to using glibc timegm
replacement.
* configure, configure.in, Makefile.config.in: Add configure
searches for acroread and sendmail programs.
* htnotify/Makefile.in, htnotify/htnotify.cc,
htcommon/Makefile.in, htcommon/defaults.cc: Use them.
* htdig/HTML.cc: Fix thinko in META robots tag.
* htcommon/defaults.cc: Define iso_8601 date formatting option
* htsearch/Display.cc, htnotify/htnotify.cc: Use it as suggested
by Knut A. Syed <Knut.Syed@nhh.no>
Fri Sep 18 14:35:02 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Fixed bug noted by Alexander Bergolth
<leo@strike.wu-wien.ac.at> in exclude logic
* htdig/HTML.cc: Fixed bug in comma-separated keywords noted by
<C.H.Liddiard@qmw.ac.uk>
* installdir/synonyms: New version contributed by John Banbury
<lijab@flinders.edu.au>
Fri Sep 18 00:38:09 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* .version: Bump to 3.1.0b2
* htsearch/Makefile.in, htdig/Makefile.in, htfuzzy/Makefile.in,
htlib/Makefile.in, htmerge/Makefile.in,
htnotify/Makefile.in, htcommon/Makefile.in: Remove include
.sniffdir directive.
* htdig/HTML.cc: Fix horrible META description coding.
* htfuzzy/EndingsDB.cc, htfuzzy/Fuzzy.cc htfuzzy/Synonym.cc,
htfuzzy/htfuzzy.cc: Change "\r" to "\n" in statistics on
suggestion of Andrew M. Bishop <amb@gedanken.demon.co.uk>
* Makefile.config.in: Remove -ggdb from LDFLAGS.
Tue Sep 15 22:31:48 1998 turtle <turtle@kiwi>
* Makefile.in: add substitution for @DATABASE_DIR@
Thu Sep 10 00:06:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/HTML.cc: Change debug level of META tags.
* htsearch/TemplateList.cc, htsearch/htsearch.cc, htsearch/Display.cc,
htsearch/Display.h: Backed out builtin-long default from Monday, now
use error handler
Mon Sep 7 23:19:12 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* contrib/htparsedoc: Added contributed external parser for MS
Word documents by Richard Jones <rjones@imcl.com>.
* htdig/Document.cc: Added fix to use htparsedoc.
* htdoc/*.html: Merged in new documentation for htdig-3.1.0b1.
* htdig/HTML.cc: Extended "noindex" behavior in previous patch.
* htcommon/defaults.cc: Added user_agent config option.
* htdig/Document.cc: Use it.
Mon Sep 7 00:34:19 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/DocumentRef.h: Added DocState for documents marked as
"noindex".
* htdig/HTML.cc, htdig/Retriever.h, htdig/Retriever.cc,
htmerge/docs.cc: Use it to remove them.
* htsearch/TemplateList.cc: Add default template of builtin-long
to slot 0 in case of an error.
* htsearch/Display.cc: Use it.
Sun Sep 6 21:36:16 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htcommon/defaults.cc: Sorted the current list of defaults, added
"pdf_parser" for the program to use in PDF.cc.
* htdig/PDF.cc: Use it, checking for the file before calling
system to fail gracefully.
* htlib/URL.cc: Bug fix for http:/ v. http://
Sat Sep 5 23:11:48 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/String.cc: Added patch by Zvi Har'El
<rl@math.technion.ac.il> to indexOf function to prevent "false
positive" matches.
* installdir/nomatch.html, installdir/syntax.html: Fixed reference
to ht://Dig 3.0.
* htdig/Document.cc: Use robotstxt_name as user-agent as a more
consistent approach.
* htsearch/parser.cc: Convert "%01" to "|" to support <SELECT
... MULTIPLE> tags.
Thu Sep 3 20:53:51 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Makefile.in: Remove reference to -lgdbm
* htsearch/Display.cc: Send Content-type header after all variable
expansion is completed.
* htcommon/WordList.cc: Removed warning under egcs-1.1
Tue Aug 11 08:58:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc, htdig/Retriever.h,
htdig/Retriever.cc, htdig/Parsable.h, htdig/Parsable.cc,
htdig/HTML.h, htdig/HTML.cc, htcommon/defaults.cc,
htcommon/DocumentRef.h, htcommon/DocumentRef.cc,
htcommon/DocumentDB.cc:
Second patch for META description tags. New field in DocDB for the
desc., space in word DB w/ proper factor.
* htmerge/docs.cc: Added statistic for total size of docs in DB.
Thu Aug 6 10:15:22 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Retriever.cc: Added "local_dir_doc" config option,
the default filename in a directory.
* htcommon/defaults.cc: Fixed "elipses" spelling mistake,
local_dir_doc as above
Tue Aug 4 11:34:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htlib/Configuration.cc: Added fix by Philippe Rochat
<prochat@lbdsun.epfl.ch> to remove whitespace after config
options.
* htdig/HTML.cc, htdig/HTML.h: Added support for META robots tags.
Mon Aug 3 16:50:46 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/ResultList.cc, htnotify/htnotify.cc,
htmerge/htmerge.cc, htmerge/docs.cc, htlib/String.cc,
htlib/ParsedString.cc, htfuzzy/Substring.cc,
htfuzzy/Prefix.cc, htfuzzy/Exact.cc,
htdig/SGMLEntities.cc, htdig/Retriever.cc, htdig/PDF.cc,
htdig/HTML.cc, htdig/Document.cc:
Fixed compiler warnings under -Wall
Mon Aug 3 05:56:23 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Spelling correction for "ellipses"
Thu Jul 23 12:14:34 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/PDF.cc, htdig/PDF.h, htdig/Document.cc: Added files (and
patch) from Sylvain Wallez for PDF parsing. Incorporates fix for
non-Adobe PDFs.
* htcommon/defaults.cc: Removed .pdf extension from bad_extensions.
Wed Jul 22 10:04:31 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Added patch from Sylvain Wallez
<s.wallez.alcatel@e-mail.com> to use the filename if no title is found.
* htnotify/htnotify.cc: Added patch from Chris Jason
Richards <richards@cs.tamu.edu> to fix problems with sendmail.
Tue Jul 21 09:56:58 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htsearch/Display.cc: Added patch by Rob Stone
<rob@psych.york.ac.uk> to create new environment variables to
htsearch: SELECTED_FORMAT and SELECTED_METHOD.
Sun Jul 19 09:51:47 1998 Andrew Scherpbier <andrew@contigo.com>
* configure.in (berkeley db stuff): Added the berkeley db .tar.gz
to the distribution and modified configure.in to extract it if it
needs to.
Thu Jul 9 09:39:01 1998 Geoff Hutchison <ghutchis@wso.williams.edu>
* htdig/Server.cc, htdig/Retriever.h, htdig/Retriever.cc,
htdig/Document.h, htdig/Document.cc, htcommon/defaults.cc: Added
support for local file digging using patches by Pasi Eronen
<pe@iki.fi>. Patches include support for local user (~username)
digging.
* htdig/HTML.h, htdig/HTML.cc, htcommon/defaults.cc:
Added support for META name=description tags. Uses new config-file
option "use_meta_description" which is off by default.
Mon Jun 22 05:02:01 1998 turtle <turtle@kiwi>
* configure.in:
Added test to make sure that the berkeley db library is present
* .cvsignore: Ignore the berkeley db library
* configure: changed
* Makefile.config.in: Removed GDBM references
* Makefile.in: Removed GDMB references
* .version: updated version to 3.1.0b1
* README: Updated version # and website location
* htdig/HTML.cc: Applied patch that prevented SGML entities that
translate to valid_punctuation characters from becoming part of
words
* configure.in: Removed references to GDBM
* htcommon/defaults.cc: Got rid of my email address as the default
maintainer
* htdig/htdig.conf: simple config file for development
* htlib/String.cc, htlib/Attic/SDSU.h, htlib/Attic/SDSU.cc,
htlib/DB2_db.cc, htlib/Connection.cc, htlib/Configuration.cc,
htlib/BTree.cc: New Berkeley database stuff
* htlib/.sniffdir/ofiles.incl: removed SDSU.*
* installdir/syntax.html, installdir/search.html,
installdir/rundig, installdir/nomatch.html, installdir/htdig.conf,
installdir/footer.html: Changed to use the new
https://htdig.sourceforge.net/ instead of the sdsu site
Sun Jun 21 23:20:14 1998 turtle <turtle@kiwi>
* rx-1.5/rx/Attic/config.log, htsearch/htsearch.cc,
htsearch/Attic/display.cc, htsearch/Display.cc, htmerge/docs.cc,
htlib/.sniffdir/ofiles.incl, htlib/Database.h, htlib/DB2_db.cc,
htlib/DB2_db.h, htlib/Database.cc, htfuzzy/.sniffdir/ofiles.incl,
htfuzzy/Prefix.cc, htfuzzy/Prefix.h, htfuzzy/Makefile.in,
htfuzzy/Fuzzy.cc, htcommon/defaults.cc, configure.in, Makefile.in,
Makefile.config.in: patches by Esa and Jesse to add BerkeleyDB and
Prefix searching
Mon Jun 15 18:15:50 1998 turtle <turtle@kiwi>
* htdig/HTML.cc: Added suggestion by Chris Liddiard to add ',' to
the list of separator characters for meta keyword parsing
Tue May 26 03:58:14 1998 turtle <turtle@kiwi>
* rx-1.5/rx/Attic/config.log, htlib/htString.h, htlib/cgi.cc,
htlib/URL.cc, htlib/String.cc, htlib/ParsedString.cc,
htlib/Database.cc, htlib/Connection.cc: Got rid of compiler
warnings.
* rx-1.5/rx/.cvsignore: added config.log
Fri Apr 3 17:10:44 1998 turtle <turtle@kiwi>
* htsearch/Display.cc: Patch to make excludes work
Tue Mar 10 16:02:32 1998 turtle <turtle@kiwi>
* htlib/strcasecmp.cc: Applied patch by Bernhard Griener to add
arguments checks in the mystrncasecmp() function
Sun Feb 22 17:43:49 1998 turtle <turtle@kiwi>
* htdoc/mailing.html: New mailing list archive location
Tue Feb 17 18:05:40 1998 turtle <turtle@kiwi>
* htdoc/uses.html: added new one
Thu Feb 12 22:22:15 1998 turtle <turtle@kiwi>
* htdoc/uses.html: Added more sites
Mon Jan 5 06:14:11 1998 turtle <turtle@kiwi>
* configure, configure.in: Added check for fstream.h to get rid of
the annoying emails about ht://Dig not compiling...
* Makefile.config.in: Added include of the GDBM library back
* .version: Now at version 3.0.9
* include/htconfig.h.in: Changed refs to time related stuff
* htmerge/htmerge.cc, htmerge/docs.cc: format changes
* htdig/Document.cc: Changed tm from pointer to real structure
* htlib/.sniffdir/ofiles.incl, htlib/timegm.cc: Our own timegm
function
* rx-1.5/rx/.cvsignore, rx-1.5/rx/Attic/Makefile: cvs cleanup
* htmerge/docs.cc: Fixed memory leak
* htlib/lib.h: Added own replacement of timegm()
* htlib/Dictionary.cc: Fixed memory leaks
* htlib/Connection.cc: Fix by Pontus Borg for AIX. Changed
'size_t' to 'unsigned long' for the length parameter for
getpeername()
* htfuzzy/Metaphone.cc: formatting changes
* htdig/Retriever.cc: fixed memory leak
* htdig/Document.cc: * Alarm was not cancelled if readHeader
returned anything but OK * Use our own timegm() replacement if
necessary
* htcommon/DocumentRef.h, htcommon/DocumentRef.cc: format changes
* htcommon/DocumentDB.h: reformatting
* htcommon/DocumentDB.cc: Fixed major memory leak
* include/.cvsignore, include/Attic/htconfig.h, rx-1.5/.cvsignore,
rx-1.5/Attic/config.cache, rx-1.5/Attic/config.status,
rx-1.5/rx/.cvsignore, rx-1.5/rx/Attic/config.status,
htlib/Attic/htlib.proj, htmerge/.cvsignore,
htmerge/Attic/htmerge.proj, htnotify/.cvsignore,
htnotify/Attic/htnotify.proj, htsearch/.cvsignore,
htsearch/Attic/htsearch.proj, Attic/config.cache,
htcommon/Attic/htcommon.proj, htfuzzy/.cvsignore,
htfuzzy/Attic/htfuzzy.proj, lookfor: General cleanup of archived
stuff
* .cvsignore: config.cache added
* htdig/.cvsignore: Added htdig
Tue Dec 16 15:57:22 1997 turtle <turtle@kiwi>
* htdig/Document.cc: Added little patch by Tobias Oetiker
<oetiker@ee.ethz.ch> that should fix problems with timeouts.
Thu Dec 11 00:28:59 1997 turtle <turtle@kiwi>
* htlib/URL.h, htlib/URL.cc: Added double slash removal code.
These were causing loops.
Thu Oct 23 18:01:10 1997 turtle <turtle@kiwi>
* htlib/Connection.cc: Fix by Pontus Borg for AIX. Changed
'size_t' to 'unsigned long' for the length parameter for
getpeername()
Mon Oct 13 02:13:52 1997 turtle <turtle@kiwi>
* htdig/Attic/Makefile, htdig/Attic/htdig.proj: remove files that
shouldn't be in the repository
* htdig/.cvsignore: Ignore Makefile
* htdoc/cf_byname.html, htdoc/cf_byprog.html, htdoc/attrs.html,
htdoc/ChangeLog: Added documentation for the external_parsers
attribute.
Mon Jul 14 15:32:22 1997 turtle <turtle@kiwi>
* htdoc/uses.html: added cambridge
Wed Jul 9 15:57:30 1997 turtle <turtle@kiwi>
* htdoc/uses.html: added the rhodos project
Mon Jul 7 22:15:45 1997 turtle <turtle@kiwi>
* htdig/Document.cc: Removed old getdate() code that replaced '-'
with ' '.
* htlib/URL.cc: Sequences of "/./" are now replaced with "/" to
reduce the chance of infinite loops
* htdig/Document.cc: Added better date parsing. Now also supports
the old RFC 850 format
Thu Jul 3 17:44:39 1997 turtle <turtle@kiwi>
* htdoc/cf_byname.html, htdoc/cf_byprog.html,
htcommon/defaults.cc, htdig/htdig.h, htdoc/attrs.html,
htlib/Configuration.h, htlib/URL.cc, htdig/Attic/Makefile,
htdig/Document.cc: Added support for virtual hosts
Mon Jun 30 17:07:49 1997 turtle <turtle@kiwi>
* htdoc/uses.html: Added Depaul university
Tue Jun 24 14:59:45 1997 turtle <turtle@kiwi>
* Makefile.in: Fixed syntax error in the installation target.
Mon Jun 23 17:33:14 1997 turtle <turtle@kiwi>
* htdig/Attic/teamball.conf, htdig/Attic/tsdsu.conf,
htdig/Attic/rohan.conf, htdig/Attic/sdsu.conf, htdig/Attic/t.conf,
htdig/Attic/nsdsu.conf, htdig/Attic/daztec.conf,
htdig/Attic/max.conf, htdig/htdig.conf, htdig/Attic/Makefile,
htdig/Attic/catalog.conf: Removed old config files
* htdoc/FAQ.html: FAQ initial
* htdoc/contents.html: Added link to the new FAQ
* htdoc/FAQ.html: *** empty log message ***
* htnotify/htnotify.cc: Added version info to the usage output
* htfuzzy/htfuzzy.cc: Added version info the usage output
* htmerge/htmerge.cc: Added version info to usage message
* htdig/main.cc: Added version info to the usage message
Mon Jun 16 15:35:56 1997 turtle <turtle@kiwi>
* installdir/footer.html: Changed the hardcoded version number to
the new VERSION variable
* htdoc/hts_templates.html: Added docs for the VERSION and PERCENT
variables
* htsearch/Display.cc: Added PERCENT and VERSION variables for the
output templates
Sat Jun 14 18:52:42 1997 turtle <turtle@kiwi>
* htdig/Document.cc: Made redirect detection code more general
Fri Jun 13 05:31:17 1997 turtle <turtle@kiwi>
* htdoc/cf_general.html: Fixed typo
Thu Jun 5 15:00:53 1997 turtle <turtle@kiwi>
* htdoc/uses.html: added VG Gas Analysis Systems
Tue Jun 3 17:49:05 1997 turtle <turtle@kiwi>
* installdir/english.0.original, installdir/english.0: Added new
english dictionary for the endings algorithm
Thu May 29 14:56:40 1997 turtle <turtle@kiwi>
* htdoc/uses.html: Added Indiana University Computer Security
Office
Wed May 28 14:47:25 1997 turtle <turtle@kiwi>
* htdoc/main.html: Fixed typo
Mon May 19 15:23:18 1997 turtle <turtle@kiwi>
* htdoc/uses.html: Added daily californian online
Tue May 13 19:28:32 1997 turtle <turtle@kiwi>
* htdoc/uses.html: Added The Reohr Group
* htdoc/uses.html: Added the Linux Documentation Project
Sun May 11 17:52:05 1997 turtle <turtle@kiwi>
* htdoc/index.html: Made the contents frame a little wider so that
text doesn't wrap
* htdoc/uses.html: Added NOVA and Gajo & Associati
Fri May 2 23:35:56 1997 turtle <turtle@kiwi>
* htdoc/uses.html: added www.bajan.org
Wed Apr 30 22:28:28 1997 turtle <turtle@kiwi>
* htdoc/uses.html: Added Caldera, Inc.
Sun Apr 27 14:43:31 1997 turtle <turtle@kiwi>
* htsearch/parser.cc, htsearch/parser.h, include/Attic/htconfig.h,
htdoc/RELEASE.html, htdoc/uses.html, htdoc/where.html,
htlib/URL.cc, htlib/strcasecmp.cc, htsearch/htsearch.cc, .version,
README, htdig/Attic/Makefile, htdoc/ChangeLog: changes
Mon Apr 21 15:44:39 1997 turtle <turtle@kiwi>
* htsearch/htsearch.cc: Added code to check the search words
against the minimum_word_length attribute
Sun Apr 20 15:27:37 1997 turtle <turtle@kiwi>
* CONFIG: Made paths more generic
* htdig/Document.cc: Added include for ctype.h
* htdig/Plaintext.cc: Fixed bug
Tue Apr 1 17:56:57 1997 turtle <turtle@kiwi>
* htdoc/uses.html: added ukc
Sun Mar 30 01:18:16 1997 turtle <turtle@kiwi>
* htdig/Attic/Makefile, htdoc/uses.html, Attic/Makefile.config,
Attic/config.log, Attic/config.status, .cvsignore, Attic/Makefile,
htsearch/Attic/Makefile, htsearch/.cvsignore,
htnotify/Attic/Makefile, htnotify/.cvsignore, htmerge/.cvsignore,
htmerge/Attic/Makefile, htlib/.cvsignore, htlib/Attic/Makefile,
htfuzzy/.cvsignore, htfuzzy/Attic/Makefile, htcommon/.cvsignore,
htcommon/Attic/Makefile: update
Thu Mar 27 00:06:05 1997 turtle <turtle@kiwi>
* htdig/Plaintext.cc: Applied patch supplied by Peter Enderborg
<pme@ufh.se> to fix a problem with a pointer running off the end
of a string.
Mon Mar 24 04:33:26 1997 turtle <turtle@kiwi>
* rx-1.5/rx/Attic/config.log, rx-1.5/rx/Attic/config.status,
htsearch/htsearch.h, htsearch/parser.h, include/Attic/htconfig.h,
rx-1.5/Attic/config.status, htsearch/Attic/Makefile,
htsearch/ResultList.cc, htsearch/ResultMatch.h,
htsearch/Template.h, htsearch/WeightWord.h, htlib/cgi.cc,
htlib/htString.h, htlib/io.cc, htmerge/Attic/Makefile,
htmerge/htmerge.h, htnotify/Attic/Makefile, htlib/StringList.cc,
htlib/StringList.h, htlib/String_fmt.cc, htlib/URL.h,
htlib/URLTrans.cc, htlib/Attic/SDSU.cc, htlib/Attic/String.h,
htlib/ParsedString.h, htlib/String.cc, htfuzzy/htfuzzy.cc,
htlib/Attic/Makefile, htlib/Configuration.cc, htlib/Connection.cc,
htlib/Database.h, htdig/URLRef.h, htfuzzy/Attic/Makefile,
htfuzzy/Exact.cc, htfuzzy/Fuzzy.h, htfuzzy/Substring.cc,
htfuzzy/SuffixEntry.h, htdig/Plaintext.cc, htdig/Postscript.cc,
htdig/SGMLEntities.cc, htdig/Server.cc, htdig/Server.h,
htdig/Attic/Makefile, htdig/ExternalParser.cc,
htdig/ExternalParser.h, htdig/Parsable.h, htcommon/Attic/Makefile,
htcommon/DocumentRef.h, htcommon/WordList.cc, htcommon/WordList.h,
htcommon/WordReference.h, htdig/Document.h, Attic/config.status,
configure, configure.in, Attic/Makefile, Attic/Makefile.config,
Attic/config.cache, Attic/config.log, Makefile.config.in: Renamed
the String.h file to htString.h to help compiling under win32
* Makefile.in: Updated "make dist" to remove CVS stuff
Fri Mar 14 17:15:32 1997 turtle <turtle@kiwi>
* htcommon/defaults.cc: Changed default value for remove_bad_urls
to true
Thu Mar 13 18:37:50 1997 turtle <turtle@kiwi>
* htnotify/htnotify.cc, Attic/Makefile.config,
htdig/SGMLEntities.cc, htdoc/uses.html: Changes
Thu Feb 27 00:52:52 1997 turtle <turtle@kiwi>
* htdoc/uses.html: new uses
Mon Feb 24 17:52:55 1997 turtle <turtle@kiwi>
* htsearch/htsearch.cc, htnotify/Attic/Makefile,
htsearch/Attic/Makefile, htlib/strcasecmp.cc,
htmerge/Attic/Makefile, htlib/Attic/Makefile, htlib/String.cc,
htlib/StringMatch.cc, htdig/SGMLEntities.cc,
htfuzzy/Attic/Makefile, htdig/Attic/Makefile,
htcommon/Attic/Makefile, htcommon/WordList.cc: Applied patches
supplied by "Jan P. Sorensen" <japs@garm.adm.ku.dk> to make
ht://Dig run on 8-bit text without the global unsigned-char option
to gcc.
Sun Feb 23 17:29:38 1997 turtle <turtle@kiwi>
* htdoc/uses.html: *** empty log message ***
Tue Feb 18 15:03:03 1997 turtle <turtle@kiwi>
* htdoc/uses.html: New uses of ht://Dig
Tue Feb 11 00:38:48 1997 turtle <turtle@kiwi>
* htsearch/htsearch.cc: Renamed the very bad wordlist variable to
badWords
Mon Feb 10 17:32:47 1997 turtle <turtle@kiwi>
* htlib/Connection.cc, htdig/Document.h, htdig/Document.cc,
htcommon/DocumentRef.cc, htcommon/DocumentRef.h: Applied AIX
specific patches supplied by Lars-Owe Ivarsson
<lars-owe.ivarsson@its.uu.se>
Fri Feb 7 18:04:13 1997 turtle <turtle@kiwi>
* htlib/URL.cc: Fixed problem with anchors without a URL
Mon Feb 3 17:37:59 1997 turtle <turtle@kiwi>
* .version, README: updated stuff to 3.0.8
* Many files: Initial CVS
Local Variables:
add-log-time-format: current-time-string
End:
|