1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 5751 5752 5753 5754 5755 5756 5757 5758 5759 5760 5761 5762 5763 5764 5765 5766 5767 5768 5769 5770 5771 5772 5773 5774 5775 5776 5777 5778 5779 5780 5781 5782 5783 5784 5785 5786 5787 5788 5789 5790 5791 5792 5793 5794 5795 5796 5797 5798 5799 5800 5801 5802 5803 5804 5805 5806 5807 5808 5809 5810 5811 5812 5813 5814 5815 5816 5817 5818 5819 5820 5821 5822 5823 5824 5825 5826 5827 5828 5829 5830 5831 5832 5833 5834 5835 5836 5837 5838 5839 5840 5841 5842 5843 5844 5845 5846 5847 5848 5849 5850 5851 5852 5853 5854 5855 5856 5857 5858 5859 5860 5861 5862 5863 5864 5865 5866 5867 5868 5869 5870 5871 5872 5873 5874 5875 5876 5877 5878 5879 5880 5881 5882 5883 5884 5885 5886 5887 5888 5889 5890 5891 5892 5893 5894 5895 5896 5897 5898 5899 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 6300 6301 6302 6303 6304 6305 6306 6307 6308 6309 6310 6311 6312 6313 6314 6315 6316 6317 6318 6319 6320 6321 6322 6323 6324 6325 6326 6327 6328 6329 6330 6331 6332 6333 6334 6335 6336 6337 6338 6339 6340 6341 6342 6343 6344 6345 6346 6347 6348 6349 6350 6351 6352 6353 6354 6355 6356 6357 6358 6359 6360 6361 6362 6363 6364 6365 6366 6367 6368 6369 6370 6371 6372 6373 6374 6375 6376 6377 6378 6379 6380 6381 6382 6383 6384 6385 6386 6387 6388 6389 6390 6391 6392 6393 6394 6395 6396 6397 6398 6399 6400 6401 6402 6403 6404 6405 6406 6407 6408 6409 6410 6411 6412 6413 6414 6415 6416 6417 6418 6419 6420 6421 6422 6423 6424 6425 6426 6427 6428 6429 6430 6431 6432 6433 6434 6435 6436 6437 6438 6439 6440 6441 6442 6443 6444 6445 6446 6447 6448 6449 6450 6451 6452 6453 6454 6455 6456 6457 6458 6459 6460 6461 6462 6463 6464 6465 6466 6467 6468 6469 6470 6471 6472 6473 6474 6475 6476 6477 6478 6479 6480 6481 6482 6483 6484 6485 6486 6487 6488 6489 6490 6491 6492 6493 6494 6495 6496 6497 6498 6499 6500 6501 6502 6503 6504 6505 6506 6507 6508 6509 6510 6511 6512 6513 6514 6515 6516 6517 6518 6519 6520 6521 6522 6523 6524 6525 6526 6527 6528 6529 6530 6531 6532 6533 6534 6535 6536 6537 6538 6539 6540 6541 6542 6543 6544 6545 6546 6547 6548 6549 6550 6551 6552 6553 6554 6555 6556 6557 6558 6559 6560 6561 6562 6563 6564 6565 6566 6567 6568 6569 6570 6571 6572 6573 6574 6575 6576 6577 6578 6579 6580 6581 6582 6583 6584 6585 6586 6587 6588 6589 6590 6591 6592 6593 6594 6595 6596 6597 6598 6599 6600 6601 6602 6603 6604 6605 6606 6607 6608 6609 6610 6611 6612 6613 6614 6615 6616 6617 6618 6619 6620 6621 6622 6623 6624 6625 6626 6627 6628 6629 6630 6631 6632 6633 6634 6635 6636 6637 6638 6639 6640 6641 6642 6643 6644 6645 6646 6647 6648 6649 6650 6651 6652 6653 6654 6655 6656 6657 6658 6659 6660 6661 6662 6663 6664 6665 6666 6667 6668 6669 6670 6671 6672 6673 6674 6675 6676 6677 6678 6679 6680 6681 6682 6683 6684 6685 6686 6687 6688 6689 6690 6691 6692 6693 6694 6695 6696 6697 6698 6699 6700 6701 6702 6703 6704 6705 6706 6707 6708 6709 6710 6711 6712 6713 6714 6715 6716 6717 6718 6719 6720 6721 6722 6723 6724 6725 6726 6727 6728 6729 6730 6731 6732 6733 6734 6735 6736 6737 6738 6739 6740 6741 6742 6743 6744 6745 6746 6747 6748 6749 6750 6751 6752 6753 6754 6755 6756 6757 6758 6759 6760 6761 6762 6763 6764 6765 6766 6767 6768 6769 6770 6771 6772 6773 6774 6775 6776 6777 6778 6779 6780 6781 6782 6783 6784 6785 6786 6787 6788 6789 6790 6791 6792 6793 6794 6795 6796 6797 6798 6799 6800 6801 6802 6803 6804 6805 6806 6807 6808 6809 6810 6811 6812 6813 6814 6815 6816 6817 6818 6819 6820 6821 6822 6823 6824 6825 6826 6827 6828 6829 6830 6831 6832 6833 6834 6835 6836 6837 6838 6839 6840 6841 6842 6843 6844 6845 6846 6847 6848 6849 6850 6851 6852 6853 6854 6855 6856 6857 6858 6859 6860 6861 6862 6863 6864 6865 6866 6867 6868 6869 6870 6871 6872 6873 6874 6875 6876 6877 6878 6879 6880 6881 6882 6883 6884 6885 6886 6887 6888 6889 6890 6891 6892 6893 6894 6895 6896 6897 6898 6899 6900 6901 6902 6903 6904 6905 6906 6907 6908 6909 6910 6911 6912 6913 6914 6915 6916
|
This ChangeLog file is no longer maintained - see the git repo history for
more recent changes: https://xapian.org/bleeding
Wed Sep 30 19:40:15 GMT 2015 Olly Betts <olly@survex.com>
* query.cc: Avoid creating temporary string objects when appending a
substring of another string.
Wed Sep 30 19:36:07 GMT 2015 Olly Betts <olly@survex.com>
* query.cc: Use += to build up strings (which should be O(n)), rather
than str = str + str2 (which is likely to be O(n*n)).
Thu Sep 24 03:50:58 GMT 2015 Olly Betts <olly@survex.com>
* query.cc: Fix $jsonarray not to prepend ']' to the first array
element.
Thu Sep 17 00:28:30 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am: Fix "make check" compilation failure on platforms
without timegm().
Thu Sep 17 00:26:16 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am: atomparsetest and htmlparsetest need datetime.cc.
Wed Sep 16 23:43:16 GMT 2015 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Remove unused header.
Wed Sep 16 23:42:29 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am,datetime.cc,datetime.h,metaxmlparse.cc,myhtmlparse.cc:
Factor out parse_datetime() function.
Wed Sep 16 20:22:03 GMT 2015 Olly Betts <olly@survex.com>
* diritor.cc: Avoid magic_descriptor() in libmagic < 5.15 as it
closes the fd passed to it. The magic_descriptor() code path isn't
actually used currently, so this issue doesn't cause omindex to
misbehave.
Tue Sep 15 22:52:24 GMT 2015 Olly Betts <olly@survex.com>
* index_file.cc,index_file.h,omindex.cc: Pass in Document object and
string for record so extra data can be added before the file is
indexed.
Mon Sep 14 03:15:50 GMT 2015 Olly Betts <olly@survex.com>
* index_file.cc,index_file.h,omindex.cc: Move setting of default
command filters into index_file.cc.
Mon Sep 14 00:54:36 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am,mime.cc,mime.h,omindex.cc: Factor out the code to find
a MIME type for a file.
Sat Sep 12 05:22:16 GMT 2015 Olly Betts <olly@survex.com>
* docs/Makefile.am: Remove bogus '.' from rm command.
Fri Sep 11 04:42:53 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Tweak order of constant definitions.
Fri Sep 11 04:42:27 GMT 2015 Olly Betts <olly@survex.com>
* index_file.cc,index_file.h: Factor out index_add_document() function.
Thu Sep 10 20:43:47 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am,index_file.cc,index_file.h,omindex.cc: Refactor to start
to split out the code to index a file.
Thu Sep 10 07:30:25 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac,docs/Makefile.am: Fix generation of overview.rst when
srcdir != builddir.
Thu Sep 10 05:56:12 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Use SAMPLE_SIZE in help text rather than literal 512.
Add '--title-size' option.
Wed Sep 09 04:28:43 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Fix comment typo.
Tue Sep 08 04:31:51 GMT 2015 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Fix documentation of $last to say
it's the MSet index *one beyond* the end of the current page.
Reported by Andrew Chilton.
Wed Sep 02 07:10:31 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst: SVG extraction is built-in too.
Wed Sep 02 02:28:45 GMT 2015 Olly Betts <olly@survex.com>
* docs/.gitignore,docs/Makefile.am,docs/overview.rst,gen-mimemap:
Generate the list of recognised mime types and of ignored extensions.
Tue Sep 01 07:22:38 GMT 2015 Olly Betts <olly@survex.com>
* .gitignore: Update.
Tue Sep 01 07:14:46 GMT 2015 Olly Betts <olly@survex.com>
* .gitignore,Makefile.am,gen-mimemap,mimemap.tokens,omindex.cc: Factor
out the default extension to MIME content-type mapping into a file,
and generate a static lookup table for use with keyword().
Sat Aug 15 17:24:07 GMT 2015 Olly Betts <olly@survex.com>
* docs/cgiparams.rst: Document behaviour if xDB is not set.
Sat Aug 15 17:19:51 GMT 2015 Olly Betts <olly@survex.com>
* docs/cgiparams.rst,query.cc: If xFILTERS is not set, don't force the
first page as that's unhelpful if someone fails to set it in their
template.
Mon Jul 06 09:54:50 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac: Don't provide our own implementation of sleep() under
__WIN32__ if there's already one - mingw provides one, and in some
situations it seems to clash with ours. Reported to xapian-discuss
by John Alveris.
Thu Jun 11 05:03:52 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am,docs/overview.rst,failed.h,omindex.cc: Track files which
couldn't be indexed in the user metadata and skip them by default on
subsequent runs to avoid the costs of repeatedly running a filter on
a file it can't handle.
Mon Jun 01 13:13:26 GMT 2015 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.3.3.
Wed May 27 02:13:24 GMT 2015 Olly Betts <olly@survex.com>
* NEWS: Update.
Fri May 22 13:05:39 GMT 2015 Olly Betts <olly@survex.com>
* docs/termprefixes.rst,omindex.cc,templates/query: Index the filename
terms with an 'F' prefix, rather than treating them as more body
text. (Fixes #633, reported by Emmanuel Garette)
Fri May 22 03:43:15 GMT 2015 Olly Betts <olly@survex.com>
* NEWS: Update.
Fri May 15 03:08:15 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Use -no-install or -no-fast-install when
linking test programs which never get installed, which means libtool
can often avoid creating a shell script wrapper.
Wed May 13 15:02:36 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Message tweak.
Wed May 13 15:00:49 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Handle text/x-perl and application/x-dvi via the
commands map instead of hardcoded cases.
Wed May 13 14:49:52 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Allow --filter to specify the
character set of the output the filter produces.
Wed May 13 01:34:09 GMT 2015 Olly Betts <olly@survex.com>
* outlookmsg2html.in: Fix handling of message/rfc822 subparts.
Tue May 12 14:07:27 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Allow --filter to handle commands
which produce output in a temporary file rather than on stdout.
Tue May 12 13:37:01 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Handle application/vnd.ms-excel via commands map instead
of a hardcoded case.
Tue May 12 12:56:40 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for %f in command passed
to --filter to allow specifying commands where the input file is
not the final argument. Fixed #570, reported by "catkin".
Tue May 12 11:17:20 GMT 2015 Olly Betts <olly@survex.com>
* diritor.h,omindex.cc,values.h: Add -track-ctime option to allow
omindex to pick up changes to file ownership and permissions.
Tue May 12 07:38:45 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac: Fix typo.
Mon May 11 07:05:24 GMT 2015 Olly Betts <olly@survex.com>
* cdb_hash.cc,md5.cc: Remove 'register' as it's deprecated, and
likely to just be ignored by any modern compiler anyway.
Tue May 05 12:25:20 GMT 2015 Olly Betts <olly@survex.com>
* docs/encodings.rst: $prettyurl undoes %-encoding of UTF-8 in 1.2.21
and later too.
Tue May 05 01:40:17 GMT 2015 Olly Betts <olly@survex.com>
* omega.cc: Drop compilation date and time from output - they prevent
reproducible builds and the version number is sufficient
information.
Fri May 01 09:36:27 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac,md5.h,values.h: Now we require C++11, just include
<cstdint> for uint32_t.
Fri May 01 09:02:34 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac: For Sun's C++ compiler, -std=c++11 enables C++11
support, and is incompatible with -library=stlport, so remove code
to enable that later option.
Fri May 01 08:35:47 GMT 2015 Olly Betts <olly@survex.com>
* commonhelp.cc,omega.cc,omindex-list.cc,omindex.cc,scriptindex.cc:
Add spaces between literal strings and macros which expand to
literal strings for C++11 compatibility.
Fri May 01 06:27:13 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac,m4/ax_cxx_compile_stdcxx_11.m4: Sync with xapian-core,
enabling C++11.
Fri May 01 01:38:08 GMT 2015 Olly Betts <olly@survex.com>
* INSTALL: IRIX is past EOL so drop information about IRIX make.
Thu Apr 30 05:08:13 GMT 2015 Olly Betts <olly@survex.com>
* Makefile.am: Add common/stringutils.cc to urlenctest_SOURCES, needed
now urldecode.h uses C_isxdigit().
Thu Apr 30 02:56:08 GMT 2015 Olly Betts <olly@survex.com>
* configfile.cc,htmlparse.cc,myhtmlparse.cc,omega.cc,omindex.cc,
query.cc,scriptindex.cc,urldecode.h: Consistently use C_isupper(),
C_toupper(), etc as these versions aren't affected by the locale
setting, and also allow signed char values (so we don't need to
cast the argument to unsigned char).
Wed Apr 15 01:12:33 GMT 2015 Olly Betts <olly@survex.com>
* docs/cgiparams.rst,docs/omegascript.rst,omega.cc,query.cc: Fix
handling of multiple P.<prefix> fields - previously only the first
seen was used. These fields are also now taken into account when
deciding if the query has changed. $query now returns an
OmegaScript list with one entry for each CGI parameter passed.
Wed Apr 15 01:11:08 GMT 2015 Olly Betts <olly@survex.com>
* templates/query: Fix setting setting of prefix map for P - in 1.3.2,
this would failed to also search in the subject. Now it also
searches in the subject and topic.
Wed Apr 15 01:09:29 GMT 2015 Olly Betts <olly@survex.com>
* templates/query: When listing matching terms, don't make the commas
italic.
Wed Mar 11 12:18:31 GMT 2015 Olly Betts <olly@survex.com>
* docs/encodings.rst: Note that one should ensure that Omega gets sent
form submissions encoded in UTF-8.
Wed Mar 11 11:28:31 GMT 2015 Olly Betts <olly@survex.com>
* docs/encodings.rst: Discuss encodings of filenames (see #550).
Wed Mar 11 11:07:29 GMT 2015 Olly Betts <olly@survex.com>
* urldecode.h,urlenctest.cc: $prettyurl now decodes valid UTF-8
sequences. Fixes #550 and #644, reported by catkin and terencz.
Mon Mar 09 12:31:44 GMT 2015 Olly Betts <olly@survex.com>
* docs/: Add a document about character encoding, as suggested by
James Aylett in #550.
Mon Mar 09 11:32:32 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst: Document 'E' prefixed boolean terms for filtering
by extension (see #668, reported by bramvdh).
Mon Mar 09 11:29:15 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst: Whitespace cleanup.
Mon Mar 09 10:16:05 GMT 2015 Olly Betts <olly@survex.com>
* templates/xml: Add XML declaration.
Mon Mar 09 10:14:51 GMT 2015 Olly Betts <olly@survex.com>
* templates/query: Eliminate blank line before <html>.
Mon Mar 09 10:14:03 GMT 2015 Olly Betts <olly@survex.com>
* templates/godmode: Return charset utf-8 in the content-type.
Mon Mar 09 06:58:13 GMT 2015 Olly Betts <olly@survex.com>
* urldecode.h,urlenctest.cc: Improve decoding done by $prettyurl - we
now leave the query and fragment parts of the URL alone and don't
decode an escaped "/" (omindex doesn't create URLs with any of
these, so we only risk breaking other URLs which have them), and we
decode some additional ASCII characters in the path part:
[]@!$&'()*+.;= (addresses #550 in part)
Sun Feb 15 11:04:05 GMT 2015 Olly Betts <olly@tartarus.org>
* expand.cc: Suppress bogus uninitialised variable warning with -Os
under GCC 4.7.2.
Tue Jan 27 04:37:12 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Interpret a command of "false" in
"--filter" as meaning to ignore files with that MIME type.
Tue Jan 27 04:13:26 GMT 2015 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for specifying a MIME
subtype of '*' in --filter arguments.
Thu Jan 22 01:44:01 GMT 2015 Olly Betts <olly@survex.com>
* omindex.cc: Ignore extensions .msi and .msp, which are Microsoft
installer files, but which libmagic sometimes incorrectly identifies
as application/msword.
Tue Jan 06 21:15:14 GMT 2015 Olly Betts <olly@survex.com>
* configure.ac: Use pkg-config in preference to determine flags needed
to compile and link with PCRE, as this will just work when
cross-compiling (at least under MXE).
Sun Dec 21 21:54:48 GMT 2014 Olly Betts <olly@survex.com>
* query.cc: Handle [=0 as [=1.
Fri Dec 19 13:09:11 GMT 2014 Olly Betts <olly@survex.com>
* configure.ac,diritor.cc: Avoid doing link tests with libmagic in
configure as they fail on mingw due to not automatically picking up
libraries which libmagic itself depends on.
Fri Dec 19 03:21:13 GMT 2014 Olly Betts <olly@survex.com>
* docs/cgiparams.rst: Improve wording of docs for SORT parameter.
Tue Dec 16 04:06:06 GMT 2014 Olly Betts <olly@survex.com>
* configure.ac: Fix typo: 'libmagic-devl' -> 'libmagic-devel'
Tue Dec 16 03:53:25 GMT 2014 Olly Betts <olly@survex.com>
* configure.ac: Define MINGW_HAS_SECURE_API under mingw to get
_putenv_s() declared in stdlib.h.
Thu Dec 11 11:33:41 GMT 2014 Olly Betts <olly@survex.com>
* Makefile.am: Add timegm.cc to scriptindex_SOURCES to fix build on
platforms which don't provide timegm().
Wed Dec 03 04:17:18 GMT 2014 Olly Betts <olly@survex.com>
* templates/xml: Update handling of DATE1, DATE2 and DAYSMINUS which
were renamed in 0.6.x and the compatibility aliases removed in
1.0.0.
Wed Dec 03 04:15:51 GMT 2014 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Update documentation references to DATE1,
DATE2, and DAYSMINUS which were renamed in 0.6.x and the
compatibility aliases removed in 1.0.0.
Wed Dec 03 02:40:50 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Make sample_size a global variable rather than passing
it around everywhere.
Wed Dec 03 02:29:37 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Remove unused '#include <fstream>'.
Wed Dec 03 02:18:51 GMT 2014 Olly Betts <olly@survex.com>
* diritor.h: Fix get_mtime() to return time_t not off_t. In practice,
this probably wouldn't have caused issues until at least 2038.
Fri Nov 28 17:12:37 GMT 2014 James Aylett <james@tartarus.org>
* Makefile.am: link omindex-list with our (GNU) getopt.
Fri Nov 28 11:38:56 GMT 2014 Olly Betts <olly@survex.com>
* configure.ac: Move AC_CANONICAL_HOST before first use of $host_os.
In practice this wasn't a problem, as LT_INIT implicitly calls
AC_CANONICAL_HOST before this point anyway.
Wed Nov 26 03:55:13 GMT 2014 Olly Betts <olly@survex.com>
* configure.ac: Enable automake option 'subdir-objects' to avoid
warning from newer automake.
Mon Nov 24 20:00:52 GMT 2014 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.3.2.
Sun Nov 23 23:57:34 GMT 2014 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: Update for renaming of 'brass' backend to
'glass'.
Sun Nov 09 22:48:43 GMT 2014 Olly Betts <olly@survex.com>
* NEWS: Update.
Tue Oct 28 02:34:34 GMT 2014 Olly Betts <olly@survex.com>
* docs/overview.rst: Document built-in list of stopwords.
Fri Oct 24 23:07:24 GMT 2014 Gaurav Arora <gauravarora.daiict@gmail.com>
* docs/omegascript.rst,weight.cc: Add support for $set{weighting,lm}.
Mon Oct 20 05:09:04 GMT 2014 Olly Betts <olly@survex.com>
* docs/overview.rst: Note that pdftotext is part of poppler as well as
xpdf. (Noted by Paul Wise)
Mon Oct 20 00:55:51 GMT 2014 Olly Betts <olly@survex.com>
* .gitignore: Update to ignore new generated files.
Thu Jul 24 21:27:26 GMT 2014 Olly Betts <olly@survex.com>
* NEWS: Update.
Sun Jun 22 13:32:10 GMT 2014 Olly Betts <olly@survex.com>
* NEWS: Update.
Fri Jun 20 14:37:04 GMT 2014 Olly Betts <olly@survex.com>
* Makefile.am: Don't compile in unixperm.cc - it isn't currently used,
and it fails to build with mingw. (fixes #635)
Mon Jun 09 01:23:48 GMT 2014 Olly Betts <olly@survex.com>
* myhtmlparse.cc: LibreOffice can export a timestamp of "0;0" - treat
this as invalid rather than as "year 0".
Fri Jun 06 05:50:21 GMT 2014 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Add handling for longer form of timestamps in LO
HTML export.
Tue Jun 03 02:29:51 GMT 2014 Olly Betts <olly@survex.com>
* diritor.cc: Fix "applications/msword" to "application/msword" in the
fallback code for CDF files.
Tue Jun 03 01:49:10 GMT 2014 Olly Betts <olly@survex.com>
* diritor.cc: In fallback for CDF files, compare the extension
*without* leading dot.
Fri May 30 05:38:01 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc,urlencode.cc,urlencode.h: URL encode starting URL
properly.
Thu May 29 03:38:26 GMT 2014 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Put ``...`` around Xapian C++ class names.
Thu May 29 03:36:55 GMT 2014 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Add optional LENGTH parameter to
$snippet.
Thu May 29 02:41:08 GMT 2014 Olly Betts <olly@survex.com>
* diritor.cc: libmagic can return a second string starting "Composite
Document File V2 Document" for the mime-type, so just look for that
prefix. And newer libmagic returns "application/CDFV2-corrupt" in
these cases, so handle that too.
Wed May 28 05:22:12 GMT 2014 Olly Betts <olly@survex.com>
* omindex-list.cc: Remove debug output.
Wed May 28 05:15:50 GMT 2014 Olly Betts <olly@survex.com>
* Makefile.am,omindex-list.cc: New tool to list URLs of all the
documents in a database (or list of databases) indexed by omindex.
Tue May 27 04:12:07 GMT 2014 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Document $snippet.
Tue May 27 04:07:43 GMT 2014 Mihai Bivol <mm.bivol@gmail.com>
* query.cc,templates/query: Add Omega Snipper integration.
Fri May 23 12:05:52 GMT 2014 Olly Betts <olly@survex.com>
* date.cc,scriptindex.cc: Pass std::string by const reference.
Fri May 23 09:05:25 GMT 2014 Olly Betts <olly@survex.com>
* query.cc: Removed unused inline function.
Tue May 20 23:27:37 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Update comment about unrtf --nopict to note the unrtf
version where this was fixed to work again.
Tue May 20 23:26:19 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Report the size limit in the message when we skip a file
which exceeds it.
Mon Apr 14 10:42:06 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Filtering via text/html now handles HTML documents which
specify a charset. "application/vnd.ms-outlook" is now handled by
filtering via text/html rather than as a hard-coded special case.
Thu Apr 10 03:47:18 GMT 2014 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for indexing Microsoft
Publisher files using pub2xhtml.
Thu Apr 10 03:29:50 GMT 2014 Olly Betts <olly@survex.com>
* diritor.cc: Work around libmagic returning a MIME content-type of
"Composite Document File V2 Document, No summary info".
Tue Mar 25 08:56:13 GMT 2014 Olly Betts <olly@survex.com>
* expand.cc: Fix mis-indentation of two lines.
Tue Mar 25 08:54:28 GMT 2014 Olly Betts <olly@survex.com>
* expand.cc: Fix warning when built with GCC 4.7.2 using -Os.
Thu Mar 13 01:41:49 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Restrict the length of what we consider to be an
extension, currently to 7 characters or whatever the longest
extension in the mime_map is if it is longer.
Mon Mar 10 06:23:22 GMT 2014 Olly Betts <olly@survex.com>
* expand.cc,expand.h,omega.cc,query.cc: Fix $set{expand,trad <K>} to
work when built against an older xapian-core.
Mon Mar 10 04:22:05 GMT 2014 Olly Betts <olly@survex.com>
* expand.cc: Only use new query expansion API if built against
xapian-core >= 1.3.2.
Thu Mar 06 10:25:03 GMT 2014 Olly Betts <olly@survex.com>
* expand.cc: Throw an error if $opt{expansion} is invalid.
Thu Mar 06 09:58:24 GMT 2014 Aarsh Shah <aarshkshah1992@gmail.com>
* Makefile.am,docs/omegascript.rst,expand.cc,expand.h,omega.cc,
query.cc: Add support for setting the query expansion scheme to use.
Tue Feb 18 05:04:48 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc,tmpdir.h: Add get_tmpfile() helper function.
Tue Feb 18 02:09:13 GMT 2014 Olly Betts <olly@survex.com>
* omindex.cc: Avoid '//' in temporary filename (cosmetic only).
Sat Feb 15 00:58:53 GMT 2014 Olly Betts <olly@survex.com>
* NEWS: Update.
Wed Feb 12 05:24:23 GMT 2014 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Don't assume .doc is
application/msword but let libmagic decide, as .doc may actually be
RTF, and also it's sometimes used for plain-text files.
Tue Dec 31 20:20:05 GMT 2013 Olly Betts <olly@survex.com>
* timegm.cc: Fix typo.
Sun Dec 29 21:26:03 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Ship common/safewinsock2.h, needed under mingw.
Sun Dec 29 06:22:50 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,metaxmlparse.cc,myhtmlparse.cc,timegm.cc,
timegm.h: Factor out our portable timegm() replacement. Fixes
incorrect timezone handling for created timestamps in OpenDocument
documents on platforms without timegm(). Should also fix a build
failure on mingw.
Thu Dec 26 01:15:27 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,portability/mkdtemp.cc,portability/mkdtemp.h,tmpdir.cc:
Add header with prototype of mkdtemp() to avoid "no previous
declaration" warning on platforms which don't have mkdtemp() as
standard.
Mon Dec 23 02:25:13 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update.
Fri Dec 20 07:07:36 GMT 2013 Olly Betts <olly@survex.com>
* docs/overview.rst: Fix minor typo.
Fri Dec 20 04:58:44 GMT 2013 Olly Betts <olly@survex.com>
* docs/overview.rst: Add Abiword as an example use of --filter, based
on patch from Frank J Bruzzaniti (fixes#383). Update unoconv
example to talk about LibreOffice instead of OpenOffice.
Thu Dec 19 10:21:28 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from 1.2.16 and ChangeLog.
Mon Dec 02 22:56:18 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac: Define __MSVCRT_VERSION__ to 0x0601 on mingw so we get
__ftime64() defined in the headers.
Fri Nov 29 03:51:24 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac: Sync GCC checks with xapian-core.
Sun Oct 13 23:20:58 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Group-readable files which are owner-readable but not
world-readable should still get a "readable by owner" term added.
Reported by Emmanuel Garette.
Fri Sep 27 00:27:47 GMT 2013 Olly Betts <olly@survex.com>
* diritor.cc: Handle ENOENT, ENOTDIR and EACCES from readdir().
Thu Sep 26 05:41:10 GMT 2013 Olly Betts <olly@survex.com>
* diritor.cc: If we've already opened the file (as we often will have
if using a modern libmagic with magic_descriptor() available), then
use fstat() on that fd rather than stat()/lstat() on the pathname.
Thu Sep 26 05:32:46 GMT 2013 Olly Betts <olly@survex.com>
* diritor.cc: If we get EACCES trying to read a directory or stat
a file, don't handle it by committing changes and exiting - instead
skip the file (like we used to before r17461).
Tue Sep 24 10:01:17 GMT 2013 Olly Betts <olly@survex.com>
* .gitignore,configure.ac,xapian-omega.spec.in: Compress source
tarballs with xz instead of gzip.
Mon Sep 23 06:26:03 GMT 2013 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h,omindex.cc: If we get ENOTDIR trying to index a
file, skip it quietly (unless in verbose mode) as we already do if
we get ENOENT, since ENOTDIR is what we get if the file and the
directory it was in got removed between us getting the filename and
trying to open it.
Mon Sep 16 11:21:50 GMT 2013 Olly Betts <olly@survex.com>
* runfilter.h: Remove trailing space added in recent commit.
Mon Sep 16 02:04:06 GMT 2013 Olly Betts <olly@survex.com>
* diritor.h,runfilter.cc,runfilter.h: Pass error message string and
errno value in ReadError exceptions.
Thu Sep 12 22:02:59 GMT 2013 Olly Betts <olly@survex.com>
* weight.cc: Use "" not <> to include local header weight.h.
Thu Sep 12 01:18:33 GMT 2013 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h,omindex.cc: Commit changes and exit, rather
than skipping the current file on most unexpected errors reading
directories or initialising libmagic - otherwise we can end up
deleting a lot of database entries on errors like EHOSTDOWN.
Tue Sep 03 01:03:33 GMT 2013 Olly Betts <olly@survex.com>
* myhtmlparse.cc,myhtmlparse.h,omindex.cc: Add support for indexing
'created' time from HTML documents.
Mon Sep 03 00:23:34 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Factor out mimetype_from_ext() function.
Mon Sep 02 23:32:01 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Fix to sleep for sleep_before_opendir from now, not
from 1970.
Mon Sep 02 22:45:44 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Make OPT_OPENDIR_SLEEP 256 not -1, as getopt() returns
-1 when there are no more options.
Mon Sep 02 05:00:16 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc,docs/overview.rst: Ignore 'adm', 'cur', and 'ico' by
default.
Mon Sep 02 04:50:59 GMT 2013 Olly Betts <olly@survex.com>
* datematchdecider.h: Fix filename in comment at top of file.
Thu Aug 29 23:39:14 GMT 2013 Olly Betts <olly@survex.com>
* diritor.h: Mark DirectoryIterator ctor as 'explicit'.
Thu Aug 29 23:37:29 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc: Add omindex --opendir-sleep=SECS option to
allow working around problems with indexing files on Microsoft DFS
shares.
Thu Aug 22 22:04:11 GMT 2013 Olly Betts <olly@survex.com>
* xlsxparse.cc: Handle pre-defined numfmtid codes for dates.
Wed Jul 17 06:15:27 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc,xlsxparse.cc,xlsxparse.h: Fix detection of cells with a
date format to work with xlsx files other than my first example.
Wed Jul 17 03:48:41 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc,xlsxparse.cc,xlsxparse.h: Decode dates for xlsx files.
Mon Jul 15 12:03:32 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* docs/omegascript.rst,weight.cc: Add support for $set{weighting,dph}.
Sun Jul 14 07:06:58 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* docs/omegascript.rst,weight.cc: Add support for $set{weighting,pl2}.
Thu Jul 11 06:12:50 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,jsonescape.cc,jsonesctest.cc: Make the JSON escape code
force the text to be valid UTF-8.
Wed Jul 10 13:01:36 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* docs/omegascript.rst,weight.cc: Add support for $set{weighting,dlh}.
Tue Jul 09 06:29:18 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc,xmlparse.h: Quick fix for infinite recursion from the
HtmlParser refactoring work.
Sun Jul 07 11:57:11 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* weight.cc: Add support for $set{weighting,bb2}.
* docs/omegascript.rst: Update list of weighting schemes understood by
$set{weighting,...}.
Fri Jul 05 03:15:12 GMT 2013 Olly Betts <olly@survex.com>
* weight.cc: Add conditional test so we can build against older
xapian-core without the new weighting schemes.
Wed Jul 03 14:02:17 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* weight.cc: Add support for $set{weighting,ineb2}.
Wed Jul 03 13:37:48 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* weight.cc: Add support for $set{weighting,ifb2}.
Wed Jul 03 11:58:22 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* weight.cc: Add support for $set{weighting,inl2}.
Wed Jun 26 13:03:24 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac: Enable -Woverloaded-virtual warning.
Wed Jun 26 13:00:33 GMT 2013 Olly Betts <olly@survex.com>
* atomparsetest.cc,htmlparse.cc,htmlparse.h,myhtmlparse.cc,omindex.cc,
xmlparse.h: Rename virtual parse_html() method to just parse(), as
it's not HTML that's being parsed in most cases.
Wed Jun 26 07:15:30 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,msxmlparse.cc,msxmlparse.h,omindex.cc,xmlparse.cc: Split
out the code specific to handling MS XML out of XmlParser into an
MSXmlParser subclass.
Wed Jun 26 04:53:39 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac: Sync compiler warning flag machinery against
xapian-core. The changes are special handling for clang, passing
-fshow-column where supported, and handling for new warning flags
in GCC 4.6 and 4.7.
Tue Jun 18 03:09:06 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Fix off-by-one when finding documents to delete which
would sometimes cause omindex to fail to delete documents from the
database when they weren't refound during an index update.
Mon Jun 17 02:13:21 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Report strerror(errno) if we can't read a file.
Mon Jun 17 02:12:12 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Factor out code to mark a document as seen into a new
mark_as_seen() function.
Mon Jun 17 00:43:28 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac,metaxmlparse.cc,metaxmlparse.h,omindex.cc: Add support
for indexing 'topic' and 'created date' meta-data for OpenDocument
format.
Sun Jun 16 11:48:28 GMT 2013 Olly Betts <olly@survex.com>
* weight.cc: Rewrite the xapian-core version test to use a macro so
it's clearer.
Sat Jun 15 00:43:16 GMT 2013 Olly Betts <olly@survex.com>
* docs/termprefixes.rst,myhtmlparse.cc,myhtmlparse.h,omindex.cc,
templates/query: Index "topic" for HTML and PDF documents.
Fri May 24 09:22:43 GMT 2013 Olly Betts <olly@survex.com>
* weight.cc: Check parameters to $set{weighting,bm25 ...} and
$set{weighting,trad ...} converted OK. Based on patch from
Aarsh Shah.
Wed May 08 11:10:20 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,README,docs/Makefile.am: SVN -> git.
Thu May 02 12:21:36 GMT 2013 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.3.1.
Wed Apr 17 03:10:40 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from 1.2 branch and ChangeLog.
Tue Apr 16 05:24:54 GMT 2013 Olly Betts <olly@survex.com>
* weight.cc: If built against older xapian-core, don't try to use
TfIdfWeight.
Mon Apr 15 06:21:21 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
* docs/omegascript.rst,weight.cc: Add support for
$set{weighting,tfidf}.
Thu Apr 04 06:42:01 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Remove support for 'configure
--enable-quiet', 'make QUIET=' and 'make QUIET=y' - automake now
supports 'configure --enable-silent-rules', 'make V=1' and 'make
V=0' which are broadly equivalent and more standard.
Wed Mar 27 09:20:28 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Don't link utf8convert.cc code into omega CGI.
Sun Mar 24 23:33:56 GMT 2013 Olly Betts <olly@survex.com>
* htmlparsetest.cc: Fix table parsing test to have a <table> tag in!
Sun Mar 17 22:00:58 GMT 2013 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Use rst2html to indexing .rst and
.rest files.
Sun Mar 17 21:34:06 GMT 2013 Olly Betts <olly@survex.com>
* htmlparsetest.cc: Update testcase to match previous change.
Sun Mar 17 21:13:49 GMT 2013 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Make paragraph break in sample \r not \n, as \n
is used by Omega to indicate the end of a field in the document
data.
Fri Mar 15 03:48:50 GMT 2013 Olly Betts <olly@survex.com>
* md5wrap.cc,md5wrap.h: Add md5_block() to checksum a block of memory.
Thu Mar 07 00:40:20 GMT 2013 Olly Betts <olly@survex.com>
* jsonescape.cc: Fix C+11 compatibility issue highlighted by GCC
warning.
Wed Mar 06 03:35:43 GMT 2013 Olly Betts <olly@survex.com>
* gen-myhtmltags,myhtmlparse.cc: Distinguish page breaks from other
whitespace in samples.
Tue Mar 05 19:37:55 GMT 2013 Olly Betts <olly@survex.com>
* INSTALL,configure.ac: Provide hints as to what package to install
for magic.h.
Mon Mar 04 03:56:22 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac,omindex.cc,runfilter.cc,runfilter.h: If omindex
receives a SIGHUP, SIGINT, SIGQUIT or SIGTERM, then kill any active
external filter child process before handling the signal as we
otherwise would.
Sun Mar 03 23:52:37 GMT 2013 Olly Betts <olly@survex.com>
* configure.ac,runfilter.cc: If setpgid() is available, put each
external filter in its own process group so we can easily kill it
along with any processes which it starts.
Mon Feb 25 19:03:06 GMT 2013 Olly Betts <olly@survex.com>
* docs/overview.rst: Update to add com to the list of ignored
extensions.
Thu Feb 21 22:09:44 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Ignore .com files by default.
Tue Feb 19 04:19:50 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Tue Feb 19 03:13:08 GMT 2013 Olly Betts <olly@survex.com>
* htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.h: Sample from HTML now
contains \n where a line or paragraph break would appear, and \t
between table cells.
Tue Feb 19 03:02:58 GMT 2013 Olly Betts <olly@survex.com>
* gen-myhtmltags,htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.tokens:
Generate a lookup table for where we should insert a space in place
of an HTML tag rather than using a switch statement for that.
Tue Feb 12 20:43:14 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Remove my-html-tok.h in "make clean" not "make
distclean", since it's built by "make" and that's what the automake
manual recommends for files built by "make".
Tue Feb 12 20:41:53 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Clean up my-html-tok.h in "make distclean".
Tue Feb 12 20:03:51 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Ship common/keyword.h.
Tue Feb 12 19:27:26 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Ship common/Tokeniseise.pm.
Tue Feb 12 00:44:23 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Fix to work in VPATH build.
Mon Feb 11 22:38:16 GMT 2013 Olly Betts <olly@survex.com>
* xlsxparse.cc: Correct "max" -> "min" when reserving space for shared
strings. This only means we now reserve a more appropriate amount
of space to start with.
Fri Feb 01 04:11:50 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Ship new file myhtmlparse.tokens.
Thu Jan 31 23:42:23 GMT 2013 Olly Betts <olly@survex.com>
* myhtmlparse.cc,myhtmlparse.tokens: Add <APPLET>, <OBJECT>, and
<TR> to the tags handled.
Thu Jan 31 06:24:13 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,gen-myhtmltags,myhtmlparse.cc,myhtmlparse.tokens: Use
a generated compact and efficient table to convert HTML tag names
to enum codes, which we can then use a C switch statement to
dispatch. The table first checks the token length, and then does a
binary chop on tokens of the same length. This is both faster and
smaller than the approach we were using, with the benefit that the
table is auto-generated.
Thu Jan 31 06:21:22 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Mon Jan 28 23:41:21 GMT 2013 Olly Betts <olly@survex.com>
* utf8convert.cc,utf8converttest.cc: Always use our built-in
conversion code for the character sets it can handle, and only use
iconv as a fall-back. This gives us more consistent results, and
in particular means we now handle BOMs better (at least with GNU
iconv).
Mon Jan 28 23:16:08 GMT 2013 Olly Betts <olly@survex.com>
* utf8convert.cc,utf8converttest.cc: A lot of data labelled as
"iso-8859-1" is actually "windows-1252". The two only differ
in characters which are control characters in iso-8859-1, so
assume the latter when we see the former.
Mon Jan 28 23:13:24 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Use charset name "iso-8859-1" in lower case
consistently.
Thu Jan 24 08:44:24 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am: Ship jsonescape.h.
Thu Jan 24 07:21:17 GMT 2013 Olly Betts <olly@survex.com>
* jsonescape.cc: Add missing header includes.
Thu Jan 24 03:08:39 GMT 2013 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Document $json and $jsonarray.
Thu Jan 24 02:55:53 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,jsonescape.cc,jsonescape.h,jsonesctest.cc,query.cc:
Add new $json and $jsonarray OmegaScript commands.
Wed Jan 09 11:53:19 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and 1.2 branch.
Fri Jan 04 02:21:05 GMT 2013 Olly Betts <olly@survex.com>
* Makefile.am,docs/omegascript.rst,omindex.cc,query.cc,sample.cc,
sample.h: Add $truncate command to break a string after a word.
Thu Jan 03 04:44:22 GMT 2013 Olly Betts <olly@survex.com>
* commonhelp.cc: Tweak wording about default to match other options
better.
Thu Jan 03 04:07:57 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: Note default size limit on files to index is unlimited.
Update --help to reflect that --sample-size now accepts the same
formats as --max-size).
Thu Jan 03 03:52:34 GMT 2013 Olly Betts <olly@survex.com>
* omindex.cc: When generating a sample for a CSV file, limit the
reserved size to the CSV file size as sample_size could be set
really high by the user.
Thu Jan 03 03:46:46 GMT 2013 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Tue Dec 18 04:49:41 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Fix typo in previous change (2>dev/null should be
2>/dev/null).
Tue Dec 18 03:48:36 GMT 2012 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Extend --filter to allow commands which
produce HTML on stdout to be specified with it.
Sun Dec 16 21:25:28 GMT 2012 Olly Betts <olly@survex.com>
* diritor.cc: If libmagic hits ENOENT trying to classify a file, throw
FileNotFound so we quietly skip the file in non-verbose mode.
Sun Dec 16 21:23:25 GMT 2012 Olly Betts <olly@survex.com>
* diritor.cc: MAGIC_MIME_TYPE was added in 4.22, so note that in the
comment about its conditional use.
Fri Dec 14 21:22:12 GMT 2012 Olly Betts <olly@survex.com>
* Makefile.am: In automake, INCLUDES is now deprecated in favour of
AM_CPPFLAGS so update to use the latter.
Fri Dec 14 04:45:12 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: If md5_file() fails with ENOENT, assume the file was
removed during indexing and only report this with --verbose as we
do in other such cases.
Fri Dec 14 04:42:34 GMT 2012 Olly Betts <olly@survex.com>
* md5wrap.cc: If we get a read error while calculating the md5 checksum
of a file, fail rather than returning the checksum of the file up to
that point.
Fri Dec 14 04:35:52 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Calculate the md5 from the loaded file contents when
indexing SVG and Atom files. Use a const ref to avoid a string
copy of the file contents for HTML and uncompressed ABI word.
Fri Dec 14 04:24:17 GMT 2012 Olly Betts <olly@survex.com>
* configure.ac,diritor.cc,diritor.h,loadfile.cc,loadfile.h,omindex.cc:
If we open a file to index it, keep the fd around and use it with
libmagic, provided magic_descriptor() is available.
Tue Dec 11 03:35:29 GMT 2012 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h,omindex.cc: If we get ENOENT for a file or
directory we're trying to index, assume it has been removed between
us reading the directory entry for it and trying to open it, and
only report the failure to index under --verbose.
Wed Nov 21 04:06:54 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Fix omindex not to segfault when -F option without a ':'
is passed.
Tue Sep 25 23:57:12 GMT 2012 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc: Replace shell_protect() with
append_filename_argument() from common/append_filename_arg.h.
Extracting text using external filters now works for filenames
containing a newline character.
Thu Aug 09 20:07:59 GMT 2012 Dan Colish <dcolish@gmail.com>
* Makefile.am,configure.ac: Allow users to configure with
MAGIC_PREFIX for non-standard installs of libmagic
Wed Jul 18 10:51:39 GMT 2012 Olly Betts <olly@survex.com>
* urldecode.h: Fix to decode escaped character at the end of the
string.
* urlenctest.cc: Add regression testcase.
Sun Jul 01 10:53:35 GMT 2012 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and 1.2 branch.
Tue Jun 26 11:59:17 GMT 2012 Olly Betts <olly@survex.com>
* configure.ac: Set link_all_deplibs_CXX=no on solaris, like we
already do for xapian-core.
Thu Jun 21 13:44:06 GMT 2012 Olly Betts <olly@survex.com>
* xlsxparse.cc: Check for "uniquecount" parameter, not "unqiueCount" as
we normalise parameter names to lower case.
Thu Jun 21 13:42:35 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: unzip extracts files in the order they are in the
archive, not the order they are on the command line, so call unzip
twice when the order of extraction matters.
Tue Jun 19 00:58:04 GMT 2012 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,opendocparse.cc,opendocparse.h,xmlparse.cc:
Improve handling of headers and footers on OpenDocument documents.
Mon Jun 18 07:02:20 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Properly fix the "trim trailing formfeeds" code.
Mon Jun 18 06:16:39 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Tweak previous change.
Mon Jun 18 05:49:07 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Fix the "trim trailing formfeeds" code not to remove one
character too many.
Mon Jun 18 05:47:33 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc,xlsxparse.cc,xlsxparse.h: Rework .xlsx parsing to
substitute the shared strings into the positions they are used
in, so that the sample actually matches what appears in the
spreadsheet.
Mon Jun 18 04:49:12 GMT 2012 Olly Betts <olly@survex.com>
* xlsxparse.cc,xlsxparse.h: Subclass XlsxParser directly from
HtmlParser.
Mon Jun 18 03:16:43 GMT 2012 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,xlsxparse.cc,xlsxparse.h: Index calculated
numbers from .xlsx files.
Wed Jun 13 04:53:02 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: pdftotext outputs a formfeed between each page, which
messes up our "empty body" check, so trim any trailing formfeeds
before the check.
Sat Jun 09 06:04:44 GMT 2012 Olly Betts <olly@survex.com>
* Cherry pick changes from Mihai Bivol's GSoC snippets branch:
* omindex.cc: Add option for the document sample size.
* omindex.cc: Add short option for sample-size
* omindex.cc: Make sample-size consistent with max-size
Sat Jun 02 12:23:21 GMT 2012 Olly Betts <olly@survex.com>
* INSTALL,Makefile.am,cgiparam.cc,configfile.cc,configure.ac,
htmlparse.cc,omindex.cc,query.cc: Change `...' quoting in prose to
'...'.
Thu May 17 12:53:07 GMT 2012 Olly Betts <olly@survex.com>
* htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.h: Change parsing of
multiple <body> tags and text outside of <body> to match the
behaviour if modern web browsers. (ticket#599)
Tue May 15 12:46:15 GMT 2012 Olly Betts <olly@survex.com>
* configure.ac: Set link_all_deplibs_CXX=no on freebsd and openbsd,
like we already do for xapian-core.
Tue May 15 11:29:53 GMT 2012 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and 1.2.10.
Tue May 08 11:39:28 GMT 2012 Olly Betts <olly@survex.com>
* runfilter.cc: Add cast to rlim_t, required for C++11 compatibility
according to new error from GCC 4.7 (reported by Gaurav Arora).
Tue May 08 11:32:48 GMT 2012 Olly Betts <olly@survex.com>
* tmpdir.cc: Add safeunistd.h for rmdir, required by GCC 4.7 (reported
by Gaurav Arora).
Sat Apr 14 00:14:58 GMT 2012 Olly Betts <olly@survex.com>
* atomparse.cc: For type="html", use the charset of the XML rather
than utf-8.
Fri Apr 13 23:36:48 GMT 2012 Olly Betts <olly@survex.com>
* Makefile.am,atomparse.cc,atomparse.h,overview.rst,omindex.cc: Add
support for atom feed files, patch from Mihai Bivol in ticket#595.
* Makefile.am,atomparsetest.cc: Add tests for AtomParser.
Thu Apr 05 14:09:28 GMT 2012 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparsetest.cc: Add support for CDATA to HTML parser.
Fri Mar 30 22:35:08 GMT 2012 Olly Betts <olly@survex.com>
* NEWS: Fix "an warning" to "a warning" in old entry.
Mon Mar 26 08:44:51 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Add --max-size option, based on patch from ndaley in
ticket#587.
Wed Mar 14 02:27:59 GMT 2012 Olly Betts <olly@survex.com>
* NEWS: Update for 1.3.0.
Tue Mar 13 10:44:11 GMT 2012 Olly Betts <olly@survex.com>
* NEWS: Update from 1.2.9 and ChangeLog.
Mon Mar 12 10:55:57 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: If the document with the highest existing docid was
updated, we'd previously report it as "added", but now we correctly
report it as "updated".
Mon Mar 12 10:50:55 GMT 2012 Olly Betts <olly@survex.com>
* omindex.cc: Catch and report std::exception.
Mon Feb 20 02:45:12 GMT 2012 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: More extensions to ignore by default:
fon pyd ttf
Sun Feb 19 22:20:49 GMT 2012 Olly Betts <olly@survex.com>
* docs/overview.rst: Wrap over-long line.
Thu Feb 16 06:52:24 GMT 2012 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add more extensions to the default
ignore list: bin dat db jar lnk pyc pyo sqlite sqlite3 sqlite-journal
tmp
Fri Jan 27 03:36:10 GMT 2012 Olly Betts <olly@survex.com>
* docs/overview.rst,htmlparse.cc,htmlparsetest.cc: Add support for
ignoring sections bracketed by <!--UdmComment--> and
<!--/UdmComment--> like we already do for <!--htdig_noindex-->.
Patch from Raphael Geissert.
Fri Dec 23 05:44:08 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst: Document that libmagic is used to determine
the MIME type if the extension isn't known. Partly addresses
ticket#569.
Fri Dec 23 01:29:17 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst: We now limit time as well as CPU and memory for
external filters.
Thu Dec 22 10:55:44 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Drop special handling for R-prefixed terms in $prettyterm
- we stopped generating these in Xapian 1.0.
Thu Dec 22 03:50:30 GMT 2011 Olly Betts <olly@survex.com>
* INSTALL,configure.ac,diritor.cc,diritor.h: Make libmagic a required
dependency.
Wed Dec 21 10:02:03 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Change Xapian::weight to double.
Wed Dec 21 05:25:40 GMT 2011 Olly Betts <olly@survex.com>
* docs/cgiparams.rst,omega.cc,query.cc: Make DEFAULTOP default to AND
rather than OR, since that matches what pretty much every search
engine does these days. Closes ticket#512.
Tue Dec 13 11:21:54 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update from 1.2.8 and ChangeLog.
Fri Dec 09 14:08:04 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc,templates/emptydocs,templates/godmode,
templates/query,urldecode.h,urlenctest.cc: Add new $prettyurl{}
command which undoes RFC3986 URL escaping which doesn't affect
semantics in practice. Partly addresses ticket#550.
Thu Dec 08 08:19:26 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Improve --help output (and man page which is generated
from it). Closes bug#572.
Thu Dec 08 04:51:12 GMT 2011 Olly Betts <olly@survex.com>
* Makefile.am: Ship new header urldecode.h.
Thu Dec 08 03:34:02 GMT 2011 Olly Betts <olly@survex.com>
* Makefile.am,cgiparam.cc,urldecode.h,urlenctest.cc: Add new
implementation of URL decoding - the old one didn't handle
various corner cases well, and had two cut and pasted variants
for handling a input from a C string (GET) or from stdin (POST).
Also add a new unit test program to test URL encoding and decoding.
Fixes bug#578.
Tue Dec 06 13:30:45 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and to reflect backporting activity.
Mon Dec 05 03:19:21 GMT 2011 Olly Betts <olly@survex.com>
* scriptindex.cc: If no rules are found in the index script, report an
error and give up - this is inevitably the result of a mistake, and
adding empty documents to the database isn't helpful.
Sat Oct 29 14:49:40 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Add note to discourage use of percentage
scores.
* templates/query: Don't show the percentage score in the default
template.
Fri Oct 14 12:36:43 GMT 2011 Olly Betts <olly@survex.com>
* configure.ac,runfilter.cc: If we don't get any data from a filter
for 5 minutes, give up - it has probably ended up blocked
indefinitely.
Mon Sep 26 01:22:08 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: HTML escape topterms.
Mon Sep 26 00:52:42 GMT 2011 Olly Betts <olly@survex.com>
* templates/godmode: HTML escape the contents of document values.
Fri Sep 23 04:09:12 GMT 2011 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,tmpdir.cc,tmpdir.h: Factor out tmpdir handling
into a separate source file.
Fri Sep 23 01:49:38 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Factor out index_mimetype() function as a step towards
allowing indexing files within other files (like zip files and email
attachments).
Fri Sep 23 00:54:40 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Use string::const_iterator where we don't modify the
string.
Thu Sep 01 12:28:36 GMT 2011 Olly Betts <olly@survex.com>
* xapian-omega.spec.in: Package outlookmsg2html helper.
Fri Aug 12 23:25:45 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update from 1.2.7 and ChangeLog.
Fri Aug 12 23:17:09 GMT 2011 Olly Betts <olly@survex.com>
* scriptindex.cc: MyHtmlParser::parse_html() no longer throws bool to
stop parsing early, so we no longer need to catch it.
Wed Aug 03 23:25:18 GMT 2011 Olly Betts <olly@survex.com>
* configure.ac: Sync changes from xapian-core: Don't pass -Wshadow for
GCC < 4.1; don't pass -Wstrict-null-sentinel for GCC 4.0.x; only
enable symbol visibility on platforms where it is supported; remove
now superfluous check for GCC >= 3. Also, add FIXME for enabling
-Woverloaded-virtual.
Wed Aug 03 06:27:06 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Index title with an 'S' prefix rather than no prefix.
* templates/query: Set up prefixes for 'author', 'title', and map
no prefix so that terms from the title are still matched by default.
Wed Aug 03 06:11:30 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Allow mapping a query string prefix to
more than one term prefix (which xapian-core has supported since
1.0.4).
Fri Jul 29 01:47:44 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Add support for per-prefix stemmers.
Thu Jul 28 13:23:26 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst,omega.cc,omega.h,query.cc,query.h: Add support
for search inputs for multiple probabilistic prefixes.
Wed Jul 27 02:35:39 GMT 2011 Olly Betts <olly@survex.com>
* scriptindex.cc: Add link to
http://xapian.org/docs/omega/scriptindex.html to --help output (and
so also to the man page which is generated from this).
Tue Jul 26 05:54:52 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Rearrange logic for discarding the RSet and forcing the
first page.
Tue Jul 26 05:27:08 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Remove support for OLDP CGI parameter which was superseded
by xP approximately a decade ago, and isn't even documented.
Mon Jul 04 06:20:03 GMT 2011 Olly Betts <olly@survex.com>
* omega.cc,utils.cc,utils.h: Factor out trim() function.
Mon Jul 04 06:14:05 GMT 2011 Olly Betts <olly@survex.com>
* omega.cc: Avoid creating a temporary string object just to trim
leading and/or trailing whitespace.
Mon Jul 04 06:08:47 GMT 2011 Olly Betts <olly@survex.com>
* omega.cc: If P had trailing spaces, we would remove all but one -
fixed to remove all of them!
Wed Jun 22 15:32:12 GMT 2011 Olly Betts <olly@survex.com>
* INSTALL: Pull in a few updates from the latest version of the
automake document which this file was originally based on.
Add in the missing copyright and licensing information.
Thu Jun 16 15:42:31 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Drop legacy support for handling '.' separated terms in
OLDP - that changed in Omega 0.9.7, which is approaching 5 years
ago now.
Thu Jun 16 15:38:40 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Improve $version output from "Xapian - xapian-omega 1.2.6"
to "xapian-omega 1.2.6".
* docs/omegascript.rst: Update example to match (and use less ancient
version!)
Thu Jun 16 15:36:12 GMT 2011 Olly Betts <olly@survex.com>
* dbi2omega: Remove uninteresting reference to 0.9.4.
Mon Jun 13 14:25:45 GMT 2011 Olly Betts <olly@survex.com>
* hashterm.cc: Avoid unnecessary temporary string object.
Mon Jun 13 14:01:20 GMT 2011 Olly Betts <olly@survex.com>
* hashterm.cc: Fix comment typo.
Mon Jun 13 13:49:14 GMT 2011 Olly Betts <olly@survex.com>
* xapian-omega.spec.in: We're ABI compatible within a release series
so make dependency on xapian-core-libs >= rather than =.
Mon Jun 13 12:30:29 GMT 2011 Olly Betts <olly@survex.com>
* scriptindex.cc: Avoid unnecessary temporary string object.
Mon Jun 13 12:24:32 GMT 2011 Olly Betts <olly@survex.com>
* scriptindex.cc: Remove error warning that index=nopos was replaced
with indexnopos - this was removed in 1.1.0 so there's been enough
time to upgrade.
Mon Jun 13 09:56:29 GMT 2011 Olly Betts <olly@survex.com>
* configure.ac: Update version to 1.3.0.
Mon Jun 13 09:42:50 GMT 2011 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: Update reference to flint.`
Mon Jun 13 08:00:16 GMT 2011 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: Expand to document mapping a user prefix to
multiple term prefixes.
Mon Jun 13 03:23:47 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst: Improve documentation of htdig_noindex.
Sun Jun 12 11:52:29 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Final update for 1.2.6.
Fri Jun 10 12:02:32 GMT 2011 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update in preparation for 1.2.6.
Fri Jun 10 03:28:33 GMT 2011 Olly Betts <olly@survex.com>
* templates/inc/anyallexactradio: Remove unused duplicate of
anyallradio.
Fri Jun 10 03:21:25 GMT 2011 Olly Betts <olly@survex.com>
* configure.ac,omindex-config.cc,omindex-config.html: Strip out partly
written and long untouched omindex-config utility.
Thu Jun 09 14:20:46 GMT 2011 Olly Betts <olly@survex.com>
* weight.cc: Fix a compiler warning (I failed to note the compiler
unfortunately).
Sun May 29 13:00:26 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: Make search query input type=search.
Sun May 29 12:24:43 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: Autofocus the search query input (using HTML
autofocus attribute with Javascript fallback for older browsers).
(ticket#544)
Wed May 25 14:33:18 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Correct the documentation of the colours used by
$highlight{}.
Fri May 13 05:50:35 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst: Add using unoconv as more complex example of
using --filter (ticket#324).
Wed Apr 20 07:00:56 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Fix typo; clarify wording.
Mon Apr 04 13:58:06 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update release date.
Mon Apr 04 13:53:34 GMT 2011 Olly Betts <olly@survex.com>
* templates/xml: Fix syntax error from recent edit.
Sun Apr 03 10:54:04 GMT 2011 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.2.5.
Sat Apr 02 14:15:32 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: Use $add{$field{modtime}} to ensure it is numeric.
Sat Apr 02 14:14:06 GMT 2011 Olly Betts <olly@survex.com>
* templates/godmode: More missing escaping.
Sat Apr 02 14:07:45 GMT 2011 Olly Betts <olly@survex.com>
* templates/xml: Remove double escaping.
Sat Apr 02 13:58:44 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: More escaping fixes.
Sat Apr 02 13:55:03 GMT 2011 Olly Betts <olly@survex.com>
* templates/emptydocs,templates/opensearch,templates/xml: More missing
escaping.
Sat Apr 02 12:34:42 GMT 2011 Olly Betts <olly@survex.com>
* templates/query: Add missing escaping.
Sat Apr 02 11:48:43 GMT 2011 Olly Betts <olly@survex.com>
* templates/godmode: Add missing escaping.
Sat Apr 02 10:34:58 GMT 2011 Olly Betts <olly@survex.com>
* templates/xml: Remove support for undocumented HILITECLASS CGI
variable. There's no evidence I can find using Google code search
or web search that this has been used anywhere, and it's problematic
to escape properly.
Sat Mar 26 14:51:36 GMT 2011 Olly Betts <olly@survex.com>
* INSTALL: Copy new Multi-Arch section from xapian-core/INSTALL.
Replace VPATH section with better equivalent from
xapian-core/INSTALL.
Wed Mar 23 15:21:41 GMT 2011 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparse.h,htmlparsetest.cc,metaxmlparse.cc,
metaxmlparse.h,myhtmlparse.cc,myhtmlparse.h,omindex.cc,svgparse.cc,
svgparse.h,xmlparse.cc,xmlparse.h,xpsxmlparse.cc,xpsxmlparse.h:
Instead of throwing a bool to abandon parsing, change methods to
return bool to signify if they want to continue parsing or not.
This is a bit faster (~0.23% for indexing a lot of HTML files).
Mon Mar 21 05:48:08 GMT 2011 Olly Betts <olly@survex.com>
* myhtmlparse.cc,myhtmlparse.h,omindex.cc: Add --ignore-exclusions
option, which will index HTML files despite meta robots tags, etc -
omindex is often used in environments where such exclusions aren't
relevant.
Fri Mar 18 10:24:58 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Just report the mimetype as unknown instead of saying
"unknown Office 2007 MIME subtype".
Fri Mar 18 05:53:21 GMT 2011 Olly Betts <olly@survex.com>
* diritor.h: Avoid using S_IRUSR, etc under __WIN32__.
Fri Mar 18 03:00:16 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Ignore *.css and *.js by default too.
Thu Mar 17 23:34:07 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: For skip messages which are only to be shown in verbose
mode, call skip with new SKIP_VERBOSE_ONLY flag. Pass new
SKIP_SHOW_FILENAME flag for skip messages shown before we say what
file we are indexing so we know to show the filename even in verbose
mode.
Thu Mar 17 03:47:54 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Restore handling of exceptions from
DirectoryIterator::get_type(), and handle exceptions from
DirectoryIterator::next() which ended up at the top level
before (though they probably never happen, at least on Linux).
Wed Mar 16 06:19:01 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Push all the code associated with indexing a file into
index_file().
Wed Mar 16 02:55:53 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Push try block around index_file() call into the
function.
Wed Mar 16 02:51:52 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Factor out handling for skipping files, and improve
these messages by consistently reporting the filename.
Tue Mar 15 12:47:12 GMT 2011 Olly Betts <olly@survex.com>
* docs/Makefile.am,docs/index.rst: Add index page which links to all
the other documentation pages.
Tue Mar 15 12:20:30 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Add --empty-docs option to allow documents we extract
no body text from to be indexed (existing behaviour), skipped, or
reported and then indexed.
Fri Mar 04 14:13:47 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Minor improvements.
Wed Mar 02 11:17:42 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update.
Wed Mar 02 06:14:41 GMT 2011 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: New standard prefix E for filename extension.
* omindex.cc: Index file extension as E-prefixed term.
Mon Feb 28 13:45:32 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Tell xls2csv not to quote fields and to put spaces
not commas between them. Fixes indexing of numeric fields, and
means we don't need to use our CSV parser to get a sample.
Mon Feb 28 12:10:53 GMT 2011 Olly Betts <olly@survex.com>
* xmlparse.cc: Add whitespace between chunks of text extracted from
Microsoft Office 2007 formats.
Wed Feb 23 12:34:28 GMT 2011 Olly Betts <olly@survex.com>
* templates/xml: Try $field{caption} (which is what omindex sets)
before $field{title} when getting a value for the hit tag's title
attribute - this is consistent with how the query template gets the
title. Add new type attribute which gives $field{type}.
Thu Feb 17 05:19:28 GMT 2011 Olly Betts <olly@survex.com>
* templates/xml: Add DBSize attribute to <result> element.
Wed Feb 16 03:19:57 GMT 2011 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,query.cc,urlencode.cc,urlencode.h: Update
URL encoding to follow RFC3986.
Tue Feb 15 03:20:40 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc: Encode reserved characters in URLs - now links to
files with names containing '#' and '?' will work.
Sun Jan 23 13:27:48 GMT 2011 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Later Microsoft Works version produce
.xlr spreadsheet files, which are apparently XL files with a
different extension, so handle them as XL files.
Thu Jan 20 11:07:46 GMT 2011 Olly Betts <olly@survex.com>
* docs/omegascript.rst,omega.cc,query.cc,templates/query: Allow
QueryParser flags to be set from OmegaScript (ticket#418).
Sat Jan 15 11:14:32 GMT 2011 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog, 1.0.22 and 1.0.23.
Wed Jan 12 02:21:59 GMT 2011 Olly Betts <olly@survex.com>
* query.cc: Fix double Content-Type header in some error reporting
situations (regression introduced in 1.2.4).
Mon Jan 10 10:00:00 GMT 2011 Olly Betts <olly@survex.com>
* omindex.cc,pkglibbindir.cc,pkglibbindir.h: Fix typo in function name
(get_pkglibdindir() -> get_pkglibbindir()).
Mon Jan 10 09:50:38 GMT 2011 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h: Don't define or try to set euid member of
DirectoryIterator on platforms where we aren't going to use it.
Mon Jan 10 09:15:24 GMT 2011 Olly Betts <olly@survex.com>
* diritor.h: Stub out get_owner() and get_group() for __WIN32__.
Fri Dec 24 10:35:29 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Thu Dec 23 01:53:06 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc: Fix to work with older libmagic which doesn't have
MAGIC_MIME_TYPE (e.g. on Ubuntu hardy).
Sun Dec 19 12:39:23 GMT 2010 Olly Betts <olly@survex.com>
* NEWS,configure.ac: 1.2.4.
Sun Dec 19 12:37:58 GMT 2010 Olly Betts <olly@survex.com>
* query.cc: Disable permission filtering based on $REMOTE_USER as that
will break some existing installations if users upgrade, which we
don't want. Probably this should be specifiable from OmegaScript
but it's not worth delaying 1.2.4 while we sort this out.
Sun Dec 19 02:46:17 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Change the new name for
"--preserve-unupdated" from "--preserve-removed" to "--no-delete".
Sun Dec 19 02:32:29 GMT 2010 Olly Betts <olly@survex.com>
* query.cc: Fix comment typo.
Fri Dec 17 12:45:47 GMT 2010 Olly Betts <olly@survex.com>
* commonhelp.cc,commonhelp.h,omindex.cc,scriptindex.cc: Swap the
meanings of -v and -V in omindex for consistency with scriptindex
and typical short options for --verbose and --version in other
packages. For backward compatibility, "omindex -v" is handled
specially and still reports the version.
Fri Dec 17 08:31:29 GMT 2010 Olly Betts <olly@survex.com>
* utf8convert.cc: Fix built in converter to handle space in charset
names, which fixes failing utf8converttest when iconv isn't
available.
Fri Dec 17 05:36:36 GMT 2010 Olly Betts <olly@survex.com>
* utf8convert.cc: Rework the fixing up of charset names which iconv()
doesn't understand a little.
Thu Dec 16 06:35:46 GMT 2010 Olly Betts <olly@survex.com>
* loadfile.cc: If fstat() fails, preserve the errno value rather than
letting close() clobber it.
Thu Dec 16 06:31:30 GMT 2010 Olly Betts <olly@survex.com>
* loadfile.cc: Fix file descriptor leak if load_file() is called on
something which isn't a file (found by cppcheck run on the Debian
archive). This case probably couldn't occur in omindex, but could if
you used the LOADFILE action in scriptindex.
Thu Dec 09 10:58:48 GMT 2010 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Replace $simplecommand with $query - a concrete
example is more useful. Improve mark-up.
* docs/termprefixes.rst: Remove mention of pre-0.9.7 use of W prefix.
Thu Nov 18 12:25:50 GMT 2010 Olly Betts <olly@survex.com>
* omega.cc: Fix reversed condition in recent exception reporting fix.
Wed Nov 17 03:46:24 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc: Add missing magic_cookie argument to calls to
magic_error().
Sat Nov 13 12:17:51 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Build up document data with += for efficiency.
Sat Nov 13 12:08:09 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Index author with A prefix.
Sat Nov 13 12:00:50 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: A file extension can't contain a '/'.
Sat Nov 13 11:50:31 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Index the leafname of the file (without any extension) as
if it contained additional keywords.
Sat Nov 13 11:32:09 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: If a filter command isn't installed, flag this in the
commands map so we don't try running this command again for any
file with the same mimetype (previously we'd rerun it for a different
extension which gave the same mimetype).
Fri Nov 12 09:11:35 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Add -no-undefined to AM_LDFLAGS on
platforms which need it to dynamically link such as cygwin (need
to do this taken from ticket#282).
Fri Nov 12 03:35:56 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Report MIME type if it's unknown to us. Remove debug
output line. Update comments.
Fri Nov 12 03:32:27 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc: Report errors from libmagic.
Fri Nov 12 02:58:20 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h: Fix to compile when libmagic is detected.
Fri Nov 12 01:40:24 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc: Add missing class qualifier to method definition.
Fri Nov 12 01:25:11 GMT 2010 Olly Betts <olly@survex.com>
* INSTALL: Mention libmagic in install instructions.
Fri Nov 12 01:16:21 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,diritor.cc,diritor.h,omindex.cc: Optionally
use libmagic to detect MIME types for files for which we have no
extension mapping, which allows us to handle files with a misleading
extension, and files with no extension. (ticket#114)
Thu Nov 11 23:23:07 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Refactor slightly to handle the unknown extension case
up front, so we lose an indentation level for the known extension
case.
Thu Nov 11 12:25:03 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Add new --filter option to allow the user to specify
new filters without patching omindex.cc.
* docs/overview.rst: Document --filter.
Thu Nov 11 02:51:55 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Factor out handling for external filter programs which
simply return UTF-8 text on stdout.
Mon Nov 08 10:58:46 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc,svgparse.cc,svgparse.h: Extract author for SVG files.
Mon Nov 08 10:40:09 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Extract metadata from Microsoft Office 2007 file formats.
Mon Nov 08 10:21:13 GMT 2010 Olly Betts <olly@survex.com>
* myhtmlparse.cc,myhtmlparse.h,omindex.cc: Extract author from HTML
documents.
Mon Nov 08 09:46:03 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Escape wildcard patterns being passed to unzip - in the
unlikely event that one of these matched files in or under the
current directory, we might fail to extract all the files we wanted
to.
Mon Nov 08 05:03:41 GMT 2010 Olly Betts <olly@survex.com>
* metaxmlparse.cc,metaxmlparse.h,omindex.cc: Extract author from
OpenDocument documents.
Mon Nov 08 03:18:26 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Extract author from PDF metadata.
Mon Nov 08 03:15:17 GMT 2010 Olly Betts <olly@survex.com>
* metaxmlparse.h: Initialise field member variable.
Mon Nov 08 00:28:07 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Index text in headers and footers for .odt and .docx
files.
Thu Nov 04 11:55:58 GMT 2010 Olly Betts <olly@survex.com>
* omega.cc,omega.h,query.cc: If we catch an error early on, make sure
that if it's appropriate, we write out a "Content-Type:" HTTP header
and end the headers.
Thu Nov 04 11:39:10 GMT 2010 Olly Betts <olly@survex.com>
* utf8converttest.cc: Add back in testcases for charset names with
hyphens in.
Thu Nov 04 09:01:43 GMT 2010 Olly Betts <olly@survex.com>
* utils.cc: Fix misuse of BUFSIZE which should be sizeof(buf) (issue
reported by compilation with CPPFLAGS=-D_GLIBCXX_DEBUG).
Thu Nov 04 09:01:08 GMT 2010 Richard Boulton <richard@tartarus.org>
* utf8convert.cc,utf8converttest.cc: If iconv can't handle a
charset, check if it's of the form (UTF|UCS)[_ ]?.* and if so,
convert to the official hypenated form. Should fix failure of
utf8converttest on OSX, where it fails due to iconv not
supporting "UTF16".
Tue Nov 02 09:48:19 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h,loadfile.cc,loadfile.h,md5wrap.cc,md5wrap.h,
omindex.cc,scriptindex.cc: Use O_NOATIME if available and either the
file is owned by the current euid, or the current euid is 0 (i.e.
we're running as root). Fixes ticket#222.
Fri Oct 29 14:26:25 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Use the CSV parser to generate a nicer sample for files
of type application/vnd.ms-excel.
Fri Oct 29 09:26:52 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am: Put $(PCRE_LIBS) in libtransform_la_LIBADD rather than
omega_LDADD (more correct, but probably doesn't actually make any
difference).
Thu Oct 28 14:46:11 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Disable more output unless --verbose is specified. Don't
flush the "Indexing" partial message until we get to the potentially
time consuming actions.
Thu Oct 28 13:54:44 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst: Improve mark-up, and tweak wording in a few
places.
Thu Oct 28 13:46:36 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst: Update docs for --duplicates and
--preserve-removed.
Thu Oct 28 13:27:01 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Deprecated "--preserve-nonduplicates" in favour of new
long option "--preserve-removed" which does the same thing, but has
a (hopefully) clearer name. Rename the variable it controls from
preserve_unupdated to delete_removed_documents (with the opposite
sense).
Thu Oct 28 12:08:59 GMT 2010 Olly Betts <olly@survex.com>
* configfile.cc: Only append '/' to directory values if they don't
already have a trailing '/'.
Thu Oct 28 11:49:54 GMT 2010 Olly Betts <olly@survex.com>
* runfilter.cc: Make the memory limit for filter processes the size
of physical memory, not 7/8 of this value, which is a little less
arbitrary (ticket#424).
Thu Oct 28 11:47:38 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Under --duplicate=ignore, fix so that old documents which
aren't seen get deleted, which wasn't implemented before (to suppress
this deletion, pass -p as well).
Thu Oct 28 10:38:21 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Track how many documents in the index we haven't seen
in this index run - if this is 0, we don't need to check for docs
to delete at all; otherwise we can at least use it to know when we
have found them all. Use a PostingIterator over all documents to
avoid having to catch exceptions from delete_document() for gaps
in the used docids.
Thu Oct 28 04:52:36 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Add quotes around directory name in "Entering directory"
message. Add directory name to "skipping directory" error message.
Thu Oct 28 04:50:37 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Document --verbose in --help. Actually recognise -V.
Thu Oct 28 04:01:31 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Move the directory iteration loop out of the try/catch
block for starting the iteration, which means it's indented by a
whole level less.
Thu Oct 28 03:47:30 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Add --verbose option, and disable the less interesting
output unless it is specified.
Thu Oct 28 03:34:44 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Eliminate the message "Caught unknown exception in
index_directory, rethrowing" as it isn't actually informative.
Thu Oct 28 01:43:44 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Variable dbpath doesn't need to be global.
Thu Oct 28 01:28:10 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: The Host and Path terms are the same for every document
in a single invocation of omindex, so calculate them just once up
front.
Thu Oct 28 01:13:36 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Eliminate the leading slash on filenames in output, so
they are now relative filenames on the system. This also simplifies
path building internally.
Wed Oct 27 09:51:51 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Use rpm's --qf option to produce output which is simpler
to parse.
Wed Oct 27 09:32:22 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for indexing RPM packages
(ticket#493).
Wed Oct 27 06:07:59 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for indexing Debian package
files (ticket #493).
Wed Oct 27 05:37:02 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Quietly ignore files with mimetype set
to "ignore". The initial list of extensions set to ignore is:
.a .dll .dylib .exe .lib .o .obj .so
Wed Oct 27 02:25:01 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Report get_description() for Xapian exceptions, which
is provides additional information above get_msg().
Wed Oct 27 01:56:08 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc,query.cc,values.h: Add file size as a value, and set up a
NumberValueRangeProcessor so size: works in the query (has to be in
bytes currently).
Wed Oct 27 01:31:25 GMT 2010 Olly Betts <olly@survex.com>
* scriptindex.cc: Report get_description() for Xapian exceptions, which
is provides additional information above get_msg().
Tue Oct 26 12:00:58 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst: Document the new emptydocs template.
Tue Oct 26 11:51:31 GMT 2010 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Add new $emptydocs command which
returns a list of documents with doclength zero.
* query.cc: Extend $field to take an optional DOCID argument, rather
than always using the context from $hitlist.
* templates/emptydocs: New template which lists documents with
doclength zero.
Thu Oct 21 12:05:23 GMT 2010 Olly Betts <olly@survex.com>
* configure.ac,unixperm.cc: Fix to build on platforms where
getgrouplist() exists but takes int* not gid_t* (e.g. Mac OS X).
Wed Oct 20 10:30:13 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Add boolean terms with add_boolean_term()
so they get wdf of 0 and don't contribute to document length.
Sat Oct 16 06:13:23 GMT 2010 Olly Betts <olly@survex.com>
* configure.ac: Probe for any options needed to enable large file
support. Handling files >= 2GB isn't especially useful, but more
importantly this is needed to allow omindex to index files on filing
systems with 64 bit inodes on some platforms (e.g. 32-bit Linux).
Mon Oct 11 11:11:07 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am: Drop special case to remove man pages on "make clean"
in maintainer-mode.
Wed Sep 29 04:14:21 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,query.cc,unixperm.cc,unixperm.h: Pull out
permission checks into a separate file and check Unix user and group
permissions based on environmental variable REMOTE_USER, if set.
Tue Sep 28 08:06:00 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am: Ship common/realtime.h.
Tue Sep 28 06:32:10 GMT 2010 Olly Betts <olly@survex.com>
* query.cc: Apply permission filters if USER and/or GROUP are set.
Tue Sep 28 06:14:50 GMT 2010 Olly Betts <olly@survex.com>
* ./: Update svn:externals to latest common from xapian-core.
* query.cc: Use RealTime::now() to time running the query. Include
more enquire set-up in the time.
Tue Sep 28 05:26:07 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Index file owner and read permissions, to allow finding
documents with a particular owner, and so searches can be restricted
to documents a user is able to read.
* docs/termprefixes.rst: Document term prefixes used by the above.
Tue Sep 28 05:20:01 GMT 2010 Olly Betts <olly@survex.com>
* diritor.h: Rename get_other_read() to is_other_readable() for
consistency.
Tue Sep 28 04:16:55 GMT 2010 Olly Betts <olly@survex.com>
* diritor.cc,diritor.h: Rearrange so that the setting of statbuf_valid
gets inlined so the compiler should be able to optimise out
subsequent calls to call_stat().
Tue Sep 28 04:10:28 GMT 2010 Olly Betts <olly@survex.com>
* diritor.h: Add methods to read the owner and group, and to check
who can read the file.
Tue Sep 28 01:39:15 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Fix typo.
Tue Sep 28 01:33:44 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Fix whitespace oddities.
Tue Sep 28 01:31:46 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Tue Sep 28 01:27:41 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Improve --help for --mime-type option.
Mon Sep 20 06:50:45 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc,svgparse.cc,svgparse.h: Extract any document title and
keywords from SVG files.
Mon Sep 20 06:49:44 GMT 2010 Olly Betts <olly@survex.com>
* htmlparse.cc: Call closing_tag() for XML empty tag syntax (like
"<tag foo=bar />").
Mon Sep 20 05:30:54 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,docs/overview.rst,omindex.cc,svgparse.cc,svgparse.h: Add
support for indexing SVG files.
Tue Sep 07 04:39:59 GMT 2010 Olly Betts <olly@survex.com>
* outlookmsg2html.in: If the required perl modules aren't available,
exit with status 127 which omindex interprets as "filter not
installed" and won't try further .msg files.
Tue Sep 07 02:24:36 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,docs/overview.rst,omindex.cc,
outlookmsg2html.in,pkglibbindir.cc,pkglibbindir.h: Add support for
indexing .msg files from Microsoft Outlook. (ticket#334)
Tue Aug 31 06:32:15 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Fix handling of quoting in CSV files to match what's
most common.
Tue Aug 31 05:41:13 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: The V in CSV is Values not Variable.
Mon Aug 30 14:56:36 GMT 2010 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for indexing .csv files.
Sat Aug 28 11:46:22 GMT 2010 Olly Betts <olly@survex.com>
* cdb_find.cc,cdb_init.cc,cgiparam.cc,date.cc,md5.cc,query.cc,utils.cc,
values.h: Fix to compile with Sun C++.
Sat Aug 28 11:36:25 GMT 2010 Olly Betts <olly@survex.com>
* omega.cc: An ESet can't contain empty terms, so there's no need to
check for them.
Tue Aug 24 05:58:28 GMT 2010 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.2.3.
Mon Aug 23 15:08:11 GMT 2010 Olly Betts <olly@survex.com>
* xapian-omega.spec.in: Don't run autoreconf - it's no longer required.
Tue Aug 03 14:11:35 GMT 2010 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: Update "flint and quartz" to "flint and chert"
as quartz is no longer supported. Give exact term length limit for
flint and chert.
Sun Jun 27 05:00:39 GMT 2010 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.2.2.
Sat Jun 26 15:59:59 GMT 2010 Olly Betts <olly@survex.com>
* NEWS.SKELETON: Add blank line to the end.
Sat Jun 26 15:59:05 GMT 2010 Olly Betts <olly@survex.com>
* NEWS.SKELETON: Add template NEWS entry.
Tue Jun 22 13:55:11 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Sync with 1.0.21.
* NEWS,configure.ac: Update for 1.2.1.
Sun Jun 13 11:55:40 GMT 2010 Olly Betts <olly@survex.com>
* freemem.cc: Merge in __WIN32__ implementation from perftest in
xapian-core.
Fri May 14 01:39:43 GMT 2010 Olly Betts <olly@survex.com>
* freemem.cc: Use "safeunistd.h" instead of <unistd.h>.
Wed Apr 28 13:38:33 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Sync with 1.0.20.
Wed Apr 28 06:44:56 GMT 2010 Olly Betts <olly@survex.com>
* configure.ac: Tell libtool not to link in deplibs on platforms where
we know they aren't needed.
* configure.ac: On Linux, extract the library search path from ldconfig
which gives us the default entries reliably.
* NEWS,configure.ac: 1.2.0.
Thu Apr 15 04:32:06 GMT 2010 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.1.5.
Mon Feb 15 14:00:26 GMT 2010 Olly Betts <olly@survex.com>
* configure.ac: Update for 1.1.4.
Mon Feb 15 13:51:44 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Add missing notes for 1.1.2 and 1.1.1 including changes from
1.0.14 and 1.0.13 respectively.
Mon Feb 15 13:28:12 GMT 2010 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and 1.0.18.
Mon Feb 08 00:48:44 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am: Need to ship common/omassert.h.
Sun Feb 07 23:03:45 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am: Need to ship common/str.h.
Sun Feb 07 21:40:03 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,omega.cc,omindex.cc,query.cc,utils.cc,utils.h: Use the
optimised str() routine instead of int_to_string() and
long_to_string().
Fri Feb 05 23:29:12 GMT 2010 Olly Betts <olly@survex.com>
* omindex.cc: Increase the wdf boost for the document title from 2 to
5, since 2 isn't really enough.
Thu Feb 04 03:20:02 GMT 2010 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,runfilter.cc: Use safesyswait.h.
* runfilter.cc: Reformat header to @file doxygen comment. Put
'#include "runfilter.h"' right after <config.h>.
Wed Dec 10 00:15:10 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Wed Dec 09 00:26:19 GMT 2009 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Add missing "using namespace std;".
Wed Dec 09 00:20:38 GMT 2009 Olly Betts <olly@survex.com>
* htmlparse.cc: Make the default charset "utf-8" not "UTF-8" as we
lower case explicitly specified character sets to compare to see
if we need to reparse, so this avoids a reparse when UTF-8 is
explicitly specified as well as the default.
Tue Dec 08 23:56:46 GMT 2009 Olly Betts <olly@survex.com>
* scriptindex.cc: Don't bomb out if indexing is disallowed or we hit
</body> for a document which had an overridden character set.
Fixes ticket#410.
Wed Nov 18 10:48:47 GMT 2009 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.1.3.
Wed Nov 18 02:37:34 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from 1.0.17 and ChangeLog.
Mon Nov 16 09:08:12 GMT 2009 Olly Betts <olly@survex.com>
* utf8converttest.cc: Charset "8859_1" isn't understood by Solaris
libiconv, and isn't likely to be specified on a page, so just
test it for our built-in convertor and GNU libc.
Wed Nov 11 04:52:25 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Also check for socketpair with -lxnet if it isn't found
without, which enables resource limits on Solaris, and possibly some
other platforms. Fixes ticket#412.
Wed Nov 04 01:51:41 GMT 2009 Olly Betts <olly@survex.com>
* freemem.cc: On Linux, _SC_AVPHYS_PAGES excludes pages used by the OS
VM cache, so will often return a really low value, so instead use
_SC_PHYS_PAGES. Reported by Rune Kock in Debian bug#548987. Also
explains ticket#358.
Wed Nov 04 00:54:38 GMT 2009 Olly Betts <olly@survex.com>
* common/: Sync with latest version from xapian-core to pick up getopt
fix for Mac OS X 10.6.
Mon Nov 02 09:32:22 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Use delete[] (not delete) for array allocated by new[].
Mon Nov 02 07:08:13 GMT 2009 Olly Betts <olly@survex.com>
* runfilter.cc: Fix likely crash if read() is interrupted by a signal.
Identified by Coverity's Scan.
Mon Nov 02 06:47:01 GMT 2009 Olly Betts <olly@survex.com>
* scriptindex.cc: Extend exception handling to the whole of main.
Xapian::Stem("english") can't actually throw, but that's not obvious
to static analysis tools, and it is more robust to wrap the whole of
main, and reduces indentation.
Mon Nov 02 06:32:41 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Tighten up the type of the error we catch
to detect an unknown stemming language.
Thu Sep 17 12:13:10 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Thu Sep 10 13:33:06 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Default to looking for xapian-config-1.1.
Thu Sep 10 06:46:55 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Sync changes from 1.0.15 and 1.0.16.
Wed Sep 09 13:32:25 GMT 2009 Olly Betts <olly@survex.com>
* omega.cc,query.cc,query.h: Fix cross-site scripting vulnerability in
reporting of exceptions (CVE-2009-2947).
Fri Aug 28 15:30:07 GMT 2009 Richard Boulton <richard@lemurconsulting.com>
* configure.ac: Check for PERL if in maintainer mode, not just when
building documentation, because making the omegascript vim syntax
mode requires it.
Wed Aug 26 14:17:06 GMT 2009 Olly Betts <olly@survex.com>
* templates/query: www.xapian.org -> xapian.org.
Tue Aug 25 11:15:38 GMT 2009 Olly Betts <olly@survex.com>
* gen-omegascript-vim: Fix swapped arguments to perl mkdir function.
Tue Aug 25 10:39:29 GMT 2009 Olly Betts <olly@survex.com>
* gen-omegascript-vim: Add GPL licence boilerplate.
Tue Aug 25 10:29:07 GMT 2009 Olly Betts <olly@survex.com>
* gen-omegascript-vim: Need to create "extra" for a VPATH build.
Tue Aug 25 08:39:00 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am: Fix for VPATH build.
Tue Aug 25 06:38:08 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,extra/omegascript.vim,extra/omegascript.vim.in,
gen-omegascript-vim: The list of OmegaScript commands in the vim
mode was rather out of date, and a few commands were misclassified.
Fix both problems and avoid future recurrences by automatically
generating those lists from the command list in query.cc.
Sat Aug 15 11:31:56 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Wed Aug 05 03:50:54 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Implement correct handling of paths when calling
external filter programs on Microsoft Windows.
Thu Jul 23 12:07:24 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Remove pointless fallback code.
Thu Jul 23 12:06:37 GMT 2009 Olly Betts <olly@survex.com>
* templates/inc/toptermsjs: Use double-quotes rather than single quotes
for parameter values on the <script> tag.
Thu Jul 23 11:29:43 GMT 2009 Olly Betts <olly@survex.com>
* docs/omegascript.rst: Document that $date uses UTC. (ticket#314)
Thu Jul 23 11:26:15 GMT 2009 Olly Betts <olly@survex.com>
* templates/query: If JavaScript is available, convert the
$field{modtime} to a string on the client-side so that the timezone
is correct. If JavaScript isn't available, fall back to the existing
behaviour of using UTC. (ticket#314)
Thu Jul 23 04:12:02 GMT 2009 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.1.2.
Wed Jul 22 04:33:29 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and sync with 1.0.13 and 1.0.14.
Tue Jul 07 15:05:09 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Consistently use endl not "\n" at the end of messages so
that output is flushed.
Tue Jul 07 07:29:21 GMT 2009 Olly Betts <olly@survex.com>
* cdb_init.cc,cdb_int.h,cgiparam.cc,configfile.cc,date.cc,
datematchdecider.cc,datematchdecider.h,freemem.cc,htmlparse.cc,
htmlparsetest.cc,md5.cc,md5test.cc,myhtmlparse.cc,omega.cc,
omindex.cc,query.cc,runfilter.cc,scriptindex.cc,strcasecmp.h,
utf8converttest.cc,utils.cc: Update to use C++ forms for ISO C
standard headers (ticket#330).
Mon Jul 06 01:54:35 GMT 2009 Olly Betts <olly@survex.com>
* loadfile.cc: Avoid infinite loop if the file has been truncated
since we read the length, or on Cygwin with the automatic end of
line translation turned on.
Sun Jul 05 13:00:57 GMT 2009 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparse.h: Make HtmlParser::get_parameter() const
(ticket#139).
Sun Jul 05 12:59:45 GMT 2009 Olly Betts <olly@survex.com>
* cdb_init.cc: Prefer static_cast<> to C-style cast.
Sat Jun 20 03:31:22 GMT 2009 Olly Betts <olly@survex.com>
* docs/overview.rst: www.xapian.org -> xapian.org
Thu Jun 11 09:45:45 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Extract pptx notesSlides and comments, if present. If
they aren't, unzip returns exit code 11, which we must ignore
(ticket#290).
Thu Jun 11 07:38:57 GMT 2009 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Handle the "macroenabled" versions of
MS Office 2007 files too (ticket#290).
Wed Jun 10 01:13:14 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Update for 1.1.1.
Tue Jun 09 14:35:40 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update for 1.1.1.
Mon May 25 13:38:46 GMT 2009 Olly Betts <olly@survex.com>
* query.cc: If SERVER_PROTOCOL in the environment is set to INCLUDED,
then our output is being included in another page (e.g. using SSI)
so suppress the output of any HTTP headers.
Mon May 25 13:02:22 GMT 2009 Olly Betts <olly@survex.com>
* templates/query: Remove extra "}" introduced when adding spelling
support.
Mon May 25 12:57:45 GMT 2009 Olly Betts <olly@survex.com>
* cgiparam.cc,commonhelp.cc: Include the corresponding header.
Mon May 25 12:56:55 GMT 2009 Olly Betts <olly@survex.com>
* cgiparam.h: Add explicit inclusions of <map> and <string> and qualify
multimap and string with std::.
Sat May 23 12:21:33 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Sync warning flags used with GCC with xapian-core
apart from -Woverloaded-virtual which fires for
MyHtmlParser::parse_html(). That probably should be tidied up at
some point, but not right now.
Wed May 20 11:24:46 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: The MD5 checksum of a text file with a BOM was being
incorrectly calculated from the contents converted to UTF-8
since 1.0.7. Noticed by Srijon Biswas.
Tue May 05 12:13:17 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: We can now use numeric_limits<> since we no longer
support GCC 2.95, so use it and fix a warning on platforms with
32 bit long.
Thu Apr 30 14:09:50 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,docs/omegascript.rst,query.cc,weight.cc,weight.h: Add
$opt{weighting} to allow the weighting scheme and parameters to be
specified (ticket#298).
Tue Apr 28 07:38:54 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Check the last modification time of files before
reindexing (ticket#342).
Tue Apr 28 05:17:04 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Drop the copyright info from the output of --version as
it's perennially out of date and we don't report it for any other
Xapian programs.
Tue Apr 28 05:03:29 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: If the filter for a filetype isn't installed, don't erase
the entry from the mime_map, but instead set it to the empty string
and then use this to report why we subsequently skip files with the
same extension, rather than slightly misleadingly reporting "Unknown
extension".
Mon Apr 27 16:34:29 GMT 2009 Olly Betts <olly@survex.com>
* templates/query: Offer any spelling correction QueryParser gives.
Mon Apr 27 13:36:19 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Add "--spelling" option to index spelling correction
data.
Sun Apr 26 16:28:36 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Make -s work as a short-form for --stemmer (as
documented by "omindex --help" and "man omindex").
Sun Apr 26 15:33:32 GMT 2009 Olly Betts <olly@survex.com>
* docs/omegascript.rst,query.cc: Add $suggestion and $opt{spelling} to
provide access to spelling correction (ticket#296).
Sun Apr 26 15:08:40 GMT 2009 Olly Betts <olly@survex.com>
* docs/scriptindex.rst,scriptindex.cc: Add new "spell" action for
scriptindex (ticket#296).
Thu Apr 23 07:40:41 GMT 2009 Olly Betts <olly@survex.com>
* docs/scriptindex.rst,scriptindex.cc: Add new "valuenumeric" action
to index a value using Xapian::sortable_serialise() to allow numeric
sorting (ticket#260).
Thu Apr 23 07:09:18 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,docs/Makefile.am: Fix things up so that in
a bootstrapped SVN tree, automatic regeneration of
autotools-generated files uses the in-tree versions of the autotools.
Wed Apr 22 13:52:28 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update for 1.1.0.
Mon Apr 20 14:20:51 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Sync changes from 1.0.12.
Mon Apr 20 14:15:41 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog and clean up for release.
Thu Apr 16 10:02:44 GMT 2009 Olly Betts <olly@survex.com>
* transform.cc: Fix off-by-one error - the return value of pcre_exec()
is one more than the number of groupings.
Thu Apr 16 09:23:29 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am: Need to ship new file transform.h.
Thu Apr 16 08:20:01 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,docs/omegascript.rst,query.cc,transform.cc,transform.h:
Factor out the implementation of $transform into a separate source
file and compile only that file with $(PCRE_CFLAGS) to avoid
problems reported by James Aylett with Mac OS X on #xapian-devel.
Fix expansion of \1 to \9 to work correctly and document these
and \\. Fix handling of unescaped \ at the end of the pattern, and
leave unrecognised \<x> sequences unchanged.
Thu Apr 16 04:38:20 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Remove duplicate "AC_SUBST(AM_CXXFLAGS)".
Thu Apr 16 04:29:28 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Avoid implicitly casting a string literal to char* in
the test for iconv by adding the same explicit cast we use in the
code in utf8convert.cc. Currently the implicit cast is "only" a
warning under GCC, but the user could pass -Werror explicitly in
CXXFLAGS, and this could be promoted to an error in future GCC
versions, and may already be so for some other compilers.
Thu Apr 16 03:56:16 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Back out previous fix - -Werror has nothing to do with
the issue James reported.
Tue Apr 14 15:34:36 GMT 2009 Richard Boulton <richard@lemurconsulting.com>
* configure.ac: Test for compiler flags before checking for
libraries, and use the compiler flags found when checking for
things. In particular, this should fix the test for the type
used by iconv() on MacOS (where it was previously returning "char
*", and the test was giving a warning about converting this to
"const char *", but not failing). Requires a change to the iconv
test to avoid it failing on linux with GCC due to an unrelated
warning in the test code.
Sat Apr 04 15:15:18 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Wed Mar 25 12:35:42 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Actually use all those warning flags we
carefully determine!
Wed Mar 25 12:03:37 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Only put XAPIAN_CXXFLAGS in CXXFLAGS for
the duration of configure (we need it as it may include options to
put the compiler into ISO C++ mode). Set AM_CXXFLAGS to
XAPIAN_CXXFLAGS in Makefile.am. This means that the user can safely
override CXXFLAGS at make-time: "make CXXFLAGS=-Os"
Wed Mar 25 10:56:29 GMT 2009 Olly Betts <olly@survex.com>
* query.cc: Cope with write() not writing all the data or being
interrupted by a signal when writing log entries.
Wed Mar 25 10:48:14 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Move AC_PROG_CXX and AC_LANG_CPLUSPLUS earlier so that
CXXFLAGS is set before we add XAPIAN_CXXFLAGS to it. With libtool
1.5.x this wasn't an issue, as AC_PROG_CXX was implicitly run early
on. With libtool 2.2.x it is as AC_PROG_CXX doesn't touch CXXFLAGS
if it is already set, so we don't get "-O2 -g" set for GCC.
Wed Mar 18 06:13:16 GMT 2009 Olly Betts <olly@survex.com>
* scriptindex.cc: Mark "index=nopos" error for removal in 1.3.0
not 1.2.0. Tweak code that produces it to use more literal strings.
Wed Mar 18 06:12:06 GMT 2009 Olly Betts <olly@survex.com>
* docs/scriptindex.rst: The deprecated "index=nopos" is now removed
and gives an error explaining what to use instead, so remove the
documentation saying it is deprecated and what to do.
Mon Mar 16 14:07:58 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Sync with 1.0.11.
Sat Feb 28 08:31:15 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Use commit() rather than flush().
Sat Feb 28 08:28:26 GMT 2009 Olly Betts <olly@survex.com>
* scriptindex.cc: Don't call reopen() on a WritableDatabase - it
doesn't do anything!
Thu Feb 26 06:38:05 GMT 2009 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog.
Thu Feb 26 06:18:05 GMT 2009 Olly Betts <olly@survex.com>
* omindex.cc: Mark "-l" as requiring an argument so that it actually
works - previously it would always result in a segmentation fault.
Thu Feb 26 00:17:56 GMT 2009 Olly Betts <olly@survex.com>
* docs/cgiparams.rst: Note the technique of using a stub database file
to allow a default of searching over multiple databases.
Wed Feb 25 12:39:08 GMT 2009 Olly Betts <olly@survex.com>
* configure.ac: Update g++ version check to match recent change to
xapian-core. Also turn on _FORTIFY_SOURCE and make the rare()
and usual() branch prediction hint macros available.
Mon Feb 23 06:05:25 GMT 2009 Olly Betts <olly@survex.com>
* Makefile.am,docs/overview.rst,omindex.cc,xpsxmlparse.cc,
xpsxmlparse.h: Add support for XPS files (bug#290).
Fri Feb 20 03:25:14 GMT 2009 Olly Betts <olly@survex.com>
* query.cc: Wrap a long comment.
Thu Feb 19 10:34:36 GMT 2009 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Prefer str.resize(0) to str = "".
Thu Feb 19 06:23:34 GMT 2009 Olly Betts <olly@survex.com>
* docs/overview.rst,omindex.cc: Add support for MS Office 2007
formats (bug#290).
Thu Feb 19 04:46:26 GMT 2009 Olly Betts <olly@survex.com>
* metaxmlparse.cc,metaxmlparse.h,xmlparse.cc,xmlparse.h: XmlParser and
MetaXmlParser were overriding opening_tag with the wrong signature so
their implementations weren't ever being used.
Fri Jan 09 04:19:32 GMT 2009 Olly Betts <olly@survex.com>
* runfilter.cc: Fix to compile when RLIMIT_AS isn't available (as on
NetBSD and OpenBSD). In this situation, instead use RLIMIT_VMEM or
RLIMIT_DATA if either is available.
Wed Dec 10 01:06:03 GMT 2008 Olly Betts <olly@survex.com>
* query.cc: Fix poor grammar in comment.
Sat Nov 01 01:49:07 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Sync with 1.0.9.
Fri Oct 31 18:34:49 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Sync warning flag handling changes from xapian-core.
Thu Oct 23 17:08:22 GMT 2008 Olly Betts <olly@survex.com>
* docs/overview.rst: Document HTML parsing a bit, including robots
meta and htdig_noindex.
Sat Oct 18 08:00:24 GMT 2008 Olly Betts <olly@survex.com>
* omega.cc: Catch std::exception and report what its what() method
returns.
Thu Oct 09 10:16:05 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Update autoconf requirement to 2.63, libtool to 2.2.6.
Wed Oct 01 04:48:37 GMT 2008 Olly Betts <olly@survex.com>
* scriptindex.cc: Separate Action constructor cases to avoid
pointlessly calling atoi() on an empty string.
Wed Oct 01 03:15:29 GMT 2008 Olly Betts <olly@survex.com>
* omega.cc,omega.h: Remove undocumented and non-functional support for
numeric sorting via: SORT=#<slot>
Thu Sep 04 04:26:22 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Set version to 1.1.0.
Thu Sep 04 04:21:12 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Sync with 1.0.8 and update from ChangeLog.
Wed Sep 03 12:26:58 GMT 2008 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparse.h,htmlparsetest.cc,myhtmlparse.cc,
myhtmlparse.h,omindex.cc,scriptindex.cc,xmlparse.h: If the character
encoding is specified using <meta http-equiv=...> in an HTML
document then reparse the document if it isn't the encoding we're
already using so that any preceding <title> is converted correctly
(bug#292).
Convert text from meta tag parameters to UTF-8 (bug#293).
Handle <meta charset="..."> (new in HTML 5).
Fix bug in parameter parsing which was probably just a small
performance penalty in real world cases, but could perhaps result in
parsing bogus extra parameters in carefully contrived situations.
Tue Aug 05 09:24:33 GMT 2008 Olly Betts <olly@survex.com>
* docs/: Fix a few typos and improve wording in a few places.
Tue Aug 05 09:19:56 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc: Tweak to use string::assign() instead of assigning the
result of string::substr().
Tue Jul 29 23:48:31 GMT 2008 Olly Betts <olly@survex.com>
* runfilter.cc: Add missing <signal.h>, noted on FreeBSD by Henrik
Brix Andersen.
Mon Jul 21 12:27:48 GMT 2008 Olly Betts <olly@survex.com>
* commonhelp.cc: Use PACKAGE_BUGREPORT instead of hardcoding the bug
report URL. Remove reference to "bugzilla" as we now use trac
instead.
Mon Jul 21 11:58:25 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Put the bug report URL as the third parameter to
AC_INIT. Add proper m4 quoting in a few places (nowhere that
should actually change behaviour). Add hard autotools version
requirements to match xapian-core, and remove the version
justification since HACKING now covers that. Drop docdir workaround
for autoconf < 2.60.
Wed Jul 09 10:44:37 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: The workaround to avoid probe code for F77, GCJ, and
RC being added to configure is no longer required now that we're
using libtool 2.2 so remove it.
Wed Jul 09 10:13:18 GMT 2008 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Use AC_CONFIG_MACRO_DIR and
ACLOCAL_AMFLAGS as libtoolize 2.2.4 recommends.
Fri Jul 04 08:29:47 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Synchronise with 1.0 branch.
Fri Jul 04 08:15:03 GMT 2008 Olly Betts <olly@survex.com>
* utf8convert.cc,utf8converttest.cc: UTF-16 with no BOM is meant to be
assumed to be big-endian. GNU libiconv doesn't handle some examples
as expected, so disable them when using iconv() for now.
Fri Jul 04 06:39:20 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc: Handle UCS-2 and UTF-16 text files with a byte-order
mark (BOM). Ignore any UTF-8 "byte-order" mark.
* utf8convert.cc: Handle UCS-2/UTF-16 and explicit BE and LE forms in
the non-iconv code.
* Makefile.am,utf8converttest.cc: Add unit tests of convert_to_utf8().
Fri Jun 27 04:43:18 GMT 2008 Olly Betts <olly@survex.com>
* query.cc: Overhaul the $highlight colour combinations since some
were rather unreadable. Reported by Joey Hess in Debian bug
#484456.
Sun Jun 01 15:12:02 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Update version to 1.0.7 to match 1.0 branch.
Sun May 25 14:56:41 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Synchronise with 1.0 branch, and update from ChangeLog.
Sat May 17 11:42:26 GMT 2008 Olly Betts <olly@survex.com>
* docs/omegascript.rst,docs/scriptindex.rst: Tweak mark-up so
generated HTML gets a non-empty title.
Sat May 10 11:14:20 GMT 2008 Olly Betts <olly@survex.com>
* Makefile.am: omega_CPPFLAGS overrides AM_CPPFLAGS, so we need to
explicitly include AM_CPPFLAGS in omega_CPPFLAGS to get
CONFIGFILE_SYSTEM defined when building omega.
Fri May 09 19:27:21 GMT 2008 Olly Betts <olly@survex.com>
* Makefile.am: Fix handling of any -I options needed for PCRE.
Sun May 04 19:12:08 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc: Fix comment error regarding catdvi options.
Sat May 03 14:02:02 GMT 2008 Olly Betts <olly@survex.com>
* xapian-omega.spec.in: Remove "www." from xapian.org and
oligarchy.co.uk URLs.
Sat May 03 13:55:35 GMT 2008 Olly Betts <olly@survex.com>
* cgiparam.cc,htdig2omega,mbox2omega,omindex-config.cc: Update FSF
address.
Sat May 03 13:54:25 GMT 2008 Olly Betts <olly@survex.com>
* gnu_getopt.h: Remove old copy of file which is no longer used - we
now share a copy with xapian-core via common/.
Sat May 03 10:42:27 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Fix header checks to pre-include <sys/types.h> which
Mac OS X needs for some other headers to work.
Sat May 03 10:41:18 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Improve code which prevents probing for f77, etc.
Fri May 02 17:52:44 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Fix to fail if --with-iconv is specified and libiconv
isn't, and we aren't using fink on Mac OS X.
Fri May 02 15:55:24 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
* configure.ac: If iconv isn't found, set with_iconv to "no", to
prevent USE_ICONV being set. Was previously only doing this if
fink on OS X was found.
Fri May 02 14:14:07 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
* query.cc: Cast size to unsigned before division to avoid a
warning about signed overflow.
Fri May 02 14:08:39 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
* configure.ac: Synchronise code for working out warning flags used
for builds with that used for xapian-core. Copes with different
formats of version number output by "gcc --version" which should
help to improve output.
Tue Apr 15 23:44:10 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
* query.cc: Catch only the specific error which indicates a need to
repeat a get_termfreq() call on the database instead of the mset.
Sun Apr 13 11:19:49 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
* freemem.h: Specify units of get_free_physical_memory().
Sun Apr 06 09:05:58 GMT 2008 Olly Betts <olly@survex.com>
* freemem.cc: Fix latent compilation error on FreeBSD, pointed out by
Richard Boulton.
Mon Mar 31 02:00:48 GMT 2008 Olly Betts <olly@survex.com>
* configure.ac: Update version to 1.0.6 to match latest release.
Wed Mar 12 07:04:56 GMT 2008 Olly Betts <olly@survex.com>
* scriptindex.cc: Make deprecated "index=nopos" an error.
Mon Mar 10 03:37:30 GMT 2008 Olly Betts <olly@survex.com>
* Makefile.am,diritor.cc,diritor.h,omindex.cc: Check for readdir()
failing.
Thu Mar 06 23:43:11 GMT 2008 Olly Betts <olly@survex.com>
* common/: Update to latest revisions.
* Makefile.am,diritor.h: Use safedirent.h not dirent.h and build
msvc_dirent.cc as part of omindex.
Wed Mar 05 23:16:23 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Update to HEAD with un-backported changes kept separate.
Wed Mar 05 19:05:12 GMT 2008 Olly Betts <olly@survex.com>
* NEWS: Update to 1.0 branch point.
Sat Feb 02 22:46:40 GMT 2008 Olly Betts <olly@survex.com>
* query.cc: Add (C) notice for Thomas Viehmann.
Sat Feb 02 22:46:14 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc: Back out random change committed by accident.
Sat Feb 02 21:23:07 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc,query.cc: New OmegaScript commands $addfilter, $lower,
$upper.
* docs/omegascript.rst: Document. Improve formatting.
Fri Feb 01 01:45:26 GMT 2008 Olly Betts <olly@survex.com>
* INSTALL: PCRE required.
* docs/omegascript.rst: $transform{} now enabled. Fixes bug#231.
Fri Feb 01 01:35:58 GMT 2008 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,query.cc: Add PCRE as a requirement and
add $transform{} command (which has been in the code for ages but
disabled).
Sat Jan 19 02:01:02 GMT 2008 Olly Betts <olly@survex.com>
* omindex.cc: Add support for DjVu files.
* docs/overview.rst: Document.
Sat Jan 12 03:37:28 GMT 2008 Olly Betts <olly@survex.com>
* freemem.cc: Check "defined HAVE_SYSMP" rather than just "HAVE_SYSMP".
This doesn't change behaviour, but fixes a compile warning on
platforms other than Linux and IRIX.
Fri Dec 21 02:13:49 GMT 2007 Olly Betts <olly@survex.com>
* NEWS: Bump release date.
Thu Dec 20 21:40:34 GMT 2007 Olly Betts <olly@survex.com>
* NEWS: Another update for 1.0.5.
Thu Dec 20 20:08:58 GMT 2007 Olly Betts <olly@survex.com>
* Makefile.am,scriptindex.cc: Fix scriptindex to insert a ':' between
prefix and term using the same criteria which the QueryParser does.
* scriptindex.cc,docs/scriptindex.rst: Action BOOLEAN now ignores an
empty input rather than adding the prefix as a term. Action UNIQUE
now issues an warning for empty input but otherwise ignores it.
Thu Dec 20 17:44:57 GMT 2007 Olly Betts <olly@survex.com>
* common/: Update to r9894 to pick up stringutils.cc.
Wed Dec 19 03:44:50 GMT 2007 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.0.5.
Tue Dec 18 00:58:07 GMT 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Thu Dec 13 01:38:43 GMT 2007 Olly Betts <olly@survex.com>
* omindex.cc: Avoid rereading uncompressed AbiWord documents in order
to calculate their MD5 checksums.
Thu Dec 13 01:34:53 GMT 2007 Olly Betts <olly@survex.com>
* omindex.cc: Improve comment wording.
Thu Dec 13 00:59:35 GMT 2007 Olly Betts <olly@survex.com>
* docs/overview.rst: Document that omindex limits resources that
filter programs can use. Also add a note welcoming suggestions
for additional reliable filter programs.
Wed Dec 12 23:49:27 GMT 2007 Olly Betts <olly@survex.com>
* Makefile.am,freemem.cc,freemem.h,runfilter.cc: Limit filter programs
to 7/8 of free physical memory on platforms where we know how to
determine this (currently at least Linux, FreeBSD, IRIX, HP-UX;
probably Solaris and a few others too). Fixes bug#111.
Wed Dec 12 18:20:34 GMT 2007 Olly Betts <olly@survex.com>
* docs/termprefixes.rst: Note the version where we stopped generating
terms with a 'W' prefix (0.9.7).
Wed Dec 12 18:17:28 GMT 2007 Olly Betts <olly@survex.com>
* docs/overview.rst: omindex hasn't generated "W"-prefix terms since
0.9.7, so remove the documentation saying it does!
Wed Dec 12 18:16:52 GMT 2007 Olly Betts <olly@survex.com>
* docs/overview.rst: Update to mention how upper case in extensions is
handled.
Wed Dec 12 17:49:12 GMT 2007 Olly Betts <olly@survex.com>
* omindex.cc: If an extension isn't found in the mime_map and contains
uppercase ASCII characters, see if the lower cased extension is in
the mime_map.
Wed Dec 12 02:09:02 GMT 2007 Olly Betts <olly@survex.com>
* NEWS: Updated from ChangeLog in preparation for 1.0.5.
Mon Dec 10 23:27:40 GMT 2007 Olly Betts <olly@survex.com>
* omindex.cc: '-f' is documented by --help as a short option for
'--follow', but wasn't previously actually recognised.
Tue Nov 20 13:08:19 GMT 2007 Olly Betts <olly@survex.com>
* htmlparse.cc: Add "using namespace std;" to ensure that
std::strchr(), etc are imported into the global namespace.
Tue Nov 20 01:01:13 GMT 2007 Richard Boulton <richard@lemurconsulting.com>
* commonhelp.cc,diritor.cc,htmlparse.cc,omega.cc,scriptindex.cc:
Add #include of cstring, to fix errors from gcc-4.3 snapshot.
Tidy include ordering in htmlparse.cc
Tue Nov 06 12:17:10 GMT 2007 Olly Betts <olly@survex.com>
* docs/Makefile.am: No need to set SUFFIXES manually for suffixes used
in implicit rules.
Mon Nov 05 19:32:41 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Probe for rst2html.
Mon Nov 05 07:24:31 GMT 2007 Olly Betts <olly@survex.com>
* Makefile.am,README,configure.ac,docs/,query.cc: Replace .txt docs
with Jenny's RST-ified versions.
Tue Oct 30 04:54:58 GMT 2007 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 1.0.4.
Sat Oct 27 05:32:06 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Sat Oct 27 05:30:28 BST 2007 Olly Betts <olly@survex.com>
* query.cc: On balance, it's more helpful to users to moan about a
template which tries to set the same user prefix as both boolean
and probabilistic, even if previous releases didn't.
Thu Oct 25 20:38:15 BST 2007 Olly Betts <olly@survex.com>
* common/: Update to latest version.
* query.cc: Remove STRINGIZE macro definition as this is now
defined by stringutils.h.
Fri Oct 19 16:17:47 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Fix for reverted add_prefix() API.
Sun Sep 30 22:12:46 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* query.cc: Use the new form of add_prefix() to avoid deprecation
warnings at compile time. Carefully avoid calling
add_prefix(f,p,PREFIX_FILTER) for a prefix which has already been
set with add_prefix(f,p,PREFIX_INLINE), because this would cause
an error (and we wish to avoid changing semantics of omegascript
to avoid breaking existing scripts).
* NEWS: Update
Fri Sep 28 15:48:50 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Final (?) update for 1.0.3.
Fri Sep 28 15:46:11 BST 2007 Olly Betts <olly@survex.com>
* mbox2omega: Expand --help output.
* docs/scriptindex.txt: Refer to mbox2omega as an example of how to
use scriptindex.
Fri Sep 28 03:18:25 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Fri Sep 28 03:15:11 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Update for 1.0.3. Use ustar format for tarball since
we have to for xapian-core anyway.
Fri Sep 28 02:42:28 BST 2007 Olly Betts <olly@survex.com>
* ./: Update common SVN rev in svn:externals so the files are in
sync with xapian-core.
Wed Sep 19 16:09:36 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update from ChangeLog entries since 1.0.2.
Sat Sep 08 19:24:48 BST 2007 Olly Betts <olly@survex.com>
* configure.ac,runfilter.cc: Impose a 5 minute CPU time limit on
filter programs to prevent problems if a filter program goes into
an infinite loop on a malformed input. Partly addresses bug#111.
Fri Sep 07 21:22:43 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: Fix comment typos.
Fri Sep 07 20:56:50 BST 2007 Olly Betts <olly@survex.com>
* docs/overview.txt,omindex.cc: Add supporting for indexing TeX DVI
files.
Thu Sep 06 20:59:57 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Fix bug in decimal fraction in $size for files >= 1M in
size.
Thu Sep 06 20:13:44 BST 2007 Olly Betts <olly@survex.com>
* templates/query: Set HTML charset to utf-8 since that's what
databases now are by default. Tidy up some HTML gremlins.
Restyle to use CSS to draw a "score bar" instead of using
images. Rework the layout of each hit. Add popup hints on
mouse-over for various items.
Thu Sep 06 18:12:07 BST 2007 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix line number tracking in dump files.
Thu Sep 06 18:06:28 BST 2007 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Add $muldiv{A,B,C} which calculates
int(A*B/C).
Thu Sep 06 03:36:36 BST 2007 Olly Betts <olly@survex.com>
* runfilter.cc: Fix file description.
Thu Sep 06 00:54:58 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,runfilter.cc,runfilter.h: Factor out the
stdout_to_string() function into its own source file.
Thu Sep 06 00:45:14 BST 2007 Olly Betts <olly@survex.com>
* cgiparam.h,commonhelp.h,date.h,hashterm.h,htmlparse.h,loadfile.h,
md5wrap.h,metaxmlparse.h,myhtmlparse.h,namedentities.h,omega.h,
sample.h,utf8convert.h,utf8truncate.h,xmlparse.h: Add missing header
guards and standardise existing header guards to use the form
OMEGA_INCLUDED_FOO_H.
Thu Sep 06 00:24:54 BST 2007 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Add '#include <config.h>'.
* omega.h: Don't '#include <config.h>'.
Mon Sep 03 19:16:37 BST 2007 Olly Betts <olly@survex.com>
* docs/overview.txt,omindex.cc: Add support for indexing AbiWord
documents.
Thu Jul 05 00:37:35 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Final (?) update for 1.0.2.
Thu Jul 05 00:33:14 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: Report files we aren't indexing because their extensions
aren't recognised.
Wed Jul 04 21:22:02 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* NEWS: Update with release date for release 1.0.2
Wed Jul 04 20:43:22 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* configure.ac: Bump version to 1.0.2.
Wed Jul 04 17:34:15 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Wed Jul 04 17:31:38 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,query.cc: Use stringutils.h from common.
* ./: Update common SVN rev in svn:externals to get the latest
stringutils.h.
* cgiparam.cc: Use string::resize() rather than assigning from a
substring of the string.
Mon Jul 02 16:42:01 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* htmlparsetest.cc,md5test.cc: Add #include <stdlib.h>, to get a
definition for exit(). Fixes compilation with gcc-snapshot.
Thu Jun 28 18:05:18 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: If --url isn't passed, default to "/", but print a
warning noting that this default has been used (at least for now).
Thu Jun 28 18:04:53 BST 2007 Olly Betts <olly@survex.com>
* docs/scriptindex.txt: Fix typo.
Wed Jun 27 15:44:30 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* NEWS: Remove the items which aren't really interesting to users.
Wed Jun 27 14:26:26 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* common/: Update svn:externals property to use latest version.
* NEWS: Updated.
Sat Jun 23 13:11:15 BST 2007 Olly Betts <olly@survex.com>
* diritor.h: Delete random extra blank line.
Sat Jun 23 13:08:35 BST 2007 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Use Xapian::BAD_VALUENO.
Sat Jun 16 11:06:08 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* Makefile.am: Pass value of XAPIAN_CONFIG to distcheck, to ensure
that it works with uninstalled copies of Xapian.
Mon Jun 11 03:34:53 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Minor wording improvement.
Mon Jun 11 03:33:37 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Probably the final update for 1.0.1.
Sun Jun 10 22:00:23 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Drop automake requirement to 1.8.3 to allow RPM spec
file to work on SLES 9.
Sun Jun 10 21:49:45 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Bump version to 1.0.1.
Sun Jun 10 02:16:54 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Updated.
Sat Jun 09 15:20:25 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,diritor.cc,diritor.h,omindex.cc: Under Linux (at least)
struct dirent can tell us the type of a directory entry for some
filing systems, so make use of this to avoid calling stat() (or
lstat()) unnecessarily - when indexing /usr/share/doc on my Linux
box, this saves about 14000 explicit calls to stat (leaving about
7000).
Thu Jun 07 01:40:43 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Wed Jun 06 15:45:33 BST 2007 Olly Betts <olly@survex.com>
* docs/scriptindex.txt: Document that you can delete a document by
giving a new document which only contains the unique term.
Mon Jun 04 16:40:18 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* Makefile.am: Only add manpages to dist_man_MANS if we're not in
maintainer mode with documentation generation turned off.
Thu May 31 20:02:16 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Thu May 31 19:16:37 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Relax automake requirement to 1.9.2 to allow RPM
building on RHEL 4.
Wed May 30 14:42:40 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update for changes since 1.0.0. Removed unused subheading
in 1.0.0 changes.
Wed May 30 10:24:57 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Fix handling of query parsing errors (broken by changes in
1.0.0).
Tue May 29 01:19:21 BST 2007 Olly Betts <olly@survex.com>
* docs/overview.txt: We no longer use pstotext for PostScript, but
instead use ps2pdf followed by pdftotext, so update the docs to
reflect this.
Fri May 18 03:36:28 BST 2007 Olly Betts <olly@survex.com>
* htmlparsetest.cc,myhtmlparse.cc: Fix bug in HTML parser - if the
text between tags consisted entirely of whitespace it would just be
ignored which could run words together. Add regression test, plus
another test for other whitespace handling.
Thu May 17 22:27:47 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Final update before release.
Thu May 17 20:48:25 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update.
Thu May 17 20:46:43 BST 2007 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: Update to include 'Z' prefix and mention
that 'R' and 'W' aren't used by Xapian now.
Thu May 17 19:11:04 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Bump version to 1.0.0.
Thu May 17 18:11:19 BST 2007 Olly Betts <olly@survex.com>
* common/: Update to latest xapian-core revision to pull in 2 argument
mkdir() wrapper for Mingw.
Thu May 17 03:29:44 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Add support for --disable-documentation
like xapian-core now has.
* configure.ac: Only enable -Werror on --enable-maintainer-mode for
GCC 4 or newer, in line with change in xapian-core.
Thu May 17 03:22:10 BST 2007 Olly Betts <olly@survex.com>
* NEWS: Update for 1.0.0.
Wed May 16 03:09:44 BST 2007 Olly Betts <olly@survex.com>
* TODO: Update.
Tue May 15 18:50:47 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Add AC_TYPE_PID_T.
Tue May 15 04:22:40 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: Remove FIXME comment which has already been addressed.
Mon May 14 04:38:49 BST 2007 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Update docs for $prettyterm{TERM}.
Mon May 14 04:31:01 BST 2007 Olly Betts <olly@survex.com>
* omega.cc,omega.h,query.cc,query.h: Rejig how $topterms and other
cases handle terms to fit with the new term generation scheme.
Add 'you' and 'your' as stopwords.
Thu May 10 04:48:43 BST 2007 Olly Betts <olly@survex.com>
* ./: Update svn:externals to pull in r8538 of xapian-core's common
subdirectory.
* Makefile.am: Add common/safe.cc to scriptindex_SOURCES.
Thu May 10 01:09:14 BST 2007 Olly Betts <olly@survex.com>
* templates/,Makefile.am: The 'query' template no longer uses
$topterms by default - to get them, use the new 'topterms' template.
Also the template fragments which aren't intended for direct use
have been move to templates/inc/.
* docs/overview.txt: Document what each of the OmegaScript templates
does.
* docs/quickstart.txt: Assorted minor improvements.
* xapian-omega.spec.in: Update to install templates/inc too.
Wed May 09 23:43:57 BST 2007 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Instead of appending a dot to
indicate a stemmed term, wrap the term in double quotes.
Sun May 06 21:41:21 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Removed commented out code for generating
"W" prefix terms for date searching. We've never made use of them
in Omega, and we'll be moving to using DateMatchDecider by default
eventually anyway.
Sun May 06 16:00:47 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Set version to mythical 0.9.99.
Sun May 06 15:52:08 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,omega.spec.in,xapian-omega.spec.in:
Update RPM spec file to reflect tarball name change from omega
to xapian.omega (patch from Fabrice Colin). Also rename omega.spec
to xapian-omega.spec (rpmbuild looks for any .spec file, but it's
more consistent to keep the names in step).
Fri May 04 19:52:44 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Use new TermGenerator convenience methods
which take std::string instead of Utf8Iterator.
Fri May 04 13:32:11 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,makemanpage.in: Use makemanpage to generate
manpages.
Fri May 04 13:30:36 BST 2007 Olly Betts <olly@survex.com>
* commonhelp.cc: Add missing full stop in description of --stemmer.
Fri May 04 04:10:23 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Explicitly include stdlib.h since we use atoi().
Thu May 03 15:16:31 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,indextext.cc,indextext.h,omindex.cc,scriptindex.cc:
Update to use new TermGenerator class.
Thu May 03 04:03:35 BST 2007 Olly Betts <olly@survex.com>
* ./: Update svn:externals to pull rev8430 of xapian-core's common
subdirectory.
* scriptindex.cc: Remove sleep() wrapper.
Wed May 02 03:26:38 BST 2007 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Removed $freqs as it has been
deprecated for ages.
Wed May 02 03:19:18 BST 2007 Olly Betts <olly@survex.com>
* docs/scriptindex.txt: Explicitly note that index=nopos is deprecated
(scriptindex already emits a warning).
Wed May 02 03:17:03 BST 2007 Olly Betts <olly@survex.com>
* docs/cgiparams.txt: FMT isn't limited to just `a-z' - the
actual restriction is that it may not contain `..'.
Wed May 02 03:02:53 BST 2007 Olly Betts <olly@survex.com>
* scriptindex.cc: Remove -q and -u options - they no longer do
anything and are only accepted for compatibility with really old
versions (0.6.1 and earlier and 0.7.5 and earlier respectively).
Wed Apr 25 21:47:48 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am: omega doesn't need indextext.cc.
Wed Apr 25 21:46:25 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Remove unused `#include "indextext.h"'.
Wed Apr 25 02:37:15 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Add support like xapian-core has for
`configure --enable-quiet', `make QUIET=' and `make QUIET=y'.
Mon Apr 23 15:42:24 BST 2007 Olly Betts <olly@survex.com>
* date.cc,datematchdecider.cc,utils.cc: Fix compilation with GCC 4.3
snapshot.
Mon Apr 23 15:38:00 BST 2007 Olly Betts <olly@survex.com>
* portability/mkdtemp.cc: config.h should always be included first and
with angle brackets. Use safeerrno.h not errno.h. No special
headers are required here for __CYGWIN__, and safesysstat.h provides
a two argument wrapper for mkdir, so we don't need any
__WIN32__-specific magic either.
Mon Apr 23 12:14:01 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* portability/mkdtemp.cc: Patch from Charlie Hull to fix windows
compilation.
* scriptindex.cc: #include <time.h> in scriptindex.cc for
localtime().
Sat Apr 21 23:31:02 BST 2007 Olly Betts <olly@survex.com>
* strcasecmp.h: New header containing magic to provide strcasecmp()
and strncasecmp().
* query.cc,utf8convert.cc: Use strcasecmp.h.
* Makefile.am,cdb_init.cc,cdb_int.h,configfile.cc,getopt.cc,
loadfile.cc,md5wrap.cc,omega.cc,omindex-config.cc,omindex.cc,
query.cc,scriptindex.cc,utf8convert.cc: Add xapian-core's common/
subdirectory as an svn:external so we can (a) share copies of
gnu_getopt.h and getopt.cc and (b) make use of the "safeunistd.h"
and friends.
Sat Apr 21 23:06:49 BST 2007 Olly Betts <olly@survex.com>
* metaxmlparse.cc,metaxmlparse.h: Fix summary comments at the top of
these two files.
Sat Apr 21 20:42:03 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: xapian.h no longer pulls in time.h, which exposes that
we weren't explicitly including it here!
Sat Apr 21 20:27:43 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: We require automake 1.9.5 for xapian-core, so require
it here too for consistency. Turn on automake -Wportability option.
Sat Apr 21 20:24:17 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Probe for ssize_t and mode_t and define replacements
if we don't find them.
Fri Apr 20 14:38:57 BST 2007 Olly Betts <olly@survex.com>
* datematchdecider.h,omega.h,datematchdecider.cc: Update return
types of MatchDecider and ExpandDecider subclasses.
Wed Apr 18 23:44:36 BST 2007 Olly Betts <olly@survex.com>
* utf8convert.cc: Fix to compile when USE_ICONV isn't defined (to_utf8
is now in the Xapian::Unicode namespace).
Wed Apr 18 23:15:26 BST 2007 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,query.cc: Remove "bias_weight" and
"bias_halflife" CGI parameters since they rely on
Enquire::set_bias() which has been removed.
Tue Apr 17 21:45:40 BST 2007 Richard Boulton <richard@lemurconsulting.com>
* Makefile.am: Link htmlparsetest with Xapian library to get access
to noascii_to_utf8.
Tue Apr 17 02:22:42 BST 2007 Olly Betts <olly@survex.com>
* htmlparse.cc: nonascii_to_utf8 is now in the public API.
Tue Apr 17 00:55:17 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,htmlparse.cc,indextext.cc,indextext.h,query.cc,sample.cc,
scriptindex.cc,tclUniData.cc,tclUniData.h,utf8convert.cc,utf8itor.cc,
utf8itor.h,utf8test.cc: Use the new Unicode API routines in the core
Xapian library instead of local copies.
Thu Apr 12 17:04:07 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am: omega and scriptindex both need tclUniData.cc.
Sat Mar 31 19:58:29 BST 2007 Olly Betts <olly@survex.com>
* query.cc: $filesize{0} is now "0 bytes", $filesize{1} is now "1
byte", $filesize{SIZE} where SIZE is negative is now "". Fix
"comparison of signed and unsigned" warning. Use "%c" to generate
the fractional part.
* docs/omegascript.txt: Document that $filesize{SIZE} is "" when SIZE
is negative.
Sat Mar 31 18:25:55 BST 2007 Olly Betts <olly@survex.com>
* query.cc: Ensure that the result of snprintf is zero terminated
since MSVC's snprintf is broken (by design it seems).
* query.cc,docs/omegascript.txt: $filesize enhanced to return a
decimal point for K, M, and G (e.g. "2.1K" and "4.0M" rather than
"2K" and "4M").
Fri Mar 30 19:57:00 BST 2007 Olly Betts <olly@survex.com>
* portability/mkdtemp.cc: Fixes for mingw.
Fri Mar 30 02:22:59 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,scriptindex.cc,utf8truncate.cc,utf8truncate.h: The
"truncate" action now knows not to chop off a multibyte utf-8
character.
Fri Mar 30 02:19:05 BST 2007 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,sample.cc,sample.h: New sample generating
function which normalises all runs of whitespace to a single space,
and fixes invalid utf-8 in the sample. This means we can now index
an iso-8859-1 text file and mostly get the same results as if it
were utf-8!
Thu Mar 29 23:12:20 BST 2007 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix optimisation of "load truncate=N" to actually
work!
Thu Mar 29 18:54:11 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Probe for mkdtemp.
* Makefile.am: Add portability/mkdtemp.cc to omindex_SOURCES if
configure didn't detect it.
* omindex.cc: Prototype mkdtemp if configure didn't detect it.
Thu Mar 29 18:47:50 BST 2007 Olly Betts <olly@survex.com>
* portability/mkdtemp.cc: Fix to compile as C++. Replace isdigit()
with a simple range test to avoid locale related quirks.
Thu Mar 29 18:28:25 BST 2007 Olly Betts <olly@survex.com>
* portability/mkdtemp.cc: Add portable implementation of mkdtemp for
use on platforms which don't supply it.
Thu Mar 29 17:22:18 BST 2007 Olly Betts <olly@survex.com>
* omindex.cc: Index PostScript by converting to PDF with ps2pdf and
then indexing that. This allows us to index PostScript files
containing Unicode characters outside of iso-8859-1, and also
means we now get metadata from PostScript files.
Thu Mar 29 03:14:55 BST 2007 Olly Betts <olly@survex.com>
* omega.spec.in: Update to handle documentation being installed in
$prefix/share/doc/xapian-omega.
Tue Mar 27 21:42:19 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: datarootdir is new in 2.60 too, so use datadir when
setting docdir for 2.59.
Mon Mar 26 15:47:53 BST 2007 Olly Betts <olly@survex.com>
* configure.ac: Add code to ensure that docdir is set for autoconf
2.59 (starting from 2.60, it is defined as standard).
* Makefile.am: Use docdir for installing docs. This means that the
documentation now goes in $prefix/share/doc/xapian-omega rather
than $prefix/share/doc/omega, which is better really.
Sat Mar 24 17:21:32 GMT 2007 Olly Betts <olly@survex.com>
* query.cc: Prefer static char[] to static char * (gives better
generated code).
Sat Mar 24 17:19:18 GMT 2007 Olly Betts <olly@survex.com>
* omega.cc: Prefer static char[] to static char * (gives better
generated code).
Sat Mar 24 17:16:49 GMT 2007 Olly Betts <olly@survex.com>
* configfile.cc: Prefer static char[] to static char * (gives better
generated code).
Thu Mar 22 01:11:52 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Eliminate libtool probe code for f77, gcj, and rc
which speeds up configure and knocks 29% off its size.
Tue Mar 06 01:56:00 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Bump version number to 0.9.10 so that snapshots don't
look older than releases.
Sun Mar 04 14:42:18 GMT 2007 Olly Betts <olly@survex.com>
* TODO: Remove entries which have already been done!
Sat Mar 03 02:24:42 GMT 2007 Olly Betts <olly@survex.com>
* utf8test.cc: Add single utf-8 sequence decoding tests.
Fri Mar 02 00:18:09 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Perform a link test for posix_fadvise to fix
misdetection on HP-UX.
Thu Mar 01 21:48:57 GMT 2007 Olly Betts <olly@survex.com>
* utf8itor.h: Add cast to suppress warning from aCC.
Thu Mar 01 21:00:56 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Check we can link with libiconv, not just compile.
Some of the HP-UX hosts in the HP testdrive seem to have headers
but no matching library.
Thu Mar 01 18:02:37 GMT 2007 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Remove unused function. Move "#include <string.h>"
before any code.
Thu Feb 22 15:45:25 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: xapian-config --cxxflags now includes -ptused for
SGI's C++ compiler, so we don't need to probe for it here.
Wed Feb 21 15:17:07 GMT 2007 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: Expand section on boolean prefixes, showing
how to generate them using scriptindex, and how to allow them to be
selected in an HTML form.
Mon Feb 19 12:51:24 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: Previous fix doesn't work. Just drop -O2 instead -
users of SGI's CC can specify "./configure CXXFLAGS=-O2" is they
want optimisation.
Sun Feb 18 21:44:09 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: For SGI's CC, -g overrides -g3 if it comes afterwards,
so we need to modify CXXFLAGS rather than just setting AM_CXXFLAGS.
Sat Feb 17 19:25:04 GMT 2007 Olly Betts <olly@survex.com>
* docs/overview.txt,omindex.cc: Add support for indexing MS Works
documents using wps2text (part of libwps).
Sat Feb 17 19:06:03 GMT 2007 Olly Betts <olly@survex.com>
* omindex.cc: Don't index empty files.
Fri Feb 16 21:14:35 GMT 2007 Olly Betts <olly@survex.com>
* NEWS: Add note that Omega < 0.8.0 NEWS entries are in the
xapian-core NEWS file.
Fri Feb 16 20:34:10 GMT 2007 Olly Betts <olly@survex.com>
* indextext.cc: Now I've fixed the bug in UTF-8 decoding, the check
for zero length terms is no longer required.
Fri Feb 16 19:34:48 GMT 2007 Olly Betts <olly@survex.com>
* tclUniData.h,utf8itor.h: The tcl unicode routines only have tables
for characters in the BMP. For other characters, assume they're
word characters, but can't be forced to lowercase.
Fri Feb 16 19:19:11 GMT 2007 Olly Betts <olly@survex.com>
* utf8itor.cc: Fix bug in decoding of 4 byte utf-8 sequences
- the returned value was 0x400000 too large! Fixes bug#106.
Thu Feb 15 19:42:36 GMT 2007 Olly Betts <olly@survex.com>
* indextext.cc,query.cc: Keep embedded apostrophe's in terms rather
than relying on generating a phrase search for them.
Thu Feb 15 05:38:12 GMT 2007 Olly Betts <olly@survex.com>
* Makefile.am,datematchdecider.cc,datematchdecider.h,
docs/cgiparams.txt,query.cc: Add an alternative implementation
of date range filtering which uses a MatchDecider. This allows
everything that the existing implementation does, plus you can
support sorting on a choice of dates (e.g. first published or
last updated), and filtering works to a resolution of a minute
rather than a day. Since omindex now adds the last modified
date as value 0, this will work with omindex.
Thu Feb 15 04:38:32 GMT 2007 Olly Betts <olly@survex.com>
* configure.ac: SGI's CC needs -g3 instead of -g if we want to use
any -O option.
Sat Feb 10 20:53:14 GMT 2007 Olly Betts <olly@survex.com>
* md5.cc: Fix reversed preprocessor conditional so that we generate
correct MD5 checksums on big endian platforms.
Sat Feb 10 20:19:23 GMT 2007 Olly Betts <olly@survex.com>
* md5.cc: No need to byte swap when we've just zero filled!
Sat Feb 10 18:54:33 GMT 2007 Olly Betts <olly@survex.com>
* indextext.cc,query.cc: Prefer Xapian::Stem::operator() to
Xapian::Stem::stem_word().
Fri Feb 09 05:53:29 GMT 2007 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Rewrite introductory paragraph. Note that
whitespace is significant, and add explicit warning to $setmap.
Mon Jan 1 01:56:56 GMT 2007 Richard Boulton <richard@lemurconsulting.com>
* indextext.cc: Fix parsing of text containing certain unicode
characters. Such text could have resulted in zero length terms
being added to documents. (The minimal example I found causing
this problem was a document containing only the unicode character
0x28a0f, which is a CJK Unified Ideograph).
Addresses bug #106, though may not be a complete fix - see the
bug for details.
Sun Dec 31 17:22:56 GMT 2006 Richard Boulton <richard@lemurconsulting.com>
* scriptindex.cc: Update short option list for scriptindex to match
documented usage (-h, -V and -s were not working).
Thu Dec 21 14:57:28 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Remove support for xB, xDATE1, xDATE2, xDAYSMINUS,
and xDEFAULTOP which were deprecated in favour of xFILTER in
0.7.5 (over 3 years ago).
Thu Dec 21 14:52:38 GMT 2006 Olly Betts <olly@survex.com>
* docs/cgiparams.txt: Remove documentation of the removed deprecated
aliases.
Thu Dec 21 14:39:04 GMT 2006 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Remove deprecated aliases for CGI parameters
(deprecated in 0.6.3 or 0.6.5, more than 3.5 years ago):
RAW_SEARCH (now RAWSEARCH), DATE1 (now START), DATE2 (now END),
DAYSMINUS (now SPAN but with slightly different semantics),
and MIN_HITS (now MINHITS).
Thu Dec 21 01:04:00 GMT 2006 Olly Betts <olly@survex.com>
* utf8convert.cc: Fix headers included for iconv and not-iconv.
Wed Dec 20 23:53:41 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac,utf8convert.cc: If iconv isn't found by configure, fall
back on simple conversion routines which handle iso-8859-1.
Configuring --without-iconv forces these routines to be used.
Configuring --with-iconv forces configure to fail if it can't find
iconv.
Tue Dec 19 20:35:04 GMT 2006 Olly Betts <olly@survex.com>
* utf8itor.h: Need <string.h> for strlen.
Tue Dec 19 19:53:52 GMT 2006 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Add "-liconv" if it's needed. If we're on
OS X, also check for libiconv installed with fink.
Fri Dec 15 05:43:40 GMT 2006 Olly Betts <olly@survex.com>
* values.h: Add include guard.
Sun Dec 10 04:33:26 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Fix $substr{} with negative start to actually work. Fix
$substr{} to never cause a C++ exception.
* docs/omegascript.txt,query.cc: Enhance $substr{} to accept a
negative length (meaning to count back from the end of the string).
Sun Dec 10 03:05:09 GMT 2006 Olly Betts <olly@survex.com>
* commonhelp.cc: "--help" now says that the default stemming language
is "english"
Thu Nov 16 23:06:25 GMT 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc,utils.cc,utils.h: Add $weight command
to OmegaScript which returns the raw document weight - mostly useful
for debugging purposes.
Thu Nov 16 04:02:10 GMT 2006 Olly Betts <olly@survex.com>
* omega.spec.in: Remove "." from the end of the Summary.
Thu Nov 16 03:03:25 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: As of xapian-core 0.8.0, XO_LIB_XAPIAN doesn't need to
be called with arguments if you want a hard requirement on xapian,
so remove the arguments.
Thu Nov 16 02:07:31 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: Change the project name to "xapian-omega" since that's
what the RPMs and Debian packages call it (there's a Rogue-like game
called Omega).
Thu Nov 16 02:01:55 GMT 2006 Olly Betts <olly@survex.com>
* omega.cc: Fix backwards setting of sort_after. Fix generation of
sort setup flags for filters.
Thu Nov 16 01:21:32 GMT 2006 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,omega.cc,omega.h,query.cc: Implement new CGI
parameters for finer control of sorting and ranking - SORTAFTER
and DOCIDORDER.
* omega.cc: Set up the filters variable so we know to revert to
page 1 if the sorting options are changed.
Tue Nov 14 15:27:09 GMT 2006 Olly Betts <olly@survex.com>
* md5test.cc: Need <stdio.h> for sprintf.
Tue Nov 14 03:19:13 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: Note a couple of platforms which take the different
iconv input types.
Tue Nov 14 03:16:37 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac,utf8convert.cc: The input pointer to iconv can be
either "char **" or "const char **" so probe at configure time.
Mon Nov 13 20:22:50 GMT 2006 Olly Betts <olly@survex.com>
* utf8convert.cc: Need <algorithm> for swap().
Mon Nov 13 02:27:51 GMT 2006 Olly Betts <olly@survex.com>
* Makefile.am,md5test.cc: Add tests for md5 code.
Mon Nov 13 02:06:51 GMT 2006 Olly Betts <olly@survex.com>
* Merge in utf8 branch:
Fri Sep 15 06:03:50 BST 2006 Olly Betts <olly@survex.com>
* utf8convert.cc: Compilation fix for Sun C++.
Thu Sep 14 23:55:20 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,htmlparse.cc,htmlparse.h,indextext.cc,
indextext.h,makesymboltabh.pl,myhtmlparse.cc,myhtmlparse.h,
namedentities.h,omindex.cc,query.cc,scriptindex.cc,
symboltab.h,tclUniData.cc,tclUniData.h,utf8convert.cc,
utf8convert.h,utf8itor.cc,utf8itor.h, utf8test.cc: Convert
to work in UTF-8.
Thu Nov 09 00:20:19 GMT 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 0.9.9.
Wed Nov 08 22:45:10 GMT 2006 Olly Betts <olly@survex.com>
* omega.spec.in: Run "autoreconf --force" to avoid rpath on x86_64
FC6.
Sun Nov 05 17:08:48 GMT 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: The "date" action was modifying the value it
operated on, which it isn't meant to do - fixed.
Sun Nov 05 02:25:48 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Report an error if $setmap is called with an even number
of parameters.
Thu Nov 02 16:08:27 GMT 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 0.9.8.
Thu Nov 02 15:43:31 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: Update comment about "-ptused".
Wed Nov 01 16:23:13 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc: Fix warning in mingw build.
Wed Nov 01 13:43:54 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc,query.cc: Fix warnings.
Wed Nov 01 04:00:20 GMT 2006 Olly Betts <olly@survex.com>
* md5.cc,md5.h: Fix warnings about changing alignment requirements
when casting pointers.
Tue Oct 31 02:47:23 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc,configure.ac,getopt.cc,omega.cc,query.cc,scriptindex.cc:
Enable more warnings for GCC (and fix them in the code). Enable
appropriate warnings for Intel's C++ compiler.
Tue Oct 31 00:02:19 GMT 2006 Olly Betts <olly@survex.com>
* htmlparsetest.cc,omindex.cc: Fix GCC warnings.
Mon Oct 30 23:57:09 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: $substr where the start is negative and longer than the
string (e.g. $substr{abcd,-5,1}) should now work as intended.
Mon Oct 30 21:02:18 GMT 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix GCC warnings uncovered by actually substituting
AM_CXXFLAGS.
Mon Oct 30 21:01:26 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: Actually substitute AM_CXXFLAGS in the Makefile.
* configure.ac: Fix AM_CXXFLAGS for IRIX.
Sat Oct 28 12:31:31 BST 2006 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Add missing "#include <ctype.h>".
Sat Oct 28 02:23:09 BST 2006 Olly Betts <olly@survex.com>
* htmlparse.cc,indextext.cc,indextext.h,myhtmlparse.cc,omega.cc,
omega.h,omindex.cc,query.cc,scriptindex.cc: Ensure that we always
pass an unsigned char value to isupper(), toupper(), etc as they
are undefined on other values (glibc makes them work for signed
char values too, but this is an extension).
Fri Oct 27 00:36:34 BST 2006 Olly Betts <olly@survex.com>
* configure.ac,md5.h,values.h: HAVE_STDINT_H is already defined
by autoconf based on trying the C compiler with AC_CHECK_HEADERS
so define HAVE_WORKING_STDINT_H instead.
Wed Oct 25 01:36:43 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Need a more sophisticated test for the stdint.h
problem on IRIX.
Tue Oct 24 02:12:13 BST 2006 Olly Betts <olly@survex.com>
* metaxmlparse.cc,omega.h: Fix warnings from SGI's C++ compiler.
Tue Oct 24 02:11:11 BST 2006 Olly Betts <olly@survex.com>
* htmlparse.cc,query.cc,scriptindex.cc: Remove unused static
functions.
Tue Oct 24 01:51:05 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Pass magic options to SGI's C++ compiler to allow
linking of templates to work.
Tue Oct 24 00:46:06 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: IRIX doesn't allow stdint.h to be included from C++
code, so we need a smarter configure test than AC_CHECK_HEADERS.
Sun Oct 22 03:30:11 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Tell AC_CHECK_HEADERS to suppress its backward
compatibility mode, so it only checks headers with the compiler.
This speeds up configure a little, and is what we do elsewhere.
Tue Oct 10 17:21:13 BST 2006 Olly Betts <olly@survex.com>
* NEWS: Update for actual 0.9.7 release.
Mon Oct 09 18:26:14 BST 2006 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: "$setmap{title,S}" should be
"$setmap{prefix,title,S}".
Sun Oct 08 21:43:16 BST 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Update for 0.9.7.
Fri Sep 15 16:56:49 BST 2006 Olly Betts <olly@survex.com>
* cgiparam.cc: Compilation fix for Sun C++.
Fri Sep 15 06:00:50 BST 2006 Olly Betts <olly@survex.com>
* configure.ac,query.cc: Compilation fix for Sun C++.
Thu Sep 14 15:41:33 BST 2006 Olly Betts <olly@survex.com>
* htmlparse.cc: Include <stdlib.h> so atoi() is prototyped.
Wed Sep 13 16:37:32 BST 2006 Olly Betts <olly@survex.com>
* configure.ac,md5.h,values.h: Use stdint.h if we have it.
Tue Sep 12 11:57:16 BST 2006 Olly Betts <olly@survex.com>
* myhtmlparse.cc: Need "#include <string.h>" for strchr.
Mon Sep 11 20:24:27 BST 2006 Olly Betts <olly@survex.com>
* values.h: Only want our own ntohl for MS Windows.
Mon Sep 11 16:36:54 BST 2006 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Now xapian-config will switch Sun's C++ compiler
into ANSI C++ compliant mode, so clean out all our special cased
bits of code.
Mon Sep 11 14:23:44 BST 2006 Olly Betts <olly@survex.com>
* md5.h,values.h: Apply previous fix for DJGPP too.
Sun Sep 10 19:04:17 BST 2006 Olly Betts <olly@survex.com>
* md5.h,values.h: Using htonl from winsock.h requires use to link
with the winsock DLL, which is overkill so just add a simple
implementation for htonl - we know MS Windows is little-endian.
Sat Sep 09 21:48:22 BST 2006 Olly Betts <olly@survex.com>
* md5.h,values.h: Sigh, winsock.h uses u_long instead of uint32_t
in the htonl prototype.
Sat Sep 09 19:19:15 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Fix typo in previous commit.
Sat Sep 09 17:11:40 BST 2006 Olly Betts <olly@survex.com>
* configure.ac,omindex.cc: Mingw doesn't have sys/wait.h or
WEXITSTATUS.
Sat Sep 09 16:44:29 BST 2006 Olly Betts <olly@survex.com>
* md5.h,values.h: On MS Windows, we need to #include <winsock.h>.
Fri Sep 08 08:01:15 BST 2006 Olly Betts <olly@survex.com>
* query.cc: Sun C++'s std::count() isn't very "std" -- it has the
wrong prototype!
Fri Sep 08 03:39:14 BST 2006 Olly Betts <olly@survex.com>
* md5.h,values.h: openbsd needs arpa/inet.h to be included before
netinet/in.h.
Wed Sep 06 21:31:33 BST 2006 Olly Betts <olly@survex.com>
* md5wrap.cc: #include <unistd.h>
Wed Sep 06 18:03:23 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am: Ship values.h.
Wed Sep 06 03:52:27 BST 2006 Olly Betts <olly@survex.com>
* configfile.cc: Changed my mind - don't allow comments on the end of
lines.
* docs/overview.txt: Document that omega.conf can have comments and
blank lines in.
Wed Sep 06 03:46:16 BST 2006 Olly Betts <olly@survex.com>
* configfile.cc,omega.conf: Fix code which reads omega.conf to be line
based as documented rather than the wacky whitespace based scheme
that was actually implemented. Allow "#" comments and blank lines
in omega.conf.
Wed Sep 06 01:26:17 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: If popen() fails, treat it as a read error.
Wed Sep 06 00:49:47 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Fix escaping of filenames to cast characters to
"unsigned char" so that isalnum() works correctly everywhere.
Not a security hole as dangerous characters were still being
escaped.
Tue Sep 05 06:49:30 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am: Run htmlparsetest on "make check".
Tue Sep 05 06:46:18 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,htmlparse.cc,htmlparse.h,metaxmlparse.cc,metaxmlparse.h,
myhtmlparse.h,omindex.cc,xmlparse.cc,xmlparse.h: Parse the XML from
OpenDocument and OpenOffice using new subclasses of HtmlParser.
Only extract meta.xml once.
Tue Sep 05 06:45:02 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,htmlparsetest.cc: Add htmlparsetest which tests the
MyHtmlParser class.
Tue Sep 05 04:36:46 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Note UTF-8 runes for pdfinfo and pdftotext.
Tue Sep 05 04:29:21 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Only run pdfinfo once and pull out the
fields we want using string operations, instead of
running it twice filtered through sed.
Tue Sep 05 03:53:00 BST 2006 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparse.h: Don't get confused by "a<b" in
Javascript in a <script> tag. Fixes bug#91.
Sat Sep 02 04:29:12 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Call pclose() not fclose() on a FILE* obtained from
popen(). If a filter program isn't installed, then don't try it
again for the same extension (not perfect but an improvement -
previously we indexed an empty document!)
Sat Sep 02 02:07:30 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,docs/omegascript.txt,md5.cc,md5.h,
md5wrap.cc,md5wrap.h,omindex.cc,query.cc,values.h: Generate
an MD5 checksum of each file indexed and store it in value #1
to allow duplicates to be collapsed. Add $pack and $unpack
OmegaScript commands to allow big endian binary values to
be encoded and decoded. Add the file last modified time
as value #0.
Fri Sep 01 04:37:09 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Tweak comment and whitespace.
Fri Sep 01 04:19:39 BST 2006 Olly Betts <olly@survex.com>
* README: Update reference to "CVS" to say "SVN".
Thu Aug 31 20:22:33 BST 2006 Olly Betts <olly@survex.com>
* loadfile.cc: #include <algorithm> for std::min().
Thu Aug 31 02:35:36 BST 2006 Olly Betts <olly@survex.com>
* loadfile.cc: More missing #include-s.
Thu Aug 31 01:53:31 BST 2006 Olly Betts <olly@survex.com>
* loadfile.cc: Add #include <unistd.h>.
Wed Aug 30 23:21:49 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am: Include loadfile.h in the tarball.
Mon Aug 28 18:09:28 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Don't generate 'W' terms since omega doesn't use them.
Mon Aug 28 03:06:46 BST 2006 Olly Betts <olly@survex.com>
* query.cc,templates/query: Use '\t' to separate terms in xP since
filter terms might contain '.'. Fixes bug#87.
Sun Aug 27 01:36:40 BST 2006 Olly Betts <olly@survex.com>
* indextext.cc: Don't generate terms with more than 3 trailing
symbols ('-', '+', or '#').
Sun Aug 27 01:11:45 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc: Added "size" field to document data; don't add "modtime"
field if the timestamp is (time_t)-1.
Sun Aug 27 00:36:12 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc,templates/query,utils.cc,utils.h: Store the file's last
modified time in the document data as "modtime" so it shows up in
search results (and tweak the query template so the display of this
information looks nicer).
Fri Aug 25 22:55:23 BST 2006 Olly Betts <olly@survex.com>
* docs/overview.txt,omindex.cc: Run xls2csv on MS Excel files; run
catppt on MS Powerpoint files; also index MS Word templates (.dot).
Thu Aug 24 21:40:10 BST 2006 Olly Betts <olly@survex.com>
* htmlparse.cc: Support htdig's "ignore this bit" comments.
Thu Aug 24 12:55:26 BST 2006 Olly Betts <olly@survex.com>
* query.cc: Fix $highlight{} to work with capitalised words (it used
to work but regressed in 0.8.2).
Thu Aug 24 12:38:50 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,omindex.cc,query.cc: Use the new routines in loadfile.cc
to replace code to do the same thing in omindex and omega.
Thu Aug 24 12:37:16 BST 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix handling of check whether a record has content
in the case where the same field is processed more than once.
Thu Aug 24 12:35:32 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,docs/scriptindex.txt,loadfile.cc,loadfile.h,
scriptindex.cc: Add new "load" action to allow the contents of an
external file to be loaded.
Thu Aug 24 12:05:23 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Check for strftime.
Sun Jul 09 01:40:09 BST 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Note that (by design) an omegascript template
can't contain an infinite loop.
Sun May 21 11:42:54 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am: Make use of the dist_ prefix to avoid having to list
files in EXTRA_DIST as well as in *_SCRIPTS, *_DATA, and man_MANS.
* Makefile.am: Prefer $(sysconfdir) to @sysconfdir@ since the former
can be overridden on the "make" command line.
Sat May 20 06:16:27 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Specify required automake version in
the call to AM_INIT_AUTOMAKE in configure.ac.
Thu May 18 14:12:13 BST 2006 Olly Betts <olly@survex.com>
* docs/overview.txt,docs/quickstart.txt: Use the default path to the
database directories in examples. Tweak the formatting in a few
places. Give a path to the omega CGI binary in the example showing
how to run it from the command line.
Wed May 17 15:28:01 BST 2006 Olly Betts <olly@survex.com>
* omega.spec.in: Fix so that the documentation gets packaged.
Tue May 16 06:56:26 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Remove unused variable from snprintf testing code.
Mon May 15 02:18:01 BST 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Updated for 0.9.6.
Sat May 13 20:43:08 BST 2006 Olly Betts <olly@survex.com>
* configure.ac: Update snprintf detection to match xapian-core.
Fri May 12 20:12:40 BST 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Clarified description of $now.
Thu Apr 27 23:45:26 BST 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Added new OmegaScript commands
$filterterms and $substr.
Thu Apr 27 18:37:50 BST 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Use const reference instead of just a reference.
Sun Apr 23 18:32:20 BST 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix "index" and "indexnopos" without a prefix to
set the weight correctly (bug introduced in 0.9.5).
Wed Apr 19 13:37:15 BST 2006 Fabrice Colin
* omega.spec.in: Create and package /var/lib/omega/cdb and
/var/log/omega.
Tue Apr 11 19:29:34 BST 2006 Olly Betts <olly@survex.com>
* configure.ac,htmlparse.cc,query.cc,scriptindex.cc: Disable MSVC
warning 4800 (on int to bool conversions) in config.h and then we
can remove the "fixes" elsewhere.
Mon Apr 10 16:26:08 BST 2006 Olly Betts <olly@survex.com>
* date.cc,hashterm.cc,htmlparse.cc,omega.cc,omindex.cc,query.cc,
scriptindex.cc: Fix MSVC7 warnings.
Sat Apr 08 20:04:33 BST 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Updated for 0.9.5.
Fri Apr 07 16:45:36 BST 2006 Olly Betts <olly@survex.com>
* omindex.cc,query.cc: Tweak for MSVC compilation.
Fri Apr 07 03:23:22 BST 2006 Olly Betts <olly@survex.com>
* omega.spec.in: Man pages may be gzipped.
Thu Apr 06 14:28:08 BST 2006 Olly Betts <olly@survex.com>
* README: Add pointer to documentation.
Thu Apr 06 03:32:21 BST 2006 Olly Betts <olly@survex.com>
* omega.spec.in: Include man pages in RPM.
Thu Apr 06 03:06:56 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,commonhelp.cc,commonhelp.h,configure.ac,omindex.cc,
scriptindex.cc: Add man pages for omindex and scriptindex.
Thu Apr 06 02:56:09 BST 2006 Olly Betts <olly@survex.com>
* mbox2omega.script: Use new "hash" command.
Wed Apr 05 19:29:14 BST 2006 Olly Betts <olly@survex.com>
* Makefile.am,docs/scriptindex.txt,hashterm.cc,hashterm.h,
omindex.cc,scriptindex.cc: Add new "hash" command to allow hashed
terms to be generated from long URLs like omindex does.
* htdig2omega.script: Use new "hash" command.
* scriptindex.cc: Fix "useless weight" warning to not incorrectly
fire when "index" or "indexnopos" has no parameter.
Wed Apr 05 15:03:28 BST 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Check if we successfully opened the index script
and give an error if not.
Fri Mar 10 05:21:13 GMT 2006 Olly Betts <olly@survex.com>
* dbi2omega: Check DBIDRIVER environmental variable to allow a driver
other than mysql to be specified without modifying the script.
Wed Mar 01 02:28:57 GMT 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Don't repeat the "note" part of warnings; Warn if
"unique=<prefix>" is used without a corresponding "boolean=<prefix>";
Warn that "index=nopos" is deprecated and should be replaced by
"indexnopos".
Tue Feb 28 23:46:57 GMT 2006 Olly Betts <olly@survex.com>
* scriptindex.cc: Report a useless weight action, even if it's
followed by another non-useless action (e.g. field); convert weight
actions into a numeric parameter on index and indexnopos Action
objects; add explanatory text "(note that actions are executed from
left to right)" when reporting useless actions.
Sun Feb 26 00:25:10 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Fix $opt[fieldnames] handling. Previously it would try
to kick in if you didn't set fieldnames but set any alphabetically
later option!
Tue Feb 21 00:18:25 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac,NEWS: Updated for 0.9.4.
Sun Feb 19 23:20:49 GMT 2006 Olly Betts <olly@survex.com>
* COPYING: Updated FSF address.
Thu Feb 16 00:10:22 GMT 2006 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Updated for 0.9.3.
Wed Feb 08 13:01:15 GMT 2006 Olly Betts <olly@survex.com>
* templates/query: Make the page title shorter so there's more chance
it will fit on icon bars, etc.
Wed Feb 08 10:08:24 GMT 2006 Olly Betts <olly@survex.com>
* docs/overview.txt: Add pointer to documentation of the supported
query syntax.
Mon Feb 06 15:19:17 GMT 2006 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: Fix typo.
Sat Jan 14 22:40:43 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: Copy over fixed snprintf checks from xapian-core.
Fri Jan 13 03:21:15 GMT 2006 Olly Betts <olly@survex.com>
* configure.ac: The configure test for snprintf uses memcmp, so
we need to "#include <string.h>" for it to work reliably.
Mon Jan 09 04:23:54 GMT 2006 Olly Betts <olly@survex.com>
* date.cc,query.cc: Add "#include <stdarg.h>" where we use
va_list, etc.
Mon Jan 09 04:17:54 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc: Fix more compilation issues with cdb no-mmap code.
Mon Jan 09 03:42:18 GMT 2006 Olly Betts <olly@survex.com>
* omega.cc,utils.cc,utils.h: Replace remaining use of split with
a direct walk of the string.
Mon Jan 09 03:19:49 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Don't split strings of docids in R parameters into a
vector<string> - just walk the string directly. The code is
as simple, and much more efficient if a lot of documents are
marked relevant.
Mon Jan 09 02:46:34 GMT 2006 Olly Betts <olly@survex.com>
* Makefile.am,date.cc,omindex.cc,query.cc,scriptindex.cc,utils.cc,
utils.h: Use snprintf where available.
Sun Jan 08 22:41:47 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc: Fixed malloc-based version to compile.
Sun Jan 08 21:05:46 GMT 2006 Olly Betts <olly@survex.com>
* cdb_find.cc,cdb_hash.cc,cdb_unpack.cc: #include <config.h>.
* configure.ac: Test for mmap.
* cdb_init.cc: If mmap isn't found, and this isn't WIN32 fall back on
the very crude approach of loading the whole file into a malloc-ed
block. For a small cdb file, that'll give acceptable performance
at least.
Fri Jan 06 21:29:37 GMT 2006 Olly Betts <olly@survex.com>
* symboltab.h: Fix A after \xbf being interpereted as an overlong
escape sequence.
Fri Jan 06 21:26:57 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Fix printf type mismatch on 64 bit platforms.
Fri Jan 06 21:00:34 GMT 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Added $find{LIST,STRING}.
Fri Jan 06 20:52:31 GMT 2006 Olly Betts <olly@survex.com>
* symboltab.h: Write top-bit set characters using \xXX notation to
avoid warnings from Intel's C++ compiler.
Fri Jan 06 18:15:42 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Removed unused variable.
Fri Jan 06 18:14:33 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Cast time_t to unsigned long to avoid problems on 64bit
platforms.
Fri Jan 06 18:12:38 GMT 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Note in the $cgi description that it returns
an arbitrary value if there's more than one, and pointing to
$cgilist.
Thu Jan 05 05:54:58 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc: Fix mingw compilation.
Thu Jan 05 03:24:07 GMT 2006 Olly Betts <olly@survex.com>
* cdb_init.cc: Fix to hopefully compile on Solaris which has a broken
sys/mman.h when used from C++.
Wed Jan 04 20:44:44 GMT 2006 Olly Betts <olly@survex.com>
* query.cc: Fixed to compile with GCC 3.0.
Wed Jan 04 04:33:15 GMT 2006 Olly Betts <olly@survex.com>
* Makefile.am,cdb.h,cdb_find.cc,cdb_hash.cc,cdb_init.cc,cdb_int.h,
cdb_unpack.cc,configfile.cc,configfile.h,docs/omegascript.txt,
omega.conf,query.cc: Add $lookup{CDBFILE,KEY} command to perform
a lookup in a CDB file.
Wed Jan 04 03:06:31 GMT 2006 Olly Betts <olly@survex.com>
* docs/omegascript.txt,docs/overview.txt,query.cc: Added new feature
which allows you to avoid storing fieldnames in every document
(which can save a lot of disk space for a large database). Instead
you just store the field values, one per line, and add something
like "$set{fieldnames,$split{caption sample url}}" to the
OmegaScript template to specify the fieldnames to use.
* docs/omegascript.txt,query.cc: Add new "$split{}" command which
splits a string to give an OmegaScript list.
* query.cc: Fix $url{} to escape "+" to "%2b".
* query.cc: Speed up $highlight{} - only compare terms which are the
same length.
Tue Jan 03 22:38:01 GMT 2006 Olly Betts <olly@survex.com>
* configfile.cc: Rename file_readable() to file_exists() to better
reflect what the function actually does!
Tue Jan 03 17:43:40 GMT 2006 Olly Betts <olly@survex.com>
* templates/opensearch: Add missing escaping.
Mon Dec 19 10:27:30 GMT 2005 Olly Betts <olly@survex.com>
* Makefile.am,commonhelp.cc,commonhelp.h,docs/overview.txt,omindex.cc,
scriptindex.cc: Add "--stemmer" option to omindex and scriptindex
to allow the stemming language to be set.
* omindex.cc,scriptindex.cc: More consistent --help and --version
output. Update FSF address.
Mon Dec 19 06:03:31 GMT 2005 Olly Betts <olly@survex.com>
* query.cc: Explicitly use "unsigned char" when %-encoding in $url
so that top bit set characters are correctly handled on platforms
where char is signed by default.
Sun Dec 11 09:30:44 GMT 2005 Olly Betts <olly@survex.com>
* templates/godmode: If a non-existent docid is specified, report the
error and prompt the user to enter another docid. Fixes bug#60.
Sun Dec 11 09:27:18 GMT 2005 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,omega.cc,omega.h,query.cc: Add "SORTREVERSE"
CGI parameter which allows the sort order to be reversed when
sorting on a value. Remove "SORTBANDS" CGI parameter since it
no longer does anything.
Sun Dec 11 09:26:14 GMT 2005 Olly Betts <olly@survex.com>
* omindex.cc: Improve wording of comment.
Sun Dec 11 09:22:58 GMT 2005 Olly Betts <olly@survex.com>
* docs/overview.txt,omindex.cc: Add support for OpenDocument format
mimetypes and extensions out of the box.
Sun Dec 11 09:16:57 GMT 2005 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: If executing an OmegaScript command
causes a Xapian exception to be thrown, catch it and copy the error
message into error_msg (which is read by the $error command).
Sun Dec 11 09:12:12 GMT 2005 Olly Betts <olly@survex.com>
* htmlparse.cc: Tweak a few comments; "while (1)" -> "while (true)".
Sun Dec 11 09:09:40 GMT 2005 Olly Betts <olly@survex.com>
* docs/overview.txt: The U prefix (URL term) was grouped with the date
searching prefixes, but it makes more sense to group it with the
prefixes relating to parts of the URL (H for hostname, P for path,
etc).
Sun Oct 02 16:28:59 BST 2005 Olly Betts <olly@survex.com>
* scriptindex.cc: Use "int database_mode" (set to the value to pass to
WritableDatabase's ctor) instead of "bool overwrite" to implement
--overwrite.
* scriptindex.cc: Remove code to handle "-q" as it no longer actually
controls anything. Just ignore it for backwards compatibility.
* scriptindex.cc: Tweak --help output to not wrap on a default
install.
Sat Sep 10 14:57:19 BST 2005 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Improve descriptions of $collapsed, $value,
$version.
Fri Jul 29 10:05:21 BST 2005 James Aylett <james@tartarus.org>
* omindex.cc: add --preserve-nonduplicates / -p option to not
delete any documents that aren't updated, in replace duplicates
mode (so that multiple runs of omindex on different subsites
don't stomp on each other).
* docs/overview.txt: update to match the above.
Fri Jul 15 11:12:28 BST 2005 Olly Betts <olly@survex.com>
* configure.ac: Updated for 0.9.2.
Fri Jul 15 02:18:40 BST 2005 Olly Betts <olly@survex.com>
* NEWS: Updated for 0.9.2.
Sat Jul 02 14:56:35 BST 2005 Olly Betts <olly@survex.com>
* query.cc: Workaround further Sun C++ crapness.
Wed Jun 29 03:19:22 BST 2005 Olly Betts <olly@survex.com>
* docs/omegascript.txt,query.cc: Changed $highlight so
if OPEN and CLOSE aren't specified, they default to
highlighting each word from the query with a different
background colour like gmane does (previous default was to use
'<strong>' and '</strong>').
* query.cc: Removed surplus whitespace.
Fri Jun 24 02:51:38 BST 2005 Olly Betts <olly@survex.com>
* query.cc: Call QueryParser::set_database() as this is now used to
decide what to do for terms like "C#".
* docs/omegascript.txt,docs/termprefixes.txt,query.cc: Add the
ability to set boolean prefixes for the QueryParser by setting
a "boolprefix" map in the omegascript template.
Fri Jun 24 02:40:10 BST 2005 Olly Betts <olly@survex.com>
* scriptindex.cc: Fix infinite loop if there's no newline at the end
of a dumpfile.
Thu Jun 23 16:42:41 BST 2005 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: Explain who to use termprefixes with
scriptindex and omega, since that's what most people will want to
know.
Thu Jun 23 16:41:15 BST 2005 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt: Added $length{} and $stoplist{}
commands to OmegaScript.
* docs/omegascript.txt: Use standard "S" prefix for title in example
for $setmap, rather than "XT".
Mon Jun 06 17:59:10 BST 2005 Olly Betts <olly@survex.com>
* NEWS: Another 0.9.1 update.
Mon Jun 06 17:52:44 BST 2005 Olly Betts <olly@survex.com>
* NEWS: Updated for 0.9.1.
Mon Jun 06 17:51:58 BST 2005 Olly Betts <olly@survex.com>
* configure.ac: Updated for 0.9.1.
Mon May 23 23:36:48 BST 2005 Fabrice Colin <fabrice.colin@gmail.com>
* omega.spec.in: Updated for 0.9.0.
Fri May 13 23:21:02 BST 2005 Olly Betts <olly@survex.com>
* NEWS: Updated for 0.9.0.
Fri May 13 00:39:44 BST 2005 Olly Betts <olly@survex.com>
* configure.ac: Updated for 0.9.0.
Fri May 13 00:35:21 BST 2005 Olly Betts <olly@survex.com>
* scriptindex.cc: Improved handling of extra blank lines in dump file;
Strip multiple \r characters from end of line; Complain if a dump
file doesn't appear to have been = escaped correctly; Flush
database after each input file to ensure all changes from a file
make it in.
* docs/omegascript.txt: Whitespace tweak.
Wed May 11 02:28:41 BST 2005 Olly Betts <olly@survex.com>
* NEWS: Started to update for 0.9.0.
Sun May 08 02:16:07 BST 2005 Olly Betts <olly@survex.com>
* query.cc: Use Query::get_terms_begin() not
QueryParser::termlist_begin().
Sun May 08 02:11:49 BST 2005 Olly Betts <olly@survex.com>
* Makefile.am: Use AM_CPPFLAGS not CPPFLAGS (CPPFLAGS is for the
user).
Wed May 4 11:32:18 BST 2005 Richard Boulton <richard@tartarus.org>
* configfile.cc: Configuration file is now looked for in various
locations: the first location in which a file is found is used.
Firstly, if the OMEGA_CONFIG_FILE environment variable is set,
the location given in it is checked. Secondly, the file
"omega.conf" in the same directory as the executable is checked.
Finally, the file "${sysconfdir}/omega.conf" (eg, /etc/omega.conf
on Linux) is checked. If none of these locations contain a file,
default values are used.
* docs/overview.txt: Update to describe new configuration file
locations.
* Makefile.am: Install omega.conf to ${sysconfdir} by default.
Define CONFIGFILE_SYSTEM with an appropriate value to find the
system configuration file.
Wed May 4 11:20:26 BST 2005 Richard Boulton <richard@tartarus.org>
* query.cc: Use new set_stemming_strategy() API method, rather than
old set_stemming_options() method. The old method didn't compile
because it's being passed a stemming_strategy value, which there
isn't a prototype for.
Fri Apr 29 10:27:05 BST 2005 Olly Betts <olly@survex.com>
* scriptindex.cc: Improved comments.
Fri Apr 15 03:12:02 BST 2005 Olly Betts <olly@survex.com>
* docs/termprefixes.txt: Updated QueryParser prefix documentation to
remove references to CVS HEAD.
* docs/termprefixes.txt: Capitalise "Month" to indicate why it has
prefix "M" (in line with all the other entries in the list).
Fri Apr 15 02:55:06 BST 2005 Olly Betts <olly@survex.com>
* indextext.cc: Generate terms like "c#".
* query.cc: Highlight words like "C#".
Fri Apr 15 02:53:22 BST 2005 Olly Betts <olly@survex.com>
* query.cc: Clearer code for adding boolean filters are added to the
query.
Wed Apr 06 02:47:14 BST 2005 Olly Betts <olly@survex.com>
* omindex.cc: Tweak the hashing of URLs so that it works the same
way on all platforms (previously it would depend on sizeof(long)).
This means an incompatibility with any existing database built on
a platform where sizeof(long) > 4 where URLs were hashed (i.e.
URLs were > 228 bytes if sizeof(long) == 8), but we really want
databases to be portable between platforms.
Wed Apr 06 02:44:58 BST 2005 Olly Betts <olly@survex.com>
* omindex.cc,docs/overview.txt: Removed useless "DUPE_duplicate"
option.
Wed Apr 06 00:48:08 BST 2005 Olly Betts <olly@survex.com>
* omindex.cc,docs/overview.txt: Added support for using pod2text for
indexing Perl documentation.
Wed Apr 06 00:25:47 BST 2005 Olly Betts <olly@survex.com>
* omindex.cc,docs/overview.txt: Replace -l/--no-recurse with
-l/--depth-limit which takes an argument allowing recursion
to be restriction to any depth, not just 0 or infinite!
Tue Apr 05 23:45:39 BST 2005 Olly Betts <olly@survex.com>
* mbox2omega,mbox2omega.script,Makefile.am: Added mbox2omega which
allows a mail folder to be indexed. Mostly it's an example as
there's no mechanism included to show the full original message.
Tue Apr 05 23:41:44 BST 2005 Olly Betts <olly@survex.com>
* scriptindex.cc: Tidy up STL header includes.
Tue Apr 05 23:34:36 BST 2005 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Clarify $field description slightly.
Tue Apr 05 23:33:33 BST 2005 Olly Betts <olly@survex.com>
* indextext.h: Add typedefs to allow AccentNormalisingItor to be used
as an STL iterator.
Tue Apr 05 00:47:52 BST 2005 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,docs/omegascript.txt: Fixed 3 references to
OmXxxx classes.
Tue Apr 05 00:41:45 BST 2005 Olly Betts <olly@survex.com>
* debian/.cvsignore,.cvsignore: Remove .cvsignore files, as they're
not used by SVN.
Mon Mar 21 16:43:07 GMT 2005 Richard Boulton <richard@tartarus.org>
* templates/opensearch: Add new template to implement basic
opensearch feeds of search results.
* Makefile.am: Include opensearch template in distribution.
Thu Mar 03 02:20:26 GMT 2005 Olly Betts <olly@survex.com>
* templates/query2: Remove Sam's unfinished rewrite of the query
template. It's not been worked on for nearly two years, and we
don't ship it.
Wed Mar 02 03:09:52 GMT 2005 Olly Betts <olly@survex.com>
* COPYING: Put in CVS.
Tue Mar 01 02:09:35 GMT 2005 Olly Betts <olly@survex.com>
* omindex.cc,docs/overview.txt: Extend -M/--mime-type to allow an
existing mapping to be removed by omitting the type.
Thu Feb 24 17:42:35 GMT 2005 Olly Betts <olly@survex.com>
* Makefile.am: Actually ship docs/termprefixes.txt (and make it harder
to fail to ship new docs in future).
Thu Feb 24 02:10:09 GMT 2005 Olly Betts <olly@survex.com>
* Makefile.am,docs/termprefixes.txt: Added a single document covering
all aspects of term prefixes.
Wed Feb 23 14:59:46 GMT 2005 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Moved $collapsed into correct place
alphabetically!
Wed Feb 16 03:46:51 GMT 2005 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,docs/overview.txt: Improved description of how
B filters are handled when building the query.
Wed Feb 16 03:44:24 GMT 2005 Olly Betts <olly@survex.com>
* omindex.cc: Fixed so that we get lstat() prototype on Linux systems
where we have posix_fadvise().
Mon Jan 17 03:35:35 GMT 2005 Olly Betts <olly@survex.com>
* query.cc: Corrected a comment.
Mon Jan 17 03:32:25 GMT 2005 Olly Betts <olly@survex.com>
* query.cc: Updated to use the new QueryParser API.
Wed Jan 05 03:15:43 GMT 2005 Olly Betts <olly@survex.com>
* docs/scriptindex.txt: Note that actions are applied in the specified
order.
Thu Dec 23 19:12:57 GMT 2004 Olly Betts <olly@survex.com>
* INSTALL: "xapian-examples" -> "omega".
Thu Dec 23 19:10:04 GMT 2004 Olly Betts <olly@survex.com>
* configure.ac,NEWS: Version 0.8.5.
Thu Dec 23 19:09:01 GMT 2004 Olly Betts <olly@survex.com>
* INSTALL,README: Added better installation instructions.
Mon Dec 20 17:26:26 GMT 2004 Olly Betts <olly@survex.com>
* configure.ac,omindex.cc: Fixed "ignore symlinks" code to compile on
systems without lstat (e.g. mingw).
Mon Dec 20 12:18:18 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Fix the "ignore symlinks" code to actually compile on
certain Linux boxes.
Mon Dec 20 11:33:59 GMT 2004 Olly Betts <olly@survex.com>
* query.cc: If an exception is thrown, make sure that the HTTP headers
get written so that we don't cause "500 Internal Server Error".
This problem was introduced by the change to allow a user specified
Content-Type in 0.8.0. Partly addresses bug#60.
Fri Dec 17 22:50:01 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Only try to delete removed documents in DUPE_replace
mode.
Thu Dec 16 11:43:28 GMT 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: Fixed "Unknown Exception" when trying to "unhtml"
text which contains "</body>" (bug#61). This bug was introduced in
0.8.4.
Thu Dec 16 11:28:25 GMT 2004 Olly Betts <olly@survex.com>
* myhtmlparse.cc: <h1> - <h6> and </h1> - </h6> should leave a
space into the dumped HTML.
Wed Dec 15 15:53:55 GMT 2004 Richard Boulton <richard@tartarus.org>
* dbi2omega: Add a comment to the start of the file detailing what
dbi2omega does.
Wed Dec 15 15:08:41 GMT 2004 Richard Boulton <richard@tartarus.org>
* omindex.cc: Change behaviour of crawler such that it doesn't
follow symbolic links any more. Add "--follow" command
line option to turn following of symlinks back on.
Wed Dec 08 16:31:46 GMT 2004 Olly Betts <olly@survex.com>
* NEWS: Final update for 0.8.4.
Tue Dec 07 18:16:32 GMT 2004 Olly Betts <olly@survex.com>
* indextext.h: Fixed to compile with GCC 3.x.
Tue Dec 07 18:15:39 GMT 2004 Olly Betts <olly@survex.com>
* omega.cc,omindex.cc,scriptindex.cc: Use the new
Database/WritableDatabase constructors.
Tue Nov 30 22:02:33 GMT 2004 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Updated for 0.8.4 release.
Wed Nov 24 04:50:52 GMT 2004 Olly Betts <olly@survex.com>
* templates/godmode: Finished off godmode template.
Wed Nov 24 04:12:09 GMT 2004 Olly Betts <olly@survex.com>
* query.cc: If there's only a boolean query so we promote it to be
the query, switch to boolean weights.
Wed Nov 24 03:29:36 GMT 2004 Olly Betts <olly@survex.com>
* Makefile.am,myhtmlparse.cc,myhtmlparse.h,omindex.cc,scriptindex.cc:
Factored out MyHtmlParser into a separate file so it can be used
in scriptindex too to give scriptindex the same improved HTML
parsing which omindex just got.
Wed Nov 24 02:22:49 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Removed bogus extra line from code which was meant to
truncate at a word boundary, but has never actually worked!
Wed Nov 24 02:20:36 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Improved HTML to text conversion - the parser now knows
that some tags should be regarded as word breaks and some shouldn't
(previously all tags were treated as word breaks).
Wed Nov 24 00:22:39 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Removed debug output; don't include \xa0 in the list of
whitespace characters for now, as that's a bit character set
specific...
Wed Nov 24 00:04:42 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: HTML extraction now turns strips leading and trailing
whitespace and converts all other consecutive groups of whitespace
to a single space.
Tue Nov 23 20:29:14 GMT 2004 Olly Betts <olly@survex.com>
* Makefile.am: XAPIAN_FLAGS already links with xapianqueryparser
so remove -lxapianqueryparser from omega_LDADD as it was causing
problems on cygwin.
Wed Nov 17 18:51:28 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Index RTF documents with unrtf, if available.
* docs/overview.txt: Document this.
Wed Nov 17 16:31:01 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: If a filename to be passed to a filter program has a
leading "-", protect it from possible interpretation as an option
by prepending "./".
Wed Nov 17 16:29:55 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Index Wordperfect documents with wpd2text, if available.
* docs/overview.txt: Document this.
Wed Nov 17 15:12:08 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Index MS Word documents with antiword, if available.
* docs/overview.txt: Document this.
Wed Nov 17 04:29:15 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Add simple code to index OpenOffice documents.
* docs/overview.txt: Update documentation to mention this.
Tue Nov 09 03:04:44 GMT 2004 Olly Betts <olly@survex.com>
* configure.ac,Makefile.am: We now get -AA or -std strict_ansi from
xapian-config, so we don't need to probe for them ourselves.
Sun Nov 07 16:36:42 GMT 2004 Olly Betts <olly@survex.com>
* utils.cc: Fixed to work with updated snprintf configure test,
Sun Nov 07 04:55:26 GMT 2004 Olly Betts <olly@survex.com>
* configure.ac: rearrange so that libtool is active when we test if
the c++ compiler can link a program so it can pull in libstdc++
through a .la file; updated snprintf test to the new one from
xapian-core.
Fri Nov 05 17:20:13 GMT 2004 Olly Betts <olly@survex.com>
* configure.ac: AM_CONFIG_HEADER -> AC_CONFIG_HEADERS; Run tests using
the C++ compiler; select ANSI mode for aCC and cxx; Check GXX not
GCC when choosing warning flags.
Wed Nov 03 20:15:34 GMT 2004 Olly Betts <olly@survex.com>
* query.cc: Updated to use Query::empty() instead of
Query::is_empty().
Wed Nov 03 20:12:37 GMT 2004 Olly Betts <olly@survex.com>
* Makefile.am,getopt.cc,getopt.h,getopt1.cc,gnu_getopt.h,omindex.cc,
scriptindex.cc: Updated to reworked getopt from xapian-core.
Wed Nov 03 04:11:03 GMT 2004 Olly Betts <olly@survex.com>
* getopt.cc: Defining _NO_PROTO is a really bad idea for C++ code!
Tue Nov 02 18:54:12 GMT 2004 Olly Betts <olly@survex.com>
* getopt.cc: Protect getopt definition for possible getopt macro
declared in getopt.h.
Tue Nov 02 17:56:08 GMT 2004 Olly Betts <olly@survex.com>
* indextext.h: Fixed 2 warnings.
Tue Nov 02 06:54:17 GMT 2004 Olly Betts <olly@survex.com>
* getopt.cc,getopt1.cc: Fixed function declarations to not use K&R C
syntax.
Tue Nov 02 05:40:06 GMT 2004 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac,getopt.c,getopt1.c,getopt.cc,getopt1.cc:
Compile everything as C++.
Mon Sep 20 14:52:24 BST 2004 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Version 0.8.3.
Mon Sep 20 14:49:26 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am,configure.ac: Require same versions of autoconf and
automake that xapian-core does.
Mon Sep 20 14:45:53 BST 2004 Olly Betts <olly@survex.com>
* omega.spec.in: Update from Fabrice Colin. The most notable change
is that the RPM is now called xapian-omega because there's already
an omega RPM (in Fedora Core at least) which is some game.
Thu Sep 16 00:57:13 BST 2004 Olly Betts <olly@survex.com>
* cgiparam.cc,configfile.cc,configfile.h,htmlparse.cc,indextext.cc,
omega.cc,omindex-config.cc: All C++ sources should #include
<config.h> as the first header; no header files should #include
<config.h>.
Thu Sep 16 00:54:31 BST 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: --version now actually reports the version. --help
now exits with status 0 rather than status 1.
Tue Sep 14 03:00:32 BST 2004 Olly Betts <olly@survex.com>
* omega.spec.in: Updated URL for sources; include htdig2omega and
htdig2omega.script in the RPM.
Tue Sep 14 02:56:52 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am: Install htdig2omega.script in ${prefix}/share/omega/
rather than ${prefix}/share/.
Mon Sep 13 03:22:55 BST 2004 Olly Betts <olly@survex.com>
* NEWS,configure.ac: Version 0.8.2.
Thu Sep 09 15:11:45 BST 2004 Olly Betts <olly@survex.com>
* NEWS: Updated.
Thu Sep 09 14:41:41 BST 2004 Olly Betts <olly@survex.com>
* query.cc: Use new checkatleast parameter to Enquire::get_mset to
implement MINHITS.
Thu Sep 02 01:45:46 BST 2004 Olly Betts <olly@survex.com>
* templates/query: Always report database not found - previously we
only did so if there was a query. Also fixed missing </center>
tag which happened in certain cases.
Wed Aug 25 23:19:47 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: When running with "replace duplicates" mode (the
default), detect documents removed since the last indexing
run and delete them from the database (bug #34).
Tue Aug 24 19:23:55 BST 2004 Olly Betts <olly@survex.com>
* omega.cc: Added FIXME comment noting that SORT and SORTBANDS should
be tracked and the results reset to the first page if they change.
Tue Aug 24 19:23:07 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am: Install htdig2omega and htdig2omega.script.
Mon Aug 23 22:29:53 BST 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: Report index file name and line number when
reporting errors in it. Added warning for redundant actions,
such as "truncate" as the last action in a rule.
Mon Aug 23 22:03:25 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Use the new replace_document(term, doc) method.
Sun Aug 22 13:11:23 BST 2004 Olly Betts <olly@survex.com>
* configure.in,configure.ac: Renamed configure.in to configure.ac.
Sat Aug 21 12:41:43 BST 2004 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Added note about that $add{$hit,1} gives
the "hit number".
Fri Aug 20 20:28:16 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am: Link with -lxapianqueryparser, not -lomqueryparser.
Thu Aug 19 19:13:34 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am: And actually ship htdig2omega and htdig2omega.script!
Thu Aug 19 19:02:40 BST 2004 Olly Betts <olly@survex.com>
* htdig2omega,htdig2omega.script: Added perl script and corresponding
scriptindex index script which allow an ht://dig database to be
imported into Xapian. This provides an easy way to provide a search
of remote websites using omega (by spidering them with ht://dig).
Sun Aug 15 01:48:58 BST 2004 Olly Betts <olly@survex.com>
* indextext.cc,indextext.h,omindex.cc,query.cc,scriptindex.cc,
symboltab.h: Fixed $highlight to understand accented characters
(bug#9).
Wed Jun 30 14:58:12 BST 2004 Olly Betts <olly@survex.com>
* NEWS,configure.in: Version 0.8.1.
Tue Jun 29 17:26:41 BST 2004 Richard Boulton <richard@tartarus.org>
* Makefile.am: Remove Debian files from distribution tarballs,
since there will often be multiple patch releases for each
release. Debian files will be available from an apt repository
in future.
Tue Jun 29 01:45:06 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Renamed hash() to hash_string() to avoid colliding
with something on IRIX; Removed explicit initialisation of
mime_types - perhaps that's spooking the SGI CC prelinker.
Sun Jun 27 23:47:35 BST 2004 Olly Betts <olly@survex.com>
* omega.cc: Change MORELIKE to pick up to 40 terms, rather than up to
6 (feedback on the mailing list suggests this gives much better
results).
Fri Jun 11 02:22:38 BST 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: Added catch for std::bad_alloc.
Mon Apr 19 14:43:17 BST 2004 Olly Betts <olly@survex.com>
* NEWS: Final update for 0.8.0.
Sun Apr 18 22:31:24 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Only need _POSIX_C_SOURCE on Linux, and it seems to
cause problems with Sun's C++ compiler.
Sun Apr 18 17:50:35 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: _POSIX_C_SOURCE works better than _POSIX_SOURCE for
making posix_fadvise prototype visible on Linux.
Thu Apr 15 02:05:49 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: And another _POSIX_SOURCE attempt!
Thu Apr 15 01:43:51 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Another stab at _POSIX_SOURCE...
Thu Apr 15 01:25:29 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Added a missing underscore (_POSIX_SOURCE not
POSIX_SOURCE!)
Thu Apr 15 00:48:12 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Defined POSIX_SOURCE to a suitable value to get
posix_fadvise on some versions of redhat.
Mon Apr 12 01:06:58 BST 2004 Olly Betts <olly@survex.com>
* NEWS,configure.in: Version 0.8.0.
Mon Apr 12 00:03:57 BST 2004 Olly Betts <olly@survex.com>
* indextext.cc,query.cc: Don't create R terms for terms which start
with a digit.
Sun Apr 11 23:47:33 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Fixed inconsistent indenting.
Sun Apr 11 23:11:51 BST 2004 Olly Betts <olly@survex.com>
* omindex.cc: Call posix_fadvise with POSIX_FADV_DONTNEED just before
closing an input file. Again should help improve indexing
throughput.
Fri Apr 02 16:09:03 BST 2004 Olly Betts <olly@survex.com>
* configure.in,omindex.cc: Use O_STREAMING and/or posix_fadvise()
when reading files to be indexed (if available). This helps to
keep the Xapian database in cache, and greatly improve indexing
throughput.
Tue Mar 30 00:06:15 BST 2004 Olly Betts <olly@survex.com>
* NEWS: We're now putting omega news here rather than in xapian-core
so composed draft version for the forthcoming 0.8.0 release.
Tue Mar 29 23:56:27 BST 2004 Olly Betts <olly@survex.com>
* templates/xml: Remove unused OmegaScript code:
`$set{topterms,$or{$ne{$msize,0},$query}}'.
Tue Mar 29 23:55:40 BST 2004 Olly Betts <olly@survex.com>
* Makefile.am: scriptindex needs to link to getopt.c and getopt1.c.
Tue Mar 23 19:20:19 GMT 2004 Olly Betts <olly@survex.com>
* templates/xml: Correct spelling of `relavence' to `relevance'.
NB: if you're parsing the XML output, you'll need to fix this
spelling in your parser!
Sun Mar 21 14:23:23 GMT 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: Use getopt for option parsing. Change default to
*not* overwriting the database (use --overwrite if you really want
to do this); -u is now accepted but ignored.
Fri Mar 12 02:11:28 GMT 2004 Olly Betts <olly@survex.com>
* templates/xml: "Content-Type: application/html" is more appropriate
than text/xml.
Fri Mar 12 02:09:33 GMT 2004 Olly Betts <olly@survex.com>
* omindex.cc: Added --overwrite option which forces an existing
database to be deleted before indexing begins.
Wed Mar 10 14:39:13 GMT 2004 Olly Betts <olly@survex.com>
* templates/xml: "Content-Type: text/xml".
Wed Mar 10 00:08:40 GMT 2004 Olly Betts <olly@survex.com>
* docs/scriptindex.txt: Make more explicit that boolean produces a
*single* boolean term.
Tue Mar 09 19:08:19 GMT 2004 Olly Betts <olly@survex.com>
* indextext.cc,omindex.cc,scriptindex.cc: Updated to use add_term()
instead of add_term_nopos().
Wed Mar 03 14:55:50 GMT 2004 Olly Betts <olly@survex.com>
* scriptindex.cc: Use true/false for assigning to booleans, not 1/0.
Sat Feb 21 18:33:15 GMT 2004 Olly Betts <olly@survex.com>
* omega.cc,query.cc,docs/omegascript.txt: Added $httpheader
Omegascript to allow arbitrary HTTP headers and alternative
Content-Type headers to be specified.
Sat Feb 14 00:32:06 GMT 2004 Olly Betts <olly@survex.com>
* query.cc: If the probabilistic query was bad, don't try to run the
match.
Sat Feb 14 00:11:52 GMT 2004 Olly Betts <olly@survex.com>
* docs/cgiparams.txt: Note that START and END should be in the format
YYYYMMDD.
Sat Feb 14 00:07:41 GMT 2004 Olly Betts <olly@survex.com>
* query.cc: Don't crash if there's a date filter but no probabilistic
query.
Wed Nov 26 22:44:49 GMT 2003 Olly Betts <olly@survex.com>
* indextext.cc: Raw terms with a multicharacter prefix are now indexed
with a : inserted (e.g. as XFOO:Rterm). This matches what the query
parser does.
Wed Nov 26 16:25:16 GMT 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.5.
Sun Nov 23 03:28:21 GMT 2003 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt: Added note that $setmap{prefix,...}
needs be used before any commands which require the query to be
parsed.
Thu Nov 20 02:44:55 GMT 2003 Olly Betts <olly@survex.com>
* docs/omegascript.txt: Expanded documentation of $set and $setmap to
include values which Omega itself makes use of.
Thu Nov 20 02:43:03 GMT 2003 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Set default value for $opt{stemmer} to "english"
rather than taking "" to mean English.
Tue Oct 21 21:29:18 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Fixed $setmap{} to not add bogus entries.
Tue Oct 21 21:20:31 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Allow the QueryParser prefix map to be set up using
$setmap{prefix,...} (e.g. $setmap{prefix,subject,XT,abstract,XA}).
Tue Oct 21 21:13:59 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Only parse probabilistic query once!
Tue Oct 21 20:03:27 BST 2003 Olly Betts <olly@survex.com>
* omega.cc,omega.h,query.cc,query.h: Reworked so that the
probabilistic query isn't parsed until we need some
information from it. This means that we can now use options
set by the omegascript template to control the behaviour of the
query parser.
Thu Oct 16 21:17:01 BST 2003 Olly Betts <olly@survex.com>
* omega.cc: Renamed `big_buf' to `query_string' and eliminated `more'
flag and use of goto; tidied up order of reading CGI variables; use
const refs to value strings in cgi_params map rather than copying
the strings out.
Sat Oct 11 20:43:04 BST 2003 Olly Betts <olly@survex.com>
* omega.cc,omega.h,query.cc: Make rset an object rather than a pointer
to an object.
Fri Oct 10 18:06:10 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Removed the unfinished code for caching omegascript
command expansions. Added code to cache $dbsize. The only other
value correctly marked for caching is already being cached!
Thu Oct 02 15:18:19 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.4.
Thu Oct 02 15:16:41 BST 2003 Olly Betts <olly@survex.com>
* query.cc: $date doesn't require the match to be run to work, but
$topdoc does!
Tue Sep 30 18:32:25 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Cleaner version of T macro.
Tue Sep 30 18:09:30 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Hopefully the final piece in the Sun C++ puzzle.
Tue Sep 30 00:59:50 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Cleaned up a recent fix by using clean generic code which
works on Sun's C++ too.
Mon Sep 29 17:12:10 BST 2003 Olly Betts <olly@survex.com>
* cgiparam.cc: Portability fixes for Sun's C++ compiler.
Mon Sep 29 13:26:22 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Another Sun C++ fix.
Mon Sep 29 11:49:30 BST 2003 Olly Betts <olly@survex.com>
* query.cc,omega.cc: More fixes for Sun's really rather rubbish
C++ compiler.
Mon Sep 29 01:39:56 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Fixes for compiling with Sun's C++ compiler.
Mon Sep 29 01:17:39 BST 2003 Olly Betts <olly@survex.com>
* omega.cc: Added workaround for compilation problem with Sun's C++.
Fri Aug 08 01:39:51 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.3.
Sat Aug 02 01:52:38 BST 2003 Olly Betts <olly@survex.com>
* configure.in,omindex.cc,query.cc: Fixed to compile on mingw
where ftime() returns void.
Fri Aug 01 20:59:57 BST 2003 Olly Betts <olly@survex.com>
* scriptindex.cc: Added #define for sleep() on __WIN32__.
Wed Jul 30 19:05:17 BST 2003 Olly Betts <olly@survex.com>
* getopt.h: Copied over latest getopt.h from xapian-core.
Sun Jul 27 16:34:19 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am,getopt.c,getopt.h,getopt1.c: Copied our version of GNU
getopt here from xapian-core so we can build omindex on non-glibc
platforms (modifications are for better C++ compatibility).
Mon Jul 21 01:16:59 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Use libtool; OM_PATH_XAPIAN -> XO_LIB_XAPIAN.
Sat Jul 19 19:26:03 BST 2003 Olly Betts <olly@survex.com>
* omindex.cc: Added missing `#include <errno.h>'.
Sat Jul 19 19:24:50 BST 2003 Olly Betts <olly@survex.com>
* indextext.cc: Fixed signed character issue.
Thu Jul 17 00:51:42 BST 2003 Olly Betts <olly@survex.com>
* bootstrap: Removed bootstrap in favour of top-level bootstrap.
Tue Jul 15 16:27:52 BST 2003 Olly Betts <olly@survex.com>
* omindex.cc: file_to_string() and stdout_to_string() now throw an
exception on a read error, avoiding the " "-for-empty-file bodge.
Tue Jul 15 15:18:32 BST 2003 James Aylett <james@tartarus.org>
* omindex.cc: fix file_to_string() to return the file on
success, and not leak memory on empty files. Fix callers
to give up on unreadable files, not vice versa. Fix
logging messages to distinguish re-indexed/added.
Fri Jul 11 15:09:55 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.2.
Fri Jul 11 12:08:57 BST 2003 Olly Betts <olly@survex.com>
* omega.cc: If the same database is listed more than once, only search
the first occurrence.
Fri Jul 11 11:57:24 BST 2003 Olly Betts <olly@survex.com>
* configure.in,utils.cc: Use snprintf.
Tue Jul 08 17:56:39 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.1.
Tue Jul 08 17:34:01 BST 2003 Olly Betts <olly@survex.com>
* omindex.cc: Fixed compilation problem.
Fri Jul 04 22:12:32 BST 2003 Olly Betts <olly@survex.com>
* bootstrap: add missing ';;' as case pattern delimiter
Thu Jul 03 23:34:50 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Version 0.7.0.
Thu Jul 03 23:33:05 BST 2003 Olly Betts <olly@survex.com>
* omindex.cc: Abort parsing of document if it's excluded from
indexing; ignore anything outside of the first <body>...</body>,
if present.
Tue Jun 24 00:45:28 BST 2003 Olly Betts <olly@survex.com>
* docs/overview.txt: Added note about hashing of long URL terms and
reworked structure a little.
Mon Jun 23 21:11:41 BST 2003 Olly Betts <olly@survex.com>
* bootstrap: Check for Bison 1.875 which doesn't work with Xapian.
Mon Jun 23 16:52:47 BST 2003 Olly Betts <olly@survex.com>
* omega.cc,omindex.cc,scriptindex.cc: Xapian::PostListIterator ->
Xapian::PostingIterator.
Thu Jun 19 20:02:00 BST 2003 Olly Betts <olly@survex.com>
* symboltab.h: Convert hardspace to space.
Wed Jun 18 16:32:34 BST 2003 Olly Betts <olly@survex.com>
* scriptindex.cc: Removed already disabled unique id hashing to docid
code. Xapian doesn't support setting arbitrary docids - if it ever
does we can retrieve this code from CVS.
Wed Jun 18 16:28:33 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am,indextext.cc,indextext.h,omindex.cc,scriptindex.cc:
Normalise accents in probabilistic terms.
Tue Jun 17 17:54:32 BST 2003 Olly Betts <olly@survex.com>
* omindex.cc: Read output from pstotext and pdftotext via pipes rather
than temporary files to side-step the whole problem of secure
temporary file creation; Use pdfinfo to get the title and keywords
from when indexing a PDF; Safe filename escaping tweaked to not
escape common safe punctuation.
Tue Jun 17 17:50:00 BST 2003 Olly Betts <olly@survex.com>
* htmlparse.cc,htmlparse.h: Moved initialisation of named_ents out of
header - it's not a sensible candidate for inlining.
Wed Jun 11 02:32:25 BST 2003 Olly Betts <olly@survex.com>
* date.cc,date.h,omega.cc,omega.h,omindex.cc,query.cc,query.h,
scriptindex.cc: Om -> Xapian::, etc.
Fri Jun 6 01:04:12 BST 2003 Richard Boulton <richard@tartarus.org>
* omindex.cc: Implement an upper limit on the length of URL
terms. Currently, this is set at 240 characters - it can
probably be increased slightly, but I'm not sure exactly
how long a term can safely be. If the URL term would be
longer than this, its last few bytes are replaced by a
hash of the tail of the URL. This means that (apart from
hopefully very rare collisions) urlterms should still be
unique ids for documents.
Fri Jun 06 00:14:13 BST 2003 Richard Boulton <richard@tartarus.org>
* omindex.cc: Clean up processing of HTML documents:
- Ignore the contents of <script> and <style> tags in HTML.
- Strip initial whitespace in each tag in an HTML document.
- Try not to split words in half when truncating title and
summary.
Tue Jun 03 11:15:28 BST 2003 Olly Betts <olly@survex.com>
* templates/query: Create log entry in query.log.
Thu May 29 18:03:54 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Fixed bug in DEFAULT_LOG_ENTRY's Omegascript.
Thu May 29 00:22:28 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Set STEM_LANGUAGE near the start of the file so it's easy
for users to change until we get better configurability.
Thu May 29 00:00:28 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am,date.cc,date.h,query.cc: Split code to build a
date range filter into a separate file.
Wed May 28 23:38:02 BST 2003 Olly Betts <olly@survex.com>
* configfile.cc,configfile.h,omega.cc,omega.conf,query.cc,query.h,
docs/omegascript.txt,docs/overview.txt,docs/quickstart.txt:
Replaced half-hearted logging support with flexible
OmegaScript-based approach with new $log command. Also added
$now to allow the current date/time to be logged.
Tue May 27 17:55:24 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Added missing "#include <assert.h>".
Mon May 26 22:41:26 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Don't use libtool; Use AC_CONFIG_FILES - it's the new
autoconf way!
Mon May 26 12:12:22 BST 2003 Olly Betts <olly@survex.com>
* omega.spec.in: Removed %changelog - it hasn't been reliably updated
and only really makes sense when the packaging is done by a third
party anyway.
Mon May 26 12:01:55 BST 2003 Olly Betts <olly@survex.com>
* query.cc: If the query is empty, don't bother running it through
enquire.
Wed Apr 30 01:18:47 BST 2003 Olly Betts <olly@survex.com>
* docs/cgiparams.txt,docs/omegascript.txt: Minor improvements.
Wed Apr 30 01:14:46 BST 2003 Olly Betts <olly@survex.com>
* query.cc: Use correct types for docid and value_no in $value.
Wed Apr 23 16:15:07 BST 2003 Sam Liddicott <sam@liddicott.com>
* templates/xml: add collapse info to xml template.
Wed Apr 23 14:00:37 BST 2003 Olly Betts <olly@survex.com>
* omega.spec.in: Merged changes from Fabrice Colin.
Thu Apr 10 03:14:51 BST 2003 Olly Betts <olly@survex.com>
* configure.in: Updated for 0.6.5 release.
Wed Apr 09 13:56:14 BST 2003 Olly Betts <olly@survex.com>
* omega.cc,query.cc,omega.h,docs/cgiparams.txt: Renamed DATE1, DATE2,
and DAYSMINUS to the more meaningful START, END, and SPAN (NB SPAN
is days before END, or after START, or before today - whereas
SPAN was before *DATE1* or before today). The old parameters names
are supported (with the original semantics) for now.
Wed Apr 09 13:44:28 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am: Install docs in /usr/share/doc/omega to be FHS
compliant.
* omega.spec.in: Consistently use %{contentdir} instead of /var/lib;
removed redundant second setting of %docdir.
Wed Apr 09 01:21:57 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am: Removed bogus extra "\".
Mon Mar 31 19:42:24 BST 2003 Olly Betts <olly@survex.com>
* Makefile.am: Install documentation!
* omega.spec.in: Merged in changes to RPM packaging from Fabrice Colin
and reworked further.
Fri Mar 28 17:47:45 GMT 2003 Olly Betts <olly@survex.com>
* templates/query,templates/query2: Removed bogus setting of defunct
xB parameter; correctly propagate multiple B parameters.
Fri Mar 28 17:45:41 GMT 2003 Olly Betts <olly@survex.com>
* omindex.cc: Report correct version number (was hard-wired to 1.0!)
Tue Mar 25 14:46:10 GMT 2003 Olly Betts <olly@survex.com>
* query.cc: If xP and P are both empty, classify as SAME_QUERY not
NEW_QUERY as there may be a boolean query too.
* query.cc: Fixed off-by-one error in rounding down topdoc - it was
possible to get to an empty page of hits if there were exactly a
multiple of HITSPERPAGE matches and the matcher over-estimated the
number of matches and Omega displayed page links.
Mon Mar 24 09:40:04 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
* templates/query: Added propagation of B boolean filter
* templates/query2: factored about a bit more, query2 is
a more modular version of query which will ultimately
lend itself to customisation a bit more to the uninitiated.
Tue Mar 04 01:02:12 GMT 2003 Olly Betts <olly@survex.com>
* omega.cc: Fixed handling of multiple DB parameters to be as
documented.
Fri Feb 28 09:52:03 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
* Added $collapsed to omegascript to give the number of hits
collapsed into the current hit, eg:
$if{$ne{$collapsed,0},$collapsed hidden results
($value{$cgi{COLLAPSE}})}
* templates/godmode: removed euro ferret icon reference
* templates/godmode: added value dumping, for values from 0-255
Thu Feb 27 11:58:13 GMT 2003 Olly Betts <olly@survex.com>
* Makefile.am,query.cc,docs/omegascript.txt,templates/query:
Added $transform{} which does regexp manipulation (currently
disabled); Added $uniq{} to eliminate duplicates from a sorted
list; Fixed a query with repeated terms to be identified as
SAME_QUERY not EXTENDED_QUERY; remove duplicates from terms
listed in term frequencies.
Wed Feb 26 17:50:26 GMT 2003 Olly Betts <olly@survex.com>
* scriptindex.cc: Allow '_' in fieldnames. Diagnose bad characters
in fieldnames better.
Wed Feb 26 15:13:02 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
* dbi2omega: Add DBUSER and DBPASSWD env var support so that password
protected DB's can easily be used
* add cgi parameter COLLAPSE to collapse on key values
* Add $value{key[,docid]} support to omegascript
Wed Feb 26 09:58:01 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
* bootstrap: Fix success message when building in non-src dir
as configure is written to the src dir.
Mon Jan 6 12:47:55 GMT 2003 James Aylett <james@tartarus.org>
* scriptindex.cc: build fix
Tue Dec 24 20:12:23 GMT 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.6.4.
Tue Dec 24 20:06:47 GMT 2002 Olly Betts <olly@survex.com>
* scriptindex.cc: Minor tweak.
Tue Dec 24 19:58:57 GMT 2002 Olly Betts <olly@survex.com>
* omega.cc,docs/cgiparams.txt: Prefer MINHITS to MIN_HITS and
RAWSEARCH to RAW_SEARCH since none of the other CGI parameter
names have _ separating words. Also support old names for now.
Mon Dec 23 03:23:33 GMT 2002 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt,templates/query: Added $unstem to map
a stemmed term to the form(s) used in the query; $queryterms now
only includes the first occurrence of each stemmed form; $prettyterm
uses the unstem map.
Sat Dec 21 17:47:33 GMT 2002 Olly Betts <olly@survex.com>
* scriptindex.cc,docs/scriptindex.txt: Replaced index=nopos with
indexnopos action; index and indexnopos now take an optional
prefix argument; index=nopos is handled specially for backwards
compatibility.
Sat Dec 21 17:18:02 GMT 2002 Olly Betts <olly@survex.com>
* scriptindex.cc,docs/scriptindex.txt: Added new scriptindex action
date=FORMAT to generate terms for date range searching.
Sat Dec 21 01:51:32 GMT 2002 Olly Betts <olly@survex.com>
* templates/query: Stop topterms sticking out of green box with
gecko based browsers.
Sat Dec 21 01:44:53 GMT 2002 Olly Betts <olly@survex.com>
* Makefile.am: Distribute docs/scriptindex.txt.
* docs/omegascript.txt: It's $setrelevant not $set_relevant.
Sat Dec 14 13:54:10 GMT 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.6.3; removed -Wno-long-long as we don't use
long long here.
* query.cc: Compilation fixes.
* templates/query: Don't call $topterms twice!
Sat Dec 14 01:10:48 GMT 2002 Olly Betts <olly@survex.com>
* query.cc: Updated in line with removal of OmSettings.
Wed Dec 11 00:58:49 GMT 2002 Olly Betts <olly@survex.com>
* configure.in,query.cc,docs/omegascript.txt,templates/query:
Added $time which reports how long the match took - when searching
on a remote website, it's hard to gauge how much time is taken by
the search, and how much by the web server and browser; renamed
and_vec to or_vec which better describes its purpose.
Mon Dec 09 17:11:26 GMT 2002 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt,templates/query: Added $dbsize
to return the number of documents in the database being searched.
Use this in the default query template on the "front page" shown
when there's no search.
Mon Dec 09 02:55:46 GMT 2002 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt,templates/query: Added $msizeexact
which returns "true" if $msize if exact (or "" if it is estimated).
This means that you'll see "... of about N matches" less often -
notably it's gone when searching for a single term, which is a
pretty common case.
Sun Dec 08 08:42:47 GMT 2002 Olly Betts <olly@survex.com>
* scriptindex.cc: Replaced icky unportable code which set the filename
to "/dev/fd/0" in order to read from stdin.
Sun Dec 08 06:39:30 GMT 2002 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt: Fixed $hitlist to complain if more
than one parameter is passed; $topterms now defaults to 16 terms
rather than 20; $topterms now weeds out terms which stem to the
same as those in the query, or those already in $topterms.
Sun Dec 08 06:36:04 GMT 2002 Olly Betts <olly@survex.com>
* templates/query: Make background white - the very light grey just
looks dirty; fixed exclusion of TopTerms Javascript when there
are not TopTerms; sample now <small>; language and size now
appear when the corresponding fields are present; fixed
unmatched </small>; fixed missing list of terms matching
each document.
Sat Dec 07 21:20:31 GMT 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.6.2.
Sat Dec 07 21:04:31 GMT 2002 Olly Betts <olly@survex.com>
* query.cc: Prefer "while (true)" to "while (1)".
Fri Dec 06 04:41:05 GMT 2002 Olly Betts <olly@survex.com>
* omindex.cc: Index .php files by default; non-zero return code if
an exception is caught.
Fri Dec 06 04:30:17 GMT 2002 Olly Betts <olly@survex.com>
* htmlparse.cc: Ignore PHP tags and their contents; fixed tag
scanning code to never read one character past the end of
the document.
Wed Dec 04 18:42:51 GMT 2002 Olly Betts <olly@survex.com>
* omega.cc,omega.h,omindex.cc,query.cc,scriptindex.cc:
Updated in line with OmSettings related changes to the API.
Wed Dec 04 17:13:43 GMT 2002 Olly Betts <olly@survex.com>
* query.cc: Fixed $dbname to return "default" for the default
database, rather than "" - this fixes paging in searches of the
default database.
* templates/query: Removed xDEFAULTOP hidden field which is no longer
used.
Wed Dec 04 11:57:13 GMT 2002 Olly Betts <olly@survex.com>
* templates/query: Removed bogus unmatched '}'.
Thu Nov 28 20:24:08 GMT 2002 Olly Betts <olly@survex.com>
* omega.cc,query.cc: Updated in line with OmEnquire::get_eset() no
longer taking an OmSettings object.
Wed Nov 27 19:02:12 GMT 2002 Olly Betts <olly@survex.com>
* dbi2omega: Return fields in table order; more efficient;
report any error reading a row; if we get a NULL field,
don't output it, and suppress perl warning about use of
an undefined program.
Wed Nov 27 05:22:04 GMT 2002 Olly Betts <olly@survex.com>
* configure.in: Set version to 0.6.0.
Wed Nov 27 05:21:00 GMT 2002 Olly Betts <olly@survex.com>
* configure.in,htmlparse.h,omindex.cc,scriptindex.cc:
Use "-Wall -W" rather than "-Wall -Wunused", and fixed the
warnings this reveals.
Wed Nov 27 04:20:13 GMT 2002 Olly Betts <olly@survex.com>
* Makefile.am,dbi2omega: Added perl script to dump any database
which perl DBI can access into the dump format expected by
scriptindex.
Wed Oct 30 02:02:32 GMT 2002 Olly Betts <olly@survex.com>
* omega.spec.in: Use bootstrap instead of buildall; don't use "-j4"
with make - most people don't all have quad processor boxes!
Wed Oct 30 01:56:31 GMT 2002 Olly Betts <olly@survex.com>
* buildall: Removed in favour of bootstrap script.
Tue Oct 29 02:01:58 GMT 2002 Olly Betts <olly@survex.com>
* omindex.cc,scriptindex.cc: Added MAX_PROB_TERM_LENGTH (set to
64) to limit size of probabilistic terms.
Sat Oct 12 17:09:55 BST 2002 Olly Betts <olly@survex.com>
* bootstrap: Copied bootstrap script from xapian-core.
Sat Oct 12 17:05:37 BST 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.5.3.
Wed Oct 09 16:55:56 BST 2002 Olly Betts <olly@survex.com>
* omega.cc,omega.h,query.cc,docs/{cgiparams.txt,omegascript.txt},
templates/query: revamped the "reset first page when filter changes"
scheme - all filtery things are now serialised and put into the
xFILTER CGI parameter, which copes with multiple B values. Support
for the old way (xB, xDATE1, xDATE2, xDAYSMINUS, xDEFAULTOP) is
included for now (but only copes with a single B value). Added (and
documented) $filters Omegascript command to implement this.
* query.cc: fixed handling of case when topdoc is non-zero, but
no matches were found. This was causing topdoc to be set to -6!
* query.cc: fixed handling of prefixes starting with an X.
Wed Oct 09 15:35:54 BST 2002 Olly Betts <olly@survex.com>
* .cvsignore: Added scriptindex and omega-*.tar.gz; removed libtool.
Sun Oct 06 18:56:40 BST 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.5.2.
Thu Oct 03 16:42:06 BST 2002 Olly Betts <olly@survex.com>
* query.cc: Added CMD_hit to enumeration.
Wed Oct 02 17:02:25 BST 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.5.1.
* Makefile.am,configure.in: require automake 1.6.3 and autoconf 2.54
since xapian-core does anyway, and it neatens configure.in slightly.
Wed Oct 02 16:58:39 BST 2002 Olly Betts <olly@survex.com>
* query.cc,docs/omegascript.txt: Added $hit which gives the m-set
number of the current hit.
Sun Sep 22 15:47:33 BST 2002 Olly Betts <olly@survex.com>
* configfile.cc: Corrected use of string.data() to string.c_str().
Sun Sep 22 03:53:35 BST 2002 Olly Betts <olly@survex.com>
* templates/query: Updated xapian url to http://www.xapian.org/
Fri Sep 20 15:36:35 BST 2002 Olly Betts <olly@survex.com>
* configure.in: Version 0.5.0.
Sun Sep 15 03:07:31 BST 2002 Richard Boulton <richard.boulton@omsee.com>
* buildall: Update to latest version, to fix bug with VPATH version
checking for autoconf.
Thu Sep 12 15:11:16 BST 2002 Olly Betts <olly@survex.com>
* htmlparse.cc: Add comment about string::replace() invalidating
iterators.
Thu Sep 12 13:38:05 BST 2002 Olly Betts <olly@survex.com>
* omegascript.vim,omegascript.txt,query.cc: cosmetic tweaks.
Thu Sep 5 14:47:54 BST 2002 Richard Boulton <richard@tartarus.org>
* configure.in: Don't use libtool. I don't know why I ever thought
it was needed.
Thu Sep 5 14:11:51 BST 2002 Richard Boulton <richard@tartarus.org>
* query.cc: Change $and to return true iff all its arguments are
not false, rather than if one or more of the arguments is false.
* docs/omegascript.txt: Update documentation of $and{}
Fri Aug 23 13:27:02 BST 2002 James Aylett <tartarus@users.sourceforge.net>
* docs/quickstart.txt: encourage people to call their first
database 'default' since this will work straight off.
Wed Aug 21 17:52:36 BST 2002 Richard Boulton <richard@tartarus.org>
* query.cc: Add $slice{} command, to slice a list at a set of
positions (given by a second list).
Also, bugfix: require $hitlist{} to take at least one parameter:
it currently segfaults if given none.
* docs/omegascript.txt: Document $slice{}.
* extra/omegascript.vim: Update syntax highlighting.
Wed Aug 21 18:03:43 BST 2002 James Aylett <tartarus@users.sourceforge.net>
* omindex.cc: tidy up output so it doesn't wrap so much
Wed Aug 21 18:01:38 BST 2002 James Aylett <tartarus@users.sourceforge.net>
* htmlparse.cc: fixed bug in entity reference handling
Wed Aug 21 13:21:12 BST 2002 James Aylett <tartarus@users.sourceforge.net>
* omindex.cc: Bugfix to metaterm generation when operating on an
absolute URL that is also at the root of its web server.
Wed Aug 21 10:48:06 BST 2002 Richard Boulton <richard@tartarus.org>
* scriptindex.cc: If a field has multiple instances, keep all of
them (previously only kept the final occurrence).
* docs/scriptindex.txt: Mention that multiple instances of fields
are permitted.
Tue Aug 20 18:02:45 BST 2002 James Aylett <tartarus@users.sourceforge.net>
* docs/quickstart.txt: correct for new(ish) omindex behaviour
Sat Aug 17 13:38:57 BST 2002 Richard Boulton <richard@tartarus.org>
* extra/omegascript.vim: Quick attempt at a vim syntax highlighting
file for omegascript. Recognises files only if they're in a
directory called "templates": perhaps we should adopt a suffix to
make recognition easier.
Read the file for installation instructions.
Thu Aug 15 11:21:20 BST 2002 Richard Boulton <richard@tartarus.org>
* scriptindex.cc: Allow updating of databases by a command line
switch, and also turn off verbose output (can be turned back
on with a switch).
* docs/scriptindex.txt: Document the "unique" tag.
Thu Aug 15 11:18:21 BST 2002 Richard Boulton <richard@tartarus.org>
* buildall: Copy buildall from xapian-core - the old one breaks
for me (due to odd aclocal paths) but the new one is fine.
We should make a common module to hold build stuff to be shared
between modules, though.
Mon Aug 12 01:34:42 BST 2002 Richard Boulton <richard@tartarus.org>
* scriptindex.cc: Bug fix - index without positional information
if "nopos" is specified, rather than the other way around.
Bug fix - don't completely eradicate newlines in multiline values,
until they have a chance to be converted to spaces.
Delete documents if no fields other than unique fields are
specifed.
Add some simple debugging, and write messages to a log file in
the database directory.
* configure.in: Use libtool.
Fri Aug 9 13:57:32 BST 2002 Richard Boulton <richard@tartarus.org>
* scriptindex.cc: Fix compile errors, by changing string
constructors to take begin and end iterators, instead of a begin
and a length.
Fri Jul 05 19:33:55 BST 2002 Olly Betts <olly@survex.com>
* omega.spec.in: Fixed wrt /usr/lib/omega/bin/omega.
Fri Jul 05 19:20:05 BST 2002 Olly Betts <olly@survex.com>
* Makefile.am, docs/quickstart.txt: Install omega as
${prefix}/lib/omega/bin/omega.
Thu Jul 04 02:11:46 BST 2002 Olly Betts <olly@survex.com>
* scriptindex.cc, docs/scriptindex.txt: new indexer - indexing
behaviour is controlled by a simple but powerful script.
* Makefile.am: tidied up.
* configfile.cc, docs/quickstart.txt: database and templates default to
being in /var/lib/omega rather than /home/omega.
* docs/quickstart.txt: describe the new test mode (command line) rather
than the old one (stdin).
* omega.cc, docs/cgiparams.txt: If xP isn't set, honour paging and
R-set. So RAW_SEARCH now only disables snapping TOPDOC to a multiple
of HITSPERPAGE.
* query.cc: "using namespace std;"
Fri Jun 14 00:07:20 BST 2002 Olly Betts <olly@survex.com>
* $prettyterm{} no longer adds a trailing '.' if the term also exists
with an R prefix and stems to itself.
Fri Jun 14 00:02:16 BST 2002 Olly Betts <olly@survex.com>
* MORELIKE can now take a termname - this allows MORELIKE to be used
with a unique id from an external database if it has been indexed
as a boolean term.
Thu Jun 13 00:01:11 BST 2002 Olly Betts <olly@survex.com>
* omega.conf: removed trailing slashes from directory names.
* query.cc: removed extra slash added to template_dir; improved
reporting of errors opening template file.
Wed Jun 12 23:51:11 BST 2002 Olly Betts <olly@survex.com>
* Added an alternative test mode - you can now pass parameters as
command line arguments, which is more convenient for repeating
the same test query, and for automated testing, e.g.:
omega 'P=information retrieval' DB=papers
If the first parameter starts with a "-" and doesn't contain an
"=", omega now outputs the version string and stops (to gracefully
handle "omega --version" and "omega --help".
Wed Jun 12 23:39:20 BST 2002 Olly Betts <olly@survex.com>
* omindex.cc: removed OLD_PREFIXES code - shout if you were using it.
Fri May 17 14:09:25 BST 2002 Olly Betts <olly@survex.com>
* Pass the database to the query parser (not used there at present,
but will allow wildcarded searches, etc to be implemented).
Thu May 16 17:57:34 BST 2002 Olly Betts <olly@survex.com>
* <algo.h> -> <algorithm>.
Thu May 16 15:41:14 BST 2002 Sam Liddicott <sam@ananova.com>
* Removed extra package again!
* Moved images to /var/www/icons/omega till we think of something
better. Should be the most harmless solution that still works
without requireing too much brains on the part of the installer
Thu May 16 14:53:54 BST 2002 Sam Liddicott <sam@ananova.com>
* Moved images to a separate optional package to stop touching
user's web tree until we work out what to do. sysadmin can
still install images if he wants and on a redhat box they will
end up in the right place. This will no doubt get revisted later,
that's fine by me.
Thu May 16 13:31:27 BST 2002 Sam Liddicott <sam@ananova.com>
* Added loads more missing files like images and templates to the
package
* Also fixed the templates to use the new images dir (if they used
images, which they actually don't)
Thu May 16 12:56:55 BST 2002 Sam Liddicott <sam@ananova.com>
* Fixes to spec file to add various missing files
Wed May 15 12:59:37 BST 2002 Olly Betts <olly@survex.com>
* omindex now understand acronyms (N.A.T.O. E.T ...).
* $highlight{} now understands "&" (AT&T M&S ...) and acronyms.
Tue May 14 13:08:41 BST 2002 Olly Betts <olly@survex.com>
* Index <word>&<word> as a single term (e.g. AT&T, M&S, A&P).
Tue May 14 12:37:49 BST 2002 Olly Betts <olly@survex.com>
* omindex.cc: cleaned up a little.
Tue May 14 11:24:42 BST 2002 Olly Betts <olly@survex.com>
* Fixed config.h inclusion; using std::*.
Tue May 14 11:18:37 BST 2002 Olly Betts <olly@survex.com>
* Updated.
Tue May 14 11:16:03 BST 2002 Olly Betts <olly@survex.com>
* Added SORT and SORTBANDS.
Mon May 13 12:52:29 BST 2002 Olly Betts <olly@survex.com>
* Autoconf 2.50.
* Commented out omindex-config (since it's unfinished) and XML support
(since only omindex-config uses it).
Thu May 02 16:06:02 BST 2002 Olly Betts <olly@survex.com>
* Updated to reflect removal of OmData.
Wed May 01 11:26:59 BST 2002 Olly Betts <olly@survex.com>
* Changed to use queryparser in libomqueryparser.
Tue Apr 23 15:10:42 BST 2002 Olly Betts <olly@survex.com>
* Make buildall smart enough to generate aclocal.m4 properly and
remove acinclude.m4. It now also extracts the package name from
configure.in so we can use the same buildall everywhere; fixed
problem with double use of AM_CXXFLAGS in Makefile.am.
Tue Apr 23 14:27:29 BST 2002 Olly Betts <olly@survex.com>
* Updated for xapian-config and xapian.m4 changes.
Thu Apr 18 14:37:05 BST 2002 Olly Betts <olly@survex.com>
* Updated buildall; minor tweaks to configure.in.
Wed Apr 17 12:31:18 BST 2002 Olly Betts <olly@survex.com>
* Removed references to xapian-config uninst options.
Fri Apr 12 15:48:33 BST 2002 Olly Betts <olly@survex.com>
* Remove parsequery.cc on "make maintainer-clean".
Fri Apr 12 16:19:19 BST 2002 Olly Betts <olly@survex.com>
* Require automake 1.5.
Fri Apr 12 12:47:04 BST 2002 Olly Betts <olly@survex.com>
* Tweaked what gets interpreted as a phrase.
Fri Apr 12 12:44:00 BST 2002 Olly Betts <olly@survex.com>
* Fixed to use AM_CFLAGS and AM_CXXFLAGS.
Mon Apr 01 23:34:09 BST 2002 Olly Betts <olly@survex.com>
* Fixed support for decimal numeric entities (e.g. "ö")
* Added support for all iso-8859-1 named entities (e.g. "ö")
Mon Apr 01 15:07:31 BST 2002 Olly Betts <olly@survex.com>
* Applied patch from "orion orion" to fix problem in HTML parsing.
Mon Mar 25 13:11:14 GMT 2002 Olly Betts <olly@survex.com>
* More tolerant treatment of random punctuation in query.
Mon Feb 4 14:57:36 GMT 2002 Sam Liddicott <sam@ananova.com>
* Added support for repeated fields in document data.
$field{fieldname} may now return multiple tab separated values if
more than one instance of a field exists in the document data
Tue Jan 15 16:29:39 GMT 2002 Sam Liddicott <sam@ananova.com>
* Fixed date_range_filter for the case where DATE1 and DATE2 don't
share the same MONTH and YEAR and M## terms for intermediate months
need calculating between the years.
Thu Jan 10 15:39:43 GMT 2002 Sam Liddicott <sam@ananova.com>
* Added $htmlstrip{} to strip out html tags
Thu Jan 10 14:34:35 GMT 2002 James Aylett <tartarus@users.sourceforge.net>
* toptermsjs snippet now included inside the HEAD, so it's
actually legal HTML. Snippet now sets the required 'type'
attribute as well. (It keeps the technically illegal
'language' attribute because I have a sneaking suspicion it
won't work otherwise.)
Thu Jan 10 14:30:19 GMT 2002 James Aylett <tartarus@users.sourceforge.net>
* $opt with two arguments now acts as a lookup for a $setmap
map. This was previously documented in a misleading fashion.
The new system is backwards compatible with the old.
Wed Jan 9 Sam Liddicott <sam@ananova.com>
* Added RAW_SEARCH as cgi param which when set stops change-search
detection being performed and processes rset, topdoc and page-change
parameters ( [ ] < > 1 2 etc etc ) anyway
* Added MIN_HITS cgi param to request many more hits than can
fit on the page so we can be confident that the next few
consecutive pages will really be needed
* Added xml template which when combined with RAW_SEARCH=1
can be very useful when searching is done from another
script
Fri Dec 21 17:56:02 GMT 2001 Olly Betts <olly@survex.com>
* Namespace fixes to allow use of find and find_if on Redhat's
"GCC 2.96".
Fri Dec 21 17:53:59 GMT 2001 Olly Betts <olly@survex.com>
* Added quick'n'dirty interface to allow experimentation with
OmBiasFunctor.
Thu Dec 20 14:46:33 GMT 2001 Olly Betts <olly@survex.com>
* Document xDB, xDAYSMINUS, xDATE1, xDATE2, xB.
Thu Dec 20 12:55:29 GMT 2001 Olly Betts <olly@survex.com>
* Use double quotes on parameters to <BODY>.
Mon Dec 17 15:01:43 GMT 2001 Olly Betts <olly@survex.com>
* Get rid of whitespace between hundreds and tens image in page
links.
Fri Dec 14 17:26:48 GMT 2001 Olly Betts <olly@survex.com>
* Force first page of hits if DB, DEFAULTOP, B, DAYSMINUS, DATE1,
or DATE2 changes; also clear relevance judgements if DB changes.
Fri Dec 14 16:21:07 GMT 2001 Olly Betts <olly@survex.com>
* Removed restriction on minimum page size (was 10) - for a shopping
type application with images next to each hit, 5 or fewer per page
might be reasonable; even one result per page makes sense for some
applications.
Fri Dec 14 15:37:20 GMT 2001 Olly Betts <olly@survex.com>
* Added $error to make nicer error reporting possible.
Fri Dec 14 14:49:18 GMT 2001 Olly Betts <olly@survex.com>
* Give more helpful messages for query syntax errors in cases where
we can without elaborate YACC hackery.
Thu Dec 13 15:10:24 GMT 2001 Olly Betts <olly@survex.com>
* For image page buttons, display pages 10-999 by using 2 or 3 images.
Thu Dec 13 15:02:16 GMT 2001 Olly Betts <olly@survex.com>
* New operators: $div{}, $mod{}, $mul{}, $sub{}, $ge{}, $gt{}, $le{},
$lt{}.
Wed Dec 12 16:37:47 GMT 2001 Olly Betts <olly@survex.com>
* Updated omegascript documentation.
Wed Dec 12 15:43:19 GMT 2001 Olly Betts <olly@survex.com>
* Fixed TOPDOC clipping.
Wed Dec 12 15:36:20 GMT 2001 Olly Betts <olly@survex.com>
* templates/query: Fixed typo which caused "..." to appear after
page buttons when it wasn't appropriate.
Wed Dec 12 15:11:23 GMT 2001 Olly Betts <olly@survex.com>
* omega: Added stopword list (still hardcoded at present though).
Wed Dec 12 12:46:57 GMT 2001 Olly Betts <olly@survex.com>
* omindex: index unstemmed terms with prefix 'R' (mnemonic: Raw).
* omega: $topterms will now return terms with prefix 'R'.
* parsequery.yy: fixed handling of DEFAULT_OP; "+first second" and
"-first second" now work; stopwording queries working (currently
stopword list is hardwired to just "the") - stopwords are ignored
when used as normal terms, but not in phrases, or with + and -.
* templates/query: make use of $prettyterm{}.
Wed Dec 12 11:11:30 GMT 2001 Olly Betts <olly@survex.com>
* $highlight{} now uses find_if not find_first_of (faster).
* Fixed detection of new/old/extended query when a term occurs
in the query more than once.
* Added $prettyterm{TERM} to convert a probabilistic term for
display to the user.
* $map would allow more than two arguments, but ignore them. Fixed
to take exactly two.
Fri Dec 07 15:59:21 GMT 2001 Olly Betts <olly@survex.com>
* Added macros to OmegaScript.
* template/query: updated to use macros.
* Removed specialcase to allow no-argument commands to accept an empty
argument list (e.g. "$thispage{}" rather than "$thispage"). The only
reason this was useful was to allow "$thispage{}s" which can just as
well be written using a comment to force the parser do what you want,
e.g. "$thispage${}s".
Thu Dec 06 18:59:34 GMT 2001 Olly Betts <olly@survex.com>
* If a stemmer is set, and all_stem isn't, only stemmer terms starting
with a lowercase letter.
Thu Dec 06 18:49:40 GMT 2001 Olly Betts <olly@survex.com>
* parsequery.yy: changed to use find_if() (faster than find_first_of()).
Thu Dec 06 17:46:37 GMT 2001 Olly Betts <olly@survex.com>
* Base page links on estimated number of matches, not minimum.
Wed Dec 05 17:07:33 GMT 2001 Olly Betts <olly@survex.com>
* omindex: minor speed tweaks.
Wed Dec 05 16:52:21 GMT 2001 Olly Betts <olly@survex.com>
* omindex: further HTML parser speed-ups.
Wed Dec 05 16:31:33 GMT 2001 Olly Betts <olly@survex.com>
* omindex: sped up HTML parsing.
Wed Dec 05 14:52:53 GMT 2001 Olly Betts <olly@survex.com>
* omindex: parsing terms from text is now twice as fast.
Thu Nov 29 16:53:45 GMT 2001 Olly Betts <olly@survex.com>
* NEAR phrases (e.g. "a NEAR b NEAR c") now work; removed "{a b c}"
syntax for NEAR phrases.
Thu Nov 29 15:25:54 GMT 2001 Olly Betts <olly@survex.com>
* $highlight{} now allows you to specify the tags to use for the
highlighting.
Thu Nov 29 15:24:53 GMT 2001 Olly Betts <olly@survex.com>
* topdoc is unsigned so subtracting and then checking if it's < 0
doesn't work...
Wed Nov 28 15:45:39 GMT 2001 Olly Betts <olly@survex.com>
* Fixed clipping of hit page in case when there are a multiple of
HITSPERPAGE matches.
Wed Nov 28 14:03:48 GMT 2001 Olly Betts <olly@survex.com>
* Added $hostname{URL}; $version output now says "Xapian - omega
<version>".
Wed Nov 28 13:04:46 GMT 2001 Olly Betts <olly@survex.com>
* docs/cgiparams.txt: Minor corrections and updates.
Wed Nov 28 13:03:40 GMT 2001 Olly Betts <olly@survex.com>
* If we're asked for a page of hits beyond the end of the matches, clip
to the last page of matches rather than the first.
Wed Nov 28 13:02:31 GMT 2001 Olly Betts <olly@survex.com>
* For an EXTENDED_QUERY, force the first page of hits.
Wed Nov 28 12:56:56 2001 James Aylett <tartarus@users.sourceforge.net>
* Lower case terms when constructing the query (otherwise why
do we store them in the database that way? :-)
Wed Nov 28 12:36:49 GMT 2001 Olly Betts <olly@survex.com>
* Fettled default query template.
Wed Nov 28 12:33:52 GMT 2001 Olly Betts <olly@survex.com>
* Request one more match than the last we want to display so we can
tell if the next page of hits is empty or not - otherwise we risk
offering a "next page" link when there are no more hits.
Mon Nov 26 16:28:00 2001 James Aylett <tartarus@users.sourceforge.net>
* --no-recurse / -l option added; useful if your sites are
nested in their disc storage (particularly things like
http://example.com/ being a distinct site, with
http://example.com/product being within it)
* --mime-type now really works (it was --mime-map in the code)
* documentation updated further
Mon Nov 26 14:39:00 2001 James Aylett <tartarus@users.sourceforge.net>
* options parsing fixed so minimised/unrecognised long options
doesn't segfault
Mon Nov 26 14:00:13 2001 James Aylett <tartarus@users.sourceforge.net>
* omindex can now index part of a site (previously 'subsite')
by having an index base within the site's disc storage
Mon Nov 26 13:57:10 2001 James Aylett <tartarus@users.sourceforge.net>
* Documentation updated for recent changes
Thu Nov 22 13:24:45 GMT 2001 Olly Betts <olly@survex.com>
* Use $nice{} in query template, but don't use $freqs. Use numbers as
page image button tooltips on Netscape 4.
Thu Nov 22 13:02:17 GMT 2001 Olly Betts <olly@survex.com>
* Herded escaped CGI parameter mangling code back into cgiparam.cc;
added special handling for numeric image button names.
Thu Nov 22 12:55:00 GMT 2001 Olly Betts <olly@survex.com>
* Fixed $nice to put the comma (or dot) in the right place.
Tue Nov 20 17:30:19 GMT 2001 Olly Betts <olly@survex.com>
* $lastpage now returns 0 when there are no matches (previously
gave a very large answer).
Tue Nov 20 12:30:47 GMT 2001 Olly Betts <olly@survex.com>
* $terms now only returns terms which were in the parsed query
(boolean filter terms are excluded).
Tue Nov 20 12:07:54 GMT 2001 Olly Betts <olly@survex.com>
* Fixed bug in date range filtering (got it wrong when start and end
date were in the same month).
* DAYSMINUS now counts back from DATE1 (if specified) rather than
always counting back from the present.
Mon Nov 19 17:13:24 GMT 2001 Olly Betts <olly@survex.com>
* Added date-range filtering (not fully tested yet).
Mon Nov 19 15:21:31 GMT 2001 Olly Betts <olly@survex.com>
* Fixed (c) message displayed by -v (BrightStation "PLC" not "Inc.",
first (c) 1999).
Fri Nov 16 11:49:20 GMT 2001 Olly Betts <olly@survex.com>
* New OmegaScript commands: $allterms{<docid>}, $freq{<term>},
$nice{<number>}, $set_relevant{<docid>}.
* $map{} now returns a list (shouldn't affect most users - if
the extra tabs are a problem, change `$map{...}' to
`$list{$map{...},}' ).
* Template `query' now preserves value of THRESHOLD.
* Template `godmode' fixed to actually work.
Wed Nov 14 15:04:13 GMT 2001 Olly Betts <olly@survex.com>
* Fixed to compile with GCC3.0
Wed Nov 14 14:54:53 GMT 2001 Olly Betts <olly@survex.com>
* Updated for changes to OmQuery
Tue Nov 06 13:10:15 GMT 2001 Olly Betts <olly@survex.com>
* Updated .cvsignore.
Tue Nov 06 13:02:04 GMT 2001 Olly Betts <olly@survex.com>
* Fixed lookup of CGI parameter THRESHOLD.
Tue Nov 6 12:38:37 GMT 2001 Richard Boulton <richard@tartarus.org>
* Moved configure.ac to configure.in: depending on autoconf 2.13 is
not needed yet.
Tue Nov 06 12:23:55 GMT 2001 Olly Betts <olly@survex.com>
* Added support for percentage threshold cutoff (CGI var THRESHOLD);
Code for calculating better percentages has been pushed into Xapian
so removed it from here.
Mon Nov 5 12:42:26 GMT 2001 Richard Boulton <richard@tartarus.org>
* Omega moved to new home, from om-examples/omega.
Standalone build system added.
|