1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509
|
/*
search.c - WordNet library of search code
*/
#ifdef _WINDOWS
#include <windows.h>
#include <windowsx.h>
#endif
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <limits.h>
#include "wn.h"
static char *Id = "$Id: search.c,v 1.166 2006/11/14 20:52:45 wn Exp $";
/* For adjectives, indicates synset type */
#define DONT_KNOW 0
#define DIRECT_ANT 1 /* direct antonyms (cluster head) */
#define INDIRECT_ANT 2 /* indrect antonyms (similar) */
#define PERTAINYM 3 /* no antonyms or similars (pertainyms) */
/* Flags for printsynset() */
#define ALLWORDS 0 /* print all words */
#define SKIP_ANTS 0 /* skip printing antonyms in printsynset() */
#define PRINT_ANTS 1 /* print antonyms in printsynset() */
#define SKIP_MARKER 0 /* skip printing adjective marker */
#define PRINT_MARKER 1 /* print adjective marker */
/* Trace types used by printspaces() to determine print sytle */
#define TRACEP 1 /* traceptrs */
#define TRACEC 2 /* tracecoords() */
#define TRACEI 3 /* traceinherit() */
#define DEFON 1
#define DEFOFF 0
/* Forward function declarations */
static void WNOverview(char *, int);
static void findverbgroups(IndexPtr);
static void add_relatives(int, IndexPtr, int, int);
static void free_rellist(void);
static void printsynset(char *, SynsetPtr, char *, int, int, int, int);
static void printantsynset(SynsetPtr, char *, int, int);
static char *printant(int, SynsetPtr, int, char *, char *);
static void printbuffer(char *);
static void printsns(SynsetPtr, int);
static void printsense(SynsetPtr, int);
static void catword(char *, SynsetPtr, int, int, int);
static void printspaces(int, int);
static void printrelatives(IndexPtr, int);
static int HasHoloMero(IndexPtr, int);
static int HasPtr(SynsetPtr, int);
static int getsearchsense(SynsetPtr, int);
static int depthcheck(int, SynsetPtr);
static void interface_doevents();
static void getexample(char *, char *);
static int findexample(SynsetPtr);
/* Static variables */
static int prflag, sense, prlexid;
static int overflag = 0; /* set when output buffer overflows */
static char searchbuffer[SEARCHBUF];
static int lastholomero; /* keep track of last holo/meronym printed */
#define TMPBUFSIZE 1024*10
static char tmpbuf[TMPBUFSIZE]; /* general purpose printing buffer */
static char wdbuf[WORDBUF]; /* general purpose word buffer */
static char msgbuf[256]; /* buffer for constructing error messages */
static int adj_marker;
extern long last_bin_search_offset;
/* Find word in index file and return parsed entry in data structure.
Input word must be exact match of string in database. */
IndexPtr index_lookup(char *word, int dbase)
{
IndexPtr idx = NULL;
FILE *fp;
char *line;
if ((fp = indexfps[dbase]) == NULL) {
sprintf(msgbuf, "WordNet library error: %s indexfile not open\n",
partnames[dbase]);
display_message(msgbuf);
return(NULL);
}
if ((line = bin_search(word, fp)) != NULL) {
idx = parse_index( last_bin_search_offset, dbase, line);
}
return (idx);
}
/* This function parses an entry from an index file into an Index data
* structure. It takes the byte offset and file number, and optionally the
* line. If the line is NULL, parse_index will get the line from the file.
* If the line is non-NULL, parse_index won't look at the file, but it still
* needs the dbase and offset parameters to be set, so it can store them in
* the Index struct.
*/
IndexPtr parse_index(long offset, int dbase, char *line) {
IndexPtr idx = NULL;
char *ptrtok;
int j;
if ( !line )
line = read_index( offset, indexfps[dbase] );
idx = (IndexPtr)calloc(1, sizeof(Index));
assert(idx);
/* set offset of entry in index file */
idx->idxoffset = offset;
/* get the word */
ptrtok=strtok(line," \n");
idx->wd = strdup(ptrtok);
assert(idx->wd);
/* get the part of speech */
ptrtok=strtok(NULL," \n");
idx->pos = strdup(ptrtok);
assert(idx->pos);
/* get the collins count */
ptrtok=strtok(NULL," \n");
idx->sense_cnt = atoi(ptrtok);
/* get the number of pointers types */
ptrtok=strtok(NULL," \n");
idx->ptruse_cnt = atoi(ptrtok);
if (idx->ptruse_cnt < 0 || (unsigned int)idx->ptruse_cnt > UINT_MAX/sizeof(int)) {
free_index(idx);
return(NULL);
}
if (idx->ptruse_cnt) {
idx->ptruse = (int *) malloc(idx->ptruse_cnt * (sizeof(int)));
assert(idx->ptruse);
/* get the pointers types */
for(j=0;j < idx->ptruse_cnt; j++) {
ptrtok=strtok(NULL," \n");
idx->ptruse[j] = getptrtype(ptrtok);
}
}
/* get the number of offsets */
ptrtok=strtok(NULL," \n");
idx->off_cnt = atoi(ptrtok);
/* get the number of senses that are tagged */
ptrtok=strtok(NULL," \n");
idx->tagged_cnt = atoi(ptrtok);
if (idx->off_cnt < 0 || (unsigned long)idx->off_cnt > ULONG_MAX/sizeof(long)) {
free_index(idx);
return(NULL);
}
/* make space for the offsets */
idx->offset = (unsigned long *) malloc(idx->off_cnt * sizeof(long));
assert(idx->offset);
/* get the offsets */
for(j=0;j<idx->off_cnt;j++) {
ptrtok=strtok(NULL," \n");
idx->offset[j] = atol(ptrtok);
}
return(idx);
}
/* 'smart' search of index file. Find word in index file, trying different
techniques - replace hyphens with underscores, replace underscores with
hyphens, strip hyphens and underscores, strip periods. */
IndexPtr getindex(char *searchstr, int dbase)
{
int i, j, k;
char c;
char strings[MAX_FORMS][WORDBUF]; /* vector of search strings */
static IndexPtr offsets[MAX_FORMS];
static int offset;
/* This works like strrok(): if passed with a non-null string,
prepare vector of search strings and offsets. If string
is null, look at current list of offsets and return next
one, or NULL if no more alternatives for this word. */
if (searchstr != NULL) {
/* Bail out if the input is too long for us to handle */
if (strlen(searchstr) > (WORDBUF - 1)) {
strcpy(msgbuf, "WordNet library error: search term is too long\n");
display_message(msgbuf);
return(NULL);
}
offset = 0;
strtolower(searchstr);
for (i = 0; i < MAX_FORMS; i++) {
strcpy(strings[i], searchstr);
offsets[i] = 0;
}
strsubst(strings[1], '_', '-');
strsubst(strings[2], '-', '_');
/* remove all spaces and hyphens from last search string, then
all periods */
for (i = j = k = 0; (c = searchstr[i]) != '\0'; i++) {
if (c != '_' && c != '-')
strings[3][j++] = c;
if (c != '.')
strings[4][k++] = c;
}
strings[3][j] = '\0';
strings[4][k] = '\0';
/* Get offset of first entry. Then eliminate duplicates
and get offsets of unique strings. */
if (strings[0] != NULL)
offsets[0] = index_lookup(strings[0], dbase);
for (i = 1; i < MAX_FORMS; i++)
if (strings[i] != NULL && (strcmp(strings[0], strings[i])))
offsets[i] = index_lookup(strings[i], dbase);
}
for (i = offset; i < MAX_FORMS; i++)
if (offsets[i]) {
offset = i + 1;
return(offsets[i]);
}
return(NULL);
}
/* Read synset from data file at byte offset passed and return parsed
entry in data structure. */
SynsetPtr read_synset(int dbase, long boffset, char *word)
{
FILE *fp;
if((fp = datafps[dbase]) == NULL) {
sprintf(msgbuf, "WordNet library error: %s datafile not open\n",
partnames[dbase]);
display_message(msgbuf);
return(NULL);
}
fseek(fp, boffset, 0); /* position file to byte offset requested */
return(parse_synset(fp, dbase, word)); /* parse synset and return */
}
/* Read synset at current byte offset in file and return parsed entry
in data structure. */
SynsetPtr parse_synset(FILE *fp, int dbase, char *word)
{
static char line[LINEBUF];
char tbuf[SMLINEBUF] = "";
char *ptrtok;
char *tmpptr;
int foundpert = 0;
char wdnum[3];
int i;
SynsetPtr synptr;
long loc; /* sanity check on file location */
loc = ftell(fp);
if ((tmpptr = fgets(line, LINEBUF, fp)) == NULL)
return(NULL);
synptr = (SynsetPtr)calloc(1, sizeof(Synset));
assert(synptr);
synptr->sstype = DONT_KNOW;
synptr->searchtype = -1;
ptrtok = line;
/* looking at offset */
ptrtok = strtok(line," \n");
synptr->hereiam = atol(ptrtok);
/* sanity check - make sure starting file offset matches first field */
if (synptr->hereiam != loc) {
sprintf(msgbuf, "WordNet library error: no synset at location %ld\n",
loc);
display_message(msgbuf);
free(synptr);
return(NULL);
}
/* looking at FNUM */
ptrtok = strtok(NULL," \n");
synptr->fnum = atoi(ptrtok);
/* looking at POS */
ptrtok = strtok(NULL, " \n");
synptr->pos = strdup(ptrtok);
assert(synptr->pos);
if (getsstype(synptr->pos) == SATELLITE)
synptr->sstype = INDIRECT_ANT;
/* looking at numwords */
ptrtok = strtok(NULL, " \n");
synptr->wcount = strtol(ptrtok, NULL, 16);
if (synptr->wcount < 0 || (unsigned int)synptr->wcount > UINT_MAX/sizeof(char *)) {
free_syns(synptr);
return(NULL);
}
synptr->words = (char **)malloc(synptr->wcount * sizeof(char *));
assert(synptr->words);
synptr->wnsns = (int *)malloc(synptr->wcount * sizeof(int));
assert(synptr->wnsns);
synptr->lexid = (int *)malloc(synptr->wcount * sizeof(int));
assert(synptr->lexid);
for (i = 0; i < synptr->wcount; i++) {
ptrtok = strtok(NULL, " \n");
synptr->words[i] = strdup(ptrtok);
assert(synptr->words[i]);
/* is this the word we're looking for? */
if (word && !strcmp(word,strtolower(ptrtok)))
synptr->whichword = i+1;
ptrtok = strtok(NULL, " \n");
sscanf(ptrtok, "%x", &synptr->lexid[i]);
}
/* get the pointer count */
ptrtok = strtok(NULL," \n");
synptr->ptrcount = atoi(ptrtok);
/* Should we check for long here as well? */
if (synptr->ptrcount < 0 || (unsigned int)synptr->ptrcount > UINT_MAX/sizeof(int)) {
free_syns(synptr);
return(NULL);
}
if (synptr->ptrcount) {
/* alloc storage for the pointers */
synptr->ptrtyp = (int *)malloc(synptr->ptrcount * sizeof(int));
assert(synptr->ptrtyp);
synptr->ptroff = (long *)malloc(synptr->ptrcount * sizeof(long));
assert(synptr->ptroff);
synptr->ppos = (int *)malloc(synptr->ptrcount * sizeof(int));
assert(synptr->ppos);
synptr->pto = (int *)malloc(synptr->ptrcount * sizeof(int));
assert(synptr->pto);
synptr->pfrm = (int *)malloc(synptr->ptrcount * sizeof(int));
assert(synptr->pfrm);
for(i = 0; i < synptr->ptrcount; i++) {
/* get the pointer type */
ptrtok = strtok(NULL," \n");
synptr->ptrtyp[i] = getptrtype(ptrtok);
/* For adjectives, set the synset type if it has a direct
antonym */
if (dbase == ADJ && synptr->sstype == DONT_KNOW) {
if (synptr->ptrtyp[i] == ANTPTR)
synptr->sstype = DIRECT_ANT;
else if (synptr->ptrtyp[i] == PERTPTR)
foundpert = 1;
}
/* get the pointer offset */
ptrtok = strtok(NULL," \n");
synptr->ptroff[i] = atol(ptrtok);
/* get the pointer part of speech */
ptrtok = strtok(NULL, " \n");
synptr->ppos[i] = getpos(ptrtok);
/* get the lexp to/from restrictions */
ptrtok = strtok(NULL," \n");
tmpptr = ptrtok;
strncpy(wdnum, tmpptr, 2);
wdnum[2] = '\0';
synptr->pfrm[i] = strtol(wdnum, (char **)NULL, 16);
tmpptr += 2;
strncpy(wdnum, tmpptr, 2);
wdnum[2] = '\0';
synptr->pto[i] = strtol(wdnum, (char **)NULL, 16);
}
}
/* If synset type is still not set, see if it's a pertainym */
if (dbase == ADJ && synptr->sstype == DONT_KNOW && foundpert == 1)
synptr->sstype = PERTAINYM;
/* retireve optional information from verb synset */
if(dbase == VERB) {
ptrtok = strtok(NULL," \n");
synptr->fcount = atoi(ptrtok);
/* allocate frame storage */
synptr->frmid = (int *)malloc(synptr->fcount * sizeof(int));
assert(synptr->frmid);
synptr->frmto = (int *)malloc(synptr->fcount * sizeof(int));
assert(synptr->frmto);
for(i=0;i<synptr->fcount;i++) {
/* skip the frame pointer (+) */
ptrtok = strtok(NULL," \n");
ptrtok = strtok(NULL," \n");
synptr->frmid[i] = atoi(ptrtok);
ptrtok = strtok(NULL," \n");
synptr->frmto[i] = strtol(ptrtok, NULL, 16);
}
}
/* get the optional definition */
ptrtok = strtok(NULL," \n");
if (ptrtok) {
ptrtok = strtok(NULL," \n");
while (ptrtok != NULL) {
if (strlen(ptrtok) + strlen(tbuf) + 1 + 1 > sizeof(tbuf)) {
free_syns(synptr);
return(NULL);
}
strcat(tbuf,ptrtok);
ptrtok = strtok(NULL, " \n");
if(ptrtok)
strcat(tbuf," ");
}
synptr->defn = malloc(strlen(tbuf) + 3);
assert(synptr->defn);
sprintf(synptr->defn,"(%s)",tbuf);
}
if (keyindexfp) { /* we have unique keys */
sprintf(tmpbuf, "%c:%8.8ld", partchars[dbase], synptr->hereiam);
synptr->key = GetKeyForOffset(tmpbuf);
}
/* Can't do earlier - calls indexlookup which messes up strtok calls */
for (i = 0; i < synptr->wcount; i++)
synptr->wnsns[i] = getsearchsense(synptr, i + 1);
return(synptr);
}
/* Free a synset linked list allocated by findtheinfo_ds() */
void free_syns(SynsetPtr synptr)
{
SynsetPtr cursyn, nextsyn;
if (synptr) {
cursyn = synptr;
while(cursyn) {
if (cursyn->nextform)
free_syns(cursyn->nextform);
nextsyn = cursyn->nextss;
free_synset(cursyn);
cursyn = nextsyn;
}
}
}
/* Free a synset */
void free_synset(SynsetPtr synptr)
{
int i;
free(synptr->pos);
for (i = 0; i < synptr->wcount; i++){
free(synptr->words[i]);
}
free(synptr->words);
free(synptr->wnsns);
free(synptr->lexid);
if (synptr->ptrcount) {
free(synptr->ptrtyp);
free(synptr->ptroff);
free(synptr->ppos);
free(synptr->pto);
free(synptr->pfrm);
}
if (synptr->fcount) {
free(synptr->frmid);
free(synptr->frmto);
}
if (synptr->defn)
free(synptr->defn);
if (synptr->headword)
free(synptr->headword);
if (synptr->ptrlist)
free_syns(synptr->ptrlist); /* changed from free_synset() */
free(synptr);
}
/* Free an index structure */
void free_index(IndexPtr idx)
{
free(idx->wd);
free(idx->pos);
if (idx->ptruse)
free(idx->ptruse);
free(idx->offset);
free(idx);
}
/* Recursive search algorithm to trace a pointer tree */
static void traceptrs(SynsetPtr synptr, int ptrtyp, int dbase, int depth)
{
int i;
int extraindent = 0;
SynsetPtr cursyn;
char prefix[40], tbuf[20];
int realptr;
interface_doevents();
if (abortsearch)
return;
if (ptrtyp < 0) {
ptrtyp = -ptrtyp;
extraindent = 2;
}
for (i = 0; i < synptr->ptrcount; i++) {
if ((ptrtyp == HYPERPTR && (synptr->ptrtyp[i] == HYPERPTR ||
synptr->ptrtyp[i] == INSTANCE)) ||
(ptrtyp == HYPOPTR && (synptr->ptrtyp[i] == HYPOPTR ||
synptr->ptrtyp[i] == INSTANCES)) ||
((synptr->ptrtyp[i] == ptrtyp) &&
((synptr->pfrm[i] == 0) ||
(synptr->pfrm[i] == synptr->whichword)))) {
realptr = synptr->ptrtyp[i]; /* deal with INSTANCE */
if(!prflag) { /* print sense number and synset */
printsns(synptr, sense + 1);
prflag = 1;
}
printspaces(TRACEP, depth + extraindent);
switch(realptr) {
case PERTPTR:
if (dbase == ADV)
sprintf(prefix, "Derived from %s ",
partnames[synptr->ppos[i]]);
else
sprintf(prefix, "Pertains to %s ",
partnames[synptr->ppos[i]]);
break;
case ANTPTR:
if (dbase != ADJ)
sprintf(prefix, "Antonym of ");
break;
case PPLPTR:
sprintf(prefix, "Participle of verb ");
break;
case INSTANCE:
sprintf(prefix, "INSTANCE OF=> ");
break;
case INSTANCES:
sprintf(prefix, "HAS INSTANCE=> ");
break;
case HASMEMBERPTR:
sprintf(prefix, " HAS MEMBER: ");
break;
case HASSTUFFPTR:
sprintf(prefix, " HAS SUBSTANCE: ");
break;
case HASPARTPTR:
sprintf(prefix, " HAS PART: ");
break;
case ISMEMBERPTR:
sprintf(prefix, " MEMBER OF: ");
break;
case ISSTUFFPTR:
sprintf(prefix, " SUBSTANCE OF: ");
break;
case ISPARTPTR:
sprintf(prefix, " PART OF: ");
break;
default:
sprintf(prefix, "=> ");
break;
}
/* Read synset pointed to */
cursyn=read_synset(synptr->ppos[i], synptr->ptroff[i], "");
/* For Pertainyms and Participles pointing to a specific
sense, indicate the sense then retrieve the synset
pointed to and other info as determined by type.
Otherwise, just print the synset pointed to. */
if ((ptrtyp == PERTPTR || ptrtyp == PPLPTR) &&
synptr->pto[i] != 0) {
snprintf(tbuf, sizeof(tbuf), " (Sense %d)\n",
cursyn->wnsns[synptr->pto[i] - 1]);
printsynset(prefix, cursyn, tbuf, DEFOFF, synptr->pto[i],
SKIP_ANTS, PRINT_MARKER);
if (ptrtyp == PPLPTR) { /* adjective pointing to verb */
printsynset(" =>", cursyn, "\n",
DEFON, ALLWORDS, PRINT_ANTS, PRINT_MARKER);
traceptrs(cursyn, HYPERPTR, getpos(cursyn->pos), 0);
} else if (dbase == ADV) { /* adverb pointing to adjective */
printsynset(" =>", cursyn, "\n",DEFON, ALLWORDS,
((getsstype(cursyn->pos) == SATELLITE)
? SKIP_ANTS : PRINT_ANTS), PRINT_MARKER);
#ifdef FOOP
traceptrs(cursyn, HYPERPTR, getpos(cursyn->pos), 0);
#endif
} else { /* adjective pointing to noun */
printsynset(" =>", cursyn, "\n",
DEFON, ALLWORDS, PRINT_ANTS, PRINT_MARKER);
traceptrs(cursyn, HYPERPTR, getpos(cursyn->pos), 0);
}
} else if (ptrtyp == ANTPTR && dbase != ADJ && synptr->pto[i] != 0) {
snprintf(tbuf, sizeof(tbuf), " (Sense %d)\n",
cursyn->wnsns[synptr->pto[i] - 1]);
printsynset(prefix, cursyn, tbuf, DEFOFF, synptr->pto[i],
SKIP_ANTS, PRINT_MARKER);
printsynset(" =>", cursyn, "\n", DEFON, ALLWORDS,
PRINT_ANTS, PRINT_MARKER);
} else
printsynset(prefix, cursyn, "\n", DEFON, ALLWORDS,
PRINT_ANTS, PRINT_MARKER);
/* For HOLONYMS and MERONYMS, keep track of last one
printed in buffer so results can be truncated later. */
if (ptrtyp >= ISMEMBERPTR && ptrtyp <= HASPARTPTR)
lastholomero = strlen(searchbuffer);
if(depth) {
depth = depthcheck(depth, cursyn);
traceptrs(cursyn, ptrtyp, getpos(cursyn->pos), (depth+1));
free_synset(cursyn);
} else
free_synset(cursyn);
}
}
}
static void tracecoords(SynsetPtr synptr, int ptrtyp, int dbase, int depth)
{
int i;
SynsetPtr cursyn;
interface_doevents();
if (abortsearch)
return;
for(i = 0; i < synptr->ptrcount; i++) {
if((synptr->ptrtyp[i] == HYPERPTR || synptr->ptrtyp[i] == INSTANCE) &&
((synptr->pfrm[i] == 0) ||
(synptr->pfrm[i] == synptr->whichword))) {
if(!prflag) {
printsns(synptr, sense + 1);
prflag = 1;
}
printspaces(TRACEC, depth);
cursyn = read_synset(synptr->ppos[i], synptr->ptroff[i], "");
printsynset("-> ", cursyn, "\n", DEFON, ALLWORDS,
SKIP_ANTS, PRINT_MARKER);
traceptrs(cursyn, ptrtyp, getpos(cursyn->pos), depth);
if(depth) {
depth = depthcheck(depth, cursyn);
tracecoords(cursyn, ptrtyp, getpos(cursyn->pos), (depth+1));
free_synset(cursyn);
} else
free_synset(cursyn);
}
}
}
static void traceclassif(SynsetPtr synptr, int dbase, int search)
{
int i, j, idx;
SynsetPtr cursyn;
long int prlist[1024];
char head[60];
int svwnsnsflag;
interface_doevents();
if (abortsearch)
return;
idx = 0;
for (i = 0; i < synptr->ptrcount; i++) {
if (((synptr->ptrtyp[i] >= CLASSIF_START) &&
(synptr->ptrtyp[i] <= CLASSIF_END) && search == CLASSIFICATION) ||
((synptr->ptrtyp[i] >= CLASS_START) &&
(synptr->ptrtyp[i] <= CLASS_END) && search == CLASS) ) {
if (!prflag) {
printsns(synptr, sense + 1);
prflag = 1;
}
cursyn = read_synset(synptr->ppos[i], synptr->ptroff[i], "");
for (j = 0; j < idx; j++) {
if (synptr->ptroff[i] == prlist[j]) {
break;
}
}
if (j == idx) {
prlist[idx++] = synptr->ptroff[i];
printspaces(TRACEP, 0);
if (synptr->ptrtyp[i] == CLASSIF_CATEGORY)
strcpy(head, "TOPIC->(");
else if (synptr->ptrtyp[i] == CLASSIF_USAGE)
strcpy(head, "USAGE->(");
else if (synptr->ptrtyp[i] == CLASSIF_REGIONAL)
strcpy(head, "REGION->(");
else if (synptr->ptrtyp[i] == CLASS_CATEGORY)
strcpy(head, "TOPIC_TERM->(");
else if (synptr->ptrtyp[i] == CLASS_USAGE)
strcpy(head, "USAGE_TERM->(");
else if (synptr->ptrtyp[i] == CLASS_REGIONAL)
strcpy(head, "REGION_TERM->(");
strcat(head, partnames[synptr->ppos[i]]);
strcat(head, ") ");
svwnsnsflag = wnsnsflag;
wnsnsflag = 1;
printsynset(head, cursyn, "\n", DEFOFF, ALLWORDS,
SKIP_ANTS, SKIP_MARKER);
wnsnsflag = svwnsnsflag;
}
free_synset(cursyn);
}
}
}
static void tracenomins(SynsetPtr synptr, int dbase)
{
int i, j, idx;
SynsetPtr cursyn;
long int prlist[1024];
char prefix[40], tbuf[20];
interface_doevents();
if (abortsearch)
return;
idx = 0;
for (i = 0; i < synptr->ptrcount; i++) {
if ((synptr->ptrtyp[i] == DERIVATION) &&
(synptr->pfrm[i] == synptr->whichword)) {
if (!prflag) {
printsns(synptr, sense + 1);
prflag = 1;
}
printspaces(TRACEP, 0);
sprintf(prefix, "RELATED TO->(%s) ",
partnames[synptr->ppos[i]]);
cursyn = read_synset(synptr->ppos[i], synptr->ptroff[i], "");
snprintf(tbuf, sizeof(tbuf), "#%d\n",
cursyn->wnsns[synptr->pto[i] - 1]);
printsynset(prefix, cursyn, tbuf, DEFOFF, synptr->pto[i],
SKIP_ANTS, SKIP_MARKER);
/* only print synset once, even if more than one link */
for (j = 0; j < idx; j++) {
#ifdef FOOP
if (synptr->ptroff[i] == prlist[j]) {
break;
}
#endif
}
if (j == idx) {
prlist[idx++] = synptr->ptroff[i];
printspaces(TRACEP, 2);
printsynset("=> ", cursyn, "\n", DEFON, ALLWORDS,
SKIP_ANTS, PRINT_MARKER);
}
free_synset(cursyn);
}
}
}
/* Trace through the hypernym tree and print all MEMBER, STUFF
and PART info. */
static void traceinherit(SynsetPtr synptr, int ptrbase, int dbase, int depth)
{
int i;
SynsetPtr cursyn;
interface_doevents();
if (abortsearch)
return;
for(i=0;i<synptr->ptrcount;i++) {
if((synptr->ptrtyp[i] == HYPERPTR) &&
((synptr->pfrm[i] == 0) ||
(synptr->pfrm[i] == synptr->whichword))) {
if(!prflag) {
printsns(synptr, sense + 1);
prflag = 1;
}
printspaces(TRACEI, depth);
cursyn = read_synset(synptr->ppos[i], synptr->ptroff[i], "");
printsynset("=> ", cursyn, "\n", DEFON, ALLWORDS,
SKIP_ANTS, PRINT_MARKER);
traceptrs(cursyn, ptrbase, NOUN, depth);
traceptrs(cursyn, ptrbase + 1, NOUN, depth);
traceptrs(cursyn, ptrbase + 2, NOUN, depth);
if(depth) {
depth = depthcheck(depth, cursyn);
traceinherit(cursyn, ptrbase, getpos(cursyn->pos), (depth+1));
free_synset(cursyn);
} else
free_synset(cursyn);
}
}
/* Truncate search buffer after last holo/meronym printed */
searchbuffer[lastholomero] = '\0';
}
static void partsall(SynsetPtr synptr, int ptrtyp)
{
int ptrbase;
int i, hasptr = 0;
ptrbase = (ptrtyp == HMERONYM) ? HASMEMBERPTR : ISMEMBERPTR;
/* First, print out the MEMBER, STUFF, PART info for this synset */
for (i = 0; i < 3; i++) {
if (HasPtr(synptr, ptrbase + i)) {
traceptrs(synptr, ptrbase + i, NOUN, 1);
hasptr++;
}
interface_doevents();
if (abortsearch)
return;
}
/* Print out MEMBER, STUFF, PART info for hypernyms on
HMERONYM search only */
/* if (hasptr && ptrtyp == HMERONYM) { */
if (ptrtyp == HMERONYM) {
lastholomero = strlen(searchbuffer);
traceinherit(synptr, ptrbase, NOUN, 1);
}
}
static void traceadjant(SynsetPtr synptr)
{
SynsetPtr newsynptr;
int i, j;
int anttype = DIRECT_ANT;
SynsetPtr simptr, antptr;
static char similar[] = " => ";
/* This search is only applicable for ADJ synsets which have
either direct or indirect antonyms (not valid for pertainyms). */
if (synptr->sstype == DIRECT_ANT || synptr->sstype == INDIRECT_ANT) {
printsns(synptr, sense + 1);
printbuffer("\n");
/* if indirect, get cluster head */
if(synptr->sstype == INDIRECT_ANT) {
anttype = INDIRECT_ANT;
i = 0;
while (synptr->ptrtyp[i] != SIMPTR) i++;
newsynptr = read_synset(ADJ, synptr->ptroff[i], "");
} else
newsynptr = synptr;
/* find antonyms - if direct, make sure that the antonym
ptr we're looking at is from this word */
for (i = 0; i < newsynptr->ptrcount; i++) {
if (newsynptr->ptrtyp[i] == ANTPTR &&
((anttype == DIRECT_ANT &&
newsynptr->pfrm[i] == newsynptr->whichword) ||
(anttype == INDIRECT_ANT))) {
/* read the antonym's synset and print it. if a
direct antonym, print it's satellites. */
antptr = read_synset(ADJ, newsynptr->ptroff[i], "");
if (anttype == DIRECT_ANT) {
printsynset("", antptr, "\n", DEFON, ALLWORDS,
PRINT_ANTS, PRINT_MARKER);
for(j = 0; j < antptr->ptrcount; j++) {
if(antptr->ptrtyp[j] == SIMPTR) {
simptr = read_synset(ADJ, antptr->ptroff[j], "");
printsynset(similar, simptr, "\n", DEFON,
ALLWORDS, SKIP_ANTS, PRINT_MARKER);
free_synset(simptr);
}
}
} else
printantsynset(antptr, "\n", anttype, DEFON);
free_synset(antptr);
}
}
if (newsynptr != synptr)
free_synset(newsynptr);
}
}
/* Fetch the given example sentence from the example file and print it out */
void getexample(char *offset, char *wd)
{
char *line;
char sentbuf[512];
if (vsentfilefp != NULL) {
if ((line = bin_search(offset, vsentfilefp)) != NULL) {
while(*line != ' ')
line++;
printbuffer(" EX: ");
snprintf(sentbuf, sizeof(sentbuf), line, wd);
printbuffer(sentbuf);
}
}
}
/* Find the example sentence references in the example sentence index file */
int findexample(SynsetPtr synptr)
{
char tbuf[256], *temp, *offset;
int wdnum;
int found = 0;
if (vidxfilefp != NULL) {
wdnum = synptr->whichword - 1;
snprintf(tbuf, sizeof(tbuf), "%s%%%-1.1d:%-2.2d:%-2.2d::",
synptr->words[wdnum],
getpos(synptr->pos),
synptr->fnum,
synptr->lexid[wdnum]);
if ((temp = bin_search(tbuf, vidxfilefp)) != NULL) {
/* skip over sense key and get sentence numbers */
temp += strlen(synptr->words[wdnum]) + 11;
strcpy(tbuf, temp);
offset = strtok(tbuf, " ,\n");
while (offset) {
getexample(offset, synptr->words[wdnum]);
offset = strtok(NULL, ",\n");
}
found = 1;
}
}
return(found);
}
static void printframe(SynsetPtr synptr, int prsynset)
{
int i;
if (prsynset)
printsns(synptr, sense + 1);
if (!findexample(synptr)) {
for(i = 0; i < synptr->fcount; i++) {
if ((synptr->frmto[i] == synptr->whichword) ||
(synptr->frmto[i] == 0)) {
if (synptr->frmto[i] == synptr->whichword)
printbuffer(" => ");
else
printbuffer(" *> ");
printbuffer(frametext[synptr->frmid[i]]);
printbuffer("\n");
}
}
}
}
static void printseealso(SynsetPtr synptr)
{
SynsetPtr cursyn;
int i, first = 1;
int svwnsnsflag;
char firstline_v[] = " Phrasal Verb-> ";
char firstline_nar[] = " Also See-> ";
char otherlines[] = "; ";
char *prefix;
if ( getpos( synptr->pos ) == VERB )
prefix = firstline_v;
else
prefix = firstline_nar;
/* Find all SEEALSO pointers from the searchword and print the
word or synset pointed to. */
for(i = 0; i < synptr->ptrcount; i++) {
if ((synptr->ptrtyp[i] == SEEALSOPTR) &&
((synptr->pfrm[i] == 0) ||
(synptr->pfrm[i] == synptr->whichword))) {
cursyn = read_synset(synptr->ppos[i], synptr->ptroff[i], "");
svwnsnsflag = wnsnsflag;
wnsnsflag = 1;
printsynset(prefix, cursyn, "", DEFOFF,
synptr->pto[i] == 0 ? ALLWORDS : synptr->pto[i],
SKIP_ANTS, SKIP_MARKER);
wnsnsflag = svwnsnsflag;
free_synset(cursyn);
if (first) {
prefix = otherlines;
first = 0;
}
}
}
if (!first)
printbuffer("\n");
}
static void freq_word(IndexPtr index)
{
int familiar=0;
int cnt;
static char *a_an[] = {
"", "a noun", "a verb", "an adjective", "an adverb" };
static char *freqcats[] = {
"extremely rare","very rare","rare","uncommon","common",
"familiar","very familiar","extremely familiar"
};
if(index) {
cnt = index->sense_cnt;
if (cnt == 0) familiar = 0;
if (cnt == 1) familiar = 1;
if (cnt == 2) familiar = 2;
if (cnt >= 3 && cnt <= 4) familiar = 3;
if (cnt >= 5 && cnt <= 8) familiar = 4;
if (cnt >= 9 && cnt <= 16) familiar = 5;
if (cnt >= 17 && cnt <= 32) familiar = 6;
if (cnt > 32 ) familiar = 7;
snprintf(tmpbuf, sizeof(tmpbuf),
"\n%s used as %s is %s (polysemy count = %d)\n",
index->wd, a_an[getpos(index->pos)], freqcats[familiar], cnt);
printbuffer(tmpbuf);
}
}
void wngrep (char *word_passed, int pos) {
FILE *inputfile;
char word[256];
int wordlen, linelen, loc;
char line[1024];
int count = 0;
inputfile = indexfps[pos];
if (inputfile == NULL) {
sprintf (msgbuf, "WordNet library error: Can't perform compounds "
"search because %s index file is not open\n", partnames[pos]);
display_message (msgbuf);
return;
}
rewind(inputfile);
if (strlen(word_passed) + 1 > sizeof(word))
return;
strcpy (word, word_passed);
ToLowerCase(word); /* map to lower case for index file search */
strsubst (word, ' ', '_'); /* replace spaces with underscores */
wordlen = strlen (word);
while (fgets (line, 1024, inputfile) != NULL) {
for (linelen = 0; line[linelen] != ' '; linelen++) {}
if (linelen < wordlen)
continue;
line[linelen] = '\0';
strstr_init (line, word);
while ((loc = strstr_getnext ()) != -1) {
if (
/* at the start of the line */
(loc == 0) ||
/* at the end of the line */
((linelen - wordlen) == loc) ||
/* as a word in the middle of the line */
(((line[loc - 1] == '-') || (line[loc - 1] == '_')) &&
((line[loc + wordlen] == '-') || (line[loc + wordlen] == '_')))
) {
strsubst (line, '_', ' ');
snprintf (tmpbuf, sizeof(tmpbuf), "%s\n", line);
printbuffer (tmpbuf);
break;
}
}
if (count++ % 2000 == 0) {
interface_doevents ();
if (abortsearch) break;
}
}
}
/* Stucture to keep track of 'relative groups'. All senses in a relative
group are displayed together at end of search. Transitivity is
supported, so if either of a new set of related senses is already
in a 'relative group', the other sense is added to that group as well. */
struct relgrp {
int senses[MAXSENSE];
struct relgrp *next;
};
static struct relgrp *rellist;
static struct relgrp *mkrellist(void);
/* Simple hash function */
#define HASHTABSIZE 1223 /* Prime number. Must be > 2*MAXTOPS */
#define hash(n) ((n) % HASHTABSIZE)
/* Find relative groups for all senses of target word in given part
of speech. */
static void relatives(IndexPtr idx, int dbase)
{
rellist = NULL;
switch(dbase) {
case VERB:
findverbgroups(idx);
interface_doevents();
if (abortsearch)
break;
printrelatives(idx, VERB);
break;
default:
break;
}
free_rellist();
}
static void findverbgroups(IndexPtr idx)
{
int i, j, k;
SynsetPtr synset;
assert(idx);
/* Read all senses */
for (i = 0; i < idx->off_cnt; i++) {
synset = read_synset(VERB, idx->offset[i], idx->wd);
/* Look for VERBGROUP ptr(s) for this sense. If found,
create group for senses, or add to existing group. */
for (j = 0; j < synset->ptrcount; j++) {
if (synset->ptrtyp[j] == VERBGROUP) {
/* Need to find sense number for ptr offset */
for (k = 0; k < idx->off_cnt; k++) {
if (synset->ptroff[j] == idx->offset[k]) {
add_relatives(VERB, idx, i, k);
break;
}
}
}
}
free_synset(synset);
}
}
static void add_relatives(int pos, IndexPtr idx, int rel1, int rel2)
{
int i;
struct relgrp *rel, *last, *r;
/* If either of the new relatives are already in a relative group,
then add the other to the existing group (transitivity).
Otherwise create a new group and add these 2 senses to it. */
for (rel = rellist; rel; rel = rel->next) {
if (rel->senses[rel1] == 1 || rel->senses[rel2] == 1) {
rel->senses[rel1] = rel->senses[rel2] = 1;
/* If part of another relative group, merge the groups */
for (r = rellist; r; r = r->next) {
if (r != rel &&
(r->senses[rel1] == 1 || r->senses[rel2] == 1)) {
for (i = 0; i < MAXSENSE; i++)
rel->senses[i] |= r->senses[i];
}
}
return;
}
last = rel;
}
rel = mkrellist();
rel->senses[rel1] = rel->senses[rel2] = 1;
if (rellist == NULL)
rellist = rel;
else
last->next = rel;
}
static struct relgrp *mkrellist(void)
{
struct relgrp *rel;
int i;
rel = (struct relgrp *) malloc(sizeof(struct relgrp));
assert(rel);
for (i = 0; i < MAXSENSE; i++)
rel->senses[i] = 0;
rel->next = NULL;
return(rel);
}
static void free_rellist(void)
{
struct relgrp *rel, *next;
rel = rellist;
while(rel) {
next = rel->next;
free(rel);
rel = next;
}
}
static void printrelatives(IndexPtr idx, int dbase)
{
SynsetPtr synptr;
struct relgrp *rel;
int i, flag;
int outsenses[MAXSENSE];
for (i = 0; i < idx->off_cnt; i++)
outsenses[i] = 0;
prflag = 1;
for (rel = rellist; rel; rel = rel->next) {
flag = 0;
for (i = 0; i < idx->off_cnt; i++) {
if (rel->senses[i] && !outsenses[i]) {
flag = 1;
synptr = read_synset(dbase, idx->offset[i], "");
printsns(synptr, i + 1);
traceptrs(synptr, HYPERPTR, dbase, 0);
outsenses[i] = 1;
free_synset(synptr);
}
}
if (flag)
printbuffer("--------------\n");
}
for (i = 0; i < idx->off_cnt; i++) {
if (!outsenses[i]) {
synptr = read_synset(dbase, idx->offset[i], "");
printsns(synptr, i + 1);
traceptrs(synptr, HYPERPTR, dbase, 0);
printbuffer("--------------\n");
free_synset(synptr);
}
}
}
/*
Search code interfaces to WordNet database
findtheinfo() - print search results and return ptr to output buffer
findtheinfo_ds() - return search results in linked list data structrure
*/
char *findtheinfo(char *searchstr, int dbase, int ptrtyp, int whichsense)
{
SynsetPtr cursyn;
IndexPtr idx = NULL;
int depth = 0;
int i, offsetcnt;
char *bufstart;
unsigned long offsets[MAXSENSE];
int skipit;
/* Initializations -
clear output buffer, search results structure, flags */
searchbuffer[0] = '\0';
wnresults.numforms = wnresults.printcnt = 0;
wnresults.searchbuf = searchbuffer;
wnresults.searchds = NULL;
abortsearch = overflag = 0;
for (i = 0; i < MAXSENSE; i++)
offsets[i] = 0;
switch (ptrtyp) {
case OVERVIEW:
WNOverview(searchstr, dbase);
break;
case FREQ:
while ((idx = getindex(searchstr, dbase)) != NULL) {
searchstr = NULL;
wnresults.SenseCount[wnresults.numforms] = idx->off_cnt;
freq_word(idx);
free_index(idx);
wnresults.numforms++;
}
break;
case WNGREP:
wngrep(searchstr, dbase);
break;
case RELATIVES:
case VERBGROUP:
while ((idx = getindex(searchstr, dbase)) != NULL) {
searchstr = NULL;
wnresults.SenseCount[wnresults.numforms] = idx->off_cnt;
relatives(idx, dbase);
free_index(idx);
wnresults.numforms++;
}
break;
default:
/* If negative search type, set flag for recursive search */
if (ptrtyp < 0) {
ptrtyp = -ptrtyp;
depth = 1;
}
bufstart = searchbuffer;
offsetcnt = 0;
/* look at all spellings of word */
while ((idx = getindex(searchstr, dbase)) != NULL) {
searchstr = NULL; /* clear out for next call to getindex() */
wnresults.SenseCount[wnresults.numforms] = idx->off_cnt;
wnresults.OutSenseCount[wnresults.numforms] = 0;
/* Print extra sense msgs if looking at all senses */
if (whichsense == ALLSENSES)
printbuffer(
" \n");
/* Go through all of the searchword's senses in the
database and perform the search requested. */
for (sense = 0; sense < idx->off_cnt; sense++) {
if (whichsense == ALLSENSES || whichsense == sense + 1) {
prflag = 0;
/* Determine if this synset has already been done
with a different spelling. If so, skip it. */
for (i = 0, skipit = 0; i < offsetcnt && !skipit; i++) {
if (offsets[i] == idx->offset[sense])
skipit = 1;
}
if (skipit != 1) {
offsets[offsetcnt++] = idx->offset[sense];
cursyn = read_synset(dbase, idx->offset[sense], idx->wd);
switch(ptrtyp) {
case ANTPTR:
if(dbase == ADJ)
traceadjant(cursyn);
else
traceptrs(cursyn, ANTPTR, dbase, depth);
break;
case COORDS:
tracecoords(cursyn, HYPOPTR, dbase, depth);
break;
case FRAMES:
printframe(cursyn, 1);
break;
case MERONYM:
traceptrs(cursyn, HASMEMBERPTR, dbase, depth);
traceptrs(cursyn, HASSTUFFPTR, dbase, depth);
traceptrs(cursyn, HASPARTPTR, dbase, depth);
break;
case HOLONYM:
traceptrs(cursyn, ISMEMBERPTR, dbase, depth);
traceptrs(cursyn, ISSTUFFPTR, dbase, depth);
traceptrs(cursyn, ISPARTPTR, dbase, depth);
break;
case HMERONYM:
partsall(cursyn, HMERONYM);
break;
case HHOLONYM:
partsall(cursyn, HHOLONYM);
break;
case SEEALSOPTR:
printseealso(cursyn);
break;
#ifdef FOOP
case PPLPTR:
traceptrs(cursyn, ptrtyp, dbase, depth);
traceptrs(cursyn, PPLPTR, dbase, depth);
break;
#endif
case SIMPTR:
case SYNS:
case HYPERPTR:
printsns(cursyn, sense + 1);
prflag = 1;
traceptrs(cursyn, ptrtyp, dbase, depth);
if (dbase == ADJ) {
/* traceptrs(cursyn, PERTPTR, dbase, depth); */
traceptrs(cursyn, PPLPTR, dbase, depth);
} else if (dbase == ADV) {
/* traceptrs(cursyn, PERTPTR, dbase, depth);*/
}
if (saflag) /* print SEE ALSO pointers */
printseealso(cursyn);
if (dbase == VERB && frflag)
printframe(cursyn, 0);
break;
case PERTPTR:
printsns(cursyn, sense + 1);
prflag = 1;
traceptrs(cursyn, PERTPTR, dbase, depth);
break;
case DERIVATION:
tracenomins(cursyn, dbase);
break;
case CLASSIFICATION:
case CLASS:
traceclassif(cursyn, dbase, ptrtyp);
break;
default:
traceptrs(cursyn, ptrtyp, dbase, depth);
break;
} /* end switch */
free_synset(cursyn);
} /* end if (skipit) */
} /* end if (whichsense) */
if (skipit != 1) {
interface_doevents();
if ((whichsense == sense + 1) || abortsearch || overflag)
break; /* break out of loop - we're done */
}
} /* end for (sense) */
/* Done with an index entry - patch in number of senses output */
if (whichsense == ALLSENSES) {
i = wnresults.OutSenseCount[wnresults.numforms];
if (i == idx->off_cnt && i == 1)
sprintf(tmpbuf, "\n1 sense of %s", idx->wd);
else if (i == idx->off_cnt)
sprintf(tmpbuf, "\n%d senses of %s", i, idx->wd);
else if (i > 0) /* printed some senses */
sprintf(tmpbuf, "\n%d of %d senses of %s",
i, idx->off_cnt, idx->wd);
/* Find starting offset in searchbuffer for this index
entry and patch string in. Then update bufstart
to end of searchbuffer for start of next index entry. */
if (i > 0) {
if (wnresults.numforms > 0) {
bufstart[0] = '\n';
bufstart++;
}
/* Avoid writing a trailing \0 after the string */
memcpy(bufstart, tmpbuf, strlen(tmpbuf));
bufstart = searchbuffer + strlen(searchbuffer);
}
}
free_index(idx);
interface_doevents();
if (overflag || abortsearch)
break; /* break out of while (idx) loop */
wnresults.numforms++;
} /* end while (idx) */
} /* end switch */
interface_doevents();
if (abortsearch)
printbuffer("\nSearch Interrupted...\n");
else if (overflag)
sprintf(searchbuffer,
"Search too large. Narrow search and try again...\n");
/* replace underscores with spaces before returning */
return(strsubst(searchbuffer, '_', ' '));
}
SynsetPtr findtheinfo_ds(char *searchstr, int dbase, int ptrtyp, int whichsense)
{
IndexPtr idx;
SynsetPtr cursyn;
SynsetPtr synlist = NULL, lastsyn = NULL;
int depth = 0;
int newsense = 0;
wnresults.numforms = 0;
wnresults.printcnt = 0;
while ((idx = getindex(searchstr, dbase)) != NULL) {
searchstr = NULL; /* clear out for next call */
newsense = 1;
if(ptrtyp < 0) {
ptrtyp = -ptrtyp;
depth = 1;
}
wnresults.SenseCount[wnresults.numforms] = idx->off_cnt;
wnresults.OutSenseCount[wnresults.numforms] = 0;
wnresults.searchbuf = NULL;
wnresults.searchds = NULL;
/* Go through all of the searchword's senses in the
database and perform the search requested. */
for(sense = 0; sense < idx->off_cnt; sense++) {
if (whichsense == ALLSENSES || whichsense == sense + 1) {
cursyn = read_synset(dbase, idx->offset[sense], idx->wd);
if (lastsyn) {
if (newsense)
lastsyn->nextform = cursyn;
else
lastsyn->nextss = cursyn;
}
if (!synlist)
synlist = cursyn;
newsense = 0;
cursyn->searchtype = ptrtyp;
cursyn->ptrlist = traceptrs_ds(cursyn, ptrtyp,
getpos(cursyn->pos),
depth);
lastsyn = cursyn;
if (whichsense == sense + 1)
break;
}
}
free_index(idx);
wnresults.numforms++;
if (ptrtyp == COORDS) { /* clean up by removing hypernym */
lastsyn = synlist->ptrlist;
synlist->ptrlist = lastsyn->ptrlist;
free_synset(lastsyn);
}
}
wnresults.searchds = synlist;
return(synlist);
}
/* Recursive search algorithm to trace a pointer tree and return results
in linked list of data structures. */
SynsetPtr traceptrs_ds(SynsetPtr synptr, int ptrtyp, int dbase, int depth)
{
int i;
SynsetPtr cursyn, synlist = NULL, lastsyn = NULL;
int tstptrtyp, docoords;
/* If synset is a satellite, find the head word of its
head synset and the head word's sense number. */
if (getsstype(synptr->pos) == SATELLITE) {
for (i = 0; i < synptr->ptrcount; i++)
if (synptr->ptrtyp[i] == SIMPTR) {
cursyn = read_synset(synptr->ppos[i],
synptr->ptroff[i],
"");
synptr->headword = strdup(cursyn->words[0]);
assert(synptr->headword);
synptr->headsense = cursyn->lexid[0];
free_synset(cursyn);
break;
}
}
if (ptrtyp == COORDS) {
tstptrtyp = HYPERPTR;
docoords = 1;
} else {
tstptrtyp = ptrtyp;
docoords = 0;
}
for (i = 0; i < synptr->ptrcount; i++) {
if((synptr->ptrtyp[i] == tstptrtyp) &&
((synptr->pfrm[i] == 0) ||
(synptr->pfrm[i] == synptr->whichword))) {
cursyn=read_synset(synptr->ppos[i], synptr->ptroff[i], "");
cursyn->searchtype = ptrtyp;
if (lastsyn)
lastsyn->nextss = cursyn;
if (!synlist)
synlist = cursyn;
lastsyn = cursyn;
if(depth) {
depth = depthcheck(depth, cursyn);
cursyn->ptrlist = traceptrs_ds(cursyn, ptrtyp,
getpos(cursyn->pos),
(depth+1));
} else if (docoords) {
cursyn->ptrlist = traceptrs_ds(cursyn, HYPOPTR, NOUN, 0);
}
}
}
return(synlist);
}
static void WNOverview(char *searchstr, int pos)
{
SynsetPtr cursyn;
IndexPtr idx = NULL;
char *cpstring = searchstr, *bufstart;
int sense, i, offsetcnt;
int svdflag, skipit;
unsigned long offsets[MAXSENSE];
cpstring = searchstr;
bufstart = searchbuffer;
for (i = 0; i < MAXSENSE; i++)
offsets[i] = 0;
offsetcnt = 0;
while ((idx = getindex(cpstring, pos)) != NULL) {
cpstring = NULL; /* clear for next call to getindex() */
wnresults.SenseCount[wnresults.numforms++] = idx->off_cnt;
wnresults.OutSenseCount[wnresults.numforms] = 0;
printbuffer(
" \n");
/* Print synset for each sense. If requested, precede
synset with synset offset and/or lexical file information.*/
for (sense = 0; sense < idx->off_cnt; sense++) {
for (i = 0, skipit = 0; i < offsetcnt && !skipit; i++)
if (offsets[i] == idx->offset[sense])
skipit = 1;
if (!skipit) {
offsets[offsetcnt++] = idx->offset[sense];
cursyn = read_synset(pos, idx->offset[sense], idx->wd);
if (idx->tagged_cnt != -1 &&
((sense + 1) <= idx->tagged_cnt)) {
sprintf(tmpbuf, "%d. (%d) ",
sense + 1, GetTagcnt(idx, sense + 1));
} else {
sprintf(tmpbuf, "%d. ", sense + 1);
}
svdflag = dflag;
dflag = 1;
printsynset(tmpbuf, cursyn, "\n", DEFON, ALLWORDS,
SKIP_ANTS, SKIP_MARKER);
dflag = svdflag;
wnresults.OutSenseCount[wnresults.numforms]++;
wnresults.printcnt++;
free_synset(cursyn);
}
}
/* Print sense summary message */
i = wnresults.OutSenseCount[wnresults.numforms];
if (i > 0) {
if (i == 1)
sprintf(tmpbuf, "\nThe %s %s has 1 sense",
partnames[pos], idx->wd);
else
sprintf(tmpbuf, "\nThe %s %s has %d senses",
partnames[pos], idx->wd, i);
if (idx->tagged_cnt > 0)
sprintf(tmpbuf + strlen(tmpbuf),
" (first %d from tagged texts)\n", idx->tagged_cnt);
else if (idx->tagged_cnt == 0)
sprintf(tmpbuf + strlen(tmpbuf),
" (no senses from tagged texts)\n");
strncpy(bufstart, tmpbuf, strlen(tmpbuf));
bufstart = searchbuffer + strlen(searchbuffer);
} else
bufstart[0] = '\0';
wnresults.numforms++;
free_index(idx);
}
}
/* Do requested search on synset passed, returning output in buffer. */
char *do_trace(SynsetPtr synptr, int ptrtyp, int dbase, int depth)
{
searchbuffer[0] = '\0'; /* clear output buffer */
traceptrs(synptr, ptrtyp, dbase, depth);
return(searchbuffer);
}
/* Set bit for each search type that is valid for the search word
passed and return bit mask. */
unsigned int is_defined(char *searchstr, int dbase)
{
IndexPtr index;
int i;
unsigned long retval = 0;
wnresults.numforms = wnresults.printcnt = 0;
wnresults.searchbuf = NULL;
wnresults.searchds = NULL;
while ((index = getindex(searchstr, dbase)) != NULL) {
searchstr = NULL; /* clear out for next getindex() call */
wnresults.SenseCount[wnresults.numforms] = index->off_cnt;
/* set bits that must be true for all words */
retval |= bit(SIMPTR) | bit(FREQ) | bit(SYNS)|
bit(WNGREP) | bit(OVERVIEW);
/* go through list of pointer characters and set appropriate bits */
for(i = 0; i < index->ptruse_cnt; i++) {
if (index->ptruse[i] <= LASTTYPE) {
retval |= bit(index->ptruse[i]);
} else if (index->ptruse[i] == INSTANCE) {
retval |= bit(HYPERPTR);
} else if (index->ptruse[i] == INSTANCES) {
retval |= bit(HYPOPTR);
}
if (index->ptruse[i] == SIMPTR) {
retval |= bit(ANTPTR);
}
#ifdef FOOP
if (index->ptruse[i] >= CLASSIF_START &&
index->ptruse[i] <= CLASSIF_END) {
retval |= bit(CLASSIFICATION);
}
if (index->ptruse[i] >= CLASS_START &&
index->ptruse[i] <= CLASS_END) {
retval |= bit(CLASS);
}
#endif
if (index->ptruse[i] >= ISMEMBERPTR &&
index->ptruse[i] <= ISPARTPTR)
retval |= bit(HOLONYM);
else if (index->ptruse[i] >= HASMEMBERPTR &&
index->ptruse[i] <= HASPARTPTR)
retval |= bit(MERONYM);
}
if (dbase == NOUN) {
/* check for inherited holonyms and meronyms */
if (HasHoloMero(index, HMERONYM))
retval |= bit(HMERONYM);
if (HasHoloMero(index, HHOLONYM))
retval |= bit(HHOLONYM);
/* if synset has hypernyms, enable coordinate search */
if (retval & bit(HYPERPTR))
retval |= bit(COORDS);
} else if (dbase == VERB) {
/* if synset has hypernyms, enable coordinate search */
if (retval & bit(HYPERPTR))
retval |= bit(COORDS);
/* enable grouping of related synsets and verb frames */
retval |= bit(RELATIVES) | bit(FRAMES);
}
free_index(index);
wnresults.numforms++;
}
return(retval);
}
/* Determine if any of the synsets that this word is in have inherited
meronyms or holonyms. */
static int HasHoloMero(IndexPtr index, int ptrtyp)
{
int i, j;
SynsetPtr synset, psynset;
int found=0;
int ptrbase;
ptrbase = (ptrtyp == HMERONYM) ? HASMEMBERPTR : ISMEMBERPTR;
for(i = 0; i < index->off_cnt; i++) {
synset = read_synset(NOUN, index->offset[i], "");
for (j = 0; j < synset->ptrcount; j++) {
if (synset->ptrtyp[j] == HYPERPTR) {
psynset = read_synset(NOUN, synset->ptroff[j], "");
found += HasPtr(psynset, ptrbase);
found += HasPtr(psynset, ptrbase + 1);
found += HasPtr(psynset, ptrbase + 2);
free_synset(psynset);
}
}
free_synset(synset);
}
return(found);
}
static int HasPtr(SynsetPtr synptr, int ptrtyp)
{
int i;
for(i = 0; i < synptr->ptrcount; i++) {
if(synptr->ptrtyp[i] == ptrtyp) {
return(1);
}
}
return(0);
}
/* Set bit for each POS that search word is in. 0 returned if
word is not in WordNet. */
unsigned int in_wn(char *word, int pos)
{
int i;
unsigned int retval = 0;
if (pos == ALL_POS) {
for (i = 1; i < NUMPARTS + 1; i++)
if (indexfps[i] != NULL && bin_search(word, indexfps[i]) != NULL)
retval |= bit(i);
} else if (indexfps[pos] != NULL && bin_search(word,indexfps[pos]) != NULL)
retval |= bit(pos);
return(retval);
}
static int depthcheck(int depth, SynsetPtr synptr)
{
if(depth >= MAXDEPTH) {
sprintf(msgbuf,
"WordNet library error: Error Cycle detected\n %s\n",
synptr->words[0]);
display_message(msgbuf);
depth = -1; /* reset to get one more trace then quit */
}
return(depth);
}
/* Strip off () enclosed comments from a word */
static char *deadjify(char *word)
{
char *y;
adj_marker = UNKNOWN_MARKER; /* default if not adj or unknown */
y=word;
while(*y) {
if(*y == '(') {
if (!strncmp(y, "(a)", 3))
adj_marker = ATTRIBUTIVE;
else if (!strncmp(y, "(ip)", 4))
adj_marker = IMMED_POSTNOMINAL;
else if (!strncmp(y, "(p)", 3))
adj_marker = PREDICATIVE;
*y='\0';
} else
y++;
}
return(word);
}
static int getsearchsense(SynsetPtr synptr, int whichword)
{
IndexPtr idx;
int i;
strsubst(strcpy(wdbuf, synptr->words[whichword - 1]), ' ', '_');
strtolower(wdbuf);
if ((idx = index_lookup(wdbuf, getpos(synptr->pos))) != NULL) {
for (i = 0; i < idx->off_cnt; i++)
if (idx->offset[i] == synptr->hereiam) {
free_index(idx);
return(i + 1);
}
free_index(idx);
}
return(0);
}
static void printsynset(char *head, SynsetPtr synptr, char *tail, int definition, int wdnum, int antflag, int markerflag)
{
int i, wdcnt;
char tbuf[SMLINEBUF];
tbuf[0] = '\0'; /* clear working buffer */
strcat(tbuf, head); /* print head */
/* Precede synset with additional information as indiecated
by flags */
if (offsetflag) /* print synset offset */
sprintf(tbuf + strlen(tbuf),"{%8.8ld} ", synptr->hereiam);
if (fileinfoflag) { /* print lexicographer file information */
sprintf(tbuf + strlen(tbuf), "<%s> ", lexfiles[synptr->fnum]);
prlexid = 1; /* print lexicographer id after word */
} else
prlexid = 0;
if (wdnum) /* print only specific word asked for */
catword(tbuf, synptr, wdnum - 1, markerflag, antflag);
else /* print all words in synset */
for(i = 0, wdcnt = synptr->wcount; i < wdcnt; i++) {
catword(tbuf, synptr, i, markerflag, antflag);
if (i < wdcnt - 1)
strcat(tbuf, ", ");
}
if(definition && dflag && synptr->defn) {
strcat(tbuf," -- ");
strcat(tbuf,synptr->defn);
}
strcat(tbuf,tail);
printbuffer(tbuf);
}
static void printantsynset(SynsetPtr synptr, char *tail, int anttype, int definition)
{
int i, wdcnt;
char tbuf[SMLINEBUF];
char *str;
int first = 1;
tbuf[0] = '\0';
if (offsetflag)
sprintf(tbuf,"{%8.8ld} ", synptr->hereiam);
if (fileinfoflag) {
sprintf(tbuf + strlen(tbuf),"<%s> ", lexfiles[synptr->fnum]);
prlexid = 1;
} else
prlexid = 0;
/* print anotnyms from cluster head (of indirect ant) */
strcat(tbuf, "INDIRECT (VIA ");
for(i = 0, wdcnt = synptr->wcount; i < wdcnt; i++) {
if (first) {
str = printant(ADJ, synptr, i + 1, "%s", ", ");
first = 0;
} else
str = printant(ADJ, synptr, i + 1, ", %s", ", ");
if (*str)
strcat(tbuf, str);
}
strcat(tbuf, ") -> ");
/* now print synonyms from cluster head (of indirect ant) */
for (i = 0, wdcnt = synptr->wcount; i < wdcnt; i++) {
catword(tbuf, synptr, i, SKIP_MARKER, SKIP_ANTS);
if (i < wdcnt - 1)
strcat(tbuf, ", ");
}
if(dflag && synptr->defn && definition) {
strcat(tbuf," -- ");
strcat(tbuf,synptr->defn);
}
strcat(tbuf,tail);
printbuffer(tbuf);
}
static void catword(char *buf, SynsetPtr synptr, int wdnum, int adjmarker, int antflag)
{
static char vs[] = " (vs. %s)";
static char *markers[] = {
"", /* UNKNOWN_MARKER */
"(predicate)", /* PREDICATIVE */
"(prenominal)", /* ATTRIBUTIVE */
"(postnominal)", /* IMMED_POSTNOMINAL */
};
/* Copy the word (since deadjify() changes original string),
deadjify() the copy and append to buffer */
strcpy(wdbuf, synptr->words[wdnum]);
strcat(buf, deadjify(wdbuf));
/* Print additional lexicographer information and WordNet sense
number as indicated by flags */
if (prlexid && (synptr->lexid[wdnum] != 0))
sprintf(buf + strlen(buf), "%d", synptr->lexid[wdnum]);
if (wnsnsflag)
sprintf(buf + strlen(buf), "#%d", synptr->wnsns[wdnum]);
/* For adjectives, append adjective marker if present, and
print antonym if flag is passed */
if (getpos(synptr->pos) == ADJ) {
if (adjmarker == PRINT_MARKER)
strcat(buf, markers[adj_marker]);
if (antflag == PRINT_ANTS)
strcat(buf, printant(ADJ, synptr, wdnum + 1, vs, ""));
}
}
static char *printant(int dbase, SynsetPtr synptr, int wdnum, char *template, char *tail)
{
int i, j, wdoff;
SynsetPtr psynptr;
char tbuf[WORDBUF];
static char retbuf[SMLINEBUF];
int first = 1;
retbuf[0] = '\0';
/* Go through all the pointers looking for anotnyms from the word
indicated by wdnum. When found, print all the antonym's
antonym pointers which point back to wdnum. */
for (i = 0; i < synptr->ptrcount; i++) {
if (synptr->ptrtyp[i] == ANTPTR && synptr->pfrm[i] == wdnum) {
psynptr = read_synset(dbase, synptr->ptroff[i], "");
for (j = 0; j < psynptr->ptrcount; j++) {
if (psynptr->ptrtyp[j] == ANTPTR &&
psynptr->pto[j] == wdnum &&
psynptr->ptroff[j] == synptr->hereiam) {
wdoff = (psynptr->pfrm[j] ? (psynptr->pfrm[j] - 1) : 0);
/* Construct buffer containing formatted antonym,
then add it onto end of return buffer */
strcpy(wdbuf, psynptr->words[wdoff]);
strcpy(tbuf, deadjify(wdbuf));
/* Print additional lexicographer information and
WordNet sense number as indicated by flags */
if (prlexid && (psynptr->lexid[wdoff] != 0))
sprintf(tbuf + strlen(tbuf), "%d",
psynptr->lexid[wdoff]);
if (wnsnsflag)
sprintf(tbuf + strlen(tbuf), "#%d",
psynptr->wnsns[wdoff]);
if (!first)
strcat(retbuf, tail);
else
first = 0;
sprintf(retbuf + strlen(retbuf), template, tbuf);
}
}
free_synset(psynptr);
}
}
return(retbuf);
}
static void printbuffer(char *string)
{
if (overflag)
return;
if (strlen(searchbuffer) + strlen(string) >= SEARCHBUF)
overflag = 1;
else
strcat(searchbuffer, string);
}
static void printsns(SynsetPtr synptr, int sense)
{
printsense(synptr, sense);
printsynset("", synptr, "\n", DEFON, ALLWORDS, PRINT_ANTS, PRINT_MARKER);
}
static void printsense(SynsetPtr synptr, int sense)
{
char tbuf[256];
/* Append lexicographer filename after Sense # if flag is set. */
if (fnflag)
sprintf(tbuf,"\nSense %d in file \"%s\"\n",
sense, lexfiles[synptr->fnum]);
else
sprintf(tbuf,"\nSense %d\n", sense);
printbuffer(tbuf);
/* update counters */
wnresults.OutSenseCount[wnresults.numforms]++;
wnresults.printcnt++;
}
static void printspaces(int trace, int depth)
{
int j;
for (j = 0; j < depth; j++)
printbuffer(" ");
switch(trace) {
case TRACEP: /* traceptrs(), tracenomins() */
if (depth)
printbuffer(" ");
else
printbuffer(" ");
break;
case TRACEC: /* tracecoords() */
if (!depth)
printbuffer(" ");
break;
case TRACEI: /* traceinherit() */
if (!depth)
printbuffer("\n ");
break;
}
}
/* Dummy function to force Tcl/Tk to look at event queue to see of
the user wants to stop the search. */
static void interface_doevents (void) {
if (interface_doevents_func != NULL) interface_doevents_func ();
}
/*
Revision log: (since version 1.5)
$Log: search.c,v $
Revision 1.166 2006/11/14 20:52:45 wn
for 2.1
Revision 1.165 2005/02/24 15:36:00 wn
fixed bug - coordinate search was missing INSTANCE pointers
Revision 1.164 2005/01/27 16:32:32 wn
removed 1.6 stuff and cleaned up #ifdefs
Revision 1.163 2004/10/25 15:25:18 wn
added instances code
Revision 1.162 2004/01/12 16:32:52 wn
changed "CATEGORY" to "TOPIC"
Revision 1.161 2003/06/23 15:52:27 wn
cleaned up format of nomin output
Revision 1.160 2003/06/05 15:29:45 wn
added pos and sense number for domains
Revision 1.159 2003/04/15 13:54:16 wn
*** empty log message ***
Revision 1.158 2003/03/20 19:31:36 wn
removed NOMIN_START/NOMIN_END range and replaced with DERIVATION
Revision 1.157 2003/02/06 19:01:36 wn
added code to print out word pointed to in derivational links.
Revision 1.156 2003/02/06 18:03:30 wn
work on classifications
Revision 1.155 2002/10/29 15:46:27 wn
added CLASSIFICATION code
Revision 1.154 2002/09/16 15:43:01 wn
allow "grep" string to be in upper case
Revision 1.153 2002/09/16 15:39:16 wn
*** empty log message ***
Revision 1.152 2002/03/22 19:39:15 wn
fill in key field in SynsetPtr if key file found
Revision 1.151 2002/03/07 18:47:52 wn
updates for 1.7.1
Revision 1.150 2001/12/04 17:48:21 wn
added test to tracenomins to only print nominalizations of serach
word and not all words in synset
Revision 1.149 2001/11/27 19:53:24 wn
removed check for version on verb example sentence stuff. only
needed for 1.5
Revision 1.148 2001/11/06 18:51:04 wn
fixed bug in getindex when passed "."
added code to skip classification
Revision 1.147 2001/10/11 18:00:56 wn
fixed bug in free_syns - wasn't freeing synset pointed to by nextform
Revision 1.146 2001/07/27 14:32:41 wn
fixed order of adjective markers
Revision 1.145 2001/06/19 15:01:22 wn
commed out include for setutil.h
Revision 1.144 2001/05/30 16:24:17 wn
changed is_defined to return unsigned int
Revision 1.143 2001/03/30 17:13:00 wn
fixed is_defined - wasn't setting coords for verbs
Revision 1.142 2001/03/29 16:18:03 wn
added newline before output from FREQ search
Revision 1.141 2001/03/29 16:11:39 wn
added code to tractptrs to print direct antonyms nicer
Revision 1.140 2001/03/27 18:47:41 wn
removed tcflag
Revision 1.139 2001/03/27 16:47:44 wn
updated is_defined for holonyms and meronyms
Revision 1.138 2000/08/14 16:04:24 wn
changed 'get_index' to call sub to do work
added code for nominalizations
Revision 1.137 1998/08/11 18:07:11 wn
minor fixes: free synptr space before rreturning if error; remove
useless statement in free_syns
* Revision 1.136 1998/08/07 17:51:32 wn
* added COORDS to traceptrs_ds and findtheinfo_ds
* fixed getsearchsense code to only happen in parse_synset
*
* Revision 1.135 1998/08/07 13:04:24 wn
* *** empty log message ***
*
* Revision 1.134 1997/11/07 16:27:36 wn
* cleanup calls to traceptrs
*
* Revision 1.133 1997/10/16 17:13:08 wn
* fixed bug in add_topnode when index == 0
*
* Revision 1.132 1997/09/05 15:33:18 wn
* change printframes to only print generic frames if specific example not found
*
* Revision 1.131 1997/09/02 16:31:18 wn
* changed includes
*
* Revision 1.130 1997/09/02 14:43:23 wn
* added code to test wnrelease in parse_synset and WNOverview
*
* Revision 1.129 1997/08/29 20:45:25 wn
* added location sanity check on parse_synset
*
* Revision 1.128 1997/08/29 18:35:03 wn
* a bunch of additional cleanups; added code to traceptrs_ds to
* tore wordnet sense number for each word; added wnresults structure;
* terminate holo/mero search at highest level having holo/mero
*
* Revision 1.127 1997/08/28 17:26:46 wn
* Changed "n senses from tagged data" to "n senses from tagged texts"
* in the overview.
*
* Revision 1.126 1997/08/27 13:26:07 wn
* trivial change in wngrep (initialized count to zero)
*
* Revision 1.125 1997/08/26 21:13:14 wn
* Grep now runs quickly because it doesn't call the doevents callback
* after each line of the search.
*
* Revision 1.124 1997/08/26 20:11:23 wn
* massive cleanups to print functions
*
* Revision 1.123 1997/08/26 15:04:18 wn
* I think I got it this time; replaced goto skipit with int skipit flag
* to make compiling easier on the Mac.
*
* Revision 1.122 1997/08/26 14:43:40 wn
* In an effort to avoid compilation errors on the
* Mac caused by the use of a "goto", I had tried to replace it with
* an if block, but had done so improperly. This is the restored version
* from before. Next check-in will have it properly replaced with flags.
*
* Revision 1.121 1997/08/25 15:54:21 wn
* *** empty log message ***
*
* Revision 1.120 1997/08/22 21:06:02 wn
* added code to use wnsnsflag to print wn sense number after each word
*
* Revision 1.119 1997/08/22 20:52:09 wn
* cleaned up findtheinfo and other fns a bit
*
* Revision 1.118 1997/08/21 20:59:20 wn
* grep now uses strstr instead of regexp searches. the old version is
* still there but commented out.
*
* Revision 1.117 1997/08/21 18:41:30 wn
* now eliminates duplicates on search returns, but not yet in overview
*
Revision 1.116 1997/08/13 17:23:45 wn
fixed mac defines
* Revision 1.115 1997/08/08 20:56:33 wn
* now uses built-in grep
*
* Revision 1.114 1997/08/08 19:15:41 wn
* added code to read attest_cnt field in index file.
* made searchbuffer fixed size
* added WNOverview (OVERVIEW) search
* added offsetflag to print synset offset before synset
*
* Revision 1.113 1997/08/05 14:20:29 wn
* changed printbuffer to not realloc space, removed calls to stopsearch()
*
* Revision 1.112 1997/07/25 17:30:03 wn
* various cleanups for release 1.6
*
Revision 1.111 1997/07/11 20:20:04 wn
Added interface_doevents code for making searches interruptable in single-threaded environments.
* Revision 1.110 1997/07/10 19:01:57 wn
* changed evca stuff
*
Revision 1.109 1997/04/22 19:59:08 wn
allow pertainyms to have antonyms
* Revision 1.108 1996/09/17 20:05:01 wn
* cleaned up EVCA code
*
* Revision 1.107 1996/08/16 18:34:13 wn
* fixed minor bug in findcousins
*
* Revision 1.106 1996/07/17 14:02:17 wn
* Added Kohl's verb example sentences. See getexample() and findExample().
*
* Revision 1.105 1996/06/14 18:49:49 wn
* upped size of tmpbuf
*
* Revision 1.104 1996/02/08 16:42:30 wn
* added some newlines to separate output and clear out tmpbuf
* so invalid searches return empty string
*
* Revision 1.103 1995/11/30 14:54:53 wn
* added grouped search for verbs
*
* Revision 1.102 1995/07/19 13:17:38 bagyenda
* *** empty log message ***
*
* Revision 1.101 1995/07/18 19:15:30 wn
* *** empty log message ***
*
* Revision 1.100 1995/07/18 18:56:24 bagyenda
* New implementation of grouped searches --Paul.
*
* Revision 1.99 1995/06/30 19:21:23 wn
* added code to findtheinfo_ds to link additional word forms
* onto synset chain
*
* Revision 1.98 1995/06/12 18:33:51 wn
* Minor change to getindex() -- Paul.
*
* Revision 1.97 1995/06/09 14:46:42 wn
* *** empty log message ***
*
* Revision 1.96 1995/06/09 14:32:49 wn
* changed code for PPLPTR and PERTPTR to print synsets pointed to
*
* Revision 1.95 1995/06/01 15:50:34 wn
* cleanup of code dealing with various hyphenations
*
*/
|