1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
|
------------------------------------------------------------------------------
A license is hereby granted to reproduce this software source code and
to create executable versions from this source code for personal,
non-commercial use. The copyright notice included with the software
must be maintained in all copies produced.
THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES
WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF
MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE
AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE
INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY.
Copyright (c) 1995, John Conover, All Rights Reserved.
Comments and/or bug reports should be addressed to:
john@johncon.com (John Conover)
------------------------------------------------------------------------------
To install rel:
1) cd to this directory
a) If necessary, alter the comments in the Makefile to chose
an appropriate compiler. The System V, BSD compiler option
should work, but using the GNU gcc compiler option will
be beneficial if using gcc as a development system.
b) The Makefile define, FSE_CONTINUE, controls the program
behavior during the file system exceptions, error opening
file, and error opening directory-if defined, the program
will continue under these exceptions, and if not defined,
the program will shutdown under these exceptions. If the
program is running as a server in a client server
architecture, then it is advisable that FSE_CONTINUE should
not be defined, otherwise, it is advisable that it should.
c) Note: some compilers may require a -D_POSIX_SOURCE to be
added to CFLAGS in the Makefile (like NextStep, for
example.)
d) Some older ANSI C compilers do not have a type for ssize_t,
and if this is the case, it should be typedef'ed as an int
in rel.h by removing the #ifndef __STDC__ compiler
directive/construct around the ssize_t typedef (like the
older GNU gcc compilers, for example.)
e) If there are any problems, start by browsing through the
"Porting issues:" section, below.
2) make
3) if all goes well, test rel by:
./rel stdio /usr/include
which should list all of the files in the /usr/include
directory that contain "stdio," in order of the files that
contain the most matches-presumably, stdio.h has the most
instances
4) install the rel executable someplace in the executable path
5) install either rel.1 or rel.catman in one of the man
directories
(rel.catman was produced by nroff -man rel.1 > rel.catman)
6) the sources were maintained with rcs/cvs-the rcs id's in
the sources are valid and can be re-established for formal
maintenence procedures, if desired; see "man rcs" for
details
7) the file "QA.METRICS" explains the remedial quality process
the sources were subjected too, (primarily, static analysis,
with McCabe and Halstead metrics provided.) The process is
inadequate for commercial or mercantile applications, but
could serve as a starting reference for a more robust
process, if desired, that should begin with a "code walk
through."
8) the directory, example.app, contains an example application
using rel, in conjuction with procmail/smartlist, to construct
an enterprise wide information retrieval system (email
repository,) that uses the Unix MTA as a delivery and query
agent.
The source was written with extensibility as an issue. To alter
character transliterations, see uppercase.c for details. For
enhancements to phrase searching and hyphenation suggestions, see
translit.c.
The program is capable of running as a wide area, distributed, full
text information retrieval system. A possible scenario would be to
distribute a large database in many systems that are internetworked
together, presumably via the Unix inet facility, with each system
running a copy of the program. Queries would be submitted to the
systems, and the systems would return individual records containing
the count of matches to the query, and the file name containing the
matches, perhaps with the machine name, in such a manner that the
records could be sorted on the "count field," and a network wide
"browser" could be used to view the documents, or a script could be
made to use the "r suite" to transfer the documents into the local
machine. Obviously, the queries would be run in parallel on the
machines in the network for enhanced performance-concurrency would not
be an issue. See the function, main(), in rel.c for suggestions.
The source documentation begins in rel.c, and is quite verbose-my
apologies. See rel.c for the particulars, and some extensibility
suggestions.
Porting issues:
1) the program uses POSIX compliant file/directory handling
functions. The structures, types, and functions used for
file/directory handling are:
in searchfile.c:
ssize_t count; /* count of bytes read from the file */
struct flock lockfile; /* file locking structure */
struct stat buf; /* structure to obtain file size */
open (filename, O_RDONLY, S_IREAD)
fcntl (infile, F_SETLKW, &lockfile)
fstat (infile, &buf)
read (infile, (char *) page, count)
(close (infile) != 0)
in searchpath.c:
DIR *dirp; /* reference to the directory for a recursion */
struct dirent *dire; /* reference to directory path */
opendir (name)
readdir (directory->dirp)
closedir (directory->dirp
2) error messages for system level interrupts are included in
message.c and message.h. This file allows a significant level of
system environment customization. See message.c for details.
|