1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
|
######################################################################
SWISH::API::Common 0.04
######################################################################
NAME
SWISH::API::Common - SWISH Document Indexing Made Easy
SYNOPSIS
use SWISH::API::Common;
my $swish = SWISH::API::Common->new();
# Index all files in a directory and its subdirectories
$swish->index("/usr/local/share/doc");
# After indexing once (it's persistent), fire up as many
# queries as you like:
# Search documents containing both "swish" and "install"
for my $hit ($swish->search("swish AND install")) {
print $hit->path(), "\n";
}
DESCRIPTION
"SWISH::API::Common" offers an easy interface to the Swish index engine.
While SWISH::API offers a complete API, "SWISH::API::Common" focusses on
ease of use.
THIS MODULE IS CURRENTLY UNDER DEVELOPMENT. THE API MIGHT CHANGE AT ANY
TIME.
Currently, "SWISH::API::Common" just allows for indexing documents in a
single directory and any of its subdirectories. Also, don't run index()
and search() in parallel yet.
INSTALLATION
"SWISH::API::Common" requires "SWISH::API" and the swish engine to be
installed. Please download the latest release from
http://swish-e.org/distribution/swish-e-2.4.3.tar.gz
and untar it, type
./configure
make
make install
and then install SWISH::API which is contained in the distribution:
cd perl
perl Makefile.PL
make
make install
METHODS
$sw = SWISH::API::Common->new()
Constructor. Takes many options, but the defaults are usually fine.
Available options and their defaults:
# Where SWISH::API::Common stores index files etc.
swish_adm_dir "$ENV{HOME}/.swish-common"
# The path to swish-e, relative is OK
swish_exe "swish-e"
# Swish index file
swish_idx_file "$self->{swish_adm_dir}/default.idx"
# Swish configuration file
swish_cnf_file "$self->{swish_adm_dir}/default.cnf"
# SWISH Stemming
swish_fuzzy_indexing_mode => "Stemming_en"
# Maximum amount of data (in bytes) extracted
# from a single file
file_len_max 100_000
# Preserve every indexed file's atime
atime_preserve
$sw->index($dir, ...)
Generate a new index of all text documents under directory $dir. One
or more directories can be specified.
$sw->search("foo AND bar");
Searches the index, using the given search expression. Returns a
list hits, which can be asked for their path:
# Search documents containing
# both "foo" and "bar"
for my $hit ($swish->search("foo AND bar")) {
print $hit->path(), "\n";
}
index_remove
Permanently delete the current index.
TODO List
* More than one index directory
* Remove documents from index
* Iterator for search hits
LEGALESE
Copyright 2005 by Mike Schilli, all rights reserved. This program is
free software, you can redistribute it and/or modify it under the same
terms as Perl itself.
AUTHOR
2005, Mike Schilli <cpan@perlmeister.com>
|