1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
|
=head1 NAME
SWISH-BUGS - List of bugs known in Swish-e
=head1 DESCRIPTION
This file contains a list of bugs reported or known in Swish-e. If
you find a bug listed here you do not need to report it as a bug. But feel
free to bug the developers about it on the Swish-e discussion list.
=head1 Bugs in Swish-e version 2.4
=over 4
=item * Stopwords not removed from query with Soundex
In dev version 2.5.2 noticed that stopwords are not removed from the query
when using Soundex. The plan is to rewrite the parser soon... (July 2004)
=item * Wild card searching can be very slow
Wild card searching needs to be optimized.
Here's a three letter search:
$ swish-e -w 'tra*' -m1
# Number of hits: 99952
# Search time: 5.424 seconds
Two letters:
$ swish-e -w 'tr*' -m1
# Number of hits: 100000
# Search time: 10.563 seconds
Single letter search:
$ swish-e -w 't*' -m1
# Number of hits: 100000
# Search time: 510.939 seconds
and used about 280MB or RAM.
This is a potential for a DoS attack. If you have a large index you may wish to filter
out single character wild cards.
=item * Character Encodings
The XML parser (Expat) returns UTF-8 data to swish-e. Therefore, the XML
parser should only be used for parsing US-ASCII encoded text.
The XML2 & HTML2 parsers (Libxml2) converts characters from UTF-8 to 8859-1 encodings before indexing
and writing properties. Indexing non-8859-1 data may result in invalid character mappings.
These issues will be resolved soon.
=item *
Phrase search failes with DoubleMetaphone
DoubleMetaphone searching can produce two search words for a single query
word. The words are expanded to (word1 OR word2), but that fails in a
phrase query: "some phrase (word1 or word2) here"
swish-e query parser is due for a rewrite, and this could be resolved then.
Reported: August 20, 2002 - moseley
=item *
Merging
merge.c does not check for matching stopwords or buzzwords in each index.
History:
Reported: September 3, 2002 - moseley
=item *
ResultSortOrder
ResultSort order is not used (and is not documented). The problem is that
the data passed to Compare_Properties() does not have access to the
ResultSortOrder table.
=back
History:
Reported: September 3, 2002 - moseley
=head1 Document Info
$Id: SWISH-BUGS.pod,v 1.9 2004/10/04 22:49:34 whmoseley Exp $
.
|