1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158
|
0.5.11
- Fix a bug that can cause a crash on an executable zip file.
- Fix parsing of empty headers when CRLFCRLF is followed by a space. In other words, fix parsing of emails that have a space as the first character in the body.
- Fix two broken (by design) throughanalyzers by replacing the with one eventanalyzer.
- Updated xesam ontology to include proper ranges. This is necessary for the Nepomuk backend but does not change anything for clucene (were all is string anyway)
- Make sure the app can handle environments where HOME is not defined.
- Make the zip analyzer check more often if it should stop analyzing.
- Fix wrong comparison when checking if we are finished yet.
- Make the analyzer respect a configuration that only wants part of the stream to be analyzed.
- Add an analyzer for Windows self-extracting zip archives.
- Ask the analyzerconfiguration if we should continue and put a cap on the maximum length of stream we read
- Log parse errors in the analysisresult.
0.5.10
- Improved Xesam support. strigidaemon can now be queried with the client from
the Xesam test suite.
- Fix a bug in subinputstream.
Under certain circumstances the function read() of the internal stream could
be called with max < min. read() specifies that in such cases, there is no
limit on the number of bytes that may be read. This would cause
SubInputStream to malfunction because it would allow too much of the internal
stream to be read.
- Reenable a number of endanalyzers.
By accident, the analyzers for .tar and .gz files were disabled in the
previous release. Now they are re-enabled.
0.5.9
- Fix bug that would severely bloat the strigi index.
- Improve latency when calling strigi to stop.
- Better (but not yet complete) Xesam support
0.5.8
- Improve quiting latency of the most important analyzers. Now Strigi reacts more quickly when you tell it to stop indexing.
- Add a tool to analyze the analyzer latency profile and find analyzers that have a high latency.
- Bring field names in line with the Xesam ontology.
- New analyzers for avi, wav, dds, rgb, sid and ico file types.
- Fix deepgrep (finally working again since 0.5.2) and extend the number of fields deepgrep searches in. Now it also searches in fields that are passed as "unsigned char*" to the IndexWriter, but only if they are not registered as being binary fields.
- Install two headers that provide metadata information about field types. Basically, these classes publish the ontology that strigi uses.
- Fix a problem with CLucene throwing CLuceneError. Because of -fvisibility=hidden, the code did not recognize CLuceneError and caused it to fall through, thus crashing programs using libstreamanalyzer. A unit test to avoid the problem from reappearing has been added.
- Fix for system where setenv() is not available (for instance windows). Hopefully those systems have putenv() :)
- Remove support for starting strigidaemon with an arbiratry index type and index dir, but add an option to use a different configuration file. This effectively gives the use the same possiblities.
- Fixes to the build system that allow strigi to be built and tested as part of a larger project (e.g. kdesupport).
- 'strigicmd listFiles' now can be used to retrieve all files/dir indexed under a certain path
- Added for support for Gentoo-way compilation flags. Implemented more consistent and pretty optional dependency handling.
0.5.7
- use plugins instead of shared libraries for the indexer backends
- lots of bugfixes and cleanups
- allow backends to be used in RAM by using ':memory:' as the index name
0.5.6
- Added Xesam User Language parser. Now it will be possible to handle Xesam UserLanguage queries (http://wiki.freedesktop.org/wiki/XesamUserSearchLanguage).
- Replaced .ini-based ontology parser with RDF/XML one.
- Updated strigicmd: now it's possible to perform searches formulated
following xesam userlanguage specifications.
- Improved ontology introspection API: properties and classes now have child lists and applicable classes/properties lists.
- change IndexReader::getFiles to IndexReader::getChildren.
- removed IndexReader::documentId and IndexReader::mTime.
- loads of build issues fixed
- added a script that helps you to find the patch that broke a unit test
- add fieldname for document content per the Xesam standard.
- lots more
0.5.5
- GUI now uses a .ui file making future improvements much easier
- install detection script for ease of use in other cmake projects
- modifying the signature of endAnalysis to endAnalysis(bool complete)
for StreamLineAnalyzer, StreamEventAnalyzer, and StreamSaxAnalyzer
- add a function to AnalyzerConfiguration that tell how many bytes can
be read at most from a stream
- add an SAX analyzer plugin that extracts the namespaces used in XML
documents. With this it possible to get all XML documents that contain e.g.
Chemical Markup Language or Dublin Core.
- add a stream for changing the encoding of an incoming stream on the fly
- use the new encoding stream to do better email parsing
- add m3u stream analyzer.
- add simple test program for strigi xesam query builder. It loads a file
containing the xesam query. It converts the xesam query into a Strigi::Query
object. It serializes the Strigi::Query object to xml for e.g. quality
control.
- add xesamquery option to strigicmd: now it's possibile to make queries
using Xesam language.
- add XesamQueryLanguage queries support. Now is possibile to translate
xesam queries formulated using XesamQueryLanguage into Strigi::Query objects.
- add a cgi executable that takes multipart/form-data and outputs an analysis
of the data as xml
- give xmlindexer the ability to read from stdin
- big improvement in parsing ms word files
- better input sanity checking. thanks to zzuf for reporting the errors
- cleanup of private variables in classes by introducing a d-pointer
0.5.4
- simplify PollingListener by letting it reuse code from DirAnalyzer
- improve parsing speed by reading incrementally large blocks and only if no throughanalyzer is ready yet
- extract more data from ogg and ID3 files
- new registerField(fieldname) function that gets additional data from the
ontology
- support of indexwriter calls: addValue(index, field, data, size),
addValue(index, field, double_value) to CLucene backend.
- enable passing of "Tokenized" flag parameter to CLucene backend
- support for the Keyword Terms which are not tokenized during queries
- handling of optional indexing flags, which are loaded from the ontology
- handling of cardinality constraint when indexing
- add keyword query type which allows for using keywords that are not split
up. e.g. chemistry.molecular_formula#"C 4 H 10". basically "#" sign tells -- do not tokenize
- parse the userlanguage wrapped in xesam query language xml
- add searialization to xml for Strigi::Query and Strigi::Term, useful for
debugging purposes
- add types from the xesam dbus interface to strigitypes.h
- add support for gif files
- add support for analyzing jpeg files.
- add prioritized, multithreaded queue for incoming requests
- add option --lastfiletoskip to diranalyzer and xmlindexer
- add support for Cc: Bcc: Message-ID: In-Reply-To: References: From: and To:
- add exclude and include filters to strigicmd create and update commands
- add deindex option, it can be used for removing dirs or files from an index
created by strigi
0.3.11
- SunOS, BSD, 64 bit and Coverity compatibility fixes
- Search in a set of default fields and not just in the text content of a file, if no specific field is specified.
- Add histogram widget to simple search client
- Add support for Ogg Vorbis
- Better decoding of email headers
- Expand Query object to handle nested queryies
- Fix highlighting and display of title in search results.
- Fix path for the child indexables
- Fix memory problems in archivereader
- Check for too short file names and omit the RPM trailer from the results.
- Add an additional unit test for the RPM stream provider.
- Revert raise() to kill(getpid()) because raise hangs the thread.
- Install qtdbus library for strigi.
0.3.10
- Convienience classes for using Strigi over Qt 4.2 DBus
- Change buildsystem to allow building of deepfind, deepgrep and xmlindexer
separately
- Speedup of deepfind by selectively using only the analyzers deepfind needs
- Many portability fixes (GCC 3, Forte, MSVC)
- New, more efficien plugin loading
- Add IFilter plugin for the Windows version
- Remove the big Strigi lock (faster indexing)
- Switch strigiclient to communicate of DBus instead of over a unix socket
- Reorganization of the indexer with a new IndexerConfiguration
- Improvements of file name filters
- New Qt widget for configuring file name filters
- Add file name setting to the DBus interface
- Move verbose unit tests
- Bugfixes in some streams
0.3.9
- Added deepfind and deepgrep, programs that are enhanced versions of find
and grep.
- Added a new way of storing the configuration in an xml file.
- Added a way to search in multiple indexes.
- Added xmlindexer, a program that outputs the file parsing results as xml.
This is convenient for debugging and can also used by other programs that
do not want to write their own indexer. It makes the superior Strigi
indexer available to other software in a convenient way.
- More versatile filters that determine which files to index. (Flavio
Castelli)
- Add possibility to index files from the client by feeding the file into the
daemon. This opens the way to indexing email from remote servers and web
pages.
|