File: ChangeLog

package info (click to toggle)
strigi 0.7.2-1
links: PTS, VCS
area: main
in suites: squeeze
size: 6,808 kB
ctags: 7,071
sloc: cpp: 44,468; ansic: 9,764; perl: 515; java: 367; python: 345; xml: 316; yacc: 176; sh: 153; makefile: 38
file content (218 lines) | stat: -rw-r--r-- 14,363 bytes
0.7.2
 - Improve cpp analyzer speed and output
 - Fix crash due to deep nesting of calls in pdf analyzer
 - Fix iconv use on Mac OS X
0.7.1
 - Support more fields from ODF documents
 - Improved skipping behavior on streams for large files.
 - Added album art support.
 - Added support for ID3v1 tags.
 - Added MP3 stream metadata extraction, UTF-16 support in tags.
 - Extended the range of metadata extracted by ID3 analyzer.
 - Added a FLAC audio file analyzer.
 - Significantly unbreak the PDF analyzer.
 - Fix scanning trees where permissions are insufficient to read some parts
 - Check for multithreaded version of libxml2
 - Require newer CLucene version (0.9.21)
0.7.0
 - Change to Nepomuk ontologies (Evgeny Egorochkin)
 - Set file property for embedded ar streams. This fixes the opening of these streams in archivereader.
 - Instead of reading each .rdf file at once in memory and then parse it, use the libxml2 I/O API to read chunks of the file when requested.
 - The attribute value is not '\0' terminated but has a pointer to the end of the string. In addition, string comparison was sped up by first comparing the string length.
0.6.5
 - Fix KDE bug 185551: Strigi now allows paths that start with protocol:/* like 'file:///' or 'remote:/'
 - Add a new function AnalysisResult::child(). This function allows an AnalysisResult instance to access the last child it has had indexed. This is needed for cases when a parent knows something about a child which the child does not know. In such cases the parent can call child()->addValue(...).
 - Adjust to the new library naming scheme in iconv-1.12
 - Implemented missing addTriplet method
 - Rewrite the implementation of ArchiveReader. The new implementation is more
 efficient in listing contents of directories. Now single directory entries can be returned without the need for reading the entire archive of which the directory is a part.
0.6.4
 - Path fixes to the build system the benefit of windows users (sengels)
 - Clean up of class ArchiveReader
 - Support for LZMA compressed streams in archives, notably .deb and .rpm
 - Remove preceding ./ from file path in tar archives.
 - Make parsing ar and deb files easier to abort: useful in e.g. Dolphin
 - Better method of removing deleted file from the CLucene
 - Do not tokenize the URL in the index to improve polling speed
 - Fix the bz2 header check: more bz2 archives are recognized (pino)
 - Fix infinite loop on parsing SGI image files
 - Fix reading of zip files without central directory.
0.6.3
 - Move Strigi::DirLister in archivereader.h to ArchiveReader::DirLister. Two class with this name were present in the code. The one in archivereader.h was not used in any code outside of Strigi, so we are changing it. Note that this changes means that one should not use Strigi 0.6.2.
 - Change type of EntryInfo.mtime from 'unsigned' to time_t.
 - The spec of SDF files was found and used to implement a more precise syntax check for the header of SDF files.
 - Fix memory corruption bug in ArchiveReader.
 - Change type of ontology entry 'exposureTime' to string. In theory something like duration would make sense but in practice xsd:string is the used one.
 - Add a default rule to find mail box directories with pattern '.*.directory'. Since these directory names start with a dot, they are normally not found.
 - Add '$HOME/.kde4' to the directories that are indexed by default.
 - Simplify matching of file paths in the rules for including or excluding directories from the index. The code is now more readable and easier to maintain.
 - Fix a big performance problem:  Whenever a directory mtime changed, all files inside the directory were re-indexed.
 - Fix bug where a gz archive that contains a file that is identical to the
 original archive is indexed over and over. The depth of nested files that are indexed is now limited to 127.
0.6.2
 - Better support for nice IO priorities on Linux (Sebastian Trueg)
 - Compile with development version of CLucene (Ben van Klinken)
 - Explicitly use 'unsigned char' or 'signed char' instead of 'char' since 'char' can be either signed or unsigned on different processors. E.g. on ARM 'char' means 'unsigned char' and on i386 'char' means 'signed char'. This changes makes libstreamanalyzer 0.6.2 binary incompatible with versions < 0.6.0. (Jos van den OOever)
 - Many CMake cleanups (Alexander Neundorf)
 - 6.5x speedup of C++ comment analyzer (Jakub Stachowski)
 - Various stability fixes (Jos van den Oever, Sebastian Trueg)
 - Support for ePub format (Jakub Stachowski)
 - Handle RIFF file with unspecified size for the RIFF packet. (Jos van den Oever)
0.5.11
 - Fix a bug that can cause a crash on an executable zip file.
 - Fix parsing of empty headers when CRLFCRLF is followed by a space. In other words, fix parsing of emails that have a space as the first character in the body.
 - Fix two broken (by design) throughanalyzers by replacing the with one eventanalyzer.
 - Updated xesam ontology to include proper ranges. This is necessary for the Nepomuk backend but does not change anything for clucene (were all is string anyway)
 - Make sure the app can handle environments where HOME is not defined.
 - Make the zip analyzer check more often if it should stop analyzing.
 - Fix wrong comparison when checking if we are finished yet.
 - Make the analyzer respect a configuration that only wants part of the stream to be analyzed.
 - Add an analyzer for Windows self-extracting zip archives.
 - Ask the analyzerconfiguration if we should continue and put a cap on the maximum length of stream we read
 - Log parse errors in the analysisresult.
0.5.10
 - Improved Xesam support. strigidaemon can now be queried with the client from
   the Xesam test suite.
 - Fix a bug in subinputstream.
   Under certain circumstances the function read() of the internal stream could
   be called with max < min. read() specifies that in such cases, there is no
   limit on the number of bytes that may be read. This would cause
   SubInputStream to malfunction because it would allow too much of the internal
   stream to be read.
 - Reenable a number of endanalyzers.
   By accident, the analyzers for .tar and .gz files were disabled in the
   previous release. Now they are re-enabled.
0.5.9
 - Fix bug that would severely bloat the strigi index.
 - Improve latency when calling strigi to stop.
 - Better (but not yet complete) Xesam support
0.5.8
 - Improve quiting latency of the most important analyzers. Now Strigi reacts more quickly when you tell it to stop indexing.
 - Add a tool to analyze the analyzer latency profile and find analyzers that have a high latency.
 - Bring field names in line with the Xesam ontology.
 - New analyzers for avi, wav, dds, rgb, sid and ico file types.
 - Fix deepgrep (finally working again since 0.5.2) and extend the number of fields deepgrep searches in. Now it also searches in fields that are passed as "unsigned char*" to the IndexWriter, but only if they are not registered as being binary fields.
 - Install two headers that provide metadata information about field types. Basically, these classes publish the ontology that strigi uses.
 - Fix a problem with CLucene throwing CLuceneError. Because of -fvisibility=hidden, the code did not recognize CLuceneError and caused it to fall through, thus crashing programs using libstreamanalyzer. A unit test to avoid the problem from reappearing has been added.
 - Fix for system where setenv() is not available (for instance windows). Hopefully those systems have putenv() :)
 - Remove support for starting strigidaemon with an arbiratry index type and index dir, but add an option to use a different configuration file. This effectively gives the use the same possiblities.
 - Fixes to the build system that allow strigi to be built and tested as part of a larger project (e.g. kdesupport).
 - 'strigicmd listFiles' now can be used to retrieve all files/dir indexed under a certain path
 - Added for support for Gentoo-way compilation flags. Implemented more consistent and pretty optional dependency handling.
0.5.7 
 - use plugins instead of shared libraries for the indexer backends
 - lots of bugfixes and cleanups
 - allow backends to be used in RAM by using ':memory:' as the index name
0.5.6
 - Added Xesam User Language parser. Now it will be possible to handle Xesam UserLanguage queries (http://wiki.freedesktop.org/wiki/XesamUserSearchLanguage).
 - Replaced .ini-based ontology parser with RDF/XML one.
 - Updated strigicmd: now it's possible to perform searches formulated
 following xesam userlanguage specifications.
 - Improved ontology introspection API: properties and classes now have child lists and applicable classes/properties lists.
 - change IndexReader::getFiles to IndexReader::getChildren.
 - removed IndexReader::documentId and IndexReader::mTime.
 - loads of build issues fixed
 - added a script that helps you to find the patch that broke a unit test
 - add fieldname for document content per the Xesam standard.
 - lots more
0.5.5
 - GUI now uses a .ui file making future improvements much easier
 - install detection script for ease of use in other cmake projects
 - modifying the signature of endAnalysis to endAnalysis(bool complete) 
   for StreamLineAnalyzer, StreamEventAnalyzer, and StreamSaxAnalyzer
 - add a function to AnalyzerConfiguration that tell how many bytes can 
   be read at most from a stream
 - add an SAX analyzer plugin that extracts the namespaces used in XML
   documents. With this it possible to get all XML documents that contain e.g.
   Chemical Markup Language or Dublin Core.
 - add a stream for changing the encoding of an incoming stream on the fly
 - use the new encoding stream to do better email parsing
 - add m3u stream analyzer.
 - add simple test program for strigi xesam query builder. It loads a file
   containing the xesam query. It converts the xesam query into a Strigi::Query
   object. It serializes the Strigi::Query object to xml for e.g. quality
   control.
 - add xesamquery option to strigicmd: now it's possibile to make queries
   using Xesam language.
 - add XesamQueryLanguage queries support. Now is possibile to translate
   xesam queries formulated using XesamQueryLanguage into Strigi::Query objects.
 - add a cgi executable that takes multipart/form-data and outputs an analysis
   of the data as xml
 - give xmlindexer the ability to read from stdin
 - big improvement in parsing ms word files
 - better input sanity checking. thanks to zzuf for reporting the errors
 - cleanup of private variables in classes by introducing a d-pointer

0.5.4
 - simplify PollingListener by letting it reuse code from DirAnalyzer
 - improve parsing speed by reading incrementally large blocks and only if no throughanalyzer is ready yet
 - extract more data from ogg and ID3 files
 - new registerField(fieldname) function that gets additional data from the
   ontology
 - support of indexwriter calls: addValue(index, field, data, size),
   addValue(index, field, double_value) to CLucene backend.
 - enable passing of "Tokenized" flag parameter to CLucene backend
 - support for the Keyword Terms which are not tokenized during queries
 - handling of optional indexing flags, which are loaded from the ontology
 - handling of cardinality constraint when indexing
 - add keyword query type which allows for using keywords that are not split
   up. e.g. chemistry.molecular_formula#"C 4 H 10". basically "#" sign tells --    do not tokenize
 - parse the userlanguage wrapped in xesam query language xml
 - add searialization to xml for Strigi::Query and Strigi::Term, useful for
   debugging purposes
 - add types from the xesam dbus interface to strigitypes.h
 - add support for gif files
 - add support for analyzing jpeg files.
 - add prioritized, multithreaded queue for incoming requests
 - add option --lastfiletoskip to diranalyzer and xmlindexer
 - add support for Cc: Bcc: Message-ID: In-Reply-To: References: From: and To:
 - add exclude and include filters to strigicmd create and update commands
 - add deindex option, it can be used for removing dirs or files from an index
   created by strigi

0.3.11
 - SunOS, BSD, 64 bit and Coverity compatibility fixes
 - Search in a set of default fields and not just in the text content of a file, if no specific field is specified.
 - Add histogram widget to simple search client
 - Add support for Ogg Vorbis
 - Better decoding of email headers
 - Expand Query object to handle nested queryies
 - Fix highlighting and display of title in search results.
 - Fix path for the child indexables
 - Fix memory problems in archivereader
 - Check for too short file names and omit the RPM trailer from the results.
 - Add an additional unit test for the RPM stream provider.
 - Revert raise() to kill(getpid()) because raise hangs the thread.
 - Install qtdbus library for strigi.

0.3.10
 - Convienience classes for using Strigi over Qt 4.2 DBus
 - Change buildsystem to allow building of deepfind, deepgrep and xmlindexer
   separately
 - Speedup of deepfind by selectively using only the analyzers deepfind needs
 - Many portability fixes (GCC 3, Forte, MSVC)
 - New, more efficien plugin loading
 - Add IFilter plugin for the Windows version
 - Remove the big Strigi lock (faster indexing)
 - Switch strigiclient to communicate of DBus instead of over a unix socket
 - Reorganization of the indexer with a new IndexerConfiguration
 - Improvements of file name filters
 - New Qt widget for configuring file name filters
 - Add file name setting to the DBus interface
 - Move verbose unit tests
 - Bugfixes in some streams

0.3.9
 - Added deepfind and deepgrep, programs that are enhanced versions of find
   and grep.
 - Added a new way of storing the configuration in an xml file.
 - Added a way to search in multiple indexes.
 - Added xmlindexer, a program that outputs the file parsing results as xml.
   This is convenient for debugging and can also used by other programs that
   do not want to write their own indexer. It makes the superior Strigi
   indexer available to other software in a convenient way.
 - More versatile filters that determine which files to index. (Flavio
   Castelli)
 - Add possibility to index files from the client by feeding the file into the
   daemon. This opens the way to indexing email from remote servers and web
   pages.