File: README

package info (click to toggle)
serpento 0.4.1-0.2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 360 kB
  • ctags: 391
  • sloc: python: 1,762; ansic: 669; perl: 157; sh: 127; makefile: 73
file content (48 lines) | stat: -rw-r--r-- 1,870 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
serpento (for the lack of better name) is a dict (RFC 2229) server
written in python.

License: GPL, with the addition: It can be linked with
whatever you want, without any restrictions.
Welcome to the world of "your opensource license is better 
than mine" :-)


Requirenments:
- Python 2.0 (does NOT work with Python 1.5, could probably
  be forced to work with python 1.6 with some effort)
- unix-like operating system (so far tested only on linux)
- some programs from tools/ directory rely on konwert to
  convert between different encodings
  (http://www.kki.net.pl/qrczak/programy/linux/konwert/)

Features:
- full UNICODE support (well, not full yet :-))
- can use raw dict file (the one with %h %d) and automatically 
  format output in plain text.
- dictionaries can be compressed with dictzip(1)
- uses the same index file as dictd
- supports following strategies:
    exact      Match words exactly
    prefix     Match prefixes
    suffix     Match suffixes
    substring  Match substring occurring anywhere in word
    re         POSIX 1003.2 regular expressions (-)
    fnmatch    fnmatch-like (* ? as wildcards) (-) 
    soundex    Match using SOUNDEX algorithm (--)
    metaphone  metaphone algorithm (--)
    lev        Match words within Levenshtein distance one (-)
    (-) : does not work correctly with UNICODE characters outside 
          ASCII range (yet)
    (--): cannot in principle work correctly with UNICODE characters 
          outside ASCII range, because it is designed for English words only.
- easily extendible with new types of databases
- now tries to be case insensitive, for most characters with sanely
  defined upper/lowercase conversion (U+0131 LATIN SMALL LETTER DOTLESS I 
  being a prime counterexample)

Drawbacks:
- early version
- no documentation (see comments in source :-))
- first starting takes significant time