File: README

package info (click to toggle)
sigma-align 1.1.1-3
  • links: PTS, VCS
  • area: main
  • in suites: lenny
  • size: 620 kB
  • ctags: 99
  • sloc: sh: 3,403; ansic: 1,344; xml: 221; makefile: 56
file content (59 lines) | stat: -rw-r--r-- 3,035 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
Sigma: Simple greedy multiple alignment
Version 1.1.1, (C) 2007, Rahul Siddharthan <rsidd@imsc.res.in>

This is a port to C of Sigma, which was originally (version 1.0)
written in ocaml.  A paper describing the algorithm was published in
BMC Bioinformatics 7:143 (2006).  A couple of important bugs have
been fixed, and it is about 3 to 4 times faster than the ocaml
version.  It does everything that the ocaml version did, but more
correctly (and faster).

1.1.1 is a minor update (see ChangeLog) with no new features.

Changes since version 1.0:

1. Mismatches in previously-aligned sequence fragments are treated
   more intelligently.  Earlier, "N" was always used, which led to
   problems in aligning large numbers of sequences, where mismatches
   became more and more common. Now the majority base, if one exists,
   is used.  There is scope for improvement, which will be dealt with
   in a future version, but the current treatment seems adequate for
   most common situations.

2. The dynamic programming algorithm for finding local alignments
   tends to become slow for large input sequences: it is O(NM), in
   time and memory usage, for sequence lengths N and M.  The current
   version includes a workaround where sequences are pre-fragmented
   into pieces of average size smaller than L (4000 by default,
   changeable with the new -l option).  This seems to work well in
   real life, both on synthetic sequence and on real DNA sequence.  As
   a result, this version of Sigma scales well to much larger datasets
   than earlier (eg, 10 seqs each 10000bp long), and performs much
   faster than many existing programs (including ClustalW and
   Dialign).  Sigma 1.0 was almost unusable on such datasets.

   Nevertheless, this fragmentation is admittedly an inelegant hack;
   in future versions, we will try to implement a more efficient
   local-alignment algorithm that will avoid this "fragmenting" or, at
   least, allow a larger average fragment size.

3. Some bugs relating to the consistency condition enforcement have
   been fixed.  These had the effect of occasionally disallowing
   legitimate alignments (but, I believe, did not cause incorrect 
   alignments).

For help on compiling, see the file COMPILING.  For command-line options, see
the file NOTES or simply run the program without options.  A unix manual page,
contributed by Charles Plessy, is in "sigma.1" and will be installed with "make
install".  The docbook source is in "sigma.1.xml".  For discussion of
background models and a sample file, see the Background directory.  For a
detailed description of the algorithm, see BMC Bioinformatics 7:143 (2006).

The program is distributed under the GNU General Public License, version 2.
For copyright and licensing information, see COPYING.

The program's website is 
            http://www.imsc.res.in/~rsidd/sigma/
and source code, as well as pre-compiled binaries for some
platforms, are available there.  To be informed of updates, email
the author, Rahul Siddharthan <rsidd@imsc.res.in>.