File: soap.man

package info (click to toggle)
soapaligner 2.20-6
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 768 kB
  • sloc: ansic: 10,051; makefile: 236
file content (145 lines) | stat: -rw-r--r-- 6,039 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
SOAPaligner/soap2(1)          Bioinformatics tool         SOAPaligner/soap2(1)



NNAAMMEE
       SOAPaligner/soap2 - Short Oligonucleotide Analysis Package aligner

SSYYNNOOPPSSIISS
       soap reference.index short_reads.fast[a|q] alignment.out [options]

DDEESSCCRRIIPPTTIIOONN
       SOAPaligner/soap2 is a member of the SOAP (Short Oligonucleotide Analy-
       sis Package). It is an updated  version  of  SOAP  software  for  short
       oligonucleotide  alignment.  The new program features in super fast and
       accurate alignment for huge amounts of short reads generated  by  Illu-
       mina/Solexa  Genome  Analyzer.  Compared to soap v1, it is one order of
       magnitude faster. It require only 2 minutes aligning one  million  sin-
       gle-end  reads  onto  the  human  reference  genome. Another remarkable
       improvement of SOAPaligner is that it now supports a wide range of  the
       read length.

       SOAPaligner  benefitted in time and space efficiency by a revolution in
       the basic data structures and algorithms used.The core  algorithms  and
       the indexing data structures (2way-BWT) are developed by the algorithms
       research group of the Department of Computer Science, the University of
       Hong Kong (T.W. Lam, Alan Tam, Simon Wong, Edward Wu and S.M. Yiu).

CCOOMMMMAANNDD AANNDD OOPPTTIIOONNSS
       ssooaapp  -D  <in.fasta.index>  -a  <query.file.a>  [-b  <query.file.b>] -o
       <alignment.output> [-2 <unpaired.output>] [options]

       OOPPTTIIOONNSS::

              --DD SSTTRR Prefix name for reference index [*.index].  See  AAPPPPEENNDDIIXX
                     How to build the reference index

              --aa SSTTRR Query file, for SE reads alignment or one end of PE reads

              --bb SSTTRR Query b file, one end of PE reads

              --oo SSTTRR Output file for alignment results

              --22 SSTTRR Output file contains mapped but unpaired reads when do PE
                     alignment

              --uu SSTTRR Output file for unmapped reads, [none]

              --mm IINNTT Minimal insert size INT allowed for PE, [400]

              --xx IINNTT Maximal insert size INT allowed for PE, [600]

              --nn IINNTT Filter low quality reads containing more INT bp Ns, [5]

              --tt     Output reads id instead reads name, [none]

              --rr IINNTT How  to  report repeat hits, 0=none; 1=random one; 2=all,
                     [1]

              --RR     RF alignment for long insert size(>=  2k  bps)  PE  data,
                     [none] FR alignment

              --ll IINNTT For  long  reads  with  high  error rate at 3'-end, those
                     can't align whole length, then first align 5' INT bp sub-
                     sequence as a seed, [256] use whole length of the read

              --ss IINNTT minimal alignment length (for soft clip)

              --vv IINNTT Totally  allowed  mismatches in one read, when use subse-
                     quence as a seed, [5]

              --gg IINNTT Allow gap size in one read, [0]

              --MM IINNTT Match mode for each read or the seed part of read,  which
                     shouldn't contain more than 2 mismatches, [4]

                     0: exact match only

                     1: 1 mismatch match only

                     2: 2 mismatch match only

                     4: find the best hits
              --pp IINNTT Multithreads, n threads, [1]

OOUUTTPPUUTT FFOORRMMAATT
       SOAP2 output format contains following column information:

       1. reads name / reads ID (if -t is available)

       2. reads sequence (if read align to reverse strand, here is the reverse
       sequence of orignal read)

       3. quality sequence (if input is fasta reads, the column  will  be  all
       'h', and the sequence is backward if reads mapping reverse )

       4.

AAPPPPEENNDDIIXX
       Before use soap2 to do alignment, the reference index must be generated
       by 2bwt-builder.

              22bbwwtt--bbuuiillddeerr <reference.fasta>

              NNOOTTEE:: 1. the reference input should only be FASTA format; 2. the
              program wil auto generate the index files in the directory where
              the fasta file is located, so confirm the permission at first.

EENNVVIIRROONNMMEENNTT
       The datastructure is imcompatible with 32bit, so it can't  be  migrated
       on  any 32bit platforms.  Due to using the MMX instruction to opitimize
       parts of code, the current version can only run on xx8866__6644 ppllaattffoorrmm..  We
       will  provide a universal version for most of the 64bit platform later.

       HHAARRDDWWAARREE RREEQQUUIIRREEMMEENNTT
              1.8Gb RAM (for a genome as large as human's)

              2.at least 8Gb hard disk to store index (for a genome  as  large
              as human's)

       SSYYSSTTEEMM RREEQQUUIIRREEMMEENNTT
              Linux x86_64

SSEEEE AALLSSOO
       Website for SOAP <http://soap.genomics.org.cn>,

       Google Group for SOAP <http://groups.google.com/group/bgi-soap>

       PPuubblliiccaattiioonn::
              "SOAP: short oligonucleotide alignment program" (2008) BIOINFOR-
              MATICS,Vol. 24 no.5 2008, pages 713-714

AATTHHOOUURR
       BBGGII SShheennzzhheenn SOAP team. The core algorithm Bidirect-BWT is  wrotten  by
       Prof. T.W. Lam and his team at HongKong University.

RREEPPOORRTT BBUUGGSS
       Report bugs to <soap@genomics.org.cn>

AACCKKNNOOWWLLEEDDGGEEMMEENNTTSS
       We  appreciate Prof. T.W. Lam, Alan Tam, Simon Wong, Edward Wu and S.M.
       Yiu prominent work on Bidirect-BWT.



SOAPaligner-2.1X                  25 May 2009             SOAPaligner/soap2(1)