File: subread-align.1

package info (click to toggle)
subread 2.0.3%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 74,360 kB
  • sloc: ansic: 53,965; sh: 437; python: 126; makefile: 65; perl: 31
file content (215 lines) | stat: -rw-r--r-- 6,191 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.16.
.TH SUBREAD-ALIGN "1" "March 2021" "subread-align 2.0.1" "User Commands"
.SH NAME
subread-align \- toolkit for processing next-gen sequencing data
.SH DESCRIPTION
Version 2.0.1
.PP
Usage:
.PP
\&./subread\-align [options] \fB\-i\fR <index_name> \fB\-r\fR <input> \fB\-t\fR <type> \fB\-o\fR <output>
.PP
## Mandatory arguments:
.TP
\fB\-i\fR <string>
Base name of the index.
.TP
\fB\-r\fR <string>
Name of an input read file. If paired\-end, this should be
the first read file (typically containing "R1"in the file
name) and the second should be provided via "\-R".
Acceptable formats include gzipped FASTQ, FASTQ, gzipped
FASTA and FASTA.
These formats are identified automatically.
.TP
\fB\-t\fR <int>
Type of input sequencing data. Its values include
0: RNA\-seq data
1: genomic DNA\-seq data.
.PP
## Optional arguments:
# input reads and output
.TP
\fB\-o\fR <string>
Name of an output file. By default, the output is in BAM
format. Omitting this option makes the output be written to
STDOUT.
.TP
\fB\-R\fR <string>
Name of the second read file in paired\-end data (typically
containing "R2" the file name).
.TP
\fB\-\-SAMinput\fR
Input reads are in SAM format.
.TP
\fB\-\-BAMinput\fR
Input reads are in BAM format.
.TP
\fB\-\-SAMoutput\fR
Save mapping results in SAM format.
.PP
# Phred offset
.TP
\fB\-P\fR <3:6>
Offset value added to the Phred quality score of each read
base. '3' for phred+33 and '6' for phred+64. '3' by default.
.PP
# thresholds for mapping
.TP
\fB\-n\fR <int>
Number of selected subreads, 10 by default.
.TP
\fB\-m\fR <int>
Consensus threshold for reporting a hit (minimal number of
subreads that map in consensus) . If paired\-end, this gives
the consensus threshold for the anchor read (anchor read
receives more votes than the other read in the same pair).
3 by default
.TP
\fB\-p\fR <int>
Consensus threshold for the non\- anchor read in a pair. 1 by
default.
.TP
\fB\-M\fR <int>
Maximum number of mis\-matched bases allowed in each reported
alignment. 3 by default. Mis\-matched bases found in softclipped bases are not counted.
.PP
# unique mapping and multi\-mapping
.TP
\fB\-\-multiMapping\fR
Report multi\-mapping reads in addition to uniquely mapped
reads. Use "\-B" to set the maximum number of equally\-best
alignments to be reported.
.TP
\fB\-B\fR <int>
Maximum number of equally\-best alignments to be reported for
a multi\-mapping read. Equally\-best alignments have the same
number of mis\-matched bases. 1 by default.
.PP
# indel detection
.TP
\fB\-I\fR <int>
Maximum length (in bp) of indels that can be detected. 5 by
default. Indels of up to 200bp long can be detected.
.TP
\fB\-\-complexIndels\fR
Detect multiple short indels that are in close proximity
(they can be as close as 1bp apart from each other).
.PP
# read trimming
.TP
\fB\-\-trim5\fR <int>
Trim off <int> number of bases from 5' end of each read. 0
by default.
.TP
\fB\-\-trim3\fR <int>
Trim off <int> number of bases from 3' end of each read. 0
by default.
.PP
# distance and orientation of paired end reads
.TP
\fB\-d\fR <int>
Minimum fragment/insert length, 50bp by default.
.TP
\fB\-D\fR <int>
Maximum fragment/insert length, 600bp by default.
.TP
\fB\-S\fR <ff:fr:rf>
Orientation of first and second reads, 'fr' by default (
forward/reverse).
.PP
# number of CPU threads
.TP
\fB\-T\fR <int>
Number of CPU threads used, 1 by default.
.PP
# read group
.TP
\fB\-\-rg\-id\fR <string>
Add read group ID to the output.
.TP
\fB\-\-rg\fR <string>
Add <tag:value> to the read group (RG) header in the output.
.PP
# read order
.TP
\fB\-\-keepReadOrder\fR
Keep order of reads in BAM output the same as that in the
input file. Reads from the same pair are always placed next
to each other no matter this option is specified or not.
.TP
\fB\-\-sortReadsByCoordinates\fR Output location\-sorted reads. This option is
applicable for BAM output only. A BAI index file is also
generated for each BAM file so the BAM files can be directly
loaded into a genome browser.
.PP
# color space reads
.TP
\fB\-b\fR
Convert color\-space read bases to base\-space read bases in
the mapping output. Note that read mapping is performed at
color\-space.
.PP
# dynamic programming
.TP
\fB\-\-DPGapOpen\fR <int> Penalty for gap opening in short indel detection. \fB\-1\fR by
default.
.TP
\fB\-\-DPGapExt\fR <int>
Penalty for gap extension in short indel detection. 0 by
default.
.TP
\fB\-\-DPMismatch\fR <int> Penalty for mismatches in short indel detection. 0 by
default.
.TP
\fB\-\-DPMatch\fR <int>
Score for matched bases in short indel detection. 2 by
default.
.PP
# detect structural variants
.TP
\fB\-\-sv\fR
Detect structural variants (eg. long indel, inversion,
duplication and translocation) and report breakpoints. Refer
to Users Guide for breakpoint reporting.
.PP
# gene annotation
.TP
\fB\-a\fR
Name of an annotation file (gzipped file is accepted).
GTF/GFF format by default. See \fB\-F\fR option for more format
information.
.TP
\fB\-F\fR
Specify format of the provided annotation file. Acceptable
formats include 'GTF' (or compatible GFF format) and
\&'SAF'. 'GTF' by default. For SAF format, please refer to
Users Guide.
.TP
\fB\-A\fR
Provide a chromosome name alias file to match chr names in
annotation with those in the reads. This should be a twocolumn comma\-delimited text file. Its first column should
include chr names in the annotation and its second column
should include chr names in the index. Chr names are case
sensitive. No column header should be included in the
file.
.TP
\fB\-\-gtfFeature\fR <string>
Specify feature type in GTF annotation. 'exon'
by default. Features used for read counting will be
extracted from annotation using the provided value.
.TP
\fB\-\-gtfAttr\fR <string>
Specify attribute type in GTF annotation. 'gene_id'
by default. Meta\-features used for read counting will be
extracted from annotation using the provided value.
.PP
# others
.TP
\fB\-v\fR
Output version of the program.
.PP
Refer to Users Manual for detailed description to the arguments.
.SH AUTHOR
 This manpage was written by Nilesh Patra for the Debian distribution and
 can be used for any other usage of the program.