1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
|
.\" generated with Ronn/v0.7.3
.\" http://github.com/rtomayko/ronn/tree/0.7.3
.
.TH "SAMBAMBA\-MARKDUP" "1" "February 2015" "" ""
.
.SH "NAME"
\fBsambamba\-markdup\fR \- finding duplicate reads in BAM file
.
.SH "SYNOPSIS"
\fBsambamba markdup\fR \fIOPTIONS\fR <input\.bam> <output\.bam>
.
.SH "DESCRIPTION"
Marks (by default) or removes duplicate reads\. For determining whether a read is a duplicate or not, the same `sum of base qualities\' method is used as in Picard \fIhttps://broadinstitute\.github\.io/picard/picard\-metric\-definitions\.html\fR\.
.
.SH "OPTIONS"
.
.TP
\fB\-r\fR, \fB\-\-remove\-duplicates\fR
remove duplicates instead of just marking them
.
.TP
\fB\-t\fR, \fB\-\-nthreads\fR=\fINTHREADS\fR
number of threads to use
.
.TP
\fB\-l\fR, \fB\-\-compression\-level\fR=\fIN\fR
specify compression level of the resulting file (from 0 to 9)");
.
.TP
\fB\-p\fR, \fB\-\-show\-progress\fR
show progressbar in STDERR
.
.TP
\fB\-\-tmpdir\fR=\fITMPDIR\fR
specify directory for temporary files; default is \fB/tmp\fR
.
.TP
\fB\-\-hash\-table\-size\fR=\fIHASHTABLESIZE\fR
size of hash table for finding read pairs (default is 262144 reads); will be rounded down to the nearest power of two; should be \fB> (average coverage) * (insert size)\fR for good performance
.
.TP
\fB\-\-overflow\-list\-size\fR=\fIOVERFLOWLISTSIZE\fR
size of the overflow list where reads, thrown away from the hash table, get a second chance to meet their pairs (default is 200000 reads); increasing the size reduces the number of temporary files created
.
.TP
\fB\-\-io\-buffer\-size\fR=\fIBUFFERSIZE\fR
controls sizes of two buffers of BUFFERSIZE \fImegabytes\fR each, used for reading and writing BAM during the second pass (default is 128)
.
.SH "SEE ALSO"
Picard \fIhttps://broadinstitute\.github\.io/picard/picard\-metric\-definitions\.html\fR metric definitions for removing duplicates\.
.
.SH "BUGS"
External sort is not implemented\. Thus, memory consumption grows by 2Gb per each 100M reads\. Check that you have enough RAM before running the tool\.
|