File: sambamba-markdup.1

package info (click to toggle)

sambamba 1.0%2Bdfsg-1

links: PTS, VCS
area: main
in suites: bookworm
size: 3,528 kB
sloc: sh: 220; python: 166; ruby: 147; makefile: 103

file content (53 lines) | stat: -rw-r--r-- 2,042 bytes

parent folder | download | duplicates (3)

.\" generated with Ronn/v0.7.3
.\" http://github.com/rtomayko/ronn/tree/0.7.3
.
.TH "SAMBAMBA\-MARKDUP" "1" "February 2015" "" ""
.
.SH "NAME"
\fBsambamba\-markdup\fR \- finding duplicate reads in BAM file
.
.SH "SYNOPSIS"
\fBsambamba markdup\fR \fIOPTIONS\fR <input\.bam> <output\.bam>
.
.SH "DESCRIPTION"
Marks (by default) or removes duplicate reads\. For determining whether a read is a duplicate or not, the same `sum of base qualities\' method is used as in Picard \fIhttps://broadinstitute\.github\.io/picard/picard\-metric\-definitions\.html\fR\.
.
.SH "OPTIONS"
.
.TP
\fB\-r\fR, \fB\-\-remove\-duplicates\fR
remove duplicates instead of just marking them
.
.TP
\fB\-t\fR, \fB\-\-nthreads\fR=\fINTHREADS\fR
number of threads to use
.
.TP
\fB\-l\fR, \fB\-\-compression\-level\fR=\fIN\fR
specify compression level of the resulting file (from 0 to 9)");
.
.TP
\fB\-p\fR, \fB\-\-show\-progress\fR
show progressbar in STDERR
.
.TP
\fB\-\-tmpdir\fR=\fITMPDIR\fR
specify directory for temporary files; default is \fB/tmp\fR
.
.TP
\fB\-\-hash\-table\-size\fR=\fIHASHTABLESIZE\fR
size of hash table for finding read pairs (default is 262144 reads); will be rounded down to the nearest power of two; should be \fB> (average coverage) * (insert size)\fR for good performance
.
.TP
\fB\-\-overflow\-list\-size\fR=\fIOVERFLOWLISTSIZE\fR
size of the overflow list where reads, thrown away from the hash table, get a second chance to meet their pairs (default is 200000 reads); increasing the size reduces the number of temporary files created
.
.TP
\fB\-\-io\-buffer\-size\fR=\fIBUFFERSIZE\fR
controls sizes of two buffers of BUFFERSIZE \fImegabytes\fR each, used for reading and writing BAM during the second pass (default is 128)
.
.SH "SEE ALSO"
Picard \fIhttps://broadinstitute\.github\.io/picard/picard\-metric\-definitions\.html\fR metric definitions for removing duplicates\.
.
.SH "BUGS"
External sort is not implemented\. Thus, memory consumption grows by 2Gb per each 100M reads\. Check that you have enough RAM before running the tool\.