File: sambamba-sort.1.ronn

package info (click to toggle)
sambamba 1.0%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 3,528 kB
  • sloc: sh: 220; python: 166; ruby: 147; makefile: 103
file content (72 lines) | stat: -rw-r--r-- 2,749 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
sambamba-sort(1) -- tool for sorting BAM files
==============================================

## SYNOPSIS

`sambamba sort` [OPTIONS] <input.bam>

## DESCRIPTION

BAM files can have either 'coordinate' sort order, or 'qname' one.

The first one means to sort the file by (integer) reference ID, and
for each reference sort corresponding reads by start coordinate.

'qname' sorting order is when reads are sorted lexicographically by
their names.

`sambamba sort` does an external `stable`-type sort on input file. That means it
reads the source BAM file in chunks that fit into memory, sorts them
and writes to a temporary directory, and then merges them. After merging
temporary files are removed automatically.

Both sorting orders are supported. Default one is 'coordinate' 
because this is the one used for building index later. In order to
switch to 'qname' sorting order, use `-n`|`--sort-by-name` flag.

## OPTIONS

  * `-m`, `--memory-limit`=<LIMIT>:
    Sets an upper bound for used memory. However, this is very approximate.
    Default memory limit is 512MiB. Increasing it will allow to make chunk
    sizes larger and also reduce amount of I/O seeks thus improving the overall
    performance.

    <LIMIT> must be a number with an optional suffix specyfying unit of measumerent.
    The following endings are recognized: K, KiB, KB, M, MiB, MB, G, GiB, GB.

  * `--tmpdir`=<TMPDIR>:
    Use <TMPDIR> to output sorted chunks. Default behaviour is to use system
    temporary directory.

  * `-o`, `--out`=<OUTPUTFILE>:
    Output file name. If not provided, the result is written to a file with .sorted.bam extension.

  * `-n`, `--sort-by-name`:
    Sort by read name instead of doing coordinate sort.

  * `-l`, `--compression-level`=<COMPRESSION_LEVEL>:
    Compression level to use for *sorted* BAM, from 0 (known as uncompressed BAM in samtools) to 9.

  * `-u`, `--uncompressed-chunks`:
    Write sorted chunks as uncompressed BAM. 
    Default behaviour is to write them with compression level 1, because that reduces time spent on I/O,
    but in some cases using this option can give you a better speed. Note, however, that the disk space
    needed for sorting will typically be 3-4 times more than without enabling this option.
    
  * `-p`, `--show-progress`:
    Show wget-like progressbar in STDERR (in fact, two of them one after another, first one for sorting,
    and then another one for merging).

  * `-t`, `--nthreads`=<NTHREADS>:
    Number of threads to use.

## SEE ALSO

For more information on the original samtools SORT behaviour, check
out the [samtools documentation](http://samtools.sourceforge.net/samtools.shtml).

## BUGS

At the moment, the memory is used quite ineffectively for really large files.