File: blasr.1

package info (click to toggle)
blasr 5.3.5%2Bdfsg-6
links: PTS, VCS
area: main
in suites: bookworm
size: 1,196 kB
sloc: cpp: 8,412; ansic: 806; python: 331; sh: 178; java: 158; makefile: 36
file content (70 lines) | stat: -rw-r--r-- 3,077 bytes
parent folder | download | duplicates (5)
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
.TH BLASR "1" "January 2019" "blasr 5.3.2" "User Commands"
.SH NAME
blasr \- Map SMRT Sequences to a reference genome
.SH SYNOPSIS
.B blasr
reads.bam genome.fasta \fB\-\-bam\fR \fB\-\-out\fR out.bam
.P
.B blasr
reads.fasta genome.fasta
.P
.B blasr
reads.fasta genome.fasta \fB\-\-sa\fR genome.fasta.sa
.P
.B blasr
reads.bax.h5 genome.fasta [\-\-sa genome.fasta.sa]
.P
.B blasr
reads.bax.h5 genome.fasta \fB\-\-sa\fR genome.fasta.sa \fB\-\-maxScore\fR 100 \fB\-\-minMatch\fR 15 ...
.P
.B blasr
reads.bax.h5 genome.fasta \fB\-\-sa\fR genome.fasta.sa \fB\-\-nproc\fR 24 \fB\-\-out\fR alignment.out ...
.SH DESCRIPTION
blasr is a read mapping program that maps reads to positions
in a genome by clustering short exact matches between the read and
the genome, and scoring clusters using alignment. The matches are
generated by searching all suffixes of a read against the genome
using a suffix array. Global chaining methods are used to score
clusters of matches.
.P
The only required inputs to blasr are a file of reads and a
reference genome.  It is exremely useful to have read filtering
information, and mapping runtime may decrease substantially when a
precomputed suffix array index on the reference sequence is
specified.
.P
Although reads may be input in FASTA format, the recommended input is
PacBio BAM files because these contain quality value
information that is used in the alignment and produces higher quality
variant detection.
Although alignments can be output in various formats, the recommended
output format is PacBio BAM.
Support to bax.h5 and plx.h5 files will be DEPRECATED.
Support to region tables for h5 files will be DEPRECATED.
.P
When suffix array index of a genome is not specified, the suffix array is
built before producing alignment.   This may be prohibitively slow
when the genome is large (e.g. Human).  It is best to precompute the
suffix array of a genome using the program sawriter, and then specify
the suffix array on the command line using \fB\-sa\fR genome.fa.sa.
.P
The optional parameters are roughly divided into three categories:
control over anchoring, alignment scoring, and output.
.P
The default anchoring parameters are optimal for small genomes and
samples with up to 5% divergence from the reference genome.  The main
parameter governing speed and sensitivity is the \fB\-minMatch\fR parameter.
For human genome alignments, a value of 11 or higher is recommended.
Several methods may be used to speed up alignments, at the expense of
possibly decreasing sensitivity.
.P
Regions that are too repetitive may be ignored during mapping by
limiting the number of positions a read maps to with the
\fB\-maxAnchorsPerPosition\fR option.  Values between 500 and 1000 are effective
in the human genome.
.P
For small genomes such as bacterial genomes or BACs, the default parameters
are sufficient for maximal sensitivity and good speed.
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.