File: addSpecies.Rd

package info (click to toggle)
r-bioc-dada2 1.34.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 3,016 kB
  • sloc: cpp: 3,096; makefile: 5
file content (70 lines) | stat: -rw-r--r-- 2,787 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/taxonomy.R
\name{addSpecies}
\alias{addSpecies}
\title{Add species-level annotation to a taxonomic table.}
\usage{
addSpecies(
  taxtab,
  refFasta,
  allowMultiple = FALSE,
  tryRC = FALSE,
  n = 2000,
  verbose = FALSE
)
}
\arguments{
\item{taxtab}{(Required). A taxonomic table, the output of \code{\link{assignTaxonomy}}.}

\item{refFasta}{(Required). The path to the reference fasta file, or an 
R connection. Can be compressed.
This reference fasta file should be formatted so that the id lines correspond to the
genus-species binomial of the associated sequence:
  
 >SeqID genus species  
 ACGAATGTGAAGTAA......}

\item{allowMultiple}{(Optional). Default FALSE.
Defines the behavior when multiple exact matches against different species are returned.
By default only unambiguous identifications are return. If TRUE, a concatenated string
of all exactly matched species is returned. If an integer is provided, multiple
identifications up to that many are returned as a concatenated string.}

\item{tryRC}{(Optional). Default FALSE. 
If TRUE, the reverse-complement of each sequences will be used for classification if it is a better match to the reference
sequences than the forward sequence.}

\item{n}{(Optional). Default \code{1e5}.
The number of records (reads) to read in and filter at any one time. 
This controls the peak memory requirement so that very large fastq files are supported. 
See \code{\link{FastqStreamer}} for details.}

\item{verbose}{(Optional). Default FALSE.
If TRUE, print status to standard output.}
}
\value{
A character matrix one column larger than input. Rows correspond to
  sequences, and columns to the taxonomic levels. NA indicates that the sequence
  was not classified at that level.
}
\description{
\code{addSpecies} wraps the \code{\link{assignSpecies}} function to assign genus-species 
binomials to the input sequences by exact matching against a reference fasta. Those binomials
are then merged with the input taxonomic table with species annotations appended as an 
additional column to the input table.
Only species identifications where the genera in the input table and the binomial 
classification are consistent are included in the return table.
}
\examples{

seqs <- getSequences(system.file("extdata", "example_seqs.fa", package="dada2"))
training_fasta <- system.file("extdata", "example_train_set.fa.gz", package="dada2")
taxa <- assignTaxonomy(seqs, training_fasta)
species_fasta <- system.file("extdata", "example_species_assignment.fa.gz", package="dada2")
taxa.spec <- addSpecies(taxa, species_fasta)
taxa.spec.multi <- addSpecies(taxa, species_fasta, allowMultiple=TRUE)

}
\seealso{
\code{\link{assignTaxonomy}}, \code{\link{assignSpecies}}
}