File: isBimeraDenovoTable.Rd

package info (click to toggle)
r-bioc-dada2 1.34.0%2Bdfsg-2
links: PTS, VCS
area: main
in suites: sid, trixie
size: 3,016 kB
sloc: cpp: 3,096; makefile: 5
file content (82 lines) | stat: -rw-r--r-- 3,260 bytes
parent folder | download | duplicates (3)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/chimeras.R
\name{isBimeraDenovoTable}
\alias{isBimeraDenovoTable}
\title{Identify bimeras in a sequence table.}
\usage{
isBimeraDenovoTable(
  seqtab,
  minSampleFraction = 0.9,
  ignoreNNegatives = 1,
  minFoldParentOverAbundance = 1.5,
  minParentAbundance = 2,
  allowOneOff = FALSE,
  minOneOffParentDistance = 4,
  maxShift = 16,
  multithread = FALSE,
  verbose = FALSE
)
}
\arguments{
\item{seqtab}{(Required). A sequence table. That is, an integer matrix with colnames
corresponding to DNA sequences.}

\item{minSampleFraction}{(Optional). Default is 0.9.
The fraction of samples in which a sequence must be flagged as bimeric in order for it to
be classified as a bimera.}

\item{ignoreNNegatives}{(Optional). Default is 1.
The number of unflagged samples to ignore when evaluating whether the fraction of samples
in which a sequence was flagged as a bimera exceeds \code{minSampleFraction}. The purpose
of this parameter is to lower the threshold at which sequences found in few samples are
flagged as bimeras.}

\item{minFoldParentOverAbundance}{(Optional). Default is 1.5.
Only sequences greater than this-fold more abundant than a sequence can be its 
"parents". Evaluated on a per-sample basis.}

\item{minParentAbundance}{(Optional). Default is 2.
Only sequences at least this abundant can be "parents". Evaluated on a per-sample basis.}

\item{allowOneOff}{(Optional). Default is FALSE.
If FALSE, sequences that have one mismatch or indel to an exact bimera are also
flagged as bimeric.}

\item{minOneOffParentDistance}{(Optional). Default is 4.
Only sequences with at least this many mismatches to the potential bimeric sequence
considered as possible "parents" when flagging one-off bimeras. There is
no such screen when considering exact bimeras.}

\item{maxShift}{(Optional). Default is 16.
Maximum shift allowed when aligning sequences to potential "parents".}

\item{multithread}{(Optional). Default is FALSE.
If TRUE, multithreading is enabled. NOT YET IMPLEMENTED.}

\item{verbose}{(Optional). Default FALSE.
Print verbose text output.}
}
\value{
\code{logical} of length equal to the number of sequences in the input table.
 TRUE if sequence is identified as a bimera. Otherwise FALSE.
}
\description{
This function implements a table-specific version of de novo bimera detection. In short,
bimeric sequences are flagged on a sample-by-sample basis. Then, a vote is performed for
each sequence across all samples in which it appeared. If the sequence is flagged in a
sufficiently high fraction of samples, it is identified as a bimera. A logical vector is
returned, with an entry for each sequence in the table indicating whether it was identified
as bimeric by this consensus procedure.
}
\examples{
derep1 = derepFastq(system.file("extdata", "sam1F.fastq.gz", package="dada2"))
derep2 = derepFastq(system.file("extdata", "sam2F.fastq.gz", package="dada2"))
dd <- dada(list(derep1,derep2), err=NULL, errorEstimationFunction=loessErrfun, selfConsist=TRUE)
seqtab <- makeSequenceTable(dd)
isBimeraDenovoTable(seqtab)
isBimeraDenovoTable(seqtab, allowOneOff=TRUE, minSampleFraction=0.5)

}
\seealso{
\code{\link{isBimera}}, \code{\link{removeBimeraDenovo}}
}