File: derepFastq.Rd

package info (click to toggle)
r-bioc-dada2 1.34.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 3,016 kB
  • sloc: cpp: 3,096; makefile: 5
file content (48 lines) | stat: -rw-r--r-- 1,940 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sequenceIO.R
\name{derepFastq}
\alias{derepFastq}
\title{Read in and dereplicate a fastq file.}
\usage{
derepFastq(fls, n = 1e+06, verbose = FALSE, qualityType = "Auto")
}
\arguments{
\item{fls}{(Required). \code{character}.
The file path(s) to the fastq file(s), or a directory containing fastq file(s).
Compressed file formats such as .fastq.gz and .fastq.bz2 are supported.}

\item{n}{(Optional). \code{numeric(1)}.
The maximum number of records (reads) to parse and dereplicate
at any one time. This controls the peak memory requirement
so that large fastq files are supported.
Default is \code{1e6}, one-million reads.
See \code{\link[ShortRead]{FastqStreamer}} for details on this parameter,
which is passed on.}

\item{verbose}{(Optional). Default FALSE.
If TRUE, throw standard R \code{\link{message}}s 
on the intermittent and final status of the dereplication.}

\item{qualityType}{(Optional). \code{character(1)}.
The quality encoding of the fastq file(s). "Auto" (the default) means to 
attempt to auto-detect the encoding. This may fail for PacBio files with
uniformly high quality scores, in which case use "FastqQuality". This
parameter is passed on to \code{\link[ShortRead]{readFastq}}; see
information there for details.}
}
\value{
A \code{\link{derep-class}} object or list of such objects.
}
\description{
A custom interface to \code{\link[ShortRead]{FastqStreamer}} 
for dereplicating amplicon sequences from fastq or compressed fastq files,
while also controlling peak memory requirement to support large files.
}
\examples{
# Test that chunk-size, `n`, does not affect the result.
testFastq = system.file("extdata", "sam1F.fastq.gz", package="dada2")
derep1 = derepFastq(testFastq, verbose = TRUE)
derep1.35 = derepFastq(testFastq, n = 35, verbose = TRUE)
all.equal(getUniques(derep1), getUniques(derep1.35)[names(getUniques(derep1))])

}