File: alphabetByCycle.Rd

package info (click to toggle)
r-bioc-shortread 1.32.0-1
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 8,384 kB
  • ctags: 293
  • sloc: ansic: 2,718; cpp: 202; sh: 3; makefile: 2
file content (89 lines) | stat: -rw-r--r-- 2,318 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
\name{alphabetByCycle}

\alias{alphabetByCycle}
\alias{alphabetByCycle,BStringSet-method}

\title{Summarize nucleotide, amino acid, or quality scores by cycle}

\description{

  \code{alphabetByCycle} summarizes nucleotides, amino acid, or qualities
  by cycle, e.g., returning the number of occurrences of each nucleotide
  \code{A, T, G, C} across all reads from 36 cycles of a Solexa lane.

}
\usage{

alphabetByCycle(stringSet, alphabet, ...)

}

\arguments{

  \item{stringSet}{A R object representing the collection of reads,
    amino acid sequences, or quality scores, to be summarized.}

  \item{alphabet}{The alphabet (character vector of length 1 strings)
    from which the sequences in \code{stringSet} are composed. Methods
    often define an appropriate alphabet, so that the user does not have
    to provide one.}

  \item{...}{Additional arguments, perhaps used by methods defined on
    this generic.}

}

\details{

  The default method requires that \code{stringSet} extends the
  \code{\link[Biostrings:XStringSet-class]{XStringSet}} class of
  \pkg{Biostrings}.

  The following method is defined, in addition to methods described in
  class-specific documentation:
  \describe{

    \item{alphabetByCycle}{\code{signature(stringSet = "BStringSet")}:
      this method uses an alphabet spanning all ASCII characters, codes
      \code{1:255}. }

  }
}

\value{

  A matrix with number of rows equal to the length of \code{alphabet}
  and columns equal to the maximum width of reads or quality scores in
  the string set. Entries in the matrix are the number of times, over
  all reads of the set, that the corresponding letter of the alphabet
  (row) appeared at the specified cycle (column).

}

\seealso{

  The IUPAC alphabet in Biostrings.

  \url{http://www.bioperl.org/wiki/FASTQ_sequence_format} for the
  BioPerl definition of fastq.

  Solexa documentation `Data analysis - documentation : Pipeline output
  and visualisation'.

}

\author{Martin Morgan}

\examples{
showMethods("alphabetByCycle")

sp <- SolexaPath(system.file('extdata', package='ShortRead'))
rfq <- readFastq(analysisPath(sp), pattern="s_1_sequence.txt")
alphabetByCycle(sread(rfq))

abcq <- alphabetByCycle(quality(rfq))
dim(abcq)
## 'high' scores, first and last cycles
abcq[64:94,c(1:5, 32:36)]
}
\keyword{manip}