File: s2n.Rd

package info (click to toggle)
r-cran-seqinr 3.4-5-2
  • links: PTS, VCS
  • area: main
  • in suites:
  • size: 5,876 kB
  • sloc: ansic: 1,987; makefile: 14
file content (65 lines) | stat: -rw-r--r-- 1,847 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
\name{s2n}
\alias{s2n}
\title{ simple numerical encoding of a DNA sequence.
 }
\description{
By default, if no \code{levels} arguments is provided, this function will
just code your DNA sequence in integer values following the lexical
order \code{(a > c > g > t)}, that is 0 for "a", 1 for "c", 2 for "g", 3 for
"t" and NA for ambiguous bases.
}
\usage{
s2n(seq, levels = s2c("acgt"), base4 = TRUE, forceToLower = TRUE)
}
\arguments{
  \item{seq}{ the sequence as a vector of single chars }
  \item{levels}{ allowed char values, by default a, c, g and t }
  \item{base4}{if TRUE the numerical encoding will start at O, if
FALSE at 1}
  \item{forceToLower}{if TRUE the sequence is forced to lower case caracters}
}
\value{
  a vector of integers
}
\references{
  \code{citation("seqinr")}
}
\author{J.R. Lobry }
\note{
The idea of starting numbering at 0 by default is that it enforces 
a kind of isomorphism between the paste operator on DNA chars and 
the + operator on integer coding for DNA chars. By this way, you can
work either in the char set, either in the integer set, depending
on what is more convenient for your purpose, and then switch from one 
set to the other one as you like.
}
\seealso{ \code{\link{n2s}}, \code{\link{factor}}, \code{\link{unclass}} }
\examples{
##
## Example of default behaviour:
##
urndna <- s2c("acgt")
seq <- sample( urndna, 100, replace = TRUE ) ; seq
s2n(seq)
##
## How to deal with RNA:
##
urnrna <- s2c("acgt")
seq <- sample( urnrna, 100, replace = TRUE ) ; seq
s2n(seq)
##
## what happens with unknown characters:
##
urnmess <- c(urndna,"n")
seq <- sample( urnmess, 100, replace = TRUE ) ; seq
s2n(seq)
##
## How to change the encoding for unknown characters:
##
tmp <- s2n(seq) ; tmp[is.na(tmp)] <- -1; tmp
##
## Simple sanity check:
##
stopifnot(all(s2n(s2c("acgt")) == 0:3))
}
\keyword{ utilities }