1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dataproc.R
\name{resample}
\alias{resample}
\alias{downsample}
\alias{prop.sample}
\title{Resample data frame using values from the column with number of clonesets.}
\usage{
resample(.data, .n = -1, .col = c("read.count", "umi.count"))
downsample(.data, .n, .col = c("read.count", "umi.count"))
prop.sample(.data, .perc = 50, .col = c("read.count", "umi.count"))
}
\arguments{
\item{.data}{Data frame with the column \code{.col} or list of such data frames.}
\item{.n}{Number of values / reads / UMIs to choose.}
\item{.col}{Which column choose to represent quanitites of clonotypes. See "Details".}
\item{.perc}{Percentage (0 - 100). See "Details" for more info.}
}
\value{
Subsampled data frame.
}
\description{
Resample data frame using values from the column with number of clonesets. Number of clonestes (i.e., rows of a MiTCR data frame)
are reads (usually the "Read.count" column) or UMIs (i.e., barcodes, usually the "Umi.count" column).
}
\details{
\code{resample}. Using multinomial distribution, compute the number of occurences for each cloneset, than remove zero-number clonotypes and
return resulting data frame. Probabilities for \code{rmultinom} for each cloneset is a percentage of this cloneset in
the \code{.col} column. It's a some sort of simulation of how clonotypes are chosen from the organisms. For now it's not working
very well, so use \code{downsample} instead.
\code{downsample}. Choose \code{.n} clones (not clonotypes!) from the input repertoires without any probabilistic simulation, but
exactly computing each choosed clones. Its output is same as for \code{resample} (repertoires), but is more consistent and
biologically pleasant.
\code{prop.sample}. Choose the first N clonotypes which occupies \code{.perc} percents of overall UMIs / reads.
}
\examples{
\dontrun{
# Get 100K reads (not clones!).
immdata.1.100k <- resample(immdata[[1]], 100000, .col = "read.count")
}
}
\seealso{
\link{rmultinom}, \link{clonal.proportion}
}
|