File: unmix.Rd

package info (click to toggle)
r-bioc-deseq2 1.46.0%2Bdfsg-2
links: PTS, VCS
area: main
in suites: sid, trixie
size: 1,748 kB
sloc: cpp: 413; makefile: 2
file content (83 lines) | stat: -rw-r--r-- 3,209 bytes
parent folder | download | duplicates (3)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/helper.R
\name{unmix}
\alias{unmix}
\title{Unmix samples using loss in a variance stabilized space}
\usage{
unmix(x, pure, alpha, shift, power = 1, format = "matrix", quiet = FALSE)
}
\arguments{
\item{x}{normalized counts or TPMs of the samples to be unmixed}

\item{pure}{normalized counts or TPMs of the "pure" samples}

\item{alpha}{for normalized counts, the dispersion of the data
when a negative binomial model is fit. this can be found by examining
the asymptotic value of \code{dispersionFunction(dds)}, when using
\code{fitType="parametric"} or the mean value when using
\code{fitType="mean"}.}

\item{shift}{for TPMs, the shift which approximately stabilizes the variance
of log shifted TPMs. Can be assessed with \code{vsn::meanSdPlot}.}

\item{power}{either 1 (for L1) or 2 (for squared) loss function.
Default is 1.}

\item{format}{\code{"matrix"} or \code{"list"}, default is \code{"matrix"}.
whether to output just the matrix of mixture components, or a list (see Value).}

\item{quiet}{suppress progress bar. default is FALSE, show progress bar
if pbapply is installed.}
}
\value{
a matrix, the mixture components for each sample in \code{x} (rows).
The "pure" samples make up the columns, and so each row sums to 1.
If colnames existed on the input matrices they will be propagated to the output matrix.
If \code{format="list"}, then a list, containing as elements:
(1) the matrix of mixture components,
(2) the correlations in the variance stabilized space of the fitted samples
to the samples in \code{x}, and
(3) the fitted samples as a matrix with the same dimension as \code{x}.
}
\description{
Unmixes samples in \code{x} according to \code{pure} components,
using numerical optimization. The components in \code{pure}
are added on the scale of gene expression (either normalized counts, or TPMs).
The loss function when comparing fitted expression to the
samples in \code{x} occurs in a variance stabilized space.
This task is sometimes referred to as "deconvolution",
and can be used, for example, to identify contributions from
various tissues.
Note: some groups have found that the mixing contributions
may be more accurate if very lowly expressed genes across \code{x}
and \code{pure} are first removed. We have not explored this fully.
Note: if the \code{pbapply} package is installed a progress bar
will be displayed while mixing components are fit.
}
\examples{

# some artificial data
cts <- matrix(c(80,50,1,100,
                1,1,60,100,
                0,50,60,100), ncol=4, byrow=TRUE)
# make a DESeqDataSet
dds <- DESeqDataSetFromMatrix(cts,
  data.frame(row.names=seq_len(ncol(cts))), ~1)
colnames(dds) <- paste0("sample",1:4)

# note! here you would instead use
# estimateSizeFactors() to do actual normalization
sizeFactors(dds) <- rep(1, ncol(dds))

norm.cts <- counts(dds, normalized=TRUE)

# 'pure' should also have normalized counts...
pure <- matrix(c(10,0,0,
                 0,0,10,
                 0,10,0), ncol=3, byrow=TRUE)
colnames(pure) <- letters[1:3]

# for real data, you need to find alpha after fitting estimateDispersions()
mix <- unmix(norm.cts, pure, alpha=0.01)

}