1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/differential_expression.R
\name{diff_mean_test_conserved}
\alias{diff_mean_test_conserved}
\title{Find differentially expressed genes that are conserved across samples}
\usage{
diff_mean_test_conserved(
y,
group_labels,
sample_labels,
balanced = TRUE,
compare = "each_vs_rest",
pval_th = 1e-04,
...
)
}
\arguments{
\item{y}{A matrix of counts; must be (or inherit from) class dgCMatrix; genes are rows,
cells are columns}
\item{group_labels}{The group labels (i.e. clusters or time points);
will be converted to factor}
\item{sample_labels}{The sample labels; will be converted to factor}
\item{balanced}{Boolean, see details for explanation; default is TRUE}
\item{compare}{Specifies which groups to compare, see details; currently only 'each_vs_rest'
(the default) is supported}
\item{pval_th}{P-value threshold used to call a gene differentially expressed when summarizing
the tests per gene}
\item{...}{Parameters passed to diff_mean_test}
}
\value{
Data frame of results
}
\description{
Find differentially expressed genes that are conserved across samples
}
\section{Details}{
This function calls diff_mean_test repeatedly and aggregates the results per group and gene.
If balanced is TRUE (the default), it is assumed that each sample spans multiple groups,
as would be the case when merging or integrating samples from the same tissue followed by
clustering. Here the group labels would be the clusters and cluster markers would have support
in each sample.
If balanced is FALSE, an unbalanced design is assumed where each sample contributes to one
group. An example is a time series experiment where some samples are taken from time point
1 while other samples are taken from time point 2. The time point would be the group label
and the goal would be to identify differentially expressed genes between time points that
are supported by many between-sample comparisons.
Output columns:
\describe{
\item{group1}{Group label of the frist group of cells}
\item{group2}{Group label of the second group of cells; currently fixed to 'rest'}
\item{gene}{Gene name (from rownames of input matrix)}
\item{n_tests}{The number of tests this gene participated in for this group}
\item{log2FC_min,median,max}{Summary statistics for log2FC across the tests}
\item{mean1,2_median}{Median of group mean across the tests}
\item{pval_max}{Maximum of p-values across tests}
\item{de_tests}{Number of tests that showed this gene having a log2FC going in the same
direction as log2FC_median and having a p-value <= pval_th}
}
The output is ordered by group1, -de_tests, -abs(log2FC_median), pval_max
}
\examples{
\donttest{
clustering <- 1:ncol(pbmc) \%\% 2
sample_id <- 1:ncol(pbmc) \%\% 3
vst_out <- vst(pbmc, return_corrected_umi = TRUE)
de_res <- diff_mean_test_conserved(y = vst_out$umi_corrected,
group_labels = clustering, sample_labels = sample_id)
}
}
|