1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/summaryMarkerStats.R
\name{summaryMarkerStats}
\alias{summaryMarkerStats}
\alias{summaryMarkerStats,ANY-method}
\alias{summaryMarkerStats,SummarizedExperiment-method}
\title{Summary marker statistics}
\usage{
summaryMarkerStats(x, ...)
\S4method{summaryMarkerStats}{ANY}(
x,
groups,
row.data = NULL,
average = "mean",
BPPARAM = SerialParam()
)
\S4method{summaryMarkerStats}{SummarizedExperiment}(x, ..., assay.type = "logcounts")
}
\arguments{
\item{x}{A numeric matrix-like object of expression values,
where each column corresponds to a cell and each row corresponds to an endogenous gene.
This is generally expected to be normalized log-expression values unless one knows better.
Alternatively, a \linkS4class{SummarizedExperiment} or \linkS4class{SingleCellExperiment} object containing such a matrix.}
\item{...}{For the generic, further arguments to pass to specific methods.
For the SummarizedExperiment method, further arguments to pass to the ANY method.}
\item{groups}{A vector of length equal to \code{ncol(x)},
specifying the group to which each cell is assigned.
If \code{x} is a \linkS4class{SingleCellExperiment}, this defaults to \code{\link{colLabels}(x)} if available.}
\item{row.data}{A \linkS4class{DataFrame} containing additional row metadata for each gene in \code{x},
to be included in each of the output DataFrames.
This should generally have row names identical to those of \code{x}.
Alternatively, a list containing one such DataFrame per level of \code{groups},
where each DataFrame contains group-specific metadata for each gene to be included in the appropriate output DataFrame.}
\item{average}{String specifying the type of average, to be passed to \code{\link{sumCountsAcrossCells}}.}
\item{BPPARAM}{A \linkS4class{BiocParallelParam} object indicating whether and how parallelization should be performed across genes.}
\item{assay.type}{A string specifying which assay values to use, usually \code{"logcounts"}.}
}
\value{
A named \linkS4class{List} of \linkS4class{DataFrame}s, with one entry per level of \code{groups}.
Each DataFrame has number of rows corresponding to the rows in \code{x} and contains the fields:
\itemize{
\item \code{self.average}, the average (log-)expression across all cells in the current group.
\item \code{other.average}, the grand average of the average (log-)expression across cells in the other groups.
\item \code{self.detected}, the proportion of cells with detected expression in the current group.
\item \code{other.detected}, the average proportion of cells with detected expression in the other groups.
}
}
\description{
Compute additional gene-level statistics for each group to assist in identifying marker genes,
to complement the formal test statistics generated by \code{\link{findMarkers}}.
}
\details{
This function only generates descriptive statistics for each gene to assist marker selection.
It does not consider blocking factors or covariates that would otherwise be available from comparisons between groups.
For the sake of brevity, statistics for the \dQuote{other} groups are summarized into a single value.
}
\examples{
library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)
# Any clustering method is okay.
kout <- kmeans(t(logcounts(sce)), centers=3)
sum.out <- summaryMarkerStats(sce, kout$cluster)
sum.out[["1"]]
# Add extra rowData if you like.
rd <- DataFrame(Symbol=sample(LETTERS, nrow(sce), replace=TRUE),
row.names=rownames(sce))
sum.out <- summaryMarkerStats(sce, kout$cluster, row.data=rd)
sum.out[["1"]]
}
\seealso{
\code{\link{findMarkers}}, where the output of this function can be used in \code{row.data=}.
}
\author{
Aaron Lun
}
|