File: multiMarkerStats.Rd

package info (click to toggle)
r-bioc-scran 1.18.5%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 1,856 kB
  • sloc: cpp: 960; sh: 13; makefile: 2
file content (74 lines) | stat: -rw-r--r-- 3,537 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multiMarkerStats.R
\name{multiMarkerStats}
\alias{multiMarkerStats}
\title{Combine multiple sets of marker statistics}
\usage{
multiMarkerStats(..., repeated = NULL, sorted = TRUE)
}
\arguments{
\item{...}{Two or more lists or \linkS4class{List}s produced by \code{\link{findMarkers}} or \code{\link{combineMarkers}}.
Each list should contain \linkS4class{DataFrame}s of results, one for each group/cluster of cells.

The names of each List should be the same; the universe of genes in each DataFrame should be the same;
and the same number of columns in each DataFrame should be named.
All elements in \code{...} are also expected to be named.}

\item{repeated}{Character vector of columns that are present in one or more DataFrames but should only be reported once.
Typically used to avoid reporting redundant copies of annotation-related columns.}

\item{sorted}{Logical scalar indicating whether each output DataFrame should be sorted by some relevant statistic.}
}
\value{
A named List of DataFrames with one DataFrame per group/cluster.
Each DataFrame contains statistics from the corresponding entry of each List in \code{...},
prefixed with the name of the List.
In addition, several combined statistics are reported:
\itemize{
\item \code{Top}, the largest rank of each gene across all DataFrames for that group.
This is only reported if each list in \code{...} was generated with \code{pval.type="any"} in \code{\link{combineMarkers}}.
\item \code{p.value}, the largest p-value of each gene across all DataFrames for that group.
This is replaced by \code{log.p.value} if p-values in \code{...} are log-transformed.
\item \code{FDR}, the BH-adjusted value of \code{p.value}.
This is replaced by \code{log.FDR} if p-values in \code{...} are log-transformed.
}
}
\description{
Combine multiple sets of marker statistics, typically from different tests,
into a single \linkS4class{DataFrame} for convenient inspection.
}
\details{
The combined statistics are designed to favor a gene that is highly ranked in each of the individual test results.
This is highly conservative and aims to identify robust DE that is significant under all testing schemes.

A combined \code{Top} value of T indicates that the gene is among the top T genes of one or more pairwise comparisons
in each of the testing schemes.
(We can be even more aggressive if the individual results were generated with a larger \code{min.prop} value.)
In effect, a gene can only achieve a low \code{Top} value if it is consistently highly ranked in each test.
If \code{sorted=TRUE}, this is used to order the genes in the output DataFrame.

The combined \code{p.value} is effectively the result of applying an intersection-union test to the per-test results.
This will only be low if the gene has a low p-value in each of the test results.
If \code{sorted=TRUE} and \code{Top} is not present, this will be used to order the genes in the output DataFrame.
}
\examples{
library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)

# Any clustering method is okay, only using k-means for convenience.
kout <- kmeans(t(logcounts(sce)), centers=4) 

tout <- findMarkers(sce, groups=kout$cluster, direction="up")
wout <- findMarkers(sce, groups=kout$cluster, direction="up", test="wilcox")

combined <- multiMarkerStats(t=tout, wilcox=wout)
colnames(combined[[1]])

}
\seealso{
\code{\link{findMarkers}} and \code{\link{combineMarkers}}, to generate elements in \code{...}.
}
\author{
Aaron Lun
}