File: coassignProb.Rd

package info (click to toggle)
r-bioc-scran 1.18.5%2Bdfsg-1
links: PTS, VCS
area: main
in suites: bullseye
size: 1,856 kB
sloc: cpp: 960; sh: 13; makefile: 2
file content (64 lines) | stat: -rw-r--r-- 3,554 bytes
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/coassignProb.R
\name{coassignProb}
\alias{coassignProb}
\title{Compute coassignment probabilities}
\usage{
coassignProb(ref, alt, summarize = FALSE)
}
\arguments{
\item{ref}{A character vector or factor containing one set of groupings, considered to be the reference.}

\item{alt}{A character vector or factor containing another set of groupings, to be compared to \code{alt}.}

\item{summarize}{Logical scalar indicating whether the output matrix should be converted into a per-label summary.}
}
\value{
If \code{summarize=FALSE}, a numeric matrix is returned with upper triangular entries filled with the coassignment probabilities for each pair of labels in \code{ref}.

Otherwise, a \linkS4class{DataFrame} is returned with one row per label in \code{ref} containing the \code{self} and \code{other} coassignment probabilities.
}
\description{
Compute coassignment probabilities for each label in a reference grouping when compared to an alternative grouping of samples.
This is now deprecated for \code{\link{pairwiseRand}}.
}
\details{
The coassignment probability for each pair of labels in \code{ref} is the probability that a randomly chosen cell from each of the two reference labels will have the same label in \code{alt}.
High coassignment probabilities indicate that a particular pair of labels in \code{ref} are frequently assigned to the same label in \code{alt}, which has some implications for cluster stability.

When \code{summarize=TRUE}, we summarize the matrix of coassignment probabilities into a set of per-label values.
The \dQuote{self} coassignment probability is simply the diagonal entry of the matrix, i.e., the probability that two cells from the same label in \code{ref} also have the same label in \code{alt}.
The \dQuote{other} coassignment probability is the maximum probability across all pairs involving that label.

% One might consider instead reporting the 'other' probability as the probability that a randomly chosen cell in the cluster and a randomly chosen cell in any other cluster belong in the same cluster.
% However, this results in very small probabilities in all cases, simply because most of the other clusters are well seperated.
% Reporting the maximum is more useful as at least you can tell that a cluster is well-separated from _all_ other clusters if it has a low 'other' probability.

In general, \code{ref} is well-recapitulated by \code{alt} if the diagonal entries of the matrix is much higher than the sum of the off-diagonal entries.
This manifests as higher values for the self probabilities compared to the other probabilities.

Note that the coassignment probability is closely related to the Rand index-based ratios  
broken down by cluster pair in \code{\link{pairwiseRand}} with \code{mode="ratio"} and \code{adjusted=FALSE}.
The off-diagonal coassignment probabilities are simply 1 minus the off-diagonal ratio, 
while the on-diagonal values differ only by the lack of consideration of pairs of the same cell in \code{\link{pairwiseRand}}.
}
\examples{
library(scuttle)
sce <- mockSCE(ncells=200)
sce <- logNormCounts(sce)

clust1 <- kmeans(t(logcounts(sce)),3)$cluster
clust2 <- kmeans(t(logcounts(sce)),5)$cluster

coassignProb(clust1, clust2)
coassignProb(clust1, clust2, summarize=TRUE)

}
\seealso{
\code{\link{bootstrapCluster}}, to compute coassignment probabilities across bootstrap replicates.

\code{\link{pairwiseRand}}, for another way to compare different clusterings.
}
\author{
Aaron Lun
}