File: librarySizeFactors.Rd

package info (click to toggle)
r-bioc-scuttle 1.16.0%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 912 kB
  • sloc: cpp: 531; sh: 7; makefile: 2
file content (82 lines) | stat: -rw-r--r-- 3,720 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/librarySizeFactors.R
\name{librarySizeFactors}
\alias{librarySizeFactors}
\alias{librarySizeFactors,ANY-method}
\alias{librarySizeFactors,SummarizedExperiment-method}
\alias{computeLibraryFactors}
\title{Compute library size factors}
\usage{
librarySizeFactors(x, ...)

\S4method{librarySizeFactors}{ANY}(
  x,
  subset.row = NULL,
  geometric = FALSE,
  BPPARAM = SerialParam(),
  subset_row = NULL,
  pseudo_count = 1
)

\S4method{librarySizeFactors}{SummarizedExperiment}(x, ..., assay.type = "counts", exprs_values = NULL)

computeLibraryFactors(x, ...)
}
\arguments{
\item{x}{For \code{librarySizeFactors}, a numeric matrix of counts with one row per feature and column per cell.
Alternatively, a \linkS4class{SummarizedExperiment} or \linkS4class{SingleCellExperiment} containing such counts.

For \code{computeLibraryFactors}, only a \linkS4class{SingleCellExperiment} containing a count matrix is accepted.}

\item{...}{For the \code{librarySizeFactors} generic, arguments to pass to specific methods.
For the SummarizedExperiment method, further arguments to pass to the ANY method.

For \code{computeLibraryFactors}, further arguments to pass to \code{librarySizeFactors}.}

\item{subset.row}{A vector specifying whether the size factors should be computed from a subset of rows of \code{x}.}

\item{geometric}{Deprecated, logical scalar indicating whether the size factor should be defined using the geometric mean.}

\item{BPPARAM}{A \linkS4class{BiocParallelParam} object indicating how calculations are to be parallelized.
Only relevant when \code{x} is a \linkS4class{DelayedArray} object.}

\item{subset_row, exprs_values}{Soft-deprecated equivalents to the arguments above.}

\item{pseudo_count}{Deprecated, numeric scalar specifying the pseudo-count to add when \code{geometric=TRUE}.}

\item{assay.type}{String or integer scalar indicating the assay of \code{x} containing the counts.}
}
\value{
For \code{librarySizeFactors}, a numeric vector of size factors is returned for all methods.

For \code{computeLibraryFactors}, \code{x} is returned containing the size factors in \code{\link{sizeFactors}(x)}.
}
\description{
Define per-cell size factors from the library sizes (i.e., total sum of counts per cell).
}
\details{
Library sizes are converted into size factors by scaling them so that their mean across cells is unity.
This ensures that the normalized values are still on the same scale as the raw counts.
Preserving the scale is useful for interpretation of operations on the normalized values,
e.g., the pseudo-count used in \code{\link{logNormCounts}} can actually be considered an additional read/UMI.
This is important for ensuring that the effect of the pseudo-count decreases with increasing sequencing depth,
see \code{?\link{normalizeCounts}} for a discussion of this effect.

With library size-derived size factors, we implicitly assume that sequencing coverage is the only difference between cells.
This is reasonable for homogeneous cell populations but is compromised by composition biases from DE between cell types.
In such cases, the library size factors will not be correct though any effects on downstream conclusions will vary,
e.g., clustering is usually unaffected by composition biases but log-fold change estimates will be less accurate.
}
\examples{
example_sce <- mockSCE()
summary(librarySizeFactors(example_sce))
}
\seealso{
\code{\link{normalizeCounts}} and \code{\link{logNormCounts}}, where these size factors are used by default.

\code{\link{geometricSizeFactors}} and \code{\link{medianSizeFactors}}, 
for two other simple methods of computing size factors.
}
\author{
Aaron Lun
}