File: topTags.Rd

package info (click to toggle)
r-bioc-edger 3.40.2%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 1,484 kB
  • sloc: cpp: 1,425; ansic: 1,109; sh: 21; makefile: 5
file content (101 lines) | stat: -rwxr-xr-x 5,341 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
\name{topTags}
\alias{topTags}
\alias{TopTags-class}
\alias{show,TopTags-method}

\title{Table of the Top Differentially Expressed Genes/Tags}

\description{Extracts the most differentially expressed genes (or sequence tags) from a test object, ranked either by p-value or by absolute log-fold-change.}

\usage{
topTags(object, n = 10, adjust.method = "BH", sort.by = "PValue", p.value = 1)
}

\arguments{ 
\item{object}{a \code{\link[edgeR:DGEList-class]{DGEExact}} or \code{\link[edgeR:DGELRT-class]{DGELRT}} object containing test statistics and p-values.
Usually created by \code{exactTest}, \code{glmLRT}, \code{glmTreat} or \code{glmQLFTest}.}

\item{n}{integer, maximum number of genes/tags to return.}

\item{adjust.method}{character string specifying the method used to adjust p-values for multiple testing. See \code{\link{p.adjust}} for possible values.}

\item{sort.by}{character string specifying the sort method. Possibilities are \code{"PValue"} for p-value, \code{"logFC"} for absolute log-fold change or \code{"none"} for no sorting.}

\item{p.value}{numeric cutoff value for adjusted p-values. Only tags with adjusted p-values equal or lower than specified are returned.}
}

\details{
This function is closely analogous to the \code{\link{topTable}} function in the limma package.
It accepts a test statistic object created by any of the edgeR functions \code{exactTest}, \code{glmLRT}, \code{glmTreat} or \code{glmQLFTest} and extracts a readable data.frame of the most differentially expressed genes.
The data.frame collates the annotation and differential expression statistics for the top genes.
The data.frame is wrapped in a \code{TopTags} output object that records the test statistic used and the multiple testing  adjustment method.

\code{TopTags} objects will return dimensions and hence functions such as \code{dim}, \code{nrow} or \code{ncol} are defined on them.
\code{TopTags} objects also have a \code{show} method so that printing produces a compact summary of their contents.

\code{topTags} permits ranking by fold-change but the authors do not recommend fold-change ranking or fold-change cutoffs for routine RNA-seq analysis.
The p-value ranking is intended to more biologically meaningful, especially if the p-values were computed using \code{glmTreat}.
}

\value{
An object of class \code{TopTags}, which is a list-based class with the following components:
\item{table}{a data.frame containing differential expression results for the top genes in sorted order.
The number of rows is the smaller of \code{n} and the number of genes with adjusted p-value less than or equal to \code{p.value}.
The data.frame includes all the annotation columns from \code{object$genes} and all statistic columns from \code{object$table} plus one of:
\tabular{rl}{
\code{FDR}: \tab false discovery rate (only when \code{adjust.method} is \code{"BH"}, \code{"BY"} or \code{"fdr"})\cr
\code{FWER}: \tab family-wise error rate (only when \code{adjust.method} is \code{"holm"}, \code{"hochberg"}, \code{"hommel"} or \code{"bonferroni"}).
}
}
\item{adjust.method}{character string specifying the method used to adjust p-values for multiple testing, same as input argument.}
\item{comparison}{character vector giving the names of the two groups being compared (for \code{DGEExact} objects) or the glm contrast being tested (for \code{DGELRT} objects).}
\item{test}{character string stating the name of the test.}
}

\note{
The terms `tag' and `gene' are used synonymously on this page and refer to the rows of \code{object}.
In general, the rows might be genes, sequence tags, transcripts, exons or whatever type of genomic feature is appropriate for the analysis at hand.
}

\author{Mark Robinson, Davis McCarthy, Yunshun Chen, Gordon Smyth}

\references{
Chen Y, Lun ATL, and Smyth, GK (2016).
From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline.
\emph{F1000Research} 5, 1438.
\url{http://f1000research.com/articles/5-1438}

McCarthy, DJ, Chen, Y, Smyth, GK (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.
\emph{Nucleic Acids Research} 40, 4288-4297.
\doi{10.1093/nar/gks042}

Robinson MD, Smyth GK (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. \emph{Biostatistics} 9, 321-332.

Robinson MD, Smyth GK (2007). Moderated statistical tests for assessing differences in tag abundance. \emph{Bioinformatics} 23, 2881-2887.
}

\seealso{
\code{\link{exactTest}}, \code{\link{glmLRT}}, \code{\link{glmTreat}}, \code{\link{glmQLFTest}}, \code{\link{dim.TopTags}}, \code{\link{p.adjust}}.
}

\examples{
# generate raw counts from NB, create list object
y <- matrix(rnbinom(80,size=1,mu=10),nrow=20)
d <- DGEList(counts=y,group=rep(1:2,each=2),lib.size=rep(c(1000:1001),2))
rownames(d$counts) <- paste("gene",1:nrow(d$counts),sep=".")

# estimate common dispersion and find differences in expression
# here we demonstrate the 'exact' methods, but the use of topTags is
# the same for a GLM analysis
d <- estimateCommonDisp(d)
de <- exactTest(d)

# look at top 10
topTags(de)
# Can specify how many genes to view
tp <- topTags(de, n=15)
# Here we view top 15
tp
# Or order by fold change instead
topTags(de,sort.by="logFC")
}