File: empPvals.Rd

package info (click to toggle)
r-bioc-qvalue 2.38.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,952 kB
  • sloc: makefile: 16
file content (77 lines) | stat: -rw-r--r-- 2,936 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/empPvals.R
\name{empPvals}
\alias{empPvals}
\title{Calculate p-values from a set of observed test statistics and
  simulated null test statistics}
\usage{
empPvals(stat, stat0, pool = TRUE)
}
\arguments{
\item{stat}{A vector of calculated test statistics.}

\item{stat0}{A vector or matrix of simulated or data-resampled null test
statistics.}

\item{pool}{If FALSE, stat0 must be a matrix with the number of rows equal to
the length of \code{stat}. Default is TRUE.}
}
\value{
A vector of p-values calculated as described above.
}
\description{
Calculates p-values from a set of observed test statistics and
  simulated null test statistics
}
\details{
The argument \code{stat} must be such that the larger the value is
  the more deviated (i.e., "more extreme") from the null hypothesis it is.
  Examples include an F-statistic or the absolute value of a t-statistic. The
  argument \code{stat0} should be calculated analogously on data that
  represents observations from the null hypothesis distribution. The p-values
  are calculated as the proportion of values from \code{stat0} that are
  greater than or equal to that from \code{stat}. If \code{pool=TRUE} is
  selected, then all of \code{stat0} is used in calculating the p-value for a
  given entry of \code{stat}. If \code{pool=FALSE}, then it is assumed that
  \code{stat0} is a matrix, where \code{stat0[i,]} is used to calculate the
  p-value for \code{stat[i]}. The function \code{empPvals} calculates
  "pooled" p-values faster than using a for-loop.

  See page 18 of the Supporting Information in Storey et al. (2005) PNAS
  (\url{http://www.pnas.org/content/suppl/2005/08/26/0504609102.DC1/04609SuppAppendix.pdf})
   for an explanation as to why calculating p-values from pooled empirical
  null statistics and then estimating FDR on these p-values is equivalent to
  directly thresholding the test statistics themselves and utilizing an
  analogous FDR estimator.
}
\examples{
# import data
data(hedenfalk)
stat <- hedenfalk$stat
stat0 <- hedenfalk$stat0 #vector from null distribution

# calculate p-values
p.pooled <- empPvals(stat=stat, stat0=stat0)
p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE)

# compare pooled to test-specific p-values
qqplot(p.pooled, p.testspecific); abline(0,1)

}
\references{
Storey JD and Tibshirani R. (2003) Statistical significance for
  genome-wide experiments. Proceedings of the National Academy of Sciences,
  100: 9440-9445.  \cr \url{http://www.pnas.org/content/100/16/9440.full}

  Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. (2005) Significance
  analysis of time course microarray experiments.  Proceedings of the
  National Academy of Sciences, 102 (36), 12837-12842. \cr
  \url{http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes}
}
\seealso{
\code{\link{qvalue}}
}
\author{
John D. Storey
}
\keyword{pvalues}