File: boot.phylo.Rd

package info (click to toggle)
r-cran-ape 5.7-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 3,932 kB
  • sloc: ansic: 7,626; cpp: 116; sh: 17; makefile: 2
file content (199 lines) | stat: -rw-r--r-- 8,016 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
\name{boot.phylo}
\alias{boot.phylo}
\alias{prop.part}
\alias{prop.clades}
\alias{print.prop.part}
\alias{summary.prop.part}
\alias{plot.prop.part}
\title{Tree Bipartition and Bootstrapping Phylogenies}
\description{
  These functions analyse bipartitions found in a series of trees.

  \code{prop.part} counts the number of bipartitions found in a series
  of trees given as \code{\dots}. If a single tree is passed, the
  returned object is a list of vectors with the tips descending from
  each node (i.e., clade compositions indexed by node number).

  \code{prop.clades} counts the number of times the bipartitions present
  in \code{phy} are present in a series of trees given as \code{\dots} or
  in the list previously computed and given with \code{part}.

  \code{boot.phylo} performs a bootstrap analysis.
}
\usage{
boot.phylo(phy, x, FUN, B = 100, block = 1,
           trees = FALSE, quiet = FALSE,
           rooted = is.rooted(phy), jumble = TRUE,
            mc.cores = 1)
prop.part(..., check.labels = TRUE)
prop.clades(phy, ..., part = NULL, rooted = FALSE)
\method{print}{prop.part}(x, ...)
\method{summary}{prop.part}(object, ...)
\method{plot}{prop.part}(x, barcol = "blue", leftmar = 4, col = "red", ...)
}
\arguments{
  \item{phy}{an object of class \code{"phylo"}.}
  \item{x}{in the case of \code{boot.phylo}: a taxa (rows) by characters
    (columns) matrix; in the case of \code{print} and \code{plot}: an
    object of class \code{"prop.part"}.}
  \item{FUN}{the function used to estimate \code{phy} (see details).}
  \item{B}{the number of bootstrap replicates.}
  \item{block}{the number of columns in \code{x} that will be resampled
    together (see details).}
  \item{trees}{a logical specifying whether to return the bootstraped
    trees (\code{FALSE} by default).}
  \item{quiet}{a logical: a progress bar is displayed by default.}
  \item{rooted}{a logical specifying whether the trees should be treated
    as rooted or not.}
  \item{jumble}{a logical value. By default, the rows of \code{x} are
    randomized to avoid artificially too large bootstrap values
    associated with very short branches.}
  \item{mc.cores}{the number of cores (CPUs) to be used (passed to
    \pkg{parallel}).}
  \item{\dots}{either (i) a single object of class \code{"phylo"}, (ii) a
    series of such objects separated by commas, or (iii) a list
    containing such objects. In the case of \code{plot} further
    arguments for the plot (see details).}
  \item{check.labels}{a logical specifying whether to check the labels
    of each tree. If \code{FALSE}, it is assumed that all trees have the
    same tip labels, and that they are in the same order (see details).}
  \item{part}{a list of partitions as returned by \code{prop.part}; if
    this is used then \code{\dots} is ignored.}
  \item{object}{an object of class \code{"prop.part"}.}
  \item{barcol}{the colour used for the bars displaying the number of
    partitions in the upper panel.}
  \item{leftmar}{the size of the margin on the left to display the tip
    labels.}
  \item{col}{the colour used to visualise the bipartitions.}
}
\details{
  The argument \code{FUN} in \code{boot.phylo} must be the function used
  to estimate the tree from the original data matrix. Thus, if the tree
  was estimated with neighbor-joining (see \code{nj}), one maybe wants
  something like \code{FUN = function(xx) nj(dist.dna(xx))}.

  \code{block} in \code{boot.phylo} specifies the number of columns to
  be resampled altogether. For instance, if one wants to resample at the
  codon-level, then \code{block = 3} must be used.

  Using \code{check.labels = FALSE} in \code{prop.part} decreases
  computing times. This requires that (i) all trees have the same tip
  labels, \emph{and} (ii) these labels are ordered similarly in all
  trees (in other words, the element \code{tip.label} are identical in
  all trees).

  The plot function represents a contingency table of the different
  partitions (on the \emph{x}-axis) in the lower panel, and their observed
  numbers in the upper panel. Any further arguments (\dots) are used to
  change the aspects of the points in the lower panel: these may be
  \code{pch}, \code{col}, \code{bg}, \code{cex}, etc. This function
  works only if there is an attribute \code{labels} in the object.

  The print method displays the partitions and their numbers. The
  summary method extracts the numbers only.
}
\note{
  \code{prop.clades} calls internally \code{prop.part} with the option
  \code{check.labels = TRUE}, which may be very slow. If the trees
  passed as \code{\dots} fulfills conditions (i) and (ii) above, then it
  might be faster to first call, e.g., \code{pp <- prop.part(...)}, then
  use the option \code{part}: \code{prop.clades(phy, part = pp)}.

  Since \pkg{ape} 3.5, \code{prop.clades} should return sensible results
  for all values of \code{rooted}: if \code{FALSE}, the numbers of
  bipartitions (or splits); if \code{TRUE}, the number of clades (of
  hopefully rooted trees).
}
\value{
  \code{prop.part} returns an object of class \code{"prop.part"} which
  is a list with an attribute \code{"number"}. The elements of this list
  are the observed clades, and the attribute their respective
  numbers. If the default \code{check.labels = FALSE} is used, an
  attribute \code{"labels"} is added, and the vectors of the returned
  object contains the indices of these labels instead of the labels
  themselves.

  \code{prop.clades} and \code{boot.phylo} return a numeric vector
  which \emph{i}th element is the number associated to the \emph{i}th
  node of \code{phy}. If \code{trees = TRUE}, \code{boot.phylo} returns
  a list whose first element (named \code{"BP"}) is like before, and the
  second element (\code{"trees"}) is a list with the bootstraped
  trees.

  \code{summary} returns a numeric vector.
}
\references{
  Efron, B., Halloran, E. and Holmes, S. (1996) Bootstrap confidence
  levels for phylogenetic trees. \emph{Proceedings of the National
    Academy of Sciences USA}, \bold{93}, 13429--13434.

  Felsenstein, J. (1985) Confidence limits on phylogenies: an approach
  using the bootstrap. \emph{Evolution}, \bold{39}, 783--791.
}
\author{Emmanuel Paradis}
\seealso{
  \code{\link{as.bitsplits}}, \code{\link{dist.topo}},
  \code{\link{consensus}}, \code{\link{nodelabels}}
}
\examples{
data(woodmouse)
f <- function(x) nj(dist.dna(x))
tr <- f(woodmouse)
### Are bootstrap values stable?
for (i in 1:5)
  print(boot.phylo(tr, woodmouse, f, quiet = TRUE))
### How many partitions in 100 random trees of 10 labels?...
TR <- rmtree(100, 10)
pp10 <- prop.part(TR)
length(pp10)
### ... and in 100 random trees of 20 labels?
TR <- rmtree(100, 20)
pp20 <- prop.part(TR)
length(pp20)
plot(pp10, pch = "x", col = 2)
plot(pp20, pch = "x", col = 2)

set.seed(2)
tr <- rtree(10) # rooted
## the following used to return a wrong result with ape <= 3.4:
prop.clades(tr, tr)
prop.clades(tr, tr, rooted = TRUE)
tr <- rtree(10, rooted = FALSE)
prop.clades(tr, tr) # correct

### an illustration of the use of prop.clades with bootstrap trees:

fun <- function(x) as.phylo(hclust(dist.dna(x), "average")) # upgma() in phangorn
tree <- fun(woodmouse)
## get 100 bootstrap trees:
bstrees <- boot.phylo(tree, woodmouse, fun, trees = TRUE)$trees
## get proportions of each clade:
clad <- prop.clades(tree, bstrees, rooted = TRUE)
## get proportions of each bipartition:
boot <- prop.clades(tree, bstrees)
layout(1)
par(mar = rep(2, 4))
plot(tree, main = "Bipartition vs. Clade Support Values")
drawSupportOnEdges(boot)
nodelabels(clad)
legend("bottomleft", legend = c("Bipartitions", "Clades"), pch = 22,
       pt.bg = c("green", "lightblue"), pt.cex = 2.5)

\dontrun{
## an example of double bootstrap:
nrep1 <- 100
nrep2 <- 100
p <- ncol(woodmouse)
DB <- 0

for (b in 1:nrep1) {
    X <- woodmouse[, sample(p, p, TRUE)]
    DB <- DB + boot.phylo(tr, X, f, nrep2, quiet = TRUE)
}
DB
## to compare with:
boot.phylo(tr, woodmouse, f, 1e4)
}
}
\keyword{manip}
\keyword{htest}