1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375
|
---
title: "Installing and Managing _Bioconductor_ Packages"
author:
- name: Marcel Ramos
affiliation: Roswell Park Comprehensive Cancer Center, Buffalo, NY
- name: Martin Morgan
affiliation: Roswell Park Comprehensive Cancer Center, Buffalo, NY
output:
BiocStyle::html_document:
toc: true
vignette: |
%\VignetteIndexEntry{Installing and Managing Bioconductor Packages}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = interactive())
```
# Introduction
Use the [BiocManager][1] package to install and manage packages from the
_[Bioconductor][2]_ project for the statistical analysis and comprehension of
high-throughput genomic data.
Current _Bioconductor_ packages are available on a 'release' version intended
for every-day use, and a 'devel' version where new features are introduced. A
new release version is created every six months. Using the [BiocManager][1]
package helps users install packages from the same release.
# Basic use
## Installing BiocManager
Use standard _R_ installation procedures to install the
[BiocManager][1] package. This command is requried only once per _R_
installation.
```{r, eval = FALSE}
chooseCRANmirror()
install.packages("BiocManager")
```
## Installing _Bioconductor_, _CRAN_, or github packages
Install _Bioconductor_ (or CRAN) packages with
```{r, eval = FALSE}
BiocManager::install(c("GenomicRanges", "Organism.dplyr"))
```
Installed packages can be updated to their current version with
```{r, eval = FALSE}
BiocManager::install()
```
## Version and validity of installations
Use `version()` to discover the version of _Bioconductor_ currently in
use.
```{r}
BiocManager::version()
```
_Bioconductor_ packages work best when they are all from the same release. Use
`valid()` to identify packages that are out-of-date or from unexpected
versions.
```{r}
BiocManager::valid()
```
`valid()` returns an object that can be queried for detailed
information about invalid packages, as illustrated in the following
screen capture
```
> v <- valid()
Warning message:
6 packages out-of-date; 0 packages too new
> names(v)
[1] "out_of_date" "too_new"
> head(v$out_of_date, 2)
Package LibPath
bit "bit" "/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.5-Bioc-3.8"
ff "ff" "/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.5-Bioc-3.8"
Installed Built ReposVer Repository
bit "1.1-12" "3.5.0" "1.1-13" "https://cran.rstudio.com/src/contrib"
ff "2.2-13" "3.5.0" "2.2-14" "https://cran.rstudio.com/src/contrib"
>
```
## Available packages
Packages available for your version of _Bioconductor_ can be
discovered with `available()`; the first argument can be used to
filter package names based on a regular expression, e.g., 'BSgenome'
package available for _Homo sapiens_
```{r}
avail <- BiocManager::available()
length(avail) # all CRAN & Bioconductor packages
BiocManager::available("BSgenome.Hsapiens") # BSgenome.Hsapiens.* packages
```
Questions about installing and managing _Bioconductor_ packages should
be addressed to the [_Bioconductor_ support site][3].
# Troubleshooting
## Failure after updating _R_
After updating _R_ (e.g., from _R_ version 3.5.x to _R_ version 3.6.x
at the time of writing this) and trying to load `BiocManager`, _R_
replies
```
Error: .onLoad failed in loadNamespace() for 'BiocManager', details:
call: NULL
error: Bioconductor version '3.8' requires R version '3.5'; see
https://bioconductor.org/install
```
This problem arises because `BiocManager` uses a second package,
`BiocVersion`, to indicate the version of _Bioconductor_ in use. In
the original installation, `BiocManager` had installed `BiocVersion`
appropriate for _R_ version 3.5. With the update, the version of
_Bioconductor_ indicated by `BiocVersion` is no longer valid -- you'll
need to update `BiocVersion` and all _Bioconductor_ packages to the
most recent version available for your new version of _R_.
The recommended practice is to maintain a separate library for each
_R_ and _Bioconductor_ version. So instead of installing packages into
_R_'s system library (e.g., as 'administrator'), install only base _R_
into the system location. Then use aliases or other approaches to
create _R_ / _Bioconductor_ version-specific installations. This is
described in the section on [maintaining multiple
versions](#multiple-versions) of _R_ and _Bioconductor_.
Alternatively, one could update all _Bioconductor_ packages in the
previous installation directory. The problem with this is that the
previous version of _Bioconductor_ is removed, compromising the
ability to reproduce earlier results. Update all _Bioconductor_
packages in the previous installation directory by removing _all_
versions of `BiocVersion`
```
remove.packages("BiocVersion") # repeat until all instances removed
```
Then install the updated `BiocVersion`, and update all _Bioconductor_
packages; answer 'yes' when you are asked to update a potentially
large number of _Bioconductor_ packages.
```{r, eval = FALSE}
BiocManager::install()
```
Confirm that the updated _Bioconductor_ is valid for your version of
_R_
```{r, eval = FALSE}
BiocManager::valid()
```
# Advanced use
## Changing version
Use the `version=` argument to update all packages to a specific _Bioconductor_
version
```{r, eval = FALSE}
BiocManager::install(version="3.7")
```
_Bioconductor_ versions are associated with specific _R_ versions, as
summarized [here][5]. Attempting to install a version of
_Bioconductor_ that is not supported by the version of _R_ in use
leads to an error; using the most recent version of _Bioconductor_ may require
installing a new version of _R_.
```
> BiocManager::install(version="3.9")
Error: Bioconductor version '3.9' requires R version '3.6'; see
https://bioconductor.org/install
```
A special version, `version="devel"`, allows use of _Bioconductor_
packages that are under development.
## Managing multiple versions {#multiple-versions}
It is possible to have multiple versions of _Bioconductor_ installed on the
same computer. A best practice is to [create an initial _R_ installation][6].
Then create and use a library for each version of _Bioconductor_. The library
will contain all _Bioconductor_, CRAN, and other packages for that version of
_Bioconductor_. We illustrate the process assuming use of _Bioconductor_
version 3.7, available using _R_ version 3.5
Create a directory to contain the library (replace `USER_NAME` with your user
name on Windows)
- Linux: `~/R/3.5-Bioc-3.7`
- macOS: `~/Library/R/3.5-Bioc-3.7/library`
- Windows: `C:\Users\USER_NAME\Documents\R\3.5-Bioc-3.7`
Set the environment variable `R_LIBS_USER` to this directory, and invoke _R_.
Command line examples for Linux are
- Linux: `R_LIBS_USER=~/R/3.5-Bioc-3.7 R`
- macOS: `R_LIBS_USER=~~/Library/R/3.5-Bioc-3.7/library R`
- Windows: `cmd /C "set R_LIBS_USER=C:\Users\USER_NAME\Documents\R\3.5-Bioc-3.7 && R"`
Once in _R_, confirm that the version-specific library path has been set
```{r, eval = FALSE}
.libPaths()
```
On Linux and macOS, create a bash alias to save typing, e.g.,
- Linux: `alias Bioc3.7='R_LIBS_USER=~/R/3.5-Bioc-3.7 R'`
- macOS: `alias Bioc3.7='R_LIBS_USER=~/Library/R/3.5-Bioc-3.7/library R'`
Invoke these from the command line as `Bioc3.7`.
On Windows, create a shortcut. Go to My Computer and navigate to a directory
that is in your PATH. Then right-click and choose New->Shortcut.
In the "type the location of the item" box, put:
```
cmd /C "set R_LIBS_USER=C:\Users\USER_NAME\Documents\R\3.5-Bioc-3.7 && R"
```
Click "Next". In the "Type a name for this shortcut" box, type `Bioc-3.7`.
## Offline use
_BiocManager_ expects to be able to connect to online _Bioconductor_
and _CRAN_ repositories. Off-line use generally requires the following
steps.
1. Use `rsync` to create local repositories of [CRAN][8] and
[Bioconductor][7]. Tell _R_ about these repositories using (e.g.,
in a site-wide `.Rprofile`, see `?.Rprofile`).
```{r, eval = FALSE}
options(
repos = "file:///path/to/CRAN-mirror",
BioC_mirror = "file:///path/to/Bioc-mirror"
)
```
2. Create an environment variable or option, e.g.,
```{r, eval = FALSE}
options(
BIOCONDUCTOR_ONLINE_VERSION_DIAGNOSIS = FALSE
)
```
3. Use `install.packages()` to bootstrap the BiocManager installation.
```{r, eval = FALSE}
install.package(c("BiocManager", "BiocVersion"))
```
BiocManager can then be used for subsequent installations, e.g.,
`BiocManager::install(c("ggplot2", "GenomicRanges"))`.
# How it works -- details for advanced trouble-shooting
BiocManager's job is to make sure that all packages are installed from
the same _Bioconductor_ version, using compatible _R_ and _CRAN_
packages. However, _R_ has an annual release cycle, whereas
_Bioconductor_ has a twice-yearly release cycle. Also, _Bioconductor_
has a 'devel' branch where new packages and features are introduced,
and a 'release' branch where bug fixes and relative stability are
important; _CRAN_ packages do not distinguish between devel and
release branches.
In the past, one would install a _Bioconductor_ package by evaluating
the command `source("https://.../biocLite.R")` to read a file from the
web. The file contained an installation script that was smart enough
to figure out what version of _R_ and _Bioconductor_ were in use or
appropriate for the person invoking the script. Sourcing an executable
script from the web is an obvious security problem.
Our solution is to use a CRAN package BiocManager, so that users
install from pre-configured CRAN mirrors rather than typing in a URL
and sourcing from the web.
But how does a CRAN package know what version of _Bioconductor_ is in
use? Can we use BiocManager? No, because we don't have enough control
over the version of BiocManager available on CRAN, e.g., everyone using
the same version of _R_ would get the same version of BiocManager and
hence of _Bioconductor_. But there are two _Bioconductor_ versions per R
version, so that does not work!
BiocManager could write information to a cache on the user disk, but
this is not a robust solution for a number of reasons. Is there any
other way that _R_ could keep track of version information? Yes, by
installing a _Bioconductor_ package (BiocVersion) whose sole purpose is
to indicate the version of _Bioconductor_ in use.
By default, BiocManager installs the BiocVersion package corresponding
to the most recent released version of _Bioconductor_ for the version
of _R_ in use. At the time this section was written, the most recent
version of R is R-3.6.1, associated with _Bioconductor_ release
version 3.9. Hence on first use of `BiocManager::install()` we see
BiocVersion version 3.9.0 being installed.
```
> BiocManager::install()
Bioconductor version 3.9 (BiocManager 1.30.4), R 3.6.1 Patched (2019-07-06
r76792)
Installing package(s) 'BiocVersion'
trying URL 'https://bioconductor.org/packages/3.9/bioc/src/contrib/\
BiocVersion_3.9.0.tar.gz'
...
```
Requesting a specific version of _Bioconductor_ updates, if possible,
the BiocVersion package.
```
> ## 3.10 is available for R-3.6
> BiocManager::install(version="3.10")
Upgrade 3 packages to Bioconductor version '3.10'? [y/n]: y
Bioconductor version 3.10 (BiocManager 1.30.4), R 3.6.1 Patched (2019-07-06
r76792)
Installing package(s) 'BiocVersion'
trying URL 'https://bioconductor.org/packages/3.10/bioc/src/contrib/\
BiocVersion_3.10.1.tar.gz'
...
> ## back down again...
> BiocManager::install(version="3.9")
Downgrade 3 packages to Bioconductor version '3.9'? [y/n]: y
Bioconductor version 3.9 (BiocManager 1.30.4), R 3.6.1 Patched (2019-07-06
r76792)
Installing package(s) 'BiocVersion'
trying URL 'https://bioconductor.org/packages/3.9/bioc/src/contrib/\
BiocVersion_3.9.0.tar.gz'
...
```
Answering `n` to the prompt to up- or downgrade packages leaves the
installation unchanged, since this would immediately create an
inconsistent installation.
One potential problem occurs when there are two or more `.libPaths()`,
with more than one BiocVersion package installed. This might occur for
instance if a 'system administrator' installed BiocVersion, and then a
user installed their own version. In this circumstance, it seems
appropriate to standardize the installation by repeatedly calling
`remove.pacakges("BiocVersion")` until all versions are removed, and
then installing the desired version.
# `sessionInfo()`
```{r, eval = TRUE}
sessionInfo()
```
[1]: https://cran.r-project.org/package=BiocManager
[2]: https://bioconductor.org
[3]: https://support.bioconductor.org
[5]: https://bioconductor.org/about/release-announcements/
[6]: https://cran.R-project.org/
[7]: https://bioconductor.org/about/mirrors/mirror-how-to/
[8]: https://cran.r-project.org/mirror-howto.html
|