1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344
|
# Introduction
MultiQC is a reporting tool that parses summary statistics from results and log files
generated by other bioinformatics tools. MultiQC doesn't run other tools for you -
it's designed to be placed at the end of analysis pipelines or to be run manually
when you've finished running your tools.
When you launch MultiQC, it recursively searches through any provided file paths
and finds files that it recognises. It parses relevant information from these and
generates a single stand-alone HTML report file. It also saves a directory of data
files with all parsed data for further downstream use.
# Installing MultiQC
## System Python
Before we start - a quick note that using the system-wide installation of Python
is _not_ recommended. This often causes problems and it's a little risky to mess
with it. If you find yourself prepending `sudo` to any MultiQC commands, take a
step back and think about Python virtual environments / conda instead (see below).
## Installing Python
To see if you have python installed, run `python --version` on the command line.
MultiQC needs Python version 2.7+, 3.4+ or 3.5+.
We recommend using virtual environments to manage your Python installation.
Our favourite is _conda_, a cross-platform tool to manage Python environments.
You can installation instructions for Miniconda
[here](https://docs.conda.io/en/latest/miniconda.html).
Once conda is installed, you can create a Python environment with the following commands:
```bash
conda create --name py3.7 python=3.7
conda activate py3.7
```
You'll want to add the `conda activate py3.7` line to your `.bashrc` file so
that the environment is loaded every time you load the terminal.
## Installing with conda
If you're using `conda` as described above, you can install MultiQC from the `bioconda`
channel as follows:
```bash
conda install -c bioconda -c conda-forge multiqc
```
Please see the [Bioconda documentation](https://bioconda.github.io/user/install.html) for more details.
## Installation with pip
This is the easiest way to install MultiQC. `pip` is the package manager for
the Python Package Manager. It comes bundled with recent versions of Python,
otherwise you can find installation instructions [here](http://pip.readthedocs.org/en/stable/installing/).
You can now install MultiQC from
[PyPI](https://pypi.python.org/pypi/multiqc) as follows:
```bash
pip install multiqc
```
If you would like the development version, the command is:
```bash
pip install git+https://github.com/ewels/MultiQC.git
```
Note that if you have problems with read-only directories, you can install to
your home directory with the `--user` parameter (though it's probably better
to use virtual environments, as described above).
```bash
pip install --user multiqc
```
## Manual installation
If you'd rather not use either of these tools, you can clone the code and install the code yourself:
```bash
git clone https://github.com/ewels/MultiQC.git
cd MultiQC
pip install .
```
`git` not installed? No problem - just download the flat files:
```bash
curl -LOk https://github.com/ewels/MultiQC/archive/master.zip
unzip master.zip
cd MultiQC-master
pip install .
```
Note that it is _not_ recommended to use the command `python setup.py install`
as this has been superseded by `pip` and does not correctly handle some package
management, such as pre-releases.
## Updating MultiQC
You can update MultiQC from [PyPI](https://pypi.python.org/pypi/multiqc)
at any time by running the following command:
```bash
pip install --upgrade multiqc
```
To update the development version, use:
```bash
pip install --force git+https://github.com/ewels/MultiQC.git
```
If you cloned the `git` repo, just pull the latest changes and install:
```bash
cd MultiQC
git pull
pip install .
```
If you downloaded the flat files, just repeat the installation procedure.
## Using a specific python interpreter
If you prefer, you can also run MultiQC with a specific python interpreter.
The command line usage and flags are then exactly the same as if you ran just `multiqc`.
For example:
```bash
python -m multiqc .
python3 -m multiqc .
~/my_env/bin/python -m multiqc .
```
## Using with a Python script
You can import and run MultiQC from within a Python script, using
the `multiqc.run()` function as follows:
```python
import multiqc
multiqc.run("/path/to/dir")
```
## Installing on Windows
MultiQC is has primarily been designed for us on Unix systems (Linux, Mac OSX).
However, it _should_ work on Windows too. Indeed, automated
[continuous integration tests](https://github.com/ewels/MultiQC/actions)
run using GitHub Actions to check compatibility (see test config
[here](https://github.com/ewels/MultiQC/blob/master/.github/workflows/multiqc_windows.yml)).
Note that support for using the base `multiqc` command was improved in MultiQC version 1.8.
## Using the Docker container
A Docker container is provided on Docker Hub called `ewels/multiqc`.
It's based on an `python-slim` base image to give the smallest image size possible.
To use, call the `docker run` with your current working directory mounted as a volume and working directory. Then just specify the MultiQC command at the end as usual:
```bash
docker run -t -v `pwd`:`pwd` -w `pwd` ewels/multiqc multiqc .
```
You can specify additional MultiQC parameters as normal:
```bash
docker run -t -v `pwd`:`pwd` -w `pwd` ewels/multiqc multiqc . --title "My amazing report" -b "This was made with docker"
```
By default, docker will use the `:latest` tag. For MultiQC, this is set to be the most recent release.
To use the most recent development code, use `ewels/multiqc:dev`.
You can also specify specific versions, eg: `ewels/multiqc:1.9`.
Note that all files on the command line (eg. config files) must also be mounted in the docker container to be accessible.
For more help, look into [the Docker documentation](https://docs.docker.com/engine/reference/commandline/run/)
### Docker bash alias
The above base command is a little verbose, so if you are using this a lot it may be worth adding the following bash alias to your `~/.bashrc` file:
```bash
alias multiqc="docker run -tv `pwd`:`pwd` -w `pwd` ewels/multiqc"
```
Once applied (you may need to reload your shell if added to your `.bashrc`) you can then just use the `multiqc` instead:
```bash
multiqc .
```
## Using Singularity
Although there is no dedicated Singularity image available for MultiQC, you can use the above Docker container.
First, build a singularity container image from the docker image (where `1.9` is the MultiQC version):
```bash
singularity build multiqc-1.9.sif docker://ewels/multiqc:1.9
```
Then, use `singularity run` to run the image with the normal MultiQC arguments:
```bash
singularity run multiqc-1.9.sif my_results/ --title "Report made using Singularity"
```
### Import errors with Singularity
Sometimes, Singularity can be over-ambitious with sharing file paths which can result in the Python
environment in your local system interacting with Python inside the image.
This can give rise to `ImportError` errors for `numpy` and other packages.
The giveaway for when this is the problem is that traceback will list python package paths which
are on your system and look different that of MultiQC inside the container (eg. `/usr/lib/python3.8/site-packages/multiqc/`).
To fix this, run the command `export PYTHONNOUSERSITE=1` before running MultiQC.
This variable [tells Python](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONNOUSERSITE)
not to add site-packages to the system path when loading, which should avoid the conflicts.
## Using the nix flake
If you're using the [nix package manager](https://nixos.org/download.html#download-nixm) with [flakes](https://nixos.wiki/wiki/Flakes) enabled, you can
run `nix develop`in the MultiQC repository to enter a shell
with required dependencies. To build MultiQC, run `nix build`.
## Python 2
As of MultiQC version 1.9, **Python 2 is no longer officially supported**.
Automatic CI tests will no longer run with Python 2 and Python 2 specific workarounds
are no longer guaranteed.
Whilst it may be possible to continue using MultiQC with Python 2 for a short time by
pinning dependencies, MultiQC compatibility for Python 2 will now slowly drift and start
to break. If you haven't already, **you need to switch to Python 3 now**.
Python 2 had its [official sunset date](https://www.python.org/doc/sunset-python-2/)
on January 1st 2020, meaning that it will no longer be developed by the Python community.
Part of the [python.org statement](https://www.python.org/doc/sunset-python-2/) reads:
> That means that we will not improve it anymore after that day,
> even if someone finds a security problem in it.
> You should upgrade to Python 3 as soon as you can.
[Very many Python packages no longer support Python 2](https://python3statement.org/)
and it whilst the MultiQC code is currently compatible with both Python 2 and Python 3,
it is increasingly difficult to maintain compatibility with the dependency packages it
uses, such as MatPlotLib, numpy and more.
## Using MultiQC through Galaxy
### On the main Galaxy instance
The easiest and fast manner to use MutliQC is to use the [usegalaxy.org](https://usegalaxy.org/) main Galaxy instance where you will find [MultiQC Galaxy tool](https://usegalaxy.org/?tool_id=toolshed.g2.bx.psu.edu%2Frepos%2Fengineson%2Fmultiqc%2Fmultiqc%2F1.0.0.0&version=1.0.0.0&__identifer=2sjdq8d9r3l) under the _NGS: QC and manipualtion_ tool panel section.
### On your instance
You can install MultiQC on your own Galaxy instance through your Galaxy admin space, searching on the [main Toolshed](https://toolshed.g2.bx.psu.edu/) for the [MultiQC repository](https://toolshed.g2.bx.psu.edu/view/iuc/multiqc/3bad335ccea9) available under the _visualization_, _statistics_ and _Fastq Manipulation_ sections.
## Installing from FreeBSD
If you're using FreeBSD you can install MultiQC via the FreeBSD ports system:
```bash
pkg install py36-multiqc
```
_(or `py27-multiqc`, `py37-multiqc`, or any other currently mainstream python version)._
This will install a prebuilt binary using only highly-portable
optimizations, much like `apt`, `yum`, etc.
FreeBSD ports can also be built and installed from source:
```bash
cd /usr/ports/biology/py-multiqc
make install
```
To report issues with a FreeBSD port, please submit a PR on the
[FreeBSD bug reports page](https://www.freebsd.org/support/bugreports.html).
For more information, visit [https://www.freebsd.org/ports/](https://www.freebsd.org/ports/index.html)
## Installing as an environment module
Many people using MultiQC will be working on a HPC environment.
Every server / cluster is different, and you're probably best off asking
your friendly sysadmin to install MultiQC for you. However, with that
in mind, here are a few general tips for installing MultiQC into an
environment module system:
MultiQC comes in two parts - the `multiqc` python package and the
`multiqc` executable script. The former must be available in `$PYTHONPATH`
and the script must be available on the `$PATH`.
A typical installation procedure with an environment module Python install
might look like this: _(Note that `$PYTHONPATH` must be defined before `pip` installation.)_
```bash
VERSION=0.7
INST=/path/to/software/multiqc/$VERSION
module load python/2.7.6
mkdir $INST
export PYTHONPATH=$INST/lib/python2.7/site-packages
pip install --install-option="--prefix=$INST" multiqc
```
Once installed, you'll need to create an environment module file.
Again, these vary between systems a lot, but here's an example:
```bash
#%Module1.0#####################################################################
##
## MultiQC
##
set components [ file split [ module-info name ] ]
set version [ lindex $components 1 ]
set modroot /path/to/software/multiqc/$version
proc ModulesHelp { } {
global version modroot
puts stderr "\tMultiQC - use MultiQC $version"
puts stderr "\n\tVersion $version\n"
}
module-whatis "Loads MultiQC environment."
# load required modules
module load python/2.7.6
# only one version at a time
conflict multiqc
# Make the directories available
prepend-path PATH $modroot/bin
prepend-path PYTHONPATH $modroot/lib/python2.7/site-packages
```
|