File: mosdepth.md

package info (click to toggle)
multiqc 1.14%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 28,824 kB
  • sloc: python: 41,884; javascript: 4,651; sh: 74; makefile: 24
file content (80 lines) | stat: -rw-r--r-- 2,815 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
Name: mosdepth
URL: https://github.com/brentp/mosdepth
Description: >
  Mosdepth performs fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.
---

[Mosdepth](https://github.com/brentp/mosdepth/) performs fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

It can generate several output files all with a common prefix and different endings:

- per-base depth (`{prefix}.per-base.bed.gz`),
- mean per-window depth given a window size (`{prefix}.regions.bed.gz`, if a BED file provided with `--by`),
- mean per-region given a BED file of regions (`{prefix}.regions.bed.gz`, if a window size provided with `--by`),
- a distribution of proportion of bases covered at or above a given threshhold for each chromosome and genome-wide (`{prefix}.mosdepth.global.dist.txt` and `{prefix}.mosdepth.region.dist.txt`),
- quantized output that merges adjacent bases as long as they fall in the same coverage bins (`{prefix}.quantized.bed.gz`),
- threshold output to indicate how many bases in each region are covered at the given thresholds (`{prefix}.thresholds.bed.gz`)

The MultiQC module plots coverage distributions from 2 kinds of outputs:

- `{prefix}.mosdepth.region.dist.txt`
- `{prefix}.mosdepth.global.dist.txt`

Using "region" if exists, otherwise "global". Plotting 3 figures:

- Proportion of bases in the reference genome with, at least, a given depth of coverage (cumulative coverage distribution).
- Proportion of bases in the reference genome with a given depth of coverage (absolute coverage distribution).
- Average coverage per contig/chromosome.

Also plotting the percentage of the genome covered at a threshold in the General Stats section.
The default thresholds are 1, 5, 10, 30, 50, which can be customised in the config as follows:

```yaml
mosdepth_config:
  general_stats_coverage:
    - 10
    - 20
    - 40
    - 200
    - 30000
```

You can also specify which columns would be hidden when the report loads (by default, all values are hidden except 30X):

```yaml
general_stats_coverage_hidden:
  - 10
  - 20
  - 200
```

For the per-contig coverage plot, you can include and exclude contigs based on name or pattern.

For example, you could add the following to your MultiQC config file:

```yaml
mosdepth_config:
  include_contigs:
    - "chr*"
  exclude_contigs:
    - "*_alt"
    - "*_decoy"
    - "*_random"
    - "chrUn*"
    - "HLA*"
    - "chrM"
    - "chrEBV"
```

Note that exclusion superseeds inclusion for the contig filters.

If you want to see what is being excluded, you can set `show_excluded_debug_logs` to `True`:

```yaml
mosdepth_config:
  show_excluded_debug_logs: True
```

This will then print a debug log message (use `multiqc -v`) for each excluded contig.
This is disabled by default as there can be very many in some cases.