File: advanced-usage.rst

package info (click to toggle)
bedtools 2.31.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 57,304 kB
  • sloc: ansic: 38,507; cpp: 29,721; sh: 8,001; makefile: 663; python: 240; javascript: 16
file content (105 lines) | stat: -rwxr-xr-x 3,261 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
###############
Advanced usage
###############


==========================================================================
Mask all regions in a genome except for targeted capture regions.
==========================================================================

Step 1. Add 500 bp up and downstream of each probe

.. code-block:: bash

  bedtools slop -i probes.bed -g hg18.genome -b 500 > probes.500bp.bed
  
NB genome is two column chromosome size list - i.e. https://genome.ucsc.edu/goldenpath/help/hg18.chrom.sizes

Step 2. Get a BED file of all regions not covered by the probes (+500 bp up/down)


.. code-block:: bash

  bedtools complement -i probes.500bp.bed -g hg18.genome > probes.500bp.complement.bed
  

Step 3. Create a masked genome where all bases are masked except for the probes +500bp

.. code-block:: bash

  bedtools maskfasta -fi hg18.fa -bed probes.500bp.complement.bed \
  -fo hg18.probecomplement.masked.fa



==========================================================================
Screening for novel SNPs.
==========================================================================
Find all SNPs that are not in dbSnp and not in the latest 1000 genomes calls

.. code-block:: bash

  bedtools intersect -a snp.calls.bed -b dbSnp.bed -v | \ 
  bedtools intersect -a - -b 1KG.bed -v | \
  > snp.calls.novel.bed


==========================================================================
Computing the coverage of features that align entirely within an interval.
==========================================================================

By default, bedtools ``coverage`` counts any feature in A that overlaps B 
by >= 1 bp. If you want to require that a feature align entirely within B for 
it to be counted, you can first use intersectBed with the "-f 1.0" option.

.. code-block:: bash

  bedtools intersect -a features.bed -b windows.bed -f 1.0 | \
  bedtools coverage -a windows.bed -b - \
  > windows.bed.coverage


==========================================================================
Computing the coverage of BAM alignments on exons.
==========================================================================
One can combine ``samtools`` with ``bedtools`` to compute coverage directly 
from the BAM data by using ``bamtobed``.

.. code-block:: bash

  bedtools bamtobed -i reads.bam | \
  bedtools coverage -a exons.bed -b - \
  > exons.bed.coverage
  

Take it a step further and require that coverage be from properly-paired reads.

.. code-block:: bash

  samtools view -uf 0x2 reads.bam | \
  coverageBed -abam - -b exons.bed \
  > exons.bed.proper.coverage



==========================================================================
Computing coverage separately for each strand.
==========================================================================
Use grep to only look at forward strand features (i.e. those that end in "+").

.. code-block:: bash

  bedtools bamtobed -i reads.bam | \
  grep \+$  | \
  bedtools coverage -a - -b genes.bed \
  > genes.bed.forward.coverage

Use grep to only look at reverse strand features (i.e. those that end in "-").

.. code-block:: bash

  bedtools bamtobed -i reads.bam | \
  grep \-$ | \
  bedtools coverage -a - -b genes.bed \
  > genes.bed.reverse.coverage