File: annotate.rst

package info (click to toggle)
bedtools 2.26.0%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 55,328 kB
  • sloc: cpp: 37,989; sh: 6,930; makefile: 2,225; python: 163
file content (120 lines) | stat: -rwxr-xr-x 5,400 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
.. _annotate:

###############
*annotate*
###############
``bedtools annotate``, well, annotates one BED/VCF/GFF file with the coverage 
and number of overlaps observed from multiple other BED/VCF/GFF files. 
In this way, it allows one to ask to what degree one feature coincides with 
multiple other feature types with a single command.

==========================================================================
Usage and option summary
==========================================================================
**Usage**:
::

  bedtools annotate [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn

**(or)**:
::

  annotateBed [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn
  
  
===========================      ===============================================================================================================================================================================================================
 Option                           Description
===========================      ===============================================================================================================================================================================================================
**-names**				         A list of names (one per file) to describe each file in -i. These names will be printed as a header line. 
**-counts**					     Report the count of features in each file that overlap -i. Default behavior is to report the fraction of -i covered by each file.
**-both**                        Report the count of features followed by the % coverage for each annotation file. Default is to report solely the fraction of -i covered by each file.
**-s**                           Force strandedness. That is, only include hits in A that overlap B on the same strand. By default, hits are included without respect to strand.
**-S**	                         Require different strandedness.  That is, only report hits in B that overlap A on the _opposite_ strand. By default, overlaps are reported without respect to strand.
===========================      ===============================================================================================================================================================================================================

==========================================================================
Default behavior - annotate one file with coverage from others.
==========================================================================
By default, the fraction of each feature covered by each annotation file is 
reported after the complete feature in the file to be annotated.

.. code-block:: bash

  $ cat variants.bed
  chr1 100  200   nasty 1  -
  chr2 500  1000  ugly  2  +
  chr3 1000 5000  big   3  -

  $ cat genes.bed
  chr1 150  200   geneA 1  +
  chr1 175  250   geneB 2  +
  chr3 0    10000 geneC 3  -

  $ cat conserve.bed
  chr1 0    10000 cons1 1  +
  chr2 700  10000 cons2 2  -
  chr3 4000 10000 cons3 3  +

  $ cat known_var.bed
  chr1 0    120   known1   -
  chr1 150  160   known2   -
  chr2 0    10000 known3   +

  $ bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed
  chr1	100	200	nasty	1	-	0.500000	1.000000	0.300000	
  chr2	500	1000	ugly	2	+	0.000000	0.600000	1.000000	
  chr3	1000	5000	big	3	-	1.000000	0.250000	0.000000


==========================================================================
``-count`` Report the count of hits from the annotation files
==========================================================================

.. code-block:: bash

  $ bedtools annotate -counts -i variants.bed -files genes.bed conserve.bed known_var.bed
  chr1	100	200	nasty	1	-	2	1	2	
  chr2	500	1000	ugly	2	+	0	1	1	
  chr3	1000	5000	big	3	-	1	1	0



===========================================================================================
``-both`` Report both the count of hits and the fraction covered from the annotation files
===========================================================================================

.. code-block:: bash

  $ bedtools annotate -both -i variants.bed -files genes.bed conserve.bed known_var.bed
  #chr	start	end	name	score	+/-	cnt1	pct1	cnt2	pct2	cnt3	pct3
  chr1	100	200	nasty	1	-	2	0.500000	1	1.000000	2	0.300000	
  chr2	500	1000	ugly	2	+	0	0.000000	1	0.600000	1	1.000000	
  chr3	1000	5000	big	3	-	1	1.000000	1	0.250000	0	0.000000


  
  
==========================================================================
``-s`` Restrict the reporting to overlaps on the **same** strand.
==========================================================================

.. code-block:: bash

  $ bedtools annotate -s -i variants.bed -files genes.bed conserve.bed known_var.bed
  chr1	100	200	nasty	1	-	0.000000	0.000000	0.000000	
  chr2	500	1000	ugly	2	+	0.000000	0.000000	0.000000	
  chr3	1000	5000	big	3	-	1.000000	0.000000	0.000000



==========================================================================
``-S`` Restrict the reporting to overlaps on the **opposite** strand.
==========================================================================

.. code-block:: bash

  $ bedtools annotate -S -i variants.bed -files genes.bed conserve.bed known_var.bed
  chr1	100	200	nasty	1	-	0.500000	1.000000	0.300000	
  chr2	500	1000	ugly	2	+	0.000000	0.600000	1.000000	
  chr3	1000	5000	big	3	-	0.000000	0.250000	0.000000