File: multiinter.rst

package info (click to toggle)
bedtools 2.31.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 57,304 kB
  • sloc: ansic: 38,507; cpp: 29,721; sh: 8,001; makefile: 663; python: 240; javascript: 16
file content (161 lines) | stat: -rw-r--r-- 5,280 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
.. _multiinter:

###############
*multiinter*
###############

|

.. image:: ../images/tool-glyphs/multiinter-glyph.png 
    :width: 600pt 

|

``bedtools multiinter`` identifies common intervals among multiple (and subsets thereof)
	 sorted BED/GFF/VCF files. 

.. note::

  1. All files must be sorted in the same mmanner (e.g., ``sort -k 1,1 -k2,2n in.bed > in.sorted.bed``)



===============================
Usage and option summary
===============================
**Usage**:
::

  bedtools multiinter [OPTIONS] -i FILE1 FILE2 .. FILEn

**(or)**:
::

  multiIntersect [OPTIONS]  -i FILE1 FILE2 .. FILEn



===========================      ===============================================================================================================================================================================================================
 Option                           Description
===========================      ===============================================================================================================================================================================================================
**-header**                      Print a header line (chrom/start/end + names of each file).
**-names**                       A list of names (one/file) to describe each file in -i.
			                     These names will be printed in the header line.
**-g**                           Use genome file to calculate empty regions.
**-empty**                       | Report empty regions (i.e., start/end intervals w/o
			                     | values in all files). Requires the '-g FILE' parameter.
**-filler TEXT**                 | Use TEXT when representing intervals having no value.
			                     | Default is '0', but you can use 'N/A' or any text.
**-examples**                    Show usage examples on the command line.
===========================      ===============================================================================================================================================================================================================




==========================================================================
Default behavior
==========================================================================
By default, ``bedtools multiinter`` will inspect all of the intervals in each input file and
report the sub-intervals that are overlapped by 0, 1, 2, ... N files.
The default output format is as follows:

1. chromosome (or entire genome)
2. 0-based start coordinate of the sub-interval.
3. 1-based end coordinate of the sub-interval.
4. The number of files whose intervals overlap this sub interval at least once.
5. The list of file numbers (by order on the command line) whose intervals overlap this sub interval at least once.
6. Columns reflecting whether each file had (1) or did not have (0) 1 or more intervals overlapping this sub interval.

For example:

.. code-block:: bash

    $ cat a.bed
    chr1  6   12
    chr1  10  20
    chr1  22  27
    chr1  24  30
    
    cat b.bed
    chr1  12  32
    chr1  14  30

    $ cat c.bed
    chr1  8   15
    chr1  10  14
    chr1  32  34

    $ cat sizes.txt
    chr1  5000

    $ bedtools multiinter -i a.bed b.bed c.bed
    chr1	6	8	1	1	1	0	0
    chr1	8	12	2	1,3	1	0	1
    chr1	12	15	3	1,2,3	1	1	1
    chr1	15	20	2	1,2	1	1	0
    chr1	20	22	1	2	0	1	0
    chr1	22	30	2	1,2	1	1	0
    chr1	30	32	1	2	0	1	0
    chr1	32	34	1	3	0	0	1

 
==========================================================================
``-header`` Add a header with columns names
==========================================================================
For example:

.. code-block:: bash

    $ bedtools multiinter -header -i a.bed b.bed c.bed
    chrom	start	end	num	list	a.bed	b.bed	c.bed
    chr1	6	8	1	1	1	0	0
    chr1	8	12	2	1,3	1	0	1
    chr1	12	15	3	1,2,3	1	1	1
    chr1	15	20	2	1,2	1	1	0
    chr1	20	22	1	2	0	1	0
    chr1	22	30	2	1,2	1	1	0
    chr1	30	32	1	2	0	1	0
    chr1	32	34	1	3	0	0	1

==========================================================================
``-names`` Add custom labels for each file in the header
==========================================================================

For example:

.. code-block:: bash

    $ bedtools multiinter -header -names A B C -i a.bed b.bed c.bed
    chrom	start	end	num	list	A	B	C
    chr1	6	8	1	1	1	0	0
    chr1	8	12	2	1,3	1	0	1
    chr1	12	15	3	1,2,3	1	1	1
    chr1	15	20	2	1,2	1	1	0
    chr1	20	22	1	2	0	1	0
    chr1	22	30	2	1,2	1	1	0
    chr1	30	32	1	2	0	1	0
    chr1	32	34	1	3	0	0	1

==========================================================================
``-empty`` Report the sub intervals not covered by any file
==========================================================================
Note that this option requires a ``-g`` file so that it knows the full 
range of each chromosome or contig.


For example:

.. code-block:: bash
  
  $ bedtools multiinter -header -names A B C -i a.bed b.bed c.bed -empty -g sizes.txt
 chrom	start	end	num	list	A	B	C
 chr1	0	6	0	none	0	0	0
 chr1	6	8	1	A	1	0	0
 chr1	8	12	2	A,C	1	0	1
 chr1	12	15	3	A,B,C	1	1	1
 chr1	15	20	2	A,B	1	1	0
 chr1	20	22	1	B	0	1	0
 chr1	22	30	2	A,B	1	1	0
 chr1	30	32	1	B	0	1	0
 chr1	32	34	1	C	0	0	1
 chr1	34	5000	0	none	0	0	0