File: bed_merge_overlapping.py

package info (click to toggle)
python-bx 0.13.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 5,000 kB
  • sloc: python: 17,136; ansic: 2,326; makefile: 24; sh: 8
file content (32 lines) | stat: -rwxr-xr-x 766 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/python3

"""
Merge any overlapping regions of bed files. Bed files can be provided on the
command line or on stdin. Merged regions are always reported on the '+'
strand, and any fields beyond chrom/start/stop are lost.

usage: %prog bed files ...
"""

import fileinput
import sys

from bx.bitset_builders import binned_bitsets_from_bed_file

bed_filenames = sys.argv[1:]
if bed_filenames:
    input = fileinput.input(bed_filenames)
else:
    input = sys.stdin

bitsets = binned_bitsets_from_bed_file(input)

for chrom in bitsets:
    bits = bitsets[chrom]
    end = 0
    while True:
        start = bits.next_set(end)
        if start == bits.size:
            break
        end = bits.next_clear(start)
        print("%s\t%d\t%d" % (chrom, start, end))