File: maf_to_int_seqs.py

package info (click to toggle)
python-bx 0.13.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 5,000 kB
  • sloc: python: 17,136; ansic: 2,326; makefile: 24; sh: 8
file content (39 lines) | stat: -rwxr-xr-x 974 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/usr/bin/python3

"""
For each block in a maf file (read from stdin) write a sequence of ints
corresponding to the columns of the block after applying the provided sequence
mapping.

The 'correct' number of species is determined by the mapping file, blocks not having
this number of species will be ignored.

usage: %prog mapping_file
"""

import sys

import bx.align.maf
from bx import seqmapping


def main():
    if len(sys.argv) > 1:
        _, alpha_map = seqmapping.alignment_mapping_from_file(open(sys.argv[1]))
    else:
        alpha_map = None

    for maf in bx.align.maf.Reader(sys.stdin):
        # Translate alignment to ints
        int_seq = seqmapping.DNA.translate_list([c.text for c in maf.components])
        # Apply mapping
        if alpha_map:
            int_seq = alpha_map.translate(int_seq)
        # Write ints separated by spaces
        for i in int_seq:
            print(i, end=" ")
        print()


if __name__ == "__main__":
    main()