File: get_refseq.py

package info (click to toggle)
fasta3 36.3.8i.14-Nov-2020-3
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 7,016 kB
  • sloc: ansic: 77,269; perl: 10,677; python: 2,461; sh: 428; csh: 86; sql: 55; makefile: 40
file content (25 lines) | stat: -rwxr-xr-x 553 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/usr/bin/python3

import sys
import re
import requests

db_type="protein"
if (re.match(r'[A-Z]M_\d+',sys.argv[1])):
    db_type="nucleotide"

acc_str = ",".join(sys.argv[1:])


seq_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?"
seq_args = "db=%s&id=" % (db_type) + acc_str  + "&rettype=fasta"
url_string = seq_url+seq_args

try: 
    req = requests.get(url_string)
except requests.exceptions.RequestException as e:
    seq_html = ''
    sys.stderr.print(e.response.text+'\n')
else:
    seq_html=req.text
    print(seq_html,end='')