File: genetic_code.rst

package info (click to toggle)
python-cogent 1.4.1-1.2
  • links: PTS, VCS
  • area: non-free
  • in suites: squeeze
  • size: 13,260 kB
  • ctags: 20,087
  • sloc: python: 116,163; ansic: 732; makefile: 74; sh: 9
file content (117 lines) | stat: -rw-r--r-- 2,927 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
Translate DNA sequences
-----------------------

.. doctest::

    >>> from cogent.core.genetic_code import DEFAULT as standard_code
    >>> standard_code.translate('TTTGCAAAC')
    'FAN'

Conversion to a ``ProteinSequence`` from a ``DnaSequence`` is shown here :ref:`translation`.

Translate a codon
-----------------

.. doctest::

    >>> from cogent.core.genetic_code import DEFAULT as standard_code
    >>> standard_code['TTT']
    'F'

or get the codons for a single amino acid

.. doctest::

    >>> standard_code['A']
    ['GCT', 'GCC', 'GCA', 'GCG']

Look up the amino acid corresponding to a single codon
------------------------------------------------------

.. doctest::

    >>> from cogent.core.genetic_code import DEFAULT as standard_code
    >>> standard_code['TTT']
    'F'

Or get all the codons for one amino acid
----------------------------------------

.. doctest::

    >>> standard_code['A']
    ['GCT', 'GCC', 'GCA', 'GCG']

For a group of amino acids
--------------------------

.. doctest::

    >>> targets = ['A','C']
    >>> codons = [standard_code[aa] for aa in targets]
    >>> codons
    [['GCT', 'GCC', 'GCA', 'GCG'], ['TGT', 'TGC']]
    >>> flat_list = sum(codons,[])
    >>> flat_list
    ['GCT', 'GCC', 'GCA', 'GCG', 'TGT', 'TGC']

Converting the ``CodonAlphabet`` to codon series
------------------------------------------------

.. doctest::

    >>> from cogent import DNA
    >>> my_seq = DNA.makeSequence("AGTACACTGGTT")
    >>> sorted(my_seq.CodonAlphabet())
    ['AAA', 'AAC', 'AAG', 'AAT'...
    >>> len(my_seq.CodonAlphabet())
    61

Obtaining the codons from a ``DnaSequence`` object
--------------------------------------------------

Use the method ``getInMotifSize()``

.. doctest::

    >>> from cogent import LoadSeqs,DNA
    >>> from cogent.core.alphabet import AlphabetError
    >>> my_seq = DNA.makeSequence('ATGCACTGGTAA','my_gene')
    >>> codons = my_seq.getInMotifSize(3)
    >>> print codons
    ['ATG', 'CAC', 'TGG', 'TAA']
    >>> try:
    ...     pep = my_seq.getTranslation()
    ... except AlphabetError as e:
    ...     print 'AlphabetError', e
    ...
    AlphabetError TAA

Remove the stop codon first
^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. doctest::

    >>> from cogent import LoadSeqs,DNA
    >>> my_seq = DNA.makeSequence('ATGCACTGGTAA','my_gene')
    >>> seq = my_seq.withoutTerminalStopCodon()
    >>> pep = seq.getTranslation()
    >>> print pep.toFasta()
    >my_gene
    MHW
    >>> print type(pep)
    <class 'cogent.core.sequence.ProteinSequence'>

Or we can just grab the correct slice from the ``DnaSequence`` object
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. doctest::

    >>> from cogent import LoadSeqs,DNA
    >>> from cogent.core.alphabet import AlphabetError
    >>> my_seq = DNA.makeSequence('ATGCACTGGTAA','my_gene')
    >>> pep = my_seq[:-3].getTranslation().toFasta()
    >>> print pep
    >my_gene
    MHW