File: genetic_code.rst

package info (click to toggle)
python-cogent 2023.2.12a1%2Bdfsg-2%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 12,416 kB
  • sloc: python: 89,165; makefile: 117; sh: 16
file content (147 lines) | stat: -rw-r--r-- 3,375 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
Translate DNA sequences
-----------------------

.. jupyter-execute::

    from cogent3 import get_code

    standard_code = get_code(1)
    standard_code.translate("TTTGCAAAC")

Conversion to a ``ProteinSequence`` from a ``DnaSequence`` is shown in :ref:`translation`.

Translate all six frames
------------------------

.. jupyter-execute::

    from cogent3 import get_code, make_seq

    standard_code = get_code(1)
    seq = make_seq("ATGCTAACATAAA", moltype="dna")
    translations = standard_code.sixframes(seq)
    print(translations)

Find out how many stops in a frame
----------------------------------

.. jupyter-execute::

    from cogent3 import get_code, make_seq

    standard_code = get_code(1)
    seq = make_seq("ATGCTAACATAAA", moltype="dna")
    stops_frame1 = standard_code.get_stop_indices(seq, start=0)
    stops_frame1
    stop_index = stops_frame1[0]
    seq[stop_index : stop_index + 3]

Translate a codon
-----------------

.. jupyter-execute::

    from cogent3 import get_code, make_seq

    standard_code = get_code(1)
    standard_code["TTT"]

or get the codons for a single amino acid

.. jupyter-execute::

    standard_code["A"]

Look up the amino acid corresponding to a single codon
------------------------------------------------------

.. jupyter-execute::

    from cogent3 import get_code

    standard_code = get_code(1)
    standard_code["TTT"]

Get all the codons for one amino acid
-------------------------------------

.. jupyter-execute::

    from cogent3 import get_code

    standard_code = get_code(1)
    standard_code["A"]

Get all the codons for a group of amino acids
---------------------------------------------

.. jupyter-execute::

    targets = ["A", "C"]
    codons = [standard_code[aa] for aa in targets]
    codons
    flat_list = sum(codons, [])
    flat_list

Converting the ``CodonAlphabet`` to codon series
------------------------------------------------

.. jupyter-execute::

    from cogent3 import make_seq

    my_seq = make_seq("AGTACACTGGTT", moltype="dna")
    sorted(my_seq.codon_alphabet())
    len(my_seq.codon_alphabet())

Obtaining the codons from a ``DnaSequence`` object
--------------------------------------------------

Use the method ``get_in_motif_size``

.. jupyter-execute::

    from cogent3 import make_seq

    my_seq = make_seq("ATGCACTGGTAA", name="my_gene", moltype="dna")
    codons = my_seq.get_in_motif_size(3)
    print(codons)

Translating a DNA sequence with a terminating stop codon
--------------------------------------------------------

You can't translate a sequence that contains a stop codon.

.. jupyter-execute::
    :hide-code:

    from cogent3.core.alphabet import AlphabetError

.. jupyter-execute::
    :raises: AlphabetError

    pep = my_seq.get_translation()

By removing the trailing stop codon first
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. jupyter-execute::

    from cogent3 import make_seq

    my_seq = make_seq("ATGCACTGGTAA", name="my_gene", moltype="dna")
    seq = my_seq.trim_stop_codon()
    pep = seq.get_translation()
    print(pep.to_fasta())
    print(type(pep))

By slicing the ``DnaSequence`` first
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. jupyter-execute::

    from cogent3 import make_seq

    my_seq = make_seq("CAAATGTATTAA", name="my_gene", moltype="dna")
    pep = my_seq[:-3].get_translation()
    print(pep.to_fasta())