File: seqsim_aln_sim_user_alphabet.rst

package info (click to toggle)
python-cogent 1.5.3-2
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 16,424 kB
  • ctags: 24,343
  • sloc: python: 134,200; makefile: 100; ansic: 17; sh: 10
file content (79 lines) | stat: -rw-r--r-- 2,213 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Seqsim Alignment Simulation Example with Non-standard alphabet
==============================================================

.. sectionauthor:: Julia Goodrich

This is an example of how to use PyCogent's ``seqsim`` module to simulate an 
alignment where the alphabet is defined by the user for a simple tree starting 
with a random sequence and a random substitution rate matrix. 

First we will perform the necessary imports.


.. doctest::

    >>> from cogent.seqsim.usage import Rates
    >>> from cogent.core.alignment import DenseAlignment
    >>> from cogent.seqsim.tree import RangeNode
    >>> from cogent.parse.tree import DndParser
    >>> from cogent.core.alphabet import CharAlphabet
    >>> from cogent.seqsim.usage import Usage
    >>> from cogent.core.sequence import ModelSequence

Now, lets specify a 4 taxon tree:

.. doctest::

    >>> t = DndParser('(a:0.4,b:0.3,(c:0.15,d:0.2)edge.0:0.1);', 
    ... constructor = RangeNode)

Create the alphabet by passing in the characters to ``CharAlphabet`` then create
tuples of all the possible pairs using ** operator

.. doctest::

    >>> Bases = CharAlphabet('ABCD')
    >>> Pairs = Bases**2

Generate a random sequence with the new alphabet and a random rate matrix,
``Usage`` is being used to define character frequencies for the random
sequence. Then we create a random sequence of length five.

.. doctest::

    >>> u = Usage({'A':0.5,'B':0.2,'C':0.15,'D':0.25}, Alphabet = Bases)    
    >>> s = ModelSequence(u.randomIndices(5))
    >>> q = Rates.random(Pairs)

Set q at the base of the tree and propagate it to all nodes in the tree,

.. doctest::

    >>> t.Q = q
    >>> t.propagateAttr('Q')

Set a P matrix from every Q matrix on each node,

.. doctest::

    >>> t.assignP()

Use ``evolve`` to evolve sequences for each tip, Note: must evolve sequence
data, not sequence object itself (for speed)

.. doctest::

    >>> t.evolve(s._data)

Build alignment,

.. doctest::

    >>> seqs = {}
    >>> for n in t.tips():
    ...     seqs[n.Name] = ModelSequence(n.Sequence,Bases)
    >>> aln = DenseAlignment(seqs,Alphabet=Bases)

The result is a Cogent ``Alignment`` object, which can be used the same way as
any other alignment object.