File: rate_heterogeneity.rst

package info (click to toggle)
python-cogent 1.4.1-1.2
  • links: PTS, VCS
  • area: non-free
  • in suites: squeeze
  • size: 13,260 kB
  • ctags: 20,087
  • sloc: python: 116,163; ansic: 732; makefile: 74; sh: 9
file content (63 lines) | stat: -rw-r--r-- 2,305 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
Analysis of rate heterogeneity
==============================

.. sectionauthor:: Gavin Huttley

A simple example for analyses involving rate heterogeneity among sites. In this case we will simulate an alignment with two rate categories and then try to recover the rates from the alignment.

.. doctest::

    >>> from cogent.evolve.substitution_model import Nucleotide
    >>> from cogent import LoadTree

Make an alignment with equal split between rates 0.6 and 0.2, and then concatenate them to create a new alignment.

.. doctest::

    >>> model = Nucleotide(equal_motif_probs=True)
    >>> tree = LoadTree("data/test.tree")
    >>> lf = model.makeLikelihoodFunction(tree)
    >>> lf.setParamRule('length', value=0.6, is_const=True)
    >>> aln1 = lf.simulateAlignment(sequence_length=10000)
    >>> lf.setParamRule('length', value=0.2, is_const=True)
    >>> aln2 = lf.simulateAlignment(sequence_length=10000)
    >>> aln3 = aln1 + aln2

Start from scratch, optimising only rates and the rate probability ratio.

.. doctest::

    >>> model = Nucleotide(equal_motif_probs=True, ordered_param="rate",
    ...                    distribution="free")
    >>> lf = model.makeLikelihoodFunction(tree, bins=2, digits=2, space=3)
    >>> lf.setAlignment(aln3)
    >>> lf.optimise(local=True, max_restarts=2, show_progress = False)

We want to know the bin probabilities and the posterior probabilities.

.. doctest::
    
    >>> bprobs = [t for t in lf.getStatistics() if 'bin' in t.Title][0]

Print the ``bprobs`` sorted by ``'rate'`` will generate a table like

.. code-block:: python
    
    bin params
    ====================
     bin   bprobs   rate
    --------------------
    bin0     0.49   0.49
    bin1     0.51   1.48
    --------------------

We'll now use a gamma distribution on the sample alignment, specifying the number of bins as 4. We specify that the bins have equal density using the ``lf.setParamRule('bprobs', is_const=True)`` command.

.. doctest::

    >>> model = Nucleotide(equal_motif_probs=True, ordered_param="rate",
    ...                    distribution="gamma")
    >>> lf = model.makeLikelihoodFunction(tree, bins=4)
    >>> lf.setParamRule('bprobs', is_const=True)
    >>> lf.setAlignment(aln3)
    >>> lf.optimise(local=True, max_restarts=2, show_progress = False)