File: testing_multi_loci.rst

package info (click to toggle)
python-cogent 2024.5.7a1%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 74,600 kB
  • sloc: python: 92,479; makefile: 117; sh: 16
file content (66 lines) | stat: -rw-r--r-- 2,194 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
.. jupyter-execute::
    :hide-code:

    import set_working_directory

Likelihood analysis of multiple loci
====================================

.. sectionauthor:: Gavin Huttley

We want to know whether an exchangeability parameter is different between alignments. We will specify a null model, under which each alignment get's it's own motif probabilities and all alignments share branch lengths and the exchangeability parameter kappa (the transition / transversion ratio). We'll split the example alignment into two-pieces.

.. jupyter-execute::

    from cogent3 import load_aligned_seqs, make_table, make_tree
    from cogent3.evolve.models import HKY85
    from scipy.stats.distributions import chi2
    from cogent3.recalculation.scope import ALL, EACH

    aln = load_aligned_seqs("data/long_testseqs.fasta")
    half = len(aln) // 2
    aln1 = aln[:half]
    aln2 = aln[half:]

We provide names for those alignments, then construct the tree, model instances.

.. jupyter-execute::

    loci_names = ["1st-half", "2nd-half"]
    loci = [aln1, aln2]
    tree = make_tree(tip_names=aln.names)
    mod = HKY85()

To make a likelihood function with multiple alignments we provide the list of loci names. We can then specify a parameter (other than length) to be the same across the loci (using the imported ``ALL``) or different for each locus (using ``EACH``). We conduct a LR test as before.

.. jupyter-execute::

    lf = mod.make_likelihood_function(tree, loci=loci_names, digits=2, space=3)
    lf.set_param_rule("length", is_independent=False)
    lf.set_param_rule("kappa", loci=ALL)
    lf.set_alignment(loci)
    lf.optimise(show_progress=False)
    lf

.. jupyter-execute::

    all_lnL = lf.lnL
    all_nfp = lf.nfp
    lf.set_param_rule("kappa", loci=EACH)
    lf.optimise(show_progress=False)
    lf

.. jupyter-execute::

    each_lnL = lf.lnL
    each_nfp = lf.nfp
    LR = 2 * (each_lnL - all_lnL)
    df = each_nfp - all_nfp

Just to pretty up the result display, I'll print(a table consisting of the test statistics created on the fly.)

.. jupyter-execute::

    make_table(
        header=["LR", "df", "p"], rows=[[LR, df, chi2.sf(LR, df)]], digits=2, space=3,
    )