File: calc_genetic_distance.rst

package info (click to toggle)
python-cogent 2023.2.12a1%2Bdfsg-2%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 12,416 kB
  • sloc: python: 89,165; makefile: 117; sh: 16
file content (74 lines) | stat: -rw-r--r-- 2,155 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
.. jupyter-execute::
    :hide-code:

    import set_working_directory

****************************
Genetic distance calculation
****************************

Fast pairwise distance estimation
=================================

For a limited number of evolutionary models a fast implementation is
available.

.. jupyter-execute::

    from cogent3 import available_distances

    available_distances()

Computing genetic distances using the ``Alignment`` object
==========================================================

Abbreviations listed from ``available_distances()`` can be used as values for the ``distance_matrix(calc=<abbreviation>)``.

.. jupyter-execute::

    from cogent3 import load_aligned_seqs

    aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
    dists = aln.distance_matrix(calc="tn93", show_progress=False)
    dists

Using the distance calculator directly
======================================

.. jupyter-execute::

    from cogent3 import get_distance_calculator, load_aligned_seqs

    aln = load_aligned_seqs("data/primate_brca1.fasta")
    dist_calc = get_distance_calculator("tn93", alignment=aln)
    dist_calc

.. jupyter-execute::

    dist_calc.run(show_progress=False)
    dists = dist_calc.get_pairwise_distances()
    dists

The distance calculation object can provide more information. For instance, the standard errors.

.. jupyter-execute::

    dist_calc.stderr

Likelihood based pairwise distance estimation
=============================================

The standard ``cogent3`` likelihood function can also be used to estimate distances. Because these require numerical optimisation they can be significantly slower than the fast estimation approach above.

The following will use the F81 nucleotide substitution model and perform numerical optimisation.

.. jupyter-execute::

    from cogent3 import get_model, load_aligned_seqs
    from cogent3.evolve import distance

    aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
    d = distance.EstimateDistances(aln, submodel=get_model("F81"))
    d.run(show_progress=False)
    dists = d.get_pairwise_distances()
    dists