1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
|
.. jupyter-execute::
:hide-code:
import set_working_directory
*******************
Building alignments
*******************
.. authors, Gavin Huttley, Kristian Rother, Patrick Yannul
Using a ``cogent3`` progressive aligner for nucleotides
=======================================================
We load a canned nucleotide substitution model and the progressive aligner ``tree_align`` function.
.. jupyter-execute::
from cogent3 import load_unaligned_seqs, make_tree
from cogent3.align.progressive import tree_align
We first align without providing a guide tree. The ``tree_align`` algorithm builds pairwise alignments and estimates the substitution model parameters and pairwise distances. The distances are used to build a neighbour joining tree and the median value of substitution model parameters are provided to the substitution model for the progressive alignment step.
.. jupyter-execute::
seqs = load_unaligned_seqs("data/test2.fasta", moltype="dna")
aln, tree = tree_align("HKY85", seqs, show_progress=False)
aln
We then align using a guide tree (pre-estimated) and specifying the ratio of transitions to transversions (kappa).
.. jupyter-execute::
tree = make_tree(
"(((NineBande:0.013,Mouse:0.185):0.023,DogFaced:0.046):0.027,Human:0.034,HowlerMon:0.019)"
)
params = {"kappa": 4.0}
aln, tree = tree_align(
"HKY85", seqs, tree=tree, param_vals=params, show_progress=False
)
aln
Using a ``cogent3`` progressive aligner for codons
==================================================
We load a canned codon substitution model and use a pre-defined tree and parameter estimates.
.. jupyter-execute::
from cogent3 import load_unaligned_seqs, make_tree
from cogent3.align.progressive import tree_align
seqs = load_unaligned_seqs("data/test2.fasta", moltype="dna")
tree = make_tree(
"((NineBande:0.058,Mouse:0.595):0.079,DogFaced:0.142,(HowlerMon:0.062,Human:0.103):0.079)"
)
params = {"kappa": 4.0, "omega": 1.3}
aln, tree = tree_align(
"MG94HKY", seqs, tree=tree, param_vals=params, show_progress=False
)
aln
Converting gaps from aa-seq alignment to nuc seq alignment
==========================================================
We load some unaligned DNA sequences and show their translation.
.. jupyter-execute::
from cogent3 import make_unaligned_seqs
from cogent3.align.progressive import tree_align
from cogent3.evolve.models import get_model
seqs = [
(
"hum",
"AAGCAGATCCAGGAAAGCAGCGAGAATGGCAGCCTGGCCGCGCGCCAGGAGAGGCAGGCCCAGGTCAACCTCACT",
),
(
"mus",
"AAGCAGATCCAGGAGAGCGGCGAGAGCGGCAGCCTGGCCGCGCGGCAGGAGAGGCAGGCCCAAGTCAACCTCACG",
),
("rat", "CTGAACAAGCAGCCACTTTCAAACAAGAAA"),
]
unaligned_DNA = make_unaligned_seqs(seqs, moltype="dna")
unaligned_DNA
.. jupyter-execute::
unaligned_DNA.get_translation()
We load an alignment of these protein sequences.
.. jupyter-execute::
from cogent3 import make_aligned_seqs
aligned_aa_seqs = [
("hum", "KQIQESSENGSLAARQERQAQVNLT"),
("mus", "KQIQESGESGSLAARQERQAQVNLT"),
("rat", "LNKQ------PLS---------NKK"),
]
aligned_aa = make_aligned_seqs(aligned_aa_seqs, moltype="protein")
We then obtain an alignment of the DNA sequences from the alignment of their translation.
.. jupyter-execute::
aligned_DNA = aligned_aa.replace_seqs(unaligned_DNA, aa_to_codon=True)
aligned_DNA
|