File: plot_aln-dotplot-1.rst

package info (click to toggle)
python-cogent 2024.5.7a1%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 74,600 kB
  • sloc: python: 92,479; makefile: 117; sh: 16
file content (60 lines) | stat: -rw-r--r-- 1,964 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.. jupyter-execute::
    :hide-code:

    import set_working_directory

Dotplot basics
==============

A technique (`Gibbs and McIntyre <https://www.ncbi.nlm.nih.gov/pubmed/5456129>`_) for comparing sequences. All ``cogent3`` sequence collections classes (``SequenceCollection``, ``Alignment`` and ``ArrayAlignment``) have a dotplot method.

.. todo:: Change dotplot ref to a citation

The method returns a drawable, as demonstrated below between unaligned sequences.

.. jupyter-execute::

    from cogent3 import load_unaligned_seqs

    seqs = load_unaligned_seqs("data/SCA1-cds.fasta", moltype="dna")
    draw = seqs.dotplot()
    draw.show()

.. jupyter-execute::
    :hide-code:

    outpath = set_working_directory.get_thumbnail_dir() / "plot_aln-dotplot-1.png"

    draw.write(outpath)

If sequence names are not provided, two randomly chosen sequences are selected (see below). The plot title reflects the parameter values for defining a match. ``window`` is the size of the sequence segments being compared. ``threshold`` is the number of exact matches within ``window`` required for the two sequence segments to be considered a match. ``gap`` is the size of a gap between adjacent matches before merging.

Modifying the matching parameters
---------------------------------

If we set window and threshold to be equal, this is equivalent to an exact match approach.

.. jupyter-execute::

    draw = seqs.dotplot(name1="Human", name2="Mouse", window=8, threshold=8)
    draw.show()

Displaying dotplot for the reverse complement
---------------------------------------------

.. jupyter-execute::

    draw = seqs.dotplot(name1="Human", name2="Mouse", rc=True)
    draw.show()

.. note:: clicking on an entry in the legend turns it off

Setting plot attributes
-----------------------

I'll modify the title and figure width.

.. jupyter-execute::

    draw = seqs.dotplot(name1="Human", name2="Mouse", rc=True, title="SCA1", width=400)
    draw.show()