File: sample-omit-gap.rst

package info (click to toggle)
python-cogent 2024.5.7a1%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 74,600 kB
  • sloc: python: 92,479; makefile: 117; sh: 16
file content (52 lines) | stat: -rw-r--r-- 1,474 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

Removing highly gapped positions
--------------------------------

Using the ``omit_gap_pos`` app, we can remove position from an alignment which exceed a specified proportions of gaps. 

Let's create a sample alignment with gaps. 
        
.. jupyter-execute::
    :raises:

    from cogent3 import make_aligned_seqs

    aln = make_aligned_seqs({"s1": "ACGA-GA-CG", "s2": "GATGATG-AT"}, moltype="dna")
    aln

Removing highly gapped nucleotide positions
"""""""""""""""""""""""""""""""""""""""""""

Sites with over 99% gaps are excluded by default.

.. jupyter-execute::
    :raises:

    from cogent3 import get_app

    omit_gap_pos_app = get_app("omit_gap_pos", moltype="dna")
    result = omit_gap_pos_app(aln)
    result

We can alter the threshold for the allowed fraction of gaps with the ``allowed_frac`` argument. Let's create an app that excludes all aligned sites with over 49% gaps.

.. jupyter-execute::
    :raises:

    omit_gap_pos_app = get_app("omit_gap_pos", allowed_frac=0.49, moltype="dna")
    result = omit_gap_pos_app(aln)
    result

Removing highly gapped codon positions
""""""""""""""""""""""""""""""""""""""

To eliminate any codon columns (where a column is a triple of nucleotides) that contain a gap character, we use the ``motif_length`` argument.

.. jupyter-execute::
    :raises:

    omit_gap_pos_app = get_app(
        "omit_gap_pos", allowed_frac=0, motif_length=3, moltype="dna"
    )
    result = omit_gap_pos_app(aln)
    result