File: sample-fixed-len.rst

package info (click to toggle)
python-cogent 2024.5.7a1%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 74,600 kB
  • sloc: python: 92,479; makefile: 117; sh: 16
file content (55 lines) | stat: -rw-r--r-- 1,676 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Sample an alignment to a fixed length
-------------------------------------

Let's load in an alignment of rodents to use in the examples. 

.. jupyter-execute::
    :raises:
    
    from cogent3 import get_app

    loader = get_app("load_aligned", moltype="protein", format="phylip")
    aln = loader("data/abglobin_aa.phylip")
    aln

How to sample the first ``n`` positions of an alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We can use the ``fixed_length`` app to sample an alignment to a fixed length. By default, it will sample from the beginning of an alignment, the argument ``length=20`` specifies how many positions to sample. 


.. jupyter-execute::
    :raises:

    from cogent3 import get_app

    first_20 = get_app("fixed_length", length=20)
    first_20(aln)


How to sample ``n`` positions from within an alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Creating the ``fixed_length`` app with the argument ``start=x`` specifies that the sampled sequence should begin ``x`` positions into the alignment. 

.. jupyter-execute::
    :raises:

    from cogent3 import get_app

    skip_10_take_20 = get_app("fixed_length", length=20, start=10)
    skip_10_take_20(aln)


How to sample ``n`` positions randomly from within an alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The start position can be selected at random with ``random=True``. An optional ``seed`` can be provided to ensure the same start position is used when the app is called.

.. jupyter-execute::
    :raises:

    from cogent3 import get_app

    random_20 = get_app("fixed_length", length=20, random=True)
    random_20(aln)