File: tutorial.rst

package info (click to toggle)
axe-demultiplexer 0.3.3%2Bdfsg-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 848 kB
  • sloc: ansic: 4,480; python: 352; makefile: 174; sh: 47
file content (74 lines) | stat: -rw-r--r-- 2,512 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
************
Axe Tutorial
************

In this tutorial, we'll use Axe to demultiplex some paired-end,
combinatorially-index Genotyping-by-Sequencing reads. The data for this
tutorial is available from figshare:
https://figshare.com/articles/axe-tutorial_tar/6143720 .

Axe should be run as the initial step of any analysis: don't use sequence QC
tools like AdapterRemoval or Trimmomatic before using axe, as indexes may be
trimmed away, or pairing information removed.

Step 0: Download the trial data
-------------------------------

This will download the trial data, and extract it on the fly:

.. code-block:: bash

   curl -LS https://ndownloader.figshare.com/files/11094782 | tar xv

Step 1: prepare a key file
--------------------------

The key file associates index sequences with sample names. A key file can be
prepared in a spreadsheet editor, like LibreOffice Calc, or Excel. The format
is quite strict, and is described in detail in the online usage documentation.

Let's now inspect the keyfile I have provided for the tutorial.

.. code-block:: bash

   head axe-keyfile.tsv


Step 2: Demultiplex with Axe
----------------------------


In this step, we will demultiplex our interleaved input file to per-sample
interleaved output files. To see a full range of Axe's options, please run
``axe-demux -h``, or inspect the online usage documentation.

First, let's inspect the input.

.. code-block:: bash

   zcat axe-tutorial.fastq.gz | head -n 8

Then, we need to ensure that axe has somewhere to put the demultiplexed reads.
Axe outputs one file (or more, depending on pairing) per sample. Axe does so by
appending the sample name to some prefix (as given by the ``-I``, ``-F``,
and/or ``-R`` options). If this prefix is a directory, then sample fastq files
will be created in that sub-directory, but the directory must exist. Let's make
an output directory:

.. code-block:: bash

   mkdir -p output

Now, let's demultiplex the reads!

.. code-block:: bash

   axe-demux -i axe-tutorial.fastq.gz -I output/ \
      -c -b axe-keyfile.tsv -t demux-stats.tsv -z 1

The command above demultiplexes reads from ``axe-tutorial.fastq.gz`` into
separate files under ``output``, based on the combinatorial (``-c``)
sample-to-index-sequence mapping described in ``axe-keyfile.tsv``, and saves a
file of statistics as ``demux-stats.tsv``. Note that we have enabled
compression of output files using the ``-z`` option, in case you don't have
much disk space available. This will make Axe slightly slower.