1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
|
.. _examples:
Example projects and data-sets
==============================
The PALEOMIX pipeline contains small example projects for the larger pipelines, which are designed to be executed in a short amount of time, and to help verify that the pipelines have been correctly installed.
.. _examples_bam:
BAM Pipeline example project
----------------------------
The example project for the BAM pipeline involves the processing of a small data set consisting of (simulated) ancient sequences derived from the human mitochondrial genome. The runtime of this project on a typical desktop or laptop ranges from around 1 minute to around 1 hour (when full modeling of ancient DNA damage patterns is enabled). To access this example project, use the 'example' command for the BAM pipeline to copy the project files to a given directory (here, the current directory)::
$ paleomix bam example .
$ cd bam_pipeline
$ paleomix bam run makefile.yaml
The output generated by the pipeline is described in the :ref:`bam_filestructure` section. Please see the :ref:`troubleshooting` section if you run into problems running the pipeline.
.. _examples_phylo:
Phylogentic Pipeline example project
------------------------------------
The example project for the phylogenetic pipeline involves the processing and mapping of a small data set consisting of (simulated) sequences derived from the human and primate mitochondrial genome, followed by the genotyping of gene sequences and the construction of a maximum likelihood phylogeny. Since this example project starts from raw reads, it therefore requires that the BAM pipeline has been correctly installed, as described in section :ref:`bam_requirements`). The runtime of this project on a typical desktop or laptop ranges from around 30 minutes to around 1 hour.
To access this example project, use the 'example' command for the phylogenetic pipeline to copy the project files to a given directory (here, the current directory), and then run the 'setup.sh' script in the root directory, to generate the data set::
$ paleomix phylo example .
$ cd phylo_pipeline
$ ./setup.sh
Once the example data has been generated, the two pipelines may be executed::
$ cd alignment
$ paleomix bam run makefile.yaml
$ cd ../phylogeny
$ paleomix phylo genotype+msa+phylogeny makefile.yaml
The output generated by the pipeline is described in the :ref:`phylo_filestructure` section. Please see the :ref:`troubleshooting` section if you run into problems running the pipeline.
.. _examples_zonkey:
Zonkey Pipeline example project
-------------------------------
The example project for the Zonkey pipeline is based on a synthetic hybrid between a Domestic donkey and an Arabian horse (obtained from [Orlando2013]_), using a low number of reads (1200). The runtime of these examples on a typical desktop or laptop ranges from around 30 minutes to around 1 hour, depending on your local configuration.
To access this example project, download the Zonkey reference database (see the 'Prerequisites' section of the :ref:`zonkey_usage` page for instructions), and use the 'example' command for zonkey to copy the project files to a given directory. Here, the current directory directory is used; to place the example files in a different location, simply replace the '.' with the full path to the desired directory::
$ paleomix zonkey example database.tar .
$ cd zonkey_pipeline
The example directory contains 3 BAM files; one containing a nuclear alignment ('nuclear.bam'); one containing a mitochondrial alignment ('mitochondrial.bam'); and one containing a combined nuclear and mitochondrial alignment ('combined.bam'). In addition, a sample table is included which shows how multiple samples may be specified and processed at once. Each of these may be run as follows::
# Process only the nuclear BAM;
# by default, results are saved in 'nuclear.zonkey'
$ paleomix zonkey run database.tar nuclear.bam
# Process only the mitochondrial BAM;
# by default, results are saved in 'mitochondrial.zonkey'
$ paleomix zonkey run database.tar mitochondrial.bam
# Process both the nuclear and the mitochondrial BAMs;
# note that is nessesary to specify an output directory
$ paleomix zonkey run database.tar nuclear.bam mitochondrial.bam results
# Process both the combined nuclear and the mitochondrial BAM;
# by default, results are saved in 'combined.zonkey'
$ paleomix zonkey run database.tar combined.bam
# Process multiple samples; the table corresponds to the four
# cases listed above.
$ paleomix zonkey run database.tar samples.txt
Please see the :ref:`troubleshooting` section if you run into problems running the pipeline. The output generated by the pipeline is described in the :ref:`zonkey_filestructure` section.
|