1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
|
*** Fiona: a parallel and automatic strategy for read error correction ***
https://www.seqan.de/apps/fiona.html
March, 2014
------------------------------------------------------------------------------
Table of Contents
------------------------------------------------------------------------------
1. Overview
2. Installation
3. Usage
4. Contact
------------------------------------------------------------------------------
1. Overview
------------------------------------------------------------------------------
Fiona is a tool for NGS read correction. The input consists of the read
set as a FASTA or FASTQ file and an estimate of the genome length as well
as an estimate of the per-error base rate.
The tool will then estimate all other parameters automatically and correct the
reads using a suffix tree (emulated through a suffix array).
------------------------------------------------------------------------------
2. Installation
------------------------------------------------------------------------------
The binaries were compiled on a Debian 6.0.6 system, so it might not work
on different systems. We will soon provide it as source code for you to
compile yourself.
After downloading the binary files, you are good to go!
Try to use fiona for correcting the input data set.
$ fiona -g 10000 example/reads.fa output.fa
The file output.fa will be created:
$ cat output.fa
>random.fasta.fasta.000000000
TTGCAGTCTGATGTACCAATACTCTCGCATATCCGCCGGACACTAAGATCTGGCACCCCTAAAGCTGGGC
TTTAT
>random.fasta.fasta.000000001
AGTATCTATTTCTCAGCCCACTCACGAATACTGTCTTTCTCCCACCTATACATGAAGTCATACAGGTACC
TGTTC
>random.fasta.fasta.000000002
TGCTTGTAAACCACTCGACGACGAGTAGGTTGGCCCGTTTAACTCGACTCTCTCTGGTGGAGCCCGACCT
CAGCT
>random.fasta.fasta.000000003
Currently, you have to specify quite a number of parameters but in future
versions, we will have them as default values. The following parameters are
always the same for the default settings.
------------------------------------------------------------------------------
3. Usage
------------------------------------------------------------------------------
The usage of fiona is as follows:
fiona [OPTIONS] -g GENOME_LENGTH INPUT_FILE OUTPUT_FILE
For a detailed list and description of fiona's parameters, see the output of
fiona -h
Options Describing Input Properties
-g GENOME_LENGTH The genome length.
-e ERROR_RATE Per-base error rate, overestimate a bit, e.g. 0.01 for
current (2012) Illumina data.
Options for Method Configuration
-id INDELS Number of indel errors to allow [0-4]. Set to 0
for Hamming distance, usually use 1 for edit distance.
------------------------------------------------------------------------------
4. Contact
------------------------------------------------------------------------------
For questions, comments, or suggestions feel free to contact
Hugues Richard <hugues.richard@upmc.fr>
Marcel Schulz <maschulz@andrew.cmu.edu>
|