File: README

package info (click to toggle)
seqan2 2.5.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 228,748 kB
  • sloc: cpp: 257,602; ansic: 91,967; python: 8,326; sh: 1,056; xml: 570; makefile: 229; awk: 51; javascript: 21
file content (87 lines) | stat: -rw-r--r-- 3,103 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
  *** Fiona: a parallel and automatic strategy for read error correction ***

                 https://www.seqan.de/apps/fiona.html
                              March, 2014

------------------------------------------------------------------------------
Table of Contents
------------------------------------------------------------------------------
  1. Overview
  2. Installation
  3. Usage
  4. Contact

------------------------------------------------------------------------------
1. Overview
------------------------------------------------------------------------------

Fiona is a tool for NGS read correction.  The input consists of the read
set as a FASTA or FASTQ file and an estimate of the genome length as well
as an estimate of the per-error base rate.

The tool will then estimate all other parameters automatically and correct the
reads using a suffix tree (emulated through a suffix array).

------------------------------------------------------------------------------
2. Installation
------------------------------------------------------------------------------

The binaries were compiled on a Debian 6.0.6 system, so it might not work
on different systems.  We will soon provide it as source code for you to
compile yourself.

After downloading the binary files, you are good to go!

Try to use fiona for correcting the input data set.

  $ fiona -g 10000 example/reads.fa output.fa

The file output.fa will be created:

  $ cat output.fa
  >random.fasta.fasta.000000000
  TTGCAGTCTGATGTACCAATACTCTCGCATATCCGCCGGACACTAAGATCTGGCACCCCTAAAGCTGGGC
  TTTAT
  >random.fasta.fasta.000000001
  AGTATCTATTTCTCAGCCCACTCACGAATACTGTCTTTCTCCCACCTATACATGAAGTCATACAGGTACC
  TGTTC
  >random.fasta.fasta.000000002
  TGCTTGTAAACCACTCGACGACGAGTAGGTTGGCCCGTTTAACTCGACTCTCTCTGGTGGAGCCCGACCT
  CAGCT
  >random.fasta.fasta.000000003

Currently, you have to specify quite a number of parameters but in future
versions, we will have them as default values.  The following parameters are
always the same for the default settings.

------------------------------------------------------------------------------
3. Usage
------------------------------------------------------------------------------

The usage of fiona is as follows:

  fiona [OPTIONS] -g GENOME_LENGTH INPUT_FILE OUTPUT_FILE

For a detailed list and description of fiona's parameters, see the output of

  fiona -h

Options Describing Input Properties

    -g GENOME_LENGTH    The genome length.
    -e ERROR_RATE       Per-base error rate, overestimate a bit, e.g. 0.01 for
                        current (2012) Illumina data.

Options for Method Configuration

    -id INDELS          Number of indel errors to allow [0-4].  Set to 0
                        for Hamming distance, usually use 1 for edit distance.

------------------------------------------------------------------------------
4. Contact
------------------------------------------------------------------------------

For questions, comments, or suggestions feel free to contact

    Hugues Richard <hugues.richard@upmc.fr>
    Marcel Schulz <maschulz@andrew.cmu.edu>