File: denoiser_preprocess.rst

package info (click to toggle)
qiime 1.8.0%2Bdfsg-4
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 130,508 kB
  • ctags: 10,145
  • sloc: python: 110,826; haskell: 379; sh: 169; makefile: 125
file content (60 lines) | stat: -rw-r--r-- 1,855 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.. _denoiser_preprocess:

.. index:: denoiser_preprocess.py

*denoiser_preprocess.py* -- Run phase of denoiser algorithm: prefix clustering
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Description:**

The script `denoiser_preprocess.py <./denoiser_preprocess.html>`_ runs the first clustering phase
which groups reads based on common prefixes.


**Usage:** :file:`denoiser_preprocess.py [options]`

**Input Arguments:**

.. note::

	
	**[REQUIRED]**
		
	-i, `-`-input_files
		Path to flowgram files (.sff.txt), comma separated
	
	**[OPTIONAL]**
		
	-f, `-`-fasta_file
		Path to fasta input file [default: None]
	-s, `-`-squeeze
		Use run-length encoding for prefix filtering [default: False]
	-l, `-`-log_file
		Path to log file [default: preprocess.log]
	-p, `-`-primer
		Primer sequence used for the amplification [default: CATGCTGCCTCCCGTAGGAGT]
	-o, `-`-output_dir
		Path to output directory [default: /tmp/]


**Output:**


prefix_dereplicated.sff.txt: human readable sff file containing the flowgram of the
                             cluster representative of each cluster.

prefix_dereplicated.fasta: Fasta file containing the cluster representative of each cluster.

prefix_mapping.txt: This file contains the actual clusters. The cluster centroid is given first,
                    the cluster members follw after the ':'.   



Run program on flowgrams in 454Reads.sff. Remove reads which are not in split_lib_filtered_seqs.fasta. 
Remove primer CATGCTGCCTCCCGTAGGAGT from reads before running phase I

::

	denoiser_preprocess.py -i Fasting_Example.sff.txt -f seqs.fna -p CATGCTGCCTCCCGTAGGAGT