File: Instructions3.tut.xml

package info (click to toggle)
bali-phy 4.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 15,392 kB
  • sloc: cpp: 120,442; xml: 13,966; haskell: 9,975; python: 2,936; yacc: 1,328; perl: 1,169; lex: 912; sh: 343; makefile: 26
file content (131 lines) | stat: -rw-r--r-- 6,838 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
<!DOCTYPE book [
<!ENTITY % sgml.features "IGNORE">
<!ENTITY % xml.features "INCLUDE">

<!ENTITY % ent-mmlalias
      PUBLIC "-//W3C//ENTITIES Aiases for MathML 2.0//EN"
             "/usr/share/xml/schema/w3c/mathml/dtd/mmlalias.ent" >
%ent-mmlalias;
]>
<article xmlns="http://docbook.org/ns/docbook" version="5.0" 
         xmlns:mml="http://www.w3.org/1998/Math/MathML"
	 xml:lang="en">
  <info><title>Alignment Practical Exercises</title>
    <author><personname><firstname>Benjamin</firstname><surname>Redelings</surname></personname></author>
  </info>

<para>This tutorial assumes a UNIX command line.  On Linux or Mac this is built in; on Windows install Cygwin.</para>
<section><info><title>Exercise (Setup, MAFFT, HoT, Seaview)</title></info>

<section><info><title>Setup</title></info>
Look around:
% cd                        					// go to your home directory
% pwd                                                           // where are we? = print working directory
% ls                                                            // look around
<section><info><title>Make a <filename>~/bin</filename> directory (for programs) and add it to your PATH</title></info>
Check if <filename>~/bin/</filename> is in your PATH
% echo $PATH
If its not there, then we need to add it:
% mkdir ~/bin            					        // make a subdirectory
% nano ~/.profile	    				                // edit the startup script
% less ~/.profile					                // examine .profile to see if your changes are there
Add a line “PATH=~/bin:$PATH”.  Then log out and log back in.
% echo $PATH                                                    // check that ~/bin is in PATH
</section>

<section><info><title>Download the alignment files we will use</title></info>
% cd
% mkdir alignment_files
% cd ~/alignment_files
% wget http://www.bali-phy.org/examples.tgz       // Download the files – Linux (On cygwin, first install wget).
% curl -O http://www.bali-phy.org/examples.tgz   // Download the files – Mac
(Alternatively, download the files using your browser)
% tar -zxf examples.tgz                                       	// extract the compressed archive
% ls examples						        // look inside the new directory that was created
% less examples/5S-rRNA/5d.fasta                                // examine the downloaded files – type q to quit
% alignment-info examples/5S-rRNA/5d.fasta
</section>

<section><info><title>Install fasta-flip.pl to ~/bin</title></info>
% cd ~/alignment_files/examples
% chmod +x fasta-flip.pl				// make the file executable
% cp fasta-flip.pl ~/bin				// copy the file to ~/bin
% which fasta-flip.pl					// check to make sure its in your PATH
</section>
</section>
<section><info><title>Align forward and backwards (HoT) with mafft.</title></info>
% cp 5S-rRNA/25.fasta .				                                  // make a copy of a file
% mafft --auto 25.fasta &gt; 25-mafft.fasta                                           // try a simple sequence alignment 
% fasta-flip.pl 25.fasta | mafft --auto - | fasta-flip.pl &gt; 25-mafft-flipped.fasta // align the sequences when flipped
</section>
<section><info><title>Visualize the alignments with seaview</title></info>
% seaview 25-mafft.fasta &#38; seaview 25-mafft-flipped.fasta &#38;                       // take a look at the two alignments
</section>
</section>

  <section><info><title>Exercise (muscle, Probcons, SuiteMSA, and consensus)</title></info>
<section><info><title>Make a local copy of a data file</title></info>
% cd ~/alignment_files/examples
% cp HIV/HIVSIV.fasta .                                                               	// Make a copy of the file in the current directory
% alignment-info HIVSIV.fasta					// Some basic info about the sequences
</section>
<section><info><title>Muscle</title></info>
Look at the time to perform different numbers of iterations with muscle
% time muscle -maxiters 1 &lt; HIVSIV.fasta &gt; HIVSIV-muscle1.fasta	// use a bad guide tree
% time muscle -maxiters 2 &lt; HIVSIV.fasta &gt; HIVSIV-muscle2.fasta	// obtain a better guide tree and realign
% time muscle -maxiters 3 &lt; HIVSIV.fasta &gt; HIVSIV-muscle3.fasta	// also add refinement
% time muscle &lt; HIVSIV.fasta &gt; HIVSIV-muscle.fasta
% muscle -in HIVSIV.fasta -out HIVSIV-muscle.fasta                 	// This is the same as the previous command.

Compare the first and last alignments
% alignment-cat HIVSIV-muscle1.fasta --reorder-by-alignment=HIVSIV-muscle.fasta &gt; a      // reorder sequences
% mv a HIVSIV-muscle1.fasta 
% seaview HIVSIV-muscle1.fasta &#38; seaview HIVSIV-muscle.fasta &#38;
</section>

<section><info><title>Probcons</title></info>
Align sequences and reversed sequences with probcons
% probcons HIVSIV.fasta &gt; HIVSIV-probcons.fasta
% fasta-flip.pl HIVSIV.fasta &gt; HIVSIV-flipped.fasta
% probcons HIVSIV-flipped.fasta | fasta-flip.pl &gt; HIVSIV-probcons-flipped.fasta
</section>
<section><info><title>SuiteMSA</title></info>
Compare alignments using SuiteMSA
% SuiteMSA &#38;
Choose <guilabel>Pixel Plot</guilabel>, and select <filename>HIVSIV-probcons.fasta</filename>.  Click on <guilabel>Add</guilabel> to load a new alignment file, and choose <filename>HIVSIV-probcons-flipped.fasta</filename>.  Click on <guilabel>Add</guilabel> to load a new alignment file, and choose <filename>HIVSIV-muscle.fasta</filename>.
Q: What is the spatial pattern of the differences?
</section>
<section><info><title>Consensus</title></info>
Construct an alignment that only contains homologies present in both mafft and probcons alignments
% echo &gt; blank
% cat blank  // It should print a blank line to the screen.

% cat HIVSIV-muscle.fasta blank HIVSIV-probcons.fasta &gt; two-alignments.fasta
% alignment-consensus –strict=0.99 &lt; two-alignments.fasta &gt; consensus.fasta
% seaview consensus.fasta &#38;
</section>
  </section>
<section><info><title>Exercise (FSA and Prank)</title></info>
<section><info><title>FSA</title></info>
Align sequences with fsa.  This will take some time.
% cd ~/alignment_files/examples
% fsa --fast --gapfactor 0 HIVSIV.fasta &gt; HIVSIV-fsa0.fasta &#38;     	// This is similar to probcons – few gaps.
% fsa --fast --gapfactor 0.5 HIVSIV.fasta &gt; HIVSIV-fsa.fasta &#38;           // This is the default. Some gaps.

Look at the alignment lengths for the different fsa parameters:
% alignment-info HIVSIV-fsa0.fasta
% alignment-info HIVSIV-fsa.fasta
</section>
<section><info><title>Prank</title></info>
Align a (short) 5S data set with prank
% cd ~/alignment_files/examples/
% prank -d=ITS/ITS1.fasta +F
% cp output.best.fas ITS1-prank1.fasta

% mafft ITS/ITS1.fasta &gt; ITS1-mafft.fasta
% muscle &lt; ITS/ITS1.fasta &gt; ITS1-muscle.fasta
% SuiteMSA &#38;
Compare ITS1-prank.fasta, ITS1-mafft.fasta, and ITS1-muscle.fasta using SuiteMSA
</section>
</section>
</article>