File: README

package info (click to toggle)
pal2nal 14.1-3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 184 kB
  • sloc: perl: 1,358; sh: 13; makefile: 7
file content (199 lines) | stat: -rw-r--r-- 7,048 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
#=================================================================#
#                                                                 #
#  PAL2NAL: robust conversion of protein sequence alignments      #
#           into the corresponding codon-based DNA alignments     #
#                                                                 #
#               version 1.0        June 15, 2005                  #
#               version 1.1        October 11, 2005               #
#               version 2.1        March 28, 2006                 #
#               version 8.0        May 29, 2006                   #
#               version 10.0       August 1, 2006                 #
#               version 11.0       August 9, 2006                 #
#               version 12.0       February 2, 2007               #
#               version 12.1       June 22, 2009                  #
#               version 12.2       September 9, 2009              #
#               version 13.0       July 26, 2010                  #
#               version 14.0       December 2, 2011               #
#               version 14.1       October 17, 2017               #
#                                                                 #
#               Mikita Suyama (mikita@bioreg.kyushu-u.ac.jp)      #
#                                                                 #
#=================================================================#


#------------------#
# What is pal2nal?
#------------------#

PAL2NAL is a program that converts a multiple sequence alignment
of proteins and the corresponding DNA (or mRNA) sequences into
a codon-based DNA alignment. The program automatically assigns
the corresponding codon sequence even if the input DNA sequence
has mismatches with the input protein sequence, or contains UTRs,
polyA tails. It can also deal with frame shifts in the input
alignment, which is suitable for the analysis of pseudogenes.
The resulting codon-based DNA alignment can further be subjected
to the calculation of synonymous (Ks) and non-synonymous (Ka)
substitution rates.

The script is licensed under GPL v2.



#-----------#
# Reference
#-----------#

If you use PAL2NAL, please cite the following paper:

  - Mikita Suyama, David Torrents, and Peer Bork (2006)
    PAL2NAL: robust conversion of protein sequence alignment into
    the corresponding codon alignments.
    Nucleic Acids Res. 34:W609-W612.


#-------#
# Files
#-------#

The distribution version should contain the following files:

    README                 - This document
    pal2nal.pl             - The script (version 14) (written in Perl)
    test.aln               - test data (protein alignment)
    test.nuc               - test data (DNA sequences)

    for_paml (directory)
      test.cnt             - control file for codeml
      test.tree            - tree file used for codeml
      test.codeml.ori      - an example of codeml output


#-------#
# Usage
#-------#

Usage:  pal2nal.pl  pep.aln  nuc.fasta  [nuc.fasta...]  [options]

    pep.aln:     protein alignment either in CLUSTAL or FASTA format

                 - works not only a pairwise alignment but also
                   the alignment with more than 2 sequences.

                 - if there are frame shifts in your alignment,
                   you have to specify those positions by numbers:
                   for example '2' in the alignment means that
                   there are only two bases (i.e. one base deletion)
                   (see 'test.aln').


    nuc.fasta:   DNA sequences
                 (single multiple fasta format file,
                  or may be separated files)

    Options:

       -h            Show help

       -output (clustal|paml|fasta|codon)
                     Output format, default = clustal

       -blockonly    Show only user specified blocks
                     '#' under CLUSTAL alignment (see example)

       -nogap        remove columns with gaps and inframe stop codons

       -nomismatch   remove mismatched codons (mismatch between
                     pep and cDNA) from the output

       -codontable (1(default)|2|3|4|5|6|9|10|11|12|13|14|15|16|21|22|23)
                     NCBI GenBank codon table
                     1  Universal code
                     2  Vertebrate mitochondrial code
                     3  Yeast mitochondrial code
                     4  Mold, Protozoan, and Coelenterate Mitochondrial code
                        and Mycoplasma/Spiroplasma code
                     5  Invertebrate mitochondrial
                     6  Ciliate, Dasycladacean and Hexamita nuclear code
                     9  Echinoderm and Flatworm mitochondrial code
                    10  Euplotid nuclear code
                    11  Bacterial, archaeal and plant plastid code
                    12  Alternative yeast nuclear code
                    13  Ascidian mitochondrial code
                    14  Alternative flatworm mitochondrial code
                    15  Blepharisma nuclear code
                    16  Chlorophycean mitochondrial code
                    21  Trematode mitochondrial code
                    22  Scenedesmus obliquus mitochondrial code
                    23  Thraustochytrium mitochondrial code

       -html         HTML output (only for the web server)

       -nostderr     No STDERR messages (only for the web server)



    - The correspondence of IDs between pep.aln and nuc.fasta is
      automatically checked:
        - If you use the same IDs in both pep.aln and nuc.fasta,
          the sequences don't have to be in the same order.
        - If not, the order of the sequences in pep.aln and nuc.fasta
          has to be the same.

    - IDs in pep.aln are used in the output.


Example:  pal2nal.pl  test.aln  test.nuc  -output paml  -nogap


#---------------------------------#
# How to calculate Ks, Ka values?
#---------------------------------#

To calclate Ks, Ka values, you need the codeml program, which
is included in the PAML package. You can download PAML from

  http://abacus.gene.ucl.ac.uk/software/paml.html

As an example, a control file (test.cnt) and a tree file (test.tree)
are in the "for_paml" sub-directory.

These control file and tree file are designed for the 'test' data
used in PAL2NAL.

Example:

   pal2nal.pl  test.aln  test.nuc  -output paml  -nogap  >  for_paml/test.codon

   cd for_paml

   codeml  test.cnt

     You can find the output of codeml in "test.codeml".
     Ks, Ka values are very end of the output file.
     Just for comparison, there is a sample output file, "test.codeml.ori".


#------------#
# WWW server
#------------#

http://www.bork.embl.de/pal2nal
 or
http://www.genome.med.kyoto-u.ac.jp/cgi-bin/suyama/pal2nal/index.cgi  (not working)


#---------#
# Contact
#---------#

If you have any questions or comments, please email me:

  Mikita Suyama
  Medical Institute of Bioregulation
  Kyushu University,
  812-8582 Fukuoka, JAPAN
  tel: +81 92 642 6384
  fax: +81 92 642 6562
  email: mikita@bioreg.kyushu-u.ac.jp