File: repeats.xml

package info (click to toggle)
mobyle-programs 5.1.2-3
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 5,468 kB
  • sloc: xml: 126,950; makefile: 2
file content (205 lines) | stat: -rw-r--r-- 6,805 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
<?xml version='1.0' encoding='UTF-8'?>
<!-- XML Authors: Corinne Maufrais, Nicolas Joly and Bertrand Neron,             -->
<!-- 'Biological Software and Databases' Group, Institut Pasteur, Paris.         -->
<!-- Distributed under LGPLv2 License. Please refer to the COPYING.LIB document. -->
<program>
  <head>
    <name>repeats</name>
    <version>1.1</version>
    <doc>
      <title>repeats</title>
      <description>
        <text lang="en">Search repeats in DNA sequence</text>
      </description>
     <comment >
      <div xmlns="http://www.w3.org/1999/xhtml">
      <p>The program scans a dna sequence file, looking for tandemly repeated
patterns where the period of the repeat has a user specified *size* from
1 to 32 nucleotides.  A possible repeat is found if *lookcount*
characters are repeated at a separation of *size*.</p> 
<p>Example: Suppose size is 7 and lookcount is 3. Then the sequence
                <pre>
                ACGTGTCCGTA 
                 ^^^   ^^^
                </pre>
could be part of a possible repeat of the pattern CGTGTC
because the first 3 characters CGT are repeated at a separation of 7.</p>

<p>Once a possible pattern is found, the program uses dynamic programming
to compute a similarity score of the pattern versus the sequence in
the area where the pattern was found.  The dynamic programming uses
weights for single indels rather than gap functions.  This is so that
the program quickly identifies the repeats rather than producing an
optimal alignment score.</p>

<p>If the similarity score exceeds a threshold, then a consensus pattern is
computed.  This consensus is aligned with the sequence and the
alignment is displayed.</p>
      </div>
      </comment>
      <authors>G. Benson</authors>
      <reference>A method for fast database search for all k-nucleotide repeats, by Gary Benson and Michael S. Waterman, Nucleic Acids Research (1994) Vol. 22, No. 22, pp 4828-4836.</reference>
    </doc>
    <category>sequence:nucleic:pattern</category>
    <command>repeats</command>
  </head>
  <parameters>
    <parameter ismandatory="1" issimple="1">
      <name>seq</name>
      <prompt lang="en">Sequence File</prompt>
      <type>
        <biotype>DNA</biotype>
        <datatype>
          <class>Sequence</class>
        </datatype>
        <dataFormat>GENBANK</dataFormat>
      </type>
      <format>
        <code proglang="perl">" $value"</code>
        <code proglang="python">" "+str(value)</code>
      </format>
      <argpos>1</argpos>
      <comment>
        <text lang="en">The data file must conform to the GenBank format.</text>
      </comment>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>alpha</name>
      <prompt lang="en">Match bonus (input as positive) (Alpha)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <vdef>
      	<value>2</value>
      </vdef>
      <format>
        <code proglang="perl"> " $value"</code>
        <code proglang="python">" "+str(value)</code>
      </format>
      <ctrl>
        <message>
          <text lang="en">Value must be positive</text>
        </message>
        <code proglang="perl">$value &gt;= 0</code>
        <code proglang="python">value &gt;= 0</code>
      </ctrl>
      <argpos>2</argpos>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>beta</name>
      <prompt lang="en">Mismatch penalty (input as positive) (Beta)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <vdef>
      	<value>6</value>
      </vdef>
      <format>
        <code proglang="perl"> " $value"  </code>
        <code proglang="python">" "+str(value)</code>
      </format>
      <ctrl>
        <message>
          <text lang="en">Value must be positive</text>
        </message>
        <code proglang="perl">$value &gt; 0</code>
        <code proglang="python">value &gt; 0</code>
      </ctrl>
      <argpos>3</argpos>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>delta</name>
      <prompt lang="en">Indel penalty (input as positive) (Delta)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <vdef>
      	<value>9</value>
      </vdef>
      <format>
        <code proglang="perl"> " $value"  </code>
        <code proglang="python">" " + str(value)</code>
      </format>
      <ctrl>
        <message>
          <text lang="en">Value must be positive</text>
        </message>
        <code proglang="perl">$value &gt;= 0</code>
        <code proglang="python">value &gt;= 0</code>
      </ctrl>
      <argpos>4</argpos>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>reportmax</name>
      <prompt lang="en">Threshold score to report an alignment (Reportmax)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <vdef>
      	<value>30</value>
      </vdef>
      <format>
        <code proglang="perl"> " $value"  </code>
        <code proglang="python">" " + str(value)</code>
      </format>
      <argpos>5</argpos>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>Size</name>
      <prompt lang="en">Pattern size (Size)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <format>
        <code proglang="perl"> " $value"  </code>
        <code proglang="python">" " + str(value)</code>
      </format>
      <argpos>6</argpos>
    </parameter>
    <parameter ismandatory="1" issimple="1">
      <name>lookcount</name>
      <prompt lang="en">Number of characters to match to trigger dynamic programming (Lookcount)</prompt>
      <type>
        <datatype>
          <class>Integer</class>
        </datatype>
      </type>
      <format>
        <code proglang="perl"> " $value"  </code>
        <code proglang="python">" " + str(value)</code>
      </format>
      <argpos>7</argpos>
      <comment>
        <text lang="en">A possible repeat is found if *lookcount* characters are repeated at a separation of *size*. Recommended to use values between 3 and 8</text>
      </comment>
    </parameter>
    <parameter issimple="1">
      <name>noshortperiods</name>
      <prompt lang="en">Patterns with shorter periods are excluded ? (Noshortperiods)</prompt>
      <type>
        <datatype>
          <class>Boolean</class>
        </datatype>
      </type>
      <vdef>
        <value>0</value>
      </vdef>
      <format>
        <code proglang="perl"> ($value)? " 1 ":" 0"</code>
        <code proglang="python">(" 0" , " 1 ")[ value ]</code>
      </format>
      <argpos>8</argpos>
    </parameter>
  </parameters>
</program>