File: TOPP_PTModel.html

package info (click to toggle)
openms 1.11.1-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 436,688 kB
  • ctags: 150,907
  • sloc: cpp: 387,126; xml: 71,547; python: 7,764; ansic: 2,626; php: 2,499; sql: 737; ruby: 342; sh: 325; makefile: 128
file content (172 lines) | stat: -rw-r--r-- 19,660 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
<HTML>
<HEAD>
<TITLE>PTModel</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> &nbsp;&middot;
<A href="classes.html">Classes</A> &nbsp;&middot;
<A href="annotated.html">Annotated Classes</A> &nbsp;&middot;
<A href="modules.html">Modules</A> &nbsp;&middot;
<A href="functions_func.html">Members</A> &nbsp;&middot;
<A href="namespaces.html">Namespaces</A> &nbsp;&middot;
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title">PTModel </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>Used to train a model for the prediction of proteotypic peptides.</p>
<p>The input consists of two files: One file contains the positive examples (the peptides which are proteotypic) and the other contains the negative examples (the nonproteotypic peptides).</p>
<p>Parts of this model has been described in the publication</p>
<p>Ole Schulz-Trieglaff, Nico Pfeifer, Clemens Gr&ouml;pl, Oliver Kohlbacher and Knut Reinert LC-MSsim - a simulation software for Liquid Chromatography Mass Spectrometry data BMC Bioinformatics 2008, 9:423.</p>
<p>There are a number of parameters which can be changed for the svm (specified in the ini file): </p>
<ul>
<li>
kernel_type: the kernel function (e.g., POLY for the polynomial kernel, LINEAR for the linear kernel or RBF for the gaussian kernel); we recommend SVMWrapper::OLIGO for our paired oligo-border kernel (POBK)  </li>
<li>
border_length: border length for the POBK  </li>
<li>
k_mer_length: length of the signals considered in the POBK  </li>
<li>
sigma: the amount of positional smoothing for the POBK  </li>
<li>
degree: the degree parameter for the polynomial kernel  </li>
<li>
c: the penalty parameter of the svm  </li>
<li>
nu: the nu parameter for nu-SVC  </li>
</ul>
<p>The last five parameters (sigma, degree, c, nu and p) are used in a cross validation (CV) to find the best parameters according to the training set. Thus, you have to specify the start value of a parameter, the step size in which the parameters should be increased and a final value for the particular parameter such that the tested parameter is never bigger than the given final value. If you want to perform a cross validation, for example, for the parameter c, you have to specify <b>c_start</b>, <b>c_step_size</b> and <b>c_stop</b> in the ini file. Let's say you want to perform a CV for c from 0.1 to 2 with step size 0.1. Open up your ini-file with INIFileEditor and modify the fields c_start, c_step_size, and c_stop accordingly.</p>
<p>If the CV should test additional parameters in a certain range you just include them analogously to the example above. Furthermore, you can specify the number of partitions for the CV with <b>number_of_partitions</b> in the ini file and the number of runs with <b>number_of_runs</b>.</p>
<p><br/>
 Consequently you have two choices to use this application:</p>
<ol>
<li>
Set the parameters of the svm: The PTModel application will train the svm with the training data and store the svm model.  </li>
<li>
Give a range of parameters for which a CV should be performed: The PTModel application will perform a CV to find the best parameter combination in the given range and afterwards train the svm with the best parameters and the whole training data. Then the model is stored.  </li>
</ol>
<p><br/>
 The model can be used in <a class="el" href="TOPP_PTPredict.html">PTPredict</a>, to predict the likelihood for peptides to be proteotypic.</p>
<p><b>The command line parameters of this tool are:</b> </p>
<pre class="fragment">
PTModel -- Trains a model for the prediction of proteotypic peptides from a training set.
Version: 1.11.1 Nov 14 2013, 11:18:15, Revision: 11976

Usage:
  PTModel &lt;options&gt;

Options (mandatory options marked with '*'):
  -in_positive &lt;file&gt;*            Input file with positive examples (valid formats: 'idXML')
  -in_negative &lt;file&gt;*            Input file with negative examples (valid formats: 'idXML')
  -out &lt;file&gt;*                    Output file: the model in libsvm format (valid formats: 'txt')
  -c &lt;float&gt;                      The penalty parameter of the svm (default: '1')
  -svm_type &lt;type&gt;                The type of the svm (NU_SVC or C_SVC) (default: 'C_SVC' valid: 'NU_SVC', 
                                  'C_SVC')
  -nu &lt;float&gt;                     The nu parameter [0..1] of the svm (for nu-SVR) (default: '0.5' min: '0' 
                                  max: '1')
  -kernel_type &lt;type&gt;             The kernel type of the svm (default: 'OLIGO' valid: 'LINEAR', 'RBF', 'POLY'
                                  , 'OLIGO')
  -degree &lt;int&gt;                   The degree parameter of the kernel function of the svm (POLY kernel) (defau
                                  lt: '1' min: '1')
  -border_length &lt;int&gt;            Length of the POBK (default: '22' min: '1')
  -k_mer_length &lt;int&gt;             K_mer length of the POBK (default: '1' min: '1')
  -sigma &lt;float&gt;                  Sigma of the POBK (default: '5')
  -max_positive_count &lt;int&gt;       Quantity of positive samples for training (randomly chosen if smaller than 
                                  available quantity) (default: '1000' min: '1')
  -max_negative_count &lt;int&gt;       Quantity of positive samples for training (randomly chosen if smaller than 
                                  available quantity) (default: '1000' min: '1')
  -redundant                      If the input sets are redundant and the redundant peptides should occur 
                                  more than once in the training set, this flag has to be set
  -additive_cv                    If the step sizes should be interpreted additively (otherwise the actual 
                                  value is multiplied with the step size to get the new value
                                  

Parameters for the grid search / cross validation::
  -cv:skip_cv                     Has to be set if the cv should be skipped and the model should just be trai
                                  ned with the specified parameters.
  -cv:number_of_runs &lt;int&gt;        Number of runs for the CV (default: '10' min: '1')
  -cv:number_of_partitions &lt;int&gt;  Number of CV partitions (default: '10' min: '2')
  -cv:degree_start &lt;int&gt;          Starting point of degree (default: '1' min: '1')
  -cv:degree_step_size &lt;int&gt;      Step size point of degree (default: '2')
  -cv:degree_stop &lt;int&gt;           Stopping point of degree (default: '4')
  -cv:c_start &lt;float&gt;             Starting point of c (default: '1')
  -cv:c_step_size &lt;float&gt;         Step size of c (default: '100')
  -cv:c_stop &lt;float&gt;              Stopping point of c (default: '1000')
  -cv:nu_start &lt;float&gt;            Starting point of nu (default: '0.1' min: '0' max: '1')
  -cv:nu_step_size &lt;float&gt;        Step size of nu (default: '1.3')
  -cv:nu_stop &lt;float&gt;             Stopping point of nu (default: '0.9' min: '0' max: '1')
  -cv:sigma_start &lt;float&gt;         Starting point of sigma (default: '1')
  -cv:sigma_step_size &lt;float&gt;     Step size of sigma (default: '1.3')
  -cv:sigma_stop &lt;float&gt;          Stopping point of sigma (default: '15')

                                  
Common TOPP options:
  -ini &lt;file&gt;                     Use the given TOPP INI file
  -threads &lt;n&gt;                    Sets the number of threads allowed to be used by the TOPP tool (default: 
                                  '1')
  -write_ini &lt;file&gt;               Writes the default configuration file
  --help                          Shows options
  --helphelp                      Shows all options (including advanced)

</pre><p> <b>INI file documentation of this tool:</b> <div class="ini_global">
<div class="legend">
<b>Legend:</b><br>
 <div class="item item_required">required parameter</div>
 <div class="item item_advanced">advanced parameter</div>
</div>
  <div class="node"><span class="node_name">+PTModel</span><span class="node_description">Trains a model for the prediction of proteotypic peptides from a training set.</span></div>
    <div class="item item_advanced"><span class="item_name" style="padding-left:16px;">version</span><span class="item_value">1.11.1</span>
<span class="item_description">Version of the tool that generated this parameters file.</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>    <div class="node"><span class="node_name">++1</span><span class="node_description">Instance '1' section for 'PTModel'</span></div>
      <div class="item"><span class="item_name item_required" style="padding-left:24px;">in_positive</span><span class="item_value"></span>
<span class="item_description">input file with positive examples</span><span class="item_tags">input file</span><span class="item_restrictions">*.idXML</span></div>      <div class="item"><span class="item_name item_required" style="padding-left:24px;">in_negative</span><span class="item_value"></span>
<span class="item_description">input file with negative examples</span><span class="item_tags">input file</span><span class="item_restrictions">*.idXML</span></div>      <div class="item"><span class="item_name item_required" style="padding-left:24px;">out</span><span class="item_value"></span>
<span class="item_description">output file: the model in libsvm format</span><span class="item_tags">output file</span><span class="item_restrictions">*.txt</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">c</span><span class="item_value">1</span>
<span class="item_description">the penalty parameter of the svm</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">svm_type</span><span class="item_value">C_SVC</span>
<span class="item_description">the type of the svm (NU_SVC or C_SVC)</span><span class="item_tags"></span><span class="item_restrictions">NU_SVC,C_SVC</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">nu</span><span class="item_value">0.5</span>
<span class="item_description">the nu parameter [0..1] of the svm (for nu-SVR)</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">kernel_type</span><span class="item_value">OLIGO</span>
<span class="item_description">the kernel type of the svm</span><span class="item_tags"></span><span class="item_restrictions">LINEAR,RBF,POLY,OLIGO</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">degree</span><span class="item_value">1</span>
<span class="item_description">the degree parameter of the kernel function of the svm (POLY kernel)</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">border_length</span><span class="item_value">22</span>
<span class="item_description">length of the POBK</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">k_mer_length</span><span class="item_value">1</span>
<span class="item_description">k_mer length of the POBK</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">sigma</span><span class="item_value">5</span>
<span class="item_description">sigma of the POBK</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">max_positive_count</span><span class="item_value">1000</span>
<span class="item_description">quantity of positive samples for training (randomly chosen if smaller than available quantity)</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">max_negative_count</span><span class="item_value">1000</span>
<span class="item_description">quantity of positive samples for training (randomly chosen if smaller than available quantity)</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">redundant</span><span class="item_value">false</span>
<span class="item_description">if the input sets are redundant and the redundant peptides should occur more than once in the training set, this flag has to be set</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">additive_cv</span><span class="item_value">false</span>
<span class="item_description">if the step sizes should be interpreted additively (otherwise the actual value is multiplied with the step size to get the new value</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">log</span><span class="item_value"></span>
<span class="item_description">Name of log file (created only when specified)</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">debug</span><span class="item_value">0</span>
<span class="item_description">Sets the debug level</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">threads</span><span class="item_value">1</span>
<span class="item_description">Sets the number of threads allowed to be used by the TOPP tool</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">no_progress</span><span class="item_value">false</span>
<span class="item_description">Disables progress logging to command line</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">test</span><span class="item_value">false</span>
<span class="item_description">Enables the test mode (needed for internal use only)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++cv</span><span class="node_description">Parameters for the grid search / cross validation:</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">skip_cv</span><span class="item_value">false</span>
<span class="item_description">Has to be set if the cv should be skipped and the model should just be trained with the specified parameters.</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">number_of_runs</span><span class="item_value">10</span>
<span class="item_description">number of runs for the CV</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">number_of_partitions</span><span class="item_value">10</span>
<span class="item_description">number of CV partitions</span><span class="item_tags"></span><span class="item_restrictions">2:&#8734;</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">degree_start</span><span class="item_value">1</span>
<span class="item_description">starting point of degree</span><span class="item_tags"></span><span class="item_restrictions">1:&#8734;</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">degree_step_size</span><span class="item_value">2</span>
<span class="item_description">step size point of degree</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">degree_stop</span><span class="item_value">4</span>
<span class="item_description">stopping point of degree</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">c_start</span><span class="item_value">1</span>
<span class="item_description">starting point of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">c_step_size</span><span class="item_value">100</span>
<span class="item_description">step size of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">c_stop</span><span class="item_value">1000</span>
<span class="item_description">stopping point of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">nu_start</span><span class="item_value">0.1</span>
<span class="item_description">starting point of nu</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">nu_step_size</span><span class="item_value">1.3</span>
<span class="item_description">step size of nu</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">nu_stop</span><span class="item_value">0.9</span>
<span class="item_description">stopping point of nu</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sigma_start</span><span class="item_value">1</span>
<span class="item_description">starting point of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sigma_step_size</span><span class="item_value">1.3</span>
<span class="item_description">step size of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sigma_stop</span><span class="item_value">15</span>
<span class="item_description">stopping point of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div></div>
 </div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>