1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172
|
<HTML>
<HEAD>
<TITLE>PTModel</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> ·
<A href="classes.html">Classes</A> ·
<A href="annotated.html">Annotated Classes</A> ·
<A href="modules.html">Modules</A> ·
<A href="functions_func.html">Members</A> ·
<A href="namespaces.html">Namespaces</A> ·
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
<div class="headertitle">
<div class="title">PTModel </div> </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>Used to train a model for the prediction of proteotypic peptides.</p>
<p>The input consists of two files: One file contains the positive examples (the peptides which are proteotypic) and the other contains the negative examples (the nonproteotypic peptides).</p>
<p>Parts of this model has been described in the publication</p>
<p>Ole Schulz-Trieglaff, Nico Pfeifer, Clemens Gröpl, Oliver Kohlbacher and Knut Reinert LC-MSsim - a simulation software for Liquid Chromatography Mass Spectrometry data BMC Bioinformatics 2008, 9:423.</p>
<p>There are a number of parameters which can be changed for the svm (specified in the ini file): </p>
<ul>
<li>
kernel_type: the kernel function (e.g., POLY for the polynomial kernel, LINEAR for the linear kernel or RBF for the gaussian kernel); we recommend SVMWrapper::OLIGO for our paired oligo-border kernel (POBK) </li>
<li>
border_length: border length for the POBK </li>
<li>
k_mer_length: length of the signals considered in the POBK </li>
<li>
sigma: the amount of positional smoothing for the POBK </li>
<li>
degree: the degree parameter for the polynomial kernel </li>
<li>
c: the penalty parameter of the svm </li>
<li>
nu: the nu parameter for nu-SVC </li>
</ul>
<p>The last five parameters (sigma, degree, c, nu and p) are used in a cross validation (CV) to find the best parameters according to the training set. Thus, you have to specify the start value of a parameter, the step size in which the parameters should be increased and a final value for the particular parameter such that the tested parameter is never bigger than the given final value. If you want to perform a cross validation, for example, for the parameter c, you have to specify <b>c_start</b>, <b>c_step_size</b> and <b>c_stop</b> in the ini file. Let's say you want to perform a CV for c from 0.1 to 2 with step size 0.1. Open up your ini-file with INIFileEditor and modify the fields c_start, c_step_size, and c_stop accordingly.</p>
<p>If the CV should test additional parameters in a certain range you just include them analogously to the example above. Furthermore, you can specify the number of partitions for the CV with <b>number_of_partitions</b> in the ini file and the number of runs with <b>number_of_runs</b>.</p>
<p><br/>
Consequently you have two choices to use this application:</p>
<ol>
<li>
Set the parameters of the svm: The PTModel application will train the svm with the training data and store the svm model. </li>
<li>
Give a range of parameters for which a CV should be performed: The PTModel application will perform a CV to find the best parameter combination in the given range and afterwards train the svm with the best parameters and the whole training data. Then the model is stored. </li>
</ol>
<p><br/>
The model can be used in <a class="el" href="TOPP_PTPredict.html">PTPredict</a>, to predict the likelihood for peptides to be proteotypic.</p>
<p><b>The command line parameters of this tool are:</b> </p>
<pre class="fragment">
PTModel -- Trains a model for the prediction of proteotypic peptides from a training set.
Version: 1.11.1 Nov 14 2013, 11:18:15, Revision: 11976
Usage:
PTModel <options>
Options (mandatory options marked with '*'):
-in_positive <file>* Input file with positive examples (valid formats: 'idXML')
-in_negative <file>* Input file with negative examples (valid formats: 'idXML')
-out <file>* Output file: the model in libsvm format (valid formats: 'txt')
-c <float> The penalty parameter of the svm (default: '1')
-svm_type <type> The type of the svm (NU_SVC or C_SVC) (default: 'C_SVC' valid: 'NU_SVC',
'C_SVC')
-nu <float> The nu parameter [0..1] of the svm (for nu-SVR) (default: '0.5' min: '0'
max: '1')
-kernel_type <type> The kernel type of the svm (default: 'OLIGO' valid: 'LINEAR', 'RBF', 'POLY'
, 'OLIGO')
-degree <int> The degree parameter of the kernel function of the svm (POLY kernel) (defau
lt: '1' min: '1')
-border_length <int> Length of the POBK (default: '22' min: '1')
-k_mer_length <int> K_mer length of the POBK (default: '1' min: '1')
-sigma <float> Sigma of the POBK (default: '5')
-max_positive_count <int> Quantity of positive samples for training (randomly chosen if smaller than
available quantity) (default: '1000' min: '1')
-max_negative_count <int> Quantity of positive samples for training (randomly chosen if smaller than
available quantity) (default: '1000' min: '1')
-redundant If the input sets are redundant and the redundant peptides should occur
more than once in the training set, this flag has to be set
-additive_cv If the step sizes should be interpreted additively (otherwise the actual
value is multiplied with the step size to get the new value
Parameters for the grid search / cross validation::
-cv:skip_cv Has to be set if the cv should be skipped and the model should just be trai
ned with the specified parameters.
-cv:number_of_runs <int> Number of runs for the CV (default: '10' min: '1')
-cv:number_of_partitions <int> Number of CV partitions (default: '10' min: '2')
-cv:degree_start <int> Starting point of degree (default: '1' min: '1')
-cv:degree_step_size <int> Step size point of degree (default: '2')
-cv:degree_stop <int> Stopping point of degree (default: '4')
-cv:c_start <float> Starting point of c (default: '1')
-cv:c_step_size <float> Step size of c (default: '100')
-cv:c_stop <float> Stopping point of c (default: '1000')
-cv:nu_start <float> Starting point of nu (default: '0.1' min: '0' max: '1')
-cv:nu_step_size <float> Step size of nu (default: '1.3')
-cv:nu_stop <float> Stopping point of nu (default: '0.9' min: '0' max: '1')
-cv:sigma_start <float> Starting point of sigma (default: '1')
-cv:sigma_step_size <float> Step size of sigma (default: '1.3')
-cv:sigma_stop <float> Stopping point of sigma (default: '15')
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (default:
'1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
</pre><p> <b>INI file documentation of this tool:</b> <div class="ini_global">
<div class="legend">
<b>Legend:</b><br>
<div class="item item_required">required parameter</div>
<div class="item item_advanced">advanced parameter</div>
</div>
<div class="node"><span class="node_name">+PTModel</span><span class="node_description">Trains a model for the prediction of proteotypic peptides from a training set.</span></div>
<div class="item item_advanced"><span class="item_name" style="padding-left:16px;">version</span><span class="item_value">1.11.1</span>
<span class="item_description">Version of the tool that generated this parameters file.</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="node"><span class="node_name">++1</span><span class="node_description">Instance '1' section for 'PTModel'</span></div>
<div class="item"><span class="item_name item_required" style="padding-left:24px;">in_positive</span><span class="item_value"></span>
<span class="item_description">input file with positive examples</span><span class="item_tags">input file</span><span class="item_restrictions">*.idXML</span></div> <div class="item"><span class="item_name item_required" style="padding-left:24px;">in_negative</span><span class="item_value"></span>
<span class="item_description">input file with negative examples</span><span class="item_tags">input file</span><span class="item_restrictions">*.idXML</span></div> <div class="item"><span class="item_name item_required" style="padding-left:24px;">out</span><span class="item_value"></span>
<span class="item_description">output file: the model in libsvm format</span><span class="item_tags">output file</span><span class="item_restrictions">*.txt</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">c</span><span class="item_value">1</span>
<span class="item_description">the penalty parameter of the svm</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:24px;">svm_type</span><span class="item_value">C_SVC</span>
<span class="item_description">the type of the svm (NU_SVC or C_SVC)</span><span class="item_tags"></span><span class="item_restrictions">NU_SVC,C_SVC</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">nu</span><span class="item_value">0.5</span>
<span class="item_description">the nu parameter [0..1] of the svm (for nu-SVR)</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">kernel_type</span><span class="item_value">OLIGO</span>
<span class="item_description">the kernel type of the svm</span><span class="item_tags"></span><span class="item_restrictions">LINEAR,RBF,POLY,OLIGO</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">degree</span><span class="item_value">1</span>
<span class="item_description">the degree parameter of the kernel function of the svm (POLY kernel)</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">border_length</span><span class="item_value">22</span>
<span class="item_description">length of the POBK</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">k_mer_length</span><span class="item_value">1</span>
<span class="item_description">k_mer length of the POBK</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">sigma</span><span class="item_value">5</span>
<span class="item_description">sigma of the POBK</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:24px;">max_positive_count</span><span class="item_value">1000</span>
<span class="item_description">quantity of positive samples for training (randomly chosen if smaller than available quantity)</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">max_negative_count</span><span class="item_value">1000</span>
<span class="item_description">quantity of positive samples for training (randomly chosen if smaller than available quantity)</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">redundant</span><span class="item_value">false</span>
<span class="item_description">if the input sets are redundant and the redundant peptides should occur more than once in the training set, this flag has to be set</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div> <div class="item"><span class="item_name" style="padding-left:24px;">additive_cv</span><span class="item_value">false</span>
<span class="item_description">if the step sizes should be interpreted additively (otherwise the actual value is multiplied with the step size to get the new value</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div> <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">log</span><span class="item_value"></span>
<span class="item_description">Name of log file (created only when specified)</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">debug</span><span class="item_value">0</span>
<span class="item_description">Sets the debug level</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:24px;">threads</span><span class="item_value">1</span>
<span class="item_description">Sets the number of threads allowed to be used by the TOPP tool</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">no_progress</span><span class="item_value">false</span>
<span class="item_description">Disables progress logging to command line</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div> <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">test</span><span class="item_value">false</span>
<span class="item_description">Enables the test mode (needed for internal use only)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div> <div class="node"><span class="node_name">+++cv</span><span class="node_description">Parameters for the grid search / cross validation:</span></div>
<div class="item"><span class="item_name" style="padding-left:32px;">skip_cv</span><span class="item_value">false</span>
<span class="item_description">Has to be set if the cv should be skipped and the model should just be trained with the specified parameters.</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">number_of_runs</span><span class="item_value">10</span>
<span class="item_description">number of runs for the CV</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">number_of_partitions</span><span class="item_value">10</span>
<span class="item_description">number of CV partitions</span><span class="item_tags"></span><span class="item_restrictions">2:∞</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">degree_start</span><span class="item_value">1</span>
<span class="item_description">starting point of degree</span><span class="item_tags"></span><span class="item_restrictions">1:∞</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">degree_step_size</span><span class="item_value">2</span>
<span class="item_description">step size point of degree</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">degree_stop</span><span class="item_value">4</span>
<span class="item_description">stopping point of degree</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">c_start</span><span class="item_value">1</span>
<span class="item_description">starting point of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">c_step_size</span><span class="item_value">100</span>
<span class="item_description">step size of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">c_stop</span><span class="item_value">1000</span>
<span class="item_description">stopping point of c</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">nu_start</span><span class="item_value">0.1</span>
<span class="item_description">starting point of nu</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">nu_step_size</span><span class="item_value">1.3</span>
<span class="item_description">step size of nu</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">nu_stop</span><span class="item_value">0.9</span>
<span class="item_description">stopping point of nu</span><span class="item_tags"></span><span class="item_restrictions">0:1</span></div> <div class="item"><span class="item_name" style="padding-left:32px;">sigma_start</span><span class="item_value">1</span>
<span class="item_description">starting point of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">sigma_step_size</span><span class="item_value">1.3</span>
<span class="item_description">step size of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div> <div class="item"><span class="item_name" style="padding-left:32px;">sigma_stop</span><span class="item_value">15</span>
<span class="item_description">stopping point of sigma</span><span class="item_tags"></span><span class="item_restrictions"> </span></div></div>
</div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>
|