File: tutorial_pip.html

package info (click to toggle)
openms 1.11.1-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 436,688 kB
  • ctags: 150,907
  • sloc: cpp: 387,126; xml: 71,547; python: 7,764; ansic: 2,626; php: 2,499; sql: 737; ruby: 342; sh: 325; makefile: 128
file content (94 lines) | stat: -rw-r--r-- 8,746 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
<HTML>
<HEAD>
<TITLE>Peak Intensity Prediction</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> &nbsp;&middot;
<A href="classes.html">Classes</A> &nbsp;&middot;
<A href="annotated.html">Annotated Classes</A> &nbsp;&middot;
<A href="modules.html">Modules</A> &nbsp;&middot;
<A href="functions_func.html">Members</A> &nbsp;&middot;
<A href="namespaces.html">Namespaces</A> &nbsp;&middot;
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title">Peak Intensity Prediction </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>This tutorial will give you an overview of how to use the peak intensity prediction (PIP). In general, PIP allows you to predict the peak intensity of a peptide relative to other peptides of the same abundance from its sequence alone. At the same time, this value allows to correct peak intensities for peptide-specific instrument sensitivity in a label-free quantitation application.</p>
<p>This method is still in an early phase: A proof of concept has been conducted and published in <b><a href="#References">[1]</a></b>. Peak intensities <em>can</em> be predicted with significant correlations, but application tests are yet to come.</p>
<h1><a class="anchor" id="PIP_background"></a>
Background</h1>
<p>The sensitivity of a mass spectrometer depends on the analyzed peptides, among other factors. This peptide-specific sensitivity causes peak heights of peptides with the same abundance to be generally different. PIP incorporates a model that maps peptide sequences to peptide-specific sensitivities.</p>
<h1><a class="anchor" id="PIP_details"></a>
Machine learning details</h1>
<p>The incorporated model has been adapted with a Local Linear Map <b><a href="#References">[2]</a></b> - a machine learning algorithm that uses both supervised and unsupervised learning in its training, and which is fast and easy to implement. Better results can be achieved with other learning architectures <b><a href="#References">[3]</a></b>, however, these are not implemented in this prototype stage yet.</p>
<h1><a class="anchor" id="PIP_training"></a>
About the training data</h1>
<p>The model which the PIP module uses has been trained with data from a Bruker Ultraflex MALDI-TOF instrument. Details about these data can be found with <b><a href="#References">[3]</a></b>. A Pearson's squared correlation of 0.43 in ten-fold cross-validation and of 0.34 across datasets from the same instrument (but with different settings and operating persons) could be achieved. There is no experience yet about the performance across instruments. So we would be pleased if you could share your experience with the model incorporated in PIP applied to other datasets.</p>
<p><br/>
 At this point, it is not possible to train a model with your own data, but it is a planned feature. It is as of yet unknown how similar peptide-specific sensitivities behave between different MALDI instruments.</p>
<h1><a class="anchor" id="PIP_howto"></a>
How to use PIP</h1>
<p>PIP lets you predict intensities using peptide sequences as input. The output values have been normalized to a mean of 0 and variance 1.</p>
<p><br/>
 To <b>test</b> PIP with data from your instrument, MALDI spectra that contain only peptides of one protein can be used:</p>
<ol type="1">
<li>Normalize your peak intensities with the sum of only the peptide's peaks to make them comparable to other spectra.</li>
<li>Logarithmize the resulting values.</li>
<li>Center and normalize your peak intensities by variance (of course, multiple spectra should be used to find mean and variance), these value are referred to as <em>tI</em> in the following.</li>
<li>Predict the peptide's peak intensities (referred to as <em>pI</em> in the following)</li>
<li>Calculate the correlation between the <em>tI</em> and <em>pI</em>. If you calculate exp(log(tI) - pI), it should give 1 as a result in this test.</li>
</ol>
<p><br/>
 To calculate relative peptide abundance (relative to those of the other peptides in the mixture) from intensities of a peptide mixture using values predicted by PIP, do above steps 2. to 4. Then calculate the peptide level <em>x</em> = exp(log(tI) - pI). <b>!!!</b> The quantification with an actual protein mixture has never been tested with this model.</p>
<h1><a class="anchor" id="PIP_example"></a>
Example code</h1>
<p>There is a usage example for the PeakIntensityPredictor class in <code>source/EXAMPLES/Tutorial_PeakIntensityPredictor.C</code>.</p>
<p>Sequences of peptides to be predicted should be stored in a vector of AASequence instances:</p>
 <div class="fragment"><div class="line">  <span class="comment">//Create a vector for the predicted values that is large enough to hold them all</span></div>
<div class="line">  vector&lt;AASequence&gt; peptides;</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;IVGLMPHPEHAVEK&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;LADNISNAMQGISEATEPR&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;ELDHSDTIEVIVNPEDIDYDAASEQAR&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;AVDTVR&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;AAWQVK&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;FLGTQGR&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;NYPSDWSDVDTK&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;GSPSFGPESISTETWSAEPYGR&quot;</span>));</div>
<div class="line">  peptides.push_back(AASequence(<span class="stringliteral">&quot;TELGFDPEAHFAIDDEVIAHTR&quot;</span>));</div>
</div><!-- fragment --></p>
<p>Then create an instance of the model, and predict the peak intensities of the peptides:</p>
<p><div class="fragment"><div class="line"></div>
<div class="line">  <span class="comment">//Create new predictor model with vector of AASequences</span></div>
<div class="line">  PeakIntensityPredictor model;</div>
<div class="line"></div>
<div class="line">  <span class="comment">//Perform prediction with LLM model</span></div>
<div class="line">  vector&lt;DoubleReal&gt; predicted = model.predict(peptides);</div>
</div><!-- fragment --></p>
<p>You can output AASequence instances like normal strings:</p>
<p><div class="fragment"><div class="line"></div>
<div class="line">  <span class="comment">//for each element in peptides print sequence as well as corresponding predicted peak intensity value.</span></div>
<div class="line">  <span class="keywordflow">for</span> (<a class="code" href="group__Concept.html#gaf9ecec2d692138fab9167164a457cbd4">Size</a> i = 0; i &lt; peptides.size(); i++)</div>
<div class="line">  {</div>
<div class="line">    cout &lt;&lt; <span class="stringliteral">&quot;Intensity of &quot;</span> &lt;&lt; peptides[i] &lt;&lt; <span class="stringliteral">&quot; is &quot;</span> &lt;&lt; predicted[i] &lt;&lt; endl;</div>
<div class="line">  }</div>
</div><!-- fragment --></p>
<h1><a class="anchor" id="References"></a>
References</h1>
<p><a class="anchor" id="References"></a> <code> <b>[1]</b> </code>:<a href="http://bieson.ub.uni-bielefeld.de/frontdoor.php?source_opus=1370">Wiebke Timm: <em>Peak Intensity Prediction in Mass Spectra using Machine Learning Methods</em>, PhD Thesis (2008)</a> <code> <b>[2]</b> </code>:Helge Ritter: <em>Learning with Self-Organizing Map, Artificial Neural Networks</em>, In T. Kohonen et al., eds.: Artificial Neural Networks, Elsevier Science Publishers (1991), 379-384 <code> <b>[3]</b> </code>:<a href="http://www.biomedcentral.com/1471-2105/9/443">W. Timm, A. Scherbart, S. B&ouml;cker, O. Kohlbacher, T.W. Nattkemper: <em>Peak Intensity Prediction in MALDI-TOF Mass Spectrometry: A Machine Learning Study to support Quantitative Proteomics</em>, BMC Bioinformatics (2008)</a> </p>
</div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>