File: tutorial_filtering.html

package info (click to toggle)
openms 1.11.1-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 436,688 kB
  • ctags: 150,907
  • sloc: cpp: 387,126; xml: 71,547; python: 7,764; ansic: 2,626; php: 2,499; sql: 737; ruby: 342; sh: 325; makefile: 128
file content (194 lines) | stat: -rw-r--r-- 16,336 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
<HTML>
<HEAD>
<TITLE>Signal processing (Smoothing, baseline reduction, calibration)</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> &nbsp;&middot;
<A href="classes.html">Classes</A> &nbsp;&middot;
<A href="annotated.html">Annotated Classes</A> &nbsp;&middot;
<A href="modules.html">Modules</A> &nbsp;&middot;
<A href="functions_func.html">Members</A> &nbsp;&middot;
<A href="namespaces.html">Namespaces</A> &nbsp;&middot;
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title">Signal processing (Smoothing, baseline reduction, calibration) </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>OpenMS offers several filters for the reduction of noise and baseline which disturb LC-MS measurements. These filters work spectra-wise and can therefore be applied to a whole raw data map as well as to a single raw spectrum. All filters offer functions for the filtering of raw data containers (e.g. <em>PeakSpectrum</em>) "filter" as well as functions for the processing of a collection of raw data containers (e.g. <em>PeakMap</em>) "filterExperiment". The functions "filter" and "filterExperiment" can both be invoked with an input container along with an output container or with iterators that define a range on the input container along with an output container. The classes described in this section can be found in the <em>FILTERING</em> folder.</p>
<h1><a class="anchor" id="filtering_baseline"></a>
Baseline filters</h1>
<p>Baseline reduction can be perfomed by the <em>TopHatFilter</em>. The top-hat filter is a morphological filter which uses the basic morphological operations "erosion" and "dilatation" to remove the baseline in raw data. Because both operations are implemented as described by Van Herk the top-hat filter expects equally spaced raw data points. If your data is not uniform yet, please use the <em>LinearResampler</em> to generate equally spaced data.</p>
<p>The <em>TopHatFilter</em> removes signal structures in the raw data which are broader than the size of the structuring element.</p>
<p>The following example (Tutorial_MorphologicalFilter.C) shows how to instantiate a tophat filter, set the length of the structuring element and remove the base line in a raw LC-MS map.</p>
 <div class="fragment"><div class="line"><span class="keywordtype">int</span> <a class="code" href="RNPxl_8C.html#a217dbf8b442f20279ea00b898af96f52">main</a>(<span class="keywordtype">int</span> argc, <span class="keyword">const</span> <span class="keywordtype">char</span>** argv)</div>
<div class="line">{</div>
<div class="line">  <span class="keywordflow">if</span> (argc &lt; 2) <span class="keywordflow">return</span> 1;</div>
<div class="line">  <span class="comment">// the path to the data should be given on the command line</span></div>
<div class="line">  String tutorial_data_path(argv[1]);</div>
<div class="line">  </div>
<div class="line">  PeakMap exp;</div>
<div class="line"></div>
<div class="line">  MzMLFile mzml_file;</div>
<div class="line">  mzml_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_MorphologicalFilter.mzML&quot;</span>, exp);</div>
<div class="line"></div>
<div class="line">  Param parameters;</div>
<div class="line">  parameters.setValue(<span class="stringliteral">&quot;struc_elem_length&quot;</span>, 1.0);</div>
<div class="line">  parameters.setValue(<span class="stringliteral">&quot;struc_elem_unit&quot;</span>, <span class="stringliteral">&quot;Thomson&quot;</span>);</div>
<div class="line">  parameters.setValue(<span class="stringliteral">&quot;method&quot;</span>, <span class="stringliteral">&quot;tophat&quot;</span>);</div>
<div class="line"></div>
<div class="line">  MorphologicalFilter mf;</div>
<div class="line">  mf.setParameters(parameters);</div>
<div class="line"></div>
<div class="line">  mf.filterExperiment(exp);</div>
<div class="line"></div>
<div class="line">  <span class="keywordflow">return</span> 0;</div>
<div class="line">} <span class="comment">//end of main</span></div>
</div><!-- fragment --></p>
<dl class="section note"><dt>Note</dt><dd>In order to remove the baseline, the width of the structuring element should be greater than the width of a peak.</dd></dl>
<h1><a class="anchor" id="filtering_smoothing"></a>
Smoothing filters</h1>
<p>We offer two smoothing filters to reduce noise in LC-MS measurements.</p>
<h2><a class="anchor" id="filtering_smoothing_gaussian"></a>
Gaussian filter</h2>
<p>The class <em>GaussFilter</em> is a gaussian filter. The wider the kernel width, the smoother the signal (the more detail information gets lost).</p>
<p>We show in the following example (Tutorial_GaussFilter.C) how to smooth a raw data map. The gaussian kernel width is set to 1 m/z.</p>
 <div class="fragment"><div class="line"><span class="keywordtype">int</span> <a class="code" href="RNPxl_8C.html#a217dbf8b442f20279ea00b898af96f52">main</a>(<span class="keywordtype">int</span> argc, <span class="keyword">const</span> <span class="keywordtype">char</span>** argv)</div>
<div class="line">{</div>
<div class="line">  <span class="keywordflow">if</span> (argc &lt; 2) <span class="keywordflow">return</span> 1;</div>
<div class="line">  <span class="comment">// the path to the data should be given on the command line</span></div>
<div class="line">  String tutorial_data_path(argv[1]);</div>
<div class="line"></div>
<div class="line">  PeakMap exp;</div>
<div class="line"></div>
<div class="line">  MzMLFile mzdata_file;</div>
<div class="line">  mzdata_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_GaussFilter.mzML&quot;</span>, exp);</div>
<div class="line"></div>
<div class="line">  GaussFilter g;</div>
<div class="line">  Param param;</div>
<div class="line">  param.setValue(<span class="stringliteral">&quot;gaussian_width&quot;</span>, 1.0);</div>
<div class="line">  g.setParameters(param);</div>
<div class="line"></div>
<div class="line">  g.filterExperiment(exp);</div>
<div class="line"></div>
<div class="line">  <span class="keywordflow">return</span> 0;</div>
<div class="line">} <span class="comment">//end of main</span></div>
</div><!-- fragment --></p>
<dl class="section note"><dt>Note</dt><dd>Use a gaussian filter kernel which has approximately the same width as your mass peaks.</dd></dl>
<h2><a class="anchor" id="filtering_smoothing_sgolay"></a>
Savitzky Golay filter</h2>
<p>The Savitzky Golay filter is implemented in two ways <em>SavitzkyGolaySVDFilter</em> and <em>SavitzkyGolayQRFilter</em>. Both filters come to the same result but in most cases the <em>SavitzkyGolaySVDFilter</em> has a better run time. The Savitzky Golay filter works only on equally spaced data. If your data is not uniform use the <em>LinearResampler</em> to generate equally spaced data. The smoothing degree depends on two parameters: the frame size and the order of the polynomial used for smoothing. The frame size corresponds to the number of filter coefficients, so the width of the smoothing interval is given by frame_size*spacing of the raw data. The bigger the frame size or the smaller the order, the smoother the signal (the more detail information gets lost!).</p>
<p>The following example (Tutorial_SavitzkyGolayFilter.C) shows how to use a <em>SavitzkyGolaySVDFilter</em> (the <em>SavitzkyGolayQRFilter</em> has the same interface) to smooth a single spectrum. The single raw data spectrum is loaded and resampled to uniform data with a spacing of 0.01 /m/z. The frame size of the Savitzky Golay filter is set to 21 data points and the polynomial order is set to 3. Afterwards the filter is applied to the resampled spectrum.</p>
 <div class="fragment"><div class="line"><span class="keywordtype">int</span> <a class="code" href="RNPxl_8C.html#a217dbf8b442f20279ea00b898af96f52">main</a>(<span class="keywordtype">int</span> argc, <span class="keyword">const</span> <span class="keywordtype">char</span>** argv)</div>
<div class="line">{</div>
<div class="line">  <span class="keywordflow">if</span> (argc &lt; 2) <span class="keywordflow">return</span> 1;</div>
<div class="line">  <span class="comment">// the path to the data should be given on the command line</span></div>
<div class="line">  String tutorial_data_path(argv[1]);</div>
<div class="line">  </div>
<div class="line">  PeakSpectrum spectrum;</div>
<div class="line"></div>
<div class="line">  DTAFile dta_file;</div>
<div class="line">  dta_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_SavitzkyGolayFilter.dta&quot;</span>, spectrum);</div>
<div class="line"></div>
<div class="line">  LinearResampler lr;</div>
<div class="line">  Param param_lr;</div>
<div class="line">  param_lr.setValue(<span class="stringliteral">&quot;spacing&quot;</span>, 0.01);</div>
<div class="line">  lr.setParameters(param_lr);</div>
<div class="line">  lr.raster(spectrum);</div>
<div class="line"></div>
<div class="line">  SavitzkyGolayFilter sg;</div>
<div class="line">  Param param_sg;</div>
<div class="line">  param_sg.setValue(<span class="stringliteral">&quot;frame_length&quot;</span>, 21);</div>
<div class="line">  param_sg.setValue(<span class="stringliteral">&quot;polynomial_order&quot;</span>, 3);</div>
<div class="line">  sg.setParameters(param_sg);</div>
<div class="line">  sg.filter(spectrum);</div>
<div class="line"></div>
<div class="line">  <span class="keywordflow">return</span> 0;</div>
<div class="line">} <span class="comment">//end of main</span></div>
</div><!-- fragment --></p>
<h1><a class="anchor" id="filtering_calibration"></a>
Calibration</h1>
<p>OpenMS offers methods for external and internal calibration of raw or peak data.</p>
<h2><a class="anchor" id="filtering_calibration_internal"></a>
Internal Calibration</h2>
<p>The InternalCalibration uses reference masses for calibration. At least two reference masses have to exist in each spectrum, otherwise it is not calibrated. The data to be calibrated can be raw data or already picked data. If we have raw data, a peak picking step is necessary. For the important peak picking parameters, have a look at the <a class="el" href="tutorial_transformations.html#transformations_pp">Peak picking</a> section.</p>
<p>The following example (Tutorial_InternalCalibration.C) shows how to use the InternalCalibration for raw data. First the data and reference masses are loaded.</p>
 <div class="fragment"><div class="line"><span class="keywordtype">int</span> <a class="code" href="RNPxl_8C.html#a217dbf8b442f20279ea00b898af96f52">main</a>(<span class="keywordtype">int</span> argc, <span class="keyword">const</span> <span class="keywordtype">char</span>** argv)</div>
<div class="line">{</div>
<div class="line">  <span class="keywordflow">if</span> (argc &lt; 2) <span class="keywordflow">return</span> 1;</div>
<div class="line">  <span class="comment">// the path to the data should be given on the command line</span></div>
<div class="line">  String tutorial_data_path(argv[1]);</div>
<div class="line"></div>
<div class="line">  InternalCalibration ic;</div>
<div class="line">  PeakMap exp, exp_calibrated;</div>
<div class="line">  MzMLFile mzml_file;</div>
<div class="line">  mzml_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_InternalCalibration.mzML&quot;</span>, exp);</div>
<div class="line"></div>
<div class="line">  std::vector&lt;double&gt; ref_masses;</div>
<div class="line">  ref_masses.push_back(1296.68476942);</div>
<div class="line">  ref_masses.push_back(2465.19833942);</div>
</div><!-- fragment --></p>
<p>Then we set the important peak picking parameters and run the internal calibration: <div class="fragment"><div class="line">} <span class="comment">//end of main</span></div>
</div><!-- fragment --></p>
<h2><a class="anchor" id="filtering_calibration_external"></a>
TOF Calibration</h2>
<p>The TOFCalibration uses calibrant spectra to convert a spectrum containing time-of-flight values into one with m/z values. For the calibrant spectra, the expected masses need to be known as well as the calibration constants in order to convert the calibrant spectra tof into m/z (determined by the instrument). Using the calibrant spectra's tof and m/z-values, first a quadratic curve fitting is done. The remaining error is estimated by a spline curve fitting. The quadratic function and the splines are used to determine the calibration equation for the conversion of the experimental data.</p>
<p>The following example (Tutorial_TOFCalibration.C) shows how to use the TOFCalibration for raw data. First the spectra and reference masses are loaded.</p>
 <div class="fragment"><div class="line"><span class="keywordtype">int</span> <a class="code" href="RNPxl_8C.html#a217dbf8b442f20279ea00b898af96f52">main</a>(<span class="keywordtype">int</span> argc, <span class="keyword">const</span> <span class="keywordtype">char</span>** argv)</div>
<div class="line">{</div>
<div class="line">  <span class="keywordflow">if</span> (argc &lt; 2) <span class="keywordflow">return</span> 1;</div>
<div class="line">  <span class="comment">// the path to the data should be given on the command line</span></div>
<div class="line">  String tutorial_data_path(argv[1]);</div>
<div class="line">  </div>
<div class="line">  TOFCalibration ec;</div>
<div class="line">  PeakMap exp_raw, calib_exp;</div>
<div class="line">  MzMLFile mzml_file;</div>
<div class="line">  mzml_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_TOFCalibration_peak.mzML&quot;</span>, calib_exp);</div>
<div class="line">  mzml_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_TOFCalibration_raw.mzML&quot;</span>, exp_raw);</div>
<div class="line"></div>
<div class="line">  vector&lt;DoubleReal&gt; ref_masses;</div>
<div class="line">  TextFile ref_file;</div>
<div class="line">  ref_file.load(tutorial_data_path + <span class="stringliteral">&quot;/data/Tutorial_TOFCalibration_masses.txt&quot;</span>, <span class="keyword">true</span>);</div>
<div class="line">  <span class="keywordflow">for</span> (TextFile::Iterator iter = ref_file.begin(); iter != ref_file.end(); ++iter)</div>
<div class="line">  {</div>
<div class="line">    ref_masses.push_back(String(iter-&gt;c_str()).toDouble());</div>
<div class="line">  }</div>
</div><!-- fragment --></p>
<p>Then we set the calibration constants for the calibrant spectra. <div class="fragment"><div class="line"></div>
<div class="line">  std::vector&lt;DoubleReal&gt; ml1;</div>
<div class="line">  ml1.push_back(418327.924993827);</div>
<div class="line"></div>
<div class="line">  std::vector&lt;DoubleReal&gt; ml2;</div>
<div class="line">  ml2.push_back(253.645187196031);</div>
<div class="line"></div>
<div class="line">  std::vector&lt;DoubleReal&gt; ml3;</div>
<div class="line">  ml3.push_back(-0.0414243465397252);</div>
<div class="line"></div>
<div class="line">  ec.setML1s(ml1);</div>
<div class="line">  ec.setML2s(ml2);</div>
<div class="line">  ec.setML3s(ml3);</div>
</div><!-- fragment --></p>
<p>Finally, we set the important peak picking parameters and run the external calibration: <div class="fragment"><div class="line"></div>
<div class="line">  Param param;</div>
<div class="line">  param.setValue(<span class="stringliteral">&quot;PeakPicker:peak_width&quot;</span>, 0.1);</div>
<div class="line">  ec.setParameters(param);</div>
<div class="line">  ec.pickAndCalibrate(calib_exp, exp_raw, ref_masses);</div>
<div class="line"></div>
<div class="line">  <span class="keywordflow">return</span> 0;</div>
<div class="line">} <span class="comment">//end of main</span></div>
</div><!-- fragment --> </p>
</div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>