File: TOPP_ProteinQuantifier.html

package info (click to toggle)
openms 1.11.1-5
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 436,688 kB
  • ctags: 150,907
  • sloc: cpp: 387,126; xml: 71,547; python: 7,764; ansic: 2,626; php: 2,499; sql: 737; ruby: 342; sh: 325; makefile: 128
file content (225 lines) | stat: -rw-r--r-- 27,488 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
<HTML>
<HEAD>
<TITLE>ProteinQuantifier</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> &nbsp;&middot;
<A href="classes.html">Classes</A> &nbsp;&middot;
<A href="annotated.html">Annotated Classes</A> &nbsp;&middot;
<A href="modules.html">Modules</A> &nbsp;&middot;
<A href="functions_func.html">Members</A> &nbsp;&middot;
<A href="namespaces.html">Namespaces</A> &nbsp;&middot;
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title">ProteinQuantifier </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>Compute peptide and protein abundances from annotated feature/consensus maps or from identification results.</p>
<center> <table class="doxtable">
<tr>
<td align="center" bgcolor="#EBEBEB">potential predecessor tools  </td><td valign="middle" rowspan="3"><img class="formulaInl" alt="$ \longrightarrow $" src="form_91.png"/> ProteinQuantifier <img class="formulaInl" alt="$ \longrightarrow $" src="form_91.png"/> </td><td align="center" bgcolor="#EBEBEB">potential successor tools   </td></tr>
<tr>
<td valign="middle" align="center" rowspan="1"><a class="el" href="TOPP_IDMapper.html">IDMapper</a>  </td><td valign="middle" align="center" rowspan="2">external tools <br/>
 e.g. for statistical analysis  </td></tr>
<tr>
<td valign="middle" align="center" rowspan="1"><a class="el" href="TOPP_FeatureLinkerUnlabeled.html">FeatureLinkerUnlabeled</a> <br/>
 (or another feature grouping tool)   </td></tr>
</table>
</center><p>Reference:<br/>
 Weisser <em>et al.</em>: <a href="http://dx.doi.org/10.1021/pr300992u">An automated pipeline for high-throughput label-free quantitative proteomics</a> (J. Proteome Res., 2013, PMID: 23391308).</p>
<p><b>Input: featureXML or consensusXML</b></p>
<p>Quantification is based on the intensity values of the features in the input files. Feature intensities are first accumulated to peptide abundances, according to the peptide identifications annotated to the features/feature groups. Then, abundances of the peptides of a protein are averaged to compute the protein abundance.</p>
<p>The peptide-to-protein step uses the (e.g. 3) most abundant proteotypic peptides per protein to compute the protein abundances. This is a general version of the "top 3 approach" (but only for relative quantification) described in:<br/>
 Silva <em>et al.</em>: Absolute quantification of proteins by LCMS<sup>E</sup>: a virtue of parallel MS acquisition (Mol. Cell. Proteomics, 2006, PMID: 16219938).</p>
<p>Only features/feature groups with unambiguous peptide annotation are used for peptide quantification, and generally only proteotypic peptides (i.e. those matching to exactly one protein) are used for protein quantification. As an exception to this rule, if ProteinProphet results for the whole sample set are provided with the <code>protxml</code> option, or are already included in a featureXML input, also groups of indistinguishable proteins will be quantified. The reported quantity then refers to the total for the whole group.</p>
<p>Peptide/protein IDs from multiple identification runs can be handled, but will not be differentiated (i.e. protein accessions for a peptide will be accumulated over all identification runs).</p>
<p>Peptides with the same sequence, but with different modifications are quantified separately on the peptide level, but treated as one peptide for the protein quantification (i.e. the contributions of differently-modified variants of the same peptide are accumulated).</p>
<p><b>Input: idXML</b></p>
<p>Quantification based on identification results uses spectral counting, i.e. the abundance of each peptide is the number of times that peptide was identified from an MS2 spectrum (considering only the best hit per spectrum). Different identification runs in the input are treated as different samples; this makes it possible to quantify several related samples at once by merging the corresponding idXML files with <a class="el" href="TOPP_IDMerger.html">IDMerger</a>. Depending on the presence of multiple runs, output format and applicable parameters are the same as for featureXML and consensusXML, respectively.</p>
<p>The notes above regarding quantification on the protein level and the treatment of modifications also apply to idXML input. In particular, this means that the settings <code>top</code> 0 and <code>average</code> <code>sum</code> should be used to get the "classical" spectral counting quantification on the protein level (where all identifications of all peptides of a protein are summed up).</p>
<p>More information below the parameter specification.</p>
<p><b>The command line parameters of this tool are:</b> </p>
<pre class="fragment">
ProteinQuantifier -- Compute peptide and protein abundances
Version: 1.11.1 Nov 14 2013, 11:18:15, Revision: 11976

Usage:
  ProteinQuantifier &lt;options&gt;

Options (mandatory options marked with '*'):
  -in &lt;file&gt;*               Input file (valid formats: 'featureXML', 'consensusXML', 'idXML')
  -protxml &lt;file&gt;           ProteinProphet results (protXML converted to idXML) for the identification runs 
                            that were used to annotate the input.
                            Information about indistinguishable proteins will be used for protein quantifica
                            tion. (valid formats: 'idXML')
  -out &lt;file&gt;               Output file for protein abundances (valid formats: 'csv')
  -peptide_out &lt;file&gt;       Output file for peptide abundances (valid formats: 'csv')
  -mzTab_out &lt;file&gt;         Export to mzTab.
                            Either 'out', 'peptide_out', or 'mzTab_out' are required. They can be used toget
                            her. (valid formats: 'csv')
                            
  -top &lt;number&gt;             Calculate protein abundance from this number of proteotypic peptides (most abunda
                            nt first; '0' for all) (default: '3' min: '0')
  -average &lt;choice&gt;         Averaging method used to compute protein abundances from peptide abundances (defa
                            ult: 'median' valid: 'median', 'mean', 'sum')
  -include_all              Include results for proteins with fewer proteotypic peptides than indicated by 
                            'top' (no effect if 'top' is 0 or 1)
  -filter_charge            Distinguish between charge states of a peptide. For peptides, abundances will be 
                            reported separately for each charge;
                            for proteins, abundances will be computed based only on the most prevalent charg
                            e of each peptide.
                            By default, abundances are summed over all charge states.

Additional options for consensus maps (and identification results comprising multiple runs):
  -consensus:normalize      Scale peptide abundances so that medians of all samples are equal
  -consensus:fix_peptides   Use the same peptides for protein quantification across all samples.
                            With 'top 0', all peptides that occur in every sample are considered.
                            Otherwise ('top N'), the N peptides that occur in the most samples (independentl
                            y of each other) are selected,
                            breaking ties by total abundance (there is no guarantee that the best co-ocurrin
                            g peptides are chosen!).

  -ratios                   Add the log2 ratios of the abundance values to the output. Format: log_2(x_0/x_0)
                            &lt;sep&gt; log_2(x_1/x_0) &lt;sep&gt; log_2(x_2/x_0) ...
  -ratiosSILAC              Add the log2 ratios for a triple SILAC experiment to the output. Only applicable 
                            to consensus maps of exactly three sub-maps. Format: log_2(heavy/light) &lt;sep&gt;
                            log_2(heavy/middle) &lt;sep&gt; log_2(middle/light)

Output formatting options:
  -format:separator &lt;sep&gt;   Character(s) used to separate fields; by default, the 'tab' character is used
  -format:quoting &lt;method&gt;  Method for quoting of strings: 'none' for no quoting, 'double' for quoting with 
                            doubling of embedded quotes,
                            'escape' for quoting with backslash-escaping of embedded quotes (default: 'doubl
                            e' valid: 'none', 'double', 'escape')
  -format:replacement &lt;x&gt;   If 'quoting' is 'none', used to replace occurrences of the separator in strings 
                            before writing (default: '_')

                            
Common TOPP options:
  -ini &lt;file&gt;               Use the given TOPP INI file
  -threads &lt;n&gt;              Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini &lt;file&gt;         Writes the default configuration file
  --help                    Shows options
  --helphelp                Shows all options (including advanced)

</pre><p> <b>INI file documentation of this tool:</b> <div class="ini_global">
<div class="legend">
<b>Legend:</b><br>
 <div class="item item_required">required parameter</div>
 <div class="item item_advanced">advanced parameter</div>
</div>
  <div class="node"><span class="node_name">+ProteinQuantifier</span><span class="node_description">Compute peptide and protein abundances</span></div>
    <div class="item item_advanced"><span class="item_name" style="padding-left:16px;">version</span><span class="item_value">1.11.1</span>
<span class="item_description">Version of the tool that generated this parameters file.</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>    <div class="node"><span class="node_name">++1</span><span class="node_description">Instance '1' section for 'ProteinQuantifier'</span></div>
      <div class="item"><span class="item_name item_required" style="padding-left:24px;">in</span><span class="item_value"></span>
<span class="item_description">Input file</span><span class="item_tags">input file</span><span class="item_restrictions">*.featureXML,*.consensusXML,*.idXML</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">protxml</span><span class="item_value"></span>
<span class="item_description">ProteinProphet results (protXML converted to idXML) for the identification runs that were used to annotate the input.<br>Information about indistinguishable proteins will be used for protein quantification.</span><span class="item_tags">input file</span><span class="item_restrictions">*.idXML</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">out</span><span class="item_value"></span>
<span class="item_description">Output file for protein abundances</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">peptide_out</span><span class="item_value"></span>
<span class="item_description">Output file for peptide abundances</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">mzTab_out</span><span class="item_value"></span>
<span class="item_description">Export to mzTab.<br>Either 'out', 'peptide_out', or 'mzTab_out' are required. They can be used together.</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">top</span><span class="item_value">3</span>
<span class="item_description">Calculate protein abundance from this number of proteotypic peptides (most abundant first; '0' for all)</span><span class="item_tags"></span><span class="item_restrictions">0:&#8734;</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">average</span><span class="item_value">median</span>
<span class="item_description">Averaging method used to compute protein abundances from peptide abundances</span><span class="item_tags"></span><span class="item_restrictions">median,mean,sum</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">include_all</span><span class="item_value">false</span>
<span class="item_description">Include results for proteins with fewer proteotypic peptides than indicated by 'top' (no effect if 'top' is 0 or 1)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">filter_charge</span><span class="item_value">false</span>
<span class="item_description">Distinguish between charge states of a peptide. For peptides, abundances will be reported separately for each charge;<br>for proteins, abundances will be computed based only on the most prevalent charge of each peptide.<br>By default, abundances are summed over all charge states.</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">ratios</span><span class="item_value">false</span>
<span class="item_description">Add the log2 ratios of the abundance values to the output. Format: log_2(x_0/x_0) <sep> log_2(x_1/x_0) <sep> log_2(x_2/x_0) ...</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">ratiosSILAC</span><span class="item_value">false</span>
<span class="item_description">Add the log2 ratios for a triple SILAC experiment to the output. Only applicable to consensus maps of exactly three sub-maps. Format: log_2(heavy/light) <sep> log_2(heavy/middle) <sep> log_2(middle/light)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">log</span><span class="item_value"></span>
<span class="item_description">Name of log file (created only when specified)</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">debug</span><span class="item_value">0</span>
<span class="item_description">Sets the debug level</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">threads</span><span class="item_value">1</span>
<span class="item_description">Sets the number of threads allowed to be used by the TOPP tool</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">no_progress</span><span class="item_value">false</span>
<span class="item_description">Disables progress logging to command line</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">test</span><span class="item_value">false</span>
<span class="item_description">Enables the test mode (needed for internal use only)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++consensus</span><span class="node_description">Additional options for consensus maps (and identification results comprising multiple runs)</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">normalize</span><span class="item_value">false</span>
<span class="item_description">Scale peptide abundances so that medians of all samples are equal</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">fix_peptides</span><span class="item_value">false</span>
<span class="item_description">Use the same peptides for protein quantification across all samples.<br>With 'top 0', all peptides that occur in every sample are considered.<br>Otherwise ('top N'), the N peptides that occur in the most samples (independently of each other) are selected,<br>breaking ties by total abundance (there is no guarantee that the best co-ocurring peptides are chosen!).</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++format</span><span class="node_description">Output formatting options</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">separator</span><span class="item_value"></span>
<span class="item_description">Character(s) used to separate fields; by default, the 'tab' character is used</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">quoting</span><span class="item_value">double</span>
<span class="item_description">Method for quoting of strings: 'none' for no quoting, 'double' for quoting with doubling of embedded quotes,<br>'escape' for quoting with backslash-escaping of embedded quotes</span><span class="item_tags"></span><span class="item_restrictions">none,double,escape</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">replacement</span><span class="item_value">_</span>
<span class="item_description">If 'quoting' is 'none', used to replace occurrences of the separator in strings before writing</span><span class="item_tags"></span><span class="item_restrictions"> </span></div></div>
<p><b>Output format</b></p>
<p>The output files produced by this tool have a table format, with columns as described below:</p>
<p><b>Protein output</b> (one protein/set of indistinguishable proteins per line):</p>
<ul>
<li><b>protein:</b> Protein accession(s) (as in the annotations in the input file; separated by "/" if more than one).</li>
<li><b>n_proteins:</b> Number of indistinguishable proteins quantified (usually "1").</li>
<li><b>protein_score:</b> Protein score, e.g. ProteinProphet probability (if available).</li>
<li><b>n_peptides:</b> Number of proteotypic peptides observed for this protein (or group of indistinguishable proteins) across all samples. Note that not necessarily all of these peptides contribute to the protein abundance (depending on parameter <code>top</code>).</li>
<li><b>abundance:</b> Computed protein abundance. For consensusXML input, there will be one column per sample ("abundance_1", "abundance_2", etc.).</li>
</ul>
<p><b>Peptide output</b> (one peptide or - if <code>filter_charge</code> is set - one charge state of a peptide per line):</p>
<ul>
<li><b>peptide:</b> Peptide sequence. Only peptides that occur in unambiguous annotations of features are reported.</li>
<li><b>protein:</b> Protein accession(s) for the peptide (separated by "/" if more than one).</li>
<li><b>n_proteins:</b> Number of proteins this peptide maps to. (Same as the number of accessions in the previous column.)</li>
<li><b>charge:</b> Charge state quantified in this line. "0" (for "all charges") unless <code>filter_charge</code> was set.</li>
<li><b>abundance:</b> Computed abundance for this peptide. If the charge in the preceding column is 0, this is the total abundance of the peptide over all charge states; otherwise, it is only the abundance observed for the indicated charge (in this case, there may be more than one line for the peptide sequence). Again, for consensusXML input, there will be one column per sample ("abundance_1", "abundance_2", etc.). Also for consensusXML, the reported values are already normalized if <code>consensus:normalize</code> was set.</li>
</ul>
<p><b>Protein quantification examples</b></p>
<p>While quantification on the peptide level is fairly straight-forward, a number of options influence quantification on the protein level - especially for consensusXML input. The three parameters <code>top</code>, <code>include_all</code> and <code>consensus:fix_peptides</code> determine which peptides are used to quantify proteins in different samples.</p>
<p>As an example, consider a protein with four proteotypic peptides. Each peptide is detected in a subset of three samples, as indicated in the table below. The peptides are ranked by abundance (1: highest, 4: lowest; assuming for simplicity that the order is the same in all samples).</p>
<center> <table class="doxtable">
<tr>
<td></td><td align="center" bgcolor="#EBEBEB">sample 1  </td><td align="center" bgcolor="#EBEBEB">sample 2  </td><td align="center" bgcolor="#EBEBEB">sample 3   </td></tr>
<tr>
<td align="center" bgcolor="#EBEBEB">peptide 1  </td><td align="center">X  </td><td></td><td align="center">X   </td></tr>
<tr>
<td align="center" bgcolor="#EBEBEB">peptide 2  </td><td align="center">X  </td><td align="center">X  </td><td></td></tr>
<tr>
<td align="center" bgcolor="#EBEBEB">peptide 3  </td><td align="center">X  </td><td align="center">X  </td><td align="center">X   </td></tr>
<tr>
<td align="center" bgcolor="#EBEBEB">peptide 4  </td><td align="center">X  </td><td align="center">X  </td><td></td></tr>
</table>
</center><p>Different parameter combinations lead to different quantification scenarios, as shown here:</p>
<center> <table class="doxtable">
<tr>
<td align="center" bgcolor="#EBEBEB" colspan="3"><b>parameters</b> <br/>
 "*": no effect in this case  </td><td align="center" bgcolor="#EBEBEB" colspan="3"><b>peptides used for quantification</b> <br/>
 "(...)": not quantified here because ...  </td><td align="center" valign="middle" bgcolor="#EBEBEB" rowspan="2">explanation   </td></tr>
<tr>
<td align="center" bgcolor="#EBEBEB"><code>top</code>  </td><td align="center" bgcolor="#EBEBEB"><code>include_all</code>  </td><td align="center" bgcolor="#EBEBEB"><code>c</code>.:fix_peptides  </td><td align="center" bgcolor="#EBEBEB">sample 1  </td><td align="center" bgcolor="#EBEBEB">sample 2  </td><td align="center" bgcolor="#EBEBEB">sample 3   </td></tr>
<tr>
<td align="center">0  </td><td align="center">*  </td><td align="center">no  </td><td align="center">1, 2, 3, 4  </td><td align="center">2, 3, 4  </td><td align="center">1, 3  </td><td>all peptides   </td></tr>
<tr>
<td align="center">1  </td><td align="center">*  </td><td align="center">no  </td><td align="center">1  </td><td align="center">2  </td><td align="center">1  </td><td>single most abundant peptide   </td></tr>
<tr>
<td align="center">2  </td><td align="center">*  </td><td align="center">no  </td><td align="center">1, 2  </td><td align="center">2, 3  </td><td align="center">1, 3  </td><td>two most abundant peptides   </td></tr>
<tr>
<td align="center">3  </td><td align="center">no  </td><td align="center">no  </td><td align="center">1, 2, 3  </td><td align="center">2, 3, 4  </td><td align="center">(too few peptides)  </td><td>three most abundant peptides   </td></tr>
<tr>
<td align="center">3  </td><td align="center">yes  </td><td align="center">no  </td><td align="center">1, 2, 3  </td><td align="center">2, 3, 4  </td><td align="center">1, 3  </td><td>three or fewer most abundant peptides   </td></tr>
<tr>
<td align="center">4  </td><td align="center">no  </td><td align="center">*  </td><td align="center">1, 2, 3, 4  </td><td align="center">(too few peptides)  </td><td align="center">(too few peptides)  </td><td>four most abundant peptides   </td></tr>
<tr>
<td align="center">4  </td><td align="center">yes  </td><td align="center">*  </td><td align="center">1, 2, 3, 4  </td><td align="center">2, 3, 4  </td><td align="center">1, 3  </td><td>four or fewer most abundant peptides   </td></tr>
<tr>
<td align="center">0  </td><td align="center">*  </td><td align="center">yes  </td><td align="center">3  </td><td align="center">3  </td><td align="center">3  </td><td>all peptides present in every sample   </td></tr>
<tr>
<td align="center">1  </td><td align="center">*  </td><td align="center">yes  </td><td align="center">3  </td><td align="center">3  </td><td align="center">3  </td><td>single peptide present in most samples   </td></tr>
<tr>
<td align="center">2  </td><td align="center">no  </td><td align="center">yes  </td><td align="center">1, 3  </td><td align="center">(peptide 1 missing)  </td><td align="center">1, 3  </td><td>two peptides present in most samples   </td></tr>
<tr>
<td align="center">2  </td><td align="center">yes  </td><td align="center">yes  </td><td align="center">1, 3  </td><td align="center">3  </td><td align="center">1, 3  </td><td>two or fewer peptides present in most samples   </td></tr>
<tr>
<td align="center">3  </td><td align="center">no  </td><td align="center">yes  </td><td align="center">1, 2, 3  </td><td align="center">(peptide 1 missing)  </td><td align="center">(peptide 2 missing)  </td><td>three peptides present in most samples   </td></tr>
<tr>
<td align="center">3  </td><td align="center">yes  </td><td align="center">yes  </td><td align="center">1, 2, 3  </td><td align="center">2, 3  </td><td align="center">1, 3  </td><td>three or fewer peptides present in most samples   </td></tr>
</table>
</center><p><b>Further considerations for parameter selection</b></p>
<p>With <code>filter_charge</code> and <code>average</code>, there is a trade-off between comparability of protein abundances within a sample and of abundances for the same protein across different samples.<br/>
 Setting <code>filter_charge</code> may increase reproducibility between samples, but will distort the proportions of protein abundances within a sample. The reason is that ionization properties vary between peptides, but should remain constant across samples. Filtering by charge state can help to reduce the impact of feature detection differences between samples.<br/>
 For <code>average</code>, there is a qualitative difference between <code>mean/median</code> and <code>sum</code> in the effect that missing peptide abundances have (only if <code>include_all</code> is set or <code>top</code> is 0): <code>mean</code> and <code>median</code> ignore missing cases, averaging only present values. If low-abundant peptides are not detected in some samples, the computed protein abundances for those samples may thus be too optimistic. <code>sum</code> implicitly treats missing values as zero, so this problem does not occur and comparability across samples is ensured. However, with <code>sum</code> the total number of peptides ("summands") available for a protein may affect the abundances computed for it (depending on <code>top</code>), so results within a sample may become unproportional. </p>
</div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>