File: TOPP_TextExporter.html

package info (click to toggle)
openms 1.11.1-5
links: PTS, VCS
area: main
in suites: jessie, jessie-kfreebsd
size: 436,688 kB
ctags: 150,907
sloc: cpp: 387,126; xml: 71,547; python: 7,764; ansic: 2,626; php: 2,499; sql: 737; ruby: 342; sh: 325; makefile: 128
file content (184 lines) | stat: -rw-r--r-- 20,830 bytes
<HTML>
<HEAD>
<TITLE>TextExporter</TITLE>
<LINK HREF="doxygen.css" REL="stylesheet" TYPE="text/css">
<LINK HREF="style_ini.css" REL="stylesheet" TYPE="text/css">
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<A href="index.html">Home</A> &nbsp;&middot;
<A href="classes.html">Classes</A> &nbsp;&middot;
<A href="annotated.html">Annotated Classes</A> &nbsp;&middot;
<A href="modules.html">Modules</A> &nbsp;&middot;
<A href="functions_func.html">Members</A> &nbsp;&middot;
<A href="namespaces.html">Namespaces</A> &nbsp;&middot;
<A href="pages.html">Related Pages</A>
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<!-- Generated by Doxygen 1.8.5 -->
</div><!-- top -->
<div class="header">
  <div class="headertitle">
<div class="title">TextExporter </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>This application converts several OpenMS XML formats (featureXML, consensusXML, and idXML) to text files.</p>
<center> <table class="doxtable">
<tr>
<td align="center" bgcolor="#EBEBEB">potential predecessor tools  </td><td valign="middle" rowspan="2"><img class="formulaInl" alt="$ \longrightarrow $" src="form_91.png"/> TextExporter <img class="formulaInl" alt="$ \longrightarrow $" src="form_91.png"/> </td><td align="center" bgcolor="#EBEBEB">potential successor tools   </td></tr>
<tr>
<td valign="middle" align="center" rowspan="1">almost any TOPP tool  </td><td valign="middle" align="center" rowspan="1">external tools (MS Excel, OpenOffice, Notepad)  </td></tr>
</table>
</center><p>The goal of this tool is to create output in a table format that is easily readable in Excel or OpenOffice. Lines in the output correspond to rows in the table; the individual columns are delineated by a separator, e.g. tab (default, TSV format) or comma (CSV format).</p>
<p>Output files begin with comment lines, starting with the special character "#". The last such line(s) will be a header with column names, but this may be preceded by more general comments.</p>
<p>Because the <a class="el" href="namespaceOpenMS.html" title="Main OpenMS namespace. ">OpenMS</a> XML formats contain different kinds of data in a hierarchical structure, TextExporter produces somewhat unusual TSV/CSV files for many inputs: Different lines in the output may belong to different types of data, and the number of columns and the meanings of the individual fields depend on the type. In such cases, the first column always contains an indicator (in capital letters) for the data type of the current line. In addition, some lines have to be understood relative to a previous line, if there is a hierarchical relationship in the data. (See below for details and examples.)</p>
<p>Missing values are represented by "-1" or "nan" in numeric fields and by blanks in character/text fields.</p>
<p>Depending on the input and the parameters, the output contains the following columns:</p>
<p><b>featureXML input:</b></p>
<ul>
<li>first column: <code>RUN</code> / <code>PROTEIN</code> / <code>UNASSIGNEDPEPTIDE</code> / <code>FEATURE</code> / <code>PEPTIDE</code> (indicator for the type of data in the current row)</li>
<li>a <code>RUN</code> line contains information about a protein identification run; further columns: <code>run_id</code>, <code>score_type</code>, <code>score_direction</code>, <code>data_time</code>, <code>search_engine_version</code>, <code>parameters</code> </li>
<li>a <code>PROTEIN</code> line contains data of a protein identified in the previously listed run; further columns: <code>score</code>, <code>rank</code>, <code>accession</code>, <code>coverage</code>, <code>sequence</code> </li>
<li>an <code>UNASSIGNEDPEPTIDE</code> line contains data of peptide hit that was not assigned to any feature; further columns: <code>rt</code>, <code>mz</code>, <code>score</code>, <code>rank</code>, <code>sequence</code>, <code>charge</code>, <code>aa_before</code>, <code>aa_after</code>, <code>score_type</code>, <code>search_identifier</code>, <code>accessions</code> </li>
<li>a <code>FEATURE</code> line contains data of a single feature; further columns: <code>rt</code>, <code>mz</code>, <code>intensity</code>, <code>charge</code>, <code>width</code>, <code>quality</code>, <code>rt_quality</code>, <code>mz_quality</code>, <code>rt_start</code>, <code>rt_end</code> </li>
<li>a <code>PEPTIDE</code> line contains data of a peptide hit annotated to the previous feature; further columns: same as for <code>UNASSIGNEDPEPTIDE</code> </li>
</ul>
<p>With the <code>no_ids</code> flag, only <code>FEATURE</code> lines (without the <code>FEATURE</code> indicator) are written.</p>
<p>With the <code>feature:minimal</code> flag, only the <code>rt</code>, <code>mz</code>, and <code>intensity</code> columns of <code>FEATURE</code> lines are written.</p>
<p><b>consensusXML input:</b></p>
<p>Output format produced for the <code>out</code> parameter:</p>
<ul>
<li>first column: <code>MAP</code> / <code>RUN</code> / <code>PROTEIN</code> / <code>UNASSIGNEDPEPTIDE</code> / <code>CONSENSUS</code> / <code>PEPTIDE</code> (indicator for the type of data in the current row)</li>
<li>a <code>MAP</code> line contains information about a sub-map; further columns: <code>id</code>, <code>filename</code>, <code>label</code>, <code>size</code> (potentially followed by further columns containing meta data, depending on the input)</li>
<li>a <code>CONSENSUS</code> line contains data of a single consensus feature; further columns: <code>rt_cf</code>, <code>mz_cf</code>, <code>intensity_cf</code>, <code>charge_cf</code>, <code>width_cf</code>, <code>quality_cf</code>, <code>rt_X0</code>, <code>mz_X0</code>, ..., rt_X1, mz_X1, ...</li>
<li><code>"..._cf"</code> columns refer to the consensus feature itself, <code>"..._Xi"</code> columns refer to a sub-feature from the map with ID "Xi" (no <code>quality</code> column in this case); missing sub-features are indicated by "nan" values</li>
<li>see above for the formats of <code>RUN</code>, <code>PROTEIN</code>, <code>UNASSIGNEDPEPTIDE</code>, <code>PEPTIDE</code> lines</li>
</ul>
<p>With the <code>no_ids</code> flag, only <code>MAP</code> and <code>CONSENSUS</code> lines are written.</p>
<p>Output format produced for the <code>consensus_centroids</code> parameter:</p>
<ul>
<li>one line per consensus centroid</li>
<li>columns: <code>rt</code>, <code>mz</code>, <code>intensity</code>, <code>charge</code>, <code>width</code>, <code>quality</code> </li>
</ul>
<p>Output format produced for the <code>consensus_elements</code> parameter:</p>
<ul>
<li>one line per sub-feature (element) of a consensus feature</li>
<li>first column: <code>H</code> / <code>L</code> (indicator for new/repeated element)</li>
<li><code>H</code> indicates a new element, <code>L</code> indicates the replication of the first element of the current consensus feature (for plotting)</li>
<li>further columns: <code>rt</code>, <code>mz</code>, <code>intensity</code>, <code>charge</code>, <code>width</code>, <code>rt_cf</code>, <code>mz_cf</code>, <code>intensity_cf</code>, <code>charge_cf</code>, <code>width_cf</code>, <code>quality_cf</code> </li>
<li><code>"..._cf"</code> columns refer to the consensus feature, the other columns refer to the sub-feature</li>
</ul>
<p>Output format produced for the <code>consensus_features</code> parameter:</p>
<ul>
<li>one line per consensus feature (suitable for processing with e.g. <a href="http://www.r-project.org">R</a>)</li>
<li>columns: same as for a <code>CONSENSUS</code> line above, followed by additional columns for identification data</li>
<li>additional columns: <code>peptide_N0</code>, <code>n_diff_peptides_N0</code>, <code>protein_N0</code>, <code>n_diff_proteins_N0</code>, <code>peptide_N1</code>, ...</li>
<li><code>"..._Ni"</code> columns refer to the identification run with index "Ni", <code>n_diff_</code>... stands for "number of different ..."; different peptides/proteins in one column are separated by "/"</li>
</ul>
<p>With the <code>no_ids</code> flag, the additional columns are not included.</p>
<p><b>idXML input:</b></p>
<ul>
<li>first column: <code>RUN</code> / <code>PROTEIN</code> / <code>PEPTIDE</code> (indicator for the type of data in the current row)</li>
<li>see above for the formats of <code>RUN</code>, <code>PROTEIN</code>, <code>PEPTIDE</code> lines</li>
<li>additional column for <code>PEPTIDE</code> lines: <code>predicted_rt</code> </li>
</ul>
<p>With the <code>id:proteins_only</code> flag, only <code>RUN</code> and <code>PROTEIN</code> lines are written.</p>
<p>With the <code>id:peptides_only</code> flag, only <code>PEPTIDE</code> lines (without the <code>PEPTIDE</code> indicator) are written.</p>
<p>With the <code>id:first_dim_rt</code> flag, the additional columns <code>rt_first_dim</code> and <code>predicted_rt_first_dim</code> are included for <code>PEPTIDE</code> lines.</p>
<p><b>The command line parameters of this tool are:</b> </p>
<pre class="fragment">
TextExporter -- Exports various XML formats to a text file.
Version: 1.11.1 Nov 14 2013, 11:18:15, Revision: 11976

Usage:
  TextExporter &lt;options&gt;

Options (mandatory options marked with '*'):
  -in &lt;file&gt;*                         Input file  (valid formats: 'featureXML', 'consensusXML', 'idXML', 'mzM
                                      L')
  -out &lt;file&gt;                         Output file (mandatory for featureXML and idXML) (valid formats: 'csv')
  -separator &lt;sep&gt;                    The used separator character(s); if not set the 'tab' character is used
  -replacement &lt;string&gt;               Used to replace occurrences of the separator in strings before writing,
                                      if 'quoting' is 'none' (default: '_')
  -quoting &lt;method&gt;                   Method for quoting of strings: 'none' for no quoting, 'double' for quot
                                      ing with doubling of embedded quotes,
                                      'escape' for quoting with backslash-escaping of embedded quotes (defau
                                      lt: 'none' valid: 'none', 'double', 'escape')
  -no_ids                             Supresses output of identification data.
                                      

Options for featureXML input files:
  -feature:minimal                    Set this flag to write only three attributes: RT, m/z, and intensity.

                                      

Options for idXML input files:
  -id:proteins_only                   Set this flag if you want only protein information from an idXML file
  -id:peptides_only                   Set this flag if you want only peptide information from an idXML file
  -id:first_dim_rt                    If this flag is set the first_dim RT of the peptide hits will also be 
                                      printed (if present).

                                      

Options for consensusXML input files:
  -consensus:centroids &lt;file&gt;         Output file for centroids of consensus features (valid formats: 'csv')
  -consensus:elements &lt;file&gt;          Output file for elements of consensus features (valid formats: 'csv')
  -consensus:features &lt;file&gt;          Output file for consensus features and contained elements from all maps
                                      (writes 'nan's if elements are missing) (valid formats: 'csv')
  -consensus:sorting_method &lt;method&gt;  Sorting options can be combined. The precedence is: sort_by_size, sort_
                                      by_maps, sorting_method (default: 'none' valid: 'none', 'RT', 'MZ',
                                      'RT_then_MZ', 'intensity', 'quality_decreasing', 'quality_increasing')
  -consensus:sort_by_maps             Apply a stable sort by the covered maps, lexicographically
  -consensus:sort_by_size             Apply a stable sort by decreasing size (i.e., the number of elements)

                                      
Common TOPP options:
  -ini &lt;file&gt;                         Use the given TOPP INI file
  -threads &lt;n&gt;                        Sets the number of threads allowed to be used by the TOPP tool (default
                                      : '1')
  -write_ini &lt;file&gt;                   Writes the default configuration file
  --help                              Shows options
  --helphelp                          Shows all options (including advanced)

</pre><p> <b>INI file documentation of this tool:</b> <div class="ini_global">
<div class="legend">
<b>Legend:</b><br>
 <div class="item item_required">required parameter</div>
 <div class="item item_advanced">advanced parameter</div>
</div>
  <div class="node"><span class="node_name">+TextExporter</span><span class="node_description">Exports various XML formats to a text file.</span></div>
    <div class="item item_advanced"><span class="item_name" style="padding-left:16px;">version</span><span class="item_value">1.11.1</span>
<span class="item_description">Version of the tool that generated this parameters file.</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>    <div class="node"><span class="node_name">++1</span><span class="node_description">Instance '1' section for 'TextExporter'</span></div>
      <div class="item"><span class="item_name item_required" style="padding-left:24px;">in</span><span class="item_value"></span>
<span class="item_description">Input file </span><span class="item_tags">input file</span><span class="item_restrictions">*.featureXML,*.consensusXML,*.idXML,*.mzML</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">out</span><span class="item_value"></span>
<span class="item_description">Output file (mandatory for featureXML and idXML)</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">separator</span><span class="item_value"></span>
<span class="item_description">The used separator character(s); if not set the 'tab' character is used</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">replacement</span><span class="item_value">_</span>
<span class="item_description">Used to replace occurrences of the separator in strings before writing, if 'quoting' is 'none'</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">quoting</span><span class="item_value">none</span>
<span class="item_description">Method for quoting of strings: 'none' for no quoting, 'double' for quoting with doubling of embedded quotes,<br>'escape' for quoting with backslash-escaping of embedded quotes</span><span class="item_tags"></span><span class="item_restrictions">none,double,escape</span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">no_ids</span><span class="item_value">false</span>
<span class="item_description">Supresses output of identification data.</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">log</span><span class="item_value"></span>
<span class="item_description">Name of log file (created only when specified)</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">debug</span><span class="item_value">0</span>
<span class="item_description">Sets the debug level</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item"><span class="item_name" style="padding-left:24px;">threads</span><span class="item_value">1</span>
<span class="item_description">Sets the number of threads allowed to be used by the TOPP tool</span><span class="item_tags"></span><span class="item_restrictions"> </span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">no_progress</span><span class="item_value">false</span>
<span class="item_description">Disables progress logging to command line</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="item item_advanced"><span class="item_name" style="padding-left:24px;">test</span><span class="item_value">false</span>
<span class="item_description">Enables the test mode (needed for internal use only)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++feature</span><span class="node_description">Options for featureXML input files</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">minimal</span><span class="item_value">false</span>
<span class="item_description">Set this flag to write only three attributes: RT, m/z, and intensity.</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++id</span><span class="node_description">Options for idXML input files</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">proteins_only</span><span class="item_value">false</span>
<span class="item_description">Set this flag if you want only protein information from an idXML file</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">peptides_only</span><span class="item_value">false</span>
<span class="item_description">Set this flag if you want only peptide information from an idXML file</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">first_dim_rt</span><span class="item_value">false</span>
<span class="item_description">If this flag is set the first_dim RT of the peptide hits will also be printed (if present).</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>      <div class="node"><span class="node_name">+++consensus</span><span class="node_description">Options for consensusXML input files</span></div>
        <div class="item"><span class="item_name" style="padding-left:32px;">centroids</span><span class="item_value"></span>
<span class="item_description">Output file for centroids of consensus features</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">elements</span><span class="item_value"></span>
<span class="item_description">Output file for elements of consensus features</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">features</span><span class="item_value"></span>
<span class="item_description">Output file for consensus features and contained elements from all maps (writes 'nan's if elements are missing)</span><span class="item_tags">output file</span><span class="item_restrictions">*.csv</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sorting_method</span><span class="item_value">none</span>
<span class="item_description">Sorting options can be combined. The precedence is: sort_by_size, sort_by_maps, sorting_method</span><span class="item_tags"></span><span class="item_restrictions">none,RT,MZ,RT_then_MZ,intensity,quality_decreasing,quality_increasing</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sort_by_maps</span><span class="item_value">false</span>
<span class="item_description">Apply a stable sort by the covered maps, lexicographically</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div>        <div class="item"><span class="item_name" style="padding-left:32px;">sort_by_size</span><span class="item_value">false</span>
<span class="item_description">Apply a stable sort by decreasing size (i.e., the number of elements)</span><span class="item_tags"></span><span class="item_restrictions">true,false</span></div></div>
 </div></div><!-- contents -->
<HR style="height:1px; border:none; border-top:1px solid #c0c0c0;">
<TABLE width="100%" border="0">
<TR>
<TD><font color="#c0c0c0">OpenMS / TOPP release 1.11.1</font></TD>
<TD align="right"><font color="#c0c0c0">Documentation generated on Thu Nov 14 2013 11:19:24 using doxygen 1.8.5</font></TD>
</TR>
</TABLE>
</BODY>
</HTML>