File: quick_start.rst

package info (click to toggle)
python-pymzml 2.5.2%2Brepack1-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 27,792 kB
  • sloc: python: 6,495; pascal: 341; makefile: 233; sh: 30
file content (59 lines) | stat: -rwxr-xr-x 1,707 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
.. _quick-start:

Quick Start
===========

Parsing a mzML file and setting measured precision
--------------------------------------------------


.. autofunction:: simple_parser.main
	:noindex:

.. include:: code_inc/simple_parser.inc


Seeking in a mzML file
----------------------

One of the features of pymzML is the ability to (create) and read
indexed gzip which allows mzML file sizes to reach the levels of the original
RAW format. The interface to random access into a mzML file is implemented
by the magic get function in pymzMLs run class.

Alternatively, pymzML can also rapidly seek into any uncompressed mzML file,
no matter if an index was included into the file or not.

.. code-block:: python

    #!/usr/bin/env python
    import pymzml

    run = pymzml.run.Reader( 'tests/data/BSA1.mzML.gz' )
    spectrum_with_id_2540 = run[ 2540 ]


Reading mzML indices with a custom regular expression
------------------------------------------------------

When reading mzML files with indices wich is not an integer or contains "scan=1" or similar,
you can set a custom regex to parse the index when initializing the reader.

Say for example you have an index as in the example file Manuels_customs_ids.mzML:
    <offset idRef="ManuelsCustomID=1 diesdas">4026</offset>

.. code-block:: python

    #!/usr/bin/env python
    import pymzml
    import re

    index_re = re.compile(
        b'.*idRef="ManuelsCustomID=(?P<ID>.*) diesdas">(?P<offset>[0-9]*)</offset>'
    )
    run = pymzml.run.Reader(your_file_path, index_regex=index_re)
    spec_1 = run[1]

The regular expression has to contain a group called ID and a group called offset.
Also be aware that your regex need to be a byte string.