File: usage.rst

package info (click to toggle)
trollsift 1.0.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 256 kB
  • sloc: python: 1,033; makefile: 128
file content (84 lines) | stat: -rw-r--r-- 4,007 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Usage
=====

Trollsift include collection of modules that assist with formatting, parsing and filtering satellite granule file names. These modules are useful and necessary for writing higher level applications and api’s for satellite batch processing. Currently we are implementing the string parsing and composing functionality. Watch this space for further modules to do with various types of filtering of satellite data granules.

Parser
------
The trollsift string parser module is useful for composing (formatting) and parsing strings
compatible with the Python :ref:`python:formatstrings`. In satellite data file name filtering,
the library is useful for extracting typical information from granule filenames, such
as observation time, platform and instrument names. The trollsift Parser can also
verify that the string formatting is invertible, i.e. specific enough to ensure that
parsing and composing of strings are bijective mappings ( aka one-to-one correspondence )
which may be essential for some applications, such as predicting granule filenames.

parsing
^^^^^^^
The Parser object holds a format string, allowing us to parse and compose strings:

  >>> from trollsift import Parser
  >>>
  >>> p = Parser("/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b")
  >>> data = p.parse("/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b")
  >>> print(data) # doctest: +NORMALIZE_WHITESPACE
  {'directory': 'otherdir', 'platform': 'noaa', 'platnum': '16',
   'time': datetime.datetime(2014, 2, 10, 10, 4), 'orbit': 69022}

Parsing in trollsift is not "greedy". This means that in the case of ambiguous
patterns it will match the shortest portion of the string possible. For example:

  >>> from trollsift import Parser
  >>>
  >>> p = Parser("{field_one}_{field_two}")
  >>> data = p.parse("abc_def_ghi")
  >>> print(data)
  {'field_one': 'abc', 'field_two': 'def_ghi'}

So even though the first field could have matched to "abc_def", the non-greedy
parsing chose the shorter possible match of "abc".

composing
^^^^^^^^^
The reverse operation is called 'compose', and is equivalent to the Python
string class format method.  Here we take the filename pattern from earlier,
change the time stamp of the data, and write out a new file name,

  >>> from datetime import datetime
  >>>
  >>> p = Parser("/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b")
  >>> data = {'directory': 'otherdir', 'platform': 'noaa', 'platnum': '16', 'time': datetime(2012, 1, 1, 1, 1), 'orbit': 69022}
  >>> p.compose(data)
  '/somedir/otherdir/hrpt_noaa16_20120101_0101_69022.l1b'

It is also possible to compose only partially, i.e., compose by specifying values
for only a subset of the parameters in the format string. Example:

  >>> p = Parser("/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b")
  >>> data = {'directory':'my_dir'}
  >>> p.compose(data, allow_partial=True)
  '/somedir/my_dir/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b'

In addition to python's builtin string formatting functionality trollsift also
provides extra conversion options such as making all characters lowercase:

  >>> my_parser = Parser("{platform_name!l}")
  >>> my_parser.compose({'platform_name': 'NPP'})
  'npp'

For all of the options see :class:`~trollsift.parser.StringFormatter`.

standalone parse and compose
----------------------------

The parse and compose methods also exist as standalone functions,
depending on your requirements you can call,

  >>> from trollsift import parse, compose
  >>> fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b"
  >>> data = parse( fmt, "/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b" )
  >>> data['time'] = datetime(2012, 1, 1, 1, 1)
  >>> compose(fmt, data)
  '/somedir/otherdir/hrpt_noaa16_20120101_0101_69022.l1b'

And achieve the exact same result as in the Parse object example above.