File: examples.rst

package info (click to toggle)
fdb 5.20.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 89,268 kB
  • sloc: cpp: 40,830; python: 5,079; sh: 4,996; makefile: 32; ansic: 8
file content (1027 lines) | stat: -rw-r--r-- 30,549 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
Examples
#############

This example demonstrates how to use common functionality of the ``PyFDB``. 
The section contains examples for all methods of the ``FDB`` object.

In general ``PyFDB`` is used to refer to the Python API of the ``FDB``, whereas
``FDB`` is used to refer to the underlying C++ class or the created Python instance.


.. _mars_selection_label:
MARS Selections
***************
One main concept of interacting with the ``FDB`` is a MARS selection. A
selection is a dictionary-like object describing the ranges or sets
coordinates, which point to multiple elements or sub-datacubes within a
datacube, see `Datacube Spec <https://github.com/ecmwf/datacube-spec>`__.

The recommended way of interacting with the ``FDB`` is to use a dictionary,
describing a MARS selection. The following example shows how to create a MARS
selection with it:

.. code-block:: python

    mars_selection = {
        "key-1": ["value-2", "value-4", "value-5"],     # String values
        "key-2": ["0.1", 0.2, "0.5"],                   # Mixed types
        "key-3": [0.1, 0.2, 0.5],                       # Float values
        "key-4": [1, 2, 0.5],                           # Integer and float values
        "key-5": 1,                                     # Single int value
        "key-6": [1 + 0.5 * x for x in range(2)],       # List of float value generated by a list expression
    }

.. note::

    The type of a MarsSelection is ``Mapping[str, str | int | float | Collection[str | int | float]]`` which can be given to
    the FDB object of the PyFDB module.

    Some of the methods accepts wildcard selection, e.g. :ref:`listing
    <list_label>`. For those it's possible to hand the wildcard selection
    directly:

MARS Identifier
***************
There is also the concept of a MARS identifier. Those are strict subsets of the MARS selections and 
differ by only allowing singular values.

.. code-block:: python

    mars_identifier = {
        "key-1": "value-1",     # String values
        "key-2": 2,     # String values
        # ...
    }

For further information about the individual MARS data types, see the `Datacube Spec <https://github.com/ecmwf/datacube-spec>`__.


PyFDB Initalisation
*******************

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB()

If no configuration is supplied ``FDB`` falls back to derive the configuration
location from a predefined set of locations. You can supply a custom location
by specifying the ``FDB_HOME`` environment variable. If you want to set the
location of the configuration file only, use the ``FDB5_CONFIG_FILE``
environment variable. There is a plethora of different configuration options,
if in doubt, refer to the official ``FDB`` documentation at `ReadTheDocs
<https://fields-database.readthedocs.io/en/latest/>`__.

.. clear-namespace

You can also pass (dynamically) created custom configurations as parameters to
the ``FDB`` constructor. Those can be supplied as a ``Path`` pointing to the
location of the configuration file, as a ``str`` which is the ``yaml``
representation of the configuration or as a ``Dict[str, Any]`` as shown below. 

.. code-block:: python

    config = {
        "type":"local",
        "engine":"toc",
        "schema":"/path/to/fdb_schema",
        "spaces":[
            {
                "handler":"Default",
                "roots":[
                    {"path": "/path/to/root"}
                ]
            }
        ],
    }
    fdb = pyfdb.FDB(config=config, user_config={})

.. clear-namespace

For convenience, the ``FDB`` instance implements the ``Python`` context manager interface:

.. code-block:: python

    with pyfdb.FDB() as fdb:
        # Use fdb in here
        pass

On exiting the context, :ref:`flush_label` is called to guarantee any potential
:ref:`archive_label` operation has been synced. This can lead to non-intended
sync behaviors, see :ref:`archive_label` for further information.

The different methods of the ``FDB`` class can be leverage for different
use-cases. Below we listed examples of the most common method class display
different ways of using the Python API.

.. _archive_label:

Archive
***********

**Archive binary data into the underlying FDB.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    with pyfdb.FDB(fdb_config_path) as fdb:
        filename = data_path / "x138-300.grib"

        fdb.archive(filename.read_bytes())
        # On exit of this scope fdb is flushed

In this scenario a ``GRIB`` file is archived to the configured ``FDB``. The FDB
reads metadata from the given ``GRIB`` file and saves this, if no optional
``identifier`` is supplied. If we set an ``identifier``, there are no
consistency checks taking place and our data is saved with the metadata given
from the supplied ``identifier``. This enables us to store arbitrary binary data 
under the given key, as shown below:

.. code-block:: python

   identifier = {
        "class": "rd",
        "expver": "zzzz",
        "stream": "oper",
        "date": "20191110",
        "time": "0000",
        "domain": "g",
        "type": "an",
        "levtype": "pl",
        "step": "0",
        "levelist": "300",
        "param": "138",
   }

   with pyfdb.FDB(fdb_config_path) as fdb:
       fdb.archive(b"test-binary-data", identifier=identifier)
       # On exit of this scope fdb is flushed

The ``flush`` command guarantees that the archived data has been flushed to the
``FDB``. In combination with using the context manager of the ``FDB`` object,
the syncing may behave differently from what the user expects. Take a look at
the following code, utilizing the archive method:

.. skip: start

.. Sybil can't skip code blocks in the default version. Therefore we need an example here

>> fdb = pyfdb.FDB()
>> for step in range(240):
>>    with fdb:
>>        fdb.archive(...)
>>        fdb.archvie(...)

.. skip: end

This would call the ``exit`` function of the ``FDB`` after each iteration of
the step loop, therefore causing a ``flush`` of the archived data. Compare the before-mentioned call 
with the following:

.. skip: start

.. Sybil can't skip code blocks in the default version. Therefore we need an example here

>> fdb = pyfdb.FDB()
>> for step in range(240):
>>    fdb.archive(...)
>>    fdb.archvie(...)
>>    fdb.flush()

.. skip: end

Both of these examples achieve the same result in the normal, successful
execution case. However, in the case with a manual call to flush() if an
exception is thrown interrupting the execution when some archive() calls
have succeed and some have not then none of this data will become visible to
the user. When using the context manager, flush() will be implicitly called
when leaving the scope including when an exception is thrown, making partial
output visible to the user. Which of these outcomes is most desirable
depends on the workflow.

.. clear-namespace

.. _flush_label:

Flush
***********

**Flush all buffers and close all data handles of the underlying FDB into a consistent DB state.**

.. invisible-code-block: python

   import pyfdb

.. tip::

   It's always safe to call ``flush``.

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    filename = data_path / "x138-300.grib"

    fdb.archive(open(filename, "rb").read())
    fdb.flush()

The ``flush`` command guarantees that the :ref:`archived <archive_label>` data
has been flushed to the ``FDB``. It's always safe to call ``flush``. You can
either call the method explicitly or by using the context manager capabilities
of the ``FDB``. See :ref:`archive_label`.

.. clear-namespace

.. _retrieve_label:

Retrieving
***********

**Retrieve data which is specified by a MARS selection.**

.. invisible-code-block: python

   import pyfdb

To Memory
=========

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "param": ["167", "165", "166"],
        "time": "1800",
    }

    with fdb.retrieve(selection) as data_handle:
        data_handle.read(4) # == b"GRIB"
        # data_handle.readall() # As an alternative to read all messages

The code above shows how to retrieve a MARS selection given as a dictionary.
The retrieved ``data_handle`` has to be opened before being read and closed
afterwards. If you are interested in reading the entire ``data_handle``, you
could use the ``readall`` method. 

.. tip::

    For the ``readall`` method there is no need to ``open`` or ``close`` the
    ``data_handle`` after the call to ``readall``.

.. clear-namespace

To File
=========

Another use-case, which is often needed, is saving certain ``GRIB`` data in a
file on your local machine. The following code is showing how to achieve this:

.. code-block:: python

    import shutil

    fdb = pyfdb.FDB(fdb_config_path)

    # Specify selection here
    # --------------------
    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "param": ["167", "165", "166"],
        "time": "1800",
    }

    filename = test_case_tmp / 'output.grib'

    with open(filename, 'wb') as out:
        with fdb.retrieve(selection) as data_handle:
            out.write(data_handle.readall())

The example above, as a first step, reads all data in memory and writes the
data to the specified file afterwards. In case the data of the selection is
to large to fit in memory, we can leverage the ``shutil`` functions read the content
buffered and write the individual chunks in a single file onto disk:

.. invisible-code-block: python

   import pyfdb
   import tempfile
   import shutil

.. code-block:: python

    with tempfile.TemporaryFile() as out:
        with fdb.retrieve(selection) as data_handle:
            assert data_handle
            shutil.copyfileobj(data_handle, out)

Depending on the implementation this ``shutil`` uses the ``read`` method or the
``readinto`` method. The first is copying a specific buffer, one at a time till
the entire data handle is depleted. The latter, leverages a ``memoryview`` to make it a zero-copy
function.

.. clear-namespace

.. _list_label:

List
****

**List data present at the underlying fdb archive and which can be retrieved.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
            "type": "an",
            "class": "ea"
        }

    list_iterator = fdb.list(selection, level=1)
    elements = list(list_iterator)

    for el in elements:
        print(el)

    assert len(elements) == 32

The code above shows an example of listing the contents of the ``FDB`` for a given selection.
The selection is describing a ``MarsSelection`` (A MARS request without the verb).

.. note::

   A ``MarsSelection`` doesn't need to be fully specified. In the example above
   you can see that many of the MARS keys aren't specified. In case of ``list``, the given keys
   are treated as a selector, meaning that all data which matches those keys is returned. For
   every key, which isn't explicitly stated, all found data is returned.

   We recommend to use lists of values at all given times, as seen in :ref:`mars_selection_label`.


``level=1`` refers to the schema level of the ``FDB``. A given ``Rule`` in a
``FDB`` schema could look like:

.. code-block::

    [ class, expver, stream, date, time, domain?
      ^^^^^^^^^^^^^^^ Level 1 ^^^^^^^^^^^^^^^^^^
      [ type, levtype
        ^^ Level 2 ^^
              [ step, levelist?, param ]]
                ^^^^^^^ Level 3 ^^^^^^^
    ]

Depending on the given ``level`` different outputs are to be expected:

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "time": "1800",
    }
    list_iterator = fdb.list(selection) # level == 3
    elements = list(list_iterator)
    print(elements[0])

::

    {class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
    {type=an,levtype=sfc}
    {step=0,param=131},
    TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=10732,length=10732,remapKey={}],
    length=10732,
    timestamp=1763479157

    {class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
    {type=an,levtype=sfc}
    {step=0,param=132},
    TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/db_store/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=21464,length=10732,remapKey={}],
    length=10732,
    timestamp=1763479157

    {class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
    {type=an,levtype=sfc}
    {step=0,param=167},
    TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/db_store/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=0,length=10732,remapKey={}],
    length=10732,
    timestamp=1763479157

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)
    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "time": "1800",
    }
    list_iterator = fdb.list(selection, level=2)
    elements = list(list_iterator)
    print(elements[0])

::

    {class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
    {type=an,levtype=sfc},
    length=0,
    timestamp=0

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)
    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "time": "1800",
    }
    list_iterator = fdb.list(selection, level=1)
    elements = list(list_iterator)
    print(elements[0])

:: 

    {class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g},
    length=0,
    timestamp=0

For each level the returned iterator of ``ListElement`` is restricting the elements to the corresponding
level of the underlying FDB. ``level=1`` returns elements, which key only contains MARS keys of level 1,
``level=2`` returns elements, which key contains MARS keys of level 2 and ``level=3`` returns elements
which key contain all MARS keys and the corresponding ``DataHandle`` pointing to the location of the
file on disk.

You can use this directly to read the message represented by the ``ListElement``, e.g.:

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    list_iterator = fdb.list(selection, level=3)
    selection = {
            "type": "an",
            "class": "ea",
            "domain": "g",
            "expver": "0001",
            "stream": "oper",
            "date": "20200101",
            "levtype": "sfc",
            "step": "0",
            "param": ["167", "131", "132"],
            "time": "1800",
    }

    for el in list_iterator:
        data_handle = el.data_handle
        data_handle.open()
        assert data_handle.read(4) == b"GRIB"
        data_handle.close()

.. clear-namespace

.. _inspect_label:
   
Inspect
*******

**Inspects the content of the underlying FDB and returns a generator of list elements
describing which field was part of the MARS selection.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    identifier = {
        "type": "an",
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "levtype": "sfc",
        "step": "0",
        "param": "131",
        "time": "1800",
    }

    inspect_iterator = fdb.inspect(identifier)
    elements = list(inspect_iterator)

    # Because the identifier needs to be fully specified, there
    # should be only a single element returned
    assert len(elements) == 1

    for el in elements:
        with el.data_handle as data_handle:
            assert data_handle.read(4) == b"GRIB"

The code above shows how to inspect certain elements stored in the ``FDB``. This call is similar to
a ``list`` call with ``level=3``, although the internals are quite different. The functionality is
designed to list a vast amount of individual fields. 

Similar to the :ref:`list <list_label>` command, each ``ListElement`` returned, contains a ``DataHandle`` which can
be used to directly access the data associated with the element, see the example of ``list``.

.. note::

   Due to the internals of the ``FDB`` only a fully specified MARS selection
   with singular values (also called Identifier) is accepted. If a list is given
   for a key, e.g. ``param=131/132``, the second value is silently dropped.

.. clear-namespace

.. _status_label:

Status
*******

**List the status of all FDB entries with their control identifiers, e.g., whether a certain database was locked for retrieval.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
        "type": "an",
        "class": "ea",
        "domain": "g",
    }

    status_iterator = fdb.status(selection)
    elements = list(status_iterator)

    len(elements) # == 32

The output of such a command can look like the above and is the same output you get from the
call to `control <control_label>` when setting certain ``ControlIdentifiers`` for elements of the ``FDB``.

::

    ControlElement(
        control_identifiers=[WIPE], 
        key={'class': ['ea'], 'date': ['20200104'], 'domain': ['g'], 'expver': ['0001'], 'stream': ['oper'], 'time': ['2100']},
        location=/<some-path>/db_store/ea:0001:oper:20200104:2100:g
    )


You can see that the ``ControlIdentifier`` for ``WIPE`` is active for the given entry of the ``FDB``.

.. tip::
   Use the ``control`` functionality of FDB to switch certain properties of ``FDB`` elements.
   Refer to the :ref:`control_label` section for further information.

.. _wipe_label:

Wipe
*******

**Wipe data from the database**

.. invisible-code-block: python

   import pyfdb

Delete FDB databases and the data therein contained. Use the passed
selection to identify the database to delete. This is equivalent to a UNIX rm command.
This function deletes either whole databases, or whole indexes within databases

.. tip::

   You should check the elements of a deletion before running it with the ``doit`` flag.
   Double check that the dry-run, which is active per default, really returns the elements you are
   expecting.

A potential deletion operation could look like this:

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    elements = list(fdb.wipe({"class": "ea"}))
    len(elements) > 0

    # NOTE: Double check that the returned elements are those you want to delete
    for element in elements:
        print(element)
    

    # Do the actual deletion with the `doit=True` flag
    wipe_iterator = fdb.wipe({"class": "ea"}, doit=True)
    wiped_elements = list(wipe_iterator)

    for element in wiped_elements:
        print(element)

.. clear-namespace

.. _purge_label:

Purge
*******
**Remove duplicate data from the database.**

.. invisible-code-block: python

   import pyfdb

Purge duplicate entries from the database and remove the associated data if the data is owned and not adopted.
Data in the ``FDB`` is immutable. It is masked, but not removed, when overwritten with new data using the same key.
Masked data can no longer be accessed. Indexes and data files that only contains masked data may be removed.

If an index refers to data that is not owned by the FDB (in particular data which has been adopted from an
existing ``FDB``), this data will not be removed.

.. tip::

   It's always advised to check the elements of a deletion before running it with the ``doit`` flag.
   Double check that the dry-run, which is active per default, really returns the elements you are
   expecting.

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    elements = list(fdb.purge({"class": "ea"}))
    len(elements) > 0

    # NOTE: Double check that the returned elements are those you want to delete
    for element in elements:
        print(element)

    # Do the actual deletion with the `doit=True` flag
    purge_iterator = fdb.purge({"class": "ea"}, doit=True)
    purge_elements = list(purge_iterator)

    for element in purge_elements:
        print(element)

.. clear-namespace

.. _stats_label:

Stats
*******
**Print information about FDB databases, aggregating the information over all the databases visited into a final summary.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
            "type": "an",
            "class": "ea",
            "domain": "g",
            "expver": "0001",
            "stream": "oper",
            "date": "20200101",
            "levtype": "sfc",
            "step": "0",
            "param": ["167", "165", "166"],
            "time": "1800",
        }

    elements = list(fdb.stats(selection))

    for el in elements:
        print(el)

The example above shows how to use the ``stats`` function to get an overview over the statistics a given MARS selection
has. For every database and every index the selection touches, it aggregates statistics and shows the result in a table.
The ``StatsElement`` s returned from the call are Python string resembling individual lines of the output generated by
the underlying ``FDB``. A potential call of the example above could lead to the following output:

::

    Index Statistics:
    Fields                          : 3
    Size of fields                  : 32,196 (31.4414 Kbytes)
    Reacheable fields               : 3
    Reachable size                  : 32,196 (31.4414 Kbytes)

    DB Statistics:
    Databases                       : 1
    TOC records                     : 2
    Size of TOC files               : 2,048 (2 Kbytes)
    Size of schemas files           : 228 (228 bytes)
    TOC records                     : 2
    Owned data files                : 1
    Size of owned data files        : 32,196 (31.4414 Kbytes)
    Index files                     : 1
    Size of index files             : 131,072 (128 Kbytes)
    Size of TOC files               : 2,048 (2 Kbytes)
    Total owned size                : 165,544 (161.664 Kbytes)
    Total size                      : 165,544 (161.664 Kbytes)

.. clear-namespace

.. _control_label:

Control
*******
**Enable certain features of FDB databases, e.g., disables or enables retrieving, list, etc.**

The example given below shows how the activation/deactivation of the wipe functionality of the ``FDB``
works for a certain selection. 

.. invisible-code-block: python

   import pyfdb
   import pytest

.. tip::
   Consume the iterator, returned by the ``control`` call, completely. Otherwise, the lock file
   won't be created.

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
        "class": "ea",
        "domain": "g",
        "expver": "0001",
        "stream": "oper",
        "date": "20200101",
        "time": "1800",
    }

    print("Lock the database for wiping")
    control_iterator = fdb.control(selection, pyfdb.ControlAction.DISABLE, [pyfdb.ControlIdentifier.WIPE])
    elements = list(control_iterator)

    assert len(elements) == 1
    assert (fdb_config_path.parent / "db_store" / "ea:0001:oper:20200101:1800:g" / "wipe.lock").exists()

    print("Try Wipe")
    wipe_iterator = fdb.wipe(selection, doit=True)

    elements = []

    with pytest.raises(RuntimeError):
        for el in wipe_iterator:
            elements.append(el)

    assert len(elements) == 0

    print("Unlock the database for wiping")
    control_iterator = fdb.control(selection, pyfdb.ControlAction.ENABLE, [pyfdb.ControlIdentifier.WIPE])
    elements = list(control_iterator)

    assert len(elements) > 0
    assert not (fdb_config_path.parent / "db_store" / "ea:0001:oper:20200101:1800:g" / "wipe.lock").exists()

    print("Wipe")
    fdb.wipe(selection, doit=True)
    fdb.flush()

    print("Success")


After specifying the selection we want to target, this has to be a selection which contains keys of 
the first and second level of the schema, we can call the ``control`` function and specify the wished action:
in this case ``ControlIdentifier.WIPE`` and ``ControlAction.DISABLE``, which translate to wanting to disable
wiping for the specified database. We could specify multiple of the ``ControlIdentifier`` in a single call.

For each of the ``ControlIdentifier`` the underlying ``FDB`` will create a ``<control-identifier-name>.lock`` file, 
which resides inside the database specified by the MARS selection. If we decide to enable the action again, this
file gets deleted.

After disabling the action, a call to it results in an empty iterator being returned.

.. clear-namespace

.. _axes_label:

Axes
*******
**Return the 'axes' and their extent of a selection for a given level of the schema in an IndexAxis object.**

If a key is not specified the entire extent (all values) are returned.

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    selection = {
            "type": "an",
            "class": "ea",
            "domain": "g",
            "expver": "0001",
            "stream": "oper",
            # "date": "20200101", # Left out to show all values are returned
            "levtype": "sfc",
            "step": "0",
            "time": "1800",
        }

    print("---------- Level 3: ----------")
    index_axis = fdb.axes(selection)
    # len(index_axis.items()) == 11

    for k, v in index_axis.items():
        print(f"k={k} | v={v}")

    print("---------- Level 2: ----------")
    index_axis = fdb.axes(selection, level=2)
    #len(index_axis.items()) == 8

    for k, v in index_axis.items():
        print(f"k={k} | v={v}")

    print("---------- Level 1: ----------")
    index_axis = fdb.axes(selection, level=1)
    # len(index_axis.items()) == 6

    for k, v in index_axis.items():
        print(f"k={k} | v={v}")

The example above produces the following output:

::

   ---------- Level 3: ----------
   k=class    | v=['ea']
   k=date     | v=['20200101', '20200102', '20200103', '20200104']
   k=domain   | v=['g']
   k=expver   | v=['0001']
   k=levelist | v=['']
   k=levtype  | v=['sfc']
   k=param    | v=['131', '132', '167']
   k=step     | v=['0']
   k=stream   | v=['oper']
   k=time     | v=['1800']
   k=type     | v=['an']

   ---------- Level 2: ----------
   k=class    | v=['ea']
   k=date     | v=['20200101', '20200102', '20200103', '20200104']
   k=domain   | v=['g']
   k=expver   | v=['0001']
   k=levtype  | v=['sfc']
   k=stream   | v=['oper']
   k=time     | v=['1800']
   k=type     | v=['an']

   ---------- Level 1: ----------
   k=class    | v=['ea']
   k=date     | v=['20200101', '20200102', '20200103', '20200104']
   k=domain   | v=['g']
   k=expver   | v=['0001']
   k=stream   | v=['oper']
   k=time     | v=['1800']

For each specified ``level``, the keys affected by the MARS selection at that level are returned. 
Optional keys in the ``FDB`` schema appear as empty lists. If a key is missing from the selection,
the key and all values stored in the ``FDB`` are returned (see the ``date`` key above).

In case you want to see the 'span' of all elements stored in an ``FDB`` you could use the following code:

.. warning::

   This following code is an expensive call (depending on the size of the ``FDB``).
   For testing purposes or locally configured ``FDB`` instances this is fine.

.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)
    index_axis: pyfdb.IndexAxis = fdb.axes({})

.. clear-namespace

.. _enabled_label:

Enabled
*******
**Check whether a specific control identifier is enabled.**

.. invisible-code-block: python

   import pyfdb

.. code-block:: python

    from pyfdb import ControlIdentifier

    fdb = pyfdb.FDB(fdb_config_path)

    assert fdb.enabled(ControlIdentifier.NONE) is True
    assert fdb.enabled(ControlIdentifier.LIST) is True
    assert fdb.enabled(ControlIdentifier.RETRIEVE) is True
    assert fdb.enabled(ControlIdentifier.ARCHIVE) is True
    assert fdb.enabled(ControlIdentifier.WIPE) is True
    assert fdb.enabled(ControlIdentifier.UNIQUEROOT) is True

The examples above show how a default ``FDB`` is configured, this is, all possible ``ControlAction`` s
are enabled by default.

Configuring the ``FDB`` to disallow writing via setting ``writable = False`` in the ``fdb_config.yaml``,
we end up with the following ``ControlIdentifier`` s:

.. code-block:: python

    import yaml
    from pyfdb import ControlIdentifier

    fdb_config = yaml.safe_load(fdb_config_path.read_text())
    fdb_config["writable"] = False

    fdb = pyfdb.FDB(fdb_config)

    assert fdb.enabled(ControlIdentifier.NONE) is True
    assert fdb.enabled(ControlIdentifier.LIST) is True
    assert fdb.enabled(ControlIdentifier.RETRIEVE) is True
    assert fdb.enabled(ControlIdentifier.ARCHIVE) is False
    assert fdb.enabled(ControlIdentifier.WIPE) is False
    assert fdb.enabled(ControlIdentifier.UNIQUEROOT) is True

The configuration changes accordingly, if we substitute ``writable = False`` with ``visitable = False``.

.. clear-namespace

.. _dirty_label:

Dirty
*************
**Return whether a flush of the FDB is needed, for example if data was archived since the last flush.**


.. code-block:: python

    fdb = pyfdb.FDB(fdb_config_path)

    filename = data_path / "x138-300.grib"

    fdb.archive(open(filename, "rb").read())
    fdb.dirty() # == True
    fdb.flush()
    fdb.dirty() # == False

The example above shows return value of the ``dirty`` command after an :ref:`archive <archive_label>` command results in ``True``. 
Flushing resets the internal status of the ``FDB`` and the call to ``dirty`` returns ``False`` afterwards.

.. clear-namespace