1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027
|
Examples
#############
This example demonstrates how to use common functionality of the ``PyFDB``.
The section contains examples for all methods of the ``FDB`` object.
In general ``PyFDB`` is used to refer to the Python API of the ``FDB``, whereas
``FDB`` is used to refer to the underlying C++ class or the created Python instance.
.. _mars_selection_label:
MARS Selections
***************
One main concept of interacting with the ``FDB`` is a MARS selection. A
selection is a dictionary-like object describing the ranges or sets
coordinates, which point to multiple elements or sub-datacubes within a
datacube, see `Datacube Spec <https://github.com/ecmwf/datacube-spec>`__.
The recommended way of interacting with the ``FDB`` is to use a dictionary,
describing a MARS selection. The following example shows how to create a MARS
selection with it:
.. code-block:: python
mars_selection = {
"key-1": ["value-2", "value-4", "value-5"], # String values
"key-2": ["0.1", 0.2, "0.5"], # Mixed types
"key-3": [0.1, 0.2, 0.5], # Float values
"key-4": [1, 2, 0.5], # Integer and float values
"key-5": 1, # Single int value
"key-6": [1 + 0.5 * x for x in range(2)], # List of float value generated by a list expression
}
.. note::
The type of a MarsSelection is ``Mapping[str, str | int | float | Collection[str | int | float]]`` which can be given to
the FDB object of the PyFDB module.
Some of the methods accepts wildcard selection, e.g. :ref:`listing
<list_label>`. For those it's possible to hand the wildcard selection
directly:
MARS Identifier
***************
There is also the concept of a MARS identifier. Those are strict subsets of the MARS selections and
differ by only allowing singular values.
.. code-block:: python
mars_identifier = {
"key-1": "value-1", # String values
"key-2": 2, # String values
# ...
}
For further information about the individual MARS data types, see the `Datacube Spec <https://github.com/ecmwf/datacube-spec>`__.
PyFDB Initalisation
*******************
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB()
If no configuration is supplied ``FDB`` falls back to derive the configuration
location from a predefined set of locations. You can supply a custom location
by specifying the ``FDB_HOME`` environment variable. If you want to set the
location of the configuration file only, use the ``FDB5_CONFIG_FILE``
environment variable. There is a plethora of different configuration options,
if in doubt, refer to the official ``FDB`` documentation at `ReadTheDocs
<https://fields-database.readthedocs.io/en/latest/>`__.
.. clear-namespace
You can also pass (dynamically) created custom configurations as parameters to
the ``FDB`` constructor. Those can be supplied as a ``Path`` pointing to the
location of the configuration file, as a ``str`` which is the ``yaml``
representation of the configuration or as a ``Dict[str, Any]`` as shown below.
.. code-block:: python
config = {
"type":"local",
"engine":"toc",
"schema":"/path/to/fdb_schema",
"spaces":[
{
"handler":"Default",
"roots":[
{"path": "/path/to/root"}
]
}
],
}
fdb = pyfdb.FDB(config=config, user_config={})
.. clear-namespace
For convenience, the ``FDB`` instance implements the ``Python`` context manager interface:
.. code-block:: python
with pyfdb.FDB() as fdb:
# Use fdb in here
pass
On exiting the context, :ref:`flush_label` is called to guarantee any potential
:ref:`archive_label` operation has been synced. This can lead to non-intended
sync behaviors, see :ref:`archive_label` for further information.
The different methods of the ``FDB`` class can be leverage for different
use-cases. Below we listed examples of the most common method class display
different ways of using the Python API.
.. _archive_label:
Archive
***********
**Archive binary data into the underlying FDB.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
with pyfdb.FDB(fdb_config_path) as fdb:
filename = data_path / "x138-300.grib"
fdb.archive(filename.read_bytes())
# On exit of this scope fdb is flushed
In this scenario a ``GRIB`` file is archived to the configured ``FDB``. The FDB
reads metadata from the given ``GRIB`` file and saves this, if no optional
``identifier`` is supplied. If we set an ``identifier``, there are no
consistency checks taking place and our data is saved with the metadata given
from the supplied ``identifier``. This enables us to store arbitrary binary data
under the given key, as shown below:
.. code-block:: python
identifier = {
"class": "rd",
"expver": "zzzz",
"stream": "oper",
"date": "20191110",
"time": "0000",
"domain": "g",
"type": "an",
"levtype": "pl",
"step": "0",
"levelist": "300",
"param": "138",
}
with pyfdb.FDB(fdb_config_path) as fdb:
fdb.archive(b"test-binary-data", identifier=identifier)
# On exit of this scope fdb is flushed
The ``flush`` command guarantees that the archived data has been flushed to the
``FDB``. In combination with using the context manager of the ``FDB`` object,
the syncing may behave differently from what the user expects. Take a look at
the following code, utilizing the archive method:
.. skip: start
.. Sybil can't skip code blocks in the default version. Therefore we need an example here
>> fdb = pyfdb.FDB()
>> for step in range(240):
>> with fdb:
>> fdb.archive(...)
>> fdb.archvie(...)
.. skip: end
This would call the ``exit`` function of the ``FDB`` after each iteration of
the step loop, therefore causing a ``flush`` of the archived data. Compare the before-mentioned call
with the following:
.. skip: start
.. Sybil can't skip code blocks in the default version. Therefore we need an example here
>> fdb = pyfdb.FDB()
>> for step in range(240):
>> fdb.archive(...)
>> fdb.archvie(...)
>> fdb.flush()
.. skip: end
Both of these examples achieve the same result in the normal, successful
execution case. However, in the case with a manual call to flush() if an
exception is thrown interrupting the execution when some archive() calls
have succeed and some have not then none of this data will become visible to
the user. When using the context manager, flush() will be implicitly called
when leaving the scope including when an exception is thrown, making partial
output visible to the user. Which of these outcomes is most desirable
depends on the workflow.
.. clear-namespace
.. _flush_label:
Flush
***********
**Flush all buffers and close all data handles of the underlying FDB into a consistent DB state.**
.. invisible-code-block: python
import pyfdb
.. tip::
It's always safe to call ``flush``.
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
filename = data_path / "x138-300.grib"
fdb.archive(open(filename, "rb").read())
fdb.flush()
The ``flush`` command guarantees that the :ref:`archived <archive_label>` data
has been flushed to the ``FDB``. It's always safe to call ``flush``. You can
either call the method explicitly or by using the context manager capabilities
of the ``FDB``. See :ref:`archive_label`.
.. clear-namespace
.. _retrieve_label:
Retrieving
***********
**Retrieve data which is specified by a MARS selection.**
.. invisible-code-block: python
import pyfdb
To Memory
=========
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"param": ["167", "165", "166"],
"time": "1800",
}
with fdb.retrieve(selection) as data_handle:
data_handle.read(4) # == b"GRIB"
# data_handle.readall() # As an alternative to read all messages
The code above shows how to retrieve a MARS selection given as a dictionary.
The retrieved ``data_handle`` has to be opened before being read and closed
afterwards. If you are interested in reading the entire ``data_handle``, you
could use the ``readall`` method.
.. tip::
For the ``readall`` method there is no need to ``open`` or ``close`` the
``data_handle`` after the call to ``readall``.
.. clear-namespace
To File
=========
Another use-case, which is often needed, is saving certain ``GRIB`` data in a
file on your local machine. The following code is showing how to achieve this:
.. code-block:: python
import shutil
fdb = pyfdb.FDB(fdb_config_path)
# Specify selection here
# --------------------
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"param": ["167", "165", "166"],
"time": "1800",
}
filename = test_case_tmp / 'output.grib'
with open(filename, 'wb') as out:
with fdb.retrieve(selection) as data_handle:
out.write(data_handle.readall())
The example above, as a first step, reads all data in memory and writes the
data to the specified file afterwards. In case the data of the selection is
to large to fit in memory, we can leverage the ``shutil`` functions read the content
buffered and write the individual chunks in a single file onto disk:
.. invisible-code-block: python
import pyfdb
import tempfile
import shutil
.. code-block:: python
with tempfile.TemporaryFile() as out:
with fdb.retrieve(selection) as data_handle:
assert data_handle
shutil.copyfileobj(data_handle, out)
Depending on the implementation this ``shutil`` uses the ``read`` method or the
``readinto`` method. The first is copying a specific buffer, one at a time till
the entire data handle is depleted. The latter, leverages a ``memoryview`` to make it a zero-copy
function.
.. clear-namespace
.. _list_label:
List
****
**List data present at the underlying fdb archive and which can be retrieved.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea"
}
list_iterator = fdb.list(selection, level=1)
elements = list(list_iterator)
for el in elements:
print(el)
assert len(elements) == 32
The code above shows an example of listing the contents of the ``FDB`` for a given selection.
The selection is describing a ``MarsSelection`` (A MARS request without the verb).
.. note::
A ``MarsSelection`` doesn't need to be fully specified. In the example above
you can see that many of the MARS keys aren't specified. In case of ``list``, the given keys
are treated as a selector, meaning that all data which matches those keys is returned. For
every key, which isn't explicitly stated, all found data is returned.
We recommend to use lists of values at all given times, as seen in :ref:`mars_selection_label`.
``level=1`` refers to the schema level of the ``FDB``. A given ``Rule`` in a
``FDB`` schema could look like:
.. code-block::
[ class, expver, stream, date, time, domain?
^^^^^^^^^^^^^^^ Level 1 ^^^^^^^^^^^^^^^^^^
[ type, levtype
^^ Level 2 ^^
[ step, levelist?, param ]]
^^^^^^^ Level 3 ^^^^^^^
]
Depending on the given ``level`` different outputs are to be expected:
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"time": "1800",
}
list_iterator = fdb.list(selection) # level == 3
elements = list(list_iterator)
print(elements[0])
::
{class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
{type=an,levtype=sfc}
{step=0,param=131},
TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=10732,length=10732,remapKey={}],
length=10732,
timestamp=1763479157
{class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
{type=an,levtype=sfc}
{step=0,param=132},
TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/db_store/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=21464,length=10732,remapKey={}],
length=10732,
timestamp=1763479157
{class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
{type=an,levtype=sfc}
{step=0,param=167},
TocFieldLocation[uri=URI[scheme=file,name=/<path-to-db_store>/db_store/ea:0001:oper:20200101:1800:g/an:sfc.20251118.151917.<?>.375861178007828.data],offset=0,length=10732,remapKey={}],
length=10732,
timestamp=1763479157
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"time": "1800",
}
list_iterator = fdb.list(selection, level=2)
elements = list(list_iterator)
print(elements[0])
::
{class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g}
{type=an,levtype=sfc},
length=0,
timestamp=0
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"time": "1800",
}
list_iterator = fdb.list(selection, level=1)
elements = list(list_iterator)
print(elements[0])
::
{class=ea,expver=0001,stream=oper,date=20200101,time=1800,domain=g},
length=0,
timestamp=0
For each level the returned iterator of ``ListElement`` is restricting the elements to the corresponding
level of the underlying FDB. ``level=1`` returns elements, which key only contains MARS keys of level 1,
``level=2`` returns elements, which key contains MARS keys of level 2 and ``level=3`` returns elements
which key contain all MARS keys and the corresponding ``DataHandle`` pointing to the location of the
file on disk.
You can use this directly to read the message represented by the ``ListElement``, e.g.:
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
list_iterator = fdb.list(selection, level=3)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"param": ["167", "131", "132"],
"time": "1800",
}
for el in list_iterator:
data_handle = el.data_handle
data_handle.open()
assert data_handle.read(4) == b"GRIB"
data_handle.close()
.. clear-namespace
.. _inspect_label:
Inspect
*******
**Inspects the content of the underlying FDB and returns a generator of list elements
describing which field was part of the MARS selection.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
identifier = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"param": "131",
"time": "1800",
}
inspect_iterator = fdb.inspect(identifier)
elements = list(inspect_iterator)
# Because the identifier needs to be fully specified, there
# should be only a single element returned
assert len(elements) == 1
for el in elements:
with el.data_handle as data_handle:
assert data_handle.read(4) == b"GRIB"
The code above shows how to inspect certain elements stored in the ``FDB``. This call is similar to
a ``list`` call with ``level=3``, although the internals are quite different. The functionality is
designed to list a vast amount of individual fields.
Similar to the :ref:`list <list_label>` command, each ``ListElement`` returned, contains a ``DataHandle`` which can
be used to directly access the data associated with the element, see the example of ``list``.
.. note::
Due to the internals of the ``FDB`` only a fully specified MARS selection
with singular values (also called Identifier) is accepted. If a list is given
for a key, e.g. ``param=131/132``, the second value is silently dropped.
.. clear-namespace
.. _status_label:
Status
*******
**List the status of all FDB entries with their control identifiers, e.g., whether a certain database was locked for retrieval.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
}
status_iterator = fdb.status(selection)
elements = list(status_iterator)
len(elements) # == 32
The output of such a command can look like the above and is the same output you get from the
call to `control <control_label>` when setting certain ``ControlIdentifiers`` for elements of the ``FDB``.
::
ControlElement(
control_identifiers=[WIPE],
key={'class': ['ea'], 'date': ['20200104'], 'domain': ['g'], 'expver': ['0001'], 'stream': ['oper'], 'time': ['2100']},
location=/<some-path>/db_store/ea:0001:oper:20200104:2100:g
)
You can see that the ``ControlIdentifier`` for ``WIPE`` is active for the given entry of the ``FDB``.
.. tip::
Use the ``control`` functionality of FDB to switch certain properties of ``FDB`` elements.
Refer to the :ref:`control_label` section for further information.
.. _wipe_label:
Wipe
*******
**Wipe data from the database**
.. invisible-code-block: python
import pyfdb
Delete FDB databases and the data therein contained. Use the passed
selection to identify the database to delete. This is equivalent to a UNIX rm command.
This function deletes either whole databases, or whole indexes within databases
.. tip::
You should check the elements of a deletion before running it with the ``doit`` flag.
Double check that the dry-run, which is active per default, really returns the elements you are
expecting.
A potential deletion operation could look like this:
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
elements = list(fdb.wipe({"class": "ea"}))
len(elements) > 0
# NOTE: Double check that the returned elements are those you want to delete
for element in elements:
print(element)
# Do the actual deletion with the `doit=True` flag
wipe_iterator = fdb.wipe({"class": "ea"}, doit=True)
wiped_elements = list(wipe_iterator)
for element in wiped_elements:
print(element)
.. clear-namespace
.. _purge_label:
Purge
*******
**Remove duplicate data from the database.**
.. invisible-code-block: python
import pyfdb
Purge duplicate entries from the database and remove the associated data if the data is owned and not adopted.
Data in the ``FDB`` is immutable. It is masked, but not removed, when overwritten with new data using the same key.
Masked data can no longer be accessed. Indexes and data files that only contains masked data may be removed.
If an index refers to data that is not owned by the FDB (in particular data which has been adopted from an
existing ``FDB``), this data will not be removed.
.. tip::
It's always advised to check the elements of a deletion before running it with the ``doit`` flag.
Double check that the dry-run, which is active per default, really returns the elements you are
expecting.
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
elements = list(fdb.purge({"class": "ea"}))
len(elements) > 0
# NOTE: Double check that the returned elements are those you want to delete
for element in elements:
print(element)
# Do the actual deletion with the `doit=True` flag
purge_iterator = fdb.purge({"class": "ea"}, doit=True)
purge_elements = list(purge_iterator)
for element in purge_elements:
print(element)
.. clear-namespace
.. _stats_label:
Stats
*******
**Print information about FDB databases, aggregating the information over all the databases visited into a final summary.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"levtype": "sfc",
"step": "0",
"param": ["167", "165", "166"],
"time": "1800",
}
elements = list(fdb.stats(selection))
for el in elements:
print(el)
The example above shows how to use the ``stats`` function to get an overview over the statistics a given MARS selection
has. For every database and every index the selection touches, it aggregates statistics and shows the result in a table.
The ``StatsElement`` s returned from the call are Python string resembling individual lines of the output generated by
the underlying ``FDB``. A potential call of the example above could lead to the following output:
::
Index Statistics:
Fields : 3
Size of fields : 32,196 (31.4414 Kbytes)
Reacheable fields : 3
Reachable size : 32,196 (31.4414 Kbytes)
DB Statistics:
Databases : 1
TOC records : 2
Size of TOC files : 2,048 (2 Kbytes)
Size of schemas files : 228 (228 bytes)
TOC records : 2
Owned data files : 1
Size of owned data files : 32,196 (31.4414 Kbytes)
Index files : 1
Size of index files : 131,072 (128 Kbytes)
Size of TOC files : 2,048 (2 Kbytes)
Total owned size : 165,544 (161.664 Kbytes)
Total size : 165,544 (161.664 Kbytes)
.. clear-namespace
.. _control_label:
Control
*******
**Enable certain features of FDB databases, e.g., disables or enables retrieving, list, etc.**
The example given below shows how the activation/deactivation of the wipe functionality of the ``FDB``
works for a certain selection.
.. invisible-code-block: python
import pyfdb
import pytest
.. tip::
Consume the iterator, returned by the ``control`` call, completely. Otherwise, the lock file
won't be created.
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
"date": "20200101",
"time": "1800",
}
print("Lock the database for wiping")
control_iterator = fdb.control(selection, pyfdb.ControlAction.DISABLE, [pyfdb.ControlIdentifier.WIPE])
elements = list(control_iterator)
assert len(elements) == 1
assert (fdb_config_path.parent / "db_store" / "ea:0001:oper:20200101:1800:g" / "wipe.lock").exists()
print("Try Wipe")
wipe_iterator = fdb.wipe(selection, doit=True)
elements = []
with pytest.raises(RuntimeError):
for el in wipe_iterator:
elements.append(el)
assert len(elements) == 0
print("Unlock the database for wiping")
control_iterator = fdb.control(selection, pyfdb.ControlAction.ENABLE, [pyfdb.ControlIdentifier.WIPE])
elements = list(control_iterator)
assert len(elements) > 0
assert not (fdb_config_path.parent / "db_store" / "ea:0001:oper:20200101:1800:g" / "wipe.lock").exists()
print("Wipe")
fdb.wipe(selection, doit=True)
fdb.flush()
print("Success")
After specifying the selection we want to target, this has to be a selection which contains keys of
the first and second level of the schema, we can call the ``control`` function and specify the wished action:
in this case ``ControlIdentifier.WIPE`` and ``ControlAction.DISABLE``, which translate to wanting to disable
wiping for the specified database. We could specify multiple of the ``ControlIdentifier`` in a single call.
For each of the ``ControlIdentifier`` the underlying ``FDB`` will create a ``<control-identifier-name>.lock`` file,
which resides inside the database specified by the MARS selection. If we decide to enable the action again, this
file gets deleted.
After disabling the action, a call to it results in an empty iterator being returned.
.. clear-namespace
.. _axes_label:
Axes
*******
**Return the 'axes' and their extent of a selection for a given level of the schema in an IndexAxis object.**
If a key is not specified the entire extent (all values) are returned.
.. invisible-code-block: python
import pyfdb
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
selection = {
"type": "an",
"class": "ea",
"domain": "g",
"expver": "0001",
"stream": "oper",
# "date": "20200101", # Left out to show all values are returned
"levtype": "sfc",
"step": "0",
"time": "1800",
}
print("---------- Level 3: ----------")
index_axis = fdb.axes(selection)
# len(index_axis.items()) == 11
for k, v in index_axis.items():
print(f"k={k} | v={v}")
print("---------- Level 2: ----------")
index_axis = fdb.axes(selection, level=2)
#len(index_axis.items()) == 8
for k, v in index_axis.items():
print(f"k={k} | v={v}")
print("---------- Level 1: ----------")
index_axis = fdb.axes(selection, level=1)
# len(index_axis.items()) == 6
for k, v in index_axis.items():
print(f"k={k} | v={v}")
The example above produces the following output:
::
---------- Level 3: ----------
k=class | v=['ea']
k=date | v=['20200101', '20200102', '20200103', '20200104']
k=domain | v=['g']
k=expver | v=['0001']
k=levelist | v=['']
k=levtype | v=['sfc']
k=param | v=['131', '132', '167']
k=step | v=['0']
k=stream | v=['oper']
k=time | v=['1800']
k=type | v=['an']
---------- Level 2: ----------
k=class | v=['ea']
k=date | v=['20200101', '20200102', '20200103', '20200104']
k=domain | v=['g']
k=expver | v=['0001']
k=levtype | v=['sfc']
k=stream | v=['oper']
k=time | v=['1800']
k=type | v=['an']
---------- Level 1: ----------
k=class | v=['ea']
k=date | v=['20200101', '20200102', '20200103', '20200104']
k=domain | v=['g']
k=expver | v=['0001']
k=stream | v=['oper']
k=time | v=['1800']
For each specified ``level``, the keys affected by the MARS selection at that level are returned.
Optional keys in the ``FDB`` schema appear as empty lists. If a key is missing from the selection,
the key and all values stored in the ``FDB`` are returned (see the ``date`` key above).
In case you want to see the 'span' of all elements stored in an ``FDB`` you could use the following code:
.. warning::
This following code is an expensive call (depending on the size of the ``FDB``).
For testing purposes or locally configured ``FDB`` instances this is fine.
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
index_axis: pyfdb.IndexAxis = fdb.axes({})
.. clear-namespace
.. _enabled_label:
Enabled
*******
**Check whether a specific control identifier is enabled.**
.. invisible-code-block: python
import pyfdb
.. code-block:: python
from pyfdb import ControlIdentifier
fdb = pyfdb.FDB(fdb_config_path)
assert fdb.enabled(ControlIdentifier.NONE) is True
assert fdb.enabled(ControlIdentifier.LIST) is True
assert fdb.enabled(ControlIdentifier.RETRIEVE) is True
assert fdb.enabled(ControlIdentifier.ARCHIVE) is True
assert fdb.enabled(ControlIdentifier.WIPE) is True
assert fdb.enabled(ControlIdentifier.UNIQUEROOT) is True
The examples above show how a default ``FDB`` is configured, this is, all possible ``ControlAction`` s
are enabled by default.
Configuring the ``FDB`` to disallow writing via setting ``writable = False`` in the ``fdb_config.yaml``,
we end up with the following ``ControlIdentifier`` s:
.. code-block:: python
import yaml
from pyfdb import ControlIdentifier
fdb_config = yaml.safe_load(fdb_config_path.read_text())
fdb_config["writable"] = False
fdb = pyfdb.FDB(fdb_config)
assert fdb.enabled(ControlIdentifier.NONE) is True
assert fdb.enabled(ControlIdentifier.LIST) is True
assert fdb.enabled(ControlIdentifier.RETRIEVE) is True
assert fdb.enabled(ControlIdentifier.ARCHIVE) is False
assert fdb.enabled(ControlIdentifier.WIPE) is False
assert fdb.enabled(ControlIdentifier.UNIQUEROOT) is True
The configuration changes accordingly, if we substitute ``writable = False`` with ``visitable = False``.
.. clear-namespace
.. _dirty_label:
Dirty
*************
**Return whether a flush of the FDB is needed, for example if data was archived since the last flush.**
.. code-block:: python
fdb = pyfdb.FDB(fdb_config_path)
filename = data_path / "x138-300.grib"
fdb.archive(open(filename, "rb").read())
fdb.dirty() # == True
fdb.flush()
fdb.dirty() # == False
The example above shows return value of the ``dirty`` command after an :ref:`archive <archive_label>` command results in ``True``.
Flushing resets the internal status of the ``FDB`` and the call to ``dirty`` returns ``False`` afterwards.
.. clear-namespace
|