1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352
|
Metadata-Version: 2.4
Name: h5netcdf
Version: 1.6.4
Summary: netCDF4 via h5py
Author-email: Stephan Hoyer <shoyer@gmail.com>, Kai Mühlbauer <kmuehlbauer@wradlib.org>
Maintainer-email: h5netcdf developers <devteam@h5netcdf.org>
License: Copyright (c) 2015, h5netcdf developers
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Project-URL: homepage, https://h5netcdf.org
Project-URL: documentation, https://h5netcdf.org
Project-URL: repository, https://github.com/h5netcdf/h5netcdf
Project-URL: changelog, https://github.com/h5netcdf/h5netcdf/blob/main/CHANGELOG.rst
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.9
Description-Content-Type: text/x-rst
License-File: LICENSE
License-File: AUTHORS.txt
Requires-Dist: h5py
Requires-Dist: packaging
Provides-Extra: test
Requires-Dist: netCDF4; extra == "test"
Requires-Dist: pytest; extra == "test"
Dynamic: license-file
h5netcdf
========
.. image:: https://github.com/h5netcdf/h5netcdf/workflows/CI/badge.svg
:target: https://github.com/h5netcdf/h5netcdf/actions
.. image:: https://badge.fury.io/py/h5netcdf.svg
:target: https://pypi.org/project/h5netcdf/
.. image:: https://github.com/h5netcdf/h5netcdf/actions/workflows/pages/pages-build-deployment/badge.svg?branch=gh-pages
:target: https://h5netcdf.github.io/h5netcdf/
A Python interface for the `netCDF4`_ file-format that reads and writes local or
remote HDF5 files directly via `h5py`_ or `h5pyd`_, without relying on the Unidata
netCDF library.
.. _netCDF4: https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#netcdf_4_spec
.. _h5py: https://www.h5py.org/
.. _h5pyd: https://github.com/HDFGroup/h5pyd
.. why-h5netcdf
Why h5netcdf?
-------------
- It has one less binary dependency (netCDF C). If you already have h5py
installed, reading netCDF4 with h5netcdf may be much easier than installing
netCDF4-Python.
- We've seen occasional reports of better performance with h5py than
netCDF4-python, though in many cases performance is identical. For
`one workflow`_, h5netcdf was reported to be almost **4x faster** than
`netCDF4-python`_.
- Anecdotally, HDF5 users seem to be unexcited about switching to netCDF --
hopefully this will convince them that netCDF4 is actually quite sane!
- Finally, side-stepping the netCDF C library (and Cython bindings to it)
gives us an easier way to identify the source of performance issues and
bugs in the netCDF libraries/specification.
.. _one workflow: https://github.com/Unidata/netcdf4-python/issues/390#issuecomment-93864839
.. _xarray: https://github.com/pydata/xarray/
Install
-------
Ensure you have a recent version of h5py installed (I recommend using `conda`_ or
the community effort `conda-forge`_).
At least version 3.0 is required. Then::
$ pip install h5netcdf
Or if you are already using conda::
$ conda install h5netcdf
Note:
From version 1.2. h5netcdf tries to align with a `nep29`_-like support policy with regard
to it's upstream dependencies.
.. _conda: https://conda.io/
.. _conda-forge: https://conda-forge.org/
.. _nep29: https://numpy.org/neps/nep-0029-deprecation_policy.html
Usage
-----
h5netcdf has two APIs, a new API and a legacy API. Both interfaces currently
reproduce most of the features of the netCDF interface, with the notable
exception of support for operations that rename or delete existing objects.
We simply haven't gotten around to implementing this yet. Patches
would be very welcome.
New API
~~~~~~~
The new API supports direct hierarchical access of variables and groups. Its
design is an adaptation of h5py to the netCDF data model. For example:
.. code-block:: python
import h5netcdf
import numpy as np
with h5netcdf.File("mydata.nc", "w") as f:
# set dimensions with a dictionary
f.dimensions = {"x": 5}
# and update them with a dict-like interface
# f.dimensions['x'] = 5
# f.dimensions.update({'x': 5})
v = f.create_variable("hello", ("x",), float)
v[:] = np.ones(5)
# you don't need to create groups first
# you also don't need to create dimensions first if you supply data
# with the new variable
v = f.create_variable("/grouped/data", ("y",), data=np.arange(10))
# access and modify attributes with a dict-like interface
v.attrs["foo"] = "bar"
# you can access variables and groups directly using a hierarchical
# keys like h5py
print(f["/grouped/data"])
# add an unlimited dimension
f.dimensions["z"] = None
# explicitly resize a dimension and all variables using it
f.resize_dimension("z", 3)
Notes:
- Automatic resizing of unlimited dimensions with array indexing is not available.
- Dimensions need to be manually resized with ``Group.resize_dimension(dimension, size)``.
- Arrays are returned padded with ``fillvalue`` (taken from underlying hdf5 dataset) up to
current size of variable's dimensions. The behaviour is equivalent to netCDF4-python's
``Dataset.set_auto_mask(False)``.
Legacy API
~~~~~~~~~~
The legacy API is designed for compatibility with `netCDF4-python`_. To use it, import
``h5netcdf.legacyapi``:
.. _netCDF4-python: https://github.com/Unidata/netcdf4-python
.. code-block:: python
import h5netcdf.legacyapi as netCDF4
# everything here would also work with this instead:
# import netCDF4
import numpy as np
with netCDF4.Dataset("mydata.nc", "w") as ds:
ds.createDimension("x", 5)
v = ds.createVariable("hello", float, ("x",))
v[:] = np.ones(5)
g = ds.createGroup("grouped")
g.createDimension("y", 10)
g.createVariable("data", "i8", ("y",))
v = g["data"]
v[:] = np.arange(10)
v.foo = "bar"
print(ds.groups["grouped"].variables["data"])
The legacy API is designed to be easy to try-out for netCDF4-python users, but it is not an
exact match. Here is an incomplete list of functionality we don't include:
- Utility functions ``chartostring``, ``num2date``, etc., that are not directly necessary
for writing netCDF files.
- h5netcdf variables do not support automatic masking or scaling (e.g., of values matching
the ``_FillValue`` attribute). We prefer to leave this functionality to client libraries
(e.g., `xarray`_), which can implement their exact desired scaling behavior. Nevertheless
arrays are returned padded with ``fillvalue`` (taken from underlying hdf5 dataset) up to
current size of variable's dimensions. The behaviour is equivalent to netCDF4-python's
``Dataset.set_auto_mask(False)``.
.. _invalid netcdf:
Invalid netCDF files
~~~~~~~~~~~~~~~~~~~~
h5py implements some features that do not (yet) result in valid netCDF files:
- Data types:
- Booleans
- Reference types
- Arbitrary filters:
- Scale-offset filters
By default [#]_, h5netcdf will not allow writing files using any of these features,
as files with such features are not readable by other netCDF tools.
However, these are still valid HDF5 files. If you don't care about netCDF
compatibility, you can use these features by setting ``invalid_netcdf=True``
when creating a file:
.. code-block:: python
# avoid the .nc extension for non-netcdf files
f = h5netcdf.File("mydata.h5", invalid_netcdf=True)
...
# works with the legacy API, too, though compression options are not exposed
ds = h5netcdf.legacyapi.Dataset("mydata.h5", invalid_netcdf=True)
...
In such cases the `_NCProperties` attribute will not be saved to the file or be removed
from an existing file. A warning will be issued if the file has `.nc`-extension.
.. rubric:: Footnotes
.. [#] h5netcdf we will raise ``h5netcdf.CompatibilityError``.
Decoding variable length strings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
h5py 3.0 introduced `new behavior`_ for handling variable length string.
Instead of being automatically decoded with UTF-8 into NumPy arrays of ``str``,
they are required as arrays of ``bytes``.
The legacy API preserves the old behavior of h5py (which matches netCDF4),
and automatically decodes strings.
The new API matches h5py behavior. Explicitly set ``decode_vlen_strings=True``
in the ``h5netcdf.File`` constructor to opt-in to automatic decoding.
.. _new behavior: https://docs.h5py.org/en/stable/strings.html
.. _phony dims:
Datasets with missing dimension scales
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
By default [#]_ h5netcdf raises a ``ValueError`` if variables with no dimension
scale associated with one of their axes are accessed.
You can set ``phony_dims='sort'`` when opening a file to let h5netcdf invent
phony dimensions according to `netCDF`_ behaviour.
.. code-block:: python
# mimic netCDF-behaviour for non-netcdf files
f = h5netcdf.File("mydata.h5", mode="r", phony_dims="sort")
...
Note, that this iterates once over the whole group-hierarchy. This has affects
on performance in case you rely on laziness of group access.
You can set ``phony_dims='access'`` instead to defer phony dimension creation
to group access time. The created phony dimension naming will differ from
`netCDF`_ behaviour.
.. code-block:: python
f = h5netcdf.File("mydata.h5", mode="r", phony_dims="access")
...
.. rubric:: Footnotes
.. [#] Keyword default setting ``phony_dims=None`` for backwards compatibility.
.. _netCDF: https://docs.unidata.ucar.edu/netcdf-c/current/interoperability_hdf5.html
Track Order
~~~~~~~~~~~
As of h5netcdf 1.1.0, if h5py 3.7.0 or greater is detected, the ``track_order``
parameter is set to ``True`` enabling `order tracking`_ for newly created
netCDF4 files. This helps ensure that files created with the h5netcdf library
can be modified by the netCDF4-c and netCDF4-python implementation used in
other software stacks. Since this change should be transparent to most users,
it was made without deprecation.
Since track_order is set at creation time, any dataset that was created with
``track_order=False`` (h5netcdf version 1.0.2 and older except for 0.13.0) will
continue to opened with order tracker disabled.
The following describes the behavior of h5netcdf with respect to order tracking
for a few key versions:
- Version 0.12.0 and earlier, the ``track_order`` parameter`order was missing
and thus order tracking was implicitly set to ``False``.
- Version 0.13.0 enabled order tracking by setting the parameter
``track_order`` to ``True`` by default without deprecation.
- Versions 0.13.1 to 1.0.2 set ``track_order`` to ``False`` due to a bug in a
core dependency of h5netcdf, h5py `upstream bug`_ which was resolved in h5py
3.7.0 with the help of the h5netcdf team.
- In version 1.1.0, if h5py 3.7.0 or above is detected, the ``track_order``
parameter is set to ``True`` by default.
.. _order tracking: https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order
.. _upstream bug: https://github.com/h5netcdf/h5netcdf/issues/136
.. _[*]: https://github.com/h5netcdf/h5netcdf/issues/128
.. changelog
Changelog
---------
`Changelog`_
.. _Changelog: https://github.com/h5netcdf/h5netcdf/blob/main/CHANGELOG.rst
.. license
License
-------
`3-clause BSD`_
.. _3-clause BSD: https://github.com/h5netcdf/h5netcdf/blob/main/LICENSE
|