1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
|
.. _refs:
Object and Region References
============================
In addition to soft and external links, HDF5 supplies one more mechanism to
refer to objects and data in a file. HDF5 *references* are low-level pointers
to other objects. The great advantage of references is that they can be
stored and retrieved as data; you can create an attribute or an entire dataset
of reference type.
References come in two flavors, object references and region references.
As the name suggests, object references point to a particular object in a file,
either a dataset, group or named datatype. Region references always point to
a dataset, and additionally contain information about a certain selection
(*dataset region*) on that dataset. For example, if you have a dataset
representing an image, you could specify a region of interest, and store it
as an attribute on the dataset.
.. _refs_object:
Using object references
-----------------------
It's trivial to create a new object reference; every high-level object
in h5py has a read-only property "ref", which when accessed returns a new
object reference:
>>> myfile = h5py.File('myfile.hdf5')
>>> mygroup = myfile['/some/group']
>>> ref = mygroup.ref
>>> print(ref)
<HDF5 object reference>
"Dereferencing" these objects is straightforward; use the same syntax as when
opening any other object:
>>> mygroup2 = myfile[ref]
>>> print(mygroup2)
<HDF5 group "/some/group" (0 members)>
.. _refs_region:
Using region references
-----------------------
Region references always contain a selection. You create them using the
dataset property "regionref" and standard NumPy slicing syntax:
>>> myds = myfile.create_dataset('dset', shape=(200,200))
>>> regref = myds.regionref[0:10, 0:5]
>>> print(regref)
<HDF5 region reference>
The reference itself can now be used in place of slicing arguments to the
dataset:
>>> subset = myds[regref]
For selections which don't conform to a regular grid, h5py copies the behavior
of NumPy's fancy indexing, which returns a 1D array. Note that for h5py release
before 2.2, h5py always returns a 1D array.
In addition to storing a selection, region references inherit from object
references, and can be used anywhere an object reference is accepted. In this
case the object they point to is the dataset used to create them.
Storing references in a dataset
-------------------------------
HDF5 treats object and region references as data. Consequently, there is a
special HDF5 type to represent them. However, NumPy has no equivalent type.
Rather than implement a special "reference type" for NumPy, references are
handled at the Python layer as plain, ordinary python objects. To NumPy they
are represented with the "object" dtype (kind 'O'). A small amount of
metadata attached to the dtype tells h5py to interpret the data as containing
reference objects.
These dtypes are available from h5py for references and region references:
* ``h5py.ref_dtype`` - for object references
* ``h5py.regionref_dtype`` - for region references
To store an array of references, use the appropriate dtype when initializing the
dataset:
>>> ref_dataset = myfile.create_dataset("MyRefs", shape=(100,), dtype=h5py.ref_dtype)
You can write the reference to the dataset and retrieve the reference as you would any other dataset:
>>> ref_dataset[0] = myfile.ref
>>> print(ref_dataset[0])
<HDF5 object reference>
You can also create a dataset containing references in a single line.
Here is an example with the region reference ``regref`` defined above:
>>> regref_dataset = myfile.create_dataset("MyRegRefs", data=regref)
Storing references in an attribute
----------------------------------
Simply assign the reference to a name; h5py will figure it out and store it
with the correct type:
>>> myref = myfile.ref
>>> myfile.attrs["Root group reference"] = myref
Null references
---------------
When you create a dataset of reference type, the uninitialized elements are
"null" references. H5py uses the truth value of a reference object to
indicate whether or not it is null:
>>> print(bool(myfile.ref))
True
>>> nullref = ref_dataset[50]
>>> print(bool(nullref))
False
|