1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268
|
.. _data-files:
Shipping data files for `nipy`
===============================
When developing or using nipy, many data files can be useful. We divide
the data files nipy uses into at least 3 categories
#. *test data* - data files required for routine code testing
#. *template data* - data files required for algorithms to function,
such as templates or atlases
#. *example data* - data files for running examples, or optional tests
The files used for testing are very small data files. They are shipped
with nipy, and live in the code repository. They live in the module path
``nipy.testing.data``.
.. now a comment .. automodule:: nipy.testing
*template data* and *example data* are example of *data packages*. What
follows is a discussion of the design and use of data packages.
Use cases for data packages
+++++++++++++++++++++++++++
Using the data package
``````````````````````
The programmer will want to use the data something like this:
.. testcode::
from nipy.utils import make_datasource
templates = make_datasource('nipy', 'templates')
fname = templates.get_filename('ICBM152', '2mm', 'T1.nii.gz')
where ``fname`` will be the absolute path to the template image
``ICBM152/2mm/T1.nii.gz``.
The programmer can insist on a particular version of a ``datasource``:
.. testcode::
if templates.version < '0.4':
raise ValueError('Need datasource version at least 0.4')
If the repository cannot find the data, then:
>>> make_datasource('nipy', 'implausible')
Traceback
...
nipy.utils.DataError
where ``DataError`` gives a helpful warning about why the data was not
found, and how it should be installed.
Warnings during installation
````````````````````````````
The example data and template data may be important, and it would be
useful to warn the user if NIPY cannot find either of the two sets of
data when installing the package. Thus::
python setup.py install
will import nipy after installation to check whether these raise an error:
>>> from nipy.utils import make_datasource
>>> template = make_datasource('nipy', 'templates')
>>> example_data = make_datasource('nipy', 'data')
and warn the user accordingly, with some basic instructions for how to
install the data.
.. _find-data:
Finding the data
````````````````
The routine ``make_datasource`` will need to be able to find the data
that has been installed. For the following call:
>>> templates = make_datasource('nipy', 'templates')
We propose to:
#. Get a list of paths where data is known to be stored with
``nipy.data.get_data_path()``
#. For each of these paths, search for directory ``nipy/templates``. If
found, and of the correct format (see below), return a datasource,
otherwise raise an Exception
The paths collected by ``nipy.data.get_data_paths()`` will be
constructed from ':' (Unix) or ';' separated strings. The source of the
strings (in the order in which they will be used in the search above)
are:
#. The value of the ``NIPY_DATA_PATH`` environment variable, if set
#. A section = ``DATA``, parameter = ``path`` entry in a
``config.ini`` file in ``nipy_dir`` where ``nipy_dir`` is
``$HOME/.nipy`` or equivalent.
#. Section = ``DATA``, parameter = ``path`` entries in configuration
``.ini`` files, where the ``.ini`` files are found by
``glob.glob(os.path.join(etc_dir, '*.ini')`` and ``etc_dir`` is
``/etc/nipy`` on Unix, and some suitable equivalent on Windows.
#. The result of ``os.path.join(sys.prefix, 'share', 'nipy')``
#. If ``sys.prefix`` is ``/usr``, we add ``/usr/local/share/nipy``. We
need this because Python 2.6 in Debian / Ubuntu does default installs
to ``/usr/local``.
#. The result of ``get_nipy_user_dir()``
Requirements for a data package
```````````````````````````````
To be a valid NIPY project data package, you need to satisfy:
#. The installer installs the data in some place that can be found using
the method defined in :ref:`find-data`.
We recommend that:
#. By default, you install data in a standard location such as
``<prefix>/share/nipy`` where ``<prefix>`` is the standard Python
prefix obtained by ``>>> import sys; print sys.prefix``
Remember that there is a distinction between the NIPY project - the
umbrella of neuroimaging in python - and the NIPY package - the main
code package in the NIPY project. Thus, if you want to install data
under the NIPY *package* umbrella, your data might go to
``/usr/share/nipy/nipy/packagename`` (on Unix). Note ``nipy`` twice -
once for the project, once for the pacakge. If you want to install data
under - say - the ```pbrain`` package umbrella, that would go in
``/usr/share/nipy/pbrain/packagename``.
Data package format
```````````````````
The following tree is an example of the kind of pattern we would expect
in a data directory, where the ``nipy-data`` and ``nipy-templates``
packages have been installed::
<ROOT>
`-- nipy
|-- data
| |-- config.ini
| `-- placeholder.txt
`-- templates
|-- ICBM152
| `-- 2mm
| `-- T1.nii.gz
|-- colin27
| `-- 2mm
| `-- T1.nii.gz
`-- config.ini
The ``<ROOT>`` directory is the directory that will appear somewhere in
the list from ``nipy.data.get_data_path()``. The ``nipy`` subdirectory
signifies data for the ``nipy`` package (as opposed to other
NIPY-related packages such as ``pbrain``). The ``data`` subdirectory of
``nipy`` contains files from the ``nipy-data`` package. In the
``nipy/data`` or ``nipy/templates`` directories, there is a
``config.ini`` file, that has at least an entry like this::
[DEFAULT]
version = 0.1
giving the version of the data package.
.. _install-data-pkgs:
Installing the data
```````````````````
We will use python distutils to install data packages, and the
``data_files`` mechanism to install the data. On Unix, with the
following command::
python setup.py install --prefix=/my/prefix
data will go to::
/my/prefix/share/nipy
For the example above this will result in these subdirectories::
/my/prefix/share/nipy/nipy/data
/my/prefix/share/nipy/nipy/templates
because ``nipy`` is both the project, and the package to which the data
relates.
If you install to a particular location, you will need to add that
location to the output of ``nipy.data.get_data_path()`` using one of the mechanisms above, for example, in your system configuration::
export NIPY_DATA_PATH=/my/prefix/share/nipy
Packaging for distributions
```````````````````````````
For a particular data package - say ``nipy-templates`` - distributions
will want to:
#. Install the data in set location. The default from ``python setup.py install`` for the data packages will be ``/usr/share/nipy`` on Unix.
#. Point a system installation of NIPY to these data.
For the latter, the most obvious route is to copy an ``.ini`` file named
for the data package into the NIPY ``etc_dir``. In this case, on Unix,
we will want a file called ``/etc/nipy/nipy_templates.ini`` with
contents::
[DATA]
path = /usr/share/nipy
Current implementation
``````````````````````
This section describes how we (the NIPY package) implement data packages
at the moment.
The data in the data packages will not be under source control.
The data packages will be available at a central release location. For
now this will be: http://nipy.sourceforge.net/data-packages/ .
A package, such as ``nipy-templates-0.1.tar.gz`` will have the following
contents::
<ROOT>
|-- setup.py
|-- README.txt
|-- MANIFEST.in
`-- templates
|-- ICBM152
| `-- 2mm
| `-- T1.nii.gz
|-- colin27
| `-- 2mm
| `-- T1.nii.gz
`-- config.ini
There should be only one ``nipy/packagename`` directory delivered by a
particular package. For example, this package installs
``nipy/templates``, but does not contain ``nipy/data``.
Making a new package tarball is simply:
#. Downloading and unpacking e.g ``nipy-templates-0.1.tar.gz`` to form
the directory structure above.
#. Making any changes to the directory
#. Running ``setup.py sdist`` to recreate the package.
The process of making a release should be:
#. Increment the major or minor version number in the ``config.ini`` file
#. Make a package tarball as above
#. Upload to distribution site
There is an example nipy data package ``nipy-examplepkg`` in the
``examples`` directory of the NIPY repository.
The machinery for creating and maintaining data packages is available with::
svn co https://nipy.svn.sourceforge.net/svnroot/nipy/data-packaging/trunk
See the ``README.txt`` file there for more information.
|