File: data_files.rst

package info (click to toggle)
nipy 0.1.2%2B20100526-2
  • links: PTS, VCS
  • area: main
  • in suites: squeeze
  • size: 11,992 kB
  • ctags: 13,434
  • sloc: python: 47,720; ansic: 41,334; makefile: 197
file content (268 lines) | stat: -rw-r--r-- 8,713 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
.. _data-files:

Shipping data files for `nipy`
===============================

When developing or using nipy, many data files can be useful. We divide
the data files nipy uses into at least 3 categories

#. *test data* - data files required for routine code testing
#. *template data* - data files required for algorithms to function,
   such as templates or atlases
#. *example data* - data files for running examples, or optional tests

The files used for testing are very small data files. They are shipped
with nipy, and live in the code repository. They live in the module path
``nipy.testing.data``.

.. now a comment .. automodule:: nipy.testing

*template data* and *example data* are example of *data packages*.  What
follows is a discussion of the design and use of data packages.

Use cases for data packages
+++++++++++++++++++++++++++

Using the data package
``````````````````````

The programmer will want to use the data something like this:

.. testcode::

   from nipy.utils import make_datasource

   templates = make_datasource('nipy', 'templates')
   fname = templates.get_filename('ICBM152', '2mm', 'T1.nii.gz')
   
where ``fname`` will be the absolute path to the template image
``ICBM152/2mm/T1.nii.gz``. 

The programmer can insist on a particular version of a ``datasource``:

.. testcode::

   if templates.version < '0.4':
      raise ValueError('Need datasource version at least 0.4')

If the repository cannot find the data, then:

>>> make_datasource('nipy', 'implausible')
Traceback
 ...
nipy.utils.DataError

where ``DataError`` gives a helpful warning about why the data was not
found, and how it should be installed.  

Warnings during installation
````````````````````````````

The example data and template data may be important, and it would be
useful to warn the user if NIPY cannot find either of the two sets of
data when installing the package.  Thus::

   python setup.py install

will import nipy after installation to check whether these raise an error:

>>> from nipy.utils import make_datasource
>>> template = make_datasource('nipy', 'templates')
>>> example_data = make_datasource('nipy', 'data')

and warn the user accordingly, with some basic instructions for how to
install the data.

.. _find-data:

Finding the data
````````````````

The routine ``make_datasource`` will need to be able to find the data
that has been installed.  For the following call:

>>> templates = make_datasource('nipy', 'templates')

We propose to:

#. Get a list of paths where data is known to be stored with
   ``nipy.data.get_data_path()``
#. For each of these paths, search for directory ``nipy/templates``.  If
   found, and of the correct format (see below), return a datasource,
   otherwise raise an Exception

The paths collected by ``nipy.data.get_data_paths()`` will be
constructed from ':' (Unix) or ';' separated strings.  The source of the
strings (in the order in which they will be used in the search above)
are:

#. The value of the ``NIPY_DATA_PATH`` environment variable, if set
#. A section = ``DATA``, parameter = ``path`` entry in a
   ``config.ini`` file in ``nipy_dir`` where ``nipy_dir`` is
   ``$HOME/.nipy`` or equivalent.
#. Section = ``DATA``, parameter = ``path`` entries in configuration
   ``.ini`` files, where the ``.ini`` files are found by
   ``glob.glob(os.path.join(etc_dir, '*.ini')`` and ``etc_dir`` is
   ``/etc/nipy`` on Unix, and some suitable equivalent on Windows.
#. The result of ``os.path.join(sys.prefix, 'share', 'nipy')``
#. If ``sys.prefix`` is ``/usr``, we add ``/usr/local/share/nipy``. We
   need this because Python 2.6 in Debian / Ubuntu does default installs
   to ``/usr/local``.
#. The result of ``get_nipy_user_dir()``

Requirements for a data package
```````````````````````````````

To be a valid NIPY project data package, you need to satisfy:

#. The installer installs the data in some place that can be found using
   the method defined in :ref:`find-data`.

We recommend that:

#. By default, you install data in a standard location such as
   ``<prefix>/share/nipy`` where ``<prefix>`` is the standard Python
   prefix obtained by ``>>> import sys; print sys.prefix``

Remember that there is a distinction between the NIPY project - the
umbrella of neuroimaging in python - and the NIPY package - the main
code package in the NIPY project.  Thus, if you want to install data
under the NIPY *package* umbrella, your data might go to
``/usr/share/nipy/nipy/packagename`` (on Unix).  Note ``nipy`` twice -
once for the project, once for the pacakge.  If you want to install data
under - say - the ```pbrain`` package umbrella, that would go in
``/usr/share/nipy/pbrain/packagename``.

Data package format
```````````````````

The following tree is an example of the kind of pattern we would expect
in a data directory, where the ``nipy-data`` and ``nipy-templates``
packages have been installed::

  <ROOT> 
  `-- nipy
      |-- data
      |   |-- config.ini
      |   `-- placeholder.txt
      `-- templates
          |-- ICBM152
          |   `-- 2mm
          |       `-- T1.nii.gz
          |-- colin27
          |   `-- 2mm
          |       `-- T1.nii.gz
          `-- config.ini

The ``<ROOT>`` directory is the directory that will appear somewhere in
the list from ``nipy.data.get_data_path()``.  The ``nipy`` subdirectory
signifies data for the ``nipy`` package (as opposed to other
NIPY-related packages such as ``pbrain``).  The ``data`` subdirectory of
``nipy`` contains files from the ``nipy-data`` package.  In the
``nipy/data`` or ``nipy/templates`` directories, there is a
``config.ini`` file, that has at least an entry like this::

  [DEFAULT]
  version = 0.1

giving the version of the data package.

.. _install-data-pkgs:

Installing the data
```````````````````

We will use python distutils to install data packages, and the
``data_files`` mechanism to install the data.  On Unix, with the
following command::

   python setup.py install --prefix=/my/prefix

data will go to::

   /my/prefix/share/nipy

For the example above this will result in these subdirectories::

   /my/prefix/share/nipy/nipy/data
   /my/prefix/share/nipy/nipy/templates

because ``nipy`` is both the project, and the package to which the data
relates.

If you install to a particular location, you will need to add that
location to the output of ``nipy.data.get_data_path()`` using one of the mechanisms above, for example, in your system configuration::

   export NIPY_DATA_PATH=/my/prefix/share/nipy

Packaging for distributions
```````````````````````````

For a particular data package - say ``nipy-templates`` - distributions
will want to:

#. Install the data in set location.  The default from ``python setup.py install`` for the data packages will be ``/usr/share/nipy`` on Unix.
#. Point a system installation of NIPY to these data. 

For the latter, the most obvious route is to copy an ``.ini`` file named
for the data package into the NIPY ``etc_dir``.  In this case, on Unix,
we will want a file called ``/etc/nipy/nipy_templates.ini`` with
contents::

   [DATA]
   path = /usr/share/nipy

Current implementation
``````````````````````

This section describes how we (the NIPY package) implement data packages
at the moment.

The data in the data packages will not be under source control.

The data packages will be available at a central release location.  For
now this will be: http://nipy.sourceforge.net/data-packages/ .

A package, such as ``nipy-templates-0.1.tar.gz`` will have the following
contents::


  <ROOT>
    |-- setup.py
    |-- README.txt
    |-- MANIFEST.in
    `-- templates
        |-- ICBM152
        |   `-- 2mm
        |       `-- T1.nii.gz
        |-- colin27
        |   `-- 2mm
        |       `-- T1.nii.gz
        `-- config.ini


There should be only one ``nipy/packagename`` directory delivered by a
particular package.  For example, this package installs
``nipy/templates``, but does not contain ``nipy/data``.  

Making a new package tarball is simply:

#. Downloading and unpacking e.g ``nipy-templates-0.1.tar.gz`` to form
   the directory structure above.
#. Making any changes to the directory
#. Running ``setup.py sdist`` to recreate the package.  

The process of making a release should be:

#. Increment the major or minor version number in the ``config.ini`` file
#. Make a package tarball as above
#. Upload to distribution site

There is an example nipy data package ``nipy-examplepkg`` in the
``examples`` directory of the NIPY repository.

The machinery for creating and maintaining data packages is available with::
   
   svn co https://nipy.svn.sourceforge.net/svnroot/nipy/data-packaging/trunk

See the ``README.txt`` file there for more information.