1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
|
Contributing
************
RosettaSciIO is meant to be a community maintained project. We welcome contributions
in the form of bug reports, documentation, code (in particular :ref:`new io plugins
<defining-plugins>`), feature requests, and more. In the following we refer to some
resources to help you make useful contributions.
Issues
======
The `issue tracker <https://github.com/hyperspy/rosettasciio/issues>`_ can be used to
report bugs or propose new features. When reporting a bug, the following is
useful:
- give a minimal example demonstrating the bug,
- copy and paste the error traceback.
.. _making_test_files:
Making test data files
======================
Test data files are typically generated using third party software, for example using a proprietary
software on a scientific instrument. These files are added to the `test suite <https://en.wikipedia.org/wiki/Test_suite>`_
of RosettaSciIO to make sure that future code development will not introduce bugs or feature
regressions. It is important that the test data files area as small as possible to avoid working
with a repository that contains GBs of test data. Indeed, the test suite is made of severals hundreds of
test data files and this number of files will keep growing as new features and formats are added
to RosettaSciIO.
Users can contribute by generating these files on softwares they have access to and by making these
files available openly; then a RosettaSciIO developer will help with adding these data to the test suite.
What characterizes good test data files:
- Relevant features: the test data files do not need to contain any meaningful data, but they need to
cover as much as possible of the format functionalities.
- Small size:
- Acquire minimum number of pixels or channels. In case of maps or spectrum images acquire a non-square grid
(e.g. "x" and "y" have different lengths).
- If possible, generate data that contains no signal (e.g. zeros) as files containing only very few values will compress very well.
Pull Requests
=============
If you want to contribute to the RosettaSciIO source code, you can send us a
`pull request <https://github.com/hyperspy/rosettasciio/pulls>`_ against the ``main``
branch. Small bug fixes are corrections to the user guide are typically a good
starting point. But don't hesitate also for significant code contributions, such
as support for a new file format - if needed, we'll help you to get the code ready
to common standards.
Please refer to the
`HyperSpy developer guide <http://hyperspy.org/hyperspy-doc/current/dev_guide/intro.html>`_
in order to get started and for detailed contributing guidelines.
Lint
----
.. _pre-commit.ci: https://pre-commit.ci
To keep the code style consistent (and more readable), `black <https://black.readthedocs.io/>`_
is used to check the code formatting. When the code doesn't comply with the expected formatting,
the `pre-commit.ci build <https://results.pre-commit.ci/latest/github/hyperspy/rosettasciio/main>`_
will fail. In practise, the code formatting can be fixed by installing ``black`` and running it on the
source code or by using :ref:`pre-commit hooks <pre-commit-hooks>`.
Alternatively, adding the message ``pre-commit.ci autofix`` in a pull request will push a commit with
the fixes using `pre-commit.ci`_.
.. _adding-and-updating-test-data:
Adding and Updating Test Data
-----------------------------
The test data are located in the corresponding subfolder of the ``rsciio/tests/data`` folder.
The test data are not packaged in the distribution files (wheel, sdist) to keep the packages
as small as possible in size. When running the test suite, the test data will be downloaded
from GitHub using pooch. When adding or updating test data, it is necessary to update the test
data registry.
To add or update test data:
#. use git as usual to add files to the repository.
#. Update ``rsciio.tests.registry.txt`` by running
:py:func:`~.tests.registry_utils.update_registry` (Unix only):
.. code-block:: python
from rsciio.tests.registry_utils import update_registry
update_registry()
On windows, you can use :ref:`pre-commit.ci <pre-commit-hooks>` by adding a message to
the pull request to update the registry.
.. note::
The url used by pooch to download the test data can be set by the environment variable
``POOCH_BASE_URL``, otherwise, the default is to download the data from the
`hyperspy/rosettasciio <https://github.com/hyperspy/rosettasciio>`_ GitHub repository.
Review
------
As quality assurance, to improve the code, and to ensure a generalized
functionality, pull requests need to be thoroughly reviewed by at least one
other member of the development team before being merged.
.. _pre-commit-hooks:
Pre-commit Hooks
----------------
Two pre-commit hooks are set up:
* Linting: run ``black``
* Update test data registry (Unix only)
These can be run locally by using `pre-commit <https://pre-commit.com>`__.
Alternatively, the comment ``pre-commit.ci autofix`` can be added to a PR to fix the formatting
using `pre-commit.ci`_.
.. _defining-plugins:
Defining new RosettaSciIO plugins
=================================
Each read/write plugin resides in a separate directory, e.g. ``spamandeggs`` the
name of which should be descriptive of the file type/manufacturer/software. This
directory should contain the following files:
* ``__init__.py`` -- Defines the exposed API functions, ``file_reader`` and optionally ``file_writer``
.. code-block:: python
from ._api import file_reader, file_writer
__all__ = [
"file_reader",
"file_writer",
]
def __dir__():
return sorted(__all__)
* ``specifications.yaml`` -- The characteristics of the IO plugin in *yaml* format:
.. code-block:: yaml
name: <String> # unique, concise, no whitespace; corresponding to directory name (e.g. ``spamandeggs``)
name_aliases: [<String>] # List of strings, may contain whitespaces (empty if no alias defined)
description: <String>
full_support: <Bool> # Whether all the Hyperspy features are supported
file_extensions: <Tuple of string> # Recognised file extension
default_extension: <Int> # Index of the extension that will be used by default
writes: <Bool>/[Nested list] # Writing capabilities
# if only limited dimensions are supported, the supported combinations of signal
# dimensions (sd) and navigation dimensions (nd) are given as list [[sd, nd], ...]
non_uniform_axis: <Bool> # Support for non-uniform axis
* ``_api.py`` -- Python file that implements the actual reader. The IO functionality
should be interfaced with the following functions:
* A function called ``file_reader`` with at least one attribute: ``filename``
that returns the :ref:`standardized signal dictionary <interfacing-api>`.
* (optional) A function called ``file_writer`` with at least two attributes:
``filename`` and ``signal`` (a python dictionary) in that order.
**Tests** covering the functionality of the plugin should be added to the
``tests`` directory with the naming ``test_spamandeggs.py`` corresponsing to
the plugin residing in the directory ``spamandeggs``. Data files for the tests
should be placed in a corresponding subdirectory - see the
:ref:`Adding and Updating Test Data <adding-and-updating-test-data>` section for more
information.
**Documentation** should be added both as **docstring**, as well as to the **user guide**,
for which a corresponding ``spamandeggs.rst`` file should be created in the directory
``doc/user_guide/supported_formats/`` and the format added to the lists in
``doc/user_guide/supported_formats/index.rst`` and ``doc/user_guide/supported_formats/supported_formats.rst``.
A few standard *docstring* components are provided by ``rsciio._docstrings.py`` and should
be used (see existing plugins).
The *docstrings* are automatically added in the *user guide* using the following lines
.. code-block:: rst
API functions
^^^^^^^^^^^^^
.. automodule:: rsciio.spamandeggs
:members:
The *docstrings* follow `Numpy docstring style <https://numpydoc.readthedocs.io>`_. The
links to RosettaSciIO API and other Sphinx documented API are checked when building the documentation
and broken links will raise warnings. In order to identify potentially broken links during pull
request review, the `Documentation <https://github.com/hyperspy/rosettasciio/actions/workflows/Documentation.yml>`_
GitHub CI workflow is set to fail when the doc build raises warnings.
.. Note ::
It is advisable to clone the files of an existing plugin when initiating a new
plugin.
RosettaSciIO version
====================
The version of RosettaSciIO is defined by `setuptools_scm <https://setuptools-scm.readthedocs.io/>`_
and retrieve by ``importlib.metadata`` at runtime in case of user installation.
- Version at build time: the version is defined from the tag or the "distance from the tag".
- Version at runtime: use the version of the package (``sdist`` or ``wheel``), which would have been
defined at build time. At runtime, the version is obtained using importlib.metadata as follow:
.. code-block:: python
from importlib.metadata import version
__version__ = version("rosettasciio")
- Version at runtime for editable installation: the version is defined from the tag or "the distance from the tag".
.. note::
To define the version in development installation or at build time, ``setuptools_scm`` uses
the git history with all commits, and shallow checkout will provide incorrect version.
For user installation in site-package, ``setuptools_scm`` is not used.
Dependencies
============
``RosettaSciIO`` strive to be easy to install with a minimum of dependencies and depends solely on
standard library modules, numpy and dask. Non-pure python (binaries) dependencies are optional for
the following reasons:
- provide maximum flexibility in usability and avoid forcing user to install library that they don't need:
for user-cases, where only a file reader are necessary, it should be possible to install ``RosettaSciIO``
without installing large or non-pure python dependencies, which are not always easy to install.
- Some binaries dependencies are not supported for all python implementation (``pypy`` or ``pyodide``)
or for all platforms.
Maintenance
===========
Please refer to the
`HyperSpy developer guide <http://hyperspy.org/hyperspy-doc/current/dev_guide/intro.html>`_
for maintenance guidelines.
|