1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219
|
.. -*- mode: rst -*-
.. vi: set ft=rst sts=4 ts=4 sw=4 et tw=79:
.. _chap_design_result_records:
**************
Result records
**************
.. topic:: Specification scope and status
This specification describes the current implementation.
Result records are the standard return value format for all DataLad commands.
Each command invocation yields one or more result records. Result records are
routinely inspected throughout the code base, and are used to inform generic
error handling, as well as particular calling commands on how to proceed with
a specific operation.
The technical implementation of a result record is a Python dictionary. This
dictionary must contain a number of mandatory fields/keys (see below). However,
an arbitrary number of additional fields may be added to a result record.
The ``get_status_dict()`` function simplifies the creation of result records.
.. note::
Developers *must* compose result records with care! DataLad supports custom
user-provided hook configurations that use result record fields to
decide when to trigger a custom post-result operation. Such custom hooks
rely on a persistent naming and composition of result record fields.
Changes to result records, including field name changes, field value changes,
but also timing/order of record emitting potentially break user set ups!
Mandatory fields
================
The following keys *must* be present in any result record. If any of these
keys is missing, DataLad's behavior is undefined.
``action``
----------
A string label identifying which type of operation a result is associated with.
Labels *must not* contain white space. They should be compact, and lower-cases,
and use ``_`` (underscore) to separate words in compound labels.
A result without an ``action`` label will not be processed and is discarded.
``path``
--------
A string with an *absolute* path describing the local entity a result is
associated with. Paths must be platform-specific (e.g., Windows paths on
Windows, and POSIX paths on other operating systems). When a result is about an
entity that has no meaningful relation to the local file system (e.g., a URL to
be downloaded), to ``path`` value should be determined with respect to the
potential impact of the result on any local entity (e.g., a URL downloaded
to a local file path, a local dataset modified based on remote information).
.. _target-result-status:
``status``
----------
This field indicates the nature of a result in terms of four categories, identified
by a string label.
- ``ok``: a standard, to-be-expected result
- ``notneeded``: an operation that was requested, but found to be unnecessary
in order to achieve a desired goal
- ``impossible``: a requested operation cannot be performed, possibly because
its preconditions are not met
- ``error``: an error occurred while performing an operation
Based on the ``status`` field, a result is categorized into *success* (``ok``,
``notneeded``) and *failure* (``impossible``, ``error``). Depending on the
``on_failure`` parameterization of a command call, any failure-result emitted
by a command can lead to an ``IncompleteResultsError`` being raised on command
exit, or a non-zero exit code on the command line. With ``on_failure='stop'``,
an operation is halted on the first failure and the command errors out
immediately, with ``on_failure='continue'`` an operation will continue despite
intermediate failures and the command only errors out at the very end, with
``on_failure='ignore'`` the command will not error even when failures occurred.
The latter mode can be used in cases where the initial status-characterization
needs to be corrected for the particular context of an operation (e.g., to
relabel expected and recoverable errors).
Common optional fields
======================
The following fields are not required, but can be used to enrich a result
record with additional information that improves its interpretability, or
triggers particular optional functionality in generic result processing.
``type``
--------
This field indicates the type of entity a result is associated with. This may
or may not be the type of the local entity identified by the ``path`` value.
The following values are common, and should be used in matching cases, but
arbitrary other values are supported too:
- ``dataset``: a DataLad dataset
- ``file``: a regular file
- ``directory``: a directory
- ``symlink``: a symbolic link
- ``key``: a git-annex key
- ``sibling``: a Dataset sibling or Git remote
``message``
-----------
A message providing additional human-readable information on the nature or
provenance of a result. Any non-``ok`` results *should* have a message providing
information on the rational of their status characterization.
A message can be a string or a tuple. In case of a tuple, the second item can
contain values for ``%``-expansion of the message string. Expansion is performed
only immediately prior to actually outputting the message, hence string formatting
runtime costs can be avoided this way, if a message is not actually shown.
``logger``
----------
If a result record has a ``message`` field, then a given `Logger` instance
(typically from ``logging.getLogger()``) will be used to automatically log
this message. The log channel/level is determined based on
``datalad.log.result-level`` configuration setting. By default, this is
the ``debug`` level. When set to ``match-status`` the log level is determined
based on the ``status`` field of a result record:
- ``debug`` for ``'ok'``, and ``'notneeded'`` results
- ``warning`` for ``'impossible'`` results
- ``error`` for ``'error'`` results
This feature should be used with care. Unconditional logging can lead to
confusing double-reporting when results rendered and also visibly logged.
``refds``
---------
This field can identify a path (using the same semantics and requirements as
the ``path`` field) to a reference dataset that represents the larger context
of an operation. For example, when recursively processing multiple files across
a number of subdatasets, a ``refds`` value may point to the common superdataset.
This value may influence, for example, how paths are rendered in user-output.
``parentds``
------------
This field can identify a path (using the same semantics and requirements as
the ``path`` field) to a dataset containing an entity.
``state``
---------
A string label categorizing the state of an entity. Common values are:
- ``clean``
- ``untracked``
- ``modified``
- ``deleted``
- ``absent``
- ``present``
``error_message``
-----------------
An error message that was captured or produced while achieving a result.
An error message can be a string or a tuple. In the case of a tuple, the
second item can contain values for ``%``-expansion of the message string.
``exception``
-------------
An exception that occurred while achieving the reported result.
``exception_traceback``
-----------------------
A string with a traceback for the exception reported in ``exception``.
Additional fields observed "in the wild"
========================================
Given that arbitrary fields are supported in result records, it is impossible
to compose a comprehensive list of field names (keys). However, in order to
counteract needless proliferation, the following list describes fields that
have been observed in implementations. Developers are encouraged to preferably
use compatible names from this list, or extend the list for additional items.
In alphabetical order:
``bytesize``
The size of an entity in bytes (integer).
``gitshasum``
SHA1 of an entity (string)
``prev_gitshasum``
SHA1 of a previous state of an entity (string)
``key``
The git-annex key associated with a ``type``-``file`` entity.
|