1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
|
===============
Task statistics
===============
We store meta-data about previous runs of various tasks. Runtime data of
former runs can be used to:
* select a powerful worker when the task is resource hungry (or the
opposite)
* decide whether a failure is fatal based on the existence of previous
successful runs
* etc.
Output data
===========
The ``WorkRequest`` model has a JSON field named ``output_data``, set upon
completion of the work request. The values are provided by the worker.
Its structure is as follows:
* ``runtime_statistics``: see :ref:`runtime-statistics` below.
* ``errors``: a list of errors. Each error is a dictionary with the
following keys:
* ``message``: user-friendly error message
* ``code``: computer-friendly error code
.. note::
Typically used to return validation/configuration errors to the user
that resulted in the task not being run at all. Other additional keys
might be set depending on the error code.
This ``errors`` key is not required for the design that we are doing
here, but it explains why I opted to create an ``output_data`` field
instead of a ``runtime_statistics`` field. See :issue:`432` for a
related issue that we could fix with this.
* ``skip_reason``: may be set to a human-readable explanation of why this
work request was skipped rather than being run normally.
.. _runtime-statistics:
``RuntimeStatistics`` model
---------------------------
The model combines runtime data about the task itself:
* ``duration`` (optional, integer): the runtime duration of the task in
seconds
* ``cpu_time`` (optional, integer): the amount of CPU time used in seconds
(combining user and system CPU time)
* ``disk_space`` (optional, integer): the maximum disk space used during the
task's execution (in bytes)
* ``memory`` (optional, integer): the maximum amount of RAM used during the
task's execution (in bytes)
But also some data about the worker to help analyze the values and/or to
provide reference data in the case of missing runtime data:
* ``available_disk_space`` (optional, integer): the available disk space
when the task started (in bytes, may be rounded)
* ``available_memory`` (optional, integer): the amount of RAM that was
available when the task started (in bytes, may be rounded)
* ``cpu_count`` (optional, integer): the number of CPU cores on the worker
that ran the task
Open question: how and where to use the statistics
==================================================
In theory, the statistics might only be available when the task becomes
pending when we have the final result for ``compute_dynamic_data()`` and
the guarantee to have values for subject/context.
If we want to use those statistics to tweak the configuration of the work
request (i.e. adding new worker requirements), then it needs some careful
coordination between the scheduler and the workflow.
In practice, many workflows will know the subject/context values by
advance and can possibly configure the work request at creation time.
|