File: threads.rst

package info (click to toggle)
python-eliot 1.16.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 964 kB
  • sloc: python: 8,641; makefile: 151
file content (131 lines) | stat: -rw-r--r-- 5,884 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
Spanning Processes and Threads
==============================

Introduction
------------

In many applications we are interested in tasks that exist in more than just a single thread or in a single process.
For example, one server may send a request to another server over a network and we would like to trace the combined operation across both servers' logs.
To make this as easy as possible Eliot supports serializing task identifiers for transfer over the network (or between threads), allowing tasks to span multiple processes.

.. _cross thread tasks:

Cross-Thread Tasks
------------------

To trace actions across threads Eliot provides the ``eliot.preserve_context`` API.
It takes a callable that is about to be passed to a thread constructor and preserves the current Eliot context, returning a new callable.
This new callable should only be used, in the thread where it will run; it will restore the Eliot context and run the original function inside of it.
For example:

.. literalinclude:: ../../../examples/cross_thread.py

Here's what the result is when run:

.. code-block:: shell-session

   $ python examples/cross_thread.py | eliot-tree
   11a85c42-a13f-491c-ad44-c48b2efad0e3
   +-- main_thread@1/started
       +-- eliot:remote_task@2,1/started
           +-- in_thread@2,2,1/started
               |-- x: 3
               `-- y: 4
               +-- in_thread@2,2,2/succeeded
                   |-- result: 7
           +-- eliot:remote_task@2,3/succeeded
       +-- main_thread@3/succeeded

.. _cross process tasks:

Cross-Process Tasks
-------------------

``eliot.Action.serialize_task_id()`` can be used to create some ``bytes`` identifying a particular location within a task.
``eliot.Action.continue_task()`` converts a serialized task identifier into an ``eliot.Action`` and then starts the ``Action``.
The process which created the task serializes the task identifier and sends it over the network to the process which will continue the task.
This second process deserializes the identifier and uses it as a context for its own messages.

In the following example the task identifier is added as a header to a HTTP request:

.. literalinclude:: ../../../examples/cross_process_client.py

The server that receives the request then extracts the identifier:

.. literalinclude:: ../../../examples/cross_process_server.py

Tracing logs across multiple processes makes debugging problems dramatically easier.
For example, let's run the following:

.. code-block:: shell-session

   $ python examples/cross_process_server.py > server.log
   $ python examples/cross_process_client.py 5 0 > client.log

Here are the resulting combined logs, as visualized by `eliot-tree`_.
The reason the client received a 500 error code is completely obvious in these logs:

.. code-block:: shell-session

  $ cat client.log server.log | eliot-tree
  1e0be9be-ae56-49ef-9bce-60e850a7db09
  +-- main@1/started
      |-- process: client
      +-- http_request@2,1/started
          |-- process: client
          |-- x: 3
          `-- y: 0
          +-- eliot:remote_task@2,2,1/started
              |-- process: server
              +-- divide@2,2,2,1/started
                  |-- process: server
                  |-- x: 3
                  `-- y: 0
                  +-- divide@2,2,2,2/failed
                      |-- exception: exceptions.ZeroDivisionError
                      |-- process: server
                      |-- reason: integer division or modulo by zero
              +-- eliot:remote_task@2,2,3/failed
                  |-- exception: exceptions.ZeroDivisionError
                  |-- process: server
                  |-- reason: integer division or modulo by zero
          +-- http_request@2,3/failed
              |-- exception: requests.exceptions.HTTPError
              |-- process: client
              |-- reason: 500 Server Error: INTERNAL SERVER ERROR
      +-- main@3/failed
          |-- exception: requests.exceptions.HTTPError
          |-- process: client
          |-- reason: 500 Server Error: INTERNAL SERVER ERROR

.. _eliot-tree: https://warehouse.python.org/project/eliot-tree/

Cross-Thread Tasks
------------------

``eliot.Action`` objects should only be used on the thread that created them.
If you want your task to span multiple threads use the API described above.


Ensuring Message Uniqueness
---------------------------

Serialized task identifiers should be used at most once.
For example, every time a remote operation is retried a new call to ``serialize_task_id()`` should be made to create a new identifier.
Otherwise there is a chance that you will end up with messages that have duplicate identification (i.e. two messages with matching ``task_uuid`` and ``task_level`` values), making it more difficult to trace causality.

If this is not possible you may wish to start a new Eliot task upon receiving a remote request, while still making sure to log the serialized remote task identifier.
The inclusion of the remote task identifier will allow manual or automated reconstruction of the cross-process relationship between the original and new tasks.

Another alternative in some cases is to rely on unique process or thread identity to distinguish between the log messages.
For example if the same serialized task identifier is sent to multiple processes, log messages within the task can still have a unique identity if a process identifier is included with each message.


Logging Output for Multiple Processes
-------------------------------------

If logs are being combined from multiple processes an identifier indicating the originating process should be included in log messages.
This can be done a number of ways, e.g.:

* Have your destination add another field to the output.
* Rely on Logstash, or whatever your logging pipeline tool is, to add a field when shipping the logs to your centralized log store.