1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383
|
Workceptor
==========
.. contents::
:local:
Workceptor is a component of receptor that handles units of work.
``work-commands`` defines a type of work that can run on the node.
foo.yml
.. code-block:: yaml
---
version: 2
node:
id: foo
log-level:
level: Debug
tcp-listeners:
- port: 2222
control-services:
- service: control
filename: /tmp/foo.sock
work-commands:
- workType: echoint
command: bash
params: "-c \"for i in {1..5}; do echo $i; sleep 1; done\""
bar.yml
.. code-block:: yaml
---
version: 2
node:
id: bar
log-level:
level: Debug
tcp-peer:
address: localhost:2222
control-services:
- service: control
work-commands:
- worktype: echoint
command: bash
params: "-c \"for i in {1..10}; do echo $i; sleep 1; done\""
- workType: echopayload
command: bash
params: "-c \"while read -r line; do echo ${line^^}; sleep 3; done\""
Configuring work commands
--------------------------
``worktype`` User-defined name to give this work definition
``command`` The executable that is invoked when running this work
``params`` Command-line options passed to this executable
Local work
-----------
Start the work by connecting to the ``control-services`` and issuing a "work submit" command
.. code-block:: bash
$ receptorctl --socket /tmp/foo.sock work submit echoint --no-payload
Result: Job Started
Unit ID: t1BlAB18
Receptor started an instance of this work type, and labeled it with a unique "Unit ID"
Work results
-------------
Use the "Unit ID" to get work results
.. code-block:: bash
receptorctl --socket /tmp/foo.sock work results t1BlAB18
1
2
3
4
5
6
7
8
9
10
Remote work
------------
Although connected to `foo`, by providing the "--node" option the work can be started on node `bar`.
The work type must be defined on the node it is intended to run on, e.g. `bar` must have a ``work-command`` called "echoint", in this case.
.. code-block:: bash
$ receptorctl --socket /tmp/foo.sock work submit echoint --node bar --no-payload
Result: Job Started
Unit ID: 87Vwqb6A
Remote work submission ultimately results in two work units running at the same time; a local work unit and the remote work unit. These two units have their own Unit IDs. The local work unit's goal is to monitor and stream results back from the running remote work unit.
Sequence of events for remote work submission
- `foo` starts a local work unit of work type "remote". This is a special work type that is built into receptor.
- This work unit attempts to connect to `bar`'s control service and issue a "work submit echoint" command. From `bar`'s perspective, this is the exact same operation as if a user connected to `bar` directly and issued a work submit command. `bar` is not aware that `foo` is the one that issued the command.
- Once submitted, `foo` will stream work results back to itself and store it on disk. It also periodically gets the ``work status`` of the work running on `bar`. Status includes information about the work state and the stdout size.
- `foo` continues streaming stdout results until the size stored on disk matches the StdoutSize reported in `bar`'s status.
.. _work_payload:
Payload
--------
in `bar.yml`
.. code-block:: yaml
- workType: echopayload
command: bash
params: "-c \"while read -r line; do echo ${line^^}; sleep 5; done\""
Here the bash command expects to read a line from stdin, echo the line in all uppercase letters, and sleep for 3 seconds.
Payloads can be passed into receptor using the "--payload" option.
.. code-block:: bash
$ echo -e "hi\ni am foo\nwhat is your name" | receptorctl --socket /tmp/foo.sock work submit echopayload --node bar --payload - -f
HI
I AM FOO
WHAT IS YOUR NAME
"--payload -" means the payload should be whatever the stdin is, which is piped in from the "echo -e ..." command.
Note: "-f" instructs receptorctl to follow the work unit immediately, i.e. stream results to stdout. One could also use "work results" to stream the results.
Runtime Parameters
-------------------
Work commands can be configured to allow parameters to be passed to commands when work is submitted:
.. code-block:: yaml
work-commands:
- workType: listcontents
command: ls
allowruntimeparams: true
The ``allowruntimeparams`` option will allow parameters to be passed to the work command by the
client submitting the work. The contents of a specific directory can be listed by passing the paths
to the receptor command as positional arguments immediately after the ``workType``:
.. code-block:: bash
receptorctl --socket /tmp/foo.sock work submit --node bar --no-payload -f listcontents /root/ /bin/
/bin/:
bash
sh
/root/:
helloworld.sh
Passing options or flags to the work command needs to be done using the ``--param`` parameter to
extend the ``params`` work command setting. The ``--all`` flag can be passed to the work command this way:
.. code-block:: bash
receptorctl --socket /tmp/foo.sock work submit --node bar --no-payload -f --param params='--all' listcontents /root/
.
..
.bash_logout
.bash_profile
.bashrc
.cache
helloworld.sh
Work list
----------
"work list" returns information about all work units that have ran on this receptor node. The following shows two work units, ``12L8s8h2`` and ``T0oN0CAp``
.. code-block:: bash
$ receptorctl --socket /tmp/foo.sock work list
{'12L8s8h2': {'Detail': 'exit status 0',
'ExtraData': None,
'State': 2,
'StateName': 'Succeeded',
'StdoutSize': 21,
'WorkType': 'echoint'},
'T0oN0CAp': {'Detail': 'Running: PID 1700818',
'ExtraData': {'Expiration': '0001-01-01T00:00:00Z',
'LocalCancelled': False,
'LocalReleased': False,
'RemoteNode': 'bar',
'RemoteParams': {},
'RemoteStarted': True,
'RemoteUnitID': 'ATDzdViR',
'RemoteWorkType': 'echoint',
'TLSClient': ''},
'State': 1,
'StateName': 'Running',
'StdoutSize': 4,
'WorkType': 'remote'},
Notice that ``T0oN0CAp`` was a remote work submission, therefore its work type is "remote". On `bar` there is a local unit ``ATDzdViR``, with the "echoint" work type.
Work cancel
------------
Cancel will stop any running work unit. Upon canceling a "remote" work unit, the local node will attempt to connect to the remote node's control service and issue a work cancel. If the remote node is down, receptor will periodically attempt to connect to the remote node to do the cancellation.
Work release
-------------
Release will cancel the work and then delete files on disk associated with that work unit. For remote work submission, release will attempt to delete files both locally and on the remote machine. Like work cancel, the release can be pending if the remote node is down. In that situation, the local files will remain on disk until the remote node can be contacted.
Work force-release
--------------------
It might be preferable to force a release, using the ``work force-release`` command. This will do a one-time attempt to connect to the remote node and issue a work release there. After this one attempt, it will then proceed to delete all local files associated with the work unit.
States
---------
A unit of work can be in Pending, Running, Succeeded, or Failed state
For local work, transitioning from Pending to Running occurs the moment the ``command`` executable is started
For remote work, transitioning from Pending to Running occurs when the status reported from the remote node has a Running state.
Signed work
------------
Remote work submissions can be digitally signed by the sender. The target node will verify the signature of the work command before starting the work unit.
A *single* pair of RSA public and private keys is created offline and distributed to the nodes. Distribute the public key (PKIX format) to any node that should receive work. Distribute the private key (PKCS1 format) to any node that needs authority to submit work.
The following commands can be used to create keys for signing work:
.. code-block:: bash
openssl genrsa -out signworkprivate.pem 2048
openssl rsa -in signworkprivate.pem -pubout -out signworkpublic.pem
in `bar.yml`
.. code-block:: yaml
# PKIX
work-verification:
publickey: /full/path/signworkpublic.pem
- workType: echopayload
command: bash
params: "-c \"while read -r line; do echo ${line^^}; sleep 5; done\""
verifysignature: true
in `foo.yml`
.. code-block:: yaml
# PKCS1
work-signing:
privatekey: /full/path/signworkprivate.pem
tokenexpiration: 30m
Tokenexpiration determines how long a the signature is valid for. This expiration directly corresponds to the "expiresAt" field in the generated JSON web token. Valid units include "h" and "m", e.g. 1h30m for one hour and 30 minutes.
Use the "--signwork" parameter to sign the work.
.. code-block:: bash
$ receptorctl --socket /tmp/foo.sock work submit echoint --node bar --no-payload --signwork
Units on disk
--------------
Netceptor, the main component of receptor that handles mesh connectivity and traffic, operates entirely in memory. That is, it does not store any state information on disk. However, Workceptor functionality is designed to be persistent across receptor restarts. Work units might be running commands that could take hours to complete, and as such needs to store some relevant information on disk in case the receptor process restarts.
By default receptor stores data under ``/tmp/receptor`` but can be changed by setting the ``datadir`` param under the ``node`` action in the config file.
For a given work unit, receptor will store files in ``{datadir}/{nodeID}/{unitID}/``.
Here is the receptor directory tree after running ``work submit echopayload`` described in :ref:`work_payload`.
.. code-block:: bash
$ tree /tmp/receptor
/tmp/receptor
├── bar
│ └── NImim5WA
│ ├── status
│ ├── status.lock
│ ├── stdin
│ └── stdout
└── foo
└── BsAjS4wi
├── status
├── status.lock
├── stdin
└── stdout
The main purpose of work unit ``BsAjS4wi`` on `foo` is to copy stdin, stdout, and status from ``NImim5WA`` on `bar` back to its own working directory.
``stdin`` is a copy of the submitted payload. The contents of this file is the same on both the local (`foo`) and remote (`bar`) machines.
.. code-block:: bash
$ cat /tmp/receptor/bar/NImim5WA/stdin
hi
i am foo
what is your name
``stdout`` contains the work unit results; the stdout of the command execution. It will also be the same on both the local node and remote node.
.. code-block:: bash
$ cat /tmp/receptor/bar/NImim5WA/stdout
HI
I AM FOO
WHAT IS YOUR NAME
``status`` contains additional information related to the work unit. The contents of status are different on `foo` and `bar`.
.. code-block:: bash
$ cat /tmp/receptor/bar/NImim5WA/stdout
{
"State":2,
"Detail":"exit status 0",
"StdoutSize":30,
"WorkType":"echopayload",
"ExtraData":null
}
.. code-block:: text
$ cat /tmp/receptor/foo/BsAjS4wi/stdout
{
"State":2,
"Detail":"exit status 0",
"StdoutSize":30,
"WorkType":"remote",
"ExtraData":{
"RemoteNode":"bar",
"RemoteWorkType":"echopayload",
"RemoteParams":{},
"RemoteUnitID":"NImim5WA",
"RemoteStarted":true,
"LocalCancelled":false,
"LocalReleased":false,
"TLSClient":"",
"Expiration":"0001-01-01T00:00:00Z"
}
}
.. image:: remote.png
:alt: sequence of events during work remote submission
The sequence of events during a work remote submission. Blue lines indicate moments when receptor writes files to disk.
|