1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141
|
# Prekill Hooks
Prekill hooks are an experimental generic, pluggable way to do work just before
oomd kills a cgroup.
## Background
Owners of an oomed process may want a heap dump or other memory
statistics of the killed program at the time it died to get insight into
potential misbehavior.
Prekill hooks direct oomd to collect these metrics, or do other arbitrary work,
just before it kills a cgroup. It is a generic interface not tied to any
particular metric collection approach or, specifically metric collection at all.
Hooks may timeout, and should not be assumed to run to completion. Process
owners should know the kernel may oom kill their code separately from oomd, in
which case prekill hooks will obviously not run at all.
## Configuration
Prekill hooks are configured the oomd.json config json in a top-level
"prekill_hooks" key, adjacent to "rulesets".
Prekill hooks are at the top level because they run on every kill oomd makes,
across all rulesets.
Prekill hooks are not interchangeable with plugins but are configured in
the same way, via "name" and "args". Hooks can't be used where plugins are
expected, and vice versa.
{
"rulesets": [
...
],
"prekill_hooks": [
{
"name": "hypothetical_prekill_hook",
"args": {
"cgroup": "/foo,/bar/*/baz"
}
}
]
}
On a kill, the oomd runs the first configured prekill hook whose "cgroup" arg
matches the path of the cgroup to be killed. At most one prekill hook runs per
kill.
Dropins may contain prekill_hooks. Dropped-in prekill hooks get priority over
those in the base configuration. Like ruleset dropins, prekill hook dropins
added later get higher priority.
The "cgroup" arg is a list of comma-separated patterns. Patterns are cgroup
paths, except path components may be "*". No other glob matching works except
star for a single whole path component.
A cgroup path matches a pattern if it 1) exactly matches the pattern, 2) is an
ancestor of a path that would match the pattern, or 3) is a descendant of a path
that matches the pattern.
To run on all kills, set `"cgroup": "/"`.
Rulesets may set a "prekill_hook_timeout" in seconds. If unset, the default is 5
seconds.
{
"rulesets": [
{
"name": "memory pressure protection",
"prekill_hook_timeout": "30",
"detectors": [...],
"actions": [...]
],
"prekill_hooks": [...]
}
The prekill hook timeout sets a window for all prekill hooks in an action
chain to finish running. For example, consider:
- a ruleset with two kill plugin actions and a 5s prekill hook timeout
- the action chain fires
- the first action targets /foo.slice and fires a prekill hook on it
- the prekill hook finishes in 3s
- /foo.slice fails to die, so the first action returns CONTINUE
- the second kill plugin runs, targets /bar.slice, and fires a prekill hook
The second prekill hook only has 2s to run before it times out, since it's been
3s (or more) since the action chain started, and the action chain set a 5s
max window for prekill hooks to run.
## API
Prekill hook implementers should subclass PrekillHook and PrekillHookInvocation
and implement these core methods:
/* same as BasePlugin::init(args, context) */
int PrekillHook::init(
const Engine::PluginArgs& args,
const PluginConstructionContext& context);
/* main method for a hook, called just before the cgroup is killed */
std::unique_ptr<PrekillHookInvocation> PrekillHook::fire(
const CgroupContext&);
/* Invocation object returned from fire() is polled to see when the hook
has finished running, and killing may begin */
bool PrekillHookInvocation::didFinish()
/* Invocation object is destructed either when it finishes, or early
if it times out */
PrekillHookInvocation::~PrekillHookInvocation()
Hooks are kicked off with PrekillHook::fire(cgroup) with the cgroup oomd intends
to kill.
Oomd is designed as a single threaded event loop, so fire() shouldn't do long
work that blocks the main thread. Instead, it vends an Invocation object which
will be polled every main loop tick (typically 1s) for didFinish(). The cgroup
will not be killed until didFinish() returns true, or we reach a timeout.
If oomd determines a PrekillHookInvocation timed out, it is destructed and
PrekillHookInvocation::~PrekillHookInvocation() called. The destructor will be
called before the cgroup is killed, regardless of whether the
hook timed out or didFinish() returned true.
All methods (fire, didFinish, ~PrekillHookInvocation) will be always be called
on the main thread and should not block for nontrivial time. If blocking work
is needed, it should be done in other threads, possibly spawned in
PrekillHook::init().
## Guarantees
- At most one prekill hook will be running per ruleset at any moment. There may
be multiple instances of a prekill hook running at the same time, as part of
different rulesets.
- If a prekill hook is run on a cgroup, the cgroup is not guaranteed to die.
Oomd may fail to kill it. (Oomd will then pick a different cgroup to try to
kill, and again call the prekill hook on its new target before trying to kill
it.)
- PrekillHooks are not guaranteed to outlive the Invocations they fire().
Invocations should encapsulate any data they need to run to completion.
|