File: prekill_hooks.md

package info (click to toggle)
oomd 0.5.0-1.2
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 1,996 kB
  • sloc: cpp: 14,345; sh: 89; makefile: 7
file content (141 lines) | stat: -rw-r--r-- 5,478 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# Prekill Hooks

Prekill hooks are an experimental generic, pluggable way to do work just before
oomd kills a cgroup.

## Background

Owners of an oomed process may want a heap dump or other memory
statistics of the killed program at the time it died to get insight into
potential misbehavior.

Prekill hooks direct oomd to collect these metrics, or do other arbitrary work,
just before it kills a cgroup. It is a generic interface not tied to any
particular metric collection approach or, specifically metric collection at all.

Hooks may timeout, and should not be assumed to run to completion. Process
owners should know the kernel may oom kill their code separately from oomd, in
which case prekill hooks will obviously not run at all.

## Configuration

Prekill hooks are configured the oomd.json config json in a top-level
"prekill_hooks" key, adjacent to "rulesets".

Prekill hooks are at the top level because they run on every kill oomd makes,
across all rulesets.

Prekill hooks are not interchangeable with plugins but are configured in
the same way, via "name" and "args". Hooks can't be used where plugins are
expected, and vice versa.

  {
      "rulesets": [
        ...
      ],
      "prekill_hooks": [
          {
              "name": "hypothetical_prekill_hook",
              "args": {
                "cgroup": "/foo,/bar/*/baz"
              }
          }
      ]
  }

On a kill, the oomd runs the first configured prekill hook whose "cgroup" arg
matches the path of the cgroup to be killed. At most one prekill hook runs per
kill.

Dropins may contain prekill_hooks. Dropped-in prekill hooks get priority over
those in the base configuration. Like ruleset dropins, prekill hook dropins
added later get higher priority.

The "cgroup" arg is a list of comma-separated patterns. Patterns are cgroup
paths, except path components may be "*". No other glob matching works except
star for a single whole path component.

A cgroup path matches a pattern if it 1) exactly matches the pattern, 2) is an
ancestor of a path that would match the pattern, or 3) is a descendant of a path
that matches the pattern.

To run on all kills, set `"cgroup": "/"`.

Rulesets may set a "prekill_hook_timeout" in seconds. If unset, the default is 5
seconds.

  {
      "rulesets": [
            {
                "name": "memory pressure protection",
                "prekill_hook_timeout": "30",
                "detectors": [...],
                "actions": [...]
      ],
      "prekill_hooks": [...]
  }

The prekill hook timeout sets a window for all prekill hooks in an action
chain to finish running. For example, consider:
- a ruleset with two kill plugin actions and a 5s prekill hook timeout
- the action chain fires
- the first action targets /foo.slice and fires a prekill hook on it
- the prekill hook finishes in 3s
- /foo.slice fails to die, so the first action returns CONTINUE
- the second kill plugin runs, targets /bar.slice, and fires a prekill hook

The second prekill hook only has 2s to run before it times out, since it's been
3s (or more) since the action chain started, and the action chain set a 5s
max window for prekill hooks to run.

## API

Prekill hook implementers should subclass PrekillHook and PrekillHookInvocation
and implement these core methods:

      /* same as BasePlugin::init(args, context) */
      int PrekillHook::init(
          const Engine::PluginArgs& args,
          const PluginConstructionContext& context);

      /* main method for a hook, called just before the cgroup is killed */
      std::unique_ptr<PrekillHookInvocation> PrekillHook::fire(
            const CgroupContext&);

      /* Invocation object returned from fire() is polled to see when the hook
         has finished running, and killing may begin */
      bool PrekillHookInvocation::didFinish()

      /* Invocation object is destructed either when it finishes, or early
         if it times out */
      PrekillHookInvocation::~PrekillHookInvocation()

Hooks are kicked off with PrekillHook::fire(cgroup) with the cgroup oomd intends
to kill.

Oomd is designed as a single threaded event loop, so fire() shouldn't do long
work that blocks the main thread. Instead, it vends an Invocation object which
will be polled every main loop tick (typically 1s) for didFinish(). The cgroup
will not be killed until didFinish() returns true, or we reach a timeout.

If oomd determines a PrekillHookInvocation timed out, it is destructed and
PrekillHookInvocation::~PrekillHookInvocation() called. The destructor will be
called before the cgroup is killed, regardless of whether the
hook timed out or didFinish() returned true.

All methods (fire, didFinish, ~PrekillHookInvocation) will be always be called
on the main thread and should not block for nontrivial time.  If blocking work
is needed, it should be done in other threads, possibly spawned in
PrekillHook::init().

## Guarantees

- At most one prekill hook will be running per ruleset at any moment. There may
  be multiple instances of a prekill hook running at the same time, as part of
  different rulesets.
- If a prekill hook is run on a cgroup, the cgroup is not guaranteed to die.
  Oomd may fail to kill it. (Oomd will then pick a different cgroup to try to
  kill, and again call the prekill hook on its new target before trying to kill
  it.)
- PrekillHooks are not guaranteed to outlive the Invocations they fire().
  Invocations should encapsulate any data they need to run to completion.