File: README.md

package info (click to toggle)
golang-github-containerd-nri 0.8.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,980 kB
  • sloc: makefile: 117; sh: 45
file content (381 lines) | stat: -rw-r--r-- 14,486 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
# Node Resource Interface, Revisited

[![PkgGoDev](https://pkg.go.dev/badge/github.com/containerd/nri)](https://pkg.go.dev/github.com/containerd/nri)
[![Build Status](https://github.com/containerd/nri/workflows/CI/badge.svg)](https://github.com/containerd/nri/actions?query=workflow%3ACI)
[![codecov](https://codecov.io/gh/containerd/nri/branch/main/graph/badge.svg)](https://codecov.io/gh/containerd/nri)
[![Go Report Card](https://goreportcard.com/badge/github.com/containerd/nri)](https://goreportcard.com/report/github.com/containerd/nri)

*This project is currently in DRAFT status*

## Goal

NRI allows plugging domain- or vendor-specific custom logic into OCI-
compatible runtimes. This logic can make controlled changes to containers
or perform extra actions outside the scope of OCI at certain points in a
containers lifecycle. This can be used, for instance, for improved allocation
and management of devices and other container resources.

NRI defines the interfaces and implements the common infrastructure for
enabling such pluggable runtime extensions, NRI plugins. This also keeps
the plugins themselves runtime-agnostic.

The goal is to enable NRI support in the most commonly used OCI runtimes,
[containerd](https://github.com/containerd/containerd) and
[CRI-O](https://github.com/cri-o/cri-o).

## Background

The revisited API is a major rewrite of NRI. It changes the scope of NRI
and how it gets integrated into runtimes. It reworks how plugins are
implemented, how they communicate with the runtime, and what kind of
changes they can make to containers.

[NRI v0.1.0](README-v0.1.0.md) used an OCI hook-like one-shot plugin invocation
mechanism where a separate instance of a plugin was spawned for every NRI
event. This instance then used its standard input and output to receive a
request and provide a response, both as JSON data.

Plugins in NRI are daemon-like entities. A single instance of a plugin is
now responsible for handling the full stream of NRI events and requests. A
unix-domain socket is used as the transport for communication. Instead of
JSON requests and responses NRI is defined as a formal, protobuf-based
'NRI plugin protocol' which is compiled into ttRPC bindings. This should
result in improved communication efficiency with lower per-message overhead,
and enable straightforward implementation of stateful NRI plugins.

## Components

The NRI implementation consists of a number of components. The core of
these are essential for implementing working end-to-end NRI support in
runtimes. These core components are the actual [NRI protocol](pkg/api),
and the [NRI runtime adaptation](pkg/adaptation).

Together these establish the model of how a runtime interacts with NRI and
how plugins interact with containers in the runtime through NRI. They also
define under which conditions plugins can make changes to containers and
the extent of these changes.

The rest of the components are the [NRI plugin stub library](pkg/stub)
and [some sample NRI plugins](plugins). [Some plugins](plugins/hook-injector)
implement useful functionality in real world scenarios. [A few](plugins/differ)
[others](plugins/logger) are useful for debugging. All of the sample plugins
serve as practical examples of how the stub library can be used to implement
NRI plugins.

## Protocol, Plugin API

The core of NRI is defined by a protobuf [protocol definition](pkg/api/api.proto)
of the low-level plugin API. The API defines two services, Runtime and Plugin.

The Runtime service is the public interface runtimes expose for NRI plugins. All
requests on this interface are initiated by the plugin. The interface provides
functions for

  - initiating plugin registration
  - requesting unsolicited updates to containers

The Plugin service is the public interface NRI uses to interact with plugins.
All requests on this interface are initiated by NRI/the runtime. The interface
provides functions for

  - configuring the plugin
  - getting initial list of already existing pods and containers
  - hooking the plugin into pod/container lifecycle events
  - shutting down the plugin

### Plugin Registration

Before a plugin can start receiving and processing container events, it needs
to register itself with NRI. During registration the plugin and NRI perform a
handshake sequence which consists of the following steps:

  1. the plugin identifies itself to the runtime
  2. NRI provides plugin-specific configuration data to the plugin
  3. the plugin subscribes to pod and container lifecycle events of interest
  4. NRI sends list of existing pods and containers to plugin
  5. the plugin requests any updates deemed necessary to existing containers

The plugin identifies itself to NRI by a plugin name and a plugin index. The
plugin index is used by NRI to determine in which order the plugin is hooked
into pod and container lifecycle event processing with respect to any other
plugins.

The plugin name is used to pick plugin-specific data to send to the plugin
as configuration. This data is only present if the plugin has been launched
by NRI. If the plugin has been externally started it is expected to acquire
its configuration also by external means. The plugin subscribes to pod and
container lifecycle events of interest in its response to configuration.

As the last step in the registration and handshaking process, NRI sends the
full set of pods and containers known to the runtime. The plugin can request
updates it considers necessary to any of the known containers in response.

Once the handshake sequence is over and the plugin has registered with NRI,
it will start receiving pod and container lifecycle events according to its
subscription.

### Pod Data and Available Lifecycle Events

<details>
<summary>NRI Pod Lifecycle Events</summary>
<p align="center">
<img src="./docs/nri-pod-lifecycle.svg" title="NRI Pod Lifecycle Events">
</p>
</details>

NRI plugins can subscribe to the following pod lifecycle events:

  - creation
  - stopping
  - removal

The following pieces of pod metadata are available to plugins in NRI:

  - ID
  - name
  - UID
  - namespace
  - labels
  - annotations
  - cgroup parent directory
  - runtime handler name

### Container Data and Available Lifecycle Events

<details>
<summary>NRI Container Lifecycle Events</summary>
<p align="center">
<img src="./docs/nri-container-lifecycle.svg" title="NRI Container Lifecycle Events">
</p>
</details>

NRI plugins can subscribe to the following container lifecycle events:

  - creation (*)
  - post-creation
  - starting
  - post-start
  - updating (*)
  - post-update
  - stopping (*)
  - removal

*) Plugins can request adjustment or updates to containers in response to
these events.

The following pieces of container metadata are available to plugins in NRI:

  - ID
  - pod ID
  - name
  - state
  - labels
  - annotations
  - command line arguments
  - environment variables
  - mounts
  - OCI hooks
  - rlimits
  - linux
    - namespace IDs
    - devices
    - resources
      - memory
        - limit
        - reservation
        - swap limit
        - kernel limit
        - kernel TCP limit
        - swappiness
        - OOM disabled flag
        - hierarchical accounting flag
        - hugepage limits
      - CPU
        - shares
        - quota
        - period
        - realtime runtime
        - realtime period
        - cpuset CPUs
        - cpuset memory
      - Block I/O class
      - RDT class

Apart from data identifying the container, these pieces of information
represent the corresponding data in the container's OCI Spec.

### Container Adjustment

During container creation plugins can request changes to the following
container parameters:

  - annotations
  - mounts
  - environment variables
  - OCI hooks
  - rlimits
  - linux
    - devices
    - resources
      - memory
        - limit
        - reservation
        - swap limit
        - kernel limit
        - kernel TCP limit
        - swappiness
        - OOM disabled flag
        - hierarchical accounting flag
        - hugepage limits
      - CPU
        - shares
        - quota
        - period
        - realtime runtime
        - realtime period
        - cpuset CPUs
        - cpuset memory
      - Block I/O class
      - RDT class

### Container Updates

Once a container has been created plugins can request updates to them.
These updates can be requested in response to another containers creation
request, in response to any containers update request, in response to any
containers stop request, or they can be requested as part of a separate
unsolicited container update request. The following container parameters
can be updated this way:

  - resources
    - memory
      - limit
      - reservation
      - swap limit
      - kernel limit
      - kernel TCP limit
      - swappiness
      - OOM disabled flag
      - hierarchical accounting flag
      - hugepage limits
    - CPU
      - shares
      - quota
      - period
      - realtime runtime
      - realtime period
      - cpuset CPUs
      - cpuset memory
    - Block I/O class
    - RDT class


## Runtime Adaptation

The NRI [runtime adaptation](pkg/adaptation) package is the interface
runtimes use to integrate to NRI and interact with NRI plugins. It
implements basic plugin discovery, startup and configuration. It also
provides the functions necessary to hook NRI plugins into lifecycle
events of pods and containers from the runtime.

The package hides the fact that multiple NRI plugins might be processing
any single pod or container lifecycle event. It takes care of invoking
plugins in the correct order and combining responses by multiple plugins
into a single one. While combining responses, the package detects any
unintentional conflicting changes made by multiple plugins to a single
container and flags such an event as an error to the runtime.

## Wrapped OCI Spec Generator

The [OCI Spec generator](pkg/runtime-tools/generate) package wraps the
[corresponding package](https://github.com/opencontainers/runtime-tools/tree/master/generate)
and adds functions for applying NRI container adjustments and updates to
OCI Specs. This package can be used by runtime NRI integration code to
apply NRI responses to containers.

## Plugin Stub Library

The plugin stub hides many of the low-level details of implementing an NRI
plugin. It takes care of connection establishment, plugin registration,
configuration, and event subscription. All [sample plugins](plugins)
are implemented using the stub. Any of these can be used as a tutorial on
how the stub library should be used.

## Sample Plugins

The following sample plugins exist for NRI:

  - [logger](plugins/logger)
  - [differ](plugins/differ)
  - [device injector](plugins/device-injector)
  - [network device injector](plugins/network-device-injector)
  - [OCI hook injector](plugins/hook-injector)
  - [ulimit adjuster](plugins/ulimit-adjuster)
  - [NRI v0.1.0 plugin adapter](plugins/v010-adapter)

Please see the documentation of these plugins for further details
about what and how each of these plugins can be used for.

## Security Considerations

From a security perspective NRI plugins should be considered part of the
container runtime. NRI does not implement granular access control to the
functionality it offers. Access to NRI is controlled by restricting access
to the systemwide NRI socket. If a process can connect to the NRI socket
and send data, it has access to the full scope of functionality available
via NRI.

In particular this includes

  - injection of OCI hooks, which allow for arbitrary execution of processes with the same privilege level as the container runtime
  - arbitrary changes to mounts, including new bind-mounts, changes to the proc, sys, mqueue, shm, and tmpfs mounts
  - the addition or removal of arbitrary devices
  - arbitrary changes to the limits for memory, CPU, block I/O, and RDT resources available, including the ability to deny service by setting limits very low

The same precautions and principles apply to protecting the NRI socket as
to protecting the socket of the runtime itself. Unless it already exists,
NRI itself creates the directory to hold its socket with permissions that
allow access only for the user ID of the runtime process. By default this
limits NRI access to processes running as root (UID 0). Changing the default
socket permissions is strongly advised against. Enabling more permissive
access control to NRI should never be done without fully understanding the
full implications and potential consequences to container security.

### Plugins as Kubernetes DaemonSets

When the runtime manages pods and containers in a Kubernetes cluster, it
is convenient to deploy and manage NRI plugins using Kubernetes DaemonSets.
Among other things, this requires bind-mounting the NRI socket into the
filesystem of a privileged container running the plugin. Similar precautions
apply and the same care should be taken for protecting the NRI socket and
NRI plugins as for the kubelet DeviceManager socket and Kubernetes Device
Plugins.

The cluster configuration should make sure that unauthorized users cannot
bind-mount host directories and create privileged containers which gain
access to these sockets and can act as NRI or Device Plugins. See the
[related documentation](https://kubernetes.io/docs/concepts/security/)
and [best practices](https://kubernetes.io/docs/setup/best-practices/enforcing-pod-security-standards/)
about Kubernetes security.

## API Stability

NRI APIs should not be considered stable yet. We try to avoid unnecessarily
breaking APIs, especially the Stub API which plugins use to interact with NRI.
However, before NRI reaches a stable 1.0.0 release, this is only best effort
and cannot be guaranteed. Meanwhile we do our best to document any API breaking
changes for each release in the [release notes](RELEASES.md).

The current target for a stable v1 API through a 1.0.0 release is the end of
this year.

## Project details

nri is a containerd sub-project, licensed under the [Apache 2.0 license](./LICENSE).
As a containerd sub-project, you will find the:

 * [Project governance](https://github.com/containerd/project/blob/main/GOVERNANCE.md),
 * [Maintainers](https://github.com/containerd/project/blob/main/MAINTAINERS),
 * and [Contributing guidelines](https://github.com/containerd/project/blob/main/CONTRIBUTING.md)

information in our [`containerd/project`](https://github.com/containerd/project) repository.