1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381
|
# Node Resource Interface, Revisited
[](https://pkg.go.dev/github.com/containerd/nri)
[](https://github.com/containerd/nri/actions?query=workflow%3ACI)
[](https://codecov.io/gh/containerd/nri)
[](https://goreportcard.com/report/github.com/containerd/nri)
*This project is currently in DRAFT status*
## Goal
NRI allows plugging domain- or vendor-specific custom logic into OCI-
compatible runtimes. This logic can make controlled changes to containers
or perform extra actions outside the scope of OCI at certain points in a
containers lifecycle. This can be used, for instance, for improved allocation
and management of devices and other container resources.
NRI defines the interfaces and implements the common infrastructure for
enabling such pluggable runtime extensions, NRI plugins. This also keeps
the plugins themselves runtime-agnostic.
The goal is to enable NRI support in the most commonly used OCI runtimes,
[containerd](https://github.com/containerd/containerd) and
[CRI-O](https://github.com/cri-o/cri-o).
## Background
The revisited API is a major rewrite of NRI. It changes the scope of NRI
and how it gets integrated into runtimes. It reworks how plugins are
implemented, how they communicate with the runtime, and what kind of
changes they can make to containers.
[NRI v0.1.0](README-v0.1.0.md) used an OCI hook-like one-shot plugin invocation
mechanism where a separate instance of a plugin was spawned for every NRI
event. This instance then used its standard input and output to receive a
request and provide a response, both as JSON data.
Plugins in NRI are daemon-like entities. A single instance of a plugin is
now responsible for handling the full stream of NRI events and requests. A
unix-domain socket is used as the transport for communication. Instead of
JSON requests and responses NRI is defined as a formal, protobuf-based
'NRI plugin protocol' which is compiled into ttRPC bindings. This should
result in improved communication efficiency with lower per-message overhead,
and enable straightforward implementation of stateful NRI plugins.
## Components
The NRI implementation consists of a number of components. The core of
these are essential for implementing working end-to-end NRI support in
runtimes. These core components are the actual [NRI protocol](pkg/api),
and the [NRI runtime adaptation](pkg/adaptation).
Together these establish the model of how a runtime interacts with NRI and
how plugins interact with containers in the runtime through NRI. They also
define under which conditions plugins can make changes to containers and
the extent of these changes.
The rest of the components are the [NRI plugin stub library](pkg/stub)
and [some sample NRI plugins](plugins). [Some plugins](plugins/hook-injector)
implement useful functionality in real world scenarios. [A few](plugins/differ)
[others](plugins/logger) are useful for debugging. All of the sample plugins
serve as practical examples of how the stub library can be used to implement
NRI plugins.
## Protocol, Plugin API
The core of NRI is defined by a protobuf [protocol definition](pkg/api/api.proto)
of the low-level plugin API. The API defines two services, Runtime and Plugin.
The Runtime service is the public interface runtimes expose for NRI plugins. All
requests on this interface are initiated by the plugin. The interface provides
functions for
- initiating plugin registration
- requesting unsolicited updates to containers
The Plugin service is the public interface NRI uses to interact with plugins.
All requests on this interface are initiated by NRI/the runtime. The interface
provides functions for
- configuring the plugin
- getting initial list of already existing pods and containers
- hooking the plugin into pod/container lifecycle events
- shutting down the plugin
### Plugin Registration
Before a plugin can start receiving and processing container events, it needs
to register itself with NRI. During registration the plugin and NRI perform a
handshake sequence which consists of the following steps:
1. the plugin identifies itself to the runtime
2. NRI provides plugin-specific configuration data to the plugin
3. the plugin subscribes to pod and container lifecycle events of interest
4. NRI sends list of existing pods and containers to plugin
5. the plugin requests any updates deemed necessary to existing containers
The plugin identifies itself to NRI by a plugin name and a plugin index. The
plugin index is used by NRI to determine in which order the plugin is hooked
into pod and container lifecycle event processing with respect to any other
plugins.
The plugin name is used to pick plugin-specific data to send to the plugin
as configuration. This data is only present if the plugin has been launched
by NRI. If the plugin has been externally started it is expected to acquire
its configuration also by external means. The plugin subscribes to pod and
container lifecycle events of interest in its response to configuration.
As the last step in the registration and handshaking process, NRI sends the
full set of pods and containers known to the runtime. The plugin can request
updates it considers necessary to any of the known containers in response.
Once the handshake sequence is over and the plugin has registered with NRI,
it will start receiving pod and container lifecycle events according to its
subscription.
### Pod Data and Available Lifecycle Events
<details>
<summary>NRI Pod Lifecycle Events</summary>
<p align="center">
<img src="./docs/nri-pod-lifecycle.svg" title="NRI Pod Lifecycle Events">
</p>
</details>
NRI plugins can subscribe to the following pod lifecycle events:
- creation
- stopping
- removal
The following pieces of pod metadata are available to plugins in NRI:
- ID
- name
- UID
- namespace
- labels
- annotations
- cgroup parent directory
- runtime handler name
### Container Data and Available Lifecycle Events
<details>
<summary>NRI Container Lifecycle Events</summary>
<p align="center">
<img src="./docs/nri-container-lifecycle.svg" title="NRI Container Lifecycle Events">
</p>
</details>
NRI plugins can subscribe to the following container lifecycle events:
- creation (*)
- post-creation
- starting
- post-start
- updating (*)
- post-update
- stopping (*)
- removal
*) Plugins can request adjustment or updates to containers in response to
these events.
The following pieces of container metadata are available to plugins in NRI:
- ID
- pod ID
- name
- state
- labels
- annotations
- command line arguments
- environment variables
- mounts
- OCI hooks
- rlimits
- linux
- namespace IDs
- devices
- resources
- memory
- limit
- reservation
- swap limit
- kernel limit
- kernel TCP limit
- swappiness
- OOM disabled flag
- hierarchical accounting flag
- hugepage limits
- CPU
- shares
- quota
- period
- realtime runtime
- realtime period
- cpuset CPUs
- cpuset memory
- Block I/O class
- RDT class
Apart from data identifying the container, these pieces of information
represent the corresponding data in the container's OCI Spec.
### Container Adjustment
During container creation plugins can request changes to the following
container parameters:
- annotations
- mounts
- environment variables
- OCI hooks
- rlimits
- linux
- devices
- resources
- memory
- limit
- reservation
- swap limit
- kernel limit
- kernel TCP limit
- swappiness
- OOM disabled flag
- hierarchical accounting flag
- hugepage limits
- CPU
- shares
- quota
- period
- realtime runtime
- realtime period
- cpuset CPUs
- cpuset memory
- Block I/O class
- RDT class
### Container Updates
Once a container has been created plugins can request updates to them.
These updates can be requested in response to another containers creation
request, in response to any containers update request, in response to any
containers stop request, or they can be requested as part of a separate
unsolicited container update request. The following container parameters
can be updated this way:
- resources
- memory
- limit
- reservation
- swap limit
- kernel limit
- kernel TCP limit
- swappiness
- OOM disabled flag
- hierarchical accounting flag
- hugepage limits
- CPU
- shares
- quota
- period
- realtime runtime
- realtime period
- cpuset CPUs
- cpuset memory
- Block I/O class
- RDT class
## Runtime Adaptation
The NRI [runtime adaptation](pkg/adaptation) package is the interface
runtimes use to integrate to NRI and interact with NRI plugins. It
implements basic plugin discovery, startup and configuration. It also
provides the functions necessary to hook NRI plugins into lifecycle
events of pods and containers from the runtime.
The package hides the fact that multiple NRI plugins might be processing
any single pod or container lifecycle event. It takes care of invoking
plugins in the correct order and combining responses by multiple plugins
into a single one. While combining responses, the package detects any
unintentional conflicting changes made by multiple plugins to a single
container and flags such an event as an error to the runtime.
## Wrapped OCI Spec Generator
The [OCI Spec generator](pkg/runtime-tools/generate) package wraps the
[corresponding package](https://github.com/opencontainers/runtime-tools/tree/master/generate)
and adds functions for applying NRI container adjustments and updates to
OCI Specs. This package can be used by runtime NRI integration code to
apply NRI responses to containers.
## Plugin Stub Library
The plugin stub hides many of the low-level details of implementing an NRI
plugin. It takes care of connection establishment, plugin registration,
configuration, and event subscription. All [sample plugins](plugins)
are implemented using the stub. Any of these can be used as a tutorial on
how the stub library should be used.
## Sample Plugins
The following sample plugins exist for NRI:
- [logger](plugins/logger)
- [differ](plugins/differ)
- [device injector](plugins/device-injector)
- [network device injector](plugins/network-device-injector)
- [OCI hook injector](plugins/hook-injector)
- [ulimit adjuster](plugins/ulimit-adjuster)
- [NRI v0.1.0 plugin adapter](plugins/v010-adapter)
Please see the documentation of these plugins for further details
about what and how each of these plugins can be used for.
## Security Considerations
From a security perspective NRI plugins should be considered part of the
container runtime. NRI does not implement granular access control to the
functionality it offers. Access to NRI is controlled by restricting access
to the systemwide NRI socket. If a process can connect to the NRI socket
and send data, it has access to the full scope of functionality available
via NRI.
In particular this includes
- injection of OCI hooks, which allow for arbitrary execution of processes with the same privilege level as the container runtime
- arbitrary changes to mounts, including new bind-mounts, changes to the proc, sys, mqueue, shm, and tmpfs mounts
- the addition or removal of arbitrary devices
- arbitrary changes to the limits for memory, CPU, block I/O, and RDT resources available, including the ability to deny service by setting limits very low
The same precautions and principles apply to protecting the NRI socket as
to protecting the socket of the runtime itself. Unless it already exists,
NRI itself creates the directory to hold its socket with permissions that
allow access only for the user ID of the runtime process. By default this
limits NRI access to processes running as root (UID 0). Changing the default
socket permissions is strongly advised against. Enabling more permissive
access control to NRI should never be done without fully understanding the
full implications and potential consequences to container security.
### Plugins as Kubernetes DaemonSets
When the runtime manages pods and containers in a Kubernetes cluster, it
is convenient to deploy and manage NRI plugins using Kubernetes DaemonSets.
Among other things, this requires bind-mounting the NRI socket into the
filesystem of a privileged container running the plugin. Similar precautions
apply and the same care should be taken for protecting the NRI socket and
NRI plugins as for the kubelet DeviceManager socket and Kubernetes Device
Plugins.
The cluster configuration should make sure that unauthorized users cannot
bind-mount host directories and create privileged containers which gain
access to these sockets and can act as NRI or Device Plugins. See the
[related documentation](https://kubernetes.io/docs/concepts/security/)
and [best practices](https://kubernetes.io/docs/setup/best-practices/enforcing-pod-security-standards/)
about Kubernetes security.
## API Stability
NRI APIs should not be considered stable yet. We try to avoid unnecessarily
breaking APIs, especially the Stub API which plugins use to interact with NRI.
However, before NRI reaches a stable 1.0.0 release, this is only best effort
and cannot be guaranteed. Meanwhile we do our best to document any API breaking
changes for each release in the [release notes](RELEASES.md).
The current target for a stable v1 API through a 1.0.0 release is the end of
this year.
## Project details
nri is a containerd sub-project, licensed under the [Apache 2.0 license](./LICENSE).
As a containerd sub-project, you will find the:
* [Project governance](https://github.com/containerd/project/blob/main/GOVERNANCE.md),
* [Maintainers](https://github.com/containerd/project/blob/main/MAINTAINERS),
* and [Contributing guidelines](https://github.com/containerd/project/blob/main/CONTRIBUTING.md)
information in our [`containerd/project`](https://github.com/containerd/project) repository.
|