1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359
|
.. _provisioning:
Customizing the kernel's runtime environment
============================================
Kernel Provisioning
~~~~~~~~~~~~~~~~~~~
Introduced in the 7.0 release, Kernel Provisioning enables the ability
for third parties to manage the lifecycle of a kernel's runtime
environment. By implementing and configuring a *kernel provisioner*,
third parties now have the ability to provision kernels for different
environments, typically managed by resource managers like Kubernetes,
Hadoop YARN, Slurm, etc. For example, a *Kubernetes Provisioner* would
be responsible for launching a kernel within its own Kubernetes pod,
communicating the kernel's connection information back to the
application (residing in a separate pod), and terminating the pod upon
the kernel's termination. In essence, a kernel provisioner is an
*abstraction layer* between the ``KernelManager`` and today's kernel
*process* (i.e., ``Popen``).
The kernel manager and kernel provisioner relationship
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prior to this enhancement, the only extension point for customizing a
kernel's behavior could occur by subclassing ``KernelManager``. This
proved to be a limitation because the Jupyter framework allows for a
single ``KernelManager`` class at any time. While applications could
introduce a ``KernelManager`` subclass of their own, that
``KernelManager`` was then tied directly to *that* application and
thereby not usable as a ``KernelManager`` in another application. As a
result, we consider the ``KernelManager`` class to be an
*application-owned entity* upon which application-specific behaviors can
be implemented.
Kernel provisioners, on the other hand, are contained within the
``KernelManager`` (i.e., a *has-a* relationship) and applications are
agnostic as to what *kind* of provisioner is in use other than what is
conveyed via the kernel's specification (kernelspec). All kernel
interactions still occur via the ``KernelManager`` and ``KernelClient``
classes within ``jupyter_client`` and potentially subclassed by the
application.
Kernel provisioners are not related in any way to the ``KernelManager``
instance that controls their lifecycle, nor do they have any affinity to
the application within which they are used. They merely provide a
vehicle by which authors can extend the landscape in which a kernel can
reside, while not side-effecting the application. That said, some kernel
provisioners may introduce requirements on the application. For example
(and completely hypothetically speaking), a ``SlurmProvisioner`` may
impose the constraint that the server (``jupyter_client``) resides on an
edge node of the Slurm cluster. These kinds of requirements can be
mitigated by leveraging applications like `Jupyter Kernel Gateway <https://github.com/jupyter/kernel_gateway>`_ or
`Jupyter Enterprise Gateway <https://github.com/jupyter/enterprise_gateway>`_
where the gateway server resides on the edge
node of (or within) the cluster, etc.
Discovery
~~~~~~~~~
Kernel provisioning does not alter today's kernel discovery mechanism
that utilizes well-known directories of ``kernel.json`` files. Instead,
it optionally extends the current ``metadata`` stanza within the
``kernel.json`` to include the specification of the kernel provisioner
name, along with an optional ``config`` stanza, consisting of
provisioner-specific configuration items. For example, a container-based
provisioner will likely need to specify the image name in this section.
The important point is that the content of this section is
provisioner-specific.
.. code:: JSON
"metadata": {
"kernel_provisioner": {
"provisioner_name": "k8s-provisioner",
"config": {
"image_name": "my_docker_org/kernel:2.1.5",
"max_cpus": 4
}
}
},
Kernel provisioner authors implement their provisioners by deriving from
:class:`KernelProvisionerBase` and expose their provisioner for consumption
via entry-points:
.. code::
'jupyter_client.kernel_provisioners': [
'k8s-provisioner = my_package:K8sProvisioner',
],
Backwards Compatibility
~~~~~~~~~~~~~~~~~~~~~~~
Prior to this release, no ``kernel.json`` (kernelspec) will contain a
provisioner entry, yet the framework is now based on using provisioners.
As a result, when a ``kernel_provisioner`` stanza is **not** present in
a selected kernelspec, jupyter client will, by default, use the built-in
``LocalProvisioner`` implementation as its provisioner. This provisioner
retains today's local kernel functionality. It can also be subclassed
for those provisioner authors wanting to extend the functionality of
local kernels. The result of launching a kernel in this manner is
equivalent to the following stanza existing in the ``kernel.json`` file:
.. code:: JSON
"metadata": {
"kernel_provisioner": {
"provisioner_name": "local-provisioner",
"config": {
}
}
},
Should a given installation wish to use a *different* provisioner as
their "default provisioner" (including subclasses of
``LocalProvisioner``), they can do so by specifying a value for
``KernelProvisionerFactory.default_provisioner_name``.
Implementing a custom provisioner
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The impact of Kernel Provisioning is that it enables the ability to
implement custom kernel provisioners to manage a kernel's lifecycle
within any runtime environment. There are currently two approaches by
which that can be accomplished, extending the ``KernelProvisionerBase``
class or extending the built-in class - ``LocalProvisioner``. As more
provisioners are introduced, some may be implemented in an abstract
sense, from which specific implementations can be authored.
Extending ``LocalProvisioner``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you're interested in running kernels locally and yet adjust their
behavior, there's a good chance you can simply extend
``LocalProvisioner`` via subclassing. This amounts to deriving from
``LocalProvisioner`` and overriding appropriate methods to provide your
custom functionality.
In this example, RBACProvisioner will verify whether the current user is
in the role meant for this kernel by calling a method implemented within *this*
provisioner. If the user is not in the role, an exception will be thrown.
.. code:: python
class RBACProvisioner(LocalProvisioner):
role: str = Unicode(config=True)
async def pre_launch(self, **kwargs: Any) -> Dict[str, Any]:
if not self.user_in_role(self.role):
raise PermissionError(
f"User is not in role {self.role} and " f"cannot launch this kernel."
)
return await super().pre_launch(**kwargs)
It is important to note *when* it's necessary to call the superclass in
a given method - since the operations it performs may be critical to the
kernel's management. As a result, you'll likely need to become familiar
with how ``LocalProvisioner`` operates.
Extending ``KernelProvisionerBase``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you'd like to launch your kernel in an environment other than the
local server, then you will need to consider subclassing :class:`KernelProvisionerBase`
directly. This will allow you to implement the various kernel process
controls relative to your target environment. For instance, if you
wanted to have your kernel hosted in a Hadoop YARN cluster, you will
need to implement process-control methods like :meth:`poll` and :meth:`wait`
to use the YARN REST API. Or, similarly, a Kubernetes-based provisioner
would need to implement the process-control methods using the Kubernetes client
API, etc.
By modeling the :class:`KernelProvisionerBase` methods after :class:`subprocess.Popen`
a natural mapping between today's kernel lifecycle management takes place. This,
coupled with the ability to add configuration directly into the ``config:`` stanza
of the ``kernel_provisioner`` metadata, allows for things like endpoint address,
image names, namespaces, hosts lists, etc. to be specified relative to your
kernel provisioner implementation.
The ``kernel_id`` corresponding to the launched kernel and used by the
kernel manager is now available *prior* to the kernel's launch. This
enables provisioners with a unique *key* they can use to discover and
control their kernel when launched into resource-managed clusters such
as Hadoop YARN or Kubernetes.
.. tip::
Use ``kernel_id`` as a discovery mechanism from your provisioner!
Here's a prototyped implementation of a couple of the abstract methods
of :class:`KernelProvisionerBase` for use in an Hadoop YARN cluster to
help illustrate a provisioner's implementation. Note that the built-in
implementation of :class:`LocalProvisioner` can also be used as a reference.
Notice the internal method ``_get_application_id()``. This method is
what the provisioner uses to determine if the YARN application (i.e.,
the kernel) is still running within the cluster. Although the provisioner
doesn't dictate the application id, the application id is
discovered via the application *name* which is a function of ``kernel_id``.
.. code:: python
async def poll(self) -> Optional[int]:
"""Submitting a new kernel/app to YARN will take a while to be ACCEPTED.
Thus application ID will probably not be available immediately for poll.
So will regard the application as RUNNING when application ID still in
ACCEPTED or SUBMITTED state.
:return: None if the application's ID is available and state is
ACCEPTED/SUBMITTED/RUNNING. Otherwise 0.
"""
result = 0
if self._get_application_id():
state = self._query_app_state_by_id(self.application_id)
if state in YarnProvisioner.initial_states:
result = None
return result
async def send_signal(self, signum):
"""Currently only support 0 as poll and other as kill.
:param signum
:return:
"""
if signum == 0:
return await self.poll()
elif signum == signal.SIGKILL:
return await self.kill()
else:
return await super().send_signal(signum)
Notice how in some cases we can compose provisioner methods to implement others. For
example, since sending a signal number of 0 is tantamount to polling the process, we
go ahead and call :meth:`poll` to handle ``signum`` of 0 and :meth:`kill` to handle
``SIGKILL`` requests.
Here we see how ``_get_application_id`` uses the ``kernel_id`` to acquire the application
id - which is the *primary id* for controlling YARN application lifecycles. Since startup
in resource-managed clusters can tend to take much longer than local kernels, you'll typically
need a polling or notification mechanism within your provisioner. In addition, your
provisioner will be asked by the ``KernelManager`` what is an acceptable startup time.
This answer is implemented in the provisioner via the :meth:`get_shutdown_wait_time` method.
.. code:: python
def _get_application_id(self, ignore_final_states: bool = False) -> str:
if not self.application_id:
app = self._query_app_by_name(self.kernel_id)
state_condition = True
if type(app) is dict:
state = app.get("state")
self.last_known_state = state
if ignore_final_states:
state_condition = state not in YarnProvisioner.final_states
if len(app.get("id", "")) > 0 and state_condition:
self.application_id = app["id"]
self.log.info(
f"ApplicationID: '{app['id']}' assigned for "
f"KernelID: '{self.kernel_id}', state: {state}."
)
if not self.application_id:
self.log.debug(
f"ApplicationID not yet assigned for KernelID: "
f"'{self.kernel_id}' - retrying..."
)
return self.application_id
def get_shutdown_wait_time(self, recommended: Optional[float] = 5.0) -> float:
if recommended < yarn_shutdown_wait_time:
recommended = yarn_shutdown_wait_time
self.log.debug(
f"{type(self).__name__} shutdown wait time adjusted to "
f"{recommended} seconds."
)
return recommended
Registering your custom provisioner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Once your custom provisioner has been authored, it needs to be exposed
as an
`entry point <https://packaging.python.org/specifications/entry-points/>`_.
To do this add the following to your ``setup.py`` (or equivalent) in its
``entry_points`` stanza using the group name
``jupyter_client.kernel_provisioners``:
::
'jupyter_client.kernel_provisioners': [
'rbac-provisioner = acme.rbac.provisioner:RBACProvisioner',
],
where:
- ``rbac-provisioner`` is the *name* of your provisioner and what will
be referenced within the ``kernel.json`` file
- ``acme.rbac.provisioner`` identifies the provisioner module name, and
- ``RBACProvisioner`` is custom provisioner object name
(implementation) that (directly or indirectly) derives from
``KernelProvisionerBase``
Deploying your custom provisioner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The final step in getting your custom provisioner deployed is to add a
``kernel_provisioner`` stanza to the appropriate ``kernel.json`` files.
This can be accomplished manually or programmatically (in which some
tooling is implemented to create the appropriate ``kernel.json`` file).
In either case, the end result is the same - a ``kernel.json`` file with
the appropriate stanza within ``metadata``. The *vision* is that kernel
provisioner packages will include an application that creates kernel
specifications (i.e., ``kernel.json`` et. al.) pertaining to that
provisioner.
Following on the previous example of ``RBACProvisioner``, one would find
the following ``kernel.json`` file in directory
``/usr/local/share/jupyter/kernels/rbac_kernel``:
.. code:: JSON
{
"argv": ["python", "-m", "ipykernel_launcher", "-f", "{connection_file}"],
"env": {},
"display_name": "RBAC Kernel",
"language": "python",
"interrupt_mode": "signal",
"metadata": {
"kernel_provisioner": {
"provisioner_name": "rbac-provisioner",
"config": {
"role": "data_scientist"
}
}
}
}
Listing available kernel provisioners
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To confirm that your custom provisioner is available for use,
the ``jupyter kernelspec`` command has been extended to include
a ``provisioners`` sub-command. As a result, running ``jupyter kernelspec provisioners``
will list the available provisioners by name followed by their module and object
names (colon-separated):
.. code:: bash
$ jupyter kernelspec provisioners
Available kernel provisioners:
local-provisioner jupyter_client.provisioning:LocalProvisioner
rbac-provisioner acme.rbac.provisioner:RBACProvisioner
|