1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317
|
.. _contents_api:
Contents API
============
.. currentmodule:: jupyter_server.services.contents
The Jupyter Notebook web application provides a graphical interface for
creating, opening, renaming, and deleting files in a virtual filesystem.
The :class:`~manager.ContentsManager` class defines an abstract
API for translating these interactions into operations on a particular storage
medium. The default implementation,
:class:`~filemanager.FileContentsManager`, uses the local
filesystem of the server for storage and straightforwardly serializes notebooks
into JSON. Users can override these behaviors by supplying custom subclasses
of ContentsManager.
This section describes the interface implemented by ContentsManager subclasses.
We refer to this interface as the **Contents API**.
Data Model
----------
.. currentmodule:: jupyter_server.services.contents.manager
Filesystem Entities
~~~~~~~~~~~~~~~~~~~
.. _notebook models:
ContentsManager methods represent virtual filesystem entities as dictionaries,
which we refer to as **models**.
Models may contain the following entries:
+--------------------+------------+-------------------------------+
| Key | Type | Info |
+====================+============+===============================+
| **name** | unicode | Basename of the entity. |
+--------------------+------------+-------------------------------+
| **path** | unicode | Full |
| | | (:ref:`API-style<apipaths>`) |
| | | path to the entity. |
+--------------------+------------+-------------------------------+
| **type** | unicode | The entity type. One of |
| | | ``"notebook"``, ``"file"`` or |
| | | ``"directory"``. |
+--------------------+------------+-------------------------------+
| **created** | datetime | Creation date of the entity. |
+--------------------+------------+-------------------------------+
| **last_modified** | datetime | Last modified date of the |
| | | entity. |
+--------------------+------------+-------------------------------+
| **content** | variable | The "content" of the entity. |
| | | (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| **mimetype** | unicode or | The mimetype of ``content``, |
| | ``None`` | if any. (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| **format** | unicode or | The format of ``content``, |
| | ``None`` | if any. (:ref:`See |
| | | Below<modelcontent>`) |
+--------------------+------------+-------------------------------+
| [optional] | | |
| **hash** | unicode or | The hash of the contents. |
| | ``None`` | It cannot be null if |
| | | ``hash_algorithm`` is |
| | | defined. |
+--------------------+------------+-------------------------------+
| [optional] | | |
| **hash_algorithm** | unicode or | The algorithm used to compute |
| | ``None`` | hash value. |
| | | It cannot be null |
| | | if ``hash`` is defined. |
+--------------------+------------+-------------------------------+
.. _modelcontent:
Certain model fields vary in structure depending on the ``type`` field of the
model. There are three model types: **notebook**, **file**, and **directory**.
- ``notebook`` models
- The ``format`` field is always ``"json"``.
- The ``mimetype`` field is always ``None``.
- The ``content`` field contains a
:class:`nbformat.notebooknode.NotebookNode` representing the .ipynb file
represented by the model. See the `NBFormat`_ documentation for a full
description.
- The ``hash`` field a hexdigest string of the hash value of the file.
If ``ContentManager.get`` not support hash, it should always be ``None``.
- ``hash_algorithm`` is the algorithm used to compute the hash value.
- ``file`` models
- The ``format`` field is either ``"text"`` or ``"base64"``.
- The ``mimetype`` field is ``text/plain`` for text-format models and
``application/octet-stream`` for base64-format models.
- The ``content`` field is always of type ``unicode``. For text-format
file models, ``content`` simply contains the file's bytes after decoding
as UTF-8. Non-text (``base64``) files are read as bytes, base64 encoded,
and then decoded as UTF-8.
- The ``hash`` field a hexdigest string of the hash value of the file.
If ``ContentManager.get`` not support hash, it should always be ``None``.
- ``hash_algorithm`` is the algorithm used to compute the hash value.
- ``directory`` models
- The ``format`` field is always ``"json"``.
- The ``mimetype`` field is always ``None``.
- The ``content`` field contains a list of :ref:`content-free<contentfree>`
models representing the entities in the directory.
- The ``hash`` field is always ``None``.
.. note::
.. _contentfree:
In certain circumstances, we don't need the full content of an entity to
complete a Contents API request. In such cases, we omit the ``mimetype``,
``content``, and ``format`` keys from the model. This most commonly occurs
when listing a directory, in which circumstance we represent files within
the directory as content-less models to avoid having to recursively traverse
and serialize the entire filesystem.
**Sample Models**
.. code-block:: python
# Notebook Model with Content and Hash
{
"content": {
"metadata": {},
"nbformat": 4,
"nbformat_minor": 0,
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "Some **Markdown**",
},
],
},
"created": datetime(2015, 7, 25, 19, 50, 19, 19865),
"format": "json",
"last_modified": datetime(2015, 7, 25, 19, 50, 19, 19865),
"mimetype": None,
"name": "a.ipynb",
"path": "foo/a.ipynb",
"type": "notebook",
"writable": True,
"hash": "f5e43a0b1c2e7836ab3b4d6b1c35c19e2558688de15a6a14e137a59e4715d34b",
"hash_algorithm": "sha256",
}
# Notebook Model without Content
{
"content": None,
"created": datetime.datetime(2015, 7, 25, 20, 17, 33, 271931),
"format": None,
"last_modified": datetime.datetime(2015, 7, 25, 20, 17, 33, 271931),
"mimetype": None,
"name": "a.ipynb",
"path": "foo/a.ipynb",
"type": "notebook",
"writable": True,
}
API Paths
~~~~~~~~~
.. _apipaths:
ContentsManager methods represent the locations of filesystem resources as
**API-style paths**. Such paths are interpreted as relative to the root
directory of the notebook server. For compatibility across systems, the
following guarantees are made:
* Paths are always ``unicode``, not ``bytes``.
* Paths are not URL-escaped.
* Paths are always forward-slash (/) delimited, even on Windows.
* Leading and trailing slashes are stripped. For example, ``/foo/bar/buzz/``
becomes ``foo/bar/buzz``.
* The empty string (``""``) represents the root directory.
Writing a Custom ContentsManager
--------------------------------
The default ContentsManager is designed for users running the notebook as an
application on a personal computer. It stores notebooks as .ipynb files on the
local filesystem, and it maps files and directories in the Notebook UI to files
and directories on disk. It is possible to override how notebooks are stored
by implementing your own custom subclass of ``ContentsManager``. For example,
if you deploy the notebook in a context where you don't trust or don't have
access to the filesystem of the notebook server, it's possible to write your
own ContentsManager that stores notebooks and files in a database.
Required Methods
~~~~~~~~~~~~~~~~
A minimal complete implementation of a custom
:class:`~manager.ContentsManager` must implement the following
methods:
.. autosummary::
ContentsManager.get
ContentsManager.save
ContentsManager.delete_file
ContentsManager.rename_file
ContentsManager.file_exists
ContentsManager.dir_exists
ContentsManager.is_hidden
You may be required to specify a Checkpoints object, as the default one,
``FileCheckpoints``, could be incompatible with your custom
ContentsManager.
Customizing Checkpoints
-----------------------
.. currentmodule:: jupyter_server.services.contents.checkpoints
Customized Checkpoint definitions allows behavior to be
altered and extended.
The ``Checkpoints`` and ``GenericCheckpointsMixin`` classes
(from :mod:`jupyter_server.services.contents.checkpoints`)
have reusable code and are intended to be used together,
but require the following methods to be implemented.
.. autosummary::
Checkpoints.rename_checkpoint
Checkpoints.list_checkpoints
Checkpoints.delete_checkpoint
GenericCheckpointsMixin.create_file_checkpoint
GenericCheckpointsMixin.create_notebook_checkpoint
GenericCheckpointsMixin.get_file_checkpoint
GenericCheckpointsMixin.get_notebook_checkpoint
No-op example
~~~~~~~~~~~~~
Here is an example of a no-op checkpoints object - note the mixin
comes first. The docstrings indicate what each method should do or
return for a more complete implementation.
.. code-block:: python
class NoOpCheckpoints(GenericCheckpointsMixin, Checkpoints):
"""requires the following methods:"""
def create_file_checkpoint(self, content, format, path):
"""-> checkpoint model"""
def create_notebook_checkpoint(self, nb, path):
"""-> checkpoint model"""
def get_file_checkpoint(self, checkpoint_id, path):
"""-> {'type': 'file', 'content': <str>, 'format': {'text', 'base64'}}"""
def get_notebook_checkpoint(self, checkpoint_id, path):
"""-> {'type': 'notebook', 'content': <output of nbformat.read>}"""
def delete_checkpoint(self, checkpoint_id, path):
"""deletes a checkpoint for a file"""
def list_checkpoints(self, path):
"""returns a list of checkpoint models for a given file,
default just does one per file
"""
return []
def rename_checkpoint(self, checkpoint_id, old_path, new_path):
"""renames checkpoint from old path to new path"""
See ``GenericFileCheckpoints`` in :mod:`notebook.services.contents.filecheckpoints`
for a more complete example.
Testing
-------
.. currentmodule:: jupyter_server.services.contents.tests
:mod:`jupyter_server.services.contents.tests` includes several test suites written
against the abstract Contents API. This means that an excellent way to test a
new ContentsManager subclass is to subclass our tests to make them use your
ContentsManager.
.. note::
PGContents_ is an example of a complete implementation of a custom
``ContentsManager``. It stores notebooks and files in PostgreSQL_ and encodes
directories as SQL relations. PGContents also provides an example of how to
reuse the notebook's tests.
.. _NBFormat: https://nbformat.readthedocs.io/en/latest/index.html
.. _PGContents: https://github.com/quantopian/pgcontents
.. _PostgreSQL: https://www.postgresql.org/
Asynchronous Support
--------------------
An asynchronous version of the Contents API is available to run slow IO processes concurrently.
- :class:`~manager.AsyncContentsManager`
- :class:`~filemanager.AsyncFileContentsManager`
- :class:`~largefilemanager.AsyncLargeFileManager`
- :class:`~checkpoints.AsyncCheckpoints`
- :class:`~checkpoints.AsyncGenericCheckpointsMixin`
.. note::
.. _asynccontents:
In most cases, the non-asynchronous Contents API is performant for local filesystems.
However, if the Jupyter Notebook web application is interacting with a high-latent virtual filesystem, you may see performance gains by using the asynchronous version.
For example, if you're experiencing terminal lag in the web application due to the slow and blocking file operations, the asynchronous version can reduce the lag.
Before opting in, comparing both non-async and async options' performances is recommended.
|