1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278
|
(use/cli)=
# Command-Line
Note, you can follow this tutorial by cloning <https://github.com/executablebooks/jupyter-cache>, and running these commands inside it.:
tox
```{jcache-clear}
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_main:jcache
:args: --help
```
The first time the cache is required, it will be lazily created:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
:input: y
```
You can specify the path to the cache, with the `--cache-path` option,
or set the `JUPYTERCACHE` environment variable.
You can also clear it at any time:
```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: clear
:input: y
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
:input: y
```
````{tip}
Execute this in the terminal for auto-completion:
```console
eval "$(_JCACHE_COMPLETE=source jcache)"
```
````
## Adding notebooks to the project
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:args: --help
```
A project consist of a set of notebooks to be executed.
When adding notebooks to the project, they are recorded by their URI (e.g. file path),
i.e. no physical copying takes place until execution time.
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb tests/notebooks/basic_unrun.ipynb tests/notebooks/complex_outputs.ipynb tests/notebooks/external_output.ipynb
```
You can list the notebooks in the project, at present none have an existing execution record in the cache:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```
You can remove a notebook from the project by its URI or ID:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: remove
:args: 4
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```
or clear all notebooks from the project:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: clear
:input: y
```
## Add a custom reader to read notebook files
By default, notebook files are read using the [nbformat reader](https://nbformat.readthedocs.io/en/latest/api.html#nbformat.read).
However, you can also specify a custom reader, defined by an entry point in the `jcache.readers` group.
Included with jupyter_cache is the [jupytext](https://jupytext.readthedocs.io) reader, for formats like MyST Markdown:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: --reader nbformat tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: --reader jupytext tests/notebooks/basic.md
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```
:::{important}
To use the `jupytext` reader, you must have the `jupytext` package installed.
:::
## Executing the notebooks
Simply call the `execute` command, to execute all notebooks in the project that do not have an existing record in the cache.
Executors are defined by entry points in the `jcache.executors` group.
jupyter-cache includes these executors:
- `local-serial`: execute notebooks with the working directory set to their path, in serial mode (using a single process).
- `local-parallel`: execute notebooks with the working directory set to their path, in parallel mode (using multiple processes).
- `temp-serial`: execute notebooks with a temporary working directory, in serial mode (using a single process).
- `temp-parallel`: execute notebooks with a temporary working directory, in parallel mode (using multiple processes).
```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --executor local-serial
```
Successfully executed notebooks will now have a record in the cache, uniquely identified by the a hash of their code and metadata content:
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: list
:args: --hashkeys
```
These records are then compared to the hashes of notebooks in the project, to find which have up-to-date executions.
Note here both notebooks share the same cached notebook (denoted by `[1]` in the status):
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```
Next time you execute the project, only notebooks which don't match a cached record will be executed:
```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --executor local-serial -v CRITICAL
```
You can also `force` all notebooks to be re-executed:
```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --force
```
If you modify a code cell, the notebook will no longer match a cached notebook or, if you wish to re-execute unchanged notebook(s) (for example if the runtime environment has changed), you can remove their records from the cache (keeping the project record):
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: clear
:input: n
:allow-exception:
```
:::{note}
The number of notebooks in the cache is limited
(current default 1000).
Once this limit is reached, the oldest (last accessed) notebooks begin to be deleted.
change this default with `jcache config cache-limit`
:::
## Analysing executed/excepted notebooks
You can see the elapsed execution time of a notebook via its ID in the cache:
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: info
:args: 1
```
Failed execution tracebacks are also available on the project record:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: info
:args: --tb tests/notebooks/basic_failing.ipynb
```
```{tip}
Code cells can be tagged with `raises-exception` to let the executor known that a cell *may* raise an exception
(see [this issue on its behaviour](https://github.com/jupyter/nbconvert/issues/730)).
```
## Retrieving executed notebooks
Notebooks added to the project are not modified in any way during or after execution:
You can create a new "final" notebook, with the cached outputs merged into the source notebook with:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: merge
:args: tests/notebooks/basic.md final_notebook.ipynb
```
## Invalidating cached notebooks
If you want to invalidate a notebook's cached execution,
for example if you have changed the notebook's execution environment,
you can do so by calling the `invalidate` command:
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: invalidate
:args: tests/notebooks/basic.ipynb
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```
## Specifying notebooks with assets
When executing in a temporary directory, you may want to specify additional "asset" files that also need to be be copied to this directory for the notebook to run.
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: remove
:args: tests/notebooks/basic.ipynb
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add-with-assets
:args: -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: info
:args: tests/notebooks/basic.ipynb
```
## Adding notebooks directly to the cache
Pre-executed notebooks can be added to the cache directly, without executing them.
A check will be made that the notebooks look to have been executed correctly,
i.e. the cell execution counts go sequentially up from 1.
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: add
:args: tests/notebooks/complex_outputs.ipynb
:input: y
```
Or to skip the validation:
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: add
:args: --no-validate tests/notebooks/external_output.ipynb
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: list
```
:::{tip}
To only show the latest versions of cached notebooks.
```console
$ jcache cache list --latest-only
```
:::
## Diffing notebooks
You can diff any of the cached notebooks with any (external) notebook:
```{warning}
This requires `pip install nbdime`
```
```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: diff
:args: 1 tests/notebooks/basic_unrun.ipynb
```
|