File: cli.md

package info (click to toggle)
jupyter-cache 1.0.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 840 kB
  • sloc: python: 2,601; makefile: 40; sh: 9
file content (278 lines) | stat: -rw-r--r-- 8,545 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
(use/cli)=

# Command-Line

Note, you can follow this tutorial by cloning <https://github.com/executablebooks/jupyter-cache>, and running these commands inside it.:
tox
```{jcache-clear}
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_main:jcache
:args: --help
```

The first time the cache is required, it will be lazily created:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
:input: y
```

You can specify the path to the cache, with the `--cache-path` option,
or set the `JUPYTERCACHE` environment variable.

You can also clear it at any time:

```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: clear
:input: y
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
:input: y
```

````{tip}
Execute this in the terminal for auto-completion:

```console
eval "$(_JCACHE_COMPLETE=source jcache)"
```
````

## Adding notebooks to the project

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:args: --help
```

A project consist of a set of notebooks to be executed.

When adding notebooks to the project, they are recorded by their URI (e.g. file path),
i.e. no physical copying takes place until execution time.

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb tests/notebooks/basic_unrun.ipynb tests/notebooks/complex_outputs.ipynb tests/notebooks/external_output.ipynb
```

You can list the notebooks in the project, at present none have an existing execution record in the cache:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```

You can remove a notebook from the project by its URI or ID:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: remove
:args: 4
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```

or clear all notebooks from the project:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: clear
:input: y
```

## Add a custom reader to read notebook files

By default, notebook files are read using the [nbformat reader](https://nbformat.readthedocs.io/en/latest/api.html#nbformat.read).
However, you can also specify a custom reader, defined by an entry point in the `jcache.readers` group.
Included with jupyter_cache is the [jupytext](https://jupytext.readthedocs.io) reader, for formats like MyST Markdown:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: --reader nbformat tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add
:args: --reader jupytext tests/notebooks/basic.md
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```

:::{important}
To use the `jupytext` reader, you must have the `jupytext` package installed.
:::

## Executing the notebooks

Simply call the `execute` command, to execute all notebooks in the project that do not have an existing record in the cache.

Executors are defined by entry points in the `jcache.executors` group.
jupyter-cache includes these executors:

- `local-serial`: execute notebooks with the working directory set to their path, in serial mode (using a single process).
- `local-parallel`: execute notebooks with the working directory set to their path, in parallel mode (using multiple processes).
- `temp-serial`: execute notebooks with a temporary working directory, in serial mode (using a single process).
- `temp-parallel`: execute notebooks with a temporary working directory, in parallel mode (using multiple processes).

```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --executor local-serial
```

Successfully executed notebooks will now have a record in the cache, uniquely identified by the a hash of their code and metadata content:

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: list
:args: --hashkeys
```

These records are then compared to the hashes of notebooks in the project, to find which have up-to-date executions.
Note here both notebooks share the same cached notebook (denoted by `[1]` in the status):

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```

Next time you execute the project, only notebooks which don't match a cached record will be executed:

```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --executor local-serial -v CRITICAL
```

You can also `force` all notebooks to be re-executed:

```{jcache-cli} jupyter_cache.cli.commands.cmd_project:cmnd_project
:command: execute
:args: --force
```

If you modify a code cell, the notebook will no longer match a cached notebook or, if you wish to re-execute unchanged notebook(s) (for example if the runtime environment has changed), you can remove their records from the cache (keeping the project record):

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: clear
:input: n
:allow-exception:
```

:::{note}
The number of notebooks in the cache is limited
(current default 1000).
Once this limit is reached, the oldest (last accessed) notebooks begin to be deleted.
change this default with `jcache config cache-limit`
:::

## Analysing executed/excepted notebooks

You can see the elapsed execution time of a notebook via its ID in the cache:

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: info
:args: 1
```

Failed execution tracebacks are also available on the project record:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: info
:args: --tb tests/notebooks/basic_failing.ipynb
```

```{tip}
Code cells can be tagged with `raises-exception` to let the executor known that a cell *may* raise an exception
(see [this issue on its behaviour](https://github.com/jupyter/nbconvert/issues/730)).
```

## Retrieving executed notebooks

Notebooks added to the project are not modified in any way during or after execution:

You can create a new "final" notebook, with the cached outputs merged into the source notebook with:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: merge
:args: tests/notebooks/basic.md final_notebook.ipynb
```

## Invalidating cached notebooks

If you want to invalidate a notebook's cached execution,
for example if you have changed the notebook's execution environment,
you can do so by calling the `invalidate` command:

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: invalidate
:args: tests/notebooks/basic.ipynb
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: list
```

## Specifying notebooks with assets

When executing in a temporary directory, you may want to specify additional "asset" files that also need to be be copied to this directory for the notebook to run.

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: remove
:args: tests/notebooks/basic.ipynb
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: add-with-assets
:args: -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_notebook:cmnd_notebook
:command: info
:args: tests/notebooks/basic.ipynb
```

## Adding notebooks directly to the cache

Pre-executed notebooks can be added to the cache directly, without executing them.

A check will be made that the notebooks look to have been executed correctly,
i.e. the cell execution counts go sequentially up from 1.

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: add
:args: tests/notebooks/complex_outputs.ipynb
:input: y
```

Or to skip the validation:

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: add
:args: --no-validate tests/notebooks/external_output.ipynb
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: list
```

:::{tip}
To only show the latest versions of cached notebooks.

```console
$ jcache cache list --latest-only
```

:::

## Diffing notebooks

You can diff any of the cached notebooks with any (external) notebook:

```{warning}
This requires `pip install nbdime`
```

```{jcache-cli} jupyter_cache.cli.commands.cmd_cache:cmnd_cache
:command: diff
:args: 1 tests/notebooks/basic_unrun.ipynb
```