File: README.rst

package info (click to toggle)
borgstore 0.3.1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 240 kB
  • sloc: python: 1,242; makefile: 3
file content (291 lines) | stat: -rw-r--r-- 9,838 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
BorgStore
=========

A key/value store implementation in Python, supporting multiple backends.

Keys
----

A key (str) can look like:

- 0123456789abcdef... (usually a long, hex-encoded hash value)
- Any other pure ASCII string without '/', '..', or spaces.


Namespaces
----------

To keep things separate, keys should be prefixed with a namespace, such as:

- config/settings
- meta/0123456789abcdef...
- data/0123456789abcdef...

Please note:

1. You should always use namespaces.
2. Nested namespaces like namespace1/namespace2/key are not supported.
3. The code can work without a namespace (empty namespace ""), but then you
   can't add another namespace later, because that would create
   nested namespaces.

Values
------

Values can be any arbitrary binary data (bytes).

Store Operations
----------------

The high-level Store API implementation transparently deals with nesting and
soft deletion, so the caller doesn't need to care much about that, and the backend
API can be much simpler:

- create/destroy: initialize or remove the whole store.
- list: flat list of the items in the given namespace (by default, only non-deleted
  items; optionally, only soft-deleted items).
- store: write a new item into the store (providing its key/value pair).
- load: read a value from the store (given its key); partial loads specifying
  an offset and/or size are supported.
- info: get information about an item via its key (exists, size, ...).
- delete: immediately remove an item from the store (given its key).
- move: implements renaming, soft delete/undelete, and moving to the current
  nesting level.
- stats: API call counters, time spent in API methods, data volume/throughput.
- latency/bandwidth emulator: can emulate higher latency (via BORGSTORE_LATENCY
  [us]) and lower bandwidth (via BORGSTORE_BANDWIDTH [bit/s]) than what is
  actually provided by the backend.

Store operations (and per-op timing and volume) are logged at DEBUG log level.

Automatic Nesting
-----------------

For the Store user, items have names such as:

- namespace/0123456789abcdef...
- namespace/abcdef0123456789...

If there are very many items in the namespace, this could lead to scalability
issues in the backend. The Store implementation therefore offers transparent
nesting, so that internally the backend API is called with names such as:

- namespace/01/23/56/0123456789abcdef...
- namespace/ab/cd/ef/abcdef0123456789...

The nesting depth can be configured from 0 (= no nesting) to N levels and
there can be different nesting configurations depending on the namespace.

The Store supports operating at different nesting levels in the same
namespace at the same time.

When using nesting depth > 0, the backends assume that keys are hashes
(contain hex digits) because some backends pre-create the nesting
directories at initialization time to optimize backend performance.

Soft deletion
-------------

To soft-delete an item (so its value can still be read or it can be
undeleted), the store just renames the item, appending ".del" to its name.

Undelete reverses this by removing the ".del" suffix from the name.

Some store operations provide a boolean flag "deleted" to control whether they
consider soft-deleted items.

Backends
--------

The backend API is rather simple; one only needs to provide some very
basic operations.

Existing backends are listed below; more might come in the future.

posixfs
~~~~~~~

Use storage on a local POSIX filesystem:

- URL: ``file:///absolute/path``
- It is the caller's responsibility to convert a relative path into an absolute
  filesystem path.
- Namespaces: directories
- Values: in key-named files
- Permissions: This backend can enforce a simple, test-friendly permission system
  and raises ``PermissionDenied`` if access is not permitted by the configuration.

  You provide a mapping of names (paths) to granted permission letters. Permissions
  apply to the exact name and all of its descendants (inheritance). If a name is not
  present in the mapping, its nearest ancestor is consulted, up to the empty name
  "" (the store root). If no mapping is provided at all, all operations are allowed.

  Permission letters:

  - ``l``: allow listing object names (directory/namespace listing)
  - ``r``: allow reading objects (contents)
  - ``w``: allow writing new objects (must not already exist)
  - ``W``: allow writing objects including overwriting existing objects
  - ``D``: allow deleting objects

  Operation requirements:

  - create(): requires ``w`` or ``W`` on the store root (``wW``)
  - destroy(): requires ``D`` on the store root
  - mkdir(name): requires ``w``
  - rmdir(name): requires ``w`` or ``D`` (``wD``)
  - list(name): requires ``l``
  - info(name): requires ``l`` (``r`` also accepted)
  - load(name): requires ``r``
  - store(name, value): requires ``w`` for new objects, ``W`` for overwrites (``wW``)
  - delete(name): requires ``D``
  - move(src, dst): requires ``D`` for the source and ``w``/``W`` for the destination

  Examples:

  - Read-only store (recursively): ``permissions = {"": "lr"}``
  - No-delete, no-overwrite (but allow adding new items): ``permissions = {"": "lrw"}``
  - Hierarchical rules: only allow listing at root, allow read/write in "dir",
    but only read for "dir/file":

    ::

        permissions = {
            "": "l",
            "dir": "lrw",
            "dir/file": "r",
        }

  To use permissions with ``Store`` and ``posixfs``, pass the mapping to Store and it
  will be handed to the posixfs backend:

  ::

      from borgstore import Store
      store = Store(url="file:///abs/path", permissions={"": "lrwWD"})
      store.create()
      store.open()
      # ...
      store.close()

sftp
~~~~

Use storage on an SFTP server:

- URL: ``sftp://user@server:port/relative/path`` (strongly recommended)

  For users' and admins' convenience, the mapping of the URL path to the server filesystem path
  depends on the server configuration (home directory, sshd/sftpd config, ...).
  Usually the path is relative to the user's home directory.
- URL: ``sftp://user@server:port//absolute/path``

  As this uses an absolute path, some things become more difficult:

  - A user's configuration might break if a server admin moves a user's home to a new location.
  - Users must know the full absolute path of the space they are permitted to use.
- Namespaces: directories
- Values: in key-named files

rclone
~~~~~~

Use storage on any of the many cloud providers `rclone <https://rclone.org/>`_ supports:

- URL: ``rclone:remote:path`` — we just prefix "rclone:" and pass everything to the right
  of that to rclone; see: https://rclone.org/docs/#syntax-of-remote-paths
- The implementation primarily depends on the specific remote.
- The rclone binary path can be set via the environment variable ``RCLONE_BINARY`` (default: "rclone").


s3
~~

Use storage on an S3-compliant cloud service:

- URL: ``(s3|b2):[profile|(access_key_id:access_key_secret)@][scheme://hostname[:port]]/bucket/path``

  The underlying backend is based on ``boto3``, so all standard boto3 authentication methods are supported:

  - provide a named profile (from your boto3 config),
  - include access key ID and secret in the URL,
  - or use default credentials (e.g., environment variables, IAM roles, etc.).

  See the `boto3 credentials documentation <https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html>`_ for more details.

  If you're connecting to **AWS S3**, the ``[schema://hostname[:port]]`` part is optional.
  Bucket and path are always required.

  .. note::

     There is a known issue with some S3-compatible services (e.g., **Backblaze B2**).
     If you encounter problems, try using ``b2:`` instead of ``s3:`` in the URL.

- Namespaces: directories
- Values: in key-named files


Scalability
-----------

- Count of key/value pairs stored in a namespace: automatic nesting is
  provided for keys to address common scalability issues.
- Key size: there are no special provisions for extremely long keys (e.g.,
  exceeding backend limitations). Usually this is not a problem, though.
- Value size: there are no special provisions for dealing with large value
  sizes (e.g., more than available memory, more than backend storage limitations,
  etc.). If one deals with very large values, one usually cuts them into
  chunks before storing them in the store.
- Partial loads improve performance by avoiding a full load if only part
  of the value is needed (e.g., a header with metadata).

Installation
------------

Install without the ``sftp:`` or ``s3:`` backend::

    pip install borgstore
    pip install "borgstore[none]"  # same thing (simplifies automation)

Install with the ``sftp:`` backend (more dependencies)::

    pip install "borgstore[sftp]"

Install with the ``s3:`` backend (more dependencies)::

    pip install "borgstore[s3]"

Please note that ``rclone:`` also supports SFTP and S3 remotes.

Want a demo?
------------

Run this to get instructions on how to run the demo::

    python3 -m borgstore

State of this project
---------------------

**API is still unstable and expected to change as development goes on.**

**As long as the API is unstable, there will be no data migration tools,
such as tools for upgrading an existing store's data to a new release.**

There are tests, and they pass for the basic functionality, so some functionality is already working well.

There might be missing features or optimization potential. Feedback is welcome!

Many possible backends are still missing. If you want to create and support one, pull requests are welcome.

Borg?
-----

Please note that this code is currently **not** used by the stable release of
BorgBackup (also known as "borg"), but only by Borg 2 beta 10+ and the master branch.

License
-------

BSD license.