File: accounting-overview.txt

package info (click to toggle)
tahoe-lafs 1.9.2-1
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 7,240 kB
  • sloc: python: 71,758; makefile: 215; lisp: 89
file content (722 lines) | stat: -rw-r--r-- 36,740 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722

= Accounting =

"Accounting" is the arena of the Tahoe system that concerns measuring,
controlling, and enabling the ability to upload and download files, and to
create new directories. In contrast with the capability-based access control
model, which dictates how specific files and directories may or may not be
manipulated, Accounting is concerned with resource consumption: how much disk
space a given person/account/entity can use.

Tahoe releases up to and including 1.4.1 have a nearly-unbounded resource
usage model. Anybody who can talk to the Introducer gets to talk to all the
Storage Servers, and anyone who can talk to a Storage Server gets to use as
much disk space as they want (up to the reserved_space= limit imposed by the
server, which affects all users equally). Not only is the per-user space
usage unlimited, it is also unmeasured: the owner of the Storage Server has
no way to find out how much space Alice or Bob is using.

The goals of the Accounting system are thus:

 * allow the owner of a storage server to control who gets to use disk space,
   with separate limits per user
 * allow both the server owner and the user to measure how much space the user
   is consuming, in an efficient manner
 * provide grid-wide aggregation tools, so a set of cooperating server
   operators can easily measure how much a given user is consuming across all
   servers. This information should also be available to the user in question.

For the purposes of this document, the terms "Account" and "User" are mostly
interchangeable. The fundamental unit of Accounting is the "Account", in that
usage and quota enforcement is performed separately for each account. These
accounts might correspond to individual human users, or they might be shared
among a group, or a user might have an arbitrary number of accounts.

Accounting interacts with Garbage Collection. To protect their shares from
GC, clients maintain limited-duration leases on those shares: when the last
lease expires, the share is deleted. Each lease has a "label", which
indicates the account or user which wants to keep the share alive. A given
account's "usage" (their per-server aggregate usage) is simply the sum of the
sizes of all shares on which they hold a lease. The storage server may limit
the user to a fixed "quota" (an upper bound on their usage). To keep a file
alive, the user must be willing to use up some of their quota.

Note that a popular file might have leases from multiple users, in which case
one user might take a chance and decline to add their own lease, saving some
of their quota and hoping that the other leases continue to keep the file
alive despite their personal unwillingness to contribute to the effort. One
could imagine a "pro-rated quotas" scheme, in which a 10MB file with 5
leaseholders would deduct 2MB from each leaseholder's quota. We have decided
to not implement pro-rated quotas, because such a scheme would make usage
values hard to predict: a given account might suddenly go over quota solely
because of a third party's actions.

== Authority Flow ==

The authority to consume space on the storage server originates, of course,
with the storage server operator. These operators start with complete control
over their space, and delegate portions of it to others: either directly to
clients who want to upload files, or to intermediaries who can then delegate
attenuated authority onwards. The operators have various reasons for wanting
to share their space: monetary consideration, expectations of in-kind
exchange, or simple generosity. But the final authority always rests with the
operator.

The server operator grants limited authority over their space by configuring
their server to accept requests that demonstrate knowledge of certain
secrets. They then share those secrets with the client who intends to use
this space, or with an intermediary who will generate still more secrets and
share those with the client. Eventually, an upload or create-directory
operation will be performed that needs this authority. Part of the operation
will involve proving knowledge of the secret to the storage server, and the
server will require this proof before accepting the uploaded share or adding
a new lease.

The authority is expressed as a string, containing cryptographically-signed
messages and keys. The string also contains "restrictions", which are
annotations that explain the limits imposed upon this authority, either by
the original grantor (the storage server operator) or by one of the
intermediaries. Authority can be reduced but not increased. Any holder of a
given authority can delegate some or all of it to another party.

The authority string may be short enough to include as an argument to a CLI
command (--with-authority ABCDE), or it may be long enough that it must be
stashed in a file and referenced in some other fashion (--with-authority-file
~/.my_authority). There are CLI tools to create brand new authority strings,
to derive attenuated authorities from an existing one, and to explain the
contents of an authority string. These authority strings can be shared with
others just like filecaps and dircaps: knowledge of the authority string is
both necessary and complete to wield the authority it represents.

Web-API requests will include the authority necessary to complete the
operation. When used by a CLI tool, the authority is likely to come from
~/.tahoe/private/authority (i.e. it is ambient to the user who has access to
that node, just like aliases provide similar access to a specific "root
directory"). When used by the browser-oriented WUI, the authority will [TODO]
somehow be retained on each page in a way that minimizes the risk of CSRF
attacks and allows safe sharing (cut-and-paste of a URL without sharing the
storage authority too). The client node receiving the web-API request will
extract the authority string from the request and use it to build the storage
server messages that it sends to fulfill that request.

== Definition Of Authority ==

The term "authority" is used here in the object-capability sense: it refers
to the ability of some principal to cause some action to occur, whether
because they can do it themselves, or because they can convince some other
principal to do it for them. In Tahoe terms, "storage authority" is the
ability to do one of the following actions:

 * upload a new share, thus consuming storage space
 * adding a new lease to a share, thus preventing space from being reclaimed
 * modify an existing mutable share, potentially increasing the space consumed

The Accounting effort may involve other kinds of authority that get limited
in a similar manner as storage authority, like the ability to download a
share or query whether a given share is present: anything that may consume
CPU time, disk bandwidth, or other limited resources. The authority to renew
or cancel a lease may be controlled in a similar fashion.

Storage authority, as granted from a server operator to a client, is not
simply a binary "use space or not" grant. Instead, it is parameterized by a
number of "restrictions". The most important of these restrictions (with
respect to the goals of Accounting) is the "Account Label".

=== Account Labels ===

A Tahoe "Account" is defined by a variable-length sequence of small integers.
(they are not required to be small, the actual limit is 2**64, but neither
are they required to be unguessable). For the purposes of discussion, these
lists will be expressed as period-joined strings: the two-element list (1,4)
will be displayed here as "1.4".

These accounts are arranged in a hierarchy: the account identifier 1.4 is
considered to be a "parent" of 1.4.2 . There is no relationship between the
values used by unrelated accounts: 1.4 is unrelated to 2.4, despite both
coincidentally using a "4" in the second element.

Each lease has a label, which contains the Account identifier. The storage
server maintains an aggregate size count for each label prefix: when asked
about account 1.4, it will report the amount of space used by shares labeled
1.4, 1.4.2, 1.4.7, 1.4.7.8, etc (but *not* 1 or 1.5).

The "Account Label" restriction allows a client to apply any label it wants,
as long as that label begins with a specific prefix. If account 1 is
associated with Alice, then Alice will receive a storage authority string
that contains a "must start with 1" restriction, enabling her to to use
storage space but obligating her to lease her shares with a label that can be
traced back to her. She can delegate part of her authority to others (perhaps
with other non-label restrictions, such as a space restriction or time limit)
with or without an additional label restriction. For example, she might
delegate some of her authority to her friend Amy, with a 1.4 label
restriction. Amy could then create labels with 1.4 or 1.4.7, but she could
not create labels with the same 1 identifier that Alice can do, nor could she
create labels with 1.5 (which Alice might have given to her other friend
Annette). The storage server operator can ask about the usage of 1 to find
out how much Alice is responsible for (which includes the space that she has
delegated to Amy and Annette), and none of the A-users can avoid being
counted in this total. But Alice can ask the storage server about the usage
of 1.4 to find out how much Amy has taken advantage of her gift. Likewise,
Alice has control over any lease with a label that begins with 1, so she can
cancel Amy's leases and free the space they were consuming. If this seems
surprising, consider that the storage server operator considered Alice to be
responsible for that space anyways: with great responsibility (for space
consumed) comes great power (to stop consuming that space).

=== Server Space Restriction ===

The storage server's basic control over how space usage (apart from the
binary use-it-or-not authority granted by handing out an authority string at
all) is implemented by keeping track of the space used by any given account
identifier. If account 1.4 sends a request to allocate a 1MB share, but that
1MB would bring the 1.4 usage over its quota, the request will be denied.

For this to be useful, the storage server must give each usage-limited
principal a separate account, and it needs to configure a size limit at the
same time as the authority string is minted. For a friendnet, the CLI "add
account" tool can do both at once:

 tahoe server add-account --quota 5GB Alice
 --> Please give the following authority string to "Alice", who should
     provide it to the "tahoe add-authority" command
     (authority string..)

This command will allocate an account identifier, add Alice to the "pet name
table" to associate it with the new account, and establish the 5GB sizelimit.
Both the sizelimit and the petname can be changed later.

Note that this restriction is independent for each server: some additional
mechanism must be used to provide a grid-wide restriction.

Also note that this restriction is not expressed in the authority string. It
is purely local to the storage server.

=== Attenuated Server Space Restriction ===

TODO (or not)

The server-side space restriction described above can only be applied by the
storage server, and cannot be attenuated by other delegates. Alice might be
allowed to use 5GB on this server, but she cannot use that restriction to
delegate, say, just 1GB to Amy.

Instead, Alice's sub-delegation should include a "server_size" restriction
key, which contains a size limit. The storage server will only honor a
request that uses this authority string if it does not cause the aggregate
usage of this authority string's account prefix to rise above the given size
limit.

Note that this will not enforce the desired restriction if the size limits
are not consistent across multiple delegated authorities for the same label.
For example, if Amy ends up with two delagations, A1 (which gives her a size
limit of 1GB) and A2 (which gives her 5GB), then she can consume 5GB despite
the limit in A1.

=== Other Restrictions ===

Many storage authority restrictions are meant for internal use by tahoe tools
as they delegate short-lived subauthorities to each other, and are not likely
to be set by end users.

 * "SI": a storage index string. The authority can only be used to upload
   shares of a single file.
 * "serverid": a server identifier. The authority can only be used when
   talking to a specific server
 * "UEB_hash": a binary hash. The authority can only be used to upload shares
   of a single file, identified by its share's contents. (note: this
   restricton would require the server to parse the share and validate the
   hash)
 * "before": a timestamp. The authority is only valid until a specific time.
   Requires synchronized clocks or a better definition of "timestamp".
 * "delegate_to_furl": a string, used to acquire a FURL for an object that
   contains the attenuated authority. When it comes time to actually use the
   authority string to do something, this is the first step.
 * "delegate_to_key": an ECDSA pubkey, used to grant attenuated authority to
   a separate private key.

== User Experience ==

The process starts with Bob the storage server operator, who has just created
a new Storage Server:

 tahoe create-node
 --> creates ~/.tahoe
 # edit ~/.tahoe/tahoe.cfg, add introducer.furl, configure storage, etc

Now Bob decides that he wants to let his friend Alice use 5GB of space on his
new server.

 tahoe server add-account --quota=5GB Alice
 --> Please give the following authority string to "Alice", who should
     provide it to the "tahoe add-authority" command
     (authority string XYZ..)

Bob copies the new authority string into an email message and sends it to
Alice. Meanwhile, Alice has created her own client, and attached it to the
same Introducer as Bob. When she gets the email, she pastes the authority
string into her local client:

 tahoe client add-authority (authority string XYZ..)
 --> new authority added: account (1)

Now all CLI commands that Alice runs with her node will take advantage of
Bob's space grant. Once Alice's node connects to Bob's, any upload which
needs to send a share to Bob's server will search her list of authorities to
find one that allows her to use Bob's server.

When Alice uses her WUI, upload will be disabled until and unless she pastes
one or more authority strings into a special "storage authority" box. TODO:
Once pasted, we'll use some trick to keep the authority around in a
convenient-yet-safe fashion.

When Alice uses her javascript-based web drive, the javascript program will
be launched with some trick to hand it the storage authorities, perhaps via a
fragment identifier (http://server/path#fragment).

If Alice decides that she wants Amy to have some space, she takes the
authority string that Bob gave her and uses it to create one for Amy:

 tahoe authority dump (authority string XYZ..)
 --> explanation of what is in XYZ
 tahoe authority delegate --account 4,1 --space 2GB (authority string XYZ..)
 --> (new authority string ABC..)

Alice sends the ABC string to Amy, who uses "tahoe client add-authority" to
start using it.

Later, Bob would like to find out how much space Alice is using. He brings up
his node's Storage Server Web Status page. In addition to the overall usage
numbers, the page will have a collapsible-treeview table with lines like:

 AccountID  Usage  TotalUsage Petname
 (1)        1.5GB  2.5GB      Alice
 +(1,4)     1.0GB  1.0GB      ?

This indicates that Alice, as a whole, is using 2.5GB. It also indicates that
Alice has delegated some space to a (1,4) account, and that delegation has
used 1.0GB. Alice has used 1.5GB on her own, but is responsible for the full
2.5GB. If Alice tells Bob that the subaccount is for Amy, then Bob can assign
a pet name for (1,4) with "tahoe server add-pet-name 1,4 Amy". Note that Bob
is not aware of the 2GB limit that Alice has imposed upon Amy: the size
restriction may have appeared on all the requests that have showed up thus
far, but Bob has no way of being sure that a less-restrictive delgation
hasn't been created, so his UI does not attempt to remember or present the
restrictions it has seen before.

=== Friendnet ===

A "friendnet" is a set of nodes, each of which is both a storage server and a
client, each operated by a separate person, all of which have granted storage
rights to the others.

The simplest way to get a friendnet started is to simply grant storage
authority to everybody. "tahoe server enable-ambient-storage-authority" will
configure the storage server to give space to anyone who asks. This behaves
just like a 1.3.0 server, without accounting of any sort.

The next step is to restrict server use to just the participants. "tahoe
server disable-ambient-storage-authority" will undo the previous step, then
there are two basic approaches:

 * "full mesh": each node grants authority directory to all the others.
   First, agree upon a userid number for each participant (the value doesn't
   matter, as long as it is unique). Each user should then use "tahoe server
   add-account" for all the accounts (including themselves, if they want some
   of their shares to land on their own machine), including a quota if they
   wish to restrict individuals:

    tahoe server add-account --account 1 --quota 5GB Alice
    --> authority string for Alice
    tahoe server add-account --account 2 --quota 5GB Bob
    --> authority string for Bob
    tahoe server add-account --account 3 --quota 5GB Carol
    --> authority string for Carol

  Then email Alice's string to Alice, Bob's string to Bob, etc. Once all
  users have used "tahoe client add-authority" on everything, each server
  will accept N distinct authorities, and each client will hold N distinct
  authorities.

 * "account manager": the group designates somebody to be the "AM", or
   "account manager". The AM generates a keypair and publishes the public key
   to all the participants, who create a local authority which delgates full
   storage rights to the corresponding private key. The AM then delegates
   account-restricted authority to each user, sending them their personal
   authority string:

    AM:
     tahoe authority create-authority --write-private-to=private.txt
     --> public.txt
     # email public.txt to all members
    AM:
     tahoe authority delegate --from-file=private.txt --account 1 --quota 5GB
     --> alice_authority.txt # email this to Alice
     tahoe authority delegate --from-file=private.txt --account 2 --quota 5GB
     --> bob_authority.txt # email this to Bob
     tahoe authority delegate --from-file=private.txt --account 3 --quota 5GB
     --> carol_authority.txt # email this to Carol
     ...
    Alice:
     # receives alice_authority.txt
     tahoe client add-authority --from-file=alice_authority.txt
     # receives public.txt
     tahoe server add-authorization --from-file=public.txt
    Bob:
     # receives bob_authority.txt
     tahoe client add-authority --from-file=bob_authority.txt
     # receives public.txt
     tahoe server add-authorization --from-file=public.txt
    Carol:
     # receives carol_authority.txt
     tahoe client add-authority --from-file=carol_authority.txt
     # receives public.txt
     tahoe server add-authorization --from-file=public.txt

   If the members want to see names next to their local usage totals, they
   can set local petnames for the accounts:

     tahoe server set-petname 1 Alice
     tahoe server set-petname 2 Bob
     tahoe server set-petname 3 Carol

   Alternatively, the AM could provide a usage aggregator, which will collect
   usage values from all the storage servers and show the totals in a single
   place, and add the petnames to that display instead.

   The AM gets more authority than anyone else (they can spoof everybody),
   but each server has just a single authorization instead of N, and each
   client has a single authority instead of N. When a new member joins the
   group, the amount of work that must be done is significantly less, and
   only two parties are involved instead of all N:

    AM:
     tahoe authority delegate --from-file=private.txt --account 4 --quota 5GB
     --> dave_authority.txt # email this to Dave
    Dave:
     # receives dave_authority.txt
     tahoe client add-authority --from-file=dave_authority.txt
     # receives public.txt
     tahoe server add-authorization --from-file=public.txt

   Another approach is to let everybody be the AM: instead of keeping the
   private.txt file secret, give it to all members of the group (but not to
   outsiders). This lets current members bring new members into the group
   without depending upon anybody else doing work. It also renders any notion
   of enforced quotas meaningless, so it is only appropriate for actual
   friends who are voluntarily refraining from spoofing each other.

=== Commercial Grid ===

A "commercial grid", like the one that allmydata.com manages as a for-profit
service, is characterized by a large number of independent clients (who do
not know each other), and by all of the storage servers being managed by a
single entity. In this case, we use an Account Manager like above, to
collapse the potential N*M explosion of authorities into something smaller.
We also create a dummy "parent" account, and give all the real clients
subaccounts under it, to give the operations personnel a convenient "total
space used" number. Each time a new customer joins, the AM is directed to
create a new authority for them, and the resulting string is provided to the
customer's client node.

 AM:
  tahoe authority create-authority --account 1 \
   --write-private-to=AM-private.txt --write-public-to=AM-public.txt

Each time a new storage server is brought up:

 SERVER:
  tahoe server add-authorization --from-file=AM-public.txt

Each time a new client joins:

 AM:
  N = next_account++
  tahoe authority delegate --from-file=AM-private.txt --account 1,N
  --> new_client_authority.txt # give this to new client

== Programmatic Interfaces ==

The storage authority can be passed as a string in a single serialized form,
which is cut-and-pasteable and printable. It uses minimal punctuation, to
make it possible to include it as a URL query argument or HTTP header field
without requiring character-escaping.

Before passing it over HTTP, however, note that revealing the authority
string to someone is equivalent to irrevocably delegating all that authority
to them. While this is appropriate when transferring authority from, say, a
receptive storage server to your local agent, it is not appropriate when
using a foreign tahoe node, or when asking a Helper to upload a specific
file. Attenuations (see below) should be used to limit the delegated
authority in these cases.

In the programmatic web-API, any operation that consumes storage will accept
a storage-authority= query argument, the value of which will be the printable
form of an authority string. This includes all PUT operations, POST t=upload
and t=mkdir, and anything which creates a new file, creates a directory
(perhaps an intermediate one), or modifies a mutable file.

Alternatively, the authority string can also be passed through an HTTP
header. A single "X-Tahoe-Storage-Authority:" header can be used with the
printable authority string. If the string is too large to fit in a single
header, the application can provide a series of numbered
"X-Tahoe-Storage-Authority-1:", "X-Tahoe-Storage-Authority-2:", etc, headers,
and these will be sorted in alphabetical order (please use 08/09/10/11 rather
than 8/9/10/11), stripped of leading and trailing whitespace, and
concatenated. The HTTP header form can accomodate larger authority strings,
since these strings can grow too large to pass as a query argument
(especially when several delegations or attenuations are involved). However,
depending upon the HTTP client library being used, passing extra HTTP headers
may be more complicated than simply modifying the URL, and may be impossible
in some cases (such as javascript running in a web browser).

TODO: we may add a stored-token form of authority-passing to handle
environments in which query-args won't work and headers are not available.
This approach would use a special PUT which takes the authority string as the
HTTP body, and remembers it on the server side in associated with a
brief-but-unguessable token. Later operations would then use the authority by
passing a --storage-authority-token=XYZ query argument. These authorities
would expire after some period.

== Quota Management, Aggregation, Reporting ==

The storage server will maintain enough information to efficiently compute
usage totals for each account referenced in all of their leases, as well as
all their parent accounts. This information is used for several purposes:

 * enforce server-space restrictions, by selectively rejecting storage
   requests which would cause the account-usage-total to rise above the limit
   specified in the enabling authorization string
 * report individual account usage to the account-holder (if a client can
   consume space under account A, they are also allowed to query usage for
   account A or a subaccount).
 * report individual account usage to the storage-server operator, possibly
   associated with a pet name
 * report usage for all accounts to the storage-server operator, possibly
   associated with a pet name, in the form of a large table
 * report usage for all accounts to an external aggregator

The external aggregator would take usage information from all the storage
servers in a single grid and sum them together, providing a grid-wide usage
number for each account. This could be used by e.g. clients in a commercial
grid to report overall-space-used to the end user.

There will be web-API URLs available for all of these reports.

TODO: storage servers might also have a mechanism to apply space-usage limits
to specific account ids directly, rather than requiring that these be
expressed only through authority-string limitation fields. This would let a
storage server operator revoke their space-allocation after delivering the
authority string.

== Low-Level Formats ==

This section describes the low-level formats used by the Accounting process,
beginning with the storage-authority data structure and working upwards. This
section is organized to follow the storage authority, starting from the point
of grant. The discussion will thus begin at the storage server (where the
authority is first created), work back to the client (which receives the
authority as a web-API argument), then follow the authority back to the
servers as it is used to enable specific storage operations. It will then
detail the accounting tables that the storage server is obligated to
maintain, and describe the interfaces through which these tables are accessed
by other parties.

=== Storage Authority ===

==== Terminology ====

Storage Authority is represented as a chain of certificates and a private
key. Each certificate authorizes and restricts a specific private key. The
initial certificate in the chain derives its authority by being placed in the
storage server's tahoe.cfg file (i.e. by being authorized by the storage
server operator). All subsequent certificates are signed by the authorized
private key that was identified in the previous certificate: they derive
their authority by delegation. Each certificate has restrictions which limit
the authority being delegated.

 authority: ([cert[0], cert[1], cert[2] ...], privatekey)

The "restrictions dictionary" is a table which establishes an upper bound on
how this authority (or any attenuations thereof) may be used. It is
effectively a set of key-value pairs.

A "signing key" is an EC-DSA192 private key string, as supplied to the
pycryptopp SigningKey() constructor, and is 12 bytes long. A "verifying key"
is an EC-DSA192 public key string, as produced by pycryptopp, and is 24 bytes
long. A "key identifier" is a string which securely identifies a specific
signing/verifying keypair: for long RSA keys it would be a secure hash of the
public key, but since ECDSA192 keys are so short, we simply use the full
verifying key verbatim. A "key hint" is a variable-length prefix of the key
identifier, perhaps zero bytes long, used to help a recipient reduce the
number of verifying keys that it must search to find one that matches a
signed message.

==== Authority Chains ====

The authority chain consists of a list of certificates, each of which has a
serialized restrictions dictionary. Each dictionary will have a
"delegate-to-key" field, which delegates authority to a private key,
referenced with a key identifier. In addition, the non-initial certs are
signed, so they each contain a signature and a key hint:

 cert[0]: serialized(restrictions_dictionary)
 cert[1]: serialized(restrictions_dictionary), signature, keyhint
 cert[2]: serialized(restrictions_dictionary), signature, keyhint

In this example, suppose cert[0] contains a delegate-to-key field that
identifies a keypair sign_A/verify_A. In this case, cert[1] will have a
signature that was made with sign_A, and the keyhint in cert[1] will
reference verify_A.

 cert[0].restrictions[delegate-to-key] = A_keyid

 cert[1].signature = SIGN(sign_A, serialized(cert[0].restrictions))
 cert[1].keyhint = verify_A
 cert[1].restrictions[delegate-to-key] = B_keyid

 cert[2].signature = SIGN(sign_B, serialized(cert[1].restrictions))
 cert[2].keyhint = verify_B
 cert[2].restrictions[delete-to-key] = C_keyid

In this example, the full storage authority consists of the cert[0,1,2] chain
and the sign_C private key: anyone who is in possession of both will be able
to exert this authority. To wield the authority, a client will present the
cert[0,1,2] chain and an action message signed by sign_C; the server will
validate the chain and the signature before performing the requested action.
The only circumstances that might prompt the client to share the sign_C
private key with another party (including the server) would be if it wanted
to irrevocably share its full authority with that party.

==== Restriction Dictionaries ====

Within a restriction dictionary, the following keys are defined. Their full
meanings are defined later.

 'accountid': an arbitrary-length sequence of integers >=0, restricting the
              accounts which can be manipulated or used in leases
 'SI': a storage index (binary string), controlling which file may be
       manipulated
 'serverid': binary string, limiting which server will accept requests
 'UEB-hash': binary string, limiting the content of the file being manipulated
 'before': timestamp (seconds since epoch), limits the lifetime of this
           authority
 'server-size': integer >0, maximum aggregate storage (in bytes) per account
 'delegate-to-key': binary string (DSA pubkey identifier)
 'furl-to': printable FURL string

==== Authority Serialization ====

There is only one form of serialization: a somewhat-compact URL-safe
cut-and-pasteable printable form. We are interested in minimizing the size of
the resulting authority, so rather than using a general-purpose (perhaps
JSON-based) serialization scheme, we use one that is specialized for this
task.

This URL-safe form will use minimal punctuation to avoid quoting issues when
used in a URL query argument. It would be nice to avoid word-breaking
characters that make cut-and-paste troublesome, however this is more
difficult because most non-alphanumeric characters are word-breaking in at
least one application.

The serialized storage authority as a whole contains a single version
identifier and magic number at the beginning. None of the internal components
contain redundant version numbers: they are implied by the container. If
components are serialized independently for other reasons, they may contain
version identifers in that form.

Signing keys (i.e. private keys) are URL-safe-serialized using Zooko's base62
alphabet, which offers almost the same density as standard base64 but without
any non-URL-safe or word-breaking characters. Since we used fixed-format keys
(EC-DSA, 192bit, with SHA256), the private keys are fixed-length (96 bits or
12 bytes), so there is no length indicator: all URL-safe-serialized signing
keys are 17 base62 characters long. The 192-bit verifying keys (i.e. public
keys) use the same approach: the URL-safe form is 33 characters long.

An account-id sequence (a variable-length sequence of non-negative numbers)
is serialized by representing each number in decimal ASCII, then joining the
pieces with commas. The string is terminated by the first non-[0-9,]
character encountered, which will either be the key-identifier letter of the
next field, or the dictionary-terminating character at the end.

Any single integral decimal number (such as the "before" timestamp field, or
the "server-size" field) is serialized as a variable-length sequence of ASCII
decimal digits, terminated by any non-digit.

The restrictions dictionary is serialized as a concatenated series of
key-identifier-letter / value string pairs, ending with the marker "E.". The
URL-safe form uses a single printable letter to indicate the which key is
being serialized. Each type of value string is serialized differently:

 "A": accountid: variable-length sequence of comma-joned numbers
 "I": storage index: fixed-length 26-character *base32*-encoded storage index
 "P": server id (peer id): fixed-length 32-character *base32* encoded serverid
      (matching the printable Tub.tubID string that Foolscap provides)
 "U": UEB hash: fixed-length 43-character base62 encoded UEB hash
 "B": before: variable-length sequence of decimal digits, seconds-since-epoch.
 "S": server-size: variable-length sequence of decimal digits, max size in bytes
 "D": delegate-to-key: ECDSA public key, 33 base62 characters.
 "F": furl-to: variable-length FURL string, wrapped in a netstring:
      "%d:%s," % (len(FURL), FURL). Note that this is rarely pasted.
 "E.": end-of-dictionary marker

The ECDSA signature is serialized as a variable number of base62 characters,
terminated by a period. We expect the signature to be about 384 bits (48
bytes) long, or 65 base62 characters. A missing signature (such as for the
initial cert) is represented as a single period.

The key hint is serialized with a base62-encoded serialized hint string (a
byte-quantized prefix of the serialized public key), terminated by a period.
An empty hint would thus be serialized as a single period. For the current
design, we expect the key hint to be empty.

The full storage authority string consists of a certificate chain and a
delegate private key. Given the single-certificate serialization scheme
described above, the full authority is serialized as follows:

 * version prefix: depends upon the application, but for storage-authority
                   chains this will be "sa0-", for Storage-Authority Version 0.
 * serialized certificates, concatenated together
 * serialized private key (to which the last certificate delegates authority)

Note that this serialization form does not have an explicit terminator, so
the environment must provide a length indicator or some other way to identify
the end of the authority string. The benefit of this approach is that the
full string will begin and end with alphanumeric characters, making
cut-and-paste easier (increasing the size of the mouse target: anywhere
within the final component will work).

Also note that the period is a reserved delimiter: it cannot appear in the
serialized restrictions dictionary. The parser can remove the version prefix,
split the rest on periods, and expect to see 3*k+1 fields, consisting of k
(restriction-dictionary,signature,keyhint) 3-tuples and a single private key
at the end.

Some examples:

 (example A)
 cert[0] delegates account 1,4 to (pubkey ZlFA / privkey 1f2S):

  sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...1f2SI9UJPXvb7vdJ1

 (example B)
 cert[0] delegates account 1,4 to ZlFA/1f2S
 cert[1] subdelegates 5GB and subaccount 1,4,7 to pubkey 0BPo/06rt:

  sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...A1,4,7S5000000000D0BPoGxJ3M4KWrmdpLnknhJABrWip5e9kPE,7cyhQvv5axdeihmOzIHjs85TcUIYiWHdsxNz50GTerEOR5ucj2TITPXxyaCUli1oF...06rtcPQotR3q4f2cT







== Problems ==

Problems which have thus far been identified with this approach:

 * allowing arbitrary subaccount generation will permit a DoS attack, in
   which an authorized uploader consumes lots of DB space by creating an
   unbounded number of randomly-generated subaccount identifiers. OTOH, they
   can already attach an unbounded number of leases to any file they like,
   consuming a lot of space.