File: design.html

package info (click to toggle)
pcb-rnd 3.1.7b-2
links: PTS, VCS
area: main
in suites: forky, sid
size: 33,108 kB
sloc: ansic: 213,400; yacc: 6,241; sh: 4,698; awk: 3,016; makefile: 2,254; lex: 1,166; python: 519; xml: 261; lisp: 154; tcl: 67; perl: 34; javascript: 6; ruby: 5
file content (101 lines) | stat: -rw-r--r-- 4,293 bytes
parent folder | download | duplicates (5)
<html>
<body>
<h1> libminuid design </h1>

<p>
libminuid generates <b>non-standard</b> UUIDs (GUIDs). This document explains
why it goes for a non-standard model and explains some of the design decisions.

<h2> Prior art <h2>
<p>
Standard <a href="https://en.wikipedia.org/wiki/Universally_unique_identifier">
UUIDs</a> are widespread. There are accessible implementations. However
there are a few drawbacks of them:
<ul>
	<li> complexity: there are different versions; rfc4122 even describes a
	     version field within the UUID.
	<li> privacy concerns (e.g. saving MAC addresses with some versions)
	<li> some parts of the design does not really guarantee unique bits; for example
	     if multiple chroots or VMs ran on a server sharing the same physical
	     network interface, they could all have the same MAC address.
	<li> portability issues: acquiring the MAC address or high precision time
	     stamp is typically easy on systems where acquiring well seeded
	     random numbers is also easy (e.g. modern Linux or BSD)
</ul>


<h2> Design decisions </h2>

<h3> Session prefix: random salt </h3>
<p>
By using a random salt as the first 14 bytes of the UUID, each UUID has 112 bits
that should make it unique across different threads/processes. This is somewhat
lower, but comparable to the standard UUID version 4's 122 bits.
<p>
Using random salt evades the main issues with standard UUIDs:
<ul>
	<li> no MAC address, time stamps or other privacy-sensitive data leaks
	<li> less portability issues: the only one thing the implementation is
	     required to do and is not ANSI-C89 standard is getting 112 bits of
	     random number from some source that preferably got hardware random
	     elements
	<li> running in parallel on the same multi-processor hardware does not
	     increase the chance of generating the same salt
</ul>
<p>
As a fallback mechanism, on systems not offering /dev/urandom or /dev/random,
the host application may add to the salt by using whatever unique identifiers
or random sources available.

<h3> Suffix: 32 bit unsigned counter </h3>
<p>
For the purpose of having unique identifiers, a long-enough, properly obtained
random number should be enough. However generating such random number may be
expensive and in some situations a lot of UUIDs need to be generated at once.
<p>
To solve this, libminuid's UUID consists of the long random number session
prefix, generated only once per session and a plain integer counter suffix
that is simply incremented on each new UUID generation. In the unlikely
situation the 32 bit counter overflows, the random prefix should be altered
or a new random prefix should be generated.
<p>
Typically one process would set up one session, generating one random number.
If the application deals with multiple independent contexts, it's also reasonable
to create one session per context.
<p>
Multi-threaded applications should either initialize one session per thread
(that generate UUIDs) or should implement a thin wrapper around libminuid
API with locking.

<h3> Privacy implications </h3>
<p>
Because of the session seed and counter, looking at UUIDs there are still
three things that can be figured:
<ul>
	<li> if two UUIDs have been generated within the same session
	<li> if so, which one got generated earlier
	<li> and at most how many other UUIDs have been generated in between
</ul>
<p>
Application designers should keep these in mind and where relevant, evade
these by using multiple sessions.


<h3> String format: base64 </h3>
<p>
Standard UUID uses hexadecimal printout and is written in text form as
32 characters + 4 hyphens, which is total 36 characters (for 128 bits
of binary UUID).
<p>
Libminuid uses base64 with no separators between prefix and suffix. It takes
24 characters in plain text to encode the 144 bits long binary UUID.
<p>
Thus libminuid produces more compact text output with two drawbacks:
<ul>
	<li> it's harder to read out loud (because of no split-up and case sensitive encoding)
	<li> it uses more ASCII characters, which may cause problems when encapsulating in a file formats or network protocol
</ul>
<p>
Widely used file formats and protocols typically can transmit base64 encoded
content or arbitrary text strings so the second drawback shouldn't cause
problems in practice.