1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332
|
Wordlists
=========
The passphrases generated by `diceware` naturally depend on the set of
words used, the wordlists.
`diceware` comes with some wordlists out-of-the-box, that might be a
good choice for usual private use.
.. warning::
We do *not* use the `diceware standard wordlist`_,
but the `long EFF wordlist`_ (see below), because it is more secure
and more comfortable to use.
Currently (v1.0) we provide the following lists:
- `ca` (8192/2^13 words)
A list of Catalan words. Compiled by `@jawlenskys`_ from Debian dict file for
Catalan and a selection of most used Catalan Wikipedia words. This list
provides the `prefix property`_.
- `de` (7776/6^5 words)
A list of German words, suitable for use with dice. Generated with
`diceware-list` based on wordlists from `Institut für Deutsche Sprache`_,
Mannheim and filtering blacklists. This list provides the `prefix property`_.
- `de_8k` (8192/2^13 words)
A longer list of German words, suitable for use with machines, nerds, and
other binary-geared entities. Generated with `diceware-list` based on
wordlists from `Institut für Deutsche Sprache`_, Mannheim and filtering
blacklists. This list provides the `prefix property`_.
- `en_eff` (7776/6^5 words, default)
This is the `long EFF wordlist`_ as published by the `Electronic Frontier
Foundation`_ in mid-2016 and used by default. They put real `scientific
effort`_ into the creation of this list which might considerably ease the
use of passphrases generated with it. When using real dice (or other
six-based randomness generators) use is definitely recommended!
It was the first list in `diceware` that provided the
`prefix property`_. That means it contains no word which is a prefix
of another word. Lists without this property might provide a slightly
decreased entropy.
- `en_securedrop` (8192/2^13 words)
We provide a hand-crafted `en_securedrop` wordlist provided
by `@Heartsucker`_. It contains 8,192 english words and
phrases. This list is based on the `diceware standard wordlist`_ and
extended to offer better memorizable words. Please see
https://github.com/heartsucker/diceware for details. The name
`en_securedrop` refers to the `securedrop`_ project.
- `en_adjectives` (1296/6^4 words)
A list of english adjectives. This list is relatively short and should be
used together with other lists -- for instance the `en_nouns` list -- to
provide a sufficient security level. List provided from the
`NaturalLanguagePasswords`_ project. This list got lots of short terms (good
for comfort, bad for security) and does *not* provide the `prefix property`_.
- `en_nouns` (7776/6^5 words)
A list of english nouns. Can be used together with other lists -- for
instance the `en_adjectives` list to form natural language phrases. List
provided from the `NaturalLanguagePasswords`_ project. This list got lots of
short terms (good for comfort, bad for security) and does *not* provide the
`prefix property`_.
- `es` (8192/2^13 words)
A list of Spanish words, carefully crafted by `@jawlenskys`_ from Debian dict
file for Spanish and a selection of most used Spanish words from `Corpus de
Referencia del Español Actual (CREA)`_. This list provides the `prefix
property`_ and is.
- `fr` (7776/6^5 words)
A list of french words, compiled by @Tango for Tails OS and Tor. Handcrafted
to avoid offensive and rare words. This list provides the `prefix property`_.
- `it` (8192/2^13 words)
A list of Italian words, Compiled by `@jawlenskys`_ from Debian dict file for
Italian and an `Italian frequency list
<https://en.wiktionary.org/wiki/User:Matthias_Buchmeier#Italian_frequency_list>`
generated from TV and movie subtitles. This list provides the `prefix
property`_.
- `pt-br` (7776/6^5 words)
A list of Brazilian Portugese words, carefully crafted by `@drebs`_. This
list contains no overshort words. It also provides the `prefix property`_.
You can pick wordlists to use with the ``-w`` or ``--wordlist`` option. Lists
with 7776 words are made for six-sided dice (7776 = 6^5) while lists with 8192
(2^13) words are made for machines and 2-sided coins.
You can also select several wordlists at once. In that case each "word" of the
generated passphrase consists of one word from each of the lists in the order
given.
Example::
$ diceware -w en_adjectives en_nouns -n 2 -d '-'
lax-toast-strong-reason
We get two "words" (`lax-toast` and `strong-reason`) each consisting of a
leading adjective and a trailing noun.
If you'd prefer the Yoda style, you could change that order::
$ diceware -w en_nouns en_adjectives -n 2 -d '-'
grains-honest-oxidant-happy
Each such term (like `oxidant-happy`) provides an entropy of about 23 bits.
Retired Wordlists
-----------------
Some wordlists have been removed from `diceware`, because they contained bad
language and words, users might be uncomfortable with.
- `en` (8192 words, removed in v0.10)
The so-called `8k wordlist`_ from Mr. Reinhold as published on
http://diceware.com/. It was something like the canonical wordlist for use
with binary-geared entities like computers or nerds.
- `en_orig` (7776 words, removed in v0.10)
This is the `diceware standard wordlist`_ as provided by
Mr. Reinhold. Something like the canonical list in former times.
There are now considerable alternatives.
None of these lists provide the `prefix property`_. They also provide overshort
terms, i.e. words that are so short, that they can lead to passphrases that are
easier to break by checking all char combinations than to try all combinations
of words in the wordlist.
Using Custom Wordlists
----------------------
You can use any wordlist you like. Simply give the filename and it
will be used::
$ diceware mywordlist.txt
HiHelloHelloHiHiHi
You can even pipe-in dynamic wordlists. Just use the dash ``-`` as
filename::
$ mywordgenerator.sh | diceware -
HiHiHelloHiHiHello
for instance.
Of course you have to give the filenames of your files with each call
to `diceware`.
But, if you want to store a wordlist persistently, you can do so too.
The built-in wordlists we offer for use with `diceware` are all stored in a
single directory. The exact location is output by ``--show-wordlist-dirs`` as
first entry::
$ diceware --show-wordlist-dirs
/path/to/some/directory
/path/to/other/directory
...
But also all the other directories listed by this command are looked up for
wordlist files (if they exist).
You can put your own wordlists into one of these folders (here:
``/path/to/some/directory``, ``/path/to/other/directory``) and rename the file
to something like ``wordlist_MY_SPECIAL_NAME.txt``. Afterwards you can pick
your wordlist by running::
$ diceware -w MY_SPECIAL_NAME
`diceware` will use this file of yours then to create a
passphrase. Please note that `diceware` only accepts files that are
named like::
wordlist_NAME.txt
or::
wordlist_OTHER_NAME.asc
I.e. we expect ``wordlist_`` at the beginning and some filename
extension like ``.txt`` at the end. Furthermore names must not contain
funny characters. In fact we accept regular letters, dashes, numbers,
and underscores only. Files that do not follow these naming convention
are ignored.
A list of all available wordlist names can be retrieved with ``--help``. See
the ``--wordlist`` explanation.
Where Wordlists are Looked Up
-----------------------------
Starting with version 1.0 wordlists can be stored in several directories. We
look for wordlists in certain directories only. The list of these directories
depends partly on environment variables. It can be shown with::
$ diceware --show-wordlist-dirs
/some/installdir/diceware/wordlists
/home/user/.local/share/diceware
/usr/local/share/diceware
/usr/share/diceware
and may be different on your machine. Wordlist directories are looked up in the
order listed by ``--show-wordlist-dirs``. Wordlists in former directories
override same-named in latter ones. So, with the order given above, a wordlist
named ``wordlist_foo.txt`` in ``/some/installdir/diceware/wordlists`` will have
precedence over a same-named wordfile located in ``/usr/share/diceware``.
The ``wordlists/`` directory of the Python package itself is always the first
we look into.
Afterwards we look up ``${XDG_DATA_HOME}/diceware/`` or, if this environment
variable is not set or empty, ``${HOME}/.local/share/diceware``.
At the end we look into each of the directories listed in the
colon-separated list in ``${XDG_DATA_DIRS}``, appended by ``/diceware``. So, if
``${XDG_DATA_DIRS}`` is set to ``/foo:/bar:/etc/foo``, we will look into
``/foo/diceware``, ``/bar/diceware`` and ``/etc/foo/diceware`` (in that order)
for wordlists.
In case the environment variable ``${XDG_DATA_DIRS}`` is not set or empty, we
look into ``/usr/local/share/diceware`` and ``/usr/share/diceware`` instead.
Under all circumstances we stop looking up wordlist directories, when the first
match (with a given wordlist name) happened.
All these rules try to follow the `XDG Base Directory Specification`_.
Plain Wordlists
---------------
Out of the box, `diceware` supports plain wordlists, PGP-signed
wordlists, and numbered wordlists. Plain wordlists look like this::
termone
termtwo
anotherterm
Each line in such a file is considered a word of the wordlist. Empty
lines are ignored.
Whitespaces are allowed if they are not at the beginning or end of a
line, stripped off otherwise.
Numbered Wordlists
------------------
Numbered wordlists contain numbers in each line, telling a
sequence of dice rolls like so::
11111 aterm
11112 anotherterm
...
`diceware` detects such lines and in this case extracts ``aterm`` and
``anotherterm`` as wordlist entries.
Apart from simple digits written next to each other, `diceware` also
accepts numbers separated by dashes like this::
1-1-1-1-1 aterm
1-1-1-1-2 anotherterm
which is handy when working with wordlists for dice with more than 9
sides.
PGP-signed Wordlists
--------------------
PGP-signed wordlists are wordlists (ordinary or numbered ones), that
have been cryptographically signed with PGP or GPG. They look like
this::
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
foo
bar
baz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iJwEAQEKAAYFAlW00GEACgkQ+5ktCoLaPzSutwP8DVgdjBFqRXNKaZlvd8pR+P3k
8xx5XLC0OFwZQFx4Ls8xl3+/xfvCNxCGSZjD6BGPzNZCK7bmQQYWcrsoEyX5jAC3
dXjAPj0nct/PkJQlrUjUI2qrO0dFfU7sRj0Gn9TOlQQkKoQVwy7pY/6HaScGNepL
J8BNUPYdOWeVgxY1jSY=
=WXfu
-----END PGP SIGNATURE-----
and are normally stored with the ``.asc`` filename extension. Signed
wordlists can be verified to detect changes, although this is not
automatically done by `diceware`.
.. warning:: Diceware does *not* automatically verify PGP-signed
files.
.. _`8k wordlist`: http://world.std.com/~reinhold/diceware8k.txt
.. _`Corpus de Referencia del Español Actual (CREA)`: https://corpus.rae.es/lfrecuencias.html
.. _`diceware standard wordlist`: http://world.std.com/~reinhold/diceware.wordlist.asc
.. _`@drebs`: https://github.com/drebs
.. _`Electronic Frontier Foundation`: https://eff.org/
.. _`@Heartsucker`: https://github.com/heartsucker/
.. _`Institut für Deutsche Sprache`: https://www.ids-mannheim.de/derewo
.. _`@jawlenskys`: https://github.com/jawlenskys
.. _`long EFF wordlist`: https://www.eff.org/files/2016/07/18/eff_large_wordlist.txt
.. _`NaturalLanguagePasswords`: https://github.com/NaturalLanguagePasswords
.. _`prefix property`: https://en.wikipedia.org/wiki/Prefix_code
.. _`scientific effort`: https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases
.. _`securedrop`: https://github.com/freedomofpress/securedrop
.. _`XDG Base Directory Specification`: https://specifications.freedesktop.org/basedir-spec/latest/
|