File: ReleaseSigning.html

package info (click to toggle)
hdf5 1.8.13%2Bdocs-15
  • links: PTS, VCS
  • area: main
  • in suites: jessie-kfreebsd
  • size: 171,520 kB
  • sloc: ansic: 387,158; f90: 35,195; sh: 20,035; xml: 17,780; cpp: 13,516; makefile: 1,487; perl: 1,299; yacc: 327; lex: 178; ruby: 37
file content (187 lines) | stat: -rw-r--r-- 9,588 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
<html>
<title>
Checksumming and PGP signing HDF5 releases
</title>


<!--
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  * Copyright by The HDF Group.                                               *
  * Copyright by the Board of Trustees of the University of Illinois.         *
  * All rights reserved.                                                      *
  *                                                                           *
  * This file is part of HDF5.  The full HDF5 copyright notice, including     *
  * terms governing use, modification, and redistribution, is contained in    *
  * the files COPYING and Copyright.html.  COPYING can be found at the root   *
  * of the source code distribution tree; Copyright.html can be found at the  *
  * root level of an installed copy of the electronic HDF5 document set and   *
  * is linked from the top-level documents page.  It can also be found at     *
  * http://hdfgroup.org/HDF5/doc/Copyright.html.  If you do not have          *
  * access to either file, you may request a copy from help@hdfgroup.org.     *
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
 -->



<body>
<h3>Checksumming and PGP signing HDF5 releases</h3>
<p>James Laird - August 22, 2005</p>
<br>
<p><b>User Request</b> (bug #387)<br><br>
I was hoping I could get the release process modified
to include a md5file checksum and a PGP signature
of the release.</p>
This would go a long way to ensure that the distribution
that is downloaded is actually the software that was 
released by the NCSA.</p> 
<p>For example:</p>
<p>&nbsp&nbsp&nbsp hdf5.1.6.4.tar.gz  - compressed tarfile<br> 
&nbsp&nbsp&nbsp hdf5.1.6.4.tar.gz.md5 - md5 checksum of the compressed tarfile<br>
&nbsp&nbsp&nbsp hdf5.1.6.4.tar.gz.asc - PGP signature of the compressed tarfile</p>
<p>This is common model for ensuring the integrity of open source
Distribution that are downloaded from multiple mirrors.</p>
<p>Here a link to the Apache web site that is a good reference:<br> 
<a href="http://httpd.apache.org/dev/verification.html">http://httpd.apache.org/dev/verification.html</a></p>
<p>Any chance that I can persuade the developers to add this feature
to the current release process?</p>
<p>Regards,<br>
&nbsp&nbsp&nbsp Pete Rothermel</p><br>
<p><b>What is an MD5 checksum?</b><br>
Explanation from the <a href="http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html">MD5 Homepage (unofficial)</a>:</p>
<p>MD5 was developed by 
Professor Ronald L. Rivest of MIT.  What it does, to quote the executive
summary of 
rfc1321, is:

<blockquote>
   [The MD5 algorithm] takes as input a message of arbitrary length and
   produces 
   as output a 128-bit "fingerprint" or "message digest" of the input.
   It is conjectured that it is computationally infeasible to produce
   two messages having the same message digest, or to produce any
   message having a given prespecified target message digest. The MD5
   algorithm is intended for digital signature applications, where a
   large file must be "compressed" in a secure manner before being
   encrypted with a private (secret) key under a public-key cryptosystem
   such as RSA.
</blockquote>

In essence, MD5 is a way to verify data integrity, and is much more reliable
than checksum and many other commonly used methods.</p>
<br>
<p>The "fingerprint" that MD5 produces is (usually) unique to a given message.
To use this in HDF5, we would run MD5 on the release tarball and post the
resulting file (hdf5.tar.md5) on our website along with the tarball itself.
If an attacker attempted to replace the real hdf5.tar archive with a fake,
the fake archive would have a different MD5 checksum and this could be detected
by the user.  A user could also download the hdf5 tarball from a mirror
FTP site, then compare that tarball's MD5 checksum against the correct value
on the main HDF5 website.</p>

<p>MD5 is vulnerable to an attack in which a false checksum is provided for
the false archive.  There must be a secure channel to transmit the correct
checksum.  However, this checksum is very small (50 bytes), so is easier to
transmit than the full hdf5 tarball.  It might be wise to post the MD5
checksum on the HDF5 website in cleartext as well as adding it to the FTP
site to make it more difficult for an attacker to replace it with a false
checksum.<br>
MD5 is known to be vulnerable to certain
collision attacks,
but these would not affect the HDF5 release process; given an existing
MD5 checksum, it is still extremely difficult to create another file with
the same checksum. </p> 

<p><b>What is a PGP signature?</b><br>
For an in-depth discussion of PGP key signing, see the comp.security.pgp FAQ.</p>
<p>Briefly, a PGP signature allows a user to vouch for a message's
trustworthiness.  The signature uses a hash function to ensure that the
message has not changed since it was signed (the same principle as MD5).
Furthermore, the signature confirms that the message has come from the user
who signed it.</p>
<p>PGP signatures rely on asymmetric keys.  A signature is created with a
private key, while the public key is widely published.  Only the
user with the private key could have created the signature and encoded the
result of the hash function, but anyone can decode the signature to check its
validity and that the message is unchanged.  This would prevent an attacker
from substituting a fake hdf5.tar archive, since the fake archive would have
a different hash value--it would be clear that the signature corresponded to
a different file.  In addition, the attacker could not create a fake
signature to accompany the fake archive (as they could with MD5), since they
do not have HDF5's private key.  The signature would ensure that no-one could
masquerade as the HDF group.</p>
<p>PGP's main drawback is that a signature can only be trusted as far as it
is known to belong to its owner.  If an attacker creates an entirely new
key and claims it belongs to HDF, users have no way of knowing which is the
correct key and which is the fake.  The solution to this problem is to
be connected to the "web of trust," which involves having a key signed by
others who can vouch for the authenticity of the key.  If HDF5 doesn't
expend some effort to enter this web of trust (and doesn't have any members
with well-known public keys), users may not be able to tell which
copy of HDF5 is genuine, but it will be easy to detect that an attack is
taking place, since there will be two different keys in existence!</p>
<p><b>Short-term and long-term costs</b><br>
First, the MD5 checksum and PGP signature would take a small amount of work
to create.  Doing this by hand is easy--it took me about ten minutes to
create a PGP key, and the commands to create the checksum and signature are
one line each.  If we wanted to create both a checksum and a signature for
every platform-specific tarball, these two lines would need to be run for
each tarball.  Both of the tools needed (pgp and md5sum) are already installed
on heping.<br></p>
<p>The creation of an MD5 checksum can be automated to take place when the
bin/reconfigure script is run.  PGP signing should <b>not</b> be automated,
any more than a pen-and-paper signature should be automated.  It might be
easiest to have snapshots involve only an MD5 checksum, while full releases
use a PGP signature.</p>
<p>The resulting MD5 checksum and PGP signature files would have to be
distributed along with the main HDF5 tarball.  This would probably involve
a small alteration to the website and would use a trivial amount of space.</p>
<p>In the long term, HDF5's PGP key would need to be maintained.  It would
be most appropriate to use a single person's key (rather than a "group" key),
so this person would need to examine each release that was signed.  Even if
we don't worry about the "web of trust" in the short term, users will probably
request that we have the key signed by other users eventually, so we may
need to have our keyholder attend "key-signing parties" or attend conferences
with his or her key and photo ID.  We will not be forced to expend any more
effort on this than we want to, of course, and a key that has not been
signed by other users will still be more useful than no key at all.<br>
There should be few other long-term costs beyond the effort required to
create and keep track of the MD5 and PGP files.  We will need to ensure that
we have working versions of md5sum and pgp (both of which are under the GNU
license) and keep track of our PGP keyring.</p>
<p>To verify MD5 checksums or PGP signatures, users would need the appropriate
utilities.  md5sum and gpg are free for Linux, and utilities exist for Windows.
Both MD5 and PGP would create a separate files containing the checksum or
signature, users who wouldn't want to or couldn't worry about these files
could safely ignore them and simply untar the archive as usual.</p>
<br>
<p><b>To create a signature file:</b></p>
<pre>
Create a key:
        gpg --gen-key   (follow the instructions)

Create a separate sig file:
        gpg -b --armor hdf5.tar

Which can then be verified with
        gpg --verify hdf5.tar.asc
</pre>
<p><b>To create an MD5 checksum:</b></p>
<pre>
Find the checksum:
        md5sum hdf5.tar > hdf5.tar.md5

Which can then be verified:
        mdf5sum --check hdf5.tar.md5

Alternately or in addition, we can just post the output
from md5sum hdf5.tar on our website, and let users read it
and compare it themselves.
</pre>

<hr>
<address>
Last modified: 19 September 2005
</address>

</body>
</html>