1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229
|
@c PSPP - a program for statistical analysis.
@c Copyright (C) 2019 Free Software Foundation, Inc.
@c Permission is granted to copy, distribute and/or modify this document
@c under the terms of the GNU Free Documentation License, Version 1.3
@c or any later version published by the Free Software Foundation;
@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
@c A copy of the license is included in the section entitled "GNU
@c Free Documentation License".
@c
@node Encrypted File Wrappers
@chapter Encrypted File Wrappers
SPSS 21 and later can package multiple kinds of files inside an
encrypted wrapper. The wrapper has a common format, regardless of the
kind of the file that it contains.
@quotation Warning
The SPSS encryption wrapper is poorly designed. When the password is
unknown, it is much cheaper and faster to decrypt a file encrypted
this way than if a well designed alternative were used. If you must
use this format, use a 10-byte randomly generated password.
@end quotation
@menu
* Common Wrapper Format::
* Password Encoding::
@end menu
@node Common Wrapper Format
@section Common Wrapper Format
An encrypted file wrapper begins with the following 36-byte header,
where @i{xxx} identifies the type of file encapsulated: @code{SAV} for
a system file, @code{SPS} for a syntax file, @code{SPV} for a viewer
file. PSPP code for identifying these files just checks for the
@code{ENCRYPTED} keyword at offset 8, but the other bytes are also
fixed in practice:
@example
0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
0010 44 @i{xx} @i{xx} @i{xx} 15 00 00 00 00 00 00 00 00 00 00 00 |D@i{xxx}............|
0020 00 00 00 00 |....|
@end example
Following the fixed header is essentially the regular contents of the
encapsulated file in its usual format, with each 16-byte block
encrypted with AES-256 in ECB mode.
To make the plaintext an even multiple of 16 bytes in length, the
encryption process appends PKCS #7 padding, as specified in RFC 5652
section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which
each byte of padding is the number of padding bytes added. If the
plaintext is, for example, 2 bytes short of a multiple of 16, the
padding is 2 bytes with value 02; if the plaintext is a multiple of 16
bytes in length, the padding is 16 bytes with value 0x10.
The AES-256 key is derived from a password in the following way:
@enumerate
@item
Start from the literal password typed by the user. Truncate it to at
most 10 bytes, then append as many null bytes as necessary until there
are exactly 32 bytes. Call this @var{password}.
@item
Let @var{constant} be the following 73-byte constant:
@example
0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11
0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80
0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85
0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1
0040 e1 83 07 d8 0d 00 00 01 00
@end example
@item
Compute CMAC-AES-256(@var{password}, @var{constant}). Call the
16-byte result @var{cmac}.
@item
The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is,
@var{cmac} repeated twice.
@end enumerate
@subheading Example
Consider the password @samp{pspp}. @var{password} is:
@example
0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............|
0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
@end example
@noindent
@var{cmac} is:
@example
0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
@end example
@noindent
The AES-256 key is:
@example
0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
@end example
@menu
* Checking Passwords::
@end menu
@node Checking Passwords
@subsection Checking Passwords
A program reading an encrypted file may wish to verify that the
password it was given is the correct one. One way is to verify that
the PKCS #7 padding at the end of the file is well formed. However,
any plaintext that ends in byte 01 is well formed PKCS #7, meaning
that about 1 in 256 keys will falsely pass this test. This might be
acceptable for interactive use, but the false positive rate is too
high for a brute-force search of the password space.
A better test requires some knowledge of the file format being
wrapped, to obtain a ``magic number'' for the beginning of the file.
@itemize @bullet
@item
The plaintext of system files begins with @code{$FL2@@(#)} or
@code{$FL3@@(#)}.
@item
Before encryption, a syntax file is prefixed with a line at the
beginning of the form @code{* Encoding: @var{encoding}.}, where
@var{encoding} is the encoding used for the rest of the file,
e.g.@: @code{windows-1252}. Thus, @code{* Encoding} may be used as a
magic number for system files.
@item
The plaintext of viewer files begins with 50 4b 03 04 14 00 08 (50 4b
is @code{PK}).
@end itemize
@node Password Encoding
@section Password Encoding
SPSS also supports what it calls ``encrypted passwords.'' These are
not encrypted. They are encoded with a simple, fixed scheme. An
encoded password is always a multiple of 2 characters long, and never
longer than 20 characters. The characters in an encoded password are
always in the graphic ASCII range 33 through 126. Each successive
pair of characters in the password encodes a single byte in the
plaintext password.
Use the following algorithm to decode a pair of characters:
@enumerate
@item
Let @var{a} be the ASCII code of the first character, and @var{b} be
the ASCII code of the second character.
@item
Let @var{ah} be the most significant 4 bits of @var{a}. Find the line
in the table below that has @var{ah} on the left side. The right side
of the line is a set of possible values for the most significant 4
bits of the decoded byte.
@display
@t{2 } @result{} @t{2367}
@t{3 } @result{} @t{0145}
@t{47} @result{} @t{89cd}
@t{56} @result{} @t{abef}
@end display
@item
Let @var{bh} be the most significant 4 bits of @var{b}. Find the line
in the second table below that has @var{bh} on the left side. The
right side of the line is a set of possible values for the most
significant 4 bits of the decoded byte. Together with the results of
the previous step, only a single possibility is left.
@display
@t{2 } @result{} @t{139b}
@t{3 } @result{} @t{028a}
@t{47} @result{} @t{46ce}
@t{56} @result{} @t{57df}
@end display
@item
Let @var{al} be the least significant 4 bits of @var{a}. Find the
line in the table below that has @var{al} on the left side. The right
side of the line is a set of possible values for the least significant
4 bits of the decoded byte.
@display
@t{03cf} @result{} @t{0145}
@t{12de} @result{} @t{2367}
@t{478b} @result{} @t{89cd}
@t{569a} @result{} @t{abef}
@end display
@item
Let @var{bl} be the least significant 4 bits of @var{b}. Find the
line in the table below that has @var{bl} on the left side. The right
side of the line is a set of possible values for the least significant
4 bits of the decoded byte. Together with the results of the previous
step, only a single possibility is left.
@display
@t{03cf} @result{} @t{028a}
@t{12de} @result{} @t{139b}
@t{478b} @result{} @t{46ce}
@t{569a} @result{} @t{57df}
@end display
@end enumerate
@subheading Example
Consider the encoded character pair @samp{-|}. @var{a} is
0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is
0xd, and @var{bl} is 0xc. @var{ah} means that the most significant
four bits of the decoded character is 2, 3, 6, or 7, and @var{bh}
means that they are 4, 6, 0xc, or 0xe. The single possibility in
common is 6, so the most significant four bits are 6. Similarly,
@var{al} means that the least significant four bits are 2, 3, 6, or 7,
and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant
four bits are 2. The decoded character is therefore 0x62, the letter
@samp{b}.
|