File: encrypted-file-wrappers.texi

package info (click to toggle)
pspp 2.0.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 66,676 kB
  • sloc: ansic: 267,210; xml: 18,446; sh: 5,534; python: 2,881; makefile: 125; perl: 64
file content (229 lines) | stat: -rw-r--r-- 7,927 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
@c PSPP - a program for statistical analysis.
@c Copyright (C) 2019 Free Software Foundation, Inc.
@c Permission is granted to copy, distribute and/or modify this document
@c under the terms of the GNU Free Documentation License, Version 1.3
@c or any later version published by the Free Software Foundation;
@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
@c A copy of the license is included in the section entitled "GNU
@c Free Documentation License".
@c

@node Encrypted File Wrappers
@chapter Encrypted File Wrappers

SPSS 21 and later can package multiple kinds of files inside an
encrypted wrapper.  The wrapper has a common format, regardless of the
kind of the file that it contains.

@quotation Warning
The SPSS encryption wrapper is poorly designed.  When the password is
unknown, it is much cheaper and faster to decrypt a file encrypted
this way than if a well designed alternative were used.  If you must
use this format, use a 10-byte randomly generated password.
@end quotation

@menu
* Common Wrapper Format::
* Password Encoding::
@end menu

@node Common Wrapper Format
@section Common Wrapper Format

An encrypted file wrapper begins with the following 36-byte header,
where @i{xxx} identifies the type of file encapsulated: @code{SAV} for
a system file, @code{SPS} for a syntax file, @code{SPV} for a viewer
file.  PSPP code for identifying these files just checks for the
@code{ENCRYPTED} keyword at offset 8, but the other bytes are also
fixed in practice:

@example
0000  1c 00 00 00 00 00 00 00  45 4e 43 52 59 50 54 45  |........ENCRYPTE|
0010  44 @i{xx} @i{xx} @i{xx} 15 00 00 00  00 00 00 00 00 00 00 00  |D@i{xxx}............|
0020  00 00 00 00                                       |....|
@end example

Following the fixed header is essentially the regular contents of the
encapsulated file in its usual format, with each 16-byte block
encrypted with AES-256 in ECB mode.

To make the plaintext an even multiple of 16 bytes in length, the
encryption process appends PKCS #7 padding, as specified in RFC 5652
section 6.3.  Padding appends 1 to 16 bytes to the plaintext, in which
each byte of padding is the number of padding bytes added.  If the
plaintext is, for example, 2 bytes short of a multiple of 16, the
padding is 2 bytes with value 02; if the plaintext is a multiple of 16
bytes in length, the padding is 16 bytes with value 0x10.

The AES-256 key is derived from a password in the following way:

@enumerate
@item
Start from the literal password typed by the user.  Truncate it to at
most 10 bytes, then append as many null bytes as necessary until there
are exactly 32 bytes.  Call this @var{password}.

@item
Let @var{constant} be the following 73-byte constant:

@example
0000  00 00 00 01 35 27 13 cc  53 a7 78 89 87 53 22 11
0010  d6 5b 31 58 dc fe 2e 7e  94 da 2f 00 cc 15 71 80
0020  0a 6c 63 53 00 38 c3 38  ac 22 f3 63 62 0e ce 85
0030  3f b8 07 4c 4e 2b 77 c7  21 f5 1a 80 1d 67 fb e1
0040  e1 83 07 d8 0d 00 00 01  00
@end example

@item
Compute CMAC-AES-256(@var{password}, @var{constant}).  Call the
16-byte result @var{cmac}.

@item
The 32-byte AES-256 key is @var{cmac} || @var{cmac}, that is,
@var{cmac} repeated twice.
@end enumerate

@subheading Example

Consider the password @samp{pspp}.  @var{password} is:

@example
0000  70 73 70 70 00 00 00 00  00 00 00 00 00 00 00 00  |pspp............|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
@end example

@noindent
@var{cmac} is:

@example
0000  3e da 09 8e 66 04 d4 fd  f9 63 0c 2c a8 6f b0 45
@end example

@noindent
The AES-256 key is:

@example
0000  3e da 09 8e 66 04 d4 fd  f9 63 0c 2c a8 6f b0 45
0010  3e da 09 8e 66 04 d4 fd  f9 63 0c 2c a8 6f b0 45
@end example

@menu
* Checking Passwords::
@end menu

@node Checking Passwords
@subsection Checking Passwords

A program reading an encrypted file may wish to verify that the
password it was given is the correct one.  One way is to verify that
the PKCS #7 padding at the end of the file is well formed.  However,
any plaintext that ends in byte 01 is well formed PKCS #7, meaning
that about 1 in 256 keys will falsely pass this test.  This might be
acceptable for interactive use, but the false positive rate is too
high for a brute-force search of the password space.

A better test requires some knowledge of the file format being
wrapped, to obtain a ``magic number'' for the beginning of the file.

@itemize @bullet
@item
The plaintext of system files begins with @code{$FL2@@(#)} or
@code{$FL3@@(#)}.

@item
Before encryption, a syntax file is prefixed with a line at the
beginning of the form @code{* Encoding: @var{encoding}.}, where
@var{encoding} is the encoding used for the rest of the file,
e.g.@: @code{windows-1252}.  Thus, @code{* Encoding} may be used as a
magic number for system files.

@item
The plaintext of viewer files begins with 50 4b 03 04 14 00 08 (50 4b
is @code{PK}).
@end itemize

@node Password Encoding
@section Password Encoding

SPSS also supports what it calls ``encrypted passwords.''  These are
not encrypted.  They are encoded with a simple, fixed scheme.  An
encoded password is always a multiple of 2 characters long, and never
longer than 20 characters.  The characters in an encoded password are
always in the graphic ASCII range 33 through 126.  Each successive
pair of characters in the password encodes a single byte in the
plaintext password.

Use the following algorithm to decode a pair of characters:

@enumerate
@item
Let @var{a} be the ASCII code of the first character, and @var{b} be
the ASCII code of the second character.

@item
Let @var{ah} be the most significant 4 bits of @var{a}.  Find the line
in the table below that has @var{ah} on the left side.  The right side
of the line is a set of possible values for the most significant 4
bits of the decoded byte.

@display
@t{2 } @result{} @t{2367}
@t{3 } @result{} @t{0145}
@t{47} @result{} @t{89cd}
@t{56} @result{} @t{abef}
@end display

@item
Let @var{bh} be the most significant 4 bits of @var{b}.  Find the line
in the second table below that has @var{bh} on the left side.  The
right side of the line is a set of possible values for the most
significant 4 bits of the decoded byte.  Together with the results of
the previous step, only a single possibility is left.

@display
@t{2 } @result{} @t{139b}
@t{3 } @result{} @t{028a}
@t{47} @result{} @t{46ce}
@t{56} @result{} @t{57df}
@end display

@item
Let @var{al} be the least significant 4 bits of @var{a}.  Find the
line in the table below that has @var{al} on the left side.  The right
side of the line is a set of possible values for the least significant
4 bits of the decoded byte.

@display
@t{03cf} @result{} @t{0145}
@t{12de} @result{} @t{2367}
@t{478b} @result{} @t{89cd}
@t{569a} @result{} @t{abef}
@end display

@item
Let @var{bl} be the least significant 4 bits of @var{b}.  Find the
line in the table below that has @var{bl} on the left side.  The right
side of the line is a set of possible values for the least significant
4 bits of the decoded byte.  Together with the results of the previous
step, only a single possibility is left.

@display
@t{03cf} @result{} @t{028a}
@t{12de} @result{} @t{139b}
@t{478b} @result{} @t{46ce}
@t{569a} @result{} @t{57df}
@end display
@end enumerate

@subheading Example

Consider the encoded character pair @samp{-|}.  @var{a} is
0x2d and @var{b} is 0x7c, so @var{ah} is 2, @var{bh} is 7, @var{al} is
0xd, and @var{bl} is 0xc.  @var{ah} means that the most significant
four bits of the decoded character is 2, 3, 6, or 7, and @var{bh}
means that they are 4, 6, 0xc, or 0xe.  The single possibility in
common is 6, so the most significant four bits are 6.  Similarly,
@var{al} means that the least significant four bits are 2, 3, 6, or 7,
and @var{bl} means they are 0, 2, 8, or 0xa, so the least significant
four bits are 2.  The decoded character is therefore 0x62, the letter
@samp{b}.