File: abst.tex

package info (click to toggle)
hol88 2.02.19940316dfsg-8
  • links: PTS
  • area: main
  • in suites: forky, sid
  • size: 65,960 kB
  • sloc: ml: 199,939; ansic: 9,666; sh: 6,913; makefile: 6,032; lisp: 2,747; yacc: 894; sed: 201; cpp: 87; awk: 5
file content (325 lines) | stat: -rw-r--r-- 12,492 bytes parent folder | download | duplicates (11)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
\section{Abstract description and specification}
\label{sec-abst}

The informal description of \Tamarack\ introduced a set of
basic data types and operations involving these data types.
Specific details such as the number of
bits in a machine word were avoided
by regarding these data types and
operations as uninterpreted types and uninterpreted primitives.
In this section, we briefly explain some of the motivations for using
uninterpreted types and uninterpreted primitives in our approach
to formal specification.
A more complete explanation may be found in \cite{Joyce:thesis}.
We also describe a formal basis for using uninterpreted types and
uninterpreted primitives in the \HOL\ formulation of higher-order logic.

\subsection{Motivation}

The formal verification of \Tamarack\ focuses very specifically
on the register-transfer level operation of the internal architecture.
Below and above this level
there are aspects of a complete design
for \Tamarack\ which are not formally considered.
For instance, the formal proof does not consider whether
the register-transfer components have been designed correctly.
Similarly, the formal proof only shows that the design satisfies
the semantics of the instruction set but it does not consider whether
these instructions could be combined to write useful programs.

To clearly demarcate the boundaries between what has and what has not
been formally considered in this proof,
we have taken an abstract approach to both the informal description
and formal specification of \Tamarack.
\footnote{
This abstract approach to the specification of \Tamarack\
was prompted by discussions with
researchers at SRI International, Menlo Park.
The EHDM verification system developed at SRI also supports
a form of abstract specification using parameterized modules \cite{EHDM}.
Eveking has similarly advocated `the separation of concerns'
as a general principle of hardware verification \cite{Eveking:glasgow}.}
Very few details are given about the basic data types
and primitive operations used to describe the design and operation
of \Tamarack.

This is quite different than conventional microprocessor descriptions
which are usually very specific about details such as word size
and various operations performed on data.
Nevertheless, it is possible to explain and
reason about a great deal of the
internal architecture without specifying these details.
The ability to make use of this abstract description without knowing
these details is one way in which formal verification is distinguished
from conventional simulation where these details would
generally need to be known.

Because we do not specify details about basic data types and
operations on basic data types,
computational aspects of
our formal theory are somewhat transparent.
This is illustrated by the \verb"add" operation which
appears in both the bottom and top levels of description.
At the bottom level, it is used to describe
one of the functions performed by the ALU.
At the top level, it is used to give
the semantics of the \Tamarack\ ADD instruction.
The name of this operation strongly suggests that it has
something to do with arithmetic addition,
but the exact nature of this operation depends on
both the word size and the particular representation of numbers,
e.g., addition of unsigned 16-bit words.
Although the name \verb"add" is used to denote this operation,
no particular meaning is associated with this operation except its type:
it is only known to be an operation on a pair of machine words which
yields another machine word as a result.
The formal proof only shows that the ADD instruction
is correctly implemented in the sense that the \verb"add" operation
is applied to the correct operands and the result
stored in the correct destination by the end of the instruction cycle.

Another motivation for our abstract approach is pragmatic.
A great deal of extra infrastructure in the formal theory
can be avoided by
using uninterpreted data types and
uninterpreted primitive operations.
The specification of \Tamarack\ differs in this respect
from earlier versions of this example
\cite{Gordon:tech42,Joyce:tech100,Joyce:calgary86}
as well as the verification of \Viper\ \cite{Cohn:calgary86,Cohn:banff87}
by not using the
built-in \HOL\ types
and postulated axioms (inherited from the {\small LCF\_LSM} system)
for bit-patterns and machine words.
This also means that this example can be re-produced
in other formalisms which lack infra-structure
for reasoning about hardware.

The cost of using uninterpreted types and uninterpreted primitives
is surprisely small: just two assumptions
about the data types and primitive operations
are needed to verify \Tamarack.
One of these assumptions,
for instance, states that the numerical value of
every 3-bit word is less than eight.
These two assumptions appear explicitly in
the correctness theorems.

As mentioned earlier,
there is one more benefit of our abstract approach.
By not specifying any details about word size or
the primitive operations performed on data,
we have an informal description and formal specification which are
generalized over a whole range of possible word sizes and possible
interpretations for the primitive operations.

\subsection{Formal basis}
\label{sec-basis}

Our use of the prefix \verb"*"
for uninterpreted data types
in the informal description of \Tamarack, e.g., \verb":*word",
hinted at our intention to use polymorphic types
for uninterpreted types in the formal specification.
We also need a formal basis for uninterpreted primitives, e.g., \verb"add",
which denote operations on these basic data types.

One possibility would be to simply introduce the names of these
operations as constants in the formal theory.
With this approach, the operations denoted by these constants
would have fixed, unknown values.
Although this approach is appealingly simple,
the association of operation names with fixed, unknown values would
be inconvenient if we ever wanted to extend the formal theory
by giving an interpretation to the
heretofore uninterpreted data types and primitive operations.
\footnote{
There is also a technical reason why this scheme might not work
very well for the \HOL\ system, in particular,
that unbound type variables are not allowed in definitions.}

Instead of a theory with constants denoting fixed, unknown values
for primitive operations, we would like to `parameterize', in an
intuitive sense, the theory by the data types and primitive operations
mentioned in the theory.
The theory could be `instantiated' by
giving an interpretation to the data types and primitive operations
(as we show in Section~\ref{sec-relate}).
This would be useful when combining correctness results for
register-transfer architecture of \Tamarack\ with another theory
about the implementation of various register-transfer level components.

Although it is not possible, in a literal sense,
to parameterize a \HOL\ theory,
the desired effect can be achieved in the \HOL\ logic by parameterizing
definitions in the theory by an extra variable, \verb"rep", which
denotes the `representation' of a theory.

The representation parameter can be thought of as a package containing
values for each of the primitive operations.
A particular operation can be specified by applying a selector function
to the representation.
To avoid introducing more names than necessary,
we have just re-used names from the informal description
for the corresponding selector functions.
Whereas these names were used in the informal description to
directly refer to primitive operations,
they are used in the formal specification to
denote selector functions which must be applied
to the representation variable.
Instead of saying,

\begin{quote}
`the operation denoted by \verb"add"'
\end{quote}

\noindent
we refer to this operation as:

\begin{quote}
`the operation selected by \verb"add"'
\end{quote}

For instance, the terms \verb"inc", \verb"add" and \verb"sub" denote
selector functions for the three operations performed by the ALU.
Applying each of these terms to the representation variable \verb"rep"
yields the three ALU operations specified by the representation
variable \verb"rep".

Like any other term in the \HOL\ logic, the representation variable \verb"rep"
is associated with a type.
We have taken the straightforward approach of representing this variable
in the \HOL\ logic as an \mbox{n-tuple} where each element corresponds
to one of the primitive operations on uninterpreted data types.
Selectors for primitive operations are defined in the formal theory
by composing various sequences of the two selectors \verb"FST" and
\verb"SND".

The ML variable \verb"rep_ty"
is used throughout the formal specification to
denote the compound type of the representation variable.
When it appears
inside a HOL term preceded by a caret \verb"^"
(ML anti-quotation),
it is expanded to the type expression shown below.
This use of ML anti-quotation is unfortunate, but
proper type abbreviation cannot be introduced here because
this type expression contains polymorphic variables.

\begintt
let rep_ty =
  ":(*wordn->bool)#                       % iszero %
    (*wordn->*wordn)#                     % inc %
    (*wordn#*wordn->*wordn)#              % add %
    (*wordn#*wordn->*wordn)#              % sub %
    (num->*wordn) #                       % wordn %
    (*wordn->num) #                       % valn %
    (*wordn->*word3)#                     % opcode %
    (*word3->num)#                        % val3 %
    (*wordn->*address)#                   % address %
    (*memory#*address->*wordn)#           % fetch %
    (*memory#*address#*wordn->*memory)#   % store %
    (num->*word4)#                        % word4 %
    (*word4->num)";;                      % val4 %
\endtt

The particular ordering of elements in this \mbox{n-tuple} is completely
arbitrary.
While a less concrete representation could be used,
any extra effort in this regard is probably not worthwhile.
Details of this particular representation are isolated to
a very small part of
the formalization, in particular, the following definitions of the
selector functions for each of the primitive operations.

\begintt
let iszero = new_definition (
	`iszero`,
	"iszero (rep:^rep_ty) = FST rep");;

let inc = new_definition(
	`inc`,
	"inc (rep:^rep_ty) = FST(SND rep)");;

let add = new_definition (
	`add`,
	"add (rep:^rep_ty) = FST(SND(SND rep))");;

let sub = new_definition (
	`sub`,
	"sub (rep:^rep_ty) = FST(SND(SND(SND rep)))");;

let wordn = new_definition (
	`wordn`,
	"wordn (rep:^rep_ty) = FST(SND(SND(SND(SND rep))))");;

let valn = new_definition (
	`valn`,
	"valn (rep:^rep_ty) = FST(SND(SND(SND(SND(SND rep)))))");;
\endtt

\newpage % PBHACK

\begintt
let opcode = new_definition (
	`opcode`,
	"opcode (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND rep))))))");;

let val3 = new_definition (
	`val3`,
	"val3 (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND(SND rep)))))))");;

let address = new_definition (
	`address`,
	"address (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND(SND(SND rep))))))))");;

let fetch = new_definition (
	`fetch`,
	"fetch (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND(SND(SND(SND rep)))))))))");;

let store = new_definition (
	`store`,
	"store (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND(SND(SND(SND(SND rep))))))))))");;

let word4 = new_definition (
	`word4`,
	"word4 (rep:^rep_ty) =
	  FST(SND(SND(SND(SND(SND(SND(SND(SND(SND(SND(SND rep)))))))))))");;

let val4 = new_definition (
	`val4`,
	"val4 (rep:^rep_ty) =
	  SND(SND(SND(SND(SND(SND(SND(SND(SND(SND(SND(SND rep)))))))))))");;
\endtt

\newpage % PBHACK

As just mentioned,
two assumptions about uninterpreted types and
uninterpreted primitives are needed to verify \Tamarack.
These assumptions are expressed by predicates
\verb"Val3_CASES_ASM" and \verb"Val4Word4_ASM" which are both
parameterized by the representation variable \verb"rep".

\begintt
let Val3_CASES_ASM = new_definition (
  `Val3_CASES_ASM`,
  "Val3_CASES_ASM (rep:^rep_ty) = \(\forall\)w. ((val3 rep) w) < 8");;

let Val4Word4_ASM = new_definition (
  `Val4Word4_ASM`,
  "Val4Word4_ASM (rep:^rep_ty) =
   \(\forall\)n. n < 16 ==> (((val4 rep) ((word4 rep) n)) = n)");;
\endtt

The predicate \verb"Val3_CASES_ASM" expresses the assumption
that the numerical value of
every 3-bit word is less than eight.
The predicate \verb"Val4Word4_ASM" expresses the assumption
that \verb":val4" and \verb":word4" are inverses for numbers
less than sixteen.