1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616
|
Advice on the use of arithmetic books (based largely on comments from
Robert Krug, and repeated in ../arithmetic/README):
SUMMARY: Power users should include arithmetic-5/top; new users
should probably instead include arithmetic/top-with-meta.
Details:
If you are willing to use dmr (see :doc dmr) and tolerate
sometimes-slower (but more complete) reasoning, use arithmetic-5.
Robert is not aware of any rewriting loops.
Don't use arithmetic-3. It's not as powerful as arithmetic-5
(Robert doesn't know if it's faster), and it may have more rewriting
loops than arithmetic-5.
[NOTE added by Matt K.: I have seen at least one example in which
arithmetic-3 proves the theorem in 0.30 seconds but arithmetic-5
takes 7.97 seconds. So clearly arithmetic-3 is better sometimes!]
Note that arithmetic-2 is totally obsolete. We may stop
distributing it.
If you want fast, lightweight arithmetic, use
arithmetic/top-with-meta.
=================================================================
README for Arithmetic-5
Author: Robert Bellarmine Krug
Read at least Section 1.A.
=================================================================
Contents
0. WHICH ARITHMETIC BOOKS SHOULD I USE?
1. HOW TO USE THE ARITHMETIC-5 BOOKS.
1.A. In Brief
1.B. Choosing a Normal Form.
1.C. Mini-theories.
2. HOW TO PROVE THEOREMS USING ACL2.
2.A. Introduction.
2.B. How to Use the Theorem Prover --- ``The Method''.
2.C. Structured Theory Development.
2.D. A Few Notes.
=================================================================
0. WHICH ARITHMETIC BOOKS SHOULD I USE?
I wish there were a simple and always correct answer to this common
question. Unfortunately, when Goedel kicked us out of Cantor's
paradise, he demonstrated that such an answer could not exist.
Do not, however, abandon hope. The following guidelines should
be good enough.
If you only need a little arithmetic reasoning (and no reasoning
about floor and mod), arithmetic/top-with-meta will probably
suffice. It is a good, fast, minimal arithmetic library.
The book arithmetic-5/top is much more powerful, and includes
rules for reasoning about floor and mod. It also includes
the start of a theory about the logical operators such as logand
and lognot. You can enable nonlinear reasoning with either
(set-default-hints '((nonlinearp-default-hint
stable-under-simplificationp
hist pspv)))
or
(set-default-hints '((nonlinearp-default-hint++
id stable-under-simplificationp
hist nil)))
This library is reasonably fast, but unfortunately suffers from
rewrite loops on occasion.
I do not particularly recommend arithmetic-3, but it is a bit
faster than arithmetic-5, and some users may find it useful.
The most powerful (and therefore the slowest) library is this
one, arithmetic-5/top. Perhaps the greatest drawback for some
users is that arithmetic-5 can cause excessive case-splits.
We would greatly appreciate any feedback on the shortcomings and/or
strengths of these books, as well as any hints as to how to choose
the appropriate library.
=================================================================
1. HOW TO USE THE ARITHMETIC-5 BOOKS.
1.A. In Brief
Load the library with:
(include-book "arithmetic-5/top" :dir :system)
and enable nonlinear arithmetic with either
(set-default-hints '((nonlinearp-default-hint
stable-under-simplificationp
hist pspv)))
or
(set-default-hints '((nonlinearp-default-hint++
id stable-under-simplificationp
hist nil)))
The latter being the suggested (and slightly slower) version.
Note 1: If you are having difficulty proving a ``simple'' theorem
and the arithemtic book's rules seem to be getting in the way, You can
execute
(set-minimal-arithmetic-5-theory)
This will disable almost all of the arithmetic book's rules, leaving
you with a small set.
Execute (set-default-arithmetic-5-theory) to return to the default
starting point of the arithmetic books.
Note 2: If you encounter a theorem for which arithmetic-5 is being
too slow, there are two techniques which can help you to determine
which rules are the most likely culprits, and then disable them.
First (the most powerful, but tied to the use of emacs) is dmr.
The second (and simpler) is accumulated persistence. Both of these
are documented in the standard ACL2 documentation.
Note 3: Perhaps the greatest drawback for some
users is that arithmetic-5 can cause excessive case-splits.
These have two sources --- first it is very eager to split on
the types of operands to arithmetic operations, if it cannot
determine (via type reasoning) the the operand is of the correct
type. These rules have names such as default-plus-1. Disabling
these rules may lead to strange behaviour. The second source
is rules for if normalization. I have found these rules very
usefull at times, but they can be expensive. These rules have
names like |(+ (if a b c) x)|
Note 4: Occasionally non-linear arithmetic will go off into the weeds
and consume too much time. If you notice that ACL2 is taking too long
on a particular goal or theorem, you can try executing
(set-default-hints nil) before retrying the problematic theorem. This
will result in a theory which ``knows'' less about multiplication, so
one should examine any failed goals with this in mind. One should
also remember to re-enable the nonlinearp-default-hint afterwards.
1.B. Choosing a Normal Form.
The major goal of this library is to provide a set of rewrite rules
which will drive systems of arithmetic expressions into a useful
normal form. There are, however, at least two barriers to achieving
this goal. First, as Kurt G\"odel showed, integer arithmetic is
formally undecidable; i.e., there is no algorithm which will fully
normalize all systems of arithmetic expressions. Second, there is no
reason to believe that there is one ``best'' normal form. We
therefore provide the ability to choose any one of several alternative
normal forms.
Before we go any further we should mention that there is, in general,
no reason to use the information in the rest of this section. We
believe that the default setup is sufficient for most purposes.
On occasion one will run across a lemma which requires a
different normal form and it is for these occasions that we provide
the following; however, for many users it will be sufficient to just
include the appropriate books and to enable nonlinear arithmetic
as described above.
Gather/Scatter-exponents:
There are (at least) two ways of normalizing the (almost) equivalent
terms:
(expt x (+ m n)) and
(* (expt x m) (expt x n)).
(Question: Under what conditions are these two terms not equivalent?)
One can choose the first of these as the normal form and gather
exponents, or one can choose the second as the normal form and scatter
exponents. That is, one can choose to rewrite (expt x (+ m n)) to
(* (expt x m) (expt x n)) --- to scatter exponents; or to rewrite
(* (expt x m) (expt x n)) to (expt x (+ m n)) --- to gather exponents.
By default, exponents are gathered when the book Bind-free/top is
loaded.
To switch to scattering exponents, execute (scatter-exponents)
at the top level prompt. To resume gathering exponents, execute
(gather-exponents). These two forms are macros which expand to the
appropriate set of commands. (To see their expansions, one can
execute
:trans (scatter-exponents) or
:trans (gather-exponents) at the top-level.
It is slightly more
complex to switch theories within a :hints form. One can use
:in-theory (e/d (scatter-exponents-theory)
(gather-exponents-theory))
and
:in-theory (e/d (gather-exponents-theory)
(scatter-exponents-theory
prefer-positive-exponents-scatter-exponents-theory))
to switch to scattering exponents or gathering exponents respectively.
Prefer-positive-addends:
This theory (enabled by default) moves negative addends to the other
side of an equality or inequality. A simple example is:
(< (+ a (- b) c)
(+ d (* -3 e)))
===>
(< (+ a c (* 3 e))
(+ b d)).
Execute (do-not-prefer-positive-addends) at the top-level to disable
this theory and (prefer-positive-addends) to re-enable it. To switch
theories within a :hints form, one can use:
:in-theory (disable prefer-positive-addends-theory)
and
:in-theory (enable prefer-positive-addends-theory)
to disable or enable this theory.
Prefer-positive-exponents:
This theory (disabled by default) moves factors with negative
exponents to the other side of an equality or inequality. A simple
example is:
(equal (* a (/ b) c)
(* d (expt e -3)))
===>
(equal (* a c (expt e 3))
(* b d)).
One can enable this with (prefer-positive-exponents) and disable it
with (do-not-prefer-positive-exponents) at the top level prompt.
Note that (prefer-positive-exponents) is a specialization of
(scatter-exponents). It therefore switches to scattering exponents if
this has not been done previously. Thus, one may need to execute
(gather-exponents) after (do-not-prefer-positive-exponents) if one was
originally gathering exponents and wished to resume doing so. To
switch theories within a :hints form, one can use:
:in-theory (e/d (scatter-exponents-theory
prefer-positive-exponents-scatter-exponents-theory)
(gather-exponents-theory))
and
:in-theory (disable prefer-positive-exponents-scatter-exponents-theory)
or
:in-theory (e/d (gather-exponents-theory)
(scatter-exponents-theory
prefer-positive-exponents-scatter-exponents-theory)).
1.C. Mini-theories.
The book mini-theories contains several useful lemmatta not enabled by
default. We recommend that you look briefly at this book. Here we
describe two of the lemmas which are included.
Two lemmas --- |(- x)| and |(/ x)| --- are included for those who like
to replace subtraction with multiplication by -1 and division by
exponentiation. If you enable these, be sure to disable their
inverses, |(* -1 x)| and |(expt x -1)|, or you will get infinite
rewrite loops.
=================================================================
In this mini-essay, we assume a basic familiarity with ACL2.
2. HOW TO PROVE THEOREMS USING ACL2.
2.A. Introduction.
When using ACL2, it is important to take an organized approach to the
proof process. Sections 1.B. and 1.C. are a summary of the material
to be found in Chapter 9 of Computer-Aided Reasoning: An Approach and
Chapter 6 of Computer-Aided Reasoning: ACL2 Case Studies. I highly
recommend these two books to the serious ACL2 user. In section 2.D. we
address a few issues that are too often overlooked by the ACL2 user.
Here, we mention a few simple points about using ACL2.
1. ACL2 is automatic only in the sense that you cannot steer it once
it begins a proof attempt. You can only interrupt it.
2. You are responsible for guiding it, usually by proving the
necessary lemmas.
3. Never prove a lemma without also thinking about how it will be
used.
4. To lead ACL2 to a proof, you usually must know the proof yourself.
2.B. How to Use the Theorem Prover --- ``The Method''
In this section, we outline `The Method''. While it is only one of
many possible styles of working with ACL2, we have found it to be a
useful starting point from which to develop one's own method. In the
following section we will discuss some of its weaknesses and how to
get around them.
Let us imagine that we want to prove some theorem, MAIN; that in
order to do so it is sufficient to use lemmas A, B, and C; and that
each of these lemmas can be proved using sub-lemmas as depicted in the
proof tree below.
MAIN
|
|
--------------------------------------
| | |
| | |
A B C
| | |
| | |
----------- --------- -----------
| | | | | | |
| | | | | | |
A1 A2 A3 B1 B2 C1 C2
|
|
---------
| |
| |
A1a A2b
One will not usually have worked out the proof (or even necessity) of
every lemma before trying to prove MAIN. In fact, most users work out
the detailed structure of the proof tree during their interaction with
ACL2. Merely keeping track of what has been proved and what remains
to be proved is daunting enough. It is this book-keeping task that
``The Method'' is designed to assist.
The goal of the procedure we outline here is to produce a sequence of
defthm commands which will lead ACL2 to a proof of MAIN. The
procedure will produce a post-order traversal of the tree (A1, A1a,
A1b, A2, A3, A, B2, B2, B, C1, C2, C, MAIN).
We use ACL2 in conjunction with a text editor such as Emacs where we
can run ACL2 as a process in a *shell* buffer, and maintain a second
buffer for input commands, referred to as the script buffer.
When we are done, the script buffer will contain the postorder
traversal of the proof tree. But, as we construct the proof the
script buffer is logically divided into two parts by an imaginary line
we call the barrier. The part above the barrier consists of the
commands that have been carried out, while the part below the buffer
contains the commands we intend to carry out later. The barrier is,
by convention, denoted by the s-expression (I-am-here).
Initially, the script buffer should contain only the theorem
MAIN with the barrier at the top of the buffer; i.e., the done list
is empty and the to-do list contains only MAIN. Here is ``The
Method''.
1. Think about the proof of the first theorem in the to-do list.
Have the necessary lemmas been proved? If so, go to step 2.
Otherwise add them to the front of the to-do list and repeat step 2.
2. Try proving the first theorem with ACL2 and let the output stream
into the *shell* buffer. Abort the proof if it runs too long.
3. If ACL2 succeeded, advance the barrier past the successful command
and go to step 2. Otherwise go to step 4.
4. Inspect the output of the failed proof attempt from the beginning
rather than from the end, You should look for the first place the
proof attempt deviated from your imagined proof. Modify the script
appropriately --- this usually means adding lemmas to the front of the
to-do list although you may need to add hints to the current theorem.
Sometimes, especially if the reason for the failure is that the
theorem is false, this may mean modifying the script both above and
below the barrier. Go to step 2.
2.C. Structured Theory Development.
There are several shortcomings of ``The Method''. In this section,
we describe some of these shortcomings and discuss an elaboration of
``The Method'' which should lessen their influence.
First, ``The Method'' provides no guidance for developing the over-all
structure of a substantial proof effort. It is too easy to lose one's
way in the middle of the proof effort and, moreover, once the proof is
complete it can be quite difficult to comprehend its structure. Such
comprehension is important for presenting the proof to others, and is
also useful for modifying the proof --- either in order to clean it up or
in order to prove a related theorem.
Second, use of ``The Method'' is prone to lead to the following:
One desires to prove a certain lemma, but it requires a lemma in support
of its proof, which leads to another lemma to be proved in support of
\emph{that} one, and so on. At some point the effort seems misguided,
but by then the evolving proof structure is far from clear and it is
difficult to decide how far to back up. Even if a decision is made to
back up to a particular lemma, is it clear which lemmas already proved
may be discarded?
Finally, even if the above process is successful for a while, it can
ultimately be problematic in an even more frustrating way. Suppose
that one attempts to prove some goal theorem, and from the failed
proof attempt one identifies rewrite rules that appear to be helpful,
say, L1 and L2. Suppose further that additional rewrite rules are
proved on the way to proving L1 and L2. When one again attempts to
prove the original goal theorem, those additional rules can send the
proof attempt in a new direction and prevent L1 and L2 from being
used.
We describe here a modular, top-down methodology which reflects common
proof development practice in the mathematical community and is
designed to lessen the above described problems.
Here is an outline describing many proofs, both mechanically checked
ones, and others.
1. To prove the main theorem Main:
2. It should suffice to prove lemmas A, B, and C. (Main Reduction)
3. We need to draw upon a body of other, previously completed
work. (Proof Support)
4. We may also need a few additional, minor, lemmas. (Proof Hacking)
The outline may be reflected in the following structure of a top-level
book, which (at least initially) can use defaxiom or skip-proofs as
shown in order to defer the proofs of the main lemmas.
(include-book "lib") ; 3. Support
(defaxiom A ...) ; 2. Main Reduction
(defaxiom B ...)
(defaxiom C ...)
<minor lemmas> ; 4. Proof Hacking
(defthm MAIN ...) ; 1. Goal
This use of defaxiom has the advantage that one can verify that lemmas
A, B, and C are actually the ones needed to prove MAIN without having
to do a lot of work to prove them. In step four of our description of
``The Method'', we said that one should add any new lemmas to the
front of the to-do list and then go directly to step one. Here, we
recommend adding these new lemmas using skip-proofs (or as axioms) and
making sure that they are what is needed before continuing.
For purposes of illustration assume that we next want to prove lemma
C. Above, we suggested that this be done by adding any necessary
sub-lemmas in front of this main lemma. Here, we suggest that one of
two techniques be used. If the proof of C requires only two or three
lemmas, and each of these requires no further sub-lemmas, prove C
within the scope of an encapsulate as in:
(encapsulate
()
(local
(defthm C1
...))
(local
(defthm C2
...))
(defthm C
...)
).
Note that by making lemmas C2 and C2 local to the encapsulate they
will not be seen outside of the context in which they were designed to
be used, and so will not change the behavior of any events later in
the script. Their influence has been limited and the chances for
surprises has been lessened.
If the proof of lemma C is more complex --- if it requires more than a
couple of lemmas or if those sub-lemmas themselves require sub-lemmas
--- lemma C should be proved in a separate book named C.lisp and the
following should appear in the main script:
(encapsulate
()
(local (include-book "C"))
(defthm C
...)
).
The book C.lisp should then be recursively treated as described here
with lemma C playing the role of MAIN. This delegation of the work to
prove C to another book makes the file MAIN.lisp much easier to read,
and the hierarchy of book inclusions reflects the overall structure of
the proof. (See certify-book and include-book in the documentation.
We assume here that C.lisp will be certified at some point during the
proof development cycle.)
Note that we are using two types of books:
2. Lemma Books: Book name is the same as the name of the final theorem
proved in the book. Main lemmas can be postponed to be proved in
subsidiary lemma books, temporarily replaced by defaxiom in the
present book.
2. Library Books: Other than a lemma book, typically it contains
generally useful definitions and lemmas.
Our top-down methodology suggests a focus on developing reasonably
short lemma books. The trick to keeping their length under control is
first to identify the main lemmas in support of the book's goal
theorem, then pushing their proofs into subsidiary lemma sub-books,
especially when supporting sub-lemmas may be required. Each main lemma
is handled as illustrated above with sub-books A, B, and C: an
encapsulate event contains first a local include-book of the
corresponding lemma sub-book, and second the main lemma.
An important aspect of this approach is that the way in which a
sub-lemma is proved in such a sub-book will not affect the
certification of the parent book. That is, the use of local around
the include-book of a lemma sub-book allows the sub-book to be
changed, other than changing its goal theorem, without affecting the
certification of the parent book. This modularity can prevent replay
problems often encountered when using less structured approaches to
mechanically-assisted proof.
Although our focus here has been on lemma books, there is still a role
for library books. It can be useful from time to time to browse ones
current collection of books and to pull out the most general of these
lemmas and put them into one or more library books. This is how the
library books in books/arithmetic, for instance, were developed.
2.D. A Few Notes.
We finish this part with some more simple points which are easy to
overlook:
1. Definitions, both recursive and non-recursive, should be disabled
as soon as possible. Non-recursive functions, which can be regarded
as abbreviations, will be opened up by ACL2 at the first opportunity
and so will rarely be used as intended if they are enabled.
Recursive definitions will not disappear in the same way, and so the
desirability of disabling them is not as readily apparent. However,
if they are enabled ACL2 will open them up each time they are seen,
attempt to rewrite them, and then close them back up again unless the
rewritten definition is ``better'' than the unopened definition.
Unless one is proving simple facts about the function this is often a
large waste of time. We have seen proofs go from taking more than 30
minutes to under 15 seconds by disabling (an admittedly large number
of complicated and mutually) recursive function definitions.
2. In addition to the ``obvious'' facts about a function which one
should prove before disabling it, one should prove lemmas about the
simple cases which can arise surprisingly often. In the case of
arithmetic functions this would include such cases as repeated
arguments, e.g., (logand x x) = x, and base cases, e.g., (expt x 0) =
(if (equal x 0) 0 1), (expt x 1) = (fix x), and (expt x -1) = (/ x).
3. Once one is comfortable with the use of rewrite rules, one should
next explore the use of type-prescription rules and linear lemmas.
Type-prescription rules can be a bit tricky to get used to, but it is
well worth the effort.
One common problem newer users encounter is with the way hypotheses of
type-prescription rules are relieved --- type reasoning alone must be
sufficient. The primitive types in ACL2 are:
*TS-ZERO* ;;; {0}
*TS-POSITIVE-INTEGER* ;;; positive integers
*TS-POSITIVE-RATIO* ;;; positive non-integer rationals
*TS-NEGATIVE-INTEGER* ;;; negative integers
*TS-NEGATIVE-RATIO* ;;; negative non-integer rationals
*TS-COMPLEX-RATIONAL* ;;; complex rationals
*TS-NIL* ;;; {nil}
*TS-T* ;;; {t}
*TS-NON-T-NON-NIL-SYMBOL* ;;; symbols other than nil, t
*TS-PROPER-CONS* ;;; null-terminated non-empty lists
*TS-IMPROPER-CONS* ;;; conses that are not proper
*TS-STRING* ;;; strings
*TS-CHARACTER* ;;; characters.
If ones hypotheses deal only with these types (or there union such as
integerp or true-listp) one is usually OK. Otherwise, the necessary
facts must be present explicitly in the hypotheses of the goal
being proved, or else themselves be relievable by other
type-prescription rules. In particular, rewriting is not used.
(For example, the presence of (< 4 x) as a hypothesis is not
sufficient for type-reasoning to establish (< 3 x).)
4. False theorems are surprisingly hard to prove. Even the most
experienced ACL2 user will regularly write a false theorem. When a
proof just will not go through, examine the output of the failed proof
attempt with an eye to test cases which may reveal a problem.
|