1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972
|
% binomial.tex %
% The Binomial Theorem proven in HOL %
\setlength{\unitlength}{1mm}
% \vskip10mm
% \begin{center}\LARGE\bf
% The Binomial Theorem proven in HOL.
% \end{center}
% \vskip10mm
\def\obj#1{\mbox{\tt#1}}
% ---------------------------------------------------------------------
% Parameters customising the document; set by whoever \input's it
% ---------------------------------------------------------------------
% \self{} is one word denoting this document, such as "article" or "chapter"
% \path{} is the path denoting the case-study directory
%
% Typical examples:
% \def\self{article}
% \def\path{\verb%Training/studies/binomial%}
% ---------------------------------------------------------------------
\section{Pascal's Triangle and the Binomial Theorem}
Pascal's Triangle\footnote{
According to Knuth \cite{knuth73}, the triangle gets its name from
its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique}
in 1653, although the binomial coefficients were known long before then.
The earliest surviving description is from the tenth century,
in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra.
The Binomial Theorem itself was first reported in 1676 by Isaac Newton.
}
is the infinite triangle of numbers which starts like this:
\[\begin{array}{ccccccc}
1 \\
1 & 1 \\
1 & 2 & 1 \\
1 & 3 & 3 & 1 \\
1 & 4 & 6 & 4 & 1 \\
1 & 5 & 10 & 10 & 5 & 1 \\
\cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\
\end{array}\]
The construction is as follows. The numbers at the edges
are $1$s. Each number in the interior is the sum of the number immediately
above it and the number to its left in the previous row.
The numbers in Pascal's Triangle are called Binomial Coefficients.
The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the
topmost row and the leftmost column are numbered $0$) is written
$n \choose k$ and pronounced ``$n$ choose $k$''. The coefficients are
pronounced this way because $n \choose k$ turns out to be the number of
ways to choose $k$ things out of $n$ things, but that is another story
(see for instance Knuth's book \cite{knuth73}).
A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63}
is the following equation, where $a$ and $b$ are integers and $n$ is any
natural number,
\[
(a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k}
\]
The rest of this \self{} describes how the Binomial Theorem can be proven
in \HOL{}. In fact, a more general theorem is proven, where $a$ and $b$
are elements of an arbitrary commutative ring, but the form displayed here
can be derived from the general result.
The motivation for the proof of the Binomial Theorem in \HOL{} is
tutorial.
% ---to be a medium sized worked example whose subject matter is
% more widely known than any specific piece of hardware or software.
The proof illustrates specific ways in \HOL{} to represent mathematical
notations and manage theorems proven about general structures such as
monoids, groups and rings. The following files accompany this \self{} in
directory \path{}:
\begin{description}
\item[{\tt mk\_BINOMIAL.ml}]
The \ML{} program containing the definitions, lemmas and theorems\\
needed to prove the Binomial Theorem in \HOL{}.
\item
[{\tt BINOMIAL.th}]
The theory file which is generated by \verb@mk_BINOMIAL.ml@.
\item
[{\tt BINOMIAL.ml}]
An \ML{} program which loads the theory \verb@BINOMIAL@ and declares
a few derived rules and one tactic used in the proof.
The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@.
\end{description}
% The aim of this \self{} is to introduce the small amount of algebra and
% mathematical notation needed to state and prove the Binomial Theorem, show
% how this is rendered in \HOL{}, and outline the structure of the proof
% contained in \verb@mk_BINOMIAL.ml@.
The \self{} is meant to be intelligible
without examination of any of the accompanying files.
To avoid clutter not every detail is spelt out.
Often definitions and theorems are indicated
in the form displayed by \HOL{}, rather than as \ML{} source text. Specific
details of the \ML{} definitions or tactics can be found in the file
\verb@mk_BINOMIAL.ml@.
The way to work with this case study is first to study this \self{} to see
the structure of the proof, and then to work interactively with \HOL{}.
Start \HOL{} and say:
\begin{session}
\begin{verbatim}
#loadt `BINOMIAL`;;
...
File BINOMIAL loaded
() : void
#BINOMIAL;;
|- !plus times.
RING(plus,times) ==>
(!a b n.
POWER times n(plus a b) =
SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
(This is the first display of an interaction with \HOL{}. The difference
between lines input to and lines output from \HOL{} is that only the former
are preceded by the \HOL{} prompt ``\verb@#@''. A line of ellipsis
``\verb@...@'' stands for output from \HOL{} which has been omitted from the
display.)
The first input line in the display above loads in all the definitions,
theorems, derived rules and the tactic defined for the proof. The second
input line asks \HOL{} to print theorem \verb@BINOMIAL@. The constants
used in \verb@BINOMIAL@ are explained later in the \self{}. The \HOL{}
session is now set up either to make use of \verb@BINOMIAL@ by specialising
it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal
package to work through proofs contained in it.
The remainder of the \self{} follows the plan:
\begin{description}
\item[Section~\ref{BinomialCoefficients}]
shows how to define the number $n \choose k$ in \HOL{} as the term
$\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by
primitive recursion.
\item[Section~\ref{MonoidsGroupsRings}]
shows how to define elementary algebraic concepts like associativity,
commutativity, monoid, group and ring in \HOL{}, and how to apply
theorems proved in the general case to particular situations.
\item[Section~\ref{PowersReductionsRangesSums}]
shows how to define the iterated operations of raising a term to a
power and reducing (summing) a list of terms. These operations are
part of the everyday mathematical language used to state the Binomial
Theorem, but they are not built-in to \HOL{} and so need to be defined.
This section and the previous two together describe enough definitions
in \HOL{} to state the Binomial Theorem.
\item[Section~\ref{BinomialTheorem}]
shows how the Binomial Theorem is proven in \HOL{}.
The proof is by induction, and depends on three main lemmas.
\item[Section~\ref{BinomialTheoremForIntegers}]
outlines how to apply the general theorem to a particular case:
the ring of integers.
\item[Appendix~\ref{PrincipalLemmas}]
states the principal lemmas used to prove the theorem.
Some of these are {\em ad hoc} but others could be reused elsewhere.
\end{description}
% ----------------------------------------------------------------------------
\section{The Binomial Coefficients}
\label{BinomialCoefficients}
The definition of the binomial coefficients as the numbers in Pascal's Triangle
is formalised in three equations:
\begin{eqnarray*}
n \choose 0 &=& 1 \\
0 \choose {k+1} &=& 0 \\
{n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}}
\end{eqnarray*}
(These actually define $n \choose k$ to be $0$ if $k>n$, but this is
consistent with Pascal's Triangle: think of the spaces to the right of
the triangle as filled with $0$'s.) The desire is to define a constant
\verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type
{\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$.
Unfortunately these three equations cannot be used directly as the definition
of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case
(when $n=0$) together with an inductive case (when $n>0$).
To work towards the definition, consider a term that specifies a base case
and an inductive case intended to be equivalent to the three equations above:
\begin{session}
\begin{verbatim}
let base_and_inductive: term =
"(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
(CHOOSE (SUC n) k =
((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";;
\end{verbatim}
\end{session}
The theory \verb@prim_rec@ contains a primitive recursion theorem which
says that any base case and inductive case identifies a unique function,
\verb@fn@:
\begin{session}
\begin{verbatim}
#num_Axiom;;
|- !e f. ?! fn.
(fn 0 = e) /\
(!n. fn (SUC n) = f (fn n) n)
\end{verbatim}
\end{session}
Given the theorem \verb@prim_rec@, the builtin function
\verb@prove_rec_fn_exists@ can prove that there is a function that satisfies
the property specified by the term \verb@base_and_inductive@:
\begin{session}
\begin{verbatim}
#let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;;
RAW_CHOOSE_DEF =
|- ?CHOOSE.
(!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
(!n k.
CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k)))
\end{verbatim}
\end{session}
These equations would not do as the only definition of \verb@CHOOSE@ due
to the presence of the conditional operator, which makes rewriting
difficult. But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is
easy to prove there is a function which satisfies the original three
equations:
\begin{session}
\begin{verbatim}
CHOOSE_DEF_LEMMA =
|- ?CHOOSE.
(!n. CHOOSE n 0 = 1) /\
(!k. CHOOSE 0(SUC k) = 0) /\
(!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Now the constant \verb@CHOOSE@ can be specified as originally desired:
\begin{session}
\begin{verbatim}
#let CHOOSE_DEF =
# new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;;
CHOOSE_DEF =
|- (!n. CHOOSE n 0 = 1) /\
(!k. CHOOSE 0(SUC k) = 0) /\
(!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Two elementary properties of the binomial coefficients can now be proven
by induction:
\begin{session}
\begin{verbatim}
CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0)
CHOOSE_SAME = |- !n. CHOOSE n n = 1
\end{verbatim}
\end{session}
% ----------------------------------------------------------------------------
\section{Monoids, Groups and Commutative Rings}
\label{MonoidsGroupsRings}
%A simple way to define algebraic structures in \HOL{}.
\subsection{Associativity and Commutativity}
An {\em algebraic structure} is a set equipped with some operators that
obey some laws. An elementary example of such a structure is the pair
$(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys
the laws of associativity,
\[
a + (b + c) = (a + b) + c
\]
and commutativity,
\[
a + b = b + a
\]
How might these laws be represented in \HOL{}? The simplest scheme is
to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a
type of the \HOL{} logic, and $+$ is a \HOL{} term of type
$\sigma\to\sigma\to\sigma$. Notice that the set in the structure is
represented as a \HOL{} type: see the theory \verb@group@ presented in
Elsa Gunter's case study of Modular Arithmetic for a more sophisticated
approach where the set in a structure can be a subset of a \HOL{} type.
The two laws can be defined as two constants in \HOL{}:
\begin{session}
\begin{verbatim}
#let ASSOCIATIVE =
# new_definition (`ASSOCIATIVE`,
# "!plus: *->*->*. ASSOCIATIVE plus =
# (!a b c. plus a (plus b c) = plus (plus a b) c)");;
...
#let COMMUTATIVE =
# new_definition (`COMMUTATIVE`,
# "!plus: *->*->*. COMMUTATIVE plus =
# (!a b. plus a b = plus b a)");;
\end{verbatim}
\end{session}
By instantiating the type variable \verb@*@ and specialising the variable
\verb@plus@ these constants are applicable to any particular structure
of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$.
A simple example to illustrate this methodology is a permutation theorem
about an arbitrary associative commutative operator (which is called
\verb@PERM@ in file \verb@mk_BINOMIAL.ml@). It is worthwhile going
through the proof in detail because it illustrates how a difficulty in
making use of assumptions like \verb@ASSOCIATIVE plus@ and \verb@COMMUTATIVE
plus@ can be solved.
\begin{session}
\begin{verbatim}
#g "!plus: *->*->*.
# ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
# !a b c. plus (plus a b) c = plus b (plus a c)";;
\end{verbatim}
\end{session}
The first step is to strip off the quantifiers, and break up the implication:
\begin{session}
\begin{verbatim}
#e (REPEAT STRIP_TAC);;
OK..
"plus(plus a b)c = plus b(plus a c)"
[ "ASSOCIATIVE plus" ]
[ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the
left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem
was about a specific operator, such as \verb@$+: num->num->num@ from the
theory of numbers in \HOL{}, the rewriting could be done with
\verb@REWRITE_TAC@ and the specific commutativity theorem, such as
\verb@ADD_SYM@:
\[
\mbox{\verb@|- !m n. m + n = n + m@}.
\]
But in the general case all that is available is the definition of
\verb@COMMUTATIVE plus@ which cannot be used directly with
\verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@.
One approach is first to prove the
following equation from the definition \verb@COMMUTATIVE plus@,
\[
\mbox{\verb@COMMUTATIVE plus |- !a b. plus a b = plus b a@},
\]
and then to use the equation with \verb@REWRITE_TAC@ in the same way as one
might use \verb@ADD_SYM@. This is a valid tactic because the
assumption \verb@COMMUTATIVE plus@ is already present in the assumption
list. A little forward proof achieves the first step, the derivation of the
equation:
\begin{session}
\begin{verbatim}
#top_print print_all_thm;;
- : (thm -> void)
#let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));;
forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a)
#let eqn = UNDISCH forwards;;
eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a
\end{verbatim}
\end{session}
This forward proof is a special case of a derived
rule of the logic,
\[
\Gamma\turn \uquant{x} t_1 = t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
which can be coded as the \ML{} function \verb@SPEC_EQ@:
\begin{session}
\begin{verbatim}
#let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));;
SPEC_EQ = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Now the equation \verb@eqn@ can be used to prove swap terms
\verb@a@ and \verb@b@:
\begin{session}
\begin{verbatim}
#e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));;
OK..
"plus(plus b a)c = plus b(plus a c)"
[ "ASSOCIATIVE plus" ]
[ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
Now that the terms appear in both sides in the same order but with different
grouping, associativity can be used to make the grouping on both sides
the same. Again the definition of \verb@ASSOCIATIVE@ cannot be used
directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition
an equation analogous to \verb@eqn@ which can be used:
\begin{session}
\begin{verbatim}
#e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);;
OK..
goal proved
...
|- !plus.
ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
(!a b c. plus(plus a b)c = plus b(plus a c))
\end{verbatim}
\end{session}
The point of this example was to illustrate how to prove theorems about
an arbitrary operator whose behaviour is specified by general predicates
like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is
in deriving equations for use with rewriting or substitution. The solution
used in the example and extensively in the proof of the Binomial Theorem
is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived
rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@:
\[
\Gamma\turn \uquant{x} t_1 \Rightarrow t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
\begin{session}
\begin{verbatim}
#let SPEC_IMP v th = UNDISCH(SPEC v th);;
SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@
generalised to take lists of variables rather than just one:
\begin{session}
\begin{verbatim}
SPECL_EQ = - : (term list -> thm -> thm)
SPECL_IMP = - : (term list -> thm -> thm)
\end{verbatim}
\end{session}
The final new derived rule used in the proof of the Binomial Theorem
is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which
splits up a conjunction in the antecedent into separate assumptions:
\[
\Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1}
\over
\Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x]
\]
\begin{session}
\begin{verbatim}
#let STRIP_SPEC_IMP v th =
# UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));;
STRIP_SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
There are other methods for proving theorems which are conditional on
predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. One is to
rewrite with the definition before assuming the predicate (see, for instance,
the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This
method is fine for small proofs, but in larger proofs it presents the problem
of how to specify a particular assumption on the assumption list. Another
method is to put the predicates on the assumption list, and use resolution
to extract the body of the definition (see the proof of
\verb@UNIQUE_LEFT_INV@). When the assumption list is not short this is
rather a blunt instrument, in the sense that resolution can find many more
theorems than the one desired, and is also rather computationally costly
when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and
its variants.
\subsection{Monoids}
A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an
associative binary operator on $M$, such that there is an {\em identity
element}, $u \in M$, which is a left and right identity of $+$, that is,
it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written
as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the
identity element is written as $0$; otherwise if the operator is written
as $\times$, the structure $(M,\times)$ is called a {\em multiplicative
monoid} and the identity element is written as $1$.
\begin{session}
\begin{verbatim}
#let RAW_MONOID =
# new_definition (`RAW_MONOID`,
# "!plus: *->*->*. MONOID plus =
# ASSOCIATIVE plus /\
# (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");;
\end{verbatim}
\end{session}
This definition is inconvenient to manipulate as it stands, because of
the presence of the existential quantifier. However it is possible via
Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that
\verb@Id plus@ stands for an identity element of \verb@plus@, if such
exists. In fact it is easy to prove that the identity element in a monoid
is unique. The first step is to prove that there exists a function \verb@Id@
with the property just described:
\begin{session}
\begin{verbatim}
ID_LEMMA =
|- ?Id.
!plus.
(?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
(!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to
construct a witness for the outer existential quantifier,
\begin{verbatim}
\plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a))
\end{verbatim}
Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows:
\begin{session}
\begin{verbatim}
#let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;;
ID =
|- !plus.
(?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
(!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The condition that \verb@Id plus@ is a left and right identity is expressed
by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively:
\begin{session}
\begin{verbatim}
LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a)
RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
Given these abbreviations it is easy to prove in \HOL{} that
if a left and right identity element exists, then it is unique:
\begin{session}
\begin{verbatim}
UNIQUE_LEFT_ID =
|- !plus.
LEFT_ID plus /\ RIGHT_ID plus ==>
(!l. (!a. plus l a = a) ==> (Id plus = l))
UNIQUE_RIGHT_ID =
|- !plus.
LEFT_ID plus /\ RIGHT_ID plus ==>
(!r. (!a. plus a r = a) ==> (Id plus = r))
\end{verbatim}
\end{session}
The use of these last two theorems is to help identify the element \verb@Id
plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory
of \verb@arithmetic@. The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@
admit a new characterisation of the predicate \verb@MONOID@, which is easier
to manipulate in \HOL{} than \verb@RAW_MONOID@:
\begin{session}
\begin{verbatim}
MONOID =
|- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus
\end{verbatim}
\end{session}
\subsection{Groups}
A {\em group} is a monoid in which every element is invertible.
\begin{session}
\begin{verbatim}
RAW_GROUP =
|- !plus.
Group plus =
MONOID plus /\
(!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus))
\end{verbatim}
\end{session}
An alternative characterisation of \verb@Group@\footnote{
The mixed case identifier \verb@Group@ is used rather than \verb@GROUP@,
because the latter is already defined in Elsa Gunter's theory \verb@group@,
a parent of her theory \verb@integer@, which needs to coexist with the
present theory so that integers can be used as an example ring
(Section~\ref{BinomialTheoremForIntegers}).
} which makes the
existential quantification implicit can be had by specifying a function
\verb@Inv@ so that, if \verb@Group plus@ holds, then the element \verb@Inv
plus a@ is an inverse of \verb@a@. The function \verb@Inv@ is specified
in a way analogous to the specification of \verb@Id@. First a lemma is
proven which says there does exist a function \verb@Inv@ with the property
desired:
\begin{session}
\begin{verbatim}
INV_LEMMA =
|- ?Inv.
!plus.
(!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==>
(!a.
(plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus))
\end{verbatim}
\end{session}
Given this lemma, which is proven by an easy tactic almost the same
as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified:
\begin{session}
\begin{verbatim}
#let INV =
# new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;;
\end{verbatim}
\end{session}
The predicates which say that \verb@Inv@ produces left and right inverses
are defined as new constants:
\begin{session}
\begin{verbatim}
LEFT_INV =
|- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus)
RIGHT_INV =
|- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus)
\end{verbatim}
\end{session}
Finally, the alternative characterisation of \verb@Group@ can be proven:
\begin{session}
\begin{verbatim}
GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus
\end{verbatim}
\end{session}
The theory of groups has been developed only as far as needed to prove
the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma,
\verb@RIGHT_CANCELLATION@, specifically about groups.
\subsection{Rings}
A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$
is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is
commutative, and {\em multiplication}, $\times$, distributes (on both sides)
over addition. A {\em commutative ring} is a ring in which multiplication
is commutative. The word ring is used in the remainder of this \self{}
and in the \HOL{} code to mean a commutative ring:
\begin{session}
\begin{verbatim}
RING =
|- !plus times.
RING(plus,times) =
Group plus /\
COMMUTATIVE plus /\
MONOID times /\
COMMUTATIVE times /\
LEFT_DISTRIB(plus,times) /\
RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
\verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants
defined by the theorems:
\begin{session}
\begin{verbatim}
LEFT_DISTRIB =
|- !plus times.
LEFT_DISTRIB(plus,times) =
(!a b c. times a(plus b c) = plus(times a b)(times a c))
RIGHT_DISTRIB =
|- !plus times.
RIGHT_DISTRIB(plus,times) =
(!a b c. times(plus a b)c = plus(times a c)(times b c))
\end{verbatim}
\end{session}
Notice that the definition of \verb@RING@ abbreviates a conjunction of
predicates, some of which are atomic, in the sense that they are some basic
property of \verb@plus@ or \verb@times@ or both, and some are compound,
in the sense that they abbreviate a further conjunction of properties.
The definition is a tree of predicates with atomic ones like
\verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound
ones like \verb@Group@ and \verb@MONOID@, at the branches. Many theorems
conditional on \verb@RING(plus,times)@ need to make use of both atomic
and compound predicates contained in the tree. To make access to all these
predicates easy, the following lemma is proven, which makes explicit all
the predicates in the tree:
\begin{session}
\begin{verbatim}
RING_LEMMA =
|- RING(plus,times) ==>
RING(plus,times) /\
Group plus /\
MONOID plus /\
LEFT_ID plus /\
RIGHT_ID plus /\
ASSOCIATIVE plus /\
LEFT_INV plus /\
RIGHT_INV plus /\
COMMUTATIVE plus /\
MONOID times /\
LEFT_ID times /\
RIGHT_ID times /\
ASSOCIATIVE times /\
COMMUTATIVE times /\
LEFT_DISTRIB(plus,times) /\
RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
Goals of the form,
\[
\mbox{\verb@!plus times. RING(plus,times) ==> ...@}
\]
are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@
to add all the atomic and compound predicates in the tree to the assumption
list:
\begin{session}
\begin{verbatim}
#let RING_TAC =
# GEN_TAC THEN GEN_TAC THEN
# DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));;
\end{verbatim}
\end{session}
The one lemma about rings needed here, \verb@RING_0@, is stated in
Appendix~\ref{PrincipalLemmas}.
% ----------------------------------------------------------------------------
\section{Powers, Reductions, Ranges and Sums}
\label{PowersReductionsRangesSums}
%Translation of mathematical definitions such as sums of series into HOL
The aim of this section is to explain how the mathematical notation
$\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}.
If this is written out more explicitly than necessary for human
consumption it looks like this:
\[
\sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k})
\]
where $\times$ is multiplication in the ring, and where exponentiation
and $\cdot$ stand for the scalar powers of multiplication and addition,
respectively. Since binomial coefficients were treated in
Section~\ref{BinomialCoefficients}, what remain to be defined in this section
are scalar powers and the $\Sigma$-notation. The latter is tackled in
three steps: reduction of a list of monoid elements; generation of lists
comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally
the reduction of a list of ring elements indexed by a subrange of the natural
numbers.
\subsection{Scalar powers in a monoid}
If $(M,+)$ and $(M,\times)$ are additive and multiplicative
monoids respectively, the scalar powers of addition and multiplication are
defined informally by the following equations, where $n>0$:
\begin{eqnarray*}
n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\
a^n &=& \overbrace{a \times \cdots \times a}^{n}
\end{eqnarray*}
When $n=0$, $n \cdot a = 0$ and $a^n = 1$.
Scalar power is defined in \HOL{} via the constant \verb@POWER@ which
is defined by primitive recursion:
\begin{session}
\begin{verbatim}
POWER =
|- (!plus a. POWER plus 0 a = Id plus) /\
(!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a))
\end{verbatim}
\end{session}
Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}.
\subsection{Reduction in a monoid}
This section makes use of the standard \HOL{} theory of lists, in particular
the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@,
definition by primitive recursion and proof by induction. The theory of
lists is described in \DESCRIPTION.
If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$,
where each $a_i$ is an element of $M$, is the sum,
\[
a_1 + \cdots + a_n,
\]
or the monoid's identity element if $n=0$.
Reduction is represented in \HOL{} by the constant \verb@REDUCE@,
with type,
\begin{verbatim}
REDUCE : (*->*->*) -> (*)list -> *
\end{verbatim}
and which is defined by primitive recursion on lists on its second argument:
\begin{session}
\begin{verbatim}
#let REDUCE =
# new_list_rec_definition (`REDUCE`,
# "(!plus. (REDUCE plus NIL) = (Id plus):*) /\
# (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");;
REDUCE =
|- (!plus. REDUCE plus[] = Id plus) /\
(!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as))
\end{verbatim}
\end{session}
Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}.
\subsection{Subranges of the natural numbers}
The constant \verb@RANGE@, which is a curried function of two numbers,
is defined by primitive recursion so that,
\[
\verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1]
\]
This definition means that the subrange is the first $n$ numbers starting
at $m$. In particular, when $n=0$ the list is empty.
Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@
\verb@RANGE@ generates a subrange given its least element and length;
an alternative method is to generate the subrange given its two endpoints,
via a constant \verb@INTERVAL@ which satisfies the equation:
\[
\verb@INTERVAL@\,m\,n =
\left\{\begin{array}{ll}
[m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\
{}[] & \mbox{otherwise}
\end{array}\right.
\]
For comparison, the two methods are definable in \HOL{} as follows:
\begin{session}
\begin{verbatim}
#let RANGE =
# new_recursive_definition false num_Axiom `RANGE`
# "(RANGE m 0 = NIL) /\
# (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";;
#
#let INTERVAL =
# new_recursive_definition false num_Axiom `INTERVAL`
# "(INTERVAL m 0 =
# (m > 0) => NIL | [0]) /\
# (INTERVAL m (SUC n) =
# (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";;
\end{verbatim}
\end{session}
\subsection{$\Sigma$-notation}
The standard $\Sigma$-notation is defined informally by the equation,
\[
\sum_{i=m}^n a_i =
\left\{\begin{array}{ll}
a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\
0 & \mbox{otherwise}
\end{array}\right.
\]
Therefore the standard $\Sigma$-notation is naturally definable via
a subrange generated by \verb@INTERVAL@.
An alternative notation, naturally definable via \verb@RANGE@, can be
defined informally as follows:
\[
\sum_{i=m}^{(n)} a_i =
\left\{\begin{array}{ll}
a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\
0 & \mbox{otherwise}
\end{array}\right.
\]
The brackets around the $n$ above the $\Sigma$ distinguish this
notation from the standard one.
This second notation is used in the \HOL{} proof of the Binomial Theorem.
The difference between the two is just that the standard notation is
definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second
notation was chosen entirely because the constant \verb@RANGE@ is easier
to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter
has a more complex definition. The disadvantage of this choice is that
the proof of the theorem uses a non-standard notation, but that's all the
difference is: notation. If desired, the Binomial Theorem stated in the
standard notation could be trivially deduced from its non-standard statement.
The non-standard $\Sigma$-notation is defined in \HOL{} as the following
abbreviation:
\begin{session}
\begin{verbatim}
#let SIGMA =
# new_definition (`SIGMA`,
# "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");;
\end{verbatim}
\end{session}
Note that an indexed term $a_i$ is represented in \HOL{} as a function
\verb@f@ of type \verb@num -> *@.
Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@;
their proofs make use of properties of \verb@MAP@ from the theory
\verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@.
The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define
\verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined
directly using primitive recursion. It is worth going to the trouble
of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely
to be useful in other situations.
% ----------------------------------------------------------------------------
\section{The Binomial Theorem for a Commutative Ring}
\label{BinomialTheorem}
%Division of a medium sized proof into manageable lemmas.
This section outlines the proof. For full details see the tactics in
\verb@mk_BINOMIAL.ml@.
The statement and proof of the Binomial Theorem use a new \HOL{} constant,
\verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n
\choose k} \cdot a^{n-k} \times b^k$. \verb@BINTERM@ is defined as a
curried function as follows:
\begin{session}
\begin{verbatim}
#let BINTERM_DEF =
# new_definition (`BINTERM_DEF`,
# "BINTERM (plus: *->*->*) (times: *->*->*) a b n k =
# POWER plus (CHOOSE n k)
# (times (POWER times (n-k) a) (POWER times k b))");;
\end{verbatim}
\end{session}
This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as
follows:
\begin{session}
\begin{verbatim}
BINOMIAL =
|- !plus times.
RING(plus,times) ==>
(!a b n.
POWER times n(plus a b) =
SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
In outline, the proof proceeds as follows. The following equation is
proven by induction on $n$:
\begin{eqnarray*}
(a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
When $n=0$ both sides reduce to the multiplicative identity element, $1$,
of the ring. For the inductive step, the equation is assumed for $n$, and
the following goal must be proven:
\begin{eqnarray*}
(a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k
\end{eqnarray*}
Three lemmas are used in the inductive step:
\[\begin{array}{rcll}
\displaystyle
\sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=&
\displaystyle
a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k +
b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k
& \mbox{({\tt LEMMA\_1})} \\
\displaystyle
a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
a^{n+1} +
a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
& \mbox{({\tt LEMMA\_2})} \\
\displaystyle
b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
+ b^{n+1}
& \mbox{({\tt LEMMA\_3})}
\end{array}\]
\verb@LEMMA_1@ is the key to the inductive step. The three stages of the
inductive step are indicated (a), (b) and (c) in \ML{} comments. Step
(a) expands the right-hand side of the goal with \verb@LEMMA_1@:
\begin{eqnarray*}
\mbox{rhs} &=&
a^{n+1} +
a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
b^{n+1}
\end{eqnarray*}
Step (b) expands the left-hand side with the induction hypothesis:
\begin{eqnarray*}
\mbox{lhs} &=&
(a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
Finally, in step (c) the expansions of the two sides are shown to be the
same, using distributivity of $\times$ over $+$, associativity of $+$,
and \verb@LEMMA_2@ and \verb@LEMMA_3@:
\begin{eqnarray*}
\mbox{lhs} &=&
a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
+
b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\
&=&
a^{n+1} +
a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
+
b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
+ b^{n+1} \\
&=&
\mbox{rhs}
\end{eqnarray*}
% ----------------------------------------------------------------------------
\section{The Binomial Theorem for Integers}
\label{BinomialTheoremForIntegers}
The previous sections have dealt with the material in theory \verb@BINOMIAL@,
concerning how to prove the Binomial Theorem for an arbitrary ring. A
natural next step is to produce some specific commutative ring, and prove
a Binomial Theorem for it, as an instance of the general case.
To see how to do so for Elsa Gunter's theory of integers, take a look
at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}.
When loaded into \HOL{}, this \ML{} file proves that the integers are a
ring---by quoting laws contained in the theory \verb@integer@---and then
derives a Binomial Theorem for the integers---by Modus Ponens from the
general theorem \verb@BINOMIAL@.
|