File: binomial.tex

package info (click to toggle)
hol88 2.02.19940316-15
  • links: PTS
  • area: main
  • in suites: wheezy
  • size: 65,928 kB
  • sloc: ml: 199,939; ansic: 9,666; sh: 7,118; makefile: 6,075; lisp: 2,747; yacc: 894; sed: 201; cpp: 87; awk: 5
file content (972 lines) | stat: -rw-r--r-- 37,252 bytes parent folder | download | duplicates (11)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
% binomial.tex %
% The Binomial Theorem proven in HOL %

\setlength{\unitlength}{1mm}
% \vskip10mm
% \begin{center}\LARGE\bf
% The Binomial Theorem proven in HOL.
% \end{center}
% \vskip10mm

\def\obj#1{\mbox{\tt#1}}

% ---------------------------------------------------------------------
% Parameters customising the document; set by whoever \input's it
% ---------------------------------------------------------------------
% \self{} is one word denoting this document, such as "article" or "chapter"
% \path{} is the path denoting the case-study directory
%
% Typical examples:
% \def\self{article}
% \def\path{\verb%Training/studies/binomial%}
% ---------------------------------------------------------------------

\section{Pascal's Triangle and the Binomial Theorem}

Pascal's Triangle\footnote{
According to Knuth \cite{knuth73}, the triangle gets its name from
its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique}
in 1653, although the binomial coefficients were known long before then.
The earliest surviving description is from the tenth century,
in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra.
The Binomial Theorem itself was first reported in 1676 by Isaac Newton.
}
is the infinite triangle of numbers which starts like this:
\[\begin{array}{ccccccc}
1  \\
1 & 1 \\
1 & 2 & 1 \\
1 & 3 & 3 & 1 \\
1 & 4 & 6 & 4 & 1 \\
1 & 5 & 10 & 10 & 5 & 1 \\
\cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\
\end{array}\]
The construction is as follows. The numbers at the edges 
are $1$s. Each number in the interior is the sum of the number immediately 
above it and the number to its left in the previous row.

The numbers in Pascal's Triangle are called Binomial Coefficients. 
The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the 
topmost row and the leftmost column are numbered $0$) is written 
$n \choose k$ and pronounced ``$n$ choose $k$''.  The coefficients are 
pronounced this way because $n \choose k$ turns out to be the number of 
ways to choose $k$ things out of $n$ things, but that is another story
(see for instance Knuth's book \cite{knuth73}).

A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63} 
is the following equation, where $a$ and $b$ are integers and $n$ is any 
natural number,
\[
(a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k}
\]
The rest of this \self{} describes how the Binomial Theorem can be proven 
in \HOL{}.  In fact, a more general theorem is proven, where $a$ and $b$ 
are elements of an arbitrary commutative ring, but the form displayed here 
can be derived from the general result.

The motivation for the proof of the Binomial Theorem in \HOL{} is 
tutorial.
% ---to be a medium sized worked example whose subject matter is 
% more widely known than any specific piece of hardware or software.
The proof illustrates specific ways in \HOL{} to represent mathematical 
notations and manage theorems proven about general structures such as 
monoids, groups and rings. The following files accompany this \self{} in 
directory \path{}:
\begin{description}
\item[{\tt mk\_BINOMIAL.ml}]
    The \ML{} program containing the definitions, lemmas and theorems\\
    needed to prove the Binomial Theorem in \HOL{}.
\item
    [{\tt BINOMIAL.th}]
    The theory file which is generated by \verb@mk_BINOMIAL.ml@.
\item
    [{\tt BINOMIAL.ml}]
    An \ML{} program which loads the theory \verb@BINOMIAL@ and declares
    a few derived rules and one tactic used in the proof.
    The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@.
\end{description}
% The aim of this \self{} is to introduce the small amount of algebra and 
% mathematical notation needed to state and prove the Binomial Theorem, show 
% how this is rendered in \HOL{}, and outline the structure of the proof 
% contained in \verb@mk_BINOMIAL.ml@.
The \self{} is meant to be intelligible 
without examination of any of the accompanying files.
To avoid clutter not every detail is spelt out.
Often definitions and theorems are indicated 
in the form displayed by \HOL{}, rather than as \ML{} source text. Specific 
details of the \ML{} definitions or tactics can be found in the file 
\verb@mk_BINOMIAL.ml@.

The way to work with this case study is first to study this \self{} to see
the structure of the proof, and then to work interactively with \HOL{}.
Start \HOL{} and say:
\begin{session}
\begin{verbatim}
#loadt `BINOMIAL`;;
...
File BINOMIAL loaded
() : void

#BINOMIAL;;
|- !plus times.
    RING(plus,times) ==>
    (!a b n.
      POWER times n(plus a b) =
      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
(This is the first display of an interaction with \HOL{}.  The difference
between lines input to and lines output from \HOL{} is that only the former
are preceded by the \HOL{} prompt ``\verb@#@''.  A line of ellipsis
``\verb@...@'' stands for output from \HOL{} which has been omitted from the 
display.)

The first input line in the display above loads in all the definitions, 
theorems, derived rules and the tactic defined for the proof.  The second 
input line asks \HOL{} to print theorem \verb@BINOMIAL@.  The constants 
used in \verb@BINOMIAL@ are explained later in the \self{}.  The \HOL{} 
session is now set up either to make use of \verb@BINOMIAL@ by specialising 
it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal 
package to work through proofs contained in it.

The remainder of the \self{} follows the plan:
\begin{description}
\item[Section~\ref{BinomialCoefficients}]
    shows how to define the number $n \choose k$ in \HOL{} as the term 
    $\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by 
    primitive recursion.
\item[Section~\ref{MonoidsGroupsRings}]
    shows how to define elementary algebraic concepts like associativity, 
    commutativity, monoid, group and ring in \HOL{}, and how to apply 
    theorems proved in the general case to particular situations.
\item[Section~\ref{PowersReductionsRangesSums}]
    shows how to define the iterated operations of raising a term to a 
    power and reducing (summing) a list of terms. These operations are 
    part of the everyday mathematical language used to state the Binomial 
    Theorem, but they are not built-in to \HOL{} and so need to be defined.  
    This section and the previous two together describe enough definitions 
    in \HOL{} to state the Binomial Theorem.
\item[Section~\ref{BinomialTheorem}]
    shows how the Binomial Theorem is proven in \HOL{}.
    The proof is by induction, and depends on three main lemmas.
\item[Section~\ref{BinomialTheoremForIntegers}]
    outlines how to apply the general theorem to a particular case:
    the ring of integers.
\item[Appendix~\ref{PrincipalLemmas}]
    states the principal lemmas used to prove the theorem.
    Some of these are {\em ad hoc} but others could be reused elsewhere.
\end{description}

% ----------------------------------------------------------------------------

\section{The Binomial Coefficients}
\label{BinomialCoefficients}

The definition of the binomial coefficients as the numbers in Pascal's Triangle
is formalised in three equations:
\begin{eqnarray*}
n \choose 0 &=& 1 \\
0 \choose {k+1} &=& 0 \\
{n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}}
\end{eqnarray*}
(These actually define $n \choose k$ to be $0$ if $k>n$, but this is 
consistent with Pascal's Triangle: think of the spaces to the right of 
the triangle as filled with $0$'s.) The desire is to define a constant 
\verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type 
{\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$.  
Unfortunately these three equations cannot be used directly as the definition 
of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case 
(when $n=0$) together with an inductive case (when $n>0$).

To work towards the definition, consider a term that specifies a base case 
and an inductive case intended to be equivalent to the three equations above:
\begin{session}
\begin{verbatim}
let base_and_inductive: term =
    "(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
     (CHOOSE (SUC n) k =
         ((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";;
\end{verbatim}
\end{session}
The theory \verb@prim_rec@ contains a primitive recursion theorem which 
says that any base case and inductive case identifies a unique function, 
\verb@fn@:
\begin{session}
\begin{verbatim}
#num_Axiom;;
|- !e f. ?! fn.
    (fn 0 = e) /\
    (!n. fn (SUC n) = f (fn n) n)
\end{verbatim}
\end{session}
Given the theorem \verb@prim_rec@, the builtin function 
\verb@prove_rec_fn_exists@ can prove that there is a function that satisfies 
the property specified by the term \verb@base_and_inductive@:
\begin{session}
\begin{verbatim}
#let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;;
RAW_CHOOSE_DEF = 
|- ?CHOOSE.
    (!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
    (!n k.
      CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k)))
\end{verbatim}
\end{session}
These equations would not do as the only definition of \verb@CHOOSE@ due 
to the presence of the conditional operator, which makes rewriting 
difficult.  But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is 
easy to prove there is a function which satisfies the original three 
equations:
\begin{session}
\begin{verbatim}
CHOOSE_DEF_LEMMA = 
|- ?CHOOSE.
    (!n. CHOOSE n 0 = 1) /\
    (!k. CHOOSE 0(SUC k) = 0) /\
    (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Now the constant \verb@CHOOSE@ can be specified as originally desired:
\begin{session}
\begin{verbatim}
#let CHOOSE_DEF =
#    new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;;
CHOOSE_DEF = 
|- (!n. CHOOSE n 0 = 1) /\
   (!k. CHOOSE 0(SUC k) = 0) /\
   (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Two elementary properties of the binomial coefficients can now be proven
by induction:
\begin{session}
\begin{verbatim}
CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0)
CHOOSE_SAME = |- !n. CHOOSE n n = 1
\end{verbatim}
\end{session}

% ----------------------------------------------------------------------------

\section{Monoids, Groups and Commutative Rings}
\label{MonoidsGroupsRings}
%A simple way to define algebraic structures in \HOL{}.

\subsection{Associativity and Commutativity}

An {\em algebraic structure} is a set equipped with some operators that 
obey some laws.  An elementary example of such a structure is the pair 
$(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys 
the laws of associativity,
\[
a + (b + c) = (a + b) + c
\]
and commutativity,
\[
a + b = b + a
\]

How might these laws be represented in \HOL{}?  The simplest scheme is 
to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a 
type of the \HOL{} logic, and $+$ is a \HOL{} term of type 
$\sigma\to\sigma\to\sigma$.  Notice that the set in the structure is 
represented as a \HOL{} type: see the theory \verb@group@ presented in 
Elsa Gunter's case study of Modular Arithmetic for a more sophisticated 
approach where the set in a structure can be a subset of a \HOL{} type.

The two laws can be defined as two constants in \HOL{}:
\begin{session}
\begin{verbatim}
#let ASSOCIATIVE =
#    new_definition (`ASSOCIATIVE`,
#      "!plus: *->*->*. ASSOCIATIVE plus =
#          (!a b c. plus a (plus b c) = plus (plus a b) c)");;
...
#let COMMUTATIVE =
#    new_definition (`COMMUTATIVE`,
#      "!plus: *->*->*. COMMUTATIVE plus =
#          (!a b. plus a b = plus b a)");;
\end{verbatim}
\end{session}
By instantiating the type variable \verb@*@ and specialising the variable 
\verb@plus@ these constants are applicable to any particular structure 
of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$.

A simple example to illustrate this methodology is a permutation theorem 
about an arbitrary associative commutative operator (which is called 
\verb@PERM@ in file \verb@mk_BINOMIAL.ml@).  It is worthwhile going 
through the proof in detail because it illustrates how a difficulty in 
making use of assumptions like \verb@ASSOCIATIVE plus@ and \verb@COMMUTATIVE 
plus@ can be solved.
\begin{session}
\begin{verbatim}
#g "!plus: *->*->*.
#       ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
#           !a b c. plus (plus a b) c = plus b (plus a c)";;
\end{verbatim}
\end{session}
The first step is to strip off the quantifiers, and break up the implication:
\begin{session}
\begin{verbatim}
#e (REPEAT STRIP_TAC);;
OK..
"plus(plus a b)c = plus b(plus a c)"
    [ "ASSOCIATIVE plus" ]
    [ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the 
left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem 
was about a specific operator, such as \verb@$+: num->num->num@ from the 
theory of numbers in \HOL{}, the rewriting could be done with 
\verb@REWRITE_TAC@ and the specific commutativity theorem, such as 
\verb@ADD_SYM@:
\[
\mbox{\verb@|- !m n. m + n = n + m@}.
\]
But in the general case all that is available is the definition of
\verb@COMMUTATIVE plus@ which cannot be used directly with
\verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@.
One approach is first to prove the
following equation from the definition \verb@COMMUTATIVE plus@,
\[
\mbox{\verb@COMMUTATIVE plus |- !a b. plus a b = plus b a@},
\]
and then to use the equation with \verb@REWRITE_TAC@ in the same way as one
might use \verb@ADD_SYM@.  This is a valid tactic because the
assumption \verb@COMMUTATIVE plus@ is already present in the assumption 
list.  A little forward proof achieves the first step, the derivation of the
equation:
\begin{session}
\begin{verbatim}
#top_print print_all_thm;;
- : (thm -> void)

#let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));;
forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a)

#let eqn = UNDISCH forwards;;
eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a
\end{verbatim}
\end{session}
This forward proof is a special case of a derived
rule of the logic,
\[
\Gamma\turn \uquant{x} t_1 = t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
which can be coded as the \ML{} function \verb@SPEC_EQ@:
\begin{session}
\begin{verbatim}
#let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));;
SPEC_EQ = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Now the equation \verb@eqn@ can be used to prove swap terms
\verb@a@ and \verb@b@:
\begin{session}
\begin{verbatim}
#e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));;
OK..
"plus(plus b a)c = plus b(plus a c)"
    [ "ASSOCIATIVE plus" ]
    [ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
Now that the terms appear in both sides in the same order but with different 
grouping, associativity can be used to make the grouping on both sides 
the same.  Again the definition of \verb@ASSOCIATIVE@ cannot be used
directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition
an equation analogous to \verb@eqn@ which can be used:
\begin{session}
\begin{verbatim}
#e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);;
OK..
goal proved
...
|- !plus.
    ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
    (!a b c. plus(plus a b)c = plus b(plus a c))
\end{verbatim}
\end{session}

The point of this example was to illustrate how to prove theorems about 
an arbitrary operator whose behaviour is specified by general predicates 
like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is 
in deriving equations for use with rewriting or substitution.  The solution 
used in the example and extensively in the proof of the Binomial Theorem 
is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived 
rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@:
\[
\Gamma\turn \uquant{x} t_1 \Rightarrow t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
\begin{session}
\begin{verbatim}
#let SPEC_IMP v th = UNDISCH(SPEC v th);;
SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@ 
generalised to take lists of variables rather than just one:
\begin{session}
\begin{verbatim}
SPECL_EQ = - : (term list -> thm -> thm)
SPECL_IMP = - : (term list -> thm -> thm)
\end{verbatim}
\end{session}
The final new derived rule used in the proof of the Binomial Theorem
is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which
splits up a conjunction in the antecedent into separate assumptions:
\[
\Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1}
\over
\Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x]
\]
\begin{session}
\begin{verbatim}
#let STRIP_SPEC_IMP v th =
#    UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));;
STRIP_SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}

There are other methods for proving theorems which are conditional on 
predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@.  One is to 
rewrite with the definition before assuming the predicate (see, for instance, 
the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This 
method is fine for small proofs, but in larger proofs it presents the problem 
of how to specify a particular assumption on the assumption list.  Another 
method is to put the predicates on the assumption list, and use resolution 
to extract the body of the definition (see the proof of 
\verb@UNIQUE_LEFT_INV@).  When the assumption list is not short this is 
rather a blunt instrument, in the sense that resolution can find many more 
theorems than the one desired, and is also rather computationally costly 
when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and 
its variants.

\subsection{Monoids}

A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an 
associative binary operator on $M$, such that there is an {\em identity 
element}, $u \in M$, which is a left and right identity of $+$, that is, 
it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written 
as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the 
identity element is written as $0$; otherwise if the operator is written 
as $\times$, the structure $(M,\times)$ is called a {\em multiplicative 
monoid} and the identity element is written as $1$.
\begin{session}
\begin{verbatim}
#let RAW_MONOID =
#    new_definition (`RAW_MONOID`,
#      "!plus: *->*->*. MONOID plus =
#          ASSOCIATIVE plus /\
#          (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");;
\end{verbatim}
\end{session}
This definition is inconvenient to manipulate as it stands, because of 
the presence of the existential quantifier.  However it is possible via 
Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that 
\verb@Id plus@ stands for an identity element of \verb@plus@, if such 
exists.  In fact it is easy to prove that the identity element in a monoid 
is unique.  The first step is to prove that there exists a function \verb@Id@ 
with the property just described:
\begin{session}
\begin{verbatim}
ID_LEMMA = 
|- ?Id.
    !plus.
     (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
     (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to
construct a witness for the outer existential quantifier,
\begin{verbatim}
    \plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a))
\end{verbatim}
Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows:
\begin{session}
\begin{verbatim}
#let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;;
ID = 
|- !plus.
    (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
    (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The condition that \verb@Id plus@ is a left and right identity is expressed 
by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively:
\begin{session}
\begin{verbatim}
LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a)
RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
Given these abbreviations it is easy to prove in \HOL{} that
if a left and right identity element exists, then it is unique:
\begin{session}
\begin{verbatim}
UNIQUE_LEFT_ID =
|- !plus.
    LEFT_ID plus /\ RIGHT_ID plus ==>
    (!l. (!a. plus l a = a) ==> (Id plus = l))

UNIQUE_RIGHT_ID =
|- !plus.
    LEFT_ID plus /\ RIGHT_ID plus ==>
    (!r. (!a. plus a r = a) ==> (Id plus = r))
\end{verbatim}
\end{session}
The use of these last two theorems is to help identify the element \verb@Id 
plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory 
of \verb@arithmetic@.  The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@ 
admit a new characterisation of the predicate \verb@MONOID@, which is easier 
to manipulate in \HOL{} than \verb@RAW_MONOID@:
\begin{session}
\begin{verbatim}
MONOID =
|- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus
\end{verbatim}
\end{session}

\subsection{Groups}

A {\em group} is a monoid in which every element is invertible.
\begin{session}
\begin{verbatim}
RAW_GROUP = 
|- !plus.
    Group plus =
    MONOID plus /\
    (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus))
\end{verbatim}
\end{session}
An alternative characterisation of \verb@Group@\footnote{
The mixed case identifier \verb@Group@ is used rather than \verb@GROUP@, 
because the latter is already defined in Elsa Gunter's theory \verb@group@, 
a parent of her theory \verb@integer@, which needs to coexist with the 
present theory so that integers can be used as an example ring 
(Section~\ref{BinomialTheoremForIntegers}).
} which makes the 
existential quantification implicit can be had by specifying a function 
\verb@Inv@ so that, if \verb@Group plus@ holds, then the element \verb@Inv 
plus a@ is an inverse of \verb@a@.  The function \verb@Inv@ is specified 
in a way analogous to the specification of \verb@Id@.  First a lemma is 
proven which says there does exist a function \verb@Inv@ with the property 
desired:
\begin{session}
\begin{verbatim}
INV_LEMMA = 
|- ?Inv.
    !plus.
     (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==>
     (!a.
       (plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus))
\end{verbatim}
\end{session}
Given this lemma, which is proven by an easy tactic almost the same
as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified:
\begin{session}
\begin{verbatim}
#let INV =
#    new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;;
\end{verbatim}
\end{session}
The predicates which say that \verb@Inv@ produces left and right inverses
are defined as new constants:
\begin{session}
\begin{verbatim}
LEFT_INV =
|- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus)
RIGHT_INV =
|- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus)
\end{verbatim}
\end{session}
Finally, the alternative characterisation of \verb@Group@ can be proven:
\begin{session}
\begin{verbatim}
GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus
\end{verbatim}
\end{session}
The theory of groups has been developed only as far as needed to prove 
the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma, 
\verb@RIGHT_CANCELLATION@, specifically about groups.

\subsection{Rings}

A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$ 
is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is 
commutative, and {\em multiplication}, $\times$, distributes (on both sides) 
over addition.  A {\em commutative ring} is a ring in which multiplication 
is commutative.  The word ring is used in the remainder of this \self{} 
and in the \HOL{} code to mean a commutative ring:
\begin{session}
\begin{verbatim}
RING = 
|- !plus times.
    RING(plus,times) =
    Group plus /\
    COMMUTATIVE plus /\
    MONOID times /\
    COMMUTATIVE times /\
    LEFT_DISTRIB(plus,times) /\
    RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
\verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants
defined by the theorems:
\begin{session}
\begin{verbatim}
LEFT_DISTRIB = 
|- !plus times.
    LEFT_DISTRIB(plus,times) =
    (!a b c. times a(plus b c) = plus(times a b)(times a c))

RIGHT_DISTRIB = 
|- !plus times.
    RIGHT_DISTRIB(plus,times) =
    (!a b c. times(plus a b)c = plus(times a c)(times b c))
\end{verbatim}
\end{session}
Notice that the definition of \verb@RING@ abbreviates a conjunction of 
predicates, some of which are atomic, in the sense that they are some basic 
property of \verb@plus@ or \verb@times@ or both, and some are compound, 
in the sense that they abbreviate a further conjunction of properties.  
The definition is a tree of predicates with atomic ones like 
\verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound 
ones like \verb@Group@ and \verb@MONOID@, at the branches.  Many theorems 
conditional on \verb@RING(plus,times)@ need to make use of both atomic 
and compound predicates contained in the tree.  To make access to all these 
predicates easy, the following lemma is proven, which makes explicit all 
the predicates in the tree: 
\begin{session}
\begin{verbatim}
RING_LEMMA = 
|- RING(plus,times) ==>
   RING(plus,times) /\
   Group plus /\
   MONOID plus /\
   LEFT_ID plus /\
   RIGHT_ID plus /\
   ASSOCIATIVE plus /\
   LEFT_INV plus /\
   RIGHT_INV plus /\
   COMMUTATIVE plus /\
   MONOID times /\
   LEFT_ID times /\
   RIGHT_ID times /\
   ASSOCIATIVE times /\
   COMMUTATIVE times /\
   LEFT_DISTRIB(plus,times) /\
   RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
Goals of the form,
\[
\mbox{\verb@!plus times. RING(plus,times) ==> ...@}
\]
are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@
to add all the atomic and compound predicates in the tree to the assumption 
list:
\begin{session}
\begin{verbatim}
#let RING_TAC =
#    GEN_TAC THEN GEN_TAC THEN
#    DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));;
\end{verbatim}
\end{session}
The one lemma about rings needed here, \verb@RING_0@, is stated in
Appendix~\ref{PrincipalLemmas}.

% ----------------------------------------------------------------------------

\section{Powers, Reductions, Ranges and Sums}
\label{PowersReductionsRangesSums}
%Translation of mathematical definitions such as sums of series into HOL

The aim of this section is to explain how the mathematical notation 
$\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}.
If this is written out more explicitly than necessary for human
consumption it looks like this:
\[
\sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k})
\]
where $\times$ is multiplication in the ring, and where exponentiation 
and $\cdot$ stand for the scalar powers of multiplication and addition, 
respectively.  Since binomial coefficients were treated in 
Section~\ref{BinomialCoefficients}, what remain to be defined in this section 
are scalar powers and the $\Sigma$-notation.  The latter is tackled in 
three steps: reduction of a list of monoid elements; generation of lists 
comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally 
the reduction of a list of ring elements indexed by a subrange of the natural 
numbers.

\subsection{Scalar powers in a monoid}

If $(M,+)$ and $(M,\times)$ are additive and multiplicative
monoids respectively, the scalar powers of addition and multiplication are
defined informally by the following equations, where $n>0$:
\begin{eqnarray*}
n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\
a^n &=& \overbrace{a \times \cdots \times a}^{n}
\end{eqnarray*}
When $n=0$, $n \cdot a = 0$ and $a^n = 1$.

Scalar power is defined in \HOL{} via the constant \verb@POWER@ which
is defined by primitive recursion:
\begin{session}
\begin{verbatim}
POWER = 
|- (!plus a. POWER plus 0 a = Id plus) /\
   (!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a))
\end{verbatim}
\end{session}
Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}.

\subsection{Reduction in a monoid}

This section makes use of the standard \HOL{} theory of lists, in particular 
the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@, 
definition by primitive recursion and proof by induction. The theory of 
lists is described in \DESCRIPTION.

If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$,
where each $a_i$ is an element of $M$, is the sum,
\[
a_1 + \cdots + a_n,
\]
or the monoid's identity element if $n=0$.

Reduction is represented in \HOL{} by the constant \verb@REDUCE@,
with type,
\begin{verbatim}
REDUCE : (*->*->*) -> (*)list -> *
\end{verbatim}
and which is defined by primitive recursion on lists on its second argument:
\begin{session}
\begin{verbatim}
#let REDUCE =
#  new_list_rec_definition (`REDUCE`,
#    "(!plus. (REDUCE plus NIL) = (Id plus):*) /\
#     (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");;
REDUCE = 
|- (!plus. REDUCE plus[] = Id plus) /\
   (!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as))
\end{verbatim}
\end{session}
Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}.

\subsection{Subranges of the natural numbers}

The constant \verb@RANGE@, which is a curried function of two numbers,
is defined by primitive recursion so that,
\[
\verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1]
\]
This definition means that the subrange is the first $n$ numbers starting 
at $m$.  In particular, when $n=0$ the list is empty.
Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@

\verb@RANGE@ generates a subrange given its least element and length;
an alternative method is to generate the subrange given its two endpoints,
via a constant \verb@INTERVAL@ which satisfies the equation:
\[
\verb@INTERVAL@\,m\,n =
    \left\{\begin{array}{ll}
    [m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\
    {}[] & \mbox{otherwise}
    \end{array}\right.
\]

For comparison, the two methods are definable in \HOL{} as follows:
\begin{session}
\begin{verbatim}
#let RANGE = 
#    new_recursive_definition false num_Axiom `RANGE`
#      "(RANGE m 0 = NIL) /\
#       (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";;
#
#let INTERVAL =
#    new_recursive_definition false num_Axiom `INTERVAL`
#      "(INTERVAL m 0 =
#          (m > 0) => NIL | [0]) /\
#      (INTERVAL m (SUC n) =
#          (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";;
\end{verbatim}
\end{session}

\subsection{$\Sigma$-notation}

The standard $\Sigma$-notation is defined informally by the equation,
\[
\sum_{i=m}^n a_i =
    \left\{\begin{array}{ll}
    a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\
    0 & \mbox{otherwise}
    \end{array}\right.
\]
Therefore the standard $\Sigma$-notation is naturally definable via
a subrange generated by \verb@INTERVAL@.

An alternative notation, naturally definable via \verb@RANGE@, can be
defined informally as follows:
\[
\sum_{i=m}^{(n)} a_i =
    \left\{\begin{array}{ll}
    a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\
    0 & \mbox{otherwise}
    \end{array}\right.
\]
The brackets around the $n$ above the $\Sigma$ distinguish this
notation from the standard one.

This second notation is used in the \HOL{} proof of the Binomial Theorem.  
The difference between the two is just that the standard notation is 
definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second 
notation was chosen entirely because the constant \verb@RANGE@ is easier 
to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter 
has a more complex definition. The disadvantage of this choice is that 
the proof of the theorem uses a non-standard notation, but that's all the 
difference is: notation.  If desired, the Binomial Theorem stated in the 
standard notation could be trivially deduced from its non-standard statement.

The non-standard $\Sigma$-notation is defined in \HOL{} as the following
abbreviation:
\begin{session}
\begin{verbatim}
#let SIGMA =
#    new_definition (`SIGMA`,
#      "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");;
\end{verbatim}
\end{session}
Note that an indexed term $a_i$ is represented in \HOL{} as a function
\verb@f@ of type \verb@num -> *@.

Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@; 
their proofs make use of properties of \verb@MAP@ from the theory 
\verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@.

The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define 
\verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined 
directly  using primitive recursion.  It is worth going to the trouble 
of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely 
to be useful in other situations. 

% ----------------------------------------------------------------------------

\section{The Binomial Theorem for a Commutative Ring}
\label{BinomialTheorem}
%Division of a medium sized proof into manageable lemmas.

This section outlines the proof.  For full details see the tactics in
\verb@mk_BINOMIAL.ml@.

The statement and proof of the Binomial Theorem use a new \HOL{} constant, 
\verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n 
\choose k} \cdot a^{n-k} \times b^k$.  \verb@BINTERM@ is defined as a
curried function as follows:
\begin{session}
\begin{verbatim}
#let BINTERM_DEF =
#    new_definition (`BINTERM_DEF`,
#      "BINTERM (plus: *->*->*) (times: *->*->*) a b n k =
#          POWER plus (CHOOSE n k)
#              (times (POWER times (n-k) a) (POWER times k b))");;
\end{verbatim}
\end{session}
This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as
follows:
\begin{session}
\begin{verbatim}
BINOMIAL = 
|- !plus times.
    RING(plus,times) ==>
    (!a b n.
      POWER times n(plus a b) =
      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
In outline, the proof proceeds as follows.  The following equation is
proven by induction on $n$:
\begin{eqnarray*}
(a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
When $n=0$ both sides reduce to the multiplicative identity element, $1$, 
of the ring. For the inductive step, the equation is assumed for $n$, and 
the following goal must be proven:
\begin{eqnarray*}
(a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k
\end{eqnarray*}
Three lemmas are used in the inductive step:
\[\begin{array}{rcll}
\displaystyle
\sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=&
\displaystyle
    a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k +
    b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k
    & \mbox{({\tt LEMMA\_1})} \\
\displaystyle
a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    & \mbox{({\tt LEMMA\_2})} \\
\displaystyle
b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    + b^{n+1}
    & \mbox{({\tt LEMMA\_3})}
\end{array}\]
\verb@LEMMA_1@ is the key to the inductive step. The three stages of the 
inductive step are indicated (a), (b) and (c) in \ML{} comments.  Step 
(a) expands the right-hand side of the goal with \verb@LEMMA_1@:
\begin{eqnarray*}
\mbox{rhs} &=&
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
    b^{n+1}
\end{eqnarray*}
Step (b) expands the left-hand side with the induction hypothesis:
\begin{eqnarray*}
\mbox{lhs} &=&
    (a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
Finally, in step (c) the expansions of the two sides are shown to be the 
same, using distributivity of $\times$ over $+$, associativity of $+$,
and \verb@LEMMA_2@ and \verb@LEMMA_3@:
\begin{eqnarray*}
\mbox{lhs} &=&
    a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
    +
    b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\
  &=&
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    +
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    + b^{n+1} \\
  &=&
      \mbox{rhs}
\end{eqnarray*}

% ----------------------------------------------------------------------------

\section{The Binomial Theorem for Integers}
\label{BinomialTheoremForIntegers}

The previous sections have dealt with the material in theory \verb@BINOMIAL@,
concerning how to prove the Binomial Theorem for an arbitrary ring.  A 
natural next step is to produce some specific commutative ring, and prove
a Binomial Theorem for it, as an instance of the general case.

To see how to do so for Elsa Gunter's theory of integers, take a look 
at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}.
When loaded into \HOL{}, this \ML{} file proves that the integers are a 
ring---by quoting laws contained in the theory \verb@integer@---and then 
derives a Binomial Theorem for the integers---by Modus Ponens from the 
general theorem \verb@BINOMIAL@.