File: binomial.tex

package info (click to toggle)
hol88 2.02.19940316-15
links: PTS
area: main
in suites: wheezy
size: 65,928 kB
sloc: ml: 199,939; ansic: 9,666; sh: 7,118; makefile: 6,075; lisp: 2,747; yacc: 894; sed: 201; cpp: 87; awk: 5
file content (972 lines) | stat: -rw-r--r-- 37,252 bytes
parent folder | download | duplicates (11)
% binomial.tex %
% The Binomial Theorem proven in HOL %

\setlength{\unitlength}{1mm}
% \vskip10mm
% \begin{center}\LARGE\bf
% The Binomial Theorem proven in HOL.
% \end{center}
% \vskip10mm

\def\obj#1{\mbox{\tt#1}}

% ---------------------------------------------------------------------
% Parameters customising the document; set by whoever \input's it
% ---------------------------------------------------------------------
% \self{} is one word denoting this document, such as "article" or "chapter"
% \path{} is the path denoting the case-study directory
%
% Typical examples:
% \def\self{article}
% \def\path{\verb%Training/studies/binomial%}
% ---------------------------------------------------------------------

\section{Pascal's Triangle and the Binomial Theorem}

Pascal's Triangle\footnote{
According to Knuth \cite{knuth73}, the triangle gets its name from
its appearance in Blaise Pascal's {\em Trait\'e du triangle arithm\'etique}
in 1653, although the binomial coefficients were known long before then.
The earliest surviving description is from the tenth century,
in Hal\={a}yudha's commentary on the Hindu classic, the Chandah-S\^{u}tra.
The Binomial Theorem itself was first reported in 1676 by Isaac Newton.
}
is the infinite triangle of numbers which starts like this:
\[\begin{array}{ccccccc}
1  \\
1 & 1 \\
1 & 2 & 1 \\
1 & 3 & 3 & 1 \\
1 & 4 & 6 & 4 & 1 \\
1 & 5 & 10 & 10 & 5 & 1 \\
\cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\
\end{array}\]
The construction is as follows. The numbers at the edges 
are $1$s. Each number in the interior is the sum of the number immediately 
above it and the number to its left in the previous row.

The numbers in Pascal's Triangle are called Binomial Coefficients. 
The number in the $n$th row and $k$th column (where $0 \leq k \leq n$, and the 
topmost row and the leftmost column are numbered $0$) is written 
$n \choose k$ and pronounced ``$n$ choose $k$''.  The coefficients are 
pronounced this way because $n \choose k$ turns out to be the number of 
ways to choose $k$ things out of $n$ things, but that is another story
(see for instance Knuth's book \cite{knuth73}).

A simple form of the Binomial Theorem \cite{maclane67} \cite{mostow63} 
is the following equation, where $a$ and $b$ are integers and $n$ is any 
natural number,
\[
(a + b)^n = \sum_{k=0}^n {n \choose k} a^k b^{n-k}
\]
The rest of this \self{} describes how the Binomial Theorem can be proven 
in \HOL{}.  In fact, a more general theorem is proven, where $a$ and $b$ 
are elements of an arbitrary commutative ring, but the form displayed here 
can be derived from the general result.

The motivation for the proof of the Binomial Theorem in \HOL{} is 
tutorial.
% ---to be a medium sized worked example whose subject matter is 
% more widely known than any specific piece of hardware or software.
The proof illustrates specific ways in \HOL{} to represent mathematical 
notations and manage theorems proven about general structures such as 
monoids, groups and rings. The following files accompany this \self{} in 
directory \path{}:
\begin{description}
\item[{\tt mk\_BINOMIAL.ml}]
    The \ML{} program containing the definitions, lemmas and theorems\\
    needed to prove the Binomial Theorem in \HOL{}.
\item
    [{\tt BINOMIAL.th}]
    The theory file which is generated by \verb@mk_BINOMIAL.ml@.
\item
    [{\tt BINOMIAL.ml}]
    An \ML{} program which loads the theory \verb@BINOMIAL@ and declares
    a few derived rules and one tactic used in the proof.
    The Binomial Theorem is bound to the \ML{} identifier \verb@BINOMIAL@.
\end{description}
% The aim of this \self{} is to introduce the small amount of algebra and 
% mathematical notation needed to state and prove the Binomial Theorem, show 
% how this is rendered in \HOL{}, and outline the structure of the proof 
% contained in \verb@mk_BINOMIAL.ml@.
The \self{} is meant to be intelligible 
without examination of any of the accompanying files.
To avoid clutter not every detail is spelt out.
Often definitions and theorems are indicated 
in the form displayed by \HOL{}, rather than as \ML{} source text. Specific 
details of the \ML{} definitions or tactics can be found in the file 
\verb@mk_BINOMIAL.ml@.

The way to work with this case study is first to study this \self{} to see
the structure of the proof, and then to work interactively with \HOL{}.
Start \HOL{} and say:
\begin{session}
\begin{verbatim}
#loadt `BINOMIAL`;;
...
File BINOMIAL loaded
() : void

#BINOMIAL;;
|- !plus times.
    RING(plus,times) ==>
    (!a b n.
      POWER times n(plus a b) =
      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
(This is the first display of an interaction with \HOL{}.  The difference
between lines input to and lines output from \HOL{} is that only the former
are preceded by the \HOL{} prompt ``\verb@#@''.  A line of ellipsis
``\verb@...@'' stands for output from \HOL{} which has been omitted from the 
display.)

The first input line in the display above loads in all the definitions, 
theorems, derived rules and the tactic defined for the proof.  The second 
input line asks \HOL{} to print theorem \verb@BINOMIAL@.  The constants 
used in \verb@BINOMIAL@ are explained later in the \self{}.  The \HOL{} 
session is now set up either to make use of \verb@BINOMIAL@ by specialising 
it to a specific ring, or to study \verb@mk_BINOMIAL.ml@ by using the subgoal 
package to work through proofs contained in it.

The remainder of the \self{} follows the plan:
\begin{description}
\item[Section~\ref{BinomialCoefficients}]
    shows how to define the number $n \choose k$ in \HOL{} as the term 
    $\obj{CHOOSE}\,n\,k$, where $\obj{CHOOSE}$ is a function defined by 
    primitive recursion.
\item[Section~\ref{MonoidsGroupsRings}]
    shows how to define elementary algebraic concepts like associativity, 
    commutativity, monoid, group and ring in \HOL{}, and how to apply 
    theorems proved in the general case to particular situations.
\item[Section~\ref{PowersReductionsRangesSums}]
    shows how to define the iterated operations of raising a term to a 
    power and reducing (summing) a list of terms. These operations are 
    part of the everyday mathematical language used to state the Binomial 
    Theorem, but they are not built-in to \HOL{} and so need to be defined.  
    This section and the previous two together describe enough definitions 
    in \HOL{} to state the Binomial Theorem.
\item[Section~\ref{BinomialTheorem}]
    shows how the Binomial Theorem is proven in \HOL{}.
    The proof is by induction, and depends on three main lemmas.
\item[Section~\ref{BinomialTheoremForIntegers}]
    outlines how to apply the general theorem to a particular case:
    the ring of integers.
\item[Appendix~\ref{PrincipalLemmas}]
    states the principal lemmas used to prove the theorem.
    Some of these are {\em ad hoc} but others could be reused elsewhere.
\end{description}

% ----------------------------------------------------------------------------

\section{The Binomial Coefficients}
\label{BinomialCoefficients}

The definition of the binomial coefficients as the numbers in Pascal's Triangle
is formalised in three equations:
\begin{eqnarray*}
n \choose 0 &=& 1 \\
0 \choose {k+1} &=& 0 \\
{n+1} \choose {k+1} &=& {{n} \choose {k+1}} + {{n} \choose {k}}
\end{eqnarray*}
(These actually define $n \choose k$ to be $0$ if $k>n$, but this is 
consistent with Pascal's Triangle: think of the spaces to the right of 
the triangle as filled with $0$'s.) The desire is to define a constant 
\verb@CHOOSE@ in \HOL{} such that for all numbers $n$, $k$ in the type 
{\tt num}, $\verb@CHOOSE@\,n\,k$ denotes the number $n \choose k$.  
Unfortunately these three equations cannot be used directly as the definition 
of \verb@CHOOSE@ in \HOL{} because they are not in the form of a base case 
(when $n=0$) together with an inductive case (when $n>0$).

To work towards the definition, consider a term that specifies a base case 
and an inductive case intended to be equivalent to the three equations above:
\begin{session}
\begin{verbatim}
let base_and_inductive: term =
    "(CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
     (CHOOSE (SUC n) k =
         ((0 = k) => 1 | (CHOOSE n (k-1)) + (CHOOSE n k)))";;
\end{verbatim}
\end{session}
The theory \verb@prim_rec@ contains a primitive recursion theorem which 
says that any base case and inductive case identifies a unique function, 
\verb@fn@:
\begin{session}
\begin{verbatim}
#num_Axiom;;
|- !e f. ?! fn.
    (fn 0 = e) /\
    (!n. fn (SUC n) = f (fn n) n)
\end{verbatim}
\end{session}
Given the theorem \verb@prim_rec@, the builtin function 
\verb@prove_rec_fn_exists@ can prove that there is a function that satisfies 
the property specified by the term \verb@base_and_inductive@:
\begin{session}
\begin{verbatim}
#let RAW_CHOOSE_DEF = prove_rec_fn_exists num_Axiom base_and_inductive;;
RAW_CHOOSE_DEF = 
|- ?CHOOSE.
    (!k. CHOOSE 0 k = ((0 = k) => 1 | 0)) /\
    (!n k.
      CHOOSE(SUC n)k = ((0 = k) => 1 | (CHOOSE n(k - 1)) + (CHOOSE n k)))
\end{verbatim}
\end{session}
These equations would not do as the only definition of \verb@CHOOSE@ due 
to the presence of the conditional operator, which makes rewriting 
difficult.  But having obtained the theorem \verb@RAW_CHOOSE_DEF@ it is 
easy to prove there is a function which satisfies the original three 
equations:
\begin{session}
\begin{verbatim}
CHOOSE_DEF_LEMMA = 
|- ?CHOOSE.
    (!n. CHOOSE n 0 = 1) /\
    (!k. CHOOSE 0(SUC k) = 0) /\
    (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Now the constant \verb@CHOOSE@ can be specified as originally desired:
\begin{session}
\begin{verbatim}
#let CHOOSE_DEF =
#    new_specification `CHOOSE_DEF` [`constant`,`CHOOSE`] CHOOSE_DEF_LEMMA;;
CHOOSE_DEF = 
|- (!n. CHOOSE n 0 = 1) /\
   (!k. CHOOSE 0(SUC k) = 0) /\
   (!n k. CHOOSE(SUC n)(SUC k) = (CHOOSE n(SUC k)) + (CHOOSE n k))
\end{verbatim}
\end{session}
Two elementary properties of the binomial coefficients can now be proven
by induction:
\begin{session}
\begin{verbatim}
CHOOSE_LESS = |- !n k. n < k ==> (CHOOSE n k = 0)
CHOOSE_SAME = |- !n. CHOOSE n n = 1
\end{verbatim}
\end{session}

% ----------------------------------------------------------------------------

\section{Monoids, Groups and Commutative Rings}
\label{MonoidsGroupsRings}
%A simple way to define algebraic structures in \HOL{}.

\subsection{Associativity and Commutativity}

An {\em algebraic structure} is a set equipped with some operators that 
obey some laws.  An elementary example of such a structure is the pair 
$(A,+)$, where $A$ is a set and $+$ is a binary operator on $A$ that obeys 
the laws of associativity,
\[
a + (b + c) = (a + b) + c
\]
and commutativity,
\[
a + b = b + a
\]

How might these laws be represented in \HOL{}?  The simplest scheme is 
to deal with structures of the form $(\sigma, +)$, where $\sigma$ is a 
type of the \HOL{} logic, and $+$ is a \HOL{} term of type 
$\sigma\to\sigma\to\sigma$.  Notice that the set in the structure is 
represented as a \HOL{} type: see the theory \verb@group@ presented in 
Elsa Gunter's case study of Modular Arithmetic for a more sophisticated 
approach where the set in a structure can be a subset of a \HOL{} type.

The two laws can be defined as two constants in \HOL{}:
\begin{session}
\begin{verbatim}
#let ASSOCIATIVE =
#    new_definition (`ASSOCIATIVE`,
#      "!plus: *->*->*. ASSOCIATIVE plus =
#          (!a b c. plus a (plus b c) = plus (plus a b) c)");;
...
#let COMMUTATIVE =
#    new_definition (`COMMUTATIVE`,
#      "!plus: *->*->*. COMMUTATIVE plus =
#          (!a b. plus a b = plus b a)");;
\end{verbatim}
\end{session}
By instantiating the type variable \verb@*@ and specialising the variable 
\verb@plus@ these constants are applicable to any particular structure 
of the form $(\sigma, +: \sigma\to\sigma\to\sigma)$.

A simple example to illustrate this methodology is a permutation theorem 
about an arbitrary associative commutative operator (which is called 
\verb@PERM@ in file \verb@mk_BINOMIAL.ml@).  It is worthwhile going 
through the proof in detail because it illustrates how a difficulty in 
making use of assumptions like \verb@ASSOCIATIVE plus@ and \verb@COMMUTATIVE 
plus@ can be solved.
\begin{session}
\begin{verbatim}
#g "!plus: *->*->*.
#       ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
#           !a b c. plus (plus a b) c = plus b (plus a c)";;
\end{verbatim}
\end{session}
The first step is to strip off the quantifiers, and break up the implication:
\begin{session}
\begin{verbatim}
#e (REPEAT STRIP_TAC);;
OK..
"plus(plus a b)c = plus b(plus a c)"
    [ "ASSOCIATIVE plus" ]
    [ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
The next step is to rewrite \verb@plus a b@ to \verb@plus b a@ on the 
left-hand side, from the assumption \verb@COMMUTATIVE plus@. If the theorem 
was about a specific operator, such as \verb@$+: num->num->num@ from the 
theory of numbers in \HOL{}, the rewriting could be done with 
\verb@REWRITE_TAC@ and the specific commutativity theorem, such as 
\verb@ADD_SYM@:
\[
\mbox{\verb@|- !m n. m + n = n + m@}.
\]
But in the general case all that is available is the definition of
\verb@COMMUTATIVE plus@ which cannot be used directly with
\verb@REWRITE_TAC@ to swap the terms \verb@a@ and \verb@b@.
One approach is first to prove the
following equation from the definition \verb@COMMUTATIVE plus@,
\[
\mbox{\verb@COMMUTATIVE plus |- !a b. plus a b = plus b a@},
\]
and then to use the equation with \verb@REWRITE_TAC@ in the same way as one
might use \verb@ADD_SYM@.  This is a valid tactic because the
assumption \verb@COMMUTATIVE plus@ is already present in the assumption 
list.  A little forward proof achieves the first step, the derivation of the
equation:
\begin{session}
\begin{verbatim}
#top_print print_all_thm;;
- : (thm -> void)

#let forwards = fst(EQ_IMP_RULE (SPEC "plus: *->*->*" COMMUTATIVE));;
forwards = |- COMMUTATIVE plus ==> (!a b. plus a b = plus b a)

#let eqn = UNDISCH forwards;;
eqn = COMMUTATIVE plus |- !a b. plus a b = plus b a
\end{verbatim}
\end{session}
This forward proof is a special case of a derived
rule of the logic,
\[
\Gamma\turn \uquant{x} t_1 = t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
which can be coded as the \ML{} function \verb@SPEC_EQ@:
\begin{session}
\begin{verbatim}
#let SPEC_EQ v th = UNDISCH(fst(EQ_IMP_RULE (SPEC v th)));;
SPEC_EQ = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Now the equation \verb@eqn@ can be used to prove swap terms
\verb@a@ and \verb@b@:
\begin{session}
\begin{verbatim}
#e(SUBST1_TAC (SPECL ["a:*";"b:*"] eqn));;
OK..
"plus(plus b a)c = plus b(plus a c)"
    [ "ASSOCIATIVE plus" ]
    [ "COMMUTATIVE plus" ]
\end{verbatim}
\end{session}
Now that the terms appear in both sides in the same order but with different 
grouping, associativity can be used to make the grouping on both sides 
the same.  Again the definition of \verb@ASSOCIATIVE@ cannot be used
directly, but the derived rule \verb@SPEC_EQ@ obtains from the definition
an equation analogous to \verb@eqn@ which can be used:
\begin{session}
\begin{verbatim}
#e(REWRITE_TAC [SPEC_EQ "plus: *->*->*" ASSOCIATIVE]);;
OK..
goal proved
...
|- !plus.
    ASSOCIATIVE plus /\ COMMUTATIVE plus ==>
    (!a b c. plus(plus a b)c = plus b(plus a c))
\end{verbatim}
\end{session}

The point of this example was to illustrate how to prove theorems about 
an arbitrary operator whose behaviour is specified by general predicates 
like \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@. The main difficulty is 
in deriving equations for use with rewriting or substitution.  The solution 
used in the example and extensively in the proof of the Binomial Theorem 
is to derive equations using the derived rule \verb@SPEC_EQ@. Another derived 
rule, \verb@SPEC_IMP@, is used extensively and is a variant of \verb@SPEC_EQ@:
\[
\Gamma\turn \uquant{x} t_1 \Rightarrow t_2
\over
\Gamma, t_1[t'/x] \turn t_2[t'/x]
\]
\begin{session}
\begin{verbatim}
#let SPEC_IMP v th = UNDISCH(SPEC v th);;
SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}
Two following two rules are versions of \verb@SPEC_EQ@ and \verb@SPEC_IMP@ 
generalised to take lists of variables rather than just one:
\begin{session}
\begin{verbatim}
SPECL_EQ = - : (term list -> thm -> thm)
SPECL_IMP = - : (term list -> thm -> thm)
\end{verbatim}
\end{session}
The final new derived rule used in the proof of the Binomial Theorem
is \verb@STRIP_SPEC_IMP@, a variant of \verb@SPEC_IMP@, which
splits up a conjunction in the antecedent into separate assumptions:
\[
\Gamma\turn \uquant{x} (t_1 \wedge \cdots \wedge t_n) \Rightarrow t_{n+1}
\over
\Gamma, t_1[t'/x], \ldots, t_n[t'/x] \turn t_{n+1}[t'/x]
\]
\begin{session}
\begin{verbatim}
#let STRIP_SPEC_IMP v th =
#    UNDISCH_ALL(SPEC v (CONV_RULE (ONCE_DEPTH_CONV ANTE_CONJ_CONV) th));;
STRIP_SPEC_IMP = - : (term -> thm -> thm)
\end{verbatim}
\end{session}

There are other methods for proving theorems which are conditional on 
predicates such as \verb@ASSOCIATIVE@ and \verb@COMMUTATIVE@.  One is to 
rewrite with the definition before assuming the predicate (see, for instance, 
the proof of \verb@RIGHT_CANCELLATION@ in \verb@mk_BINOMIAL.ml@). This 
method is fine for small proofs, but in larger proofs it presents the problem 
of how to specify a particular assumption on the assumption list.  Another 
method is to put the predicates on the assumption list, and use resolution 
to extract the body of the definition (see the proof of 
\verb@UNIQUE_LEFT_INV@).  When the assumption list is not short this is 
rather a blunt instrument, in the sense that resolution can find many more 
theorems than the one desired, and is also rather computationally costly 
when compared to carefully constructing a theorem with \verb@SPEC_EQ@ and 
its variants.

\subsection{Monoids}

A {\em monoid} is a structure $(M,+)$ where $M$ is a set, $+$ is an 
associative binary operator on $M$, such that there is an {\em identity 
element}, $u \in M$, which is a left and right identity of $+$, that is, 
it satisfies $u+a = a+u = a$ for all $a \in M$. When the operator is written 
as $+$, the structure $(M,+)$ is called an {\em additive monoid} and the 
identity element is written as $0$; otherwise if the operator is written 
as $\times$, the structure $(M,\times)$ is called a {\em multiplicative 
monoid} and the identity element is written as $1$.
\begin{session}
\begin{verbatim}
#let RAW_MONOID =
#    new_definition (`RAW_MONOID`,
#      "!plus: *->*->*. MONOID plus =
#          ASSOCIATIVE plus /\
#          (?u. (!a. plus a u = a) /\ (!a. plus u a = a))");;
\end{verbatim}
\end{session}
This definition is inconvenient to manipulate as it stands, because of 
the presence of the existential quantifier.  However it is possible via 
Hilbert's $\epsilon$-operator to specify a function \verb@Id@ such that 
\verb@Id plus@ stands for an identity element of \verb@plus@, if such 
exists.  In fact it is easy to prove that the identity element in a monoid 
is unique.  The first step is to prove that there exists a function \verb@Id@ 
with the property just described:
\begin{session}
\begin{verbatim}
ID_LEMMA = 
|- ?Id.
    !plus.
     (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
     (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The proof of \verb@ID_LEMMA@ uses Hilbert's $\epsilon$-operator to
construct a witness for the outer existential quantifier,
\begin{verbatim}
    \plus.@u:*.(!a:*. (plus u a = a)) /\ (!a. (plus a u = a))
\end{verbatim}
Given \verb@ID_LEMMA@, the constant \verb@Id@ is specified as follows:
\begin{session}
\begin{verbatim}
#let ID = new_specification `Id` [`constant`, `Id`] ID_LEMMA;;
ID = 
|- !plus.
    (?u. (!a. plus u a = a) /\ (!a. plus a u = a)) ==>
    (!a. plus(Id plus)a = a) /\ (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
The condition that \verb@Id plus@ is a left and right identity is expressed 
by the predicates \verb@LEFT_ID@ and \verb@RIGHT_ID@, respectively:
\begin{session}
\begin{verbatim}
LEFT_ID = |- !plus. LEFT_ID plus = (!a. plus(Id plus)a = a)
RIGHT_ID = |- !plus. RIGHT_ID plus = (!a. plus a(Id plus) = a)
\end{verbatim}
\end{session}
Given these abbreviations it is easy to prove in \HOL{} that
if a left and right identity element exists, then it is unique:
\begin{session}
\begin{verbatim}
UNIQUE_LEFT_ID =
|- !plus.
    LEFT_ID plus /\ RIGHT_ID plus ==>
    (!l. (!a. plus l a = a) ==> (Id plus = l))

UNIQUE_RIGHT_ID =
|- !plus.
    LEFT_ID plus /\ RIGHT_ID plus ==>
    (!r. (!a. plus a r = a) ==> (Id plus = r))
\end{verbatim}
\end{session}
The use of these last two theorems is to help identify the element \verb@Id 
plus@ for some specific \verb@plus@, such as \verb@$+@ from the theory 
of \verb@arithmetic@.  The abbreviations \verb@LEFT_ID@ and \verb@RIGHT_ID@ 
admit a new characterisation of the predicate \verb@MONOID@, which is easier 
to manipulate in \HOL{} than \verb@RAW_MONOID@:
\begin{session}
\begin{verbatim}
MONOID =
|- !plus. MONOID plus = LEFT_ID plus /\ RIGHT_ID plus /\ ASSOCIATIVE plus
\end{verbatim}
\end{session}

\subsection{Groups}

A {\em group} is a monoid in which every element is invertible.
\begin{session}
\begin{verbatim}
RAW_GROUP = 
|- !plus.
    Group plus =
    MONOID plus /\
    (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus))
\end{verbatim}
\end{session}
An alternative characterisation of \verb@Group@\footnote{
The mixed case identifier \verb@Group@ is used rather than \verb@GROUP@, 
because the latter is already defined in Elsa Gunter's theory \verb@group@, 
a parent of her theory \verb@integer@, which needs to coexist with the 
present theory so that integers can be used as an example ring 
(Section~\ref{BinomialTheoremForIntegers}).
} which makes the 
existential quantification implicit can be had by specifying a function 
\verb@Inv@ so that, if \verb@Group plus@ holds, then the element \verb@Inv 
plus a@ is an inverse of \verb@a@.  The function \verb@Inv@ is specified 
in a way analogous to the specification of \verb@Id@.  First a lemma is 
proven which says there does exist a function \verb@Inv@ with the property 
desired:
\begin{session}
\begin{verbatim}
INV_LEMMA = 
|- ?Inv.
    !plus.
     (!a. ?b. (plus b a = Id plus) /\ (plus a b = Id plus)) ==>
     (!a.
       (plus(Inv plus a)a = Id plus) /\ (plus a(Inv plus a) = Id plus))
\end{verbatim}
\end{session}
Given this lemma, which is proven by an easy tactic almost the same
as the one for \verb@ID_LEMMA@, the constant \verb@Inv@ can be specified:
\begin{session}
\begin{verbatim}
#let INV =
#    new_specification `Inv` [`constant`, `Inv`] INV_LEMMA;;
\end{verbatim}
\end{session}
The predicates which say that \verb@Inv@ produces left and right inverses
are defined as new constants:
\begin{session}
\begin{verbatim}
LEFT_INV =
|- !plus. LEFT_INV plus = (!a. plus(Inv plus a)a = Id plus)
RIGHT_INV =
|- !plus. RIGHT_INV plus = (!a. plus a(Inv plus a) = Id plus)
\end{verbatim}
\end{session}
Finally, the alternative characterisation of \verb@Group@ can be proven:
\begin{session}
\begin{verbatim}
GROUP = |- Group plus = MONOID plus /\ LEFT_INV plus /\ RIGHT_INV plus
\end{verbatim}
\end{session}
The theory of groups has been developed only as far as needed to prove 
the Binomial Theorem; Appendix~\ref{PrincipalLemmas} shows the one lemma, 
\verb@RIGHT_CANCELLATION@, specifically about groups.

\subsection{Rings}

A {\em ring} is a structure $(R,+,\times)$ such that structure $(R,+)$ 
is a group, structure $(R,\times)$ is a monoid, {\em addition}, $+$, is 
commutative, and {\em multiplication}, $\times$, distributes (on both sides) 
over addition.  A {\em commutative ring} is a ring in which multiplication 
is commutative.  The word ring is used in the remainder of this \self{} 
and in the \HOL{} code to mean a commutative ring:
\begin{session}
\begin{verbatim}
RING = 
|- !plus times.
    RING(plus,times) =
    Group plus /\
    COMMUTATIVE plus /\
    MONOID times /\
    COMMUTATIVE times /\
    LEFT_DISTRIB(plus,times) /\
    RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
\verb@LEFT_DISTRIB@ and \verb@RIGHT_DISTRIB@ are new constants
defined by the theorems:
\begin{session}
\begin{verbatim}
LEFT_DISTRIB = 
|- !plus times.
    LEFT_DISTRIB(plus,times) =
    (!a b c. times a(plus b c) = plus(times a b)(times a c))

RIGHT_DISTRIB = 
|- !plus times.
    RIGHT_DISTRIB(plus,times) =
    (!a b c. times(plus a b)c = plus(times a c)(times b c))
\end{verbatim}
\end{session}
Notice that the definition of \verb@RING@ abbreviates a conjunction of 
predicates, some of which are atomic, in the sense that they are some basic 
property of \verb@plus@ or \verb@times@ or both, and some are compound, 
in the sense that they abbreviate a further conjunction of properties.  
The definition is a tree of predicates with atomic ones like 
\verb@LEFT_DISTRIB@ and \verb@ASSOCIATIVE@, at the leaves, and compound 
ones like \verb@Group@ and \verb@MONOID@, at the branches.  Many theorems 
conditional on \verb@RING(plus,times)@ need to make use of both atomic 
and compound predicates contained in the tree.  To make access to all these 
predicates easy, the following lemma is proven, which makes explicit all 
the predicates in the tree: 
\begin{session}
\begin{verbatim}
RING_LEMMA = 
|- RING(plus,times) ==>
   RING(plus,times) /\
   Group plus /\
   MONOID plus /\
   LEFT_ID plus /\
   RIGHT_ID plus /\
   ASSOCIATIVE plus /\
   LEFT_INV plus /\
   RIGHT_INV plus /\
   COMMUTATIVE plus /\
   MONOID times /\
   LEFT_ID times /\
   RIGHT_ID times /\
   ASSOCIATIVE times /\
   COMMUTATIVE times /\
   LEFT_DISTRIB(plus,times) /\
   RIGHT_DISTRIB(plus,times)
\end{verbatim}
\end{session}
Goals of the form,
\[
\mbox{\verb@!plus times. RING(plus,times) ==> ...@}
\]
are consistently tackled with the following tactic, which uses \verb@RING_LEMMA@
to add all the atomic and compound predicates in the tree to the assumption 
list:
\begin{session}
\begin{verbatim}
#let RING_TAC =
#    GEN_TAC THEN GEN_TAC THEN
#    DISCH_THEN (\th. STRIP_ASSUME_TAC (MP RING_LEMMA th));;
\end{verbatim}
\end{session}
The one lemma about rings needed here, \verb@RING_0@, is stated in
Appendix~\ref{PrincipalLemmas}.

% ----------------------------------------------------------------------------

\section{Powers, Reductions, Ranges and Sums}
\label{PowersReductionsRangesSums}
%Translation of mathematical definitions such as sums of series into HOL

The aim of this section is to explain how the mathematical notation 
$\sum_{k=0}^n {n \choose k} a^k b^{n-k}$ can be rendered in \HOL{}.
If this is written out more explicitly than necessary for human
consumption it looks like this:
\[
\sum_{k=0}^n {n \choose k} \cdot (a^k \times b^{n-k})
\]
where $\times$ is multiplication in the ring, and where exponentiation 
and $\cdot$ stand for the scalar powers of multiplication and addition, 
respectively.  Since binomial coefficients were treated in 
Section~\ref{BinomialCoefficients}, what remain to be defined in this section 
are scalar powers and the $\Sigma$-notation.  The latter is tackled in 
three steps: reduction of a list of monoid elements; generation of lists 
comprising subranges of natural numbers, $[n,n+1,\ldots,n+k-1]$; and finally 
the reduction of a list of ring elements indexed by a subrange of the natural 
numbers.

\subsection{Scalar powers in a monoid}

If $(M,+)$ and $(M,\times)$ are additive and multiplicative
monoids respectively, the scalar powers of addition and multiplication are
defined informally by the following equations, where $n>0$:
\begin{eqnarray*}
n \cdot a &=& \overbrace{a + \cdots + a}^{n} \\
a^n &=& \overbrace{a \times \cdots \times a}^{n}
\end{eqnarray*}
When $n=0$, $n \cdot a = 0$ and $a^n = 1$.

Scalar power is defined in \HOL{} via the constant \verb@POWER@ which
is defined by primitive recursion:
\begin{session}
\begin{verbatim}
POWER = 
|- (!plus a. POWER plus 0 a = Id plus) /\
   (!plus n a. POWER plus(SUC n)a = plus a(POWER plus n a))
\end{verbatim}
\end{session}
Three lemmas about \verb@POWER@ are stated in Appendix~\ref{PrincipalLemmas}.

\subsection{Reduction in a monoid}

This section makes use of the standard \HOL{} theory of lists, in particular 
the constants \verb@NIL@, \verb@CONS@, \verb@APPEND@ and \verb@MAP@, 
definition by primitive recursion and proof by induction. The theory of 
lists is described in \DESCRIPTION.

If $(M,+)$ is a monoid, then the reduction of a list $[a_1,\ldots,a_n]$,
where each $a_i$ is an element of $M$, is the sum,
\[
a_1 + \cdots + a_n,
\]
or the monoid's identity element if $n=0$.

Reduction is represented in \HOL{} by the constant \verb@REDUCE@,
with type,
\begin{verbatim}
REDUCE : (*->*->*) -> (*)list -> *
\end{verbatim}
and which is defined by primitive recursion on lists on its second argument:
\begin{session}
\begin{verbatim}
#let REDUCE =
#  new_list_rec_definition (`REDUCE`,
#    "(!plus. (REDUCE plus NIL) = (Id plus):*) /\
#     (!plus (a:*) as. REDUCE plus (CONS a as) = plus a (REDUCE plus as))");;
REDUCE = 
|- (!plus. REDUCE plus[] = Id plus) /\
   (!plus a as. REDUCE plus(CONS a as) = plus a(REDUCE plus as))
\end{verbatim}
\end{session}
Three lemmas about \verb@REDUCE@ are stated in Appendix~\ref{PrincipalLemmas}.

\subsection{Subranges of the natural numbers}

The constant \verb@RANGE@, which is a curried function of two numbers,
is defined by primitive recursion so that,
\[
\verb@RANGE@\,m\,n = [m, m+1, \ldots, m+n-1]
\]
This definition means that the subrange is the first $n$ numbers starting 
at $m$.  In particular, when $n=0$ the list is empty.
Appendix~\ref{PrincipalLemmas} states two simple properties of \verb@RANGE@

\verb@RANGE@ generates a subrange given its least element and length;
an alternative method is to generate the subrange given its two endpoints,
via a constant \verb@INTERVAL@ which satisfies the equation:
\[
\verb@INTERVAL@\,m\,n =
    \left\{\begin{array}{ll}
    [m, m+1, \ldots, n] & \mbox{if $m \leq n$} \\
    {}[] & \mbox{otherwise}
    \end{array}\right.
\]

For comparison, the two methods are definable in \HOL{} as follows:
\begin{session}
\begin{verbatim}
#let RANGE = 
#    new_recursive_definition false num_Axiom `RANGE`
#      "(RANGE m 0 = NIL) /\
#       (RANGE m (SUC n) = CONS m (RANGE (SUC m) n))";;
#
#let INTERVAL =
#    new_recursive_definition false num_Axiom `INTERVAL`
#      "(INTERVAL m 0 =
#          (m > 0) => NIL | [0]) /\
#      (INTERVAL m (SUC n) =
#          (m > (SUC n)) => NIL | APPEND (INTERVAL m n) ([SUC n]))";;
\end{verbatim}
\end{session}

\subsection{$\Sigma$-notation}

The standard $\Sigma$-notation is defined informally by the equation,
\[
\sum_{i=m}^n a_i =
    \left\{\begin{array}{ll}
    a_m + a_{m+1} + \cdots + a_n & \mbox{if $m \leq n$} \\
    0 & \mbox{otherwise}
    \end{array}\right.
\]
Therefore the standard $\Sigma$-notation is naturally definable via
a subrange generated by \verb@INTERVAL@.

An alternative notation, naturally definable via \verb@RANGE@, can be
defined informally as follows:
\[
\sum_{i=m}^{(n)} a_i =
    \left\{\begin{array}{ll}
    a_m + a_{m+1} + \cdots + a_{m+n-1} & \mbox{if $n > 0$} \\
    0 & \mbox{otherwise}
    \end{array}\right.
\]
The brackets around the $n$ above the $\Sigma$ distinguish this
notation from the standard one.

This second notation is used in the \HOL{} proof of the Binomial Theorem.  
The difference between the two is just that the standard notation is 
definable via \verb@INTERVAL@, and the second via \verb@RANGE@. The second 
notation was chosen entirely because the constant \verb@RANGE@ is easier 
to manipulate in \HOL{} than \verb@INTERVAL@, which is because the latter 
has a more complex definition. The disadvantage of this choice is that 
the proof of the theorem uses a non-standard notation, but that's all the 
difference is: notation.  If desired, the Binomial Theorem stated in the 
standard notation could be trivially deduced from its non-standard statement.

The non-standard $\Sigma$-notation is defined in \HOL{} as the following
abbreviation:
\begin{session}
\begin{verbatim}
#let SIGMA =
#    new_definition (`SIGMA`,
#      "SIGMA (plus,m,n) f = REDUCE plus (MAP f (RANGE m n)): *");;
\end{verbatim}
\end{session}
Note that an indexed term $a_i$ is represented in \HOL{} as a function
\verb@f@ of type \verb@num -> *@.

Appendix~\ref{PrincipalLemmas} states several properties of \verb@SIGMA@; 
their proofs make use of properties of \verb@MAP@ from the theory 
\verb@list@, and also the lemmas about \verb@REDUCE@ and \verb@RANGE@.

The functions \verb@REDUCE@ and \verb@RANGE@ are needed here only to define 
\verb@SIGMA@, and in fact could be omitted if \verb@SIGMA@ were to be defined 
directly  using primitive recursion.  It is worth going to the trouble 
of defining \verb@REDUCE@ and \verb@RANGE@, however, because they are likely 
to be useful in other situations. 

% ----------------------------------------------------------------------------

\section{The Binomial Theorem for a Commutative Ring}
\label{BinomialTheorem}
%Division of a medium sized proof into manageable lemmas.

This section outlines the proof.  For full details see the tactics in
\verb@mk_BINOMIAL.ml@.

The statement and proof of the Binomial Theorem use a new \HOL{} constant, 
\verb@BINTERM@, to abbreviate indexed terms of the form $\lambda k.\,{n 
\choose k} \cdot a^{n-k} \times b^k$.  \verb@BINTERM@ is defined as a
curried function as follows:
\begin{session}
\begin{verbatim}
#let BINTERM_DEF =
#    new_definition (`BINTERM_DEF`,
#      "BINTERM (plus: *->*->*) (times: *->*->*) a b n k =
#          POWER plus (CHOOSE n k)
#              (times (POWER times (n-k) a) (POWER times k b))");;
\end{verbatim}
\end{session}
This abbreviation lets the Binomial Theorem be stated and proven in \HOL{} as
follows:
\begin{session}
\begin{verbatim}
BINOMIAL = 
|- !plus times.
    RING(plus,times) ==>
    (!a b n.
      POWER times n(plus a b) =
      SIGMA(plus,0,SUC n)(BINTERM plus times a b n))
\end{verbatim}
\end{session}
In outline, the proof proceeds as follows.  The following equation is
proven by induction on $n$:
\begin{eqnarray*}
(a+b)^n &=& \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
When $n=0$ both sides reduce to the multiplicative identity element, $1$, 
of the ring. For the inductive step, the equation is assumed for $n$, and 
the following goal must be proven:
\begin{eqnarray*}
(a+b)^{n+1} &=& \sum_{k=0}^{(n+2)} {{n+1} \choose k} a^{n+1-k} b^k
\end{eqnarray*}
Three lemmas are used in the inductive step:
\[\begin{array}{rcll}
\displaystyle
\sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k &=&
\displaystyle
    a \times \sum_{k=1}^{(n)} {n \choose k} a^{n+1-k} b^k +
    b \times \sum_{k=0}^{(n)} {n \choose k} a^{n+1-k} b^k
    & \mbox{({\tt LEMMA\_1})} \\
\displaystyle
a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    & \mbox{({\tt LEMMA\_2})} \\
\displaystyle
b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k &=&
\displaystyle
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    + b^{n+1}
    & \mbox{({\tt LEMMA\_3})}
\end{array}\]
\verb@LEMMA_1@ is the key to the inductive step. The three stages of the 
inductive step are indicated (a), (b) and (c) in \ML{} comments.  Step 
(a) expands the right-hand side of the goal with \verb@LEMMA_1@:
\begin{eqnarray*}
\mbox{rhs} &=&
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k +
    b^{n+1}
\end{eqnarray*}
Step (b) expands the left-hand side with the induction hypothesis:
\begin{eqnarray*}
\mbox{lhs} &=&
    (a+b)\sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
\end{eqnarray*}
Finally, in step (c) the expansions of the two sides are shown to be the 
same, using distributivity of $\times$ over $+$, associativity of $+$,
and \verb@LEMMA_2@ and \verb@LEMMA_3@:
\begin{eqnarray*}
\mbox{lhs} &=&
    a \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k
    +
    b \times \sum_{k=0}^{(n+1)} {n \choose k} a^{n-k} b^k \\
  &=&
    a^{n+1} +
    a \times \sum_{k=1}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    +
    b \times \sum_{k=0}^{(n)} {{n+1} \choose k} a^{n+1-k} b^k
    + b^{n+1} \\
  &=&
      \mbox{rhs}
\end{eqnarray*}

% ----------------------------------------------------------------------------

\section{The Binomial Theorem for Integers}
\label{BinomialTheoremForIntegers}

The previous sections have dealt with the material in theory \verb@BINOMIAL@,
concerning how to prove the Binomial Theorem for an arbitrary ring.  A 
natural next step is to produce some specific commutative ring, and prove
a Binomial Theorem for it, as an instance of the general case.

To see how to do so for Elsa Gunter's theory of integers, take a look 
at the file {\tt mk\_BINOMIAL\_integer.ml}, contained in directory \path{}.
When loaded into \HOL{}, this \ML{} file proves that the integers are a 
ring---by quoting laws contained in the theory \verb@integer@---and then 
derives a Binomial Theorem for the integers---by Modus Ponens from the 
general theorem \verb@BINOMIAL@.