17. The Natural Numbers and Induction

This chapter marks a transition from the abstract to the concrete. Viewing the mathematical universe in terms of sets, relations, and functions gives us useful ways of thinking about mathematical objects and structures and the relationships between them. At some point, however, we need to start thinking about particular mathematical objects and structures, and the natural numbers are a good place to start. The nineteenth century mathematician Leopold Kronecker once proclaimed “God created the whole numbers; everything else is the work of man.” By this he meant that the natural numbers (and the integers, which we will also discuss below) are a fundamental component of the mathematical universe, and that many other objects and structures of interest can be constructed from these.

In this chapter, we will consider the natural numbers and the basic principles that govern them. In Chapter 18 we will see that even basic operations like addition and multiplication can be defined using means described here, and their properties derived from these basic principles. Our presentation in this chapter will remain informal, however. In Chapter 19, we will see how these principles play out in number theory, one of the oldest and most venerable branches of mathematics.

17.1. The Principle of Induction

The set of natural numbers is the set

\[\mathbb{N} = \{ 0, 1, 2, 3, \ldots \}.\]

In the past, opinions have differed as to whether the set of natural numbers should start with 0 or 1, but these days most mathematicians take them to start with 0. Logicians often call the function \(s(n) = n + 1\) the successor function, since it maps each natural number, \(n\), to the one that follows it. What makes the natural numbers special is that they are generated by the number zero and the successor function, which is to say, the only way to construct a natural number is to start with \(0\) and apply the successor function finitely many times. From a foundational standpoint, we are in danger of running into a circularity here, because it is not clear how we can explain what it means to apply a function “finitely many times” without talking about the natural numbers themselves. But the following principle, known as the principle of induction, describes this essential property of the natural numbers in a non-circular way.


Principle of Induction. Let \(P\) be any property of natural numbers. Suppose \(P\) holds of zero, and whenever \(P\) holds of a natural number \(n\), then it holds of its successor, \(n + 1\). Then \(P\) holds of every natural number.


This reflects the image of the natural numbers as being generated by zero and the successor operation: by covering the zero and successor cases, we take care of all the natural numbers.

The principle of induction provides a recipe for proving that every natural number has a certain property: to show that \(P\) holds of every natural number, show that it holds of \(0\), and show that whenever it holds of some number \(n\), it holds of \(n + 1\). This form of proof is called a proof by induction. The first required task is called the base case, and the second required task is called the induction step. The induction step requires temporarily fixing a natural number \(n\), assuming that \(P\) holds of \(n\), and then showing that \(P\) holds of \(n + 1\). In this context, the assumption that \(P\) holds of \(n\) is called the inductive hypothesis.

You can visualize proof by induction as a method of knocking down an infinite stream of dominoes, all at once. We set the mechanism in place and knock down domino 0 (the base case), and every domino knocks down the next domino (the induction step). So domino 0 knocks down domino 1; that knocks down domino 2, and so on.

Here is an example of a proof by induction.


Theorem. For every natural number \(n\),

\[1 + 2 + \ldots + 2^n = 2^{n+1} - 1.\]

Proof. We prove this by induction on \(n\). In the base case, when \(n = 0\), we have \(1 = 2^{0+1} - 1\), as required.

For the induction step, fix \(n\), and assume the inductive hypothesis

\[1 + 2 + \ldots + 2^n = 2^{n+1} - 1.\]

We need to show that this same claim holds with \(n\) replaced by \(n + 1\). But this is just a calculation:

\[\begin{split}1 + 2 + \ldots + 2^{n+1} & = (1 + 2 + \ldots + 2^n) + 2^{n+1} \\ & = 2^{n+1} - 1 + 2^{n+1} \\ & = 2 \cdot 2^{n+1} - 1 \\ & = 2^{n+2} - 1.\end{split}\]

In the notation of first-order logic, if we write \(P(n)\) to mean that \(P\) holds of \(n\), we could express the principle of induction as follows:

\[P(0) \wedge \forall n \; (P(n) \to P(n + 1)) \to \forall n \; P(n).\]

But notice that the principle of induction says that the axiom holds for every property \(P\), which means that we should properly use a universal quantifier for that, too:

\[\forall P \; (P(0) \wedge \forall n \; (P(n) \to P(n + 1)) \to \forall n \; P(n)).\]

Quantifying over properties takes us out of the realm of first-order logic; induction is therefore a second-order principle.

The pattern for a proof by induction is expressed even more naturally by the following natural deduction rule:

You should think about how some of the proofs in this chapter could be represented formally using natural deduction.

For another example of a proof by induction, let us derive a formula that, given any finite set \(S\), determines the number of subsets of \(S\). For example, there are four subsets of the two-element set \(\{1, 2\}\), namely \(\emptyset\), \(\{1\}\), \(\{2\}\), and \(\{1, 2\}\). You should convince yourself that there are eight subsets of the set \(\{1, 2, 3\}\). The following theorem establishes the general pattern.


Theorem. For any finite set \(S\), if \(S\) has \(n\) elements, then there are \(2^n\) subsets of \(S\).

Proof. We use induction on \(n\). In the base case, there is only one set with \(0\) elements, the empty set, and there is exactly one subset of the empty set, as required.

In the inductive case, suppose \(S\) has \(n + 1\) elements. Let \(a\) be any element of \(S\), and let \(S'\) be the set containing the remaining \(n\) elements. In order to count the subsets of \(S\), we divide them into two groups.

First, we consider the subsets of \(S\) that don’t contain \(a\). These are exactly the subsets of \(S'\), and by the inductive hypothesis, there are \(2^n\) of those.

Next we consider the subsets of \(S\) that do contain \(a\). Each of these is obtained by choosing a subset of \(S'\) and adding \(a\). Since there are \(2^n\) subsets of \(S'\), there are \(2^n\) subsets of \(S\) that contain \(a\).

Taken together, then, there are \(2^n + 2^n = 2^{n+1}\) subsets of \(S\), as required.


We have seen that there is a correspondence between properties of a domain and subsets of a domain. For every property \(P\) of natural numbers, we can consider the set \(S\) of natural numbers with that property, and for every set of natural numbers, we can consider the property of being in that set. For example, we can talk about the property of being even, or talk about the set of even numbers. Under this correspondence, the principle of induction can be cast as follows:


Principle of Induction. Let \(S\) be any set of natural numbers that contains \(0\) and is closed under the successor operation. Then \(S = \mathbb{N}\).


Here, saying that \(S\) is “closed under the successor operation” means that whenever a number \(n\) is in \(S\), so is \(n + 1\).

17.2. Variants of Induction

In this section, we will consider variations on the principle of induction that are often useful. It is important to recognize that each of these can be justified using the principle of induction as stated in the last section, so they need not be taken as fundamental.

The first one is no great shakes: instead of starting from \(0\), we can start from any natural number, \(m\).


Principle of Induction from a Starting Point. Let \(P\) be any property of natural numbers, and let \(m\) be any natural number. Suppose \(P\) holds of \(m\), and whenever \(P\) holds of a natural number \(n\) greater than or equal to \(m\), then it holds of its successor, \(n + 1\). Then \(P\) holds of every natural number greater than or equal to \(m\).


Assuming the hypotheses of this last principle, if we let \(P'(n)\) be the property “\(P\) holds of \(m + n\),” we can prove that \(P'\) holds of every \(n\) by the ordinary principle of induction. But this means that \(P\) holds of every number greater than or equal to \(m\).

Here is one example of a proof using this variant of induction.


Theorem. For every natural number \(n \geq 5\), \(2^n > n^2\).

Proof. By induction on \(n\). When \(n = 5\), we have \(2^n = 32 > 25 = n^2\), as required.

For the induction step, suppose \(n \ge 5\) and \(2^n > n^2\). Since \(n\) is greater than or equal to \(5\), we have \(2n + 1 \leq 3 n \leq n^2\), and so

\[\begin{split}(n+1)^2 &= n^2 + 2n + 1 \\ & \leq n^2 + n^2 \\ & < 2^n + 2^n \\ & = 2^{n+1}.\end{split}\]

For another example, let us derive a formula for the sum total of the angles in a convex polygon. A polygon is said to be convex if every line between two vertices stays inside the polygon. We will accept without proof the visually obvious fact that one can subdivide any convex polygon with more than three sides into a triangle and a convex polygon with one fewer side, namely, by closing off any two consecutive sides to form a triangle. We will also accept, without proof, the basic geometric fact that the sum of the angles of any triangle is 180 degrees.


Theorem. For any \(n \geq 3\), the sum of the angles of any convex \(n\)-gon is \(180(n - 2)\).

Proof. In the base case, when \(n = 3\), this reduces to the statement that the sum of the angles in any triangle is 180 degrees.

For the induction step, suppose \(n \geq 3\), and let \(P\) be a convex \((n+1)\)-gon. Divide \(P\) into a triangle and an \(n\)-gon. By the inductive hypotheses, the sum of the angles of the \(n\)-gon is \(180(n-2)\) degrees, and the sum of the angles of the triangle is \(180\) degrees. The measures of these angles taken together make up the sum of the measures of the angles of \(P\), for a total of \(180(n-2) + 180 = 180(n-1)\) degrees.


For our second example, we will consider the principle of complete induction, also sometimes known as total induction.


Principle of Complete Induction. Let \(P\) be any property that satisfies the following: for any natural number \(n\), whenever \(P\) holds of every number less than \(n\), it also holds of \(n\). Then \(P\) holds of every natural number.


Notice that there is no need to break out a special case for zero: for any property \(P\), \(P\) holds of all the natural numbers less than zero, for the trivial reason that there aren’t any! So, in particular, any such property automatically holds of zero.

Notice also that if such a property \(P\) holds of every number less than \(n\), then it also holds of every number less than \(n + 1\) (why?). So, for such a \(P\), the ordinary principle of induction implies that for every natural number \(n\), \(P\) holds of every natural number less than \(n\). But this is just a roundabout way of saying that \(P\) holds of every natural number. In other words, we have justified the principle of complete induction using ordinary induction.

To use the principle of complete induction we merely have to let \(n\) be any natural number and show that \(P\) holds of \(n\), assuming that it holds of every smaller number. Compare this to the ordinary principle of induction, which requires us to show \(P (n + 1)\) assuming only \(P(n)\). The following example of the use of this principle is taken verbatim from the introduction to this book:


Theorem. Every natural number greater than or equal to 2 can be written as a product of primes.

Proof. We proceed by induction on \(n\). Let \(n\) be any natural number greater than 2. If \(n\) is prime, we are done; we can consider \(n\) itself as a product with one factor. Otherwise, \(n\) is composite, and we can write \(n = m \cdot k\) where \(m\) and \(k\) are smaller than \(n\) and greater than 1. By the inductive hypothesis, each of \(m\) and \(k\) can be written as a product of primes:

\[\begin{split}m = p_1 \cdot p_2 \cdot \ldots \cdot p_u \\ k = q_1 \cdot q_2 \cdot \ldots \cdot q_v.\end{split}\]

But then we have

\[n = m \cdot k = p_1 \cdot p_2 \cdot \ldots \cdot p_u \cdot q_1 \cdot q_2 \cdot \ldots \cdot q_v.\]

We see that \(n\) is a product of primes, as required.


Finally, we will consider another formulation of induction, known as the least element principle.


The Least Element Principle. Suppose \(P\) is some property of natural numbers, and suppose \(P\) holds of some \(n\). Then there is a smallest value of \(n\) for which \(P\) holds.


In fact, using classical reasoning, this is equivalent to the principle of complete induction. To see this, consider the contrapositive of the statement above: “if there is no smallest value for which \(P\) holds, then \(P\) doesn’t hold of any natural number.” Let \(Q(n)\) be the property “\(P\) does not hold of \(n\).” Saying that there is no smallest value for which \(P\) holds means that, for every \(n\), if \(P\) holds at \(n\), then it holds of some number smaller than \(n\); and this is equivalent to saying that, for every \(n\), if \(Q\) doesn’t hold at \(n\), then there is a smaller value for which \(Q\) doesn’t hold. And that is equivalent to saying that if \(Q\) holds for every number less than \(n\), it holds for \(n\) as well. Similarly, saying that \(P\) doesn’t hold of any natural number is equivalent to saying that \(Q\) holds of every natural number. In other words, replacing the least element principle by its contrapositive, and replacing \(P\) by “not \(Q\),” we have the principle of complete induction. Since every statement is equivalent to its contrapositive, and every predicate has its negated version, the two principles are the same.

It is not surprising, then, that the least element principle can be used in much the same way as the principle of complete induction. Here, for example, is a formulation of the previous proof in these terms. Notice that it is phrased as a proof by contradiction.


Theorem. Every natural number greater than equal to 2 can be written as a product of primes.

Proof. Suppose, to the contrary, some natural number greater than or equal to 2 cannot be written as a product of primes. By the least element principle, there is a smallest such element; call it \(n\). Then \(n\) is not prime, and since it is greater than or equal to 2, it must be composite. Hence we can write \(n = m \cdot k\) where \(m\) and \(k\) are smaller than \(n\) and greater than 1. By the assumption on \(n\), each of \(m\) and \(k\) can be written as a product of primes:

\[\begin{split}m = p_1 \cdot p_2 \cdot \ldots \cdot p_u \\ k = q_1 \cdot q_2 \cdot \ldots \cdot q_v.\end{split}\]

But then we have

\[n = m \cdot k = p_1 \cdot p_2 \cdot \ldots \cdot p_u \cdot q_1 \cdot q_2 \cdot \ldots \cdot q_v.\]

We see that \(n\) is a product of primes, contradicting the fact that \(n\) cannot be written as a product of primes.


Here is another example:


Theorem. Every natural number is interesting.

Proof. Suppose, to the contrary, some natural number is uninteresting. Then there is a smallest one, \(n\). In other words, \(n\) is the smallest uninteresting number. But that is really interesting! Contradiction.


17.3. Recursive Definitions

Suppose I tell you that I have a function \(f : \mathbb{N} \to \mathbb{N}\) in mind, satisfying the following properties:

\[\begin{split}f(0) & = 1 \\ f(n + 1) & = 2 \cdot f(n)\end{split}\]

What can you infer about \(f\)? Try calculating a few values:

\[\begin{split}f(1) & = f(0 + 1) = 2 \cdot f(0) = 2 \\ f(2) & = f(1 + 1) = 2 \cdot f(1) = 4 \\ f(3) & = f(2 + 1) = 2 \cdot f(2) = 8\end{split}\]

It soon becomes apparent that for every \(n\), \(f(n) = 2^n\).

What is more interesting is that the two conditions above specify all the values of \(f\), which is to say, there is exactly one function meeting the specification above. In fact, it does not matter that \(f\) takes values in the natural numbers; it could take values in any other domain. All that is needed is a value of \(f(0)\) and a way to compute the value of \(f(n+1)\) in terms of \(n\) and \(f(n)\). This is what the principle of definition by recursion asserts:


Principle of Definition by Recursion. Let \(A\) be any set, and suppose \(a\) is in \(A\), and \(g : \mathbb{N} \times A \to A\). Then there is a unique function \(f\) satisfying the following two clauses:

\[\begin{split}f(0) & = a \\ f(n + 1) & = g(n, f(n)).\end{split}\]

The principle of recursive definition makes two claims at once: first, that there is a function \(f\) satisfying the clauses above, and, second, that any two functions \(f_1\) and \(f_2\) satisfying those clauses are equal, which is to say, they have the same values for every input. In the example with which we began this section, \(A\) is just \(\mathbb{N}\) and \(g(n, f(n)) = 2 \cdot f(n)\).

In some axiomatic frameworks, the principle of recursive definition can be justified using the principle of induction. In others, the principle of induction can be viewed as a special case of the principle of recursive definition. For now, we will simply take both to be fundamental properties of the natural numbers.

As another example of a recursive definition, consider the function \(g : \mathbb{N} \to \mathbb{N}\) defined recursively by the following clauses:

\[\begin{split}g(0) & = 1 \\ g(n+1) & = (n + 1) \cdot g(n)\end{split}\]

Try calculating the first few values. Unwrapping the definition, we see that \(g(n) = 1 \cdot 2 \cdot 3 \cdot \ldots \cdot (n-1) \cdot n\) for every \(n\); indeed, definition by recursion is usually the proper way to make expressions using “…” precise. The value \(g(n)\) is read “\(n\) factorial,” and written \(n!\).

Indeed, summation notation

\[\sum_{i < n} f (i) = f(0) + f(1) + \ldots + f(n-1)\]

and product notation

\[\prod_{i < n} f (i) = f(0) \cdot f(1) \cdot \cdots \cdot f(n-1)\]

can also be made precise using recursive definitions. For example, the function \(k(n) = \sum_{i < n} f (i)\) can be defined recursively as follows:

\[\begin{split}k(0) &= 0 \\ k(n+1) &= k(n) + f(n)\end{split}\]

Induction and recursion are complementary principles, and typically the way to prove something about a recursively defined function is to use the principle of induction. For example, the following theorem provides a formulas for the sum \(1 + 2 + \ldots + n\), in terms of \(n\).


Theorem. For every \(n\), \(\sum_{i < n + 1} i = n (n + 1) / 2\).

Proof. In the base case, when \(n = 0\), both sides are equal to \(0\).

In the inductive step, we have

\[\begin{split}\sum_{i < n + 2} i & = \left(\sum_{i < n + 1} i\right) + (n + 1) \\ & = n (n + 1) / 2 + n + 1 \\ & = \frac{n^2 +n}{2} + \frac{2n + 2}{2} \\ & = \frac{n^2 + 3n + 2}{2} \\ & = \frac{(n+1)(n+2)}{2}.\end{split}\]

There are just as many variations on the principle of recursive definition as there are on the principle of induction. For example, in analogy to the principle of complete induction, we can specify a value of \(f(n)\) in terms of the values that \(f\) takes at all inputs smaller than \(n\). When \(n \geq 2\), for example, the following definition specifies the value of a function \(\mathrm{fib}(n)\) in terms of its two predecessors:

\[\begin{split}\mathrm{fib}(0) & = 0 \\ \mathrm{fib}(1) & = 1 \\ \mathrm{fib}(n+2) & = \mathrm{fib}(n + 1) + \mathrm{fib}(n)\end{split}\]

Calculating the values of \(\mathrm{fib}\) on \(0, 1, 2, \ldots\) we obtain

\[0, 1, 1, 2, 3, 5, 8, 13, 21, \ldots\]

Here, after the second number, each successive number is the sum of the two values preceding it. This is known as the Fibonacci sequence, and the corresponding numbers are known as the Fibonacci numbers. An ordinary mathematical presentation would write \(F_n\) instead of \(\mathrm{fib}(n)\) and specify the sequence with the following equations:

\[F_0 = 0, \quad F_1 = 1, \quad F_{n+2} = F_{n+1} + F_n\]

But you can now recognize such a specification as an implicit appeal to the principle of definition by recursion. We ask you to prove some facts about the Fibonacci sequence in the exercises below.

17.4. Defining Arithmetic Operations

In fact, we can even use the principle of recursive definition to define the most basic operations on the natural numbers and show that they have the properties we expect them to have. From a foundational standpoint, we can characterize the natural numbers as a set, \(\mathbb{N}\), with a distinguished element \(0\) and a function, \(\mathrm{succ}(m)\), which, for every natural number \(m\), returns its successor. These satisfy the following:

  • \(0 \neq \mathrm{succ}(m)\) for any \(m\) in \(\mathbb{N}\).

  • For every \(m\) and \(n\) in \(\mathbb{N}\), if \(m \neq n\), then \(\mathrm{succ}(m) \neq \mathrm{succ}(n)\). In other words, \(\mathrm{succ}\) is injective.

  • If \(A\) is any subset of \(\mathbb{N}\) with the property that \(0\) is in \(A\) and whenever \(n\) is in \(A\) then \(\mathrm{succ}(n)\) is in \(A\), then \(A = \mathbb{N}\).

The last clause can be reformulated as the principle of induction:

Suppose \(P(n)\) is any property of natural numbers, such that \(P\) holds of \(0\), and for every \(n\), \(P(n)\) implies \(P(\mathrm{succ}(n))\). Then every \(P\) holds of every natural number.

Remember that this principle can be used to justify the principle of definition by recursion:

Let \(A\) be any set, \(a\) be any element of \(A\), and let \(g(n,m)\) be any function from \(\mathbb{N} \times A\) to \(A\). Then there is a unique function \(f: \mathbb{N} \to A\) satisfying the following two clauses:

  • \(f(0) = a\)

  • \(f(\mathrm{succ}(n)) = g(n,f(n))\) for every \(n\) in \(N\)

We can use the principle of recursive definition to define addition with the following two clauses:

\[\begin{split}m + 0 & = m \\ m + \mathrm{succ}(n) & = \mathrm{succ}(m + n)\end{split}\]

Note that we are fixing \(m\), and viewing this as a function of \(n\). If we write \(1 = \mathrm{succ}(0)\), \(2 = \mathrm{succ}(1)\), and so on, it is easy to prove \(n + 1 = \mathrm{succ}(n)\) from the definition of addition.

We can proceed to define multiplication using the following two clauses:

\[\begin{split}m \cdot 0 & = 0 \\ m \cdot \mathrm{succ}(n) & = m \cdot n + m\end{split}\]

We can also define a predecessor function by

\[\begin{split}\mathrm{pred}(0) & = 0 \\ \mathrm{pred}(\mathrm{succ}(n)) & = n\end{split}\]

We can define truncated subtraction by

\[\begin{split}m \dot - 0 & = m \\ m \dot - (\mathrm{succ}(n)) & = \mathrm{pred}(m \dot - n)\end{split}\]

With these definitions and the induction principle, one can prove all the following identities:

  • \(n \neq 0\) implies \(\mathrm{succ}(\mathrm{pred}(n)) = n\)

  • \(0 + n = n\)

  • \(\mathrm{succ}(m) + n = \mathrm{succ}(m + n)\)

  • \((m + n) + k = m + (n + k)\)

  • \(m + n = n + m\)

  • \(m(n + k) = mn + mk\)

  • \(0 \cdot n = 0\)

  • \(1 \cdot n = n\)

  • \((mn)k = m(nk)\)

  • \(mn = nm\)

We will do the first five here, and leave the remaining ones as exercises.


Proposition. For every natural number \(n\), if \(n \neq 0\) then \(\mathrm{succ}(\mathrm{pred}(n)) = n\).

Proof. By induction on \(n\). We have ruled out the case where \(n\) is \(0\), so we only need to show that the claim holds for \(\mathrm{succ}(n)\). But in that case, we have \(\mathrm{succ}(\mathrm{pred}(\mathrm{succ}(n)) = \mathrm{succ}(n)\) by the second defining clause of the predecessor function.

Proposition. For every \(n\), \(0 + n = n\).

Proof. By induction on \(n\). We have \(0 + 0 = 0\) by the first defining clause for addition. And assuming \(0 + n = n\), we have \(0 + \mathrm{succ}(n) = \mathrm{succ}(0 + n) = n\), using the second defining clause for addition.

Proposition. For every \(m\) and \(n\), \(\mathrm{succ}(m) + n = \mathrm{succ}(m + n)\).

Proof. Fix \(m\) and use induction on \(n\). Then \(n = 0\), we have \(\mathrm{succ}(m) + 0 = \mathrm{succ}(m) = \mathrm{succ}(m + 0)\), using the first defining clause for addition. Assuming the claim holds for \(n\), we have

\[\begin{split}\mathrm{succ}(m) + \mathrm{succ}(n) & = \mathrm{succ}(\mathrm{succ}(m) + n) \\ & = \mathrm{succ} (\mathrm{succ} (m + n)) \\ & = \mathrm{succ} (m + \mathrm{succ}(n))\end{split}\]

using the inductive hypothesis and the second defining clause for addition.

Proposition. For every \(m\), \(n\), and \(k\), \((m + n) + k = m + (n + k)\).

Proof. By induction on \(k\). The case where \(k = 0\) is easy, and in the induction step we have

\[\begin{split}(m + n) + \mathrm{succ}(k) & = \mathrm{succ} ((m + n) + k) \\ & = \mathrm{succ} (m + (n + k)) \\ & = m + \mathrm{succ} (n + k) \\ & = m + (n + \mathrm{succ} (k)))\end{split}\]

using the inductive hypothesis and the definition of addition.

Proposition. For every pair of natural numbers \(m\) and \(n\), \(m + n = n + m\).

Proof. By induction on \(n\). The base case is easy using the second proposition above. In the inductive step, we have

\[\begin{split}m + \mathrm{succ}(n) & = \mathrm{succ}(m + n) \\ & = \mathrm{succ} (n + m) \\ & = \mathrm{succ}(n) + m\end{split}\]

using the third proposition above.


17.5. Arithmetic on the Natural Numbers

Continuing as in the last section, we can establish all the basic properties of the natural numbers that play a role in day-to-day mathematics. We summarize the main ones here:

\[\begin{split}m + n &= n + m \quad \text{(commutativity of addition)}\\ m + (n + k) &= (m + n) + k \quad \text{(associativity of addition)}\\ n + 0 &= n \quad \text{($0$ is a neutral element for addition)}\\ n \cdot m &= m \cdot n \quad \text{(commutativity of multiplication)}\\ m \cdot (n \cdot k) &= (m \cdot n) \cdot k \quad \text{(associativity of multiplication)}\\ n \cdot 1 &= n \quad \text{($1$ is an neutral element for multiplication)}\\ n \cdot (m + k) &= n \cdot m + n \cdot k \quad \text{(distributivity)}\\ n \cdot 0 &= 0 \quad \text{($0$ is an absorbing element for multiplication)}\end{split}\]

In an ordinary mathematical argument or calculation, they can be used without explicit justification. We also have the following properties:

  • \(n + 1 \neq 0\)

  • if \(n + k = m + k\) then \(n = m\)

  • if \(n \cdot k = m \cdot k\) and \(k \neq 0\) then \(n = m\)

We can define \(m \le n\), “\(m\) is less than or equal to \(n\),” to mean that there exists a \(k\) such that \(m + k = n\). If we do that, it is not hard to show that the less-than-or-equal-to relation satisfies all the following properties, for every \(n\), \(m\), and \(k\):

  • \(n \le n\) (reflexivity)

  • if \(n \le m\) and \(m \le k\) then \(n \le k\) (transitivity)

  • if \(n \le m\) and \(m \le n\) then \(n = m\) (antisymmetry)

  • for all \(n\) and \(m\), either \(n \le m\) or \(m \le n\) is true (totality)

  • if \(n \le m\) then \(n + k \le m + k\)

  • if \(n + k \le m + k\) then \(n \le m\)

  • if \(n \le m\) then \(nk \le mk\)

  • if \(m \ge n\) then \(m = n\) or \(m \ge n + 1\)

  • \(0 \le n\)

Remember from Chapter 13 that the first four items assert that \(\le\) is a linear order. Note that when we write \(m \ge n\), we mean \(n \le m\).

As usual, then, we can define \(m < n\) to mean that \(m \le n\) and \(m \ne n\). In that case, we have that \(m \le n\) holds if and only if \(m < n\) or \(m = n\).


Proposition. For every \(m\), \(m + 1 \not\le 0\).

Proof. Otherwise, we would have \((m + 1) + k = (m + k) + 1 = 0\) for some \(k\).


In particular, taking \(m = 0\), we have \(1 \not\le 0\).


Proposition. We have \(m < n\) if and only if \(m + 1 \le n\).

Proof. Suppose \(m < n\). Then \(m \le n\) and \(m \ne n\). So there is a \(k\) such that \(m + k = n\), and since \(m \ne n\), we have \(k \ne 0\). Then \(k = u + 1\) for some \(u\), which means we have \(m + (u + 1) = m + 1 + u = n\), so \(m \le n\), as required.

In the other direction, suppose \(m + 1 \le n\). Then \(m \le n\). We also have \(m \ne n\), since if \(m = n\), we would have \(m + 1 \le m + 0\) and hence \(1 \le 0\), a contradiction.


In a similar way, we can show that \(m < n\) if and only if \(m \le n\) and \(m \ne n\). In fact, we can demonstrate all of the following from these properties and the properties of \(\le\):

  • \(n < n\) is never true (irreflexivity)

  • if \(n < m\) and \(m < k\) then \(n < k\) (transitivity)

  • for all \(n\) and \(m\), either \(n < m\), \(n = m\) or \(m < n\) is true (trichotomy)

  • if \(n < m\) then \(n + k < m + k\)

  • if \(k > 0\) and \(n < m\) then \(nk < mk\)

  • if \(m > n\) then \(m = n + 1\) or \(m > n + 1\)

  • for all \(n\), \(n = 0\) or \(n > 0\)

The first three items mean that \(<\) is a strict linear order, and the properties above means that \(\le\) is the associated linear order, in the sense described in Section 13.1.


Proof. We will prove some of these properties using the previous characterization of the less-than relation.

The first property is straightforward: we know \(n \le n + 1\), and if we had \(n + 1 \le n\), we should have \(n = n + 1\), a contradiction.

For the second property, assume \(n < m\) and \(m < k\). Then \(n + 1 \le m \le m + 1 \le k\), which implies \(n < k\).

For the third, we know that either \(n \le m\) or \(m \le n\). If \(m = n\), we are done, and otherwise we have either \(n < m\) or \(m < n\).

For the fourth, if \(n + 1 \le m\), we have \(n + 1 + k = (n + k) + 1 \le m + k\), as required.

For the fifth, suppose \(k > 0\), which is to say, \(k \ge 1\). If \(n < m\), then \(n + 1 \le m\), and so \(nk + 1 \le n k + k \le mk\). But this implies \(n k < m k\), as required.

The rest of the remaining proofs are left as an exercise to the reader.


Here are some additional properties of \(<\) and \(\le\):

  • \(n < m\) and \(m < n\) cannot both hold (asymmetry)

  • \(n + 1 > n\)

  • if \(n < m\) and \(m \le k\) then \(n < k\)

  • if \(n \le m\) and \(m < k\) then \(n < k\)

  • if \(m > n\) then \(m \ge n + 1\)

  • if \(m \ge n\) then \(m + 1 > n\)

  • if \(n + k < m + k\) then \(n < m\)

  • if \(nk < mk\) then \(k > 0\) and \(n < m\)

These can be proved from the ones above. Moreover, the collection of principles we have just seen can be used to justify basic facts about the natural numbers, which are again typically taken for granted in informal mathematical arguments.


Proposition. If \(m\) and \(n\) are natural numbers such that \(m + n = 0\), then \(m = n = 0\).

Proof. If \(m + n = 0\), then \(m \le 0\), so \(m = 0\) and \(n = 0 + n = m + n = 0\).

Proposition. If \(n\) is a natural number such that \(n < 3\), then \(n = 0\), \(n = 1\) or \(n = 2\).

Proof. In this proof we repeatedly use the property that if \(m > n\) then \(m = n + 1\) or \(m > n + 1\). Since \(2 + 1 = 3 > n\), we conclude that either \(2 + 1 = n + 1\) or \(2 + 1 > n + 1\). In the first case we conclude \(n = 2\), and we are done. In the second case we conclude \(2 > n\), which implies that either \(2 = n + 1\), or \(2 > n + 1\). In the first case, we conclude \(n = 1\), and we are done. In the second case, we conclude \(1 > n\), and appeal one last time to the general principle presented above to conclude that either \(1 = n + 1\) or \(1 > n + 1\). In the first case, we conclude \(n = 0\), and we are once again done. In the second case, we conclude that \(0 > n\). This leads to a contradiction, since now \(0 > n \ge 0\), hence \(0 > 0\), which contradicts the irreflexivity of \(>\).


17.6. The Integers

The natural numbers are designed for counting discrete quantities, but they suffer an annoying drawback: it is possible to subtract \(n\) from \(m\) if \(n\) is less than or equal to \(m\), but not if \(m\) is greater than \(n\). The set of integers, \(\mathbb{Z}\), extends the natural numbers with negative values, to make it possible to carry out subtraction in full:

\[\mathbb{Z} = \{ \ldots, -3, -2, -1, 0, 1, 2, 3, \ldots \}.\]

We will see in a later chapter that the integers can be extended to the rational numbers, the real numbers, and the complex numbers, each of which serves useful purposes. For dealing with discrete quantities, however, the integers will get us pretty far.

You can think of the integers as consisting of two copies of the natural numbers, a positive one and a negative one, sharing a common zero. Conversely, once we have the integers, you can think of the natural numbers as consisting of the nonnegative integers, that is, the integers that are greater than or equal to \(0\). Most mathematicians blur the distinction between the two, though we will see that in Lean, for example, the natural numbers and the integers represent two different data types.

Most of the properties of the natural numbers that were enumerated in the last section hold of the integers as well, but not all. For example, it is no longer the case that \(n + 1 \neq 0\) for every \(n\), since the claim is false for \(n = -1\). For another example, it is not the case that every integer is either equal to \(0\) or greater than \(0\), since this fails to hold of the negative integers.

The key property that the integers enjoy, which sets them apart from the natural numbers, is that for every integer \(n\) there is a value \(-n\) with the property that \(n + (-n) = 0\). The value \(-n\) is called the negation of \(n\). We define subtraction \(n - m\) to be \(n + (-m)\). For any integer \(n\), we also define the absolute value of \(n\), written \(|n|\), to be \(n\) if \(n \geq 0\), and \(-n\) otherwise.

We can no longer use proof by induction on the integers, because induction does not cover the negative numbers. But we can use induction to show that a property holds of every nonnegative integer, for example. Moreover, we know that every negative integer is the negation of a positive one. As a result, proofs involving the integers often break down into two cases, where one case covers the nonnegative integers, and the other case covers the negative ones.

17.7. Exercises

  1. Write the principle of complete induction using the notation of symbolic logic. Also write the least element principle this way, and use logical manipulations to show that the two are equivalent.

  2. Show that for every \(n\), \(0^2 + 1^2 + 2^2 + \ldots n^2= \frac{1}{6}n(1+n)(1+2n)\).

  3. Show that for every \(n\), \(0^3 + 1^3 + \ldots + n^3 = \frac{1}{4} n^2 (n+1)^2\).

  4. Show that for every \(n\), \(\sum_{i \le n} \frac{i}{(i + 1)!} = \frac{n! - 1}{n}\).

  5. Given the definition of the Fibonacci numbers in Section 17.3, prove Cassini’s identity: for every \(n\), \(F^2_{n+1} - F_{n+2} F_n = (-1)^n\). Hint: in the induction step, write \(F_{n+2}^2\) as \(F_{n+2}(F_{n+1} + F_n)\).

  6. Prove \(\sum_{i < n} F_{2i+1} = F_{2n}\).

  7. Prove the following two identities:

    • \(F_{2n+1} = F^2_{n+1} + F^2_n\)

    • \(F_{2n+2} = F^2_{n+2} - F^2_n\)

    Hint: use induction on \(n\), and prove them both at once. In the induction step, expand \(F_{2n+3} = F_{2n+2} + F_{2n+1}\), and similarly for \(F_{2n+4}\). Proving the second equation is especially tricky. Use the inductive hypothesis and the first identity to simplify the left-hand side, and repeatedly unfold the Fibonacci number with the highest index and simplify the equation you need to prove. (When you have worked out a solution, write a clear equational proof, calculating in the ``forward’’ direction.)

  8. Prove that every natural number can be written as a sum of distinct powers of 2. For this problem, \(1 = 2^0\) is counted as power of 2.

  9. Let \(V\) be a non-empty set of integers such that the following two properties hold:

    • If \(x, y \in V\), then \(x - y \in V\).

    • If \(x \in V\), then every multiple of \(x\) is an element of \(V\).

    Prove that there is some \(d \in V\), such that \(V\) is equal to the set of multiples of \(d\). Hint: use the least element principle.

  10. Give an informal but detailed proof that for every natural number \(n\), \(1 \cdot n = n\), using a proof by induction, the definition of multiplication, and the theorems proved in Section 17.4.

  11. Show that multiplication distributes over addition. In other words, prove that for natural numbers \(m\), \(n\), and \(k\), \(m (n + k) = m n + m k\). You should use the definitions of addition and multiplication and facts proved in Section 17.4 (but nothing more).

  12. Prove the multiplication is associative, in the same way. You can use any of the facts proved in Section 17.4 and the previous exercise.

  13. Prove that multiplication is commutative.

  14. Prove \((m^n)^k = m^{nk}\).

  15. Following the example in Section 17.5, prove that if \(n\) is a natural number and \(n < 5\), then \(n\) is one of the values \(0, 1, 2, 3\), or \(4\).

  16. Prove that if \(n\) and \(m\) are natural numbers and \(n m = 1\), then \(n = m = 1\), using only properties listed in Section 17.5.

    This is tricky. First show that \(n\) and \(m\) are greater than \(0\), and hence greater than or equal to \(1\). Then show that if either one of them is greater than \(1\), then \(n m > 1\).

  17. Prove any of the other claims in Section 17.5 that were stated without proof.

  18. Prove the following properties of negation and subtraction on the integers, using only the properties of negation and subtraction given in Section 17.6.

    • If \(n + m = 0\) then \(m = -n\).

    • \(-0 = 0\).

    • If \(-n = -m\) then \(n = m\).

    • \(m + (n - m) = n\).

    • \(-(n + m) = -n - m\).

    • If \(m < n\) then \(n - m > 0\).

    • If \(m < n\) then \(-m > -n\).

    • \(n \cdot (-m) = -nm\).

    • \(n(m - k) = nm - nk\).

    • If \(n < m\) then \(n - k < m - k\).

  19. Suppose you have an infinite chessboard with a natural number written in each square. The value in each square is the average of the values of the four neighboring squares. Prove that all the values on the chessboard are equal.

  20. Prove that every natural number can be written as a sum of distinct non-consecutive Fibonacci numbers. For example, \(22 = 1 + 3 + 5 + 13\) is not allowed, since 3 and 5 are consecutive Fibonacci numbers, but \(22 = 1 + 21\) is allowed.