# Arrays and Termination

To write efficient code, it is important to select appropriate data structures. Linked lists have their place: in some applications, the ability to share the tails of lists is very important. However, most use cases for a variable-length sequential collection of data are better served by arrays, which have both less memory overhead and better locality.

Arrays, however, have two drawbacks relative to lists:

- Arrays are accessed through indexing, rather than by pattern matching, which imposes proof obligations in order to maintain safety.
- A loop that processes an entire array from left to right is a tail-recursive function, but it does not have an argument that decreases on each call.

Making effective use of arrays requires knowing how to prove to Lean that an array index is in bounds, and how to prove that an array index that approaches the size of the array also causes the program to terminate. Both of these are expressed using an inequality proposition, rather than propositional equality.

## Inequality

Because different types have different notions of ordering, inequality is governed by two type classes, called `LE`

and `LT`

.
The table in the section on standard type classes describes how these classes relate to the syntax:

Expression | Desugaring | Class Name |
---|---|---|

`x < y` | `LT.lt x y` | `LT` |

`x ≤ y` | `LE.le x y` | `LE` |

`x > y` | `LT.lt y x` | `LT` |

`x ≥ y` | `LE.le y x` | `LE` |

In other words, a type may customize the meaning of the `<`

and `≤`

operators, while `>`

and `≥`

derive their meanings from `<`

and `≤`

.
The classes `LT`

and `LE`

have methods that return propositions rather than `Bool`

s:

```
class LE (α : Type u) where
le : α → α → Prop
class LT (α : Type u) where
lt : α → α → Prop
```

The instance of `LE`

for `Nat`

delegates to `Nat.le`

:

```
instance : LE Nat where
le := Nat.le
```

Defining `Nat.le`

requires a feature of Lean that has not yet been presented: it is an inductively-defined relation.

### Inductively-Defined Propositions, Predicates, and Relations

`Nat.le`

is an *inductively-defined relation*.
Just as `inductive`

can be used to create new datatypes, it can also be used to create new propositions.
When a proposition takes an argument, it is referred to as a *predicate* that may be true for some, but not all, potential arguments.
Propositions that take multiple arguments are called *relations*.

Each constructor of an inductively defined proposition is a way to prove it. In other words, the declaration of the proposition describes the different forms of evidence that it is true. A proposition with no arguments that has a single constructor can be quite easy to prove:

```
inductive EasyToProve : Prop where
| heresTheProof : EasyToProve
```

The proof consists of using its constructor:

```
theorem fairlyEasy : EasyToProve := by
constructor
```

In fact, the proposition `True`

, which should always be easy to prove, is defined just like `EasyToProve`

:

```
inductive True : Prop where
| intro : True
```

Inductively-defined propositions that don't take arguments are not nearly as interesting as inductively-defined datatypes.
This is because data is interesting in its own right—the natural number `3`

is different from the number `35`

, and someone who has ordered 3 pizzas will be upset if 35 arrive at their door 30 minutes later.
The constructors of a proposition describe ways in which the proposition can be true, but once a proposition has been proved, there is no need to know *which* underlying constructors were used.
This is why most interesting inductively-defined types in the `Prop`

universe take arguments.

The inductively-defined predicate `IsThree`

states that its argument is three:

```
inductive IsThree : Nat → Prop where
| isThree : IsThree 3
```

The mechanism used here is just like indexed families such as `HasCol`

, except the resulting type is a proposition that can be proved rather than data that can be used.

Using this predicate, it is possible to prove that three is indeed three:

```
theorem three_is_three : IsThree 3 := by
constructor
```

Similarly, `IsFive`

is a predicate that states that its argument is `5`

:

```
inductive IsFive : Nat → Prop where
| isFive : IsFive 5
```

If a number is three, then the result of adding two to it should be five. This can be expressed as a theorem statement:

```
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
skip
```

The resulting goal has a function type:

```
unsolved goals
n : Nat
⊢ IsThree n → IsFive (n + 2)
```

Thus, the `intro`

tactic can be used to convert the argument into an assumption:

```
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
intro three
```

```
unsolved goals
n : Nat
three : IsThree n
⊢ IsFive (n + 2)
```

Given the assumption that `n`

is three, it should be possible to use the constructor of `IsFive`

to complete the proof:

```
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
intro three
constructor
```

However, this results in an error:

```
tactic 'constructor' failed, no applicable constructor found
n : Nat
three : IsThree n
⊢ IsFive (n + 2)
```

This error occurs because `n + 2`

is not definitionally equal to `5`

.
In an ordinary function definition, dependent pattern matching on the assumption `three`

could be used to refine `n`

to `3`

.
The tactic equivalent of dependent pattern matching is `cases`

, which has a syntax similar to that of `induction`

:

```
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
intro three
cases three with
| isThree => skip
```

In the remaining case, `n`

has been refined to `3`

:

```
unsolved goals
case isThree
⊢ IsFive (3 + 2)
```

Because `3 + 2`

is definitionally equal to `5`

, the constructor is now applicable:

```
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
intro three
cases three with
| isThree => constructor
```

The standard false proposition `False`

has no constructors, making it impossible to provide direct evidence for.
The only way to provide evidence for `False`

is if an assumption is itself impossible, similarly to how `nomatch`

can be used to mark code that the type system can see is unreachable.
As described in the initial Interlude on proofs, the negation `Not A`

is short for `A → False`

.
`Not A`

can also be written `¬A`

.

It is not the case that four is three:

```
theorem four_is_not_three : ¬ IsThree 4 := by
skip
```

The initial proof goal contains `Not`

:

```
unsolved goals
⊢ ¬IsThree 4
```

The fact that it's actually a function type can be exposed using `simp`

:

```
theorem four_is_not_three : ¬ IsThree 4 := by
simp [Not]
```

```
unsolved goals
⊢ IsThree 4 → False
```

Because the goal is a function type, `intro`

can be used to convert the argument into an assumption.
There is no need to keep `simp`

, as `intro`

can unfold the definition of `Not`

itself:

```
theorem four_is_not_three : ¬ IsThree 4 := by
intro h
```

```
unsolved goals
h : IsThree 4
⊢ False
```

In this proof, the `cases`

tactic solves the goal immediately:

```
theorem four_is_not_three : ¬ IsThree 4 := by
intro h
cases h
```

Just as a pattern match on a `Vect String 2`

doesn't need to include a case for `Vect.nil`

, a proof by cases over `IsThree 4`

doesn't need to include a case for `isThree`

.

### Inequality of Natural Numbers

The definition of `Nat.le`

has a parameter and an index:

```
inductive Nat.le (n : Nat) : Nat → Prop
| refl : Nat.le n n
| step : Nat.le n m → Nat.le n (m + 1)
```

The parameter `n`

is the number that should be smaller, while the index is the number that should be greater than or equal to `n`

.
The `refl`

constructor is used when both numbers are equal, while the `step`

constructor is used when the index is greater than `n`

.

From the perspective of evidence, a proof that \( n \leq k \) consists of finding some number \( d \) such that \( n + d = m \).
In Lean, the proof then consists of a `Nat.le.refl`

constructor wrapped by \( d \) instances of `Nat.le.step`

.
Each `step`

constructor adds one to its index argument, so \( d \) `step`

constructors adds \( d \) to the larger number.
For example, evidence that four is less than or equal to seven consists of three `step`

s around a `refl`

:

```
theorem four_le_seven : 4 ≤ 7 :=
open Nat.le in
step (step (step refl))
```

The strict less-than relation is defined by adding one to the number on the left:

```
def Nat.lt (n m : Nat) : Prop :=
Nat.le (n + 1) m
instance : LT Nat where
lt := Nat.lt
```

Evidence that four is strictly less than seven consists of two `step`

's around a `refl`

:

```
theorem four_lt_seven : 4 < 7 :=
open Nat.le in
step (step refl)
```

This is because `4 < 7`

is equivalent to `5 ≤ 7`

.

## Proving Termination

The function `Array.map`

transforms an array with a function, returning a new array that contains the result of applying the function to each element of the input array.
Writing it as a tail-recursive function follows the usual pattern of delegating to a function that passes the output array in an accumulator.
The accumulator is initialized with an empty array.
The accumulator-passing helper function also takes an argument that tracks the current index into the array, which starts at `0`

:

```
def Array.map (f : α → β) (arr : Array α) : Array β :=
arrayMapHelper f arr Array.empty 0
```

The helper should, at each iteration, check whether the index is still in bounds.
If so, it should loop again with the transformed element added to the end of the accumulator and the index incremented by `1`

.
If not, then it should terminate and return the accumulator.
An initial implementation of this code fails because Lean is unable to prove that the array index is valid:

```
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
if i < arr.size then
arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
else soFar
```

```
failed to prove index is valid, possible solutions:
- Use `have`-expressions to prove the index is valid
- Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid
- Use `a[i]?` notation instead, result is an `Option` type
- Use `a[i]'h` notation instead, where `h` is a proof that index is valid
α : Type ?u.1742
β : Type ?u.1745
f : α → β
arr : Array α
soFar : Array β
i : Nat
⊢ i < Array.size arr
```

However, the conditional expression already checks the precise condition that the array index's validity demands (namely, `i < arr.size`

).
Adding a name to the `if`

resolves the issue, because it adds an assumption that the array indexing tactic can use:

```
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
if inBounds : i < arr.size then
arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
else soFar
```

Lean does not, however, accept the modified program, because the recursive call is not made on an argument to one of the input constructors. In fact, both the accumulator and the index grow, rather than shrinking:

```
fail to show termination for
arrayMapHelper
with errors
argument #6 was not used for structural recursion
failed to eliminate recursive application
arrayMapHelper f✝ arr (Array.push soFar (f✝ arr[i])) (i + 1)
structural recursion cannot be used
failed to prove termination, use `termination_by` to specify a well-founded relation
```

Nevertheless, this function terminates, so simply marking it `partial`

would be unfortunate.

Why does `arrayMapHelper`

terminate?
Each iteration checks whether the index `i`

is still in bounds for the array `arr`

.
If so, `i`

is incremented and the loop repeats.
If not, the program terminates.
Because `arr.size`

is a finite number, `i`

can be incremented only a finite number of times.
Even though no argument to the function decreases on each call, `arr.size - i`

decreases toward zero.

Lean can be instructed to use another expression for termination by providing a `termination_by`

clause at the end of a definition.
The `termination_by`

clause has two components: names for the function's arguments and an expression using those names that should decrease on each call.
For `arrayMapHelper`

, the final definition looks like this:

```
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
if inBounds : i < arr.size then
arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
else soFar
termination_by arrayMapHelper _ arr _ i _ => arr.size - i
```

A similar termination proof can be used to write `Array.find`

, a function that finds the first element in an array that satisfies a Boolean function and returns both the element and its index:

```
def Array.find (arr : Array α) (p : α → Bool) : Option (Nat × α) :=
findHelper arr p 0
```

Once again, the helper function terminates because `arr.size - i`

decreases as `i`

increases:

```
def findHelper (arr : Array α) (p : α → Bool) (i : Nat) : Option (Nat × α) :=
if h : i < arr.size then
let x := arr[i]
if p x then
some (i, x)
else findHelper arr p (i + 1)
else none
termination_by findHelper arr p i => arr.size - i
```

Not all termination arguments are as quite as simple as this one. However, the basic structure of identifying some expression based on the function's arguments that will decrease in each call occurs in all termination proofs. Sometimes, creativity can be required in order to figure out just why a function terminates, and sometimes Lean requires additional proofs in order to accept the termination argument.

## Exercises

- Implement a
`ForM (Array α)`

instance on arrays using a tail-recursive accumulator-passing function and a`termination_by`

clause. - Implement a function to reverse arrays using a tail-recursive accumulator-passing function that
*doesn't*need a`termination_by`

clause. - Reimplement
`Array.map`

,`Array.find`

, and the`ForM`

instance using`for ... in ...`

loops in the identity monad and compare the resulting code. - Reimplement array reversal using a
`for ... in ...`

loop in the identity monad. Compare it to the tail-recursive function.