Insertion Sort and Array Mutation
While insertion sort does not have the optimal worstcase time complexity for a sorting algorithm, it still has a number of useful properties:
 It is simple and straightforward to implement and understand
 It is an inplace algorithm, requiring no additional space to run
 It is a stable sort
 It is fast when the input is already almost sorted
Inplace algorithms are particularly useful in Lean due to the way it manages memory. In some cases, operations that would normally copy an array can be optimized into mutation. This includes swapping elements in an array.
Most languages and runtime systems with automatic memory management, including JavaScript, the JVM, and .NET, use tracing garbage collection. When memory needs to be reclaimed, the system starts at a number of roots (such as the call stack and global values) and then determines which values can be reached by recursively chasing pointers. Any values that can't be reached are deallocated, freeing memory.
Reference counting is an alternative to tracing garbage collection that is used by a number of languages, including Python, Swift, and Lean. In a system with reference counting, each object in memory has a field that tracks how many references there are to it. When a new reference is established, the counter is incremented. When a reference ceases to exist, the counter is decremented. When the counter reaches zero, the object is immediately deallocated.
Reference counting has one major disadvantage compared to a tracing garbage collector: circular references can lead to memory leaks. If object \( A \) references object \( B \) , and object \( B \) references object \( A \), they will never be deallocated, even if nothing else in the program references either \( A \) or \( B \). Circular references result either from uncontrolled recursion or from mutable references. Because Lean supports neither, it is impossible to construct circular references.
Reference counting means that the Lean runtime system's primitives for allocating and deallocating data structures can check whether a reference count is about to fall to zero, and reuse an existing object instead of allocating a new one. This is particularly important when working with large arrays.
An implementation of insertion sort for Lean arrays should satisfy the following criteria:
 Lean should accept the function without a
partial
annotation  If passed an array to which there are no other references, it should modify the array inplace rather than allocating a new one
The first criterion is easy to check: if Lean accepts the definition, then it is satisfied.
The second, however, requires a means of testing it.
Lean provides a builtin function called dbgTraceIfShared
with the following signature:
#check dbgTraceIfShared
dbgTraceIfShared.{u} {α : Type u} (s : String) (a : α) : α
It takes a string and a value as arguments, and prints a message that uses the string to standard error if the value has more than one reference, returning the value. This is not, strictly speaking, a pure function. However, it is intended to be used only during development to check that a function is in fact able to reuse memory rather than allocating and copying.
When learning to use dbgTraceIfShared
, it's important to know that #eval
will report that many more values are shared than in compiled code.
This can be confusing.
It's important to build an executable with lake
rather than experimenting in an editor.
Insertion sort consists of two loops. The outer loop moves a pointer from left to right across the array to be sorted. After each iteration, the region of the array to the left of the pointer is sorted, while the region to the right may not yet be sorted. The inner loop takes the element pointed to by the pointer and moves it to the left until the appropriate location has been found and the loop invariant has been restored. In other words, each iteration inserts the next element of the array into the appropriate location in the sorted region.
The Inner Loop
The inner loop of insertion sort can be implemented as a tailrecursive function that takes the array and the index of the element being inserted as arguments.
The element being inserted is repeatedly swapped with the element to its left until either the element to the left is smaller or the beginning of the array is reached.
The inner loop is structurally recursive on the Nat
that is inside the Fin
used to index into the array:
def insertSorted [Ord α] (arr : Array α) (i : Fin arr.size) : Array α :=
match i with
 ⟨0, _⟩ => arr
 ⟨i' + 1, _⟩ =>
have : i' < arr.size := by
simp [Nat.lt_of_succ_lt, *]
match Ord.compare arr[i'] arr[i] with
 .lt  .eq => arr
 .gt =>
insertSorted (arr.swap ⟨i', by assumption⟩ i) ⟨i', by simp [*]⟩
If the index i
is 0
, then the element being inserted into the sorted region has reached the beginning of the region and is the smallest.
If the index is i' + 1
, then the element at i'
should be compared to the element at i
.
Note that while i
is a Fin arr.size
, i'
is just a Nat
because it results from the val
field of i
.
It is thus necessary to prove that i' < arr.size
before i'
can be used to index into arr
.
Omitting the have
expression with the proof that i' < arr.size
reveals the following goal:
unsolved goals
α : Type ?u.7
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
i' : Nat
isLt✝ : i' + 1 < Array.size arr
⊢ i' < Array.size arr
The hint Nat.lt_of_succ_lt
is a theorem from Lean's standard library.
Its signature, found by #check Nat.lt_of_succ_lt
, is
Nat.lt_of_succ_lt {n m : Nat} (a✝ : Nat.succ n < m) : n < m
In other words, it states that if n + 1 < m
, then n < m
.
The *
passed to simp
causes it to combine Nat.lt_of_succ_lt
with the isLt
field from i
to get the final proof.
Having established that i'
can be used to look up the element to the left of the element being inserted, the two elements are looked up and compared.
If the element to the left is less than or equal to the element being inserted, then the loop is finished and the invariant has been restored.
If the element to the left is greater than the element being inserted, then the elements are swapped and the inner loop begins again.
Array.swap
takes both of its indices as Fin
s, and the by assumption
that establishes that i' < arr.size
makes use of the have
.
The index to be examined on the next round through the inner loop is also i'
, but by assumption
is not sufficient in this case.
This is because the proof was written for the original array arr
, not the result of swapping two elements.
The simp
tactic's database contains the fact that swapping two elements of an array doesn't change its size, and the [*]
argument instructs it to additionally use the assumption introduced by have
.
The Outer Loop
The outer loop of insertion sort moves the pointer from left to right, invoking insertSorted
at each iteration to insert the element at the pointer into the correct position in the array.
The basic form of the loop resembles the implementation of Array.map
:
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
The resulting error is also the same as the error that occurs without a termination_by
clause on Array.map
, because there is no argument that decreases at every recursive call:
fail to show termination for
insertionSortLoop
with errors
argument #4 was not used for structural recursion
failed to eliminate recursive application
insertionSortLoop (insertSorted arr { val := i, isLt := h }) (i + 1)
structural recursion cannot be used
failed to prove termination, use `termination_by` to specify a wellfounded relation
Before constructing the termination proof, it can be convenient to test the definition with a partial
modifier to make sure that it returns the expected answers:
partial def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
#eval insertionSortLoop #[5, 17, 3, 8] 0
#[3, 5, 8, 17]
#eval insertionSortLoop #["metamorphic", "igneous", "sedentary"] 0
#["igneous", "metamorphic", "sedentary"]
Termination
Once again, the function terminates because the difference between the index and the size of the array being processed decreases on each recursive call.
This time, however, Lean does not accept the termination_by
:
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
termination_by insertionSortLoop arr i => arr.size  i
failed to prove termination, possible solutions:
 Use `have`expressions to prove the remaining goals
 Use `termination_by` to specify a different wellfounded relation
 Use `decreasing_by` to specify your own tactic for discharging this kind of goal
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Nat
h : i < Array.size arr
⊢ Array.size (insertSorted arr { val := i, isLt := h })  (i + 1) < Array.size arr  i
The problem is that Lean has no way to know that insertSorted
returns an array that's the same size as the one it is passed.
In order to prove that insertionSortLoop
terminates, it is necessary to first prove that insertSorted
doesn't change the size of the array.
Copying the unproved termination condition from the error message to the function and "proving" it with sorry
allows the function to be temporarily accepted:
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
have : (insertSorted arr ⟨i, h⟩).size  (i + 1) < arr.size  i := by
sorry
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
termination_by insertionSortLoop arr i => arr.size  i
declaration uses 'sorry'
Because insertSorted
is structurally recursive on the index of the element being inserted, the proof should be by induction on the index.
In the base case, the array is returned unchanged, so its length certainly does not change.
For the inductive step, the induction hypothesis is that a recursive call on the next smaller index will not change the length of the array.
There are two cases two consider: either the element has been fully inserted into the sorted region and the array is returned unchanged, in which case the length is also unchanged, or the element is swapped with the next one before the recursive call.
However, swapping two elements in an array doesn't change the size of it, and the induction hypothesis states that the recursive call with the next index returns an array that's the same size as its argument.
Thus, the size remains unchanged.
Translating this Englishlanguage theorem statement to Lean and proceeding using the techniques from this chapter is enough to prove the base case and make progress in the inductive step:
theorem insert_sorted_size_eq [Ord α] (arr : Array α) (i : Fin arr.size) :
(insertSorted arr i).size = arr.size := by
match i with
 ⟨j, isLt⟩ =>
induction j with
 zero => simp [insertSorted]
 succ j' ih =>
simp [insertSorted]
The simplification using insertSorted
in the inductive step revealed the pattern match in insertSorted
:
unsolved goals
case succ
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
j' : Nat
ih : ∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
isLt : Nat.succ j' < Array.size arr
⊢ Array.size
(match compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] with
 Ordering.lt => arr
 Ordering.eq => arr
 Ordering.gt =>
insertSorted
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) } { val := Nat.succ j', isLt := isLt })
{ val := j',
isLt :=
(_ :
j' <
Array.size
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) }
{ val := Nat.succ j', isLt := isLt })) }) =
Array.size arr
When faced with a goal that includes if
or match
, the split
tactic (not to be confused with the split
function used in the definition of merge sort) replaces the goal with one new goal for each path of control flow:
theorem insert_sorted_size_eq [Ord α] (arr : Array α) (i : Fin arr.size) :
(insertSorted arr i).size = arr.size := by
match i with
 ⟨j, isLt⟩ =>
induction j with
 zero => simp [insertSorted]
 succ j' ih =>
simp [insertSorted]
split
Additionally, each new goal has an assumption that indicates which branch led to that goal, named heq✝
in this case:
unsolved goals
case succ.h_1
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
j' : Nat
ih : ∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
isLt : Nat.succ j' < Array.size arr
x✝ : Ordering
heq✝ : compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] = Ordering.lt
⊢ Array.size arr = Array.size arr
case succ.h_2
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
j' : Nat
ih : ∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
isLt : Nat.succ j' < Array.size arr
x✝ : Ordering
heq✝ : compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] = Ordering.eq
⊢ Array.size arr = Array.size arr
case succ.h_3
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
j' : Nat
ih : ∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
isLt : Nat.succ j' < Array.size arr
x✝ : Ordering
heq✝ : compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] = Ordering.gt
⊢ Array.size
(insertSorted
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) } { val := Nat.succ j', isLt := isLt })
{ val := j',
isLt :=
(_ :
j' <
Array.size
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) }
{ val := Nat.succ j', isLt := isLt })) }) =
Array.size arr
Rather than write proofs for both simple cases, adding <;> try rfl
after split
causes the two straightforward cases to disappear immediately, leaving only a single goal:
theorem insert_sorted_size_eq [Ord α] (arr : Array α) (i : Fin arr.size) :
(insertSorted arr i).size = arr.size := by
match i with
 ⟨j, isLt⟩ =>
induction j with
 zero => simp [insertSorted]
 succ j' ih =>
simp [insertSorted]
split <;> try rfl
unsolved goals
case succ.h_3
α : Type u_1
inst✝ : Ord α
arr : Array α
i : Fin (Array.size arr)
j' : Nat
ih : ∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
isLt : Nat.succ j' < Array.size arr
x✝ : Ordering
heq✝ : compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] = Ordering.gt
⊢ Array.size
(insertSorted
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) } { val := Nat.succ j', isLt := isLt })
{ val := j',
isLt :=
(_ :
j' <
Array.size
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) }
{ val := Nat.succ j', isLt := isLt })) }) =
Array.size arr
Unfortunately, the induction hypothesis is not strong enough to prove this goal.
The induction hypothesis states that calling insertSorted
on arr
leaves the size unchanged, but the proof goal is to show that the result of the recursive call with the result of swapping leaves the size unchanged.
Successfully completing the proof requires an induction hypothesis that works for any array that is passed to insertSorted
together with the smaller index as an argument
It is possible to get a strong induction hypothesis by using the generalizing
option to the induction
tactic.
This option brings additional assumptions from the context into the statement that's used to generate the base case, the induction hypothesis, and the goal to be shown in the inductive step.
Generalizing over arr
leads to a stronger hypothesis:
theorem insert_sorted_size_eq [Ord α] (arr : Array α) (i : Fin arr.size) :
(insertSorted arr i).size = arr.size := by
match i with
 ⟨j, isLt⟩ =>
induction j generalizing arr with
 zero => simp [insertSorted]
 succ j' ih =>
simp [insertSorted]
split <;> try rfl
In the resulting goal, arr
is now part of a "for all" statement in the inductive hypothesis:
unsolved goals
case succ.h_3
α : Type u_1
inst✝ : Ord α
j' : Nat
ih :
∀ (arr : Array α),
Fin (Array.size arr) →
∀ (isLt : j' < Array.size arr), Array.size (insertSorted arr { val := j', isLt := isLt }) = Array.size arr
arr : Array α
i : Fin (Array.size arr)
isLt : Nat.succ j' < Array.size arr
x✝ : Ordering
heq✝ : compare arr[j'] arr[{ val := Nat.succ j', isLt := isLt }] = Ordering.gt
⊢ Array.size
(insertSorted
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) } { val := Nat.succ j', isLt := isLt })
{ val := j',
isLt :=
(_ :
j' <
Array.size
(Array.swap arr { val := j', isLt := (_ : j' < Array.size arr) }
{ val := Nat.succ j', isLt := isLt })) }) =
Array.size arr
However, this whole proof is beginning to get unmanageable.
The next step would be to introduce a variable standing for the length of the result of swapping, show that it is equal to arr.size
, and then show that this variable is also equal to the length of the array that results from the recursive call.
These equality statement can then be chained together to prove the goal.
It's much easier, however, to carefully reformulate the theorem statement such that the induction hypothesis is automatically strong enough and the variables are already introduced.
The reformulated statement reads:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → arr.size = len →
(insertSorted arr ⟨i, isLt⟩).size = len := by
skip
This version of the theorem statement is easier to prove for a few reasons:
 Rather than bundling up the index and the proof of its validity in a
Fin
, the index comes before the array. This allows the induction hypothesis to naturally generalize over the array and the proof thati
is in bounds.  An abstract length
len
is introduced to stand forarray.size
. Proof automation is often better at working with explicit statements of equality.
The resulting proof state shows the statement that will be used to generate the induction hypothesis, as well as the base case and the goal of the inductive step:
unsolved goals
α : Type u_1
inst✝ : Ord α
len i : Nat
⊢ ∀ (arr : Array α) (isLt : i < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i, isLt := isLt }) = len
Compare the statement with the goals that result from the induction
tactic:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → arr.size = len →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero => skip
 succ i' ih => skip
In the base case, each occurrence of i
has been replaced by 0
.
Using intro
to introduce each assumption and then simplifying using insertSorted
will prove the goal, because insertSorted
at index zero
returns its argument unchanged:
unsolved goals
case zero
α : Type u_1
inst✝ : Ord α
len : Nat
⊢ ∀ (arr : Array α) (isLt : Nat.zero < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := Nat.zero, isLt := isLt }) = len
In the inductive step, the induction hypothesis has exactly the right strength.
It will be useful for any array, so long as that array has length len
:
unsolved goals
case succ
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
⊢ ∀ (arr : Array α) (isLt : Nat.succ i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := Nat.succ i', isLt := isLt }) = len
In the base case, simp
reduces the goal to arr.size = len
:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → arr.size = len →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted]
 succ i' ih => skip
unsolved goals
case zero
α : Type u_1
inst✝ : Ord α
len : Nat
arr : Array α
isLt : Nat.zero < Array.size arr
hLen : Array.size arr = len
⊢ Array.size arr = len
This can be proved using the assumption hLen
.
Adding the *
parameter to simp
instructs it to additionally use assumptions, which solves the goal:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → arr.size = len →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih => skip
In the inductive step, introducing assumptions and simplifying the goal results once again in a goal that contains a pattern match:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted]
unsolved goals
case succ
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
arr : Array α
isLt : Nat.succ i' < Array.size arr
hLen : Array.size arr = len
⊢ Array.size
(match compare arr[i'] arr[{ val := Nat.succ i', isLt := isLt }] with
 Ordering.lt => arr
 Ordering.eq => arr
 Ordering.gt =>
insertSorted
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) } { val := Nat.succ i', isLt := isLt })
{ val := i',
isLt :=
(_ :
i' <
Array.size
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) }
{ val := Nat.succ i', isLt := isLt })) }) =
len
Using the split
tactic results in one goal for each pattern.
Once again, the first two goals result from branches without recursive calls, so the induction hypothesis is not necessary:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted]
split
unsolved goals
case succ.h_1
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
arr : Array α
isLt : Nat.succ i' < Array.size arr
hLen : Array.size arr = len
x✝ : Ordering
heq✝ : compare arr[i'] arr[{ val := Nat.succ i', isLt := isLt }] = Ordering.lt
⊢ Array.size arr = len
case succ.h_2
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
arr : Array α
isLt : Nat.succ i' < Array.size arr
hLen : Array.size arr = len
x✝ : Ordering
heq✝ : compare arr[i'] arr[{ val := Nat.succ i', isLt := isLt }] = Ordering.eq
⊢ Array.size arr = len
case succ.h_3
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
arr : Array α
isLt : Nat.succ i' < Array.size arr
hLen : Array.size arr = len
x✝ : Ordering
heq✝ : compare arr[i'] arr[{ val := Nat.succ i', isLt := isLt }] = Ordering.gt
⊢ Array.size
(insertSorted
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) } { val := Nat.succ i', isLt := isLt })
{ val := i',
isLt :=
(_ :
i' <
Array.size
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) }
{ val := Nat.succ i', isLt := isLt })) }) =
len
Running try assumption
in each goal that results from split
eliminates both of the nonrecursive goals:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted]
split <;> try assumption
unsolved goals
case succ.h_3
α : Type u_1
inst✝ : Ord α
len i' : Nat
ih :
∀ (arr : Array α) (isLt : i' < Array.size arr),
Array.size arr = len → Array.size (insertSorted arr { val := i', isLt := isLt }) = len
arr : Array α
isLt : Nat.succ i' < Array.size arr
hLen : Array.size arr = len
x✝ : Ordering
heq✝ : compare arr[i'] arr[{ val := Nat.succ i', isLt := isLt }] = Ordering.gt
⊢ Array.size
(insertSorted
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) } { val := Nat.succ i', isLt := isLt })
{ val := i',
isLt :=
(_ :
i' <
Array.size
(Array.swap arr { val := i', isLt := (_ : i' < Array.size arr) }
{ val := Nat.succ i', isLt := isLt })) }) =
len
The new formulation of the proof goal, in which a constant len
is used for the lengths of all the arrays involved in the recursive function, falls nicely within the kinds of problems that simp
can solve.
This final proof goal can be solved by simp [*]
, because the assumptions that relate the array's length to len
are important:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted]
split <;> try assumption
simp [*]
Finally, because simp [*]
can use assumptions, the try assumption
line can be replaced by simp [*]
, shortening the proof:
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted]
split <;> simp [*]
This proof can now be used to replace the sorry
in insertionSortLoop
.
Providing arr.size
as the len
argument to the theorem causes the final conclusion to be (insertSorted arr ⟨i, isLt⟩).size = arr.size
, so the rewrite ends with a very manageable proof goal:
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
have : (insertSorted arr ⟨i, h⟩).size  (i + 1) < arr.size  i := by
rw [insert_sorted_size_eq arr.size i arr h rfl]
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
termination_by insertionSortLoop arr i => arr.size  i
unsolved goals
α : Type ?u.22342
inst✝ : Ord α
arr : Array α
i : Nat
h : i < Array.size arr
⊢ Array.size arr  (i + 1) < Array.size arr  i
The proof Nat.sub_succ_lt_self
is part of Lean's standard library.
It's type is ∀ (a i : Nat), i < a → a  (i + 1) < a  i
, which is exactly what's needed:
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
have : (insertSorted arr ⟨i, h⟩).size  (i + 1) < arr.size  i := by
rw [insert_sorted_size_eq arr.size i arr h rfl]
simp [Nat.sub_succ_lt_self, *]
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
termination_by insertionSortLoop arr i => arr.size  i
The Driver Function
Insertion sort itself calls insertionSortLoop
, initializing the index that demarcates the sorted region of the array from the unsorted region to 0
:
def insertionSort [Ord α] (arr : Array α) : Array α :=
insertionSortLoop arr 0
A few quick tests show the function is at least not blatantly wrong:
#eval insertionSort #[3, 1, 7, 4]
#[1, 3, 4, 7]
#eval insertionSort #[ "quartz", "marble", "granite", "hematite"]
#["granite", "hematite", "marble", "quartz"]
Is This Really Insertion Sort?
Insertion sort is defined to be an inplace sorting algorithm. What makes it useful, despite its quadratic worstcase run time, is that it is a stable sorting algorithm that doesn't allocate extra space and that handles almostsorted data efficiently. If each iteration of the inner loop allocated a new array, then the algorithm wouldn't really be insertion sort.
Lean's array operations, such as Array.set
and Array.swap
, check whether the array in question has a reference count that is greater than one.
If so, then the array is visible to multiple parts of the code, which means that it must be copied.
Otherwise, Lean would no longer be a pure functional language.
However, when the reference count is exactly one, there are no other potential observers of the value.
In these cases, the array primitives mutate the array in place.
What other parts of the program don't know can't hurt them.
Lean's proof logic works at the level of pure functional programs, not the underlying implementation.
This means that the best way to discover whether a program unnecessarily copies data is to test it.
Adding calls to dbgTraceIfShared
at each point where mutation is desired causes the provided message to be printed to stderr
when the value in question has more than one reference.
Insertion sort has precisely one place that is at risk of copying rather than mutating: the call to Array.swap
.
Replacing arr.swap ⟨i', by assumption⟩ i
with ((dbgTraceIfShared "array to swap" arr).swap ⟨i', by assumption⟩ i)
causes the program to emit shared RC array to swap
whenever it is unable to mutate the array.
However, this change to the program changes the proofs as well, because now there's a call to an additional function.
Because dbgTraceIfShared
returns its second argument directly, adding it to the calls to simp
is enough to fix the proofs.
The complete instrumented code for insertion sort is:
def insertSorted [Ord α] (arr : Array α) (i : Fin arr.size) : Array α :=
match i with
 ⟨0, _⟩ => arr
 ⟨i' + 1, _⟩ =>
have : i' < arr.size := by
simp [Nat.lt_of_succ_lt, *]
match Ord.compare arr[i'] arr[i] with
 .lt  .eq => arr
 .gt =>
insertSorted
((dbgTraceIfShared "array to swap" arr).swap ⟨i', by assumption⟩ i)
⟨i', by simp [dbgTraceIfShared, *]⟩
theorem insert_sorted_size_eq [Ord α] (len : Nat) (i : Nat) :
(arr : Array α) → (isLt : i < arr.size) → (arr.size = len) →
(insertSorted arr ⟨i, isLt⟩).size = len := by
induction i with
 zero =>
intro arr isLt hLen
simp [insertSorted, *]
 succ i' ih =>
intro arr isLt hLen
simp [insertSorted, dbgTraceIfShared]
split <;> simp [*]
def insertionSortLoop [Ord α] (arr : Array α) (i : Nat) : Array α :=
if h : i < arr.size then
have : (insertSorted arr ⟨i, h⟩).size  (i + 1) < arr.size  i := by
rw [insert_sorted_size_eq arr.size i arr h rfl]
simp [Nat.sub_succ_lt_self, *]
insertionSortLoop (insertSorted arr ⟨i, h⟩) (i + 1)
else
arr
termination_by insertionSortLoop arr i => arr.size  i
def insertionSort [Ord α] (arr : Array α) : Array α :=
insertionSortLoop arr 0
A bit of cleverness is required to check whether the instrumentation actually works.
First off, the Lean compiler aggressively optimizes function calls away when all their arguments are known at compile time.
Simply writing a program that applies insertionSort
to a large array is not sufficient, because the resulting compiled code may contain only the sorted array as a constant.
The easiest way to ensure that the compiler doesn't optimize away the sorting routine is to read the array from stdin
.
Secondly, the compiler performs dead code elimination.
Adding extra let
s to the program won't necessarily result in more references in running code if the let
bound variables are never used.
To ensure that the extra reference is not eliminated entirely, it's important to ensure that the extra reference is somehow used.
The first step in testing the instrumentation is to write getLines
, which reads an array of lines from standard input:
def getLines : IO (Array String) := do
let stdin ← IO.getStdin
let mut lines : Array String := #[]
let mut currLine ← stdin.getLine
while !currLine.isEmpty do
 Drop trailing newline:
lines := lines.push (currLine.dropRight 1)
currLine ← stdin.getLine
pure lines
IO.FS.Stream.getLine
returns a complete line of text, including the trailing newline.
It returns ""
when the endoffile marker has been reached.
Next, two separate main
routines are needed.
Both read the array to be sorted from standard input, ensuring that the calls to insertionSort
won't be replaced by their return values at compile time.
Both then print to the console, ensuring that the calls to insertionSort
won't be optimized away entirely.
One of them prints only the sorted array, while the other prints both the sorted array and the original array.
The second function should trigger a warning that Array.swap
had to allocate a new array:
def mainUnique : IO Unit := do
let lines ← getLines
for line in insertionSort lines do
IO.println line
def mainShared : IO Unit := do
let lines ← getLines
IO.println " Sorted lines: "
for line in insertionSort lines do
IO.println line
IO.println ""
IO.println " Original data: "
for line in lines do
IO.println line
The actual main
simply selects one of the two main actions based on the provided commandline arguments:
def main (args : List String) : IO UInt32 := do
match args with
 ["shared"] => mainShared; pure 0
 ["unique"] => mainUnique; pure 0
 _ =>
IO.println "Expected single argument, either \"shared\" or \"unique\""
pure 1
Running it with no arguments produces the expected usage information:
$ sort
Expected single argument, either "shared" or "unique"
The file testdata
contains the following rocks:
schist
feldspar
diorite
pumice
obsidian
shale
gneiss
marble
flint
Using the instrumented insertion sort on these rocks results them being printed in alphabetical order:
$ sort unique < testdata
diorite
feldspar
flint
gneiss
marble
obsidian
pumice
schist
shale
However, the version in which a reference is retained to the original array results in a notification on stderr
(namely, shared RC array to swap
) from the first call to Array.swap
:
$ sort shared < testdata
shared RC array to swap
 Sorted lines: 
diorite
feldspar
flint
gneiss
marble
obsidian
pumice
schist
shale
 Original data: 
schist
feldspar
diorite
pumice
obsidian
shale
gneiss
marble
flint
The fact that only a single shared RC
notification appears means that the array is copied only once.
This is because the copy that results from the call to Array.swap
is itself unique, so no further copies need to be made.
In an imperative language, subtle bugs can result from forgetting to explicitly copy an array before passing it by reference.
When running sort shared
, the array is copied as needed to preserve the pure functional meaning of Lean programs, but no more.
Other Opportunities for Mutation
The use of mutation instead of copying when references are unique is not limited to array update operators.
Lean also attempts to "recycle" constructors whose reference counts are about to fall to zero, reusing them instead of allocating new data.
This means, for instance, that List.map
will mutate a linked list in place, at least in cases when nobody could possibly notice.
One of the most important steps in optimizing hot loops in Lean code is making sure that the data being modified is not referred to from multiple locations.
Exercises

Write a function that reverses arrays. Test that if the input array has a reference count of one, then your function does not allocate a new array.

Implement either merge sort or quicksort for arrays. Prove that your implementation terminates, and test that it doesn't allocate more arrays than expected. This is a challenging exercise!