# Arrays and Indexing

The Interlude describes how to use indexing notation in order to look up entries in a list by their position. This syntax is also governed by a type class, and it can be used for a variety of different types.

## Arrays

For instance, Lean arrays are much more efficient than linked lists for most purposes.
In Lean, the type `Array α`

is a dynamically-sized array holding values of type `α`

, much like a Java `ArrayList`

, a C++ `std::vector`

, or a Rust `Vec`

.
Unlike `List`

, which has a pointer indirection on each use of the `cons`

constructor, arrays occupy a contiguous region of memory, which is much better for processor caches.
Also, looking up a value in an array takes constant time, while lookup in a linked list takes time proportional to the index being accessed.

In pure functional languages like Lean, it is not possible to mutate a given position in a data structure. Instead, a copy is made that has the desired modifications. When using an array, the Lean compiler and runtime contain an optimizations that can allow modifications to be implemented as mutations behind the scenes when there is only a single unique reference to an array, while a list would require copies of all prior nodes to be made.

Arrays are written similarly to lists, but with a leading `#`

:

```
def northernTrees : Array String :=
#["beech", "birch", "elm", "oak"]
```

The number of values in an array can be found using `Array.size`

.
For instance, `northernTrees.size`

evaluates to `4`

.
For indices that are smaller than an array's size, indexing notation can be used to find the corresponding value, just as with lists.
That is, `northernTrees[2]`

evaluates to `"elm"`

.
Similarly, the compiler requires a proof that an index is in bounds, and attempting to look up a value outside the bounds of the array results in a compile-time error, just as with lists.
For instance, `northernTrees[8]`

results in:

```
failed to prove index is valid, possible solutions:
- Use `have`-expressions to prove the index is valid
- Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid
- Use `a[i]?` notation instead, result is an `Option` type
- Use `a[i]'h` notation instead, where `h` is a proof that index is valid
⊢ 8 < Array.size northernTrees
```

## Non-Empty Lists

A datatype that represents non-empty lists can be defined as a structure with a field for the head of the list and a field for the tail, which is an ordinary, potentially empty list:

```
structure NonEmptyList (α : Type) : Type where
head : α
tail : List α
```

For example, the non-empty list `idahoSpiders`

(which contains some spider species native to the US state of Idaho) consists of `"Banded Garden Spider"`

followed by four other spiders, for a total of five spiders.

Looking up the value at a specific index in this list with a recursive function should consider three possibilities:

- The index is
`0`

, in which case the head of the list should be returned. - The index is
`n + 1`

and the tail is empty, in which case the index is out of bounds. - The index is
`n + 1`

and the tail is non-empty, in which case the function can be called recursively on the tail and`n`

.

For example, a lookup function that returns an `Option`

can be written as follows:

```
def NonEmptyList.get? : NonEmptyList α → Nat → Option α
| xs, 0 => some xs.head
| {head := _, tail := []}, _ + 1 => none
| {head := _, tail := h :: t}, n + 1 => get? {head := h, tail := t} n
```

Each case in the pattern match corresponds to one of the possibilities above.
The recursive call to `get?`

does not require a `NonEmptyList`

namespace qualifier because the body of the definition is implicitly in the definition's namespace.
Another way to write this function uses `get?`

for lists when the index is greater than zero:

```
def NonEmptyList.get? : NonEmptyList α → Nat → Option α
| xs, 0 => some xs.head
| xs, n + 1 => xs.tail.get? n
```

If the list contains one entry, then only `0`

is a valid index.
If it contains two entries, then both `0`

and `1`

are valid indices.
If it contains three entries, then `0`

, `1`

, and `2`

are valid indices.
In other words, the valid indices into a non-empty list are natural numbers that are strictly less than the length of the list, which are less than or equal to the length of the tail.

The definition of what it means for an index to be in bounds should be written as an `abbrev`

because the tactics used to find evidence that indices are acceptable are able to solve inequalities of numbers, but they don't know anything about the name `NonEmptyList.inBounds`

:

```
abbrev NonEmptyList.inBounds (xs : NonEmptyList α) (i : Nat) : Prop :=
i ≤ xs.tail.length
```

This function returns a proposition that might be true or false.
For instance, `2`

is in bounds for `idahoSpiders`

, while `5`

is not:

```
theorem atLeastThreeSpiders : idahoSpiders.inBounds 2 := by simp
theorem notSixSpiders : ¬idahoSpiders.inBounds 5 := by simp
```

The logical negation operator has a very low precedence, which means that `¬idahoSpiders.inBounds 5`

is equivalent to `¬(idahoSpiders.inBounds 5)`

.

This fact can be used to write a lookup function that requires evidence that the index is valid, and thus need not return `Option`

, by delegating to the version for lists that checks the evidence at compile time:

```
def NonEmptyList.get (xs : NonEmptyList α) (i : Nat) (ok : xs.inBounds i) : α :=
match i with
| 0 => xs.head
| n + 1 => xs.tail[n]
```

It is, of course, possible to write this function to use the evidence directly, rather than delegating to a standard library function that happens to be able to use the same evidence. This requires techniques for working with proofs and propositions that are described later in this book.

## Overloading Indexing

Indexing notation for a collection type can be overloaded by defining an instance of the `GetElem`

type class.
For the sake of flexiblity, `GetElem`

has four parameters:

- The type of the collection
- The type of the index
- The type of elements that are extracted from the collection
- A function that determines what counts as evidence that the index is in bounds

The element type and the evidence function are both output parameters.
`GetElem`

has a single method, `getElem`

, which takes a collection value, an index value, and evidence that the index is in bounds as arguments, and returns an element:

```
class GetElem (coll : Type) (idx : Type) (item : outParam Type) (inBounds : outParam (coll → idx → Prop)) where
getElem : (c : coll) → (i : idx) → inBounds c i → item
```

In the case of `NonEmptyList α`

, these parameters are:

- The collection is
`NonEmptyList α`

- Indices have type
`Nat`

- The type of elements is
`α`

- An index is in bounds if it is less than or equal to the length of the tail

In fact, the `GetElem`

instance can delegate directly to `NonEmptyList.get`

:

```
instance : GetElem (NonEmptyList α) Nat α NonEmptyList.inBounds where
getElem := NonEmptyList.get
```

With this instance, `NonEmptyList`

becomes just as convenient to use as `List`

.
Evaluating `idahoSpiders[0]`

yields `"Banded Garden Spider"`

, while `idahoSpiders[9]`

leads to the compile-time error:

```
failed to prove index is valid, possible solutions:
- Use `have`-expressions to prove the index is valid
- Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid
- Use `a[i]?` notation instead, result is an `Option` type
- Use `a[i]'h` notation instead, where `h` is a proof that index is valid
⊢ NonEmptyList.inBounds idahoSpiders 9
```

Because both the collection type and the index type are input parameters to the `GetElem`

type class, new types can be used to index into existing collections.
The positive number type `Pos`

is a perfectly reasonable index into a `List`

, with the caveat that it cannot point at the first entry.
The follow instance of `GetElem`

allows `Pos`

to be used just as conveniently as `Nat`

to find a list entry:

```
instance : GetElem (List α) Pos α (fun list n => list.length > n.toNat) where
getElem (xs : List α) (i : Pos) ok := xs[i.toNat]
```

Indexing can also make sense for non-numeric indices.
For example, `Bool`

can be used to select between the fields in a point, with `false`

corresponding to `x`

and `true`

corresponding to `y`

:

```
instance : GetElem (PPoint α) Bool α (fun _ _ => True) where
getElem (p : PPoint α) (i : Bool) _ :=
if not i then p.x else p.y
```

In this case, both Booleans are valid indices.
Because every possible `Bool`

is in bounds, the evidence is simply the true proposition `True`

.