# Applicative Functors

An *applicative functor* is a functor that has two additional operations available: `pure`

and `seq`

.
`pure`

is the same operator used in `Monad`

, because `Monad`

in fact inherits from `Applicative`

.
`seq`

is much like `map`

: it allows a function to be used in order to transform the contents of a datatype.
However, with `seq`

, the function is itself contained in the datatype: `f (α → β) → (Unit → f α) → f β`

.
Having the function under the type `f`

allows the `Applicative`

instance to control how the function is applied, while `Functor.map`

unconditionally applies a function.
The second argument has a type that begins with `Unit →`

to allow the definition of `seq`

to short-circuit in cases where the function will never be applied.

The value of this short-circuiting behavior can be seen in the instance of `Applicative Option`

:

```
instance : Applicative Option where
pure x := .some x
seq f x :=
match f with
| none => none
| some g => g <$> x ()
```

In this case, if there is no function for `seq`

to apply, then there is no need to compute its argument, so `x`

is never called.
The same consideration informs the instance of `Applicative`

for `Except`

:

```
instance : Applicative (Except ε) where
pure x := .ok x
seq f x :=
match f with
| .error e => .error e
| .ok g => g <$> x ()
```

This short-circuiting behavior depends only on the `Option`

or `Except`

structures that *surround* the function, rather than on the function itself.

Monads can be seen as a way of capturing the notion of sequentially executing statements into a pure functional language.
The result of one statement can affect which further statements run.
This can be seen in the type of `bind`

: `m α → (α → m β) → m β`

.
The first statement's resulting value is an input into a function that computes the next statement to execute.
Successive uses of `bind`

are like a sequence of statements in an imperative programming language, and `bind`

is powerful enough to implement control structures like conditionals and loops.

Following this analogy, `Applicative`

captures function application in a language that has side effects.
The arguments to a function in languages like Kotlin or C# are evaluated from left to right.
Side effects performed by earlier arguments occur before those performed by later arguments.
A function is not powerful enough to implement custom short-circuiting operators that depend on the specific *value* of an argument, however.

Typically, `seq`

is not invoked directly.
Instead, the operator `<*>`

is used.
This operator wraps its second argument in `fun () => ...`

, simplifying the call site.
In other words, `E1 <*> E2`

is syntactic sugar for `Seq.seq E1 (fun () => E2)`

.

The key feature that allows `seq`

to be used with multiple arguments is that a multiple-argument Lean function is really a single-argument function that returns another function that's waiting for the rest of the arguments.
In other words, if the first argument to `seq`

is awaiting multiple arguments, then the result of the `seq`

will be awaiting the rest.
For example, `some Plus.plus`

can have the type `Option (Nat → Nat → Nat)`

.
Providing one argument, `some Plus.plus <*> some 4`

, results in the type `Option (Nat → Nat)`

.
This can itself be used with `seq`

, so `some Plus.plus <*> some 4 <*> some 7`

has the type `Option Nat`

.

Not every functor is applicative.
`Pair`

is like the built-in product type `Prod`

:

```
structure Pair (α β : Type) : Type where
first : α
second : β
```

Like `Except`

, `Pair`

has type `Type → Type → Type`

.
This means that `Pair α`

has type `Type → Type`

, and a `Functor`

instance is possible:

```
instance : Functor (Pair α) where
map f x := ⟨x.first, f x.second⟩
```

This instance obeys the `Functor`

contract.

The two properties to check are that `id <$> Pair.mk x y = Pair.mk x y`

and that `f <$> g <$> Pair.mk x y = (f ∘ g) <$> Pair.mk x y`

.
The first property can be checked by just stepping through the evaluation of the left side, and noticing that it evaluates to the right side:

```
id <$> Pair.mk x y
===>
Pair.mk x (id y)
===>
Pair.mk x y
```

The second can be checked by stepping through both sides, and noting that they yield the same result:

```
f <$> g <$> Pair.mk x y
===>
f <$> Pair.mk x (g y)
===>
Pair.mk x (f (g y))
(f ∘ g) <$> Pair.mk x y
===>
Pair.mk x ((f ∘ g) y)
===>
Pair.mk x (f (g y))
```

Attempting to define an `Applicative`

instance, however, does not work so well.
It will require a definition of `pure`

:

```
def Pair.pure (x : β) : Pair α β := _
```

```
don't know how to synthesize placeholder
context:
β α : Type
x : β
⊢ Pair α β
```

There is a value with type `β`

in scope (namely `x`

), and the error message from the underscore suggests that the next step is to use the constructor `Pair.mk`

:

```
def Pair.pure (x : β) : Pair α β := Pair.mk _ x
```

```
don't know how to synthesize placeholder for argument 'first'
context:
β α : Type
x : β
⊢ α
```

Unfortunately, there is no `α`

available.
Because `pure`

would need to work for *all possible types* α to define an instance of `Applicative (Pair α)`

, this is impossible.
After all, a caller could choose `α`

to be `Empty`

, which has no values at all.

## A Non-Monadic Applicative

When validating user input to a form, it's generally considered to be best to provide many errors at once, rather than one error at a time. This allows the user to have an overview of what is needed to please the computer, rather than feeling badgered as they correct the errors field by field.

Ideally, validating user input will be visible in the type of the function that's doing the validating. It should return a datatype that is specific—checking that a text box contains a number should return an actual numeric type, for instance. A validation routine could throw an exception when the input does not pass validation. Exceptions have a major drawback, however: they terminate the program at the first error, making it impossible to accumulate a list of errors.

On the other hand, the common design pattern of accumulating a list of errors and then failing when it is non-empty is also problematic.
A long nested sequences of `if`

statements that validate each sub-section of the input data is hard to maintain, and it's easy to lose track of an error message or two.
Ideally, validation can be performed using an API that enables a new value to be returned yet automatically tracks and accumulates error messages.

An applicative functor called `Validate`

provides one way to implement this style of API.
Like the `Except`

monad, `Validate`

allows a new value to be constructed that characterizes the validated data accurately.
Unlike `Except`

, it allows multiple errors to be accumulated, without a risk of forgetting to check whether the list is empty.

### User Input

As an example of user input, take the following structure:

```
structure RawInput where
name : String
birthYear : String
```

The business logic to be implemented is the following:

- The name may not be empty
- The birth year must be numeric and non-negative
- The birth year must be greater than 1900, and less than or equal to the year in which the form is validated

Representing these as a datatype will require a new feature, called *subtypes*.
With this tool in hand, a validation framework can be written that uses an applicative functor to track errors, and these rules can be implemented in the framework.

### Subtypes

Representing these conditions is easiest with one additional Lean type, called `Subtype`

:

```
structure Subtype {α : Type} (p : α → Prop) where
val : α
property : p val
```

This structure has two type parameters: an implicit parameter that is the type of data `α`

, and an explicit parameter `p`

that is a predicate over `α`

.
A *predicate* is a logical statement with a variable in it that can be replaced with a value to yield an actual statement, like the parameter to `GetElem`

that describes what it means for an index to be in bounds for a lookup.
In the case of `Subtype`

, the predicate slices out some subset of the values of `α`

for which the predicate holds.
The structure's two fields are, respectively, a value from `α`

and evidence that the value satisfies the predicate `p`

.
Lean has special syntax for `Subtype`

.
If `p`

has type `α → Prop`

, then the type `Subtype p`

can also be written `{x : α // p x}`

, or even `{x // p x}`

when the type can be inferred automatically.

Representing positive numbers as inductive types is clear and easy to program with.
However, it has a key disadvantage.
While `Nat`

and `Int`

have the structure of ordinary inductive types from the perspective of Lean programs, the compiler treats them specially and uses fast arbitrary-precision number libraries to implement them.
This is not the case for additional user-defined types.
However, a subtype of `Nat`

that restricts it to non-zero numbers allows the new type to use the efficient representation while still ruling out zero at compile time:

```
def FastPos : Type := {x : Nat // x > 0}
```

The smallest fast positive number is still one.
Now, instead of being a constructor of an inductive type, it's an instance of a structure that's constructed with angle brackets.
The first argument is the underlying `Nat`

, and the second argument is the evidence that said `Nat`

is greater than zero:

```
def one : FastPos := ⟨1, by simp⟩
```

The `OfNat`

instance is very much like that for `Pos`

, except it uses a short tactic proof to provide evidence that `n + 1 > 0`

:

```
instance : OfNat FastPos (n + 1) where
ofNat := ⟨n + 1, by simp_arith⟩
```

The `simp_arith`

tactic is a version of `simp`

that takes additional arithmetic identities into account.

Subtypes are a two-edged sword.
They allow efficient representation of validation rules, but they transfer the burden of maintaining these rules to the users of the library, who have to *prove* that they are not violating important invariants.
Generally, it's a good idea to use them internally to a library, providing an API to users that automatically ensures that all invariants are satisfied, with any necessary proofs being internal to the library.

Checking whether a value of type `α`

is in the subtype `{x : α // p x}`

usually requires that the proposition `p x`

be decidable.
The section on equality and ordering classes describes how decidable propositions can be used with `if`

.
When `if`

is used with a decidable proposition, a name can be provided.
In the `then`

branch, the name is bound to evidence that the proposition is true, and in the `else`

branch, it is bound to evidence that the proposition is false.
This comes in handy when checking whether a given `Nat`

is positive:

```
def Nat.asFastPos? (n : Nat) : Option FastPos :=
if h : n > 0 then
some ⟨n, h⟩
else none
```

In the `then`

branch, `h`

is bound to evidence that `n > 0`

, and this evidence can be used as the second argument to `Subtype`

's constructor.

### Validated Input

The validated user input is a structure that expresses the business logic using multiple techniques:

- The structure type itself encodes the year in which it was checked for validity, so that
`CheckedInput 2019`

is not the same type as`CheckedInput 2020`

- The birth year is represented as a
`Nat`

rather than a`String`

- Subtypes are used to constrain the allowed values in the name and birth year fields

```
structure CheckedInput (thisYear : Nat) : Type where
name : {n : String // n ≠ ""}
birthYear : {y : Nat // y > 1900 ∧ y ≤ thisYear}
```

An input validator should take the current year and a `RawInput`

as arguments, returning either a checked input or at least one validation failure.
This is represented by the `Validate`

type:

```
inductive Validate (ε α : Type) : Type where
| ok : α → Validate ε α
| errors : NonEmptyList ε → Validate ε α
```

It looks very much like `Except`

.
The only difference is that the `error`

constructor may contain more than one failure.

Validate is a functor.
Mapping a function over it transforms any successful value that might be present, just as in the `Functor`

instance for `Except`

:

```
instance : Functor (Validate ε) where
map f
| .ok x => .ok (f x)
| .errors errs => .errors errs
```

The `Applicative`

instance for `Validate`

has an important difference from the instance for `Except`

: while the instance for `Except`

terminates at the first error encountered, the instance for `Validate`

is careful to accumulate all errors from *both* the function and the argument branches:

```
instance : Applicative (Validate ε) where
pure := .ok
seq f x :=
match f with
| .ok g => g <$> (x ())
| .errors errs =>
match x () with
| .ok _ => .errors errs
| .errors errs' => .errors (errs ++ errs')
```

Using `.errors`

together with the constructor for `NonEmptyList`

is a bit verbose.
Helpers like `reportError`

make code more readable.
In this application, error reports will consist of field names paired with messages:

```
def Field := String
def reportError (f : Field) (msg : String) : Validate (Field × String) α :=
.errors { head := (f, msg), tail := [] }
```

The `Applicative`

instance for `Validate`

allows the checking procedures for each field to be written independently and then composed.
Checking a name consists of ensuring that a string is non-empty, then returning evidence of this fact in the form of a `Subtype`

.
This uses the evidence-binding version of `if`

:

```
def checkName (name : String) : Validate (Field × String) {n : String // n ≠ ""} :=
if h : name = "" then
reportError "name" "Required"
else pure ⟨name, h⟩
```

In the `then`

branch, `h`

is bound to evidence that `name = ""`

, while it is bound to evidence that `¬name = ""`

in the `else`

branch.

It's certainly the case that some validation errors make other checks impossible.
For example, it makes no sense to check whether the birth year field is greater than 1900 if a confused user wrote the word `"syzygy"`

instead of a number.
Checking the allowed range of the number is only meaningful after ensuring that the field in fact contains a number.
This can be expressed using the function `andThen`

:

```
def Validate.andThen (val : Validate ε α) (next : α → Validate ε β) : Validate ε β :=
match val with
| .errors errs => .errors errs
| .ok x => next x
```

While this function's type signature makes it suitable to be used as `bind`

in a `Monad`

instance, there are good reasons not to do so.
They are described in the section that describes the `Applicative`

contract.

To check that the birth year is a number, a built-in function called `String.toNat? : String → Option Nat`

is useful.
It's most user-friendly to eliminate leading and trailing whitespace first using `String.trim`

:

```
def checkYearIsNat (year : String) : Validate (Field × String) Nat :=
match year.trim.toNat? with
| none => reportError "birth year" "Must be digits"
| some n => pure n
```

To check that the provided year is in the expected range, nested uses of the evidence-providing form of `if`

are in order:

```
def checkBirthYear (thisYear year : Nat) : Validate (Field × String) {y : Nat // y > 1900 ∧ y ≤ thisYear} :=
if h : year > 1900 then
if h' : year ≤ thisYear then
pure ⟨year, by simp [*]⟩
else reportError "birth year" s!"Must be no later than {thisYear}"
else reportError "birth year" "Must be after 1900"
```

Finally, these three components can be combined using `seq`

:

```
def checkInput (year : Nat) (input : RawInput) : Validate (Field × String) (CheckedInput year) :=
pure CheckedInput.mk <*>
checkName input.name <*>
(checkYearIsNat input.birthYear).andThen fun birthYearAsNat =>
checkBirthYear year birthYearAsNat
```

Testing `checkInput`

shows that it can indeed return multiple pieces of feedback:

```
#eval checkInput 2023 {name := "David", birthYear := "1984"}
```

```
Validate.ok { name := "David", birthYear := 1984 }
```

```
#eval checkInput 2023 {name := "", birthYear := "2045"}
```

```
Validate.errors { head := ("name", "Required"), tail := [("birth year", "Must be no later than 2023")] }
```

```
#eval checkInput 2023 {name := "David", birthYear := "syzygy"}
```

```
Validate.errors { head := ("birth year", "Must be digits"), tail := [] }
```

Form validation with `checkInput`

illustrates a key advantage of `Applicative`

over `Monad`

.
Because `>>=`

provides enough power to modify the rest of the program's execution based on the value from the first step, it *must* receive a value from the first step to pass on.
If no value is received (e.g. because an error has occurred), then `>>=`

cannot execute the rest of the program.
`Validate`

demonstrates why it can be useful to run the rest of the program anyway: in cases where the earlier data isn't needed, running the rest of the program can yield useful information (in this case, more validation errors).
`Applicative`

's `<*>`

may run both of its arguments before recombining the results.
Similarly, `>>=`

forces sequential execution.
Each step must complete before the next may run.
This is generally useful, but it makes it impossible to have parallel execution of different threads that naturally emerges from the program's actual data dependencies.
A more powerful abstraction like `Monad`

increases the flexibility that's available to the API consumer, but it decreases the flexibility that is available to the API implementor.