Standard Classes

This section presents a variety of operators and functions that can be overloaded using type classes in Lean. Each operator or function corresponds to a method of a type class. Unlike C++, infix operators in Lean are defined as abbreviations for named functions; this means that overloading them for new types is not done using the operator itself, but rather refers to a name such as HAdd.hAdd.

Arithmetic

Most arithmetic operators are available in a heterogeneous form, where the arguments may have different type and an output parameter decides the type of the resulting expression. For each heterogeneous operator, there is a corresponding homogeneous version that can found by removing the letter h, so that HAdd.hAdd becomes Add.add. The following arithmetic operators are overloaded:

ExpressionDesugaringClass Name
x + yHAdd.hAdd x yHAdd
x - yHSub.hSub x yHSub
x * yHMul.hMul x yHMul
x / yHDiv.hDiv x yHDiv
x % yHMod.hMod x yHMod
x ^ yHPow.hPow x yHPow
(- x)Neg.neg xNeg

Bitwise Operators

Lean contains a number of standard bitwise operators that are overloaded using type classes. There are instances for fixed-width types such as UInt8, UInt16, UInt32, UInt64, and USize. The latter is the size of words on the current platform, typically 32 or 64 bits. The following bitwise operators are overloaded:

ExpressionDesugaringClass Name
x &&& yHAnd.hAnd x yHAnd
x ||| y HOr.hOr x yHOr
x ^^^ yHXor.hXor x yHXor
~~~ xComplement.complement xComplement
x >>> yHShiftRight.hShiftRight x yHShiftRight
x <<< yHShiftLeft.hShiftLeft x yHShiftLeft

Because the names And and Or are already taken as the names of logical connectives, the homogeneous versions of HAnd and HOr are called AndOp and OrOp rather than And and Or.

Equality and Ordering

Testing equality of two values typically uses the BEq class, which is short for "Boolean equality". Due to Lean's use as a theorem prover, there are really two kinds of equality operators in Lean:

  • Boolean equality is the same kind of equality that is found in other programming languages. It is a function that takes two values and returns a Bool. Boolean equality is written with two equals signs, just as in Python and C#. Because Lean is a pure functional language, there's no separate notions of reference vs value equality—pointers cannot be observed directly.
  • Propositional equality is the mathematical statement that two things are equal. Propositional equality is not a function; rather, it is a mathematical statement that admits proof. It is written with a single equals sign. A statement of propositional equality is like a type that classifies evidence of this equality.

Both notions of equality are important, and used for different purposes. Boolean equality is useful in programs, when a decision needs to be made about whether two values are equal. For example, "Octopus" == "Cuttlefish" evaluates to false, and "Octopodes" == "Octo".append "podes" evaluates to true. Some values, such as functions, cannot be checked for equality. For example, (fun (x : Nat) => 1 + x) == (Nat.succ ·) yields the error:

failed to synthesize instance
  BEq (Nat → Nat)

As this message indicates, == is overloaded using a type class. The expression x == y is actually shorthand for BEq.beq x y.

Propositional equality is a mathematical statement rather than an invocation of a program. Because propositions are like types that describe evidence for some statement, propositional equality has more in common with types like String and Nat → List Int than it does with Boolean equality. This means that it can't automatically be checked. However, the equality of any two expressions can be stated in Lean, so long as they have the same type. The statement (fun (x : Nat) => 1 + x) = (Nat.succ ·) is a perfectly reasonable statement. From the perspective of mathematics, two functions are equal if they map equal inputs to equal outputs, so this statement is even true, though it requires a two-line proof to convince Lean of this fact.

Generally speaking, when using Lean as a programming language, it's easiest to stick to Boolean functions rather than propositions. However, as the names true and false for Bool's constructors suggest, this difference is sometimes blurred. Some propositions are decidable, which means that they can be checked just like a Boolean function. The function that checks whether the proposition is true or false is called a decision procedure, and it returns evidence of the truth or falsity of the proposition. Some examples of decidable propositions include equality and inequality of natural numbers, equality of strings, and "ands" and "ors" of propositions that are themselves decidable.

In Lean, if works with decidable propositions. For example, 2 < 4 is a proposition:

#check 2 < 4
2 < 4 : Prop

Nonetheless, it is perfectly acceptable to write it as the condition in an if. For example, if 2 < 4 then 1 else 2 has type Nat and evaluates to 1.

Not all propositions are decidable. If they were, then computers would be able to prove any true proposition just by running the decision procedure, and mathematicians would be out of a job. More specifically, decidable propositions have an instance of the Decidable type class which has a method that is the decision procedure. Trying to use a proposition that isn't decidable as if it were a Bool results in a failure to find the Decidable instance. For example, if (fun (x : Nat) => 1 + x) = (Nat.succ ·) then "yes" else "no" results in:

failed to synthesize instance
  Decidable ((fun x => 1 + x) = fun x => Nat.succ x)

The following propositions, that are usually decidable, are overloaded with type classes:

ExpressionDesugaringClass Name
x < yLT.lt x yLT
x ≤ yLE.le x yLE
x > yLT.lt y xLT
x ≥ yLE.le y xLE

Because defining new propositions hasn't yet been demonstrated, it may be difficult to define new instances of LT and LE.

Additionally, comparing values using <, ==, and > can be inefficient. Checking first whether one value is less than another, and then whether they are equal, can require two traversals over large data structures. To solve this problem, Java and C# have standard compareTo and CompareTo methods (respectively) that can be overridden by a class in order to implement all three operations at the same time. These methods return a negative integer if the receiver is less than the argument, zero if they are equal, and a positive integer if the receiver is greater than the argument. Rather than overload the meaning of integers, Lean has a built-in inductive type that describes these three possibilities:

inductive Ordering where
| lt
| eq
| gt

The Ord type class can be overloaded to produce these comparisons. For Pos, an implementation can be:

def Pos.comp : Pos → Pos → Ordering
  | Pos.one, Pos.one => Ordering.eq
  | Pos.one, Pos.succ _ => Ordering.lt
  | Pos.succ _, Pos.one => Ordering.gt
  | Pos.succ n, Pos.succ k => comp n k

instance : Ord Pos where
  compare := Pos.comp

In situations where compareTo would be the right approach in Java, use Ord.compare in Lean.

Hashing

Java and C# have hashCode and GetHashCode methods, respectively, that compute a hash of a value for use in data structures such as hash tables. The Lean equivalent is a type class called Hashable:

class Hashable (α : Type) where
  hash : α → UInt64

If two values are considered equal according to a BEq instance for their type, then they should have the same hashes. In other words, if x == y then hash x == hash y. If x ≠ y, then hash x won't necessarily differ from hash y (after all, there are infinitely more Nat values than there are UInt64 values), but data structures built on hashing will have better performance if unequal values are likely to have unequal hashes. This is the same expectation as in Java and C#.

The standard library contains a function mixHash with type UInt64 → UInt64 → UInt64 that can be used to combine hashes for different fields for a constructor. A reasonable hash function for an inductive datatype can be written by assigning a unique number to each constructor, and then mixing that number with the hashes of each field. For example, a Hashable instance for Pos can be written:

def hashPos : Pos → UInt64
  | Pos.one => 0
  | Pos.succ n => mixHash 1 (hashPos n)

instance : Hashable Pos where
  hash := hashPos

Hashable instances for polymorphic types can use recursive instance search. Hashing a NonEmptyList α is only possible when α can be hashed:

instance [Hashable α] : Hashable (NonEmptyList α) where
  hash xs := mixHash (hash xs.head) (hash xs.tail)

Binary trees use both recursion and recursive instance search in the implementations of BEq and Hashable:

inductive BinTree (α : Type) where
  | leaf : BinTree α
  | branch : BinTree α → α → BinTree α → BinTree α

def eqBinTree [BEq α] : BinTree α → BinTree α → Bool
  | BinTree.leaf, BinTree.leaf =>
    true
  | BinTree.branch l x r, BinTree.branch l2 x2 r2 =>
    x == x2 && eqBinTree l l2 && eqBinTree r r2
  | _, _ =>
    false

instance [BEq α] : BEq (BinTree α) where
  beq := eqBinTree

def hashBinTree [Hashable α] : BinTree α → UInt64
  | BinTree.leaf =>
    0
  | BinTree.branch left x right =>
    mixHash 1 (mixHash (hashBinTree left) (mixHash (hash x) (hashBinTree right)))

instance [Hashable α] : Hashable (BinTree α) where
  hash := hashBinTree

Deriving Standard Classes

Instance of classes like BEq and Hashable are often quite tedious to implement by hand. Lean includes a feature called instance deriving that allows the compiler to automatically construct well-behaved instances of many type classes. In fact, the deriving Repr phrase in the definition of Point in the section on structures is an example of instance deriving.

Instances can be derived in two ways. The first can be used when defining a structure or inductive type. In this case, add deriving to the end of the type declaration followed by the names of the classes for which instances should be derived. For a type that is already defined, a standalone deriving command can be used. Write deriving instance C1, C2, ... for T to deriving instances of C1, C2, ... for the type T after the fact.

BEq and Hashable instances can be derived for Pos and NonEmptyList using a very small amount of code:

deriving instance BEq, Hashable for Pos
deriving instance BEq, Hashable, Repr for NonEmptyList

Instance can be derived for at least the following classes:

  • Inhabited
  • BEq
  • Repr
  • Hashable
  • Ord

In some cases, however, the derived Ord instance may not produce precisely the ordering desired in an application. When this is the case, it's fine to write an Ord instance by hand. The collection of classes for which instances can be derived can be extended by advanced users of Lean.

Aside from the clear advantages in programmer productivity and code readability, deriving instances also makes code easier to maintain, because the instances are updated as the definitions of types evolve. Changesets involving updates to datatypes are easier to read without line after line of formulaic modifications to equality tests and hash computation.

Appending

Many datatypes have some sort of append operator. In Lean, appending two values is overloaded with the type class HAppend, which is a heterogeneous operation like that used for arithmetic operations:

class HAppend (α : Type) (β : Type) (γ : outParam Type) where
  hAppend : α → β → γ

The syntax xs ++ ys desugars to HAppend.hAppend xs ys. For homogeneous cases, it's enough to implement an instance of Append, which follows the usual pattern:

instance : Append (NonEmptyList α) where
  append xs ys :=
    { head := xs.head, tail := xs.tail ++ ys.head :: ys.tail }

After defining the above instance,

#eval idahoSpiders ++ idahoSpiders

has the following output:

{ head := "Banded Garden Spider",
tail := ["Long-legged Sac Spider",
         "Wolf Spider",
         "Hobo Spider",
         "Cat-faced Spider",
         "Banded Garden Spider",
         "Long-legged Sac Spider",
         "Wolf Spider",
         "Hobo Spider",
         "Cat-faced Spider"] }

Similarly, a definition of HAppend allows non-empty lists to be appended to ordinary lists:

instance : HAppend (NonEmptyList α) (List α) (NonEmptyList α) where
  hAppend xs ys :=
    { head := xs.head, tail := xs.tail ++ ys }

With this instance available,

#eval idahoSpiders ++ ["Trapdoor Spider"]

results in

{ head := "Banded Garden Spider",
  tail := ["Long-legged Sac Spider", "Wolf Spider", "Hobo Spider", "Cat-faced Spider", "Trapdoor Spider"] }

Functors

A polymorphic type is a functor if it has an overload for a function named map that transforms every element contained in it by a function. While most languages use this terminology, C#'s equivalent to map is called System.Linq.Enumerable.Select. For example, mapping a function over a list constructs a new list in which each entry from the starting list has been replaced by the result of the function on that entry. Mapping a function f over an Option leaves none untouched, and replaces some x with some (f x).

Here are some examples of functors and how their Functor instances overload map:

  • Functor.map (· + 5) [1, 2, 3] evaluates to [6, 7, 8]
  • Functor.map toString (some (List.cons 5 List.nil)) evaluates to some "[5]"
  • Functor.map List.reverse [[1, 2, 3], [4, 5, 6]] evaluates to [[3, 2, 1], [6, 5, 4]]

Because Functor.map is a bit of a long name for this common operation, Lean also provides an infix operator for mapping a function, namely <$>. The prior examples can be rewritten as follows:

  • (· + 5) <$> [1, 2, 3] evaluates to [6, 7, 8]
  • toString <$> (some (List.cons 5 List.nil)) evaluates to some "[5]"
  • List.reverse <$> [[1, 2, 3], [4, 5, 6]] evaluates to [[3, 2, 1], [6, 5, 4]]

An instance of Functor for NonEmptyList requires specifying the map function.

instance : Functor NonEmptyList where
  map f xs := { head := f xs.head, tail := f <$> xs.tail }

Here, map uses the Functor instance for List to map the function over the tail. This instance is defined for NonEmptyList rather than for NonEmptyList α because the argument type α plays no role in resolving the type class. A NonEmptyList can have a function mapped over it no matter what the type of entries is. If α were a parameter to the class, then it would be possible to make versions of Functor that only worked for NonEmptyList Nat, but part of being a functor is that map works for any entry type.

Here is an instance of Functor for PPoint:

instance : Functor PPoint where
  map f p := { x := f p.x, y := f p.y }

In this case, f has been applied to both x and y.

Even when the type contained in a functor is itself a functor, mapping a function only goes down one layer. That is, when using map on a NonEmptyList (PPoint Nat), the function being mapped should take PPoint Nat as its argument rather than Nat.

The definition of the Functor class uses one more language feature that has not yet been discussed: default method definitions. Normally, a class will specify some minimal set of overloadable operations that make sense together, and then use polymorphic functions with instance implicit arguments that build on the overloaded operations to provide a larger library of features. For example, the function concat can concatenate any non-empty list whose entries are appendable:

def concat [Append α] (xs : NonEmptyList α) : α :=
  let rec catList (start : α) : List α → α
    | [] => start
    | (z :: zs) => catList (start ++ z) zs
  catList xs.head xs.tail

However, for some classes, there are operations that can be more efficiently implemented with knowledge of the internals of a datatype.

In these cases, a default method definition can be provided. A default method definition provides a default implementation of a method in terms of the other methods. However, instance implementors may choose to override this default with something more efficient. Default method definitions contain := in a class definition.

In the case of Functor, some types have a more efficient way of implementing map when the function being mapped ignores its argument. Functions that ignore their arguments are called constant functions because they always return the same value. Here is the definition of Functor, in which mapConst has a default implementation:

class Functor (f : Type → Type) where
  map : {α β : Type} → (α → β) → f α → f β

  mapConst {α β : Type} (x : α) (coll : f β) : f α :=
    map (fun _ => x) coll

Just as a Hashable instance that doesn't respect BEq is buggy, a Functor instance that moves around the data as it maps the function is also buggy. For example, a buggy Functor instance for List might throw away its argument and always return the empty list, or it might reverse the list. A bad instance for PPoint might place f x in both the x and the y fields. Specifically, Functor instances should follow two rules:

  1. Mapping the identity function should result in the original argument.
  2. Mapping two composed functions should have the same effect as composing their mapping.

More formally, the first rule says that id <$> x equals x. The second rule says that map (fun y => f (g y)) x equals map f (map g x). The composition fun y => f (g y) can also be written f ∘ g. These rules prevent implementations of map that move the data around or delete some of it.

Exercises

  • Write an instance of HAppend (List α) (NonEmptyList α) (NonEmptyList α) and test it.
  • Implement a Functor instance for the binary tree datatype.