Coercions
In mathematics, it is common to use the same symbol to stand for different aspects of some object in different contexts.
For example, if a ring is referred to in a context where a set is expected, then it is understood that the ring's underlying set is what's intended.
In programming languages, it is common to have rules to automatically translate values of one type into values of another type.
For instance, Java allows a byte
to be automatically promoted to an int
, and Kotlin allows a non-nullable type to be used in a context that expects a nullable version of the type.
In Lean, both purposes are served by a mechanism called coercions. When Lean encounters an expression of one type in a context that expects a different type, it will attempt to coerce the expression before reporting a type error. Unlike Java, C, and Kotlin, the coercions are extensible by defining instances of type classes.
Positive Numbers
For example, every positive number corresponds to a natural number.
The function Pos.toNat
that was defined earlier converts a Pos
to the corresponding Nat
:
def Pos.toNat : Pos → Nat
| Pos.one => 1
| Pos.succ n => n.toNat + 1
The function List.drop
, with type {α : Type} → Nat → List α → List α
, removes a prefix of a list.
Applying List.drop
to a Pos
, however, leads to a type error:
[1, 2, 3, 4].drop (2 : Pos)
application type mismatch
List.drop 2
argument
2
has type
Pos : Type
but is expected to have type
Nat : Type
Because the author of List.drop
did not make it a method of a type class, it can't be overridden by defining a new instance.
The type class Coe
describes overloaded ways of coercing from one type to another:
class Coe (α : Type) (β : Type) where
coe : α → β
An instance of Coe Pos Nat
is enough to allow the prior code to work:
instance : Coe Pos Nat where
coe x := x.toNat
#eval [1, 2, 3, 4].drop (2 : Pos)
[3, 4]
Using #check
shows the result of the instance search that was used behind the scenes:
#check [1, 2, 3, 4].drop (2 : Pos)
List.drop (Pos.toNat 2) [1, 2, 3, 4] : List Nat
Chaining Coercions
When searching for coercions, Lean will attempt to assemble a coercion out of a chain of smaller coercions.
For example, there is already a coercion from Nat
to Int
.
Because of that instance, combined with the Coe Pos Nat
instance, the following code is accepted:
def oneInt : Int := Pos.one
This definition uses two coercions: from Pos
to Nat
, and then from Nat
to Int
.
The Lean compiler does not get stuck in the presence of circular coercions.
For example, even if two types A
and B
can be coerced to one another, their mutual coercions can be used to find a path:
inductive A where
| a
inductive B where
| b
instance : Coe A B where
coe _ := B.b
instance : Coe B A where
coe _ := A.a
instance : Coe Unit A where
coe _ := A.a
def coercedToB : B := ()
Remember: the double parentheses ()
is short for the constructor Unit.unit
.
After deriving a Repr B
instance,
#eval coercedToB
results in:
B.b
The Option
type can be used similarly to nullable types in C# and Kotlin: the none
constructor represents the absence of a value.
The Lean standard library defines a coercion from any type α
to Option α
that wraps the value in some
.
This allows option types to be used in a manner even more similar to nullable types, because some
can be omitted.
For instance, the function List.getLast?
that finds the last entry in a list can be written without a some
around the return value x
:
def List.last? : List α → Option α
| [] => none
| [x] => x
| _ :: x :: xs => last? (x :: xs)
Instance search finds the coercion, and inserts a call to coe
, which wraps the argument in some
.
These coercions can be chained, so that nested uses of Option
don't require nested some
constructors:
def perhapsPerhapsPerhaps : Option (Option (Option String)) :=
"Please don't tell me"
Coercions are only activated automatically when Lean encounters a mismatch between an inferred type and a type that is imposed from the rest of the program. In cases with other errors, coercions are not activated. For example, if the error is that an instance is missing, coercions will not be used:
def perhapsPerhapsPerhapsNat : Option (Option (Option Nat)) :=
392
failed to synthesize instance
OfNat (Option (Option (Option Nat))) 392
This can be worked around by manually indicating the desired type to be used for OfNat
:
def perhapsPerhapsPerhapsNat : Option (Option (Option Nat)) :=
(392 : Nat)
Additionally, coercions can be manually inserted using an up arrow:
def perhapsPerhapsPerhapsNat : Option (Option (Option Nat)) :=
↑(392 : Nat)
In some cases, this can be used to ensure that Lean finds the right instances. It can also make the programmer's intentions more clear.
Non-Empty Lists and Dependent Coercions
An instance of Coe α β
makes sense when the type β
has a value that can represent each value from the type α
.
Coercing from Nat
to Int
makes sense, because the type Int
contains all the natural numbers.
Similarly, a coercion from non-empty lists to ordinary lists makes sense because the List
type can represent every non-empty list:
instance : Coe (NonEmptyList α) (List α) where
coe
| { head := x, tail := xs } => x :: xs
This allows non-empty lists to be used with the entire List
API.
On the other hand, it is impossible to write an instance of Coe (List α) (NonEmptyList α)
, because there's no non-empty list that can represent the empty list.
This limitation can be worked around by using another version of coercions, which are called dependent coercions.
Dependent coercions can be used when the ability to coerce from one type to another depends on which particular value is being coerced.
Just as the OfNat
type class takes the particular Nat
being overloaded as a parameter, dependent coercion takes the value being coerced as a parameter:
class CoeDep (α : Type) (x : α) (β : Type) where
coe : β
This is a chance to select only certain values, either by imposing further type class constraints on the value or by writing certain constructors directly.
For example, any List
that is not actually empty can be coerced to a NonEmptyList
:
instance : CoeDep (List α) (x :: xs) (NonEmptyList α) where
coe := { head := x, tail := xs }
Coercing to Types
In mathematics, it is common to have a concept that consists of a set equipped with additional structure. For example, a monoid is some set S, an element s of S, and an associative binary operator on S, such that s is neutral on the left and right of the operator. S is referred to as the "carrier set" of the monoid. The natural numbers with zero and addition form a monoid, because addition is associative and adding zero to any number is the identity. Similarly, the natural numbers with one and multiplication also form a monoid. Monoids are also widely used in functional programming: lists, the empty list, and the append operator form a monoid, as do strings, the empty string, and string append:
structure Monoid where
Carrier : Type
neutral : Carrier
op : Carrier → Carrier → Carrier
def natMulMonoid : Monoid :=
{ Carrier := Nat, neutral := 1, op := (· * ·) }
def natAddMonoid : Monoid :=
{ Carrier := Nat, neutral := 0, op := (· + ·) }
def stringMonoid : Monoid :=
{ Carrier := String, neutral := "", op := String.append }
def listMonoid (α : Type) : Monoid :=
{ Carrier := List α, neutral := [], op := List.append }
Given a monoid, it is possible to write the foldMap
function that, in a single pass, transforms the entries in a list into a monoid's carrier set and then combines them using the monoid's operator.
Because monoids have a neutral element, there is a natural result to return when the list is empty, and because the operator is associative, clients of the function don't have to care whether the recursive function combines elements from left to right or from right to left.
def foldMap (M : Monoid) (f : α → M.Carrier) (xs : List α) : M.Carrier :=
let rec go (soFar : M.Carrier) : List α → M.Carrier
| [] => soFar
| y :: ys => go (M.op soFar (f y)) ys
go M.neutral xs
Even though a monoid consists of three separate pieces of information, it is common to just refer to the monoid's name in order to refer to its set. Instead of saying "Let A be a monoid and let x and y be elements of its carrier set", it is common to say "Let A be a monoid and let x and y be elements of A". This practice can be encoded in Lean by defining a new kind of coercion, from the monoid to its carrier set.
The CoeSort
class is just like the Coe
class, with the exception that the target of the coercion must be a sort, namely Type
or Prop
.
The term sort in Lean refers to these types that classify other types—Type
classifies types that themselves classify data, and Prop
classifies propositions that themselves classify evidence of their truth.
Just as Coe
is checked when a type mismatch occurs, CoeSort
is used when something other than a sort is provided in a context where a sort would be expected.
The coercion from a monoid into its carrier set extracts the carrier:
instance : CoeSort Monoid Type where
coe m := m.Carrier
With this coercion, the type signatures become less bureaucratic:
def foldMap (M : Monoid) (f : α → M) (xs : List α) : M :=
let rec go (soFar : M) : List α → M
| [] => soFar
| y :: ys => go (M.op soFar (f y)) ys
go M.neutral xs
Another useful example of CoeSort
is used to bridge the gap between Bool
and Prop
.
As discussed in the section on ordering and equality, Lean's if
expression expects the condition to be a decidable proposition rather than a Bool
.
Programs typically need to be able to branch based on Boolean values, however.
Rather than have two kinds of if
expression, the Lean standard library defines a coercion from Bool
to the proposition that the Bool
in question is equal to true
:
instance : CoeSort Bool Prop where
coe b := b = true
In this case, the sort in question is Prop
rather than Type
.
Coercing to Functions
Many datatypes that occur regularly in programming consist of a function along with some extra information about it.
For example, a function might be accompanied by a name to show in logs or by some configuration data.
Additionally, putting a type in a field of a structure, similarly to the Monoid
example, can make sense in contexts where there is more than one way to implement an operation and more manual control is needed than type classes would allow.
For example, the specific details of values emitted by a JSON serializer may be important because another application expects a particular format.
Sometimes, the function itself may be derivable from just the configuration data.
A type class called CoeFun
can transform values from non-function types to function types.
CoeFun
has two parameters: the first is the type whose values should be transformed into functions, and the second is an output parameter that determines exactly which function type is being targeted.
class CoeFun (α : Type) (makeFunctionType : outParam (α → Type)) where
coe : (x : α) → makeFunctionType x
The second parameter is itself a function that computes a type. In Lean, types are first-class and can be passed to functions or returned from them, just like anything else.
For example, a function that adds a constant amount to its argument can be represented as a wrapper around the amount to add, rather than by defining an actual function:
structure Adder where
howMuch : Nat
A function that adds five to its argument has a 5
in the howMuch
field:
def add5 : Adder := ⟨5⟩
This Adder
type is not a function, and applying it to an argument results in an error:
#eval add5 3
function expected at
add5
term has type
Adder
Defining a CoeFun
instance causes Lean to transform the adder into a function with type Nat → Nat
:
instance : CoeFun Adder (fun _ => Nat → Nat) where
coe a := (· + a.howMuch)
#eval add5 3
8
Because all Adder
s should be transformed into Nat → Nat
functions, the argument to CoeFun
's second parameter was ignored.
When the value itself is needed to determine the right function type, then CoeFun
's second parameter is no longer ignored.
For example, given the following representation of JSON values:
inductive JSON where
| true : JSON
| false : JSON
| null : JSON
| string : String → JSON
| number : Float → JSON
| object : List (String × JSON) → JSON
| array : List JSON → JSON
deriving Repr
a JSON serializer is a structure that tracks the type it knows how to serialize along with the serialization code itself:
structure Serializer where
Contents : Type
serialize : Contents → JSON
A serializer for strings need only wrap the provided string in the JSON.string
constructor:
def Str : Serializer :=
{ Contents := String,
serialize := JSON.string
}
Viewing JSON serializers as functions that serialize their argument requires extracting the inner type of serializable data:
instance : CoeFun Serializer (fun s => s.Contents → JSON) where
coe s := s.serialize
Given this instance, a serializer can be applied directly to an argument:
def buildResponse (title : String) (R : Serializer) (record : R.Contents) : JSON :=
JSON.object [
("title", JSON.string title),
("status", JSON.number 200),
("record", R record)
]
The serializer can be passed directly to buildResponse
:
#eval buildResponse "Functional Programming in Lean" Str "Programming is fun!"
JSON.object
[("title", JSON.string "Functional Programming in Lean"),
("status", JSON.number 200.000000),
("record", JSON.string "Programming is fun!")]
Aside: JSON as a String
It can be a bit difficult to understand JSON when encoded as Lean objects.
To help make sure that the serialized response was what was expected, it can be convenient to write a simple converter from JSON
to String
.
The first step is to simplify the display of numbers.
JSON
doesn't distinguish between integers and floating point numbers, and the type Float
is used to represent both.
In Lean, Float.toString
includes a number of trailing zeros:
#eval (5 : Float).toString
"5.000000"
The solution is to write a little function that cleans up the presentation by dropping all trailing zeros, followed by a trailing decimal point:
def dropDecimals (numString : String) : String :=
if numString.contains '.' then
let noTrailingZeros := numString.dropRightWhile (· == '0')
noTrailingZeros.dropRightWhile (· == '.')
else numString
With this definition, #eval dropDecimals (5 : Float).toString
yields "5"
, and #eval dropDecimals (5.2 : Float).toString
yields "5.2"
.
The next step is to define a helper function to append a list of strings with a separator in between them:
def String.separate (sep : String) (strings : List String) : String :=
match strings with
| [] => ""
| x :: xs => String.join (x :: xs.map (sep ++ ·))
This function is useful to account for comma-separated elements in JSON arrays and objects.
#eval ", ".separate ["1", "2"]
yields "1, 2"
, #eval ", ".separate ["1"]
yields "1"
, and #eval ", ".separate []
yields ""
.
Finally, a string escaping procedure is needed for JSON strings, so that the Lean string containing "Hello!"
can be output as "\"Hello!\""
.
Fortunately, the Lean compiler contains an internal function for escaping JSON strings already, called Lean.Json.escape
.
To access this function, add import Lean
to the beginning of your file.
The function that emits a string from a JSON
value is declared partial
because Lean cannot see that it terminates.
This is because recursive calls to asString
occur in functions that are being applied by List.map
, and this pattern of recursion is complicated enough that Lean cannot see that the recursive calls are actually being performed on smaller values.
In an application that just needs to produce JSON strings and doesn't need to mathematically reason about the process, having the function be partial
is not likely to cause problems.
partial def JSON.asString (val : JSON) : String :=
match val with
| true => "true"
| false => "false"
| null => "null"
| string s => "\"" ++ Lean.Json.escape s ++ "\""
| number n => dropDecimals n.toString
| object members =>
let memberToString mem :=
"\"" ++ Lean.Json.escape mem.fst ++ "\": " ++ asString mem.snd
"{" ++ ", ".separate (members.map memberToString) ++ "}"
| array elements =>
"[" ++ ", ".separate (elements.map asString) ++ "]"
With this definition, the output of serialization is easier to read:
#eval (buildResponse "Functional Programming in Lean" Str "Programming is fun!").asString
"{\\"title\\": \\"Functional Programming in Lean\\", \\"status\\": 200, \\"record\\": \\"Programming is fun!\\"}"
Messages You May Meet
Natural number literals are overloaded with the OfNat
type class.
Because coercions fire in cases where types don't match, rather than in cases of missing instances, a missing OfNat
instance for a type does not cause a coercion from Nat
to be applied:
def perhapsPerhapsPerhapsNat : Option (Option (Option Nat)) :=
392
failed to synthesize instance
OfNat (Option (Option (Option Nat))) 392
Design Considerations
Coercions are a powerful tool that should be used responsibly. On the one hand, they can allow an API to naturally follow the everyday rules of the domain being modeled. This can be the difference between a bureaucratic mess of manual conversion functions and a clear program. As Abelson and Sussman wrote in the preface to Structure and Interpretation of Computer Programs (MIT Press, 1996),
Programs must be written for people to read, and only incidentally for machines to execute.
Coercions, used wisely, are a valuable means of achieving readable code that can serve as the basis for communication with domain experts. APIs that rely heavily on coercions have a number of important limitations, however. Think carefully about these limitations before using coercions in your own libraries.
First off, coercions are only applied in contexts where enough type information is available for Lean to know all of the types involved, because there are no output parameters in the coercion type classes. This means that a return type annotation on a function can be the difference between a type error and a successfully applied coercion. For example, the coercion from non-empty lists to lists makes the following program work:
def lastSpider : Option String :=
List.getLast? idahoSpiders
On the other hand, if the type annotation is omitted, then the result type is unknown, so Lean is unable to find the coercion:
def lastSpider :=
List.getLast? idahoSpiders
application type mismatch
List.getLast? idahoSpiders
argument
idahoSpiders
has type
NonEmptyList String : Type
but is expected to have type
List ?m.34258 : Type
More generally, when a coercion is not applied for some reason, the user receives the original type error, which can make it difficult to debug chains of coercions.
Finally, coercions are not applied in the context of field accessor notation. This means that there is still an important difference between expressions that need to be coerced and those that don't, and this difference is visible to users of your API.