The Lean Theorem Prover

Leonardo de Moura (joint work with Soonho Kong, Jeremy Avigad, Floris van Doorn, Jakob von Raumer)

Microsoft Research, 2015/02/18

Introduction: Lean

New automated & interactive theorem prover
Powerful system for
- reasoning about complex systems
- reasoning about mathematics
- proving claims about both
de Bruijn's Principle: small trusted kernel

Introduction: Lean

Proving should be as easy as programming
Expressive logic
Partial constructions: automation fills the "holes"

Introduction: Lean

Platform for
- Software verification & development
- Formalized mathematics
- Education (mathematics, logic, computer science)
- Synthesis (proofs & programs)

Introduction: Lean

It is an ongoing and long term effort
At CMU, it is already being used for formalizing
- Homotopy Type Theory
- Category Theory
- Algebraic Hierarchy
Interactive theory proving course at CMU

What is new?

Poweful elaboration engine that can handle
- Higher-order unification
- Definitional reductions
- Coercions
- Ad-hoc polymorphism (aka overloading)
- Type classes
- Tactics (the automation gateway)
- in an integrated way

What is new?

Poweful elaboration engine that can handle
Small trusted kernel: it does not contain
- Termination checker
- Fixpoint operators
- Pattern matching

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
- Proof irrelevance is optional
- Extensible (e.g., future support for HITs)

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
- Process theorems in parallel
- Execute/try tactics (automation) in parallel

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
Incremental compilation

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
Incremental compilation
Universe polymorphism

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
Incremental compilation
Universe polymorphism
Freedom to trust
- Support for "macros" (procedural attachments to kernel)
- Every "macro" can be eliminated
- Faster type checking and reduction

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
Incremental compilation
Universe polymorphism
Freedom to trust
Support for mixed declarative and tactic proof style
Future: powerful automation

What is new?

Poweful elaboration engine that can handle
Small trusted kernel
Flexible kernel
Multi-core support
Incremental compilation
Universe polymorphism
Freedom to trust
Support for mixed declarative and tactic proof style
Future: powerful automation

The Logical Framework

First layer: dependent lambda calculus

Type -- Sort nat -- Constant λ x : nat, x -- Lambda abstraction vector bool 3 -- Application Π (n : nat), vector nat n -- Function Space nat → bool -- Function Space (no dependency)

The Logical Framework

First layer: dependent lambda calculus

definition f (A B : Type) : A → B → A := λ (a : A) (b : B), a definition f (A B : Type) (a : A) (b : B) : A := a

The Logical Framework

First layer: dependent lambda calculus
Hierarchy of universes

set_option pp.universes true check Type.{0} -- Prop : Type.{1} check Type.{1} -- Type.{1} : Type.{2} check Type.{2} -- Type.{2} : Type.{3}

The Logical Framework

First layer: dependent lambda calculus
Hierarchy of universes
Prop is the type of all propositions

theorem f (A B : Prop) : A → B → A := λ (a : A) (b : B), a theorem f (A B : Prop) (a : A) (b : B) : A := a

The Logical Framework

First layer: dependent lambda calculus
Hierarchy of universes
Prop is the type of all propositions
Configurable
- Impredicative Prop
Π (x : nat), x = x -- is a Proposition ∀ (x : nat), x = x -- Alternative notation
- Proof irrelevance
example (p : Prop) (H₁ : p) (H₂ : p) : H₁ = H₂ := rfl -- reflexivity
We have two flavors
- Standard (CoC): uses an impredicative and proof irrelevant Prop
- HoTT: proof relevant, and no Prop

The Logical Framework

First layer: dependent lambda calculus
Hierarchy of universes
Prop is the type of all propositions
Configurable
We have two flavors
Second layer: inductive families
- We "believe" in recursion
inductive vector (A : Type) : nat → Type := | nil : vector A zero | cons : Π {n : nat}, A → vector A n → vector A (succ n)

Inductive families

Given
inductive nat : Type := | zero : nat | succ : nat → nat
Produces
-- 1) A new type check nat -- 2) Introduction rules (aka Constructors) check nat.zero check nat.succ -- 3) Eliminator (aka Recursor) check @nat.rec -- 4) Computational rule variable C : nat → Type variable Hz : C nat.zero variable Hs : Π (a : nat), C a → C (nat.succ a) eval nat.rec Hz Hs nat.zero -- Hz variable a : nat eval nat.rec Hz Hs (nat.succ a) -- Hs a (nat.rec Hz Hs a)

Inductive families

definition pred (a : nat) : nat := nat.rec nat.zero (λ (a₁ : nat) (r : nat), a₁) a eval pred (nat.succ (nat.succ nat.zero)) -- => nat.succ nat.zero

Inductive families

It is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, function spaces, and inductive types; everything else follows from those.
inductive true : Prop := intro : true inductive false : Prop -- empty type inductive and (a b : Prop) : Prop := intro : a → b → and a b inductive or (a b : Prop) : Prop := | inl : a → or a b | inr : b → or a b inductive Exists (A : Type) (P : A → Prop) : Prop := intro : Π (a : A), P a → Exists P inductive eq (A : Type) (a : A) : A → Prop := refl : eq A a a

Inductive families

theorem and_comm (p q : Prop) (H : p ∧ q) : q ∧ p := and.rec (λ Hp Hq, and.intro Hq Hp) H theorem and_comm (p q : Prop) (H : p ∧ q) : q ∧ p := match H with and.intro Hp Hq := and.intro Hq Hp end definition swap (A B : Type) (p : A × B) : B × A := match p with (a, b) := (b, a) end

Universe polymorphism

definition id.{u} (A : Type.{u}) (a : A) : A := a definition arrow.{u₁ u₂} (A : Type.{u₁}) (B : Type.{u₂}) : Type.{imax u₁ u₂} := A → B check arrow.{0 0} true false check arrow.{1 1} nat nat check arrow.{1 0} nat true definition arrow (A : Type) (B : Type) : Type := A → B check arrow nat true

Elaboration

"By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race." A. N. Whitehead

We want a convenient system
The following features are implemented outside of the kernel
- Implicit arguments
- Coercions
- Namespace management
- Ad-hoc polymorphism (overloading)
- Type classes
- Recursive equations
- Dependent pattern matching
- Human-readable proofs
- Structures
- Tactics (gateway to decision procedures, rewriters, …)

Implicit arguments

Curly braces indicate that argument should be inferred rather than entered explicitly.

definition id {A : Type} (a : A) : A := a check id 10 -- @id num 10 check @id num 10

Implicit arguments

check @eq.subst -- eq.subst : ∀ {A : Type} {a b : A} {P : A → Prop}, -- a = b → P a → P b variables (A : Type) (R : A → A → Prop) variables (a b c : A) (f : A → A → A) example (H₁ : R (f a a) (f a a)) (H₂ : a = b) : R (f a b) (f b a) := eq.subst H₂ H₁ example (H₁ : R (f a a) (f a a)) (H₂ : a = b) : R (f a b) (f b a) := @eq.subst A a b (λ x : A, R (f a x) (f x a)) H₂ H₁

Definitional Reductions

Elaborator must respect the computational interpretation of terms

example (A : Type) (a b : A) : (a, b).1 = a := rfl example (A B : Type) (f : A → A → B) (a : A) : (λ x, f x x) a = f a a := rfl

Coercions

In Lean, we can associate attributes to definitions.
Coercion is one of the available attributes.

definition to_list [coercion] {A : Type} {n : nat} (v : vector A n) : list A := vector.rec list.nil (λ n h t r, list.cons h r) v variable f : list nat → Prop variable v : vector nat 10 check f v -- f (@to_list nat 10 v)

Namespaces

We can group definitions, metaobjects (e.g., notation declarations) and attributes into namespaces.
We can open namespaces

namespace foo definition f (a : nat) := nat.succ a eval f nat.zero end foo eval foo.f nat.zero open foo eval f nat.zero open nat check 1 + 2 open nat (hiding add sub) open nat (renaming add → nadd) open nat (rec_on) open [notations] nat open -[classes] nat

Ad-hoc polymorphism

notation a + b := add a b notation a + b := bor a b eval 1 + 2 eval tt + ff

We can use namespaces to avoid unwanted ambiguity.
We can override ad-hoc polymorphism

check (#nat a + b)

Type classes

Synthesis procedure based on lambda-Prolog
Big picture
- Mark some inductive families as classes
- Mark some definitions as (generators of) instances
- Indicate that some implicit arguments must be synthesized using type classes
Instances are treated as Horn clauses

Inhabited Type Class

inductive inhabited [class] (A : Type) : Type := mk : A → inhabited A definition default (A : Type) [h : inhabited A] : A := inhabited.rec (λ a, a) h definition prop_inhabited [instance] : inhabited Prop := inhabited.mk true definition nat_inhabited [instance] : inhabited nat := inhabited.mk nat.zero definition fun_inhabited [instance] (A B : Type) (h : inhabited B) : inhabited (A → B) := inhabited.mk (λ x : A, default B) definition prod_inhabited [instance] (A B : Type) (ha : inhabited A) (hb : inhabited B) : inhabited (A × B) := inhabited.mk (default A, default B) eval default (nat → nat × Prop)

Inhabited Type Class (Applications)

Some propositions only hold for inhabited types
theorem ex_trivial (A : Type) [h : inhabited A] : ∃ x : A, x = x := exists.intro (default A) rfl
"Corner cases"
definition head {A : Type} [h : inhabited A] (l : list A) : A := list.rec (default A) (λ h t r, h) l

Recursive equations

Recursors are inconvenient to use.
Compiler from recursive equations to recursors.
Two compilation strategies: structural and well-founded recursion

definition fib : nat → nat | fib 0 := 1 | fib 1 := 1 | fib (a+2) := fib (a+1) + fib a example (a : nat) : fib (a+2) = fib (a+1) + fib a := rfl

Recursive equations

Proofs by induction

theorem fib_pos : ∀ n, 0 < fib n | fib_pos 0 := show 0 < 1, from zero_lt_succ 0 | fib_pos 1 := show 0 < 1, from zero_lt_succ 0 | fib_pos (a+2) := calc 0 = 0 + 0 : rfl ... < fib (a+1) + 0 : lt_right (fib_pos (a+1)) 0 ... < fib (a+1) + fib a : lt_left (fib_pos a) (fib (a+1)) ... = fib (a+2) : rfl

Recursive equations

Dependent pattern matching

definition map {A B C : Type} (f : A → B → C) : Π {n : nat}, vector A n → vector B n → vector C n | map nil nil := nil | map (a::va) (b::vb) := f a b :: map va vb definition zip {A B : Type} : Π {n}, vector A n → vector B n → vector (A × B) n | zip nil nil := nil | zip (a::va) (b::vb) := (a, b) :: zip va vb

Human-readable proofs

Simulate Mizar and Isabelle/Isar
Proof terms + extensible parser
- have, show, obtain constructs
- assume and take as alternative notation for lambda abstraction
- Calculational proofs

theorem dvd_of_dvd_add_left {m n₁ n₂ : ℕ} (H₁ : m | n₁ + n₂) (H₂ : m | n₁) : m | n₂ := obtain (c₁ : nat) (Hc₁ : n₁ + n₂ = m * c₁), from H₁, obtain (c₂ : nat) (Hc₂ : n₁ = m * c₂), from H₂, have aux : m * (c₁ - c₂) = n₂, from calc m * (c₁ - c₂) = m * c₁ - m * c₂ : mul_sub_left_distrib ... = n₁ + n₂ - m * c₂ : Hc₁ ... = n₁ + n₂ - n₁ : Hc₂ ... = n₂ : add_sub_cancel_left, dvd.intro aux

Tactics

Automation such as rewrite engined, simplifiers and decision procedures are integrated into the system as tactics.
A placeholder/hole can be viewed as a goal
A proof state is a sequence of goals, substitution (already solved holes), and postponed constraints.
A tactic is a function from proof state to a lazy stream of proof states.
Tacticals are tactic combinadors: andthen, orelse, par, …

Tactics

We can switch to tactic mode using begin … end or by …

example (p q : Prop) (Hp : p) (Hq : q) : p ∧ q ∧ p := begin apply and.intro, exact Hp, apply and.intro, exact Hq, exact Hp end example (H₁ : a + 0 = 0) (H₂ : b + 0 = 0) : a + b + 0 = 0 := begin rewrite add_zero at *, rewrite [H₁, H₂] end

Structures

Special kind of inductive datatype (only one constructor)
Projections are generated automatically
"Inheritance"
Extensively used to formalize the algebraic hierarchy
We can view them as parametric modules

structure has_mul [class] (A : Type) := (mul : A → A → A) structure semigroup [class] (A : Type) extends has_mul A := (mul_assoc : ∀a b c, mul (mul a b) c = mul a (mul b c)) ... structure group [class] (A : Type) extends monoid A, has_inv A := (mul_left_inv : ∀a, mul (inv a) a = one)

Structures

structure ring [class] (A : Type) extends add_comm_group A, monoid A, distrib A, zero_ne_one_class A theorem ring.mul_zero [s : ring A] (a : A) : a * 0 = 0 := have H : a * 0 + 0 = a * 0 + a * 0, from calc a * 0 + 0 = a * 0 : by rewrite add_zero ... = a * (0 + 0) : by rewrite add_zero ... = a * 0 + a * 0 : by rewrite ring.left_distrib, show a * 0 = 0, by rewrite -(add.left_cancel H)

Future work

Auto tactic based on equational reasoning, matching, heuristic instantiation, …
Decision procedures for arithmetic
Better support for proof by reflection
Independent/reference type checker in LISP
Better libraries (ongoing work)
HIT (Higher-order inductive datatypes in HoTT mode)
Fibrant universes (in HoTT mode)
Structured editor (for education)

Thank you

Website: http://leanprover.github.io/
Source code: https://github.com/leanprover/lean
Theorem proving in Lean: https://leanprover.github.io/tutorial/index.html