have2lemma¶

Extract have statements from proofs and convert them into standalone lemmas.

Python API¶

result = await axle.have2lemma(
    content=lean_code,
    environment="lean-4.26.0",
    names=["main_theorem"],         # Optional
    include_have_body=False,        # Optional: use sorry instead
    include_whole_context=True,     # Optional
    reconstruct_callsite=False,     # Optional
    verbosity=0,                    # Optional: 0-2
)
print(result.content)
print(result.lemma_names)  # ["main_theorem.h1", "main_theorem.h2"]

CLI¶

Usage: axle have2lemma CONTENT [OPTIONS]

# Extract all have statements
axle have2lemma theorem.lean --environment lean-4.26.0
# Extract from specific theorems
axle have2lemma theorem.lean --names main_proof,helper --environment lean-4.26.0
# Include proof bodies in extracted lemmas
axle have2lemma theorem.lean --include-have-body --environment lean-4.26.0
# Reconstruct callsites (replace have with lemma call)
axle have2lemma theorem.lean --reconstruct-callsite --environment lean-4.26.0
# Skip context cleanup
axle have2lemma theorem.lean --no-include-whole-context --environment lean-4.26.0
# Pipeline usage
cat theorem.lean | axle have2lemma - --environment lean-4.26.0 | axle check - --environment lean-4.26.0

HTTP API¶

curl -s -X POST http://10.1.10.100:8989/v1/axle/have2lemma \
    -d '{"content": "import Mathlib\ntheorem foo : 1 = 1 ∧ 2 = 2 := by\n  have h1 : 1 = 1 := by rfl\n  have h2 : 2 = 2 := by rfl\n  exact ⟨h1, h2⟩", "environment": "lean-4.26.0"}' | jq

Input Parameters¶

Name	Type	Required	Description
`content`	`str`	Yes	A string of Lean code
`environment`	`str`	Yes	Environment to use (e.g., `"lean-4.26.0"`, `"lean-4.25.1"`)
`names`	`list[str]`	No	List of theorem names to process
`indices`	`list[int]`	No	List of theorem indices (0-based, negative allowed)
`include_have_body`	`bool`	No	Include have bodies in extracted lemmas (default: false, uses sorry)
`include_whole_context`	`bool`	No	Include whole context when extracting (default: true)
`reconstruct_callsite`	`bool`	No	Replace have statement with lemma call (default: false)
`verbosity`	`int`	No	Pretty-printer verbosity: 0=default, 1=robust, 2=extra robust
`ignore_imports`	`bool`	No	If true, ignore import statement mismatches and use the environment's default imports. If false, raise an error on mismatch. Default: false
`timeout_seconds`	`float`	No	Maximum execution time (default: 120)

Note: Specify only one of names or indices. If neither specified, processes all theorems.

Output Fields¶

Name	Type	Description
`lean_messages`	`dict`	Messages from Lean compiler with `errors`, `warnings`, `infos` lists
`tool_messages`	`dict`	Messages from have2lemma tool with `errors`, `warnings`, `infos` lists
`content`	`str`	The transformed Lean code with have statements extracted as lemmas
`lemma_names`	`list[str]`	List of names of the newly created lemmas
`timings`	`dict`	Timing info: `total_ms`, `parse_ms`

Example Response¶

{
  "lean_messages": {
    "errors": [],
    "warnings": [],
    "infos": []
  },
  "tool_messages": {
    "errors": [],
    "warnings": [],
    "infos": []
  },
  "content": "import Mathlib\n\nlemma foo.h1 : 1 = 1 := sorry\n\nlemma foo.h2 (h1 : 1 = 1) : 2 = 2 := sorry\n\ntheorem foo : 1 = 1 ∧ 2 = 2 := by\n  have h1 : 1 = 1 := by rfl\n  have h2 : 2 = 2 := by rfl\n  exact ⟨h1, h2⟩",
  "lemma_names": ["foo.h1", "foo.h2"],
  "timings": {
    "total_ms": 95,
    "parse_ms": 88
  }
}

Demo¶

There are a lot of configurable options for have2lemma. Let's go through them and discuss why they exist.

Options¶

There are three main options to discuss:

include_have_body: Whether to include have bodies in extracted lemmas. If false, lemmas will use sorry instead. Defaults to false.
include_whole_context: Whether to include the whole context (skip cleanup) when extracting have statements. Defaults to true.
reconstruct_callsite: Whether to reconstruct the callsite (replace have statement with lemma call). Defaults to false.

Default behavior¶

Let's look at the simple following example.

theorem example_theorem (p q r : Prop) : p ∧ r → p ∨ q := by
  intro hpr
  have h1 : p := by simp_all
  have h2 : r := by simp_all
  left
  assumption

The default behavior sets include_have_body=false, include_whole_context=true, and reconstruct_callsite=false, giving us

lemma example_theorem.h1 (p q r : Prop) (hpr : p ∧ r) : p := sorry

lemma example_theorem.h2 (p q r : Prop) (hpr : p ∧ r) (h1 : p) : r := sorry

theorem example_theorem (p q r : Prop) : p ∧ r → p ∨ q := by
  intro hpr
  have h1 : p := by simp_all
  have h2 : r := by simp_all
  left
  assumption

Here,

both generated lemmas are sorried out -- this is the result of include_have_body=false.
in both lemmas, the entire local context is provided, which is the result of include_whole_context=true. This might include redundant variables -- in this case, q isn't relevant to the goal.
the main theorem is left unchanged -- this is the result of reconstruct_callsite=false.

`include_have_body`¶

Let's see what happens if we set this value to true:

lemma example_theorem.h1 (p q r : Prop) (hpr : p ∧ r) : p := by simp_all

lemma example_theorem.h2 (p q r : Prop) (hpr : p ∧ r) (h1 : p) : r := by simp_all

The output now includes the proof body!

Why bother making this configurable?

This option is NOT guaranteed to be robust, and might introduce errors into the file. In this example:

theorem complex_types : ∀ (n : Nat), n + 0 = n := by
  intro n
  have base : 0 + 0 = 0 := by rfl
  have step : ∀ m, m + 0 = m → (m + 1) + 0 = m + 1 := by
    intro m ih
    rfl
  sorry

the second generated lemma is

lemma complex_types.step : ∀ (n : ℕ), 0 + 0 = 0 → ∀ (m : ℕ), m + 0 = m → m + 1 + 0 = m + 1 := by
    intro m ih
    rfl

This does not compile!!! Notice that Lean has decided to revert n in the type. This means the proof will fail, because there is a missing intro n ....

`include_whole_context`¶

Now let's set this option to false. In our original example, this gives us:

lemma example_theorem.h1 (p r : Prop) (hpr : p ∧ r) : p := sorry
lemma example_theorem.h2 (p r : Prop) (hpr : p ∧ r) (h1 : p) : r := sorry

Notice that the tool has now removed the q variable from both lemmas, as it is irrelevant to the goal and hypotheses.

Why make this configurable?

In general, Lean's dependency analysis is purely based on heuristics. See the source:

A variable is relevant if (1) it occurs in the target type, (2) there is a relevant variable that depends on it, or (3) the type of the variable is a proposition that depends on a relevant variable.

Therefore, it's possible that a hypothesis in the context is useful even though Lean judges it to be irrelevant. In rare cases, it can break the proof when used in conjunction with include_have_body=true. For example:

theorem foo : Odd 5 ∨ Even 5 := by
  have odd : Odd 5 := by exists 2
  have sol : Odd 5 ∨ Even 5 := by
    left
    assumption
  exact sol

When running with include_have_body=true and include_whole_context=false, the tool will output the lemmas

lemma foo.odd : Odd 5 := by exists 2

lemma foo.sol : Odd 5 ∨ Even 5 := by
    left
    assumption

Notably, in the second lemma, Lean judged the hypothesis odd as irrelevant -- no good! The proof body now breaks on assumption.

`reconstruct_callsite`¶

Our final option is the most intricate. Let's try enabling this option:

...

theorem example_theorem (p q r : Prop) : p ∧ r → p ∨ q := by
  intro hpr
  have h1 : p := example_theorem.h1 p q r hpr
  have h2 : r := example_theorem.h2 p q r hpr h1
  left
  assumption

Here, in the main theorem, we removed the body of the have statement, replacing it with an application of the lemmas we just generated!

Why make this configurable?

Let's make a very small change to our original proof. Instead of running intro hpr, we'll have Lean generate the name for us, and just run intros.

theorem example_theorem (p q r : Prop) : p ∧ r → p ∨ q := by
  intros
  have h1 : p := by simp_all
  ...

Now we'll run have2lemma again.

lemma example_theorem.h1 (p q r : Prop) (a : p ∧ r) : p := sorry

lemma example_theorem.h2 (p q r : Prop) (a : p ∧ r) (h1 : p) : r := sorry

theorem example_theorem (p q r : Prop) : p ∧ r → p ∨ q := by
  intros
  have h1 : p := sorry /- try using example_theorem.h1 here -/
  have h2 : r := sorry /- try using example_theorem.h2 here -/
  left
  assumption

Uh oh. What happened? Notice that when we run intros, we introduce a new hypothesis with type p ∧ r -- but we haven't given it a name! This means we can't ever refer to it explicitly (i.e., it is inaccessible). (This is a Lean quirk which can be disabled, but hygienic names are generally a good thing.) have2lemma automatically generated the name a in the lemmas, but we can't assign anything to it -- so our tool complains that we've encountered an inaccessible variable, and gives up.

Verbosity¶

The verbosity parameter controls how explicit the pretty-printer is when generating lemma signatures. Higher verbosity levels produce more explicit output, which can help avoid ambiguity in complex type situations.

verbosity=0 (default): Standard pretty-printing options
verbosity=1: Robust options with additional explicitness
verbosity=2: Extra robust options with maximum explicitness

When to use higher verbosity¶

Consider this example involving coercions:

theorem explicit_coercion_test (n : ℕ) (hn : n > 0) : True := by
  have h : (∑ i : Fin n, (1 : ℝ) / (i.val + 1)) ≤ (harmonic n : ℝ) + 1 := by
    sorry
  trivial

With default verbosity (verbosity=0), the coercion (harmonic n : ℝ) may be pretty-printed as Rat.cast (harmonic n), losing the target type ℝ. This causes Lean to fail with errors like "failed to synthesize RatCast ℕ" because it can't infer the correct target type for the coercion.

With verbosity=2, the pretty-printer uses pp.explicit=true, which preserves the target type information and produces a valid lemma signature.

Rule of thumb: If you encounter type inference errors in generated lemmas—especially involving coercions, casts, or polymorphic functions—try increasing the verbosity level.

Do note that at verbosity=2, type signatures may become incredibly complex and unreadable, so it should be used sparingly.

Summary¶

These configuration options provide some flexibility around usage, at the cost of correctness in some cases. Try to keep this in mind when generating bug reports -- some of these errors aren't fixable without significant effort.