🪴 Berwin Gan

❯

❯

Machine Learning 🤖

❯

❯

STP: Self-Play LLM Theorem Provers with Iterative Conjecturing and Proving

STP: Self-Play LLM Theorem Provers with Iterative Conjecturing and Proving

Jun 28, 20251 min read

self-play
agents

Reasoning ~ coding + math

“Classical” reasoning = brute force search-base LLM

pattern matching
hallucination
self-reflection
extract vague intuition from few-show examples
incapability of long, precise mechanical deduction (tool use)

Issues

Lack-of-Data for proofs
how did human mathematicians developer intuition with lack of data.
practice on variants/ extensions of known theorems
- make statement more abstract
- Conjecturer dataset
  - can’t be too easy (1-8/32) tries
- Prover dataset
  - correct and non-trivial
Elegancy score: $\frac{Length of shortest correct proof}{Length of the conjecture}$
Diversity across different domains of math

Question

Ablation study on how different domains of the dataset affects end ability ?
Re-weighting the domains of dataset ?
What makes a problem novel compared with existing dataset ?

Graph View

Issues
Question

Created with Quartz v4.4.0 © 2025

GitHub
Discord Community