Reasoning ~ coding + math
“Classical” reasoning = brute force search-base LLM
- pattern matching
- hallucination
- self-reflection
- extract vague intuition from few-show examples
- incapability of long, precise mechanical deduction (tool use)
Issues
- Lack-of-Data for proofs
- how did human mathematicians developer intuition with lack of data.
- practice on variants/ extensions of known theorems
- make statement more abstract
- Conjecturer dataset
- can’t be too easy (1-8/32) tries
- Prover dataset
- correct and non-trivial
- Elegancy score:
- Diversity across different domains of math
Question
- Ablation study on how different domains of the dataset affects end ability ?
- Re-weighting the domains of dataset ?
- What makes a problem novel compared with existing dataset ?