articleHuggingFace Blog

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Kimina-Prover-72B uses a Test-Time Reinforcement Learning (TTRL) search to autonomously discover and reuse lemmas for long-horizon Lean proofs, plus an error-fixing module that interprets Lean messages for targeted corrections. It achieves a state-of-the-art miniF2F performance (92.2% pass rate) and shows pass@1/32/1024 of 63.9/84.0/87.7, with two distilled variants released (8B and 1.7B).

publié 10 JUIL. 2025★★★★★

Lire la sourcehuggingface.co/blog/AI-MO/kimina-prover

[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern

Source: HuggingFace Blog
Ingéré: 10 JUIL. 2025 · 19:10
Score édito: 4.0 / 5