FeedCette semaineArticle
articleHuggingFace Blog

BigCodeArena: Judging code generations end to end with code executions

BigCodeArena is a human-in-the-loop platform that evaluates AI code generation by executing code in sandboxed environments across multiple languages and frameworks. It enables interactive testing, multi-turn conversations, and community voting to rank models, addressing key evaluation gaps in code generation. The platform has gathered over 14,000 conversations since February 2025.

publié 07 OCT. 2025★★★★
Lire la sourcehuggingface.co/blog/bigcode/arena
[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern
Source
HuggingFace Blog
Ingéré
07 OCT. 2025 · 19:10
Score édito
4.0 / 5