articleHuggingFace Blog
Measuring Open-Source Llama Nemotron Models on DeepResearch Bench
AI-Q combines Llama 3.3-70B Instruct and Llama-3.3-Nemotron-Super-49B-v1.5 to enable long-context retrieval, agentic reasoning, and tool use in open-source stacks. NVIDIA details model lineage, post-training, and transparent evaluation metrics (hallucination detection, multi-source synthesis, citation trust, RAGAS), plus a 49B Nemotron running on a single H100. DeepResearch Bench ranks AI-Q top among fully open stacks with a score of 40.52 in LLM with Search (Aug 2025).
publié 04 AOÛT 2025★★★★★
Lire la sourcehuggingface.co/blog/nvidia/ai-q-top-ranking-open-portable-deep-research-agent
[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern
- Source
- HuggingFace Blog
- Ingéré
- 04 AOÛT 2025 · 19:10
- Score édito
- 4.0 / 5