articleHuggingFace Blog
Measuring Open-Source Llama Nemotron Models on DeepResearch Bench
AI-Q combines Llama 3.3-70B Instruct and Llama-3.3-Nemotron-Super-49B-v1.5 to enable long-context retrieval, agentic reasoning, and tool use in open-source stacks. NVIDIA details model lineage, post-training, and transparent evaluation metrics (hallucination detection, multi-source synthesis, citation trust, RAGAS), plus a 49B Nemotron running on a single H100. DeepResearch Bench ranks AI-Q top among fully open stacks with a score of 40.52 in LLM with Search (Aug 2025).
published AUG 04, 2025★★★★★
Read the sourcehuggingface.co/blog/nvidia/ai-q-top-ranking-open-portable-deep-research-agent
[*] Opens in a new tab · no tracking on Lantern's side
- Source
- HuggingFace Blog
- Ingested
- AUG 04, 2025 · 19:10
- Editorial score
- 4.0 / 5