articleHuggingFace Blog
Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs
AutoRound is Intel's weight-only post-training quantization that uses signed gradient descent to jointly optimize weight rounding and clipping for accurate low-bit quantization (INT2–INT8). It claims up to 2.1x higher relative accuracy at 2-bit, quantizes a 72B model in ~37 minutes on A100, and supports mixed-bit tuning, multiple export formats, and recipes (auto-round-best/light) with small calibration sets.
publié 29 AVR. 2025★★★★★
Lire la sourcehuggingface.co/blog/autoround
[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern
- Source
- HuggingFace Blog
- Ingéré
- 29 AVR. 2025 · 19:10
- Score édito
- 4.0 / 5