articleHuggingFace Blog
Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs
AutoRound is Intel's weight-only post-training quantization that uses signed gradient descent to jointly optimize weight rounding and clipping for accurate low-bit quantization (INT2–INT8). It claims up to 2.1x higher relative accuracy at 2-bit, quantizes a 72B model in ~37 minutes on A100, and supports mixed-bit tuning, multiple export formats, and recipes (auto-round-best/light) with small calibration sets.
published APR 29, 2025★★★★★
Read the sourcehuggingface.co/blog/autoround
[*] Opens in a new tab · no tracking on Lantern's side
- Source
- HuggingFace Blog
- Ingested
- APR 29, 2025 · 19:10
- Editorial score
- 4.0 / 5