articleHuggingFace Blog

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

AutoRound is Intel's weight-only post-training quantization that uses signed gradient descent to jointly optimize weight rounding and clipping for accurate low-bit quantization (INT2–INT8). It claims up to 2.1x higher relative accuracy at 2-bit, quantizes a 72B model in ~37 minutes on A100, and supports mixed-bit tuning, multiple export formats, and recipes (auto-round-best/light) with small calibration sets.

publié 29 AVR. 2025★★★★★

Lire la sourcehuggingface.co/blog/autoround

[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern

Source: HuggingFace Blog
Ingéré: 29 AVR. 2025 · 19:10
Score édito: 4.0 / 5