articleHuggingFace Blog

Welcome EmbeddingGemma, Google's new efficient embedding model

Google introduces EmbeddingGemma, a 308M-parameter multilingual embedding model optimized for on-device use with a 2K context window. It supports 100+ languages and achieves top scores on the MMTEB/MTEB benchmarks while staying under 200 MB RAM when quantized. The model uses a Gemma3-based encoder with mean pooling and an optional 768-d output that can be truncated to 512/256/128 for speed and memory efficiency, enabling fast on-device retrieval and RAG pipelines.

publié 04 SEPT. 2025★★★★★

Lire la sourcehuggingface.co/blog/embeddinggemma

[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern

Source: HuggingFace Blog
Ingéré: 04 SEPT. 2025 · 19:10
Score édito: 5.0 / 5