articleHuggingFace Blog
mmBERT: ModernBERT goes Multilingual
mmBERT is a massively multilingual encoder trained on over 3T tokens across 1,800+ languages, achieving state-of-the-art results with faster inference than prior models. It extends ModernBERT with novel components and a three-phase training schedule to improve learning for low-resource languages, using progressively sampled, broader data.
published SEP 09, 2025★★★★★
Read the sourcehuggingface.co/blog/mmbert
[*] Opens in a new tab · no tracking on Lantern's side
- Source
- HuggingFace Blog
- Ingested
- SEP 09, 2025 · 19:10
- Editorial score
- 4.0 / 5