articleGitHub Trending — Python
deepseek-ai/DeepSeek-V3
DeepSeek-V3 is a 671B Mixture-of-Experts LLM with 37B activated per token, featuring MLA and an auxiliary-load-free load balancing strategy, trained on 14.8T tokens. It achieves top open-source benchmarks and supports FP8-enabled backends (SGLang, LMDeploy, TRT-LLM, vLLM, LightLLM) with local run steps and a Multi-Token Prediction objective plus post-training distillation from DeepSeek-R1.
publié 28 AVR. 2026★★★★★
Lire la sourcegithub.com/deepseek-ai/DeepSeek-V3
[*] Ouvre dans un nouvel onglet · pas de tracking côté Lantern
- Source
- GitHub Trending — Python
- Ingéré
- 28 AVR. 2026 · 08:40
- Score édito
- 5.0 / 5