Feed·Digest·Sources·About·

[⋯]Chargement

© 2026 Lantern·Set in Geist Mono·Sources [52]·Methodology·Privacy

Built solo in Lille, FR·v0.6

Veille dev & IA

Le meilleur du dev et de l'IA, scoré chaque jour par un agent. Filtré, résumé, classé. Aucune couleur, aucun bruit — juste la matière.

Issue: No. 138
Date: 18 MAI 2026
Édition: FR · DAILY
Sources: 14 actives
Articles: 0 aujourd'hui

§ Feed·Vol. 02·No. 138

Last ingest·10:00 UTC+2·Next·08:00

Filtres

Reference PanelA.1

01. Type— 5

02. Période— 3

03. Source— 7

04. Score— min.

0 actifs

$⌘K

Articles / jour0

7-jour moy.47

Lun → Dim-100%

Feed · 805 articles

trier parscore·DESC ↓

62121 MAI00:00

articleHuggingFace Blog·l’an dernier

nanoVLM: The simplest repository to train your VLM in pure PyTorch

nanoVLM provides a minimal PyTorch toolkit to train a Vision Language Model on a free Colab tier. It fuses a SigLIP-based vision transformer with a Llama 3 language backbone, using a Modality Projection (pixel shuffle + linear) to align image and text embeddings for decoding. The post offers quickstart steps: clone the repo and run train.py, or use the Colab notebook to begin training without local setup.

★★★★★·HuggingFace Blog

62219 MAI00:00

articleHuggingFace Blog·l’an dernier

Microsoft and Hugging Face expand collaboration

Microsoft et Hugging Face étendent leur collaboration pour déployer facilement des modèles open sur Azure via AI Foundry, avec plus de 10 000 modèles disponibles. Les modèles sont vérifiés pour la sécurité (safetensors, ProtectAI Guardian et JFrog) et déployables en quelques clics via le bouton Deploy, en choisissant VM et paramètres.

★★★★★·HuggingFace Blog

62315 MAI13:13

articleHuggingFace Blog·l’an dernier

Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models.

Falcon-Edge is a series of universal, fine-tunable LLMs built on BitNet with ternary weights (-1,0,1). It trains end-to-end to yield both non-quantized and quantized variants (bf16, BitNet, pre-quantized) in one pass, available in 1B and 3B sizes with base and instruction-tuned models. Early results on Hugging Face leaderboard v2 show competitive performance for its size, backed by ~1.5T tokens of pre-training and a 1-bit LLM tooling package.

★★★★★·HuggingFace Blog

62415 MAI00:00

articleHuggingFace Blog·l’an dernier

The Transformers Library: standardizing model definitions

Transformers is evolving toward a standard model-definition library, aiming to be the pivot across frameworks so an architecture supported by transformers is available in the broader ecosystem. It now covers 300+ architectures with day-0 support and tight interoperability with inference engines (vLLM, SGLang, TGI) as well as llama.cpp and MLX, while simplifying model definitions to lower the barrier to contributions.

★★★★★·HuggingFace Blog

62514 MAI00:00

articleHuggingFace Blog·l’an dernier

Improving Hugging Face Model Access for Kaggle Users

Kaggle and Hugging Face announce an integration that makes Hugging Face models visible directly on Kaggle, with a pre-populated code snippet to load models in Kaggle notebooks. You can authenticate private models with your HF_TOKEN, and consent-gated models require access requests; notebooks referencing HF models will auto-create a Kaggle model page.

★★★★★·HuggingFace Blog

62613 MAI00:00

articleHuggingFace Blog·l’an dernier

Blazingly fast whisper transcriptions with Inference Endpoints

Hugging Face unveils a blazing-fast OpenAI Whisper deployment on Inference Endpoints, delivering up to 8x speedups using vLLM and CUDA graphs on NVIDIA GPUs. The stack adds torch.compile, dynamic quantization to float8 and reduced KV cache precision to boost throughput without sacrificing transcription quality, with WER comparable to Transformer baselines across standard datasets.

★★★★★·HuggingFace Blog

62712 MAI00:00

articleHuggingFace Blog·l’an dernier

Vision Language Models (Better, faster, stronger)

Vision Language Models are getting smaller while becoming more capable, with new architectures enabling any-to-any inputs/outputs, multimodal retrieval and agents. The post surveys models like Chameleon/Lumina-mGPT, Qwen 2.5 Omni (Thinker-Talker), MiniCPM-o 2.6, Janus-Pro-7B, and Kimi-VL-A3B-Thinking, plus MoE decoders, RAG, safety, and new benchmarks (MMT-Bench, MMMU-Pro).

★★★★★·HuggingFace Blog

62811 MAI00:00

articleHuggingFace Blog·l’an dernier

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

Cet article présente LeRobot Community Datasets comme une tentative de créer l'équivalent ImageNet pour la robotique, en insistant sur la nécessité de données diversifiées et de pratiques de curation robustes. Il identifie les défis actuels (annotations incohérentes, problèmes de correspondance de caractéristiques, épisodes de faible qualité) et propose des orientations pour améliorer la généralisation via des jeux de données plus ouverts et variés.

★★★★★·HuggingFace Blog

62930 AVR00:00

articleHuggingFace Blog·l’an dernier

The 4 Things Qwen-3’s Chat Template Teaches Us

Le Qwen-3 introduit un template de chat plus sophistiqué: la pensée peut être activée ou désactivée via enable_thinking, et la gestion du contexte est dynamique grâce à un rolling checkpoint qui conserve les réflexions pertinentes pendant les appels d’outils. Le texte souligne aussi une meilleure sérialisation des arguments des outils et montre comment ces choix influencent les performances et la lisibilité du flux de conversation par rapport à Qwen-2.5 et QwQ.

★★★★★·HuggingFace Blog

63030 AVR00:00

articleHuggingFace Blog·l’an dernier

How to Build an MCP Server with Gradio

Gradio now exposes Python functions as MCP tools, enabling LLMs to call them via an MCP server. The guide shows a concise 5-line example converting a letter-counting function into a tool, launching the server, and wiring it into MCP clients with a config snippet.

★★★★★·HuggingFace Blog

63129 AVR00:00

articleHuggingFace Blog·l’an dernier

Welcoming Llama Guard 4 on Hugging Face Hub

Meta lance Llama Guard 4, un modèle dense 12B multimodal pour filtrer les contenus inappropriés, disponible sur Hugging Face. Pruné à partir de Llama 4 Scout (pas MoE), il tourne sur un seul GPU et évalue texte et image avec 14 catégories de risques. La release inclut aussi Llama Prompt Guard 2 et des checkpoints ouverts, accompagnés d’un notebook interactif.

★★★★★·HuggingFace Blog

63229 AVR00:00

articleHuggingFace Blog·l’an dernier

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

AutoRound is Intel's weight-only post-training quantization that uses signed gradient descent to jointly optimize weight rounding and clipping for accurate low-bit quantization (INT2–INT8). It claims up to 2.1x higher relative accuracy at 2-bit, quantizes a 72B model in ~37 minutes on A100, and supports mixed-bit tuning, multiple export formats, and recipes (auto-round-best/light) with small calibration sets.

★★★★★·HuggingFace Blog

63325 AVR22:37

articleHuggingFace Blog·l’an dernier

PipelineRL

PipelineRL introduit des mises à jour de poids en vol pendant l'entraînement RL des LLM, permettant un débit d'inférence élevé tout en restant proche de l'on-policy. L'étude montre des résultats compétitifs sur 7B et 32B par rapport à Open-Reasoner-Zero sur AIME 2024 et MATH 500, avec une implémentation plus simple (pas de fonction valeur et sans pénalité KL).

★★★★★·HuggingFace Blog

63425 AVR00:00

articleHuggingFace Blog·l’an dernier

Tiny Agents: an MCP-powered agent in 50 lines of code

Cet article présente Tiny Agents, un agent MCP-powered en 50 lignes de code. Il explique que l’agent se résume à une boucle while sur un MCP client et détaille comment lancer des serveurs MCP locaux et exécuter des prompts (ex: recherche web, manipulation de fichiers) via un exemple en TypeScript.

★★★★★·HuggingFace Blog

63522 AVR18:33

articleHuggingFace Blog·l’an dernier

Finetuning olmOCR to be a faithful OCR-Engine

Researchers fine-tuned olmOCR-7B-0225-preview to preserve header and footer information, making it a more faithful OCR engine for business documents. They created a dataset of 8,000 documents with Qwen2.5-VL-72B-Instruct, trained with 4 gradient accumulation steps on 8xH100 for 2.5 epochs, and evaluated on header/footer-inclusive data using document anchoring. The result is a practical improvement for invoices and other layout-rich texts.

★★★★★·HuggingFace Blog

63616 AVR10:10

articleHuggingFace Blog·l’an dernier

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

Explains the two stages of token generation for LLMs—prefill, where input tokens are processed in parallel to produce the first token, and decode, where subsequent tokens are generated sequentially using a KV cache. It defines latency metrics (time to first token and time per output token) and analyzes how concurrent requests and batching affect throughput on multi-GPU setups. It also hints at batching patterns like prefill-first and chunked prefill to optimize latency.

★★★★★·HuggingFace Blog

63716 AVR00:00

articleHuggingFace Blog·l’an dernier

Introducing HELMET: Holistically Evaluating Long-context Language Models

HELMET introduces a comprehensive benchmark for evaluating long-context language models, addressing the shortcomings of perplexity and synthetic tasks by emphasizing diversity, controllability, and reliability. The blog reports evaluation across 59 LCLMs, highlights real-world task gaps, and provides a quickstart guide and links to code, data, and the paper for practical replication.

★★★★★·HuggingFace Blog

63816 AVR00:00

articleHuggingFace Blog·l’an dernier

Cohere on Hugging Face Inference Providers

Cohere devient fournisseur d'inférence sur Hugging Face Hub, permettant l'inférence serverless via Cohere et Cohere Labs sur une gamme de modèles optimisés pour l'entreprise. Les points forts incluent des contextes longs (256k sur le modèle A-03-2025), une prise en charge multilingue (23 langues) et une RAG avec citations, sécurité et outils d'agentivité. L'article décrit l'usage via l'UI, les SDK clients et un notebook Colab pour tester, avec des exemples Python utilisant huggingface_hub.

★★★★★·HuggingFace Blog

63916 AVR00:00

articleHuggingFace Blog·l’an dernier

17 Reasons Why Gradio Isn't Just Another UI Library

Gradio is framed as a full framework for ML apps, not just a UI library, offering features like universal API access, an Interactive API Recorder, SSR for fast apps, automatic queueing and real-time streaming, and enterprise-grade security. The piece lists 17 capabilities that differentiate it for production ML workflows.

★★★★★·HuggingFace Blog

64014 AVR00:00

articleHuggingFace Blog·l’an dernier

4M Models Scanned: Protect AI + Hugging Face 6 Months In

Protect AI and Hugging Face expanded Guardian's threat detection with four new modules (PAIT-ARV-100, PAIT-JOBLIB-101, PAIT-TF-200, PAIT-LMAFL-300) to cover more formats and obfuscation techniques. The integration emphasizes a zero-trust security stance with inline alerts on Hugging Face and a huntr bug‑bounty program, reporting 4.47M scans and 352k unsafe issues.

★★★★★·HuggingFace Blog

Page 32 / 41

← Préc.Suiv. →

20 sur 805 affichés

Issue 138 · Digest

Le résumé hebdo, livré dimanche.

20 articles classés par un agent. Aucun bruit, aucune pub. Désabonnement en un clic.

[top 7 jours]B.1

01.
Mullvad exit IPs as a fingerprinting vector
Lobsters
02.
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality
HuggingFace Blog
03.
what 262,715 regex questions on stack overflow haven't answered
Lobsters
04.
Mythos finds a curl vulnerability
Lobsters
05.
int a = 5; a = a++ + ++a; a = ? (2011)
Lobsters

Colophon · MakerC.1

Quentin Lecocq · @celdama

Dev fullstack · CRO freelance · Lille, FR

Lantern est un side-project — agrégation, scoring IA, digest hebdo. Construit avec Next.js 16, Drizzle, Neon & Claude. Un seul mainteneur.

[X][GitHub][RSS][Site]

RaccourcisC.2

Recherche⌘ K
Article suivantJ
Article précédentK
OuvrirEnter
FavoriF