[⋯]Loading

Built solo in Lille, FR·v0.6

Dev & AI feed

The best of dev and AI, scored every day by an agent. Filtered, summarized, ranked. No color, no noise — just the substance.

Issue: No. 141
Date: MAY 21, 2026
Edition: EN · DAILY
Sources: 14 active
Articles: 0 today

§ Feed·Vol. 02·No. 141

Last ingest·08:00 UTC+0·Next·08:00

Filters

Reference PanelA.1

01. Type— 5

02. Period— 3

03. Source— 7

04. Score— min.

0 active

$⌘K

Articles / day0

7-day avg.36

Mon → Sun-74%

Feed · 847 articles

sort byscore·DESC ↓

621JUL 2118:01

articleHuggingFace Blog·10 mo. ago

Accelerate a World of LLMs on Hugging Face with NVIDIA NIM

NVIDIA NIM offre un conteneur unique pour deployer rapidement une large gamme de LLM via Hugging Face, en automatisant l’adaptation, l’analyse du modele et le choix du backend (TensorRT-LLM, vLLM, SGLang). Il prend en charge Hugging Face, GGUF et TensorRT-LLM et illustre le deployment avec Codestral-22B via une commande Docker et tokens API.

★★★★★·HuggingFace Blog

622JUL 1800:00

articleHuggingFace Blog·10 mo. ago

Arc Virtual Cell Challenge: A Primer

Arc Institute lance le Virtual Cell Challenge, qui vise à entraîner un modèle capable de prédire l’effet du silençage d’un gène sur une cellule, même dans des types cellulaires inédits (contexte généralisation). Le jeu de données réunit environ 300k profils RNA‑seq d’une cellule unique et propose un cadre mathématique pour séparer le signal du perturbation, l’hétérogénéité et le bruit technique, avec des architectures comme le State Transition Model et le State Embedding Model.

★★★★★·HuggingFace Blog

623JUL 1700:00

articleHuggingFace Blog·10 mo. ago

Back to The Future: Evaluating AI Agents on Predicting Future Events

AI evaluation should shift from recalling facts to forecasting future events. FutureBench uses real-world prediction markets and live news to test agents' ability to reason under uncertainty and synthesize information to predict outcomes.

★★★★★·HuggingFace Blog

624JUL 1700:00

articleHuggingFace Blog·10 mo. ago

Consilium: When Multiple LLMs Collaborate

Consilium est une plateforme qui fait débattre et faire consensus entre plusieurs LLMs via des discussions structurées, avec des modes comme consensus, vote majoritaire ou classement par choix. Déployée comme interface Gradio et serveur MCP, elle visualise une table ronde et les échanges des experts. L'article relie l'idée au MAI-DxO de Microsoft pour démontrer l'efficacité du multi-LLM.

★★★★★·HuggingFace Blog

625JUL 1700:00

mcpHuggingFace Blog·10 mo. ago

Five Big Improvements to Gradio MCP Servers

Gradio has released version 5.38.0 to enhance MCP servers with Seamless Local File Support via a new File Upload endpoint, real-time progress streaming for MCP clients, and a one-line OpenAPI-to-MCP conversion using gr.load_openapi. The update also improves authentication by allowing server arguments to be declared as gr.Header and surfaced in docs.

★★★★★·HuggingFace Blog

626JUL 1600:00

articleHuggingFace Blog·10 mo. ago

Ettin Suite: SoTA Paired Encoders and Decoders

Ettin introduces the first SoTA paired encoder-only and decoder-only models (17M–1B params) trained identically for apples-to-apples comparisons. It extends the ModernBERT recipe to both architectures, with encoders beating ModernBERT and decoders beating Llama 3.2 and SmolLM2, while preserving architecture-specific advantages.

★★★★★·HuggingFace Blog

627JUL 1500:00

articleHuggingFace Blog·10 mo. ago

Migrating the Hub from Git LFS to Xet

Hugging Face has deployed Xet as the Hub’s storage backend, migrating hundreds of petabytes and millions of repos with minimal disruption. The migration relies on a Git LFS Bridge and background content migrations to support both LFS and Xet, allowing a no-hard-cutover transition. The approach aims to scale storage with AI workloads while preserving existing workflows.

★★★★★·HuggingFace Blog

628JUL 1012:54

articleHuggingFace Blog·10 mo. ago

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Kimina-Prover-72B uses a Test-Time Reinforcement Learning (TTRL) search to autonomously discover and reuse lemmas for long-horizon Lean proofs, plus an error-fixing module that interprets Lean messages for targeted corrections. It achieves a state-of-the-art miniF2F performance (92.2% pass rate) and shows pass@1/32/1024 of 63.9/84.0/87.7, with two distilled variants released (8B and 1.7B).

★★★★★·HuggingFace Blog

629JUL 1000:00

articleHuggingFace Blog·11 mo. ago

ScreenEnv: Deploy your full stack Desktop Agent

ScreenEnv is a Python library that runs isolated Ubuntu desktop environments in Docker to test and deploy GUI agents with full desktop control, including window management and file operations. It supports both the Model Context Protocol (MCP) and a Direct Sandbox API, enabling flexible integration with existing backends or AI systems. The article provides setup examples and a quick path to build a custom Desktop Agent using smolagents.

★★★★★·HuggingFace Blog

630JUL 1000:00

mcpHuggingFace Blog·11 mo. ago

Building the Hugging Face MCP Server

Building the Hugging Face MCP Server enables customized AI Assistants to access the Hub and thousands of apps through a single URL. The article compares MCP transports (Streamable HTTP vs SSE) and outlines three patterns: Direct Response, Request Scoped Streams, and Server Push Streams, with their trade-offs and needed connection management. It also covers making the server dynamic and remotely configurable, plus client-side usage hints (TypeScript/Python).

★★★★★·HuggingFace Blog

631JUL 1000:00

articleHuggingFace Blog·11 mo. ago

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

Asynchronous inference decouples action prediction from execution in robotic policies, reducing runtime lag and enabling replanning with action chunks. The article describes a two-component architecture (PolicyServer and RobotClient) using gRPC to achieve ~2× speedups and continuous operation, and explains why sequential inference falls short.

★★★★★·HuggingFace Blog

632JUL 0900:00

mcpHuggingFace Blog·11 mo. ago

Upskill your LLMs With Gradio MCP Servers

The article introduces the Model Context Protocol (MCP) and explains how MCP servers can extend LLMs with new abilities via an app-store-like ecosystem. It guides readers to find MCP-compatible spaces in Hugging Face Spaces and demonstrates wiring a Flux.1 Kontext[dev] MCP server to an LLM to edit images from text prompts.

★★★★★·HuggingFace Blog

633JUL 0900:00

articleHuggingFace Blog·11 mo. ago

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Reachy Mini is an open-source, programmable robot for human-robot interaction and AI prototyping. It runs Python (and soon JS/Scratch), ships as a kit, and starts at $299, with a wireless compute version at $449. It emphasizes open hardware/software, a growing community, and privacy controls by design.

★★★★★·HuggingFace Blog

634JUL 0900:00

articleHuggingFace Blog·11 mo. ago

Creating custom kernels for the AMD MI300

Hugging Face and AMD optimize custom kernels for MI300X to boost Llama 3.1 405B inference in FP8 on 8 GPUs. They combine fused residual/RMS norm/FP8 conversion, SwiGLU, and a Skinny GEMM kernel, with benchmarks and open-source tooling in hf-rocm-kernels.

★★★★★·HuggingFace Blog

635JUL 0800:00

articleHuggingFace Blog·11 mo. ago

Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure

Hugging Face outlines three major alerts that bolster production infrastructure monitoring. The highlighted NAT Gateway throughput alert acts as an early warning for unusual egress traffic and helps with cost optimization, while other alerts cover Kubernetes API request errors, rate limiting, and a bonus alert for new clusters sending zero metrics.

★★★★★·HuggingFace Blog

636JUL 0800:00

articleHuggingFace Blog·11 mo. ago

SmolLM3: smol, multilingual, long-context reasoner

SmolLM3 is a competitive open 3B model (11T tokens) with 128k long context via NoPE and YaRN, and multilingual support across six languages. It uses a Llama-based decoder with Grouped Query Attention, intra-document masking, and training-stability tweaks, plus a dual Instruct/Reasoning model.

★★★★★·HuggingFace Blog

637JUL 0800:00

articleHuggingFace Blog·11 mo. ago

Efficient MultiModal Data Pipeline

Efficient MultiModal Data Pipeline identifies data pipeline waste in multimodal training, like idle GPUs and excessive padding. It outlines a five-stage approach—from visualizing the dataset to packing batches with a knapsack strategy—to reduce padding and maximize throughput. The post also introduces an iterable dataset and a repository for practical reproduction.

★★★★★·HuggingFace Blog

638JUL 0412:25

articleHuggingFace Blog·11 mo. ago

Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models

Annonce de la compétition NeurIPS 2025 E2LM visant à capturer les signaux en phase précoce d'entraînement des LLMs dans le domaine scientifique. Les submissions seront évaluées via ScoreSQ, ScoreRC et ScoreCS, avec Score = α1·ScoreSQ + α2·ScoreRC + α3·ScoreCS et des poids α1=0.5, α2=0.1, α3=0.4. L'inscription se fait sur Hugging Face et l'évaluation s'appuie sur lm-evaluation-harness, avec des phases allant de juillet à novembre 2025 et résultats prévus le 04 novembre 2025.

★★★★★·HuggingFace Blog

639JUL 0100:00

articleHuggingFace Blog·11 mo. ago

Training and Finetuning Sparse Embedding Models with Sentence Transformers

Sentence Transformers can finetune sparse encoder/embedding models for retrieval, hybrid search, and reranking. The post outlines training components (model, datasets, losses, trainer, evaluators) and provides practical examples, including using naver/splade-v3 and decoding sparse embeddings for interpretability.

★★★★★·HuggingFace Blog

640JUN 2721:09

articleHuggingFace Blog·11 mo. ago

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

Le NVIDIA Llama Nemotron Nano VL est un modèle Vision-Language 8B optimisé pour l’identification et l’OCR de documents, offrant une haute précision OCRBench v2 et une extraction fiable de texte, tableaux et éléments visuels. Disponible sur Hugging Face, il s’appuie sur Llama-3.1-8B-Instruct et C-RADIOv2-VLM-H et peut être post-entraîné via NVIDIA NeMo; un tutoriel Notebook permet de construire rapidement des solutions d’automatisation pour factures, contrats et documents de santé.

★★★★★·HuggingFace Blog

Page 32 / 43

← Prev.Next →

20 of 847 shown

Issue 141 · Digest

The weekly digest, every Sunday.

20 articles ranked by an agent. No noise, no ads. One-click unsubscribe.

Subscribe →

[top 7 days]B.1

01.
Chasing down why installing the kernel segfaulted
Lobsters
02.
I turned a $80 RK3562 Android tablet into a Debian Linux workstation
Hacker News (100+ pts)
03.
Mullvad exit IPs as a fingerprinting vector
Lobsters
04.
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality
HuggingFace Blog
05.
int a = 5; a = a++ + ++a; a = ? (2011)
Lobsters

Colophon · MakerC.1

Quentin Lecocq · @celdama

Fullstack dev · CRO freelance · Lille, FR

Lantern is a side-project — aggregation, AI scoring, weekly digest. Built with Next.js 16, Drizzle, Neon & Claude. One maintainer.

[X][GitHub][RSS][Site]

ShortcutsC.2

Search⌘ K
Next articleJ
Previous articleK
OpenEnter
FavoriteF

Dev & AI feed

§Feed · 847 articles

Feed · 847 articles