[⋯]Loading

Built solo in Lille, FR·v0.6

Dev & AI feed

The best of dev and AI, scored every day by an agent. Filtered, summarized, ranked. No color, no noise — just the substance.

Issue: No. 141
Date: MAY 21, 2026
Edition: EN · DAILY
Sources: 14 active
Articles: 42 today

§ Feed·Vol. 02·No. 141

Last ingest·08:00 UTC+0·Next·08:00

Filters

Reference PanelA.1

01. Type— 5

02. Period— 3

03. Source— 7

04. Score— min.

0 active

$⌘K

Articles / day42

7-day avg.42

Mon → Sun-62%

Feed · 851 articles

sort byscore·DESC ↓

721FEB 1300:00

articleHuggingFace Blog·last yr.

1 Billion Classifications

The piece breaks down how to cost-effectively run 1B+ classifications or embeddings at scale, analyzing model architectures, hardware options, and deployment choices. It offers a framework to estimate cost and latency, plus a practical stack (Inference Endpoints, Hugging Face Hub, Infinity, k6) to benchmark and optimize throughput.

★★★★★·HuggingFace Blog

722FEB 1200:00

toolHuggingFace Blog·last yr.

Build awesome datasets for video generation

The post describes tooling to build video-generation datasets, extending the img2dataset approach to videos. It presents a three-stage pipeline (Acquisition, Pre-processing/Filtering, Processing) using yt-dlp, Video to Scenes, watermark and aesthetic/NSFW checks, motion scoring, and Florence-2-based captions/OCR to filter data for fine-tuning.

★★★★★·HuggingFace Blog

723FEB 1200:00

articleHuggingFace Blog·last yr.

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

The article argues that content-defined chunking (CDC) is only a means to speed up data movement, not the ultimate goal of deduplication. To scale, the team moves from per-chunk transfers to aggregation: blocks up to 64MB reduce CAS entries by ~1000x, while shards map files to blocks and detect changes, enabling faster uploads/downloads.

★★★★★·HuggingFace Blog

724FEB 1016:10

articleHuggingFace Blog·last yr.

Open R1: Update #2

OpenR1-Math-220k is a large-scale math-reasoning dataset built on 512 H100s, with two (often four) solutions per problem and about 800k reasoning traces. It uses automated filtering (Math Verify) and LLama3.3-70B-Instruct as a judge, and achieves high throughput with vLLM and SGLang (~300k solutions/day). The update also details distillation results and a pipeline extensible to other domains.

★★★★★·HuggingFace Blog

725FEB 1000:00

articleHuggingFace Blog·last yr.

The Open Arabic LLM Leaderboard 2

Open Arabic LLM Leaderboard evolved from fragmented benchmarks to a unified platform. Since May 2024, OALL hosts 14 benchmarks; later Balsam Index added ~1,400 datasets and 50,000 questions, AraGen launched 3C3H with private test cycles, and SEAL introduced a private Arabic leaderboard with human-preference evaluation. The ecosystem drew 46k visitors and 700+ models from 180+ organizations.

★★★★★·HuggingFace Blog

726FEB 0400:00

articleHuggingFace Blog·last yr.

DABStep: Data Agent Benchmark for Multi-step Reasoning

Introducing DABstep, a benchmark of 450+ real-world data analysis tasks to evaluate multi-step reasoning in AI agents. The study finds current top agents reach only about 16% accuracy, underscoring a large gap to reliably tackle real data tasks that mix structured data and unstructured documents.

★★★★★·HuggingFace Blog

727FEB 0400:00

articleHuggingFace Blog·last yr.

Open-source DeepResearch – Freeing our search agents

OpenAI a publié Deep Research, un système qui navigue sur le Web pour résumer le contenu et répondre par le résumé. L’article présente une reproduction open-source du cadre agentique (CodeAgent) et les résultats sur le benchmark GAIA (≈67% en 1-shot, 47,6% niveau 3). Les auteurs visent une open-source du cadre et les prochaines étapes de reproductibilité.

★★★★★·HuggingFace Blog

728FEB 0400:00

articleHuggingFace Blog·last yr.

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

π0 and π0-FAST are Vision-Language-Action models for generalist robot control. They use flow-matching for real-time action trajectories (50 Hz) across seven platforms and 68 tasks, and introduce fast attention techniques (FlashAttention2, FlexAttention) to handle 2D masks and cross-embodiment training. The post also points to Hugging Face LeRobot repos for code and pretrained models.

★★★★★·HuggingFace Blog

729FEB 0200:04

articleHuggingFace Blog·last yr.

Open-R1: Update #1

Open-R1: Update #1 résume les progrès pour répliquer le pipeline d’entraînement et les données synthétiques de DeepSeek-R1 (MATH-500, GRPO dans TRL 0.14, DeepSpeed ZeRO, vLLM). Le post évoque aussi les défis de longueur des générations et propose des ressources communautaires ainsi qu’un leaderboard public.

★★★★★·HuggingFace Blog

730JAN 3110:29

articleHuggingFace Blog·last yr.

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Replaying the DeepSeek-R1 'aha moment', this post uses Group Relative Policy Optimization (GRPO) and the Countdown Game to train an open model via RL. It details a distributed setup with DeepSpeed and vLLM on 4× NVIDIA H100 GPUs and explains how GRPO replaces a value function with group-based baselines. The aim is self-verification and search abilities learned with minimal human data, illustrating a concrete RL workflow for LLMs.

★★★★★·HuggingFace Blog

731JAN 3100:00

articleHuggingFace Blog·last yr.

The AI tools for Art Newsletter - Issue 1

2024 saw major open-source breakthroughs in AI art, with a shift to diffusion transformers (DiT) and flow matching in text-to-image generation. Flux.1 achieved state-of-the-art results, outperforming some closed models, as open releases like Stable Diffusion 3, Stable Diffusion 3.5, AuraFlow and HunyuanDiT expanded the open ecosystem. The article also highlights personalization advances via SDXL and hints at 2025’s ongoing open-source momentum.

★★★★★·HuggingFace Blog

732JAN 3000:00

articleHuggingFace Blog·last yr.

How to deploy and fine-tune DeepSeek models on AWS

The article shows how to deploy and fine-tune DeepSeek R1 models on AWS with Hugging Face. It covers Inference Endpoints, Bedrock, SageMaker, and EC2 Neuron deployments, plus notes on pricing and upcoming Inferentia support.

★★★★★·HuggingFace Blog

733JAN 2800:00

articleHuggingFace Blog·last yr.

Welcome to Inference Providers on the Hub

Hugging Face déploie quatre fournisseurs d’inférence serverless (fal, Replicate, Sambanova, Together AI) directement sur les pages des modèles du Hub, et les intègre dans les SDK JS et Python. Les utilisateurs peuvent configurer leurs clés API ou opter pour un routage via Hugging Face, avec des exemples d’utilisation via InferenceClient pour appeler des modèles comme DeepSeek-R1.

★★★★★·HuggingFace Blog

734JAN 2800:00

articleHuggingFace Blog·last yr.

Open-R1: a fully open reproduction of DeepSeek-R1

Open-R1 aims to reproduce DeepSeek-R1's reasoning capabilities using reinforcement learning with minimal human supervision, and to reveal training data and hyperparameters. It builds on DeepSeek-V3, a 671B Mixture-of-Experts model, and contrasts the RL-only DeepSeek-R1-Zero with the refined DeepSeek-R1.

★★★★★·HuggingFace Blog

735

toolGitHub Trending — Python

5 min to read

YILING0013/AI_NovelGenerator

Documente un générateur de romans IA, avec architecture modulaire, configuration détaillée, et un flux de travail pas-à-pas (setup, génération et vérifications).

★★★★★·GitHub Trending — Python

736

articleGitHub Trending — Python

2 min to read

dreammis/social-auto-upload

Outil open-source visant à publier automatiquement des vidéos sur plusieurs plateformes via CLI et mode headless, avec une architecture modulaire et plans de refonte.

★★★★★·GitHub Trending — Python

737

toolGitHub Trending — Python

6 min to read

lllyasviel/Fooocus

Évalue Fooocus: outil d'images hors-ligne basé sur SDXL, installation simplifiée et gestion des modèles, avec avertissements sur sources officielles et prérequis matériels.

★★★★★·GitHub Trending — Python

738

toolGitHub Trending — Python

3 min to read

conorluddy/ios-simulator-skill

Skill Claude prêt à l'emploi pour construire, tester et automatiser iOS avec 22 scripts: Xcode via xcodebuild, navigation via idb/simctl, et sorties résumées.

★★★★★·GitHub Trending — Python

739

toolGitHub Trending — Python

2 min to read

cocoindex-io/cocoindex

CocoIndex fournit une couche incrémentale qui maintient des contextes AI frais en réutilisant uniquement le delta entre sources variées, avec des runtimes Python/Rust et des exposé

★★★★★·GitHub Trending — Python

740

toolGitHub Trending — Python

5 min to read

OpenBMB/VoxCPM

VoxCPM2 est un système TTS diffusion autoregressive sans tokenisation, 2B paramètres, 30 langues, Voice Design et clonage contrôlable, sortie 48 kHz et déploiement open-source.

★★★★★·GitHub Trending — Python

Page 37 / 43

← Prev.Next →

20 of 851 shown

Issue 141 · Digest

The weekly digest, every Sunday.

20 articles ranked by an agent. No noise, no ads. One-click unsubscribe.

Subscribe →

[top 7 days]B.1

01.
Chasing down why installing the kernel segfaulted
Lobsters
02.
I turned a $80 RK3562 Android tablet into a Debian Linux workstation
Hacker News (100+ pts)
03.
Mullvad exit IPs as a fingerprinting vector
Lobsters
04.
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality
HuggingFace Blog
05.
int a = 5; a = a++ + ++a; a = ? (2011)
Lobsters

Colophon · MakerC.1

Quentin Lecocq · @celdama

Fullstack dev · CRO freelance · Lille, FR

Lantern is a side-project — aggregation, AI scoring, weekly digest. Built with Next.js 16, Drizzle, Neon & Claude. One maintainer.

[X][GitHub][RSS][Site]

ShortcutsC.2

Search⌘ K
Next articleJ
Previous articleK
OpenEnter
FavoriteF

Dev & AI feed

§Feed · 851 articles

Feed · 851 articles