🌀 RLM — Recursive & Recurrent Language Models

Dernière mise à jour : 01/05/2026 à 03:32 UTC

Veille sur les alternatives récurrentes aux Transformers : RWKV, Mamba/SSM, RetNet, xLSTM, Hyena, Griffin/Hawk. Flux RSS officiels, publications arXiv, discussions Hacker News.
Titres en anglais traduits automatiquement en vert · PDF = lien direct vers le preprint arXiv. Historique : 227 article(s) sur 360 jours.

← IT/Science News & + 🧠 LLM Open Source 📚 Publications

📰 Actualités du jour

🌀

RWKV Language Model

https://github.com/BlinkDL/RWKV-LM

Blog officiel

RWKV-v5

RWKV-v4neo

RWKV v2 - RNN with Transformer Performance

RWKV v2 - RNN avec la performance du transformateur

0.02 0.02

0.01 0.01

Hacker News

2023-03-30

The RWKV language model: An RNN with the advantages of a transformer

Le modèle de langue RWKV : un RNN avec les avantages d'un transformateur

HN 184▲

2024-12-30

RWKV Language Model

Modèle de langue RWKV

HN 183▲

2023-07-04

How the RWKV language model works

Comment fonctionne le modèle de langue RWKV

HN 71▲

2023-06-29

RWKV Language Model Math

Mathématiques du modèle de langue RWKV

HN 1▲

2025-03-29

DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products

DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers

HN 1▲

🐍

Mamba / SSM

https://github.com/state-spaces/mamba

Blog officiel

v2.3.1

v2.3.0

v2.2.6.post3

v2.2.6.post2

v2.2.6

v2.2.6.post1

v2.2.5

v2.2.4

v2.2.3

v2.2.3.post2

Hacker News

2024-02-25

Mamba Explained: The State Space Model Taking On Transformers

Mamba a expliqué : Le modèle spatial d'État prenant en compte les transformateurs

HN 270▲

2024-01-09

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

MoE-Mamba: Modèles d'espaces d'État sélectifs efficaces avec mélange d'experts

HN 129▲

2026-01-28

A verification layer for browser agents: Amazon case study

Une couche de vérification pour les agents du navigateur : étude de cas Amazon

HN 56▲

2026-01-21

A verification layer for browser agents: Amazon case study

Une couche de vérification pour les agents du navigateur : étude de cas Amazon

HN 28▲

2023-12-04

Mamba: New SSM arch with linear-time scaling that outperforms Transformers

Mamba: Nouvelle arche SSM avec échelle linéaire qui surpasse les transformateurs

HN 6▲

2024-03-30

A Visual Guide to Mamba and State Space Models

Guide visuel des modèles spatiaux Mamba et State

HN 3▲

🤗

Hugging Face — Séquence

https://huggingface.co/blog

Blog officiel

2026-04-29

AI evals are becoming the new compute bottleneck

Les évaluations d'IA deviennent le nouveau goulot d'étranglement informatique

2026-04-29

Granite 4.1 LLMs: How They’re Built

Granite 4.1 LLM: Comment ils sont construits

2026-04-29

DeepInfra on Hugging Face Inference Providers 🔥

DeepInfra sur les fournisseurs d'inférences faciales

2026-04-28

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

Présentation de NVIDIA Nemotron 3 Nano Omni: Intelligence multimodale longue durée pour les agents de documents, audio et vidéo

2026-04-27

How to build scalable web apps with OpenAI's Privacy Filter

Comment construire des applications web évolutives avec le filtre de confidentialité OpenAI

2026-04-24

DeepSeek-V4: a million-token context that agents can actually use

DeepSeek-V4 : un contexte de millions de jetons que les agents peuvent utiliser

2026-04-23

How to Use Transformers.js in a Chrome Extension

Comment utiliser Transformers.js dans une extension Chrome

2026-04-21

QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

QIMMA : Un premier LLM arabe de qualité Tableau de bord

2026-04-21

AI and the Future of Cybersecurity: Why Openness Matters

L'IA et l'avenir de la cybersécurité : pourquoi l'ouverture compte

2026-04-16

Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents

Ecom-RLVE: Environnements vérifiables adaptatifs pour les agents conversateurs du commerce électronique

Hacker News

2023-07-25

Retentive Network: A Successor to Transformer Implemented in PyTorch

Réseau rétentif : un successeur à la transformation mis en œuvre en PyTorch

HN 12▲

2016-01-01

Character-Aware LSTM CNN Neural Language Models

Modèles de langage neuronal LSTM CNN

HN 4▲

2024-05-12

Show HN: NanoXLSTM: minimal codebase for playing with xLSTM language models

Afficher HN: NanoXLSTM: base de code minimale pour jouer avec les modèles de langue xLSTM

HN 3▲

2026-02-05

Language Modeling, Part 5: Reverse Engineering LSTM Cells

Modélisation de la langue, Partie 5: Ingénierie inverse LSTM Cellules

HN 1▲

2025-03-29

DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products

DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers

HN 1▲

⚡

EleutherAI

https://blog.eleuther.ai

Aucun article récupéré.

🔬

Google DeepMind Research

https://deepmind.google

Hacker News

2025-03-29

DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products

DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers

HN 1▲

📘

Publications arXiv

RWKV · Mamba · RetNet · xLSTM · Hyena · Griffin — PDF = preprint direct

2026-04-30

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

HERMES++: Vers un modèle de monde de conduite unifié pour la compréhension et la génération des scènes 3D

— Xin Zhou

Driving world models serve as a pivotal technology for autonomous driving by simulating environmental dynamics. However, existing approaches predominantly focus on future scene generation, often overlooking comprehensive 3D scene understanding. Conversely, while Large Language Models (LLMs) demonstr

PDF

2026-04-30

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

OmniRobotHome: Une plate-forme multi-caméra pour l'interaction multi-humaine-robot en temps réel

— Dingkang Liang

Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting concurrently on interleaved subtasks with tight spatial and temporal coupling. This regime remains un

PDF

2026-04-30

Covariant Locally Localized Gravity and vDVZ Continuity

Gravité localisée et continuité vDVZ

— Xiwu Chen

The Karch-Randall braneworld concerns the physics of an AdS$_{d}$ brane embedded in an ambient gravitational AdS$_{d+1}$ spacetime. The gravitational theory induced on the AdS$_{d}$ brane has a very light but massive graviton. It has been established that the zero graviton mass limit of the $d$-dime

PDF

2026-04-30

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

LaST-R1: Renforcement de l'action par le biais d'un raisonnement adaptatif pour les modèles VLA

— Feiyang Tan

Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing explicit linguistic reasoning that suffers from latency and discretization, or utilizing more expressi

PDF

2026-04-30

Representation Fréchet Loss for Visual Generation

Représentation Fréchet Perte pour la génération visuelle

— Dingyuan Zhang

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term

PDF

2026-04-30

Towards Systematics of Calabi-Yau Landscape for String Cosmology

Vers la systématique du paysage de Calabi-Yau pour la cosmologie à cordes

— Hengshuang Zhao

In this review, we discuss the relevance and impact of studying Calabi-Yau threefolds in the context of global model building in string phenomenology. First, taking a phenomenologist-friendly approach, we review how the topologies of the various divisors and curves of the compactifying CY threefolds

PDF

2026-04-30

Cosmology of fractional gravity

Cosmologie de la gravité fractionnelle

— Xiang Bai

This is a first study of the cosmology of classical fractional gravity, a nonlocal proposal endowed with self-adjoint fractional d'Alembertian operators which serves as the basis for an ultraviolet-complete theory of quantum gravity. We derive the classical covariant nonlocal equations of motio

PDF

2026-04-30

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

La génération visuelle dans la nouvelle ère : une évolution de la cartographie atomique à la modélisation du monde agentique

— Junyoung Lee

Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, long-horizon consistency, and causal understanding. We argue that the field should move beyond appea

PDF

2026-04-30

Chemical Taxonomy of $ω$~Centauri: Ten Populations Reveal a Multi-Phase Enrichment History

Taxonomie chimique du Centauri : Dix populations dévoilent une histoire d'enrichissement multiphasé

— Dingkang Liang

$ω$~Centauri, the most massive globular cluster in the Milky Way, exhibits a level of stellar population complexity that has long resisted a unified chemical characterisation. We exploit high-resolution near-infrared spectroscopy from the Milky Way Mapper survey (MWM DR19) to construct one of the la

PDF

2026-04-30

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

Généralisable Sparse-View Reconstruction 3D à partir d'images sans contrainte

— Dingyuan Zhang

Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimization using appearance embeddings or dynamic masks, which requires extensive per-scene training and

PDF

2026-04-30

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Exploration Hacking: Les LLM peuvent-ils apprendre à résister à l'entraînement RL?

— Xiwu Chen

Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration of diverse actions by the model during training, which creates a potential failure mode: a model cou

PDF

2026-04-30

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Ordinateurs synthétiques à l'échelle pour la simulation de productivité de longue durée

— Feiyang Tan

Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synt

PDF

2026-04-30

An adaptive wavelet-based PINN for problems with localized high-magnitude source

Un PINN adaptatif basé sur les vagues pour les problèmes avec la source localisée de haute magnitude

— Dingyuan Zhang

In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper pr

PDF

2026-04-30

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

LLM comme Raffineur de structure de graphique clinique: améliorer l'apprentissage de la représentation dans le diagnostic de saisie EEG

— Hengshuang Zhao

Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether correlation-based or learning-based, often generate redundant or irrelevant edges due to the noisy nat

PDF

2026-04-30

AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images

AEGIS: un point de repère holistique pour l'évaluation de l'analyse médico-légale des images académiques produites par l'IA

— Xiang Bai

We introduce AEGIS, A holistic benchmark for Evaluating forensic analysis of AI-Generated academic ImageS. Compared to existing benchmarks, AEGIS features three key advances: (1) Domain-Specific Complexity: covering seven academic categories with 39 fine-grained subtypes, exposing intrinsic forensic

PDF

2026-04-30

Strait: Perceiving Priority and Interference in ML Inference Serving

Détroit : Percevoir la priorité et l'interférence dans le service de l'inférence ML

— Hao Chen

Machine learning (ML) inference serving systems host deep neural network (DNN) models and schedule incoming inference requests across deployed GPUs. However, limited support for task prioritization and insufficient latency estimation under concurrent execution may restrict their applicability in on-

PDF

2026-04-30

Enhancement of superconducting stiffness in hybrid superconducting-metallic bilayers

Amélioration de la rigidité supraconductrice dans les bicouches hybrides supraconductrices-métalliques

— Xiwu Chen

Boosting superconductivity by metallic reservoirs is the essence of Kivelson's bilayer proposal. One layer provides pairing to the electrons, while the weakly coupled metal provides additional phase coherence to those pairs by mediating extended-range pair-pair coupling. Demonstrating significa

PDF

2026-04-30

Computing Equilibrium beyond Unilateral Deviation

Équilibre informatique au-delà de la déviation unilatérale

— Feiyang Tan

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts

PDF

2026-04-30

Observation of Vinen turbulence during far-from-equilibrium Bose-Einstein condensation

Observation des turbulences de Vinen pendant la condensation de Bose-Einstein, loin de l'équilibre

— Timur Şahin

Relaxation of far-from-equilibrium quantum fluids, intimately related to the emergence of long-range order, is theoretically associated with the decay of a turbulent isotropic tangle of vortex lines. We observe and study such decaying quantum turbulence in a homogeneous 3D atomic Bose gas. Using mat

PDF

2026-04-30

Intrinsic anomalous thermal hall effect as a signature of quantum metric in d-wave altermagnets

Effet thermal anomal intrinsèque comme signature de métrique quantique dans les alteraimants d-onde

— Dingkang Liang

We investigate the intrinsic anomalous thermal Hall effect in d-wave altermagnets, where a transverse heat current is generated by a longitudinal temperature gradient in the absence of a magnetic field, with the leading response proportional to $(\nabla T)^3$. In these systems, the intrinsic Berry c

PDF

2026-04-30

RopeDreamer: A Kinematic Recurrent State Space Model for Dynamics of Flexible Deformable Linear Objects

RopeDreamer: Un modèle d'espace d'état à répétition cinématique pour les dynamiques d'objets linéaires déformables flexibles

— Dingyuan Zhang

The robotic manipulation of Deformable Linear Objects (DLOs) is a fundamental challenge due to the high-dimensional, non-linear dynamics of flexible structures and the complexity of maintaining topological integrity during contact-rich tasks. While recent data-driven methods have utilized Recurrent

PDF

2026-04-30

FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption

FlashRT: Vers une équipe rouge efficace pour l'injection rapide et la corruption du savoir

— Hengshuang Zhao

Long-context large language models (LLMs)-for example, Gemini-3.1-Pro and Qwen-3.5-are widely used to empower many real-world applications, such as retrieval-augmented generation, autonomous agents, and AI assistants. However, security remains a major concern for their widespread deployment, with th

PDF

2026-04-30

Mapping data sensitivities in global QCD analysis with linear response and influence functions

Cartographie des sensibilités des données dans l'analyse globale de QCD avec réponse linéaire et fonctions d'influence

— Xiang Bai

Global QCD analyses provide the primary framework for extracting hadron structure from experimental data, yet the mechanisms by which data constrain non-perturbative functions remain difficult to interpret due to the high dimensionality and complexity of these fits. Here we develop a framework based

PDF

2026-04-30

Beyond first-order accuracy in continuous-forcing immersed boundary methods, and their well-conditioned projection-based solution

Au-delà de la précision du premier ordre dans les méthodes de limite immergée à force continue, et leur solution de projection bien conditionnée

— Rishi G. Gopalakrishnan

We introduce a refined immersed boundary (IB) methodology that is better-than-first-order accurate in practice, while preserving key properties of "continuous-forcing" IB approaches that retain a singular source term in the governing equations. Our method leverages a smoothed indicator (He

PDF

🔬

OpenAlex — Publications académiques

Corpus académique indexé par OpenAlex (ResearchGate, Crossref, PubMed…)

2026-12-31

Main rooms and coffered ceilings as definers of a noble palace model in renaissance Spain. The case study of the palace of Peñaranda de Duero, Spain

Les salles principales et les plafonds coffered en tant que définitions d'un modèle de palais noble en renaissance Espagne. Étude de cas du palais de Peñaranda de Duero, Espagne

— Manuel de Miguel Sánchez, Miguel Carlos Fernández Cabo, Ana González Uriel

2026-12-31

Cic. "Ad Brut." 1.9 and the Death of Porcia

Cic. "Ad Brut." 1.9 et la mort de Porcia

— François, Ide

2026-12-31

DYNAMICS OF PORTFOLIO ASSESSMENT STRATEGY IN EVALUATING THE ENGLISH LANGUAGE SKILLS FOR PUPILS WITH HEARING IMPAIRMENTS IN FAKO DIVISION, SOUTH WEST REGION OF CAMEROON

DYNAMIQUES DE LA STRATÉGIE D'ÉVALUATION DU PORTEFEUILLE DANS L'ÉVALUATION DES COMPÉTENCES LANGUES EN ANGLAIS POUR LES PUISSANCES AYANT DES IMPORTATIONS D'AUDIENCE DANS LA DIVISION DE FAKO, RÉGION DU SUD-OUEST DU CAMEROUN

— Ngenwie Emilia Tanyie, Therese Mungah Shalo Tchombe

2026-12-31

Toward a global eco-social policy? The OECD and Green Keynesianism

Vers une politique écosociale globale? L'OCDE et le keynésianisme vert

— R. Schulze Waltrup, M. Büchs, A. Kaasch

2026-12-24

Technology-Enhanced Writing Pedagogy for EFL Learners: A Multi-Study Dissertation on Practice, Effectiveness, and Teacher Perceptions

La pédagogie de l'écriture améliorée par la technologie pour les apprenants de l'EFL : une thèse multi-études sur la pratique, l'efficacité et les perceptions des enseignants

— Xiao Lin

2026-12-09

Introducing a sociocultural practices framework: how it helps to explain the emergence and spread of ‘grassroots’ housing models

Introduction d'un cadre de pratiques socioculturelles: comment il aide à expliquer l'émergence et la diffusion de modèles de logement «basiques»

— Hamiduddin, Iqbal, Pauker, Madeleine, Field, Martin

2026-06-30

The Ontological Status of AI-Generated Music: From Stylistic Mimicry to Collaborative Agency

L'état ontologique de la musique produite par l'IA : du mimétisme stylistique à l'agence de collaboration

— Shen Jiang

2026-05-31

On Efficient Approximate Aggregate Nearest Neighbor Queries over Learned Representations

Sur les questions les plus proches des voisins sur les représentations apprises

— Wang, Carrie, Amer-Yahia, Sihem, Lakshmanan, Laks et al.

2026-05-03

Amélioration de la prévision spatio-temporelle par fusion du voisinage spatial : étude de cas sur la mobilité liée à la COVID-19 au Pérou

Amélioration de la prévision spatio-temporelle par fusion du passage spatial : étude de cas sur la mobilité liée à la COVID-19 au Pérou

— Chuan Li, Jiang You, Hassine Moungla et al.

2026-05-01

SENS-ASR: Semantic Embedding injection in Neural-transducer for Streaming Automatic Speech Recognition

SENS-ASR: injection sémantique dans un transducteur neuronal pour la reconnaissance automatique de la parole

— Youness Dkhissi, Valentin Vielzeuf, Elys Allesiardo et al.

2026-05-01

Adversarial Heterogeneous Agent Learning for Robotic Systems: A Framework for Coordinated Competitive Behaviors

Apprentissage des agents hétérogéniques de l'adversaire pour les systèmes robotiques : un cadre pour les comportements concurrentiels coordonnés

— Christopher Allred

2026-04-28

Synolitic Graph Neural Networks for MRI-Derived Radiomic-Based Prediction of Prostate Cancer Progression on Active Surveillance

Réseaux neuronaux de diagramme synolitique pour la prévision radiomique par IRM de la progression du cancer de la prostate sur la surveillance active

— Mikhail I. Krivonosov, Arseniy Trukhanov, Nikita Sushentsev et al.

2026-04-27

Spurious alignment between large language models and brains can emerge from non-robust methods and overlooked confounds

L'alignement pur entre les grands modèles de langage et les cerveaux peut émerger de méthodes non-robustes et de confusions négligées

— Nima Hadidi, Ebrahim Feghhi, Bryan H Song et al.

2026-04-24

HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models

HubRouter: Primitif d'acheminement sous-quadratique rechargeable pour les modèles de séquence hybrides

— Abhinaba Basu

2026-04-24

SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference

SpikingBrain2.0: Modèles de fondation inspirés du cerveau pour une inférence efficace de long-contexte et de cross-platform

— Yuqi Pan, Jinghao Zhuang, Yupeng Feng et al.

2026-04-23

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

Séance de réflexion : Techniques matérielles et logicielles pour accélérer les modèles de fondation multimodales

— Muhammad Shafique, Abdul Basit, Muhammad Abdullah Hanif et al.

2026-04-23

The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

Le transformateur récurrent : une profondeur plus efficace et un décodage efficace

— Costin-Andrei Oncescu, Depen Morwani, Samy Jelassi et al.

2026-04-23

LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs

LayerBoost: Réduction de l'attention pour les LLM efficaces

— Mohamed Ali Souibgui, Jan Fostier, Rodrigo Abadía-Heredia et al.

📅 Historique — 360 derniers jours

May 2026 — 29 article(s)

2026-04-30

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

HERMES++: Vers un modèle de monde de conduite unifié pour la compréhension et la génération des scènes 3D

— Xin Zhou

PDF

2026-04-30

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

OmniRobotHome: Une plate-forme multi-caméra pour l'interaction multi-humaine-robot en temps réel

— Dingkang Liang

PDF

2026-04-30

Covariant Locally Localized Gravity and vDVZ Continuity

Gravité localisée et continuité vDVZ

— Xiwu Chen

PDF

2026-04-30

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

LaST-R1: Renforcement de l'action par le biais d'un raisonnement adaptatif pour les modèles VLA

— Feiyang Tan

PDF

2026-04-30

Representation Fréchet Loss for Visual Generation

Représentation Fréchet Perte pour la génération visuelle

— Dingyuan Zhang

PDF

2026-04-30

Towards Systematics of Calabi-Yau Landscape for String Cosmology

Vers la systématique du paysage de Calabi-Yau pour la cosmologie à cordes

— Hengshuang Zhao

PDF

2026-04-30

Cosmology of fractional gravity

Cosmologie de la gravité fractionnelle

— Xiang Bai

PDF

2026-04-30

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

La génération visuelle dans la nouvelle ère : une évolution de la cartographie atomique à la modélisation du monde agentique

— Junyoung Lee

PDF

2026-04-30

Chemical Taxonomy of $ω$~Centauri: Ten Populations Reveal a Multi-Phase Enrichment History

Taxonomie chimique du Centauri : Dix populations dévoilent une histoire d'enrichissement multiphasé

— Dingkang Liang

PDF

2026-04-30

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

Généralisable Sparse-View Reconstruction 3D à partir d'images sans contrainte

— Dingyuan Zhang

PDF

2026-04-30

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Exploration Hacking: Les LLM peuvent-ils apprendre à résister à l'entraînement RL?

— Xiwu Chen

PDF

2026-04-30

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Ordinateurs synthétiques à l'échelle pour la simulation de productivité de longue durée

— Feiyang Tan

PDF

2026-04-30

An adaptive wavelet-based PINN for problems with localized high-magnitude source

Un PINN adaptatif basé sur les vagues pour les problèmes avec la source localisée de haute magnitude

— Dingyuan Zhang

PDF

2026-04-30

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

LLM comme Raffineur de structure de graphique clinique: améliorer l'apprentissage de la représentation dans le diagnostic de saisie EEG

— Hengshuang Zhao

PDF

2026-04-30

AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images

AEGIS: un point de repère holistique pour l'évaluation de l'analyse médico-légale des images académiques produites par l'IA

— Xiang Bai

PDF

2026-04-30

Strait: Perceiving Priority and Interference in ML Inference Serving

Détroit : Percevoir la priorité et l'interférence dans le service de l'inférence ML

— Hao Chen

PDF

2026-04-30

Enhancement of superconducting stiffness in hybrid superconducting-metallic bilayers

Amélioration de la rigidité supraconductrice dans les bicouches hybrides supraconductrices-métalliques

— Xiwu Chen

PDF

2026-04-30

Computing Equilibrium beyond Unilateral Deviation

Équilibre informatique au-delà de la déviation unilatérale

— Feiyang Tan

PDF

2026-04-30

Observation of Vinen turbulence during far-from-equilibrium Bose-Einstein condensation

Observation des turbulences de Vinen pendant la condensation de Bose-Einstein, loin de l'équilibre

— Timur Şahin

PDF

2026-04-30

Intrinsic anomalous thermal hall effect as a signature of quantum metric in d-wave altermagnets

Effet thermal anomal intrinsèque comme signature de métrique quantique dans les alteraimants d-onde

— Dingkang Liang

PDF

2026-04-30

RopeDreamer: A Kinematic Recurrent State Space Model for Dynamics of Flexible Deformable Linear Objects

RopeDreamer: Un modèle d'espace d'état à répétition cinématique pour les dynamiques d'objets linéaires déformables flexibles

— Dingyuan Zhang

PDF

2026-04-30

FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption

FlashRT: Vers une équipe rouge efficace pour l'injection rapide et la corruption du savoir

— Hengshuang Zhao

PDF

2026-04-30

Mapping data sensitivities in global QCD analysis with linear response and influence functions

Cartographie des sensibilités des données dans l'analyse globale de QCD avec réponse linéaire et fonctions d'influence

— Xiang Bai

PDF

2026-04-30

Beyond first-order accuracy in continuous-forcing immersed boundary methods, and their well-conditioned projection-based solution

Au-delà de la précision du premier ordre dans les méthodes de limite immergée à force continue, et leur solution de projection bien conditionnée

— Rishi G. Gopalakrishnan

PDF

2026-04-28

Synolitic Graph Neural Networks for MRI-Derived Radiomic-Based Prediction of Prostate Cancer Progression on Active Surveillance

Réseaux neuronaux de diagramme synolitique pour la prévision radiomique par IRM de la progression du cancer de la prostate sur la surveillance active

— Mikhail I. Krivonosov, Arseniy Trukhanov, Nikita Sushentsev et al.

2026-04-24

HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models

HubRouter: Primitif d'acheminement sous-quadratique rechargeable pour les modèles de séquence hybrides

— Abhinaba Basu

2026-04-24

SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference

SpikingBrain2.0: Modèles de fondation inspirés du cerveau pour une inférence efficace de long-contexte et de cross-platform

— Yuqi Pan, Jinghao Zhuang, Yupeng Feng et al.

2026-04-23

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

Séance de réflexion : Techniques matérielles et logicielles pour accélérer les modèles de fondation multimodales

— Muhammad Shafique, Abdul Basit, Muhammad Abdullah Hanif et al.

2026-04-23

LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs

LayerBoost: Réduction de l'attention pour les LLM efficaces

— Mohamed Ali Souibgui, Jan Fostier, Rodrigo Abadía-Heredia et al.

April 2026 — 198 article(s)

2026-04-29

AI evals are becoming the new compute bottleneck

Les évaluations d'IA deviennent le nouveau goulot d'étranglement informatique

2026-04-29

Granite 4.1 LLMs: How They’re Built

Granite 4.1 LLM: Comment ils sont construits

2026-04-29

DeepInfra on Hugging Face Inference Providers 🔥

DeepInfra sur les fournisseurs d'inférences faciales

2026-04-29

Outer-Crust Equations of State for Neutron Stars

Équations d'état pour les étoiles de Neutron

— P. S. Koliogiannis

We construct and systematically assess four outer-crust equations of state based on relativistic nuclear mass models and a machine-learning mass table. Our aim is to quantify the sensitivity of the equilibrium composition and thermodynamic properties of the outer crust to the underlying nuclear inpu

PDF

2026-04-29

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models

Tourner le TIDE: Distillation d'Architecture croisée pour les modèles de langue de diffusion

— N. Paar

Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference steps within a single architecture, none address cross-arch

PDF

2026-04-29

Anomalous Transport and Explicit Symmetry Breaking in Holography

Transport anomal et symmétrie explicite Breaking in Holographie

— Gongbo Zhang

We consider a holographic Einstein-Maxwell model in five dimensions with pure gauge and mixed gauge-gravitational Chern-Simons terms to study anomaly-induced transport in the presence of explicit symmetry breaking. We include the full backreaction of the scalar field and gauge fields on the metric a

PDF

2026-04-29

Optimizing Dynamic Metasurface Antenna Configurations for Direction-of-Arrival and Polarization Estimation Using an Experimentally Calibrated Multiport-Network Model

Optimisation des configurations d'antenne dynamique de métasurface pour l'estimation de la direction d'arrivée et de la polarisation à l'aide d'un modèle de réseau multiports étalonné expérimentalement

— Wen Wang

Sensing the direction of arrival and polarization of impinging signals is a key prerequisite for beamforming and interference mitigation in modern wireless communication systems. Dynamic metasurface antennas (DMAs) can multiplex direction- and polarization-dependent field information onto a single d

PDF

2026-04-29

Large quantum dot energy level shifts in anomalous photon-assisted tunneling

Changements de niveau d'énergie de point quantique dans un tunnel anomale assisté au photon

— Ye Tian

Orbital energy splittings are important quantum dot parameters for the operation of hole spin qubits. They are known to depend on the lateral confinement of the quantum dots. However, when changing top, plunger gate voltages, which are the typical control parameter for qubit applications, such energ

PDF

2026-04-29

Three-Step Nav: A Hierarchical Global-Local Planner for Zero-Shot Vision-and-Language Navigation

Trois étapes Nav : un planificateur hiérarchique mondial-local pour la navigation en vision et en langue zéro

— Li Yuan

Breakthrough progress in vision-based navigation through unknown environments has been achieved by using multimodal large language models (MLLMs). These models can plan a sequence of motions by evaluating the current view at each time step against the task and goal given to the agent. However, curre

PDF

2026-04-29

Fractions of Recurrence Operators for Generalized Fourier Series in Classical Orthogonal Polynomials

Fractions des opérateurs de récurrence pour les séries de Fourier généralisées en polynômes orthogonaux classiques

— Ashis Tamang

We consider series expansions in bases of classical orthogonal polynomials. When such a series solves a linear differential equation with polynomial coefficients, its coefficients satisfy a linear recurrence equation. We interpret this equation as the numerator of a fraction of linear recurrence ope

PDF

2026-04-29

Select to Think: Unlocking SLM Potential with Local Sufficiency

Choisir de penser: Débloquer le potentiel de la GDT avec suffisance locale

— Nishal Rai

Small language models (SLMs) offer computational efficiency for scalable deployment, yet they often fall short of the reasoning power exhibited by their larger counterparts (LLMs). To mitigate this gap, current approaches invoke an LLM to generate tokens at points of reasoning divergence, but these

PDF

2026-04-29

Simulating dynamics of RLC circuits with a quantum differential-algebraic equations solver

Simulation de la dynamique des circuits RLC avec un solveur quantique différentiel-algébrique

— Ashis Tamang

We introduce a quantum algorithm for simulating the dynamics of electrical circuits consisting of resistors, inductors and capacitors (aka RLC circuits) along with power sources. Given oracle access to the connectivity of the circuit and values of the electrical elements, our algorithm prepares a qu

PDF

2026-04-29

Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport

Hyper Input Convex Réseaux neuronaux pour l'apprentissage de la forme et le transport optimal

— Nishal Rai

We introduce Hyper Input Convex Neural Networks (HyCNNs), a novel neural network architecture designed for learning convex functions. HyCNNs combine the principles of Maxout networks with input convex neural networks (ICNNs) to create a neural network that is always convex in the input, theoreticall

PDF

2026-04-29

Weighted Linearization of Vector Fields via a Formal Moser Trick

Linéarisation pondérée des champs vectoriaux via un trick formel

— Gongbo Zhang

Many well-known theorems establish sufficient criteria for linearizability of a vector field in terms of the eigenvalues of its linear approximation. By attaching weights to coordinates so that some directions are considered "linear", others "quadratic", and so on, one can define

PDF

2026-04-29

Degree-dependent and distance-dependent contact rates interpolate between explosive, exponential and polynomial epidemic growth

Les taux de contact dépendants du degré et de la distance interpolent entre la croissance épidémique explosive, exponentielle et polynôme

— Wanrong Zheng

It is a fundamental question in epidemiology to estimate, model and predict the growth rate of a pandemic. Analogously, analysing the diffusion of innovation, (fake) news, memes, and rumours is of key importance in the social sciences. The resulting epidemic growth curves can be classified according

PDF

2026-04-29

Meta-learning-enhanced implicit full waveform inversion

Inversion implicite de pleine forme d'onde renforcée par le méta-apprentissage

— Yunhao Ge

Implicit full waveform inversion (IFWI) introduces implicit neural representations to parameterize the subsurface velocity model as a continuous function of spatial coordinates, which alleviates the dependence on the initial model and improves inversion flexibility. However, IFWI still requires a la

PDF

2026-04-29

Thermodynamics formalism for singular flows

Formalisme thermodynamique pour les flux singuliers

— Ashis Tamang

We establish that $C^\infty$ three-dimensional flows with positive topological entropy admit only finitely many ergodic measures of maximal entropy, even when singularities (zero-velocity points) are present. Furthermore, every ergodic measure of maximal entropy is rapid mixing for such flows within

PDF

2026-04-29

World2VLM: Distilling World Model Imagination into VLMs for Dynamic Spatial Reasoning

World2VLM: L'imagination du modèle mondial de distillation en VLM pour la raison spatiale dynamique

— Nishal Rai

Vision-language models (VLMs) have shown strong performance on static visual understanding, yet they still struggle with dynamic spatial reasoning that requires imagining how scenes evolve under egocentric motion. Recent efforts address this limitation either by scaling spatial supervision with synt

PDF

2026-04-29

Data-driven discovery of polynomial ODEs with provably bounded solutions

Découverte par des données d'ODE polynômes avec des solutions clairement délimitées

— Karl Landsteiner

We introduce SILAS, a data-driven framework for discovering polynomial ordinary differential equations (ODEs) with provably bounded trajectories. Boundedness is certified by compact absorbing sets defined via polynomial Lyapunov functions. We jointly identify the ODE vector field and the Lyapunov fu

PDF

2026-04-29

High Coupling Tunable Acoustic Resonators in Monolithic Barium Titanate

Résonateurs acoustiques à couple élevé en titane de baryum monolithique

— Eugenio Megias

The growing number of wireless communication bands has driven demand for compact, low-loss, and frequency adjustable RF filtering. Tunable acoustic resonators are well suited to address these needs, offering a path toward reconfigurable front ends with reduced component count. In this work, we exten

PDF

2026-04-29

Adaptive Self-Organization in Anonymous Dynamic Networks

Auto-organisation adaptative dans les réseaux dynamiques anonymes

— Brighton X. Coe

We introduce the problem of adaptive self-organization in which the nodes of an anonymous, synchronous dynamic network must distributively change the collective distribution of their responses (or "colors") as a function of time-varying environmental signals, even when these signals are on

PDF

2026-04-29

Exact Dynamic Programming for Solow--Polasky Diversity Subset Selection on Lines and Staircases

Programmation dynamique exacte pour Solow--Polasky Diversity Subset Selection sur les lignes et les escaliers

— Tyler J. Kovach

We study exact fixed-cardinality Solow--Polasky diversity subset selection on ordered finite $\ell_1$ sets, with monotone biobjective Pareto fronts and their higher-dimensional staircase analogues as central applications. Solow--Polasky diversity was introduced in biodiversity conservation, whereas

PDF

2026-04-29

Schwinger-Keldysh Path Integral for Gauge theories

Schwinger-Keldysh Path Integral pour les théories de la jauge

— Nishal Rai

We develop the Schwinger-Keldysh path-integral formalism for open non-Abelian gauge theories that are gauge-fixed via the BRST method in covariant gauges. We focus on generic initial states, pure and mixed, specified at finite times suitable for non-equilibrium processes. We pay particular attention

PDF

2026-04-28

Enhancing customer value through artificial intelligence and machine learning: Personalization, big data analytics, and customer experience management

Améliorer la valeur du client grâce à l'intelligence artificielle et à l'apprentissage automatique : personnalisation, analyse des mégadonnées et gestion de l'expérience client

— Adejoke Olumide Dele-Rotimi

2026-04-27

Spurious alignment between large language models and brains can emerge from non-robust methods and overlooked confounds

L'alignement pur entre les grands modèles de langage et les cerveaux peut émerger de méthodes non-robustes et de confusions négligées

— Nima Hadidi, Ebrahim Feghhi, Bryan H Song et al.

2026-04-28

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

Présentation de NVIDIA Nemotron 3 Nano Omni: Intelligence multimodale longue durée pour les agents de documents, audio et vidéo

2026-04-28

Recursive Multi-Agent Systems

Systèmes récursifs multi-agents

— Xiyuan Yang

Recursive or looped language models have recently emerged as a new scaling axis by iteratively refining the same model computation over latent states to deepen reasoning. We extend such scaling principle from a single model to multi-agent systems, and ask: Can agent collaboration itself be scaled th

PDF

2026-04-28

DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios

DV-World: Benchmarking Data Visualization Agents in Real-World Scénarios

— Jiaru Zou

Real-world data visualization (DV) requires native environmental grounding, cross-platform evolution, and proactive intent alignment. Yet, existing benchmarks often suffer from code-sandbox confinement, single-language creation-only tasks, and assumption of perfect intent. To bridge these gaps, we i

PDF

2026-04-28

Credit Limits beyond Full Collateralization in Decentralized Micropayments: Incentive Conditions

Limites de crédit au-delà de la collatéralisation complète dans les micropaiements décentralisés : conditions incitatives

— Rui Pan

In decentralized non-custodial micropayments, the central challenge is not whether payments can be executed directly, but under what conditions such systems can offer credit limits without requiring full collateral backing. Existing approaches typically tie available credit to posted collateral, cau

PDF

2026-04-28

From short-lived to long-lived clouds: impact of star formation models on giant molecular cloud evolution in simulations of an NGC 300-like galaxy

Des nuages à courte durée de vie à longue durée de vie : impact des modèles de formation d'étoiles sur l'évolution des nuages moléculaires géants dans les simulations d'une galaxie de type NGC 300

— Ruizhong Qiu

Multi-wavelength observations of molecular and ionized gas indicate that GMCs are short-lived, generally dispersing within one or two dynamical timescales. To investigate the physical origin of these short lifetimes and the role of star formation prescriptions, we conduct radiation-hydrodynamic simu

PDF