🌀 RLM — Recursive & Recurrent Language Models

Dernière mise à jour : 01/05/2026 à 03:32 UTC

Veille sur les alternatives récurrentes aux Transformers : RWKV, Mamba/SSM, RetNet, xLSTM, Hyena, Griffin/Hawk. Flux RSS officiels, publications arXiv, discussions Hacker News.
Titres en anglais traduits automatiquement en vert · PDF = lien direct vers le preprint arXiv. Historique : 227 article(s) sur 360 jours.

← IT/Science News & + 🧠 LLM Open Source 📚 Publications

📰 Actualités du jour

🌀

Blog officiel

RWKV-v5
RWKV-v5
RWKV-v4neo
RWKV-v4neo
RWKV v2 - RNN with Transformer Performance
RWKV v2 - RNN avec la performance du transformateur
0.02 0.02
0.01 0.01

Hacker News

2023-03-30
The RWKV language model: An RNN with the advantages of a transformer
Le modèle de langue RWKV : un RNN avec les avantages d'un transformateur
HN 184▲
2024-12-30
RWKV Language Model
Modèle de langue RWKV
HN 183▲
2023-07-04
How the RWKV language model works
Comment fonctionne le modèle de langue RWKV
HN 71▲
2023-06-29
RWKV Language Model Math
Mathématiques du modèle de langue RWKV
HN 1▲
2025-03-29
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers
HN 1▲

Blog officiel

v2.3.1
v2.3.1
v2.3.0
v2.3.0
v2.2.6.post3
v2.2.6.post3
v2.2.6.post2
v2.2.6.post2
v2.2.6
v2.2.6
v2.2.6.post1
v2.2.6.post1
v2.2.5
v2.2.5
v2.2.4
v2.2.4
v2.2.3
v2.2.3
v2.2.3.post2
v2.2.3.post2

Hacker News

2024-02-25
Mamba Explained: The State Space Model Taking On Transformers
Mamba a expliqué : Le modèle spatial d'État prenant en compte les transformateurs
HN 270▲
2024-01-09
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
MoE-Mamba: Modèles d'espaces d'État sélectifs efficaces avec mélange d'experts
HN 129▲
2026-01-28
A verification layer for browser agents: Amazon case study
Une couche de vérification pour les agents du navigateur : étude de cas Amazon
HN 56▲
2026-01-21
A verification layer for browser agents: Amazon case study
Une couche de vérification pour les agents du navigateur : étude de cas Amazon
HN 28▲
2023-12-04
Mamba: New SSM arch with linear-time scaling that outperforms Transformers
Mamba: Nouvelle arche SSM avec échelle linéaire qui surpasse les transformateurs
HN 6▲
2024-03-30
A Visual Guide to Mamba and State Space Models
Guide visuel des modèles spatiaux Mamba et State
HN 3▲
🤗

Hugging Face — Séquence

https://huggingface.co/blog

Blog officiel

2026-04-29
AI evals are becoming the new compute bottleneck
Les évaluations d'IA deviennent le nouveau goulot d'étranglement informatique
2026-04-29
Granite 4.1 LLMs: How They’re Built
Granite 4.1 LLM: Comment ils sont construits
2026-04-29
DeepInfra on Hugging Face Inference Providers 🔥
DeepInfra sur les fournisseurs d'inférences faciales
2026-04-28
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
Présentation de NVIDIA Nemotron 3 Nano Omni: Intelligence multimodale longue durée pour les agents de documents, audio et vidéo
2026-04-27
How to build scalable web apps with OpenAI's Privacy Filter
Comment construire des applications web évolutives avec le filtre de confidentialité OpenAI
2026-04-24
DeepSeek-V4: a million-token context that agents can actually use
DeepSeek-V4 : un contexte de millions de jetons que les agents peuvent utiliser
2026-04-23
How to Use Transformers.js in a Chrome Extension
Comment utiliser Transformers.js dans une extension Chrome
2026-04-21
QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard
QIMMA : Un premier LLM arabe de qualité Tableau de bord
2026-04-21
AI and the Future of Cybersecurity: Why Openness Matters
L'IA et l'avenir de la cybersécurité : pourquoi l'ouverture compte
2026-04-16
Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents
Ecom-RLVE: Environnements vérifiables adaptatifs pour les agents conversateurs du commerce électronique

Hacker News

2023-07-25
Retentive Network: A Successor to Transformer Implemented in PyTorch
Réseau rétentif : un successeur à la transformation mis en œuvre en PyTorch
HN 12▲
2016-01-01
Character-Aware LSTM CNN Neural Language Models
Modèles de langage neuronal LSTM CNN
HN 4▲
2024-05-12
Show HN: NanoXLSTM: minimal codebase for playing with xLSTM language models
Afficher HN: NanoXLSTM: base de code minimale pour jouer avec les modèles de langue xLSTM
HN 3▲
2026-02-05
Language Modeling, Part 5: Reverse Engineering LSTM Cells
Modélisation de la langue, Partie 5: Ingénierie inverse LSTM Cellules
HN 1▲
2025-03-29
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers
HN 1▲

Aucun article récupéré.

🔬

Google DeepMind Research

https://deepmind.google

Hacker News

2025-03-29
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
DeltaProduct: Améliorer le trafic d'État dans les RNN linéaires via les produits ménagers
HN 1▲
📘

Publications arXiv

RWKV · Mamba · RetNet · xLSTM · Hyena · Griffin — PDF = preprint direct

2026-04-30
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++: Vers un modèle de monde de conduite unifié pour la compréhension et la génération des scènes 3D
— Xin Zhou
Driving world models serve as a pivotal technology for autonomous driving by simulating environmental dynamics. However, existing approaches predominantly focus on future scene generation, often overlooking comprehensive 3D scene understanding. Conversely, while Large Language Models (LLMs) demonstr
PDF
2026-04-30
OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction
OmniRobotHome: Une plate-forme multi-caméra pour l'interaction multi-humaine-robot en temps réel
— Dingkang Liang
Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting concurrently on interleaved subtasks with tight spatial and temporal coupling. This regime remains un
PDF
2026-04-30
Covariant Locally Localized Gravity and vDVZ Continuity
Gravité localisée et continuité vDVZ
— Xiwu Chen
The Karch-Randall braneworld concerns the physics of an AdS$_{d}$ brane embedded in an ambient gravitational AdS$_{d+1}$ spacetime. The gravitational theory induced on the AdS$_{d}$ brane has a very light but massive graviton. It has been established that the zero graviton mass limit of the $d$-dime
PDF
2026-04-30
LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models
LaST-R1: Renforcement de l'action par le biais d'un raisonnement adaptatif pour les modèles VLA
— Feiyang Tan
Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing explicit linguistic reasoning that suffers from latency and discretization, or utilizing more expressi
PDF
2026-04-30
Representation Fréchet Loss for Visual Generation
Représentation Fréchet Perte pour la génération visuelle
— Dingyuan Zhang
We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term
PDF
2026-04-30
Towards Systematics of Calabi-Yau Landscape for String Cosmology
Vers la systématique du paysage de Calabi-Yau pour la cosmologie à cordes
— Hengshuang Zhao
In this review, we discuss the relevance and impact of studying Calabi-Yau threefolds in the context of global model building in string phenomenology. First, taking a phenomenologist-friendly approach, we review how the topologies of the various divisors and curves of the compactifying CY threefolds
PDF
2026-04-30
Cosmology of fractional gravity
Cosmologie de la gravité fractionnelle
— Xiang Bai
This is a first study of the cosmology of classical fractional gravity, a nonlocal proposal endowed with self-adjoint fractional d'Alembertian operators which serves as the basis for an ultraviolet-complete theory of quantum gravity. We derive the classical covariant nonlocal equations of motio
PDF
2026-04-30
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
La génération visuelle dans la nouvelle ère : une évolution de la cartographie atomique à la modélisation du monde agentique
— Junyoung Lee
Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, long-horizon consistency, and causal understanding. We argue that the field should move beyond appea
PDF
2026-04-30
Chemical Taxonomy of $ω$~Centauri: Ten Populations Reveal a Multi-Phase Enrichment History
Taxonomie chimique du Centauri : Dix populations dévoilent une histoire d'enrichissement multiphasé
— Dingkang Liang
$ω$~Centauri, the most massive globular cluster in the Milky Way, exhibits a level of stellar population complexity that has long resisted a unified chemical characterisation. We exploit high-resolution near-infrared spectroscopy from the Milky Way Mapper survey (MWM DR19) to construct one of the la
PDF
2026-04-30
Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
Généralisable Sparse-View Reconstruction 3D à partir d'images sans contrainte
— Dingyuan Zhang
Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimization using appearance embeddings or dynamic masks, which requires extensive per-scene training and
PDF
2026-04-30
Exploration Hacking: Can LLMs Learn to Resist RL Training?
Exploration Hacking: Les LLM peuvent-ils apprendre à résister à l'entraînement RL?
— Xiwu Chen
Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration of diverse actions by the model during training, which creates a potential failure mode: a model cou
PDF
2026-04-30
Synthetic Computers at Scale for Long-Horizon Productivity Simulation
Ordinateurs synthétiques à l'échelle pour la simulation de productivité de longue durée
— Feiyang Tan
Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synt
PDF
2026-04-30
An adaptive wavelet-based PINN for problems with localized high-magnitude source
Un PINN adaptatif basé sur les vagues pour les problèmes avec la source localisée de haute magnitude
— Dingyuan Zhang
In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper pr
PDF
2026-04-30
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis
LLM comme Raffineur de structure de graphique clinique: améliorer l'apprentissage de la représentation dans le diagnostic de saisie EEG
— Hengshuang Zhao
Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether correlation-based or learning-based, often generate redundant or irrelevant edges due to the noisy nat
PDF
2026-04-30
AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images
AEGIS: un point de repère holistique pour l'évaluation de l'analyse médico-légale des images académiques produites par l'IA
— Xiang Bai
We introduce AEGIS, A holistic benchmark for Evaluating forensic analysis of AI-Generated academic ImageS. Compared to existing benchmarks, AEGIS features three key advances: (1) Domain-Specific Complexity: covering seven academic categories with 39 fine-grained subtypes, exposing intrinsic forensic
PDF
2026-04-30
Strait: Perceiving Priority and Interference in ML Inference Serving
Détroit : Percevoir la priorité et l'interférence dans le service de l'inférence ML
— Hao Chen
Machine learning (ML) inference serving systems host deep neural network (DNN) models and schedule incoming inference requests across deployed GPUs. However, limited support for task prioritization and insufficient latency estimation under concurrent execution may restrict their applicability in on-
PDF
2026-04-30
Enhancement of superconducting stiffness in hybrid superconducting-metallic bilayers
Amélioration de la rigidité supraconductrice dans les bicouches hybrides supraconductrices-métalliques
— Xiwu Chen
Boosting superconductivity by metallic reservoirs is the essence of Kivelson's bilayer proposal. One layer provides pairing to the electrons, while the weakly coupled metal provides additional phase coherence to those pairs by mediating extended-range pair-pair coupling. Demonstrating significa
PDF
2026-04-30
Computing Equilibrium beyond Unilateral Deviation
Équilibre informatique au-delà de la déviation unilatérale
— Feiyang Tan
Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts
PDF
2026-04-30
Observation of Vinen turbulence during far-from-equilibrium Bose-Einstein condensation
Observation des turbulences de Vinen pendant la condensation de Bose-Einstein, loin de l'équilibre
— Timur Şahin
Relaxation of far-from-equilibrium quantum fluids, intimately related to the emergence of long-range order, is theoretically associated with the decay of a turbulent isotropic tangle of vortex lines. We observe and study such decaying quantum turbulence in a homogeneous 3D atomic Bose gas. Using mat
PDF
2026-04-30
Intrinsic anomalous thermal hall effect as a signature of quantum metric in d-wave altermagnets
Effet thermal anomal intrinsèque comme signature de métrique quantique dans les alteraimants d-onde
— Dingkang Liang
We investigate the intrinsic anomalous thermal Hall effect in d-wave altermagnets, where a transverse heat current is generated by a longitudinal temperature gradient in the absence of a magnetic field, with the leading response proportional to $(\nabla T)^3$. In these systems, the intrinsic Berry c
PDF
2026-04-30
RopeDreamer: A Kinematic Recurrent State Space Model for Dynamics of Flexible Deformable Linear Objects
RopeDreamer: Un modèle d'espace d'état à répétition cinématique pour les dynamiques d'objets linéaires déformables flexibles
— Dingyuan Zhang
The robotic manipulation of Deformable Linear Objects (DLOs) is a fundamental challenge due to the high-dimensional, non-linear dynamics of flexible structures and the complexity of maintaining topological integrity during contact-rich tasks. While recent data-driven methods have utilized Recurrent
PDF
2026-04-30
FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption
FlashRT: Vers une équipe rouge efficace pour l'injection rapide et la corruption du savoir
— Hengshuang Zhao
Long-context large language models (LLMs)-for example, Gemini-3.1-Pro and Qwen-3.5-are widely used to empower many real-world applications, such as retrieval-augmented generation, autonomous agents, and AI assistants. However, security remains a major concern for their widespread deployment, with th
PDF
2026-04-30
Mapping data sensitivities in global QCD analysis with linear response and influence functions
Cartographie des sensibilités des données dans l'analyse globale de QCD avec réponse linéaire et fonctions d'influence
— Xiang Bai
Global QCD analyses provide the primary framework for extracting hadron structure from experimental data, yet the mechanisms by which data constrain non-perturbative functions remain difficult to interpret due to the high dimensionality and complexity of these fits. Here we develop a framework based
PDF
2026-04-30
Beyond first-order accuracy in continuous-forcing immersed boundary methods, and their well-conditioned projection-based solution
Au-delà de la précision du premier ordre dans les méthodes de limite immergée à force continue, et leur solution de projection bien conditionnée
— Rishi G. Gopalakrishnan
We introduce a refined immersed boundary (IB) methodology that is better-than-first-order accurate in practice, while preserving key properties of "continuous-forcing" IB approaches that retain a singular source term in the governing equations. Our method leverages a smoothed indicator (He
PDF
🔬

OpenAlex — Publications académiques

Corpus académique indexé par OpenAlex (ResearchGate, Crossref, PubMed…)

2026-12-31
Main rooms and coffered ceilings as definers of a noble palace model in renaissance Spain. The case study of the palace of Peñaranda de Duero, Spain
Les salles principales et les plafonds coffered en tant que définitions d'un modèle de palais noble en renaissance Espagne. Étude de cas du palais de Peñaranda de Duero, Espagne
— Manuel de Miguel Sánchez, Miguel Carlos Fernández Cabo, Ana González Uriel
2026-12-31
Cic. "Ad Brut." 1.9 and the Death of Porcia
Cic. "Ad Brut." 1.9 et la mort de Porcia
— François, Ide
2026-12-31
DYNAMICS OF PORTFOLIO ASSESSMENT STRATEGY IN EVALUATING THE ENGLISH LANGUAGE SKILLS FOR PUPILS WITH HEARING IMPAIRMENTS IN FAKO DIVISION, SOUTH WEST REGION OF CAMEROON
DYNAMIQUES DE LA STRATÉGIE D'ÉVALUATION DU PORTEFEUILLE DANS L'ÉVALUATION DES COMPÉTENCES LANGUES EN ANGLAIS POUR LES PUISSANCES AYANT DES IMPORTATIONS D'AUDIENCE DANS LA DIVISION DE FAKO, RÉGION DU SUD-OUEST DU CAMEROUN
— Ngenwie Emilia Tanyie, Therese Mungah Shalo Tchombe
2026-12-31
Toward a global eco-social policy? The OECD and Green Keynesianism
Vers une politique écosociale globale? L'OCDE et le keynésianisme vert
— R. Schulze Waltrup, M. Büchs, A. Kaasch
2026-12-24
Technology-Enhanced Writing Pedagogy for EFL Learners: A Multi-Study Dissertation on Practice, Effectiveness, and Teacher Perceptions
La pédagogie de l'écriture améliorée par la technologie pour les apprenants de l'EFL : une thèse multi-études sur la pratique, l'efficacité et les perceptions des enseignants
— Xiao Lin
2026-12-09
Introducing a sociocultural practices framework: how it helps to explain the emergence and spread of ‘grassroots’ housing models
Introduction d'un cadre de pratiques socioculturelles: comment il aide à expliquer l'émergence et la diffusion de modèles de logement «basiques»
— Hamiduddin, Iqbal, Pauker, Madeleine, Field, Martin
2026-06-30
The Ontological Status of AI-Generated Music: From Stylistic Mimicry to Collaborative Agency
L'état ontologique de la musique produite par l'IA : du mimétisme stylistique à l'agence de collaboration
— Shen Jiang
2026-05-31
On Efficient Approximate Aggregate Nearest Neighbor Queries over Learned Representations
Sur les questions les plus proches des voisins sur les représentations apprises
— Wang, Carrie, Amer-Yahia, Sihem, Lakshmanan, Laks et al.
2026-05-03
Amélioration de la prévision spatio-temporelle par fusion du voisinage spatial : étude de cas sur la mobilité liée à la COVID-19 au Pérou
Amélioration de la prévision spatio-temporelle par fusion du passage spatial : étude de cas sur la mobilité liée à la COVID-19 au Pérou
— Chuan Li, Jiang You, Hassine Moungla et al.
2026-05-01
SENS-ASR: Semantic Embedding injection in Neural-transducer for Streaming Automatic Speech Recognition
SENS-ASR: injection sémantique dans un transducteur neuronal pour la reconnaissance automatique de la parole
— Youness Dkhissi, Valentin Vielzeuf, Elys Allesiardo et al.
2026-05-01
Adversarial Heterogeneous Agent Learning for Robotic Systems: A Framework for Coordinated Competitive Behaviors
Apprentissage des agents hétérogéniques de l'adversaire pour les systèmes robotiques : un cadre pour les comportements concurrentiels coordonnés
— Christopher Allred
2026-04-28
Synolitic Graph Neural Networks for MRI-Derived Radiomic-Based Prediction of Prostate Cancer Progression on Active Surveillance
Réseaux neuronaux de diagramme synolitique pour la prévision radiomique par IRM de la progression du cancer de la prostate sur la surveillance active
— Mikhail I. Krivonosov, Arseniy Trukhanov, Nikita Sushentsev et al.
2026-04-27
Spurious alignment between large language models and brains can emerge from non-robust methods and overlooked confounds
L'alignement pur entre les grands modèles de langage et les cerveaux peut émerger de méthodes non-robustes et de confusions négligées
— Nima Hadidi, Ebrahim Feghhi, Bryan H Song et al.
2026-04-24
HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models
HubRouter: Primitif d'acheminement sous-quadratique rechargeable pour les modèles de séquence hybrides
— Abhinaba Basu
2026-04-24
SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference
SpikingBrain2.0: Modèles de fondation inspirés du cerveau pour une inférence efficace de long-contexte et de cross-platform
— Yuqi Pan, Jinghao Zhuang, Yupeng Feng et al.
2026-04-23
Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models
Séance de réflexion : Techniques matérielles et logicielles pour accélérer les modèles de fondation multimodales
— Muhammad Shafique, Abdul Basit, Muhammad Abdullah Hanif et al.
2026-04-23
The Recurrent Transformer: Greater Effective Depth and Efficient Decoding
Le transformateur récurrent : une profondeur plus efficace et un décodage efficace
— Costin-Andrei Oncescu, Depen Morwani, Samy Jelassi et al.
2026-04-23
LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
LayerBoost: Réduction de l'attention pour les LLM efficaces
— Mohamed Ali Souibgui, Jan Fostier, Rodrigo Abadía-Heredia et al.

📅 Historique — 360 derniers jours

May 2026 — 29 article(s)

2026-04-30
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++: Vers un modèle de monde de conduite unifié pour la compréhension et la génération des scènes 3D
— Xin Zhou
Driving world models serve as a pivotal technology for autonomous driving by simulating environmental dynamics. However, existing approaches predominantly focus on future scene generation, often overlooking comprehensive 3D scene understanding. Conversely, while Large Language Models (LLMs) demonstr
PDF
2026-04-30
OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction
OmniRobotHome: Une plate-forme multi-caméra pour l'interaction multi-humaine-robot en temps réel
— Dingkang Liang
Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting concurrently on interleaved subtasks with tight spatial and temporal coupling. This regime remains un
PDF
2026-04-30
Covariant Locally Localized Gravity and vDVZ Continuity
Gravité localisée et continuité vDVZ
— Xiwu Chen
The Karch-Randall braneworld concerns the physics of an AdS$_{d}$ brane embedded in an ambient gravitational AdS$_{d+1}$ spacetime. The gravitational theory induced on the AdS$_{d}$ brane has a very light but massive graviton. It has been established that the zero graviton mass limit of the $d$-dime
PDF
2026-04-30
LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models
LaST-R1: Renforcement de l'action par le biais d'un raisonnement adaptatif pour les modèles VLA
— Feiyang Tan
Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing explicit linguistic reasoning that suffers from latency and discretization, or utilizing more expressi
PDF
2026-04-30
Representation Fréchet Loss for Visual Generation
Représentation Fréchet Perte pour la génération visuelle
— Dingyuan Zhang
We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term
PDF
2026-04-30
Towards Systematics of Calabi-Yau Landscape for String Cosmology
Vers la systématique du paysage de Calabi-Yau pour la cosmologie à cordes
— Hengshuang Zhao
In this review, we discuss the relevance and impact of studying Calabi-Yau threefolds in the context of global model building in string phenomenology. First, taking a phenomenologist-friendly approach, we review how the topologies of the various divisors and curves of the compactifying CY threefolds
PDF
2026-04-30
Cosmology of fractional gravity
Cosmologie de la gravité fractionnelle
— Xiang Bai
This is a first study of the cosmology of classical fractional gravity, a nonlocal proposal endowed with self-adjoint fractional d'Alembertian operators which serves as the basis for an ultraviolet-complete theory of quantum gravity. We derive the classical covariant nonlocal equations of motio
PDF
2026-04-30
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
La génération visuelle dans la nouvelle ère : une évolution de la cartographie atomique à la modélisation du monde agentique
— Junyoung Lee
Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, long-horizon consistency, and causal understanding. We argue that the field should move beyond appea
PDF
2026-04-30
Chemical Taxonomy of $ω$~Centauri: Ten Populations Reveal a Multi-Phase Enrichment History
Taxonomie chimique du Centauri : Dix populations dévoilent une histoire d'enrichissement multiphasé
— Dingkang Liang
$ω$~Centauri, the most massive globular cluster in the Milky Way, exhibits a level of stellar population complexity that has long resisted a unified chemical characterisation. We exploit high-resolution near-infrared spectroscopy from the Milky Way Mapper survey (MWM DR19) to construct one of the la
PDF
2026-04-30
Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
Généralisable Sparse-View Reconstruction 3D à partir d'images sans contrainte
— Dingyuan Zhang
Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimization using appearance embeddings or dynamic masks, which requires extensive per-scene training and
PDF
2026-04-30
Exploration Hacking: Can LLMs Learn to Resist RL Training?
Exploration Hacking: Les LLM peuvent-ils apprendre à résister à l'entraînement RL?
— Xiwu Chen
Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration of diverse actions by the model during training, which creates a potential failure mode: a model cou
PDF
2026-04-30
Synthetic Computers at Scale for Long-Horizon Productivity Simulation
Ordinateurs synthétiques à l'échelle pour la simulation de productivité de longue durée
— Feiyang Tan
Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synt
PDF
2026-04-30
An adaptive wavelet-based PINN for problems with localized high-magnitude source
Un PINN adaptatif basé sur les vagues pour les problèmes avec la source localisée de haute magnitude
— Dingyuan Zhang
In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper pr
PDF
2026-04-30
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis
LLM comme Raffineur de structure de graphique clinique: améliorer l'apprentissage de la représentation dans le diagnostic de saisie EEG
— Hengshuang Zhao
Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether correlation-based or learning-based, often generate redundant or irrelevant edges due to the noisy nat
PDF
2026-04-30
AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images
AEGIS: un point de repère holistique pour l'évaluation de l'analyse médico-légale des images académiques produites par l'IA
— Xiang Bai
We introduce AEGIS, A holistic benchmark for Evaluating forensic analysis of AI-Generated academic ImageS. Compared to existing benchmarks, AEGIS features three key advances: (1) Domain-Specific Complexity: covering seven academic categories with 39 fine-grained subtypes, exposing intrinsic forensic
PDF
2026-04-30
Strait: Perceiving Priority and Interference in ML Inference Serving
Détroit : Percevoir la priorité et l'interférence dans le service de l'inférence ML
— Hao Chen
Machine learning (ML) inference serving systems host deep neural network (DNN) models and schedule incoming inference requests across deployed GPUs. However, limited support for task prioritization and insufficient latency estimation under concurrent execution may restrict their applicability in on-
PDF
2026-04-30
Enhancement of superconducting stiffness in hybrid superconducting-metallic bilayers
Amélioration de la rigidité supraconductrice dans les bicouches hybrides supraconductrices-métalliques
— Xiwu Chen
Boosting superconductivity by metallic reservoirs is the essence of Kivelson's bilayer proposal. One layer provides pairing to the electrons, while the weakly coupled metal provides additional phase coherence to those pairs by mediating extended-range pair-pair coupling. Demonstrating significa
PDF
2026-04-30
Computing Equilibrium beyond Unilateral Deviation
Équilibre informatique au-delà de la déviation unilatérale
— Feiyang Tan
Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts
PDF
2026-04-30
Observation of Vinen turbulence during far-from-equilibrium Bose-Einstein condensation
Observation des turbulences de Vinen pendant la condensation de Bose-Einstein, loin de l'équilibre
— Timur Şahin
Relaxation of far-from-equilibrium quantum fluids, intimately related to the emergence of long-range order, is theoretically associated with the decay of a turbulent isotropic tangle of vortex lines. We observe and study such decaying quantum turbulence in a homogeneous 3D atomic Bose gas. Using mat
PDF
2026-04-30
Intrinsic anomalous thermal hall effect as a signature of quantum metric in d-wave altermagnets
Effet thermal anomal intrinsèque comme signature de métrique quantique dans les alteraimants d-onde
— Dingkang Liang
We investigate the intrinsic anomalous thermal Hall effect in d-wave altermagnets, where a transverse heat current is generated by a longitudinal temperature gradient in the absence of a magnetic field, with the leading response proportional to $(\nabla T)^3$. In these systems, the intrinsic Berry c
PDF
2026-04-30
RopeDreamer: A Kinematic Recurrent State Space Model for Dynamics of Flexible Deformable Linear Objects
RopeDreamer: Un modèle d'espace d'état à répétition cinématique pour les dynamiques d'objets linéaires déformables flexibles
— Dingyuan Zhang
The robotic manipulation of Deformable Linear Objects (DLOs) is a fundamental challenge due to the high-dimensional, non-linear dynamics of flexible structures and the complexity of maintaining topological integrity during contact-rich tasks. While recent data-driven methods have utilized Recurrent
PDF
2026-04-30
FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption
FlashRT: Vers une équipe rouge efficace pour l'injection rapide et la corruption du savoir
— Hengshuang Zhao
Long-context large language models (LLMs)-for example, Gemini-3.1-Pro and Qwen-3.5-are widely used to empower many real-world applications, such as retrieval-augmented generation, autonomous agents, and AI assistants. However, security remains a major concern for their widespread deployment, with th
PDF
2026-04-30
Mapping data sensitivities in global QCD analysis with linear response and influence functions
Cartographie des sensibilités des données dans l'analyse globale de QCD avec réponse linéaire et fonctions d'influence
— Xiang Bai
Global QCD analyses provide the primary framework for extracting hadron structure from experimental data, yet the mechanisms by which data constrain non-perturbative functions remain difficult to interpret due to the high dimensionality and complexity of these fits. Here we develop a framework based
PDF
2026-04-30
Beyond first-order accuracy in continuous-forcing immersed boundary methods, and their well-conditioned projection-based solution
Au-delà de la précision du premier ordre dans les méthodes de limite immergée à force continue, et leur solution de projection bien conditionnée
— Rishi G. Gopalakrishnan
We introduce a refined immersed boundary (IB) methodology that is better-than-first-order accurate in practice, while preserving key properties of "continuous-forcing" IB approaches that retain a singular source term in the governing equations. Our method leverages a smoothed indicator (He
PDF
2026-04-28
Synolitic Graph Neural Networks for MRI-Derived Radiomic-Based Prediction of Prostate Cancer Progression on Active Surveillance
Réseaux neuronaux de diagramme synolitique pour la prévision radiomique par IRM de la progression du cancer de la prostate sur la surveillance active
— Mikhail I. Krivonosov, Arseniy Trukhanov, Nikita Sushentsev et al.
2026-04-24
HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models
HubRouter: Primitif d'acheminement sous-quadratique rechargeable pour les modèles de séquence hybrides
— Abhinaba Basu
2026-04-24
SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference
SpikingBrain2.0: Modèles de fondation inspirés du cerveau pour une inférence efficace de long-contexte et de cross-platform
— Yuqi Pan, Jinghao Zhuang, Yupeng Feng et al.
2026-04-23
Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models
Séance de réflexion : Techniques matérielles et logicielles pour accélérer les modèles de fondation multimodales
— Muhammad Shafique, Abdul Basit, Muhammad Abdullah Hanif et al.
2026-04-23
LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
LayerBoost: Réduction de l'attention pour les LLM efficaces
— Mohamed Ali Souibgui, Jan Fostier, Rodrigo Abadía-Heredia et al.

April 2026 — 198 article(s)

2026-04-29
AI evals are becoming the new compute bottleneck
Les évaluations d'IA deviennent le nouveau goulot d'étranglement informatique
2026-04-29
Granite 4.1 LLMs: How They’re Built
Granite 4.1 LLM: Comment ils sont construits
2026-04-29
DeepInfra on Hugging Face Inference Providers 🔥
DeepInfra sur les fournisseurs d'inférences faciales
2026-04-29
Outer-Crust Equations of State for Neutron Stars
Équations d'état pour les étoiles de Neutron
— P. S. Koliogiannis
We construct and systematically assess four outer-crust equations of state based on relativistic nuclear mass models and a machine-learning mass table. Our aim is to quantify the sensitivity of the equilibrium composition and thermodynamic properties of the outer crust to the underlying nuclear inpu
PDF
2026-04-29
Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models
Tourner le TIDE: Distillation d'Architecture croisée pour les modèles de langue de diffusion
— N. Paar
Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference steps within a single architecture, none address cross-arch
PDF
2026-04-29
Anomalous Transport and Explicit Symmetry Breaking in Holography
Transport anomal et symmétrie explicite Breaking in Holographie
— Gongbo Zhang
We consider a holographic Einstein-Maxwell model in five dimensions with pure gauge and mixed gauge-gravitational Chern-Simons terms to study anomaly-induced transport in the presence of explicit symmetry breaking. We include the full backreaction of the scalar field and gauge fields on the metric a
PDF
2026-04-29
Optimizing Dynamic Metasurface Antenna Configurations for Direction-of-Arrival and Polarization Estimation Using an Experimentally Calibrated Multiport-Network Model
Optimisation des configurations d'antenne dynamique de métasurface pour l'estimation de la direction d'arrivée et de la polarisation à l'aide d'un modèle de réseau multiports étalonné expérimentalement
— Wen Wang
Sensing the direction of arrival and polarization of impinging signals is a key prerequisite for beamforming and interference mitigation in modern wireless communication systems. Dynamic metasurface antennas (DMAs) can multiplex direction- and polarization-dependent field information onto a single d
PDF
2026-04-29
Large quantum dot energy level shifts in anomalous photon-assisted tunneling
Changements de niveau d'énergie de point quantique dans un tunnel anomale assisté au photon
— Ye Tian
Orbital energy splittings are important quantum dot parameters for the operation of hole spin qubits. They are known to depend on the lateral confinement of the quantum dots. However, when changing top, plunger gate voltages, which are the typical control parameter for qubit applications, such energ
PDF
2026-04-29
Three-Step Nav: A Hierarchical Global-Local Planner for Zero-Shot Vision-and-Language Navigation
Trois étapes Nav : un planificateur hiérarchique mondial-local pour la navigation en vision et en langue zéro
— Li Yuan
Breakthrough progress in vision-based navigation through unknown environments has been achieved by using multimodal large language models (MLLMs). These models can plan a sequence of motions by evaluating the current view at each time step against the task and goal given to the agent. However, curre
PDF
2026-04-29
Fractions of Recurrence Operators for Generalized Fourier Series in Classical Orthogonal Polynomials
Fractions des opérateurs de récurrence pour les séries de Fourier généralisées en polynômes orthogonaux classiques
— Ashis Tamang
We consider series expansions in bases of classical orthogonal polynomials. When such a series solves a linear differential equation with polynomial coefficients, its coefficients satisfy a linear recurrence equation. We interpret this equation as the numerator of a fraction of linear recurrence ope
PDF
2026-04-29
Select to Think: Unlocking SLM Potential with Local Sufficiency
Choisir de penser: Débloquer le potentiel de la GDT avec suffisance locale
— Nishal Rai
Small language models (SLMs) offer computational efficiency for scalable deployment, yet they often fall short of the reasoning power exhibited by their larger counterparts (LLMs). To mitigate this gap, current approaches invoke an LLM to generate tokens at points of reasoning divergence, but these
PDF
2026-04-29
Simulating dynamics of RLC circuits with a quantum differential-algebraic equations solver
Simulation de la dynamique des circuits RLC avec un solveur quantique différentiel-algébrique
— Ashis Tamang
We introduce a quantum algorithm for simulating the dynamics of electrical circuits consisting of resistors, inductors and capacitors (aka RLC circuits) along with power sources. Given oracle access to the connectivity of the circuit and values of the electrical elements, our algorithm prepares a qu
PDF
2026-04-29
Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport
Hyper Input Convex Réseaux neuronaux pour l'apprentissage de la forme et le transport optimal
— Nishal Rai
We introduce Hyper Input Convex Neural Networks (HyCNNs), a novel neural network architecture designed for learning convex functions. HyCNNs combine the principles of Maxout networks with input convex neural networks (ICNNs) to create a neural network that is always convex in the input, theoreticall
PDF
2026-04-29
Weighted Linearization of Vector Fields via a Formal Moser Trick
Linéarisation pondérée des champs vectoriaux via un trick formel
— Gongbo Zhang
Many well-known theorems establish sufficient criteria for linearizability of a vector field in terms of the eigenvalues of its linear approximation. By attaching weights to coordinates so that some directions are considered "linear", others "quadratic", and so on, one can define
PDF
2026-04-29
Degree-dependent and distance-dependent contact rates interpolate between explosive, exponential and polynomial epidemic growth
Les taux de contact dépendants du degré et de la distance interpolent entre la croissance épidémique explosive, exponentielle et polynôme
— Wanrong Zheng
It is a fundamental question in epidemiology to estimate, model and predict the growth rate of a pandemic. Analogously, analysing the diffusion of innovation, (fake) news, memes, and rumours is of key importance in the social sciences. The resulting epidemic growth curves can be classified according
PDF
2026-04-29
Meta-learning-enhanced implicit full waveform inversion
Inversion implicite de pleine forme d'onde renforcée par le méta-apprentissage
— Yunhao Ge
Implicit full waveform inversion (IFWI) introduces implicit neural representations to parameterize the subsurface velocity model as a continuous function of spatial coordinates, which alleviates the dependence on the initial model and improves inversion flexibility. However, IFWI still requires a la
PDF
2026-04-29
Thermodynamics formalism for singular flows
Formalisme thermodynamique pour les flux singuliers
— Ashis Tamang
We establish that $C^\infty$ three-dimensional flows with positive topological entropy admit only finitely many ergodic measures of maximal entropy, even when singularities (zero-velocity points) are present. Furthermore, every ergodic measure of maximal entropy is rapid mixing for such flows within
PDF
2026-04-29
World2VLM: Distilling World Model Imagination into VLMs for Dynamic Spatial Reasoning
World2VLM: L'imagination du modèle mondial de distillation en VLM pour la raison spatiale dynamique
— Nishal Rai
Vision-language models (VLMs) have shown strong performance on static visual understanding, yet they still struggle with dynamic spatial reasoning that requires imagining how scenes evolve under egocentric motion. Recent efforts address this limitation either by scaling spatial supervision with synt
PDF
2026-04-29
Data-driven discovery of polynomial ODEs with provably bounded solutions
Découverte par des données d'ODE polynômes avec des solutions clairement délimitées
— Karl Landsteiner
We introduce SILAS, a data-driven framework for discovering polynomial ordinary differential equations (ODEs) with provably bounded trajectories. Boundedness is certified by compact absorbing sets defined via polynomial Lyapunov functions. We jointly identify the ODE vector field and the Lyapunov fu
PDF
2026-04-29
High Coupling Tunable Acoustic Resonators in Monolithic Barium Titanate
Résonateurs acoustiques à couple élevé en titane de baryum monolithique
— Eugenio Megias
The growing number of wireless communication bands has driven demand for compact, low-loss, and frequency adjustable RF filtering. Tunable acoustic resonators are well suited to address these needs, offering a path toward reconfigurable front ends with reduced component count. In this work, we exten
PDF
2026-04-29
Adaptive Self-Organization in Anonymous Dynamic Networks
Auto-organisation adaptative dans les réseaux dynamiques anonymes
— Brighton X. Coe
We introduce the problem of adaptive self-organization in which the nodes of an anonymous, synchronous dynamic network must distributively change the collective distribution of their responses (or "colors") as a function of time-varying environmental signals, even when these signals are on
PDF
2026-04-29
Exact Dynamic Programming for Solow--Polasky Diversity Subset Selection on Lines and Staircases
Programmation dynamique exacte pour Solow--Polasky Diversity Subset Selection sur les lignes et les escaliers
— Tyler J. Kovach
We study exact fixed-cardinality Solow--Polasky diversity subset selection on ordered finite $\ell_1$ sets, with monotone biobjective Pareto fronts and their higher-dimensional staircase analogues as central applications. Solow--Polasky diversity was introduced in biodiversity conservation, whereas
PDF
2026-04-29
Schwinger-Keldysh Path Integral for Gauge theories
Schwinger-Keldysh Path Integral pour les théories de la jauge
— Nishal Rai
We develop the Schwinger-Keldysh path-integral formalism for open non-Abelian gauge theories that are gauge-fixed via the BRST method in covariant gauges. We focus on generic initial states, pure and mixed, specified at finite times suitable for non-equilibrium processes. We pay particular attention
PDF
2026-04-28
Enhancing customer value through artificial intelligence and machine learning: Personalization, big data analytics, and customer experience management
Améliorer la valeur du client grâce à l'intelligence artificielle et à l'apprentissage automatique : personnalisation, analyse des mégadonnées et gestion de l'expérience client
— Adejoke Olumide Dele-Rotimi
2026-04-27
Spurious alignment between large language models and brains can emerge from non-robust methods and overlooked confounds
L'alignement pur entre les grands modèles de langage et les cerveaux peut émerger de méthodes non-robustes et de confusions négligées
— Nima Hadidi, Ebrahim Feghhi, Bryan H Song et al.
2026-04-28
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
Présentation de NVIDIA Nemotron 3 Nano Omni: Intelligence multimodale longue durée pour les agents de documents, audio et vidéo
2026-04-28
Recursive Multi-Agent Systems
Systèmes récursifs multi-agents
— Xiyuan Yang
Recursive or looped language models have recently emerged as a new scaling axis by iteratively refining the same model computation over latent states to deepen reasoning. We extend such scaling principle from a single model to multi-agent systems, and ask: Can agent collaboration itself be scaled th
PDF
2026-04-28
DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios
DV-World: Benchmarking Data Visualization Agents in Real-World Scénarios
— Jiaru Zou
Real-world data visualization (DV) requires native environmental grounding, cross-platform evolution, and proactive intent alignment. Yet, existing benchmarks often suffer from code-sandbox confinement, single-language creation-only tasks, and assumption of perfect intent. To bridge these gaps, we i
PDF
2026-04-28
Credit Limits beyond Full Collateralization in Decentralized Micropayments: Incentive Conditions
Limites de crédit au-delà de la collatéralisation complète dans les micropaiements décentralisés : conditions incitatives
— Rui Pan
In decentralized non-custodial micropayments, the central challenge is not whether payments can be executed directly, but under what conditions such systems can offer credit limits without requiring full collateral backing. Existing approaches typically tie available credit to posted collateral, cau
PDF
2026-04-28
From short-lived to long-lived clouds: impact of star formation models on giant molecular cloud evolution in simulations of an NGC 300-like galaxy
Des nuages à courte durée de vie à longue durée de vie : impact des modèles de formation d'étoiles sur l'évolution des nuages moléculaires géants dans les simulations d'une galaxie de type NGC 300
— Ruizhong Qiu
Multi-wavelength observations of molecular and ionized gas indicate that GMCs are short-lived, generally dispersing within one or two dynamical timescales. To investigate the physical origin of these short lifetimes and the role of star formation prescriptions, we conduct radiation-hydrodynamic simu
PDF