Research

Paper

AI LLM March 06, 2026

Lyapunov Probes for Hallucination Detection in Large Foundation Models

Authors

Bozhi Luan, Gen Li, Yalan Qin, Jifeng Guo, Yun Zhou, Faguo Wu, Hongwei Zheng, Wenjun Wu, Zhaoxin Fan

Abstract

We address hallucination detection in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by framing the problem through the lens of dynamical systems stability theory. Rather than treating hallucination as a straightforward classification task, we conceptualize (M)LLMs as dynamical systems, where factual knowledge is represented by stable equilibrium points within the representation space. Our main insight is that hallucinations tend to arise at the boundaries of knowledge-transition regions separating stable and unstable zones. To capture this phenomenon, we propose Lyapunov Probes: lightweight networks trained with derivative-based stability constraints that enforce a monotonic decay in confidence under input perturbations. By performing systematic perturbation analysis and applying a two-stage training process, these probes reliably distinguish between stable factual regions and unstable, hallucination-prone regions. Experiments on diverse datasets and models demonstrate consistent improvements over existing baselines.

Metadata

arXiv ID: 2603.06081

Provider: ARXIV

Primary Category: cs.CV

Published: 2026-03-06

Fetched: 2026-03-09 06:05

Related papers

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Kaituo Feng, Manyuan Zhang, Shuang Chen, Yunlong Lin, Kaixuan Fan, Yilei Jian... • 2026-03-30

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or • 2026-03-30

Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books

Minh-Thu Do, Quynh-Chau Le-Tran, Duc-Duy Nguyen-Mai, Thien-Trang Nguyen, Khan... • 2026-03-30

ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining

Anuj Diwan, Eunsol Choi, David Harwath • 2026-03-30

RAD-AI: Rethinking Architecture Documentation for AI-Augmented Ecosystems

Oliver Aleksander Larsen, Mahyar T. Moghaddam • 2026-03-30

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.06081v1</id>\n    <title>Lyapunov Probes for Hallucination Detection in Large Foundation Models</title>\n    <updated>2026-03-06T09:32:59Z</updated>\n    <link href='https://arxiv.org/abs/2603.06081v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.06081v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>We address hallucination detection in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by framing the problem through the lens of dynamical systems stability theory. Rather than treating hallucination as a straightforward classification task, we conceptualize (M)LLMs as dynamical systems, where factual knowledge is represented by stable equilibrium points within the representation space. Our main insight is that hallucinations tend to arise at the boundaries of knowledge-transition regions separating stable and unstable zones. To capture this phenomenon, we propose Lyapunov Probes: lightweight networks trained with derivative-based stability constraints that enforce a monotonic decay in confidence under input perturbations. By performing systematic perturbation analysis and applying a two-stage training process, these probes reliably distinguish between stable factual regions and unstable, hallucination-prone regions. Experiments on diverse datasets and models demonstrate consistent improvements over existing baselines.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CV'/>\n    <published>2026-03-06T09:32:59Z</published>\n    <arxiv:primary_category term='cs.CV'/>\n    <author>\n      <name>Bozhi Luan</name>\n    </author>\n    <author>\n      <name>Gen Li</name>\n    </author>\n    <author>\n      <name>Yalan Qin</name>\n    </author>\n    <author>\n      <name>Jifeng Guo</name>\n    </author>\n    <author>\n      <name>Yun Zhou</name>\n    </author>\n    <author>\n      <name>Faguo Wu</name>\n    </author>\n    <author>\n      <name>Hongwei Zheng</name>\n    </author>\n    <author>\n      <name>Wenjun Wu</name>\n    </author>\n    <author>\n      <name>Zhaoxin Fan</name>\n    </author>\n  </entry>"
}