Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

TRACE: Evaluating Execution Efficiency of LLM-Based Code Translation

While Large Language Models (LLMs) have substantially improved the functional correctness of code translation, the critical dimension of \textit{execution efficiency} remains overlooked. We present...

Zhihao Gong, Zeyu Sun, Dong Huang, Qingyuan Liang, Jie M. Zhang, Dan Hao

2603.16479 2026-03-17
AI LLM

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

Schema-guided reasoning pipelines ask LLMs to produce explicit intermediate structures -- rubrics, checklists, verification queries -- before committing to a final decision. But do these structures...

Oleg Somov, Mikhail Chaichuk, Mikhail Seleznyov, Alexander Panchenko, Elena Tutubalina

2603.16475 2026-03-17
TESTING

A Novel Approach for Fault Detection and Failure Analysis of CMOS Copper Metal Stacks

For the Inner Tracking System 3 (ITS3) upgrade, the ALICE experiment at CERN requires monolithic active pixel sensors of dimensions up to 97~mm$\,\times\,$266~mm, occupying a large fraction of a 30...

Gregor Hieronymus Eberwein, Gianluca Aglieri Rinella, Daniela Bortoletto, Szymon Bugiel, Francesc...

2603.16473 2026-03-17
AI LLM

GAP-MLLM: Geometry-Aligned Pre-training for Activating 3D Spatial Perception in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) demonstrate exceptional semantic reasoning but struggle with 3D spatial perception when restricted to pure RGB inputs. Despite leveraging implicit geometric...

Jiaxin Zhang, Junjun Jiang, Haijie Li, Youyu Chen, Kui Jiang, Dave Zhenyu Chen

2603.16461 2026-03-17
AI LLM

DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning

Diffusion large language models (D-LLMs) have emerged as a promising alternative to auto-regressive models due to their iterative refinement capabilities. However, hallucinations remain a critical ...

Yanyu Qian, Yue Tan, Yixin Liu, Wang Yu, Shirui Pan

2603.16459 2026-03-17
AI LLM

Agentic AI for SAGIN Resource Management_Semantic Awareness, Orchestration, and Optimization

Space-air-ground integrated networks (SAGIN) promise ubiquitous 6G connectivity but face significant resource management challenges due to heterogeneous infrastructure, dynamic topologies, and stri...

Linghao Zhang, Haitao Zhao, Bo Xu, Hongbo Zhu, Xianbin Wang

2603.16458 2026-03-17
AI LLM

Evo-Retriever: LLM-Guided Curriculum Evolution with Viewpoint-Pathway Collaboration for Multimodal Document Retrieval

Visual-language models (VLMs) excel at data mappings, but real-world document heterogeneity and unstructuredness disrupt the consistency of cross-modal embeddings. Recent late-interaction methods e...

Weiqing Li, Jinyue Guo, Yaqi Wang, Haiyang Xiao, Yuewei Zhang, Guohua Liu, Hao Henry Wang

2603.16455 2026-03-17
AI LLM

RetailBench: Evaluating Long-Horizon Autonomous Decision-Making and Strategy Stability of LLM Agents in Realistic Retail Environments

Large Language Model (LLM)-based agents have achieved notable success on short-horizon and highly structured tasks. However, their ability to maintain coherent decision-making over long horizons in...

Linghua Zhang, Jun Wang, Jingtong Wu, Zhisong Zhang

2603.16453 2026-03-17
AI LLM

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Text-to-SQL parsing has achieved remarkable progress under the Full Schema Assumption. However, this premise fails in real-world enterprise environments where databases contain hundreds of tables w...

Ai Jian, Xiaoyun Zhang, Wanrou Du, Jingqing Ruan, Jiangbo Pei, Weipeng Zhang, Ke Zeng, Xunliang Cai

2603.16448 2026-03-17
AI LLM

Visual Distraction Undermines Moral Reasoning in Vision-Language Models

Moral reasoning is fundamental to safe Artificial Intelligence (AI), yet ensuring its consistency across modalities becomes critical as AI systems evolve from text-based assistants to embodied agen...

Xinyi Yang, Chenheng Xu, Weijun Hong, Ce Mo, Qian Wang, Fang Fang, Yixin Zhu

2603.16445 2026-03-17
TESTING

Capability-Guided Compression: Toward Interpretability-Aware Budget Allocation for Large Language Models

Large language model compression has made substantial progress through pruning, quantization, and low-rank decomposition, yet a fundamental limitation persists across all existing methods: compress...

Rishaank Gupta

2603.16440 2026-03-17
AI LLM

VQKV: High-Fidelity and High-Ratio Cache Compression via Vector-Quantization

The growing context length of Large Language Models (LLMs) enlarges the Key-Value (KV) cache, limiting deployment in resource-limited environments. Prior training-free approaches for KV cache compr...

Yixuan Wang, Qingyu Shi, Jiayu Zhou, Dianbo Liu, Ziwei He, Zhouhan Lin

2603.16435 2026-03-17
AI LLM

From Natural Language to Executable Option Strategies via Large Language Models

Large Language Models (LLMs) excel at general code generation, yet translating natural-language trading intents into correct option strategies remains challenging. Real-world option design requires...

Haochen Luo, Zhengzhao Lai, Junjie Xu, Yifan Li, Tang Pok Hin, Yuan Zhang, Chen Liu

2603.16434 2026-03-17
AI LLM

IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video

Unsupervised physical parameter estimation from video lacks a common benchmark: existing methods evaluate on non-overlapping synthetic data, the sole real-world dataset is restricted to single-body...

Rasul Khanbayov, Mohamed Rayan Barhdadi, Erchin Serpedin, Hasan Kurban

2603.16432 2026-03-17
AI LLM

EngGPT2: Sovereign, Efficient and Open Intelligence

EngGPT2-16B-A3B is the latest iteration of Engineering Group's Italian LLM and it's built to be a Sovereign, Efficient and Open model. EngGPT2 is trained on 2.5 trillion tokens - less than Qwen3's ...

G. Ciarfaglia, A. Rosanova, S. Cipolla, J. Bartoli, A. Di Domenico, C. Fioroni, A. Fontana, M. R....

2603.16430 2026-03-17
AI LLM

An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU

Fine-tuning Large Language Models (LLMs) has become essential for domain adaptation, but its memory-intensive property exceeds the capabilities of most GPUs. To address this challenge and democrati...

Ruijia Yang, Zeyi Wen

2603.16428 2026-03-17
AI LLM

HGP-Mamba: Integrating Histology and Generated Protein Features for Mamba-based Multimodal Survival Risk Prediction

Recent advances in multimodal learning have significantly improved cancer survival risk prediction. However, the joint prognostic potential of protein markers and histopathology images remains unde...

Jing Dai, Chen Wu, Ming Wu, Qibin Zhang, Zexi Wu, Jingdong Zhang, Hongming Xu

2603.16421 2026-03-17
AI LLM

Via Negativa for AI Alignment: Why Negative Constraints Are Structurally Superior to Positive Preferences

Recent empirical results have demonstrated that training large language models (LLMs) with negative-only feedback can match or exceed standard reinforcement learning from human feedback (RLHF). Neg...

Quan Cheng

2603.16417 2026-03-17
AI LLM

IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time

Multi-hop question answering (QA) requires reasoning across multiple documents, yet existing retrieval-augmented generation (RAG) approaches address this either through graph-based methods requirin...

Zhenghua Bao, Yi Shi

2603.16415 2026-03-17
AI LLM

Trained Persistent Memory for Frozen Encoder--Decoder LLMs: Six Architectural Methods

Frozen encoder--decoder language models are stateless: the latent representation is discarded after every forward pass, so no information persists across sessions. This paper presents a \textbf{pro...

Hong Jeong

2603.16413 2026-03-17