Papers
Research papers from arXiv and related sources
TRACE: Evaluating Execution Efficiency of LLM-Based Code Translation
While Large Language Models (LLMs) have substantially improved the functional correctness of code translation, the critical dimension of \textit{execution efficiency} remains overlooked. We present...
Zhihao Gong, Zeyu Sun, Dong Huang, Qingyuan Liang, Jie M. Zhang, Dan Hao
Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures
Schema-guided reasoning pipelines ask LLMs to produce explicit intermediate structures -- rubrics, checklists, verification queries -- before committing to a final decision. But do these structures...
Oleg Somov, Mikhail Chaichuk, Mikhail Seleznyov, Alexander Panchenko, Elena Tutubalina
A Novel Approach for Fault Detection and Failure Analysis of CMOS Copper Metal Stacks
For the Inner Tracking System 3 (ITS3) upgrade, the ALICE experiment at CERN requires monolithic active pixel sensors of dimensions up to 97~mm$\,\times\,$266~mm, occupying a large fraction of a 30...
Gregor Hieronymus Eberwein, Gianluca Aglieri Rinella, Daniela Bortoletto, Szymon Bugiel, Francesc...
GAP-MLLM: Geometry-Aligned Pre-training for Activating 3D Spatial Perception in Multimodal Large Language Models
Multimodal Large Language Models (MLLMs) demonstrate exceptional semantic reasoning but struggle with 3D spatial perception when restricted to pure RGB inputs. Despite leveraging implicit geometric...
Jiaxin Zhang, Junjun Jiang, Haijie Li, Youyu Chen, Kui Jiang, Dave Zhenyu Chen
DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning
Diffusion large language models (D-LLMs) have emerged as a promising alternative to auto-regressive models due to their iterative refinement capabilities. However, hallucinations remain a critical ...
Yanyu Qian, Yue Tan, Yixin Liu, Wang Yu, Shirui Pan
Agentic AI for SAGIN Resource Management_Semantic Awareness, Orchestration, and Optimization
Space-air-ground integrated networks (SAGIN) promise ubiquitous 6G connectivity but face significant resource management challenges due to heterogeneous infrastructure, dynamic topologies, and stri...
Linghao Zhang, Haitao Zhao, Bo Xu, Hongbo Zhu, Xianbin Wang
Evo-Retriever: LLM-Guided Curriculum Evolution with Viewpoint-Pathway Collaboration for Multimodal Document Retrieval
Visual-language models (VLMs) excel at data mappings, but real-world document heterogeneity and unstructuredness disrupt the consistency of cross-modal embeddings. Recent late-interaction methods e...
Weiqing Li, Jinyue Guo, Yaqi Wang, Haiyang Xiao, Yuewei Zhang, Guohua Liu, Hao Henry Wang
RetailBench: Evaluating Long-Horizon Autonomous Decision-Making and Strategy Stability of LLM Agents in Realistic Retail Environments
Large Language Model (LLM)-based agents have achieved notable success on short-horizon and highly structured tasks. However, their ability to maintain coherent decision-making over long horizons in...
Linghua Zhang, Jun Wang, Jingtong Wu, Zhisong Zhang
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas
Text-to-SQL parsing has achieved remarkable progress under the Full Schema Assumption. However, this premise fails in real-world enterprise environments where databases contain hundreds of tables w...
Ai Jian, Xiaoyun Zhang, Wanrou Du, Jingqing Ruan, Jiangbo Pei, Weipeng Zhang, Ke Zeng, Xunliang Cai
Visual Distraction Undermines Moral Reasoning in Vision-Language Models
Moral reasoning is fundamental to safe Artificial Intelligence (AI), yet ensuring its consistency across modalities becomes critical as AI systems evolve from text-based assistants to embodied agen...
Xinyi Yang, Chenheng Xu, Weijun Hong, Ce Mo, Qian Wang, Fang Fang, Yixin Zhu
Capability-Guided Compression: Toward Interpretability-Aware Budget Allocation for Large Language Models
Large language model compression has made substantial progress through pruning, quantization, and low-rank decomposition, yet a fundamental limitation persists across all existing methods: compress...
Rishaank Gupta
VQKV: High-Fidelity and High-Ratio Cache Compression via Vector-Quantization
The growing context length of Large Language Models (LLMs) enlarges the Key-Value (KV) cache, limiting deployment in resource-limited environments. Prior training-free approaches for KV cache compr...
Yixuan Wang, Qingyu Shi, Jiayu Zhou, Dianbo Liu, Ziwei He, Zhouhan Lin
From Natural Language to Executable Option Strategies via Large Language Models
Large Language Models (LLMs) excel at general code generation, yet translating natural-language trading intents into correct option strategies remains challenging. Real-world option design requires...
Haochen Luo, Zhengzhao Lai, Junjie Xu, Yifan Li, Tang Pok Hin, Yuan Zhang, Chen Liu
IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video
Unsupervised physical parameter estimation from video lacks a common benchmark: existing methods evaluate on non-overlapping synthetic data, the sole real-world dataset is restricted to single-body...
Rasul Khanbayov, Mohamed Rayan Barhdadi, Erchin Serpedin, Hasan Kurban
EngGPT2: Sovereign, Efficient and Open Intelligence
EngGPT2-16B-A3B is the latest iteration of Engineering Group's Italian LLM and it's built to be a Sovereign, Efficient and Open model. EngGPT2 is trained on 2.5 trillion tokens - less than Qwen3's ...
G. Ciarfaglia, A. Rosanova, S. Cipolla, J. Bartoli, A. Di Domenico, C. Fioroni, A. Fontana, M. R....
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU
Fine-tuning Large Language Models (LLMs) has become essential for domain adaptation, but its memory-intensive property exceeds the capabilities of most GPUs. To address this challenge and democrati...
Ruijia Yang, Zeyi Wen
HGP-Mamba: Integrating Histology and Generated Protein Features for Mamba-based Multimodal Survival Risk Prediction
Recent advances in multimodal learning have significantly improved cancer survival risk prediction. However, the joint prognostic potential of protein markers and histopathology images remains unde...
Jing Dai, Chen Wu, Ming Wu, Qibin Zhang, Zexi Wu, Jingdong Zhang, Hongming Xu
Via Negativa for AI Alignment: Why Negative Constraints Are Structurally Superior to Positive Preferences
Recent empirical results have demonstrated that training large language models (LLMs) with negative-only feedback can match or exceed standard reinforcement learning from human feedback (RLHF). Neg...
Quan Cheng
IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time
Multi-hop question answering (QA) requires reasoning across multiple documents, yet existing retrieval-augmented generation (RAG) approaches address this either through graph-based methods requirin...
Zhenghua Bao, Yi Shi
Trained Persistent Memory for Frozen Encoder--Decoder LLMs: Six Architectural Methods
Frozen encoder--decoder language models are stateless: the latent representation is discarded after every forward pass, so no information persists across sessions. This paper presents a \textbf{pro...
Hong Jeong