Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Functionality-Oriented LLM Merging on the Fisher--Rao Manifold

Weight-space merging aims to combine multiple fine-tuned LLMs into a single model without retraining, yet most existing approaches remain fundamentally parameter-space heuristics. This creates thre...

Jiayu Wang, Zuojun Ye, Wenpeng Yin

2603.04972 2026-03-05
AI LLM

MPCEval: A Benchmark for Multi-Party Conversation Generation

Multi-party conversation generation, such as smart reply and collaborative assistants, is an increasingly important capability of generative AI, yet its evaluation remains a critical bottleneck. Co...

Minxing Zhang, Yi Yang, Zhuofan Jia, Xuan Yang, Jian Pei, Yuchen Zang, Xingwang Deng, Xianglong Chen

2603.04969 2026-03-05
AI LLM

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

Preference alignment is an essential step in adapting large language models (LLMs) to human values, but existing approaches typically depend on costly human annotations or large-scale API-based mod...

Amirabbas Afzali, Myeongho Jeon, Maria Brbic

2603.04968 2026-03-05
AI LLM

WaterSIC: information-theoretically (near) optimal linear layer quantization

This paper considers the problem of converting a given dense linear layer to low precision. The tradeoff between compressed length and output discrepancy is analyzed information theoretically (IT)....

Egor Lifar, Semyon Savkin, Or Ordentlich, Yury Polyanskiy

2603.04956 2026-03-05
AI LLM

Retrieval-Augmented Generation with Covariate Time Series

While RAG has greatly enhanced LLMs, extending this paradigm to Time-Series Foundation Models (TSFMs) remains a challenge. This is exemplified in the Predictive Maintenance of the Pressure Regulati...

Kenny Ye Liang, Zhongyi Pei, Huan Zhang, Yuhui Liu, Shaoxu Song, Jianmin Wang

2603.04951 2026-03-05
AI LLM

$\nabla$-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space

Scaling inference-time compute for Large Language Models (LLMs) has unlocked unprecedented reasoning capabilities. However, existing inference-time scaling methods typically rely on inefficient and...

Peihao Wang, Ruisi Cai, Zhen Wang, Hongyuan Mei, Qiang Liu, Pan Li, Zhangyang Wang

2603.04948 2026-03-05
AI LLM

LocalSUG: Geography-Aware LLM for Query Suggestion in Local-Life Services

In local-life service platforms, the query suggestion module plays a crucial role in enhancing user experience by generating candidate queries based on user input prefixes, thus reducing user effor...

Jinwen Chen, Shuai Gong, Shiwen Zhang, Zheng Zhang, Yachao Zhao, Lingxiang Wang, Haibo Zhou, Yuan...

2603.04946 2026-03-05
AI LLM

Detecting RAG Advertisements Across Advertising Styles

Large language models (LLMs) enable a new form of advertising for retrieval-augmented generation (RAG) systems in which organic responses are blended with contextually relevant ads. The prospect of...

Sebastian Heineking, Wilhelm Pertsch, Ines Zelch, Janek Bevendorff, Benno Stein, Matthias Hagen, ...

2603.04925 2026-03-05
AI LLM

AILS-NTUA at SemEval-2026 Task 10: Agentic LLMs for Psycholinguistic Marker Extraction and Conspiracy Endorsement Detection

This paper presents a novel agentic LLM pipeline for SemEval-2026 Task 10 that jointly extracts psycholinguistic conspiracy markers and detects conspiracy endorsement. Unlike traditional classifier...

Panagiotis Alexios Spanakis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorg...

2603.04921 2026-03-05
AI LLM

BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning

Proximal constraints are fundamental to the stability of the Large Language Model reinforcement learning. While the canonical clipping mechanism in PPO serves as an efficient surrogate for trust re...

Yuan Li, Bo Wang, Yufei Gao, Yuqian Yao, Xinyuan Wang, Zhangyue Yin, Xipeng Qiu

2603.04918 2026-03-05
AI LLM

Roomify: Spatially-Grounded Style Transformation for Immersive Virtual Environments

We present Roomify, a spatially-grounded transformation system that generates themed virtual environments anchored to users' physical rooms while maintaining spatial structure and functional semant...

Xueyang Wang, Qinxuan Cen, Weitao Bi, Yunxiang Ma, Xin Yi, Robert Xiao, Xinyi Fu, Hewu Li

2603.04917 2026-03-05
AI LLM

EVMbench: Evaluating AI Agents on Smart Contract Security

Smart contracts on public blockchains now manage large amounts of value, and vulnerabilities in these systems can lead to substantial losses. As AI agents become more capable at reading, writing, a...

Justin Wang, Andreas Bigger, Xiaohai Xu, Justin W. Lin, Andy Applebaum, Tejal Patwardhan, Alpin Y...

2603.04915 2026-03-05
AI LLM

Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems

In perpetrator treatment, a recurring observation is the dissociation between insight and action: offenders articulate remorse yet behavioral change does not follow. We report four preregistered st...

Hiroki Fukui

2603.04904 2026-03-05
AI LLM

AgentSCOPE: Evaluating Contextual Privacy Across Agentic Workflows

Agentic systems are increasingly acting on users' behalf, accessing calendars, email, and personal files to complete everyday tasks. Privacy evaluation for these systems has focused on the input an...

Ivoline C. Ngong, Keerthiram Murugesan, Swanand Kadhe, Justin D. Weisz, Amit Dhurandhar, Karthike...

2603.04902 2026-03-05
AI LLM

EvoTool: Self-Evolving Tool-Use Policy Optimization in LLM Agents via Blame-Aware Mutation and Diversity-Aware Selection

LLM-based agents depend on effective tool-use policies to solve complex tasks, yet optimizing these policies remains challenging due to delayed supervision and the difficulty of credit assignment i...

Shuo Yang, Soyeon Caren Han, Xueqi Ma, Yan Li, Mohammad Reza Ghasemi Madani, Eduard Hovy

2603.04900 2026-03-05
AI LLM

U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent Planning

This demonstration presents U-Parking, a distributed Ultra-Wideband (UWB)-assisted autonomous parking system. By integrating Large Language Models (LLMs)-assisted planning with robust fusion locali...

Yiang Wu, Qiong Wu, Pingyi Fan, Kezhi Wang, Wen Chen, Guoqiang Mao, Khaled B. Letaief

2603.04898 2026-03-05
AI LLM

Can LLMs Capture Expert Uncertainty? A Comparative Analysis of Value Alignment in Ethnographic Qualitative Research

Qualitative analysis of open-ended interviews plays a central role in ethnographic and economic research by uncovering individuals' values, motivations, and culturally embedded financial behaviors....

Arina Kostina, Marios Dikaiakos, Alejandro Porcel, Tassos Stassopoulos

2603.04897 2026-03-05
AI LLM

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

Accurate time series forecasting underpins decision-making across domains, yet conventional ML development suffers from data scarcity in new deployments, poor adaptability under distribution shift,...

Longkun Xu, Xiaochun Zhang, Qiantu Tuo, Rui Li

2603.04873 2026-03-05
AI LLM

Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning

Denoising in the sRGB image space is challenging due to noise variability. Although end-to-end methods perform well, their effectiveness in real-world scenarios is limited by the scarcity of real n...

Jaekyun Ko, Dongjin Kim, Soomin Lee, Guanghui Wang, Tae Hyun Kim

2603.04870 2026-03-05
AI LLM

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Generating realistic and diverse trajectories is a critical challenge in autonomous driving simulation. While Large Language Models (LLMs) show promise, existing methods often rely on structured da...

Mingxuan Mu, Guo Yang, Lei Chen, Ping Wu, Jianxun Cui

2603.04868 2026-03-05