Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

SAGE-LLM: Towards Safe and Generalizable LLM Controller with Fuzzy-CBF Verification and Graph-Structured Knowledge Retrieval for UAV Decision

In UAV dynamic decision, complex and variable hazardous factors pose severe challenges to the generalization capability of algorithms. Despite offering semantic understanding and scene generalizati...

Wenzhe Zhao, Yang Zhao, Ganchao Liu, Zhiyu Jiang, Dandan Ma, Zihao Li, Xuelong Li

2602.23719 2026-02-27
AI LLM

ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation

Large Language Model (LLM)-based agents show promise for e-commerce conversational shopping, yet existing implementations lack the interaction depth and contextual breadth required for complex prod...

Jiangyuan Wang, Kejun Xiao, Huaipeng Zhao, Tao Luo, Xiaoyi Zeng

2602.23716 2026-02-27
AI LLM

A Reliable Indoor Navigation System for Humans Using AR-based Technique

Reliable navigation systems are not available indoors, such as in campuses and small areas. Users must depend on confusing, time-consuming static signage or floor maps. In this paper, an AR-based t...

Vijay U. Rathod, Manav S. Sharma, Shambhavi Verma, Aadi Joshi, Sachin Aage, Sujal Shahane

2602.23706 2026-02-27
AI LLM

From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems

LLM-powered Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex domains but suffer from inherent fragility and opaque failure mechanisms. Existing failure attribution met...

Yawen Wang, Wenjie Wu, Junjie Wang, Qing Wang

2602.23701 2026-02-27
AI LLM

Does Personalized Nudging Wear Off? A Longitudinal Study of AI Self-Modeling for Behavioral Engagement

Sustaining the effectiveness of behavior change technologies remains a key challenge. AI self-modeling, which generates personalized portrayals of one's ideal self, has shown promise for motivating...

Qing He, Zeyu Wang, Yuzhou Du, Jiahuan Ding, Yuanchun Shi, Yuntao Wang

2602.23688 2026-02-27
AI LLM

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference

The paradigm of large language model (LLM) reasoning is shifting from parameter scaling to test-time compute scaling, yet many existing approaches still rely on uniform brute-force sampling (for ex...

Siyuan Ma, Bo Gao, Xiaojun Jia, Simeng Qin, Tianlin Li, Ke Ma, Xiaoshuang Jia, Wenqi Ren, Yang Liu

2602.23681 2026-02-27
AI LLM

Suppressing Prior-Comparison Hallucinations in Radiology Report Generation via Semantically Decoupled Latent Steering

Automated radiology report generation using vision-language models (VLMs) is limited by the risk of prior-comparison hallucination, where the model generates historical findings unsupported by the ...

Ao Li, Rui Liu, Mingjie Li, Sheng Liu, Lei Wang, Xiaodan Liang, Lina Yao, Xiaojun Chang, Lei Xing

2602.23676 2026-02-27
AI LLM

PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents

Large language model (LLM) agents typically rely on reactive decision-making paradigms such as ReAct, selecting actions conditioned on growing execution histories. While effective for short tasks, ...

Yihan, Wen, Xin Chen

2602.23668 2026-02-27
AI LLM

TRIZ-RAGNER: A Retrieval-Augmented Large Language Model for TRIZ-Aware Named Entity Recognition in Patent-Based Contradiction Mining

TRIZ-based contradiction mining is a fundamental task in patent analysis and systematic innovation, as it enables the identification of improving and worsening technical parameters that drive inven...

Zitong Xu, Yuqing Wu, Yue Zhao

2602.23656 2026-02-27
AI LLM

AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech

We introduce AudioCapBench, a benchmark for evaluating audio captioning capabilities of large multimodal models. \method covers three distinct audio domains, including environmental sound, music, a...

Jielin Qiu, Jianguo Zhang, Zixiang Chen, Liangwei Yang, Ming Zhu, Juntao Tan, Haolin Chen, Wentin...

2602.23649 2026-02-27
AI LLM

SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair

The rapid advancement of Large Language Models (LLMs) has led to the emergence of intelligent agents capable of autonomously interacting with environments and invoking external tools. Recently, age...

Quanjun Zhang, Chengyu Gao, Yu Han, Ye Shang, Chunrong Fang, Zhenyu Chen, Liang Xiao

2602.23647 2026-02-27
AI LLM

AI Must Embrace Specialization via Superhuman Adaptable Intelligence

Everyone from AI executives and researchers to doomsayers, politicians, and activists is talking about Artificial General Intelligence (AGI). Yet, they often don't seem to agree on its exact defini...

Judah Goldfeder, Philippe Wyder, Yann LeCun, Ravid Shwartz Ziv

2602.23643 2026-02-27
AI LLM

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a...

Zhihao Ding, Jinming Li, Ze Lu, Jieming Shi

2602.23636 2026-02-27
AI LLM

When LLMs Help -- and Hurt -- Teaching Assistants in Proof-Based Courses

Teaching assistants (TAs) are essential to grading and feedback provision in proof-based courses, yet these tasks are time-intensive and difficult to scale. Although Large Language Models (LLMs) ha...

Romina Mahinpei, Sofiia Druchyna, Manoel Horta Ribeiro

2602.23635 2026-02-27
AI LLM

Toward E2E Intelligence in 6G Networks: An AI Agent-Based RAN-CN Converged Intelligence Framework

Recent advances in intelligent network control have primarily relied on task-specific Artificial Intelligence (AI) models deployed separately within the Radio Access Network (RAN) and Core Network ...

Youbin Han, Haneul Ko, Namseok Ko, Tarik Taleb, Yan Chen

2602.23623 2026-02-27
AI LLM

DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model

Significant progress has been made in the field of Instruction-based Image Editing Models (IIEMs). However, while these models demonstrate plausible adherence to instructions and strong reasoning a...

Shibo Hong, Boxian Ai, Jun Kuang, Wei Wang, FengJiao Chen, Zhongyuan Peng, Chenhao Huang, Yixin Cao

2602.23622 2026-02-27
AI LLM

Synthetic Data Powers Product Retrieval for Long-tail Knowledge-Intensive Queries in E-commerce Search

Product retrieval is the backbone of e-commerce search: for each user query, it identifies a high-recall candidate set from billions of items, laying the foundation for high-quality ranking and use...

Gui Ling, Weiyuan Li, Yue Jiang, Wenjun Peng, Xingxian Liu, Dongshuai Li, Fuyu Lv, Dan Ou, Haihon...

2602.23620 2026-02-27
AI LLM

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning

The reasoning capability of large language models (LLMs), defined as their ability to analyze, infer, and make decisions based on input information, is essential for building intelligent task-orien...

Yu Zhu, Kai Yang

2602.23610 2026-02-27
AI LLM

LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering

Long-form question answering (LFQA) demands nuanced evaluation of multi-sentence explanatory responses, yet existing metrics often fail to reflect human judgment. We present LFQA-HP-1M, a large-sca...

Rafid Ishrak Jahan, Fahmid Shahriar Iqbal, Sagnik Ray Choudhury

2602.23603 2026-02-27
AI LLM

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable capability for complex and long-horizon embodied planning. By keeping track of past experiences and environmental states, ...

Zebin Yang, Tong Xie, Baotong Lu, Shaoshan Liu, Bo Yu, Meng Li

2602.23592 2026-02-27