Papers
Research papers from arXiv and related sources
ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation
Large Language Model (LLM)-based agents show promise for e-commerce conversational shopping, yet existing implementations lack the interaction depth and contextual breadth required for complex prod...
Jiangyuan Wang, Kejun Xiao, Huaipeng Zhao, Tao Luo, Xiaoyi Zeng
A Reliable Indoor Navigation System for Humans Using AR-based Technique
Reliable navigation systems are not available indoors, such as in campuses and small areas. Users must depend on confusing, time-consuming static signage or floor maps. In this paper, an AR-based t...
Vijay U. Rathod, Manav S. Sharma, Shambhavi Verma, Aadi Joshi, Sachin Aage, Sujal Shahane
From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems
LLM-powered Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex domains but suffer from inherent fragility and opaque failure mechanisms. Existing failure attribution met...
Yawen Wang, Wenjie Wu, Junjie Wang, Qing Wang
Privacy-Preserving Local Energy Trading Considering Network Fees
Driven by the widespread deployment of distributed energy resources, local energy markets (LEMs) have emerged as a promising approach for enabling direct trades among prosumers and consumers to bal...
Eman Alqahtani, Mustafa A. Mustafa
Does Personalized Nudging Wear Off? A Longitudinal Study of AI Self-Modeling for Behavioral Engagement
Sustaining the effectiveness of behavior change technologies remains a key challenge. AI self-modeling, which generates personalized portrayals of one's ideal self, has shown promise for motivating...
Qing He, Zeyu Wang, Yuzhou Du, Jiahuan Ding, Yuanchun Shi, Yuntao Wang
ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference
The paradigm of large language model (LLM) reasoning is shifting from parameter scaling to test-time compute scaling, yet many existing approaches still rely on uniform brute-force sampling (for ex...
Siyuan Ma, Bo Gao, Xiaojun Jia, Simeng Qin, Tianlin Li, Ke Ma, Xiaoshuang Jia, Wenqi Ren, Yang Liu
Suppressing Prior-Comparison Hallucinations in Radiology Report Generation via Semantically Decoupled Latent Steering
Automated radiology report generation using vision-language models (VLMs) is limited by the risk of prior-comparison hallucination, where the model generates historical findings unsupported by the ...
Ao Li, Rui Liu, Mingjie Li, Sheng Liu, Lei Wang, Xiaodan Liang, Lina Yao, Xiaojun Chang, Lei Xing
PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents
Large language model (LLM) agents typically rely on reactive decision-making paradigms such as ReAct, selecting actions conditioned on growing execution histories. While effective for short tasks, ...
Yihan, Wen, Xin Chen
TRIZ-RAGNER: A Retrieval-Augmented Large Language Model for TRIZ-Aware Named Entity Recognition in Patent-Based Contradiction Mining
TRIZ-based contradiction mining is a fundamental task in patent analysis and systematic innovation, as it enables the identification of improving and worsening technical parameters that drive inven...
Zitong Xu, Yuqing Wu, Yue Zhao
ProtoDCS: Towards Robust and Efficient Open-Set Test-Time Adaptation for Vision-Language Models
Large-scale Vision-Language Models (VLMs) exhibit strong zero-shot recognition, yet their real-world deployment is challenged by distribution shifts. While Test-Time Adaptation (TTA) can mitigate t...
Wei Luo, Yangfan Ou, Jin Deng, Zeshuai Deng, Xiquan Yan, Zhiquan Wen, Mingkui Tan
AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech
We introduce AudioCapBench, a benchmark for evaluating audio captioning capabilities of large multimodal models. \method covers three distinct audio domains, including environmental sound, music, a...
Jielin Qiu, Jianguo Zhang, Zixiang Chen, Liangwei Yang, Ming Zhu, Juntao Tan, Haolin Chen, Wentin...
SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair
The rapid advancement of Large Language Models (LLMs) has led to the emergence of intelligent agents capable of autonomously interacting with environments and invoking external tools. Recently, age...
Quanjun Zhang, Chengyu Gao, Yu Han, Ye Shang, Chunrong Fang, Zhenyu Chen, Liang Xiao
AI Must Embrace Specialization via Superhuman Adaptable Intelligence
Everyone from AI executives and researchers to doomsayers, politicians, and activists is talking about Artificial General Intelligence (AGI). Yet, they often don't seem to agree on its exact defini...
Judah Goldfeder, Philippe Wyder, Yann LeCun, Ravid Shwartz Ziv
Stress-Testing Assumptions: A Guide to Bayesian Sensitivity Analyses in Causal Inference
While observational data are routinely used to estimate causal effects of biomedical treatments, doing so requires special methods to adjust for observed confounding. These methods invariably rely ...
Arman Oganisian
Learning to Reflect and Correct: Towards Better Decoding Trajectories for Large-Scale Generative Recommendation
Generative Recommendation (GR) has become a promising paradigm for large-scale recommendation systems. However, existing GR models typically perform single-pass decoding without explicit refinement...
Haibo Xing, Hao Deng, Lingyu Mu, Jinxin Hu, Yu Zhang, Xiaoyi Zeng, Jing Zhang
FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation
Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a...
Zhihao Ding, Jinming Li, Ze Lu, Jieming Shi
When LLMs Help -- and Hurt -- Teaching Assistants in Proof-Based Courses
Teaching assistants (TAs) are essential to grading and feedback provision in proof-based courses, yet these tasks are time-intensive and difficult to scale. Although Large Language Models (LLMs) ha...
Romina Mahinpei, Sofiia Druchyna, Manoel Horta Ribeiro
MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs
Synthesizing high-quality training data is crucial for enhancing domain models' reasoning abilities. Existing methods face limitations in long-tail knowledge coverage, effectiveness verification, a...
Lun Zhan, Feng Xiong, Huanyong Liu, Feng Zhang, Yuhui Yin
Toward E2E Intelligence in 6G Networks: An AI Agent-Based RAN-CN Converged Intelligence Framework
Recent advances in intelligent network control have primarily relied on task-specific Artificial Intelligence (AI) models deployed separately within the Radio Access Network (RAN) and Core Network ...
Youbin Han, Haneul Ko, Namseok Ko, Tarik Taleb, Yan Chen
DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model
Significant progress has been made in the field of Instruction-based Image Editing Models (IIEMs). However, while these models demonstrate plausible adherence to instructions and strong reasoning a...
Shibo Hong, Boxian Ai, Jun Kuang, Wei Wang, FengJiao Chen, Zhongyuan Peng, Chenhao Huang, Yixin Cao