Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

DALI: LLM-Agent Enhanced Dual-Stream Adaptive Leadership Identification for Group Recommendations

Group recommendation systems play a pivotal role in supporting collective decisions across various contexts, from leisure activities to organizational team-building. Existing group recommendation a...

Boxun Song, Min Gao, Jiawei Cheng

2603.19909 2026-03-20
AI LLM

Utility-Guided Agent Orchestration for Efficient LLM Tool Use

Tool-using large language model (LLM) agents often face a fundamental tension between answer quality and execution cost. Fixed workflows are stable but inflexible, while free-form multi-step reason...

Boyan Liu, Gongming Zhao, Hongli Xu

2603.19896 2026-03-20
AI LLM

What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test Time

Test-Time Reinforcement Learning (TTRL) enables Large Language Models (LLMs) to enhance reasoning capabilities on unlabeled test streams by deriving pseudo-rewards from majority voting consensus. H...

Dong Yan, Jian Liang, Yanbo Wang, Shuo Lu, Ran He, Tieniu Tan

2603.19880 2026-03-20
AI LLM

MedQ-Engine: A Closed-Loop Data Engine for Evolving MLLMs in Medical Image Quality Assessment

Medical image quality assessment (Med-IQA) is a prerequisite for clinical AI deployment, yet multimodal large language models (MLLMs) still fall substantially short of human experts, particularly w...

Jiyao Liu, Junzhi Ning, Wanying Qu, Lihao Liu, Chenglong Ma, Junjun He, Ningsheng Xu

2603.19863 2026-03-20
AI LLM

Beyond detection: cooperative multi-agent reasoning for rapid onboard EO crisis response

Rapid identification of hazardous events is essential for next-generation Earth Observation (EO) missions supporting disaster response. However, current monitoring pipelines remain largely ground-c...

Alejandro D. Mousist, Pedro Delgado de Robles Martín, Raquel Lladró Climent, Julian Cobos Aparicio

2603.19858 2026-03-20
AI LLM

Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue

Do LLMs talk like us? This question intrigues a multitude of scholar and it is relevant in many fields, from education to academia. This work presents an interpretable statistical feature for disti...

Riccardo Scantamburlo, Mauro Mezzanzana, Giacomo Buonanno, Francesco Bertolotti

2603.19849 2026-03-20
AI LLM

Overreliance on AI in Information-seeking from Video Content

The ubiquity of multimedia content is reshaping online information spaces, particularly in social media environments. At the same time, search is being rapidly transformed by generative AI, with la...

Anders Giovanni Møller, Elisa Bassignana, Francesco Pierri, Luca Maria Aiello

2603.19843 2026-03-20
AI LLM

Gesture2Speech: How Far Can Hand Movements Shape Expressive Speech?

Human communication seamlessly integrates speech and bodily motion, where hand gestures naturally complement vocal prosody to express intent, emotion, and emphasis. While recent text-to-speech (TTS...

Lokesh Kumar, Nirmesh Shah, Ashishkumar P. Gudmalwar, Pankaj Wasnik

2603.19831 2026-03-20
AI LLM

FormalEvolve: Neuro-Symbolic Evolutionary Search for Diverse and Prover-Effective Autoformalization

Autoformalization aims to translate natural-language mathematics into compilable, machine-checkable statements. However, semantic consistency does not imply prover effectiveness: even semantically ...

Haijian Lu, Wei Wang, Jing Liu

2603.19828 2026-03-20
AI LLM

Borderless Long Speech Synthesis

Most existing text-to-speech (TTS) systems either synthesize speech sentence by sentence and stitch the results together, or drive synthesis from plain-text dialogues alone. Both approaches leave m...

Xingchen Song, Di Wu, Dinghao Zhou, Pengyu Cheng, Hongwu Ding, Yunchao He, Jie Wang, Shengfan She...

2603.19798 2026-03-20
AI LLM

Text-Based Personas for Simulating User Privacy Decisions

The ability to simulate human privacy decisions has significant implications for aligning autonomous agents with individual intent and conducting cost-effective, large-scale privacy-centric user st...

Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal, Hamza Harkous, Nina Taft, Marco Gruteser

2603.19791 2026-03-20
AI LLM

Embodied Science: Closing the Discovery Loop with Agentic Embodied AI

Artificial intelligence has demonstrated remarkable capability in predicting scientific properties, yet scientific discovery remains an inherently physical, long-horizon pursuit governed by experim...

Xiang Zhuang, Chenyi Zhou, Kehua Feng, Zhihui Zhu, Yunfan Gao, Yijie Zhong, Yichi Zhang, Junjie H...

2603.19782 2026-03-20
AI LLM

Evaluating Image Editing with LLMs: A Comprehensive Benchmark and Intermediate-Layer Probing Approach

Evaluating text-guided image editing (TIE) methods remains a challenging problem, as reliable assessment should simultaneously consider perceptual quality, alignment with textual instructions, and ...

Shiqi Gao, Zitong Xu, Kang Fu, Huiyu Duan, Xiongkuo Min, Jia wang, Guangtao Zhai

2603.19775 2026-03-20
AI LLM

ConSearcher: Supporting Conversational Information Seeking in Online Communities with Member Personas

Many people browse online communities to learn from others' experiences and opinions, e.g., for constructing travel plans. Conversational search powered by large language models (LLMs) could ease t...

Shiwei Wu, Xinyue Chen, Yuheng Liu, Xingbo Wang, Qingyu Guo, Longfei Chen, Chuhan Shi, Zhenhui Peng

2603.19747 2026-03-20
AI LLM

Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking

Human Label Variation (HLV), i.e. systematic differences among annotators' judgments, remains underexplored in benchmarks despite rapid progress in large language model (LLM) development. We addres...

Tomas Ruiz, Tanalp Agustoslu, Carsten Schwemmer

2603.19744 2026-03-20
AI LLM

Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation

Understanding the internal mechanisms of transformer-based large language models (LLMs) is crucial for their reliable deployment and effective operation. While recent efforts have yielded a plethor...

Lasse Marten Jantsch, Dong-Jae Koh, Seonghyeon Lee, Young-Kyoon Suh

2603.19742 2026-03-20
AI LLM

FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference...

Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu

2603.19741 2026-03-20
AI LLM

PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction

While context compression can mitigate the growing inference costs of Large Language Models (LLMs) by shortening contexts, existing methods that specify a target compression ratio or length suffer ...

Runsong Zhao, Shilei Liu, Jiwei Tang, Langming Liu, Haibin Chen, Weidong Zhang, Yujin Yuan, Tong ...

2603.19733 2026-03-20
AI LLM

Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

Formal verification via interactive theorem proving is increasingly used to ensure the correctness of critical systems, yet constructing large proof scripts remains highly manual and limits scalabi...

Baoding He, Zenan Li, Wei Sun, Yuan Yao, Taolue Chen, Xiaoxing Ma, Zhendong Su

2603.19715 2026-03-20
AI LLM

TAB-AUDIT: Detecting AI-Fabricated Scientific Tables via Multi-View Likelihood Mismatch

AI-generated fabricated scientific manuscripts raise growing concerns with large-scale breaches of academic integrity. In this work, we present the first systematic study on detecting AI-generated ...

Shuo Huang, Yan Pen, Lizhen Qu

2603.19712 2026-03-20