Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

The Reasoning Error About Reasoning: Why Different Types of Reasoning Require Different Representational Structures

Different types of reasoning impose different structural demands on representational systems, yet no systematic account of these demands exists across psychology, AI, and philosophy of mind. I prop...

Yiling Wu

2603.21736 2026-03-23
AI LLM

Cognitive Agency Surrender: Defending Epistemic Sovereignty via Scaffolded AI Friction

The proliferation of Generative Artificial Intelligence has transformed benign cognitive offloading into a systemic risk of cognitive agency surrender. Driven by the commercial dogma of "zero-frict...

Kuangzhe Xu, Yu Shen, Longjie Yan, Yinghui Ren

2603.21735 2026-03-23
AI LLM

EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning

Scientific idea generation is a cornerstone of autonomous knowledge discovery, yet the iterative evolution required to transform initial concepts into high-quality research proposals remains a form...

Andreas Sauter, Yuyue Zhao, Jacopo Urbani, Wenxiang Hu, Zaiqiao Meng, Lun Zhou, Xiaohui Yan, Youg...

2603.21728 2026-03-23
AI LLM

LSAI: A Large Small AI Model Codesign Framework for Agentic Robot Scenarios

The development of Artificial Intelligence (AI) has enabled agentic robots an appealing paradigm for various applications, such as research and rescue in complex environment. In this context, the n...

Longyu Zhou, Supeng Leng, Tianhao Liang, Jianping Yao

2603.21726 2026-03-23
AI LLM

CurvZO: Adaptive Curvature-Guided Sparse Zeroth-Order Optimization for Efficient LLM Fine-Tuning

Fine-tuning large language models (LLMs) with backpropagation achieves high performance but incurs substantial memory overhead, limiting scalability on resource-constrained hardware. Zeroth-order (...

Shuo Wang, Ziyu Chen, Ming Tang

2603.21725 2026-03-23
AI LLM

Probing How Scalable Table Data Enhances General Long-Context Reasoning

As real-world tasks grow increasingly complex, long-context reasoning has become a core capability for Large Language Models (LLMs). However, few studies explore which data types are effective for ...

Huaibing Xie, Guoliang Zhao, Yang Liu, Shihan Dou, Siming Huang, Yanling Xiao, Shaolei Wang, Yiti...

2603.21719 2026-03-23
AI LLM

When Exploration Comes for Free with Mixture-Greedy: Do we need UCB in Diversity-Aware Multi-Armed Bandits?

Efficient selection among multiple generative models is increasingly important in modern generative AI, where sampling from suboptimal models is costly. This problem can be formulated as a multi-ar...

Bahar Dibaei Nia, Farzan Farnia

2603.21716 2026-03-23
AI LLM

Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning

Long-tail class incremental learning (LT CIL) remains highly challenging because the scarcity of samples in tail classes not only hampers their learning but also exacerbates catastrophic forgetting...

Xi Wang, Xu Yang, Donghao Sun, Cheng Deng

2603.21708 2026-03-23
AI LLM

Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

Model merging has emerged as a practical approach to combine capabilities of specialized large language models (LLMs) without additional training. In the Long-to-Short (L2S) scenario, merging a bas...

Tian Xia

2603.21705 2026-03-23
AI LLM

Rethinking Token Reduction for Large Vision-Language Models

Large Vision-Language Models (LVLMs) excel in visual understanding and reasoning, but the excessive visual tokens lead to high inference costs. Although recent token reduction methods mitigate this...

Yi Wang, Haofei Zhang, Qihan Huang, Anda Cao, Gongfan Fang, Wei Wang, Xuan Jin, Jie Song, Mingli ...

2603.21701 2026-03-23
AI LLM

Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) extend text-only LLMs with visual reasoning, but also introduce new safety failure modes under visually grounded instructions. We study comic-template jailb...

Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee

2603.21697 2026-03-23
AI LLM

MIND: Multi-agent inference for negotiation dialogue in travel planning

While Multi-Agent Debate (MAD) research has advanced, its efficacy in coordinating complex stakeholder interests such as travel planning remains largely unexplored. To bridge this gap, we propose M...

Hunmin Do, Taejun Yoon, Kiyong Jung

2603.21696 2026-03-23
AI LLM

Reasoning Provenance for Autonomous AI Agents: Structured Behavioral Analytics Beyond State Checkpoints and Execution Traces

As AI agents transition from human-supervised copilots to autonomous platform infrastructure, the ability to analyze their reasoning behavior across populations of investigations becomes a pressing...

Neelmani Vispute

2603.21692 2026-03-23
AI LLM

AI Token Futures Market: Commoditization of Compute and Derivatives Contract Design

As large language models (LLMs) and vision-language-action models (VLAs) become widely deployed, the tokens consumed by AI inference are evolving into a new type of commodity. This paper systematic...

Yicai Xing

2603.21690 2026-03-23
AI LLM

Mirage The Illusion of Visual Understanding

Multimodal AI systems have achieved remarkable performance across a broad range of real-world tasks, yet the mechanisms underlying visual-language reasoning remain surprisingly poorly understood. W...

Mohammad Asadi, Jack W. O'Sullivan, Fang Cao, Tahoura Nedaee, Kamyar Fardi, Fei-Fei Li, Ehsan Ade...

2603.21687 2026-03-23
AI LLM

Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Hate speech online targets individuals or groups based on identity attributes and spreads rapidly, posing serious social risks. Memes, which combine images and text, have emerged as a nuanced vehic...

Rui Xing, Qi Chai, Jie Ma, Jing Tao, Pinghui Wang, Shuming Zhang, Xinping Wang, Hao Wang

2603.21686 2026-03-23
AI LLM

Optimizing Multi-Agent Weather Captioning via Text Gradient Descent: A Training-Free Approach with Consensus-Aware Gradient Fusion

Generating interpretable natural language captions from weather time series data remains a significant challenge at the intersection of meteorological science and natural language processing. While...

Shixu Liu

2603.21673 2026-03-23
AI LLM

TAMTRL: Teacher-Aligned Reward Reshaping for Multi-Turn Reinforcement Learning in Long-Context Compression

The rapid progress of large language models (LLMs) has led to remarkable performance gains across a wide range of tasks. However, when handling long documents that exceed the model's context window...

Li Wang, Yandong Wang, Xin Yu, Kui Zhang, Tianhao Peng, Wenjun Wu

2603.21663 2026-03-23
AI LLM

OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging

Federated learning (FL) has become a promising paradigm for collaborative medical image analysis, yet existing frameworks remain tightly coupled to task-specific backbones and are fragile under het...

Meilin Liu, Jiaying Wang, Jing Shan

2603.21660 2026-03-23
AI LLM

A Comparative Analysis of LLM Memorization at Statistical and Internal Levels: Cross-Model Commonalities and Model-Specific Signatures

Memorization is a fundamental component of intelligence for both humans and LLMs. However, while LLM performance scales rapidly, our understanding of memorization lags. Due to limited access to the...

Bowen Chen, Namgi Han, Yusuke Miyao

2603.21658 2026-03-23