Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Large language model (LLM)-based agents have emerged as powerful autonomous controllers for digital environments, including mobile interfaces, operating systems, and web browsers. Web navigation, f...

Taiyi Wang, Sian Gooding, Florian Hartmann, Oriana Riva, Edward Grefenstette

2603.19685 2026-03-20
AI LLM

GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems

Large language model (LLM)-based multi-agent systems (MAS) have demonstrated exceptional capabilities in solving complex tasks, yet their effectiveness depends heavily on the underlying communicati...

Hongjiang Chen, Xin Zheng, Yixin Liu, Pengfei Jiao, Shiyuan Li, Huan Liu, Zhidong Zhao, Ziqi Xu, ...

2603.19677 2026-03-20
TESTING

ATHENA: Adaptive Test-Time Steering for Improving Count Fidelity in Diffusion Models

Text-to-image diffusion models achieve high visual fidelity but surprisingly exhibit systematic failures in numerical control when prompts specify explicit object counts. To address this limitation...

Mohammad Shahab Sepehri, Asal Mehradfar, Berk Tinaz, Salman Avestimehr, Mahdi Soltanolkotabi

2603.19676 2026-03-20
AI LLM

Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach

This paper presents a novel prompt engineering framework for trait specific Automatic Essay Scoring (AES) in Arabic, leveraging large language models (LLMs) under zero-shot and few-shot configurati...

Salim Al Mandhari, Hieu Pham Dinh, Mo El-Haj, Paul Rayson

2603.19668 2026-03-20
TESTING

GenFacet: End-to-End Generative Faceted Search via Multi-Task Preference Alignment in E-Commerce

Faceted search acts as a critical bridge for navigating massive ecommerce catalogs, yet traditional systems rely on static rule-based extraction or statistical ranking, struggling with emerging voc...

Zhouwei Zhai, Min Yang, Jin Li

2603.19665 2026-03-20
TESTING

The Residual Stream Is All You Need: On the Redundancy of the KV Cache in Transformer Inference

The key-value (KV) cache is widely treated as essential state in transformer inference, and a large body of work engineers policies to compress, evict, or approximate its entries. We prove that thi...

Kaleem Ullah Qasim, Jiashu Zhang, Muhammad Kafeel Shaheen, Razan Alharith, Heying Zhang

2603.19664 2026-03-20
TESTING

Accurate Open-Loop Control of a Soft Continuum Robot Through Visually Learned Latent Representations

This work addresses open-loop control of a soft continuum robot (SCR) from video-learned latent dynamics. Visual Oscillator Networks (VONs) from previous work are used, that provide mechanistically...

Henrik Krauss, Johann Licher, Naoya Takeishi, Annika Raatz, Takehisa Yairi

2603.19655 2026-03-20
TESTING

Ensembles-based Feature Guided Analysis

Recent Deep Neural Networks (DNN) applications ask for techniques that can explain their behavior. Existing solutions, such as Feature Guided Analysis (FGA), extract rules on their internal behavio...

Federico Formica, Stefano Gregis, Andrea Rota, Aurora Francesca Zanenga, Mark Lawford, Claudio Me...

2603.19653 2026-03-20
AI LLM

PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization

Social platforms serve as central hubs for information exchange, where user behaviors and platform interventions jointly shape opinions. However, intervention policies like recommendation and conte...

Renhong Huang, Ning Tang, Jiarong Xu, Yuxuan Cao, Qingqian Tu, Sheng Guo, Bo Zheng, Huiyuan Liu, ...

2603.19649 2026-03-20
AI LLM

HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

Although agentic workflows have demonstrated strong potential for solving complex tasks, existing automated generation methods remain inefficient and underperform, as they rely on predefined operat...

Beibei Xu, Yutong Ye, Chuyun Shen, Yingbo Zhou, Cheng Chen, Mingsong Chen

2603.19639 2026-03-20
AI LLM

BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

The exponential expansion of context windows in LLMs has unlocked capabilities for long-document understanding but introduced severe bottlenecks in inference latency and information utilization. Ex...

Zhengpei Hu, Kai Li, Dapeng Fu, Chang Zeng, Yue Li, Yuanhao Tang, Jianqiang Huang

2603.19635 2026-03-20
AI LLM

MetaCues: Enabling Critical Engagement with Generative AI for Information Seeking and Sensemaking

Generative AI (GenAI) search tools are increasingly used for information seeking, yet their design tends to encourage cognitive offloading, which may lead to passive engagement, selective attention...

Anjali Singh, Karan Taneja, Zhitong Guan, Soo Young Rieh

2603.19634 2026-03-20
TESTING

Dual Prompt-Driven Feature Encoding for Nighttime UAV Tracking

Robust feature encoding constitutes the foundation of UAV tracking by enabling the nuanced perception of target appearance and motion, thereby playing a pivotal role in ensuring reliable tracking. ...

Yiheng Wang, Changhong Fu, Liangliang Yao, Haobo Zuo, Zijie Zhang

2603.19628 2026-03-20
TESTING

A Concept of Next-Generation Atmospheric Cherenkov Telescope Array (NG-ACTA)

The Next-Generation Atmospheric Cherenkov Telescope Array (NG-ACTA) is proposed as a prospective infrastructure for very high energy (VHE) gamma-ray astronomy, consisting of a mixed-aperture array ...

Jiancheng Wang, Jirong Mao

2603.19622 2026-03-20
TESTING

Blow-up of solutions to the Euler-Poisson-Darbox equation with critical power nonlinearity

In our recent precious work, we established the finite time blow up result and upper bound of lifespan estimate to the singular Cauchy problem of semilinear Euler-Poisson-Darboux equation in R^n wi...

Mengting Fan, Ning-An Lai, Hiroyuki Takamura

2603.19614 2026-03-20
TESTING

Universal method for optimized robustness in self-testing of quantum resources

Self-testing is a phenomenon where the use of specific quantum states or measurements can be inferred solely from the correlations they generate. We introduce a universal method for conducting robu...

Shin-Liang Chen, Nikolai Miklin

2603.19612 2026-03-20
AI LLM

Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

In-Context Learning (ICL) enables pretrained LLMs to adapt to downstream tasks by conditioning on a small set of input-output demonstrations, without any parameter updates. Although there have been...

Xuhan Tong, Yuchen Zeng, Jiawei Zhang

2603.19611 2026-03-20
AI LLM

ParallelVLM: Lossless Video-LLM Acceleration with Visual Alignment Aware Parallel Speculative Decoding

Although current Video-LLMs achieve impressive performance in video understanding tasks, their autoregressive decoding efficiency remains constrained by the massive number of video tokens. Visual t...

Quan Kong, Yuhao Shen, Yicheng Ji, Huan Li, Cong Wang

2603.19610 2026-03-20
AI LLM

Physion-Eval: Evaluating Physical Realism in Generated Video via Human Reasoning

Video generation models are increasingly used as world simulators for storytelling, simulation, and embodied AI. As these models advance, a key question arises: do generated videos obey the physica...

Qin Zhang, Peiyu Jing, Hong-Xing Yu, Fangqiang Ding, Fan Nie, Weimin Wang, Yilun Du, James Zou, J...

2603.19607 2026-03-20
AI LLM

CO-EVOLVE: Bidirectional Co-Evolution of Graph Structure and Semantics for Heterophilous Learning

The integration of Large Language Models (LLMs) and Graph Neural Networks (GNNs) promises to unify semantic understanding with structural reasoning, yet existing methods typically rely on static, u...

Jinming Xing, Muhammad Shahzad

2603.19596 2026-03-20