Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

When More Is Less: A Systematic Analysis of Spatial and Commonsense Information for Visual Spatial Reasoning

Visual spatial reasoning (VSR) remains challenging for modern vision-language models (VLMs), despite advances in multimodal architectures. A common strategy is to inject additional information at i...

Muku Akasaka, Soyeon Caren Han

2602.21619 2026-02-25
AI LLM

Structurally Aligned Subtask-Level Memory for Software Engineering Agents

Large Language Models (LLMs) have demonstrated significant potential as autonomous software engineering (SWE) agents. Recent work has further explored augmenting these agents with memory mechanisms...

Kangning Shen, Jingyuan Zhang, Chenxi Sun, Wencong Zeng, Yang Yue

2602.21611 2026-02-25
AI LLM

MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification

Bangla-English code-mixing is widespread across South Asian social media, yet resources for implicit meaning identification in this setting remain scarce. Existing sentiment and sarcasm models larg...

Kazi Samin Yasar Alam, Md Tanbir Chowdhury, Tamim Ahmed, Ajwad Abrar, Md Rafid Haque

2602.21608 2026-02-25
AI LLM

Inverse prediction of capacitor multiphysics dynamic parameters using deep generative model

Finite element simulations are run by package design engineers to model design structures. The process is irreversible meaning every minute structural adjustment requires a fresh input parameter ru...

Kart-Leong Lim, Rahul Dutta, Mihai Rotaru

2602.21606 2026-02-25
AI LLM

Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation

This paper argues that reliable end-to-end graph data analytics cannot be achieved by retrieval- or code-generation-centric LLM agents alone. Although large language models (LLMs) provide strong re...

Qiange Wang, Chaoyi Chen, Jingqi Gao, Zihan Wang, Yanfeng Zhang, Ge Yu

2602.21604 2026-02-25
AI LLM

AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking

Approximate Nearest Neighbor (ANN) search has become fundamental to modern AI infrastructure, powering recommendation systems, search engines, and large language models across industry leaders from...

Ganap Ashit Tewary, Nrusinga Charan Gantayat, Jeff Zhang

2602.21600 2026-02-25
AI LLM

Retrieval Challenges in Low-Resource Public Service Information: A Case Study on Food Pantry Access

Public service information systems are often fragmented, inconsistently formatted, and outdated. These characteristics create low-resource retrieval environments that hinder timely access to critic...

Touseef Hasan, Laila Cure, Souvika Sarkar

2602.21598 2026-02-25
AI LLM

SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints

Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks oft...

Hyungmin Kim, Hobeom Jeon, Dohyung Kim, Minsu Jang, Jeahong Kim

2602.21595 2026-02-25
AI LLM

Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to...

Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang

2602.21593 2026-02-25
AI LLM

CADC: Content Adaptive Diffusion-Based Generative Image Compression

Diffusion-based generative image compression has demonstrated remarkable potential for achieving realistic reconstruction at ultra-low bitrates. The key to unlocking this potential lies in making t...

Xihua Sheng, Lingyu Zhu, Tianyu Zhang, Dong Liu, Shiqi Wang, Jing Wang

2602.21591 2026-02-25
AI LLM

Hall effect on nontrivial quadrupole order in quasi-kagome compound URhSn

This study focuses on the transport properties of the quasi-kagome compound URhSn, which exhibits successive phase transitions at TC =16 K (ferromagnetic phase) and TO =54 K (intermediate phase). A...

Yusei Shimizu, Arvind Maurya, Yoshiya Homma, Motoi Kimata, Toni Helm, Ai Nakamura, Dexin Li, Atsu...

2602.21587 2026-02-25
AI LLM

Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences

Many applications seek to optimize LLM outputs at test time by iteratively proposing, scoring, and refining candidates over a discrete output space. Existing methods use a calibrated scalar evaluat...

Sweta Karlekar, Carolina Zheng, Magnus Saebo, Nicolas Beltran-Velez, Shuyang Yu, John Bowlan, Mic...

2602.21585 2026-02-25
AI LLM

Exploring Human-Machine Coexistence in Symmetrical Reality

In the context of the evolution of artificial intelligence (AI), the interaction between humans and AI entities has become increasingly salient, challenging the conventional human-centric paradigms...

Zhenliang Zhang

2602.21584 2026-02-25
AI LLM

Power and Limitations of Aggregation in Compound AI Systems

When designing compound AI systems, a common approach is to query multiple copies of the same model and aggregate the responses to produce a synthesized output. Given the homogeneity of these model...

Nivasini Ananthakrishnan, Meena Jagadeesan

2602.21556 2026-02-25
AI LLM

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

The performance of multi-turn, agentic LLM inference is increasingly dominated by KV-Cache storage I/O rather than computation. In prevalent disaggregated architectures, loading the massive KV-Cach...

Yongtong Wu, Shaoyuan Chen, Yinmin Zhong, Rilin Huang, Yixuan Tan, Wentao Zhang, Liyue Zhang, Sha...

2602.21548 2026-02-25
AI LLM

RAC: Relation-Aware Cache Replacement for Large Language Models

The scaling of Large Language Model (LLM) services faces significant cost and latency challenges, making effective caching under tight capacity crucial. Existing cache replacement policies, from he...

Yuchong Wu, Zihuan Xu, Wangze Ni, Peng Cheng, Lei Chen, Xuemin Lin, Heng Tao Shen, Kui Ren

2602.21547 2026-02-25
AI LLM

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

Agentic reinforcement learning (ARL) has rapidly gained attention as a promising paradigm for training agents to solve complex, multi-step interactive tasks. Despite encouraging early results, ARL ...

Xiaoxuan Wang, Han Zhang, Haixin Wang, Yidan Shi, Ruoyan Li, Kaiqiao Han, Chenyi Tong, Haoran Den...

2602.21534 2026-02-25
AI LLM

Reasoning-Driven Design of Single Atom Catalysts via a Multi-Agent Large Language Model Framework

Large language models (LLMs) are becoming increasingly applied beyond natural language processing, demonstrating strong capabilities in complex scientific tasks that traditionally require human exp...

Dong Hyeon Mok, Seoin Back, Victor Fung, Guoxiang Hu

2602.21533 2026-02-25
AI LLM

One Brain, Omni Modalities: Towards Unified Non-Invasive Brain Decoding with Large Language Models

Deciphering brain function through non-invasive recordings requires synthesizing complementary high-frequency electromagnetic (EEG/MEG) and low-frequency metabolic (fMRI) signals. However, despite ...

Changli Tang, Shurui Li, Junliang Wang, Qinfan Xiao, Zhonghao Zhai, Lei Bai, Yu Qiao, Bowen Zhou,...

2602.21522 2026-02-25
AI LLM

Which Tool Response Should I Trust? Tool-Expertise-Aware Chest X-ray Agent with Multimodal Agentic Learning

AI agents with tool-use capabilities show promise for integrating the domain expertise of various tools. In the medical field, however, tools are usually AI models that are inherently error-prone a...

Zheang Huai, Honglong Yang, Xiaomeng Li

2602.21517 2026-02-25