Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Which Tool Response Should I Trust? Tool-Expertise-Aware Chest X-ray Agent with Multimodal Agentic Learning

AI agents with tool-use capabilities show promise for integrating the domain expertise of various tools. In the medical field, however, tools are usually AI models that are inherently error-prone a...

Zheang Huai, Honglong Yang, Xiaomeng Li

2602.21517 2026-02-25
AI LLM

Training Generalizable Collaborative Agents via Strategic Risk Aversion

Many emerging agentic paradigms require agents to collaborate with one another (or people) to achieve shared goals. Unfortunately, existing approaches to learning policies for such collaborative pr...

Chengrui Qu, Yizhou Zhang, Nicholas Lanzetti, Eric Mazumdar

2602.21515 2026-02-25
TESTING

Deep Unfolding Real-Time Super-Resolution Using Subpixel-Shift Twin Image and Convex Self-Similarity Prior

Multi-image super-resolution (MISR) is a critical technique for satellite remote sensing. In the perspective of information, twin-image super-resolution (TISR) is regarded as the most challenging M...

Chia-Hsiang Lin, Wei-Chih Liu, Yu-En Chiu, Jhao-Ting Lin

2602.21513 2026-02-25
TESTING

AHAN: Asymmetric Hierarchical Attention Network for Identical Twin Face Verification

Identical twin face verification represents an extreme fine-grained recognition challenge where even state-of-the-art systems fail due to overwhelming genetic similarity. Current face recognition m...

Hoang-Nhat Nguyen

2602.21503 2026-02-25
TESTING

See It, Say It, Sorted: An Iterative Training-Free Framework for Visually-Grounded Multimodal Reasoning in LVLMs

Recent large vision-language models (LVLMs) have demonstrated impressive reasoning ability by generating long chain-of-thought (CoT) responses. However, CoT reasoning in multimodal contexts is high...

Yongchang Zhang, Xianzheng Ma, Tianyi Liu, Guangquan Zhou, Yang Chen

2602.21497 2026-02-25
AI LLM

Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information

While defenses for structured PII are mature, Large Language Models (LLMs) pose a new threat: Semantic Sensitive Information (SemSI), where models infer sensitive identity attributes, generate repu...

Umid Suleymanov, Zaur Rajabov, Emil Mirzazada, Murat Kantarcioglu

2602.21496 2026-02-25
AI LLM

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning

Reinforcement learning (RL) has become a central post-training paradigm for large language models (LLMs), but its performance is highly sensitive to the quality of training problems. This sensitivi...

Ningyuan Yang, Weihua Du, Weiwei Sun, Sean Welleck, Yiming Yang

2602.21492 2026-02-25
AI LLM

StoryComposerAI: Supporting Human-AI Story Co-Creation Through Decomposition and Linking

GenAI's ability to produce text and images is increasingly incorporated into human-AI co-creation tasks such as storytelling and video editing. However, integrating GenAI into these tasks requires ...

Shuo Niu, Dylan Clements, Marina Margalit Nemanov, Hyungsin Kim

2602.21486 2026-02-25
TESTING

Global Sequential Testing for Multi-Stream Auditing

Across many risk-sensitive areas, it is critical to continuously audit the performance of machine learning systems and detect any unusual behavior quickly. This can be modeled as a sequential hypot...

Beepul Bharti, Ambar Pal, Jeremias Sulam

2602.21479 2026-02-25
TESTING

A Knowledge-Driven Approach to Music Segmentation, Music Source Separation and Cinematic Audio Source Separation

We propose a knowledge-driven, model-based approach to segmenting audio into single-category and mixed-category chunks with applications to source separation. "Knowledge" here denotes information a...

Chun-wei Ho, Sabato Marco Siniscalchi, Kai Li, Chin-Hui Lee

2602.21476 2026-02-25
TESTING

Measuring elastic properties of granular hydrogels: Effects of capillary interaction and ionic conditions

The elastic properties of granular hydrogels are commonly characterised under wet conditions, yet the influence of capillary interactions remains unclear. In practical applications, hydrogels opera...

Jiayin Zhao, Haiyi Zhong, Yixiang Gan

2602.21457 2026-02-25
TESTING

Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound

Introduction: Deep learning-based segmentation models are increasingly integrated into clinical imaging workflows, yet their robustness to adversarial perturbations remains incompletely characteriz...

Nicholas Dietrich, David McShannon

2602.21452 2026-02-25
TESTING

Surrogate-assisted global sensitivity analysis of a hybrid-dimensional Stokes--Brinkman--Darcy model

Development of new multiscale mathematical models often entails considerable complexity and multiple undetermined parameters, typically arising from closure relations. To enable reliable simulation...

Linheng Ruan, Ilja Kröker, Sergey Oladyshkin, Iryna Rybak

2602.21448 2026-02-24
TESTING

VLA Knows Its Limits

Action chunking has recently emerged as a standard practice in flow-based Vision-Language-Action (VLA) models. However, the effect and choice of the execution horizon - the number of actions to be ...

Haoxuan Wang, Gengyu Zhang, Yan Yan, Ramana Rao Kompella, Gaowen Liu

2602.21445 2026-02-24
TESTING

Breathing Black Hole Shadows in Modified Gravity (MOG)

In this paper, we investigate the dynamic phenomenological signatures of a Schwarzschild-MOG black hole shadow perturbed by passing gravitational waves. By perturbing the Hamilton-Jacobi equation f...

Nikko John Leo S. Lobos, Emmanuel T. Rodulfo

2602.21432 2026-02-24
TESTING

PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models

Medical Vision Language Models (VLMs) can change their answers when clinicians rephrase the same question, which raises deployment risks. We introduce Paraphrase Sensitivity Failure (PSF)-Med, a be...

Binesh Sadanandan, Vahid Behzadan

2602.21428 2026-02-24
TESTING

Automating Timed Up and Go Phase Segmentation and Gait Analysis via the tugturn Markerless 3D Pipeline

Instrumented Timed Up and Go (TUG) analysis can support clinical and research decision-making, but robust and reproducible markerless pipelines are still limited. We present \textit{tugturn.py}, a ...

Abel Gonçalves Chinaglia, Guilherme Manna Cesar, Paulo Roberto Pereira Santiago

2602.21425 2026-02-24
TESTING

Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates

Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach...

Nick Polson, Vadim Sokolov

2602.21408 2026-02-24
TESTING

The turbulence driving mode in NGC7793 and NGC1313

We present spatially resolved measurements of turbulence driving modes across entire extragalactic discs of NGC7793 and NGC1313, using Atacama Large Millimetre/submillimetre Array (ALMA) CO(J=2-1) ...

Lewis J Miller, Kathryn Grasha, Christoph Federrath

2602.21405 2026-02-24
TESTING

The Headless Firm: How AI Reshapes Enterprise Boundaries

The boundary of the firm is determined by coordination cost. We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew wi...

Tassilo Klein, Sebastian Wieczorek

2602.21401 2026-02-24