Papers
Research papers from arXiv and related sources
Which Tool Response Should I Trust? Tool-Expertise-Aware Chest X-ray Agent with Multimodal Agentic Learning
AI agents with tool-use capabilities show promise for integrating the domain expertise of various tools. In the medical field, however, tools are usually AI models that are inherently error-prone a...
Zheang Huai, Honglong Yang, Xiaomeng Li
Training Generalizable Collaborative Agents via Strategic Risk Aversion
Many emerging agentic paradigms require agents to collaborate with one another (or people) to achieve shared goals. Unfortunately, existing approaches to learning policies for such collaborative pr...
Chengrui Qu, Yizhou Zhang, Nicholas Lanzetti, Eric Mazumdar
Deep Unfolding Real-Time Super-Resolution Using Subpixel-Shift Twin Image and Convex Self-Similarity Prior
Multi-image super-resolution (MISR) is a critical technique for satellite remote sensing. In the perspective of information, twin-image super-resolution (TISR) is regarded as the most challenging M...
Chia-Hsiang Lin, Wei-Chih Liu, Yu-En Chiu, Jhao-Ting Lin
AHAN: Asymmetric Hierarchical Attention Network for Identical Twin Face Verification
Identical twin face verification represents an extreme fine-grained recognition challenge where even state-of-the-art systems fail due to overwhelming genetic similarity. Current face recognition m...
Hoang-Nhat Nguyen
See It, Say It, Sorted: An Iterative Training-Free Framework for Visually-Grounded Multimodal Reasoning in LVLMs
Recent large vision-language models (LVLMs) have demonstrated impressive reasoning ability by generating long chain-of-thought (CoT) responses. However, CoT reasoning in multimodal contexts is high...
Yongchang Zhang, Xianzheng Ma, Tianyi Liu, Guangquan Zhou, Yang Chen
Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information
While defenses for structured PII are mature, Large Language Models (LLMs) pose a new threat: Semantic Sensitive Information (SemSI), where models infer sensitive identity attributes, generate repu...
Umid Suleymanov, Zaur Rajabov, Emil Mirzazada, Murat Kantarcioglu
GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
Reinforcement learning (RL) has become a central post-training paradigm for large language models (LLMs), but its performance is highly sensitive to the quality of training problems. This sensitivi...
Ningyuan Yang, Weihua Du, Weiwei Sun, Sean Welleck, Yiming Yang
StoryComposerAI: Supporting Human-AI Story Co-Creation Through Decomposition and Linking
GenAI's ability to produce text and images is increasingly incorporated into human-AI co-creation tasks such as storytelling and video editing. However, integrating GenAI into these tasks requires ...
Shuo Niu, Dylan Clements, Marina Margalit Nemanov, Hyungsin Kim
Global Sequential Testing for Multi-Stream Auditing
Across many risk-sensitive areas, it is critical to continuously audit the performance of machine learning systems and detect any unusual behavior quickly. This can be modeled as a sequential hypot...
Beepul Bharti, Ambar Pal, Jeremias Sulam
A Knowledge-Driven Approach to Music Segmentation, Music Source Separation and Cinematic Audio Source Separation
We propose a knowledge-driven, model-based approach to segmenting audio into single-category and mixed-category chunks with applications to source separation. "Knowledge" here denotes information a...
Chun-wei Ho, Sabato Marco Siniscalchi, Kai Li, Chin-Hui Lee
Measuring elastic properties of granular hydrogels: Effects of capillary interaction and ionic conditions
The elastic properties of granular hydrogels are commonly characterised under wet conditions, yet the influence of capillary interactions remains unclear. In practical applications, hydrogels opera...
Jiayin Zhao, Haiyi Zhong, Yixiang Gan
Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound
Introduction: Deep learning-based segmentation models are increasingly integrated into clinical imaging workflows, yet their robustness to adversarial perturbations remains incompletely characteriz...
Nicholas Dietrich, David McShannon
Surrogate-assisted global sensitivity analysis of a hybrid-dimensional Stokes--Brinkman--Darcy model
Development of new multiscale mathematical models often entails considerable complexity and multiple undetermined parameters, typically arising from closure relations. To enable reliable simulation...
Linheng Ruan, Ilja Kröker, Sergey Oladyshkin, Iryna Rybak
VLA Knows Its Limits
Action chunking has recently emerged as a standard practice in flow-based Vision-Language-Action (VLA) models. However, the effect and choice of the execution horizon - the number of actions to be ...
Haoxuan Wang, Gengyu Zhang, Yan Yan, Ramana Rao Kompella, Gaowen Liu
Breathing Black Hole Shadows in Modified Gravity (MOG)
In this paper, we investigate the dynamic phenomenological signatures of a Schwarzschild-MOG black hole shadow perturbed by passing gravitational waves. By perturbing the Hamilton-Jacobi equation f...
Nikko John Leo S. Lobos, Emmanuel T. Rodulfo
PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models
Medical Vision Language Models (VLMs) can change their answers when clinicians rephrase the same question, which raises deployment risks. We introduce Paraphrase Sensitivity Failure (PSF)-Med, a be...
Binesh Sadanandan, Vahid Behzadan
Automating Timed Up and Go Phase Segmentation and Gait Analysis via the tugturn Markerless 3D Pipeline
Instrumented Timed Up and Go (TUG) analysis can support clinical and research decision-making, but robust and reproducible markerless pipelines are still limited. We present \textit{tugturn.py}, a ...
Abel Gonçalves Chinaglia, Guilherme Manna Cesar, Paulo Roberto Pereira Santiago
Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates
Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach...
Nick Polson, Vadim Sokolov
The turbulence driving mode in NGC7793 and NGC1313
We present spatially resolved measurements of turbulence driving modes across entire extragalactic discs of NGC7793 and NGC1313, using Atacama Large Millimetre/submillimetre Array (ALMA) CO(J=2-1) ...
Lewis J Miller, Kathryn Grasha, Christoph Federrath
The Headless Firm: How AI Reshapes Enterprise Boundaries
The boundary of the firm is determined by coordination cost. We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew wi...
Tassilo Klein, Sebastian Wieczorek