Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

Pruning is widely recognized as an effective method for reducing the parameters of large language models (LLMs), potentially leading to more efficient deployment and inference. One classic and prom...

Mingluo Su, Huan Wang

2603.05878 2026-03-06
AI LLM

Shifting Adaptation from Weight Space to Memory Space: A Memory-Augmented Agent for Medical Image Segmentation

Medical image segmentation is fundamental to clinical workflows, yet models trained on a single dataset often fail to generalize across institutions, scanners, or patient populations. While vision ...

Bowen Chen, Qiaohui Gao, Shaowen Wan, Shanhui Sun, Wei Liu, Xiang Li, Tianming Liu, Lin Zhao

2603.05873 2026-03-06
AI LLM

Evolving Deception: When Agents Evolve, Deception Wins

Self-evolving agents offer a promising path toward scalable autonomy. However, in this work, we show that in competitive environments, self-evolution can instead give rise to a serious and previous...

Zonghao Ying, Haowen Dai, Tianyuan Zhang, Yisong Xiao, Quanchen Zou, Aishan Liu, Jian Yang, Yaodo...

2603.05872 2026-03-06
AI LLM

Challenges in Synchronous & Remote Collaboration Around Visualization

We characterize 16 challenges faced by those investigating and developing remote and synchronous collaborative experiences around visualization. Our work reflects the perspectives and prior researc...

Matthew Brehmer, Maxime Cordeil, Christophe Hurter, Takayuki Itoh, Wolfgang Büschel, Mahmood Jasi...

2603.05871 2026-03-06
TESTING

AnyCamVLA: Zero-Shot Camera Adaptation for Viewpoint Robust Vision-Language-Action Models

Despite remarkable progress in Vision-Language-Action models (VLAs) for robot manipulation, these large pre-trained models require fine-tuning to be deployed in specific environments. These fine-tu...

Hyeongjun Heo, Seungyeon Woo, Sang Min Kim, Junho Kim, Junho Lee, Yonghyeon Lee, Young Min Kim

2603.05868 2026-03-06
AI LLM

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

While Large Language Models (LLMs) have revolutionized code generation, standard "System 1" approaches, generating solutions in a single forward pass, often hit a performance ceiling when faced wit...

Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim

2603.05863 2026-03-06
AI LLM

Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery

Clinical image interpretation is inherently multi-step and tool-centric: clinicians iteratively combine visual evidence with patient context, quantify findings, and refine their decisions through a...

Lin Fan, Pengyu Dai, Zhipeng Deng, Haolin Wang, Xun Gong, Yefeng Zheng, Yafei Ou

2603.05860 2026-03-06
AI LLM

How Well Do Current Speech Deepfake Detection Methods Generalize to the Real World?

Recent advances in speech synthesis and voice conversion have greatly improved the naturalness and authenticity of generated audio. Meanwhile, evolving encoding, compression, and transmission mecha...

Daixian Li, Jun Xue, Yanzhen Ren, Zhuolin Yi, Yihuan Huang, Guanxiang Feng, Yi Chai

2603.05852 2026-03-06
AI LLM

The Values of Value in AI Adoption: Rethinking Efficiency in UX Designers' Workplaces

Although organizations increasingly position AI adoption as a pathway to competitiveness and innovation, organizations' perspectives on productivity and efficiency often clash with workers' perspec...

Inha Cha, Catherine Wieczorek, Richmond Y. Wong

2603.05848 2026-03-06
AI LLM

Evaluating LLM Alignment With Human Trust Models

Trust plays a pivotal role in enabling effective cooperation, reducing uncertainty, and guiding decision-making in both human interactions and multi-agent systems. Although it is significant, there...

Anushka Debnath, Stephen Cranefield, Bastin Tony Roy Savarimuthu, Emiliano Lorini

2603.05839 2026-03-06
TESTING

Multi-Segment Consistency Tests of General Relativity

As the LIGO-VIRGO-KAGRA Network of gravitational-wave detectors improves in sensitivity, accumulating hundreds of gravitational-wave detections per year, it becomes imperative to improve tests of g...

Vaishak Prasad

2603.05835 2026-03-06
AI LLM

Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual Analytics

Large Language Models (LLMs) are transforming Conversational Visual Analytics (CVA) by enabling data analysis through natural language. However, evaluating LLMs for CVA remains a challenge: requiri...

Srishti Palani, Vidya Setlur

2603.05832 2026-03-06
AI LLM

Knowledge-driven Reasoning for Mobile Agentic AI: Concepts, Approaches, and Directions

Mobile agentic AI is extending autonomous capabilities to resource-constrained platforms such as edge robots and unmanned aerial vehicles (UAVs), where strict size, weight, power, and cost (SWAP-C)...

Guangyuan Liu, Changyuan Zhao, Yinqiu Liu, Dusit Niyato, Biplab Sikdar

2603.05831 2026-03-06
AI LLM

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of i...

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Changran Hu, Qizheng Zhang, Urmish Thakker

2603.05829 2026-03-06
AI LLM

HART: Data-Driven Hallucination Attribution and Evidence-Based Tracing for Large Language Models

Large language models (LLMs) have demonstrated remarkable performance in text generation and knowledge-intensive question answering. Nevertheless, they are prone to producing hallucinated content, ...

Shize Liang, Hongzhi Wang

2603.05828 2026-03-06
AI LLM

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

Adapting foundation models to new clinical sites remains challenging in practice. Domain shift and scarce annotations must be handled by experts, yet many clinical groups do not have ready access t...

Son Thai Ly, Hien V. Nguyen

2603.05822 2026-03-06
TESTING

ImKWS: Test-Time Adaptation for Keyword Spotting with Class Imbalance

Keyword spotting (KWS) identifies words for voice assistants, but environmental noise frequently reduces accuracy. Standard adaptation fixes this issue and strictly requires original or labeled aud...

Hanyu Ding, Yang Xiao, Jiaheng Dong, Ting Dang

2603.05821 2026-03-06
TESTING

Which Data Matter? Embedding-Based Data Selection for Speech Recognition

Modern ASR systems are typically trained on large-scale pseudo-labeled, in-the-wild data spanning multiple domains. While such heterogeneous data benefit generalist models designed for broad deploy...

Zakaria Aldeneh, Skyler Seto, Maureen de Seyssel, Jie Chi, Zijin Gu, Takuya Higuchi, Jee-weon Jun...

2603.05819 2026-03-06
TESTING

Two Localization Strategies for Sequential MCMC Data Assimilation with Applications to Nonlinear Non-Gaussian Geophysical Models

We present a localized data assimilation (DA) scheme based on the sequential Markov Chain Monte Carlo (SMCMC) technique [Ruzayqat et al., 2024], a provably convergent method for filtering high-dime...

Hamza Ruzayqat, Hristo G. Chipilski, Omar Knio

2603.05817 2026-03-06
TESTING

Nonlinear Conjugate Gradient Method for Multiobjective Optimization Problems of Interval-Valued Maps

In this article, we propose an algorithm for the nonlinear conjugate gradient method to find a Pareto critical point of unconstrained multiobjective interval optimization problems. In this algorith...

Tapas Mondal, Debdas Ghosh, Jingxin Liu, Jie Li

2603.05814 2026-03-06