Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

CARE: Towards Clinical Accountability in Multi-Modal Medical Reasoning with an Evidence-Grounded Agentic Framework

Large visual language models (VLMs) have shown strong multi-modal medical reasoning ability, but most operate as end-to-end black boxes, diverging from clinicians' evidence-based, staged workflows ...

Yuexi Du, Jinglu Wang, Shujie Liu, Nicha C. Dvornek, Yan Lu

2603.01607 2026-03-02
TESTING

A Block Least Mean Square Method for Fiber Longitudinal Power Profile Monitoring

We propose a block least mean square (LMS) algorithm to monitor the longitudinal power profile of a fiber-optic link through receiver-based digital data from a coherent detector. Compared to the be...

Paolo Serena, Chiara Lasagni, Alberto Bononi, Fabien Boitier, Joana Girard-Jollet

2603.01604 2026-03-02
AI LLM

MigMate: A VS Code Extension for LLM-based Library Migration of Python Projects

Modern software relies heavily on third-party software libraries to streamline the development process. The act of switching one library for a similar counterpart, called library migration, natural...

Matthias Kebede, May Mahmoud, Mohayeminul Islam, Sarah Nadi

2603.01596 2026-03-02
TESTING

IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

Click-through rate (CTR) models in advertising and recommendation systems rely heavily on item ID embeddings, which struggle in item cold-start settings. We present IDProxy, a solution that leverag...

Yubin Zhang, Haiming Xu, Guillaume Salha-Galvan, Ruiyan Han, Feiyang Xiao, Yanhua Huang, Li Lin, ...

2603.01590 2026-03-02
TESTING

DualSentinel: A Lightweight Framework for Detecting Targeted Attacks in Black-box LLM via Dual Entropy Lull Pattern

Recent intelligent systems integrate powerful Large Language Models (LLMs) through APIs, but their trustworthiness may be critically undermined by targeted attacks like backdoor and prompt injectio...

Xiaoyi Pang, Xuanyi Hao, Pengyu Liu, Qi Luo, Song Guo, Zhibo Wang

2603.01574 2026-03-02
TESTING

Testing Hooke-like isotropic hyper-/hypo-elastic material models under finite simple shear deformations

We test some Hooke-like isotropic hyper-/hypo-elastic material models under finite simple shear deformations (cf., Thiel et al. Int. J. Non-linear Mech. 112: 57--72, 2019) and show that (1) the com...

Sergey N. Korobeynikov, Alexey Yu. Larichkin, Patrizio Neff

2603.01551 2026-03-02
TESTING

Pharmacology Knowledge Graphs: Do We Need Chemical Structure for Drug Repurposing?

The contributions of model complexity, data volume, and feature modalities to knowledge graph-based drug repurposing remain poorly quantified under rigorous temporal validation. We constructed a ph...

Youssef Abo-Dahab, Ruby Hernandez, Ismael Caleb Arechiga Duran

2603.01537 2026-03-02
TESTING

Benchmarking Semantic Segmentation Models via Appearance and Geometry Attribute Editing

Semantic segmentation takes pivotal roles in various applications such as autonomous driving and medical image analysis. When deploying segmentation models in practice, it is critical to test their...

Zijin Yin, Bing Li, Kongming Liang, Hao Sun, Zhongjiang He, Zhanyu Ma, Jun Guo

2603.01535 2026-03-02
TESTING

RoboGPU: Accelerating GPU Collision Detection for Robotics

Autonomous robots are increasingly prevalent in our society, emerging in medical care, transportation vehicles, and home assistance. These robots rely on motion planning and collision detection to ...

Lufei Liu, Liwei Xue, Youssef Mohammed, Jocelyn Zhao, Yuan Hsi Chou, Tor M. Aamodt

2603.01517 2026-03-02
TESTING

Retrieval, Refinement, and Ranking for Text-to-Video Generation via Prompt Optimization and Test-Time Scaling

While large-scale datasets have driven significant progress in Text-to-Video (T2V) generative models, these models remain highly sensitive to input prompts, demonstrating that prompt design is crit...

Zillur Rahman, Alex Sheng, Cristian Meo

2603.01509 2026-03-02
TESTING

FATE: Closed-Loop Feasibility-Aware Task Generation with Active Repair for Physically Grounded Robotic Curricula

Recent breakthroughs in generative simulation have harnessed Large Language Models (LLMs) to generate diverse robotic task curricula, yet these open-loop paradigms frequently produce linguistically...

Bingchuan Wei, Bingqi Huang, Jingheng Ma, Zeyu zhang, Sen Cui

2603.01505 2026-03-02
TESTING

Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

Non-negative matrix factorization (NMF) is widely used for parts-based representations, yet formal inference for covariate effects is rarely available when the basis is learned under non-negativity...

Kenichi Satoh

2603.01468 2026-03-02
TESTING

Modified Teukolsky formalism: Null testing and numerical benchmarking

Next-generation gravitational-wave detectors will make black-hole ringdown an increasingly sensitive probe of small departures from General Relativity in the strong-field regime. This motivates obt...

Fawzi Aly, Mahmoud A. Mansour, Luis Lehner, Dejan Stojkovic, Dongjun Li, Pratik Wagle

2603.01456 2026-03-02
TESTING

VidDoS: Universal Denial-of-Service Attack on Video-based Large Language Models

Video-LLMs are increasingly deployed in safety-critical applications but are vulnerable to Energy-Latency Attacks (ELAs) that exhaust computational resources. Current image-centric methods fail bec...

Duoxun Tang, Dasen Dai, Jiyao Wang, Xiao Yang, Jianyu Wang, Siqi Cai

2603.01454 2026-03-02
TESTING

Autoregressive Synthesis of Sparse and Semi-Structured Mixed-Type Data

Synthetic data generation is a critical capability for data sharing, privacy compliance, system benchmarking and test data provisioning. Existing methods assume dense, fixed-schema tabular data, ye...

Thomas Rückstieß, Robin Vujanic

2603.01444 2026-03-02
TESTING

Quantifying Conversational Reliability of Large Language Models under Multi-Turn Interaction

Large Language Models (LLMs) are increasingly deployed in real-world applications where users engage in extended, mixed-topic conversations that depend on prior context. Yet, their reliability unde...

Jiyoon Myung

2603.01423 2026-03-02
TESTING

MIST-RL: Mutation-based Incremental Suite Testing via Reinforcement Learning

Large Language Models (LLMs) often fail to generate correct code on the first attempt, which requires using generated unit tests as verifiers to validate the solutions. Despite the success of recen...

Sicheng Zhu, Jiajun Wang, Jiawei Ai, Xin Li

2603.01409 2026-03-02
TESTING

Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification

Speculative Decoding (SD) has emerged as a premier technique for accelerating Large Language Model (LLM) inference by decoupling token generation into rapid drafting and parallel verification. Whil...

Guang Huang, Zeyi Wen

2603.01399 2026-03-02
TESTING

Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis

Atmospheric turbulence significantly degrades long-range imaging by introducing geometric warping and exposure-time-dependent blur, which adversely affects both visual quality and the performance o...

Junwei Zeng, Dong Liang, Sheng-Jun Huang, Kun Zhan, Songcan Chen

2603.01398 2026-03-02
TESTING

Destruction of wall-bounded vortices using synthetic jet actuators

We experimentally explore the effectiveness of a rectangular orifice synthetic jet actuator for wall-bounded vortex destruction. Vortex flows near a boundary often present unforeseen or undesired f...

Frank A. Tricouros, Cameron Hoober, John C. Vaccaro, Tyler Van Buren

2603.01392 2026-03-02