Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Ensuring Safety in Automated Mechanical Ventilation through Offline Reinforcement Learning and Digital Twin Verification

Mechanical ventilation (MV) is a life-saving intervention for patients with acute respiratory failure (ARF) in the ICU. However, inappropriate ventilator settings could cause ventilator-induced lun...

Hang Yu, Huidong Liu, Qingchen Zhang, William Joy, Kateryna Nikulina, Andreas A. Schuppert, Sina ...

2603.11372 2026-03-11
TESTING

Relaxed Efficient Acquisition of Context and Temporal Features

In many biomedical applications, measurements are not freely available at inference time: each laboratory test, imaging modality, or assessment incurs financial cost, time burden, or patient risk. ...

Yunni Qu, Dzung Dinh, Grant King, Whitney Ringwald, Bing Cai Kok, Kathleen Gates, Aiden Wright, J...

2603.11370 2026-03-11
TESTING

abx_amr_simulator: A simulation environment for antibiotic prescribing policy optimization under antimicrobial resistance

Antimicrobial resistance (AMR) poses a global health threat, reducing the effectiveness of antibiotics and complicating clinical decision-making. To address this challenge, we introduce abx_amr_sim...

Joyce Lee, Seth Blumberg

2603.11369 2026-03-11
TESTING

Fair-Gate: Fairness-Aware Interpretable Risk Gating for Sex-Fair Voice Biometrics

Voice biometric systems can exhibit sex-related performance gaps even when overall verification accuracy is strong. We attribute these gaps to two practical mechanisms: (i) demographic shortcut lea...

Yangyang Qu, Todisco Massimiliano, Galdi Chiara, Evans Nicholas

2603.11360 2026-03-11
TESTING

Teleodynamic Learning a new Paradigm For Interpretable AI

We introduce Teleodynamic Learning, a new paradigm for machine learning in which learning is not the minimization of a fixed objective, but the emergence and stabilization of functional organizatio...

Enrique ter Horst, Juan Diego Zambrano

2603.11355 2026-03-11
TESTING

Human Navigation Behaviour and Brain Dynamics in Real-world Contexts

The study of navigation behaviour and the associated brain dynamics have been a focus increasing research over the last decades. Coinciding with this has been an increased focus on a more ecologica...

Pablo Fernandez Velasco, Antoine Coutrot, Hugo J. Spiers

2603.11347 2026-03-11
TESTING

Hybrid eTFCE-GRF: Exact Cluster-Size Retrieval with Analytical p-Values for Voxel-Based Morphometry

Threshold-free cluster enhancement (TFCE) integrates cluster extent across thresholds to improve voxel-wise neuroimaging inference, but permutation testing makes it prohibitively slow for large dat...

Don Yin, Hao Chen, Takeshi Miki, Boxing Liu, Enyu Yang

2603.11344 2026-03-11
TESTING

FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles

Large language models (LLMs) are increasingly applied to financial analysis, yet their ability to audit structured financial statements under explicit accounting principles remains poorly explored....

Arun Vignesh Malarkkan, Manan Roy Choudhury, Guangwei Zhang, Vivek Gupta, Qingyun Wang, Yanjie Fu...

2603.11339 2026-03-11
TESTING

RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents

LLM agents increasingly perform end-to-end ML engineering tasks where success is judged by a single scalar test metric. This creates a structural vulnerability: an agent can increase the reported s...

Yonas Atinafu, Robin Cohen

2603.11337 2026-03-11
TESTING

Proto-NUX: A prototype telescope for ground-based near-ultraviolet observations

The Near-UV-eXplorer (NUX) is a proposed ground-based, wide-field telescope array with a field of view of $\sim$70 square degrees, designed to operate over the 300-350 nm wavelength range and to ac...

Rasjied Sloot, Rudy Wijnands, Steven Bloemen, Rik ter Horst, Hans Ellermeijer, Alexander Hoogerbrug

2603.11336 2026-03-11
AI LLM

COMIC: Agentic Sketch Comedy Generation

We propose a fully automated AI system that produces short comedic videos similar to sketch shows such as Saturday Night Live. Starting with character references, the system employs a population of...

Susung Hong, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz

2603.11048 2026-03-11
TESTING

On Utility Maximization under Multivariate Fake Stationary Affine Volterra Models

This paper is concerned with Merton's portfolio optimization problem in a Volterra stochastic environment described by a multivariate fake stationary Volterra--Heston model. Due to the non-Markovia...

Emmanuel Gnabeyeu

2603.11046 2026-03-11
TESTING

V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation

Generating music that temporally aligns with video events is challenging for existing text-to-music models, which lack fine-grained temporal control. We introduce V2M-Zero, a zero-pair video-to-mus...

Yan-Bo Lin, Jonah Casebeer, Long Mai, Aniruddha Mahapatra, Gedas Bertasius, Nicholas J. Bryan

2603.11042 2026-03-11
AI LLM

Chasing RATs: Tracing Reading for and as Creative Activity

Creativity research has privileged making over the interpretive labor that precedes and shapes it. We introduce Reading Activity Traces (RATs), a proposal that treats reading -- broadly defined to ...

Sophia Liu, Shm Garanganao Almeda

2603.11031 2026-03-11
AI LLM

Beyond the Illusion of Consensus: From Surface Heuristics to Knowledge-Grounded Evaluation in LLM-as-a-Judge

The paradigm of LLM-as-a-judge relies on a critical assumption, namely that high inter-evaluator agreement indicates reliable and objective evaluation. We present two complementary findings that ch...

Mingyang Song, Mao Zheng, Chenning Xu

2603.11027 2026-03-11
AI LLM

LLMGreenRec: LLM-Based Multi-Agent Recommender System for Sustainable E-Commerce

Rising environmental awareness in e-commerce necessitates recommender systems that not only guide users to sustainable products but also minimize their own digital carbon footprints. Traditional se...

Hao N. Nguyen, Hieu M. Nguyen, Son Van Nguyen, Nguyen Thi Hanh

2603.11025 2026-03-11
AI LLM

Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style

VLMs have become increasingly proficient at a range of computer vision tasks, such as visual question answering and object detection. This includes increasingly strong capabilities in the domain of...

Marvin Limpijankit, Milad Alshomary, Yassin Oulad Daoud, Amith Ananthram, Tim Trombley, Elias Ste...

2603.11024 2026-03-11
AI LLM

Leech Lattice Vector Quantization for Efficient LLM Compression

Scalar quantization of large language models (LLMs) is fundamentally limited by information-theoretic bounds. While vector quantization (VQ) overcomes these limits by encoding blocks of parameters ...

Tycho F. A. van der Ouderaa, Mart van Baalen, Paul Whatmough, Markus Nagel

2603.11021 2026-03-11
TESTING

Gravitational Anomaly Measurement in Wide Binaries is Sensitive to Orbital Modeling

Recent work by Chae et al. (2026) reported a gravitational anomaly in 36 wide-binary pairs, finding a gravity boost factor of $γ\equiv G_{\rm eff}/G_{\rm N} \approx 1.60_{-0.14}^{+0.17}$ at low acc...

Serat M. Saad, Yuan-Sen Ting

2603.11015 2026-03-11
AI LLM

Task-Aware Delegation Cues for LLM Agents

LLM agents increasingly present as conversational collaborators, yet human--agent teamwork remains brittle due to information asymmetry: users lack task-specific reliability cues, and agents rarely...

Xingrui Gu

2603.11011 2026-03-11