Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Unifying On- and Off-Policy Variance Reduction Methods

Continuous and efficient experimentation is key to the practical success of user-facing applications on the web, both through online A/B-tests and off-policy evaluation. Despite their shared object...

Olivier Jeunen

2603.08370 2026-03-09
AI LLM

M$^3$-ACE: Rectifying Visual Perception in Multimodal Math Reasoning via Multi-Agentic Context Engineering

Multimodal large language models have recently shown promising progress in visual mathematical reasoning. However, their performance is often limited by a critical yet underexplored bottleneck: ina...

Peijin Xie, Zhen Xu, Bingquan Liu, Baoxun Wang

2603.08369 2026-03-09
TESTING

Torque Hyperuniformity in Frictional Granular Matter - Theory and Experiments

A question of some fundamental importance is whether a given assembly of frictional granules (say sand or powder) will exhibit stress autocorrelations with long-range anisotropic decay as determine...

Jin Shang, Jie Zhang, Itamar Procaccia

2603.08363 2026-03-09
AI LLM

Local-Global Prompt Learning via Sparse Optimal Transport

Few-shot adaptation of vision-language models (VLMs) like CLIP typically relies on learning textual prompts matched to global image embeddings. Recent works extend this paradigm by incorporating lo...

Deniz Kizaroğlu, Ülku Tuncer Küçüktas, Emre Çakmakyurdu, Alptekin Temizel

2603.08347 2026-03-09
TESTING

Amortized Phylodynamic Inference with Neural Bayes Estimators and Recursive Neural Networks

Phylodynamics is used to estimate epidemic dynamics from phylogenetic trees or genomic sequences of pathogens, but the likelihood calculations needed can be challenging for complex models. We prese...

Alexander E. Zarebski, Thomas Williams, Louis du Plessis

2603.08345 2026-03-09
AI LLM

SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation

Answering complex, real-world queries often requires synthesizing facts scattered across vast document corpora. In these settings, standard retrieval-augmented generation (RAG) pipelines suffer fro...

Yagiz Can Akay, Muhammed Yusuf Kartal, Esra Alparslan, Faruk Ortakoyluoglu, Arda Akpinar

2603.08329 2026-03-09
AI LLM

Beyond Attention Heatmaps: How to Get Better Explanations for Multiple Instance Learning Models in Histopathology

Multiple instance learning (MIL) has enabled substantial progress in computational histopathology, where a large amount of patches from gigapixel whole slide images are aggregated into slide-level ...

Mina Jamshidi Idaji, Julius Hense, Tom Neuhäuser, Augustin Krause, Yanqing Luo, Oliver Eberle, Th...

2603.08328 2026-03-09
AI LLM

Agentic Neurosymbolic Collaboration for Mathematical Discovery: A Case Study in Combinatorial Design

We study mathematical discovery through the lens of neurosymbolic reasoning, where an AI agent powered by a large language model (LLM), coupled with symbolic computation tools, and human strategic ...

Hai Xia, Carla P. Gomes, Bart Selman, Stefan Szeider

2603.08322 2026-03-09
AI LLM

CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support

Large language models (LLMs) show significant potential for clinical decision support (CDS), yet their black-box nature -- characterized by untraceable reasoning and probabilistic hallucinations --...

Liuyi Xu, Yun Guo, Ming Chen, Zihan Dun, Yining Qian, An-Yang Lu, Shuang Li, Lijun Liu

2603.08321 2026-03-09
AI LLM

Human-AI Divergence in Ego-centric Action Recognition under Spatial and Spatiotemporal Manipulations

Humans consistently outperform state-of-the-art AI models in action recognition, particularly in challenging real-world conditions involving low resolution, occlusion, and visual clutter. Understan...

Sadegh Rahmaniboldaji, Filip Rybansky, Quoc C. Vuong, Anya C. Hurlbert, Frank Guerin, Andrew Gilbert

2603.08317 2026-03-09
AI LLM

Concept-Guided Fine-Tuning: Steering ViTs away from Spurious Correlations to Improve Robustness

Vision Transformers (ViTs) often degrade under distribution shifts because they rely on spurious correlations, such as background cues, rather than semantically meaningful features. Existing regula...

Yehonatan Elisha, Oren Barkan, Noam Koenigstein

2603.08309 2026-03-09
TESTING

Weighted Chernoff information and optimal loss exponent in context-sensitive hypothesis testing

We consider context-sensitive (binary) hypothesis testing for i.i.d. observations under a multiplicative weight function. We establish the logarithmic asymptotic, as the sample size grows, of the o...

Mark Kelbert, El'mira Yu. Kalimulina

2603.08308 2026-03-09
TESTING

Nonminimal Lorentz Violation in Atomic and Molecular Spectroscopy Experiments

This presentation discusses potential signals of Lorentz violation that could be observed in atomic and molecular spectroscopy experiments. It provides a general overview of the nonrelativistic eff...

Arnaldo J. Vargas

2603.08298 2026-03-09
AI LLM

Novel Semantic Prompting for Zero-Shot Action Recognition

Zero-shot action recognition relies on transferring knowledge from vision-language models to unseen actions using semantic descriptions. While recent methods focus on temporal modeling or architect...

Salman Iqbal, Waheed Rehman

2603.08289 2026-03-09
AI LLM

A Blockchain-based Traceability System for AI-Driven Engine Blade Inspection

Aircraft engine blade maintenance relies on inspection records shared across manufacturers, airlines, maintenance organizations, and regulators. Yet current systems are fragmented, difficult to aud...

Mahmoud Hafez, Eman Ouda, Mohammed A. Mohammed Eltoum, Khaled Salah, Yusra Abdulrahman

2603.08288 2026-03-09
AI LLM

LAMUS: A Large-Scale Corpus for Legal Argument Mining from U.S. Caselaw using LLMs

Legal argument mining aims to identify and classify the functional components of judicial reasoning, such as facts, issues, rules, analysis, and conclusions. Progress in this area is limited by the...

Serene Wang, Lavanya Pobbathi, Haihua Chen

2603.08286 2026-03-09
TESTING

An objective non-local prior for skew-symmetric models

We propose an objective non-local prior for testing symmetry against skew-symmetric alternatives. The prior is derived through a formal construction rule by assigning a uniform distribution to a di...

F. J. Rubio

2603.08285 2026-03-09
AI LLM

Evaluating LLM-Based Grant Proposal Review via Structured Perturbations

As AI-assisted grant proposals outpace manual review capacity in a kind of ``Malthusian trap'' for the research ecosystem, this paper investigates the capabilities and limitations of LLM-based gran...

William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana Maynard

2603.08281 2026-03-09
AI LLM

AdaCultureSafe: Adaptive Cultural Safety Grounded by Cultural Knowledge in Large Language Models

With the widespread adoption of Large Language Models (LLMs), respecting indigenous cultures becomes essential for models' culturally safety and responsible global applications. Existing studies se...

Hankun Kang, Di Lin, Zhirong Liao, Pengfei Bai, Xinyi Zeng, Jiawei Jiang, Yuanyuan Zhu, Tieyun Qian

2603.08275 2026-03-09
AI LLM

How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms

How much do large language models actually hallucinate when answering questions grounded in provided documents? Despite the critical importance of this question for enterprise AI deployments, relia...

JV Roig

2603.08274 2026-03-09