Papers
Research papers from arXiv and related sources
Unifying On- and Off-Policy Variance Reduction Methods
Continuous and efficient experimentation is key to the practical success of user-facing applications on the web, both through online A/B-tests and off-policy evaluation. Despite their shared object...
Olivier Jeunen
M$^3$-ACE: Rectifying Visual Perception in Multimodal Math Reasoning via Multi-Agentic Context Engineering
Multimodal large language models have recently shown promising progress in visual mathematical reasoning. However, their performance is often limited by a critical yet underexplored bottleneck: ina...
Peijin Xie, Zhen Xu, Bingquan Liu, Baoxun Wang
Torque Hyperuniformity in Frictional Granular Matter - Theory and Experiments
A question of some fundamental importance is whether a given assembly of frictional granules (say sand or powder) will exhibit stress autocorrelations with long-range anisotropic decay as determine...
Jin Shang, Jie Zhang, Itamar Procaccia
Local-Global Prompt Learning via Sparse Optimal Transport
Few-shot adaptation of vision-language models (VLMs) like CLIP typically relies on learning textual prompts matched to global image embeddings. Recent works extend this paradigm by incorporating lo...
Deniz Kizaroğlu, Ülku Tuncer Küçüktas, Emre Çakmakyurdu, Alptekin Temizel
Amortized Phylodynamic Inference with Neural Bayes Estimators and Recursive Neural Networks
Phylodynamics is used to estimate epidemic dynamics from phylogenetic trees or genomic sequences of pathogens, but the likelihood calculations needed can be challenging for complex models. We prese...
Alexander E. Zarebski, Thomas Williams, Louis du Plessis
SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation
Answering complex, real-world queries often requires synthesizing facts scattered across vast document corpora. In these settings, standard retrieval-augmented generation (RAG) pipelines suffer fro...
Yagiz Can Akay, Muhammed Yusuf Kartal, Esra Alparslan, Faruk Ortakoyluoglu, Arda Akpinar
Beyond Attention Heatmaps: How to Get Better Explanations for Multiple Instance Learning Models in Histopathology
Multiple instance learning (MIL) has enabled substantial progress in computational histopathology, where a large amount of patches from gigapixel whole slide images are aggregated into slide-level ...
Mina Jamshidi Idaji, Julius Hense, Tom Neuhäuser, Augustin Krause, Yanqing Luo, Oliver Eberle, Th...
Agentic Neurosymbolic Collaboration for Mathematical Discovery: A Case Study in Combinatorial Design
We study mathematical discovery through the lens of neurosymbolic reasoning, where an AI agent powered by a large language model (LLM), coupled with symbolic computation tools, and human strategic ...
Hai Xia, Carla P. Gomes, Bart Selman, Stefan Szeider
CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support
Large language models (LLMs) show significant potential for clinical decision support (CDS), yet their black-box nature -- characterized by untraceable reasoning and probabilistic hallucinations --...
Liuyi Xu, Yun Guo, Ming Chen, Zihan Dun, Yining Qian, An-Yang Lu, Shuang Li, Lijun Liu
Human-AI Divergence in Ego-centric Action Recognition under Spatial and Spatiotemporal Manipulations
Humans consistently outperform state-of-the-art AI models in action recognition, particularly in challenging real-world conditions involving low resolution, occlusion, and visual clutter. Understan...
Sadegh Rahmaniboldaji, Filip Rybansky, Quoc C. Vuong, Anya C. Hurlbert, Frank Guerin, Andrew Gilbert
Concept-Guided Fine-Tuning: Steering ViTs away from Spurious Correlations to Improve Robustness
Vision Transformers (ViTs) often degrade under distribution shifts because they rely on spurious correlations, such as background cues, rather than semantically meaningful features. Existing regula...
Yehonatan Elisha, Oren Barkan, Noam Koenigstein
Weighted Chernoff information and optimal loss exponent in context-sensitive hypothesis testing
We consider context-sensitive (binary) hypothesis testing for i.i.d. observations under a multiplicative weight function. We establish the logarithmic asymptotic, as the sample size grows, of the o...
Mark Kelbert, El'mira Yu. Kalimulina
Nonminimal Lorentz Violation in Atomic and Molecular Spectroscopy Experiments
This presentation discusses potential signals of Lorentz violation that could be observed in atomic and molecular spectroscopy experiments. It provides a general overview of the nonrelativistic eff...
Arnaldo J. Vargas
Novel Semantic Prompting for Zero-Shot Action Recognition
Zero-shot action recognition relies on transferring knowledge from vision-language models to unseen actions using semantic descriptions. While recent methods focus on temporal modeling or architect...
Salman Iqbal, Waheed Rehman
A Blockchain-based Traceability System for AI-Driven Engine Blade Inspection
Aircraft engine blade maintenance relies on inspection records shared across manufacturers, airlines, maintenance organizations, and regulators. Yet current systems are fragmented, difficult to aud...
Mahmoud Hafez, Eman Ouda, Mohammed A. Mohammed Eltoum, Khaled Salah, Yusra Abdulrahman
LAMUS: A Large-Scale Corpus for Legal Argument Mining from U.S. Caselaw using LLMs
Legal argument mining aims to identify and classify the functional components of judicial reasoning, such as facts, issues, rules, analysis, and conclusions. Progress in this area is limited by the...
Serene Wang, Lavanya Pobbathi, Haihua Chen
An objective non-local prior for skew-symmetric models
We propose an objective non-local prior for testing symmetry against skew-symmetric alternatives. The prior is derived through a formal construction rule by assigning a uniform distribution to a di...
F. J. Rubio
Evaluating LLM-Based Grant Proposal Review via Structured Perturbations
As AI-assisted grant proposals outpace manual review capacity in a kind of ``Malthusian trap'' for the research ecosystem, this paper investigates the capabilities and limitations of LLM-based gran...
William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana Maynard
AdaCultureSafe: Adaptive Cultural Safety Grounded by Cultural Knowledge in Large Language Models
With the widespread adoption of Large Language Models (LLMs), respecting indigenous cultures becomes essential for models' culturally safety and responsible global applications. Existing studies se...
Hankun Kang, Di Lin, Zhirong Liao, Pengfei Bai, Xinyi Zeng, Jiawei Jiang, Yuanyuan Zhu, Tieyun Qian
How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms
How much do large language models actually hallucinate when answering questions grounded in provided documents? Despite the critical importance of this question for enterprise AI deployments, relia...
JV Roig