Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Irradiation Studies of TGC Electronics Components for the ATLAS Experiment at High-Luminosity LHC

This paper evaluates the radiation tolerance of commercial off-the-shelf (COTS) electronics components for use in the Thin Gap Chamber (TGC) frontend electronics of the ATLAS experiment at the High...

Yuya Ohsumi, Daisuke Hashimoto, Yasuyuki Horii, Takumi Aoki, Haruka Asada, Kazumasa Hashizume, Ha...

2603.04129 2026-03-04
AI LLM

Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Developing Audio-Visual Large Language Models (AV-LLMs) for unified scene understanding is pivotal in multimodal intelligence. While instruction tuning enables pre-trained models with multi-task ab...

Dongnuan Cai, Henghui Du, Chang Zhou, Xi Chen, Dan Guo, Hongyuan Zhang, Xuelong Li, Di Hu

2603.04128 2026-03-04
AI LLM

BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Can reinforcement learning with hard, verifiable rewards teach a compact language model to reason about physics, or does it primarily learn to pattern-match toward correct answers? We study this qu...

Tarjei Paule Hage, Markus J. Buehler

2603.04124 2026-03-04
AI LLM

FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation

Large Language Models (LLMs) often generate overly cautious and vague responses on sensitive topics, sacrificing helpfulness for safety. Existing evaluation frameworks lack systematic methods to id...

Juhyun Oh, Nayeon Lee, Chani Jung, Jiho Jin, Junho Myung, Jongwon Lee, Taeui Song, Alice Oh

2603.04123 2026-03-04
TESTING

When to restart? Exploring escalating restarts on convergence

Learning rate scheduling plays a critical role in the optimization of deep neural networks, directly influencing convergence speed, stability, and generalization. While existing schedulers such as ...

Ayush K. Varshney, Šarūnas Girdzijauskas, Konstantinos Vandikas, Aneta Vulgarakis Feljan

2603.04117 2026-03-04
AI LLM

Understanding Sources of Demographic Predictability in Brain MRI via Disentangling Anatomy and Contrast

Demographic attributes such as age, sex, and race can be predicted from medical images, raising concerns about bias in clinical AI systems. In brain MRI, this signal may arise from anatomical varia...

Mehmet Yigit Avci, Akshit Achara, Andrew King, Jorge Cardoso

2603.04113 2026-03-04
TESTING

Testing Full Mediation of Treatment Effects and the Identifiability of Causal Mechanisms

In causal analysis, understanding the causal mechanisms through which an intervention or treatment affects an outcome is often of central interest. We propose a test to evaluate (i) whether the cau...

Martin Huber, Kevin Kloiber, Lukáš Lafférs

2603.04109 2026-03-04
TESTING

A Random Rule Model

We propose a Random Rule Model (RRM) in which behavior is generated by switching among a small library of transparent, parameter-free decision rules. A differentiable gate learns environment-depend...

Avner Seror

2603.04105 2026-03-04
TESTING

Characterizing Machine Learning Force Fields as Emerging Molecular Dynamics Workloads on Graphics Processing Units

Molecular dynamics (MD) simulates the time evolution of atomic systems governed by interatomic forces, and the fidelity of these simulations depends critically on the underlying force model. Classi...

Udari De Alwis, Benjamin E. Mayer, Tom J. Ashby, Maria Barrera, Timon Evenblij, Joyjit Kundu

2603.04092 2026-03-04
TESTING

The Steiner Tree Problem: Novel QUBO Formulation and Quantum Annealing Implementation

The Steiner Tree Problem (STP) is a well-known NP-hard combinatorial optimization problem, which has wide applications in network design, integrated circuit layout, bioinformatics, and other fields...

Dan Li, Xiang-Hui Wu, Ji-Rong Liu

2603.04089 2026-03-04
AI LLM

Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation

This paper investigates two complementary paradigms for predicting machine translation (MT) quality: source-side difficulty prediction and candidate-side quality estimation (QE). The rapid adoption...

Malik Marmonier, Benoît Sagot, Rachel Bawden

2603.04083 2026-03-04
TESTING

Features of Spacetime-Symmetry Breaking and the Standard-Model Extension in Riemann-Cartan Geometry

For over two decades, the gravity sector of the Standard-Model Extension (SME) has served as a phenomenological framework for testing spacetime symmetry breaking in the presence of gravity. During ...

Robert Bluhm

2603.04079 2026-03-04
TESTING

Modified-gradient methods for exact divergence-free in meshless magnetohydrodynamics

We present a novel gradient regularization to completely eliminate the magnetic divergence error in meshless magnetohydrodynamics (MHD), which offers a high spatial resolution and conservative adva...

Xiongbiao Tu, Qiao Wang, Liang Gao, Yifa Tang

2603.04077 2026-03-04
TESTING

Dose-Dependent Cardiac Complexity Changes in Children Following Prenatal Glucocorticoid Exposure: Complementary Evidence from Multiscale Entropy Analysis and ECG Foundation Models

\noindent\textbf{Background} Prenatal glucocorticoid exposure alters cardiac development, but whether persistent cardiac effects in childhood follow a dose-response relationship remains unknown. We...

Nicolas B. Garnier, Michelle Dreiling, Valeska Kozik, Matthias Schwab, Florian Rakers, Martin G F...

2603.04074 2026-03-04
TESTING

SaFeR: Safety-Critical Scenario Generation for Autonomous Driving Test via Feasibility-Constrained Token Resampling

Safety-critical scenario generation is crucial for evaluating autonomous driving systems. However, existing approaches often struggle to balance three conflicting objectives: adversarial criticalit...

Jinlong Cui, Fenghua Liang, Guo Yang, Chengcheng Tang, Jianxun Cui

2603.04071 2026-03-04
AI LLM

Monitoring Emergent Reward Hacking During Generation via Internal Activations

Fine-tuned large language models can exhibit reward-hacking behavior arising from emergent misalignment, which is difficult to detect from final outputs alone. While prior work has studied reward h...

Patrick Wilhelm, Thorsten Wittkopp, Odej Kao

2603.04069 2026-03-04
TESTING

Fermi-Dirac thermal measurements: A framework for quantum hypothesis testing and semidefinite optimization

Quantum measurements are the means by which we recover messages encoded into quantum states. They are at the forefront of quantum hypothesis testing, wherein the goal is to perform an optimal measu...

Nana Liu, Mark M. Wilde

2603.04061 2026-03-04
TESTING

Morphologies for DECaLS Galaxies through a combination of non-parametric indices and machine learning methods: A comprehensive catalog using the Galaxy Morphology Extractor (galmex) code

Galaxy morphology encodes key information about formation and evolution. Large imaging surveys require automated, reproducible methods beyond visual inspection. Non--parametric indices provide an u...

V. M. Sampaio, Y. Jaffé, C. Lima-Dias, S. Véliz Astudillo, M. Martínez-Marín, H. Méndez-Hernández...

2603.04040 2026-03-04
AI LLM

The Empty Quadrant: AI Teammates for Embodied Field Learning

For four decades, AIED research has rested on what we term the Sedentary Assumption: the unexamined design commitment to a stationary learner seated before a screen. Mobile learning and museum guid...

Hyein Kim, Sung Park

2603.04034 2026-03-04
AI LLM

Who Judges the Judge? Evaluating LLM-as-a-Judge for French Medical open-ended QA

Automatic evaluation of medical open-ended question answering (OEQA) remains challenging due to the need for expert annotations. We evaluate whether large language models (LLMs) can act as judges o...

Ikram Belmadani, Oumaima El Khettari, Pacôme Constant dit Beaufils, Richard Dufour, Benoit Favre

2603.04033 2026-03-04