Papers
Research papers from arXiv and related sources
Momentum Measurement of Charged Particles in FASER's Emulsion Detector at the LHC
We present a momentum measurement method based on multiple Coulomb scattering (MCS) in the FASER$ν$ emulsion detector. The measurement of charged-particle momenta is essential for studying neutrino...
FASER Collaboration, Roshan Mammen Abraham, Xiaocong Ai, Saul Alonso Monsalve, John Anders, Emma...
A Hybrid Federated Learning Based Ensemble Approach for Lung Disease Diagnosis Leveraging Fusion of SWIN Transformer and CNN
The significant advancements in computational power cre- ate a vast opportunity for using Artificial Intelligence in different ap- plications of healthcare and medical science. A Hybrid FL-Enabled ...
Asif Hasan Chowdhury, Md. Fahim Islam, M Ragib Anjum Riad, Faiyaz Bin Hashem, Md Tanzim Reza, Md....
ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment
Activation steering, or representation engineering, offers a lightweight approach to align large language models (LLMs) by manipulating their internal activations at inference time. However, curren...
Hongjue Zhao, Haosen Sun, Jiangtao Kong, Xiaochang Li, Qineng Wang, Liwei Jiang, Qi Zhu, Tarek Ab...
A Theoretical Framework for Modular Learning of Robust Generative Models
Training large-scale generative models is resource-intensive and relies heavily on heuristic dataset weighting. We address two fundamental questions: Can we train Large Language Models (LLMs) modul...
Corinna Cortes, Mehryar Mohri, Yutao Zhong
MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning
Existing Reinforcement Learning with Verifiable Rewards (RLVR) algorithms, such as GRPO, rely on rigid, uniform, and symmetric trust region mechanisms that are fundamentally misaligned with the com...
Xiaoliang Fu, Jiaye Lin, Yangyi Fang, Binbin Zheng, Chaowen Hu, Zekai Shao, Cong Qin, Lu Pan, Ke ...
KLong: Training LLM Agent for Extremely Long-horizon Tasks
This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via trajectory-splitting SFT, then scale it via p...
Yue Liu, Zhiyuan Hu, Flood Sung, Jiaheng Zhang, Bryan Hooi
Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability
In multi-agent IR pipelines for tasks such as search and ranking, LLM-based agents exchange intermediate reasoning in terms of Chain-of-Thought (CoT) with each other. Current CoT evaluation narrowl...
Shashank Aggarwal, Ram Vikas Mishra, Amit Awekar
Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems
Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness ...
Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan
Toward a Fully Autonomous, AI-Native Particle Accelerator
This position paper presents a vision for self-driving particle accelerators that operate autonomously with minimal human intervention. We propose that future facilities be designed through artific...
Chris Tennant
LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs
Medical vision-language models (VLMs) are strong zero-shot recognizers for medical imaging, but their reliability under domain shift hinges on calibrated uncertainty with guarantees. Split conforma...
Behzad Bozorgtabar, Dwarikanath Mahapatra, Sudipta Roy, Muzammal Naseer, Imran Razzak, Zongyuan Ge
Systematic Evaluation of Single-Cell Foundation Model Interpretability Reveals Attention Captures Co-Expression Rather Than Unique Regulatory Signal
We present a systematic evaluation framework - thirty-seven analyses, 153 statistical tests, four cell types, two perturbation modalities - for assessing mechanistic interpretability in single-cell...
Ihor Kendiukhov
Provably Explaining Neural Additive Models
Despite significant progress in post-hoc explanation methods for neural networks, many remain heuristic and lack provable guarantees. A key approach for obtaining explanations with provable guarant...
Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan Şahin, Jan Kretinsky, Matthias Altho...
Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation
Large language models (LLMs) have shown strong potential across a variety of tasks, but their application in the telecom field remains challenging due to domain complexity, evolving standards, and ...
Dun Yuan, Hao Zhou, Xue Liu, Hao Chen, Yan Xin, Jianzhong, Zhang
The Anxiety of Influence: Bloom Filters in Transformer Attention Heads
Some transformer attention heads appear to function as membership testers, dedicating themselves to answering the question "has this token appeared before in the context?" We identify these heads a...
Peter Balogh
Inspiral tests of general relativity and waveform geometry
The phase evolution of gravitational waves encodes critical information about the orbital dynamics of binary systems. In this work, we test the robustness of parameterized tests against unmodeled d...
Brian C. Seymour, Jacob Golomb, Yanbei Chen
When Models Ignore Definitions: Measuring Semantic Override Hallucinations in LLM Reasoning
Large language models (LLMs) demonstrate strong performance on standard digital logic and Boolean reasoning tasks, yet their reliability under locally redefined semantics remains poorly understood....
Yogeswar Reddy Thota, Setareh Rafatirad, Homayoun Houman, Tooraj Nikoubin
Dodging the Moose: Experimental Insights in Real-Life Automated Collision Avoidance
The sudden appearance of a static obstacle on the road, i.e. the moose test, is a well-known emergency scenario in collision avoidance for automated driving. Model Predictive Control (MPC) has long...
Leila Gharavi, Simone Baldi, Yuki Hosomi, Tona Sato, Bart De Schutter, Binh-Minh Nguyen, Hiroshi ...
Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems
This work presents a practical benchmarking framework for optimizing artificial intelligence (AI) models on ARM Cortex processors (M0+, M4, M7), focusing on energy efficiency, accuracy, and resourc...
Pranay Jain, Maximilian Kasper, Göran Köber, Axel Plinge, Dominik Seuß
Proximal powered knee placement: a case study
Lower limb amputation affects millions worldwide, leading to impaired mobility, reduced walking speed, and limited participation in daily and social activities. Powered prosthetic knees can partial...
Kyle R. Embry, Lorenzo Vianello, Jim Lipsey, Frank Ursetta, Michael Stephens, Zhi Wang, Ann M. Si...
Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
Learning from self-sampled data and sparse environmental feedback remains a fundamental challenge in training self-evolving agents. Temporal credit assignment mitigates this issue by transforming s...
Wen-Tse Chen, Jiayu Chen, Fahim Tajwar, Hao Zhu, Xintong Duan, Ruslan Salakhutdinov, Jeff Schneider