Papers
Research papers from arXiv and related sources
In-Context Symbolic Regression for Robustness-Improved Kolmogorov-Arnold Networks
Symbolic regression aims to replace black-box predictors with concise analytical expressions that can be inspected and validated in scientific machine learning. Kolmogorov-Arnold Networks (KANs) ar...
Francesco Sovrano, Lidia Losavio, Giulia Vilone, Marc Langheinrich
Mechanistic Foundations of Goal-Directed Control
Mechanistic interpretability has transformed the analysis of transformer circuits by decomposing model behavior into competing algorithms, identifying phase transitions during training, and derivin...
Alma Lago
Investigating How Neighbourhood Scores Reflect Forecast Error
Meaningful scores for forecast verification are essential for developing reliable forecasts, and there has been much effort to develop scores that align well with human perceptions of forecast qual...
Bobby Antonio
Practicing with Language Models Cultivates Human Empathic Communication
Empathy is central to human connection, yet people often struggle to express it effectively. In blinded evaluations, large language models (LLMs) generate responses that are often judged more empat...
Aakriti Kumar, Nalin Poungpeth, Diyi Yang, Bruce Lambert, Matthew Groh
A proof-of-concept for automated AI-driven stellarator coil optimization with in-the-loop finite-element calculations
Finding feasible coils for stellarator fusion devices is a critical challenge of realizing this concept for future power plants. Years of research work can be put into the design of even a single r...
Alan A. Kaptanoglu, Pedro F. Gil
Molecular gas and star formation in GASP jellyfish galaxies
Several studies have reported a nearly linear correlation between the molecular gas and star formation rate surface density, the so-called Kennicutt-Schmidt (KS) law. We aim to retrieve the KS rela...
A. Moretti, R. Paladino, C. Bacchini, A. Marasco, E. Giunchi, B. M. Poggianti, L. K. Hunt, T. Deb...
Why the Valuable Capabilities of LLMs Are Precisely the Unexplainable Ones
This paper proposes and argues for a counterintuitive thesis: the truly valuable capabilities of large language models (LLMs) reside precisely in the part that cannot be fully captured by human-rea...
Quan Cheng
Multi-turn Physics-informed Vision-language Model for Physics-grounded Anomaly Detection
Vision-Language Models (VLMs) demonstrate strong general-purpose reasoning but remain limited in physics-grounded anomaly detection, where causal understanding of dynamics is essential. Existing VL...
Yao Gu, Xiaohao Xu, Yingna Wu
Towards physically more comprehensive AGN modelling in cosmological simulations: A MACER-based modification of IllustrisTNG
Active galactic nuclei (AGN) feedback plays a significant role in many aspects of galaxy formation and evolution and has become a key ingredient in cosmological simulations. However, the subgrid mo...
Bocheng Zhu, Volker Springel, Feng Yuan
SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing
Large Language Models (LLMs) often suffer from catastrophic forgetting and collapse during sequential knowledge editing. This vulnerability stems from the prevailing dense editing paradigm, which t...
Yuhuan Liu, Haitian Zhong, Xinyuan Xia, Qiang Liu, Shu Wu, Liang Wang
Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation
Machine Translation (MT) evaluation has gone beyond metrics, towards more specific linguistic phenomena. Regarding English-Chinese language pairs, passive sentences are constructed and distributed ...
Xinyue Ma, Pol Pastells, Mireia Farrús, Mariona Taulé
Tracking the Discriminative Axis: Dual Prototypes for Test-Time OOD Detection Under Covariate Shift
For reliable deployment of deep-learning systems, out-of-distribution (OOD) detection is indispensable. In the real world, where test-time inputs often arrive as streaming mixtures of in-distributi...
Wooseok Lee, Jin Mo Yang, Saewoong Bahk, Hyung-Sin Kim
LMetric: Simple is Better - Multiplication May Be All You Need for LLM Request Scheduling
High-quality LLM request scheduling requires achieving two key objectives: whether the routed instance has KV$ to accelerate the request execution and whether the workload is balanced across instan...
Dingyan Zhang, Jinbo Han, Kaixi Zhang, Xingda Wei, Sijie Shen, Chenguang Fang, Wenyuan Yu, Jingre...
The effects of polarization on the observables in the decay $Ξ_{cc}^{++} \rightarrow Ξ_{c}^{+} \bar{\ell}ν_{\ell}$
We investigate the effects of polarization on several physical observables in the semileptonic decay $Ξ_{cc}^{++} \rightarrow Ξ_c^{+} \bar{\ell}ν_{\ell}$. We analyze the polarization effects of the...
Qazi Maaz Us Salam, Anamta Asif, Ishtiaq Ahmed, Rizwan Khalid
Joint Routing and Model Pruning for Decentralized Federated Learning in Bandwidth-Constrained Multi-Hop Wireless Networks
Decentralized federated learning (D-FL) enables privacy-preserving training without a central server, but multi-hop model exchanges and aggregation are often bottlenecked by communication resource ...
Xiaoyu He, Weicai Li, Tiejun Lv, Xi Yu
The Hrunting of AI: Where and How to Improve English Dialectal Fairness
It is known that large language models (LLMs) underperform in English dialects, and that improving them is difficult due to data scarcity. In this work we investigate how quality and availability i...
Wei Li, Adrian de Wynter
CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds
Although deep neural networks perform extremely well in controlled environments, they fail in real-world scenarios where data isn't available all at once, and the model must adapt to a new data dis...
Vaishnavi Nagabhushana, Kartikay Agrawal, Ayon Borthakur
Token Coherence: Adapting MESI Cache Protocols to Minimize Synchronization Overhead in Multi-Agent LLM Systems
Multi-agent LLM orchestration incurs synchronization costs scaling as O(n x S x |D|) in agents, steps, and artifact size under naive broadcast -- a regime I term broadcast-induced triply-multiplica...
Vladyslav Parakhin
Viaggiu holographic dark energy in light of DESI DR2
We test the cosmological viability of the Viaggiu holographic dark energy (VHDE) model by using late-time observational data. In particular, we place constraints on the free parameters of the model...
Amlan K. Halder, Andronikos Paliathanasis, Stefano Viaggiu, Abdulla Al Mamon, Subhajit Saha
ForceVLA2: Unleashing Hybrid Force-Position Control with Force Awareness for Contact-Rich Manipulation
Embodied intelligence for contact-rich manipulation has predominantly relied on position control, while explicit awareness and regulation of interaction forces remain under-explored, limiting stabi...
Yang Li, Zhaxizhuoma, Hongru Jiang, Junjie Xia, Hongquan Zhang, Jinda Du, Yunsong Zhou, Jia Zeng...