Papers
Research papers from arXiv and related sources
FAMOSE: A ReAct Approach to Automated Feature Discovery
Feature engineering remains a critical yet challenging bottleneck in machine learning, particularly for tabular data, as identifying optimal features from an exponentially large feature space tradi...
Keith Burghardt, Jienan Liu, Sadman Sakib, Yuning Hao, Bo Li
When to Trust the Cheap Check: Weak and Strong Verification for Reasoning
Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Extern...
Shayan Kiyani, Sima Noorani, George Pappas, Hamed Hassani
Unmasking the Factual-Conceptual Gap in Persian Language Models
While emerging Persian NLP benchmarks have expanded into pragmatics and politeness, they rarely distinguish between memorized cultural facts and the ability to reason about implicit social norms. W...
Alireza Sakhaeirad, Ali Ma'manpoosh, Arshia Hemmat
What Makes a Good LLM Agent for Real-world Penetration Testing?
LLM-based agents show promise for automating penetration testing, yet reported performance varies widely across systems and benchmarks. We analyze 28 LLM-based penetration testing systems and evalu...
Gelei Deng, Yi Liu, Yuekang Li, Ruozhao Yang, Xiaofei Xie, Jie Zhang, Han Qiu, Tianwei Zhang
Stable Asynchrony: Variance-Controlled Off-Policy RL for LLMs
Reinforcement learning (RL) is widely used to improve large language models on reasoning tasks, and asynchronous RL training is attractive because it increases end-to-end throughput. However, for w...
Luke Huang, Zhuoyang Zhang, Qinghao Hu, Shang Yang, Song Han
Exploring Novel Data Storage Approaches for Large-Scale Numerical Weather Prediction
Driven by scientific and industry ambition, HPC and AI applications such as operational Numerical Weather Prediction (NWP) require processing and storing ever-increasing data volumes as fast as pos...
Nicolau Manubens Gil
Towards Anytime-Valid Statistical Watermarking
The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promisi...
Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan
AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing
PDEs are central to scientific and engineering modeling, yet designing accurate numerical solvers typically requires substantial mathematical expertise and manual tuning. Recent neural network-base...
Jianda Du, Youran Sun, Haizhao Yang
Adapting Actively on the Fly: Relevance-Guided Online Meta-Learning with Latent Concepts for Geospatial Discovery
In many real-world settings, such as environmental monitoring, disaster response, or public health, with costly and difficult data collection and dynamic environments, strategically sampling from u...
Jowaria Khan, Anindya Sarkar, Yevgeniy Vorobeychik, Elizabeth Bondi-Kelly
MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models
Molecular generation with diffusion models has emerged as a promising direction for AI-driven drug discovery and materials science. While graph diffusion models have been widely adopted due to the ...
Hojung Jung, Rodrigo Hormazabal, Jaehyeong Jo, Youngrok Park, Kyunggeun Roh, Se-Young Yun, Sehui ...
Art2Mus: Artwork-to-Music Generation via Visual Conditioning and Large-Scale Cross-Modal Alignment
Music generation has advanced markedly through multimodal deep learning, enabling models to synthesize audio from text and, more recently, from images. However, existing image-conditioned systems s...
Ivan Rinaldi, Matteo Mendula, Nicola Fanelli, Florence Levé, Matteo Testi, Giovanna Castellano, G...
The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines?
Current speech LLMs largely perform implicit ASR: on tasks solvable from a transcript, they are behaviorally and mechanistically equivalent to simple Whisper$\to$LLM cascades. We show this through ...
Jayadev Billa
Stochastic galactic supernova flux of semi-relativistic particles
New exotic particles with MeV masses, such as axion-like particles or light dark matter, can be emitted from core-collapse supernovae (SNe) with semi-relativistic velocities. Due to their speed dis...
David Alonso-González, David Cerdeño, Marina Cermeño, Andres D. Perez
Asymptotic Smoothing of the Lipschitz Loss Landscape in Overparameterized One-Hidden-Layer ReLU Networks
We study the topology of the loss landscape of one-hidden-layer ReLU networks under overparameterization. On the theory side, we (i) prove that for convex $L$-Lipschitz losses with an $\ell_1$-regu...
Saveliy Baturin
AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games
Rigorously evaluating machine intelligence against the broad spectrum of human general intelligence has become increasingly important and challenging in this era of rapid technological advance. Con...
Lance Ying, Ryan Truong, Prafull Sharma, Kaiya Ivy Zhao, Nathan Cloos, Kelsey R. Allen, Thomas L....
BMW: Bayesian Model-Assisted Adaptive Phase II Clinical Trial Design for Win Ratio Statistic
The win ratio (WR) statistic is increasingly used to evaluate treatment effects based on prioritized composite endpoints, yet existing Bayesian adaptive designs are not directly applicable because ...
Di Zhu, Yong Zang
BMC4TimeSec: Verification Of Timed Security Protocols
We present BMC4TimeSec, an end-to-end tool for verifying Timed Security Protocols (TSP) based on SMT-based bounded model checking and multi-agent modelling in the form of Timed Interpreted Systems ...
Agnieszka M. Zbrzezny
Asymptotically Optimal Sequential Testing with Markovian Data
We study one-sided and $α$-correct sequential hypothesis testing for data generated by an ergodic Markov chain. The null hypothesis is that the unknown transition matrix belongs to a prescribed set...
Alhad Sethi, Kavali Sofia Sagar, Shubhada Agrawal, Debabrota Basu, P. N. Karthik
Building an AI-native Research Ecosystem for Experimental Particle Physics: A Community Vision
Experimental particle physics seeks to understand the universe by probing its fundamental particles and forces and exploring how they govern the large-scale processes that shape cosmic evolution. T...
Thea Klaeboe Aarrestad, Alaa Abdelhamid, Haider Abidi, Jahred Adelman, Jennifer Adelman-McCarthy,...
Optimal control of stochastic Volterra integral equations with completely monotone kernels and stochastic differential equations on Hilbert spaces with unbounded control and diffusion operators
The dynamic programming approach is one of the most powerful ones in optimal control. However, when dealing with optimal control problems of stochastic Volterra integral equations (SVIEs) with comp...
Gabriele Bolli, Filippo de Feo