Papers
Research papers from arXiv and related sources
Field-angle dependence of magnetoresistance in UTe2
We theoretically study angle-resolved magnetoresistance under rotated magnetic field in the normal state of a spin-triplet superconductor UTe$_2$. The Wannier model derived from a GGA+$U$ calculati...
Jun Ishizuka, Youichi Yanase
Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning
Auto-formalization (AF) translates natural-language reasoning problems into solver-executable programs, enabling symbolic solvers to perform sound logical deduction. In practice, however, AF pipeli...
Zhiyu Ni, Zheng Liang, Liangcheng Song, Chenrui Cao, Xian Zhang, Alberto Sangiovanni-Vincentelli,...
From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs
Multimodal Large Language Models (MLLMs) are increasingly applied to pixel-level vision tasks, yet their intrinsic capacity for spatial understanding remains poorly understood. We investigate segme...
Boyong Wu, Sanghwan Kim, Zeynep Akata
Alignment Makes Language Models Normative, Not Descriptive
Post-training alignment optimizes language models to match human preference signals, but this objective is not equivalent to modeling observed human behavior. We compare 120 base-aligned model pair...
Eilam Shapira, Moshe Tennenholtz, Roi Reichart
Talk is Cheap, Logic is Hard: Benchmarking LLMs on Post-Condition Formalization
Formal specifications, such as pre- and post-conditions provide a solid basis for performing thorough program verification. However, developers rarely provide such formal specifications, hence if A...
I. S. W. B. Prasetya, Fitsum Kifetew, Davide Prandi
Influence of Gripper Design on Human Demonstration Quality for Robot Learning
Opening sterile medical packaging is routine for healthcare workers but remains challenging for robots. Learning from demonstration enables robots to acquire manipulation skills directly from human...
Gina L. Georgadarellis, Natalija Beslic, Seonhun Lee, Frank C. Sup, Meghan E. Huber
Reconstructing the Type Ia Supernova Absolute Magnitude with Two-Probe Physics-Informed Neural Networks
We apply two variants of Physics-Informed Neural Networks (PINNs) to reconstruct the Type Ia supernova absolute magnitude $M_B(z)$ from joint BAO and supernova data under four cosmological models (...
Denitsa Staicova
Generalist Multimodal LLMs Gain Biometric Expertise via Human Salience
Iris presentation attack detection (PAD) is critical for secure biometric deployments, yet developing specialized models faces significant practical barriers: collecting data representing future un...
Jacob Piland, Byron Dowling, Christopher Sweet, Adam Czajka
Noise-Response Calibration: A Causal Intervention Protocol for LLM-Judges
Large language models (LLMs) are increasingly used as automated judges and synthetic labelers, especially in low-label settings. Yet these systems are stochastic and often overconfident, which make...
Maxim Khomiakov, Jes Frellsen
PAuth - Precise Task-Scoped Authorization For Agents
The emerging agentic web envisions AI agents that reliably fulfill users' natural-language (NL)-based tasks by interacting with existing web services. However, existing authorization models are mis...
Reshabh K Sharma, Linxi Jiang, Zhiqiang Lin, Shuo Chen
Quadratic Surrogate Attractor for Particle Swarm Optimization
This paper presents a particle swarm optimization algorithm that leverages surrogate modeling to replace the conventional global best solution with the minimum of an n-dimensional quadratic form, p...
Maurizio Clemente, Marcello Canova
Energy Flow Graph: Modeling Software Energy Consumption
The growing energy demands of computational systems necessitate a fundamental shift from performance-centric design to one that treats energy consumption as one of the primary design considerations...
Saurabhsingh Rajput, Tushar Sharma
Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents
Agentic AI systems can now generate code with remarkable fluency, but a fundamental question remains: \emph{does the generated code actually do what the user intended?} The gap between informal nat...
Shuvendu K. Lahiri
Multilingual Reference Need Assessment System for Wikipedia
Wikipedia is a critical source of information for millions of users across the Web. It serves as a key resource for large language models, search engines, question-answering systems, and other Web-...
Aitolkyn Baigutanova, Francisco Navas, Pablo Aragon, Mykola Trokhymovych, Muniza Aslam, Ai-Jou Ch...
SLSim: a strong lensing population simulation package
Gravitational lensing offers unique insights into cosmology by bending light around massive objects. Strong gravitational lensing, in particular, produces magnified and often multiple images of dis...
Narayan Khadka, Simon Birrer, Henry Best, Paras Sharma, Katsuya T. Abe, Xianzhe Tang, Carly Misti...
A Longitudinal Study of Usability in Identity-Based Software Signing
Identity-based software signing tools aim to make software artifact provenance verifiable while reducing the operational burden of long-lived key management. However, there is limited cross-tool lo...
Kelechi G. Kalu, Hieu Tran, Santiago Torres-Arias, Sooyeon Jeong, James C. Davis
Upward Book Embeddings of Partitioned Digraphs
In 1999, Heath, Pemmaraju, and Trenk [SIAM J. Comput. 28(4), 1999] extended the classic notion of book embeddings to digraphs, introducing the concept of upward book embeddings, in which the vertic...
Giordano Da Lozzo, Fabrizio Frati, Ignaz Rutter
Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles
Ensembling Vision-Language Models (VLMs) from different providers maximizes benchmark accuracy, yet models from the same architectural family share correlated errors that standard voting ignores. W...
Zacharie Bugaud
On Big-M Reformulations of Bilevel Linear Programs: Hardness of A Posteriori Verification
A standard approach to solving optimistic bilevel linear programs (BLPs) is to replace the lower-level problem with its Karush-Kuhn-Tucker (KKT) optimality conditions and reformulate the resulting ...
Sergey S. Ketkov, Oleg A. Prokopyev
How Proxy Race Distorts Regression-Based Fairness Audits
Proxy-based race inference is increasingly used to conduct fairness assessments when protected-class data are unavailable or legally restricted -- most prominently in U.S. fair-lending enforcement,...
Xi Xin, Giles Hooker, Fei Huang