Papers
Research papers from arXiv and related sources
Revisiting foundation models for cell instance segmentation
Cell segmentation is a fundamental task in microscopy image analysis. Several foundation models for cell segmentation have been introduced, virtually all of them are extensions of Segment Anything ...
Anwai Archit, Constantin Pape
Uncertainty equality for SU(N) observables enabling the experimentally friendly detection of k-inseparability via purity measurements
We derive an exact uncertainty relation for arbitrary quantum states of finite-dimensional Hilbert spaces. For any given $k$-partition of a $d$-dimensional multipartite system, we introduce the tot...
G. Tartaglione, G. Zanfardino, F. Illuminati
The Revised Evolutionary Volume Tolman Test: Cosmological Constraints from Galaxy Evolution
In this study we adapt a classical cosmology measurement, the volume or number density test, to a modern synthesis of observed galaxy evolution. We do this by using measured galaxy mass functions a...
Christopher J. Conselice, Edmund J. Copeland, Sergio Sevillano Muñoz
How do LLMs Compute Verbal Confidence
Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely used to extract uncertainty estimates from black-box models. However, how LLMs internally generate...
Dharshan Kumaran, Arthur Conmy, Federico Barbero, Simon Osindero, Viorica Patraucean, Petar Velic...
Event-Centric Human Value Understanding in News-Domain Texts: An Actor-Conditioned, Multi-Granularity Benchmark
Existing human value datasets do not directly support value understanding in factual news: many are actor-agnostic, rely on isolated utterances or synthetic scenarios, and lack explicit event struc...
Yao Wang, Xin Liu, Zhuochen Liu, Jiankang Chen, Adam Jatowt, Kyoungsook Kim, Noriko Kando, Haitao Yu
Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation
Physics-informed machine learning surrogates are increasingly explored to accelerate dynamic simulation of generators, converters, and other power grid components. The key question, however, is not...
Petros Ellinas, Indrajit Chaudhuri, Johanna Vorwerk, Spyros Chatzivasileiadis
Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control
Diffusion models and flow matching have become a cornerstone of robotic imitation learning, yet they suffer from a structural inefficiency where inference is often bound to a fixed integration sche...
Zunzhe Zhang, Runhan Huang, Yicheng Liu, Shaoting Zhu, Linzhan Mou, Hang Zhao
ArchBench: Benchmarking Generative-AI for Software Architecture Tasks
Benchmarks for large language models (LLMs) have progressed from snippet-level function generation to repository-level issue resolution, yet they overwhelmingly target implementation correctness. S...
Bassam Adnan, Aviral Gupta, Sreemaee Akshathala, Karthik Vaidhyanathan
Text-to-Stage: Spatial Layouts from Long-form Narratives
In this work, we probe the ability of a language model to demonstrate spatial reasoning from unstructured text, mimicking human capabilities and automating a process that benefits many downstream m...
Jefferson Hernandez, Swarnadeep Saha, Chenxi Whitehouse, Sanjeel Parekh, Calvin Murdock, Yuliang ...
RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy
LLM agents often fail in closed-world embodied environments because actions must satisfy strict preconditions -- such as location, inventory, and container states -- and failure feedback is sparse....
Zhenhang Yuan, Shenghai Yuan, Lihua Xie
CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents
A prerequisite for coding agents to perform tasks on large repositories is code localization - the identification of relevant files, classes, and functions to work on. While repository-level code l...
Lintang Sutawika, Aditya Bharat Soni, Bharath Sriraam R R, Apurva Gandhi, Taha Yassine, Sanidhya ...
FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair
Multimodal Automated Program Repair (MAPR) extends traditional program repair by requiring models to jointly reason over source code, textual issue descriptions, and visual artifacts such as GUI sc...
Ruize Ma, Yilei Jiang, Shilin Zhang, Zheng Ma, Yi Feng, Vincent Ng, Zhi Wang, Xiangyu Yue, Chuany...
Discovering Decoupled Functional Modules in Large Language Models
Understanding the internal functional organization of Large Language Models (LLMs) is crucial for improving their trustworthiness and performance. However, how LLMs organize different functions int...
Yanke Yu, Jin Li, Ying Sun, Ping Li, Zhefeng Wang, Yi Zheng
CodeT5-RNN: Reinforcing Contextual Embeddings for Enhanced Code Comprehension
Contextual embeddings generated by LLMs exhibit strong positional inductive biases, which can limit their ability to fully capture long-range, order-sensitive dependencies in highly structured sour...
Md Mostafizer Rahman, Ariful Islam Shiplu, Yutaka Watanobe, Md Faizul Ibne Amin, Syed Rameez Naqv...
Process Supervision for Chain-of-Thought Reasoning via Monte Carlo Net Information Gain
Multi-step reasoning improves the capabilities of large language models (LLMs) but increases the risk of errors propagating through intermediate steps. Process reward models (PRMs) mitigate this by...
Corentin Royer, Debarun Bhattacharjya, Gaetano Rossiello, Andrea Giovannini, Mennatallah El-Assady
M2P: Improving Visual Foundation Models with Mask-to-Point Weakly-Supervised Learning for Dense Point Tracking
Tracking Any Point (TAP) has emerged as a fundamental tool for video understanding. Current approaches adapt Vision Foundation Models (VFMs) like DINOv2 via offline finetuning or test-time optimiza...
Qiangqiang Wu, Tianyu Yang, Bo Fang, Jia Wan, Matias Di Martino, Guillermo Sapiro, Antoni B. Chan
Swarm: Co-Activation Aware KVCache Offloading Across Multiple SSDs
The key-value (KV) cache has become the dominant contributor to memory consumption in large language model (LLM) inference. Although offloading KVCache from GPU high-bandwidth memory (HBM) to CPU D...
Tuowei Wang, Liyun Chu, Ruwen Fan, Ju Ren
Simulating the influence of stoichiometry on the spectral emissivity of Mo$_x$Si$_y$ thin films
In this work, we simulate the spectral emissivity of various stoichiometric crystal phases of Mo$_x$Si$_y$ compounds using density functional perturbation theory. The dielectric function, including...
Zahra Golsanamlou, Arseniy Baskakov, Robbert van de Kruijs, Silvester Houweling, Giorgio Colombi,...
Multivariate Residual Estimation Risk
The purpose of this paper is to describe and extend the use of the newly-introduced measure, residual estimation risk. Following the seminal work of Bignozzi and Tsanakas, the quantification of res...
D. J. Manuge
Algorithms for the Generation of Snarks
The essential requirement for a cubic graph to be called a snark is that it can not be edge-coloured with three colours. To avoid trivial cases, varying restrictions on the connectivity are impos...
Gunnar Brinkmann, Steven Van Overberghe