Papers
Research papers from arXiv and related sources
The Rise of Null Hypothesis Significance Testing (NHST): Institutional Massification and the Emergence of a Procedural Epistemology
It has long been a puzzle why, despite sustained reform efforts, many applied scientific fields remain dominated by Null Hypothesis Significance Testing (NHST), a framework that dichotomizes study ...
Carol Ting
Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark
Current online translation services require sending user text to cloud servers, posing a risk of privacy leakage when the text contains sensitive information. This risk hinders the application of o...
Wei Shao, Lemao Liu, Yinqiao Li, Guoping Huang, Shuming Shi, Linqi Song
A Skill-augmented Agentic Framework and Benchmark for Multi-Video Understanding
Multimodal Large Language Models have achieved strong performance in single-video understanding, yet their ability to reason across multiple videos remains limited. Existing approaches typically co...
Yue Zhang, Liqiang Jing, Jia Li, Yapeng Tian, Xinya Du, Yunhui Guo, Vibhav Gogate
GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation
Game UI design requires consistent visual assets across rarity tiers yet remains a predominantly manual process. We present GameUIAgent, an LLM-powered agentic framework that translates natural lan...
Wei Zeng, Fengwei An, Zhen Liu, Jian Zhao
Multimodal Deep Learning for Early Prediction of Patient Deterioration in the ICU: Integrating Time-Series EHR Data with Clinical Notes
Early identification of patients at risk for clinical deterioration in the intensive care unit (ICU) remains a critical challenge. Delayed recognition of impending adverse events, including mortali...
Binesh Sadanandan
Quantum-Kinetic Dark Energy (QKDE): An effective dark energy framework with a covariantly completed time-dependent scalar kinetic normalization
A minimal effective dark-energy framework - Quantum-Kinetic Dark Energy (QKDE) - is developed in which the scalar kinetic normalization carries a slow background time dependence through a covariant...
Daniel Brown
Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents
Computer-using agents (CUAs) act directly on graphical user interfaces, yet their perception of the screen is often unreliable. Existing work largely treats these failures as performance limitation...
Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen
AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
Frozen-backbone transfer with Vision Transformers faces two under-addressed issues: optimization instability when adapters are naively inserted into a fixed feature extractor, and the absence of pr...
Salim Khazem
Beyond Local Code Optimization: Multi-Agent Reasoning for Software System Optimization
Large language models and AI agents have recently shown promise in automating software performance optimization, but existing approaches predominantly rely on local, syntax-driven code transformati...
Huiyun Peng, Parth Vinod Patil, Antonio Zhong Qiu, George K. Thiruvathukal, James C. Davis
Representation Learning for Spatiotemporal Physical Systems
Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. Ho...
Helen Qu, Rudy Morel, Michael McCabe, Alberto Bietti, François Lanusse, Shirley Ho, Yann LeCun
Visual-ERM: Reward Modeling for Visual Equivalence
Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent ...
Ziyu Liu, Shengyuan Ding, Xinyu Fang, Xuanlang Dai, Penghui Yang, Jianze Liang, Jiaqi Wang, Kai C...
A Generative Model of Conspicuous Consumption and Status Signaling
Status signaling drives human behavior and the allocation of scarce resources such as mating opportunities, yet the generative mechanisms governing how specific goods, signals, or behaviors acquire...
Logan Cross, Jordi Grau-Moya, William A. Cunningham, Alexander Sasha Vezhnevets, Joel Z. Leibo
Bounds on Agreement between Subjective and Objective Measurements
Objective estimators of multimedia quality are often judged by comparing estimates with subjective "truth data," most often via Pearson correlation coefficient (PCC) or mean-squared error (MSE). Bu...
Jaden Pieper, Stephen D. Voran
Neuron-Aware Data Selection In Instruction Tuning For Large Language Models
Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade L...
Xin Chen, Junchao Wu, Shu Yang, Runzhe Zhan, Zeyu Wu, Min Yang, Shujian Huang, Lidia S. Chao, Der...
Navig-AI-tion: Navigation by Contextual AI and Spatial Audio
Audio-only walking navigation can leave users disoriented, relying on vague cardinal directions and lacking real-time environmental context, leading to frequent errors. To address this, we present ...
Mathias N. Lystbæk, Haley Adams, Ranjith Kagathi Ananda, Eric J Gonzalez, Luca Ballan, Qiuxuan Wu...
From Experiments to Expertise: Scientific Knowledge Consolidation for AI-Driven Computational Research
While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What disting...
Haonan Huang
Lattice Discrete Particle Model (LDPM): Comparison of Various Time Integration Solvers and Implementations
This article presents a comparison of various implementations of the Lattice Discrete Particle Model (LDPM) for the numerical simulation of concrete and other heterogeneous quasibrittle materials. ...
Erol Lale, Jan Eliáš, Ke Yu, Matthew Troemner, Monika Středulová, Julien Khoury, Tianju Xue, Ioan...
LLM Constitutional Multi-Agent Governance
Large Language Models (LLMs) can generate persuasive influence strategies that shift cooperative behavior in multi-agent populations, but a critical question remains: does the resulting cooperation...
J. de Curtò, I. de Zarzà
Verification of Robust Properties for Access Control Policies
Existing methods for verifying access control policies require the policy to be complete and fully determined before verification can proceed, but in practice policies are developed iteratively, co...
Alexander V. Gheorghiu
Semantic Invariance in Agentic AI
Large Language Models (LLMs) increasingly serve as autonomous reasoning agents in decision support, scientific problem-solving, and multi-agent coordination systems. However, deploying LLM agents i...
I. de Zarzà, J. de Curtò, Jordi Cabot, Pietro Manzoni, Carlos T. Calafate