Papers
Research papers from arXiv and related sources
AgenticRS-EnsNAS: Ensemble-Decoupled Self-Evolving Architecture Search
Neural Architecture Search (NAS) deployment in industrial production systems faces a fundamental validation bottleneck: verifying a single candidate architecture pi requires evaluating the deployed...
Yun Chen, Moyu Zhang, Jinxin Hu, Yu Zhang, Xiaoyi Zeng
ReViSQL: Achieving Human-Level Text-to-SQL
Translating natural language to SQL (Text-to-SQL) is a critical challenge in both database research and data analytics applications. Recent efforts have focused on enhancing SQL reasoning by develo...
Yuxuan Zhu, Tengjun Jin, Yoojin Choi, Daniel Kang
An Agentic Approach to Generating XAI-Narratives
Explainable AI (XAI) research has experienced substantial growth in recent years. Existing XAI methods, however, have been criticized for being technical and expert-oriented, motivating the develop...
Yifan He, David Martens
Sound State Encodings in Translational Separation Logic Verifiers (Extended Version)
Automated program verifiers are often organized into a front-end, which encodes an input program into an intermediate verification language (IVL), and a back-end, which proves that the IVL program ...
Hongyi Ling, Thibault Dardinier, Ellen Arlt, Peter Müller
When Contextual Inference Fails: Cancelability in Interactive Instruction Following
We investigate the separation of literal interpretation from contextual inference in a collaborative block-building task where a builder must resolve underspecified instructions using contextual in...
Natalia Bila, Kata Naszádi, Alexandra Mayn, Christof Monz
Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts
Deep learning models often struggle under natural distribution shifts, a common challenge in real-world deployments. Test-Time Adaptation (TTA) addresses this by adapting models during inference wi...
John Turnbull, Shivam Grover, Amin Jalali, Ali Etemad
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
Reinforcement learning (RL) has become a standard paradigm for post-training and aligning Large Language Models (LLMs), yet recent evidence suggests it faces a persistent "capability ceiling": unli...
Yurun Yuan, Tengyang Xie
Stone-in-Waiting: A Cloud-Based Accelerator for the Quantum Approximate Optimization Algorithm
The Quantum Approximate Optimization Algorithm (QAOA) and its advanced variant, the Quantum Alternating Operator Ansatz (QAOA), are major research topics in the current era of Noisy Intermediate-Sc...
Shuai Zeng
X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving
Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision--language--action (VLA) policies directly map raw sensor streams to driving actio...
Chaoda Zheng, Sean Li, Jinhao Deng, Zhennan Wang, Shijia Chen, Liqiang Xiao, Ziheng Chi, Hongbin ...
Promoting Critical Thinking With Domain-Specific Generative AI Provocations
The evidence on the effects of generative AI (GenAI) on critical thinking is mixed, with studies suggesting both potential harms and benefits depending on its implementation. Some argue that AI-dri...
Thomas Şerban von Davier, Hao-Ping Lee, Jodi Forlizzi, Sauvik Das
Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance
Autonomous coding agents are increasingly integrated into software development workflows, offering capabilities that extend beyond code suggestion to active system interaction and environment manag...
Fazhong Liu, Zhuoyan Chen, Tu Lan, Haozhen Tan, Zhenyu Xu, Xiang Li, Guoxing Chen, Yan Meng, Haoj...
Model-Driven Learning-Based Physical Layer Authentication for Mobile Wi-Fi Devices
The rise of wireless technologies has made the Internet of Things (IoT) ubiquitous, but the broadcast nature of wireless communications exposes IoT to authentication risks. Physical layer authentic...
Yijia Guo, Junqing Zhang, Yao-Win Peter Hong, Stefano Tomasin
Interpreting Reinforcement Learning Model Behavior via Koopman with Control
Reinforcement learning (RL) models have shown the capability of learning complex behaviors, but quantitatively assessing those behaviors - which is critical for safety assurance and the discovery o...
William T. Redman
HiPath: Hierarchical Vision-Language Alignment for Structured Pathology Report Prediction
Pathology reports are structured, multi-granular documents encoding diagnostic conclusions, histological grades, and ancillary test results across one or more anatomical sites; yet existing patholo...
Ruicheng Yuan, Zhenxuan Zhang, Anbang Wang, Liwei Hu, Xiangqian Hua, Yaya Peng, Jiawei Luo, Guang...
On the Ability of Transformers to Verify Plans
Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when generalization should be expected has been limited. We take important steps towards addressi...
Yash Sarrof, Yupei Du, Katharina Stein, Alexander Koller, Sylvie Thiébaux, Michael Hahn
On the Capacity of Future Lane-Free Urban Infrastructure
In this paper, the potential capacity and spatial efficiency of future autonomous lane-free traffic in urban environments are explored using a combination of analytical and simulation-based approac...
Patrick Malcolm, Klaus Bogenberger
TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)
Privacy-preserving aggregation is a cornerstone for AI systems that learn from distributed data without exposing individual records, especially in federated learning and telemetry. Existing two-ser...
Harish Karthikeyan, Antigoni Polychroniadou
Large Language Models and Stock Investing: Is the Human Factor Required?
This paper investigates whether large language models (LLMs) can generate reliable stock market predictions. We evaluate four state-of-the-art models - ChatGPT, Gemini, DeepSeek, and Perplexity - a...
Ricardo Crisostomo, Diana Mykhalyuk
Hybrid topic modelling for computational close reading: Mapping narrative themes in Pushkin's Evgenij Onegin
This study presents a hybrid topic modelling framework for computational literary analysis that integrates Latent Dirichlet Allocation (LDA) with sparse Partial Least Squares Discriminant Analysis ...
Angelo Maria Sabatini
Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents
As large language models (LLMs) evolve into autonomous agents, persistent memory at the API layer is essential for enabling context-aware behavior across LLMs and multi-session interactions. Existi...
Luiz C. Borro, Luiz A. B. Macarini, Gordon Tindall, Michael Montero, Adam B. Struck