Papers
Research papers from arXiv and related sources
RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation
Simulation of surveys using LLMs is emerging as a powerful application for generating human-like responses at scale. Prior work evaluates survey simulation using metrics borrowed from other domains...
Weronika Łajewska, Paul Missault, George Davidson, Saab Mansour
SVLAT: Scientific Visualization Literacy Assessment Test
Scientific visualization (SciVis) has become an essential means for exploring, understanding, and communicating complex scientific phenomena. However, the field still lacks a validated instrument a...
Patrick Phuoc Do, Kaiyuan Tang, Kuangshi Ai, Chaoli Wang
Regret Bounds for Competitive Resource Allocation with Endogenous Costs
We study online resource allocation among N interacting modules over T rounds. Unlike standard online optimization, costs are endogenous: they depend on the full allocation vector through an intera...
Rui Chai
Book your room in the Turing Hotel! A symmetric and distributed Turing Test with multiple AIs and humans
In this paper, we report our experience with ``TuringHotel'', a novel extension of the Turing Test based on interactions within mixed communities of Large Language Models (LLMs) and human participa...
Christian Di Maio, Tommaso Guidi, Luigi Quarantiello, Jack Bell, Marco Gori, Stefano Melacci, Vin...
Evaluating 5W3H Structured Prompting for Intent Alignment in Human-AI Interaction
Natural language prompts often suffer from intent transmission loss: the gap between what users actually need and what they communicate to AI systems. We evaluate PPS (Prompt Protocol Specification...
Peng Gang
Terms of (Ab)Use: An Analysis of GenAI Services
Generative AI services like ChatGPT and Gemini are some of the fastest-growing consumer services. Individuals using such services must accept their terms of use before access, and conform to these ...
Harshvardhan J. Pandit, Dick A. H. Blankvoort, Dick A. H. Blankvoort, Sasha Luccioni, Abeba Birhane
Sketch2Topo: Using Hand-Drawn Inputs for Diffusion-Based Topology Optimization
Topology optimization (TO) is employed in engineering to optimize structural performance while maximizing material efficiency. However, traditional TO methods incur significant computational and ti...
Shuyue Feng, Cedric Caremel, Yoshihiro Kawahara
Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring
Reliable anomaly detection in distributed power plant monitoring systems is essential for ensuring operational continuity and reducing maintenance costs, particularly in regions where telecom opera...
Corneille Niyonkuru, Marcellin Atemkeng, Gabin Maxime Nguegnang, Arnaud Nguembang Fadja
Context Bootstrapped Reinforcement Learning
Reinforcement Learning from Verifiable Rewards (RLVR) suffers from exploration inefficiency, where models struggle to generate successful rollouts, resulting in minimal learning signal. This challe...
Saaket Agashe, Jayanth Srinivasa, Gaowen Liu, Ramana Kompella, Xin Eric Wang
Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought
Chain-of-thought (CoT) reasoning improves LLM accuracy, yet detecting failures cheaply remains elusive. We study whether the shape of uncertainty dynamics across reasoning steps--captured by sampli...
Xinghao Zhao
Agentic Business Process Management: A Research Manifesto
This paper presents a manifesto that articulates the conceptual foundations of Agentic Business Process Management (APM), an extension of Business Process Management (BPM) for governing autonomous ...
Diego Calvanese, Angelo Casciani, Giuseppe De Giacomo, Marlon Dumas, Fabiana Fournier, Timotheus ...
Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections
The rapid proliferation of artificial intelligence (AI) technologies has led to a dynamic regulatory landscape, where legislative frameworks strive to keep pace with technical advancements. As AI p...
Shiliang Zhang, Sabita Maharjan
GHOST: Fast Category-agnostic Hand-Object Interaction Reconstruction from RGB Videos using Gaussian Splatting
Understanding realistic hand-object interactions from monocular RGB videos is essential for AR/VR, robotics, and embodied AI. Existing methods rely on category-specific templates or heavy computati...
Ahmed Tawfik Aboukhadra, Marcel Rogge, Nadia Robertini, Abdalla Arafa, Jameel Malik, Ahmed Elhaye...
Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs
Knowledge-grounded dialogue systems aim to generate informative, contextually relevant responses by conditioning on external knowledge sources. However, most existing approaches focus exclusively o...
Vedant Pandya
Uniform a priori bounds and error analysis for the Adam stochastic gradient descent optimization method
The adaptive moment estimation (Adam) optimizer proposed by Kingma & Ba (2014) is presumably the most popular stochastic gradient descent (SGD) optimization method for the training of deep neural n...
Steffen Dereich, Thang Do, Arnulf Jentzen
Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries
Large Language Models (LLMs) have been progressively exhibiting there capabilities in various areas of research. The performance of the LLMs in acute maternal healthcare area, predominantly in low ...
Anagani Bhanusree, Sai Divya Vissamsetty, K VenkataKrishna Rao, Rimjhim
Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution
LLM-powered agents are emerging as a dominant paradigm for autonomous task solving. Unlike standard inference workloads, agents operate in a strictly serial "LLM-tool" loop, where the LLM must wait...
Yifan Sui, Han Zhao, Rui Ma, Zhiyuan He, Hao Wang, Jianxun Li, Yuqing Yang
Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness
Positron emission tomography (PET) is a widely recognized technique for diagnosing neurodegenerative diseases, offering critical functional insights. However, its high costs and radiation exposure ...
Yitong Li, Igor Yakushev, Dennis M. Hedderich, Christian Wachinger
From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making
Artificial intelligence (AI) systems are deployed as collaborators in human decision-making. Yet, evaluation practices focus primarily on model accuracy rather than whether human-AI teams are prepa...
Min Hun Lee
I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems
Large language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when grant...
Vedanta S P, Ponnurangam Kumaraguru