Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation

Simulation of surveys using LLMs is emerging as a powerful application for generating human-like responses at scale. Prior work evaluates survey simulation using metrics borrowed from other domains...

Weronika Łajewska, Paul Missault, George Davidson, Saab Mansour

2603.19002 2026-03-19
AI LLM

SVLAT: Scientific Visualization Literacy Assessment Test

Scientific visualization (SciVis) has become an essential means for exploring, understanding, and communicating complex scientific phenomena. However, the field still lacks a validated instrument a...

Patrick Phuoc Do, Kaiyuan Tang, Kuangshi Ai, Chaoli Wang

2603.19000 2026-03-19
AI LLM

Regret Bounds for Competitive Resource Allocation with Endogenous Costs

We study online resource allocation among N interacting modules over T rounds. Unlike standard online optimization, costs are endogenous: they depend on the full allocation vector through an intera...

Rui Chai

2603.18999 2026-03-19
AI LLM

Book your room in the Turing Hotel! A symmetric and distributed Turing Test with multiple AIs and humans

In this paper, we report our experience with ``TuringHotel'', a novel extension of the Turing Test based on interactions within mixed communities of Large Language Models (LLMs) and human participa...

Christian Di Maio, Tommaso Guidi, Luigi Quarantiello, Jack Bell, Marco Gori, Stefano Melacci, Vin...

2603.18981 2026-03-19
AI LLM

Evaluating 5W3H Structured Prompting for Intent Alignment in Human-AI Interaction

Natural language prompts often suffer from intent transmission loss: the gap between what users actually need and what they communicate to AI systems. We evaluate PPS (Prompt Protocol Specification...

Peng Gang

2603.18976 2026-03-19
AI LLM

Terms of (Ab)Use: An Analysis of GenAI Services

Generative AI services like ChatGPT and Gemini are some of the fastest-growing consumer services. Individuals using such services must accept their terms of use before access, and conform to these ...

Harshvardhan J. Pandit, Dick A. H. Blankvoort, Dick A. H. Blankvoort, Sasha Luccioni, Abeba Birhane

2603.18964 2026-03-19
AI LLM

Sketch2Topo: Using Hand-Drawn Inputs for Diffusion-Based Topology Optimization

Topology optimization (TO) is employed in engineering to optimize structural performance while maximizing material efficiency. However, traditional TO methods incur significant computational and ti...

Shuyue Feng, Cedric Caremel, Yoshihiro Kawahara

2603.18960 2026-03-19
AI LLM

Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring

Reliable anomaly detection in distributed power plant monitoring systems is essential for ensuring operational continuity and reducing maintenance costs, particularly in regions where telecom opera...

Corneille Niyonkuru, Marcellin Atemkeng, Gabin Maxime Nguegnang, Arnaud Nguembang Fadja

2603.18954 2026-03-19
AI LLM

Context Bootstrapped Reinforcement Learning

Reinforcement Learning from Verifiable Rewards (RLVR) suffers from exploration inefficiency, where models struggle to generate successful rollouts, resulting in minimal learning signal. This challe...

Saaket Agashe, Jayanth Srinivasa, Gaowen Liu, Ramana Kompella, Xin Eric Wang

2603.18953 2026-03-19
AI LLM

Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Chain-of-thought (CoT) reasoning improves LLM accuracy, yet detecting failures cheaply remains elusive. We study whether the shape of uncertainty dynamics across reasoning steps--captured by sampli...

Xinghao Zhao

2603.18940 2026-03-19
AI LLM

Agentic Business Process Management: A Research Manifesto

This paper presents a manifesto that articulates the conceptual foundations of Agentic Business Process Management (APM), an extension of Business Process Management (BPM) for governing autonomous ...

Diego Calvanese, Angelo Casciani, Giuseppe De Giacomo, Marlon Dumas, Fabiana Fournier, Timotheus ...

2603.18916 2026-03-19
AI LLM

Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections

The rapid proliferation of artificial intelligence (AI) technologies has led to a dynamic regulatory landscape, where legislative frameworks strive to keep pace with technical advancements. As AI p...

Shiliang Zhang, Sabita Maharjan

2603.18914 2026-03-19
AI LLM

GHOST: Fast Category-agnostic Hand-Object Interaction Reconstruction from RGB Videos using Gaussian Splatting

Understanding realistic hand-object interactions from monocular RGB videos is essential for AR/VR, robotics, and embodied AI. Existing methods rely on category-specific templates or heavy computati...

Ahmed Tawfik Aboukhadra, Marcel Rogge, Nadia Robertini, Abdalla Arafa, Jameel Malik, Ahmed Elhaye...

2603.18912 2026-03-19
AI LLM

Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs

Knowledge-grounded dialogue systems aim to generate informative, contextually relevant responses by conditioning on external knowledge sources. However, most existing approaches focus exclusively o...

Vedant Pandya

2603.18911 2026-03-19
AI LLM

Uniform a priori bounds and error analysis for the Adam stochastic gradient descent optimization method

The adaptive moment estimation (Adam) optimizer proposed by Kingma & Ba (2014) is presumably the most popular stochastic gradient descent (SGD) optimization method for the training of deep neural n...

Steffen Dereich, Thang Do, Arnulf Jentzen

2603.18899 2026-03-19
AI LLM

Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries

Large Language Models (LLMs) have been progressively exhibiting there capabilities in various areas of research. The performance of the LLMs in acute maternal healthcare area, predominantly in low ...

Anagani Bhanusree, Sai Divya Vissamsetty, K VenkataKrishna Rao, Rimjhim

2603.18898 2026-03-19
AI LLM

Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution

LLM-powered agents are emerging as a dominant paradigm for autonomous task solving. Unlike standard inference workloads, agents operate in a strictly serial "LLM-tool" loop, where the LLM must wait...

Yifan Sui, Han Zhao, Rui Ma, Zhiyuan He, Hao Wang, Jianxun Li, Yuqing Yang

2603.18897 2026-03-19
AI LLM

Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness

Positron emission tomography (PET) is a widely recognized technique for diagnosing neurodegenerative diseases, offering critical functional insights. However, its high costs and radiation exposure ...

Yitong Li, Igor Yakushev, Dennis M. Hedderich, Christian Wachinger

2603.18896 2026-03-19
AI LLM

From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making

Artificial intelligence (AI) systems are deployed as collaborators in human decision-making. Yet, evaluation practices focus primarily on model accuracy rather than whether human-AI teams are prepa...

Min Hun Lee

2603.18895 2026-03-19
AI LLM

I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems

Large language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when grant...

Vedanta S P, Ponnurangam Kumaraguru

2603.18894 2026-03-19