Papers
Research papers from arXiv and related sources
Machine-Generated, Machine-Checked Proofs for a Verified Compiler (Experience Report)
We report on using an agentic coding assistant (Claude Code, powered by Claude Opus 4.6) to mechanize a substantial Rocq correctness proof from scratch, with human guidance but without human proof ...
Zoe Paraskevopoulou
The Digital Gorilla: Rebalancing Power in the Age of AI
Contemporary artificial intelligence (AI) policy suffers from a basic categorical error. Existing frameworks rely on analogizing AI to inherited technology types -- such as products, platforms, or ...
M. Alejandra Parra-Orlandoni, Roxanne A. Schnyder, Christopher J. Mallet
HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images
Accurate heat-demand maps play a crucial role in decarbonizing space heating, yet most municipalities lack detailed building-level data needed to calculate them. We introduce HeatPrompt, a zero-sho...
Kundan Thota, Xuanhao Mu, Thorsten Schlachter, Veit Hagenmeyer
Multilingual Large Language Models do not comprehend all natural languages to equal degrees
Large Language Models (LLMs) play a critical role in how humans access information. While their core use relies on comprehending written requests, our understanding of this ability is currently lim...
Natalia Moskvina, Raquel Montero, Masaya Yoshida, Ferdy Hubers, Paolo Morosi, Walid Irhaymi, Jin ...
The LLMbda Calculus: AI Agents, Conversations, and Information Flow
A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding conversation. AI agents build such conversations automatical...
Zac Garby, Andrew D. Gordon, David Sands
Can You Tell It's AI? Human Perception of Synthetic Voices in Vishing Scenarios
Large Language Models and commercial speech synthesis systems now enable highly realistic AI-generated voice scams (vishing), raising urgent concerns about deception at scale. Yet it remains unclea...
Zoha Hayat Bhatti, Bakhtawar Ahtisham, Seemal Tausif, Niklas George, Nida ul Habib Bajwa, Mobin J...
Interaction Theater: A case of LLM Agents Interacting at Scale
As multi-agent architectures and agent-to-agent protocols proliferate, a fundamental question arises: what actually happens when autonomous LLM agents interact at scale? We study this question empi...
Sarath Shekkizhar, Adam Earle
To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation
Visual navigation typically assumes the existence of at least one obstacle-free path between start and goal, which must be discovered/planned by the robot. However, in real-world scenarios, such as...
Apoorva Vashisth, Manav Kulshrestha, Pranav Bakshi, Damon Conover, Guillaume Sartoretti, Aniket Bera
Entropy in Large Language Models
In this study, the output of large language models (LLM) is considered an information source generating an unlimited sequence of symbols drawn from a finite alphabet. Given the probabilistic nature...
Marco Scharringhausen
CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence
Modern code intelligence agents operate in contexts exceeding 1 million tokens--far beyond the scale where humans manually locate relevant files. Yet agents consistently fail to discover architectu...
Tarakanath Paipuru
Let There Be Claws: An Early Social Network Analysis of AI Agents on Moltbook
Within twelve days of launch, an AI-native social platform exhibits extreme attention concentration, hierarchical role separation, and one-way attention flow, consistent with the hypothesis that st...
H. C. W. Price, H. AlMuhanna, P. M. Bassani, M. Ho, T. S. Evans
AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization
Large language models (LLMs) offer substantial promise for automating clinical text summarization, yet maintaining factual consistency remains challenging due to the length, noise, and heterogeneit...
Fahmida Liza Piya, Rahmatollah Beheshti
Latent Introspection: Models Can Detect Prior Concept Injections
We uncover a latent capacity for introspection in a Qwen 32B model, demonstrating that the model can detect when concepts have been injected into its earlier context and identify which concept was ...
Theia Pearson-Vogel, Martin Vanek, Raymond Douglas, Jan Kulveit
Agents of Chaos
We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems...
Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki...
Protecting and Promoting Human Agency in Education in the Age of Artificial Intelligence
Human agency is crucial in education and increasingly challenged by the use of generative AI. This meeting report synthesizes interdisciplinary insights and conceptualizes four aspects that delinea...
Olga Viberg, Mutlu Cukurova, Rene F. Kizilcec, Simon Buckingham Shum, Dorottya Demszky, Dragan Ga...
SongEcho: Towards Cover Song Generation via Instance-Adaptive Element-wise Linear Modulation
Cover songs constitute a vital aspect of musical culture, preserving the core melody of an original composition while reinterpreting it to infuse novel emotional depth and thematic emphasis. Althou...
Sifei Li, Yang Li, Zizhou Wang, Yuxin Zhang, Fuzhang Wu, Oliver Deussen, Tong-Yee Lee, Weiming Dong
RL-RIG: A Generative Spatial Reasoner via Intrinsic Reflection
Recent advancements in image generation have achieved impressive results in producing high-quality images. However, existing image generation models still generally struggle with a spatial reasonin...
Tianyu Wang, Zhiyuan Ma, Qian Wang, Xinyi Zhang, Xinwei Long, Bowen Zhou
ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting
The strong capabilities of recent Large Language Models (LLMs) have made them highly effective for zero-shot re-ranking task. Attention-based re-ranking methods, which derive relevance scores direc...
Yuxing Tian, Fengran Mo, Weixu Zhang, Yiyan Qi, Jian-Yun Nie
Probabilistic Photonic Computing
Probabilistic computing excels in approximating combinatorial problems and modelling uncertainty. However, using conventional deterministic hardware for probabilistic models is challenging: (pseudo...
Frank Brückerhoff-Plückelmann, Anna P. Ovvyan, Akhil Varri, Hendrik Borras, Bernhard Klein, C. Da...
Multidimensional photonic computing
The rapidly increasing demands for computational throughput, bandwidth, and memory capacity fueled by breakthroughs in machine learning pose substantial challenges for conventional electronic compu...
Ivonne Bente, Shabnam Taheriniya, Francesco Lenzini, Frank Brückerhoff-Plückelmann, Michael Kues,...