Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Machine-Generated, Machine-Checked Proofs for a Verified Compiler (Experience Report)

We report on using an agentic coding assistant (Claude Code, powered by Claude Opus 4.6) to mechanize a substantial Rocq correctness proof from scratch, with human guidance but without human proof ...

Zoe Paraskevopoulou

2602.20082 2026-02-23
AI LLM

The Digital Gorilla: Rebalancing Power in the Age of AI

Contemporary artificial intelligence (AI) policy suffers from a basic categorical error. Existing frameworks rely on analogizing AI to inherited technology types -- such as products, platforms, or ...

M. Alejandra Parra-Orlandoni, Roxanne A. Schnyder, Christopher J. Mallet

2602.20080 2026-02-23
AI LLM

HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images

Accurate heat-demand maps play a crucial role in decarbonizing space heating, yet most municipalities lack detailed building-level data needed to calculate them. We introduce HeatPrompt, a zero-sho...

Kundan Thota, Xuanhao Mu, Thorsten Schlachter, Veit Hagenmeyer

2602.20066 2026-02-23
AI LLM

Multilingual Large Language Models do not comprehend all natural languages to equal degrees

Large Language Models (LLMs) play a critical role in how humans access information. While their core use relies on comprehending written requests, our understanding of this ability is currently lim...

Natalia Moskvina, Raquel Montero, Masaya Yoshida, Ferdy Hubers, Paolo Morosi, Walid Irhaymi, Jin ...

2602.20065 2026-02-23
AI LLM

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding conversation. AI agents build such conversations automatical...

Zac Garby, Andrew D. Gordon, David Sands

2602.20064 2026-02-23
AI LLM

Can You Tell It's AI? Human Perception of Synthetic Voices in Vishing Scenarios

Large Language Models and commercial speech synthesis systems now enable highly realistic AI-generated voice scams (vishing), raising urgent concerns about deception at scale. Yet it remains unclea...

Zoha Hayat Bhatti, Bakhtawar Ahtisham, Seemal Tausif, Niklas George, Nida ul Habib Bajwa, Mobin J...

2602.20061 2026-02-23
AI LLM

Interaction Theater: A case of LLM Agents Interacting at Scale

As multi-agent architectures and agent-to-agent protocols proliferate, a fundamental question arises: what actually happens when autonomous LLM agents interact at scale? We study this question empi...

Sarath Shekkizhar, Adam Earle

2602.20059 2026-02-23
AI LLM

To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation

Visual navigation typically assumes the existence of at least one obstacle-free path between start and goal, which must be discovered/planned by the robot. However, in real-world scenarios, such as...

Apoorva Vashisth, Manav Kulshrestha, Pranav Bakshi, Damon Conover, Guillaume Sartoretti, Aniket Bera

2602.20055 2026-02-23
AI LLM

Entropy in Large Language Models

In this study, the output of large language models (LLM) is considered an information source generating an unlimited sequence of symbols drawn from a finite alphabet. Given the probabilistic nature...

Marco Scharringhausen

2602.20052 2026-02-23
AI LLM

CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

Modern code intelligence agents operate in contexts exceeding 1 million tokens--far beyond the scale where humans manually locate relevant files. Yet agents consistently fail to discover architectu...

Tarakanath Paipuru

2602.20048 2026-02-23
AI LLM

Let There Be Claws: An Early Social Network Analysis of AI Agents on Moltbook

Within twelve days of launch, an AI-native social platform exhibits extreme attention concentration, hierarchical role separation, and one-way attention flow, consistent with the hypothesis that st...

H. C. W. Price, H. AlMuhanna, P. M. Bassani, M. Ho, T. S. Evans

2602.20044 2026-02-23
AI LLM

AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization

Large language models (LLMs) offer substantial promise for automating clinical text summarization, yet maintaining factual consistency remains challenging due to the length, noise, and heterogeneit...

Fahmida Liza Piya, Rahmatollah Beheshti

2602.20040 2026-02-23
AI LLM

Latent Introspection: Models Can Detect Prior Concept Injections

We uncover a latent capacity for introspection in a Qwen 32B model, demonstrating that the model can detect when concepts have been injected into its earlier context and identify which concept was ...

Theia Pearson-Vogel, Martin Vanek, Raymond Douglas, Jan Kulveit

2602.20031 2026-02-23
AI LLM

Agents of Chaos

We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems...

Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki...

2602.20021 2026-02-23
AI LLM

Protecting and Promoting Human Agency in Education in the Age of Artificial Intelligence

Human agency is crucial in education and increasingly challenged by the use of generative AI. This meeting report synthesizes interdisciplinary insights and conceptualizes four aspects that delinea...

Olga Viberg, Mutlu Cukurova, Rene F. Kizilcec, Simon Buckingham Shum, Dorottya Demszky, Dragan Ga...

2602.20014 2026-02-23
AI LLM

SongEcho: Towards Cover Song Generation via Instance-Adaptive Element-wise Linear Modulation

Cover songs constitute a vital aspect of musical culture, preserving the core melody of an original composition while reinterpreting it to infuse novel emotional depth and thematic emphasis. Althou...

Sifei Li, Yang Li, Zizhou Wang, Yuxin Zhang, Fuzhang Wu, Oliver Deussen, Tong-Yee Lee, Weiming Dong

2602.19976 2026-02-23
AI LLM

RL-RIG: A Generative Spatial Reasoner via Intrinsic Reflection

Recent advancements in image generation have achieved impressive results in producing high-quality images. However, existing image generation models still generally struggle with a spatial reasonin...

Tianyu Wang, Zhiyuan Ma, Qian Wang, Xinyi Zhang, Xinwei Long, Bowen Zhou

2602.19974 2026-02-23
AI LLM

ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting

The strong capabilities of recent Large Language Models (LLMs) have made them highly effective for zero-shot re-ranking task. Attention-based re-ranking methods, which derive relevance scores direc...

Yuxing Tian, Fengran Mo, Weixu Zhang, Yiyan Qi, Jian-Yun Nie

2602.19969 2026-02-23
AI LLM

Probabilistic Photonic Computing

Probabilistic computing excels in approximating combinatorial problems and modelling uncertainty. However, using conventional deterministic hardware for probabilistic models is challenging: (pseudo...

Frank Brückerhoff-Plückelmann, Anna P. Ovvyan, Akhil Varri, Hendrik Borras, Bernhard Klein, C. Da...

2602.19968 2026-02-23
AI LLM

Multidimensional photonic computing

The rapidly increasing demands for computational throughput, bandwidth, and memory capacity fueled by breakthroughs in machine learning pose substantial challenges for conventional electronic compu...

Ivonne Bente, Shabnam Taheriniya, Francesco Lenzini, Frank Brückerhoff-Plückelmann, Michael Kues,...

2602.19957 2026-02-23