Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Switching-Reference Voltage Control for Distribution Systems with AI-Training Data Centers

Large-scale AI training workloads in modern data centers exhibit rapid and periodic power fluctuations, which may induce significant voltage deviations in power distribution systems. Existing volta...

Mingyuan Yan, Trager Joswig-Jones, Baosen Zhang, Yize Chen, Wenqi Cu

2603.15588 2026-03-16
AI LLM

Grounding World Simulation Models in a Real-World Metropolis

What if a world simulation model could render not an imagined environment but a city that actually exists? Prior generative world models synthesize visually plausible yet artificial environments by...

Junyoung Seo, Hyunwook Choi, Minkyung Kwon, Jinhyeok Choi, Siyoon Jin, Gayoung Lee, Junho Kim, Jo...

2603.15583 2026-03-16
AI LLM

Mamba-3: Improved Sequence Modeling using State Space Principles

Scaling inference-time compute has emerged as an important driver of LLM performance, making inference efficiency a central focus of model design alongside model quality. While the current Transfor...

Aakash Lahoti, Kevin Y. Li, Berlin Chen, Caitlin Wang, Aviv Bick, J. Zico Kolter, Tri Dao, Albert Gu

2603.15569 2026-03-16
AI LLM

Lore: Repurposing Git Commit Messages as a Structured Knowledge Protocol for AI Coding Agents

As AI coding agents become both primary producers and consumers of source code, the software industry faces an accelerating loss of institutional knowledge. Each commit captures a code diff but dis...

Ivan Stetsenko

2603.15566 2026-03-16
AI LLM

The PokeAgent Challenge: Competitive and Long-Context Learning at Scale

We present the PokeAgent Challenge, a large-scale benchmark for decision-making research built on Pokemon's multi-agent battle system and expansive role-playing game (RPG) environment. Partial obse...

Seth Karten, Jake Grigsby, Tersoo Upaa, Junik Bae, Seonghun Hong, Hyunyoung Jeong, Jaeyoon Jung, ...

2603.15563 2026-03-16
AI LLM

Panoramic Affordance Prediction

Affordance prediction serves as a critical bridge between perception and action in embodied AI. However, existing research is confined to pinhole camera models, which suffer from narrow Fields of V...

Zixin Zhang, Chenfei Liao, Hongfei Zhang, Harold Haodong Chen, Kanghao Chen, Zichen Wen, Litao Gu...

2603.15558 2026-03-16
AI LLM

Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models

Vision-Language Models (VLMs) frequently "hallucinate" - generate plausible yet factually incorrect statements - posing a critical barrier to their trustworthy deployment. In this work, we propose ...

Lexiang Xiong, Qi Li, Jingwen Ye, Xinchao Wang

2603.15557 2026-03-16
AI LLM

Can LLMs Model Incorrect Student Reasoning? A Case Study on Distractor Generation

Modeling plausible student misconceptions is critical for AI in education. In this work, we examine how large language models (LLMs) reason about misconceptions when generating multiple-choice dist...

Yanick Zengaffinen, Andreas Opedal, Donya Rooein, Kv Aditya Srivatsa, Shashank Sonkar, Mrinmaya S...

2603.15547 2026-03-16
AI LLM

Kimodo: Scaling Controllable Human Motion Generation

High-quality human motion data is becoming increasingly important for applications in robotics, simulation, and entertainment. Recent generative models offer a potential data source, enabling human...

Davis Rempe, Mathis Petrovich, Ye Yuan, Haotian Zhang, Xue Bin Peng, Yifeng Jiang, Tingwu Wang, U...

2603.15546 2026-03-16
AI LLM

InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study Design in Real Social Systems

Causal inference in social science relies on end-to-end, intervention-centered research-design reasoning grounded in real-world policy interventions, but current benchmarks fail to evaluate this ca...

Shaojie Shi, Zhengyu Shi, Lingran Zheng, Xinyu Su, Anna Xie, Bohao Lv, Rui Xu, Zijian Chen, Zhich...

2603.15542 2026-03-16
AI LLM

QiboAgent: a practitioner's guideline to open source assistants for Quantum Computing code development

We introduce QiboAgent, a reference implementation designed to serve as a practitioner's guideline for developing specialized coding assistants in Quantum Computing middleware. Addressing the limit...

Lorenzo Esposito, Andrea Papaluca, Stefano Carrazza

2603.15538 2026-03-16
AI LLM

DUET: Disaggregated Hybrid Mamba-Transformer LLMs with Prefill and Decode-Specific Packages

Large language models operate in distinct compute-bound prefill followed by memory bandwidth-bound decode phases. Hybrid Mamba-Transformer models inherit this asymmetry while adding state space mod...

Alish Kanani, Sangwan Lee, Han Lyu, Jiahao Lin, Jaehyun Park, Umit Y. Ogras

2603.15530 2026-03-16
AI LLM

Are Dilemmas and Conflicts in LLM Alignment Solvable? A View from Priority Graph

As Large Language Models (LLMs) become more powerful and autonomous, they increasingly face conflicts and dilemmas in many scenarios. We first summarize and taxonomize these diverse conflicts. Then...

Zhenheng Tang, Xiang Liu, Qian Wang, Eunsol Choi, Bo Li, Xiaowen Chu

2603.15527 2026-03-16
AI LLM

Clinically Aware Synthetic Image Generation for Concept Coverage in Chest X-ray Models

The clinical deployment of AI diagnostic models demands more than benchmark accuracy - it demands robustness across the full spectrum of disease presentations. However, publicly available chest rad...

Amy Rafferty, Rishi Ramaesh, Ajitha Rajan

2603.15525 2026-03-16
AI LLM

SlovKE: A Large-Scale Dataset and LLM Evaluation for Slovak Keyphrase Extraction

Keyphrase extraction for morphologically rich, low-resource languages remains understudied, largely due to the scarcity of suitable evaluation datasets. We address this gap for Slovak by constructi...

David Števaňák, Marek Šuppa

2603.15523 2026-03-16
AI LLM

Beyond the Covariance Trap: Unlocking Generalization in Same-Subject Knowledge Editing for Large Language Models

While locate-then-edit knowledge editing efficiently updates knowledge encoded within Large Language Models (LLMs), a critical generalization failure mode emerges in the practical same-subject know...

Xiyu Liu, Qingyi Si, Zhengxiao Liu, Chenxu Yang, Naibin Gu, Zheng Lin

2603.15518 2026-03-16
AI LLM

ViX-Ray: A Vietnamese Chest X-Ray Dataset for Vision-Language Models

Vietnamese medical research has become an increasingly vital domain, particularly with the rise of intelligent technologies aimed at reducing time and resource burdens in clinical diagnosis. Recent...

Duy Vu Minh Nguyen, Chinh Thanh Truong, Phuc Hoang Tran, Hung Tuan Le, Nguyen Van-Thanh Dat, Trun...

2603.15513 2026-03-16
AI LLM

Not All Invariants Are Equal: Curating Training Data to Accelerate Program Verification with SLMs

The synthesis of inductive loop invariants is a critical bottleneck in automated program verification. While Large Language Models (LLMs) show promise in mitigating this issue, they often fail on h...

Ido Pinto, Yizhak Yisrael Elboher, Haoze Wu, Nina Narodytska, Guy Katz

2603.15510 2026-03-16
AI LLM

Seeking SOTA: Time-Series Forecasting Must Adopt Taxonomy-Specific Evaluation to Dispel Illusory Gains

We argue that the current practice of evaluating AI/ML time-series forecasting models, predominantly on benchmarks characterized by strong, persistent periodicities and seasonalities, obscures real...

Raeid Saqur, Christoph Bergmeir, Blanka Horvath, Daniel Schmidt, Frank Rudzicz, Terry Lyons

2603.15506 2026-03-16
AI LLM

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

LLMs often exhibit Aha moments during reasoning, such as apparent self-correction following tokens like "Wait," yet their underlying mechanisms remain unclear. We introduce an information-theoretic...

Jeonghye Kim, Xufang Luo, Minbeom Kim, Sangmook Lee, Dongsheng Li, Yuqing Yang

2603.15500 2026-03-16