Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Artificial Intelligence for Climate Adaptation: Reinforcement Learning for Climate Change-Resilient Transport

Climate change is expected to intensify rainfall and, consequently, pluvial flooding, leading to increased disruptions in urban transportation systems over the coming decades. Designing effective a...

Miguel Costa, Arthur Vandervoort, Carolin Schmidt, João Miranda, Morten W. Petersen, Martin Drews...

2603.06278 2026-03-06
AI LLM

Story Point Estimation Using Large Language Models

This study investigates the use of large language models (LLMs) for story point estimation. Story points are unitless, project-specific effort estimates that help developers on the scrum team forec...

Pranam Prakash Shetty, Adarsh Balakrishnan, Mengqiao Xu, Xiaoyin Xi, Zhe Yu

2603.06276 2026-03-06
AI LLM

Stem: Rethinking Causal Information Flow in Sparse Attention

The quadratic computational complexity of self-attention remains a fundamental bottleneck for scaling Large Language Models (LLMs) to long contexts, particularly during the pre-filling phase. In th...

Lin Niu, Xin Luo, Linchuan Xie, Yifu Sun, Guanghua Yu, Jianchen Zhu, S Kevin Zhou

2603.06274 2026-03-06
AI LLM

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

Agentic retrieval-augmented reasoning pipelines are increasingly used to structure how large language models (LLMs) incorporate external evidence in clinical decision support. These systems iterati...

Mina Farajiamiri, Jeta Sopa, Saba Afza, Lisa Adams, Felix Barajas Ordonez, Tri-Thien Nguyen, Mahs...

2603.06271 2026-03-06
AI LLM

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the d...

Hari Shankar, Vedanta S P, Sriharini Margapuri, Debjani Mazumder, Ponnurangam Kumaraguru, Abhijna...

2603.06264 2026-03-06
AI LLM

NOVA: Next-step Open-Vocabulary Autoregression for 3D Multi-Object Tracking in Autonomous Driving

Generalizing across unknown targets is critical for open-world perception, yet existing 3D Multi-Object Tracking (3D MOT) pipelines remain limited by closed-set assumptions and ``semantic-blind'' h...

Kai Luo, Xu Wang, Rui Fan, Kailun Yang

2603.06254 2026-03-06
TESTING

Skill-Adaptive Ghost Instructors: Enhancing Retention and Reducing Over-Reliance in VR Piano Learning

Motor-skill learning systems in XR rely on persistent cues. However, constant cueing can induce overreliance and erode memorization and skill transfer. We introduce a skill-adaptive, dynamically tr...

Tzu-Hsin Hsieh, Cassandra Michelle Stefanie Visser, Elmar Eisemann, Ricardo Marroquim

2603.06253 2026-03-06
AI LLM

MLLMRec-R1: Incentivizing Reasoning Capability in Large Language Models for Multimodal Sequential Recommendation

Group relative policy optimization (GRPO) has become a standard post-training paradigm for improving reasoning and preference alignment in large language models (LLMs), and has recently shown stron...

Yu Wang, Yonghui Yang, Le Wu, Jiancan Wu, Hefei Xu, Hui Lin

2603.06243 2026-03-06
AI LLM

Human, Algorithm, or Both? Gender Bias in Human-Augmented Recruiting

Recent years have seen rapid growth in the market for HR technology and AI-driven HR solutions in particular. This popularity has also resulted in increased attention to the negative aspects of usi...

Mesut Kaya, Toine Bogers

2603.06240 2026-03-06
AI LLM

What are AI researchers worried about?

As AI attracts vast investment and attention, there are competing concerns about the technology's opportunities and uncertainties that blend technical and social questions. The public debate, domin...

Cian O'Donovan, Sarp Gurakan, Ananya Karanam, Xiaomeng Wu, Jack Stilgoe

2603.06223 2026-03-06
AI LLM

Conversational Demand Response: Bidirectional Aggregator-Prosumer Coordination through Agentic AI

Residential demand response depends on sustained prosumer participation, yet existing coordination is either fully automated, or limited to one-way dispatch signals and price alerts that offer litt...

Reda El Makroum, Sebastian Zwickl-Bernhard, Lukas Kranzl, Hans Auer

2603.06217 2026-03-06
AI LLM

LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) is a framework in which a Generator, such as a Large Language Model (LLM), produces answers by retrieving documents from an external collection using a Retrieve...

Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi, Masaki Otsuki

2603.06198 2026-03-06
AI LLM

Wisdom of the AI Crowd (AI-CROWD) for Ground Truth Approximation in Content Analysis: A Research Protocol & Validation Using Eleven Large Language Models

Large-scale content analysis is increasingly limited by the absence of observable ground truth or gold-standard labels, as creating such benchmarks through extensive human coding becomes impractica...

Luis de-Marcos, Manuel Goyanes, Adrián Domínguez-Díaz

2603.06197 2026-03-06
AI LLM

Transformer-Based Pulse Shape Discrimination in HPGe Detectors with Masked Autoencoder Pre-training

Pulse-shape discrimination (PSD) in high-purity germanium (HPGe) detectors is central to rare-event searches such as neutrinoless double-beta decay (0vBB), yet conventional approaches compress each...

Marta Babicz, Saúl Alonso-Monsalve, Alain Fauquex, Laura Baudis

2603.06192 2026-03-06
AI LLM

CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation

We introduce CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that assesses reports based on diagnostic correctness, contextual relevance, and patient safety. U...

Mohammed Baharoon, Thibault Heintz, Siavash Raissi, Mahmoud Alabbad, Mona Alhammad, Hassan AlOmai...

2603.06183 2026-03-06
AI LLM

Towards Motion Turing Test: Evaluating Human-Likeness in Humanoid Robots

Humanoid robots have achieved significant progress in motion generation and control, exhibiting movements that appear increasingly natural and human-like. Inspired by the Turing Test, we propose th...

Mingzhe Li, Mengyin Liu, Zekai Wu, Xincheng Lin, Junsheng Zhang, Ming Yan, Zengye Xie, Changwang ...

2603.06181 2026-03-06
TESTING

Reflective Flow Sampling Enhancement

The growing demand for text-to-image generation has led to rapid advances in generative modeling. Recently, text-to-image diffusion models trained with flow matching algorithms, such as FLUX, have ...

Zikai Zhou, Muyao Wang, Shitong Shao, Lichen Bai, Haoyi Xiong, Bo Han, Zeke Xie

2603.06165 2026-03-06
TESTING

Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR

Self-supervised learning (SSL) underpins modern audio deepfake detection, yet most prior work centers on a single large wav2vec2-XLSR backbone, leaving compact under studied. We present RAPTOR, Rep...

Ajinkya Kulkarni, Sandipana Dowerah, Atharva Kulkarni, Tanel Alumäe, Mathew Magimai Doss

2603.06164 2026-03-06
TESTING

Homogeneous Border Bases on Infinite Order Ideals

Border bases are traditionally restricted to 0-dimensional ideals due to the finiteness of the underlying order ideal. In this paper we extend the theory to homogeneous ideals of positive Krull dim...

Cristina Bertone, Sofia Bovero

2603.06155 2026-03-06
TESTING

Machine Learning Based Mesh Movement for Non-Hydrostatic Tsunami Simulation

This study investigates the use of machine learning based mesh adaptivity, specifically mesh movement methods (UM2N), with depth integrated non-hydrostatic shallow water models. Motivation for this...

Yezhang Li, Stephan C. Kramer, Matthew D. Piggott

2603.06152 2026-03-06