Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax

We aim to examine the extent to which Large Language Models (LLMs) can 'talk much' about grammar modules, providing evidence from syntax core properties translated by ChatGPT into Arabic. We collec...

Mohammed Q. Shormani

2603.20114 2026-03-20
TESTING

Cislunar State and Uncertainty Propagation via the Modified Generalized Equinoctial Orbital Elements

The complex cislunar dynamical environment poses challenges for spacecraft navigation and Space Domain Awareness (SDA) operations, where the knowledge of current and future spacecraft states is ess...

Maaninee Gupta, Kyle J. DeMars

2603.20110 2026-03-20
AI LLM

GO-GenZip: Goal-Oriented Generative Sampling and Hybrid Compression

Current network data telemetry pipelines consist of massive streams of fine-grained Key Performance Indicators (KPIs) from multiple distributed sources towards central aggregators, making data stor...

Pietro Talli, Qi Liao, Alessandro Lieto, Parijat Bhattacharjee, Federico Chiariotti, Andrea Zanella

2603.20109 2026-03-20
TESTING

Trojan horse hunt in deep forecasting models: Insights from the European Space Agency competition

Forecasting plays a crucial role in modern safety-critical applications, such as space operations. However, the increasing use of deep forecasting models introduces a new security risk of trojan ho...

Krzysztof Kotowski, Ramez Shendy, Jakub Nalepa, Agata Kaczmarek, Dawid Płudowski, Piotr Wilczyńsk...

2603.20108 2026-03-20
TESTING

Sharing The Secret: Distributed Privacy-Preserving Monitoring

In traditional runtime verification, a system is typically observed by a monolithic monitor. Enforcing privacy in such settings is computationally expensive, as it necessitates heavy cryptographic ...

Mahyar Karimi, K. S. Thejaswini, Roderick Bloem, Thomas A. Henzinger

2603.20107 2026-03-20
AI LLM

The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus

LLMs are increasingly used as general-purpose reasoners, but long inputs remain bottlenecked by a fixed context window. Recursive Language Models (RLMs) address this by externalising the prompt and...

Amartya Roy, Rasul Tutunov, Xiaotong Ji, Matthieu Zimmer, Haitham Bou-Ammar

2603.20105 2026-03-20
AI LLM

Pitfalls in Evaluating Interpretability Agents

Automated interpretability systems aim to reduce the need for human labor and scale analysis to increasingly large models and diverse tasks. Recent efforts toward this goal leverage large language ...

Tal Haklay, Nikhil Prakash, Sana Pandey, Antonio Torralba, Aaron Mueller, Jacob Andreas, Tamar Ro...

2603.20101 2026-03-20
AI LLM

LLM-Enhanced Semantic Data Integration of Electronic Component Qualifications in the Aerospace Domain

Large manufacturing companies face challenges in information retrieval due to data silos maintained by different departments, leading to inconsistencies and misalignment across databases. This pape...

Antonio De Santis, Marco Balduini, Matteo Belcao, Andrea Proia, Marco Brambilla, Emanuele Della V...

2603.20094 2026-03-20
AI LLM

Beyond Accuracy: Towards a Robust Evaluation Methodology for AI Systems for Language Education

The rapid adoption of large language models in AI-powered language education has created an urgent need for evaluations that assess pedagogical effectiveness, particularly in language learning--one...

James Edgell, Wm. Matthew Kennedy, Isaac Pattis, Ben Knight, Danielle Carvalho, Elizabeth Wonnacott

2603.20088 2026-03-20
TESTING

Inference in high-dimensional logistic regression under tensor network dependence

We investigate the problem of statistical inference for logistic regression with high-dimensional covariates in settings where dependence among individuals is induced by an underlying Markov random...

Josh Miles, Sohom Bhattacharya

2603.20082 2026-03-20
AI LLM

Agentic Harness for Real-World Compilers

Compilers are critical to modern computing, yet fixing compiler bugs is difficult. While recent large language model (LLM) advancements enable automated bug repair, compiler bugs pose unique challe...

Yingwei Zheng, Cong Li, Shaohua Li, Yuqun Zhang, Zhendong Su

2603.20075 2026-03-20
AI LLM

The End of Rented Discovery: How AI Search Redistributes Power Between Hotels and Intermediaries

When a traveler asks an AI search engine to recommend a hotel, which sources get cited -- and does query framing matter? We audit 1,357 grounding citations from Google Gemini across 156 hotel queri...

Peiying Zhu, Sidi Chang

2603.20062 2026-03-20
AI LLM

From School AI Readiness to Student AI Literacy: A National Multilevel Mediation Analysis of Institutional Capacity and Teacher Capability

Artificial intelligence (AI) is increasingly embedded in vocational education systems, yet empirical evidence linking institutional AI readiness to student learning outcomes remains limited. This s...

Xiu Guan, Mingmin Zheng, Dragan Gašević, Wenxin Guo, Yingqun Liu, Xibin Han, Danijela Gasevic, Ru...

2603.20056 2026-03-20
TESTING

Feasible Deviations from Unitarity with Vector-Like Quark Singlets

We deduce pertinent relations between the elements of the CKM matrix, and find that not all of these are totally compatible with experiment and/or the assumption of the $3 \times 3$ unitarity. We i...

Francisco Albergaria, Francisco J. Botella, G. C. Branco, José Filipe Bastos, J. I. Silva-Marcos

2603.20047 2026-03-20
AI LLM

Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Reinforcement Learning (RL) with rubric-based rewards has recently shown remarkable progress in enhancing general reasoning capabilities of Large Language Models (LLMs), yet still suffers from inef...

Wenjian Zhang, Kongcheng Zhang, Jiaxin Qi, Baisheng Lai, Jianqiang Huang

2603.20046 2026-03-20
AI LLM

LoASR-Bench: Evaluating Large Speech Language Models on Low-Resource Automatic Speech Recognition Across Language Families

Large language models (LLMs) have driven substantial advances in speech language models (SpeechLMs), yielding strong performance in automatic speech recognition (ASR) under high-resource conditions...

Jianan Chen, Xiaoxue Gao, Tatsuya Kawahara, Nancy F. Chen

2603.20042 2026-03-20
TESTING

CoverageBench: Evaluating Information Coverage across Tasks and Domains

We wish to measure the information coverage of an ad hoc retrieval algorithm, that is, how much of the range of available relevant information is covered by the search results. Information coverage...

Saron Samuel, Andrew Yates, Dawn Lawrie, Ian Soboroff, Trevor Adriaanse, Benjamin Van Durme, Euge...

2603.20034 2026-03-20
AI LLM

Orchestrating Human-AI Software Delivery: A Retrospective Longitudinal Field Study of Three Software Modernization Programs

Evidence on AI in software engineering still leans heavily toward individual task completion, while evidence on team-level delivery remains scarce. We report a retrospective longitudinal field stud...

Maximiliano Armesto, Christophe Kolb

2603.20028 2026-03-20
AI LLM

Detached Skip-Links and $R$-Probe: Decoupling Feature Aggregation from Gradient Propagation for MLLM OCR

Multimodal large language models (MLLMs) excel at high-level reasoning yet fail on OCR tasks where fine-grained visual details are compromised or misaligned. We identify an overlooked optimization ...

Ziye Yuan, Ruchang Yao, Chengxin Zheng, Yusheng Zhao, Daxiang Dong, Ming Zhang

2603.20020 2026-03-20
AI LLM

RouterKGQA: Specialized--General Model Routing for Constraint-Aware Knowledge Graph Question Answering

Knowledge graph question answering (KGQA) is a promising approach for mitigating LLM hallucination by grounding reasoning in structured and verifiable knowledge graphs. Existing approaches fall int...

Bo Yuan, Hexuan Deng, Xuebo Liu, Min Zhang

2603.20017 2026-03-20