Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

RECOVER: Robust Entity Correction via agentic Orchestration of hypothesis Variants for Evidence-based Recovery

Entity recognition in Automatic Speech Recognition (ASR) is challenging for rare and domain-specific terms. In domains such as finance, medicine, and air traffic control, these errors are costly. I...

Abhishek Kumar, Aashraya Sachdeva

2603.16411 2026-03-17
AI LLM

PlotTwist: A Creative Plot Generation Framework with Small Language Models

Creative plot generation presents a fundamental challenge for language models: transforming a concise premise into a coherent narrative that sustains global structure, character development, and em...

Abhinav Thorat, Ravi Kolla, Jyotin Goel, Niranjan Pedanekar

2603.16410 2026-03-17
AI LLM

Who Benchmarks the Benchmarks? A Case Study of LLM Evaluation in Icelandic

This paper evaluates current Large Language Model (LLM) benchmarking for Icelandic, identifies problems, and calls for improved evaluation methods in low/medium-resource languages in particular. We...

Finnur Ágúst Ingimundarson, Steinunn Rut Friðriksdóttir, Bjarki Ármannsson, Iris Edda Nowenstein,...

2603.16406 2026-03-17
AI LLM

Fanar 2.0: Arabic Generative AI Stack

We present Fanar 2.0, the second generation of Qatar's Arabic-centric Generative AI platform. Sovereignty is a first-class design principle: every component, from data pipelines to deployment infra...

FANAR TEAM, Ummar Abbas, Mohammad Shahmeer Ahmad, Minhaj Ahmad, Abdulaziz Al-Homaid, Anas Al-Nua...

2603.16397 2026-03-17
AI LLM

Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models

Hardware faults, specifically bit-flips in quantized weights, pose a severe reliability threat to Large Language Models (LLMs), often triggering catastrophic model collapses. We demonstrate that th...

Deng Liu, Song Chen

2603.16382 2026-03-17
AI LLM

InViC: Intent-aware Visual Cues for Medical Visual Question Answering

Medical visual question answering (Med-VQA) aims to answer clinically relevant questions grounded in medical images. However, existing multimodal large language models (MLLMs) often exhibit shortcu...

Zhisong Wang, Ziyang Chen, Zanting Ye, Hongze Zhu, Yefeng Zheng, Yong Xia

2603.16372 2026-03-17
AI LLM

FactorEngine: A Program-level Knowledge-Infused Factor Mining Framework for Quantitative Investment

We study alpha factor mining, the automated discovery of predictive signals from noisy, non-stationary market data-under a practical requirement that mined factors be directly executable and audita...

Qinhong Lin, Ruitao Feng, Yinglun Feng, Zhenxin Huang, Yukun Chen, Zhongliang Yang, Linna Zhou, B...

2603.16365 2026-03-17
TESTING

High-Precision Photometry with a scientific CMOS Camera: II On-Sky Testing of the Marana camera at the NGTS facility

Modern scientific CMOS cameras offer very fast readout speeds and low read noise. In this study, we evaluate the performance of the Andor Marana CMOS camera through on-sky testing carried out at th...

Ioannis Apergis, Daniel Bayliss, Paul Chote, James McCormac, Peter J. Wheatley, Morgan A. Mitchel...

2603.16361 2026-03-17
AI LLM

One Kiss: Emojis as Agents of Genre Flux in Generative Comics

Generative AI has made visual storytelling widely accessible, yet current prompt-based interactions often force users into a trade-off between precise control and creative flow. We present One Kiss...

Xiruo Wang, Xinyi Jiang, Ziqi Lyu

2603.16359 2026-03-17
AI LLM

Beyond Grading Accuracy: Exploring Alignment of TAs and LLMs

In this paper, we investigate the potential of open-source Large Language Models (LLMs) for grading Unified Modeling Language (UML) class diagrams. In contrast to existing work, which primarily eva...

Matthijs Jansen op de Haar, Nacir Bouali, Faizan Ahmed

2603.16357 2026-03-17
AI LLM

Toward Experimentation-as-a-Service in 5G/6G: The Plaza6G Prototype for AI-Assisted Trials

This paper presents Plaza6G, the first operational Experiment-as-a-Service (ExaS) platform unifying cloud resources with next-generation wireless infrastructure. Developed at CTTC in Barcelona, Pla...

Sergio Barrachina-Muñoz, Marc Carrascosa-Zamacois, Horacio Bleda, Umair Riaz, Yasir Maqsood, Xavi...

2603.16356 2026-03-17
AI LLM

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development

We present PashtoCorp, a 1.25-billion-word corpus for Pashto, a language spoken by 60 million people that remains severely underrepresented in NLP. The corpus is assembled from 39 sources spanning ...

Hanif Rahman

2603.16354 2026-03-17
AI LLM

Automated identification of Ichneumonoidea wasps via YOLO-based deep learning: Integrating HiresCam for Explainable AI

Accurate taxonomic identification of parasitoid wasps within the superfamily Ichneumonoidea is essential for biodiversity assessment, ecological monitoring, and biological control programs. However...

Joao Manoel Herrera Pinheiro, Gabriela Do Nascimento Herrera, Alvaro Doria Dos Santos, Luciana Bu...

2603.16351 2026-03-17
AI LLM

Prompts Blend Requirements and Solutions: From Intent to Implementation

AI coding assistants are reshaping software development by shifting focus from writing code to formulating prompts. In chat-focused approaches such as vibe coding, prompts become the primary arbite...

Shalini Chakraborty, Jan-Philipp Steghöfer

2603.16348 2026-03-17
TESTING

Decoding the Critique Mechanism in Large Reasoning Models

Large Reasoning Models (LRMs) exhibit backtracking and self-verification mechanisms that enable them to revise intermediate steps and reach correct solutions, yielding strong performance on complex...

Hoang Phan, Quang H. Nguyen, Hung T. Q. Le, Xiusi Chen, Heng Ji, Khoa D. Doan

2603.16331 2026-03-17
AI LLM

An Interpretable Machine Learning Framework for Non-Small Cell Lung Cancer Drug Response Analysis

Lung cancer is a condition where there is abnormal growth of malignant cells that spread in an uncontrollable fashion in the lungs. Some common treatment strategies are surgery, chemotherapy, and r...

Ann Rachel, Pranav M Pawar, Mithun Mukharjee, Raja M, Tojo Mathew

2603.16330 2026-03-17
TESTING

Parameter Optimization of Domain-Wall Fermion using Machine Learning

We study a parameter optimization of domain-wall fermions to improve chiral symmetry based on machine learning. Domain-wall fermions involve coefficients along the fifth dimension, which can be tre...

Shunsuke Yasunaga, Kenta Yoshimura, Akio Tomiya, Yuki Nagai

2603.16329 2026-03-17
AI LLM

A Human-Centred Architecture for Large Language Models-Cognitive Assistants in Manufacturing within Quality Management Systems

Large Language Models-Cognitive Assistants (LLM-CAs) can enhance Quality Management Systems (QMS) in manufacturing, fostering continuous process improvement and knowledge management. However, there...

Marcos Galdino, Johanna Grahl, Tobias Hamann, Anas Abdelrazeq, Ingrid Isenhardt

2603.16325 2026-03-17
AI LLM

Learning to Predict, Discover, and Reason in High-Dimensional Discrete Event Sequences

Electronic control units (ECUs) embedded within modern vehicles generate a large number of asynchronous events known as diagnostic trouble codes (DTCs). These discrete events form complex temporal ...

Hugo Math

2603.16313 2026-03-17
AI LLM

Omnilingual MT: Machine Translation for 1,600 Languages

High-quality machine translation (MT) can scale to hundreds of languages, setting a high bar for multilingual systems. However, compared to the world's 7,000 languages, current systems still offer ...

Omnilingual MT Team, Belen Alastruey, Niyati Bafna, Andrea Caciolai, Kevin Heffernan, Artyom Koz...

2603.16309 2026-03-17