Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS

Large language models are increasingly adopted as semantic backbones for neural text-to-speech systems. However, frozen LLM representations are insufficient for modeling speaker specific acoustic a...

Anupam Purwar, Aditya Choudhary

2603.10904 2026-03-11
AI LLM

LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation

Transformer-based large language models (LLMs) rely on key-value (KV) caching to avoid redundant computation during autoregressive inference. While this mechanism greatly improves efficiency, the c...

Jinwoo Ahn, Ingyu Seong, Akhil Kedia, Junhan Kim, Hyemi Jang, Kangwook Lee, Yongkweon Jeon

2603.10899 2026-03-11
AI LLM

A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification

Medication errors pose a significant threat to patient safety, making pharmacist verification (PV) a critical, yet heavily burdened, final safeguard. The direct application of Large Language Models...

Yichi Zhu, Kan Ling, Xu Liu, Hengrun Zhang, Huiqun Yu, Guisheng Fan

2603.10891 2026-03-11
AI LLM

Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models

Reinforcement learning (RL) finetuning has become a key technique for enhancing the reasoning abilities of large language models (LLMs). However, its effectiveness critically depends on the selecti...

Yixiu Mao, Yun Qu, Qi Wang, Heming Zou, Xiangyang Ji

2603.10887 2026-03-11
TESTING

Kernel Tests of Equivalence

We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional difference...

Xing Liu, Axel Gandy

2603.10886 2026-03-11
AI LLM

An Extreme Multi-label Text Classification (XMTC) Library Dataset: What if we took "Use of Practical AI in Digital Libraries" seriously?

Subject indexing is vital for discovery but hard to sustain at scale and across languages. We release a large bilingual (English/German) corpus of catalog records annotated with the Integrated Auth...

Jennifer D'Souza, Sameer Sadruddin, Maximilian Kähler, Andrea Salfinger, Luca Zaccagna, Francesca...

2603.10876 2026-03-11
TESTING

SNPgen: Phenotype-Supervised Genotype Representation and Synthetic Data Generation via Latent Diffusion

Polygenic risk scores and other genomic analyses require large individual-level genotype datasets, yet strict data access restrictions impede sharing. Synthetic genotype generation offers a privacy...

Andrea Lampis, Michela Carlotta Massi, Nicola Pirastu, Francesca Ieva, Matteo Matteucci, Emanuele...

2603.10873 2026-03-11
TESTING

Exploring Indicators of Developers' Sentiment Perceptions in Student Software Projects

Communication is a crucial social factor in the success of software projects, as positively or negatively perceived statements can influence how recipients feel and affect team collaboration throug...

Martin Obaidi, Marc Herrmann, Jendrik Martensen, Jil Klünder, Kurt Schneider

2603.10864 2026-03-11
AI LLM

OSUM-Pangu: An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs

Recent advancements in Speech Large Language Models have significantly enhanced multi-dimensional speech understanding. However, the majority of high-performance frameworks are predominantly optimi...

Yujie Liao, Xuelong Geng, Hongfei Xue, Shuiyuan Wang, Lei Xie

2603.10862 2026-03-11
AI LLM

SiDiaC-v.2.0: Sinhala Diachronic Corpus Version 2.0

SiDiaC-v.2.0 is the largest comprehensive Sinhala Diachronic Corpus to date, covering a period from 1800 CE to 1955 CE in terms of publication dates, and a historical span from the 5th to the 20th ...

Nevidu Jayatilleke, Nisansa de Silva, Uthpala Nimanthi, Gagani Kulathilaka, Azra Safrullah, Johan...

2603.10861 2026-03-11
AI LLM

Numerical analysis for leaky-integrate-fire networks under Euler--Maruyama

Leaky integrate-and-fire (LIF) networks are canonical models in computational neuroscience and a standard substrate for neuromorphic AI. We study Euler--Maruyama simulation of current-based LIF net...

Xu'an Dou, Frank Chen, Kevin K Lin, Zhuo-Cheng Xiao

2603.10854 2026-03-11
TESTING

Spectroscopic galaxy redshifts in the Peanut cluster - a massive nearly head-on cluster merger shortly after pericenter passage

The Peanut cluster (SRGe J023820.8+200556, SRGe CL0238.3+2005, $z_{spec}$ = 0.42) has recently emerged as a candidate for a rare, massive merger, potentially analogous to the Bullet cluster. We pre...

I. Zaznobin, N. Lyskova, I. Bikmaev, R. Burenin, A. Arshinova, E. Churazov, S. Dodonov, M. Gilfan...

2603.10849 2026-03-11
TESTING

$V_{0.5}$: Generalist Value Model as a Prior for Sparse RL Rollouts

In Reinforcement Learning with Verifiable Rewards (RLVR), constructing a robust advantage baseline is critical for policy gradients, effectively guiding the policy model to reinforce desired behavi...

Yi-Kai Zhang, Yueqing Sun, Hongyan Hao, Qi Gu, Xunliang Cai, De-Chuan Zhan, Han-Jia Ye

2603.10848 2026-03-11
TESTING

Evaluating Few-Shot Pill Recognition Under Visual Domain Shift

Adverse drug events are a significant source of preventable harm, which has led to the development of automated pill recognition systems to enhance medication safety. Real-world deployment of these...

W. I. Chu, G. Tarroni, L. Li

2603.10833 2026-03-11
AI LLM

BALD-SAM: Disagreement-based Active Prompting in Interactive Segmentation

The Segment Anything Model (SAM) has revolutionized interactive segmentation through spatial prompting. While existing work primarily focuses on automating prompts in various settings, real-world a...

Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib

2603.10828 2026-03-11
AI LLM

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation

Speech-aware large language models (LLMs) can accept speech inputs, yet their training objectives largely emphasize linguistic content or specific fields such as emotions or the speaker's gender, l...

Thomas Thebaud, Yuzhe Wang, Laureano Moro-Velazquez, Jesus Villalba-Lopez, Najim Dehak

2603.10827 2026-03-11
AI LLM

A dataset of medication images with instance segmentation masks for preventing adverse drug events

Medication errors and adverse drug events (ADEs) pose significant risks to patient safety, often arising from difficulties in reliably identifying pharmaceuticals in real-world settings. AI-based p...

W. I. Chu, S. Hirani, G. Tarroni, L. Li

2603.10825 2026-03-11
TESTING

Evaluating randomized smoothing as a defense against adversarial attacks in trajectory prediction

Accurate and robust trajectory prediction is essential for safe and efficient autonomous driving, yet recent work has shown that even state-of-the-art prediction models are highly vulnerable to inp...

Julian F. Schumann, Eduardo Figueiredo, Frederik Baymler Mathiesen, Luca Laurenti, Jens Kober, Ar...

2603.10821 2026-03-11
AI LLM

HanMoVLM: Large Vision-Language Models for Professional Artistic Painting Evaluation

While Large Vision-Language Models (VLMs) demonstrate impressive general visual capabilities, they remain artistically blind and unable to offer professional evaluation of artworks within specific ...

Hongji Yang, Yucheng Zhou, Wencheng Han, Songlian Li, Xiaotong Zhao, Jianbing Shen

2603.10814 2026-03-11
AI LLM

Nurture-First Agent Development: Building Domain-Expert AI Agents Through Conversational Knowledge Crystallization

The emergence of large language model (LLM)-based agent frameworks has shifted the primary challenge in building domain-expert AI agents from raw capability to effective encoding of domain expertis...

Linghao Zhang

2603.10808 2026-03-11