Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Event-by-Event Multiplicity Fluctuations in Heavy-Ion Collisions Using Modified HIJING Monte Carlo Generator

This work presents an analysis of event-by-event multiplicity fluctuations as a sensitive tool for diagnosing the state of matter produced in relativistic heavy-ion collisions. Using a modified ver...

Y. A. Rusak, L. F. Babichev

2603.09732 2026-03-10
TESTING

EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning

Multimodal large language models (MLLMs) are increasingly considered as a foundation for embodied agents, yet it remains unclear whether they can reliably reason about the long-term physical conseq...

Chengjun Yu, Xuhan Zhu, Chaoqun Du, Pengfei Yu, Wei Zhai, Yang Cao, Zheng-Jun Zha

2603.09731 2026-03-10
AI LLM

WVA: A Global Optimization Control Plane for llmd

As Large Language Models (LLMs) scale to handle massive concurrent traffic, optimizing the infrastructure required for inference has become a primary challenge. To manage the high cost of GPU resou...

Abhishek Malvankar, Lionel Villard, Mohammed Abdi, Evgeny Shindin, Braulio Dumba, Vishakha Ramani...

2603.09730 2026-03-10
AI LLM

A Multi-Prototype-Guided Federated Knowledge Distillation Approach in AI-RAN Enabled Multi-Access Edge Computing System

With the development of wireless network, Multi-Access Edge Computing (MEC) and Artificial Intelligence (AI)-native Radio Access Network (RAN) have attracted significant attention. Particularly, th...

Luyao Zou, Hayoung Oh, Chu Myaet Thwal, Apurba Adhikary, Seohyeon Hong, Zhu Han

2603.09727 2026-03-10
TESTING

Idempotent Slices with Applications to Code-Size Reduction

Given a value computed within a program, an idempotent backward slice with respect to this value is a maximal subprogram that computes it. An informal notion of an idempotent slice has previously b...

Rafael Alvarenga de Azevedo, Daniel Augusto Costa de Sa, Rodrigo Caetano Rocha, Fernando Magno Qu...

2603.09726 2026-03-10
TESTING

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition

We present DRES: a 1.5-hour Dutch realistic elicited (semi-spontaneous) speech dataset from 80 speakers recorded in noisy, public indoor environments. DRES was designed as a test set for the evalua...

Dimme de Groot, Yuanyuan Zhang, Jorge Martinez, Odette Scharenborg

2603.09725 2026-03-10
AI LLM

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Large language models (LLMs) are increasingly used across the scientific workflow, including to draft peer-review reports. However, many AI-generated reviews are superficial and insufficiently acti...

Sihong Wu, Yiling Ma, Yilun Zhao, Tiansheng Hu, Owen Jiang, Manasi Patwardhan, Arman Cohan

2603.09723 2026-03-10
TESTING

The Flint Hills Series, Mixed Tate Motives, and a Criterion for the Irrationality Measure of $π$

We undertake a rigorous structural analysis of the Flint Hills series $S = \sum_{n=1}^{\infty} \frac{1}{n^3 \sin^2 n}$. Our primary contribution is a reduction theorem that expresses $S$ as a linea...

Carlos Lopez Zapata

2603.09719 2026-03-10
AI LLM

AutoAgent: Evolving Cognition and Elastic Memory Orchestration for Adaptive Agents

Autonomous agent frameworks still struggle to reconcile long-term experiential learning with real-time, context-sensitive decision-making. In practice, this gap appears as static cognition, rigid w...

Xiaoxing Wang, Ning Liao, Shikun Wei, Chen Tang, Feiyu Xiong

2603.09716 2026-03-10
AI LLM

Robotic Scene Cloning:Advancing Zero-Shot Robotic Scene Adaptation in Manipulation via Visual Prompt Editing

Modern robots can perform a wide range of simple tasks and adapt to diverse scenarios in the well-trained environment. However, deploying pre-trained robot models in real-world user scenarios remai...

Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Tiancai Wang, Chang Wen Chen, Haoqiang Fan, Zh...

2603.09712 2026-03-10
TESTING

Finetuning a Text-to-Audio Model for Room Impulse Response Generation

Room Impulse Responses (RIRs) enable realistic acoustic simulation, with applications ranging from multimedia production to speech data augmentation. However, acquiring high-quality real-world RIRs...

Kirak Kim, Sungyoung Kim

2603.09708 2026-03-10
AI LLM

Evaluation of LLMs in retrieving food and nutritional context for RAG systems

In this article, we evaluate four Large Language Models (LLMs) and their effectiveness at retrieving data within a specialized Retrieval-Augmented Generation (RAG) system, using a comprehensive foo...

Maks Požarnik Vavken, Matevž Ogrinc, Tome Eftimov, Barbara Koroušić Seljak

2603.09704 2026-03-10
AI LLM

An Empirical Study of Interaction Smells in Multi-Turn Human-LLM Collaborative Code Generation

Large Language Models (LLMs) have revolutionized code generation, evolving from static tools into dynamic conversational interfaces that facilitate complex, multi-turn collaborative programming. Wh...

Binquan Zhang, Li Zhang, Lin Shi, Song Wang, Yuwei Qian, Linhui Zhao, Fang Liu, An Fu, Yida Ye

2603.09701 2026-03-10
AI LLM

ActiveUltraFeedback: Efficient Preference Data Generation using Active Learning

Reinforcement Learning from Human Feedback (RLHF) has become the standard for aligning Large Language Models (LLMs), yet its efficacy is bottlenecked by the high cost of acquiring preference data, ...

Davit Melikidze, Marian Schneider, Jessica Lam, Martin Wertich, Ido Hakimi, Barna Pásztor, Andrea...

2603.09692 2026-03-10
AI LLM

ESAinsTOD: A Unified End-to-End Schema-Aware Instruction-Tuning Framework for Task-Oriented Dialog Modeling

Existing end-to-end modeling methods for modular task-oriented dialog systems are typically tailored to specific datasets, making it challenging to adapt to new dialog scenarios. In this work, we p...

Dechuan Teng, Chunlin Lu, Libo Qin, Wanxiang Che

2603.09691 2026-03-10
AI LLM

Automatic Cardiac Risk Management Classification using large-context Electronic Patients Health Records

To overcome the limitations of manual administrative coding in geriatric Cardiovascular Risk Management, this study introduces an automated classification framework leveraging unstructured Electron...

Jacopo Vitale, David Della Morte, Luca Bacco, Mario Merone, Mark de Groot, Saskia Haitjema, Leand...

2603.09685 2026-03-10
AI LLM

Murmurations: a case study in AI-assisted mathematics

We report the emergence of a striking new phenomenon in arithmetic, which we call murmurations. First observed experimentally through averages over large arithmetic datasets, murmurations can be de...

Yang-Hui He, Kyu-Hwan Lee, Thomas Oliver, Alexey Pozdnyakov

2603.09680 2026-03-10
AI LLM

EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages

Large language models achieve near-ceiling performance on code generation benchmarks, yet these results increasingly reflect memorization rather than genuine reasoning. We introduce EsoLang-Bench, ...

Aman Sharma, Paras Chopra

2603.09678 2026-03-10
TESTING

No evaluation without fair representation : Impact of label and selection bias on the evaluation, performance and mitigation of classification models

Bias can be introduced in diverse ways in machine learning datasets, for example via selection or label bias. Although these bias types in themselves have an influence on important aspects of fair ...

Magali Legast, Toon Calders, François Fouss

2603.09662 2026-03-10
AI LLM

FreqCycle: A Multi-Scale Time-Frequency Analysis Method for Time Series Forecasting

Mining time-frequency features is critical for time series forecasting. Existing research has predominantly focused on modeling low-frequency patterns, where most time series energy is concentrated...

Boya Zhang, Shuaijie Yin, Huiwen Zhu, Xing He

2603.09661 2026-03-10