Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Large Language Model Empowered CSI Feedback in Massive MIMO Systems

Despite the success of large language models (LLMs) across domains, their potential for efficient channel state information (CSI) compression and feedback in frequency division duplex (FDD) massive...

Jie Wu, Wei Xu, Le Liang, Xiaohu You, Mérouane Debbah

2603.02686 2026-03-03
AI LLM

HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse

Subtle and indirect hate speech remains an underexplored challenge in online safety research, particularly when harmful intent is embedded within misleading or manipulative narratives. Existing hat...

Sai Kartheek Reddy Kasu, Shankar Biradar, Sunil Saumya, Md. Shad Akhtar

2603.02684 2026-03-03
TESTING

VisionCreator: A Native Visual-Generation Agentic Model with Understanding, Thinking, Planning and Creation

Visual content creation tasks demand a nuanced understanding of design conventions and creative workflows-capabilities challenging for general models, while workflow-based agents lack specialized k...

Jinxiang Lai, Zexin Lu, Jiajun He, Rongwei Quan, Wenzhe Zhao, Qinyu Yang, Qi Chen, Qin Lin, Chuyu...

2603.02681 2026-03-03
AI LLM

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly fo...

Yang Zhao, Zihao Li, Zhiyu Jiang, Dandan Ma, Ganchao Liu, Wenzhe Zhao

2603.02680 2026-03-03
AI LLM

Causal Learning Should Embrace the Wisdom of the Crowd

Learning causal structures typically represented by directed acyclic graphs (DAGs) from observational data is notoriously challenging due to the combinatorial explosion of possible graphs and inher...

Ryan Feng Lin, Yuantao Wei, Huiling Liao, Xiaoning Qian, Shuai Huang

2603.02678 2026-03-03
AI LLM

ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs

Large language models suffer from content effects in reasoning tasks, particularly in multi-lingual contexts. We introduce a novel method that reduces these biases through explicit structural abstr...

Wicaksono Leksono Muhamad, Joanito Agili Lopo, Tack Hwa Wong, Muhammad Ravi Shulthan Habibi, Samu...

2603.02676 2026-03-03
AI LLM

IMR-LLM: Industrial Multi-Robot Task Planning and Program Generation using Large Language Models

In modern industrial production, multiple robots often collaborate to complete complex manufacturing tasks. Large language models (LLMs), with their strong reasoning capabilities, have shown potent...

Xiangyu Su, Juzhan Xu, Oliver van Kaick, Kai Xu, Ruizhen Hu

2603.02669 2026-03-03
AI LLM

SorryDB: Can AI Provers Complete Real-World Lean Theorems?

We present SorryDB, a dynamically-updating benchmark of open Lean tasks drawn from 78 real world formalization projects on GitHub. Unlike existing static benchmarks, often composed of competition p...

Austin Letson, Leopoldo Sarra, Auguste Poiroux, Oliver Dressler, Paul Lezeau, Dhyan Aranha, Frede...

2603.02668 2026-03-03
TESTING

Quantum Algorithms for Approximate Graph Isomorphism Testing

The graph isomorphism problem asks whether two graphs are identical up to vertex relabeling. While the exact problem admits quasi-polynomial-time classical algorithms, many applications in molecula...

Prateek P. Kulkarni

2603.02656 2026-03-03
TESTING

Improving Diffusion Planners by Self-Supervised Action Gating with Energies

Diffusion planners are a strong approach for offline reinforcement learning, but they can fail when value-guided selection favours trajectories that score well yet are locally inconsistent with the...

Yuan Lu, Dongqi Han, Yansen Wang, Dongsheng Li

2603.02650 2026-03-03
TESTING

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Universal Speech Enhancement (USE) aims to restore speech quality under diverse degradation conditions while preserving signal fidelity. Despite recent progress, key challenges in training target s...

Szu-Wei Fu, Rong Chao, Xuesong Yang, Sung-Feng Huang, Ryandhimas E. Zezario, Rauf Nasretdinov, An...

2603.02641 2026-03-03
TESTING

The Vienna 4G/5G Drive-Test Dataset

Machine learning for mobile network analysis, planning, and optimization is often limited by the lack of large, comprehensive real-world datasets. This paper introduces the Vienna 4G/5G Drive-Test ...

Wilfried Wiedner, Lukas Eller, Mariam Mussbah, Dominik Rössler, Valerian Maresch, Philipp Svoboda...

2603.02638 2026-03-03
TESTING

Topological bounds on the dynamical growth rate of chemical reaction networks

Growth and decay are system-level properties of chemical reaction networks (CRNs) relevant from prebiotic chemistry to cellular metabolism. Their properties are typically analyzed through the kinet...

Praful Gagrani, Jiwei Wang, Yannick De Decker, David Lacoste

2603.02627 2026-03-03
TESTING

Same Error, Different Function: The Optimizer as an Implicit Prior in Financial Time Series

Neural networks applied to financial time series operate in a regime of underspecification, where model predictors achieve indistinguishable out-of-sample error. Using large-scale volatility foreca...

Federico Vittorio Cortesi, Giuseppe Iannone, Giulia Crippa, Tomaso Poggio, Pierfrancesco Beneventano

2603.02620 2026-03-03
TESTING

Mind the Way You Select Negative Texts: Pursuing the Distance Consistency in OOD Detection with VLMs

Out-of-distribution (OOD) detection seeks to identify samples from unknown classes, a critical capability for deploying machine learning models in open-world scenarios. Recent research has demonstr...

Zhikang Xu, Qianqian Xu, Zitai Wang, Cong Hua, Sicong Li, Zhiyong Yang, Qingming Huang

2603.02618 2026-03-03
TESTING

His2Trans: A Skeleton First Framework for Self Evolving C to Rust Translation with Historical Retrieval

Automated C-to-Rust migration encounters systemic obstacles when scaling from code snippets to industrial projects, mainly because build context is often unavailable ("dependency hell") and domain-...

Shengbo Wang, Mingwei Liu, Guangsheng Ou, Yuwen Chen, Zike Li, Yanlin Wang, Zibin Zheng

2603.02617 2026-03-03
TESTING

Bayesian Optimization in Chemical Compound Sub-Spaces using Low-Dimensional Molecular Descriptors

Efficient optimization of molecules with targeted properties remains a significant challenge due to the vast size and discrete nature of chemical compound space. Conventional machine-learning-based...

Yun-Wen Mao, Roman V. Krems

2603.02605 2026-03-03
TESTING

AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows

Autonomous AI agents are deployed at unprecedented scale, yet no principled methodology exists for verifying that an agent has not regressed after changes to its prompts, tools, models, or orch...

Varun Pratap Bhardwaj

2603.02601 2026-03-03
TESTING

Synthetic-Child: An AIGC-Based Synthetic Data Pipeline for Privacy-Preserving Child Posture Estimation

Accurate child posture estimation is critical for AI-powered study companion devices, yet collecting large-scale annotated datasets of children is both expensive and ethically prohibitive due to pr...

Taowen Zeng

2603.02598 2026-03-03
TESTING

GPUTOK: GPU Accelerated Byte Level BPE Tokenization

As large language models move toward million-token context windows, CPU tokenizers become a major slowdown because they process text one step at a time while powerful GPUs sit unused. We built a GP...

Venu Gopal Kadamba, Kanishkha Jaisankar

2603.02597 2026-03-03