Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

MuxTune: Efficient Multi-Task LLM Fine-Tuning in Multi-Tenant Datacenters via Spatial-Temporal Backbone Multiplexing

Parameter-Efficient Fine-Tuning (PEFT) is widely applied as the backend of fine-tuning APIs for large language model (LLM) customization in datacenters. Service providers deploy separate instances ...

Chunyu Xue, Yi Pan, Weihao Cui, Quan Chen, Shulai Zhang, Bingsheng He, Minyi Guo

2603.02885 2026-03-03
AI LLM

SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion

Artificial Intelligence Generated Content (AIGC), particularly video generation with diffusion models, has been advanced rapidly. Invisible watermarking is a key technology for protecting AI-genera...

Xinjie Zhu, Zijing Zhao, Hui Jin, Qingxiao Guo, Yilong Ma, Yunhao Wang, Xiaobing Guo, Weifeng Zhang

2603.02882 2026-03-03
AI LLM

Emerging trends in Cislunar Space for Lunar Science Exploration and Space Robotics aiding Human Spaceflight Safety

In recent years, the Moon has emerged as an unparalleled extraterrestrial testbed for advancing cuttingedge technological and scientific research critical to enabling sustained human presence on it...

Arsalan Muhammad, Yue Wang, Hai Huang, Hao Wang

2603.02878 2026-03-03
AI LLM

Eval4Sim: An Evaluation Framework for Persona Simulation

Large Language Model (LLM) personas with explicit specifications of attributes, background, and behavioural tendencies are increasingly used to simulate human conversations for tasks such as user m...

Eliseo Bao, Anxo Perez, Xi Wang, Javier Parapar

2603.02876 2026-03-03
AI LLM

LaTeX Compilation: Challenges in the Era of LLMs

As large language models (LLMs) increasingly assist scientific writing, limitations and the significant token cost of TeX become more and more visible. This paper analyzes TeX's fundamental defects...

Tianyou Liu, Ziqiang Li, Yansong Li, Xurui Liu

2603.02873 2026-03-03
AI LLM

LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

Large Language Models (LLMs) achieve strong performance in analyzing and generating text, yet they struggle with explicit, transparent, and verifiable reasoning over complex texts such as those con...

Gianvincenzo Alfano, Sergio Greco, Lucio La Cava, Stefano Francesco Monea, Irina Trubitsyna

2603.02858 2026-03-03
AI LLM

A Browser-based Open Source Assistant for Multimodal Content Verification

Disinformation and false content produced by generative AI pose a significant challenge for journalists and fact-checkers who must rapidly verify digital media information. While there is an abunda...

Rosanna Milner, Michael Foster, Olesya Razuvayevskaya, Ian Roberts, Valentin Porcellini, Denis Te...

2603.02842 2026-03-03
AI LLM

Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

Predicting future student responses to questions is particularly valuable for educational learning platforms where it enables effective interventions. One of the key approaches to do this has been ...

Prarthana Bhattacharyya, Joshua Mitton, Ralph Abboud, Simon Woodhead

2603.02830 2026-03-03
AI LLM

Toward Early Quality Assessment of Text-to-Image Diffusion Models

Recent text-to-image (T2I) diffusion and flow-matching models can produce highly realistic images from natural language prompts. In practical scenarios, T2I systems are often run in a ``generate--t...

Huanlei Guo, Hongxin Wei, Bingyi Jing

2603.02829 2026-03-03
AI LLM

BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation

The rapid advancement of text-to-video (T2V) models has revolutionized content creation, yet their commercial potential remains largely untapped. We introduce, for the first time, the task of seaml...

Zihao Zhu, Ruotong Wang, Siwei Lyu, Min Zhang, Baoyuan Wu

2603.02816 2026-03-03
AI LLM

Benchmarking Speech Systems for Frontline Health Conversations: The DISPLACE-M Challenge

The DIarization and Speech Processing for LAnguage understanding in Conversational Environments - Medical (DISPLACE-M) challenge introduces a conversational AI benchmark focused on understanding go...

Dhanya E, Ankita Meena, Manas Nanivadekar, Noumida A, Victor Azad, Ashwini Nagaraj Shenoy, Pratik...

2603.02813 2026-03-03
AI LLM

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

As LLM-powered agents have been used for high-stakes decision-making, such as clinical diagnosis, it becomes critical to develop reliable verification of their decisions to facilitate trustworthy d...

Yichi Zhang, Nabeel Seedat, Yinpeng Dong, Peng Cui, Jun Zhu, Mihaela van de Schaar

2603.02798 2026-03-03
AI LLM

From Heuristic Selection to Automated Algorithm Design: LLMs Benefit from Strong Priors

Large Language Models (LLMs) have already been widely adopted for automated algorithm design, demonstrating strong abilities in generating and evolving algorithms across various fields. Existing wo...

Qi Huang, Furong Ye, Ananta Shahane, Thomas Bäck, Niki van Stein

2603.02792 2026-03-03
AI LLM

OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets

Multimodal Large Language Models (MLLMs) enhance the potential of natural language processing. However, their actual impact on document information extraction remains unclear. In particular, it is ...

Jiyuan Shen, Peiyue Yuan, Atin Ghosh, Yifan Mai, Daniel Dahlmeier

2603.02789 2026-03-03
AI LLM

Rethinking Code Similarity for Automated Algorithm Design with LLMs

The rise of Large Language Model-based Automated Algorithm Design (LLM-AAD) has transformed algorithm development by autonomously generating code implementations of expert-level algorithms. Unlike ...

Rui Zhang, Zhichao Lu

2603.02787 2026-03-03
AI LLM

From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

Large Language Models (LLMs) show significant potential in AI mathematical tutoring, yet current evaluations often rely on simplistic metrics or narrow pedagogical scenarios, failing to assess comp...

Weikang Shi, Houxing Ren, Junting Pan, Aojun Zhou, Ke Wang, Zimu Lu, Yunqiao Yang, Yuxuan Hu, Lin...

2603.02775 2026-03-03
AI LLM

Agentic Self-Evolutionary Replanning for Embodied Navigation

Failure is inevitable for embodied navigation in complex environments. To enhance the resilience, replanning (RP) is a viable option, where the robot is allowed to fail, but is capable of adjusting...

Guoliang Li, Ruihua Han, Chengyang Li, He Li, Shuai Wang, Wenchao Ding, Hong Zhang, Chengzhong Xu

2603.02772 2026-03-03
AI LLM

EvoSkill: Automated Skill Discovery for Multi-Agent Systems

Coding agents are increasingly used as general-purpose problem solvers, but their flexibility does not by itself confer the domain expertise needed for specialized tasks. Recent work addresses this...

Salaheddin Alzubi, Noah Provenzano, Jaydon Bingham, Weiyuan Chen, Tu Vu

2603.02766 2026-03-03
AI LLM

Seeing Clearly without Training: Mitigating Hallucinations in Multimodal LLMs for Remote Sensing

Multimodal large language models (MLLMs) suffer from pronounced hallucinations in remote sensing visual question-answering (RS-VQA), primarily caused by visual grounding failures in large-scale sce...

Yi Liu, Jing Zhang, Di Wang, Xiaoyu Tian, Haonan Guo, Bo Du

2603.02754 2026-03-03
AI LLM

CoShadow: Multi-Object Shadow Generation for Image Compositing via Diffusion Model

Realistic shadow generation is crucial for achieving seamless image compositing, yet existing methods primarily focus on single-object insertion and often fail to generalize when multiple foregroun...

Waqas Ahmed, Dean Diepeveen, Ferdous Sohel

2603.02743 2026-03-03