Papers
Research papers from arXiv and related sources
MuxTune: Efficient Multi-Task LLM Fine-Tuning in Multi-Tenant Datacenters via Spatial-Temporal Backbone Multiplexing
Parameter-Efficient Fine-Tuning (PEFT) is widely applied as the backend of fine-tuning APIs for large language model (LLM) customization in datacenters. Service providers deploy separate instances ...
Chunyu Xue, Yi Pan, Weihao Cui, Quan Chen, Shulai Zhang, Bingsheng He, Minyi Guo
SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion
Artificial Intelligence Generated Content (AIGC), particularly video generation with diffusion models, has been advanced rapidly. Invisible watermarking is a key technology for protecting AI-genera...
Xinjie Zhu, Zijing Zhao, Hui Jin, Qingxiao Guo, Yilong Ma, Yunhao Wang, Xiaobing Guo, Weifeng Zhang
Emerging trends in Cislunar Space for Lunar Science Exploration and Space Robotics aiding Human Spaceflight Safety
In recent years, the Moon has emerged as an unparalleled extraterrestrial testbed for advancing cuttingedge technological and scientific research critical to enabling sustained human presence on it...
Arsalan Muhammad, Yue Wang, Hai Huang, Hao Wang
Eval4Sim: An Evaluation Framework for Persona Simulation
Large Language Model (LLM) personas with explicit specifications of attributes, background, and behavioural tendencies are increasingly used to simulate human conversations for tasks such as user m...
Eliseo Bao, Anxo Perez, Xi Wang, Javier Parapar
LaTeX Compilation: Challenges in the Era of LLMs
As large language models (LLMs) increasingly assist scientific writing, limitations and the significant token cost of TeX become more and more visible. This paper analyzes TeX's fundamental defects...
Tianyou Liu, Ziqiang Li, Yansong Li, Xurui Liu
LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates
Large Language Models (LLMs) achieve strong performance in analyzing and generating text, yet they struggle with explicit, transparent, and verifiable reasoning over complex texts such as those con...
Gianvincenzo Alfano, Sergio Greco, Lucio La Cava, Stefano Francesco Monea, Irina Trubitsyna
A Browser-based Open Source Assistant for Multimodal Content Verification
Disinformation and false content produced by generative AI pose a significant challenge for journalists and fact-checkers who must rapidly verify digital media information. While there is an abunda...
Rosanna Milner, Michael Foster, Olesya Razuvayevskaya, Ian Roberts, Valentin Porcellini, Denis Te...
Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs
Predicting future student responses to questions is particularly valuable for educational learning platforms where it enables effective interventions. One of the key approaches to do this has been ...
Prarthana Bhattacharyya, Joshua Mitton, Ralph Abboud, Simon Woodhead
Toward Early Quality Assessment of Text-to-Image Diffusion Models
Recent text-to-image (T2I) diffusion and flow-matching models can produce highly realistic images from natural language prompts. In practical scenarios, T2I systems are often run in a ``generate--t...
Huanlei Guo, Hongxin Wei, Bingyi Jing
BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation
The rapid advancement of text-to-video (T2V) models has revolutionized content creation, yet their commercial potential remains largely untapped. We introduce, for the first time, the task of seaml...
Zihao Zhu, Ruotong Wang, Siwei Lyu, Min Zhang, Baoyuan Wu
Benchmarking Speech Systems for Frontline Health Conversations: The DISPLACE-M Challenge
The DIarization and Speech Processing for LAnguage understanding in Conversational Environments - Medical (DISPLACE-M) challenge introduces a conversational AI benchmark focused on understanding go...
Dhanya E, Ankita Meena, Manas Nanivadekar, Noumida A, Victor Azad, Ashwini Nagaraj Shenoy, Pratik...
Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification
As LLM-powered agents have been used for high-stakes decision-making, such as clinical diagnosis, it becomes critical to develop reliable verification of their decisions to facilitate trustworthy d...
Yichi Zhang, Nabeel Seedat, Yinpeng Dong, Peng Cui, Jun Zhu, Mihaela van de Schaar
From Heuristic Selection to Automated Algorithm Design: LLMs Benefit from Strong Priors
Large Language Models (LLMs) have already been widely adopted for automated algorithm design, demonstrating strong abilities in generating and evolving algorithms across various fields. Existing wo...
Qi Huang, Furong Ye, Ananta Shahane, Thomas Bäck, Niki van Stein
OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets
Multimodal Large Language Models (MLLMs) enhance the potential of natural language processing. However, their actual impact on document information extraction remains unclear. In particular, it is ...
Jiyuan Shen, Peiyue Yuan, Atin Ghosh, Yifan Mai, Daniel Dahlmeier
Rethinking Code Similarity for Automated Algorithm Design with LLMs
The rise of Large Language Model-based Automated Algorithm Design (LLM-AAD) has transformed algorithm development by autonomously generating code implementations of expert-level algorithms. Unlike ...
Rui Zhang, Zhichao Lu
From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench
Large Language Models (LLMs) show significant potential in AI mathematical tutoring, yet current evaluations often rely on simplistic metrics or narrow pedagogical scenarios, failing to assess comp...
Weikang Shi, Houxing Ren, Junting Pan, Aojun Zhou, Ke Wang, Zimu Lu, Yunqiao Yang, Yuxuan Hu, Lin...
Agentic Self-Evolutionary Replanning for Embodied Navigation
Failure is inevitable for embodied navigation in complex environments. To enhance the resilience, replanning (RP) is a viable option, where the robot is allowed to fail, but is capable of adjusting...
Guoliang Li, Ruihua Han, Chengyang Li, He Li, Shuai Wang, Wenchao Ding, Hong Zhang, Chengzhong Xu
EvoSkill: Automated Skill Discovery for Multi-Agent Systems
Coding agents are increasingly used as general-purpose problem solvers, but their flexibility does not by itself confer the domain expertise needed for specialized tasks. Recent work addresses this...
Salaheddin Alzubi, Noah Provenzano, Jaydon Bingham, Weiyuan Chen, Tu Vu
Seeing Clearly without Training: Mitigating Hallucinations in Multimodal LLMs for Remote Sensing
Multimodal large language models (MLLMs) suffer from pronounced hallucinations in remote sensing visual question-answering (RS-VQA), primarily caused by visual grounding failures in large-scale sce...
Yi Liu, Jing Zhang, Di Wang, Xiaoyu Tian, Haonan Guo, Bo Du
CoShadow: Multi-Object Shadow Generation for Image Compositing via Diffusion Model
Realistic shadow generation is crucial for achieving seamless image compositing, yet existing methods primarily focus on single-object insertion and often fail to generalize when multiple foregroun...
Waqas Ahmed, Dean Diepeveen, Ferdous Sohel