Papers
Research papers from arXiv and related sources
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
Speculative decoding accelerates autoregressive large language model (LLM) inference by using a lightweight draft model to propose candidate tokens that are then verified in parallel by the target ...
Alexander Samarin, Sergei Krutikov, Anton Shevtsov, Sergei Skvortsov, Filipp Fisin, Alexander Gol...
RF-Agent: Automated Reward Function Design via Language Agent Tree Search
Designing efficient reward functions for low-level control tasks is a challenging problem. Recent research aims to reduce reliance on expert experience by using Large Language Models (LLMs) with ta...
Ning Gao, Xiuhui Zhang, Xingyu Jiang, Mukang You, Mohan Zhang, Yue Deng
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Software engineering agents (SWE) are improving rapidly, with recent gains largely driven by reinforcement learning (RL). However, RL training is constrained by the scarcity of large-scale task col...
Ibragim Badertdinov, Maksim Nekrashevich, Anton Shevtsov, Alexander Golubev
RUMAD: Reinforcement-Unifying Multi-Agent Debate
Multi-agent debate (MAD) systems leverage collective intelligence to enhance reasoning capabilities, yet existing approaches struggle to simultaneously optimize accuracy, consensus formation, and c...
Chao Wang, Han Lin, Huaze Tang, Huijing Lin, Wenbo Ding
NAU-QMUL: Utilizing BERT and CLIP for Multi-modal AI-Generated Image Detection
With the aim of detecting AI-generated images and identifying the specific models responsible for their generation, we propose a multi-modal multi-task model. The model leverages pre-trained BERT a...
Xiaoyu Guo, Arkaitz Zubiaga
CLFEC: A New Task for Unified Linguistic and Factual Error Correction in paragraph-level Chinese Professional Writing
Chinese text correction has traditionally focused on spelling and grammar, while factual error correction is usually treated separately. However, in paragraph-level Chinese professional writing, li...
Jian Kai, Zidong Zhang, Jiwen Chen, Zhengxiang Wu, Songtao Sun, Fuyang Li, Yang Cao, Qiang Liu
Measurement of Born Cross Sections for $e^+e^-\toΣ^-\barΣ^+$ at $\sqrt{s}=3.51-4.95$ GeV and Observation of $ψ(3770)\toΣ^-\barΣ^+$
Using $e^+e^-$ collision data corresponding to an integrated luminosity of 44 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we report the first measurement of Born cross sect...
BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Alibert...
Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning
Recent work applies Large Language Models (LLMs) to source-code vulnerability detection, but most evaluations still rely on random train-test splits that ignore time and overestimate real-world per...
Xuhui Dou, Hayretdin Bahsi, Alejandro Guerra-Manzanares
ReasonX: Declarative Reasoning on Explanations
Explaining opaque Machine Learning (ML) models has become an increasingly important challenge. However, current eXplanation in AI (XAI) methods suffer several shortcomings, including insufficient a...
Laura State, Salvatore Ruggieri, Franco Turini
GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks
Structured deep model compression methods are hardware-friendly and substantially reduce memory and inference costs. However, under aggressive compression, the resulting accuracy degradation often ...
Wenwu Tang, Dong Wang, Lothar Thiele, Olga Saukh
Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding
Diffusion-based large language models (dLLMs) have shown promising performance across various reasoning tasks, establishing themselves as an alternative to autoregressive large language models (LLM...
Xiangzhong Luo, Yilin An, Zhicheng Yu, Weichen Liu, Xu Yang
Reasoning-Driven Multimodal LLM for Domain Generalization
This paper addresses the domain generalization (DG) problem in deep learning. While most DG methods focus on enforcing visual feature invariance, we leverage the reasoning capability of multimodal ...
Zhipeng Xu, Zilong Wang, Xinyang Jiang, Dongsheng Li, De Cheng, Nannan Wang
UniFAR: A Unified Facet-Aware Retrieval Framework for Scientific Documents
Existing scientific document retrieval (SDR) methods primarily rely on document-centric representations learned from inter-document relationships for document-document (doc-doc) retrieval. However,...
Zheng Dou, Zhao Zhang, Deqing Wang, Yikun Ban, Fuzhen Zhuang
OPTIAGENT: A Physics-Driven Agentic Framework for Automated Optical Design
Optical design is the process of configuring optical elements to precisely manipulate light for high-fidelity imaging. It is inherently a highly non-convex optimization problem that relies heavily ...
Yuyu Geng, Lei Sun, Yao Gao, Xinxin Hu, Zhonghua Yi, Xiaolong Qian, Weijian Hu, Jian Bai, Kaiwei ...
Shape vs. Context: Examining Human--AI Gaps in Ambiguous Japanese Character Recognition
High text recognition performance does not guarantee that Vision-Language Models (VLMs) share human-like decision patterns when resolving ambiguity. We investigate this behavioral gap by directly c...
Daichi Haraguchi
A Difference-in-Difference Approach to Detecting AI-Generated Images
Diffusion models are able to produce AI-generated images that are almost indistinguishable from real ones. This raises concerns about their potential misuse and poses substantial challenges for det...
Xinyi Qi, Kai Ye, Chengchun Shi, Ying Yang, Hongyi Zhou, Jin Zhu
Unlocking Cognitive Capabilities and Analyzing the Perception-Logic Trade-off
Recent advancements in Multimodal Large Language Models (MLLMs) pursue omni-perception capabilities, yet integrating robust sensory grounding with complex reasoning remains a challenge, particularl...
Longyin Zhang, Shuo Sun, Yingxu He, Won Cheng Yi Lewis, Muhammad Huzaifah Bin Md Shahrin, Hardik ...
From Static Benchmarks to Dynamic Protocol: Agent-Centric Text Anomaly Detection for Evaluating LLM Reasoning
The evaluation of large language models (LLMs) has predominantly relied on static datasets, which offer limited scalability and fail to capture the evolving reasoning capabilities of recent models....
Seungdong Yoa, Sanghyu Yoon, Suhee Yoon, Dongmin Kim, Ye Seul Sim, Junhyun Lee, Woohyung Lim
SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud
Embodied AI requires sub-second inference near the Radio Access Network (RAN), but deployments span heterogeneous tiers (on-device, RAN-edge, cloud) and must not disrupt real-time baseband processi...
Hariz Yet, Nguyen Thanh Tam, Mao V. Ngo, Lim Yi Shen, Lin Wei, Jihong Park, Binbin Chen, Tony Q. ...
The Auton Agentic AI Framework
The field of Artificial Intelligence is undergoing a transition from Generative AI -- probabilistic generation of text and images -- to Agentic AI, in which autonomous systems execute actions withi...
Sheng Cao, Zhao Chang, Chang Li, Hannan Li, Liyao Fu, Ji Tang