Papers
Research papers from arXiv and related sources
Surgical Post-Training: Cutting Errors, Keeping Knowledge
Enhancing the reasoning capabilities of Large Language Models (LLMs) via post-training is often constrained by the trade-off between efficiency and catastrophic forgetting. While prior research emp...
Wenye Lin, Kai Han
HeRo: Adaptive Orchestration of Agentic RAG on Heterogeneous Mobile SoC
With the increasing computational capability of mobile devices, deploying agentic retrieval-augmented generation (RAG) locally on heterogeneous System-on-Chips (SoCs) has become a promising way to ...
Maoliang Li, Jiayu Chen, Zihao Zheng, Ziqian Li, Xinhao Sun, Guojie Luo, Chenchen Liu, Xiang Chen
CeProAgents: A Hierarchical Agents System for Automated Chemical Process Development
The development of chemical processes, a cornerstone of chemical engineering, presents formidable challenges due to its multi-faceted nature, integrating specialized knowledge, conceptual design, a...
Yuhang Yang, Ruikang Li, Jifei Ma, Kai Zhang, Qi Liu, Jianyu Han, Yonggan Bu, Jibin Zhou, Defu Li...
LexChronos: An Agentic Framework for Structured Event Timeline Extraction in Indian Jurisprudence
Understanding and predicting judicial outcomes demands nuanced analysis of legal documents. Traditional approaches treat judgments and proceedings as unstructured text, limiting the effectiveness o...
Anka Chandrahas Tummepalli, Preethu Rose Anish
PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts
Modern stereo matching methods have leveraged monocular depth foundation models to achieve superior zero-shot generalization performance. However, most existing methods primarily focus on extractin...
Xianqi Wang, Hao Yang, Hangtian Wang, Junda Cheng, Gangwei Xu, Min Lin, Xin Yang
QCAgent: An agentic framework for quality-controllable pathology report generation from whole slide image
Recent methods for pathology report generation from whole-slide image (WSI) are capable of producing slide-level diagnostic descriptions but fail to ground fine-grained statements in localized visu...
Rundong Wang, Wei Ba, Ying Zhou, Yingtai Li, Bowen Liu, Baizhi Wang, Yuhao Wang, Zhidong Yang, Ku...
Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
Speculative decoding accelerates large language model (LLM) inference by using a small draft model to generate candidate tokens for a larger target model to verify. The efficacy of this technique h...
Jiebin Zhang, Zhenghan Yu, Liang Wang, Nan Yang, Eugene J. Yu, Zheng Li, Yifan Song, Dawei Zhu, X...
Who Explains Privacy Policies to Me? Embodied and Textual LLM-Powered Privacy Assistants in Virtual Reality
Virtual Reality (VR) systems collect fine-grained behavioral and biometric data, yet privacy policies are rarely read or understood due to their complex language, length, and poor integration into ...
Vincent Freiberger, Moritz Dresch, Florian Alt, Arthur Fleig, Viktorija Paneva
DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning
Adapting Large Multimodal Models (LMMs) to real-world scenarios poses the dual challenges of learning from sequential data streams while handling frequent modality incompleteness, a task known as C...
Xiwei Liu, Yulong Li, Feilong Tang, Imran Razzak
Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum
Cybercrime forums play a central role in the cybercrime ecosystem, serving as hubs for the exchange of illicit goods, services, and knowledge. Previous studies have explored the market and social s...
Raphael Hoheisel, Tom Meurs, Jai Wientjes, Marianne Junger, Abhishta Abhishta, Masarah Paquet-Clo...
The Invisibility Hypothesis: Promises of AGI and the Future of the Global South
Discussions surrounding Artificial General Intelligence have largely focused on technical feasibility, timelines, and existential risk, often treating its social impact as being the same across dif...
L. Julian Lechuga Lopez, Luis Lara
Closing the Gap Between Float and Posit Hardware Efficiency
The b-posit, or bounded posit, is a variation of the posit format designed for high performance computing (HPC) and AI applications. Unlike traditional floating-point formats (floats), posits use v...
Aditya Anirudh Jonnalagadda, Rishi Thotli, John L. Gustafson
Evaluating and Understanding Scheming Propensity in LLM Agents
As frontier language models are increasingly deployed as autonomous agents pursuing complex, long-term objectives, there is increased risk of scheming: agents covertly pursuing misaligned goals. Pr...
Mia Hopman, Jannes Elstner, Maria Avramidou, Amritanshu Prasad, David Lindner
CARE: Towards Clinical Accountability in Multi-Modal Medical Reasoning with an Evidence-Grounded Agentic Framework
Large visual language models (VLMs) have shown strong multi-modal medical reasoning ability, but most operate as end-to-end black boxes, diverging from clinicians' evidence-based, staged workflows ...
Yuexi Du, Jinglu Wang, Shujie Liu, Nicha C. Dvornek, Yan Lu
MigMate: A VS Code Extension for LLM-based Library Migration of Python Projects
Modern software relies heavily on third-party software libraries to streamline the development process. The act of switching one library for a similar counterpart, called library migration, natural...
Matthias Kebede, May Mahmoud, Mohayeminul Islam, Sarah Nadi
DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science
The fast-growing demands in using Large Language Models (LLMs) to tackle complex multi-step data science tasks create an emergent need for accurate benchmarking. There are two major gaps in existin...
Fan Shu, Yite Wang, Ruofan Wu, Boyi Liu, Zhewei Yao, Yuxiong He, Feng Yan
Do LLMs Benefit From Their Own Words?
Multi-turn interactions with large language models typically retain the assistant's own past responses in the conversation history. In this work, we revisit this design choice by asking whether lar...
Jenny Y. Huang, Leshem Choshen, Ramon Astudillo, Tamara Broderick, Jacob Andreas
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
GPU kernel optimization is fundamental to modern deep learning but remains a highly specialized task requiring deep hardware expertise. Despite strong performance in general programming, large lang...
Weinan Dai, Hanlin Wu, Qiying Yu, Huan-ang Gao, Jiahao Li, Chengquan Jiang, Weiqiang Lou, Yufan S...
A Minimal Agent for Automated Theorem Proving
We propose a minimal agentic baseline that enables systematic comparison across different AI-based theorem prover architectures. This design implements the core features shared among state-of-the-a...
Borja Requena Pozo, Austin Letson, Krystian Nowakowski, Izan Beltran Ferreiro, Leopoldo Sarra
From Efficiency to Meaning: Adolescents' Envisioned Role of AI in Health Management
While prior research has focused on providers, caregivers, and adult patients, little is known about adolescents' perceptions of AI in health learning and management. Utilizing design fiction and c...
Jamie Lee, Kyuha Jung, Cecilia Lee, Lauren MacDonnell, Jessica Kim, Daniel Otterson, Erin Newman,...