Papers
Research papers from arXiv and related sources
Why Does RLAIF Work At All?
Reinforcement Learning from AI Feedback (RLAIF) enables language models to improve by training on their own preference judgments, yet no theoretical account explains why this self-improvement seemi...
Robin Young
Contextualized Privacy Defense for LLM Agents
LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability. Most prior approaches rely on static or passive defenses, s...
Yule Wen, Yanzhe Zhang, Jianxun Lian, Xiaoyuan Yi, Xing Xie, Diyi Yang
TagaVLM: Topology-Aware Global Action Reasoning for Vision-Language Navigation
Vision-Language Navigation (VLN) presents a unique challenge for Large Vision-Language Models (VLMs) due to their inherent architectural mismatch: VLMs are primarily pretrained on static, disembodi...
Jiaxing Liu, Zexi Zhang, Xiaoyan Li, Boyue Wang, Yongli Hu, Baocai Yin
Delegation and Verification Under AI
As AI systems enter institutional workflows, workers must decide whether to delegate task execution to AI and how much effort to invest in verifying AI outputs, while institutions evaluate workers ...
Lingxiao Huang, Wenyang Xiao, Nisheeth K. Vishnoi
Architecting Trust in Artificial Epistemic Agents
Large language models increasingly function as epistemic agents -- entities that can 1) autonomously pursue epistemic goals and 2) actively shape our shared knowledge environment. They curate the i...
Nahema Marchal, Stephanie Chan, Matija Franklin, Manon Revel, Geoff Keeling, Roberta Fischli, Bil...
Changing Pedagogical Paradigms: Integrating Generative AI in Mathematics to Enhance Digital Literacy through 'Mathematical Battles with AI'
This paper introduces `Math Battles with AI', an innovative competitive format designed at ITMO University to redefine the role of generative AI in mathematics education. Moving away from a purely ...
Maria Moskalenko, Alexander Trifanov, Roman Popkov, Arina Tabieva, Maria Smirnova, Konstantin Pra...
The Geometry of Learning Under AI Delegation
As AI systems shift from tools to collaborators, a central question is how the skills of humans relying on them change over time. We study this question mathematically by modeling the joint evoluti...
Lingxiao Huang, Nisheeth K. Vishnoi
SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment
Large Language Models are rapidly gaining traction in software engineering, yet their growing carbon footprint raises pressing sustainability concerns. While training emissions are substantial, inf...
Priyavanshi Pathania, Rohit Mehra, Vibhu Saujanya Sharma, Vikrant Kaulgud, Tiffani Nevels, Sanjay...
Reducing Labeling Effort in Architecture Technical Debt Detection through Active Learning and Explainable AI
Self-Admitted Technical Debt (SATD) refers to technical compromises explicitly admitted by developers in natural language artifacts such as code comments, commit messages, and issue trackers. Among...
Edi Sutoyo, Paris Avgeriou, Andrea Capiluppi
ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization
Recent advancements in reinforcement fine-tuning have significantly improved the reasoning ability of large language models (LLMs). In particular, methods such as group relative policy optimization...
Yang Zhan, Yunhao Li, Zhang Chao, Yuxu Lu, Yan Li
Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models
Graph-based tasks in the zero-shot setting remain a significant challenge due to data scarcity and the inability of traditional Graph Neural Networks (GNNs) to generalize to unseen domains or label...
Fengzhi Li, Liang Zhang, Yuan Zuo, Ruiqing Zhao, YanSong Liu, Yunfei Ma, Fanyu Meng, Junlan Feng
GloPath: An Entity-Centric Foundation Model for Glomerular Lesion Assessment and Clinicopathological Insights
Glomerular pathology is central to the diagnosis and prognosis of renal diseases, yet the heterogeneity of glomerular morphology and fine-grained lesion patterns remain challenging for current AI a...
Qiming He, Jing Li, Tian Guan, Yifei Ma, Zimo Zhao, Yanxia Wang, Hongjing Chen, Yingming Xu, Shua...
Eliciting Numerical Predictive Distributions of LLMs Without Autoregression
Large Language Models (LLMs) have recently been successfully applied to regression tasks -- such as time series forecasting and tabular prediction -- by leveraging their in-context learning abiliti...
Julianna Piskorz, Katarzyna Kobalczyk, Mihaela van der Schaar
Articulation in Motion: Prior-free Part Mobility Analysis for Articulated Objects By Dynamic-Static Disentanglement
Articulated objects are ubiquitous in daily life. Our goal is to achieve a high-quality reconstruction, segmentation of independent moving parts, and analysis of articulation. Recent methods analys...
Hao Ai, Wenjie Chang, Jianbo Jiao, Ales Leonardis, Ofek Eyal
Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction
Document-level event argument extraction (DEAE) is essential for knowledge acquisition, aiming to extract participants of events from documents.In the zero-shot setting, existing methods employ LLM...
Guangjun Zhang, Hu Zhang, Yazhou Han, Yue Fan, Yuhang Shao, Ru Li, Hongye Tan
SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training
In recent years, pre-trained large language models have achieved remarkable success across diverse tasks. Besides the pivotal role of self-supervised pre-training, their effectiveness in downstream...
Qi Zhang, Yifei Wang, Xiaohan Wang, Jiajun Chai, Guojun Yin, Wei Lin, Yisen Wang
Speech recognition assisted by large language models to command software orally -- Application to an augmented and virtual reality web app for immersive molecular graphics
This project successfully developed, evaluated and integrated a Voice User Interface (VUI) into a web application that we are developing for immersive molecular graphics. Said app provides augmente...
Fabio Cortes Rodriguez, Luciano Abriata
SpecLoop: An Agentic RTL-to-Specification Framework with Formal Verification Feedback Loop
RTL implementations frequently lack up-to-date or consistent specifications, making comprehension, maintenance, and verification costly and error-prone. While prior work has explored generating spe...
Fu-Chieh Chang, Yu-Hsin Yang, Hung-Ming Huang, Yun-Chia Hsu, Yin-Yu Lin, Ming-Fang Tsai, Chun-Chi...
Kraken: Higher-order EM Side-Channel Attacks on DNNs in Near and Far Field
The multi-million dollar investment required for modern machine learning (ML) has made large ML models a prime target for theft. In response, the field of model stealing has emerged. Attacks based ...
Peter Horvath, Ilia Shumailov, Lukasz Chmielewski, Lejla Batina, Yuval Yarom
LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval
The increasing diversity and scale of video data demand retrieval systems capable of multimodal understanding, adaptive reasoning, and domain-specific knowledge integration. This paper presents LLa...
Minh-Chi Phung, Thien-Bao Le, Cam-Tu Tran-Thi, Thu-Dieu Nguyen-Thi, Vu-Hung Dao