Papers
Research papers from arXiv and related sources
Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning
Chain-of-Thought (CoT) has substantially empowered Large Language Models (LLMs) to tackle complex reasoning tasks, yet the verbose nature of explicit reasoning steps incurs prohibitive inference la...
Qin-Wen Luo, Sheng Ren, Xiang Chen, Rui Liu, Jun Fang, Naiqiang Tan, Sheng-Jun Huang
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility through natural language interaction and tool-mediated deci...
Zhiheng Song, Jingshuai Zhang, Chuan Qin, Chao Wang, Chao Chen, Longfei Xu, Kaikui Liu, Xiangxian...
Tackling Privacy Heterogeneity in Differentially Private Federated Learning
Differentially private federated learning (DP-FL) enables clients to collaboratively train machine learning models while preserving the privacy of their local data. However, most existing DP-FL app...
Ruichen Xu, Ying-Jun Angela Zhang, Jianwei Huang
Fine-grained Semantics Integration for Large Language Model-based Recommendation
Recent advances in Large Language Models (LLMs) have shifted in recommendation systems from the discriminative paradigm to the LLM-based generative paradigm, where the recommender autoregressively ...
Jiawen Feng, Xiaoyu Kong, Leheng Sheng, Bin Wu, Chao Yi, Feifang Yang, Xiang-Rong Sheng, Han Zhu,...
TorchLean: Formalizing Neural Networks in Lean
Neural networks are increasingly deployed in safety- and mission-critical pipelines, yet many verification and analysis results are produced outside the programming environment that defines and run...
Robert Joseph George, Jennifer Cruden, Xiangru Zhong, Huan Zhang, Anima Anandkumar
HyperKKL: Enabling Non-Autonomous State Estimation through Dynamic Weight Conditioning
This paper proposes HyperKKL, a novel learning approach for designing Kazantzis-Kravaris/Luenberger (KKL) observers for non-autonomous nonlinear systems. While KKL observers offer a rigorous theore...
Yahia Salaheldin Shaaban, Salem Lahlou, Abdelrahman Sayed Sayed
ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL
We propose ContextRL, a novel framework that leverages context augmentation to overcome these bottlenecks. Specifically, to enhance Identifiability, we provide the reward model with full reference ...
Xingyu Lu, Jinpeng Wang, YiFan Zhang, Shijie Ma, Xiao Hu, Tianke Zhang, Haonan fan, Kaiyu Jiang, ...
Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA
Large Language Models (LLMs) obey consistent scaling laws -- empirical power-law fits that predict how loss decreases with compute, data, and parameters. While predictive, these laws are descriptiv...
Hai Huang, Yann LeCun, Randall Balestriero
Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery
Vision-language foundation models (VLFMs) promise zero-shot and retrieval understanding for Earth observation. While operational satellite systems often lack full multi-spectral coverage, making RG...
Minh Kha Do, Wei Xiang, Kang Han, Di Wu, Khoa Phan, Yi-Ping Phoebe Chen, Gaowen Liu, Ramana Rao K...
Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD
In Embedding-as-an-Interface (EaaI) settings, pre-trained models are queried for Intermediate Representations (IRs). The distributional properties of IRs can leak training-set membership signals, e...
Jiayang Meng, Tao Huang, Chen Hou, Guolong Zheng, Hong Chen
EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning
Progress in hardware model checking depends critically on high-quality benchmarks. However, the community faces a significant benchmark gap: existing suites are limited in number, often distributed...
Guangyu Hu, Xiaofeng Zhou, Wei Zhang, Hongce Zhang
CoLyricist: Enhancing Lyric Writing with AI through Workflow-Aligned Support
We propose CoLyricist, an AI-assisted lyric writing tool designed to support the typical workflows of experienced lyricists and enhance their creative efficiency. While lyricists have unique proces...
Masahiro Yoshida, Bingxuan Li, Songyan Zhao, Qinyi Zhou, Shiwei Hu, Xiang Anthony Chen, Nanyun Peng
SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning
Long-running agentic tasks, such as deep research, require multi-hop reasoning over information distributed across multiple webpages and documents. In such tasks, the LLM context is dominated by to...
Sanjay Kariyappa, G. Edward Suh
Beyond Vintage Rotation: Bias-Free Sparse Representation Learning with Oracle Inference
Learning low-dimensional latent representations is a central topic in statistics and machine learning, and rotation methods have long been used to obtain sparse and interpretable representations. D...
Chengyu Cui, Yunxiao Chen, Jing Ouyang, Gongjun Xu
Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA
Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. ...
Wenwei Li, Ming Xu, Tianle Xia, Lingxiang Hu, Yiding Sun, Linfang Shang, Liqun Liu, Peng Shu, Hua...
Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance
Example-based guidance is widely used to improve mathematical reasoning at inference time, yet its effectiveness is highly unstable across problems and models-even when the guidance is correct and ...
Weida Liang, Yiyou Sun, Shuyuan Nan, Chuang Li, Dawn Song, Kenji Kawaguchi
Metamorphic Testing of Vision-Language Action-Enabled Robots
Vision-Language-Action (VLA) models are multimodal robotic task controllers that, given an instruction and visual inputs, produce a sequence of low-level control actions (or motor commands) enablin...
Pablo Valle, Sergio Segura, Shaukat Ali, Aitor Arrieta
GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views
Feed-forward 3D reconstruction offers substantial runtime advantages over per-scene optimization, which remains slow at inference and often fragile under sparse views. However, existing feed-forwar...
Tianyu Chen, Wei Xiang, Kang Han, Yu Lu, Di Wu, Gaowen Liu, Ramana Rao Kompella
Operationalizing Fairness: Post-Hoc Threshold Optimization Under Hard Resource Limits
The deployment of machine learning in high-stakes domains requires a balance between predictive safety and algorithmic fairness. However, existing fairness interventions often as- sume unconstraine...
Moirangthem Tiken Singh, Amit Kalita, Sapam Jitu Singh
RepoMod-Bench: A Benchmark for Code Repository Modernization via Implementation-Agnostic Testing
The evolution of AI coding agents has shifted the frontier from simple snippet completion to autonomous repository-level engineering. However, evaluating these agents remains ill-posed in general c...
Xuefeng Li, Nir Ben-Israel, Yotam Raz, Belal Ahmed, Doron Serebro, Antoine Raux