Papers
Research papers from arXiv and related sources
UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking
Recent advancements in LLM-based information-seeking agents have achieved record-breaking performance on established benchmarks. However, these agents remain heavily reliant on search-engine-indexe...
Chang Liu, Chuqiao Kuang, Tianyi Zhuang, Yuxin Cheng, Huichi Zhou, Xiaoguang Li, Lifeng Shang
SAMoE-VLA: A Scene Adaptive Mixture-of-Experts Vision-Language-Action Model for Autonomous Driving
Recent advances in Vision-Language-Action (VLA) models have shown promising capabilities in autonomous driving by leveraging the understanding and reasoning strengths of Large Language Models(LLMs)...
Zihan You, Hongwei Liu, Chenxu Dang, Zhe Wang, Sining Ang, Aoqi Wang, Yan Wang
Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
Understanding and addressing potential safety alignment risks in large language models (LLMs) is critical for ensuring their safe and trustworthy deployment. In this paper, we highlight an insidiou...
Guangnian Wan, Xinyin Ma, Gongfan Fang, Xinchao Wang
TrianguLang: Geometry-Aware Semantic Consensus for Pose-Free 3D Localization
Localizing objects and parts from natural language in 3D space is essential for robotics, AR, and embodied AI, yet existing methods face a trade-off between the accuracy and geometric consistency o...
Bryce Grant, Aryeh Rothenberg, Atri Banerjee, Peng Wang
Language-Invariant Multilingual Speaker Verification for the TidyVoice 2026 Challenge
Multilingual speaker verification (SV) remains challenging due to limited cross-lingual data and language-dependent information in speaker embeddings. This paper presents a language-invariant multi...
Ze Li, Xiaoxiao Miao, Juan Liu, Ming Li
Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization
Large language model (LLM)-based judges are widely adopted for automated evaluation and reward modeling, yet their judgments are often affected by judgment biases. Accurately evaluating these biase...
Hongli Zhou, Hui Huang, Rui Zhang, Kehai Chen, Bing Xu, Conghui Zhu, Tiejun Zhao, Muyun Yang
DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation
Significant progress has been achieved in subject-driven text-to-image (T2I) generation, which aims to synthesize new images depicting target subjects according to user instructions. However, evalu...
Zhenyu Hu, Qing Wang, Te Cao, Luo Liao, Longfei Lu, Liqun Liu, Shuang Li, Hang Chen, Mengge Xue, ...
EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs
Autoregressive decoding remains a primary bottleneck in large language model (LLM) serving, motivating speculative decoding methods that reduce expensive teacher-model invocations by verifying mult...
Chang Han, Yijie Hu, Jingling Liu
From Reactive to Map-Based AI: Tuned Local LLMs for Semantic Zone Inference in Object-Goal Navigation
Object-Goal Navigation (ObjectNav) requires an agent to find and navigate to a target object category in unknown environments. While recent Large Language Model (LLM)-based agents exhibit zero-shot...
Yudai Noda, Kanji Tanaka
The AI Amplifier Effect: Defining Human-AI Intimacy and Romantic Relationships with Conversational AI
What does it mean to fall in love with something we know is virtual? The proliferation of conversational AI enables users to create customizable companions, fostering new intimate relationships tha...
Ching Christie Pang, Yi Gao, Xuetong Wang, Pan Hui
High-Fidelity Pruning for Large Language Models
Large Language Models (LLMs) have demonstrated exceptional performance across a wide range of tasks, yet their significant computational and memory requirements present major challenges for deploym...
Yijun Zhu, Jianxin Wang, Chengchao Shen
Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval
With the emergence of Large Language Models (LLMs), new methods in Information Retrieval are available in which relevance is estimated directly through language understanding and reasoning, instead...
Matei Benescu, Ivo Pascal de Jong
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
On-the-fly category discovery (OCD) aims to recognize known categories while simultaneously discovering novel ones from an unlabeled online stream, using a model trained only on labeled data. Exist...
Yanan Wu, Yuhan Yan, Tailai Chen, Zhixiang Chi, ZiZhang Wu, Yi Jin, Yang Wang, Zhenbo Li
Synthetic Defect Image Generation for Power Line Insulator Inspection Using Multimodal Large Language Models
Utility companies increasingly rely on drone imagery for post-event and routine inspection, but training accurate defect-type classifiers remains difficult because defect examples are rare and insp...
Xuesong Wang, Caisheng Wang
In-Context Reinforcement Learning for Tool Use in Large Language Models
While large language models (LLMs) exhibit strong reasoning abilities, their performance on complex tasks is often constrained by the limitations of their internal knowledge. A compelling approach ...
Yaoqi Ye, Yiran Zhao, Keyu Duan, Zeyu Zheng, Kenji Kawaguchi, Cihang Xie, Michael Qizhe Shieh
Deterministic Differentiable Structured Pruning for Large Language Models
Structured pruning reduces LLM inference cost by removing low-importance architectural components. This can be viewed as learning a multiplicative gate for each component under an l0 sparsity const...
Weiyu Huang, Pengle Zhang, Xiaolu Zhang, Jun Zhou, Jun Zhu, Jianfei Chen
Evaluating Generative Models via One-Dimensional Code Distributions
Most evaluations of generative models rely on feature-distribution metrics such as FID, which operate on continuous recognition features that are explicitly trained to be invariant to appearance va...
Zexi Jia, Pengcheng Luo, Yijia Zhong, Jinchao Zhang, Jie Zhou
CinemaWorld: Generative Augmented Reality with LLMs and 3D Scene Generation for Movie Augmentation
We introduce CinemaWorld, a generative augmented reality system that augments the viewer's physical surroundings with automatically generated mixed reality 3D content extracted from and synchronize...
Keiichi Ihara, DaeHo Lee, Manato Abe, Hye-Young Jo, Ryo Suzuki
Stabilized Fine-Tuning with LoRA in Federated Learning: Mitigating the Side Effect of Client Size and Rank via the Scaling Factor
Large Language Models (LLMs) are pivotal in natural language processing. The impracticality of full fine-tuning has prompted Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation ...
Jiayu Huang, Xiaohu Wu, Tiantian He, Qicheng Lao
Samyama: A Unified Graph-Vector Database with In-Database Optimization, Agentic Enrichment, and Hardware Acceleration
Modern data architectures are fragmented across graph databases, vector stores, analytics engines, and optimization solvers, resulting in complex ETL pipelines and synchronization overhead. We pres...
Madhulatha Mandarapu, Sandeep Kunkunuru