Papers
Research papers from arXiv and related sources
Design-Specification Tiling for ICL-based CAD Code Generation
Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet they underperform on domain-specific tasks such as Computer-Aided Design (CAD) code generation due to ...
Yali Du, San-Zhuo Xi, Hui Sun, Ming Li
AI Planning Framework for LLM-Based Web Agents
Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it...
Orit Shahnovsky, Rotem Dror
Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity
Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generation is memory-bandwidth-bound. We sho...
Donglin Yu
Fisher information based lower bounds on the cost of quantum phase estimation
Quantum phase estimation (QPE) is a cornerstone of quantum algorithms designed to estimate the eigenvalues of a unitary operator. QPE is typically implemented through two paradigms with distinct ci...
Ryosuke Kimura, Kosuke Mitarai
FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning
With the rapid advancement of large language models (LLMs), growing efforts have been made on LLM-based table retrieval. However, existing studies typically focus on single-table query, and impleme...
Chaojie Sun, Bin Cao, Tiantian Li, Chenyu Hou, Ruizhe Li, Qing Fan
Seeing Eye to Eye: Enabling Cognitive Alignment Through Shared First-Person Perspective in Human-AI Collaboration
Despite advances in multimodal AI, current vision-based assistants often remain inefficient in collaborative tasks. We identify two key gulfs: a communication gulf, where users must translate rich ...
Zhuyu Teng, Pei Chen, Yichen Cai, Ruoqing Lu, Zhaoqu Jiang, Jiayang Li, Weitao You, Lingyun Sun
EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning
Reinforcement learning with verifiable rewards (RLVR) is a promising approach for improving code generation in large language models, but its effectiveness is limited by weak and static verificatio...
Chi Ruan, Dongfu Jiang, Huaye Zeng, Ping Nie, Wenhu Chen
RXNRECer Enables Fine-grained Enzymatic Function Annotation through Active Learning and Protein Language Models
A key challenge in enzyme annotation is identifying the biochemical reactions catalyzed by proteins. Most existing methods rely on Enzyme Commission (EC) numbers as intermediaries: they first predi...
Zhenkun Shi, Jun Zhu, Dehang Wang, BoYu Chen, Qianqian Yuan, Zhitao Mao, Fan Wei, Weining Wu, Xia...
STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs
Adversarial patches are physically realizable localized noise, which are able to hijack Vision Transformers (ViT) self-attention, pulling focus toward a small, high-contrast region and corrupting t...
Nandish Chattopadhyay, Anadi Goyal, Chandan Karfa, Anupam Chattopadhyay
Experimental evidence of progressive ChatGPT models self-convergence
Large Language Models (LLMs) that undergo recursive training on synthetically generated data are susceptible to model collapse, a phenomenon marked by the generation of meaningless output. Existing...
Konstantinos F. Xylogiannopoulos, Petros Xanthopoulos, Panagiotis Karampelas, Georgios A. Bakamitsos
Colluding LoRA: A Composite Attack on LLM Safety Alignment
We introduce Colluding LoRA (CoLoRA), an attack in which each adapter appears benign and plausibly functional in isolation, yet their linear composition consistently compromises safety. Unlike atta...
Sihao Ding
Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!
Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy ...
Yanna Jiang, Guangsheng Yu, Qingyuan Yu, Yi Chen, Qin Wang
Testing the AdS/CFT Correspondence Through Thermodynamic Geometry of Nonlinear Electrodynamics AdS Black Holes with Generalized Entropies
We investigate the thermodynamics and thermodynamic geometry of several Anti--de Sitter black hole solutions arising from nonlinear electromagnetic theories, namely the ModMax, nonlinear electrodyn...
Abhishek Baruah, Amijit Bhattacharjee, Prabwal Jyoti Phukon
MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization
Knowledge editing (KE) aims to precisely rectify specific knowledge in Large Language Models (LLMs) without disrupting general capabilities. State-of-the-art methods suffer from an open-loop contro...
Shuxin Liu, Ou Wu
Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs
Generalizing neural surrogate models across different PDE parameters remains difficult because changes in PDE coefficients often make learning harder and optimization less stable. The problem becom...
Zhangyong Liang, Ji Zhang
Multivariate normality test based on the uniform distribution on the Stiefel manifold
This study presents a new procedure for necessary tests of multivariate normality based on the uniform distribution on the Stiefel manifold. We demonstrate that the test statistic, which is formed ...
Koki Shimizu, Toshiya Iwashita
HyGra: Accelerating Network-State Simulation for LLM Training in DCNs via Adaptive Packet-Flow Granularity
In recent years, large language models (LLMs) have driven substantial intelligent transformation across diverse industries. Commercial LLM training is typically performed over data center networks ...
Wenyi Wang, Zheng Wu, Yanmeng Wang, Haolin Mao, Lei Han, Gaogang Xie, Fu Xiao
Vision Verification Enhanced Fusion of VLMs for Efficient Visual Reasoning
With the growing number and diversity of Vision-Language Models (VLMs), many works explore language-based ensemble, collaboration, and routing techniques across multiple VLMs to improve multi-model...
Selim Furkan Tekin, Yichang Xu, Gaowen Liu, Ramana Rao Kompella, Margaret L. Loper, Ling Liu
RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction
Retrosynthesis prediction is a core task in organic synthesis that aims to predict reactants for a given product molecule. Traditionally, chemists select a plausible bond disconnection and derive c...
Hanbum Ko, Chanhui Lee, Ye Rin Kim, Rodrigo Hormazabal, Sehui Han, Sungbin Lim, Sungwoong Kim
From Text to Forecasts: Bridging Modality Gap with Temporal Evolution Semantic Space
Incorporating textual information into time-series forecasting holds promise for addressing event-driven non-stationarity; however, a fundamental modality gap hinders effective fusion: textual desc...
Lehui Li, Yuyao Wang, Jisheng Yan, Wei Zhang, Jinliang Deng, Haoliang Sun, Zhongyi Han, Yongshun ...