Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Design-Specification Tiling for ICL-based CAD Code Generation

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet they underperform on domain-specific tasks such as Computer-Aided Design (CAD) code generation due to ...

Yali Du, San-Zhuo Xi, Hui Sun, Ming Li

2603.12712 2026-03-13
AI LLM

AI Planning Framework for LLM-Based Web Agents

Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it...

Orit Shahnovsky, Rotem Dror

2603.12710 2026-03-13
AI LLM

Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity

Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generation is memory-bandwidth-bound. We sho...

Donglin Yu

2603.12707 2026-03-13
TESTING

Fisher information based lower bounds on the cost of quantum phase estimation

Quantum phase estimation (QPE) is a cornerstone of quantum algorithms designed to estimate the eigenvalues of a unitary operator. QPE is typically implemented through two paradigms with distinct ci...

Ryosuke Kimura, Kosuke Mitarai

2603.12706 2026-03-13
AI LLM

FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

With the rapid advancement of large language models (LLMs), growing efforts have been made on LLM-based table retrieval. However, existing studies typically focus on single-table query, and impleme...

Chaojie Sun, Bin Cao, Tiantian Li, Chenyu Hou, Ruizhe Li, Qing Fan

2603.12702 2026-03-13
AI LLM

Seeing Eye to Eye: Enabling Cognitive Alignment Through Shared First-Person Perspective in Human-AI Collaboration

Despite advances in multimodal AI, current vision-based assistants often remain inefficient in collaborative tasks. We identify two key gulfs: a communication gulf, where users must translate rich ...

Zhuyu Teng, Pei Chen, Yichen Cai, Ruoqing Lu, Zhaoqu Jiang, Jiayang Li, Weitao You, Lingyun Sun

2603.12701 2026-03-13
TESTING

EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning

Reinforcement learning with verifiable rewards (RLVR) is a promising approach for improving code generation in large language models, but its effectiveness is limited by weak and static verificatio...

Chi Ruan, Dongfu Jiang, Huaye Zeng, Ping Nie, Wenhu Chen

2603.12698 2026-03-13
TESTING

RXNRECer Enables Fine-grained Enzymatic Function Annotation through Active Learning and Protein Language Models

A key challenge in enzyme annotation is identifying the biochemical reactions catalyzed by proteins. Most existing methods rely on Enzyme Commission (EC) numbers as intermediaries: they first predi...

Zhenkun Shi, Jun Zhu, Dehang Wang, BoYu Chen, Qianqian Yuan, Zhitao Mao, Fan Wei, Weining Wu, Xia...

2603.12694 2026-03-13
TESTING

STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs

Adversarial patches are physically realizable localized noise, which are able to hijack Vision Transformers (ViT) self-attention, pulling focus toward a small, high-contrast region and corrupting t...

Nandish Chattopadhyay, Anadi Goyal, Chandan Karfa, Anupam Chattopadhyay

2603.12688 2026-03-13
AI LLM

Experimental evidence of progressive ChatGPT models self-convergence

Large Language Models (LLMs) that undergo recursive training on synthetically generated data are susceptible to model collapse, a phenomenon marked by the generation of meaningless output. Existing...

Konstantinos F. Xylogiannopoulos, Petros Xanthopoulos, Panagiotis Karampelas, Georgios A. Bakamitsos

2603.12683 2026-03-13
AI LLM

Colluding LoRA: A Composite Attack on LLM Safety Alignment

We introduce Colluding LoRA (CoLoRA), an attack in which each adapter appears benign and plausibly functional in isolation, yet their linear composition consistently compromises safety. Unlike atta...

Sihao Ding

2603.12681 2026-03-13
TESTING

Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!

Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy ...

Yanna Jiang, Guangsheng Yu, Qingyuan Yu, Yi Chen, Qin Wang

2603.12679 2026-03-13
TESTING

Testing the AdS/CFT Correspondence Through Thermodynamic Geometry of Nonlinear Electrodynamics AdS Black Holes with Generalized Entropies

We investigate the thermodynamics and thermodynamic geometry of several Anti--de Sitter black hole solutions arising from nonlinear electromagnetic theories, namely the ModMax, nonlinear electrodyn...

Abhishek Baruah, Amijit Bhattacharjee, Prabwal Jyoti Phukon

2603.12678 2026-03-13
AI LLM

MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization

Knowledge editing (KE) aims to precisely rectify specific knowledge in Large Language Models (LLMs) without disrupting general capabilities. State-of-the-art methods suffer from an open-loop contro...

Shuxin Liu, Ou Wu

2603.12677 2026-03-13
TESTING

Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs

Generalizing neural surrogate models across different PDE parameters remains difficult because changes in PDE coefficients often make learning harder and optimization less stable. The problem becom...

Zhangyong Liang, Ji Zhang

2603.12676 2026-03-13
TESTING

Multivariate normality test based on the uniform distribution on the Stiefel manifold

This study presents a new procedure for necessary tests of multivariate normality based on the uniform distribution on the Stiefel manifold. We demonstrate that the test statistic, which is formed ...

Koki Shimizu, Toshiya Iwashita

2603.12672 2026-03-13
AI LLM

HyGra: Accelerating Network-State Simulation for LLM Training in DCNs via Adaptive Packet-Flow Granularity

In recent years, large language models (LLMs) have driven substantial intelligent transformation across diverse industries. Commercial LLM training is typically performed over data center networks ...

Wenyi Wang, Zheng Wu, Yanmeng Wang, Haolin Mao, Lei Han, Gaogang Xie, Fu Xiao

2603.12671 2026-03-13
TESTING

Vision Verification Enhanced Fusion of VLMs for Efficient Visual Reasoning

With the growing number and diversity of Vision-Language Models (VLMs), many works explore language-based ensemble, collaboration, and routing techniques across multiple VLMs to improve multi-model...

Selim Furkan Tekin, Yichang Xu, Gaowen Liu, Ramana Rao Kompella, Margaret L. Loper, Ling Liu

2603.12669 2026-03-13
AI LLM

RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction

Retrosynthesis prediction is a core task in organic synthesis that aims to predict reactants for a given product molecule. Traditionally, chemists select a plausible bond disconnection and derive c...

Hanbum Ko, Chanhui Lee, Ye Rin Kim, Rodrigo Hormazabal, Sehui Han, Sungbin Lim, Sungwoong Kim

2603.12666 2026-03-13
AI LLM

From Text to Forecasts: Bridging Modality Gap with Temporal Evolution Semantic Space

Incorporating textual information into time-series forecasting holds promise for addressing event-driven non-stationarity; however, a fundamental modality gap hinders effective fusion: textual desc...

Lehui Li, Yuyao Wang, Jisheng Yan, Wei Zhang, Jinliang Deng, Haoliang Sun, Zhongyi Han, Yongshun ...

2603.12664 2026-03-13