Papers
Research papers from arXiv and related sources
STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation
Scientific time series are central to scientific AI but are typically sparse, highly heterogeneous, and limited in scale, making unified representation learning particularly challenging. Meanwhile,...
Chen Zhang, Liwei Liu, Jun Tao, Xiaoyu Yang, Xuenan Xu, Kai Chen, Bowen Zhou, Wen Wu, Chao Zhang
Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework
Artificial intelligence is increasingly embedded in human decision-making, where it can either enhance human reasoning or induce excessive cognitive dependence. This paper introduces a conceptual a...
Eduardo Di Santi
Multimodal Model for Computational Pathology:Representation Learning and Image Compression
Whole slide imaging (WSI) has transformed digital pathology by enabling computational analysis of gigapixel histopathology images. Recent foundation model advances have accelerated progress in comp...
Peihang Wu, Zehong Chen, Lijian Xu
Robust Discrete Pricing Optimization via Multiple-Choice Knapsack Reductions
We study a discrete portfolio pricing problem that selects one price per product from a finite menu under margin and fairness constraints. To account for demand uncertainty, we incorporate a budget...
Zi Yuan Eric Shao
Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation
Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing evaluation approaches rely on rule-based metrics that fail to ...
Pius Horn, Janis Keuper
Click-to-Ask: An AI Live Streaming Assistant with Offline Copywriting and Online Interactive QA
Live streaming commerce has become a prominent form of broadcasting in the modern era. To facilitate more efficient and convenient product promotions for streamers, we present Click-to-Ask, an AI-d...
Ruizhi Yu, Keyang Zhong, Peng Liu, Qi Wu, Haoran Zhang, Yanhao Zhang, Chen Chen, Haonan Lu
Beyond TVLA: Anderson-Darling Leakage Assessment for Neural Network Side-Channel Leakage Detection
Test Vector Leakage Assessment (TVLA) based on Welch's $t$-test has become a standard tool for detecting side-channel leakage. However, its mean-based nature can limit sensitivity when leakage mani...
Ján Mikulec, Jakub Breier, Xiaolu Hou
MOSAIC: Multi-Objective Slice-Aware Iterative Curation for Alignment
We study how to allocate a fixed supervised fine-tuning budget when three objectives must be balanced at once: multi-turn safety alignment, low over-refusal on benign boundary queries, and instruct...
Yipu Dou, Wang Yang
An Onto-Relational-Sophic Framework for Governing Synthetic Minds
The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting broad, flexible competence across reasoning, creative synthesis, and social interaction, h...
Huansheng Ning, Jianguo Ding
D-Mem: A Dual-Process Memory System for LLM Agents
Driven by the development of persistent, self-adapting autonomous agents, equipping these systems with high-fidelity memory access for long-horizon reasoning has emerged as a critical requirement. ...
Zhixing You, Jiachen Yuan, Jason Cai
GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?
In recent years, AI-generated videos have become increasingly realistic and sophisticated. Meanwhile, Large Vision-Language Models (LVLMs) have shown strong potential for detecting such content. Ho...
Yueying Zou, Pei Pei Li, Zekun Li, Xinyu Guo, Xing Cui, Huaibo Huang, Ran He
REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation
Zero-shot object-goal navigation (ZSON) requires navigating unknown environments to find a target object without task-specific training. Prior hierarchical training-free solutions invest in scene u...
Shuqi Xiao, Maani Ghaffari, Chengzhong Xu, Hui Kong
Reduced-order turbulent flow solver to simulate streamwise periodic fins with iso-thermal walls
Assessment of the thermo-hydraulic performance of heat exchangers using computational fluid dynamics is a challenging task. The intricate geometries of a heat exchanger require a fine discretizatio...
Nitish Anand, Praharsh Pai Raikar, Carlo De Servi
Learning to Self-Evolve
We introduce Learning to Self-Evolve (LSE), a reinforcement learning framework that trains large language models (LLMs) to improve their own contexts at test time. We situate LSE in the setting of ...
Xiaoyin Chen, Canwen Xu, Yite Wang, Boyi Liu, Zhewei Yao, Yuxiong He
ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs
Tool-augmented large language models (LLMs) must tightly couple multi-step reasoning with external actions, yet existing benchmarks often confound this interplay with complex environment dynamics, ...
Wanjia Zhao, Ludwig Schmidt, James Zou, Vidhisha Balachandran, Lingjiao Chen
DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units
We introduce DiscoPhon, a multilingual benchmark for evaluating unsupervised phoneme discovery from discrete speech units. DiscoPhon covers 6 dev and 6 test languages, chosen to span a wide range o...
Maxime Poli, Manel Khentout, Angelo Ortiz Tandazo, Ewan Dunbar, Emmanuel Chemla, Emmanuel Dupoux
SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization
SQL query comprehension is a significant challenge due to complex syntax, diverse join types, and deep nesting. Many queries lack adequate comments, severely hindering code readability, maintainabi...
Lei Yu, Peng Wang, Jingyuan Zhang, Xin Wang, Jia Xu, Li Yang, Changzhi Deng, Jiajia Ma, Fengjun Z...
AutORAN: LLM-driven Natural Language Programming for Agile xApp Development
Traditional RAN systems are closed and monolithic, stifling innovation. The openness and programmability enabled by Open Radio Access Network (O-RAN) are envisioned to revolutionize cellular networ...
Xin Li, Shiming Yu, Leming Shen, Jianing Zhang, Yuanqing Zheng, Yaxiong Xie
From Connectivity to Multi-Orbit Intelligence: Space-Based Data Center Architectures for 6G and Beyond
Direct handset-to-satellite (DHTS) communication is emerging as a core capability of 6G non-terrestrial networks, enabling standard devices to directly access low Earth orbit (LEO) satellites. Whil...
Shimaa Naser, Maryam Tariq, Raneem Abdel-Rahim, De Mi, Azzam Mourad, Hadi Otrok, Mahmoud Al-Qutay...
Design and implementation of a high-density sub-nanosecond timing system for a C-band photocathode electron gun test platform
This paper presents the design and implementation of a high-density, deterministic trigger distribution system tailored for the C-band photocathode electron gun test platform at the Southern Advanc...
Peng Zhu, Kangjia Xue, Lin Wang, Yuliang Zhang, Yongcheng Hea, Xuan Wu, Mingtao Li, Sinong Cheng,...