Papers
Research papers from arXiv and related sources
Hardware Implementation of Photonic Spiking Hash Retrieval
Hashing retrieval is a pivotal technology for large-scale similarity search, widely applied in retrieval-augmented generation (RAG) for large language models (LLMs), massive image repositories, and...
Shangxuan Shi, Shuiying Xiang, Xintao Zeng, Yonghang Chen, Wanting Yu, Yahui Zhang, Xingxing Guo,...
Ouroboros: Wafer-Scale SRAM CIM with Token-Grained Pipelining for Large Language Model Inference
Conventional LLM inference architectures suffer from high energy and latency due to frequent data movement across memory hierarchies. We propose Ouroboros, a wafer-scale SRAM-based Computing-in-Mem...
Yiqi Liu, Yudong Pan, Mengdi Wang, Shixin Zhao, Haonan Zhu, Yinhe Han, Lei Zhang, Ying Wang
An Empirical Analysis of Calibration and Selective Prediction in Multimodal Clinical Condition Classification
As artificial intelligence systems move toward clinical deployment, ensuring reliable prediction behavior is fundamental for safety-critical decision-making tasks. One proposed safeguard is selecti...
L. Julián Lechuga López, Farah E. Shamout, Tim G. J. Rudner
Designing XY and Dzyaloshinskii--Moriya couplings in Majorana Cooper pair boxes
We theoretically study how to design spin couplings in networks of Majorana Cooper pair boxes (MCBs) connected by multiple normal-metal leads. The inter-box interaction is generated by the conducti...
Manato Teranishi, Shintaro Hoshino, Ai Yamakage
From "What" to "How": Constrained Reasoning for Autoregressive Image Generation
Autoregressive image generation has seen recent improvements with the introduction of chain-of-thought and reinforcement learning. However, current methods merely specify "What" details to depict b...
Ruxue Yan, Xubo Liu, Wenya Guo, Zhengkun Zhang, Ying Zhang, Xiaojie Yuan
A Natural Language Agentic Approach to Study Affective Polarization
Affective polarization has been central to political and social studies, with growing focus on social media, where partisan divisions are often exacerbated. Real-world studies tend to have limited ...
Stephanie Anneris Malvicini, Ewelina Gajewska, Arda Derbent, Katarzyna Budzynska, Jarosław A. Chu...
FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing
The financial domain involves a variety of important time-series problems. Recently, time-series analysis methods that jointly leverage textual and numerical information have gained increasing atte...
Jaehoon Lee, Suhwan Park, Tae Yoon Lim, Seunghan Lee, Jun Seo, Dongwan Kang, Hwanil Choi, Minjae ...
Graph-GRPO: Stabilizing Multi-Agent Topology Learning via Group Relative Policy Optimization
Optimizing communication topology is fundamental to the efficiency and effectiveness of Large Language Model (LLM)-based Multi-Agent Systems (MAS). While recent approaches utilize reinforcement lea...
Yueyang Cang, Xiaoteng Zhang, Erlu Zhao, Zehua Ji, Yuhang Liu, Yuchen He, Zhiyuan Ning, Chen Yiju...
Large Language Model Empowered CSI Feedback in Massive MIMO Systems
Despite the success of large language models (LLMs) across domains, their potential for efficient channel state information (CSI) compression and feedback in frequency division duplex (FDD) massive...
Jie Wu, Wei Xu, Le Liang, Xiaohu You, Mérouane Debbah
HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse
Subtle and indirect hate speech remains an underexplored challenge in online safety research, particularly when harmful intent is embedded within misleading or manipulative narratives. Existing hat...
Sai Kartheek Reddy Kasu, Shankar Biradar, Sunil Saumya, Md. Shad Akhtar
LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization
While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly fo...
Yang Zhao, Zihao Li, Zhiyu Jiang, Dandan Ma, Ganchao Liu, Wenzhe Zhao
Causal Learning Should Embrace the Wisdom of the Crowd
Learning causal structures typically represented by directed acyclic graphs (DAGs) from observational data is notoriously challenging due to the combinatorial explosion of possible graphs and inher...
Ryan Feng Lin, Yuantao Wei, Huiling Liao, Xiaoning Qian, Shuai Huang
ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs
Large language models suffer from content effects in reasoning tasks, particularly in multi-lingual contexts. We introduce a novel method that reduces these biases through explicit structural abstr...
Wicaksono Leksono Muhamad, Joanito Agili Lopo, Tack Hwa Wong, Muhammad Ravi Shulthan Habibi, Samu...
IMR-LLM: Industrial Multi-Robot Task Planning and Program Generation using Large Language Models
In modern industrial production, multiple robots often collaborate to complete complex manufacturing tasks. Large language models (LLMs), with their strong reasoning capabilities, have shown potent...
Xiangyu Su, Juzhan Xu, Oliver van Kaick, Kai Xu, Ruizhen Hu
SorryDB: Can AI Provers Complete Real-World Lean Theorems?
We present SorryDB, a dynamically-updating benchmark of open Lean tasks drawn from 78 real world formalization projects on GitHub. Unlike existing static benchmarks, often composed of competition p...
Austin Letson, Leopoldo Sarra, Auguste Poiroux, Oliver Dressler, Paul Lezeau, Dhyan Aranha, Frede...
VoiceAgengRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures
We present VoiceAgentRAG, an open-source dual-agent memory router that decouples retrieval from response generation. A background Slow Thinker agent continuously monitors the conversation stream, p...
Jielin Qiu, Jianguo Zhang, Zixiang Chen, Liangwei Yang, Ming Zhu, Juntao Tan, Haolin Chen, Wentin...
Frontier Models Can Take Actions at Low Probabilities
Pre-deployment evaluations inspect only a limited sample of model actions. A malicious model seeking to evade oversight could exploit this by randomizing when to "defect": misbehaving so rarely tha...
Alex Serrano, Wen Xing, David Lindner, Erik Jenner
From Leaderboard to Deployment: Code Quality Challenges in AV Perception Repositories
Autonomous vehicle (AV) perception models are typically evaluated solely on benchmark performance metrics, with limited attention to code quality, production readiness and long-term maintainability...
Mateus Karvat, Bram Adams, Sidney Givigi
Personal Health Data Integration and Intelligence through Semantic Web and Blockchain Technologies
Data integration among various stakeholders in the healthcare space remains a challenge, despite the impressive advances in Health AI in the past decade. There is a lot of ``messy'' non-standard bu...
Oshani Seneviratne, Manan Shukla, Jianjing Lin
Organizing, Orchestrating, and Benchmarking Agent Skills at Ecosystem Scale
The rapid proliferation of Claude agent skills has raised the central question of how to effectively leverage, manage, and scale the agent skill ecosystem. In this paper, we propose AgentSkillOS, t...
Hao Li, Chunjiang Mu, Jianhao Chen, Siyue Ren, Zhiyao Cui, Yiqun Zhang, Lei Bai, Shuyue Hu