Papers
Research papers from arXiv and related sources
Low-Degree Method Fails to Predict Robust Subspace Recovery
The low-degree polynomial framework has been highly successful in predicting computational versus statistical gaps for high-dimensional problems in average-case analysis and machine learning. This ...
He Jia, Aravindan Vijayaraghavan
Maximizing Generalization: The Effect of Different Augmentation Techniques on Lightweight Vision Transformer for Bengali Character Classification
Deep learning models have proven to be highly effective in computer vision, with deep convolutional neural networks achieving impressive results across various computer vision tasks. However, these...
Rafi Hassan Chowdhury, Naimul Haque, Kaniz Fatiha
ExpGuard: LLM Content Moderation in Specialized Domains
With the growing deployment of large language models (LLMs) in real-world applications, establishing robust safety guardrails to moderate their inputs and outputs has become essential to ensure adh...
Minseok Choi, Dongjin Kim, Seungbin Yang, Subin Kim, Youngjun Kwak, Juyoung Oh, Jaegul Choo, Jung...
LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges
As large language models grow more capable, general AI agents have become increasingly prevalent in practical applications. However, existing benchmarks face significant limitations, failing to rep...
Hao Li, Huan Wang, Jinjie Gu, Wenjie Wang, Chenyi Zhuang, Sikang Bian
An Augmented Rating System for Test cricket: adapting Glicko's model
ICC's current ranking system does not adequately account for key contextual factors such as home advantage, toss impact and scheduling imbalances; leading to inconsistencies in team evaluation in T...
Rhitankar Bandyopadhyay, Diganta Mukherjee
Molecular Dynamics Simulations Reveal PolyQ-Length-Dependent Conformational Changes in Huntingtin Exon-1: Implications for Environmental Co-Solvent Modulation of Aggregation-Prone States
Huntington's disease (HD) is caused by CAG-repeat expansion in HTT, which lengthens the polyglutamine (polyQ) tract in huntingtin (HTT) and promotes misfolding and aggregation. While polyQ-length-d...
Jai Geddes-Nelson, Xiaochen Liu, Ken-Tye Yong
An LLM-Assisted Toolkit for Inspectable Multimodal Emotion Data Annotation
Multimodal Emotion Recognition (MER) increasingly depends on fine grained, evidence grounded annotations, yet inspection and label construction are hard to scale when cues are dynamic and misaligne...
Zheyuan Kuang, Weiwei Jiang, Nicholas Koemel, Matthew Ahmadi, Emmanuel Stamatakis, Benjamin Tag, ...
Relevance Matters: A Multi-Task and Multi-Stage Large Language Model Approach for E-commerce Query Rewriting
For e-commerce search, user experience is measured by users' behavioral responses to returned products, like click-through rate and conversion rate, as well as the relevance between returned produc...
Aijun Dai, Jixiang Zhang, Haiqing Hu, Guoyu Tang, Lin Liu, Ziguang Cheng
Fuzzing Microservices in Face of Intrinsic Uncertainties
The widespread adoption of microservices has fundamentally transformed how modern software systems are designed, deployed, operated and maintained. However, well-known microservice properties (e.g....
Man Zhang, Tao Yue, Andrea Arcuri
A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities
Large language models (LLMs) exhibit a unified "general factor" of capability across 10 benchmarks, a finding confirmed by our factor analysis of 156 models, yet they still struggle with simple, tr...
Faiz Ghifari Haznitrama, Faeyza Rishad Ardi, Alice Oh
Exploiting PendingIntent Provenance Confusion to Spoof Android SDK Authentication
A single authentication bypass in a partner SDK grants attackers the identity of every partner in the ecosystem -- and millions of apps use SDKs with exactly this vulnerability. OWASP's 2024 Mobile...
Ramanpreet Singh Khinda
PathSpace: Rapid continuous map approximation for efficient SLAM using B-Splines in constrained environments
Simultaneous Localization and Mapping (SLAM) plays a crucial role in enabling autonomous vehicles to navigate previously unknown environments. Semantic SLAM mostly extends visual SLAM, leveraging...
Aduen Benjumea, Andrew Bradley, Alexander Rast, Matthias Rolf
Agentic Mixed-Source Multi-Modal Misinformation Detection with Adaptive Test-Time Scaling
Vision-language models (VLMs) have been proven effective for detecting multi-modal misinformation on social platforms, especially in zero-shot settings with unavailable or delayed annotations. Howe...
Wei Jiang, Tong Chen, Wei Yuan, Quoc Viet Hung Nguyen, Hongzhi Yin
Multiscale Ultrabroadband Polymer Scattering Media with Tailored Emittance for Radiative Thermal Management
A surface that selectively emits heat in the long-wave infrared (LWIR) can enable passive cooling in hot environments while retaining partial radiative insulation in cold conditions, but its real-w...
Zhenpeng Li, Mathis Degeorges, Nithin Jo Varghese, Jyotirmoy Mandal
Measurement of a quantum system using spin-mechanical conversion
Levitated macroscopic particles exhibiting quantum mechanical effects are garnering increased attention as a means for precision sensing and testing quantum mechanics. Defects in diamond, such as t...
A. A. Wood, D. S. Rice, T. Xie, F. H. Cassells, R. M. Goldblatt, T. Delord, G. Hétet, A. M. Martin
NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect
Large Language Models (LLMs) achieve strong performance on natural language tasks but remain unreliable in mathematical reasoning, frequently generating fluent yet logically inconsistent solutions....
Pratibha Zunjare, Michael Hsiao
Joint Estimation of Dynamic O-D Demand and Choice Models for Dynamic Multi-modal Networks: Computational Graph-Based Learning and Hypothesis Tests
Understanding travel demand and behavior, particularly route and mode choices, is critical for effective transportation planning and policy design in multi-modal systems with emerging mobility opti...
Xiaoyu Ma, Sean Qian
Probing Planck-Scale Physics with High-Frequency Gravitational Waves
We develop a framework for testing quantum gravity through the stochastic gravitational-wave background produced by evaporating near-Planck-mass primordial black holes. Because gravitons free-strea...
Stefano Profumo
Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training
Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators oft...
Valentin Lacombe, Valentin Quesnel, Damien Sileo
VoiceAgengRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures
We present VoiceAgentRAG, an open-source dual-agent memory router that decouples retrieval from response generation. A background Slow Thinker agent continuously monitors the conversation stream, p...
Jielin Qiu, Jianguo Zhang, Zixiang Chen, Liangwei Yang, Ming Zhu, Juntao Tan, Haolin Chen, Wentin...