Papers
Research papers from arXiv and related sources
DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube
Dari, the primary language of Afghanistan, is spoken by tens of millions of people yet remains largely absent from the misinformation detection literature. We address this gap with DariMis, the fir...
Jawid Ahmad Baktash, Mosa Ebrahimi, Mohammad Zarif Joya, Mursal Dawodi
Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy
The growing use of large language models has increased interest in sharing textual data in a privacy-preserving manner. One prominent line of work addresses this challenge through text rewriting un...
Weijun Li, Arnaud Grivet Sébert, Qiongkai Xu, Annabelle McIver, Mark Dras
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
We study the theoretical behavior of denoising score matching--the learning task associated to diffusion models--when the data distribution is supported on a low-dimensional manifold and the score ...
Anand Jerry George, Nicolas Macris
How well does MAGPHYS recover galaxy properties? A test using EAGLE simulated star-forming galaxies
Spectral energy distribution (SED) models are widely used to infer the physical properties of galaxies from multi-wavelength photometry, but their accuracy is difficult to assess because the true p...
Zoe R. Jones, Elisabete da Cunha, Andrew Battisti
ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning
Retrieval-Augmented Generation (RAG) improves the reliability of large language model applications by grounding generation in retrieved evidence, but it also introduces a new attack surface: corpus...
Xiangyu Yin, Yi Qi, Chih-hong Cheng
GateSID: Adaptive Gating for Semantic-Collaborative Alignment in Cold-Start Recommendation
In cold-start scenarios, the scarcity of collaborative signals for new items exacerbates the Matthew effect, which undermines platform diversity and remains a persistent challenge in real-world rec...
Hai Zhu, Yantao Yu, Lei Shen, Bing Wang, Xiaoyi Zeng
A diffuse-interface model for N-phase flows with liquid-solid phase change
In this work, we first propose a diffuse interface model for simulating N phase flows with solid liquid phase change. In this model, a phase field approach is adopted to capture multiphase fluid in...
Jiangxu Huang, Chengjie Zhan, Zhenhua Chai, Changsheng Huang, Xi Liu
Critical LAN and Score Tests for Mixed Fractional Models under High-Frequency Observation at H=3/4
We study the critical boundary $H=3/4$ for two mixed fractional models under high-frequency observation, namely mixed fractional Brownian motion and mixed fractional Ornstein--Uhlenbeck. For differ...
Chunhao Cai, Yiwu Shang, Cong Zhang
TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration
The rapid advancement of Vision-Language Models (VLMs) has brought their safety vulnerabilities into sharp focus. However, existing red teaming methods are fundamentally constrained by an inherent ...
Chunxiao Li, Lijun Li, Jing Shao
Portfolio Optimization under Recursive Utility via Reinforcement Learning
We study whether a risk-sensitive objective from asset-pricing theory -- recursive utility -- improves reinforcement learning for portfolio allocation. The Bellman equation under recursive utility ...
Minkey Chang
Continuous Optimization for Satisfiability Modulo Theories on Linear Real Arithmetic
Efficient solutions for satisfiability modulo theories (SMT) are integral in industrial applications such as hardware verification and design automation. Existing approaches are predominantly based...
Yunuo Cen, Daniel Ebler, Xuanyao Fong
Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models
Learning a generalist control policy for dexterous manipulation typically relies on large-scale datasets. Given the high cost of real-world data collection, a practical alternative is to generate s...
Ruixing Jin, Zicheng Zhu, Ruixiang Ouyang, Sheng Xu, Bo Yue, Zhizheng Wu, Guiliang Liu
Designing to Forget: Deep Semi-parametric Models for Unlearning
Recent advances in machine unlearning have focused on developing algorithms to remove specific training samples from a trained model. In contrast, we observe that not all models are equally easy to...
Amber Yijia Zheng, Yu-Shan Tai, Raymond A. Yeh
Breaking news
This paper examines how regulatory interventions in high-frequency financial markets affect price discovery. We focus on Breaking news, where dynamic circuit breakers trigger trading halts immediat...
Lars Winkelmann, Wenying Yao
Probing the Bias of Large-Scale Structure with Unlocalized Fast Radio Bursts
Large-scale structure (LSS) and tracer bias connect observable populations to the cosmic matter distribution. While galaxies are standard tracers, transient events such as gravitational-wave source...
Yu-Tong Su, Zhengxiang Li
When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning
Language models increasingly "show their work" by writing step-by-step reasoning before answering. But are these reasoning steps genuinely used, or decorative narratives generated after the model h...
Abhinaba Basu, Pavan Chakraborty
Universal and efficient graph neural networks with dynamic attention for machine learning interatomic potentials
The core of molecular dynamics simulation fundamentally lies in the interatomic potential. Traditional empirical potentials lack accuracy, while first-principles methods are computationally prohibi...
Shuyu Bi, Zhede Zhao, Qiangchao Sun, Tao Hu, Xionggang Lu, Hongwei Cheng
The Costs of Early-career Disciplinary Pivots: Evidence from PhD Admissions
Scientific innovation often comes from researchers who pivot across disciplines. However, prior work found that established researchers face productivity penalties when pivoting. Here, we investiga...
Sidney Xiang, Nicholas David, Dallas Card, Wenhao Sun, Daniel M Romero, Misha Teplitskiy
Search for the radiative decays $D^0\to γ\bar K_1(1270)^0$ and $D^+\to γK_1(1270)^+$
A search for the radiative decays $D^0\to γ\bar K_1(1270)^0$ and $D^+\to γK_1(1270)^+$ is conducted using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energ...
BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso,...
ABSTRAL: Automatic Design of Multi-Agent Systems Through Iterative Refinement and Topology Optimization
How should multi-agent systems be designed, and can that design knowledge be captured in a form that is inspectable, revisable, and transferable? We introduce ABSTRAL, a framework that treats MAS a...
Weijia Song, Jiashu Yue, Zhe Pang