Personal Assistant Web

AI LLM

Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services

The rapid adoption of large language models (LLMs) in financial services introduces new operational, regulatory, and security risks. Yet most red-teaming benchmarks remain domain-agnostic and fail ...

Fabrizio Dimino, Bhaskarjit Sarmah, Stefano Pasquali

2603.10807 • 2026-03-11

View PDF

TESTING

Backdoor Directions in Vision Transformers

This paper investigates how Backdoor Attacks are represented within Vision Transformers (ViTs). By assuming knowledge of the trigger, we identify a specific ``trigger direction'' in the model's act...

Sengim Karayalcin, Marina Krcek, Pin-Yu Chen, Stjepan Picek

2603.10806 • 2026-03-11

View PDF

AI LLM

AI-Enhanced Spatial Cellular Traffic Demand Prediction with Contextual Clustering and Error Correction for 5G/6G Planning

Accurate spatial prediction of cellular traffic demand is essential for 5G NR capacity planning, network densification, and data-driven 6G planning. Although machine learning can fuse heterogeneous...

Mohamad Alkadamani, Colin Brown, Halim Yanikomeroglu

2603.10800 • 2026-03-11

View PDF

TESTING

Quantum Limits of Passive Optical Surface Metrology and Defect Detection

We develop a quantum statistical framework for passive optical surface metrology. Modelling a surface as an incoherent ensemble of point emitters imaged through a diffraction-limited system, we emp...

Jernej Frank, George Brumpton, Tommaso Tufarelli, Gerardo Adesso, Samanta Piano

2603.10796 • 2026-03-11

View PDF

AI LLM

Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contract Security?

EVMbench, released by OpenAI, Paradigm, and OtterSec, is the first large-scale benchmark for AI agents on smart contract security. Its results -- agents detect up to 45.6% of vulnerabilities and ex...

Chaoyuan Peng, Lei Wu, Yajin Zhou

2603.10795 • 2026-03-11

View PDF

AI LLM

Interpretable Chinese Metaphor Identification via LLM-Assisted MIPVU Rule Script Generation: A Comparative Protocol Study

Metaphor identification is a foundational task in figurative language processing, yet most computational approaches operate as opaque classifiers offering no insight into why an expression is judge...

Weihang Huang, Mengna Liu

2603.10784 • 2026-03-11

View PDF

AI LLM

Guiding Diffusion Models with Semantically Degraded Conditions

Classifier-Free Guidance (CFG) is a cornerstone of modern text-to-image models, yet its reliance on a semantically vacuous null prompt ($\varnothing$) generates a guidance signal prone to geometric...

Shilong Han, Yuming Zhang, Hongxia Wang

2603.10780 • 2026-03-11

View PDF

AI LLM

A Control-Theoretic Foundation for Agentic Systems

This paper develops a control-theoretic framework for analyzing agentic systems embedded within feedback control loops. In such systems, an AI agent may adapt controller parameters, select among co...

Ali Eslami, Jiangbo Yu

2603.10779 • 2026-03-11

View PDF

AI LLM

Large Language Models as Annotators for Machine Translation Quality Estimation

Large Language Models (LLMs) have demonstrated excellent performance on Machine Translation Quality Estimation (MTQE), yet their high inference costs make them impractical for direct application. I...

Sidi Wang, Sophie Arnoult, Amir Kamran

2603.10775 • 2026-03-11

View PDF

AI LLM

AI-Generated Rubric Interfaces: K-12 Teachers' Perceptions and Practices

This study investigates K--12 teachers' perceptions and experiences with AI-supported rubric generation during a summer professional development workshop ($n = 25$). Teachers used MagicSchool.ai to...

Bahare Riahi, Sayali Patukale, Joy Niranjan, Yogya Koneru, Tiffany Barnes, Veronica Cateté

2603.10773 • 2026-03-11

View PDF

TESTING

Multiple change-point detection on the circle via isolation using permutation testing

In this paper we propose a new method for multiple change-point detection for piecewise-constant circular signals, a setting that, despite its importance in many scientific domains, remains compara...

Sophia Loizidou, Andreas Anastasiou, Christophe Ley

2603.10772 • 2026-03-11

View PDF

AI LLM

Word Recovery in Large Language Models Enables Character-Level Tokenization Robustness

Large language models (LLMs) trained with canonical tokenization exhibit surprising robustness to non-canonical inputs such as character-level tokenization, yet the mechanisms underlying this robus...

Zhipeng Yang, Shu Yang, Lijie Hu, Di Wang

2603.10771 • 2026-03-11

View PDF

AI LLM

RAGPerf: An End-to-End Benchmarking Framework for Retrieval-Augmented Generation Systems

We present the design and implementation of a RAG-based AI system benchmarking (RAGPerf) framework for characterizing the system behaviors of RAG pipelines. To facilitate detailed profiling and fin...

Shaobo Li, Yirui Zhou, Yuan Xu, Kevin Chen, Daniel Waddington, Swaminathan Sundararaman, Hubertus...

2603.10765 • 2026-03-11

View PDF

AI LLM

Prioritizing Gradient Sign Over Modulus: An Importance-Aware Framework for Wireless Federated Learning

Wireless federated learning (FL) facilitates collaborative training of artificial intelligence (AI) models to support ubiquitous intelligent applications at the wireless edge. However, the inherent...

Yiyang Yue, Jiacheng Yao, Wei Xu, Zhaohui Yang, George K. Karagiannidis, Dusit Niyato

2603.10763 • 2026-03-11

View PDF

AI LLM

CodePercept: Code-Grounded Visual STEM Perception for MLLMs

When MLLMs fail at Science, Technology, Engineering, and Mathematics (STEM) visual reasoning, a fundamental question arises: is it due to perceptual deficiencies or reasoning limitations? Through s...

Tongkun Guan, Zhibo Yang, Jianqiang Wan, Mingkun Yang, Zhengtao Guo, Zijian Hu, Ruilin Luo, Ruize...

2603.10757 • 2026-03-11

View PDF

AI LLM

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

LLM agents are highly vulnerable to Indirect Prompt Injection (IPI), where adversaries embed malicious directives in untrusted tool outputs to hijack execution. Most existing defenses treat IPI as ...

Yu He, Haozhe Zhu, Yiming Li, Shuo Shao, Hongwei Yao, Zhihao Liu, Zhan Qin

2603.10749 • 2026-03-11

View PDF

AI LLM

Pneuma-Seeker: A Relational Reification Mechanism to Align AI Agents with Human Work over Relational Data

When faced with data problems, many data workers cannot articulate their information need precisely enough for software to help. Although LLMs interpret natural-language requests, they behave britt...

Muhammad Imam Luthfi Balaka, John Hillesland, Kemal Badur, Raul Castro Fernandez

2603.10747 • 2026-03-11

View PDF

AI LLM

CUPID: A Plug-in Framework for Joint Aleatoric and Epistemic Uncertainty Estimation with a Single Model

Accurate estimation of uncertainty in deep learning is critical for deploying models in high-stakes domains such as medical diagnosis and autonomous decision-making, where overconfident predictions...

Xinran Xu, Xiuyi Fan

2603.10745 • 2026-03-11

View PDF

TESTING

A Grammar of Machine Learning Workflows

Data leakage affected 294 published papers across 17 scientific fields (Kapoor & Narayanan, 2023). The dominant response has been documentation: checklists, linters, best-practice guides. Documenta...

Simon Roth

2603.10742 • 2026-03-11

View PDF

TESTING

Zero crossings of the differential scalar polarizability of Ba$^+$ clock transition

The differential scalar polarizability $Δα_0(ω)$ of the Ba$^+$ S$_{1/2}$-to-D$_{5/2}$ clock transition has a zero crossing near 481nm, which is measured to be 623.603\,13(17)\,THz. From this measur...

N Jayjong, M D K Lee, K J Arnold, M D Barrett

2603.10740 • 2026-03-11

View PDF

Papers