Personal Assistant Web

AI LLM

Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services

The rapid adoption of large language models (LLMs) in financial services introduces new operational, regulatory, and security risks. Yet most red-teaming benchmarks remain domain-agnostic and fail ...

Fabrizio Dimino, Bhaskarjit Sarmah, Stefano Pasquali

2603.10807 • 2026-03-11

View PDF

AI LLM

AI-Enhanced Spatial Cellular Traffic Demand Prediction with Contextual Clustering and Error Correction for 5G/6G Planning

Accurate spatial prediction of cellular traffic demand is essential for 5G NR capacity planning, network densification, and data-driven 6G planning. Although machine learning can fuse heterogeneous...

Mohamad Alkadamani, Colin Brown, Halim Yanikomeroglu

2603.10800 • 2026-03-11

View PDF

AI LLM

Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contract Security?

EVMbench, released by OpenAI, Paradigm, and OtterSec, is the first large-scale benchmark for AI agents on smart contract security. Its results -- agents detect up to 45.6% of vulnerabilities and ex...

Chaoyuan Peng, Lei Wu, Yajin Zhou

2603.10795 • 2026-03-11

View PDF

AI LLM

Interpretable Chinese Metaphor Identification via LLM-Assisted MIPVU Rule Script Generation: A Comparative Protocol Study

Metaphor identification is a foundational task in figurative language processing, yet most computational approaches operate as opaque classifiers offering no insight into why an expression is judge...

Weihang Huang, Mengna Liu

2603.10784 • 2026-03-11

View PDF

AI LLM

Guiding Diffusion Models with Semantically Degraded Conditions

Classifier-Free Guidance (CFG) is a cornerstone of modern text-to-image models, yet its reliance on a semantically vacuous null prompt ($\varnothing$) generates a guidance signal prone to geometric...

Shilong Han, Yuming Zhang, Hongxia Wang

2603.10780 • 2026-03-11

View PDF

AI LLM

A Control-Theoretic Foundation for Agentic Systems

This paper develops a control-theoretic framework for analyzing agentic systems embedded within feedback control loops. In such systems, an AI agent may adapt controller parameters, select among co...

Ali Eslami, Jiangbo Yu

2603.10779 • 2026-03-11

View PDF

AI LLM

Large Language Models as Annotators for Machine Translation Quality Estimation

Large Language Models (LLMs) have demonstrated excellent performance on Machine Translation Quality Estimation (MTQE), yet their high inference costs make them impractical for direct application. I...

Sidi Wang, Sophie Arnoult, Amir Kamran

2603.10775 • 2026-03-11

View PDF

AI LLM

AI-Generated Rubric Interfaces: K-12 Teachers' Perceptions and Practices

This study investigates K--12 teachers' perceptions and experiences with AI-supported rubric generation during a summer professional development workshop ($n = 25$). Teachers used MagicSchool.ai to...

Bahare Riahi, Sayali Patukale, Joy Niranjan, Yogya Koneru, Tiffany Barnes, Veronica Cateté

2603.10773 • 2026-03-11

View PDF

AI LLM

Word Recovery in Large Language Models Enables Character-Level Tokenization Robustness

Large language models (LLMs) trained with canonical tokenization exhibit surprising robustness to non-canonical inputs such as character-level tokenization, yet the mechanisms underlying this robus...

Zhipeng Yang, Shu Yang, Lijie Hu, Di Wang

2603.10771 • 2026-03-11

View PDF

AI LLM

RAGPerf: An End-to-End Benchmarking Framework for Retrieval-Augmented Generation Systems

We present the design and implementation of a RAG-based AI system benchmarking (RAGPerf) framework for characterizing the system behaviors of RAG pipelines. To facilitate detailed profiling and fin...

Shaobo Li, Yirui Zhou, Yuan Xu, Kevin Chen, Daniel Waddington, Swaminathan Sundararaman, Hubertus...

2603.10765 • 2026-03-11

View PDF

AI LLM

Prioritizing Gradient Sign Over Modulus: An Importance-Aware Framework for Wireless Federated Learning

Wireless federated learning (FL) facilitates collaborative training of artificial intelligence (AI) models to support ubiquitous intelligent applications at the wireless edge. However, the inherent...

Yiyang Yue, Jiacheng Yao, Wei Xu, Zhaohui Yang, George K. Karagiannidis, Dusit Niyato

2603.10763 • 2026-03-11

View PDF

AI LLM

CodePercept: Code-Grounded Visual STEM Perception for MLLMs

When MLLMs fail at Science, Technology, Engineering, and Mathematics (STEM) visual reasoning, a fundamental question arises: is it due to perceptual deficiencies or reasoning limitations? Through s...

Tongkun Guan, Zhibo Yang, Jianqiang Wan, Mingkun Yang, Zhengtao Guo, Zijian Hu, Ruilin Luo, Ruize...

2603.10757 • 2026-03-11

View PDF

AI LLM

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

LLM agents are highly vulnerable to Indirect Prompt Injection (IPI), where adversaries embed malicious directives in untrusted tool outputs to hijack execution. Most existing defenses treat IPI as ...

Yu He, Haozhe Zhu, Yiming Li, Shuo Shao, Hongwei Yao, Zhihao Liu, Zhan Qin

2603.10749 • 2026-03-11

View PDF

AI LLM

Pneuma-Seeker: A Relational Reification Mechanism to Align AI Agents with Human Work over Relational Data

When faced with data problems, many data workers cannot articulate their information need precisely enough for software to help. Although LLMs interpret natural-language requests, they behave britt...

Muhammad Imam Luthfi Balaka, John Hillesland, Kemal Badur, Raul Castro Fernandez

2603.10747 • 2026-03-11

View PDF

AI LLM

CUPID: A Plug-in Framework for Joint Aleatoric and Epistemic Uncertainty Estimation with a Single Model

Accurate estimation of uncertainty in deep learning is critical for deploying models in high-stakes domains such as medical diagnosis and autonomous decision-making, where overconfident predictions...

Xinran Xu, Xiuyi Fan

2603.10745 • 2026-03-11

View PDF

AI LLM

CacheSolidarity: Preventing Prefix Caching Side Channels in Multi-tenant LLM Serving Systems

Large Language Models (LLMs) rely on optimizations like Automatic Prefix Caching (APC) to accelerate inference. APC works by reusing previously computed states for the beginning part of a request (...

Panagiotis Georgios Pennas, Konstantinos Papaioannou, Marco Guarnieri, Thaleia Dimitra Doudali

2603.10726 • 2026-03-11

View PDF

AI LLM

Believing vs. Achieving -- The Disconnect between Efficacy Beliefs and Collaborative Outcomes

As artificial intelligence (AI) becomes increasingly integrated into workflows, humans must decide when to rely on AI advice. These decisions depend on general efficacy beliefs, i.e., humans' confi...

Philipp Spitzer, Joshua Holstein

2603.10708 • 2026-03-11

View PDF

AI LLM

Prism-$Δ$: Differential Subspace Steering for Prompt Highlighting in Large Language Models

Prompt highlighting steers a large language model to prioritize user-specified text spans during generation. A key challenge is extracting steering directions that capture the difference between re...

Yuyao Ge, Shenghua Liu, Yiwei Wang, Tianyu Liu, Baolong Bi, Lingrui Mei, Jiayu Yao, Jiafeng Guo, ...

2603.10705 • 2026-03-11

View PDF

AI LLM

Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval

Retrieval-Augmented Generation (RAG) systems typically treat documents as flat text, ignoring the structured metadata and linked relationships that knowledge graphs provide. In this paper, we inves...

Andrea Volpini, Elie Raad, Beatrice Gamba, David Riccitelli

2603.10700 • 2026-03-11

View PDF

AI LLM

EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution

Neural text-to-SQL models, which translate natural language questions (NLQs) into SQL queries given a database schema, have achieved remarkable performance. However, database schemas frequently evo...

Tianshu Zhang, Kun Qian, Siddhartha Sahai, Yuan Tian, Shaddy Garg, Huan Sun, Yunyao Li

2603.10697 • 2026-03-11

View PDF

Papers