Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Can Large Language Models Reason and Optimize Under Constraints?

Large Language Models (LLMs) have demonstrated great capabilities across diverse natural language tasks; yet their ability to solve abstraction and optimization problems with constraints remains sc...

Fabien Bernier, Salah Ghamizi, Pantelis Dogoulis, Maxime Cordy

2603.23004 2026-03-24
TESTING

In-orbit Test of the Weak Equivalence Principle with Atom Interferometry

The Weak Equivalence Principle (WEP) is a central pillar of general relativity. Its precise test with quantum systems in space offers a unique window onto new physics. Here we report the first in-o...

Dan-Fang Zhang, Jing-Ting Li, Wen-Zhang Wang, Wei-Hao Xu, Jia-Yi Wei, Xiao Li, Yi-Bo Wang, Dong-F...

2603.22981 2026-03-24
AI LLM

JFTA-Bench: Evaluate LLM's Ability of Tracking and Analyzing Malfunctions Using Fault Trees

In the maintenance of complex systems, fault trees are used to locate problems and provide targeted solutions. To enable fault trees stored as images to be directly processed by large language mode...

Yuhui Wang, Zhixiong Yang, Ming Zhang, Shihan Dou, Zhiheng Xi, Enyu Zhou, Senjie Jin, Yujiong She...

2603.22978 2026-03-24
TESTING

DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube

Dari, the primary language of Afghanistan, is spoken by tens of millions of people yet remains largely absent from the misinformation detection literature. We address this gap with DariMis, the fir...

Jawid Ahmad Baktash, Mosa Ebrahimi, Mohammad Zarif Joya, Mursal Dawodi

2603.22977 2026-03-24
TESTING

Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy

The growing use of large language models has increased interest in sharing textual data in a privacy-preserving manner. One prominent line of work addresses this challenge through text rewriting un...

Weijun Li, Arnaud Grivet Sébert, Qiongkai Xu, Annabelle McIver, Mark Dras

2603.22968 2026-03-24
AI LLM

Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees

Large language models (LLMs) inherently operate over a large generation space, yet conventional usage typically reports the most likely generation (MLG) as a point prediction, which underestimates ...

Ye Li, Anqi Hu, Yuanchang Ye, Shiyan Tong, Zhiyuan Wang, Bo Fu

2603.22966 2026-03-24
TESTING

Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data

We study the theoretical behavior of denoising score matching--the learning task associated to diffusion models--when the data distribution is supported on a low-dimensional manifold and the score ...

Anand Jerry George, Nicolas Macris

2603.22962 2026-03-24
AI LLM

Toward Integrated Sensing, Communications, and Edge Intelligence Networks

Wireless systems are expanding their purposes, from merely connecting humans and things to connecting intelligence and opportunistically sensing of the environment through radio-frequency signals. ...

Mattia Merluzzi, Miltiadis C. Filippou, Paolo Di Lorenzo, George C. Alexandropoulos

2603.22958 2026-03-24
AI LLM

Privacy-Preserving EHR Data Transformation via Geometric Operators: A Human-AI Co-Design Technical Report

Electronic health records (EHRs) and other real-world clinical data are essential for clinical research, medical artificial intelligence, and life science, but their sharing is severely limited by ...

Maolin Wang, Beining Bao, Gan Yuan, Hongyu Chen, Bingkun Zhao, Baoshuo Kan, Jiming Xu, Qi Shi, Yi...

2603.22954 2026-03-24
AI LLM

Caption Generation for Dongba Paintings via Prompt Learning and Semantic Fusion

Dongba paintings, the treasured pictorial legacy of the Naxi people in southwestern China, feature richly layered visual elements, vivid color palettes, and pronounced ethnic and regional cultural ...

Shuangwu Qian, Xiaochan Yuan, Pengfei Liu

2603.22946 2026-03-24
TESTING

How well does MAGPHYS recover galaxy properties? A test using EAGLE simulated star-forming galaxies

Spectral energy distribution (SED) models are widely used to infer the physical properties of galaxies from multi-wavelength photometry, but their accuracy is difficult to assess because the true p...

Zoe R. Jones, Elisabete da Cunha, Andrew Battisti

2603.22945 2026-03-24
AI LLM

From Morality Installation in LLMs to LLMs in Morality-as-a-System

Work on morality in large language models (LLMs) has progressed via constitutional AI, reinforcement learning from human feedback (RLHF) and systematic benchmarking, yet it still lacks tools to con...

Gunter Bombaerts

2603.22944 2026-03-24
AI LLM

PersonalQ: Select, Quantize, and Serve Personalized Diffusion Models for Efficient Inference

Personalized text-to-image generation lets users fine-tune diffusion models into repositories of concept-specific checkpoints, but serving these repositories efficiently is difficult for two reason...

Qirui Wang, Qi Guo, Yiding Sun, Junkai Yang, Dongxu Zhang, Shanmin Pang, Qing Guo

2603.22943 2026-03-24
AI LLM

Optimizing Small Language Models for NL2SQL via Chain-of-Thought Fine-Tuning

Translating Natural Language to SQL (NL2SQL) remains a critical bottleneck for democratization of data in enterprises. Although Large Language Models (LLMs) like Gemini 2.5 and other LLMs have demo...

Anshul Solanki, Sanchit Latawa, Koushik Chakraborty, Navneet Kamboj

2603.22942 2026-03-24
AI LLM

Ran Score: a LLM-based Evaluation Score for Radiology Report Generation

Chest X-ray report generation and automated evaluation are limited by poor recognition of low-prevalence abnormalities and inadequate handling of clinically important language, including negation a...

Ran Zhang, Yucong Lin, Zhaoli Su, Bowen Liu, Danni Ai, Tianyu Fu, Deqiang Xiao, Jingfan Fan, Yuan...

2603.22935 2026-03-24
TESTING

ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

Retrieval-Augmented Generation (RAG) improves the reliability of large language model applications by grounding generation in retrieved evidence, but it also introduces a new attack surface: corpus...

Xiangyu Yin, Yi Qi, Chih-hong Cheng

2603.22934 2026-03-24
AI LLM

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Recent AI systems combine large language models with tools, external knowledge via retrieval-augmented generation (RAG), and even autonomous multi-agent decision loops. This agentic AI paradigm gre...

Ali Dehghantanha, Sajad Homayoun

2603.22928 2026-03-24
AI LLM

The EU AI Act and the Rights-based Approach to Technological Governance

The EU AI Act constitutes an important development in shaping the Union's digital regulatory architecture. The Act places fundamental rights at the heart of a risk-based governance framework. The a...

Georgios Pavlidis

2603.22920 2026-03-24
AI LLM

IntentWeave: A Progressive Entry Ladder for Multi-Surface Browser Agents in Cloud Portals

Browser agents built on LLMs can act in web interfaces, yet most remain confined to a single chat surface (e.g., a sidebar). This mismatch with real browsing can increase context-switching and redu...

Wanying Mo, Jijia Lai, Xiaoming Wang

2603.22917 2026-03-24
TESTING

GateSID: Adaptive Gating for Semantic-Collaborative Alignment in Cold-Start Recommendation

In cold-start scenarios, the scarcity of collaborative signals for new items exacerbates the Matthew effect, which undermines platform diversity and remains a persistent challenge in real-world rec...

Hai Zhu, Yantao Yu, Lei Shen, Bing Wang, Xiaoyi Zeng

2603.22916 2026-03-24