Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
TESTING

Fractal universe and quantum gravity made simple

Quantum field theory (QFT) on fractal spacetimes is a program aiming at quantizing the gravitational interaction consistently at all energy scales thanks to an intrinsically or dynamically induced ...

Fabio Briscese, Gianluca Calcagni

2603.24593 2026-03-25
AI LLM

Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini

While large language models have accelerated software development through "vibe coding", prototyping intelligent Extended Reality (XR) experiences remains inaccessible due to the friction of comple...

Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongyi Zhou, Xingyue Chen...

2603.24591 2026-03-25
AI LLM

Comparing Developer and LLM Biases in Code Evaluation

As LLMs are increasingly used as judges in code applications, they should be evaluated in realistic interactive settings that capture partial context and ambiguous intent. We present TRACE (Tool fo...

Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donahue, Ameet Talwalkar,...

2603.24586 2026-03-25
AI LLM

The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence

Agentic artificial intelligence (AI) in organizations is a sequential decision problem constrained by reliability and oversight cost. When deterministic workflows are replaced by stochastic policie...

Biplab Pal, Santanu Bhattacharya

2603.24582 2026-03-25
AI LLM

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

Retrieval-augmented generation (RAG) systems are increasingly used to analyze complex policy documents, but achieving sufficient reliability for expert usage remains challenging in domains characte...

Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, Tunazzina Islam

2603.24580 2026-03-25
AI LLM

MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. W...

Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie Hu, Yu Qin, Erchao ...

2603.24579 2026-03-25
AI LLM

Anti-I2V: Safeguarding your photos from malicious image-to-video generation

Advances in diffusion-based video generation models, while significantly improving human animation, poses threats of misuse through the creation of fake videos from a specific person's photo and te...

Duc Vu, Anh Nguyen, Chi Tran, Anh Tran

2603.24570 2026-03-25
TESTING

POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan

Multimodal speaker identification systems typically assume the availability of complete and homogeneous audio-visual modalities during both training and testing. However, in real-world applications...

Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kumar Das, Monorama Swai...

2603.24569 2026-03-25
AI LLM

Boosting LLMs for Mutation Generation

LLM-based mutation testing is a promising testing technology, but existing approaches typically rely on a fixed set of mutations as few-shot examples or none at all. This can result in generic low-...

Bo Wang, Ming Deng, Mingda Chen, Chengran Yang, Youfang Lin, Mark Harman, Mike Papadakis, Jie M. ...

2603.24560 2026-03-25
TESTING

LensWalk: Agentic Video Understanding by Planning How You See in Videos

The dense, temporal nature of video presents a profound challenge for automated analysis. Despite the use of powerful Vision-Language Models, prevailing methods for video understanding are limited ...

Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan

2603.24558 2026-03-25
AI LLM

Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents

Retrieval-Augmented Generation (RAG) has emerged as a framework to address the constraints of Large Language Models (LLMs). Yet, its effectiveness fundamentally hinges on document chunking - an oft...

Samuel Taiwo, Mohd Amaluddin Yusoff

2603.24556 2026-03-25
TESTING

Orientation Reconstruction of Proteins using Coulomb Explosions

We solve the orientation recovery of a tumbling protein in the gas phase from single-event measurements of the spatial positions of its ions after an X-ray laser induced explosion. We simulate diff...

Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman

2603.24553 2026-03-25
TESTING

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Organic farming is a key element in achieving more sustainable agriculture. For a better understanding of the development and impact of organic farming, comprehensive, spatially explicit informatio...

Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mirela Tulbure, Patrick ...

2603.24552 2026-03-25
TESTING

Detection of local geometry in random graphs: information-theoretic and computational limits

We study the problem of detecting local geometry in random graphs. We introduce a model $\mathcal{G}(n, p, d, k)$, where a hidden community of average size $k$ has edges drawn as a random geometric...

Jinho Bok, Shuangping Li, Sophie H. Yu

2603.24545 2026-03-25
AI LLM

Analysing the Safety Pitfalls of Steering Vectors

Activation steering has emerged as a powerful tool to shape LLM behavior without the need for weight updates. While its inherent brittleness and unreliability are well-documented, its safety implic...

Yuxiao Li, Alina Fastowski, Efstratios Zaradoukas, Bardh Prenkaj, Gjergji Kasneci

2603.24543 2026-03-25
TESTING

Radial Distribution Function in a Two Dimensional Core-Shoulder Particle System

An important quantity in liquid state theory is the radial distribution function $g(r)$. It can be calculated within the framework of classical density functional theory in two very distinct ways. ...

Michael Wassermair, Gerhard Kahl, Andrew J Archer, Roland Roth

2603.24537 2026-03-25
AI LLM

Robust Multilingual Text-to-Pictogram Mapping for Scalable Reading Rehabilitation

Reading comprehension presents a significant challenge for children with Special Educational Needs and Disabilities (SEND), often requiring intensive one-on-one reading support. To assist therapist...

Soufiane Jhilal, Martina Galletti

2603.24536 2026-03-25
AI LLM

No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty Attributions

Research on explainable AI (XAI) has frequently focused on explaining model predictions. More recently, methods have been proposed to explain prediction uncertainty by attributing it to input featu...

Emily Schiller, Teodor Chiaburu, Marco Zullich, Luca Longo

2603.24524 2026-03-25
AI LLM

TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

To embed domain-specific or specialized knowledge into pre-trained foundation models, fine-tuning using techniques such as parameter efficient fine-tuning (e.g. LoRA) is a common practice. However,...

Yushi Guan, Jeanine Ohene-Agyei, Daniel Kwan, Jean Sebastien Dandurand, Yifei Zhang, Nandita Vija...

2603.24518 2026-03-25
AI LLM

AVO: Agentic Variation Operators for Autonomous Evolutionary Search

Agentic Variation Operators (AVO) are a new family of evolutionary variation operators that replace the fixed mutation, crossover, and hand-designed heuristics of classical evolutionary search with...

Terry Chen, Zhifan Ye, Bing Xu, Zihao Ye, Timmy Liu, Ali Hassani, Tianqi Chen, Andrew Kerr, Haich...

2603.24517 2026-03-25