Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

A Systematic Study of Pseudo-Relevance Feedback with LLMs

Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from...

Nour Jedidi, Jimmy Lin

2603.11008 2026-03-11
TESTING

Layered Performance Analysis of TLS 1.3 Handshakes: Classical, Hybrid, and Pure Post-Quantum Key Exchange

In this paper, we present a laboratory study focused on the impact of post-quantum cryptography (PQC) algorithms on multiple layers of stateful HTTP over TLS transactions: the TCP handshake, the in...

David Gómez-Cambronero, Daniel Munteanu, Ana Isabel González-Tablas

2603.11006 2026-03-11
TESTING

Searching for Magnetic White Dwarfs in LAMOST DR10

Magnetic white dwarfs (MWDs) are key to understanding the origin and evolution of magnetic fields in compact stars. While large spectroscopic surveys such as SDSS have greatly expanded the known sa...

Si-Cheng Yu, Juan-Juan Ren, Vitaly V. Neustroev, Thomas Hackman, Hao-Tong Zhang, Yi-Qiao Dong, Zh...

2603.11004 2026-03-11
AI LLM

RCTs & Human Uplift Studies: Methodological Challenges and Practical Solutions for Frontier AI Evaluation

Human uplift studies - or studies that measure AI effects on human performance relative to a status quo, typically using randomized controlled trial (RCT) methodology - are increasingly used to inf...

Patricia Paskov, Kevin Wei, Shen Zhou Hong, Dan Bateyko, Xavier Roberts-Gaal, Carson Ezell, Gaili...

2603.11001 2026-03-11
AI LLM

Artificial Intelligence as a Catalyst for Innovation in Software Engineering

The rapid evolution and inherent complexity of modern software requirements demand highly flexible and responsive development methodologies. While Agile frameworks have become the industry standard...

Carlos Alberto Fernández-y-Fernández, Jorge R. Aguilar-Cisneros

2603.10994 2026-03-11
AI LLM

Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity

Recent advances in text-to-image (T2I) generation have greatly improved visual quality, yet producing images that appear visually authentic to real-world photography remains challenging. This is pa...

Zhengyao Fang, Zexi Jia, Yijia Zhong, Pengcheng Luo, Jinchao Zhang, Guangming Lu, Jun Yu, Wenjie Pei

2603.10990 2026-03-11
TESTING

Theory of Cell Body Lensing and Phototaxis Sign Reversal in "Eyeless" Mutants of $Chlamydomonas$

Phototaxis of many species of green algae relies upon directional sensitivity of their membrane-bound photoreceptors, which arises from the presence of a pigmented "eyespot" behind them that blocks...

Sumit Kumar Birwa, Ming Yang, Adriana I. Pesci, Raymond E. Goldstein

2603.10986 2026-03-11
AI LLM

Learning Adaptive Force Control for Contact-Rich Sample Scraping with Heterogeneous Materials

The increasing demand for accelerated scientific discovery, driven by global challenges, highlights the need for advanced AI-driven robotics. Deploying robotic chemists in human-centric labs is key...

Cenk Cetin, Shreyas Pouli, Gabriella Pizzuto

2603.10979 2026-03-11
AI LLM

GroundCount: Grounding Vision-Language Models with Object Detection for Mitigating Counting Hallucinations

Vision Language Models (VLMs) exhibit persistent hallucinations in counting tasks, with accuracy substantially lower than other visual reasoning tasks (excluding sentiment). This phenomenon persist...

Boyuan Chen, Minghao Shao, Siddharth Garg, Ramesh Karri, Muhammad Shafique

2603.10978 2026-03-11
AI LLM

Report for NSF Workshop on Algorithm-Hardware Co-design for Medical Applications

This report summarizes the discussions and recommendations from the NSF Workshop on Algorithm-Hardware Co-design for Medical Applications, held on September 26-27, 2024, in Pittsburgh, PA. The work...

Peipei Zhou, Zheng Dong, Insup Lee, Aidong Zhang, Robert Dick, Majid Sarrafzadeh, Xiaodong Wu, We...

2603.10976 2026-03-11
TESTING

Violating the All-or-Nothing Picture of Local Charges in Non-Hermitian Bosonic Chains

We present explicit counterexamples to a widespread empirical expectation that local commuting charges display all-or-nothing behavior. In the class of bosonic chains with symmetric nearest-neighbo...

Mizuki Yamaguchi, Naoto Shiraishi

2603.10972 2026-03-11
AI LLM

TOSSS: a CVE-based Software Security Benchmark for Large Language Models

With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software engineers and support a wide range of development ta...

Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi, Angela Makhanu, Gaëtan Peter, Roos Wensveen

2603.10969 2026-03-11
TESTING

Island Sliding Barriers: A first-principles metric for determining remote epitaxy viability

Remote epitaxy, where a 2D van der Waals material (usually graphene) is inserted on top of the substrate before film epitaxy, has emerged as a promising path for growing electronics with lower defe...

Quinn T. Campbell, Manny Xavier de Jesus Lopez, Anthony Rice, Timothy J. Ruggles, Taisuke Ohta, C...

2603.10968 2026-03-11
AI LLM

Ranking Reasoning LLMs under Test-Time Scaling

Test-time scaling evaluates reasoning LLMs by sampling multiple outputs per prompt, but ranking models in this regime remains underexplored. We formalize dense benchmark ranking under test-time sca...

Mohsen Hariri, Michael Hinczewski, Jing Ma, Vipin Chaudhary

2603.10960 2026-03-11
TESTING

Covariate-adjusted statistical dependence representation through partial copulas: bounds and new insights

In this paper, we revisit the notion of partial copula, originally introduced to test conditional independence, highlighting its capability to represent the dependence between two random variables ...

Vinícius Litvinoff Justus, Felipe Fontana Vieira

2603.10941 2026-03-11
TESTING

STADA: Specification-based Testing for Autonomous Driving Agents

Simulation-based testing has become a standard approach to validating autonomous driving agents prior to real-world deployment. A high-quality validation campaign will exercise an agent in diverse ...

Joy Saha, Trey Woodlief, Sebastian Elbaum, Matthew B. Dwyer

2603.10940 2026-03-11
AI LLM

Bridging the Skill Gap in Clinical CBCT Interpretation with CBCTRepD

Generative AI has advanced rapidly in medical report generation; however, its application to oral and maxillofacial CBCT reporting remains limited, largely because of the scarcity of high-quality p...

Qinxin Wu, Fucheng Niu, Hengchuan Zhu, Yifan Sun, Ye Shen, Xu Li, Han Wu, Leqi Liu, Zhiwen Pan, Z...

2603.10933 2026-03-11
TESTING

Novel Architecture of RPA In Oral Cancer Lesion Detection

Accurate and early detection of oral cancer lesions is crucial for effective diagnosis and treatment. This study evaluates two RPA implementations, OC-RPAv1 and OC-RPAv2, using a test set of 31 ima...

Revana Magdy, Joy Naoum, Ali Hamdi

2603.10928 2026-03-11
TESTING

Training-Free Multi-Step Inference for Target Speaker Extraction

Target speaker extraction (TSE) aims to recover a target speaker's speech from a mixture using a reference utterance as a cue. Most TSE systems adopt conditional auto-encoder architectures with one...

Zhenghai You, Ying Shi, Lantian Li, Dong Wang

2603.10921 2026-03-11
AI LLM

LLM2Vec-Gen: Generative Embeddings from Large Language Models

LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed ...

Parishad BehnamGhader, Vaibhav Adlakha, Fabian David Schmidt, Nicolas Chapados, Marius Mosbach, S...

2603.10913 2026-03-11