Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

AnimeAgent: Is the Multi-Agent via Image-to-Video models a Good Disney Storytelling Artist?

Custom Storyboard Generation (CSG) aims to produce high-quality, multi-character consistent storytelling. Current approaches based on static diffusion models, whether used in a one-shot manner or w...

Hailong Yan, Shice Liu, Tao Wang, Xiangtao Zhang, Yijie Zhong, Jinwei Chen, Le Zhang, Bo Li

2602.20664 2026-02-24
AI LLM

ICSSPulse: A Modular LLM-Assisted Platform for Industrial Control System Penetration Testing

It is well established that industrial control systems comprise the operational backbone of modern critical infrastructures, yet their increasing connectivity exposes them to cyber threats that are...

Michail Takaronis, Athanasia Kollarou, Vyron Kampourakis, Vasileios Gkioulos, Sokratis Katsikas

2602.20663 2026-02-24
AI LLM

TOM: A Ternary Read-only Memory Accelerator for LLM-powered Edge Intelligence

The deployment of Large Language Models (LLMs) for real-time intelligence on edge devices is rapidly growing. However, conventional hardware architectures face a fundamental memory wall challenge, ...

Hongyi Guan, Yijia Zhang, Wenqiang Wang, Yizhao Gao, Shijie Cao, Chen Zhang, Ningyi Xu

2602.20662 2026-02-24
AI LLM

Lagom: Unleashing the Power of Communication and Computation Overlapping for Distributed LLM Training

Overlapping communication with computation is crucial for distributed large-model training, yet optimizing it - especially when computation becomes the bottleneck-remains challenging. We present La...

Guanbin Xu, ZhenGuo Xu, Yuzhe Li, Youhui Bai, Ping Gong, Chaoyi Ruan, Cheng Li

2602.20656 2026-02-24
TESTING

PLATOSpec's first results: Three new transiting warm Jupiters from the WINE survey TIC 147027702, TIC 245076932 and TIC 87422071

We report the discovery and characterisation of three transiting warm Jupiters: TIC 147027702b, TIC 245076932b and TIC 87422071b. These systems were initially identified as transiting candidates us...

Pavol Gajdoš, Rafael Brahm, Lorena Acuña-Aguirre, Matías I. Jones, Helem Salinas, Jozef Lipták, A...

2602.20654 2026-02-24
TESTING

DANCE: Doubly Adaptive Neighborhood Conformal Estimation

The recent developments of complex deep learning models have led to unprecedented ability to accurately predict across multiple data representation types. Conformal prediction for uncertainty quant...

Brandon R. Feng, Brian J. Reich, Daniel Beaglehole, Xihaier Luo, David Keetae Park, Shinjae Yoo, ...

2602.20652 2026-02-24
AI LLM

CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models

Client perceptions of the therapeutic alliance are critical for counseling effectiveness. Accurately capturing these perceptions remains challenging, as traditional post-session questionnaires are ...

Anqi Li, Chenxiao Wang, Yu Lu, Renjun Xu, Lizhi Ma, Zhenzhong Lan

2602.20648 2026-02-24
AI LLM

An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation

Real-world crash reports, which combine textual summaries and sketches, are valuable for scenario-based testing of autonomous driving systems (ADS). However, current methods cannot effectively tran...

Fida Khandaker Safa, Yupeng Jiang, Xi Zheng

2602.20644 2026-02-24
AI LLM

Grounding LLMs in Scientific Discovery via Embodied Actions

Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and verifiable physical simulation. Existing solut...

Bo Zhang, Jinfeng Zhou, Yuxuan Chen, Jianing Yin, Minlie Huang, Hongning Wang

2602.20639 2026-02-24
AI LLM

AI Combines, Humans Socialise: A SECI-based Experience Report on Business Simulation Games

Background. Business Simulation Games (BSG) are widely used to foster experiential learning in complex managerial and organisational contexts by exposing students to decision-making under uncertain...

Nordine Benkeltoum

2602.20633 2026-02-24
AI LLM

QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

As Large Language Models (LLMs) saturate elementary benchmarks, the research frontier has shifted from generation to the reliability of automated evaluation. We demonstrate that standard "LLM-as-a-...

Santiago Gonzalez, Alireza Amiri Bavandpour, Peter Ye, Edward Zhang, Ruslans Aleksejevs, Todor An...

2602.20629 2026-02-24
AI LLM

When can we trust untrusted monitoring? A safety case sketch across collusion strategies

AIs are increasingly being deployed with greater autonomy and capabilities, which increases the risk that a misaligned AI may be able to cause catastrophic harm. Untrusted monitoring -- using one u...

Nelson Gardner-Challis, Jonathan Bostock, Georgiy Kozhevnikov, Morgan Sinclaire, Joan Velja, Ales...

2602.20628 2026-02-24
AI LLM

Physics-based phenomenological characterization of cross-modal bias in multimodal models

The term 'algorithmic fairness' is used to evaluate whether AI models operate fairly in both comparative (where fairness is understood as formal equality, such as "treat like cases as like") and no...

Hyeongmo Kim, Sohyun Kang, Yerin Choi, Seungyeon Ji, Junhyuk Woo, Hyunsuk Chung, Soyeon Caren Han...

2602.20624 2026-02-24
TESTING

A Low Cost Picoseconds Precision Timing and Synchronization Over A Hundred Kilometer

Large-scale systems, such as very large accelerators used for fundamental research, require the implementation of precise timing and synchronization systems over distances of several tens of kilome...

Alice Renaux, Ronic Chiche, A. Martens, Antoine Back, Paul-Éric Pottie, Daniel Charlet

2602.20622 2026-02-24
AI LLM

RecoverMark: Robust Watermarking for Localization and Recovery of Manipulated Faces

The proliferation of AI-generated content has facilitated sophisticated face manipulation, severely undermining visual integrity and posing unprecedented challenges to intellectual property. In res...

Haonan An, Xiaohui Ye, Guang Hua, Yihang Tao, Hangcheng Cao, Xiangyu Yu, Yuguang Fang

2602.20618 2026-02-24
AI LLM

Amortized Bayesian inference for actigraph time sheet data from mobile devices

Mobile data technologies use ``actigraphs'' to furnish information on health variables as a function of a subject's movement. The advent of wearable devices and related technologies has propelled t...

Daniel Zhou, Sudipto Banerjee

2602.20611 2026-02-24
AI LLM

SpecMind: Cognitively Inspired, Interactive Multi-Turn Framework for Postcondition Inference

Specifications are vital for ensuring program correctness, yet writing them manually remains challenging and time-intensive. Recent large language model (LLM)-based methods have shown successes in ...

Cuong Chi Le, Minh V. T Pham, Tung Vu Duy, Cuong Duc Van, Huy N. Phan, Hoang N. Phan, Tien N. Nguyen

2602.20610 2026-02-24
TESTING

Correlator-Level Verification of Mass and Current Maps in Abelian Chern-Simons Dualities

We construct an explicit local operator realization that reproduces Dirac fermion correlation functions in three spacetime dimensions within an Abelian Chern-Simons framework and use it to examine ...

Vaibhav Wasnik

2602.20604 2026-02-24
TESTING

A Case Study on Runtime Verification of a Continuous Deployment Process

We report our experience in applying runtime monitoring to a FluxCD-based continuous deployment (CD) process. Our target system consists of GitHub Actions, GitHub Container Registry (GHCR), FluxCD,...

Shoma Ansai, Masaki Waga

2602.20598 2026-02-24
AI LLM

OptiLeak: Efficient Prompt Reconstruction via Reinforcement Learning in Multi-tenant LLM Services

Multi-tenant LLM serving frameworks widely adopt shared Key-Value caches to enhance efficiency. However, this creates side-channel vulnerabilities enabling prompt leakage attacks. Prior studies ide...

Longxiang Wang, Xiang Zheng, Xuhao Zhang, Yao Zhang, Ye Wu, Cong Wang

2602.20595 2026-02-24