Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

Large Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dia...

Ian Steenstra, Paola Pedrelli, Weiyan Shi, Stacy Marsella, Timothy W. Bickmore

2602.19948 2026-02-23
AI LLM

When Pretty Isn't Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators

Recent text-to-image (T2I) diffusion models produce visually stunning images and demonstrate excellent prompt following. But do they perform well as synthetic vision data generators? In this work, ...

Krzysztof Adamkiewicz, Brian Moser, Stanislav Frolov, Tobias Christian Nauen, Federico Raue, Andr...

2602.19946 2026-02-23
AI LLM

Discover, Segment, and Select: A Progressive Mechanism for Zero-shot Camouflaged Object Segmentation

Current zero-shot Camouflaged Object Segmentation methods typically employ a two-stage pipeline (discover-then-segment): using MLLMs to obtain visual prompts, followed by SAM segmentation. However,...

Yilong Yang, Jianxin Tian, Shengchuan Zhang, Liujuan Cao

2602.19944 2026-02-23
AI LLM

A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs

Sparse Mixture-of-Experts (SMoE) architectures are increasingly used to scale large language models efficiently, delivering strong accuracy under fixed compute budgets. However, SMoE models often s...

Zijie Liu, Jie Peng, Jinhao Duan, Zirui Liu, Kaixiong Zhou, Mingfu Liang, Luke Simon, Xi Liu, Zha...

2602.19938 2026-02-23
AI LLM

Guiding Peptide Kinetics via Collective-Variable Tuning of Free-Energy Barriers

While recent advances in AI have transformed protein structure prediction, protein function is also often strongly influenced by the thermodynamic and kinetic features encoded in its underlying fre...

Alexander Zhilkin, Muralika Medaparambath, Dan Mendels

2602.19936 2026-02-23
AI LLM

BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models

For low-altitude economy (LAE), fast and accurate beam prediction between high-mobility unmanned aerial vehicles (UAVs) and ground base stations is of paramount importance, which ensures seamless c...

Chenran Kou, Changsheng You, Mingjiang Wu, Dingzhu Wen, Zezhong Zhang, Chengwen Xing

2602.19929 2026-02-23
AI LLM

Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models

Fine-tuning large vision models (LVMs) and large language models (LLMs) under differentially private federated learning (DPFL) is hindered by a fundamental privacy-utility trade-off. Low-Rank Adapt...

Jin Liu, Yinbin Miao, Ning Xi, Junkang Liu

2602.19926 2026-02-23
AI LLM

Janus-Q: End-to-End Event-Driven Trading via Hierarchical-Gated Reward Modeling

Financial market movements are often driven by discrete financial events conveyed through news, whose impacts are heterogeneous, abrupt, and difficult to capture under purely numerical prediction o...

Xiang Li, Zikai Wei, Yiyan Qi, Wanyun Zhou, Xiang Liu, Penglei Sun, Yongqi Zhang, Xiaowen Chu

2602.19919 2026-02-23
AI LLM

Watson & Holmes: A Naturalistic Benchmark for Comparing Human and LLM Reasoning

Existing benchmarks for AI reasoning provide limited insight into how closely these capabilities resemble human reasoning in naturalistic contexts. We present an adaptation of the Watson & Holmes d...

Thatchawin Leelawat, Lewis D Griffin

2602.19914 2026-02-23
AI LLM

Multi-Modal Representation Learning via Semi-Supervised Rate Reduction for Generalized Category Discovery

Generalized Category Discovery (GCD) aims to identify both known and unknown categories, with only partial labels given for the known categories, posing a challenging open-set recognition problem. ...

Wei He, Xianghan Meng, Zhiyuan Huang, Xianbiao Qi, Rong Xiao, Chun-Guang Li

2602.19910 2026-02-23
AI LLM

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

Reinforcement learning with verifiers (RLVR) is a central paradigm for improving large language model (LLM) reasoning, yet existing methods often suffer from limited exploration. Policies tend to c...

Zhongwei Wan, Yun Shen, Zhihao Dou, Donghao Zhou, Yu Zhang, Xin Wang, Hui Shen, Jing Xiong, Chaof...

2602.19895 2026-02-23
AI LLM

LLM-enabled Applications Require System-Level Threat Monitoring

LLM-enabled applications are rapidly reshaping the software ecosystem by using large language models as core reasoning components for complex task execution. This paradigm shift, however, introduce...

Yedi Zhang, Haoyu Wang, Xianglin Yang, Jin Song Dong, Jun Sun

2602.19844 2026-02-23
AI LLM

MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems

As LLM-based Multi-Agent Systems (MAS) are increasingly deployed for complex tasks, ensuring their reliability has become a pressing challenge. Since MAS coordinate through unstructured natural lan...

Jin Jia, Zhiling Deng, Zhuangbin Chen, Yingqi Wang, Zibin Zheng

2602.19843 2026-02-23
AI LLM

SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation

Modern large language models (LLMs) excel at generating fluent and faithful translations. However, they struggle to preserve an author's unique literary style, often producing semantically correct ...

Jingzhuo Wu, Jiajun Zhang, Keyan Jin, Dehua Ma, Junbo Wang

2602.19840 2026-02-23
AI LLM

An Explainable Memory Forensics Approach for Malware Analysis

Memory forensics is an effective methodology for analyzing living-off-the-land malware, including threats that employ evasion, obfuscation, anti-analysis, and steganographic techniques. By capturin...

Silvia Lucia Sanna, Davide Maiorca, Giorgio Giacinto

2602.19831 2026-02-23
AI LLM

Semantic Caching for OLAP via LLM-Based Query Canonicalization (Extended Version)

Analytical workloads exhibit substantial semantic repetition, yet most production caches key entries by SQL surface form (text or AST), fragmenting reuse across BI tools, notebooks, and NL interfac...

Laurent Bindschaedler

2602.19811 2026-02-23
AI LLM

OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research

In January 2026, the open-source agent framework OpenClaw and the agent-only social network Moltbook produced a large-scale dataset of autonomous AI-to-AI interaction, attracting six academic publi...

Lukas Weidener, Marko Brkić, Mihailo Jovanović, Ritvik Singh, Emre Ulgac, Aakaash Meduri

2602.19810 2026-02-23
AI LLM

Stop Preaching and Start Practising Data Frugality for Responsible Development of AI

This position paper argues that the machine learning community must move from preaching to practising data frugality for responsible artificial intelligence (AI) development. For long, progress has...

Sophia N. Wilson, Guðrún Fjóla Guðmundsdóttir, Andrew Millard, Raghavendra Selvan, Sebastian Mair

2602.19789 2026-02-23
AI LLM

Janus-Faced Technological Progress and the Arms Race in the Education of Humans and Chatbots

We study the conditions under which technological advances, in combination with a lognormal wage distribution, incentivize agents into an inefficient educational arms race. Our model emphasizes tha...

Wolfgang Kuhle

2602.19783 2026-02-23
AI LLM

AegisSat: Securing AI-Enabled SoC FPGA Satellite Platforms

The increasing adoption of System-on-Chip Field-Programmable Gate Arrays (SoC FPGAs) in AI-enabled satellite systems, valued for their reconfigurability and in-orbit update capabilities, introduces...

Huimin Li, Vusal Novruzov, Nikhilesh Singh, Lichao Wu, Mohamadreza Rostami, Ahmad-Reza Sadeghi

2602.19777 2026-02-23