Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

Alignment Reduces Expressed but Not Encoded Gender Bias: A Unified Framework and Study

During training, Large Language Models (LLMs) learn social regularities that can lead to gender bias in downstream applications. Most mitigation efforts focus on reducing bias in generated outputs,...

Nour Bouchouchi, Thiabult Laugel, Xavier Renard, Christophe Marsala, Marie-Jeanne Lesot, Marcin D...

2603.24125 2026-03-25
AI LLM

The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

RLHF-aligned language models exhibit response homogenization: on TruthfulQA (n=790), 40-79% of questions produce a single semantic cluster across 10 i.i.d. samples. On affected questions, sampling-...

Mingyi Liu

2603.24124 2026-03-25
AI LLM

Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

Recently, reinforcement learning~(RL) has become an important approach for improving the capabilities of large language models~(LLMs). In particular, reinforcement learning from verifiable rewards~...

Fei Bai, Zhipeng Chen, Chuan Hao, Ming Yang, Ran Tao, Bryan Dai, Wayne Xin Zhao, Jian Yang, Hongt...

2603.24093 2026-03-25
AI LLM

LGTM: Training-Free Light-Guided Text-to-Image Diffusion Model via Initial Noise Manipulation

Diffusion models have demonstrated high-quality performance in conditional text-to-image generation, particularly with structural cues such as edges, layouts, and depth. However, lighting condition...

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser, Ko Watanabe, Riku Takahashi, Andreas Dengel

2603.24086 2026-03-25
AI LLM

LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale

Benchmarks such as MMLU suggest flagship language models approach factuality saturation, with scores above 90\%. We show this picture is incomplete. \emph{LLMpedia} generates encyclopedic articles ...

Muhammed Saeed, Simon Razniewski

2603.24080 2026-03-25
AI LLM

When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm

Recently, multimodal large language models (MLLMs) have emerged as a unified paradigm for language and image generation. Compared with diffusion models, MLLMs possess a much stronger capability for...

Ye Leng, Junjie Chu, Mingjie Li, Chenhao Lin, Chao Shen, Michael Backes, Yun Shen, Yang Zhang

2603.24079 2026-03-25
AI LLM

PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

We present PosterIQ, a design-driven benchmark for poster understanding and generation, annotated across composition structure, typographic hierarchy, and semantic intent. It includes 7,765 image-a...

Yuheng Feng, Wen Zhang, Haodong Duan, Xingxing Zou

2603.24078 2026-03-25
AI LLM

ConceptKT: A Benchmark for Concept-Level Deficiency Prediction in Knowledge Tracing

Knowledge Tracing (KT) is a critical technique for modeling student knowledge to support personalized learning. However, most KT systems focus on binary correctness prediction and cannot diagnose t...

Yu-Chen Kang, Yu-Chien Tang, An-Zi Yen

2603.24073 2026-03-25
AI LLM

Enhanced Mycelium of Thought (EMoT): A Bio-Inspired Hierarchical Reasoning Architecture with Strategic Dormancy and Mnemonic Encoding

Current prompting paradigms for large language models (LLMs), including Chain-of-Thought (CoT) and Tree-of-Thoughts (ToT), follow linear or tree-structured reasoning paths that lack persistent memo...

Florian Odi Stummer

2603.24065 2026-03-25
AI LLM

SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation

Despite the promise of Vision-Language-Action (VLA) models as generalist robotic controllers, their robustness against perceptual noise and environmental variations in out-of-distribution (OOD) tas...

Zhuoran Li, Zhiyang Li, Kaijun Zhou, Jinyu Gu

2603.24060 2026-03-25
AI LLM

FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval

Tool-use capabilities are vital for Large Language Models (LLMs) in finance, a domain characterized by massive investment targets and data-intensive inquiries. However, existing data synthesis meth...

Caishuang Huang, Yang Qiao, Rongyu Zhang, Junjie Ye, Pu Lu, Wenxi Wu, Meng Zhou, Xiku Du, Tao Gui...

2603.24051 2026-03-25
AI LLM

Human Factors in Detecting AI-Generated Portraits: Age, Sex, Device, and Confidence

Generative AI now produces photorealistic portraits that circulate widely in social and newslike contexts. Human ability to distinguish real from synthetic faces is time-sensitive because image gen...

Sunwhi Kim, Sunyul Kim

2603.24048 2026-03-25
AI LLM

From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs

Contextual automatic speech recognition (ASR) with Speech-LLMs is typically trained with oracle conversation history, but relies on error-prone history at inference, causing a train-test mismatch i...

Xiaoyong Guo, Nanjie Li, Zijie Zeng, Kai Wang, Hao Huang, Haihua Xu, Wei Shi

2603.24034 2026-03-25
AI LLM

Decompose and Transfer: CoT-Prompting Enhanced Alignment for Open-Vocabulary Temporal Action Detection

Open-Vocabulary Temporal Action Detection (OV-TAD) aims to classify and localize action segments in untrimmed videos for unseen categories. Previous methods rely solely on global alignment between ...

Sa Zhu, Wanqian Zhang, Lin Wang, Xiaohua Chen, Chenxu Cui, Jinchao Zhang, Bo Li

2603.24030 2026-03-25
AI LLM

Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale

Applying large, proprietary API-based language models to text-to-SQL tasks poses a significant industry challenge: reliance on massive, schema-heavy prompts results in prohibitive per-token API cos...

Chinmay Soni, Shivam Chourasia, Gaurav Kumar, Hitesh Kapoor

2603.24023 2026-03-25
AI LLM

Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing

Participatory urban sensing leverages human mobility for large-scale urban data collection, yet existing methods typically rely on centralized optimization and assume homogeneous participants, resu...

Xusen Guo, Mingxing Peng, Hongliang Lu, Hai Yang, Jun Ma, Yuxuan Liang

2603.24014 2026-03-25
AI LLM

CVPD at QIAS 2026: RAG-Guided LLM Reasoning for Al-Mawarith Share Computation and Heir Allocation

Islamic inheritance (Ilm al-Mawarith) is a multi-stage legal reasoning task requiring the identification of eligible heirs, resolution of blocking rules (hajb), assignment of fixed and residual sha...

Wassim Swaileh, Mohammed-En-Nadhir Zighem, Hichem Telli, Salah Eddine Bekhouche, Abdellah Zakaria...

2603.24012 2026-03-25
AI LLM

Analyzing animal movement using deep learning

Understanding how animals move through heterogeneous landscapes is central to ecology and conservation. In this context, step selection functions (SSFs) have emerged as the main statistical framewo...

Thibault Fronville, Maximilian Pichler, Johannes Signer, Marius Grabow, Stephanie Kramer-Schadt, ...

2603.24009 2026-03-25
AI LLM

Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning

Multimodal Large Language Models (MLLMs) have demonstrated remarkable reasoning capabilities across modalities such as images and text. However, tabular data, despite being a critical real-world mo...

Kun-Yang Yu, Zhi Zhou, Shi-Yu Tian, Xiao-Wen Yang, Zi-Yi Jia, Ming Yang, Zi-Jian Cheng, Lan-Zhe G...

2603.24004 2026-03-25
AI LLM

Forensic Implications of Localized AI: Artifact Analysis of Ollama, LM Studio, and llama.cpp

The proliferation of local Large Language Model (LLM) runners, such as Ollama, LM Studio and llama.cpp, presents a new challenge for digital forensics investigators. These tools enable users to dep...

Shariq Murtuza

2603.23996 2026-03-25