Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery

Clinical image interpretation is inherently multi-step and tool-centric: clinicians iteratively combine visual evidence with patient context, quantify findings, and refine their decisions through a...

Lin Fan, Pengyu Dai, Zhipeng Deng, Haolin Wang, Xun Gong, Yefeng Zheng, Yafei Ou

2603.05860 2026-03-06
AI LLM

How Well Do Current Speech Deepfake Detection Methods Generalize to the Real World?

Recent advances in speech synthesis and voice conversion have greatly improved the naturalness and authenticity of generated audio. Meanwhile, evolving encoding, compression, and transmission mecha...

Daixian Li, Jun Xue, Yanzhen Ren, Zhuolin Yi, Yihuan Huang, Guanxiang Feng, Yi Chai

2603.05852 2026-03-06
AI LLM

The Values of Value in AI Adoption: Rethinking Efficiency in UX Designers' Workplaces

Although organizations increasingly position AI adoption as a pathway to competitiveness and innovation, organizations' perspectives on productivity and efficiency often clash with workers' perspec...

Inha Cha, Catherine Wieczorek, Richmond Y. Wong

2603.05848 2026-03-06
AI LLM

Evaluating LLM Alignment With Human Trust Models

Trust plays a pivotal role in enabling effective cooperation, reducing uncertainty, and guiding decision-making in both human interactions and multi-agent systems. Although it is significant, there...

Anushka Debnath, Stephen Cranefield, Bastin Tony Roy Savarimuthu, Emiliano Lorini

2603.05839 2026-03-06
AI LLM

Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual Analytics

Large Language Models (LLMs) are transforming Conversational Visual Analytics (CVA) by enabling data analysis through natural language. However, evaluating LLMs for CVA remains a challenge: requiri...

Srishti Palani, Vidya Setlur

2603.05832 2026-03-06
AI LLM

Knowledge-driven Reasoning for Mobile Agentic AI: Concepts, Approaches, and Directions

Mobile agentic AI is extending autonomous capabilities to resource-constrained platforms such as edge robots and unmanned aerial vehicles (UAVs), where strict size, weight, power, and cost (SWAP-C)...

Guangyuan Liu, Changyuan Zhao, Yinqiu Liu, Dusit Niyato, Biplab Sikdar

2603.05831 2026-03-06
AI LLM

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of i...

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Changran Hu, Qizheng Zhang, Urmish Thakker

2603.05829 2026-03-06
AI LLM

HART: Data-Driven Hallucination Attribution and Evidence-Based Tracing for Large Language Models

Large language models (LLMs) have demonstrated remarkable performance in text generation and knowledge-intensive question answering. Nevertheless, they are prone to producing hallucinated content, ...

Shize Liang, Hongzhi Wang

2603.05828 2026-03-06
AI LLM

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

Adapting foundation models to new clinical sites remains challenging in practice. Domain shift and scarce annotations must be handled by experts, yet many clinical groups do not have ready access t...

Son Thai Ly, Hien V. Nguyen

2603.05822 2026-03-06
AI LLM

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

Efficient and stable training of large language models (LLMs) remains a core challenge in modern machine learning systems. To address this challenge, Reparameterized Orthogonal Equivalence Training...

Zeju Qiu, Lixin Liu, Adrian Weller, Han Shi, Weiyang Liu

2603.05500 2026-03-05
AI LLM

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Large language models sometimes produce false or misleading responses. Two approaches to this problem are honesty elicitation -- modifying prompts or weights so that the model answers truthfully --...

Helena Casademunt, Bartosz Cywiński, Khoi Tran, Arya Jakkli, Samuel Marks, Neel Nanda

2603.05494 2026-03-05
AI LLM

cuRoboV2: Dynamics-Aware Motion Generation with Depth-Fused Distance Fields for High-DoF Robots

Effective robot autonomy requires motion generation that is safe, feasible, and reactive. Current methods are fragmented: fast planners output physically unexecutable trajectories, reactive control...

Balakumar Sundaralingam, Adithyavairavan Murali, Stan Birchfield

2603.05493 2026-03-05
AI LLM

NL2GDS: LLM-aided interface for Open Source Chip Design

The growing complexity of hardware design and the widening gap between high-level specifications and register-transfer level (RTL) implementation hinder rapid prototyping and system design. We intr...

Max Eland, Jeyan Thiyagalingam, Dinesh Pamunuwa, Roshan Weerasekera

2603.05489 2026-03-05
AI LLM

Observing and Controlling Features in Vision-Language-Action Models

Vision-Language-Action Models (VLAs) have shown remarkable progress towards embodied intelligence. While their architecture partially resembles that of Large Language Models (LLMs), VLAs exhibit hi...

Hugo Buurmeijer, Carmen Amo Alonso, Aiden Swann, Marco Pavone

2603.05487 2026-03-05
AI LLM

Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

As AI models progress beyond simple chatbots into more complex workflows, we draw ever closer to the event horizon beyond which AI systems will be utilized in autonomous, self-maintaining feedback ...

Benjamin Feuer, Lucas Rosenblatt, Oussama Elachqar

2603.05485 2026-03-05
AI LLM

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs). To enhance trust, natural language claims from diverse sources, including human-written te...

Artem Vazhentsev, Maria Marina, Daniil Moskovskiy, Sergey Pletenev, Mikhail Seleznyov, Mikhail Sa...

2603.05471 2026-03-05
AI LLM

Kraus Constrained Sequence Learning For Quantum Trajectories from Continuous Measurement

Real-time reconstruction of conditional quantum states from continuous measurement records is a fundamental requirement for quantum feedback control, yet standard stochastic master equation (SME) s...

Priyanshi Singh, Krishna Bhatia

2603.05468 2026-03-05
AI LLM

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for large language models and long-context applications. While FlashAttention-3 optimized attention for Hopp...

Ted Zadouri, Markus Hoehnerbach, Jay Shah, Timmy Liu, Vijay Thakkar, Tri Dao

2603.05451 2026-03-05
AI LLM

Distributed Partial Information Puzzles: Examining Common Ground Construction Under Epistemic Asymmetry

Establishing common ground, a shared set of beliefs and mutually recognized facts, is fundamental to collaboration, yet remains a challenge for current AI systems, especially in multimodal, multipa...

Yifan Zhu, Mariah Bradford, Kenneth Lai, Timothy Obiso, Videep Venkatesha, James Pustejovsky, Nik...

2603.05450 2026-03-05
AI LLM

SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning

Weakly-Supervised Dense Video Captioning aims to localize and describe events in videos trained only on caption annotations, without temporal boundaries. Prior work introduced an implicit supervisi...

Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Minju Jeon, Hyungee Kim, Dong-Jin Kim

2603.05437 2026-03-05