Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation

In the field of educational assessment, automated scoring systems increasingly rely on deep learning and large language models (LLMs). However, these systems face significant risks of bias amplific...

Yun Wang, Xuansheng Wu, Jingyuan Huang, Lei Liu, Xiaoming Zhai, Ninghao Liu

2602.23580 2026-02-27
AI LLM

VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

Symbolic regression has recently gained traction in AI-driven scientific discovery, aiming to recover explicit closed-form expressions from data that reveal underlying physical laws. Despite recent...

Somjit Roy, Pritam Dey, Bani K. Mallick

2602.23561 2026-02-27
AI LLM

Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents

Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular...

Aishwarya Sarkar, Sayan Ghosh, Nathan Tallent, Aman Chadha, Tanya Roosta, Ali Jannesari

2602.23556 2026-02-26
AI LLM

Humans and LLMs Diverge on Probabilistic Inferences

Human reasoning often involves working over limited information to arrive at probabilistic conclusions. In its simplest form, this involves making an inference that is not strictly entailed by a pr...

Gaurav Kamath, Sreenath Madathil, Sebastian Schuster, Marie-Catherine de Marneffe, Siva Reddy

2602.23546 2026-02-26
AI LLM

Uncertainty-aware Language Guidance for Concept Bottleneck Models

Concept Bottleneck Models (CBMs) provide inherent interpretability by first mapping input samples to high-level semantic concepts, followed by a combination of these concepts for the final classifi...

Yangyi Li, Mengdi Huai

2602.23495 2026-02-26
AI LLM

Automated Extraction of Unstructured Post-SBRT Toxicity Data from Radiology Reports Using Large Language Models

We evaluated the viability of using a Large Language Model (LLM) to extract patient-specific specific toxicity and progression outcomes from unstructured radiology reports. We retrospectively extra...

Justin Pijanowski, Yakout Mezgueldi, Alan Lee, Drew Moghanaki, Ricky R. Savjani, James Lamb

2602.23492 2026-02-26
AI LLM

IDP Accelerator: Agentic Document Intelligence from Extraction to Compliance Validation

Understanding and extracting structured insights from unstructured documents remains a foundational challenge in industrial NLP. While Large Language Models (LLMs) enable zero-shot extraction, trad...

Md Mofijul Islam, Md Sirajus Salekin, Joe King, Priyashree Roy, Vamsi Thilak Gudi, Spencer Romo, ...

2602.23481 2026-02-26
AI LLM

FHIRPath-QA: Executable Question Answering over FHIR Electronic Health Records

Though patients are increasingly granted digital access to their electronic health records (EHRs), existing interfaces may not support precise, trustworthy answers to patient-specific questions. La...

Michael Frew, Nishit Bheda, Bryan Tripp

2602.23479 2026-02-26
AI LLM

CACTUSDB: Unlock Co-Optimization Opportunities for SQL and AI/ML Inferences

There is a growing demand for supporting inference queries that combine Structured Query Language (SQL) and Artificial Intelligence / Machine Learning (AI/ML) model inferences in database systems, ...

Lixi Zhou, Kanchan Chowdhury, Lulu Xie, Jaykumar Tandel, Hong Guan, Zhiwei Fan, Xinwei Fu, Jia Zou

2602.23469 2026-02-26
AI LLM

CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Scientific research relies on accurate citation for attribution and integrity, yet large language models (LLMs) introduce a new risk: fabricated references that appear plausible but correspond to n...

Zhengqing Yuan, Kaiwen Shi, Zheyuan Zhang, Lichao Sun, Nitesh V. Chawla, Yanfang Ye

2602.23452 2026-02-26
AI LLM

Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

Training large language models to reason with search engines via reinforcement learning is hindered by a fundamental credit assignment problem: existing methods such as Search-R1 provide only a spa...

Chris Samarinas, Haw-Shiuan Chang, Hamed Zamani

2602.23440 2026-02-26
AI LLM

Learning dynamics from online-offline systems of LLM agents

Online information is increasingly linked to real-world instability, especially as automated accounts and LLM-based agents help spread and amplify news. In this work, we study how information sprea...

Moyi Tian, George Mohler, P. Jeffrey Brantingham, Nancy Rodríguez

2602.23437 2026-02-26
AI LLM

MediX-R1: Open Ended Medical Reinforcement Learning

We introduce MediX-R1, an open-ended Reinforcement Learning (RL) framework for medical multimodal large language models (MLLMs) that enables clinically grounded, free-form answers beyond multiple-c...

Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Omair Mohamed, Mohamed Zidan, Fahad Khan, Salman...

2602.23363 2026-02-26
AI LLM

EvoX: Meta-Evolution for Automated Discovery

Recent work such as AlphaEvolve has shown that combining LLM-driven optimization with evolutionary search can effectively improve programs, prompts, and algorithms across domains. In this paradigm,...

Shu Liu, Shubham Agarwal, Monishwaran Maheswaran, Mert Cemri, Zhifei Li, Qiuyang Mang, Ashwin Nar...

2602.23413 2026-02-26
AI LLM

Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

Open-vocabulary segmentation (OVS) extends the zero-shot recognition capabilities of vision-language models (VLMs) to pixel-level prediction, enabling segmentation of arbitrary categories specified...

Tilemachos Aravanis, Vladan Stojnić, Bill Psomas, Nikos Komodakis, Giorgos Tolias

2602.23339 2026-02-26
AI LLM

Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

AI-powered scientific research tools are rapidly being integrated into research workflows, yet the field lacks a clear lens into how researchers use these systems in real-world settings. We present...

Dany Haddad, Dan Bareket, Joseph Chee Chang, Jay DeYoung, Jena D. Hwang, Uri Katz, Mark Polak, Sa...

2602.23335 2026-02-26
AI LLM

Utilizing LLMs for Industrial Process Automation

A growing number of publications address the best practices to use Large Language Models (LLMs) for software engineering in recent years. However, most of this work focuses on widely-used general p...

Salim Fares

2602.23331 2026-02-26
AI LLM

Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

The advancement of large language models (LLMs) has accelerated the development of autonomous financial trading systems. While mainstream approaches deploy multi-agent systems mimicking analyst and...

Kunihiro Miyazaki, Takanobu Kawahara, Stephen Roberts, Stefan Zohren

2602.23330 2026-02-26
AI LLM

LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to perform better than with internet-onl...

Chen Bo Calvin Zhang, Christina Q. Knight, Nicholas Kruus, Jason Hausenloy, Pedro Medeiros, Natha...

2602.23329 2026-02-26
AI LLM

Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction

An artificial intelligence (AI) model can be viewed as a function that maps inputs to outputs in high-dimensional spaces. Once designed and well trained, the AI model is applied for inference. Howe...

Sha Hu

2602.23315 2026-02-26