Papers
Research papers from arXiv and related sources
BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation
In the field of educational assessment, automated scoring systems increasingly rely on deep learning and large language models (LLMs). However, these systems face significant risks of bias amplific...
Yun Wang, Xuansheng Wu, Jingyuan Huang, Lei Liu, Xiaoming Zhai, Ninghao Liu
VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees
Symbolic regression has recently gained traction in AI-driven scientific discovery, aiming to recover explicit closed-form expressions from data that reveal underlying physical laws. Despite recent...
Somjit Roy, Pritam Dey, Bani K. Mallick
Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents
Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular...
Aishwarya Sarkar, Sayan Ghosh, Nathan Tallent, Aman Chadha, Tanya Roosta, Ali Jannesari
Humans and LLMs Diverge on Probabilistic Inferences
Human reasoning often involves working over limited information to arrive at probabilistic conclusions. In its simplest form, this involves making an inference that is not strictly entailed by a pr...
Gaurav Kamath, Sreenath Madathil, Sebastian Schuster, Marie-Catherine de Marneffe, Siva Reddy
Uncertainty-aware Language Guidance for Concept Bottleneck Models
Concept Bottleneck Models (CBMs) provide inherent interpretability by first mapping input samples to high-level semantic concepts, followed by a combination of these concepts for the final classifi...
Yangyi Li, Mengdi Huai
Automated Extraction of Unstructured Post-SBRT Toxicity Data from Radiology Reports Using Large Language Models
We evaluated the viability of using a Large Language Model (LLM) to extract patient-specific specific toxicity and progression outcomes from unstructured radiology reports. We retrospectively extra...
Justin Pijanowski, Yakout Mezgueldi, Alan Lee, Drew Moghanaki, Ricky R. Savjani, James Lamb
IDP Accelerator: Agentic Document Intelligence from Extraction to Compliance Validation
Understanding and extracting structured insights from unstructured documents remains a foundational challenge in industrial NLP. While Large Language Models (LLMs) enable zero-shot extraction, trad...
Md Mofijul Islam, Md Sirajus Salekin, Joe King, Priyashree Roy, Vamsi Thilak Gudi, Spencer Romo, ...
FHIRPath-QA: Executable Question Answering over FHIR Electronic Health Records
Though patients are increasingly granted digital access to their electronic health records (EHRs), existing interfaces may not support precise, trustworthy answers to patient-specific questions. La...
Michael Frew, Nishit Bheda, Bryan Tripp
CACTUSDB: Unlock Co-Optimization Opportunities for SQL and AI/ML Inferences
There is a growing demand for supporting inference queries that combine Structured Query Language (SQL) and Artificial Intelligence / Machine Learning (AI/ML) model inferences in database systems, ...
Lixi Zhou, Kanchan Chowdhury, Lulu Xie, Jaykumar Tandel, Hong Guan, Zhiwei Fan, Xinwei Fu, Jia Zou
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era
Scientific research relies on accurate citation for attribution and integrity, yet large language models (LLMs) introduce a new risk: fabricated references that appear plausible but correspond to n...
Zhengqing Yuan, Kaiwen Shi, Zheyuan Zhang, Lichao Sun, Nitesh V. Chawla, Yanfang Ye
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning
Training large language models to reason with search engines via reinforcement learning is hindered by a fundamental credit assignment problem: existing methods such as Search-R1 provide only a spa...
Chris Samarinas, Haw-Shiuan Chang, Hamed Zamani
Learning dynamics from online-offline systems of LLM agents
Online information is increasingly linked to real-world instability, especially as automated accounts and LLM-based agents help spread and amplify news. In this work, we study how information sprea...
Moyi Tian, George Mohler, P. Jeffrey Brantingham, Nancy Rodríguez
MediX-R1: Open Ended Medical Reinforcement Learning
We introduce MediX-R1, an open-ended Reinforcement Learning (RL) framework for medical multimodal large language models (MLLMs) that enables clinically grounded, free-form answers beyond multiple-c...
Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Omair Mohamed, Mohamed Zidan, Fahad Khan, Salman...
EvoX: Meta-Evolution for Automated Discovery
Recent work such as AlphaEvolve has shown that combining LLM-driven optimization with evolutionary search can effectively improve programs, prompts, and algorithms across domains. In this paradigm,...
Shu Liu, Shubham Agarwal, Monishwaran Maheswaran, Mert Cemri, Zhifei Li, Qiuyang Mang, Ashwin Nar...
Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?
Open-vocabulary segmentation (OVS) extends the zero-shot recognition capabilities of vision-language models (VLMs) to pixel-level prediction, enabling segmentation of arbitrary categories specified...
Tilemachos Aravanis, Vladan Stojnić, Bill Psomas, Nikos Komodakis, Giorgos Tolias
Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset
AI-powered scientific research tools are rapidly being integrated into research workflows, yet the field lacks a clear lens into how researchers use these systems in real-world settings. We present...
Dany Haddad, Dan Bareket, Joseph Chee Chang, Jay DeYoung, Jena D. Hwang, Uri Katz, Mark Polak, Sa...
Utilizing LLMs for Industrial Process Automation
A growing number of publications address the best practices to use Large Language Models (LLMs) for software engineering in recent years. However, most of this work focuses on widely-used general p...
Salim Fares
Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks
The advancement of large language models (LLMs) has accelerated the development of autonomous financial trading systems. While mainstream approaches deploy multi-agent systems mimicking analyst and...
Kunihiro Miyazaki, Takanobu Kawahara, Stephen Roberts, Stefan Zohren
LLM Novice Uplift on Dual-Use, In Silico Biology Tasks
Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to perform better than with internet-onl...
Chen Bo Calvin Zhang, Christina Q. Knight, Nicholas Kruus, Jason Hausenloy, Pedro Medeiros, Natha...
Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction
An artificial intelligence (AI) model can be viewed as a function that maps inputs to outputs in high-dimensional spaces. Once designed and well trained, the AI model is applied for inference. Howe...
Sha Hu