Papers
Research papers from arXiv and related sources
Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling
Large Language Models (LLMs) have demonstrated impressive capabilities in role-playing (RP); however, small Language Models (SLMs) with highly stylized personas remains a challenge due to data scar...
Chanhui Zhu
A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA
Gated DeltaNet (GDN) is a linear attention mechanism that replaces the growing KV cache with a fixed-size recurrent state. Hybrid LLMs like Qwen3-Next use 75% GDN layers and achieve competitive acc...
Neelesh Gupta, Peter Wang, Rajgopal Kannan, Viktor K. Prasanna
Weak-SIGReg: Covariance Regularization for Stable Deep Learning
Modern neural network optimization relies heavily on architectural priorssuch as Batch Normalization and Residual connectionsto stabilize training dynamics. Without these, or in low-data regimes wi...
Habibullah Akbar
Learning Next Action Predictors from Human-Computer Interaction
Truly proactive AI systems must anticipate what we will do next. This foresight demands far richer information than the sparse signals we type into our prompts -- it demands reasoning over the enti...
Omar Shaikh, Valentin Teutschbein, Kanishk Gandhi, Yikun Chi, Nick Haber, Thomas Robinson, Nilam ...
DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality
Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality remains challenging. Existing fact-checkers are primarily designed for general-domain, fac...
Yukun Huang, Leonardo F. R. Ribeiro, Momchil Hardalov, Bhuwan Dhingra, Markus Dreyer, Venkatesh S...
The World Won't Stay Still: Programmable Evolution for Agent Benchmarks
LLM-powered agents fulfill user requests by interacting with environments, querying data, and invoking tools in a multi-turn process. Yet, most existing benchmarks assume static environments with f...
Guangrui Li, Yaochen Xie, Yi Liu, Ziwei Dong, Xingyuan Pan, Tianqi Zheng, Jason Choi, Michael J. ...
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic Questioning
LLMs are increasingly deployed in high-stakes domains such as medical triage and legal assistance, often as document-grounded QA systems in which a user provides a description, relevant sources are...
Maksym Taranukhin, Shuyue Stella Li, Evangelos Milios, Geoff Pleiss, Yulia Tsvetkov, Vered Shwartz
LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis
GPU design space exploration (DSE) for modern AI workloads, such as Large-Language Model (LLM) inference, is challenging because of GPUs' vast, multi-modal design spaces, high simulation costs, and...
Tao Zhang, Rui Ma, Shuotao Xu, Peng Cheng, Yongqiang Xiong
Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning
Large language models (LLMs) benefit substantially from supervised fine-tuning (SFT) and reinforcement learning with verifiable rewards (RLVR) in reasoning tasks. However, these recipes perform poo...
Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han
Building an Ensemble LLM Semantic Tagger for UN Security Council Resolutions
This paper introduces a new methodology for using LLM-based systems for accurate and efficient semantic tagging of UN Security Council resolutions. The main goal is to leverage LLM performance vari...
Hussein Ghaly
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs
What happens when a storyteller forgets its own story? Large Language Models (LLMs) can now generate narratives spanning tens of thousands of words, but they often fail to maintain consistency thro...
Junjie Li, Xinrui Guo, Yuhao Wu, Roy Ka-Wei Lee, Hongzhi Li, Yutao Xie
Measuring Perceptions of Fairness in AI Systems: The Effects of Infra-marginality
Differences in data distributions between demographic groups, known as the problem of infra-marginality, complicate how people evaluate fairness in machine learning models. We present a user study ...
Schrasing Tong, Minseok Jung, Ilaria Liccardi, Lalana Kagal
Computational Pathology in the Era of Emerging Foundation and Agentic AI -- International Expert Perspectives on Clinical Integration and Translational Readiness
Recent breakthroughs in artificial intelligence through foundation models and agents have accelerated the evolution of computational pathology. Demonstrated performance gains reported across academ...
Qian Da, Yijiang Chen, Min Ju, Zheyi Ji, Albert Zhou, Wenwen Wang, Matthew A Abikenari, Philip Ch...
VerChol -- Grammar-First Tokenization for Agglutinative Languages
Tokenization is the foundational step in all large language model (LLM) pipelines, yet the dominant approach Byte Pair Encoding (BPE) and its variants is inherently script agnostic and optimized fo...
Prabhu Raja
Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation
Reliable deployment of large language models (LLMs) requires accurate uncertainty estimation. Existing methods are predominantly answer-first, producing confidence only after generating an answer, ...
Changcheng Li, Jiancan Wu, Hengheng Zhang, Zhengsu Chen, Guo An, Junxiang Qiu, Xiang Wang, Qi Tian
ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning
Pruning is widely recognized as an effective method for reducing the parameters of large language models (LLMs), potentially leading to more efficient deployment and inference. One classic and prom...
Mingluo Su, Huan Wang
Shifting Adaptation from Weight Space to Memory Space: A Memory-Augmented Agent for Medical Image Segmentation
Medical image segmentation is fundamental to clinical workflows, yet models trained on a single dataset often fail to generalize across institutions, scanners, or patient populations. While vision ...
Bowen Chen, Qiaohui Gao, Shaowen Wan, Shanhui Sun, Wei Liu, Xiang Li, Tianming Liu, Lin Zhao
Evolving Deception: When Agents Evolve, Deception Wins
Self-evolving agents offer a promising path toward scalable autonomy. However, in this work, we show that in competitive environments, self-evolution can instead give rise to a serious and previous...
Zonghao Ying, Haowen Dai, Tianyuan Zhang, Yisong Xiao, Quanchen Zou, Aishan Liu, Jian Yang, Yaodo...
Challenges in Synchronous & Remote Collaboration Around Visualization
We characterize 16 challenges faced by those investigating and developing remote and synchronous collaborative experiences around visualization. Our work reflects the perspectives and prior researc...
Matthew Brehmer, Maxime Cordeil, Christophe Hurter, Takayuki Itoh, Wolfgang Büschel, Mahmood Jasi...
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning
While Large Language Models (LLMs) have revolutionized code generation, standard "System 1" approaches, generating solutions in a single forward pass, often hit a performance ceiling when faced wit...
Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim