Papers
Research papers from arXiv and related sources
Why AI systems don't learn and what to do about it: Lessons on autonomous learning from cognitive science
We critically examine the limitations of current AI models in achieving autonomous learning and propose a learning architecture inspired by human and animal cognition. The proposed framework integr...
Emmanuel Dupoux, Yann LeCun, Jitendra Malik
More Test-Time Compute Can Hurt: Overestimation Bias in LLM Beam Search
Wider beam search should improve LLM reasoning, but when should you stop widening? Prior work on beam width selection has focused on inference efficiency \citep{qin2025dsbd, freitag2017beam}, witho...
Gal Dalal, Assaf Hallak, Gal Chechik, Yftach Ziser
Formalizing and validating properties in Asmeta with Large Language Models (Extended Abstract)
Writing temporal logic properties is often a challenging task for users of model-based development frameworks, particularly when translating informal requirements into formal specifications. In thi...
Andrea Bombarda, Silvia Bonfanti, Angelo Gargantini, Nico Pellegrinelli
GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks
Explainable Artificial Intelligence (XAI) is increasingly essential as AI systems are deployed in critical fields such as healthcare and finance, offering transparency into AI-driven decisions. Two...
Jacob Sanderson, Hua Mao, Wai Lok Woo
SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations
As telecommunications operators accelerate adoption of AI-enabled automation, a practical question remains unresolved: can general-purpose large language model (LLM) agents reliably execute telecom...
Ivo Brett
Brain-Inspired Graph Multi-Agent Systems for LLM Reasoning
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of language tasks, yet complex multi-step reasoning remains a fundamental challenge. While Large Reasoning...
Guangfu Hao, Yuming Dai, Xianzhe Qin, Shan Yu
CRASH: Cognitive Reasoning Agent for Safety Hazards in Autonomous Driving
As AVs grow in complexity and diversity, identifying the root causes of operational failures has become increasingly complex. The heterogeneity of system architectures across manufacturers, ranging...
Erick Silva, Rehana Yasmin, Ali Shoker
PMAx: An Agentic Framework for AI-Driven Process Mining
Process mining provides powerful insights into organizational workflows, but extracting these insights typically requires expertise in specialized query languages and data science tools. Large Lang...
Anton Antonov, Humam Kourani, Alessandro Berti, Gyunam Park, Wil M. P. van der Aalst
Intelligent Co-Design: An Interactive LLM Framework for Interior Spatial Design via Multi-Modal Agents
In architectural interior design, miscommunication frequently arises as clients lack design knowledge, while designers struggle to explain complex spatial relationships, leading to delayed timeline...
Ren Jian Lim, Rushi Dai
The Neuroscience of Transformers
Neuroscience has long informed the development of artificial neural networks, but the success of modern architectures invites, in turn, the converse: can modern networks teach us lessons about brai...
Peter Koenig, Mario Negrello
PYTHEN: A Flexible Framework for Legal Reasoning in Python
This paper introduces PYTHEN, a novel Python-based framework for defeasible legal reasoning. PYTHEN is designed to model the inherently defeasible nature of legal argumentation, providing a flexibl...
Ha-Thanh Nguyen, Ken Satoh
CCTU: A Benchmark for Tool Use under Complex Constraints
Solving problems through tool use under explicit constraints constitutes a highly challenging yet unavoidable scenario for large language models (LLMs), requiring capabilities such as function call...
Junjie Ye, Guoqiang Zhang, Wenjie Fu, Tao Gui, Qi Zhang, Xuanjing Huang
The Impact of AI-Assisted Development on Software Security: A Study of Gemini and Developer Experience
The ongoing shortage of skilled developers, particularly in security-critical software development, has led organizations to increasingly adopt AI-powered development tools to boost productivity an...
Nadine Jost, Benjamin Berens, Manuel Karl, Stefan Albert Horstmann, Martin Johns, Alena Naiakshina
Evolutionary Transfer Learning for Dragonchess
Dragonchess, a three-dimensional chess variant introduced by Gary Gygax, presents unique strategic and computational challenges that make it an ideal environment for studying the transfer of artifi...
Jim O'Connor, Annika Hoag, Sarah Goyette, Gary B. Parker
Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies
Large language models (LLMs) have shown remarkable performance across various sentence-based linguistic phenomena, yet their ability to capture cross-sentence paradigmatic patterns, such as verb al...
Giuseppe Samo, Paola Merlo
From Documents to Spans: Code-Centric Learning for LLM-based ICD Coding
ICD coding is a critical yet challenging task in healthcare. Recently, LLM-based methods demonstrate stronger generalization than discriminative methods in ICD coding. However, fine-tuning LLMs for...
Xu Zhang, Wenxin Ma, Chenxu Wu, Rongsheng Wang, Kun Zhang, S. Kevin Zhou
Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search
Modern e-commerce search is evolving to resolve complex user intents. While Large Language Models (LLMs) offer strong reasoning, existing LLM-based paradigms face a fundamental blindness-latency di...
Mengxiang Chen, Zhouwei Zhai, Jin Li
Directional Embedding Smoothing for Robust Vision Language Models
The safety and reliability of vision-language models (VLMs) are a crucial part of deploying trustworthy agentic AI systems. However, VLMs remain vulnerable to jailbreaking attacks that undermine th...
Ye Wang, Jing Liu, Toshiaki Koike-Akino
SAGE: Multi-Agent Self-Evolution for LLM Reasoning
Reinforcement learning with verifiable rewards improves reasoning in large language models (LLMs), but many methods still rely on large human-labeled datasets. While self-play reduces this dependen...
Yulin Peng, Xinxin Zhu, Chenxing Wei, Nianbo Zeng, Leilei Wang, Ying Tiffany He, F. Richard Yu
Mechanistic Foundations of Goal-Directed Control
Mechanistic interpretability has transformed the analysis of transformer circuits by decomposing model behavior into competing algorithms, identifying phase transitions during training, and derivin...
Alma Lago