Papers
Research papers from arXiv and related sources
X-GS: An Extensible Open Framework Unifying 3DGS Architectures with Downstream Multimodal Models
3D Gaussian Splatting (3DGS) has emerged as a powerful technique for novel view synthesis, subsequently extending into numerous spatial AI applications. However, most existing 3DGS methods are isol...
Yueen Ma, Irwin King
Context Engineering: From Prompts to Corporate Multi-Agent Architecture
As artificial intelligence (AI) systems evolve from stateless chatbots to autonomous multi-step agents, prompt engineering (PE), the discipline of crafting individual queries, proves necessary but ...
Vera V. Vishnyakova
A saccade-inspired approach to image classification using visiontransformer attention maps
Human vision achieves remarkable perceptual performance while operating under strict metabolic constraints. A key ingredient is the selective attention mechanism, driven by rapid saccadic eye movem...
Matthis Dallain, Laurent Rodriguez, Laurent Udo Perrinet, Benoît Miramond
A Variational Latent Equilibrium for Learning in Cortex
Brains remain unrivaled in their ability to recognize and generate complex spatiotemporal patterns. While AI is able to reproduce some of these capabilities, deep learning algorithms remain largely...
Simon Brandt, Paul Haider, Walter Senn, Federico Benitez, Mihai A. Petrovici
Preparing Students for AI-Driven Agile Development: A Project-Based AI Engineering Curriculum
Generative AI and agentic tools are reshaping agile software development, yet many engineering curricula still teach agile methods and AI competencies separately and largely lecture-based. This pap...
Andreas Rausch, Stefan Wittek, Tobias Geger, David Inkermann
Routing without Forgetting
Continual learning in transformers is commonly addressed through parameter-efficient adaptation: prompts, adapters, or LoRA modules are specialized per task while the backbone remains frozen. Altho...
Alessio Masano, Giovanni Bellitto, Dipam Goswani, Joost Van de Weijer, Concetto Spampinato
ALARM: Audio-Language Alignment for Reasoning Models
Large audio language models (ALMs) extend LLMs with auditory understanding. A common approach freezes the LLM and trains only an adapter on self-generated targets. However, this fails for reasoning...
Petr Grinberg, Hassan Shahmohammadi
Compartmentalization-Aware Automated Program Repair
Software compartmentalization breaks down an application into compartments isolated from each other: an attacker taking over a compartment will be confined to it, limiting the damage they can cause...
Jia Hu, Youcheng Sun, Pierre Olivier
Dynamic Multimodal Expression Generation for LLM-Driven Pedagogical Agents: From User Experience Perspective
In virtual reality (VR) educational scenarios, Pedagogical agents (PAs) enhance immersive learning through realistic appearances and interactive behaviors. However, most existing PAs rely on static...
Ninghao Wan, Jiarun Song, Fuzheng Yang
Enhancing Debunking Effectiveness through LLM-based Personality Adaptation
This study proposes a novel methodology for generating personalized fake news debunking messages by prompting Large Language Models (LLMs) with persona-based inputs aligned to the Big Five personal...
Pietro Dell'Oglio, Alessandro Bondielli, Francesco Marcelloni, Lucia C. Passaro
Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation
Speculative decoding accelerates LLM inference but suffers from performance degradation when target models are fine-tuned for specific domains. A naive solution is to retrain draft models for every...
Luxi Lin, Zhihang Lin, Zhanpeng Zeng, Yuhao Chen, Qingyu Zhang, Jixiang Luo, Xuelong Li, Rongrong Ji
You Didn't Have to Say It like That: Subliminal Learning from Faithful Paraphrases
When language models are trained on synthetic data, they (student model) can covertly acquire behavioral traits from the data-generating model (teacher model). Subliminal learning refers to the tra...
Isaia Gisler, Zhonghao He, Tianyi Qiu
Beyond Short-Horizon: VQ-Memory for Robust Long-Horizon Manipulation in Non-Markovian Simulation Benchmarks
The high cost of collecting real-robot data has made robotic simulation a scalable platform for both evaluation and data generation. Yet most existing benchmarks concentrate on simple manipulation ...
Wang Honghui, Jing Zhi, Ao Jicong, Song Shiji, Li Xuelong, Huang Gao, Bai Chenjia
EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG
Manual development of automatic tests for embedded C software is a strenuous and time-consuming task that does not scale well. With the accelerating pace of software release cycles, verification in...
Maximilian Harnot, Sebastian Komarnicki, Michal Polok, Timo Oksanen
Evolving Prompt Adaptation for Vision-Language Models
The adaptation of large-scale vision-language models (VLMs) to downstream tasks with limited labeled data remains a significant challenge. While parameter-efficient prompt learning methods offer a ...
Enming Zhang, Jiayang Li, Yanru Wu, Zhenyu Liu, Yang Li
Vibe-Creation: The Epistemology of Human-AI Emergent Cognition
The encounter between human reasoning and generative artificial intelligence (GenAI) cannot be adequately described by inherited metaphors of tool use, augmentation, or collaborative partnership. T...
Ilya Levin
GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models
We present GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planner...
Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore
The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek Editions
We present the Patrologia Graeca Corpus, the first large-scale open OCR and linguistic resource for nineteenthcentury editions of Ancient Greek. The collection covers the remaining undigitized volu...
Chahan Vidal-Gorène, Bastien Kindt
TopoOR: A Unified Topological Scene Representation for the Operating Room
Surgical Scene Graphs abstract the complexity of surgical operating rooms (OR) into a structure of entities and their relations, but existing paradigms suffer from strictly dyadic structural limita...
Tony Danjun Wang, Ka Young Kim, Tolga Birdal, Nassir Navab, Lennart Bastian
An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse
Model merging unifies independently fine-tuned LLMs from the same base, enabling reuse and integration of parallel development efforts without retraining. However, in practice we observe that mergi...
Yuan Cao, Dezhi Ran, Yuzhe Guo, Mengzhou Wu, Simin Chen, Linyi Li, Wei Yang, Tao Xie