Papers
Research papers from arXiv and related sources
Spatial Colour Mixing Illusions as a Perception Stress Test for Vision-Language Models
Vision-language models (VLMs) achieve strong benchmark results, yet can exhibit systematic perceptual weaknesses: structured, large changes to pixel values can cause confident yet nonsensical predi...
Nicoleta-Nina Basoc, Adrian Cosma, Emilian Radoi
Partial Policy Gradients for RL in LLMs
Reinforcement learning is a framework for learning to act sequentially in an unknown environment. We propose a natural approach for modeling policy structure in policy gradients. The key idea is to...
Puneet Mathur, Branislav Kveton, Subhojyoti Mukherjee, Viet Dac Lai
Making Implicit Premises Explicit in Logical Understanding of Enthymemes
Real-world arguments in text and dialogues are normally enthymemes (i.e. some of their premises and/or claims are implicit). Natural language processing (NLP) methods for handling enthymemes can po...
Xuyao Feng, Anthony Hunter
Untangling dust emission and CIB anisotropies with the Scattering Transform Statistics
Template-fit approach is often used to separate the Galactic dust emission and the cosmic infrared background (CIB) anisotropies at low $\text{HI}$ column density regions with an underlying assumpt...
Srijita Sinha, Tuhin Ghosh, Erwan Allys, François Boulanger, Jean-Marc Delouis
Real-World Fault Detection for C-Extended Python Projects with Automated Unit Test Generation
Many popular Python libraries use C-extensions for performance-critical operations allowing users to combine the best of the two worlds: The simplicity and versatility of Python and the performance...
Lucas Berg, Lukas Krodinger, Stephan Lukasczyk, Annibale Panichella, Gordon Fraser, Wim Vanhoof, ...
Variability Study and Searching for QPOs with day-like periods in the blazar S5 0716+714 with TESS
Using an unprecedented cadence of 30 minutes provided by the Transiting Exoplanet Survey Satellite (TESS), we have examined the optical light curves (LCs) of the blazar S5 0716+714 obtained from it...
Shubham Kishore, Alok C. Gupta, Paul J. Wiita
Enhancing Neural Video Compression of Static Scenes with Positive-Incentive Noise
Static scene videos, such as surveillance feeds and videotelephony streams, constitute a dominant share of storage consumption and network traffic. However, both traditional standardized codecs and...
Cheng Yuan, Zhenyu Jia, Jiawei Shao, Xuelong Li
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality
Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that fa...
Xi Wang, Mengdie Zhuang, Jiqun Liu
Lyapunov Probes for Hallucination Detection in Large Foundation Models
We address hallucination detection in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by framing the problem through the lens of dynamical systems stability theory. Rather...
Bozhi Luan, Gen Li, Yalan Qin, Jifeng Guo, Yun Zhou, Faguo Wu, Hongwei Zheng, Wenjun Wu, Zhaoxin Fan
Distributed Semantic Alignment over Interference Channels: A Game-Theoretic Approach
Semantic communication acts as a key enabler for effective task execution in AI-driven systems, prioritizing the extraction of the underlying meaning before transmission. However, when devices rely...
Giuseppe Di Poce, Mattia Merluzzi, Emilio Calvanese Strinati, Paolo Di Lorenzo
Aggregative Semantics for Quantitative Bipolar Argumentation Frameworks
Formal argumentation is being used increasingly in artificial intelligence as an effective and understandable way to model potentially conflicting pieces of information, called arguments, and ident...
Yann Munro, Isabelle Bloch, Marie-Jeanne Lesot
Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring
Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases. While early systems relied on handcrafte...
Jonas Kubesch, Lena Huber, Clemens Havas
ChatShopBuddy: Towards Reliable Conversational Shopping Agents via Reinforcement Learning
Conversational shopping agents represent a critical consumer-facing application of Large Language Model (LLM)-powered agents, yet how to effectively apply post-training Reinforcement Learning (RL) ...
Yiruo Cheng, Kelong Mao, Tianhao Li, Jiejun Tan, Ji-Rong Wen, Zhicheng Dou
Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation
Task planning, the problem of sequencing actions to reach a goal from an initial state, is a core capability requirement for autonomous robotic systems. Whether large language models (LLMs) can ser...
Kai Göbel, Pierrick Lorang, Patrik Zips, Tobias Glück
RODEO: RObotic DEcentralized Organization
Robots are improving their autonomy with minimal human supervision. However, auditable actions, transparent decision processes, and new human-robot interaction models are still missing requirements...
Milan Groshev, Eduardo Castelló Ferrer
A LINDDUN-based Privacy Threat Modeling Framework for GenAI
As generative AI (GenAI) systems become increasingly prevalent across various technological stacks, the question of how such systems handle sensitive and personal data flows becomes increasingly im...
Qianying Liao, Jonah Bellemans, Laurens Sion, Xue Jiang, Dmitrii Usynin, Xuebing Zhou, Dimitri Va...
Pre-AI Baseline: Developer IDE Satisfaction and Tool Autonomy in 2022
To quantify the impact of AI on software development, the community requires a robust pre-AI baseline. This study analyzes valid satisfaction data from 1,155 software developers collected in July 2...
Nikola Balić
Detecting Semantic Alignments between Textual Specifications and Domain Models
Context: Having domain models derived from textual specifications has proven to be very useful in the early phases of software engineering. However, creating correct domain models and establishing ...
Shwetali Shimangaud, Lola Burgueño, Rijul Saini, Jörg Kienzle
Ensemble Learning with Sparse Hypercolumns
Directly inspired by findings in biological vision, high-dimensional hypercolumns are feature vectors built by concatenating multi-scale activations of convolutional neural networks for a single im...
Julia Dietlmeier, Vayangi Ganepola, Oluwabukola G. Adegboro, Mayug Maniparambil, Claudia Mazo, No...
Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object Tracking
Multi-object tracking (MOT) involves analyzing object trajectories and counting the number of objects in video sequences. However, 2D MOT faces challenges due to positional cost confusion arising f...
Chunjiang Li, Jianbo Ma, Li Shen, Yanru Chen, Liangyin Chen