Papers
Research papers from arXiv and related sources
Automatic Configuration of LLM Post-Training Pipelines
LLM post-training pipelines that combine supervised fine-tuning and reinforcement learning are difficult to configure under realistic compute budgets: the configuration space is high-dimensional an...
Channe Chwa, Xinle Wu, Yao Lu
A Concept is More Than a Word: Diversified Unlearning in Text-to-Image Diffusion Models
Concept unlearning has emerged as a promising direction for reducing the risks of harmful content generation in text-to-image diffusion models by selectively erasing undesirable concepts from a mod...
Duc Hao Pham, Van Duy Truong, Duy Khanh Dinh, Tien Cuong Nguyen, Dien Hy Ngo, Tuan Anh Bui
Implicit Grading Bias in Large Language Models: How Writing Style Affects Automated Assessment Across Math, Programming, and Essay Tasks
As large language models (LLMs) are increasingly deployed as automated graders in educational settings, concerns about fairness and bias in their evaluations have become critical. This study invest...
Rudra Jadhav, Janhavi Danve, Sonalika Shaw
ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation
Autonomous web agents such as \textbf{OpenClaw} are rapidly moving into high-impact real-world workflows, but their security robustness under live network threats remains insufficiently evaluated. ...
Haochen Zhao, Shaoyang Cui
Dual-Model Prediction of Affective Engagement and Vocal Attractiveness from Speaker Expressiveness in Video Learning
This paper outlines a machine learning-enabled speaker-centric Emotion AI approach capable of predicting audience-affective engagement and vocal attractiveness in asynchronous video-based learning,...
Hung-Yue Suen, Kuo-En Hung, Fan-Hsun Tseng
Are complicated loss functions necessary for teaching LLMs to reason?
Recent advances in large language models (LLMs) highlight the importance of post training techniques for improving reasoning and mathematical ability. Group Relative Policy Optimization (GRPO) has ...
Gabriele Carrino, Andrea Sassella, Nicolo Brunello, Federico Toschi, Mark James Carman
Automatic detection of Gen-AI texts: A comparative framework of neural models
The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, ...
Cristian Buttaro, Irene Amerini
Memento-Skills: Let Agents Design Agents
We introduce \emph{Memento-Skills}, a generalist, continually-learnable LLM agent system that functions as an \emph{agent-designing agent}: it autonomously constructs, adapts, and improves task-spe...
Huichi Zhou, Siyuan Guo, Anjie Liu, Zhongwei Yu, Ziqin Gong, Bowen Zhao, Zhixun Chen, Menglong Zh...
Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review
Security code reviews increasingly rely on systems integrating Large Language Models (LLMs), ranging from interactive assistants to autonomous agents in CI/CD pipelines. We study whether confirmati...
Dimitris Mitropoulos, Nikolaos Alexopoulos, Georgios Alexopoulos, Diomidis Spinellis
EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
Deploying high-performance dense prediction models on resource-constrained edge devices remains challenging due to strict limits on computation and memory. In practice, lightweight systems for obje...
Longfei Liu, Yongjie Hou, Yang Li, Qirui Wang, Youyang Sha, Yongjun Yu, Yinzhi Wang, Peizhe Ru, X...
CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks
Despite the success of reinforcement learning from human feedback (RLHF) in aligning language models, current reward modeling heavily relies on experimental feedback data collected from human annot...
Hao Wang, Licheng Pan, Zhichao Chen, Chunyuan Zheng, Zhixuan Chu, Xiaoxi Li, Yuan Lu, Xinggao Liu...
Green Architectural Tactics in ML-enabled Systems: An LLM-based Repository Mining Study
Context: The increasing adoption of machine learning (ML) and artificial intelligence (AI) technologies raises growing concerns about their environmental sustainability. Developing and deploying ML...
Vincenzo De Martino, Silverio Martínez-Fernández, Fabio Palomba
Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures
Many works in the literature show that LLM outputs exhibit discriminatory behaviour, triggering stereotype-based inferences based on the dialect in which the inputs are written. This bias has been ...
Martina Ullasci, Marco Rondina, Riccardo Coppola, Flavio Giobergia, Riccardo Bellanca, Gabriele M...
Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer
Bridging the simulation-to-reality (sim2real) gap remains challenging as labelled real-world data is scarce. Existing diffusion-based approaches rely on unstructured prompts or statistical alignmen...
Mohamed Youssef, Mayar Elfares, Anna-Maria Meer, Matteo Bortoletto, Andreas Bulling
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution
Memory-augmented LLM agents maintain external memory banks to support long-horizon interaction, yet most existing systems treat construction, retrieval, and utilization as isolated subroutines. Thi...
Minhua Lin, Zhiwei Zhang, Hanqing Lu, Hui Liu, Xianfeng Tang, Qi He, Xiang Zhang, Suhang Wang
Holter-to-Sleep: AI-Enabled Repurposing of Single-Lead ECG for Sleep Phenotyping
Sleep disturbances are tightly linked to cardiovascular risk, yet polysomnography (PSG)-the clinical reference standard-remains resource-intensive and poorly suited for multi-night, home-based, and...
Donglin Xie, Qingshuo Zhao, Jingyu Wang, Shijia Geng, Jiarui Jin, Jun Li, Rongrong Guo, Guangkun ...
STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation
Scientific time series are central to scientific AI but are typically sparse, highly heterogeneous, and limited in scale, making unified representation learning particularly challenging. Meanwhile,...
Chen Zhang, Liwei Liu, Jun Tao, Xiaoyu Yang, Xuenan Xu, Kai Chen, Bowen Zhou, Wen Wu, Chao Zhang
Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework
Artificial intelligence is increasingly embedded in human decision-making, where it can either enhance human reasoning or induce excessive cognitive dependence. This paper introduces a conceptual a...
Eduardo Di Santi
Multimodal Model for Computational Pathology:Representation Learning and Image Compression
Whole slide imaging (WSI) has transformed digital pathology by enabling computational analysis of gigapixel histopathology images. Recent foundation model advances have accelerated progress in comp...
Peihang Wu, Zehong Chen, Lijian Xu
Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation
Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing evaluation approaches rely on rule-based metrics that fail to ...
Pius Horn, Janis Keuper