Papers
Research papers from arXiv and related sources
ResNet-50 with Class Reweighting and Anatomy-Guided Temporal Decoding for Gastrointestinal Video Analysis
We developed a multi-label gastrointestinal video analysis pipeline based on a ResNet-50 frame classifier followed by anatomy-guided temporal event decoding. The system predicts 17 labels, includin...
Romil Imtiaz, Dimitris K. Iakovidis
Exploring parameter-efficient fine-tuning (PEFT) of billion-parameter vision models with QLoRA and DoRA: insights into generalization for limited-data image classification under a 98:1 test-to-train regime
Automated behavior classification is essential for precision livestock farming but faces challenges of high computational costs and limited labeled data. This study systematically compared three ap...
Haiyu Yang, Sumit Sharma, Enhong Liu, Miel Hostens
Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory
Large language models increasingly serve as persistent knowledge workers, with in-context memory - facts stored in the prompt - as the default strategy. We benchmark in-context memory against Knowl...
Oliver Zahn, Simran Chana
CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution
Label-free reinforcement learning enables large language models to improve reasoning capabilities without ground-truth supervision, typically by treating majority-voted answers as pseudo-labels. Ho...
Teng Pan, Yuchen Yan, Zixuan Wang, Ruiqing Zhang, Gaiyang Han, Wanqi Zhang, Weiming Lu, Jun Xiao,...
Large Language Models in Teaching and Learning: Reflections on Implementing an AI Chatbot in Higher Education
The landscape of education is changing rapidly, shaped by emerging pedagogical approaches, technological innovations such as artificial intelligence (AI), and evolving societal expectations, all of...
Fiammetta Caccavale, Carina L. Gargalo, Julian Kager, Magdalena Skowyra, Steen Larsen, Krist V. G...
Attention Sinks Induce Gradient Sinks
Attention sinks and massive activations are recurring and closely related phenomena in Transformer models. Existing studies have largely focused on the forward pass, making it unclear whether their...
Yihong Chen, Quanming Yao
Facial Movement Dynamics Reveal Workload During Complex Multitasking
Real-time cognitive workload monitoring is crucial in safety-critical environments, yet established measures are intrusive, expensive, or lack temporal resolution. We tested whether facial movement...
Carter Sale, Melissa N. Stolar, Gaurav Patil, Michael J. Gostelow, Julia Wallier, Margaret C. Mac...
Multi-Source Human-in-the-Loop Digital Twin Testbed for Connected and Autonomous Vehicles in Mixed Traffic Flow
In the emerging mixed traffic environments, Connected and Autonomous Vehicles (CAVs) have to interact with surrounding human-driven vehicles (HDVs). This paper introduces MSH-MCCT (Multi-Source Hum...
Jianghong Dong, Jiawei Wang, Chunying Yang, Mengchi Cai, Chaoyi Chen, Qing Xu, Jianqiang Wang, Ke...
Concept-to-Pixel: Prompt-Free Universal Medical Image Segmentation
Universal medical image segmentation seeks to use a single foundational model to handle diverse tasks across multiple imaging modalities. However, existing approaches often rely heavily on manual v...
Haoyun Chen, Fenghe Tang, Wenxin Ma, Shaohua Kevin Zhou
Fast stabilizer state preparation via AI-optimized graph decimation
We propose a general method for preparing stabilizer states with reduced two-qubit gate count and depth compared to the state of the art. The method starts from a graph state representation of the ...
Michael Doherty, Matteo Puviani, Jasmine Brewer, Gabriel Matos, David Amaro, Ben Criger, David T....
Embedding World Knowledge into Tabular Models: Towards Best Practices for Embedding Pipeline Design
Embeddings are a powerful way to enrich data-driven machine learning models with the world knowledge of large language models (LLMs). Yet, there is limited evidence on how to design effective LLM-b...
Oksana Kolomenko, Ricardo Knauer, Erik Rodner
LR-Robot: A Unified Supervised Intelligent Framework for Real-Time Systematic Literature Reviews with Large Language Models
Recent advances in artificial intelligence (AI) and natural language processing (NLP) have enabled tools to support systematic literature reviews (SLRs), yet existing frameworks often produce outpu...
Wei Wei, Jin Zheng, Zining Wang
DiffVP: Differential Visual Semantic Prompting for LLM-Based CT Report Generation
While large language models (LLMs) have advanced CT report generation, existing methods typically encode 3D volumes holistically, failing to distinguish informative cues from redundant anatomical b...
Yuhe Tian, Kun Zhang, Haoran Ma, Rui Yan, Yingtai Li, Rongsheng Wang, Shaohua Kevin Zhou
Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation
Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more s...
Iakovos-Christos Zarkadis, Christos Douligeris
Eye image segmentation using visual and concept prompts with Segment Anything Model 3 (SAM3)
Previous work has reported that vision foundation models show promising zero-shot performance in eye image segmentation. Here we examine whether the latest iteration of the Segment Anything Model, ...
Diederick C. Niehorster, Marcus Nyström
Electron-Hole Scattering Dichotomy and Anisotropic Warping in Quasi-Two-Dimensional Fermi Surfaces of UTe2
We present a combined experimental and theoretical study of the detailed Fermi-surface (FS) geometry of UTe2, a heavy-fermion superconductor that has recently attracted considerable attention as a ...
Motoi Kimata, Jun Ishizuka, Freya Husstedt, Yusei Shimizu, Ai Nakamura, Dexin Li, Yoshiya Homma, ...
Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation
Multimodal remote sensing semantic segmentation enhances scene interpretation by exploiting complementary physical cues from heterogeneous data. Although pretrained Vision Foundation Models (VFMs) ...
Haocheng Li, Juepeng Zheng, Shuangxi Miao, Ruibo Lu, Guosheng Cai, Haohuan Fu, Jianxi Huang
MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment
In the real economy, modern decision-making is fundamentally challenged by high-dimensional, multimodal environments, which are further complicated by agent heterogeneity and combinatorial data spa...
Yusen Wu, Yiran Liu, Xiaotie Deng
Can Blindfolded LLMs Still Trade? An Anonymization-First Framework for Portfolio Optimization
For LLM trading agents to be genuinely trustworthy, they must demonstrate understanding of market dynamics rather than exploitation of memorized ticker associations. Building responsible multi-agen...
Joohyoung Jeon, Hongchul Lee
Stochastic set-valued optimization and its application to robust learning
In this paper, we develop a stochastic set-valued optimization (SVO) framework tailored for robust machine learning. In the SVO setting, each decision variable is mapped to a set of objective value...
Tommaso Giovannelli, Jingfu Tan, Luis Nunes Vicente