Research

Paper

TESTING March 09, 2026

Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation

Authors

He-Yen Hsieh, Wei-Te Mark Ting, H. T. Kung

Abstract

Pre-trained gaze models learn to identify useful patterns commonly found across users, but subtle user-specific variations (i.e., eyelid shape or facial structure) can degrade model performance. Test-time personalization (TTP) adapts pre-trained models to these user-specific domain shifts using only a few unlabeled samples. Efficient fine-tuning is critical in performing this domain adaptation: data and computation resources can be limited-especially for on-device customization. While popular parameter-efficient fine-tuning (PEFT) methods address adaptation costs by updating only a small set of weights, they may not be taking full advantage of structures encoded in pre-trained filters. To more effectively leverage existing structures learned during pre-training, we reframe personalization as a process to reweight existing features rather than learning entirely new ones. We present Attentive Low-Rank Filter Adaptation (Alfa) to adapt gaze models by reweighting semantic patterns in pre-trained filters. With Alfa, singular value decomposition (SVD) extracts dominant spatial components that capture eye and facial characteristics across users. Via an attention mechanism, we need only a few unlabeled samples to adjust and reweight pre-trained structures, selectively amplifying those relevant to a target user. Alfa achieves the lowest average gaze errors across four cross-dataset gaze benchmarks, outperforming existing TTP methods and low-rank adaptation (LoRA)-based variants. We also show that Alfa's attentive low-rank methods can be applied to applications beyond vision, such as diffusion-based language models.

Metadata

arXiv ID: 2603.08445

Provider: ARXIV

Primary Category: cs.CV

Published: 2026-03-09

Fetched: 2026-03-10 05:43

Related papers

Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data

Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox • 2026-03-30

Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation

Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, Da... • 2026-03-30

SNID-SAGE: A Modern Framework for Interactive Supernova Classification and Spectral Analysis

Fiorenzo Stoppa, Stephen J. Smartt • 2026-03-30

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

Sofiane Azzouz, Pierre-André Vuissoz, Yves Laprie • 2026-03-30

Rotating black hole shadows in metric-affine bumblebee gravity

Jose R. Nascimento, Ana R. M. Oliveira, Albert Yu. Petrov, Paulo J. Porfírio,... • 2026-03-30

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.08445v1</id>\n    <title>Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation</title>\n    <updated>2026-03-09T14:43:04Z</updated>\n    <link href='https://arxiv.org/abs/2603.08445v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.08445v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Pre-trained gaze models learn to identify useful patterns commonly found across users, but subtle user-specific variations (i.e., eyelid shape or facial structure) can degrade model performance. Test-time personalization (TTP) adapts pre-trained models to these user-specific domain shifts using only a few unlabeled samples. Efficient fine-tuning is critical in performing this domain adaptation: data and computation resources can be limited-especially for on-device customization. While popular parameter-efficient fine-tuning (PEFT) methods address adaptation costs by updating only a small set of weights, they may not be taking full advantage of structures encoded in pre-trained filters. To more effectively leverage existing structures learned during pre-training, we reframe personalization as a process to reweight existing features rather than learning entirely new ones.\n  We present Attentive Low-Rank Filter Adaptation (Alfa) to adapt gaze models by reweighting semantic patterns in pre-trained filters. With Alfa, singular value decomposition (SVD) extracts dominant spatial components that capture eye and facial characteristics across users. Via an attention mechanism, we need only a few unlabeled samples to adjust and reweight pre-trained structures, selectively amplifying those relevant to a target user. Alfa achieves the lowest average gaze errors across four cross-dataset gaze benchmarks, outperforming existing TTP methods and low-rank adaptation (LoRA)-based variants. We also show that Alfa's attentive low-rank methods can be applied to applications beyond vision, such as diffusion-based language models.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CV'/>\n    <published>2026-03-09T14:43:04Z</published>\n    <arxiv:comment>21 pages, 16 figures, AAAI2026</arxiv:comment>\n    <arxiv:primary_category term='cs.CV'/>\n    <author>\n      <name>He-Yen Hsieh</name>\n    </author>\n    <author>\n      <name>Wei-Te Mark Ting</name>\n    </author>\n    <author>\n      <name>H. T. Kung</name>\n    </author>\n  </entry>"
}