Research

Paper

TESTING February 26, 2026

No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors

Authors

Tao Liu, Gang Wan, Kan Ren, Shibo Wen

Abstract

We propose a new unsupervised framework for online video stabilization. Unlike methods based on deep learning that require paired stable and unstable datasets, our approach instantiates the classical stabilization pipeline with three stages and incorporates a multithreaded buffering mechanism. This design addresses three longstanding challenges in end-to-end learning: limited data, poor controllability, and inefficiency on hardware with constrained resources. Existing benchmarks focus mainly on handheld videos with a forward view in visible light, which restricts the applicability of stabilization to domains such as UAV nighttime remote sensing. To fill this gap, we introduce a new multimodal UAV aerial video dataset (UAV-Test). Experiments show that our method consistently outperforms state-of-the-art online stabilizers in both quantitative metrics and visual quality, while achieving performance comparable to offline methods.

Metadata

arXiv ID: 2602.23141
Provider: ARXIV
Primary Category: cs.CV
Published: 2026-02-26
Fetched: 2026-02-27 04:35

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.23141v1</id>\n    <title>No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors</title>\n    <updated>2026-02-26T16:04:36Z</updated>\n    <link href='https://arxiv.org/abs/2602.23141v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.23141v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>We propose a new unsupervised framework for online video stabilization. Unlike methods based on deep learning that require paired stable and unstable datasets, our approach instantiates the classical stabilization pipeline with three stages and incorporates a multithreaded buffering mechanism. This design addresses three longstanding challenges in end-to-end learning: limited data, poor controllability, and inefficiency on hardware with constrained resources. Existing benchmarks focus mainly on handheld videos with a forward view in visible light, which restricts the applicability of stabilization to domains such as UAV nighttime remote sensing. To fill this gap, we introduce a new multimodal UAV aerial video dataset (UAV-Test). Experiments show that our method consistently outperforms state-of-the-art online stabilizers in both quantitative metrics and visual quality, while achieving performance comparable to offline methods.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CV'/>\n    <published>2026-02-26T16:04:36Z</published>\n    <arxiv:comment>CVPR2026</arxiv:comment>\n    <arxiv:primary_category term='cs.CV'/>\n    <author>\n      <name>Tao Liu</name>\n    </author>\n    <author>\n      <name>Gang Wan</name>\n    </author>\n    <author>\n      <name>Kan Ren</name>\n    </author>\n    <author>\n      <name>Shibo Wen</name>\n    </author>\n  </entry>"
}