Research

Paper

TESTING March 06, 2026

Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching

Authors

Zhuorui Zhang, Roger Pallarès-López, Praneeth Namburi, Brian W. Anthony

Abstract

Acquiring per-frame video annotations remains a primary bottleneck for deploying computer vision in specialized domains such as medical imaging, where expert labeling is slow and costly. Label propagation offers a natural solution, yet existing approaches face fundamental limitations. Video trackers and segmentation models can propagate labels within a single sequence but require per-video initialization and cannot generalize across videos. Classic correspondence pipelines operate on detector-chosen keypoints and struggle in low-texture scenes, while dense feature matching and one-shot segmentation methods enable cross-video propagation but lack spatiotemporal smoothness and unified support for both point and mask annotations. We present Match4Annotate, a lightweight framework for both intra-video and inter-video propagation of point and mask annotations. Our method fits a SIREN-based implicit neural representation to DINOv3 features at test time, producing a continuous, high-resolution spatiotemporal feature field, and learns a smooth implicit deformation field between frame pairs to guide correspondence matching. We evaluate on three challenging clinical ultrasound datasets. Match4Annotate achieves state-of-the-art inter-video propagation, outperforming feature matching and one-shot segmentation baselines, while remaining competitive with specialized trackers for intra-video propagation. Our results show that lightweight, test-time-optimized feature matching pipelines have the potential to offer an efficient and accessible solution for scalable annotation workflows.

Metadata

arXiv ID: 2603.06471
Provider: ARXIV
Primary Category: cs.CV
Published: 2026-03-06
Fetched: 2026-03-09 06:05

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.06471v1</id>\n    <title>Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching</title>\n    <updated>2026-03-06T16:56:46Z</updated>\n    <link href='https://arxiv.org/abs/2603.06471v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.06471v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Acquiring per-frame video annotations remains a primary bottleneck for deploying computer vision in specialized domains such as medical imaging, where expert labeling is slow and costly. Label propagation offers a natural solution, yet existing approaches face fundamental limitations. Video trackers and segmentation models can propagate labels within a single sequence but require per-video initialization and cannot generalize across videos. Classic correspondence pipelines operate on detector-chosen keypoints and struggle in low-texture scenes, while dense feature matching and one-shot segmentation methods enable cross-video propagation but lack spatiotemporal smoothness and unified support for both point and mask annotations. We present Match4Annotate, a lightweight framework for both intra-video and inter-video propagation of point and mask annotations. Our method fits a SIREN-based implicit neural representation to DINOv3 features at test time, producing a continuous, high-resolution spatiotemporal feature field, and learns a smooth implicit deformation field between frame pairs to guide correspondence matching. We evaluate on three challenging clinical ultrasound datasets. Match4Annotate achieves state-of-the-art inter-video propagation, outperforming feature matching and one-shot segmentation baselines, while remaining competitive with specialized trackers for intra-video propagation. Our results show that lightweight, test-time-optimized feature matching pipelines have the potential to offer an efficient and accessible solution for scalable annotation workflows.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CV'/>\n    <published>2026-03-06T16:56:46Z</published>\n    <arxiv:primary_category term='cs.CV'/>\n    <author>\n      <name>Zhuorui Zhang</name>\n    </author>\n    <author>\n      <name>Roger Pallarès-López</name>\n    </author>\n    <author>\n      <name>Praneeth Namburi</name>\n    </author>\n    <author>\n      <name>Brian W. Anthony</name>\n    </author>\n  </entry>"
}