Research

Paper

TESTING February 27, 2026

Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic Profiling

Authors

Sean M. Alderman, John D. Hastings

Abstract

The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipeline for unsupervised IoT device traffic profiling and incremental model updating, evaluated on selected long-duration captures from the Deakin IoT dataset. For baseline profiling, density-based clustering (DBSCAN) isolates a substantial outlier portion of the data and produces the strongest alignment with ground-truth device labels among tested classical methods (NMI 0.78), outperforming centroid-based clustering on cluster purity. For incremental adaptation, we evaluate stream-oriented clustering approaches and find that BIRCH supports efficient updates (0.13 seconds per update) and forms comparatively coherent clusters for a held-out novel device (purity 0.87), but with limited capture of novel traffic (share 0.72) and a measurable trade-off in known-device accuracy after adaptation (0.71). Overall, the results highlight a practical trade-off between high-purity static profiling and the flexibility of incremental clustering for evolving IoT environments.

Metadata

arXiv ID: 2602.24047
Provider: ARXIV
Primary Category: cs.NI
Published: 2026-02-27
Fetched: 2026-03-02 06:04

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.24047v1</id>\n    <title>Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic Profiling</title>\n    <updated>2026-02-27T14:31:01Z</updated>\n    <link href='https://arxiv.org/abs/2602.24047v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.24047v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipeline for unsupervised IoT device traffic profiling and incremental model updating, evaluated on selected long-duration captures from the Deakin IoT dataset. For baseline profiling, density-based clustering (DBSCAN) isolates a substantial outlier portion of the data and produces the strongest alignment with ground-truth device labels among tested classical methods (NMI 0.78), outperforming centroid-based clustering on cluster purity. For incremental adaptation, we evaluate stream-oriented clustering approaches and find that BIRCH supports efficient updates (0.13 seconds per update) and forms comparatively coherent clusters for a held-out novel device (purity 0.87), but with limited capture of novel traffic (share 0.72) and a measurable trade-off in known-device accuracy after adaptation (0.71). Overall, the results highlight a practical trade-off between high-purity static profiling and the flexibility of incremental clustering for evolving IoT environments.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.NI'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <published>2026-02-27T14:31:01Z</published>\n    <arxiv:comment>6 pages, 2 figures, 4 tables</arxiv:comment>\n    <arxiv:primary_category term='cs.NI'/>\n    <author>\n      <name>Sean M. Alderman</name>\n    </author>\n    <author>\n      <name>John D. Hastings</name>\n    </author>\n  </entry>"
}