Paper
DKD-KAN: A Lightweight knowledge-distilled KAN intrusion detection framework, based on MLP and KAN
Authors
Mohammad Alikhani
Abstract
Cyber-security systems often operate in resource-constrained environments, such as edge environments and real-time monitoring systems, where model size and inference time are crucial. A light-weight intrusion detection framework is proposed that utilizes the Kolmogorov-Arnold Network (KAN) to capture complex features in the data, with the efficiency of decoupled knowledge distillation (DKD) training approach. A high-capacity KAN network is first trained to detect attacks performed on the test bed. This model then serves as a teacher to guide a much smaller multilayer perceptron (MLP) student model via DKD. The resulting DKD-MLP model contains only 2,522 and 1,622 parameters for WADI and SWaT datasets, which are significantly smaller than the number of parameters of the KAN teacher model. This is highly appropriate for deployment in resource-constrained devices with limited computational resources. Despite its low size, the student model maintains a high performance. Our approach demonstrate the practicality of using KAN as a knowledge-rich teacher to train much smaller student models, without considerable drop in accuracy in intrusion detection frameworks. We have validated our approach on two publicly available datasets. We report F1-score improvements of 4.18% on WADI and 3.07% on SWaT when using the DKD-MLP model, compared to the bare student model. The implementation of this paper is available on our GitHub repository.
Metadata
Related papers
Fractal universe and quantum gravity made simple
Fabio Briscese, Gianluca Calcagni • 2026-03-25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan
Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25
LensWalk: Agentic Video Understanding by Planning How You See in Videos
Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25
Orientation Reconstruction of Proteins using Coulomb Explosions
Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series
Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.03486v1</id>\n <title>DKD-KAN: A Lightweight knowledge-distilled KAN intrusion detection framework, based on MLP and KAN</title>\n <updated>2026-03-03T20:02:11Z</updated>\n <link href='https://arxiv.org/abs/2603.03486v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.03486v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Cyber-security systems often operate in resource-constrained environments, such as edge environments and real-time monitoring systems, where model size and inference time are crucial. A light-weight intrusion detection framework is proposed that utilizes the Kolmogorov-Arnold Network (KAN) to capture complex features in the data, with the efficiency of decoupled knowledge distillation (DKD) training approach. A high-capacity KAN network is first trained to detect attacks performed on the test bed. This model then serves as a teacher to guide a much smaller multilayer perceptron (MLP) student model via DKD. The resulting DKD-MLP model contains only 2,522 and 1,622 parameters for WADI and SWaT datasets, which are significantly smaller than the number of parameters of the KAN teacher model. This is highly appropriate for deployment in resource-constrained devices with limited computational resources. Despite its low size, the student model maintains a high performance. Our approach demonstrate the practicality of using KAN as a knowledge-rich teacher to train much smaller student models, without considerable drop in accuracy in intrusion detection frameworks. We have validated our approach on two publicly available datasets. We report F1-score improvements of 4.18% on WADI and 3.07% on SWaT when using the DKD-MLP model, compared to the bare student model. The implementation of this paper is available on our GitHub repository.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n <category scheme='http://arxiv.org/schemas/atom' term='eess.SP'/>\n <category scheme='http://arxiv.org/schemas/atom' term='eess.SY'/>\n <published>2026-03-03T20:02:11Z</published>\n <arxiv:primary_category term='cs.CR'/>\n <author>\n <name>Mohammad Alikhani</name>\n </author>\n </entry>"
}