Dealing with Dataset Class Imbalance for Multi-Class Network Intrusion Detection Systems
Keywords:
Intrusion Detection System (IDS), Class Imbalance, SMOTE-ENN, Machine Learning, Network Security, CICIoT2023 Dataset, Multi-Class Classification, Internet of Things (IoT)Abstract
Intrusion Detection Systems (IDS) are now more crucial for protecting network environments due to the increasing number of IoT devices. However, class imbalance substantially impacts the performance of IDS, especially in multi-class classification, where some attack classes are prevalent, and others are rare. Our research explores the effects of class imbalance by experimenting with the CICIoT2023 dataset in three distinct classification settings: 2-class, 8-class, and 33-class. Our experiments reveal that, although the binary classification performance is high (accuracy ≈ 0.99), the performance drops in multi-class settings, with accuracy of ≈ 0.85 and macro-F1 of 0.59 in the 8-class scenario, and further drops to accuracy of ≈ 0.76 and macro-F1 of 0.47 in the 33-class setting due to the severe class imbalance. To resolve this problem, we examine SMOTE, class weighting, and a combined SMOTE-ENN method. Our results show that the SMOTE-ENN approach effectively balances classes (up to 0.97 accuracy and macro-F1≈0.96 on the test dataset after training with balanced data) and enhances detection performance for minority classes. By contrast, baseline models have lower macro-F1 and recall for minority classes. These results show that hybrid resampling not only boosts classification accuracy but also the ability to detect rare attacks, thus serving as an effective strategy for robust multi-class IoT intrusion detection systems.
References
“(PDF) Optimizing Random Forest for IoT Cyberattack Detection using SMOTE: A Study on CIC-IoT2023 Dataset.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/399485131_Optimizing_Random_Forest_for_IoT_Cyberattack_Detection_using_SMOTE_A_Study_on_CIC-IoT2023_Dataset
“A two-tier optimization strategy for feature selection in robust adversarial attack mitigation on internet of things network security | Scientific Reports.” Accessed: May 08, 2026. [Online]. Available: https://www.nature.com/articles/s41598-025-85878-3
S. Wali and I. Khan, “Explainable AI and Random Forest Based Reliable Intrusion Detection system,” Comput. Sci. Eng., Dec. 2021, doi: 10.36227/TECHRXIV.17169080.V1.
W. A. H. Salman and C. H. Yong, “Overview of the CICIoT2023 Dataset for Internet of Things Intrusion Detection Systems,” Mesopotamian J. Big Data, vol. 2025, pp. 50–60, Jan. 2025, doi: 10.58496/MJBD/2025/004.
S. K. R. Mallidi and R. R. Ramisetty, “Optimizing Intrusion Detection for IoT: A Systematic Review of Machine Learning and Deep Learning Approaches With Feature Selection and Data Balancing,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 15, no. 2, p. e70008, Jun. 2025, doi: 10.1002/WIDM.70008;JOURNAL:JOURNAL:19424795;ISSUE:ISSUE:DOI.
T. Ammannamma, A. S.N. Chakravarthy, “A hybrid lightweight feature extraction assisted ensemble approach for intrusion detection with ESMOTE-based class imbalance handling in IoT networks,” Comput. Electr. Eng., vol. 130, p. 110846, 2026, doi: https://doi.org/10.1016/j.compeleceng.2025.110846.
B. Kiranmayee, M. S. Devi, K. Susheela, R. Dhumpati, K. K. R. Penubaka, and U. G. Naidu, “Developing a Robust Intrusion Detection System Using SMOTE and Hybrid SVNN Model,” 4th Int. Conf. Sentim. Anal. Deep Learn. ICSADL 2025 - Proc., pp. 369–376, 2025, doi: 10.1109/ICSADL65848.2025.10933456.
“Comparative Analysis of Machine Learning Algorithms for Anomaly Detection in IoT Networks Using CICIoT2023 Dataset | INFOCOMP Journal of Computer Science.” Accessed: May 08, 2026. [Online]. Available: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/5342
“(PDF) Imbalanced Data Problem in Machine Learning: A Review.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/388208416_Imbalanced_Data_problem_in_Machine_Learning_A_review
“Resampling approaches to handle class imbalance: a review from a data perspective | Journal of Big Data | Springer Nature Link.” Accessed: May 08, 2026. [Online]. Available: https://link.springer.com/article/10.1186/s40537-025-01119-4
“Enhancing IoT security: A comparative study of feature reduction techniques for intrusion detection system - ScienceDirect.” Accessed: May 08, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2667305324000814
S. R. Alve, M. Z. Mahmud, S. Islam, M. A. Chowdhury, and J. Islam, “Resource-Efficient Machine Learning Approaches for Multi-Class Threat Detection in IoT Environments,” 2025 IEEE Int. Conf. Quantum Photonics, Artif. Intell. Networking, QPAIN 2025, 2025, doi: 10.1109/QPAIN66474.2025.11171769.
“Intrusion Detection in IoT Environment Using Hyperparameters Tuned Machine and Deep Learning Models on the CICIoT2023 Dataset | Request PDF.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/396634807_Intrusion_Detection_in_IoT_Environment_Using_Hyperparameters_Tuned_Machine_and_Deep_Learning_Models_on_the_CICIoT2023_Dataset
Y. Kim, C. Won, and H. Kim, “Impact of Data Processing Techniques on AI Models for Attack-Based Imbalanced and Encrypted Traffic within IoT Environments,” Comput. Mater. Contin., vol. 86, no. 1, pp. 1–28, Nov. 2025, doi: 10.32604/CMC.2025.069608.
“Detection of IoT Botnet Cyber Attacks using Machine Learning | Request PDF.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/371203020_Detection_of_IoT_Botnet_Cyber_Attacks_using_Machine_Learning
P. Tharun, R. Sumathi, S. Manjula, R. S. Charan, A. Dhanush Saran, and M. L. Reddy, “Convolutional Neural Network for Advanced Intrusion Detection for Data Balancing on System Efficiency,” Proc. 8th Int. Conf. Comput. Methodol. Commun. ICCMC 2025, pp. 145–152, 2025, doi: 10.1109/ICCMC65190.2025.11140907.
P. Singh, D. Nehra, V. Mangat, and K. Kumar, “Enhancing intrusion detection with ResNet and SMOTE-ENN: a deep learning approach to class imbalance in CICIDS2017,” Int. J. Inf. Technol. 2025, pp. 1–10, Nov. 2025, doi: 10.1007/S41870-025-02766-9.
“(PDF) Towards Robust IoT Security: The Impact of Data Quality and Imbalanced Data on AI-Based IDS.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/394324485_Towards_Robust_IoT_Security_The_Impact_of_Data_Quality_and_Imbalanced_Data_on_AI-Based_IDS
Y. J. Park and C. K. Ma, “A novel instance density-based hybrid resampling for imbalanced classification problems,” Soft Comput. 2025 294, vol. 29, no. 4, pp. 2031–2045, Mar. 2025, doi: 10.1007/S00500-025-10499-X.
Y. Guo, Y. Kou, L. Z. Yi, and G. H. Fu, “HiBBKA: A Hybrid Method With Resampling and Heuristic Feature Selection for Class-Imbalanced Data in Chemometrics,” J. Chemom., vol. 39, no. 5, p. e70029, Apr. 2025, doi: 10.1002/CEM.70029;PAGEGROUP:STRING:PUBLICATION.
“(PDF) Hybrid Sampling Approach to Enhance Intrusion Detection System in IoT Networks.” Accessed: May 08, 2026. [Online]. Available: https://www.researchgate.net/publication/390909719_Hybrid_Sampling_Approach_to_Enhance_Intrusion_Detection_System_in_IoT_Networks
Nahida Nigar, Rashed Mustafa, “Enhanced Intrusion Detection via Hybrid Data Resampling and Feature Optimization,” IEEE Access, 2025, [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=11141474
Monirah Al-Ajlan, Mourad Ykhlef, “GAN-AHR: A GAN-Based Adaptive Hybrid Resampling Algorithm for Imbalanced Intrusion Detection,” Electronics, vol. 14, no. 17, p. 3476, 2025, doi: https://doi.org/10.3390/electronics14173476.
“Robust Intrusion Detection System Using an Improved Hybrid Deep Learning Model for Binary and Multi-Class Classification in IoT Networks. | EBSCOhost.” Accessed: May 08, 2026. [Online]. Available: https://openurl.ebsco.com/EPDB%3Agcd%3A11%3A10491979/detailv2?sid=ebsco%3Aplink%3Acrawler-gcd&id=ebsco%3Agcd%3A184201869&crl=c&jrnl=22277080&link_origin=www.google.com
Mingming Han, Husheng Guo, “A new data complexity measure for multi-class imbalanced classification tasks,” Pattern Recognit., vol. 157, p. 110881, 2025, doi: https://doi.org/10.1016/j.patcog.2024.110881.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 50sea

This work is licensed under a Creative Commons Attribution 4.0 International License.


















