Investigating the Impact of Adversarial Evasion Attacks on Model Explanation in IoT-based DDoS Detection Systems
Keywords:
Explainable IDS, IoT, DDoS Detection, XAI Explanations, Adversarial PerturbationsAbstract
Explainable Artificial Intelligence (XAI) is increasingly being used by intrusion detection systems (IDS) to enhance transparency and enable human-centered cybersecurity decision-making. However, adversarial evasion attacks present a dual threat, as they can mislead both the models themselves and their interpretability outputs intended to explain IDS model predictions. This study investigates how vulnerable DDoS detection in IoT networks is to such adversarial manipulations using the CICIoT2023 dataset. Machine learning and deep learning models, including Random Forest and LSTM, were analyzed using LIME and SHAP frameworks to generate understandable explanations after being trained to detect DDoS traffic. Adversarial perturbed examples were introduced to test the resilience of the explanation, integrity, and accuracy of prediction. The findings indicate that small perturbations can significantly reduce the accuracy of detecting attacks, resulting in false feature attributions. In adversarial settings, LSTM false negatives went up as much as 24,876 to 27,993, decreasing the accuracy from 87.13% to 86.86%, whereas random forest misclassifications went up by 363 to 12,195; however, the accuracy drop was from 99.96% to 98.95%. The overlap in top-5 feature rankings for LSTM was 50%, and that of RF was 45%, respectively. Also, the SHAP cosine similarity declined to 35% for LSTM and 55% for RF, indicating important differences in interpretability. The results of this study highlight limitations in existing explainable IDS approaches and the necessity of adversarial-resistant XAI methods to ensure reliable, trustworthy, and understandable cybersecurity analytics when using the Internet of Things.
References
M. F. Saiyed, I. Al-Anbagi, and M. Shamim Hossain, “An Explainable Deep Learning System for Cyberattack Detection in Internet of Energy Networks,” IEEE Netw., 2025, doi: 10.1109/MNET.2025.3622918.
A. A. G. Euclides Carlos Pinto Neto, Sajjad Dadkhah, Raphael Ferreira, Alireza Zohourian, Rongxing Lu, “CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment,” Sensors, vol. 23, no. 13, p. 5941, 2023, doi: https://doi.org/10.3390/s23135941.
S. M. Lundberg and S. I. Lee, “A Unified Approach to Interpreting Model Predictions,” Adv. Neural Inf. Process. Syst., vol. 2017-December, pp. 4766–4775, May 2017, Accessed: Aug. 14, 2024. [Online]. Available: https://arxiv.org/abs/1705.07874v2
M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier,” NAACL-HLT 2016 - 2016 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Proc. Demonstr. Sess., pp. 97–101, 2016, doi: 10.18653/v1/n16-3020.
“(PDF) Adversarial Machine Learning Attacks against Intrusion Detection Systems: A Survey on Strategies and Defense.” Accessed: May 09, 2026. [Online]. Available: https://www.researchgate.net/publication/367979039_Adversarial_Machine_Learning_Attacks_against_Intrusion_Detection_Systems_A_Survey_on_Strategies_and_Defense
M. A. Ayub, W. A. Johnson, D. A. Talbert, and A. Siraj, “Model Evasion Attack on Intrusion Detection Systems using Adversarial Machine Learning,” 2020 54th Annu. Conf. Inf. Sci. Syst. CISS 2020, Mar. 2020, doi: 10.1109/CISS48834.2020.1570617116.
Sheikh Abdul Wahab, Saira Sultana, “A Multi-Class Intrusion Detection System for DDoS Attacks in IoT Networks Using Deep Learning and Transformers,” Sensors, vol. 25, no. 15, p. 4845, 2025, doi: https://doi.org/10.3390/s25154845.
A. Khan, Y. Li, S. Shoukat, D. Javeed, and M. Adil, “Towards secure IoT-enabled transportation: an explainable AI and deep learning-based approach for efficient threat detection,” Clust. Comput. 2025 2811, vol. 28, no. 11, pp. 699-, Sep. 2025, doi: 10.1007/S10586-025-05473-Z.
Jason Moss, Jeremy Gordon, “Explainable AI in IoT: A Survey of Challenges, Advancements, and Pathways to Trustworthy Automation,” Electronics, vol. 14, no. 23, p. 4622, 2025, doi: https://doi.org/10.3390/electronics14234622.
Vincent Zibi Mohale, Ibidun Christiana Obagbuwa, “A systematic review on the integration of explainable artificial intelligence in intrusion detection systems to enhancing transparency and interpretability in cybersecurity,” Front. Artif. Intell., vol. 8, 2025, doi: https://doi.org/10.3389/frai.2025.1526221.
D. L. Marino, C. S. Wickramasinghe, and M. Manic, “An adversarial approach for explainable AI in intrusion detection systems,” Proc. IECON 2018 - 44th Annu. Conf. IEEE Ind. Electron. Soc., pp. 3237–3243, Dec. 2018, doi: 10.1109/IECON.2018.8591457.
Maraz Mia, Mir Mehedi A. Pritom, “Explainable but Vulnerable: Adversarial Attacks on XAI Explanation in Cybersecurity Applications,” arXiv:2510.03623, 2025, [Online]. Available: https://arxiv.org/abs/2510.03623
Jon Vadillo, Roberto Santana, Jose A. Lozano, “Adversarial Attacks in Explainable Machine Learning: A Survey of Threats Against Models and Humans,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 2024, [Online]. Available: https://wires.onlinelibrary.wiley.com/doi/full/10.1002/widm.1567
Tharindu Lakshan Yasarathna, Nhien-An Le-Khac, “SoK: Systematic analysis of adversarial threats against deep learning approaches for autonomous anomaly detection systems in SDN-IoT networks,” arXiv:2509.26350, 2025, [Online]. Available: https://arxiv.org/abs/2509.26350
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 50sea

This work is licensed under a Creative Commons Attribution 4.0 International License.


















