Ransomware Resilience: A Real-time Detection Framework using Kafka and Machine Learning

Saad Khan; Rana Marwat Hussain; Talha Saleem Baig; Mian Muhammad Qasim

Authors

Saad Khan School of Systems and Technology, University of the Management and Technology, Lahore, Pakistan
Rana Marwat Hussain School of Systems and Technology, University of the Management and Technology, Lahore, Pakistan
Talha Saleem Baig School of Systems and Technology, University of the Management and Technology, Lahore, Pakistan
Mian Muhammad Qasim School of Systems and Technology, University of the Management and Technology, Lahore, Pakistan

Keywords:

Ransomware, Machine Learning, Real Time, Kafka

Abstract

Ransomware has emerged as a prominent cyber threat in recent years, targeting numerous businesses. In response to the escalating frequency of attacks, organizations are increasingly seeking effective tools and strategies to mitigate the impact of ransomware incidents. This research addresses the pressing need for real-time detection of ransomware, offering a solution that leverages cutting-edge technologies. The surge in ransomware attacks poses a significant challenge to the cybersecurity landscape, compelling organizations to adopt proactive measures. Recognizing the urgency of the situation, this study motivates the exploration of an innovative approach to ransomware detection. By utilizing advanced tools such as Apache Kafka and Spark, we aim to enhance detection capabilities and contribute to the resilience of businesses against cyber threats. Our methodology employs the Kafka tool and Spark for real-time identification of ransomware exploits. The research utilizes the CIC-MalMem-2022 dataset to develop and validate the proposed model. The integration of Apache Kafka with traditional machine learning techniques is explored to improve the accuracy of cyber threat detection, offering a comprehensive and efficient solution. The implemented model exhibits a commendable detection rate of 95.2%, demonstrating its effectiveness in identifying ransomware attacks in real-time. The combination of Apache Kafka's streaming capabilities and established machine learning methodologies proves to be a potent defense against the evolving landscape of cyber threats. In conclusion, our research provides a robust and practical approach to combating ransomware threats through real-time detection. By leveraging the synergy of Kafka and machine learning, organizations can fortify their cybersecurity defenses and respond proactively to potential ransomware exploits. This study contributes valuable insights and tools to the ongoing efforts in enhancing cyber resilience.

Author Biography

Rana Marwat Hussain, School of Systems and Technology, University of the Management and Technology, Lahore, Pakistan

I am an energetic and enthusiastic person who enjoys a challenge and achieving personal goals. My present career aim is to work within IT because I enjoy working with computers, I enjoy the environment and I find the work interesting and satisfying. The opportunity to learn new skills and work with new technologies is particularly attractive to me. I have a clear, logical mind with a practical approach to problem solving and a drive to see things through to completion. I enjoy working on my own initiative or in a team. In short, I am reliable, trustworthy, hardworking, and eager to learn and have a genuine interest in PR. I am ambitious and looking forward to getting exposure in leading companies or institutes, to experience the professional working environment and enhance technical skills.

References

“Internet Organised Crime Threat Assessment (IOCTA) 2017 | Europol.” Accessed: Feb. 07, 2024. [Online]. Available: https://www.europol.europa.eu/publications-events/main-reports/internet-organised-crime-threat-assessment-iocta-2017

S. Bukhari, A. Yousaf, S. Niazi, and M. R. Anjum, “A Novel Technique for the Generation and Application of Substitution Boxes (s-box) for the Image Encryption,” Nucl., vol. 55, no. 4, pp. 219–225, 2018, Accessed: Feb. 07, 2024. [Online]. Available: http://thenucleuspak.org.pk/index.php/Nucleus/article/view/229

K. Rieck, T. Holz, C. Willems, P. Düssel, and P. Laskov, “Learning and classification of malware behavior,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 5137 LNCS, pp. 108–125, 2008, doi: 10.1007/978-3-540-70542-0_6/COVER.

E. Konstantinou, S. Wolthusen, and R. Holloway, “Metamorphic Virus: Analysis and Detection,” 2008, Accessed: Feb. 07, 2024. [Online]. Available: http://www.rhul.ac.uk/mathematics/techreports

Y. Özkan, “Malware Detection in Forensic Memory Dumps: The Use of Deep Meta-Learning Models,” Acta Infologica, vol. 7, no. 1, pp. 165–172, Jun. 2023, doi: 10.26650/ACIN.1282824.

V. Minkevics and J. Kampars, “Methods, models and techniques to improve information system’s security in large organizations,” ICEIS 2020 - Proc. 22nd Int. Conf. Enterp. Inf. Syst., vol. 1, pp. 632–639, 2020, doi: 10.5220/0009572406320639.

“An Efficient Approach for Advanced Malware Analysis Using Memory Forensic Technique | IEEE Conference Publication | IEEE Xplore.” Accessed: Feb. 12, 2024. [Online]. Available: https://ieeexplore.ieee.org/document/8029568

A. H. Lashkari, B. Li, T. L. Carrier, and G. Kaur, “VolMemLyzer: Volatile Memory Analyzer for Malware Classification using Feature Engineering,” 2021 Reconciling Data Anal. Autom. Privacy, Secur. A Big Data Challenge, RDAAPS 2021, May 2021, doi: 10.1109/RDAAPS48126.2021.9452028.

D. Sgandurra, L. Muñoz-González, R. Mohsen, and E. C. Lupu, “Automated Dynamic Analysis of Ransomware: Benefits, Limitations and use for Detection,” Sep. 2016, Accessed: Feb. 07, 2024. [Online]. Available: https://arxiv.org/abs/1609.03020v1

Y. L. Wan, J. C. Chang, R. J. Chen, and S. J. Wang, “Feature-Selection-Based Ransomware Detection with Machine Learning of Data Analysis,” 2018 3rd Int. Conf. Comput. Commun. Syst. ICCCS 2018, pp. 392–396, Sep. 2018, doi: 10.1109/CCOMS.2018.8463300.

W. Z. A. Zakaria, M. F. Abdollah, O. Mohd, and A. F. M. Ariffin, “The rise of ransomware,” ACM Int. Conf. Proceeding Ser., pp. 66–70, Dec. 2017, doi: 10.1145/3178212.3178224.

M. Dener, G. Ok, and A. Orman, “Malware Detection Using Memory Analysis Data in Big Data Environment,” Appl. Sci. 2022, Vol. 12, Page 8604, vol. 12, no. 17, p. 8604, Aug. 2022, doi: 10.3390/APP12178604.

“Leveraging Feature Selection to Improve the Accuracy for Malware Detection,” Jun. 2023, doi: 10.21203/RS.3.RS-3045391/V1.

A. Carrega, L. Caviglione, M. Repetto, and M. Zuppelli, “Programmable data gathering for detecting stegomalware,” Proc. 2020 IEEE Conf. Netw. Softwarization Bridg. Gap Between AI Netw. Softwarization, NetSoft 2020, pp. 422–429, Jun. 2020, doi: 10.1109/NETSOFT48620.2020.9165537.

J. Zhou, A. H. Gandomi, F. Chen, and A. Holzinger, “Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics,” Electron. 2021, Vol. 10, Page 593, vol. 10, no. 5, p. 593, Mar. 2021, doi: 10.3390/ELECTRONICS10050593.

M. M. Hasan and M. M. Rahman, “RansHunt: A support vector machines based ransomware analysis framework with integrated feature set,” 20th Int. Conf. Comput. Inf. Technol. ICCIT 2017, vol. 2018-January, pp. 1–7, Jul. 2017, doi: 10.1109/ICCITECHN.2017.8281835.

G. Cusack, O. Michel, and E. Keller, “Machine learning-based detection of ransomware using SDN,” SDN-NFVSec 2018 - Proc. 2018 ACM Int. Work. Secur. Softw. Defin. Networks Netw. Funct. Virtualization, Co-located with CODASPY 2018, vol. 2018-January, pp. 1–6, Mar. 2018, doi: 10.1145/3180465.3180467.

A. Ichinose, A. Takefusa, H. Nakada, and M. Oguchi, “A study of a video analysis framework using Kafka and spark streaming,” Proc. - 2017 IEEE Int. Conf. Big Data, Big Data 2017, vol. 2018-January, pp. 2396–2401, Jul. 2017, doi: 10.1109/BIGDATA.2017.8258195.

M. A. Arshed, S. Mumtaz, O. Riaz, W. Sharif, and S. Abdullah, “A Deep Learning Framework for Multi Drug Side Effects Prediction with Drug Chemical Substructure,” Int. J. Innov. Sci. Technol., vol. 4, no. 1, pp. 19–31, Jan. 2022, doi: 10.33411/IJIST/2022040102.

M. T. Ubaid, M. Z. Khan, M. Rumaan, M. A. Arshed, M. U. G. Khan, and A. Darboe, “COVID-19 SOP’s Violations Detection in Terms of Face Mask Using Deep Learning,” 4th Int. Conf. Innov. Comput. ICIC 2021, 2021, doi: 10.1109/ICIC53490.2021.9692999.

M. A. Arshed, H. Ghassan, M. Hussain, M. Hassan, A. Kanwal, and R. Fayyaz, “A Light Weight Deep Learning Model for Real World Plant Identification,” 2022 2nd Int. Conf. Distrib. Comput. High Perform. Comput. DCHPC 2022, pp. 40–45, 2022, doi: 10.1109/DCHPC55044.2022.9731841.

A. Shahzad, M. A. Arshed, F. Liaquat, M. Tanveer, M. Hussain, and R. Alamdar, “Pneumonia Classification from Chest X-ray Images Using Pre-Trained Network Architectures,” VAWKUM Trans. Comput. Sci., vol. 10, no. 2, pp. 34–44, Dec. 2022, doi: 10.21015/VTCS.V10I2.1271.

M. Mubeen, M. A. Arshed, and H. A. Rehman, “DeepFireNet - A Light-Weight Neural Network for Fire-Smoke Detection,” Commun. Comput. Inf. Sci., vol. 1616 CCIS, pp. 171–181, 2022, doi: 10.1007/978-3-031-10525-8_14.

M. A. Arshed, S. Mumtaz, M. Hussain, R. Alamdar, M. T. Hassan, and M. Tanveer, “DeepFinancial Model for Exchange Rate Impacts Prediction of Political and Financial Statements,” 3rd IEEE Int. Conf. Artif. Intell. ICAI 2023, pp. 13–19, 2023, doi: 10.1109/ICAI58407.2023.10136658.

H. Younis, M. Asad Arshed, F. ul Hassan, M. Khurshid, H. Ghassan, and M. Haseeb-, “Tomato Disease Classification using Fine-Tuned Convolutional Neural Network,” Int. J. Innov. Sci. Technol., vol. 4, no. 1, pp. 123–134, Feb. 2022, doi: 10.33411/IJIST/2022040109.

M. A. Arshed, A. Shahzad, K. Arshad, D. Karim, S. Mumtaz, and M. Tanveer, “Multiclass Brain Tumor Classification from MRI Images using Pre-Trained CNN Model,” VFAST Trans. Softw. Eng., vol. 10, no. 4, pp. 22–28, Nov. 2022, doi: 10.21015/VTSE.V10I4.1182.

M. A. Arshed et al., “Machine Learning with Data Balancing Technique for IoT Attack and Anomalies Detection,” Int. J. Innov. Sci. Technol., vol. 4, no. 2, pp. 490–498, 2022, doi: 10.33411/ijist/2022040218.

H. A. Arshad, M. Hussain, A. Amin, and M. A. Arshed, “Impact of Artificial Intelligence in COVID-19 Pandemic: A Comprehensive Review,” 2022 2nd Int. Conf. Distrib. Comput. High Perform. Comput. DCHPC 2022, pp. 66–73, 2022, doi: 10.1109/DCHPC55044.2022.9732091.

M. A. Arshed, W. Qureshi, M. Rumaan, M. T. Ubaid, A. Qudoos, and M. U. G. Khan, “Comparison of Machine Learning Classifiers for Breast Cancer Diagnosis,” 4th Int. Conf. Innov. Comput. ICIC 2021, 2021, doi: 10.1109/ICIC53490.2021.9692926.

M. A. Arshed and F. Riaz, “Machine Learning for High Risk Cardiovascular Patient Identification 1,” J. Distrib. Comput. Syst., vol. 4, no. 2, pp. 34–39, 2021.

M. Hussain, A. Shahzad, F. Liaquat, M. A. Arshed, S. Mansoor, and Z. Akram, “Performance Analysis of Machine Learning Algorithms for Early Prognosis of Cardiac Vascular Disease,” Tech. J., vol. 28, no. 02, pp. 31–41, Jun. 2023, Accessed: Dec. 26, 2023. [Online]. Available: https://tj.uettaxila.edu.pk/index.php/technical-journal/article/view/1778

S. K. Shaukat and V. J. Ribeiro, “RansomWall: A layered defense system against cryptographic ransomware attacks using machine learning,” 2018 10th Int. Conf. Commun. Syst. Networks, COMSNETS 2018, vol. 2018-January, pp. 356–363, Mar. 2018, doi: 10.1109/COMSNETS.2018.8328219.

and T. L. A. Tseng, Y. Chen, Y. Kao, “Deep learning for ransomware detection”.

S. Maniath, A. Ashok, P. Poornachandran, V. G. Sujadevi, A. U. P. Sankar, and S. Jan, “Deep learning LSTM based ransomware detection,” 2017 Recent Dev. Control. Autom. Power Eng. RDCAPE 2017, pp. 442–446, May 2018, doi: 10.1109/RDCAPE.2017.8358312.

H. Daku, P. Zavarsky, and Y. Malik, “Behavioral-Based Classification and Identification of Ransomware Variants Using Machine Learning,” Proc. - 17th IEEE Int. Conf. Trust. Secur. Priv. Comput. Commun. 12th IEEE Int. Conf. Big Data Sci. Eng. Trust. 2018, pp. 1560–1564, Sep. 2018, doi: 10.1109/TRUSTCOM/BIGDATASE.2018.00224.

Y. Takeuchi, K. Sakai, and S. Fukumoto, “Detecting ransomware using support vector machines,” ACM Int. Conf. Proceeding Ser., Aug. 2018, doi: 10.1145/3229710.3229726.

“Machine Learning-Based Detection of Ransomware Using SDN | Proceedings of the 2018 ACM International Workshop on Security in Software Defined Networks & Network Function Virtualization.” Accessed: Feb. 07, 2024. [Online]. Available: https://dl.acm.org/doi/10.1145/3180465.3180467

O. M. K. Alhawi, J. Baldwin, and A. Dehghantanha, “Leveraging Machine Learning Techniques for Windows Ransomware Network Traffic Detection,” vol. 70, 2018, doi: 10.1007/978-3-319-73951-9_5.

S. Poudyal, K. P. Subedi, and D. Dasgupta, “A Framework for Analyzing Ransomware using Machine Learning,” Proc. 2018 IEEE Symp. Ser. Comput. Intell. SSCI 2018, pp. 1692–1699, Jul. 2018, doi: 10.1109/SSCI.2018.8628743.

K. Lee, S. Y. Lee, and K. Yim, “Machine Learning Based File Entropy Analysis for Ransomware Detection in Backup Systems,” IEEE Access, vol. 7, pp. 110205–110215, 2019, doi: 10.1109/ACCESS.2019.2931136.

F. Khan, C. Ncube, L. K. Ramasamy, S. Kadry, and Y. Nam, “A Digital DNA Sequencing Engine for Ransomware Detection Using Machine Learning,” IEEE Access, vol. 8, pp. 119710–119719, 2020, doi: 10.1109/ACCESS.2020.3003785.

L. Chen, C.-Y. Yang, A. Paul, and R. Sahita, “Towards resilient machine learning for ransomware detection,” Dec. 2018, Accessed: Feb. 07, 2024. [Online]. Available: https://arxiv.org/abs/1812.09400v2

S. Il Bae, G. Bin Lee, and E. G. Im, “Ransomware detection using machine learning algorithms,” Concurr. Comput. Pract. Exp., vol. 32, no. 18, p. e5422, Sep. 2020, doi: 10.1002/CPE.5422.

J. Hwang, J. Kim, S. Lee, and K. Kim, “Two-Stage Ransomware Detection Using Dynamic Analysis and Machine Learning Techniques,” Wirel. Pers. Commun., vol. 112, no. 4, pp. 2597–2609, Jun. 2020, doi: 10.1007/S11277-020-07166-9/METRICS.

H. Zuhair, A. Selamat, and O. Krejcar, “A Multi-Tier Streaming Analytics Model of 0-Day Ransomware Detection Using Machine Learning,” Appl. Sci. 2020, Vol. 10, Page 3210, vol. 10, no. 9, p. 3210, May 2020, doi: 10.3390/APP10093210.