A Modified K-Nearest Neighbors Algorithm for the Detection of Heart Disease

Authors

  • Muhammad Muntazir Khan Institute of Computer Science and Information Technology, ICS/IT, FMCS, the University of Agriculture, Peshawar 25130, Pakistan
  • Shakila Parveen Jan Department of Information Technology, Qurtuba University, Peshawar, Pakistan
  • Jamal Uddin Computer Science, Riphah School of Computing and Innovation, Riphah International University, Lahore, Pakistan
  • Basharat Ahmad Hassan Institute of Computer Science and Information Technology, ICS/IT, FMCS, the University of Agriculture, Peshawar 25130, Pakistan
  • Anees Ur Rahman Youth affair district youth office upper chitral

Keywords:

Heart Disease, KNN, Jaccard, Cosine, Accuracy, Confusion Matrix

Abstract

The leading cause of mortality worldwide is heart disease, sometimes referred to as cardiovascular disease. It is a dangerous illness that impacts the heart and blood arteries. A significant amount of research and analysis has been done recently with the goal of improving the accuracy and dependability of heart disease data. In this discipline, machine learning is crucial since it provides medical diagnostic tools that may be used to forecast illness and enhance healthcare. In this study, heart disease detection is proposed by combining KNN with Jaccard and Cosine similarities. Further, the results of Jaccard and cosine integrated KNN are compared with the results of state-of-the-art models like KNN and decision trees. Python and its libraries are used for simulation purposes. After the simulation, it was found that Jaccard-based KNN (JKNN) had the best accuracy (97%) according to the study's analysis of the Cleveland heart disease dataset. With 91% accuracy, the Cosine-based KNN (CKNN) likewise demonstrated strong performance. In a similar vein, the decision tree is inadequate for classifying heart disease because of its poor accuracy rate as 85%. Likely, KNN shows average results in the form of accuracy, as 86%. According to the results, the JKNN technique is the best model for this task, closely followed by CKNN. The use of machine learning in the diagnosis and prognosis of heart disease is affected by these discoveries.

References

K. S. and C. J. A Ishak, A Ginting, “Clasiffication of Heart Disease using Decision Tree Algorithm,” IOP Conf. Ser. Mater. Sci. Eng., vol. 003, 20201, doi: 10.1088/1757-899X/1003/1/012119.

R. Katarya and S. K. Meena, “Machine Learning Techniques for Heart Disease Prediction: A Comparative Study and Analysis,” Heal. Technol. 2020 111, vol. 11, no. 1, pp. 87–97, Nov. 2020, doi: 10.1007/S12553-020-00505-7.

A. M. A. Ibrahim Mahmood Ibrahim, “The Role of Machine Learning Algorithms for Diagnosing Diseases,” J. Appl. Sci. Technol. Trends, vol. 2, no. 1, 2021, [Online]. Available: https://jastt.org/index.php/jasttpath/article/view/79

K. Z. Xinyao LI, Linlin ZHANG, Xuehua BI, Ying ZHANG, Guanglei YU, “The Classificatied Prediction of Coronary Heart Disease Based on Patient Similarity Analysis,” Res. Sq., 2021, [Online]. Available: https://www.researchsquare.com/article/rs-724235/v1

S. Grampurohit and C. Sagarnal, “Disease prediction using machine learning algorithms,” 2020 Int. Conf. Emerg. Technol. INCET 2020, Jun. 2020, doi: 10.1109/INCET49848.2020.9154130.

X. Z. and R. W. S. Zhang, X. Li, M. Zong, “Efficient kNN Classification With Different Numbers of Nearest Neighbors,” IEEE Trans. Neural Networks Learn. Syst., vol. 29, no. 5, pp. 1774–1785, 2018, doi: 10.1109/TNNLS.2017.2673241.

D. A. . A. and N. C. Aziz, “Implementation of K-Nearest Neighbors Algorithm for Predicting Heart Disease Using Python Flask,” Iraqi J. Sci., vol. 62, no. 9, 2021, doi: 10.24996/ijs.2021.62.9.33.

S.-W. K. & C.-F. T. Li-Yu Hu, Min-Wei Huang, “The distance function effect on k-nearest neighbor classification for medical datasets,” Springerplus, vol. 5, no. 1304, 2016, doi: https://doi.org/10.1186/s40064-016-2941-7.

A. Singh and R. Kumar, “Heart Disease Prediction Using Machine Learning Algorithms,” Int. Conf. Electr. Electron. Eng. ICE3 2020, pp. 452–457, Feb. 2020, doi: 10.1109/ICE348803.2020.9122958.

D. Shah, S. Patel, and S. K. Bharti, “Heart Disease Prediction using Machine Learning Techniques,” SN Comput. Sci. 2020 16, vol. 1, no. 6, pp. 1–6, Oct. 2020, doi: 10.1007/S42979-020-00365-Y.

M. Alobed, A. M. M. Altrad, and Z. B. A. Bakar, “A Comparative Analysis of Euclidean, Jaccard and Cosine Similarity Measure and Arabic Wordnet for Automated Arabic Essay Scoring,” Proc. - CAMP 2021 2021 5th Int. Conf. Inf. Retr. Knowl. Manag. Digit. Technol. IR 4.0 Beyond, pp. 70–74, Jun. 2021, doi: 10.1109/CAMP51653.2021.9498119.

M. Besta et al., “Communication-Efficient Jaccard similarity for High-Performance Distributed Genome Comparisons,” Proc. - 2020 IEEE 34th Int. Parallel Distrib. Process. Symp. IPDPS 2020, pp. 1122–1132, May 2020, doi: 10.1109/IPDPS47924.2020.00118.

M. A. M. Marimuthu, “A Review on Heart Disease Prediction using Machine Learning and Data Analytics Approach,” Int. J. Comput. Appl., vol. 81, no. 18, pp. 975–8887, 20181, doi: 10.5120/ijca2018917863.

A. A. A. & O. O. Micheal Olaolu Arowolo, Marion Olubunmi Adebiyi, “Optimized hybrid investigative based dimensionality reduction methods for malaria vector using KNN classifier,” J. Big Data, vol. 8, no. 29, 2021, doi: https://doi.org/10.1186/s40537-021-00415-z.

H. El Hamdaoui, S. Boujraf, N. E. H. Chaoui, and M. Maaroufi, “A Clinical support system for Prediction of Heart Disease using Machine Learning Techniques,” 2020 Int. Conf. Adv. Technol. Signal Image Process. ATSIP 2020, Sep. 2020, doi: 10.1109/ATSIP49331.2020.9231760.

M. M. Ali, B. K. Paul, K. Ahmed, F. M. Bui, J. M. W. Quinn, and M. A. Moni, “Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison,” Comput. Biol. Med., vol. 136, p. 104672, Sep. 2021, doi: 10.1016/J.COMPBIOMED.2021.104672.

S. L. B. Pauline Rothmann-Brumm, “An Improved K Nearest Neighbor Classifier for High-Dimensional and Mixture Data,” J. Phys. Conf. Ser., 2021, doi: 10.1088/1742-6596/1813/1/012026.

R. D. Canlas, “Data Mining in Healthcare : Current Applications and Issues By,” Comput. Sci., 2010.

Chaitrali S., Sulabha S. Apte, “Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques,” Int. J. Comput. Appl., vol. 47, no. 10, 2012, [Online]. Available: https://research.ijcaonline.org/volume47/number10/pxc3880076.pdf

S. I. Ayon, M. M. Islam, and M. R. Hossain, “Coronary Artery Heart Disease Prediction: A Comparative Study of Computational Intelligence Techniques,” IETE J. Res., vol. 68, no. 4, pp. 2488–2507, 2022, doi: 10.1080/03772063.2020.1713916;WGROUP:STRING:PUBLICATION.

V. S. Moohanad Jawthari, “Predicting students’ academic performance using a modified kNN algorithm,” Pollack Period., 2021, doi: https://doi.org/10.1556/606.2021.00374.

S. S. Yadav, S. M. Jadhav, S. Nagrale, and N. Patil, “Application of Machine Learning for the Detection of Heart Disease,” 2nd Int. Conf. Innov. Mech. Ind. Appl. ICIMIA 2020 - Conf. Proc., pp. 165–172, Mar. 2020, doi: 10.1109/ICIMIA48430.2020.9074954.

R. K. A. Vijay Verma, “A New Similarity Measure Based on Simple Matching Coefficient for Improving the Accuracy of Collaborative Recommendations,” Int. J. Inf. Technol. Comput. Sci., vol. 11, no. 6, 2019, doi: https://doi.org/10.5815/ijitcs.2019.06.05.

S. R. K. Maheswari, A. Balamurugan, P. Malathi, “Hybrid clustering algorithm for an efficient brain tumor segmentation,” Mater. Today, 2021, doi: https://doi.org/10.1016/j.matpr.2020.08.718.

C. Fan, “Correlation Coefficients of Refined-Single Valued Neutrosophic Sets and Their Applications in Multiple Attribute Decision-Making,” J. Adv. Comput. Intell. Intell. Informatics, vol. 23, no. 3, pp. 421–426, May 2019, doi: 10.20965/JACIII.2019.P0421.

M. B. M. S. Asfandyar Khan, Abdullah Khan, Muhammad Muntazir Khan, Kamran Farid, Muhammad Mansoor Alam, “Cardiovascular and Diabetes Diseases Classification Using Ensemble Stacking Classifiers with SVM as a Meta Classifier,” Diagnostics, vol. 12, no. 11, p. 2595, 2022, doi: https://doi.org/10.3390/diagnostics12112595.

E. A. Muhammad Muntazir Khan, Muhammad Zubair Rehman, Abdullah Khan, “Anomaly detection in network traffic with ELSC learning algorithm,” Electron. Lett., vol. 16, no. 14, 2024, doi: https://doi.org/10.1049/ell2.13235.

M. I. & M. A. Yousef Alhwaiti, Muntazir Khan, Muhammad Asim, Muhammad Hameed Siddiqi, “Leveraging YOLO deep learning models to enhance plant disease identification,” Sci. Rep., vol. 15, p. 7969, 2025, doi: https://doi.org/10.1038/s41598-025-92143-0.

S. A. Lashari, M. M. Khan, A. Khan, S. Salahuddin, and M. N. Ata, “Comparative Evaluation of Machine Learning Models for Mobile Phone Price Prediction: Assessing Accuracy, Robustness, and Generalization Performance,” J. Informatics Web Eng., vol. 3, no. 3, pp. 147–163, Oct. 2024, doi: 10.33093/JIWE.2024.3.3.9.

M. Imran, J. Usman, M. Khan, and A. Khan, “A Hybrid Deep Learning VGG-16 Based SVM Model for Vehicle Type Classification,” J. Informatics Web Eng., vol. 4, no. 1, pp. 152–167, Feb. 2025, doi: 10.33093/JIWE.2025.4.1.12.

Downloads

Published

2025-11-24

How to Cite

Khan, M. M., Shakila Parveen Jan, Jamal Uddin, Basharat Ahmad Hassan, & Anees Ur Rahman. (2025). A Modified K-Nearest Neighbors Algorithm for the Detection of Heart Disease. International Journal of Innovations in Science & Technology, 7(4), 2863–2880. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/1654

Most read articles by the same author(s)