Auscultation-Based Pulmonary Disease Detection and Classification Using Deep Neural Networks

Yusra Shaikh; Areej Fatemah Meghji; Zobiya Jumani; Mehak Jatoi

Authors

Yusra Shaikh Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Areej Fatemah Meghji Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Zobiya Jumani Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Mehak Jatoi Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan

Keywords:

Pulmonary Disease, Auscultations, Deep Learning, Recurrent Neural Network (RNN), Data Augmentation, Gated Recurrent Unit

Abstract

Pulmonary diseases like Pneumonia, Bronchiectasis, and Chronic Obstructive Pulmonary Disease cause a large number of deaths worldwide. For such diseases to be treated and managed effectively, an early and accurate diagnosis is essential. In this work, we propose a deep learning model based on Recurrent Neural Networks (RNN) that can detect three different pulmonary diseases, as well as healthy lung sounds, using only auscultation recordings. The model was trained using the ICBHI dataset, which contains 920 recordings from 126 people and covers more than 6,800 respiratory cycles. To uniform the data, the audios are padded to equal length. To tackle class imbalance in the dataset, augmentation techniques of Gaussian noise injection, time-shifting, and time stretching are used. We employ a simplified version of the Gated Recurrent Unit (GRU)-based RNN architecture to deal with the padded sequences, along with a dropout layer to avoid overfitting. The model is trained using the Adamax optimizer with categorical cross-entropy loss, along with a model checkpoint to ensure learning consistency. Apart from the evaluation of model accuracy, we also evaluated the F1-score, accuracy, and loss graphs to ensure the competitive performance of our approach. Out of the six different experiments, with different data variations and two different model architectures, the outperforming model exhibited an accuracy of 98.53%, a precision of 98.57%, a recall of 98.53%, and an F1-score of 98.52%.

References

“Lung disease: MedlinePlus Medical Encyclopedia.” Accessed: Nov. 01, 2025. [Online]. Available: https://medlineplus.gov/ency/article/000066.htm

“Chronic respiratory disease is third leading cause of death globally with air pollution killing 1.3 million people | Institute for Health Metrics and Evaluation.” Accessed: Nov. 01, 2025. [Online]. Available: https://www.healthdata.org/news-events/newsroom/news-releases/chronic-respiratory-disease-third-leading-cause-death-globally

“The State of the World’s Children 2017: Statistical tables - UNICEF DATA.” Accessed: Nov. 01, 2025. [Online]. Available: https://data.unicef.org/resources/state-worlds-children-2017-statistical-tables/

K. Q. Dong-Min Huang, Jia Huang, “Deep learning-based lung sound analysis for intelligent stethoscope,” Mil. Med. Res., vol. 10, no. 44, 2023, doi: https://doi.org/10.1186/s40779-023-00479-3.

Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nat. 2015 5217553, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.

S. S. K. & I. B. Susmita Das, Amara Tariq, Thiago Santos, “Recurrent Neural Networks (RNNs): Architectures, Training Tricks, and Introduction to Influential Research,” Mach. Learn. Brain Disord., pp. 117–138, 2023, doi: https://doi.org/10.1007/978-1-0716-3195-9_4.

“Classification in Machine Learning: A Guide for Beginners | DataCamp.” Accessed: Nov. 01, 2025. [Online]. Available: https://www.datacamp.com/blog/classification-machine-learning

“Respiratory Sound Database.” Accessed: Nov. 01, 2025. [Online]. Available: https://www.kaggle.com/datasets/vbookshelf/respiratory-sound-database

Z. Tariq, S. K. Shah, and Y. Lee, “Lung Disease Classification using Deep Convolutional Neural Network,” Proc. - 2019 IEEE Int. Conf. Bioinforma. Biomed. BIBM 2019, pp. 732–735, Nov. 2019, doi: 10.1109/BIBM47256.2019.8983071.

V. Basu and S. Rana, “Respiratory diseases recognition through respiratory sound with the help of deep neural network,” 4th Int. Conf. Comput. Intell. Networks, CINE 2020, Feb. 2020, doi: 10.1109/CINE48825.2020.234388.

G. A. C. Georgios Petmezas, “Automated Lung Sound Classification Using a Hybrid CNN-LSTM Network and Focal Loss Function,” Sensors, vol. 22, no. 3, p. 1232, 2022, doi: https://doi.org/10.3390/s22031232.

A. S. Pinzhi Zhang, “Pulmonary disease detection and classification in patient respiratory audio files using long short-term memory neural networks,” Front. Med., vol. 10, 2023, doi: https://doi.org/10.3389/fmed.2023.1269784.

M. H. Nawaz, J. Ahmad, M. Haroon, M. Haseeb, and A. Salman, “Real-Time Deep Learning for Lung Disease Classification: A Step Forward,” 2024 Int. Conf. Front. Inf. Technol. FIT 2024, 2024, doi: 10.1109/FIT63703.2024.10838437.

H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech,” J. Acoust. Soc. Am., vol. 87, no. 4, pp. 1738–1752, Apr. 1990, doi: 10.1121/1.399423.

B. M. Rocha et al., “Α Respiratory Sound Database for the Development of Automated Classification,” IFMBE Proc., vol. 66, pp. 33–37, 2018, doi: 10.1007/978-981-10-7419-6_6.

H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284, Sep. 2009, doi: 10.1109/TKDE.2008.239.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2011, doi: 10.1613/jair.953.

A. R.-C. Dulva Hina, “Impact Evaluation of Sound Dataset Augmentation and Synthetic Generation upon Classification Accuracy,” J. Sens. Actuator Netw, vol. 14, no. 5, p. 91, 2025, doi: https://doi.org/10.3390/jsan14050091.

J. Y. Qiurui Sun, “Advances and Challenges in Respiratory Sound Analysis: A Technique Review Based on the ICBHI2017 Database,” Preprints, 2025, [Online]. Available: https://www.preprints.org/manuscript/202506.1527

“Time Stretching And Pitch Shifting of Audio Signals – An Overview | Stephan Bernsee’s Blog.” Accessed: Nov. 04, 2025. [Online]. Available: https://blogs.zynaptiq.com/bernsee/time-pitch-overview/

R. B. Constantin Constantinescu, “Lung Sounds Anomaly Detection with Respiratory Cycle Segmentation,” BRAIN. Broad Res. Artif. Intell. Neurosci., vol. 15, no. 3, pp. 188–196, 2024, [Online]. Available: https://brain.edusoft.ro/index.php/brain/article/view/1597

“Timeseries data loading.” Accessed: Nov. 01, 2025. [Online]. Available: https://keras.io/api/data_loading/timeseries/

“Masking layer.” Accessed: Nov. 01, 2025. [Online]. Available: https://keras.io/api/layers/core_layers/masking/

E. H. I. E. Amr Mohamed El Koshiry, “Detecting cyberbullying using deep learning techniques: a pre-trained glove and focal loss technique,” PeerJ Comput. Sci., vol. 10, p. e1961, 2024, [Online]. Available: https://peerj.com/articles/cs-1961/#fig-6

A. K. Dubey and V. Jain, “Comparative Study of Convolution Neural Network’s Relu and Leaky-Relu Activation Functions,” Lect. Notes Electr. Eng., vol. 553, pp. 873–880, 2019, doi: 10.1007/978-981-13-6772-4_76.

A. Gulli, A. Kapoor, S. Pal, O’Reilly for Higher Education (Firm), and an O. M. C. Safari, “Deep Learning with TensorFlow 2 and Keras - Second Edition,” p. 646.

S. V. Alex Labach, Hojjat Salehinejad, “Survey of Dropout Methods for Deep Neural Networks,” arXiv:1904.13310, vol. 4, 2019, doi: https://doi.org/10.48550/arXiv.1904.13310.

Google Colab, “Google Colaboratory: Python in the Cloud”, [Online]. Available: https://colab.research.google.com

M. O. Raza, A. F. Meghji, N. A. Mahoto, M. S. Al Reshan, M. S. Abosaq, H. A. Sulaiman, A. Shaikh “Reading Between the Lines: Machine Learning Ensemble and Deep Learning for Implied Threat Detection in Textual Data,” Int. J. Comput. Intell. Syst., vol. 17, no. 183, 2024, doi: https://doi.org/10.1007/s44196-024-00580-y.