Clear Tic-AI: Detection of Dysarthria and its Severity Analysis

Authors

  • Huma Sheraz National University of Modern Languages NUML,
  • Iqra Ashraf National University of Modern Languages (NUML).
  • Sidra Ashraf National University of Modern Languages (NUML).
  • Muhammad Zain National University of Modern Languages (NUML).
  • Babar Nawaz National University of Modern Languages (NUML).

Keywords:

Dysarthria, Speech Production, Sequential Neural Network

Abstract

Dysarthria and other motor speech disorders result from abnormalities in the neural or muscular processes that actually control speech production; conversely, these disorders affect the strength, coordination, and tone of the vocal muscles that ultimately produce less intelligible speech. Because dysarthria can range from moderate distortion of articulation to severe impairment of speech, early and accurate assessment is critical. The paper proposes Clearitic AI, an automated speech analysis platform that leverages artificial intelligence to diagnose vocal disorders. It fuses Wav2Vec2 with traditional acoustic features such as Mel-Frequency Cepstral Coefficients (MFCCs), pitch estimation, and spectral descriptors. Abnormal voice classification and its severity on a framework with a sequential neural network architecture are proposed. Extensive testing of the system is performed using 10,000 recordings of voice samples from the TORGO dataset and the Mozilla Common Voice dataset. Experimental results demonstrate that the proposed model achieves a classification accuracy of 94.2% (±1.3), an F1-score of 0.943, and an Area Under the Curve (AUC) of 0.987 on the test set, thereby establishing the effectiveness of this framework for dysarthric speech detection applications.

References

Lindsay PenningtonNick MillerSheila Robson, “Speech therapy for children with dysarthria acquired before three years of age - Pennington, L - 2009 | Cochrane Library,” Cochrane library. Accessed: Dec. 23, 2025. [Online]. Available: https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD006937.pub2/full

V. M. Maria Helena Franciscatto, Marcos Didonet Del Fabro, João Carlos Damasceno Lima, Celio Trois, Augusto Moro, “Towards a speech therapy support system based on phonological processes early detection,” Comput. Speech Lang., vol. 65, p. 101130, 2021, doi: https://doi.org/10.1016/j.csl.2020.101130.

M. Danubianu, S. G. Pentiuc, O. A. Schipor, M. Nestor, and I. Ungureanu, “Distributed intelligent system for personalized therapy of speech disorders,” Proc. - 3rd Int. Multi-Conf. Comput. Glob. Inf. Technol. ICCGI 2008 Conjunction with ComP2P 2008 1st Int. Work. Comput. P2P Networks Theory Pract., pp. 166–170, 2008, doi: 10.1109/ICCGI.2008.31.

M. L.-N. Vladimir E. Robles-Bykbaev, “SPELTA: An expert system to generate therapy plans for speech and language disorders,” Expert Syst. Appl., vol. 42, no. 21, pp. 7641–7651, 2015, doi: https://doi.org/10.1016/j.eswa.2015.06.011.

“Speech Sound Disorders: Articulation and Phonology.” Accessed: Dec. 23, 2025. [Online]. Available: https://www.asha.org/practice-portal/clinical-topics/articulation-and-phonology/?srsltid=AfmBOooMICpu2zctIJy1D3I_HarxrXAdowIXDL8nJ3O0_MsT40GslSYv

J. P. Eugenia I. Toki, “An Online Expert System for Diagnostic Assessment Procedures on Young Children’s Oral Speech and Language,” Procedia Comput. Sci., vol. 14, pp. 428–437, 2012, doi: https://doi.org/10.1016/j.procs.2012.10.049.

A. Ben-Aharon, “A Practical Guide to Establishing an Online Speech Therapy Private Practice,” Perspect. ASHA Spec. Interes. Groups, vol. 4, no. 4, pp. 712–718, 2019, [Online]. Available: https://pubs.asha.org/doi/10.1044/2019_PERS-SIG18-2018-0022

C. J. Price, “A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading,” Neuroimage, vol. 62, no. 2, pp. 816–847, 2012, doi: https://doi.org/10.1016/j.neuroimage.2012.04.062.

S. L. S. Pascale Tremblay, “Motor response selection in overt sentence production: a functional MRI study,” Front. Psychol., vol. 2, 2011, doi: https://doi.org/10.3389/fpsyg.2011.00253.

H. Ackermann, D. Wildgruber, and W. Grodd, “Neuroradiological activation studies on the cerebral organisation of language capacities - A review,” Fortschritte der Neurol. Psychiatr., vol. 65, no. 4, pp. 182–194, 1997, doi: 10.1055/S-2007-996321/BIB.

P. Mariën et al., “Consensus paper: Language and the cerebellum: An ongoing enigma,” Cerebellum, vol. 13, no. 3, pp. 386–410, Dec. 2014, doi: 10.1007/S12311-013-0540-5/METRICS.

H Chertkow, S Murtha, “PET activation and language,” Clin Neurosci, vol. 4, no. 2, pp. 78–86, 1997, [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/9059757/

S. Geva, P. S. Jones, J. T. Crinion, C. J. Price, J. C. Baron, and E. A. Warburton, “The Effect of Aging on the Neural Correlates of Phonological Word Retrieval,” J. Cogn. Neurosci., vol. 24, no. 11, pp. 2135–2146, Nov. 2012, doi: 10.1162/JOCN_A_00278.

S. E. G. Hannah P. Rowe, “Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial From the Clinical Perspective,” Front. Comput. Sci., vol. 4, 2022, doi: https://doi.org/10.3389/fcomp.2022.770210.

A. A. J. and R. Rajan, “Automated Dysarthria Severity Classification: A Study on Acoustic Features and Deep Learning Techniques,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, pp. 1147–1157, 2022, doi: 10.1109/TNSRE.2022.3169814.

N. Côté, “Speech Quality Measurement Methods,” pp. 37–85, 2011, doi: 10.1007/978-3-642-18463-5_2.

G. W. Ray D. Kent, “Acoustic studies of dysarthric speech: Methods, progress, and potential,” J. Commun. Disord., vol. 32, no. 3, pp. 141–186, 1999, doi: https://doi.org/10.1016/S0021-9924(99)00004-0.

J. R. D. Hugo Botha, “Classification and clinicoradiologic features of primary progressive aphasia (PPA) and apraxia of speech,” Cortex, vol. 69, pp. 220–236, 2015, doi: https://doi.org/10.1016/j.cortex.2015.05.013.

M. N. R. Jonathan D. Rohrer, “Neologistic jargon aphasia and agraphia in primary progressive aphasia,” J. Neurol. Sci., vol. 277, no. 1, 2009, [Online]. Available: https://www.jns-journal.com/article/S0022-510X(08)00510-8/fulltext

K. H. Takeharu Tsuboi , Hiroshi Tatsumi , Masahiko Yamamoto , Yoshiya Toyoshima , Yasuji Katayama, “A case of conduction aphasia presenting with peculiar jargon speech,” Clin. Neurol., vol. 61, no. 5, pp. 297–304, 2021, doi: https://doi.org/10.5692/clinicalneurol.cn-001466.

J. R. Duffy, “Motor Speech Disorders: Clues to Neurologic Diagnosis,” Park. Dis. Mov. Disord., pp. 35–53, 2000, doi: 10.1007/978-1-59259-410-8_2.

J. Rusz et al., “Speech disorders reflect differing pathophysiology in Parkinson’s disease, progressive supranuclear palsy and multiple system atrophy,” J. Neurol. 2015 2624, vol. 262, no. 4, pp. 992–1001, Feb. 2015, doi: 10.1007/S00415-015-7671-1.

A. S. Hussain Albaqshi, “Dysarthric Speech Recognition using Convolutional Recurrent Neural Networks,” Int. J. Intell. Eng. Syst., vol. 13, no. 6, 2020, doi: 10.22266/ijies2020.1231.34.

R. R. Jagat Chaitanya Prabhala, “Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model,” Mach. Learn. with Appl., vol. 21, p. 100721, 2025, doi: https://doi.org/10.1016/j.mlwa.2025.100721.

H. A. Irianta, Abdul Fadlil, and Rusydi Umar, “Transfer Learning-Based Detection of Dysarthric Speech Using Lightweight Convolutional Neural Networks,” JUITA J. Inform., pp. 349–358, Nov. 2025, doi: 10.30595/JUITA.V13I3.27695.

S. B. M. Mahendran, R. Visalakshi, “Dysarthria detection using convolution neural network,” Meas. Sensors, vol. 30, p. 100913, 2023, doi: https://doi.org/10.1016/j.measen.2023.100913.

G. P. Usha and J. S. R. Alex, “Speech assessment tool methods for speech impaired children: a systematic literature review on the state-of-the-art in Speech impairment analysis,” Multimed. Tools Appl., vol. 82, no. 22, p. 1, Sep. 2023, doi: 10.1007/S11042-023-14913-0.

B. Kadirvelu, L. Stumpf, S. Waibel, and A. A. Faisal, “Speaker-independent dysarthria severity classification using self-supervised transformers and multi-task learning,” PLOS Digit. Heal., vol. 4, no. 11, p. e0001076, Nov. 2025, doi: 10.1371/JOURNAL.PDIG.0001076.

P. Wang and H. Van Hamme, “A Light Transformer for Speech-To-Intent Applications,” 2021 IEEE Spok. Lang. Technol. Work. SLT 2021 - Proc., pp. 997–1003, Jan. 2021, doi: 10.1109/SLT48900.2021.9383559.

A. S. Alluhaidan, E. M. Alanazi, N. Aljohani, and A. A. Alneil, “A real-time pediatric dysarthria speech disorder detection using residual recurrent neural network with attention U-net based transformer encoder model,” AIMS Math., vol. 10, no. 12, pp. 28787–28814, 2025, doi: 10.3934/MATH.20251267.

“The TORGO database.” Accessed: Dec. 23, 2025. [Online]. Available: https://www.cs.toronto.edu/~complingweb/data/TORGO/torgo.html

J. M. Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, “Common Voice: A Massively-Multilingual Speech Corpus,” arXiv:1912.06670, 2020, [Online]. Available: https://arxiv.org/abs/1912.06670

M. A. Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, “wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations,” arXiv:2006.11477, 2020, doi: https://doi.org/10.48550/arXiv.2006.11477.

C. R. Brian McFee, “librosa: Audio and Music Signal Analysis in Python,” SciPy, 2015, [Online]. Available: https://proceedings.scipy.org/articles/Majora-7b98e3ed-003

G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Trans. Speech Audio Process., vol. 10, no. 5, pp. 293–302, Jul. 2002, doi: 10.1109/TSA.2002.800560.

G. I. and B. Y. and C. A., “Deep Learning,” pp. 1–23, 2016, Accessed: Jun. 17, 2025. [Online]. Available: https://mitpress.mit.edu/9780262035613/deep-learning/

“Pattern Recognition and Machine Learning | Springer Nature Link (formerly SpringerLink).” Accessed: Dec. 23, 2025. [Online]. Available: https://link.springer.com/book/9780387310732

K. P. Murphy, “Machine learning - a probabilistic perspective,” Adapt. Comput. Mach. Learn. Ser., 2012.

W. N. Hsu, B. Bolte, Y. H. H. Tsai, K. Lakhotia, R. Salakhutdinov, and A. Mohamed, “HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 29, pp. 3451–3460, 2021, doi: 10.1109/TASLP.2021.3122291.

Downloads

Published

2025-12-01

How to Cite

Huma Sheraz, Iqra Ashraf, Sidra Ashraf, Muhammad Zain, & Babar Nawaz. (2025). Clear Tic-AI: Detection of Dysarthria and its Severity Analysis. International Journal of Innovations in Science & Technology, 7(4), 3018–3032. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/1664