AI-Based Sindhi Handwritten Alphabets Classification with Web-Based Development
Keywords:
Sindhi, Handwriting Recognition, Convolution Neural Network, Data Augmentation, Web-Based Application, Machine Learning, Open-sourceAbstract
Handwriting recognition has made remarkable progress for some prominent scripts, but low-resource languages such as Sindhi have received little attention so far. In this research, we propose the design and implementation of a strong AI based model to classify handwritten Sindhi alphabets. To overcome the difficulties caused by varying handwriting and a lack of publicly available datasets, the model builds on a manually curated, heterogeneous dataset, sophisticated CNN architectures, and data augmentation techniques. To support more research, the dataset will be made publicly available in two versions: raw and augmented. This study’s key contributions include achieving approximately 93% training accuracy and 96% validation accuracy with a loss below 1%, and the creation of valuable open-source datasets for Sindhi handwriting recognition. While a web-based application is planned as future work, these achievements provide a strong foundation for digitizing Sindhi texts and educational tools, and help preserving Sindhi language heritage.
References
A. Chandio and M. Leghari, “Deep learning-based isolated handwritten Sindhi character recognition,” Indian J. Technol., vol. 6, no. 2, pp. 12–19, 2020, [Online]. Available: https://www.researchgate.net/publication/343170746_Deep_learning-based_isolated_handwritten_Sindhi_character_recognition
H. A. Muhammad Sadiq Amin, Siddiqui Muhammad Yasir, “Recognition of Pashto Handwritten Characters Based on Deep Learning,” Sensors, vol. 20, no. 20, p. 5884, 2020, doi: https://doi.org/10.3390/s20205884.
Q. U. A. Akram and S. Hussain, “Improving Urdu Recognition Using Character-Based Artistic Features of Nastalique Calligraphy,” IEEE Access, vol. 7, pp. 8495–8507, 2019, doi: 10.1109/ACCESS.2018.2887103.
S. K. & J. J. P. C. R. Syed Yasser Arafat, Nabeel Ashraf, Muhammad Javed Iqbal, Iftikhar Ahmad, “Urdu signboard detection and recognition using deep learning,” Multimed. Tools Appl., vol. 81, pp. 11965–11987, 2022, doi: https://doi.org/10.1007/s11042-020-10175-2.
A. A. Sanjrani et al., “Extended framework for Sindhi numerals OCR using gradient orientation histograms,” J. Intell. Fuzzy Syst., vol. 43, no. 2, pp. 2045–2056, 2022, doi: 10.3233/JIFS-219304/ASSET/A3AA616C-998F-4BB9-96EA-7C931A6C1491/ASSETS/GRAPHIC/10.3233_JIFS-219304-IMG2.JPG.
J. Baber et al, “Urdu handwritten character recognition using deep learning,” J. Inf. Commun. Technol. Res., vol. 3, no. 1, pp. 67–74, 2020.
M. K. S. A. Naveed, Ahmed Soomro, Leezna Saleem, “OHSCR: Benchmarks dataset for offline handwritten Sindhi character recognition,” Sir Syed Univ. J. Res., vol. 4, no. 1, pp. 11–20, 2024, doi: 10.33317/ssurj.618.
A. H. J. Asghar Ali Chandio, Mehwish Leghari, Mehjabeen Leghari, “Multi-Font and Multi-Size Printed Sindhi Character Recognition using Convolutional Neural Networks,” Pakistan J. Eng. Appl. Sci., vol. 25, 2019, [Online]. Available: https://journal.uet.edu.pk/ojs_old/index.php/pjeas/article/view/1635
S. H. Fazli Khaliq, Muhammad Shabir, Inayat Khan, Shafiq Ahmad, Muhammad Usman, Muhammad Zubair, “Pashto Handwritten Invariant Character Trajectory Prediction Using a Customized Deep Learning Technique,” Sensors, vol. 23, no. 13, p. 6060, 2023, doi: https://doi.org/10.3390/s23136060.
M. R. B. Sayma Shafeeque A. W. Siddiqui, Rajashri G. Kanke, Ramnath M. Gaikwad, “Review on Isolated Urdu Character Recognition: Offline Handwritten Approach,” Ijraset J. Res. Appl. Sci. Eng. Technol., 2023, doi: https://doi.org/10.22214/ijraset.2023.55164.

Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 50SEA

This work is licensed under a Creative Commons Attribution 4.0 International License.