Sentiment Classification Using Multinomial Logistic Regression on Roman Urdu Text

Authors

  • Irfan Qutab Department of Computer Science, University of Lahore.
  • Khawar Iqbal Malik Department of Computer Science, University of Lahore.
  • Hira Arooj Department of Mathematics & Statistics, University of Lahore

Keywords:

Sentiment Analysis, Emotion Analysis, Multinomial Logistics Regression, Machine Learning.

Abstract

Sentiment analysis seeks to reveal textual knowledge of literary documents in which people communicate their thoughts and views on shared platforms, such as social blogs. On social blogs, users detail is available as short comments. A question of sentiment analysis has been raised by information across large dimensions published on these blogs. Although, some language libraries are established to address the problem of emotional analysis but limited work is available on Roman Urdu language because most of the comments or opinions available online are published in text-free style. The present study evaluates emotions in the comments of Roman Urdu by using a machine learning technique. This analysis was done in different stages of data collection, labeling, pre-processing, and feature extraction. In the final phase, we used the pipeline method along with Multinomial Logistic Regression for the classification of the dataset into four categories (Politics, Sports, Education and Religion). The whole dataset was divided into training and test sets. We evaluated our test set and achieved results by using Precision, Recall, Accuracy, F1 Score and Confusion Matrix and found the accuracy ranging to 94%.

Full Text

Downloads

Published

2022-04-17

How to Cite

Irfan Qutab, Malik, K. I., & Hira Arooj. (2022). Sentiment Classification Using Multinomial Logistic Regression on Roman Urdu Text. International Journal of Innovations in Science & Technology, 4(2), 323–335. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/217