Emotion Classification in Bangla Text Data Using Gaussian Naive Bayes Classifier: A Computational Linguistic Study

Authors

  • S M Abdullah Shafi Computer Science Department, American International University-Bangladesh. Dhaka, Bangladesh https://orcid.org/0000-0001-7365-5145
  • Myesha Samia Computer Science Department, American International University-Bangladesh. Dhaka, Bangladesh https://orcid.org/0009-0008-8385-6280
  • Sultanul Arifeen Hamim Computer Science Department, American International University-Bangladesh. Dhaka, Bangladesh

DOI:

https://doi.org/10.56532/mjsat.v4i4.273

Keywords:

Emotion Detection, Natural Language Processing, Naïve Bayes, Machine Learning

Abstract

Emotion analysis from Bengali text data is challenging due to the intricate structure of the language itself and lack of resource availability tailored to Sentiment Classification. In this paper, the authors have used machine learning algorithms, particularly Gaussian Naive Bayes and Support Vector Machine, for the classification of six emotions in Bengali text. The data is comprehensively pre-processed through segmentation, emoticon handling, removal of stop words, and stemming. It uses feature selection techniques like unigram, bi-gram, and term frequency-inverse document frequency to improve classification accuracy. The main aim of the paper is to present an in-depth analysis of emotion detection in Bengali text, which would be very helpful to scholars working on NLP problems in non-English languages. This research, hence, fills up the gap in emotion analysis research for Bengali text, which has comparatively remained underexplored compared to other languages. The methodology involves dataset preparation, extensive preprocessing, feature extraction with selection, and classification. After rigorous experimentation, the accuracy attained with the GNB classifier is 93.83%, proving the effectiveness of the proposed model in capturing subtle emotional nuances in Bengali text.

References

Dasgupta, Sajib, et al. "Morphological analysis of inflecting compound words in Bangla." (2005).

Bhattacharya, U., et al. "An analytic scheme for online handwritten Bangla cursive word recognition." Proc. of the 11th ICFHR (2008): 320-325.

M. Kundu, Machine Understanding of Bangla Sentences, Kolkata: Indian Statistical Institute, Kolkata, Interdisciplinary Series-3.

M. S. I. G. M. M.-E.-E. M. N. I. K. M. Azharul Hasan, Sentiment Recognition from Bangla Text, Dhaka: Technical Challenges and Design Issues in Bangla Language Processing, 2013.

A. a. B. S. Das, Subjectivity detection in English and Bengali: A CRF-based approach, Rajshahi: In Proceeding of ICON. Macmillan Publishers, 2009.

A. a. S. F. Esuli, SENTIWORDNET: A publicly available lexical resource for opinion mining, LREC-06. LREC, 2006.

C. a. V. A. Strapparava, Wordnet effect: An effective extension of WordNet, 4th LREC, pp. 1083-1086, LREC, 2004.

D. R. S. a. B. S. Das, Emotion tracking on blogs - A case study for Bengali, Jiang, H. et al. (Eds.), IEA/AIE 2012, (LNAI) ol. 7345, pp. 447–456)., 2010.

A. &. B. S. Das, Phrase level polarity identification for Bengali, International Journal of Computational Linguistics and Applications, 1(2), pp. 169–181, 2010.

T. W. J. a. H. Wilson, Recognizing contextual polarity in phrase level sentiment analysis, HLT/EMNLP, pp. 347–354, ACL, 2005.

S. &. N. V. Dasgupta, Topic-wise, sentiment-wise, or otherwise? Identifying the hidden dimension for unsupervised text classification, Empirical Methods in Natural Language Processing (EMNLP), 2009.

Graesser, Arthur C., Haiying Li, and Carol Forsyth. "Learning by communicating in natural language with conversational agents." Current Directions in Psychological Science 23.5 (2014): 374-380..

Hossain, Sk Imran, et al. "Implementation of an efficient bangla soft keyboard with text-to-image replacement support." 2012 15th International Conference on Computer and Information Technology (ICCIT). IEEE, 2012.

M. Mahmudur, M. T. Altaf, and S. Ismail, “Detecting Sentiment from Bangla Text using Machine Learning Technique and Feature Analysis,” Int. J. Compute. Appl., vol. 975,2016 p. 8887.

K. Sarkar, “Using Character N-gram Features and Multinomial Naive Bayes for Sentiment Polarity Detection in Bengali Tweets,” in 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT), 2018, pp. 1–4.

D. Das and S. Bandyopadhyay, “Labeling emotion in Bengali blog corpus--a fine grained tagging at sentence level,” in Proceedings of the Eighth Workshop on Asian Language Resources, 2010, pp. 47–55.

D. Das, S. Roy, and S. Bandyopadhyay, "Emotion tracking on blogs- case study for Bengali," in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2012, pp. 447–456.

N. I. Tripto and M. E. Ali, "Detecting Multilabel Sentiment and Emotions from Bangla YouTube Comments," in 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), 2018, pp. 1–6.

M. Rahman, M. Seddiqui, and others, "Comparison of Classical Machine Learning Approaches on Bangla Textual Emotion Analysis”, arXivPrepr. arXiv1907.07826, 2019.

Mansur Alp Tocoglu, OkanÖztürkmenoğlu and Adil Alpkoçak, “Emotion Analysis from Turkish Tweets using Deep Neural Networks,” 2019.

Jainhua Tao, “Context based emotion detection from text input”, Interspeech 2004-ICSPL, 8th International Conference on Spoken Language Processing, 2004.

Hani Al-Omari, Malak Abdullah and Nabeel Bassam, “Emotion Detection in Text using Deep Learning”, 13th International Workshop on Semantic Evaluation (SemEval-2019).

Shaikh, Mostafa Al Masum, Helmut Prendinger, and Ishizuka Mitsuru. "Assessing sentiment of text by semantic dependency and contextual valence analysis." Affective Computing and Intelligent Interaction: Second International Conference, ACII 2007 Lisbon, Portugal, September 12-14, 2007 Proceedings 2. Springer Berlin Heidelberg, 2007.

Azmin, Sara, and Kingshuk Dhar. "Emotion detection from bangla text corpus using naive bayes classifier." 2019 4th International Conference on Electrical Information and Communication Technology (EICT). IEEE, 2019.

Hossain, Eftekhar, Omar Sharif, and Mohammed Moshiul Hoque. "Sentiment polarity detection on bengali book reviews using multinomial naive bayes." Progress in Advanced Computing and Intelligent Engineering: Proceedings of

Downloads

Published

2024-09-02

How to Cite

[1]
S M Abdullah Shafi, Myesha Samia, and Sultanul Arifeen Hamim, “Emotion Classification in Bangla Text Data Using Gaussian Naive Bayes Classifier: A Computational Linguistic Study”, Malaysian J. Sci. Adv. Tech., vol. 4, no. 4, pp. 405–412, Sep. 2024.

Issue

Section

Articles