350 rub
Journal Neurocomputers №3 for 2025 г.
Article in number:
Analysis of challenges of automatic sentiment detection methods in unstructured texts
Type of article: overview article
DOI: https://doi.org/10.18127/j19998554-202503-07
UDC: 004.852:004.912
Authors:

E.A. Presnov1, A.N. Alpatov2
1, 2 MIREA – Russian Technological University (Moscow, Russia)

1 presnov.e.a@yandex.ru, 2 alpatov@mirea.ru

Abstract:

In recent years, the tasks of automatic sentiment detection in text have gained particular relevance due to the growing volume of user-generated content in social media, blogs, product and service reviews, political discussions, and other digital communication environments. The advent of deep neural networks has introduced new approaches to extracting complex linguistic patterns and contextual dependencies, significantly improving sentiment classification quality across a wide range of benchmark datasets. However, despite these advancements, several challenges remain in applying machine learning and neural networks to sentiment analysis. Firstly, the heterogeneity and contextual variability of sentiment-related vocabulary pose a serious obstacle. Secondly, sarcasm, irony, and implicitly expressed attitudes present a significant difficulty: standard machine learning methods often fail to effectively interpret such nuances, especially without incorporating pragmatic context. In addition, the quality of input data plays a crucial role in building reliable sentiment detection systems. Of particular note is the challenge of recognizing emotions that go beyond the simple positive/negative polarity spectrum.

The objective of the article is to analyze existing machine learning methods for sentiment analysis in text in order to identify promising directions for further research.

A range of existing text analysis methods applicable to sentiment detection has been reviewed. The current challenges have been identified. These are limited availability of domain-specific training data, class imbalance in sentiment categories, and restricted access to representative datasets. Directions for future research have also been outlined, including ambiguity in language, idiomatic expressions, and the expansion of detectable emotional categories.

The results of the analysis can be applied in the design of systems focused on identifying authors' opinions regarding specific issues or objects, as well as emotional responses to particular events or entities. These findings may support the early stages of system design and technical requirements analysis in selecting the most appropriate methodological approach.

Pages: 49-61
For citation

Presnov E.A., Alpatov A.N. Analysis of challenges of automatic sentiment detection methods in unstructured texts. Neurocomputers. 2025. V. 27. № 3. P. 49–61. DOI: https://doi.org/10.18127/j19998554-202503-07 (in Russian)

References
  1. Measuring digital development – Facts and Figures 2024. ITU publications [Elektronnyj resurs]. URL: https://www.itu.int/hub/publication/d-ind-ict_mdd-2024-4/ (data obrashcheniya: 2024).
  2. Shchekotin E.V. i dr. Sub''ektivnaya otsenka (ne)blagopoluchiya naseleniya regionov RF na osnove dannykh sotsial'nykh setej. Monitoring obshchestvennogo mneniya: Ekonomicheskie i sotsial'nye peremeny. 2020. № 1 (155). S. 78–116. (in Russian)
  3. Razali N.A.M. et al. Opinion mining for national security: techniques, domain applications, challenges and research opportunities. Journal of big data. 2021. V. 8. P. 1–46.
  4. Liu B. Sentiment analysis: Mining opinions, sentiments, and emotions. Morgan & Claypool Publishers. 2012.
  5. Liu B. Sentiment analysis and opinion mining. Springer Nature. 2022.
  6. Pazel'skaya A.G., Solov'ev A.N. Metod opredeleniya emotsij v tekstakh na russkom yazyke. Komp'yuternaya lingvistika i intellektual'nye tekhnologii «Dialog-2011». M. 2011. S. 510–522. (in Russian)
  7. Fernández-Gavilanes M. et al. Unsupervised method for sentiment analysis in online texts. Expert Systems with Applications. 2016. V. 58. P. 57–75.
  8. Wook M. et al. Opinion mining technique for developing student feedback analysis system using lexicon-based approach (OMFeedback). Education and Information Technologies. 2020. V. 25. P. 2549–2560.
  9. Garcia M.B. Sentiment analysis of tweets on coronavirus disease 2019 (COVID-19) pandemic from Metro Manila, Philippines. Cybernetics and Information Technologies. 2020. V. 20. № 4. P. 141–155.
  10. Bessmertnyj I.A., Nugumanova A.B., Platonov A.V. Intellektual'nye sistemy. M.: Yurajt. 2017. C. 95–97. (in Russian)
  11. Bishop C.M., Nasrabadi N.M. Pattern recognition and machine learning. New York: Springer. 2006. V. 4. № 4. P. 738.
  12. Boser B.E., Guyon I.M., Vapnik V.N. A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory. 1992. P. 144–152.
  13. Zhang L., Dong W., Mu X. Analysing the features of negative sentiment tweets. The Electronic Library. 2018. V. 36. № 5. P. 782–799.
  14. Ameur H., Jamoussi S., Hamadou A.B. Sentiment lexicon enrichment using emotional vector representation. 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA). IEEE. 2017. P. 951–958.
  15. Banik N., Rahman M.H.H. Evaluation of naive bayes and support vector machines on bangla textual movie reviews. 2018 International Conference on Bangla Speech and Language Processing (ICBSLP). IEEE. 2018. P. 1–6.
  16. Mogaji E., Erkan I. Insight into consumer experience on UK train transportation services. Travel Behaviour and Society. 2019. V. 14. P. 21–33.
  17. Kaur H.J., Kumar R. Sentiment analysis from social media in crisis situations. International Conference on Computing, Communication & Automation. IEEE. 2015. P. 251–256.
  18. Fischer I., Steiger H.J. Toward automatic evaluation of medical abstracts: The current value of sentiment analysis and machine learning for classification of the importance of PubMed abstracts of randomized trials for stroke. Journal of Stroke and Cerebrovascular Diseases. 2020. V. 29. № 9. P. 105042.
  19. Gopalakrishnan V., Ramaswamy C. Patient opinion mining to analyze drugs satisfaction using supervised learning. Journal of Applied Research and Technology. 2017. V. 15. № 4. P. 311–319.
  20. Quinlan J.R. C4.5: programs for machine learning. Elsevier. 2014.
  21. Quinlan J.R. Induction of decision trees. Machine learning. 1986. V. 1. P. 81–106.
  22. Elhadad M.K., Li K.F., Gebali F. Sentiment analysis of Arabic and English tweets. Web, Artificial Intelligence and Network Applications: Proceedings of the Workshops of the 33rd International Conference on Advanced Information Networking and Applications (WAINA-2019). Springer International Publishing. 2019. P. 334–348.
  23. Chen Y. Convolutional neural network for sentence classification. Thesis. University of Waterloo. 2015.
  24. Harb J.G.D., Ebeling R., Becker K. A framework to analyze the emotional reactions to mass violent events on Twitter and influential factors. Information Processing & Management. 2020. V. 57. № 6. P. 102372.
  25. Medford R.J. et al. An «infodemic»: leveraging high-volume Twitter data to understand early public sentiment for the coronavirus disease 2019 outbreak. Open forum infectious diseases. US: Oxford University Press. 2020. V. 7. № 7. P. ofaa258.
  26. Lai S. et al. Recurrent convolutional neural networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence. 2015. V. 29. № 1.
  27. Manning C., Schutze H. Foundations of statistical natural language processing. MIT Press. 1999.
  28. Vaswani A. et al. Attention is all you need. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA. 2017. P. 1–11.
  29. Ali S., Wang G., Riaz S. Aspect based sentiment analysis of ridesharing platform reviews for kansei engineering. IEEE Access. 2020. V. 8. P. 173186–173196.
  30. Vychegzhanin S.V., Kotel'nikov E.V., Razova E.V. Issledovanie metodov vybora optimal'nogo kolichestva priznakov dlya resheniya zadachi opredeleniya tochki zreniya avtora teksta. Advanced Science. 2019. № 1. S. 19–23.
Date of receipt: 03.04.2025
Approved after review: 06.05.2025
Accepted for publication: 26.05.2025