An ensemble machine learning approach for Twitter sentiment analysis

Radiuk, Pavlo; Pavlova, Olga; Hrypynska, Nadiia

An ensemble machine learning approach for Twitter sentiment analysis

Файли

Radiuk_An-ensemble-machine-learning.pdf (842.92 KB)

Дата

2022-07-17

Автори

Radiuk, Pavlo

Pavlova, Olga

Hrypynska, Nadiia

Видавець

CEUR-WS

Анотація

The presented study addresses the issue of classifying emotional expressions based on small texts (tweets) extracted from the social network Twitter. In this paper, we propose a novel approach to preprocessing tweets to fit them more effectively into the classification model. Moreover, we suggest utilizing two types of features, namely unigrams and bigrams, to expand the feature vector. The classification task of emotional expressions was performed according to several machine learning algorithms: raw random forest, gradient boosting random forest, support vector machine, multilayer perceptron, recurrent neural network, and convolutional neural network. The feature vector elements are presented as sparse and dense subvectors. As a result of computational experiments, it was found that the “appearance” in the reflection of the sparse vector provided higher performance than the “regularity.” The experiments also showed that deep learning approaches performed better than traditional machine learning techniques. Consequently, the best recurrent neural network achieved an accuracy of 83.0% on the test dataset, while the best convolutional neural network reached 83.34%. At the same time, it was discovered that the convolutional model with the support vector machine classifier showed better performance than the single convolutional neural network. Overall, the proposed ensemble method based on receiving the most votes according to the five best models’ predictions has reached an absolute accuracy of 85.71%, proving its practical usefulness.

Ключові слова

Machine learning, deep learning, ensemble model, Twitter, sentiment analysis, sentiment classification

Бібліографічний опис

Radiuk P., Pavlova O., Hrypynska N. An ensemble machine learning approach for Twitter sentiment analysis. The 6th International Conference on Computational Linguistics and Intelligent Systems (CoLInS-2022). Volume I: Main Conference : CEUR-Workshop Proceedings. Vol. 3171. (Gliwice, Poland, 12-13 May 2022). Gliwice, 2022. Pp. 387-397. URL: http://ceur-ws.org/Vol-3171/paper32.pdf

URI

https://elar.khmnu.edu.ua/handle/123456789/12310

Зібрання

Кафедра комп’ютерних наук

Повна інформація про документ