Using deep learning neural networks to classify toxic comments on social media

D. V. Zakharenko

https://orcid.org/0009-0000-0306-5684

DOI: https://doi.org/10.47813/2782-5280-2023-2-4-0119-0133

Keywords: artificial neural networks, deep learning, text classification, text preprocessing, toxic comments, social networks, digital civility


Abstract

The purpose of this study was to study the use of artificial neural networks of deep learning to classify toxic comments on social networks. The prevalence of toxic interactions on these platforms has reached an all-time high level, which has led to a decrease in the level of digital civility. Moderators of these platforms have to spend a lot of time and effort to control the negative in the comments. The study examines various algorithms and methods for building artificial neural networks, and compares the performance of the three selected models to determine the most effective for solving this problem. Comments from the Wikipedia discussion page serve as data for building classification models. The study includes an overview of the methods used to achieve targeted results using Python and its libraries. It also covers technical aspects, such as the process of building, training and evaluating models of artificial neural networks. Valuable information about the necessary theoretical foundations was reviewed, as well as some previous studies and solutions were discussed. Classifying the nature of hate comments will provide platforms with flexibility in dealing with them and open the door to new discussions and solutions.


Author Biography

D. V. Zakharenko

Danil Zakharenko, Siberian Federal University, Institute of Space and Information Technologies, Department of Software Engineering, Krasnoyarsk, Russia


References

Data Preprocessing in Machine learning. URL: https://blogs.microsoft.com/on-the-issues/2020/02/10/digital-civility-lowest. (дата обращения: 14.09.2023).

Javatpoint. URL https://www.javatpoint.com/data-preprocessing-machine-learning. (дата обращения: 17.09.2023).

NTKL. nltk.tokenize package — NLTK 3.8.1. URL: https://www.nltk.org/api/nltk.tokenize.html. (дата обращения: 19.09.2023).

Brownlee, J. Why One-Hot Encode Data in Machine Learning?. URL: https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning. (дата обращения: 21.09.2023).

WordNet. A Lexical Database for English. URL: https://wordnet.princeton.edu. (дата обращения: 23.09.2023).

Datastart. Плавное введение в Natural Language Processing (NLP). URL: https://datastart.ru/blog/read/plavnoe-vvedenie-v-natural-language-processing-nlp. (дата обращения: 25.09.2023).

Brownlee, J. Data Preparation for Variable Length Input Sequences. URL: https://machinelearningmastery.com/data-preparation-variable-length-input-sequences-sequence-prediction. (дата обращения: 27.09.2023).

TensorFlow. tf.keras.layers.Dropout. TensorFlow Core v2.14.0. URL: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout. (дата обращения: 29.09.2023)

TensorFlow. tf.keras.layers.Dense. TensorFlow Core v2.14.0. URL: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. (дата обращения: 03.10.2023)

Randolph. Deep Learning for Multi-Label Text Classification. URL: https://github.com/RandolphVI/Multi-Label-Text-Classification. (дата обращения: 14.10.2023)

REFERENCES

Data Preprocessing in Machine learning. URL: https://blogs.microsoft.com/on-the-issues/2020/02/10/digital-civility-lowest. (data obrashcheniya: 14.09.2023).

Javatpoint. URL https://www.javatpoint.com/data-preprocessing-machine-learning. (data obrashcheniya: 17.09.2023).

NTKL. nltk.tokenize package — NLTK 3.8.1. URL: https://www.nltk.org/api/nltk.tokenize.html. (data obrashcheniya: 19.09.2023).

Brownlee, J. Why One-Hot Encode Data in Machine Learning?. URL: https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning. (data obrashcheniya: 21.09.2023).

WordNet. A Lexical Database for English. URL: https://wordnet.princeton.edu. (data obrashcheniya: 23.09.2023).

Datastart. Plavnoe vvedenie v Natural Language Processing (NLP). URL: https://datastart.ru/blog/read/plavnoe-vvedenie-v-natural-language-processing-nlp. (data obrashcheniya: 25.09.2023).

Brownlee, J. Data Preparation for Variable Length Input Sequences. URL: https://machinelearningmastery.com/data-preparation-variable-length-input-sequences-sequence-prediction. (data obrashcheniya: 27.09.2023).

TensorFlow. tf.keras.layers.Dropout. TensorFlow Core v2.14.0. URL: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout. (data obrashcheniya: 29.09.2023)

TensorFlow. tf.keras.layers.Dense. TensorFlow Core v2.14.0. URL: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. (data obrashcheniya: 03.10.2023)

Randolph. Deep Learning for Multi-Label Text Classification. URL: https://github.com/RandolphVI/Multi-Label-Text-Classification. (data obrashcheniya: 14.10.2023)

Веб-сайт https://www.oajiem.com использует cookie файлы с с целью повышения удобства и эффективности работы Пользователя при работе с сервисами журнала "Modern Innovations, Systems and Technologies" - "Современные инновации, системы и технологии". Продолжая использование сайта, Пользователь дает согласие на использование файлов cookie.