AMR-CNN: Abstract Meaning Representation with Convolution Neural Network for Toxic Content Detection
DOI:
https://doi.org/10.13052/jwe1540-9589.2135Keywords:
Toxic content detection, Texta analysis, Abstract meaning representation, Convolution neural network, Natural Language Processing.Abstract
Recognizing the offensive, abusive, and profanity of multimedia content on the web has been a challenge to keep the web environment for user’s freedom of speech. As profanity filtering function has been developed and applied in text, audio, and video context in platforms such as social media, entertainment, and education, the number of methods to trick the web-based application also has been increased and became a new issue to be solved. Compared to commonly developed toxic content detection systems that use lexicon and keyword-based detection, this work tries to embrace a different approach by the meaning of the sentence. Meaning representation is a way to grasp the meaning of linguistic input. This work proposed a data-driven approach utilizing Abstract meaning Representation to extract the meaning of the online text content into a convolutional neural network to detect level profanity. This work implements the proposed model in two kinds of datasets from the Offensive Language Identification Dataset and other datasets from the Offensive Hate dataset merged with the Twitter Sentiment Analysis dataset. The results indicate that the proposed model performs effectively, and can achieve a satisfactory accuracy in recognizing the level of online text content toxicity.
Downloads
References
Gaydhani, A., Doma, V., Kendre, S. and Bhagwat, L. Detecting hate speech and offensive language on twitter using machine learning: An n-gram and tfidf based approach. arXiv preprint arXiv:1809.08651, 2018.
Watanabe, H., Bouazizi, M. and Ohtsuki, T. Hate speech on twitter: A pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE access, 6, pp. 13825–13835, 2018.
Davidson, T., Warmsley, D., Macy, M.W. and Weber, I., Automated hate speech detection and the problem of offensive language. CoRR, abs/1703.04009. URL: http://arxiv.org/abs/1703.04009, 2017.
Hua, T., Chen, F., Zhao, L., Lu, C.T. and Ramakrishnan, N., STED: semi-supervised targeted-interest event detectionin in twitter. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1466–1469), August 2013.
Burnap, P. and Williams, M.L., Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data science, 5(1), p. 11, 2016.
Xiang, G., Fan, B., Wang, L., Hong, J. and Rose, C., October. Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 1980–1984, 2012.
Gitari, N.D., Zuping, Z., Damien, H. and Long, J., A lexicon-based approach for hate speech detection. International Journal of Multimedia and Ubiquitous Engineering, 10(4), pp. 215–230, 2015.
Pavlopoulos, J., Malakasiotis, P. and Androutsopoulos, I., Deeper attention to abusive user content moderation. In Proceedings of the 2017 conference on empirical methods in natural language processing, pp. 1125–1135, September 2017.
Pitsilis, G.K., Ramampiaro, H. and Langseth, H., Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433, 2018.
Gao, L. and Huang, R. Detecting online hate speech using context aware models. arXiv preprint arXiv:1710.07395, 2017.
Park, J.H. and Fung, P., One-step and two-step classification for abusive language detection on twitter. arXiv preprint arXiv:1706.01206, 2017.
Badjatiya, P., Gupta, S., Gupta, M. and Varma, V., April. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion (pp. 759–760), 2017.
Park, J.H. and Fung, P., One-step and two-step classification for abusive language detection on twitter. arXiv preprint arXiv:1706.01206, 2017.
Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G. and Plagianakos, V.P. Convolutional neural networks for toxic comment classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence pp. 1–6, July, 2018.
Khieu, K. and Narwal, N., Detecting and classifying toxic comments. Web: https://web.stanford.edu/class/archive/cs/cs224n/cs224n,1184.
Chu, T., Jue, K. and Wang, M., 2016. Comment abuse classification with deep learning. Von https://web.stanford.edu/class/cs224n/reports/2762092.pdf abgerufen.
Kohli, M., Kuehler, E. and Palowitch, J., Paying attention to toxic comments online. Web: https://web.stanford.edu/class/archive/cs/cs224n/cs224n, 1184.
Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M. and Schneider, N., Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pp. 178–186, 2013.
Matthiessen, C.M.I.M. and BATEMAN, J., Systemic-Functional Linguistics in Language Generation: Penman, 1991.
Flanigan, J., Thomson, S., Carbonell, J.G., Dyer, C. and Smith, N.A., June. A discriminative graph-based parser for the abstract meaning representation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1426–1436), 2014.
Kipf, T.N. and Welling, M., Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Rao, S., Marcu, D., Knight, K., Daumé III, H., Biomedical event extraction using abstract meaning representation. BioNLP 2017 pp. 126–135, 2017.
Dohare, S., Karnick, H., Text summarization using abstract meaning representation. arXiv preprint arXiv:1706.01678, 2017.
Song, L., Zhang, Y., Peng, X., Wang, Z., Gildea, D., Amr-to-text generation as a traveling salesman problem. In: EMNLP 2016.
Tayal, Kshitij, Rao Nikhil, Saurabh Agarwal, and Karthik Subbian. “Short text classification using graph convolutional network.” In NIPS workshop on Graph Representation Learning. 2019.
Guo, Beibei, Yu Xiao, Chiping Zhang, and Yong Zhao. “Graph theory-based adaptive intermittent synchronization for stochastic delayed complex networks with semi-Markov jump.” Applied Mathematics and Computation 366: 124739, 2020.