Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media

Kikkisetti, Dhanush; Mustafa, Raza Ul; Melillo, Wendy; Corizzo, Roberto; Boukouvalas, Zois; Gill, Jeff; Japkowicz, Nathalie

Computer Science > Computation and Language

arXiv:2401.10841 (cs)

[Submitted on 19 Jan 2024 (v1), last revised 23 Jan 2024 (this version, v2)]

Title:Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media

Authors:Dhanush Kikkisetti, Raza Ul Mustafa, Wendy Melillo, Roberto Corizzo, Zois Boukouvalas, Jeff Gill, Nathalie Japkowicz

View PDF HTML (experimental)

Abstract:Online hate speech proliferation has created a difficult problem for social media platforms. A particular challenge relates to the use of coded language by groups interested in both creating a sense of belonging for its users and evading detection. Coded language evolves quickly and its use varies over time. This paper proposes a methodology for detecting emerging coded hate-laden terminology. The methodology is tested in the context of online antisemitic discourse. The approach considers posts scraped from social media platforms, often used by extremist users. The posts are scraped using seed expressions related to previously known discourse of hatred towards Jews. The method begins by identifying the expressions most representative of each post and calculating their frequency in the whole corpus. It filters out grammatically incoherent expressions as well as previously encountered ones so as to focus on emergent well-formed terminology. This is followed by an assessment of semantic similarity to known antisemitic terminology using a fine-tuned large language model, and subsequent filtering out of the expressions that are too distant from known expressions of hatred. Emergent antisemitic expressions containing terms clearly relating to Jewish topics are then removed to return only coded expressions of hatred.

Comments:	9 pages, 4 figures, 2 algorithms, 3 tables
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:2401.10841 [cs.CL]
	(or arXiv:2401.10841v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.10841

Submission history

From: Nathalie Japkowicz Ph.D. [view email]
[v1] Fri, 19 Jan 2024 17:40:50 UTC (1,336 KB)
[v2] Tue, 23 Jan 2024 20:05:30 UTC (1,462 KB)

Computer Science > Computation and Language

Title:Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators