Authors:
Debora Nozza
;
Elisabetta Fersini
and
Enza Messina
Affiliation:
University of Milano-Bicocca, Italy
Keyword(s):
Irony Detection, Unsupervised Learning, Probabilistic Model, Word Embeddings.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Business Analytics
;
Clustering and Classification Methods
;
Computational Intelligence
;
Data Analytics
;
Data Engineering
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Mining Text and Semi-Structured Data
;
Soft Computing
;
Symbolic Systems
Abstract:
The automatic detection of figurative language, such as irony and sarcasm, is one of the most challenging tasks
of Natural Language Processing (NLP). This is because machine learning methods can be easily misled by the
presence of words that have a strong polarity but are used ironically, which means that the opposite polarity
was intended. In this paper, we propose an unsupervised framework for domain-independent irony detection.
In particular, to derive an unsupervised Topic-Irony Model (TIM), we built upon an existing probabilistic topic
model initially introduced for sentiment analysis purposes. Moreover, in order to improve its generalization
abilities, we took advantage of Word Embeddings to obtain domain-aware ironic orientation of words. This is
the first work that addresses this task in unsupervised settings and the first study on the topic-irony distribution.
Experimental results have shown that TIM is comparable, and sometimes even better with respect to supervised
state of the art approaches for irony detection. Moreover, when integrating the probabilistic model with word
embeddings (TIM+WE), promising results have been obtained in a more complex and real world scenario.
(More)