Personalized Intended and Perceived Sarcasm Detection on Twitter
Sarcasm detection is a challenging task for various NLP applications. It often requires additional context related to the conversation or participants involved to interpret the intended meaning. In this work, we introduce an extended reactive supervision method to collect sarcastic data from Twitter and improve the quality of the data that is extracted. Our new dataset contains around 35K labeled tweets sarcastic or non-sarcastic, as well as additional tweets regarding both conversational and author context. The experiments focus on two tasks, the binary classification task of sarcastic vs. non-sarcastic and intended vs. perceived sarcasm. We compare models using textual features of tweets and models utilizing additional author embeddings by using their historical tweets. Moreover, we show the importance of combining conversational features together with author ones.
- Published in:
Workshop on Computational Linguistics for the Political and Social Sciences - Type:
Inproceedings - Authors:
Plepi, Joan; Buski, Magdalena; Flek, Lucie - Year:
2023
Citation information
Plepi, Joan; Buski, Magdalena; Flek, Lucie: Personalized Intended and Perceived Sarcasm Detection on Twitter, Workshop on Computational Linguistics for the Political and Social Sciences, 2023, https://aclanthology.org/2023.cpss-1.2/, Plepi.etal.2023a,
@Inproceedings{Plepi.etal.2023a,
author={Plepi, Joan; Buski, Magdalena; Flek, Lucie},
title={Personalized Intended and Perceived Sarcasm Detection on Twitter},
booktitle={Workshop on Computational Linguistics for the Political and Social Sciences},
url={https://aclanthology.org/2023.cpss-1.2/},
year={2023},
abstract={Sarcasm detection is a challenging task for various NLP applications. It often requires additional context related to the conversation or participants involved to interpret the intended meaning. In this work, we introduce an extended reactive supervision method to collect sarcastic data from Twitter and improve the quality of the data that is extracted. Our new dataset contains around 35K labeled...}}