SARCASM DETECTION IN PERSIAN release_pgxuvdymafaq5adj3npg36pneu

by Zahra Bokaee Nezhad, Mohammad Ali Deihimi

Published in Journal of Information and Communication Technology by UUM Press, Universiti Utara Malaysia.

2020   Volume 20

Abstract

Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as sentiment analysis and text summarisation. However, research on sarcasm detection in Persian is very limited. This paper investigated the sarcasm detection technique on Persian tweets by combining deep learning-based and machine learning-based approaches. Four sets of features that cover different types of sarcasm were proposed, namely deep polarity, sentiment, part of speech, and punctuation features. These features were utilised to classify the tweets as sarcastic and nonsarcastic. In this study, the deep polarity feature was proposed by conducting a sentiment analysis using deep neural network architecture. In addition, to extract the sentiment feature, a Persian sentiment dictionary was developed, which consisted of four sentiment categories. The study also used a new Persian proverb dictionary in the preparation step to enhance the accuracy of the proposed model. The performance of the model is analysed using several standard machine learning algorithms. The results of the experiment showed that the method outperformed the baseline method and reached an accuracy of 80.82%. The study also examined the importance of each proposed feature set and evaluated its added value to the classification.
In application/xml+jats format

Archived Files and Locations

application/pdf  1.4 MB
file_guxnfpolnfekdecuzzqg3d7wgy
e-journal.uum.edu.my (publisher)
web.archive.org (webarchive)
application/pdf  1.5 MB
file_dfsazbuhv5f53inwnwtdbayk5a
www.scienceopen.com (web)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2020-11-04
Language   en ?
Journal Metadata
Open Access Publication
In DOAJ
Not in Keepers Registry
ISSN-L:  1675-414X
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 8f068d95-a08d-45ee-ae1a-07873b086a9c
API URL: JSON