Detection of opinion spam based on anomalous rating deviation release_2nlws632qzc2bnrgxuy5mysrpa

by David Savage, Xiuzhen Zhang, Xinghuo Yu, Pauline Chou, Qingmai Wang

Released as a article .

2016  

Abstract

The publication of fake reviews by parties with vested interests has become a severe problem for consumers who use online product reviews in their decision making. To counter this problem a number of methods for detecting these fake reviews, termed opinion spam, have been proposed. However, to date, many of these methods focus on analysis of review text, making them unsuitable for many review systems where accom-panying text is optional, or not possible. Moreover, these approaches are often computationally expensive, requiring extensive resources to handle text analysis over the scale of data typically involved. In this paper, we consider opinion spammers manipulation of average ratings for products, focusing on dif-ferences between spammer ratings and the majority opinion of honest reviewers. We propose a lightweight, effective method for detecting opinion spammers based on these differences. This method uses binomial regression to identify reviewers having an anomalous proportion of ratings that deviate from the majority opinion. Experiments on real-world and synthetic data show that our approach is able to successfully iden-tify opinion spammers. Comparison with the current state-of-the-art approach, also based only on ratings, shows that our method is able to achieve similar detection accuracy while removing the need for assump-tions regarding probabilities of spam and non-spam reviews and reducing the heavy computation required for learning.
In text/plain format

Archived Files and Locations

application/pdf  233.0 kB
file_mkbkshxpuzd4rizqfxrwkhsvdi
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2016-08-02
Version   v1
Language   en ?
arXiv  1608.00684v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 38e69f72-6012-40e7-b023-01feab79626c
API URL: JSON