A Machine Learning Pipeline to Examine Political Bias with Congressional Speeches release_e6vwafinfrel7ellyygla7qfr4

by Prasad hajare, Sadia Kamal, Siddharth Krishnan, Arunkumar Bagavathi

Released as a article .

2021  

Abstract

Computational methods to model political bias in social media involve several challenges due to heterogeneity, high-dimensional, multiple modalities, and the scale of the data. Political bias in social media has been studied in multiple viewpoints like media bias, political ideology, echo chambers, and controversies using machine learning pipelines. Most of the current methods rely heavily on the manually-labeled ground-truth data for the underlying political bias prediction tasks. Limitations of such methods include human-intensive labeling, labels related to only a specific problem, and the inability to determine the near future bias state of a social media conversation. In this work, we address such problems and give machine learning approaches to study political bias in two ideologically diverse social media forums: Gab and Twitter without the availability of human-annotated data. Our proposed methods exploit the use of transcripts collected from political speeches in US congress to label the data and achieve the highest accuracy of 70.5% and 65.1% in Twitter and Gab data respectively to predict political bias. We also present a machine learning approach that combines features from cascades and text to forecast cascade's political bias with an accuracy of about 85%.
In text/plain format

Archived Files and Locations

application/pdf  446.6 kB
file_pyok7ojz75cv5p5bhmx5cgsvvi
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-09-18
Version   v1
Language   en ?
arXiv  2109.09014v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: d2822103-d077-42bd-8464-ee3bac472565
API URL: JSON