Skip to main navigation Skip to search Skip to main content

BERT-based Classifiers for Fake News Detection on Short and Long Texts with Noisy Data: A Comparative Analysis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Free uncontrolled access to the Internet is the main reason for fake news propagation on the Internet both in social media and in regular Internet publications. In this paper we study the potential of several BERT-based models to detect fake news related to politics. Our contribution to the area consists of testing BERT, RoBERTa and MNLI RoBERTa models with (a) short and long texts; (b) ensembling with the best models; (c) noisy texts. To improve ensembling, we introduce an additional class ‘Doubtful news’. To create noisy data we use cross-translation. For the experiments we consider the well-known FRN (Fake vs. Real News, long texts) and LIAR (short texts) datasets. The results we obtained on the long texts dataset are higher than the results we obtained on the short texts dataset. The proposed approach to ensembling provided significant improvement of the results. The experiments with noisy data demonstrated high noise immunity of the BERT model with long news and the RoBERTa model with short news.

Original languageEnglish
Title of host publicationText, Speech, and Dialogue - 25th International Conference, TSD 2022, Proceedings
EditorsPetr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala
PublisherSpringer Science and Business Media Deutschland GmbH
Pages263-274
Number of pages12
ISBN (Print)9783031162695
DOIs
Publication statusPublished - 2022
Event25th International Conference on Text, Speech, and Dialogue, TSD 2022 - Brno, Czech Republic
Duration: 6 Sep 20229 Sep 2022

Publication series

NameLecture Notes in Computer Science
Volume13502 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Text, Speech, and Dialogue, TSD 2022
Country/TerritoryCzech Republic
CityBrno
Period6/09/229/09/22

Keywords

  • BERT
  • Ensembling
  • Fake News
  • MNLI RoBERTa
  • Noise Immunity
  • RoBERTa

Fingerprint

Dive into the research topics of 'BERT-based Classifiers for Fake News Detection on Short and Long Texts with Noisy Data: A Comparative Analysis'. Together they form a unique fingerprint.

Cite this