Abstract
Spread of fake news and disinformation may have many profound consequences, e.g. social conflicts, distrust in media, political instability. Fake news identification is an well-established area of natural language processing (NLP). Given its recent success on English, fake news identification is currently being used as a tool by a variety of agencies including corporate companies and big media houses. However, fake news identification still possesses a challenge for languages other than English and low-resource languages. The bidirectional encoders using masked language models, e.g. bidirectional encoder representations from Transformers (BERT), multilingual BERT (mBERT), produce state-of-the-art results in numerous natural language processing (NLP) tasks. This transfer learning strategy is very effective when labeled data is not abundantly available especially in low-resource scenarios. This paper investigates the application of BERT for fake news identification in Brazilian Portuguese. In addition to BERT, we also tested a number of widely-used machine learning (ML) algorithms, methods and strategies for this task. We found that fake news identification models built using advanced ML algorithms including BERT performed excellently in this task, and interestingly, BERT is found to be the best-performing model which produces a F1_score of 98.4 on the hold-out test set.
| Original language | English |
|---|---|
| Title of host publication | Natural Language Processing and Information Systems - 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022, Proceedings |
| Editors | Paolo Rosso, Valerio Basile, Raquel Martínez, Elisabeth Métais, Farid Meziane |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 111-118 |
| Number of pages | 8 |
| ISBN (Print) | 9783031084720 |
| DOIs | |
| Publication status | Published - 2022 |
| Externally published | Yes |
| Event | 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022 - Valencia, Spain Duration: 15 Jun 2022 → 17 Jun 2022 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 13286 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022 |
|---|---|
| Country/Territory | Spain |
| City | Valencia |
| Period | 15/06/22 → 17/06/22 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 16 Peace, Justice and Strong Institutions
Keywords
- Deep learning
- Fact checking
- Fake news identification
Fingerprint
Dive into the research topics of 'Identifying Fake News in Brazilian Portuguese'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver