In an era where podcasts are booming and misinformation is harder than ever to track, researchers at the University of Stavanger/SFI MediaFutures have developed a groundbreaking open-source tool to help tackle false claims in audio content. Associate Professor Vinay Setty and former UiS student Adam James Becker have introduced a first-of-its-kind platform for real-time podcast transcription, annotation, and fact-checking described in their paper: “Annotation Tool and Dataset for Fact-Checking Podcasts”.

Podcasts are powerful and popular, but largely unregulated. With millions of episodes published in dozens of languages, fact-checking them has been a nearly impossible task. Traditional tools fall short due to the length of audio, lack of accessible data, and the challenges of multilingual processing.

A Smarter Way to Fact-Check Audio

The new tool changes that. By combining speech recognition technology with an easy-to-use annotation interface, the platform allows users to listen to podcast audio while highlighting claims, fixing transcription errors, and noting when statements need verification. It supports over 90 languages and integrates with open-source systems like OpenAI’s Whisper for transcription and F-Coref for co-reference resolution.

Users can also log why a statement should be fact-checked: Whether it’s potentially harmful, surprising, or simply worth a second look. All this happens in a streamlined interface designed for researchers, journalists, and anyone interested in the truth behind the talk.

Big Dataset, Real Impact

The researchers have also released a large, openly available dataset featuring 531 podcast episodes from 38 shows in English, Norwegian, and German. It includes detailed annotations for claims, fact-checking motivations, and verification results. The dataset supports advanced tasks like claim detection and stance classification and has already been used to fine-tune AI models such as XLM-RoBERTa, offering competitive results when compared with tools like GPT-4.

“This is about giving people the tools to check facts in the formats they actually consume,” said Professor Setty. “Most fact-checking today focuses on text. But more and more information is shared through podcasts—and we need systems that can keep up.”

Open and Accessible

The entire system, including the tool and datasets, is freely available on GitHub: https://github.com/factiverse/factcheck-podcasts

With support from SFI MediaFutures and the Research Council of Norway, the team aims to expand the project and make podcast fact-checking easier and more reliable for everyone.

Read the paper here.

Vinay Setty

Vinay Setty

Associate Professor, UiS

Dr. Vinay Setty is an Associate Professor at the Department of Electrical Engineering and Computer Science. Before that he has been an Assistant Professor at Aalborg University in Denmark and Postdoctoral Researcher at Max Planck Institute for Informatics. Setty got is PhD from University of Oslo, Norway. 

 

2025

Setty, Vinay; Becker, Adam James. Annotation Tool and Dataset for Fact-Checking Podcasts. Conference. The Web Conference 2025, 2025.

 

2024

Aarnes, Peter Røysland; Setty, Vinay; Galuščáková, Petra. IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection. Conference. Conference and Labs of the Evaluation Forum, 2024.

2023

Opdahl, Andreas L.; Tessem, Bjørnar; Dang-Nguyen, Duc-Tien; Motta, Enrico; Setty, Vinay; Throndsen, Eivind; Tverberg, Are; Trattner, Christoph. Trustworthy Journalism Through AI. Journal Article. In: Data & Knowledge Engineering (DKE), Elsevier, 2023.