Detection of Medical Conspiracy Theories with Limited Resources: Using Data from Prior Epidemics and {LLMs}
Online dissemination of conspiracy theories (CTs) during epidemics poses significant risks to public health. This paper addresses the problem of detecting CTs in social media posts with an emphasis on the resource-constrained scenarios characterized by the absence of labeled datasets and the high cost of expert annotation. To address these challenges, we investigate resource-efficient methods for CT detection across multiple epidemics. We construct a novel dataset of CT-labeled social media posts covering four major epidemics from the past decade: Ebola, Zika, COVID-19, and Monkeypox. We conduct extensive experiments addressing four research questions: (1) the performance of BERT-like models on individual epidemics, (2) the ability to transfer knowledge from past epidemics to new ones, (3) the efficacy of zero-shot classification using Large Language Models (LLMs), and (4) the feasibility of training BERT-like models on LLM-labeled datasets. Our findings indicate that BERT-like models exhibit highly variable performance across epidemics. Transfer learning from prior epidemics can be effective and their performance can be improved with the number of prior datasets. Zero-shot LLM classifiers, including ensemble methods, achieve performance that matches or surpasses that of fine-tuned BERT-like models. Finally, we demonstrate that BERT-like models trained on LLM-labeled datasets achieve results close to the models trained on expert-annotated data, offering a practical alternative when expert labeling is infeasible. While automated methods can be useful for data analysis, we caution against automatization of content filtering due to the inherent difficulty of CT detection and the potential biases of language models.
- Published in:
Authorea - Type:
Article - Authors:
- Year:
2025 - Source:
https://www.authorea.com/users/915424/articles/1288454-detection-of-medical-conspiracy-theories-with-limited-resources-using-data-from-prior-epidemics-and-llms
Citation information
: Detection of Medical Conspiracy Theories with Limited Resources: Using Data from Prior Epidemics and {LLMs}, Authorea, 2025, https://www.authorea.com/users/915424/articles/1288454-detection-of-medical-conspiracy-theories-with-limited-resources-using-data-from-prior-epidemics-and-llms, Schlicht.etal.2025b,
@Article{Schlicht.etal.2025b,
author={Schlicht, Ipek Baris; Korenčić, Damir; Chulvi, Berta; Flek, Lucie; Rosso, Paolo},
title={Detection of Medical Conspiracy Theories with Limited Resources: Using Data from Prior Epidemics and {LLMs}},
journal={Authorea},
url={https://www.authorea.com/users/915424/articles/1288454-detection-of-medical-conspiracy-theories-with-limited-resources-using-data-from-prior-epidemics-and-llms},
year={2025},
abstract={Online dissemination of conspiracy theories (CTs) during epidemics poses significant risks to public health. This paper addresses the problem of detecting CTs in social media posts with an emphasis on the resource-constrained scenarios characterized by the absence of labeled datasets and the high cost of expert annotation. To address these challenges, we investigate resource-efficient methods for...}}