Supporting verification of news articles with automated search for semantically similar articles

Fake information poses one of the major threats for society in the 21st century. Identifying misinformation has become a key challenge due to the amount of fake news that is published daily. Yet, no approach is established that addresses the dynamics and versatility of fake news editorials. Instead of classifying content, we propose an evidence retrieval approach to handle fake content. The learning task is formulated as an unsupervised machine learning problem. We provide the reader with a set of documents from reliable news sources supporting the hypothesis of a text and the final decision is left to the reader. Technically we propose a two-step process: high recall-step: With information extracted from the given text we query for similar content from reliable news sources. (ii) high precision-step: We narrow the supporting evidence down by measuring the semantic distance of the text with the collection from step (i). The distance is calculated based on Word2Vec and the Word Mover’s Distance. In our experiments, only content that is below a certain distance threshold is considered as supporting evidence. We find that our approach is agnostic to concept drifts, i.e. the machine learning task is independent of the hypotheses in a text. This makes it highly adaptable in times where fake content is as diverse as classical news is. Our pipeline offers the possibility for further analysis in the future, such as investigating bias and differences in news reporting.

  • Published in:
    ROMCIR Workshop at ECIR Reducing Online Misinformation through Credible Information Retrieval Workshop (ROMCIR) at the European Conference on Information Retrieval (ECIR)
  • Type:
    Inproceedings
  • Authors:
    V. Gupta, K. Beckh, S. Giesselbach, D. Wegener, T. Wirtz
  • Year:
    2021

Citation information

V. Gupta, K. Beckh, S. Giesselbach, D. Wegener, T. Wirtz: Supporting verification of news articles with automated search for semantically similar articles, Reducing Online Misinformation through Credible Information Retrieval Workshop (ROMCIR) at the European Conference on Information Retrieval (ECIR), ROMCIR Workshop at ECIR, 2021, https://doi.org/10.48550/arXiv.2103.15581, Gupta.etal.2021,