{MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning

Propaganda, a pervasive tool for influencing public opinion, demands robust automated detection systems, particularly for underresourced languages. Current efforts largely focus on well-resourced languages like English, leaving significant gaps in languages such as Arabic. This research addresses these gaps by introducing {MultiProp} Framework, a crosslingual meta-learning framework designed to enhance propaganda detection across multiple languages, including Arabic, German, Italian, French and English. We constructed a multilingual dataset using data translation techniques, beginning with Arabic data from {PTC} and {WANLP} shared tasks, and expanded it with translations into German Italian and French, further enriched by the {SemEval}23 dataset. Our proposed framework encompasses three distinct models: {MultiProp}-Baseline, which combines ensembles of pre-trained models such as {GPT}-2, {mBART}, and {XLM}-{RoBERTa}; {MultiProp}-{ML}, designed to handle languages with minimal or no training data by utilizing advanced meta-learning techniques; and {MultiProp}-Chunk, which overcomes the challenges of processing longer texts that exceed the token limits of pre-trained models. Together, they deliver superior performance compared to state-of-the-art methods, representing a significant advancement in the field of crosslingual propaganda detection.

  • Published in:
    Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script
  • Type:
    Inproceedings
  • Authors:
    Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie
  • Year:
    2025
  • Source:
    https://aclanthology.org/2025.abjadnlp-1.2/

Citation information

Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie: {MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning, Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script, 2025, 7--22, January, Association for Computational Linguistics, https://aclanthology.org/2025.abjadnlp-1.2/, Aldabbas.etal.2025a,

Associated Lamarr Researchers

Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrid ML to the profile
Prof. Dr. Lucie Flek

Prof. Dr. Lucie Flek

Area Chair NLP to the profile