{MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning

Propaganda, a pervasive tool for influenc- ing public opinion, demands robust auto- mated detection systems, particularly for under- resourced languages. Current efforts largely focus on well-resourced languages like English, leaving significant gaps in languages such as Arabic. This research addresses these gaps by introducing {MultiProp} Framework, a cross- lingual meta-learning framework designed to enhance propaganda detection across multiple languages, including Arabic, German, Italian, French and English. We constructed a mul- tilingual dataset using data translation tech- niques, beginning with Arabic data from {PTC} and {WANLP} shared tasks, and expanded it with translations into German Italian and French, further enriched by the {SemEval}23 dataset. Our proposed framework encompasses three distinct models: {MultiProp}-Baseline, which combines ensembles of pre-trained models such as {GPT}-2, {mBART}, and {XLM}-{RoBERTa}; {MultiProp}-{ML}, designed to handle languages with minimal or no training data by utiliz- ing advanced meta-learning techniques; and {MultiProp}-Chunk, which overcomes the chal- lenges of processing longer texts that exceed the token limits of pre-trained models. To- gether, they deliver superior performance com- pared to state-of-the-art methods, representing a significant advancement in the field of cross- lingual propaganda detection.

  • Veröffentlicht in:
    Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script
  • Typ:
    Inproceedings
  • Autoren:
    Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie
  • Jahr:
    2025
  • Source:
    https://aclanthology.org/2025.abjadnlp-1.2/

Informationen zur Zitierung

Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie: {MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning, Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script, 2025, 7--22, January, Association for Computational Linguistics, https://aclanthology.org/2025.abjadnlp-1.2/, Aldabbas.etal.2025a,

Assoziierte Lamarr-ForscherInnen

Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrides ML zum Profil
Prof. Dr. Lucie Flek

Prof. Dr. Lucie Flek

Area Chair NLP zum Profil