{MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning
Propaganda, a pervasive tool for influenc- ing public opinion, demands robust auto- mated detection systems, particularly for under- resourced languages. Current efforts largely focus on well-resourced languages like English, leaving significant gaps in languages such as Arabic. This research addresses these gaps by introducing {MultiProp} Framework, a cross- lingual meta-learning framework designed to enhance propaganda detection across multiple languages, including Arabic, German, Italian, French and English. We constructed a mul- tilingual dataset using data translation tech- niques, beginning with Arabic data from {PTC} and {WANLP} shared tasks, and expanded it with translations into German Italian and French, further enriched by the {SemEval}23 dataset. Our proposed framework encompasses three distinct models: {MultiProp}-Baseline, which combines ensembles of pre-trained models such as {GPT}-2, {mBART}, and {XLM}-{RoBERTa}; {MultiProp}-{ML}, designed to handle languages with minimal or no training data by utiliz- ing advanced meta-learning techniques; and {MultiProp}-Chunk, which overcomes the chal- lenges of processing longer texts that exceed the token limits of pre-trained models. To- gether, they deliver superior performance com- pared to state-of-the-art methods, representing a significant advancement in the field of cross- lingual propaganda detection.
- Published in:
Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script - Type:
Inproceedings - Authors:
Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie - Year:
2025 - Source:
https://aclanthology.org/2025.abjadnlp-1.2/
Citation information
Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie: {MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning, Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script, 2025, 7--22, January, Association for Computational Linguistics, https://aclanthology.org/2025.abjadnlp-1.2/, Aldabbas.etal.2025a,
@Inproceedings{Aldabbas.etal.2025a,
author={Aldabbas, Farizeh; Ashraf, Shaina; Sifa, Rafet; Flek, Lucie},
title={{MultiProp} Framework: Ensemble Models for Enhanced Cross-Lingual Propaganda Detection in Social Media and News using Data Augmentation, Text Segmentation, and Meta-Learning},
booktitle={Proceedings of the 1st Workshop on {NLP} for Languages Using Arabic Script},
pages={7--22},
month={January},
publisher={Association for Computational Linguistics},
url={https://aclanthology.org/2025.abjadnlp-1.2/},
year={2025},
abstract={Propaganda, a pervasive tool for influenc- ing public opinion, demands robust auto- mated detection systems, particularly for under- resourced languages. Current efforts largely focus on well-resourced languages like English, leaving significant gaps in languages such as Arabic. This research addresses these gaps by introducing {MultiProp} Framework, a cross- lingual meta-learning framework...}}