EDAudio: Easy Data Augmentation for Dialectal Audio
We investigate lightweight and easily applicable data augmentation techniques for dialectal audio classification. We evaluate four main methods, namely shifting pitch, interval removal, background noise insertion and interval swap as well as several subvariants on recordings from 20 German dialects. Each main method is tested across multiple hyperparameter combinations, inlcuding augmentation length, coverage ratio and number of augmentations per original sample. Our results show that frequency-based techniques, particularly frequency masking, consistently yield performance improvements, while others such as time masking or speaker-based insertion can negatively affect the results. Our comparative analysis identifies which augmentations are most effective under realistic conditions, offering simple and efficient strategies to improve dialectal speech classification.
- Published in:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI era - Type:
Inproceedings - Year:
2025 - Source:
https://aclanthology.org/2025.ranlp-1.44
Citation information
: EDAudio: Easy Data Augmentation for Dialectal Audio, Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI era, 2025, 363--368, September, INCOMA Ltd., Shoumen, Bulgaria, https://aclanthology.org/2025.ranlp-1.44, Fischbach.etal.2025c,
@Inproceedings{Fischbach.etal.2025c,
author={Fischbach, Lea; Karimi, Akbar; Lameli, Alfred; Flek, Lucie},
title={EDAudio: Easy Data Augmentation for Dialectal Audio},
booktitle={Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI era},
pages={363--368},
month={September},
publisher={INCOMA Ltd., Shoumen, Bulgaria},
url={https://aclanthology.org/2025.ranlp-1.44},
year={2025},
abstract={We investigate lightweight and easily applicable data augmentation techniques for dialectal audio classification. We evaluate four main methods, namely shifting pitch, interval removal, background noise insertion and interval swap as well as several subvariants on recordings from 20 German dialects. Each main method is tested across multiple hyperparameter combinations, inlcuding augmentation...}}