Automating translation checks of financial documents using large language models

We introduce a tool for automated translation checking of financial reports in German-English. It uses a heuristic matching algorithm followed by a transformer encoder based error detection model on sentence pair level. For generating the training data, we leverage state-of-the-art large language models such as {GPT}-4o, thereby alleviating the need for expert annotations. The results suggest that smaller models fine-tuned specifically for this task significantly outperform large multi-purpose generative models like {GPT}-4 for this particular problem, and that a combination of informed and deep learning approaches works best in this case. The tool is being made publicly available as a demonstrator.

  • Published in:
    Language Resources and Evaluation
  • Type:
    Article
  • Authors:
    Pielka, Maren; Hahnbück, Max; Deußer, Tobias; Uedelhoven, Daniel; Chatterjee, Moinam; Shah, Vijul; Soliman, Osama; von der Bank, Jannis; Das, Writwick; Talarico, Maria Chiara; Zhao, Cong; Held Celis, Carolina; Temath, Christian; Sifa, Rafet
  • Year:
    2025
  • Source:
    https://doi.org/10.1007/s10579-025-09862-z

Citation information

Pielka, Maren; Hahnbück, Max; Deußer, Tobias; Uedelhoven, Daniel; Chatterjee, Moinam; Shah, Vijul; Soliman, Osama; von der Bank, Jannis; Das, Writwick; Talarico, Maria Chiara; Zhao, Cong; Held Celis, Carolina; Temath, Christian; Sifa, Rafet: Automating translation checks of financial documents using large language models, Language Resources and Evaluation, 2025, July, https://doi.org/10.1007/s10579-025-09862-z, Pielka.etal.2025a,

Associated Lamarr Researchers

lamarr institute person Pielka Maren - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Maren Pielka

Autorin to the profile
Christian Temath

Dr. Christian Temath

Area Chair Transfer to the profile
Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrid ML to the profile