Automating translation checks of financial documents using large language models

We introduce a tool for automated translation checking of financial reports in German-English. It uses a heuristic matching algorithm followed by a transformer encoder based error detection model on sentence pair level. For generating the training data, we leverage state-of-the-art large language models such as {GPT}-4o, thereby alleviating the need for expert annotations. The results suggest that smaller models fine-tuned specifically for this task significantly outperform large multi-purpose generative models like {GPT}-4 for this particular problem, and that a combination of informed and deep learning approaches works best in this case. The tool is being made publicly available as a demonstrator.

  • Published in:
    Language Resources and Evaluation
  • Type:
    Article
  • Authors:
    Pielka, Maren; Hahnbück, Max; Deußer, Tobias; Uedelhoven, Daniel; Chatterjee, Moinam; Shah, Vijul; Soliman, Osama; von der Bank, Jannis; Das, Writwick; Talarico, Maria Chiara; Zhao, Cong; Held Celis, Carolina; Temath, Christian; Sifa, Rafet
  • Year:
    2025
  • Source:
    https://doi.org/10.1007/s10579-025-09862-z

Citation information

Pielka, Maren; Hahnbück, Max; Deußer, Tobias; Uedelhoven, Daniel; Chatterjee, Moinam; Shah, Vijul; Soliman, Osama; von der Bank, Jannis; Das, Writwick; Talarico, Maria Chiara; Zhao, Cong; Held Celis, Carolina; Temath, Christian; Sifa, Rafet: Automating translation checks of financial documents using large language models, Language Resources and Evaluation, 2025, July, https://doi.org/10.1007/s10579-025-09862-z, Pielka.etal.2025a,