A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain

Named Entity Recognition ({NER}) in the legal domain presents unique challenges due to specialized terminology and complex linguistic structures inherent in legal texts. While large language models ({LLMs}) like {GPT}-4, Llama-3, and others have significantly advanced natural language processing, their effectiveness in domain-specific tasks like legal Named Entity Recognition remains underexplored. This study conducts a comprehensive comparative analysis of eleven state-of-the-art {LLMs} on legal {NER} tasks across seven diverse datasets in five languages, namely English, Portuguese, German, Turkish, and Ukrainian. We evaluate the models’ performance using F1 scores, focusing on their ability to accurately identify and classify legal entities. Our findings reveal significant variability in {LLM} performance across different languages and legal contexts, with proprietary models like {GPT}-4 achieving the highest overall scores. The results highlight the influence of model architecture, dataset characteristics, and prompt design on the effectiveness of legal {NER} tasks. This study provides valuable benchmarks for legal {NER} applications and offers insights into the strengths and limitations of current {LLMs}, guiding future research and development in legal natural language processing.

Published in:
2024 IEEE International Conference on Big Data (BigData)
Type:
Inproceedings
Authors:
Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet
Year:
2024
Source:
https://ieeexplore.ieee.org/abstract/document/10825695

Citation information

Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet: A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain, 2024 IEEE International Conference on Big Data (BigData), 2024, 4737--4742, December, https://ieeexplore.ieee.org/abstract/document/10825695, Deusser.etal.2024b,

Open BibTeX citation

A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain

Citation information

Associated Lamarr Researchers

Maren Pielka

Prof. Dr. Christian Bauckhage

Prof. Dr. Rafet Sifa