A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain

Named Entity Recognition ({NER}) in the legal domain presents unique challenges due to specialized terminology and complex linguistic structures inherent in legal texts. While large language models ({LLMs}) like {GPT}-4, Llama-3, and others have significantly advanced natural language processing, their effectiveness in domain-specific tasks like legal Named Entity Recognition remains underexplored. This study conducts a comprehensive comparative analysis of eleven state-of-the-art {LLMs} on legal {NER} tasks across seven diverse datasets in five languages, namely English, Portuguese, German, Turkish, and Ukrainian. We evaluate the models’ performance using F1 scores, focusing on their ability to accurately identify and classify legal entities. Our findings reveal significant variability in {LLM} performance across different languages and legal contexts, with proprietary models like {GPT}-4 achieving the highest overall scores. The results highlight the influence of model architecture, dataset characteristics, and prompt design on the effectiveness of legal {NER} tasks. This study provides valuable benchmarks for legal {NER} applications and offers insights into the strengths and limitations of current {LLMs}, guiding future research and development in legal natural language processing.

  • Published in:
    2024 {IEEE} International Conference on Big Data ({BigData})
  • Type:
    Inproceedings
  • Authors:
    Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet
  • Year:
    2024
  • Source:
    https://ieeexplore.ieee.org/abstract/document/10825695

Citation information

Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet: A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain, 2024 {IEEE} International Conference on Big Data ({BigData}), 2024, 4737--4742, December, https://ieeexplore.ieee.org/abstract/document/10825695, Deusser.etal.2024b,

Associated Lamarr Researchers

lamarr institute person Pielka Maren - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Maren Pielka

Autorin to the profile
Kopie von LAMARR Person 500x500 1 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Christian Bauckhage

Director to the profile
Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrid ML to the profile