A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain
Named Entity Recognition ({NER}) in the legal domain presents unique challenges due to specialized terminology and complex linguistic structures inherent in legal texts. While large language models ({LLMs}) like {GPT}-4, Llama-3, and others have significantly advanced natural language processing, their effectiveness in domain-specific tasks like legal Named Entity Recognition remains underexplored. This study conducts a comprehensive comparative analysis of eleven state-of-the-art {LLMs} on legal {NER} tasks across seven diverse datasets in five languages, namely English, Portuguese, German, Turkish, and Ukrainian. We evaluate the models’ performance using F1 scores, focusing on their ability to accurately identify and classify legal entities. Our findings reveal significant variability in {LLM} performance across different languages and legal contexts, with proprietary models like {GPT}-4 achieving the highest overall scores. The results highlight the influence of model architecture, dataset characteristics, and prompt design on the effectiveness of legal {NER} tasks. This study provides valuable benchmarks for legal {NER} applications and offers insights into the strengths and limitations of current {LLMs}, guiding future research and development in legal natural language processing.
- Published in:
2024 {IEEE} International Conference on Big Data ({BigData}) - Type:
Inproceedings - Authors:
Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet - Year:
2024 - Source:
https://ieeexplore.ieee.org/abstract/document/10825695
Citation information
Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet: A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain, 2024 {IEEE} International Conference on Big Data ({BigData}), 2024, 4737--4742, December, https://ieeexplore.ieee.org/abstract/document/10825695, Deusser.etal.2024b,
@Inproceedings{Deusser.etal.2024b,
author={Deußer, Tobias; Zhao, Cong; Sparrenberg, Lorenz; Uedelhoven, Daniel; Berger, Armin; Pielka, Maren; Hillebrand, Lars; Bauckhage, Christian; Sifa, Rafet},
title={A Comparative Study of Large Language Models for Named Entity Recognition in the Legal Domain},
booktitle={2024 {IEEE} International Conference on Big Data ({BigData})},
pages={4737--4742},
month={December},
url={https://ieeexplore.ieee.org/abstract/document/10825695},
year={2024},
abstract={Named Entity Recognition ({NER}) in the legal domain presents unique challenges due to specialized terminology and complex linguistic structures inherent in legal texts. While large language models ({LLMs}) like {GPT}-4, Llama-3, and others have significantly advanced natural language processing, their effectiveness in domain-specific tasks like legal Named Entity Recognition remains...}}