ALiBERT: Improved Automated List Inspection (ALI) with BERT
We consider Automated List Inspection (ALI), a content-based text recommendation system that assists auditors in matching relevant text passages from notes in financial statements to specific law regulations. ALI follows a ranking paradigm in which a fixed number of requirements per textual passage are shown to the user. Despite achieving impressive ranking performance, the user experience can still be improved by showing a dynamic number of recommendations. Besides, existing models rely on a feature-based language model that needs to be pre-trained on a large corpus of domain-specific datasets. Moreover, they cannot be trained in an end-to-end fashion by jointly optimizing with language model parameters. In this work, we alleviate these concerns by considering a multi-label classification approach that predicts dynamic requirement sequences. We base our model on pre-trained BERT that allows us to fine-tune the whole model in an end-to-end fashion, thereby avoiding the need for training a language representation model. We conclude by presenting a detailed evaluation of the proposed model on two German financial datasets.
- Published in:
DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering ACM Symposium on Document Engineering (DocEng) - Type:
Inproceedings - Authors:
R. Ramamurthy, M. Pielka, R. Stenzel, C. Bauckhage, R. Sifa, T. D. Khameneh, U. Warning, B. Kliem, R. Loitz - Year:
2021
Citation information
R. Ramamurthy, M. Pielka, R. Stenzel, C. Bauckhage, R. Sifa, T. D. Khameneh, U. Warning, B. Kliem, R. Loitz: ALiBERT: Improved Automated List Inspection (ALI) with BERT, ACM Symposium on Document Engineering (DocEng), DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering, 2021, https://doi.org/10.1145/3469096.3474928, Ramamurthy.etal.2021,
@Inproceedings{Ramamurthy.etal.2021,
author={R. Ramamurthy, M. Pielka, R. Stenzel, C. Bauckhage, R. Sifa, T. D. Khameneh, U. Warning, B. Kliem, R. Loitz},
title={ALiBERT: Improved Automated List Inspection (ALI) with BERT},
booktitle={ACM Symposium on Document Engineering (DocEng)},
journal={DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering},
url={https://doi.org/10.1145/3469096.3474928},
year={2021},
abstract={We consider Automated List Inspection (ALI), a content-based text recommendation system that assists auditors in matching relevant text passages from notes in financial statements to specific law regulations. ALI follows a ranking paradigm in which a fixed number of requirements per textual passage are shown to the user. Despite achieving impressive ranking performance, the user experience can...}}