Zero-Shot Text Matching for Automated Auditing using Sentence Transformers
Natural language processing methods have several applications in automated auditing, including document or passage classification, information retrieval, and question answering. However, training such models requires a large amount of annotated data which is scarce in industrial settings. At the same time, techniques like zero-shot and unsupervised learning allow for application of models pre-trained using general domain data to unseen domains.In this work, we study the efficiency of unsupervised text matching using Sentence-Bert, a transformer-based model, by applying it to the semantic similarity of financial passages. Experimental results show that this model is robust to documents from in- and out-of-domain data.
- Published in:
2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) - Type:
Inproceedings - Authors:
Biesner, David; Pielka, Maren; Ramamurthy, Rajkumar; Dilmaghani, Tim; Kliem, Bernd; Loitz, Rüdiger; Sifa, Rafet - Year:
2022 - Source:
https://ieeexplore.ieee.org/document/10069326
Citation information
Biesner, David; Pielka, Maren; Ramamurthy, Rajkumar; Dilmaghani, Tim; Kliem, Bernd; Loitz, Rüdiger; Sifa, Rafet: Zero-Shot Text Matching for Automated Auditing using Sentence Transformers, 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), 2022, https://ieeexplore.ieee.org/document/10069326, Biesner.etal.2022b,
@Inproceedings{Biesner.etal.2022b,
author={Biesner, David; Pielka, Maren; Ramamurthy, Rajkumar; Dilmaghani, Tim; Kliem, Bernd; Loitz, Rüdiger; Sifa, Rafet},
title={Zero-Shot Text Matching for Automated Auditing using Sentence Transformers},
booktitle={2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)},
url={https://ieeexplore.ieee.org/document/10069326},
year={2022},
abstract={Natural language processing methods have several applications in automated auditing, including document or passage classification, information retrieval, and question answering. However, training such models requires a large amount of annotated data which is scarce in industrial settings. At the same time, techniques like zero-shot and unsupervised learning allow for application of models...}}