Automatic Indexing of Financial Documents via Information Extraction

Author: R. Ramamurthy, M. Lübbering, T. Bell, M. Gebauer, B. Ulusay, D. Uedelhoven, T. D. Khameneh, R. Loitz, M. Pielka, C. Bauckhage, R. Sifa
Journal: 2021 IEEE Symposium Series on Computational Intelligence (SSCI)
Year: 2021

Citation information

R. Ramamurthy, M. Lübbering, T. Bell, M. Gebauer, B. Ulusay, D. Uedelhoven, T. D. Khameneh, R. Loitz, M. Pielka, C. Bauckhage, R. Sifa,
2021 IEEE Symposium Series on Computational Intelligence (SSCI),
2021,
1-5,
IEEE,
Orlando, FL, USA,
https://doi.org/10.1109/SSCI50451.2021.9659977

The problem of extracting information from large volumes of unstructured documents is pervasive in the domain of financial business. Enterprises and investors need automatic methods that can extract information from these documents, particularly for indexing and efficiently retrieving information. To this end, we present a scalable end-to-end document processing system for indexing and information retrieval from large volumes of financial documents. While we show our system works for the use case of financial document processing, the entire system itself is agnostic of the document type and machine learning model type. Thus, it can be applied to any large-scale document processing task involving domain-specific extractors.