Utilization of Reconstructive Representation Learning for Robust Classification
Deep neural networks (DNNs) are generally trained via empirical risk minimization (ERM) on classification tasks. While this has lead to impressive results in scientific benchmarks, as well as, industrial applications, it has been also shown that DNNs tend to give wrong predictions with elevated confidence on out-of-distribution data. In the past, various AI accidents have been associated with these robustness deficiencies of DNNs, making the development of safer DNN architectures inevitable.
To this end, we examine this issue from a theoretical, optimizational point of view and empirically verify the deficiency across various benchmarks. As a potential, multistep solution, we turn towards outlier detection methods in the first step, as such methods aim to capture out-of-distribution data, i.e., data that cannot be explained by the data generating process of normal data. In particular, we utilize reconstructive representation learning, i.e., autoencoders, to learn a representation of normality and leverage the reconstruction error as an outlierness signal to filter outliers. We find that the integration of outlier data into the training process, as opposed to previous works (e.g., one-class autoencoders), benefits the model robustness significantly, and propose the novel architecture adversarially trained autoencoder (ATA), which includes this insight by actively maximizing/minimizing the reconstruction error of outliers/inliers, respectively.
In the second step, we consider the related problem of open-set recognition (OSR), which aims to filter a fixed set of inlier classes from all the possibly existing rest classes including out-of-distribution data. We show that our supervised outlier detection method ATA can solve this generalized one-vs-rest classification task, without expressing the robustness deficiencies of DNNs optimized via ERM. To actively reduce the open-space risk, a principal robustness criterion in OSR, we extend ATA towards our decoupled autoencoder (DAE) architecture, which learns a tighter hull around the inlier data and provides probability scores on the inlierness of a sample, in contrast to ATA. To support our empirical evidence, we prove the existence of an upper bound on the open-space risk for ATA and DAE.
In the final step, we perform multi-class classification on the inlier classes in the OSR setting, which resembles the multi-class classification of real-world deployments due to the out-of-distribution exposure. To this end, we compose an ensemble of DAEs, each learning a different one-vs-rest relationship on the inlier classes, and demonstrate the robustness benefits and its capability to separate between aleatoric and epistemic uncertainty. All three properties together are unmatched by any other DNN architecture.
Finally, the applicability to real-world settings is displayed on the use case of toxicity detection in online communication and the deployment case study of a large-scale information extraction system for financial data.
- Type:
Phdthesis - Authors:
Lübbering, Max - Year:
2023
Citation information
Lübbering, Max: Utilization of Reconstructive Representation Learning for Robust Classification, 2023, July, https://hdl.handle.net/20.500.11811/10947, Luebbering.2023a,
@Phdthesis{Luebbering.2023a,
author={Lübbering, Max},
title={Utilization of Reconstructive Representation Learning for Robust Classification},
month={July},
url={https://hdl.handle.net/20.500.11811/10947},
year={2023},
abstract={Deep neural networks (DNNs) are generally trained via empirical risk minimization (ERM) on classification tasks. While this has lead to impressive results in scientific benchmarks, as well as, industrial applications, it has been also shown that DNNs tend to give wrong predictions with elevated confidence on out-of-distribution data. In the past, various AI accidents have been associated with...}}