The path to secure AI: an overview of practical methods
Over the course of just a decade, deep neural networks (DNNs) have revolutionized Machine Learning, consistently demonstrating unmatched performance in a growing number of highly relevant tasks. They frequently surpass human capabilities in narrowly defined problem scenarios. Many of these neural models have made their way into mature applications such as smart speakers, automated translations, or content feeds. However, safety-critical systems, where there may be a risk to life and limb, pose specific challenges in the application of Machine Learning methods. These dangers include various potential weaknesses of DNNs, ranging from inadequate generalization and insufficient interpretability to issues with manipulated inputs. Adequately considering, addressing, and resolving these weaknesses make the use of neural networks in safety-critical systems challenging. Examples of systems requiring such safety-critical considerations include nearly all applications of autonomous robotics involving human interaction. This includes automated driving.
“KI Absicherung” (AI Security) – Secure AI for Autonomous Driving
In recent years, a variety of state-of-the-art techniques have been developed to address potential safety concerns of Machine Learning processes. In the research project “AI Security/KI Absicherung,” 24 partners from industry and research, under the leadership of Volkswagen AG and Fraunhofer IAIS (as part of the Lamarr institute), are developing methods and argumentation chains to demonstrably ensure the safety of AI-based perception functions. A survey paper resulting from this collaboration provides a structured and comprehensive overview of state-of-the-art techniques for securing ML models. The goal of this survey and collaboration is to summarize current research approaches, identify potential weaknesses in DNNs, quantify them (using safety metrics), or at least partially address them (using security measures).
To develop these metrics and measures, some research approaches focus on building a theory for a better understanding of DNNs, while others dedicate themselves to the practical development of methods to adapt training, predictions, or the development process itself to new safety requirements. Currently, there are numerous works, each addressing only a small part of all safety concerns. However, a comprehensive security argumentation for a complex system based on a DNN will generally rely on the interplay of several metrics and measures. This article summarizes the most well-known and promising ones in eight categories, providing an overview of each. For a detailed examination of individual metrics and measures, the survey paper is recommended.
Eight Paths to Secure AI
Dataset Optimization: Measures for dataset optimization focus on the training and evaluation process of neural networks. This is based on the fact that neural networks, unlike humans, generally make poor predictions on inputs that were not presented to them during training and may fundamentally differ from training data. The problem arises from incomplete, i.e., non-representative datasets, and the presence of distributional shifts. For instance, the appearance of pedestrians and the street scene can vary significantly from country to country and over time. Possible countermeasures for these problems include data augmentation and outlier detection methods.
Robustness and “Adversarial Attacks”: Even small changes in the inputs of a neural network, especially through the linkage of numerous nonlinear functions, can lead to a significantly different outcome. For instance, changing the clothing of a person might be considered a minor alteration within an image, yet it can result in completely different predictions by the Machine Learning model. Measures in the area of robustness address this issue. “Adversarial attacks”, a specific case of robustness, involve seeking minimal manipulations that cause significant changes in a network’s behavior.
Explainability: Measures in the area of explainability deal with disclosing the internal workings of neural networks. From a security perspective, this is important because the interpretability and transparency of Machine Learning methods allow tracing error cases back to the model and initiating targeted improvements (see blog post Why AI needs to be explainable). These methods also allow identifying shortcut learning, a special case of overfitting, where a prediction is based on a randomly learned correlation rather than a causal relationship.
Uncertainty: Methods and measures in the area of uncertainty aim to equip DNNs with a situation-dependent quality assessment, allowing them to evaluate their own safety regarding the made prediction. Without appropriate adjustments, neural networks tend to overestimate the confidence of their predictions. In this case, they attribute a probability to the predicted result that does not correlate with the probability that the prediction is correct. However, as a component of a system, possibly with multiple redundantly operating models, the confidence of a DNN is essential to select the best predictions and discard others. Poor confidence estimates occur, especially with input data that differs significantly from the training set. Additionally, the self-assessment of a network should not only be globally correct but also locally correlate with the corresponding error probability. An example here is a pedestrian detection system that detects all female pedestrians but not male ones. While the confidence may be correctly evaluated globally as 50% for each output of the network, the lack of data-adapted confidence assessment clearly presents a security issue. Methods and measures in the category of uncertainty assessment address this problem.
Aggregation: Methods and measures in the area of aggregation deal with the combination of different models in a complex system. Key representatives of these ideas include ensembling, the combination of predictions from different methods, and temporal consistency, the combination of predictions at different time points.
Verification and Validation: Methods and measures for verification attempt to define quantitative requirements for neural networks and systematically and comprehensively test them. Whether these requirements are sufficient to guarantee the safety of a system and whether, for example, risks have been overlooked, is considered in validation. For AI models, these established concepts from functional safety need to be rethought, with verification largely consisting of defining, selecting, compiling, and maintaining a test dataset. To determine the coverage quality of this test dataset, procedures are used to evaluate the completeness of activations of a layer for all training examples of a neural network (or a surrogate model).
Architecture: Many types of constructing a neural network have a significant impact on both generalizability and robustness. While initially starting with a few convolutional layers and gradually adding more over time, today there are specialized blocks integrated as a network within the network, partially automated and combined. These approaches are complemented by methods of multi-task learning, where a network is trained on multiple tasks, requiring its feature representation to be versatile.
Compression: Lastly, the inference duration and associated computational operations for neural networks can prohibit their use in safety-critical applications, as necessary results cannot be calculated in an acceptable time or require too much energy. A range of solutions fall under the term pruning. The goal here is to determine the influence of weights, neurons, or layers on the loss value and gradually remove those elements with minimal impact. Dataset-independent methods use different heuristics to assess the relevance of individual filters based on statistical properties.
Challenges with Radiance
Modern AI is not only defined by the pursuit of ever-higher performance. Researchers and practitioners today must also be aware of the security-related vulnerabilities in neural networks and stay abreast of the most important developments in addressing them. While a comprehensive solution does not seem imminent at present, the methods presented in this article have high practical relevance and already have far-reaching implications in various economic applications.
More information can be found on the AI Security/KI Absicherung project page (in German) or in the corresponding survey paper.
Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety
S. Houben, S. Abrecht, M. Akila, A. Bär, F. Brockherde, P. Feifel, T. Fingscheidt, S. S. Gannamaneni, S. E. Ghobadi, A. Hammam, A. Haselhoff, F. Hauser, C. Heinzemann, M. Hoffmann, N. Kapoor, F. Kappel, M. Klingner, J. Kronenberger, F. Küppers, J. Löhdefink, M. Mlynarski, M. Mock, F. Mualla, S. Pavlitskaya, M. Poretschkin, A. Pohl, V. Ravi-Kumar, J. Rosenzweig, M. Rottmann, S. Rüping, T. Sämann, J. D. Schneider, E. Schulz, G. Schwalbe, J. Sicking, T. Srivastava, S. Varghese, Weber, S. Wirkert, T. Wirtz, M. Woehrle, 2021, PDF