Why AI needs to be explainable

AI's explainability|AI's explainability|
© Nature Communications/CC BY

Today, Machine Learning methods are already being applied in many everyday situations. Whether it is in autonomous driving, where Artificial Intelligence (AI) is used for identifying pedestrians and intersections, or in the unlocking of modern smartphones and laptops through facial recognition. When AI makes decisions that impact people, it is crucial that the decision-making processes are understandable for both experts and users.

For example, in the context of autonomous driving, trust in the correct decision-making of such methods is indispensable for the public acceptance and use of these technologies. Ensuring that the underlying AI algorithms are free from inherent biases is equally critical. In automated decisions, such as determining a person’s creditworthiness, discrimination based on factors like origin or gender should not occur. The General Data Protection Regulation (GDPR), applied in all European Union member states since 2018, is also the topic of ongoing debate among experts. The discussion revolves around whether the GDPR grants all citizens the right to an explanation if a decision affecting them has been made using automated processes, such as Machine Learning methods.

Making complex processes transparent

In the research area of Machine Learning, neural networks have emerged as popular methods for solving decision problems in recent years. For instance, neural networks can learn from images to identify objects depicted in the respective image, a decision problem known as classification. However, compared to many previous methods, the specific decision processes of neural networks are considerably challenging, if not nearly impossible, to directly comprehend due to their complexity. Despite this, neural networks often achieve higher accuracy in their decisions compared to other Machine Learning methods that also classify images. Generally, in the field of Machine Learning, there exists a trade-off between simpler, often insufficiently accurate methods and complex methods with high accuracy.

Current research on the explainability of Artificial Intelligence often focuses on enhancing the transparency of the decision-making process of complex neural networks. One approach to achieve explainability involves finding an appropriate weighting for each relevant feature in the decision. For a single decision, such as determining a person’s creditworthiness, features like age, expenditures in the last year, and previous creditworthiness evaluations can be decisive. High and positive weightings mean that creditworthiness is positively related to the feature. When these features are in image form, their weightings can be represented as an image, with each pixel assigned a weight. High weights are often visualized in prominent red, while low weights are depicted in blue tones.

High accuracy alone is not enough

The illustration below provides a visualization of decision weightings. In a specific application, a system using automated learning processes should decide whether a presented photo depicts a train. Initially, researchers observed that the neural network could very accurately distinguish between images with trains and those without. However, the transparent breakdown of the decision process revealed that the algorithm had not successfully learned to identify trains. Instead, its decision primarily relied on recognizing rails. Although the algorithm achieved high accuracy in identifying images with trains, it made its decision based on incorrect features.

AI's explainability
The neural network decided that a train is depicted in this image. However, an insight into the decision-making process (right-hand image) shows that the model has only learned to identify rails.
© Nature Communications/ CCBY

In practice, a model like the one above should not be used because it has not learned what features constitute a train. For images showing only rails and no trains, the model might incorrectly decide that a train is present. For safety-critical applications, such as autonomous driving, this behavior is unacceptable. Consider a scenario where such a neural network is used to recognize pedestrians. If the model mistakenly identifies a pedestrian based on the pavement on a sidewalk, there is a possibility that the same pedestrian may not be recognized on a differently colored bike path. Identifying such errors before deploying the neural network in practice is crucial, particularly for safety-critical use cases.

While neural networks are indispensable in current Machine Learning research due to their high accuracy, they are presently unsuitable for use in safety-critical areas. Certifying neural networks for use in safety-critical areas is challenging as they quickly become too complex. However, initial approaches that make decision processes more transparent already exist. The method illustrated above, using visual representations and weighting, enables checking whether the neural network has internalized specific concepts (such as the visual features of a train) or whether it erroneously relies on other features for decisions that often co-occur (such as rails in the example above). Further development of these approaches to decipher decisions of complex learning processes is therefore a central research goal to make the advantages of neural networks accessible to the general public.

For further reading, there is a lot of additional information on explainability of Artificial Intelligence, including the book: “Interpretable Machine Learning – A Guide for Making Black Box Models Explainable.”

Matthias Jakobs,

13. January 2021

Topics

Matthias Jakobs

Matthias Jakobs focuses his research on trustworthy machine learning. He is currently working on various problems, including providing guarantees for explanation methods based on Shapley values, both in theory and in practical applications. Additionally, he is exploring the combination of explainability models with Bayesian Neural Networks (BNNs). He is particularly interested in illuminating the decision-making process of black-box models to instill greater trust in the decisions made by neural networks, […]

More blog posts