Insights from the AAAI XAI4Sci Explainable Machine Learning for Sciences Workshop

Beitragsbild 1 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Kit8 d.o.o./ & Lamarr-Institut

Science is about discovery and getting closer to understanding the natural world. The scientific method follows a process of investigating a question, formulating hypotheses, experimental testing, analysis, and conclusion reporting. Machine Learning is used in and for this process, and because most models are opaque, the need for model explanations has become a key concern.

In general, explanations are needed to better understand model behavior and build trust in Machine Learning predictions. However, in the sciences, the need for explanations goes beyond these points. In scientific domains, explanations are essential for validation and the advancement of knowledge about the underlying process. It is less important that explanations are “appealing” and instead that they are comprehensible and verifiable for e.g. the purpose of generating new hypotheses.

So far, the approaches of explainable Machine Learning for the sciences have not been sufficiently discussed. Even more so, the state-of-the-art in explainable Machine Learning is facing challenges due to problems such as differing explanations from different explanation methods or unstable explanations. Hence, it is vital to examine more closely what works, what does not, and what is missing.

At the AAAI Conference on Artificial Intelligence, a workshop was organized to shed light on this topic and precisely answer these questions. The goal of the “XAI4Sci Explainable machine learning for sciences” workshop was to initiate exchange and to create a space for discussions about what makes explanations unique in the scientific domain and what challenges and potentials may arise in the future.

Here, we provide insights from the workshop by summarizing the talks and placing the findings in a broader context. In particular, we will see that current explanation methods may not necessarily work and explore alternative approaches instead.

Workshop structure

The workshop started off with talks about prior knowledge and cognition and then progressed to application examples in physics, medicine, and materials science.

In total, eight speakers provided input to the topic. In the following, we will shortly introduce their topic and highlight take-aways and further reading for each of them.

Exploring prior knowledge

Knowledge is a key component of science: What knowledge do we already have, and how can we generate new knowledge? This is why, it is interesting to look at how prior knowledge can be used for the goal of explainable Machine Learning.

Often, we encounter the challenge that model outputs do not align with domain expertise, but there is research specifically addressing this issue. The talk introduced three approaches that utilize prior knowledge for explainability. The first approach allows the integration of additional knowledge as individual, interpretable components and therefore increases interpretability of the whole system. An important research area in this direction is physics-informed Machine Learning which tries to integrate physical constraints directly into the Machine Learning pipeline. The second approach integrates knowledge into the explainability component and can serve as a filtering step to remove explanation candidates that do not conform to prior knowledge. The third approach uses explanations to gain insight into the underlying process. These insights can be seen as new knowledge which is in turn integrated into the pipeline. This approach has potential for iterative insight via improvement loops.

Take-aways from the talk and the discussion

Sometimes it is not possible to integrate knowledge, e.g., because the knowledge is intuitive. An example of this is experienced medical personnel who can recognize whether a patient is getting worse without being able to provide quantitative measures such as changes in skin color, sweating, or the like.

We can also have knowledge which does not align with the information in the training data, e.g., because the training data is flawed or lacks certain data. This poses a challenge. In one project we are currently re-annotating training data, which is, of course, not trivial and laborious. An interesting question is whether there are already studies investigating the impact of integrating “wrong” knowledge. Possibly, there is adjacent work in the area of Adversarial Attacks.

Talk title: How prior knowledge can be utilized for Explainable Machine Learning

Speaker: Katharina Beckh

For more details on the three approaches for knowledge-driven explainable Machine Learning we refer to the corresponding blog post and publication.

Exploring uncertainty

Since knowledge is information that is validated and usually refers to hard facts, the sciences are often confronted with uncertainty. Hence, the second talk explored uncertainty, in particular, how well humans can learn environmental uncertainty.  

When we are faced with the task of learning environmental uncertainty, it is actually unclear how our brain internalizes uncertainty with limited cognitive resources. In a user study, it was explored how well a probability distribution reported by users aligned with the actual distribution. The users received sequential observations, e.g., in the form of the landing location of lava rocks and were asked to predict the probability density of future landing locations. A density estimation framework was presented to model the human behavior.

Take-aways from the talk and the discussion

The talk and discussion left the question open how exactly the notion of uncertainty representation and explainable Machine Learning are linked. What can be said is that the topic is less about how we should generate explanations, but rather about how intelligent systems should work, namely how neural systems could manage uncertainty and how we can explain human behavior regarding uncertainty. The most significant take-away was that humans are indeed able to learn good density models, despite a bias to overestimate the number of clusters (e.g., volcanoes). From the study desig, we learned that it is difficult to model memory and to find a good user study set-up. Especially the last point is a very relevant one for evaluating explanations in general.

Talk title: Uncertainty representations in cognitive systems

Speaker: Kevin Li

The talk is based on the paper “Bounded rationality in structured density estimation”, and provides more technical insights on the density estimation framework.

Application: Face processing

In a first application example, advances and insights from face recognition were presented. It was shown that with neural networks, the error rates reduced year after year, eventually becoming better than human performance. Other-race effect is a phenomenon where we recognize own-race faces better than other-race faces. It was found that this phenomenon also occurs for algorithms. For frontal images, Machine Learning models perform better than the majority of people, even compared to super recognizers (people who are especially skilled in recognizing faces).

Take-aways from the talk and the discussion

Similar to the previous talk, this talk only indirectly linked to explainable Machine Learning. In the discussion, we learned that post-hoc methods (LIME and GradCAM) were tested, and the results were not striking. It remains questionable whether these methods would be helpful for the task of face recognition. It would also be interesting to better understand what the system focuses on. Typical features such as the distance between the eyes were not separately investigated, and it remains unclear what examiners, especially super recognizers, are looking at in contrast to, e.g., students. This analysis could be realized, for instance, using eye-tracking technology.

Talk title: Challenges in using deep learning to model the face processing in humans

Speaker: P. Jonathon Phillips

For more information on the face recognition systems and the other race effect we refer the reader to published works from Phillips and recommend the following paper as a starting point.

Application: Medicine

Medicine is a high-risk field which especially requires interpretability. The two speakers both advocated strongly for interpretable models and against post-hoc approaches.

In the first talk, an approach was introduced to transform a black box model into an interpretable model. To give some background on the use case: Breast cancer is the most diagnosed cancer worldwide. Regular screening is performed for women over 40. However, mammography in general is a challenging task, with apparently 1 in 5 cases being missed. The goal is to reduce screening burden by predicting the cancer risk for the next 5 years. A state-of-the-art model has good predictive performance, but due to its lack of interpretability, it is unclear which features it considers. Through experimentation with the model, it was found, somewhat by accident, that the dissimilarity between left and right breast is a very good predictor for breast cancer risk. With this, an interpretable model was developed that takes the localized dissimilarities between both sides into account. The new system has comparable performance to the black box and no trade-off between accuracy and interpretability was found.

The second talk highlighted which models work particularly well in practice. Here, the use case was sepsis prediction. It was recommended to use explainable boosting machines and GAM-changer. GAMs are additive models which allow inspection of e.g. local overfitting. Since sepsis is rare in comparison, a model may learn spurious correlations. With GAMs, it is possible to inspect and to smoothen that out.

Take-aways from the talks and the discussion

The process of taking a well-performing black-box and trying to understand the reasons for its decision may work for other systems, too. It beautifully showcases the process of using a blackbox, deriving insight and using that to build a model that is more suitable for the application scenario.

Models which allow for easy iterative changes are preferred over opaque ones which would require timeconsuming retraining. In the latter case, it is still necessary to reason from data and consider, e.g., measuring errors and missing entries.

Talk title: We turned a black box into a scientific discovery and an interpretable model: a study in predicting breast cancer years in advance

Speaker: Cynthia Rudin

Talk title: Safety and explainability in clinical predictive models

Speaker: Peter A. Stella

The system for breast cancer risk prediction is described in more detail in this paper.

The visual analytics tool described in the second talk can be found here.

Application: Materials Science

A domain which has less risk than medicine is materials science. In fact, simulations in materials engineering experiments are comparably cheap.

Materials science is confronted with the question as to why a particular nanophotonic structure works. This insight would enable the creation of optimized devices. With the quest to gain a better functional understanding of why a structure performs well, post-hoc methods were used. The output of these methods served as the baseto explore and escape the local minima of the optimization algorithm. Another step filters out candidates which are impossible to fabricate.

Take-aways from the talk and the discussion

The talk nicely showed that the use of post-hoc methods can be beneficial in scientific domains where risk and experimental costs are low. This contrasts with the medical domain which has much higher risk and associated costs. Moreover, it seems that studying the latent space representation is a difficult endeavor as it is difficult to interpret, and thus currently not a solution. Ultimately, the post-hoc methods did not provide insights into the relevant features, leaving the deeper principles unknown. This also raises the question of whether post-hoc methods are insufficient for achieving this understanding.

Talk title: Explainable AI to both elucidate and optimize the design of complex optical materials and devices

Speaker: Aaswath P. Raman

For more technical insights on the topic of designing nanophotonic structure we refer to the following publication.

Application: Physics

Physics is a field that has been engaged in interpretable Machine Learning for some time now, thanks to research on physics-informed Machine Learning. The idea is to incorporate physical constraints and principles into Machine Learning to bring the model behavior closer to the desired behavior.  

The first talk presented several examples of physics-informed Machine Learning. Among them, the use of a Graph Neural Network (GNN) and the finding that physical relations can be extracted from the model via symbolic regression. The second talk focused on the construction of white box models in physics. The aim being a model that can reveal underlying physical laws of high-energy particle collisions. In particular, the work focused on parton showers and utilized Generative Adversarial Networks (GAN). With this, the underlying parton branching mechanism can be recovered. It is proposed that white box construction can be helpful in several areas of high energy physics phenomenology.

Take-aways from the talks and the discussion

The benefit of injecting prior knowledge into the inductive bias of architectures was highlighted, e.g., aligning architectural parts with mechanistic concepts. At the same time, it was pointed out that we can be fooled due to misalignment. During the discussion, it was asked why GANs were used for transparency goals. In the second talk, GANs had the benefit for data augmentation and the possibility to rerun simulations. Interestingly, the physics talks did not mention post-hoc methods and it remains unclear whether this is a useful approach or not.

Talk title: Explainable Generative Adversarial Network for Particle Shower

Speaker: Yue Shi Lai

Talk title: Injecting knowledge and extracting insight: promise and perils

Speaker: Kyle Cranmer

For a more in-depth read on distilling symbolic representations of GNNs, we refer to the following link.

The details of the white box approach for high-energy particle collisions can be found here.


From the talks, it became clear that the explainability approaches in the various domains are different and risk dependent. While medicine strongly advocates for interpretable models, materials science utilizes post-hoc methods for structure design purposes. Throughout the workshop, I got the impression that many speakers and participants are critical towards post-hoc methods. Explainable ML in sciences is about understanding underlying processes. In certain areas, it is costly if there remains uncertainty about whether the methods reflect the actual model behavior. The phenomenon that explanation methods differ (disagreement) certainly contributes to this critical attitude.

Outside of this workshop, I have the impression that a lot of people are still under the misconception that the use of a ML system will magically solve a problem or lead to understanding. We have seen at several instances in the talks that there is still a human who interprets the results and reasons about the data. So, one major take-away is that ML will not solve scientific problems on its own but with a human-in-the-loop. In these cases, it is much more important to design methods that help humans achieve their “understanding goals” – taking human cognition into account. Of course, exceptions such as constrained use or the example of face recognition remain.

For future work, it will be interesting to see more research that investigates how we can leverage black boxes to better understand an underlying phenomenon – especially with the amount of language models that are available now.

Katharina Beckh

Katharina Beckh is a research associate at the Lamarr site of Fraunhofer IAIS. Her research focuses on human-oriented Machine Learning. She is working on interactive learning algorithms and explainable language models.

More blog posts