In a new study, researchers at the University of Bonn, led by Prof. Dr. Jürgen Bajorath, Chair of Life Sciences and Principal Investigator at the Lamarr Institute for Machine Learning and Artificial Intelligence, have published groundbreaking findings in the field of artificial intelligence (AI) in drug discovery. The research, published in Nature Machine Intelligence, sheds new light on the functioning of AI applications that were previously perceived as a black box.
The study focused on analyzing graph neural networks (GNNs), a form of machine learning used in drug discovery for predicting drug efficacy. Until now, these applications were considered opaque black boxes, as it was difficult to understand how they arrived at their predictions.
More light into the darkness of the “black box” with the “EdgeSHAPer” analysis tool
The researchers developed the “EdgeSHAPer” method to examine the workings of six different GNN models in detail. The result was surprising: most GNNs tended to largely remember known data instead of taking specific chemical interactions into account, as is necessary in drug research.
Prof. Dr. Jürgen Bajorath, Principal Investigator of the Lamarr Institute and renowned chemical computer scientist at the University of Bonn and the Bonn Aachen International Center for Information Technology (b-it), commented on the results: “The GNNs are very dependent on the data they are trained with. Instead of focusing on the core aspects of specific protein-drug interactions, the models mainly remember chemically similar molecules that they have learned about in training.”
The scientists compared the GNNs’ approach to the “Clever Hans Effect”, an analogy to a horse that appeared to be able to calculate, but actually perceived nuances in the environment and thus often came to the correct result. Prof. Bajorath emphasized that the predictions of GNNs were largely overrated, as similarly high-quality predictions could be made using chemical knowledge and simpler methods.
New opportunities for drug research and the prediction of drug efficacy
Despite the critical assessment of current AI applications, there are opportunities for improvement according to our Principal Investigator Prof. Bajorath. Some of the GNN models studied showed a tendency to learn more interactions when the efficacy of known drugs increased. According to Bajorath, this could be further developed through modified training methods.
The Lamarr PI sees the results as an opportunity to bring more transparency into the “black box” of artificial intelligence. The analysis tools developed, such as the “EdgeSHAPer”, could contribute to understanding how complex models make their predictions.