Talk at AI4X Conference by Lamarr PI Prof. Jürgen Bajorath
Lamarr Principal Investigator Prof. Dr. Jürgen Bajorath will speak at the upcoming AI4X Conference, presenting his talk “Chemical Language Models and Their Learning Characteristics”.
The AI4X Conference brings together leading voices in AI for science and industry, showcasing cutting-edge research at the intersection of machine learning, life sciences, and engineering.
In this context, Prof. Bajorath will explore how chemical language models (CLMs)—deep learning architectures adapted from natural language processing—are used in drug discovery to generate novel molecular structures under specific property constraints.
Abstract
In the life sciences and drug discovery, a variety of generative machine learning models are utilized for different applications. Among these are chemical language models (CLMs) that are based on deep learning architectures adopted from natural language processing. CLMs learn textual representations of molecular structure and probability distributions to predict new chemical matter and are often conditioned by context-dependent rules such a specific property constraints. Transformers have become preferred CLM architectures. Hallmarks of transformer CLMs include the self-attention mechanism and ability to learn a variety of mappings of molecular representations and associated property measures. The ensuing versatility of CLMs in addressing different machine translation tasks provides new opportunities for generative molecular design. Transformer CLMs often deliver promising results in off-the-beaten-path prediction tasks. However, rationalizing predictions of these models is challenging and a topical issue in explainable artificial intelligence (XAI). So far, transformer predictions have mostly been analyzed by determining attention weight distributions and attention flow, but other approaches are beginning to emerge. For instance, depending on the application, careful control calculations often help to unveil model-specific learning characteristics. This is often crucial to avoid over-interpretation of predictions or confusion caused by “Clever Hans” effects.
Details
Date
10. July 2025
16:20 - 16:40
Topics
Life Sciences , Science