Unraveling learning characteristics of transformer models for molecular design
Transformer networks are the basis for large language models and are also widely used in different scientific fields, including drug design. Given their flexible architecture and signature (self-)attention mechanism, transformers are suitable for many generative modeling tasks. Predictions of transformer models, however, are difficult to rationalize. Indeed, concerns have been raised that these models can sometimes act as “Clever Hans” predictors, evoking a comparison to a famous 19th-century horse that appeared to be able to count but was simply reading subtle body language cues from his trainer. Such models may provide desirable results but for reasons different than those anticipated, leading to potentially false understandings of the causal relationships in the system. Here, we have used sequence-based generative compound design as a test system to study the learning characteristics of transformer models. Our findings show that predictions of protein-sequence-based transformer models are purely statistically driven and that care should be taken not to over-interpret these predictions.
- Published in:
Patterns - Type:
Article - Year:
2025 - Source:
https://www.cell.com/patterns/abstract/S2666-3899(25)00240-5
Citation information
: Unraveling learning characteristics of transformer models for molecular design, Patterns, 2025, October, Elsevier, https://www.cell.com/patterns/abstract/S2666-3899(25)00240-5, Roth.Bajorath.2025a,
@Article{Roth.Bajorath.2025a,
author={Roth, Jannik P.; Bajorath, Jürgen},
title={Unraveling learning characteristics of transformer models for molecular design},
journal={Patterns},
month={October},
publisher={Elsevier},
url={https://www.cell.com/patterns/abstract/S2666-3899(25)00240-5},
year={2025},
abstract={Transformer networks are the basis for large language models and are also widely used in different scientific fields, including drug design. Given their flexible architecture and signature (self-)attention mechanism, transformers are suitable for many generative modeling tasks. Predictions of transformer models, however, are difficult to rationalize. Indeed, concerns have been raised that these...}}