Designing highly potent compounds using a chemical language model

Compound potency prediction is a major task in medicinal chemistry and drug design. Inspired by the concept of activity cliffs (which encode large differences in potency between similar active compounds), we have devised a new methodology for predicting potent compounds from weakly potent input molecules. Therefore, a chemical language model was implemented consisting of a conditional transformer architecture for compound design guided by observed potency differences. The model was evaluated using a newly generated compound test system enabling a rigorous assessment of its performance. It was shown to predict known potent compounds from different activity classes not encountered during training. Moreover, the model was capable of creating highly potent compounds that were structurally distinct from input molecules. It also produced many novel candidate compounds not included in test sets. Taken together, the findings confirmed the ability of the new methodology to generate structurally diverse highly potent compounds.

  • Published in:
    Scientific Reports
  • Type:
    Article
  • Authors:
    Chen, Hengwei; Bajorath, Jürgen
  • Year:
    2023

Citation information

Chen, Hengwei; Bajorath, Jürgen: Designing highly potent compounds using a chemical language model, Scientific Reports, 2023, 13, 7412, https://www.nature.com/articles/s41598-023-34683-x#citeas, Chen.Bajorath.2023a,

Associated Lamarr Researchers

lamarr institute person Bajorath Juergen - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Jürgen Bajorath

Area Chair Life Sciences to the profile