Uncovering and tackling fundamental limitations of compound potency predictions using machine learning models
Molecular property predictions play a central role in computer-aided drug discovery. Although a variety of physicochemical (e.g., solubility or chemical reactivity) or physiological properties (e.g., metabolic stability or toxicity) can be predicted, biological activity is by far the most frequently investigated compound feature. Activity predictions are carried out in a qualitative (target-based activity, through compound classification) or quantitative (compound potency or ligand-target affinity, through regression modeling) manner. Many studies have evaluated and compared different machine learning methods for activity and potency predictions, recently with a focus on deep learning. Regardless of the methods used, these studies generally rely on conventional benchmark settings. Recent work has shown that potency prediction benchmarks have severe general limitations that have long been unnoticed but prevent a reliable assessment of different methods and their relative performance. In this perspective, we outline general limitations of benchmark settings for compound potency predictions, introduce potential alternatives enabling a more realistic assessment of state-of-the-art predictive models, and discuss future directions for elucidating predictions and further increasing their impact.
- Published in:
Cell Reports Physical Science - Type:
Article - Authors:
Janela, Tiago; Bajorath, Jürgen - Year:
2024 - Source:
https://www.cell.com/cell-reports-physical-science/fulltext/S2666-3864(24)00248-0
Citation information
Janela, Tiago; Bajorath, Jürgen: Uncovering and tackling fundamental limitations of compound potency predictions using machine learning models, Cell Reports Physical Science, 2024, 5, https://www.cell.com/cell-reports-physical-science/fulltext/S2666-3864(24)00248-0, Janela.Bajorath.2024a,
@Article{Janela.Bajorath.2024a,
author={Janela, Tiago; Bajorath, Jürgen},
title={Uncovering and tackling fundamental limitations of compound potency predictions using machine learning models},
journal={Cell Reports Physical Science},
volume={5},
url={https://www.cell.com/cell-reports-physical-science/fulltext/S2666-3864(24)00248-0},
year={2024},
abstract={Molecular property predictions play a central role in computer-aided drug discovery. Although a variety of physicochemical (e.g., solubility or chemical reactivity) or physiological properties (e.g., metabolic stability or toxicity) can be predicted, biological activity is by far the most frequently investigated compound feature. Activity predictions are carried out in a qualitative (target-based...}}