Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing
Analogue series ({AS}) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship ({SAR}) information. Pairs of active {AS} consisting of compounds with corresponding substituents and comparable potency progression represent {SAR} transfer events for the same target or across different targets. We report a new computational approach to systematically search for {SAR} transfer series that combines an {AS} alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing. The methodology comprehensively accounts for substituent similarity, identifies non-classical bioisosteres, captures substituent-property relationships, and generates accurate {AS} alignments. Context-dependent similarity assessment is conceptually novel in computational medicinal chemistry and should also be of interest for other applications.
- Published in:
Journal of Cheminformatics - Type:
Article - Authors:
Yoshimori, Atsushi; Bajorath, Jürgen - Year:
2025 - Source:
https://doi.org/10.1186/s13321-025-00951-3
Citation information
Yoshimori, Atsushi; Bajorath, Jürgen: Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing, Journal of Cheminformatics, 2025, 17, 1, 5, January, https://doi.org/10.1186/s13321-025-00951-3, Yoshimori.Bajorath.2025a,
@Article{Yoshimori.Bajorath.2025a,
author={Yoshimori, Atsushi; Bajorath, Jürgen},
title={Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing},
journal={Journal of Cheminformatics},
volume={17},
number={1},
pages={5},
month={January},
url={https://doi.org/10.1186/s13321-025-00951-3},
year={2025},
abstract={Analogue series ({AS}) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship ({SAR}) information. Pairs of active {AS} consisting of compounds with corresponding substituents and comparable potency progression represent {SAR} transfer events for the same target or across different targets. We report a new computational...}}