6533b858fe1ef96bd12b5b3e

RESEARCH PRODUCT

Improving structural similarity based virtual screening using background knowledge

Tobias GirschickLucia PuchbauerStefan Kramer

subject

Virtual screeningEnrichmentPhysical and Theoretical ChemistryLibrary and Information SciencesStructural similarity004 InformatikComputer Graphics and Computer-Aided DesignData miningBackground knowledge004 Data processingComputer Science ApplicationsResearch Article

description

Background Virtual screening in the form of similarity rankings is often applied in the early drug discovery process to rank and prioritize compounds from a database. This similarity ranking can be achieved with structural similarity measures. However, their general nature can lead to insufficient performance in some application cases. In this paper, we provide a link between ranking-based virtual screening and fragment-based data mining methods. The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings. This background knowledge in the form of binding relevant substructures can either be derived by hand selection or by automated fragment-based data mining methods. Results In virtual screening experiments we show that our approach clearly improves enrichment factors with both applied variants of our approach: the extension of the structural similarity measure with background knowledge in the form of a hand-selected relevant substructure or the extension of the similarity measure with background knowledge derived with data mining methods. Conclusion Our study shows that adding binding relevant background knowledge can lead to significantly improved similarity rankings in virtual screening and that even basic data mining approaches can lead to competitive results making hand-selection of the background knowledge less crucial. This is especially important in drug discovery and development projects where no receptor structure is available or more frequently no verified binding mode is known and mostly ligand based approaches can be applied to generate hit compounds.

https://dx.doi.org/10.25358/openscience-7924