0000000000341640
AUTHOR
Tobias Girschick
Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships and Data-Driven Selection of Source Datasets
Quantitative structure–activity relationships are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically relevant endpoints, which constitute the target outcomes of experiments. The task is often tackled by instance-based methods, which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two families of chemical distance measures, fingerprint-based and maximum common subgraph-based measures, provide orthogonal information about chemical similarity. This paper presents a novel method for finding suitable combinations of them, called adapted tran…
Improving structural similarity based virtual screening using background knowledge
Background Virtual screening in the form of similarity rankings is often applied in the early drug discovery process to rank and prioritize compounds from a database. This similarity ranking can be achieved with structural similarity measures. However, their general nature can lead to insufficient performance in some application cases. In this paper, we provide a link between ranking-based virtual screening and fragment-based data mining methods. The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings. This background knowledge in the form of binding relevant substructures can either be derived by hand selec…
Similarity boosted quantitative structure-activity relationship--a systematic study of enhancing structural descriptors by molecular similarity.
The concept of molecular similarity is one of the most central in the fields of predictive toxicology and quantitative structure-activity relationship (QSAR) research. Many toxicological responses result from a multimechanistic process and, consequently, structural diversity among the active compounds is likely. Combining this knowledge, we introduce similarity boosted QSAR modeling, where we calculate molecular descriptors using similarities with respect to representative reference compounds to aid a statistical learning algorithm in distinguishing between different structural classes. We present three approaches for the selection of reference compounds, one by literature search and two by…