6533b822fe1ef96bd127d5ae
RESEARCH PRODUCT
Learning Molecular Classes from Small Numbers of Positive Examples Using Graph Grammars
Domenico MoscaAndreas HildebrandtErnst Althaussubject
Class (set theory)Property (philosophy)Theoretical computer scienceGrammarRule-based machine translationComputer scienceSmall numbermedia_common.quotation_subjectGraph (abstract data type)Construct (python library)Type (model theory)media_commondescription
We consider the following problem: A researcher identified a small number of molecules with a certain property of interest and now wants to find further molecules sharing this property in a database. This can be described as learning molecular classes from small numbers of positive examples. In this work, we propose a method that is based on learning a graph grammar for the molecular class. We consider the type of graph grammars proposed by Althaus et al. [2], as it can be easily interpreted and allows relatively efficient queries. We identify rules that are frequently encountered in the positive examples and use these to construct a graph grammar. We then classify a molecule as being contained in the class if it matches the computed graph grammar. We analyzed our method on different known groups of molecules defined by structural properties and show that our method achieves low false-negative and low false-positive rates.
year | journal | country | edition | language |
---|---|---|---|---|
2021-01-01 |