0000000000324163
AUTHOR
Emili Besalú
Superposing significant interaction rules (SSIR) method: a simple procedure for rapid ranking of congeneric compounds
The Superposing Significant Interaction Rules (SSIR) method is revised and implemented. The method is a simple combinatorial procedure, which deals with in situ generated rules among a dichotomized congeneric molecular family, selecting the most probabilistically relevant ones. The mere counting of the number of relevant rules attached to new compounds generates a molecular ranking useful for database filtering, refinement and prediction. The algorithm only needs for a symbolic molecular representation and this allows for mining the database in a confidential manner. Third parties will not know the real compounds that are on the way to be worked out. The procedure is tested for a complete s…
Checking the Efficacy of Two Basic Descriptors With a Set of Properties of Alkanes
Several experimental properties of alkanes are described by means of multilinear models at the cross-validation level. The models have been obtained considering two main sets of descriptors: mathematically-based and experimental ones. The best models are obtained normally involving one of the two sets. The main goal of this work is to show how the theoretical descriptors are able to perform a competitive role against the experimental ones. This constitutes an important topic in the quantitative structure-property relationships field because the use of mathematical and in silico descriptors is validated as a proper tool for model building. Activity distributions of the properties and indices…
Molecular Rearrangement of an Aza-Scorpiand Macrocycle Induced by pH: A Computational Study †
Rearrangements and their control are a hot topic in supramolecular chemistry due to the possibilities that these phenomena open in the design of synthetic receptors and molecular machines. Macrocycle aza-scorpiands constitute an interesting system that can reorganize their spatial structure depending on pH variations or the presence of metal cations. In this study, the relative stabilities of these conformations were predicted computationally by semi-empirical and density functional theory approximations, and the reorganization from closed to open conformations was simulated by using the Monte Carlo multiple minimum method Financial support by the Spanish Ministerio de Economía y Competitiv…
A Probabilistic Analysis About the Concepts of Difficulty and Usefulness of a Molecular Ranking Classification
Discerning between the concepts of difficulty and usefulness of a molecular ranking classification is of significant importance in virtual design chemistry. Here, both concepts are viewed from the statistical and practical point of view according to the standard definitions of enrichment and statistical significance p-values. These parameters are useful not only to compare distinct rankings obtained for the same molecular database, but also in order to compare the ones established in distinct molecular sets from an objective point of view.
Internal Test Sets Studies in a Group of Antimalarials
Topological indices have been applied to build QSAR models for a set of 20 an- timalarial cyclic peroxy cetals. In order to evalua te the reliability of the proposed linear models leave-n-out and Internal Test Sets (ITS) approaches have b een considered. The pro- posed procedure resulted in a robust and consensued prediction equation and here it is shown why it is superior to the employed standard c ross-validation algorithms involving multilinear regression models.
Ranking Series of Cancer-Related Gene Expression Data by Means of the Superposing Significant Interaction Rules Method
The Superposing Significant Interaction Rules (SSIR) method is a combinatorial procedure that deals with symbolic descriptors of samples. It is able to rank the series of samples when those items are classified into two classes. The method selects preferential descriptors and, with them, generates rules that make up the rank by means of a simple voting procedure. Here, two application examples are provided. In both cases, binary or multilevel strings encoding gene expressions are considered as descriptors. It is shown how the SSIR procedure is useful for ranking the series of patient transcription data to diagnose two types of cancer (leukemia and prostate cancer) obtaining Area Under Recei…
Equivalence of the Pecka–Ponec Correlation Probability and the Statistical F Significance for MLR Models
In an article of this journal Pecka and Ponec [J. Math. Chem. 27 (2000) 13] have proposed, by means of a probability calculation, a method to evaluate the statistical importance of correlations obtained from multilinear regression equations involving an arbitrary number of experimental points and parameters. Here, it is demonstrated how this probability exactly coincides with a more general concept: the confidence probability of an F distribution having the appropriate degrees of freedom.