Search results for "discriminant analysis"
showing 10 items of 229 documents
Feature selection on a dataset of protein families: from exploratory data analysis to statistical variable importance
2016
Proteins are characterized by several typologies of features (structural, geometrical, energy). Most of these features are expected to be similar within a protein family. We are interested to detect which features can identify proteins that belong to a family, as well as to define the boundaries among families. Some features are redundant: they could generate noise in identifying which variables are essential as a fingerprint and, consequently, if they are related or not to a function of a protein family. We defined an original approach to analyze protein features for defining their relationships and peculiarities within protein families. A multistep approach has been mainly performed in R …
ChemInform Abstract: Discrimination and Molecular Design of New Theoretical Hypolipaemic Agents Using the Molecular Connectivity Functions.
2010
The molecular topology model and discriminant analysis have been applied to the prediction and QSAR interpretation of some pharmacological properties of hypolipaemic drugs using multivariable regre...
Protein linear indices of the ‘macromolecular pseudograph α-carbon atom adjacency matrix’ in bioinformatics. Part 1: Prediction of protein stability …
2005
Abstract A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein’s total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on R n [ f k ( x m i ) : R n → R n ] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph α-carbon atom adjacency matrix. Total linear indices are linear functional on R n . That is, the kth total linear indices are linear maps from R n to the scalar R [ f k …
Discrimination and Molecular Design of New Theoretical Hypolipaemic Agents Using the Molecular Connectivity Functions
2000
The molecular topology model and discriminant analysis have been applied to the prediction and QSAR interpretation of some pharmacological properties of hypolipaemic drugs using multivariable regression equations with their statistical parameters. Regression analysis showed that the molecular topology model predicts these properties. The corresponding stability (cross-validation) studies done on the selected prediction models confirmed the goodness of the fits. The method used for hypolipaemic activity selection was a linear discriminant analysis (LDA). We make use of the pharmacological distribution diagrams (PDDs) as a visualizing technique for the identification and design of new hypolip…
Predicting antitrichomonal activity: A computational screening using atom-based bilinear indices and experimental proofs
2006
Existing Trichomonas vaginalis therapies are out of reach for most trichomoniasis people in developing countries and, where available, they are limited by their toxicity (mainly in pregnant women) and their cost. New antitrichomonal agents are needed to combat emerging metronidazole-resistant trichomoniasis and reduce the side effects associated with currently available drugs. Toward this end, atom-based bilinear indices, a new TOMOCOMD-CARDD molecular descriptor, and linear discriminant analysis (LDA) were used to discover novel, potent, and non-toxic lead trichomonacidal chemicals. Two discriminant functions were obtained with the use of non-stochastic and stochastic atom-type bilinear in…
Dragon method for finding novel tyrosinase inhibitors: Biosilico identification and experimental in vitro assays
2006
QSAR (quantitative structure-activity relationship) studies of tyrosinase inhibitors employing Dragon descriptors and linear discriminant analysis (LDA) are presented here. A data set of 653 compounds, 245 with tyrosinase inhibitory activity and 408 having other clinical uses were used. The active data set was processed by k-means cluster analysis in order to design training and prediction series. Seven LDA-based QSAR models were obtained. The discriminant functions applied showed a globally good classification of 99.79% for the best model Class=-96.067+1.988 x 10(2)X0Av +9 1.907 BIC3 + 6.853 CIC1 in the training set. External validation processes to assess the robustness and predictive pow…
Retrained Classification of Tyrosinase Inhibitors and “In Silico” Potency Estimation by Using Atom-Type Linear Indices
2012
In this paper, the authors present an effort to increase the applicability domain (AD) by means of retraining models using a database of 701 great dissimilar molecules presenting anti-tyrosinase activity and 728 drugs with other uses. Atom-based linear indices and best subset linear discriminant analysis (LDA) were used to develop individual classification models. Eighteen individual classification-based QSAR models for the tyrosinase inhibitory activity were obtained with global accuracy varying from 88.15-91.60% in the training set and values of Matthews correlation coefficients (C) varying from 0.76-0.82. The external validation set shows globally classifications above 85.99% and 0.72 fo…
New hypoglycaemic agents selected by molecular topology.
2003
Abstract New compounds showing hypoglycaemic activity have been designed through a computer aided method based on quantitative structure–activity relationship (QSAR) and molecular connectivity. After calculation of topological indices for a set of 89 compounds including active and inactive with regards to hypoglycaemic action, linear discriminant analysis was performed so that a useful model to predict such an activity was achieved. Later on, the discriminant model was applied on a huge database so that fourteen compounds were selected as potential new hypoglycaemics. From them, just five were finally selected for experimental test on expected hypoglycaemic activity. Among the selected comp…
A topological substructural approach for the prediction of P-glycoprotein substrates
2006
A topological substructural molecular design approach (TOPS-MODE) has been used to predict whether a given compound is a P-glycoprotein (P-gp) substrate or not. A linear discriminant model was developed to classify a data set of 163 compounds as substrates or nonsubstrates (91 substrates and 72 nonsubstrates). The final model fit the data with sensitivity of 82.42% and specificity of 79.17%, for a final accuracy of 80.98%. The model was validated through the use of an external validation set (40 compounds, 22 substrates and 18 nonsubstrates) with a 77.50% of prediction accuracy; fivefold full cross-validation (removing 40 compounds in each cycle, 80.50% of good prediction) and the predictio…
New tyrosinase inhibitors selected by atomic linear indices-based classification models.
2005
In the present report, the use of the atom-based linear indices for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones is presented. In this sense, discriminant models were applied and globally good classifications of 93.51% and 92.46% were observed for non-stochastic and stochastic linear indices best models, respectively, in the training set. The external prediction sets had accuracies of 91.67% and 89.44%. In addition, these fitted models were used in the screening of new cycloartane compounds isolated from herbal plants. A good behavior is shown between the theoretical and experimental results. These results provide a tool that can be used i…