Search results for "Test set"
showing 10 items of 50 documents
Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
2021
Unifloral honeys are highly demanded by honey consumers, especially in Europe. To ensure that a honey belongs to a very appreciated botanical class, the classical methodology is palynological analysis to identify and count pollen grains. Highly trained personnel are needed to perform this task, which complicates the characterization of honey botanical origins. Organoleptic assessment of honey by expert personnel helps to confirm such classification. In this study, the ability of different machine learning (ML) algorithms to correctly classify seven types of Spanish honeys of single botanical origins (rosemary, citrus, lavender, sunflower, eucalyptus, heather and forest honeydew) was investi…
ChemInform Abstract: Antimicrobial Activity Characterization in a Heterogeneous Group of Compounds.
2010
In this work we carry out a study of pattern recognition to detect the microbiological activity in a group of heterogeneous compounds. The structural descriptors utilized are the topological connectivity indexes. The methods followed are stepwise linear discriminant analysis (linear analysis) and artificial neural network (nonlinear analysis). Although both methods are appropriate to differentiate between active and inactive compounds, the artificial neural network is, in this case, more adequate, since it shows in a test set a prediction success of 98%, versus 92% obtained with linear discriminant analysis.
Antimicrobial Activity Characterization in a Heterogeneous Group of Compounds
1998
In this work we carry out a study of pattern recognition to detect the microbiological activity in a group of heterogeneous compounds. The structural descriptors utilized are the topological connectivity indexes. The methods followed are stepwise linear discriminant analysis (linear analysis) and artificial neural network (nonlinear analysis). Although both methods are appropriate to differentiate between active and inactive compounds, the artificial neural network is, in this case, more adequate, since it shows in a test set a prediction success of 98%, versus 92% obtained with linear discriminant analysis.
Internal Test Sets Studies in a Group of Antimalarials
2006
Topological indices have been applied to build QSAR models for a set of 20 an- timalarial cyclic peroxy cetals. In order to evalua te the reliability of the proposed linear models leave-n-out and Internal Test Sets (ITS) approaches have b een considered. The pro- posed procedure resulted in a robust and consensued prediction equation and here it is shown why it is superior to the employed standard c ross-validation algorithms involving multilinear regression models.
Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation
2019
Background The use of receiver operating characteristic curves, or “ROC analysis,” has become quite common in biomedical research to support decisions. However, sensitivity, specificity, and misclassification rates are still often estimated using the training sample, overlooking the risk of overrating the test performance. Methods A simulation study was performed to highlight the inferential implications of splitting (or not) the dataset into training and test set. The normality assumption was made for the classifier given the disease status, and the Youden's criterion considered for the detection of the optimal cutoff. Then, an ROC analysis with sample split was applied to assess the disc…
A three-factor optimisation strategy for micellar liquid chromatography
2000
An interpretive optimisation methodology for micellar liquid chromatography (MLC) is shown, taking into account pH, surfactant (sodium dodecyl sulphate) and organic modifier (propanol) concentration. Two objectives are considered: to develop a highly practical straightforward three-factor optimisation for practical MLC, and, in order to avoid unecessary experiments, to link two and three-factor optimisations through a step-wise construction of the experimental design at different pH levels. The whole pH range for an ODS column (from 3 to 7) is covered. The proposed strategy was thoroughly evaluated using the chromatographic data from 81 experimental mobile phases, applied to the separation …
TOPS-MODE approach for the prediction of blood-brain barrier permeation.
2004
The blood-brain barrier permeation has been investigated by using a topological substructural molecular design approach (TOPS-MODE). A linear regression model was developed to predict the in vivo blood-brain partitioning coefficient on a data set of 119 compounds, treated as the logarithm of the blood-brain concentration ratio. The final model explained the 70% of the variance and it was validated through the use of an external validation set (33 compounds of the 119, MAE = 0.33), a leave-one-out crossvalidation (q(2) = 0.65, S(press) = 0.43), fivefold full crossvalidation (removing 28 compounds in each cycle, MAE = 33, RMSE = 0.43) and the prediction of +/- values for an external test set …
Data Quality Model-based Testing of Information Systems
2020
This paper proposes a model-based testing approach by offering to use the data quality model (DQ-model) instead of the program’s control flow graph as a testing model. The DQ-model contains definitions and conditions for data objects to consider the data object as correct. The study proposes to automatically generate a complete test set (CTS) using a DQmodel that allows all data quality conditions to be tested, resulting in a full coverage of DQ-model. In addition, the possibility to check the conformity of the data to be entered and already stored in the database is ensured. The proposed alternative approach changes the testing process: (1) CTS can be generated prior to software developmen…
3D-Chiral quadratic indices of the ‘molecular pseudograph’s atom adjacency matrix’ and their application to central chirality codification: classific…
2004
Quadratic indices of the 'molecular pseudograph's atom adjacency matrix' have been generalized to codify chemical structure information for chiral drugs. These 3D-chiral quadratic indices make use of a trigonometric 3D-chirality correction factor. These indices are nonsymmetric and reduced to classical (2D) descriptors when symmetry is not codified. By this reason, it is expected that they will be useful to predict symmetry-dependent properties. 3D-Chirality quadratic indices are real numbers and thus, can be easily calculated in TOMOCOMD-CARDD software. These descriptors circumvent the inability of conventional 2D quadratic indices (Molecules 2003, 8, 687-726. http://www.mdpi.org) and othe…
Analysis of tear protein patterns by a neural network as a diagnostical tool for the detection of dry eyes
1999
The electrophoretic patterns of tears from patients with dry-eye disease (n = 43) and from healthy subjects (n = 17) were analyzed by means of multivariate statistical methods and an artificial neural network (ANN), following sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). From each electrophoretic pattern a data set was created, randomly divided into test (unknown samples) and training patterns (known samples), with ANN training by one of these sets. After training, the performance of the ANN was checked by presenting the test data set to the ANN. Furthermore, the data was classified using multivariate analysis of discriminance. The groups were significantly different…