Search results for "Test set"

showing 10 items of 50 documents

Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys

2021

Unifloral honeys are highly demanded by honey consumers, especially in Europe. To ensure that a honey belongs to a very appreciated botanical class, the classical methodology is palynological analysis to identify and count pollen grains. Highly trained personnel are needed to perform this task, which complicates the characterization of honey botanical origins. Organoleptic assessment of honey by expert personnel helps to confirm such classification. In this study, the ability of different machine learning (ML) algorithms to correctly classify seven types of Spanish honeys of single botanical origins (rosemary, citrus, lavender, sunflower, eucalyptus, heather and forest honeydew) was investi…

Health (social science)OrganolepticPlant ScienceTP1-1185Machine learningcomputer.software_genre01 natural sciencesHealth Professions (miscellaneous)MicrobiologyArticle0404 agricultural biotechnologyPartial least squares regressionMathematicsAliments Consumbotanical originArtificial neural networkbusiness.industryIntel·ligència artificialChemical technology010401 analytical chemistryphysicochemical parameters04 agricultural and veterinary sciencesLinear discriminant analysis040401 food science0104 chemical sciencesRandom forestSupport vector machineTree (data structure)machine learningclassificationTest setArtificial intelligencebusinessApiculturaAlgorithmcomputerunifloral honeysFood ScienceFoods
researchProduct

ChemInform Abstract: Antimicrobial Activity Characterization in a Heterogeneous Group of Compounds.

2010

In this work we carry out a study of pattern recognition to detect the microbiological activity in a group of heterogeneous compounds. The structural descriptors utilized are the topological connectivity indexes. The methods followed are stepwise linear discriminant analysis (linear analysis) and artificial neural network (nonlinear analysis). Although both methods are appropriate to differentiate between active and inactive compounds, the artificial neural network is, in this case, more adequate, since it shows in a test set a prediction success of 98%, versus 92% obtained with linear discriminant analysis.

Heterogeneous groupArtificial neural networkbusiness.industryChemistryTest setPattern recognition (psychology)Pattern recognitionGeneral MedicineArtificial intelligenceLinear analysisLinear discriminant analysisbusinessAntimicrobialChemInform
researchProduct

Antimicrobial Activity Characterization in a Heterogeneous Group of Compounds

1998

In this work we carry out a study of pattern recognition to detect the microbiological activity in a group of heterogeneous compounds. The structural descriptors utilized are the topological connectivity indexes. The methods followed are stepwise linear discriminant analysis (linear analysis) and artificial neural network (nonlinear analysis). Although both methods are appropriate to differentiate between active and inactive compounds, the artificial neural network is, in this case, more adequate, since it shows in a test set a prediction success of 98%, versus 92% obtained with linear discriminant analysis.

Heterogeneous groupMolecular StructureArtificial neural networkbusiness.industryLinear modelDiscriminant AnalysisPattern recognitionGeneral ChemistryLinear analysisAntimicrobialLinear discriminant analysisPattern Recognition AutomatedComputer Science ApplicationsAnti-Infective AgentsNonlinear DynamicsComputational Theory and MathematicsTest setPattern recognition (psychology)Linear ModelsNeural Networks ComputerArtificial intelligencebusinessInformation SystemsMathematicsJournal of Chemical Information and Computer Sciences
researchProduct

Internal Test Sets Studies in a Group of Antimalarials

2006

Topological indices have been applied to build QSAR models for a set of 20 an- timalarial cyclic peroxy cetals. In order to evalua te the reliability of the proposed linear models leave-n-out and Internal Test Sets (ITS) approaches have b een considered. The pro- posed procedure resulted in a robust and consensued prediction equation and here it is shown why it is superior to the employed standard c ross-validation algorithms involving multilinear regression models.

Internal test sets method; topological indices; linear models; QSAR; statistical validation.Quantitative structure–activity relationshipMultilinear mapInternal test sets methodLinear models (Statistics)CatalysisInorganic ChemistrySet (abstract data type)lcsh:ChemistryQSAR (Bioquímica)Order (group theory)Applied mathematicsPhysical and Theoretical ChemistryMolecular Biologylcsh:QH301-705.5SpectroscopyReliability (statistics)Mathematicsstatistical validation.Group (mathematics)QSAROrganic ChemistryLinear modelRegression analysisGeneral MedicineComputer Science Applicationslcsh:Biology (General)lcsh:QD1-999Models lineals (Estadística)topological indiceslinear modelsInternational Journal of Molecular Sciences
researchProduct

Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation

2019

Background The use of receiver operating characteristic curves, or “ROC analysis,” has become quite common in biomedical research to support decisions. However, sensitivity, specificity, and misclassification rates are still often estimated using the training sample, overlooking the risk of overrating the test performance. Methods A simulation study was performed to highlight the inferential implications of splitting (or not) the dataset into training and test set. The normality assumption was made for the classifier given the disease status, and the Youden's criterion considered for the detection of the optimal cutoff. Then, an ROC analysis with sample split was applied to assess the disc…

Male020205 medical informaticsperformance estimatorsmedia_common.quotation_subjectHealth Informatics02 engineering and technology03 medical and health sciences0302 clinical medicineHealth Information ManagementSurveys and QuestionnairesStatisticstrue predictive performanceRinite Alérgica0202 electrical engineering electronic engineering information engineeringmedicineHumanssample splitComputer Simulation030212 general & internal medicineChildAsmaNormalityAsthmaMathematicsmedia_commonAdvanced and Specialized NursingReceiver operating characteristicasthma control testasthma control test sample split performance estimators optimal cutoff simulation study true predictive performanceDiscriminant validityReproducibility of ResultsEstimatormedicine.diseasesimulation studyRhinitis AllergicAsthmaConfidence intervalROC CurveTest setoptimal cutoffFemaleClassifier (UML)
researchProduct

A three-factor optimisation strategy for micellar liquid chromatography

2000

An interpretive optimisation methodology for micellar liquid chromatography (MLC) is shown, taking into account pH, surfactant (sodium dodecyl sulphate) and organic modifier (propanol) concentration. Two objectives are considered: to develop a highly practical straightforward three-factor optimisation for practical MLC, and, in order to avoid unecessary experiments, to link two and three-factor optimisations through a step-wise construction of the experimental design at different pH levels. The whole pH range for an ODS column (from 3 to 7) is covered. The proposed strategy was thoroughly evaluated using the chromatographic data from 81 experimental mobile phases, applied to the separation …

Mean squared errorChemistryOrganic ChemistryClinical BiochemistryAnalytical chemistryBiochemistryHigh-performance liquid chromatographyMicellar electrokinetic chromatographyAnalytical ChemistrySet (abstract data type)ChemometricsPropanolchemistry.chemical_compoundMicellar liquid chromatographyTest setBiological systemChromatographia
researchProduct

TOPS-MODE approach for the prediction of blood-brain barrier permeation.

2004

The blood-brain barrier permeation has been investigated by using a topological substructural molecular design approach (TOPS-MODE). A linear regression model was developed to predict the in vivo blood-brain partitioning coefficient on a data set of 119 compounds, treated as the logarithm of the blood-brain concentration ratio. The final model explained the 70% of the variance and it was validated through the use of an external validation set (33 compounds of the 119, MAE = 0.33), a leave-one-out crossvalidation (q(2) = 0.65, S(press) = 0.43), fivefold full crossvalidation (removing 28 compounds in each cycle, MAE = 33, RMSE = 0.43) and the prediction of +/- values for an external test set …

Mean squared errorLogarithmChemistryPharmaceutical ScienceThermodynamicsPenetration (firestop)PermeationConcentration ratioModels BiologicalPartition coefficientCapillary PermeabilityBlood-Brain BarrierPredictive Value of TestsTest setLinear regressionLinear ModelsComputer SimulationJournal of pharmaceutical sciences
researchProduct

Data Quality Model-based Testing of Information Systems

2020

This paper proposes a model-based testing approach by offering to use the data quality model (DQ-model) instead of the program’s control flow graph as a testing model. The DQ-model contains definitions and conditions for data objects to consider the data object as correct. The study proposes to automatically generate a complete test set (CTS) using a DQmodel that allows all data quality conditions to be tested, resulting in a full coverage of DQ-model. In addition, the possibility to check the conformity of the data to be entered and already stored in the database is ensured. The proposed alternative approach changes the testing process: (1) CTS can be generated prior to software developmen…

Model-based testingbusiness.industryComputer scienceSoftware developmentProcess (computing)020207 software engineering02 engineering and technologycomputer.software_genreSoftwareSystem under test020204 information systemsData qualityTest set0202 electrical engineering electronic engineering information engineeringControl flow graphData miningbusinesscomputerProceedings of the 2020 Federated Conference on Computer Science and Information Systems
researchProduct

3D-Chiral quadratic indices of the ‘molecular pseudograph’s atom adjacency matrix’ and their application to central chirality codification: classific…

2004

Quadratic indices of the 'molecular pseudograph's atom adjacency matrix' have been generalized to codify chemical structure information for chiral drugs. These 3D-chiral quadratic indices make use of a trigonometric 3D-chirality correction factor. These indices are nonsymmetric and reduced to classical (2D) descriptors when symmetry is not codified. By this reason, it is expected that they will be useful to predict symmetry-dependent properties. 3D-Chirality quadratic indices are real numbers and thus, can be easily calculated in TOMOCOMD-CARDD software. These descriptors circumvent the inability of conventional 2D quadratic indices (Molecules 2003, 8, 687-726. http://www.mdpi.org) and othe…

Models MolecularQuantitative structure–activity relationshipChemistryStereochemistryOrganic ChemistryClinical BiochemistryStability (learning theory)Computational BiologyQuantitative Structure-Activity RelationshipPharmaceutical ScienceAngiotensin-Converting Enzyme InhibitorsStereoisomerismLinear discriminant analysisBiochemistryCross-validationQuadratic equationTest setDrug DiscoveryLinear regressionReceptors sigmaMolecular MedicineApplied mathematicsAdjacency matrixMolecular BiologyBioorganic & Medicinal Chemistry
researchProduct

Analysis of tear protein patterns by a neural network as a diagnostical tool for the detection of dry eyes

1999

The electrophoretic patterns of tears from patients with dry-eye disease (n = 43) and from healthy subjects (n = 17) were analyzed by means of multivariate statistical methods and an artificial neural network (ANN), following sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). From each electrophoretic pattern a data set was created, randomly divided into test (unknown samples) and training patterns (known samples), with ANN training by one of these sets. After training, the performance of the ANN was checked by presenting the test data set to the ANN. Furthermore, the data was classified using multivariate analysis of discriminance. The groups were significantly different…

Multivariate analysisChromatographyArtificial neural networkbusiness.industryClinical BiochemistryTear proteinsDry eyesPattern recognitionmedicine.diseaseBiochemistryAnalytical ChemistrySet (abstract data type)Data setTest setmedicineArtificial intelligencebusinessMathematicsTest dataElectrophoresis
researchProduct