Search results for "cross-validation"

showing 10 items of 50 documents

Assessment of the statistical significance of classifications in infrared spectroscopy based diagnostic models.

2014

Fourier transform infrared (IR) spectroscopy in combination with multivariate data analysis is a versatile tool that can be applied to disease diagnosis. However, a rigorous validation of the obtained models is necessary in order to obtain robust results. This work evaluates the advantages of the use of permutation testing for determining the statistical significance of the misclassification errors obtained from IR based diagnostic models through cross validation (CV). The model performance, estimated by CV, is compared to a distribution of CV-performance values obtained using randomly permuted class labels. The distribution of ‘random CV-values’ is considered as a null distribution and use…

Multivariate analysisFeature selectionClinical Chemistry Tests02 engineering and technology01 natural sciencesBiochemistryCross-validationAnalytical ChemistryResamplingStatisticsDiagnosisSpectroscopy Fourier Transform InfraredElectrochemistryNull distributionEnvironmental ChemistryHumansSpectroscopyMathematicsModels Statistical010401 analytical chemistryEstimatorContrast (statistics)Discriminant AnalysisReproducibility of Results021001 nanoscience & nanotechnology0104 chemical sciencesRandom forest0210 nano-technologyThe Analyst
researchProduct

Multivariate regression analysis applied to the calibration of equipment used in pig meat classification in Romania.

2016

This paper highlights the statistical methodology used in a dissection experiment carried out in Romania to calibrate and standardize two classification devices, OptiGrade PRO (OGP) and Fat-o-Meat'er (FOM). One hundred forty-five carcasses were measured using the two probes and dissected according to the European reference method. To derive prediction formulas for each device, multiple linear regression analysis was performed on the relationship between the reference lean meat percentage and the back fat and muscle thicknesses, using the ordinary least squares technique. The root mean squared error of prediction calculated using the leave-one-out cross validation met European Commission (EC…

Multivariate statisticsMeatMean squared errorFood HandlingSwine0211 other engineering and technologies02 engineering and technologyCross-validationStatisticsCalibrationMedicineAnimals021110 strategic defence & security studiesbusiness.industryBack fatRomania0402 animal and dairy scienceRegression analysis04 agricultural and veterinary sciences040201 dairy & animal scienceAdipose TissueOrdinary least squaresCalibrationBody CompositionMultiple linear regression analysisbusinessFood ScienceMeat science
researchProduct

A topological sub-structural approach for predicting human intestinal absorption of drugs.

2004

The human intestinal absorption (HIA) of drugs was studied using a topological sub-structural approach (TOPS-MODE). The drugs were divided into three classes according to reported cutoff values for HIA. "Poor" absorption was defined as HIAor =30%, "high" absorption as HIAor =80%, whereas "moderate" absorption was defined between these two values (30%HIA79%). Two linear discriminant analyses were carried out on a training set of 82 compounds. The percentages of correct classification, for both models, were 89.02%. The predictive power of the models were validated by three test: a leave-one-out cross validation procedure (88.9% and 87.9%), an external prediction set of 127 drugs (92.9% and 80…

PharmacologyQuantitative structure–activity relationshipChemistryOrganic ChemistryBiological AvailabilityQuantitative Structure-Activity RelationshipGeneral MedicineModels TheoreticalLinear discriminant analysisTopologyCross-validationIntestinal absorptionBioavailabilityIntestinal AbsorptionPharmaceutical PreparationsTest setDrug DiscoveryHuman intestinal absorptionCutoffHumansIntestinal MucosaEuropean journal of medicinal chemistry
researchProduct

Predicting ACL Injury Using Machine Learning on Data From an Extensive Screening Test Battery of 880 Female Elite Athletes

2022

Background: Injury risk prediction is an emerging field in which more research is needed to recognize the best practices for accurate injury risk assessment. Important issues related to predictive machine learning need to be considered, for example, to avoid overinterpreting the observed prediction performance. Purpose: To carefully investigate the predictive potential of multiple predictive machine learning methods on a large set of risk factor data for anterior cruciate ligament (ACL) injury; the proposed approach takes into account the effect of chance and random variations in prediction performance. Study Design: Case-control study; Level of evidence, 3. Methods: The authors used 3-dime…

Physical Therapy Sports Therapy and Rehabilitationcross-validationMachine LearningurheiluHumansprediction significanceOrthopedics and Sports MedicinejoukkueurheiluProspective StudiesliikeanalyysisuorituskykyurheiluvammatACL injuryAnterior Cruciate Ligament Injuriesmotion analysispredictive methodsmachine learningkoneoppiminenAthletesCase-Control StudiesAthletic InjuriesennustettavuusFemaleteam sportsloukkaantuminen (fyysinen)urheilijat
researchProduct

Feature selection on a dataset of protein families: from exploratory data analysis to statistical variable importance

2016

Proteins are characterized by several typologies of features (structural, geometrical, energy). Most of these features are expected to be similar within a protein family. We are interested to detect which features can identify proteins that belong to a family, as well as to define the boundaries among families. Some features are redundant: they could generate noise in identifying which variables are essential as a fingerprint and, consequently, if they are related or not to a function of a protein family. We defined an original approach to analyze protein features for defining their relationships and peculiarities within protein families. A multistep approach has been mainly performed in R …

Quantitative Biology::Biomoleculesbusiness.industrySparse PCAPattern recognitionFeature selectionLinear discriminant analysisCross-validationRandom forestExploratory data analysisStatistical classificationArtificial intelligencebusinessCluster analysisMathematics
researchProduct

Atom-based 3D-chiral quadratic indices. Part 2: prediction of the corticosteroid-binding globulinbinding affinity of the 31 benchmark steroids data s…

2005

A quantitative structure-activity relationship (QSAR) study to predict the relative affinities of the steroid 'benchmark' data set to the corticosteroid-binding globulin (CBG) is described. It is shown that the 3D-chiral quadratic indices closely correlate with the measured CBG affinity values for the 31 steroids. The calculated descriptors were correlated with biological data through multiple linear regressions. Two statistically significant models were obtained when non-stochastic (R = 0.924 and s = 0.46) as well as stochastic (R = 0.929 and s = 0.46) 3D-chiral quadratic indices were used. A leave-one-out (LOO) approach to model validation is used here; the best results obtained in the cr…

Quantitative structure–activity relationshipClinical BiochemistryPharmaceutical ScienceQuantitative Structure-Activity RelationshipBiochemistryCross-validationStructure-Activity RelationshipQuadratic equationDrug DiscoveryLinear regressionApplied mathematicsComputer SimulationMolecular BiologyTranscortinChromatographyMolecular StructureChemistryOrganic ChemistryComputational BiologyRegression analysisAffinitiesData setDatabases as TopicModels ChemicalTopological indexMolecular MedicineSteroidsBioorganicmedicinal chemistry
researchProduct

Predicting antitrichomonal activity: A computational screening using atom-based bilinear indices and experimental proofs

2006

Existing Trichomonas vaginalis therapies are out of reach for most trichomoniasis people in developing countries and, where available, they are limited by their toxicity (mainly in pregnant women) and their cost. New antitrichomonal agents are needed to combat emerging metronidazole-resistant trichomoniasis and reduce the side effects associated with currently available drugs. Toward this end, atom-based bilinear indices, a new TOMOCOMD-CARDD molecular descriptor, and linear discriminant analysis (LDA) were used to discover novel, potent, and non-toxic lead trichomonacidal chemicals. Two discriminant functions were obtained with the use of non-stochastic and stochastic atom-type bilinear in…

Quantitative structure–activity relationshipDatabases FactualMolecular modelStereochemistryClinical BiochemistryDrug Evaluation PreclinicalPharmaceutical ScienceAntitrichomonal AgentsLigandsBiochemistryCross-validationChemometricsStructure-Activity Relationshipchemistry.chemical_compoundArtificial IntelligencePredictive Value of TestsMolecular descriptorDrug DiscoveryTrichomonas vaginalisAnimalsCluster AnalysisComputer SimulationMolecular BiologyStochastic ProcessesOrganic ChemistryComputational BiologyReproducibility of ResultsLinear discriminant analysisAntitrichomonal agentchemistryData Interpretation StatisticalTopological indexLinear ModelsMolecular MedicineBiological systemAlgorithmsBioorganic & Medicinal Chemistry
researchProduct

<strong>Predicting Proteasome Inhibition using Atomic Weighted Vector and Machine Learning</strong>

2018

Ubiquitin/Proteasome System (UPS) is a highly regulated mechanism of intracellular protein degradation and turnover. Through the concerted actions of a series of enzymes, proteins are marked for proteasomal degradation by being linked to the polypeptide co-factor, ubiquitin. The UPS participates in a wide array of biological functions such as antigen presentation, regulation of gene transcription and the cell cycle, and activation of NF-κB. Some researchers have applied QSAR method and machine learning in the study of proteasome inhibition (EC50(µmol/L)), such as: the analysis of proteasome inhibition prediction, in the prediction of multi-target inhibitors of UPP and in the prediction of p…

Quantitative structure–activity relationshipbusiness.industryProtein contact mapPerceptronMachine learningcomputer.software_genreCross-validationRandom forestStatistical classificationMolecular descriptorLinear regressionArtificial intelligencebusinesscomputerMathematicsProceedings of MOL2NET 2018, International Conference on Multidisciplinary Sciences, 4th edition
researchProduct

A General Frame for Building Optimal Multiple SVM Kernels

2012

The aim of this paper is to define a general frame for building optimal multiple SVM kernels. Our scheme follows 5 steps: formal representation of the multiple kernels, structural representation, choice of genetic algorithm, SVM algorithm, and model evaluation. The computation of the optimal parameter values of SVM kernels is performed using an evolutionary method based on the SVM algorithm for evaluation of the quality of chromosomes. After the multiple kernel is found by the genetic algorithm we apply cross validation method for estimating the performance of our predictive model. We implemented and compared many hybrid methods derived from this scheme. Improved co-mutation operators are u…

Scheme (programming language)Multiple kernel learningbusiness.industryComputationPattern recognitionCross-validationSupport vector machineGenetic algorithmArtificial intelligenceGeneral framebusinesscomputerKernel (category theory)Mathematicscomputer.programming_language
researchProduct

Determination of total phenolic compounds in compost by infrared spectroscopy

2016

Abstract Middle and near infrared (MIR and NIR) were applied to determine the total phenolic compounds (TPC) content in compost samples based on models built by using partial least squares (PLS) regression. The multiplicative scatter correction, standard normal variate and first derivative were employed as spectra pretreatment, and the number of latent variable were optimized by leave-one-out cross-validation. The performance of PLS-ATR-MIR and PLS-DR-NIR models was evaluated according to root mean square error of cross validation and prediction (RMSECV and RMSEP), the coefficient of determination for prediction ( R pred 2 ) and residual predictive deviation (RPD) being obtained for this la…

Spectroscopy Near-InfraredCoefficient of determinationSpectrophotometry InfraredMean squared errorChemistryCompost010401 analytical chemistryNear-infrared spectroscopyAnalytical chemistryInfrared spectroscopy04 agricultural and veterinary sciencesengineering.materialResidual040401 food science01 natural sciencesCross-validation0104 chemical sciencesAnalytical ChemistrySoil0404 agricultural biotechnologyPhenolsPartial least squares regressionengineeringLeast-Squares AnalysisTalanta
researchProduct