Search results for "Data mining"

showing 10 items of 907 documents

Photonic non-contact estimation of blood lactate level

2015

The ability to measure the blood lactate level in a non-invasive, non-contact manner is very appealing to the sports industry as well as the home care field. That is mainly because this substance level is an imperative parameter in the course of devolving a personal workout programs. Moreover, the blood lactate level is also a pivotal means in estimation of muscles' performance capability. In this manuscript we propose an optical non-contact approach to estimate the concentration level of this parameter. Firstly, we introduce the connection between the physiological muscle tremor and the lactate blood levels. Secondly, we suggest a photonic optical method to estimate the physiological tremo…

Computer sciencebusiness.industrycomputer.software_genreAtomic and Molecular Physics and OpticsArticlePhysiological tremorElectromagnetic opticsProof of conceptControl theoryBlood lactateData miningPhotonicsbusinesscomputerLaser beamsBiotechnology
researchProduct

Mass Spectrometry in Food Quality and Safety

2015

Abstract In recent years, mass spectrometry has gained a wide recognition as a selective and fast technique for the analysis and assessment of a wide range of food products. The state of the art in the determination of safety and quality of food is presented to illustrate the capability of this technique for classification and grading, defect and disease detection, distribution and visualization of chemical attributes, and evaluations of overall quality of meat, fish, fruits, vegetables, and other food products. The features of mass spectrometry for each category were summarized in the aspects of the investigated quality and safety attributes, the used systems (triple quadrupole, quadrupole…

Computer sciencebusiness.industrymedia_common.quotation_subjectMass spectrometrycomputer.software_genreFood safetyOrbitrapTriple quadrupole mass spectrometerlaw.inventionChemometricslawData analysisQuality (business)Data miningFood qualitybusinesscomputermedia_common
researchProduct

Gaussian Process Regression (GPR) Representation in Predictive Model Markup Language (PMML)

2017

International audience; This paper describes Gaussian process regression (GPR) models presented in predictive model markup language (PMML). PMML is an extensible-markup-language (XML) -based standard language used to represent data-mining and predictive analytic models, as well as pre- and post-processed data. The previous PMML version, PMML 4.2, did not provide capabilities for representing probabilistic (stochastic) machine-learning algorithms that are widely used for constructing predictive models taking the associated uncertainties into consideration. The newly released PMML version 4.3, which includes the GPR model, provides new features: confidence bounds and distribution for the pred…

Computer sciencecomputer.internet_protocol02 engineering and technologycomputer.software_genreIndustrial and Manufacturing EngineeringArticleSet (abstract data type)[SPI]Engineering Sciences [physics]Kriging020204 information systems0202 electrical engineering electronic engineering information engineeringUncertainty quantificationRepresentation (mathematics)predictive model markup language (PMML)Probabilistic logicdata miningPredictive analyticsXMLComputer Science Applicationspredictive analyticsControl and Systems EngineeringPredictive Model Markup Languagestandards020201 artificial intelligence & image processingData miningcomputerXMLGaussian process regression
researchProduct

A methodology to assess the intrinsic discriminative ability of a distance function and its interplay with clustering algorithms for microarray data …

2013

Abstract Background Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. Following Handl et al., it can be summarized as a three step process: (1) choice of a distance function; (2) choice of a clustering algorithm; (3) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Results A procedure is proposed for the assessment of the discriminative ability of a distance functi…

Computer sciencecomputer.software_genreBiochemistrysymbols.namesakeDiscriminative modelStructural BiologyCluster AnalysisRelevance (information retrieval)Cluster analysisMolecular BiologyOligonucleotide Array Sequence AnalysisClustering discriminative ability of a distance function external validation indicesSettore INF/01 - InformaticaResearchApplied MathematicsMutual informationPearson product-moment correlation coefficientComputer Science ApplicationsHierarchical clusteringEuclidean distanceRange (mathematics)Metric (mathematics)symbolsData miningTranscriptomecomputerAlgorithmsBMC Bioinformatics
researchProduct

Indexing a sequence for mapping reads with a single mismatch

2014

Mapping reads against a genome sequence is an interesting and useful problem in computational molecular biology and bioinformatics. In this paper, we focus on the problem of indexing a sequence for mapping reads with a single mismatch. We first focus on a simpler problem where the length of the pattern is given beforehand during the data structure construction. This version of the problem is interesting in its own right in the context of the next generation sequencing. In the sequel, we show how to solve the more general problem. In both cases, our algorithm can construct an efficient data structure in time and space and can answer subsequent queries in time. Here, n is the length of the s…

Computer sciencegenome sequenceGeneral Mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]General Physics and AstronomyContext (language use)algorithmscomputer.software_genrePattern matchingSequenceSearch engine indexingGeneral EngineeringWildcard characterArticlescomputer.file_formatConstruct (python library)Data structuremapping readspattern matchingComputingMethodologies_DOCUMENTANDTEXTPROCESSINGData mining[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]Focus (optics)mismatchcomputerAlgorithmindexingPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
researchProduct

Automated Uncertainty Quantification Through Information Fusion in Manufacturing Processes

2017

International audience; Evaluation of key performance indicators (KPIs) such as energy consumption is essential for decision-making during the design and operation of smart manufacturing systems. The measurements of KPIs are strongly affected by several uncertainty sources such as input material uncertainty, the inherent variability in the manufacturing process, model uncertainty, and the uncertainty in the sensor measurements of operational data. A comprehensive understanding of the uncertainty sources and their effect on the KPIs is required to make the manufacturing processes more efficient. Towards this objective, this paper proposed an automated methodology to generate a hierarchical B…

Computer scienceinjection molding02 engineering and technologycomputer.software_genreIndustrial and Manufacturing Engineering[SPI]Engineering Sciences [physics]GME0202 electrical engineering electronic engineering information engineeringUncertainty quantificationuncertaintyautomationhierarchicalbusiness.industryBayesian network020207 software engineeringmeta-modelAutomationComputer Science ApplicationsMetamodelingInformation fusionBayesian networkControl and Systems Engineeringsemantic020201 artificial intelligence & image processingData miningbusinesscomputer
researchProduct

Mesh Visual Quality Assessment Metrics: A Comparison Study

2017

3D graphics technologies have known a developed progress in the last years, and several processing operations can be applied on 3D meshes such as watermarking, compression, simplification and so forth. Mesh visual quality assessment becomes an important issue to evaluate the visual appearance of the 3D shape after specific modifications. Several metrics have been proposed in this context, from the classical distance-based metrics to the perceptual-based metrics which include perceptual information about the human visual system. In this paper, we propose to study the performance of several mesh visual quality metrics. First, the comparison is conducted regardless the distortion types neither…

Computer sciencemedia_common.quotation_subject020207 software engineeringContext (language use)02 engineering and technologycomputer.software_genreVisual appearanceVisualizationMetric (mathematics)Human visual system model0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingQuality (business)Polygon meshData miningcomputer3D computer graphicsmedia_common2017 13th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS)
researchProduct

Diversity in random subspacing ensembles

2004

Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. It was shown experimentally and theoretically that in order for an ensemble to be effective, it should consist of classifiers having diversity in their predictions. A number of ways are known to quantify diversity in ensembles, but little research has been done about their appropriateness. In this paper, we compare eight measures of the ensemble diversity with regard to their correlation with the accuracy improvement due to ensembles. We conduct experiments on 21 data sets from the UCI machine learning repository, comparing the correlations for random subspacing ensembles with diffe…

Computer sciencemedia_common.quotation_subjectAmbiguityEnsemble diversitycomputer.software_genreEnsemble learningData warehouseCorrelationInformation extractionKnowledge extractionStatisticsEntropy (information theory)Data miningcomputermedia_common
researchProduct

Missing values in deduplication of electronic patient data

2011

Data deduplication refers to the process in which records referring to the same real-world entities are detected in datasets such that duplicated records can be eliminated. The denotation ‘record linkage’ is used here for the same problem.1 A typical application is the deduplication of medical registry data.2 3 Medical registries are institutions that collect medical and personal data in a standardized and comprehensive way. The primary aims are the creation of a pool of patients eligible for clinical or epidemiological studies and the computation of certain indices such as the incidence in order to oversee the development of diseases. The latter task in particular requires a database in wh…

Computer sciencemedia_common.quotation_subjectInferenceHealth InformaticsAmbiguityPatient dataMissing datacomputer.software_genreResearch and ApplicationsRegressionNeoplasmsStatisticsData deduplicationElectronic Health RecordsHumansData miningImputation (statistics)Medical Record LinkageRegistriescomputerRecord linkagemedia_common
researchProduct

A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR.

2013

(Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate t…

Computer sciencemedia_common.quotation_subjectOrganic ChemistryScale (descriptive set theory)Variance (accounting)computer.software_genreCross-validationComputer Science ApplicationsModel validationWorkflowStructural BiologyCheminformaticsTest setDrug DiscoveryMolecular MedicineQuality (business)Data miningcomputermedia_commonMolecular informatics
researchProduct