Search results for "Data mining"
showing 10 items of 907 documents
Application of model quality evaluation to systems biology
2008
Application of model quality evaluation to the quasispecies models is presented. These models are useful for the analysis of the DNA and RNA evolution and for the description of the population dynamics of viruses and bacteria. An estimate of the parameters together with their interval of variability is computed and the quality evaluation is tested on the basis of the model prediction error capability.
Data Mining in Cancer Research [Application Notes
2010
This article is not intended as a comprehensive survey of data mining applications in cancer. Rather, it provides starting points for further, more targeted, literature searches, by embarking on a guided tour of computational intelligence applications in cancer medicine, structured in increasing order of the physical scales of biological processes.
A Metamodeling Approach to Evolution
2001
With the increasing complexity of systems being modeled, analysis & design move towards more and more abstract methodologies. Most of them rely on metamodeling tools that employ multi-view models and the four-layer metamodeling architecture. Our idea is to use the metamodeling approach to classify and to constraint the possible evolutions of an information system with the effect to improve both detection of evolution conflicts and disciplined reuse. Within the domain of UML metamodeling, a refinement of the metamodel-level classification is proposed that includes bases for defining a metric of the evolution (in terms of distance between metamodels).
Concept Drift Detection Using Online Histogram-Based Bayesian Classifiers
2016
In this paper, we present a novel algorithm that performs online histogram-based classification, i.e., specifically designed for the case when the data is dynamic and its distribution is non-stationary. Our method, called the Online Histogram-based Naïve Bayes Classifier (OHNBC) involves a statistical classifier based on the well-established Bayesian theory, but which makes some assumptions with respect to the independence of the attributes. Moreover, this classifier generates a prediction model using uni-dimensional histograms, whose segments or buckets are fixed in terms of their cardinalities but dynamic in terms of their widths. Additionally, our algorithm invokes the principles of info…
Online Estimation of Discrete Densities
2013
We address the problem of estimating a discrete joint density online, that is, the algorithm is only provided the current example and its current estimate. The proposed online estimator of discrete densities, EDDO (Estimation of Discrete Densities Online), uses classifier chains to model dependencies among features. Each classifier in the chain estimates the probability of one particular feature. Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains and ensembles of weighted classifier chains. For all density estimators, we provide consistency proofs and propose algorithms to perform certain inference tasks. The empirical evaluation of t…
Handling local concept drift with dynamic integration of classifiers : domain of antibiotic resistance in nosocomial infections
2006
In the real world concepts and data distributions are often not stable but change with time. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques, which treat arriving instances as equally important contributors to the target concept. Among the most popular and effective approaches to handle concept drift is ensemble learning, where a set of models built over different time periods is maintained and the best model is selected or the predictions of models are combined. In this paper we consider the use of an ensemble integration technique that helps to better handle concept drift at t…
Effect of amount of biomaterial used for maxillary sinus lift on volume maintenance of grafts
2020
Background Regardless of the kind of biomaterial used for the graft, it is clear that, over time, the graft undergoes dimensional changes that could influence the final bone volume obtained, which could alter the stability of the installed implants. The aim of the present study was to compared and correlated the graft behavior with the amount (in grams) of xenogeneic and alloplastic biomaterials used in grafts for maxillary sinus lift. Material and Methods This retrospective cohort study used 148 CBCT images of 74 grafts from 68 maxillary sinuses lift patients in a university, post-graduate clinic. The weights of biomaterials, categorized in intervals according to amount used, were correlat…
On utilizing dependence-based information to enhance micro-aggregation for secure statistical databases
2011
Published version of an article in the journal: Pattern Analysis and Applications. Also available from the publisher at: http://dx.doi.org/10.1007/s10044-011-0199-9 We consider the micro-aggregation problem which involves partitioning a set of individual records in a micro-data file into a number of mutually exclusive and exhaustive groups. This problem, which seeks for the best partition of the micro-data file, is known to be NP-hard, and has been tackled using many heuristic solutions. In this paper, we would like to demonstrate that in the process of developing micro-aggregation techniques (MATs), it is expedient to incorporate information about the dependence between the random variable…
Kernels for Remote Sensing Image Classification
2015
Classification of images acquired by airborne and satellite sensors is a very challenging problem. These remotely sensed images usually acquire information from the scene at different wavelengths or spectral channels. The main problems involved are related to the high dimensionality of the data to be classified and the very few existing labeled samples, the diverse noise sources involved in the acquisition process, the intrinsic nonlinearity and non-Gaussianity of the data distribution in feature spaces, and the high computational cost involved to process big data cubes in near real time. The framework of statistical learning in general, and of kernel methods in particular, has gained popul…
2004
The current progress in sequencing projects calls for rapid, reliable and accurate function assignments of gene products. A variety of methods has been designed to annotate sequences on a large scale. However, these methods can either only be applied for specific subsets, or their results are not formalised, or they do not provide precise confidence estimates for their predictions. We have developed a large-scale annotation system that tackles all of these shortcomings. In our approach, annotation was provided through Gene Ontology terms by applying multiple Support Vector Machines (SVM) for the classification of correct and false predictions. The general performance of the system was bench…