Search results for "Software"
showing 10 items of 7396 documents
Feature Selection for Ensembles of Simple Bayesian Classifiers
2002
A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has advantages in terms of simplicity, learning speed, classification speed, storage space, and incrementality. One way to generate an ensemble of simple Bayesian classifiers is to use different feature subsets as in the random subspace method. In this paper we present a technique for building ensembles o…
Ensemble Feature Selection Based on the Contextual Merit
2001
Recent research has proved the benefits of using ensembles of classifiers for classification problems. Ensembles constructed by machine learning methods manipulating the training set are used to create diverse sets of accurate classifiers. Different feature selection techniques based on applying different heuristics for generating base classifiers can be adjusted to specific domain characteristics. In this paper we consider and experiment with the contextual feature merit measure as a feature selection heuristic. We use the diversity of an ensemble as evaluation function in our new algorithm with a refinement cycle. We have evaluated our algorithm on seven data sets from UCI. The experiment…
Putting the user into the active learning loop : Towards realistic but efficient photointerpretation
2012
In recent years, several studies have been published about the smart definition of training set using active learning algorithms. However, none of these works consider the contradiction between the active learning methods, which rank the pixels according to their uncertainty, and the confidence of the user in labeling, which is related both to the homogeneity of the pixel context and to the knowledge of the user of the scene. In this paper, we propose a two-steps procedure based on a filtering scheme to learn the confidence of the user in labeling. This way, candidate training pixels are ranked according both to their uncertainty and to the chances of being labeled correctly by the user. In…
2004
This paper presents the use of Support Vector Machines (SVMs) for prediction and analysis of antisense oligonucleotide (AO) efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1) feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE), and (2) AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to …
Multilayer neural networks: an experimental evaluation of on-line training methods
2004
Artificial neural networks (ANN) are inspired by the structure of biological neural networks and their ability to integrate knowledge and learning. In ANN training, the objective is to minimize the error over the training set. The most popular method for training these networks is back propagation, a gradient descent technique. Other non-linear optimization methods such as conjugate directions set or conjugate gradient have also been used for this purpose. Recently, metaheuristics such as simulated annealing, genetic algorithms or tabu search have been also adapted to this context.There are situations in which the necessary training data are being generated in real time and, an extensive tr…
Intelligent Sampling for Vegetation Nitrogen Mapping Based on Hybrid Machine Learning Algorithms
2021
Upcoming satellite imaging spectroscopy missions will deliver spatiotemporal explicit data streams to be exploited for mapping vegetation properties, such as nitrogen (N) content. Within retrieval workflows for real-time mapping over agricultural regions, such crop-specific information products need to be derived precisely and rapidly. To allow fast processing, intelligent sampling schemes for training databases should be incorporated to establish efficient machine learning (ML) models. In this study, we implemented active learning (AL) heuristics using kernel ridge regression (KRR) to minimize and optimize a training database for variational heteroscedastic Gaussian processes regression (V…
Ensemble Feature Selection Based on Contextual Merit and Correlation Heuristics
2001
Recent research has proven the benefits of using ensembles of classifiers for classification problems. Ensembles of diverse and accurate base classifiers are constructed by machine learning methods manipulating the training sets. One way to manipulate the training set is to use feature selection heuristics generating the base classifiers. In this paper we examine two of them: correlation-based and contextual merit -based heuristics. Both rely on quite similar assumptions concerning heterogeneous classification problems. Experiments are considered on several data sets from UCI Repository. We construct fixed number of base classifiers over selected feature subsets and refine the ensemble iter…
One-Class Classifiers : A Review and Analysis of Suitability in the Context of Mobile-Masquerader Detection
2007
One-class classifiers employing for training only the data from one class are justified when the data from other classes is difficult to obtain. In particular, their use is justified in mobile-masquerader detection, where user characteristics are classified as belonging to the legitimate user class or to the impostor class, and where collecting the data originated from impostors is problematic. This paper systematically reviews various one-class classification methods, and analyses their suitability in the context of mobile-masquerader detection. For each classification method, its sensitivity to the errors in the training set, computational requirements, and other characteristics are consi…
Learning Similarity Scores by Using a Family of Distance Functions in Multiple Feature Spaces
2017
There exist a large number of distance functions that allow one to measure similarity between feature vectors and thus can be used for ranking purposes. When multiple representations of the same object are available, distances in each representation space may be combined to produce a single similarity score. In this paper, we present a method to build such a similarity ranking out of a family of distance functions. Unlike other approaches that aim to select the best distance function for a particular context, we use several distances and combine them in a convenient way. To this end, we adopt a classical similarity learning approach and face the problem as a standard supervised machine lea…
Tomographic image processing for the morphological and metrological study of Valsalva sinuses
2012
This Phd thesis deals with the design and the use of image processing tools in order to allow a reliable and objective study of the sinuses of Valsalva which are important cavities of the aortic root. The proposed methods can be applied on cine-MR sequences and CT examinations without any change in the settings between two examinations.Firstly, we studied the morphology of this anatomical area and its constant properties in all images of the dataset. Sinuses are one of the main bright organs with limited movements. Hence a new algorithm has been designed. It detects and characterizes each bright organ by a single trajectory. Various tools of mathematical morphology are used for this step, a…