Search results for "Machine learning"
showing 10 items of 1464 documents
Diversity in search strategies for ensemble feature selection
2005
Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to achieve higher accuracy, which is often not achievable with single models. It was shown theoretically and experimentally that in order for an ensemble to be effective, it should consist of base classifiers that have diversity in their predictions. One technique, which proved to be effective for constructing an ensemble of diverse base classifiers, is the use of different feature subsets, or so-called ensemble feature selection. Many ensemble feature selection strategies incorporate diversity as an objective in the search for the best collection of feature subse…
<title>Distance functions in dynamic integration of data mining techniques</title>
2000
One of the most important directions in the improvement of data mining and knowledge discovery is the integration of multiple data mining techniques. An integration method needs to be able either to evaluate and select the most appropriate data mining technique or to combine two or more techniques efficiently. A recent integration method for the dynamic integration of multiple data mining techniques is based on the assumption that each of the data mining techniques is the best one inside a certain subarea of the whole domain area. This method uses an instance-based learning approach to collect information about the competence areas of the mining techniques and applies a distance function to…
Machine Learning Methods for Spatial and Temporal Parameter Estimation
2020
Monitoring vegetation with satellite remote sensing is of paramount relevance to understand the status and health of our planet. Accurate and constant monitoring of the biosphere has large societal, economical, and environmental implications, given the increasing demand of biofuels and food by the world population. The current democratization of machine learning, big data, and high processing capabilities allow us to take such endeavor in a decisive manner. This chapter proposes three novel machine learning approaches to exploit spatial, temporal, multi-sensor, and large-scale data characteristics. We show (1) the application of multi-output Gaussian processes for gap-filling time series of…
Computerunterstützte Diagnostik in der Thoraxradiologie - aktuelle Schwerpunkte und Techniken
2003
The proliferation of digital data sets and the increasing amount of images, e. g. through the use of multislice spiral CT or multiple follow-up examinations in the context of new therapies, are ideal prerequisites for computer-aided diagnosis (CAD) in chest radiology. Multiple studies have described the applications and advantages of computer assistance in performing different diagnostic tasks. More powerful computers will enable the introduction of these systems into the clinical routine and could provide an enormous increase in morphological and functional information. The commercial introduction of tools for detection and visualization of pulmonary nodules has already begun. This is one …
A generalizability measure for program synthesis with genetic programming
2021
The generalizability of programs synthesized by genetic programming (GP) to unseen test cases is one of the main challenges of GP-based program synthesis. Recent work showed that increasing the amount of training data improves the generalizability of the programs synthesized by GP. However, generating training data is usually an expensive task as the output value for every training case must be calculated manually by the user. Therefore, this work suggests an approximation of the expected generalization ability of solution candidates found by GP. To obtain candidate solutions that all solve the training cases, but are structurally different, a GP run is not stopped after the first solution …
Inference of Spatiotemporal Processes over Graphs via Kernel Kriged Kalman Filtering
2018
Inference of space-time signals evolving over graphs emerges naturally in a number of network science related applications. A frequently encountered challenge pertains to reconstructing such dynamic processes given their values over a subset of vertices and time instants. The present paper develops a graph-aware kernel-based kriged Kalman filtering approach that leverages the spatio-temporal dynamics to allow for efficient online reconstruction, while also coping with dynamically evolving network topologies. Laplacian kernels are employed to perform kriging over the graph when spatial second-order statistics are unknown, as is often the case. Numerical tests with synthetic and real data ill…
A word prediction methodology for automatic sentence completion
2015
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…
A family of kernel anomaly change detectors
2014
This paper introduces the nonlinear extension of the anomaly change detection algorithms in [1] based on the theory of reproducing kernels. The presented methods generalize their linear counterparts, under both the Gaussian and elliptically-contoured assumptions, and produce both improved detection accuracies and reduced false alarm rates. We study the Gaussianity of the data in Hilbert spaces with kernel dependence estimates, provide low-rank kernel versions to cope with the high computational cost of the methods, and give prescriptions about the selection of the kernel functions and their parameters. We illustrate the performance of the introduced kernel methods in both pervasive and anom…
Validation of a Reinforcement Learning Policy for Dosage Optimization of Erythropoietin
2007
This paper deals with the validation of a Reinforcement Learning (RL) policy for dosage optimization of Erythropoietin (EPO). This policy was obtained using data from patients in a haemodialysis program during the year 2005. The goal of this policy was to maintain patients' Haemoglobin (Hb) level between 11.5 g/dl and 12.5 g/dl. An individual management was needed, as each patient usually presents a different response to the treatment. RL provides an attractive and satisfactory solution, showing that a policy based on RL would be much more successful in achieving the goal of maintaining patients within the desired target of Hb than the policy followed by the hospital so far. In this work, t…
Restricted Neighborhood Search Clustering Revisited: An Evolutionary Computation Perspective
2013
Protein-protein interaction networks have been broadly studied in the last few years, in order to understand the behavior of proteins inside the cell. Proteins interacting with each other often share common biological functions or they participate in the same biological process. Thus, discovering protein complexes made of groups of proteins strictly related, can be useful to predict protein functions. Clustering techniques have been widely employed to detect significative biological complexes. In this paper, we integrate one of the most popular network clustering techniques, namely the Restricted Neighborhood Search Clustering (RNSC), with evolutionary computation. The two cost functions in…