Search results for "Density estimation"
showing 10 items of 61 documents
Forest of Normalized Trees: Fast and Accurate Density Estimation of Streaming Data
2018
Density estimation of streaming data is a relevant task in numerous domains. In this paper, a novel non-parametric density estimator called FRONT (forest of normalized trees) is introduced. It uses a structure of multiple normalized trees, segments the feature space of the data stream through a periodically updated linear transformation and is able to adapt to ever evolving data streams. FRONT provides accurate density estimation and performs favorably compared to existing online density estimators in terms of the average log score on multiple standard data sets. Its low complexity, linear runtime as well as constant memory usage, makes FRONT by design suitable for large data streams. Final…
Online Density Estimation of Heterogeneous Data Streams in Higher Dimensions
2016
The joint density of a data stream is suitable for performing data mining tasks without having access to the original data. However, the methods proposed so far only target a small to medium number of variables, since their estimates rely on representing all the interdependencies between the variables of the data. High-dimensional data streams, which are becoming more and more frequent due to increasing numbers of interconnected devices, are, therefore, pushing these methods to their limits. To mitigate these limitations, we present an approach that projects the original data stream into a vector space and uses a set of representatives to provide an estimate. Due to the structure of the est…
Optimized Kernel Entropy Components
2016
This work addresses two main issues of the standard Kernel Entropy Component Analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of the KECA method, named Optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular…
Kernel methods and their derivatives: Concept and perspectives for the earth system sciences.
2020
Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature mapping is not directly accessible and difficult to interpret.The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods is intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to many different problems. We n…
A Unified SVM Framework for Signal Estimation
2013
This paper presents a unified framework to tackle estimation problems in Digital Signal Processing (DSP) using Support Vector Machines (SVMs). The use of SVMs in estimation problems has been traditionally limited to its mere use as a black-box model. Noting such limitations in the literature, we take advantage of several properties of Mercer's kernels and functional analysis to develop a family of SVM methods for estimation in DSP. Three types of signal model equations are analyzed. First, when a specific time-signal structure is assumed to model the underlying system that generated the data, the linear signal model (so called Primal Signal Model formulation) is first stated and analyzed. T…
Hybrid chaotic firefly decision making model for Parkinson’s disease diagnosis
2020
Parkinson’s disease is found as a progressive neurodegenerative condition which affects motor circuit by the loss of up to 70% of dopaminergic neurons. Thus, diagnosing the early stages of incidence is of great importance. In this article, a novel chaos-based stochastic model is proposed by combining the characteristics of chaotic firefly algorithm with Kernel-based Naïve Bayes (KNB) algorithm for diagnosis of Parkinson’s disease at an early stage. The efficiency of the model is tested on a voice measurement dataset that is collected from “UC Irvine Machine Learning Repository.” The dynamics of chaos optimization algorithm will enhance the firefly algorithm by introducing six types of chao…
Functional Data Analysis in NTCP Modeling: A New Method to Explore the Radiation Dose-Volume Effects
2014
Purpose/Objective(s) To describe a novel method to explore radiation dose-volume effects. Functional data analysis is used to investigate the information contained in differential dose-volume histograms. The method is applied to the normal tissue complication probability modeling of rectal bleeding (RB) for patients irradiated in the prostatic bed by 3-dimensional conformal radiation therapy. Methods and Materials Kernel density estimation was used to estimate the individual probability density functions from each of the 141 rectum differential dose-volume histograms. Functional principal component analysis was performed on the estimated probability density functions to explore the variatio…
GIS-data related route optimization, hierarchical clustering, location optimization, and kernel density methods are useful for promoting distributed …
2019
Currently, geographic information system (GIS) models are popular for studying location-allocation-related questions concerning bioenergy plants. The aim of this study was to develop a model to investigate optimal locations for two different types of bioenergy plants, for farm and centralized biogas plants, and for wood terminals in rural areas based on minimizing transportation distances. The optimal locations of biogas plants were determined using location optimization tools in R software, and the optimal locations of wood terminals were determined using kernel density tools in ArcGIS. The present case study showed that the utilized GIS tools are useful for bioenergy-related decision-maki…
Kernel estimation and display of a five-dimensional conditional intensity function
2018
The aim of this paper is to find a convenient and effective method of displaying some second order properties in a neighbourhood of a selected point of the process. The used techniques are based on very general high-dimensional nonparametric smoothing developed to define a more gen- eral version of the conditional intensity function introduced in earlier earthquake studies by Vere-Jones (1978). 1976) is commonly used for such a purpose in discussing the cumulative behavior of interpoint distances about an initial point. It is defined as the expected number of events falling within a given distance of the initial event, divided by the overall density (rate in 2-dimensions) of the process, sa…
Learning with the kernel signal to noise ratio
2012
This paper presents the application of the kernel signal to noise ratio (KSNR) in the context of feature extraction to general machine learning and signal processing domains. The proposed approach maximizes the signal variance while minimizes the estimated noise variance in a reproducing kernel Hilbert space (RKHS). The KSNR can be used in any kernel method to deal with correlated (possibly non-Gaussian) noise. We illustrate the method in nonlinear regression examples, dependence estimation and causal inference, nonlinear channel equalization, and nonlinear feature extraction from high-dimensional satellite images. Results show that the proposed KSNR yields more fitted solutions and extract…