Search results for "Feature selection"
showing 10 items of 139 documents
Feature Selection Methods to Extract Knowledge and Enhance Analysis of Ventricular Fibrillation Signals
2014
Optimized spatio-temporal descriptors for real-time fall detection: comparison of support vector machine and Adaboost-based classification
2013
We propose a supervised approach to detect falls in a home environment using an optimized descriptor adapted to real-time tasks. We introduce a realistic dataset of 222 videos, a new metric allowing evaluation of fall detection performance in a video stream, and an automatically optimized set of spatio-temporal descriptors which fed a supervised classifier. We build the initial spatio-temporal descriptor named STHF using several combinations of transformations of geometrical features (height and width of human body bounding box, the user’s trajectory with her/his orientation, projection histograms, and moments of orders 0, 1, and 2). We study the combinations of usual transformations of the…
DBSCAN Algorithm for Document Clustering
2019
Abstract Document clustering is a problem of automatically grouping similar document into categories based on some similarity metrics. Almost all available data, usually on the web, are unclassified so we need powerful clustering algorithms that work with these types of data. All common search engines return a list of pages relevant to the user query. This list needs to be generated fast and as correct as possible. For this type of problems, because the web pages are unclassified, we need powerful clustering algorithms. In this paper we present a clustering algorithm called DBSCAN – Density-Based Spatial Clustering of Applications with Noise – and its limitations on documents (or web pages)…
Parameter Rating by Diffusion Gradient
2014
Anomaly detection is a central task in high-dimensional data analysis. It can be performed by using dimensionality reduction methods to obtain a low-dimensional representation of the data, which reveals the geometry and the patterns that exist and govern it. Usually, anomaly detection methods classify high-dimensional vectors that represent data points as either normal or abnormal. Revealing the parameters (i.e., features) that cause detected abnormal behaviors is critical in many applications. However, this problem is not addressed by recent anomaly-detection methods and, specifically, by nonparametric methods, which are based on feature-free analysis of the data. In this chapter, we provi…
Context-related data processing in artificial neural networks for higher reliability of telerehabilitation systems
2015
Classification is a data processing technique of a great significance both for native eHealth systems and web telemedicine solutions. In this sense, artificial neural networks have been widely applied in telerehabilitation as powerful tools to process information and acquire a new medical knowledge. But effective analysis of multidimensional heterogeneous medical data, still poses considerable difficulties. It was shown that processing too many data features simultaneously is costly and has some adverse effects on the resulting models classification properties. Therefore, there is a strong need to develop new techniques for selecting features from the very large data sets that include many …
Local dimensionality reduction and supervised learning within natural clusters for biomedical data analysis
2006
Inductive learning systems were successfully applied in a number of medical domains. Nevertheless, the effective use of these systems often requires data preprocessing before applying a learning algorithm. This is especially important for multidimensional heterogeneous data presented by a large number of features of different types. Dimensionality reduction (DR) is one commonly applied approach. The goal of this paper is to study the impact of natural clustering--clustering according to expert domain knowledge--on DR for supervised learning (SL) in the area of antibiotic resistance. We compare several data-mining strategies that apply DR by means of feature extraction or feature selection w…
Feature selection for KNN classifier to improve accurate detection of subthalamic nucleus during deep brain stimulation surgery in Parkinson’s patien…
2017
The tremor and dystonia associated with Parkinson’s disease can be treated with deep brain stimulation (DBS) implanted into the subthalamic nucleus (STN). The accurate STN detection is a complex neurosurgeon task during a DBS surgery since a proper fixing of stimulating electrodes will impact on the patient’s future life. The brain electrical signals obtained with Micro Electrodes Register (MER) are acquired at different depths of the brain during DBS surgery to detect STN. In our previous work, we found good accuracy performance to improve the localization of STN using K-Nearest Neighbours (KNN) supervised learning algorithm. However, for real-time classification, it is essential to reduce…
Diversity in Ensemble Feature Selection
2003
Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to achieve higher accuracy, which is often not achievable with single models. It was shown theoretically and experimentally that in order for an ensemble to be effective, it should consist of high-accuracy base classifiers that should have high diversity in their predictions. One technique, which proved to be effective for constructing an ensemble of accurate and diverse base classifiers, is to use different feature subsets, or so-called ensemble feature selection. Many ensemble feature selection strategies incorporate diversity as a component of the fitness funct…
Feature selection for distance-based regression: An umbrella review and a one-shot wrapper
2023
Feature selection (FS) may improve the performance, cost-efficiency, and understandability of supervised machine learning models. In this paper, FS for the recently introduced distance-based supervised machine learning model is considered for regression problems. The study is contextualized by first providing an umbrella review (review of reviews) of recent development in the research field. We then propose a saliency-based one-shot wrapper algorithm for FS, which is called MAS-FS. The algorithm is compared with a set of other popular FS algorithms, using a versatile set of simulated and benchmark datasets. Finally, experimental results underline the usefulness of FS for regression, confirm…
Variable Selection in Predictive MIDAS Models
2014
In short-term forecasting, it is essential to take into account all available information on the current state of the economic activity. Yet, the fact that various time series are sampled at different frequencies prevents an efficient use of available data. In this respect, the Mixed-Data Sampling (MIDAS) model has proved to outperform existing tools by combining data series of different frequencies. However, major issues remain regarding the choice of explanatory variables. The paper first addresses this point by developing MIDAS based dimension reduction techniques and by introducing two novel approaches based on either a method of penalized variable selection or Bayesian stochastic searc…