Search results for "Random forest"
showing 10 items of 121 documents
Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps
2015
Abstract The assessment of class frequency in soil map legends is affected by uncertainty, especially at small scales where generalization is greater. The aim of this study was to test the hypothesis that data mining techniques provide better estimation of class frequency than traditional deterministic pedology in a national soil map. In the 1:5,000,000 map of Italian soil regions, the soil classes are the WRB reference soil groups (RSGs). Different data mining techniques, namely neural networks, random forests, boosted tree, classification and regression tree, and supported vector machine (SVM), were tested and the last one gave the best RSG predictions using selected auxiliary variables a…
Breaking the curse of dimensionality in quadratic discriminant analysis models with a novel variant of a Bayes classifier enhances automated taxa ide…
2013
Macroinvertebrate samples are commonly used in biomonitoring to study changes on aquatic ecosystems. Traditionally, specimens are identified manually to taxa by human experts being time-consuming and cost intensive. Using the image data of 35 taxa and 64 features, we propose a novel variant of the quadratic discriminant analysis for breaking the curse of dimensionality in quadratic discriminant analysis models. Our variant, called a random Bayes array (RBA), uses bagging and random feature selection similar to random forest. We explore several variations of RBA. We consider three classification (i.e taxa identification) decisions: majority vote, averaged posterior probabilities, and a novel…
Model-Assisted Estimation Through Random Forests in Finite Population Sampling
2021
In surveys, the interest lies in estimating finite population parameters such as population totals and means. In most surveys, some auxiliary information is available at the estimation stage. This information may be incorporated in the estimation procedures to increase their precision. In this article, we use random forests (RFs) to estimate the functional relationship between the survey variable and the auxiliary variables. In recent years, RFs have become attractive as National Statistical Offices have now access to a variety of data sources, potentially exhibiting a large number of observations on a large number of variables. We establish the theoretical properties of model-assisted proc…
Adaptive sparse representation of continuous input for tsetlin machines based on stochastic searching on the line
2021
This paper introduces a novel approach to representing continuous inputs in Tsetlin Machines (TMs). Instead of using one Tsetlin Automaton (TA) for every unique threshold found when Booleanizing continuous input, we employ two Stochastic Searching on the Line (SSL) automata to learn discriminative lower and upper bounds. The two resulting Boolean features are adapted to the rest of the clause by equipping each clause with its own team of SSLs, which update the bounds during the learning process. Two standard TAs finally decide whether to include the resulting features as part of the clause. In this way, only four automata altogether represent one continuous feature (instead of potentially h…
Deep learning approach for prediction of impact peak appearance at ground reaction force signal of running activity
2020
Protruding impact peak is one of the features of vertical ground reaction force (GRF) that is related to injury risk while running. The present research is dedicated to predicting GRF impact peak appearance by setting a binary classification problem. Kinematic data, namely a number of raw signals in the sagittal plane, collected by the Vicon motion capture system (Oxford Metrics Group, UK) were employed as predictors. Therefore, the input data for the predictive model are presented as a multi-channel time series. Deep learning techniques, namely five convolutional neural network (CNN) models were applied to the binary classification analysis, based on a Multi-Layer Perceptron (MLP) classifi…
Spectrum Hole Detection for Cognitive Radio through Energy Detection using Random Forest
2020
The growth of wireless data is the major driving force for an exponential increase in wireless communication. Cognitive Radio is one of the emerging wireless technologies that can be used for smart utility networks. Optimum utilization of the wireless spectrum is the objective of Cognitive Radio. Finding a spectrum hole through intelligent means is essential for the success of Cognitive Radio. Dynamic spectrum allocation is also an efficient technique for spectrum allocation. It will lead to a better spectrum utilization. In this paper, some of the machine learning techniques are used to find a frequency range for dynamic spectrum allocation. Different machine learning techniques such as Lo…
Supervised Analysis for Phenotype Identification: The Case of Heart Failure Ejection Fraction Class
2021
Artificial Intelligence is creating a paradigm shift in health care, with phenotyping patients through clustering techniques being one of the areas of interest. Objective: To develop a predictive model to classify heart failure (HF) patients according to their left ventricular ejection fraction (LVEF), by using available data from Electronic Health Records (EHR). Subjects and methods: 2854 subjects over 25 years old with a diagnosis of HF and LVEF, measured by echocardiography, were selected to develop an algorithm to predict patients with reduced EF using supervised analysis. The performance of the developed algorithm was tested in heart failure patients from Primary Care. To select the mo…
Mākslīgās inteliģences metožu izmantošanas potenciāls atmosfēras piesārņojuma datu apstrādē un interpretācijā
2022
Automatizēta datu kvalitātes kontrole kļūst aizvien populārāka lielu datu kopu apstrādē. Kā arī lielāku popularitāti vides metrikā gūst sabiedriskā zinātne, pieļaujot izmantot alternatīvas un ne tik sarežģītas mēriekārtas un metodes (t.s. ne-references iekārtas). Nereti šo ne-references staciju dati ir publiski pieejami, bet šo datu kvalitāte ir novērtējama kā apšaubāma, ar zemu ticamības līmeni, ja netiek veikta mērījumu kvalitātes kontrole. Atmosfēras piesārņojuma jomā, izmantojot sensoru iekārtas, mērījumi bieži vien tiek veikti 1 – 10 minūšu intervālā, kas ļauj iegūt milzīgas datu kopas. Pietiekami blīvs sabiedriskās zinātnes ne-references iekārtu tīkls sniedz būtisku telpisku vides kva…
A Supervised Learning Framework for Automatic Prostate Segmentation in Trans Rectal Ultrasound Images
2012
International audience; Heterogeneous intensity distribution inside the prostate gland, significant variations in prostate shape, size, inter dataset contrast variations, and imaging artifacts like shadow regions and speckle in Trans Rectal Ultrasound (TRUS) images challenge computer aided automatic or semi-automatic segmentation of the prostate. In this paper, we propose a supervised learning schema based on random forest for automatic initialization and propagation of statistical shape and appearance model. Parametric representation of the statistical model of shape and appearance is derived from principal component analysis (PCA) of the probability distribution inside the prostate and PC…
Region-based segmentation on depth images from a 3D reference surface for tree species recognition.
2013
International audience; The aim of the work presented in this paper is to develop a method for the automatic identification of tree species using Terrestrial Light Detection and Ranging (T-LiDAR) data. The approach that we propose analyses depth images built from 3D point clouds corresponding to a 30 cm segment of the tree trunk in order to extract characteristic shape features used for classifying the different tree species using the Random Forest classifier. We will present the method used to transform the 3D point cloud to a depth image and the region based segmentation method used to segment the depth images before shape features are computed on the segmented images. Our approach has be…