Search results for "Random forest"
showing 10 items of 121 documents
Risk Assessment of Hip Fracture Based on Machine Learning
2020
[EN] Identifying patients with high risk of hip fracture is a great challenge in osteoporosis clinical assessment. Bone Mineral Density (BMD) measured by Dual-Energy X-Ray Absorptiometry (DXA) is the current gold standard in osteoporosis clinical assessment. However, its classification accuracy is only around 65%. In order to improve this accuracy, this paper proposes the use of Machine Learning (ML) models trained with data from a biomechanical model that simulates a sideways-fall. Machine Learning (ML) models are models able to learn and to make predictions from data. During a training process, ML models learn a function that maps inputs and outputs without previous knowledge of the probl…
Mass Spectrometry Imaging Differentiates Chromophobe Renal Cell Carcinoma and Renal Oncocytoma with High Accuracy
2020
Background: While subtyping of the majority of malignant chromophobe renal cell carcinoma (cRCC) and benign renal oncocytoma (rO) is possible on morphology alone, additional histochemical, immunohistochemical or molecular investigations are required in a subset of cases. As currently used histochemical and immunohistological stains as well as genetic aberrations show considerable overlap in both tumors, additional techniques are required for differential diagnostics. Mass spectrometry imaging (MSI) combining the detection of multiple peptides with information about their localization in tissue may be a suitable technology to overcome this diagnostic challenge. Patients and Methods: Formalin…
Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochth…
2018
Commercial single nucleotide polymorphism (SNP) arrays have been recently developed for several species and can be used to identify informative markers to differentiate breeds or populations for several downstream applications. To identify the most discriminating genetic markers among thousands of genotyped SNPs, a few statistical approaches have been proposed. In this work, we compared several methods of SNPs preselection (Delta, F st and principal component analyses (PCA)) in addition to Random Forest classifications to analyse SNP data from six dairy cattle breeds, including cosmopolitan (Holstein, Brown and Simmental) and autochthonous Italian breeds raised in two different regions and …
2020
Human movements are characterized by highly non-linear and multi-dimensional interactions within the motor system. Recently, an increasing emphasis on machine-learning applications has led to a significant contribution to the field of gait analysis, e.g., in increasing the classification performance. In order to ensure the generalizability of the machine-learning models, different data preprocessing steps are usually carried out to process the measured raw data before the classifications. In the past, various methods have been used for each of these preprocessing steps. However, there are hardly any standard procedures or rather systematic comparisons of these different methods and their im…
A Methodological Framework to Discover Pharmacogenomic Interactions Based on Random Forests
2021
The identification of genomic alterations in tumor tissues, including somatic mutations, deletions, and gene amplifications, produces large amounts of data, which can be correlated with a diversity of therapeutic responses. We aimed to provide a methodological framework to discover pharmacogenomic interactions based on Random Forests. We matched two databases from the Cancer Cell Line Encyclopaedia (CCLE) project, and the Genomics of Drug Sensitivity in Cancer (GDSC) project. For a total of 648 shared cell lines, we considered 48,270 gene alterations from CCLE as input features and the area under the dose-response curve (AUC) for 265 drugs from GDSC as the outcomes. A three-step reduction t…
Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
2018
The identification of individuals’ breed of origin has several practical applications in livestock and is useful in different biological contexts such as conservation genetics, breeding and authentication of animal products. In this paper, penalized multinomial regression was applied to identify the minimum number of single nucleotide polymorphisms (SNPs) from high-throughput genotyping data for individual assignment to dairy sheep breeds reared in Sicily. The combined use of penalized multinomial regression and stability selection reduced the number of SNPs required to 48. A final validation step on an independent population was carried out obtaining 100% correctly classified individuals. …
Cell state prediction through distributed estimation of transmit power
2019
Determining the state of each cell, for instance, cell outages, in a densely deployed cellular network is a difficult problem. Several prior studies have used minimization of drive test (MDT) reports to detect cell outages. In this paper, we propose a two step process. First, using the MDT reports, we estimate the serving base station’s transmit power for each user. Second, we learn summary statistics of estimated transmit power for various networks states and use these to classify the network state on test data. Our approach is able to achieve an accuracy of 96% on an NS-3 simulation dataset. Decision tree, random forest and SVM classifiers were able to achieve a classification accuracy of…
Creación de un modelo estadístico predictivo para la determinación de las funciones de atenuación en español hablado
2018
Recientemente, algunos autores han definido distintas variables para caracterizar la atenuación lingüística en el marco de una base de datos multidimensional (Briz/Albelda; Albelda/otros). En este estudio se han seleccionado 982 elementos de atenuación de dieciocho entrevistas de español hablado; todos ellos han sido supervisados por hasta cuatro entrevistadores del proyecto Es.VaG.Atenuación. Finalmente, los datos se han evaluado mediante tres pruebas estadísticas para la clasificación y la reducción de variables: el análisis múltiple de correspondencias, el árbol de clasificaciones y el Random Forest. Las variables más determinantes en la discriminación de las funciones de atenuación han …
2021
Nitrogen (N) is one of the key nutrients supplied in agricultural production worldwide. Over-fertilization can have negative influences on the field and the regional level (e.g., agro-ecosystems). Remote sensing of the plant N of field crops presents a valuable tool for the monitoring of N flows in agro-ecosystems. Available data for validation of satellite-based remote sensing of N is scarce. Therefore, in this study, field spectrometer measurements were used to simulate data of the Sentinel-2 (S2) satellites developed for vegetation monitoring by the ESA. The prediction performance of normalized ratio indices (NRIs), random forest regression (RFR) and Gaussian processes regression (GPR) f…
A finite element-based machine learning approach for modeling the mechanical behavior of the breast tissues under compression in real-time
2017
[EN] This work presents a data-driven method to simulate, in real-time, the biomechanical behavior of the breast tissues in some image-guided interventions such as biopsies or radiotherapy dose delivery as well as to speed up multimodal registration algorithms. Ten real breasts were used for this work. Their deformation due to the displacement of two compression plates was simulated off-line using the finite element (FE) method. Three machine learning models were trained with the data from those simulations. Then, they were used to predict in real-time the deformation of the breast tissues during the compression. The models were a decision tree and two tree-based ensemble methods (extremely…