Search results for "Covar"
showing 10 items of 509 documents
Regularized Regression Incorporating Network Information: Simultaneous Estimation of Covariate Coefficients and Connection Signs
2014
We develop an algorithm that incorporates network information into regression settings. It simultaneously estimates the covariate coefficients and the signs of the network connections (i.e. whether the connections are of an activating or of a repressing type). For the coefficient estimation steps an additional penalty is set on top of the lasso penalty, similarly to Li and Li (2008). We develop a fast implementation for the new method based on coordinate descent. Furthermore, we show how the new methods can be applied to time-to-event data. The new method yields good results in simulation studies concerning sensitivity and specificity of non-zero covariate coefficients, estimation of networ…
Irrelevant Features, Class Separability, and Complexity of Classification Problems
2011
In this paper, analysis of class separability measures is performed in attempt to relate their descriptive abilities to geometrical properties of classification problems in presence of irrelevant features. The study is performed on synthetic and benchmark data with known irrelevant features and other characteristics of interest, such as class boundaries, shapes, margins between classes, and density. The results have shown that some measures are individually informative, while others are less reliable and only can provide complimentary information. Classification problem complexity measurements on selected data sets are made to gain additional insights on the obtained results.
Does Sedentary Behavior Predict Academic Performance in Adolescents or the Other Way Round? A Longitudinal Path Analysis.
2016
This study examined whether adolescents’ time spent on sedentary behaviors (academic, technological-based and social-based activities) was a better predictor of academic performance than the reverse. A cohort of 755 adolescents participated in a three-year period study. Structural Equation Modeling techniques were used to test plausible causal hypotheses. Four competing models were analyzed to determine which model best fitted the data. The Best Model was separately tested by gender. The Best Model showed that academic performance was a better predictor of sedentary behaviors than the other way round. It also indicated that students who obtained excellent academic results were more likely t…
Tuning of Extended Kalman Filters for Sensorless Motion Control with Induction Motor
2019
This work deals with the tuning of an Extended Kalman Filter for sensorless control of induction motors for electrical traction in automotive. Assuming that the parameters of the induction motor-load model are known, Genetic Algorithms are used for obtaining the system noise covariance matrix, considering the measurement noise covariance matrix equal to the identity matrix. It is shown that only stator currents have to be acquired for reaching this objective, which is easy to accomplish using Hall-effect transducers. In fact, the Genetic Algorithm minimizes, with respect to the system covariance matrix, a suitable measure of the displacement between the stator currents experimentally acquir…
The impact of sample reduction on PCA-based feature extraction for supervised learning
2006
"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…
A Bayesian unified framework for risk estimation and cluster identification in small area health data analysis.
2020
Many statistical models have been proposed to analyse small area disease data with the aim of describing spatial variation in disease risk. In this paper, we propose a Bayesian hierarchical model that simultaneously allows for risk estimation and cluster identification. Our model formulation assumes that there is an unknown number of risk classes and small areas are assigned to a risk class by means of independent allocation variables. Therefore, areas within each cluster are assumed to share a common risk but they may be geographically separated. The posterior distribution of the parameter representing the number of risk classes is estimated using a novel procedure that combines its prior …
Statistical Methods for the Geographical Analysis of Rare Diseases
2010
In this chapter we provide a summary of different methods for the detection of disease clusters. First of all, we give a summary of methods for computing estimates of the relative risk. These estimates provide smoothed values of the relative risks that can account for its spatial variation. Some methods for assessing spatial autocorrelation and general clustering are also discussed to test for significant spatial variation of the risk. In order to find the actual location of the clusters, scan methods are introduced. The spatial scan statistic is discussed as well as its extension by means of Generalised Linear Models that allows for the inclusion of covariates and cluster effects. In this …
On utilizing dependence-based information to enhance micro-aggregation for secure statistical databases
2011
Published version of an article in the journal: Pattern Analysis and Applications. Also available from the publisher at: http://dx.doi.org/10.1007/s10044-011-0199-9 We consider the micro-aggregation problem which involves partitioning a set of individual records in a micro-data file into a number of mutually exclusive and exhaustive groups. This problem, which seeks for the best partition of the micro-data file, is known to be NP-hard, and has been tackled using many heuristic solutions. In this paper, we would like to demonstrate that in the process of developing micro-aggregation techniques (MATs), it is expedient to incorporate information about the dependence between the random variable…
Finding condensed descriptions for multi-dimensional data.
1976
Abstract We describe two programs that may be used to find condensed descriptions for data available in a contingency table or in a covariance matrix in the case that these data follow a multinomial or a multivariate normal distribution, respectively. The programs perform a stepwise model search among multiplicative models by computing appropriate likelihood-ratio test statistics.
Uncertainty analysis of gross primary production upscaling using Random Forests, remote sensing and eddy covariance data
2015
Abstract The accurate quantification of carbon fluxes at continental spatial scale is important for future policy decisions in the context of global climate change. However, many elements contribute to the uncertainty of such estimate. In this study, the uncertainties of eight days gross primary production (GPP) predicted by Random Forest (RF) machine learning models were analysed at the site, ecosystem and European spatial scales. At the site level, the uncertainties caused by the missing of key drivers were evaluated. The most accurate predictions of eight days GPP were obtained when all available drivers were used (Pearson's correlation coefficient, ρ ~ 0.84; Root Mean Square Error (RMSE…