Search results for "Outlier"
showing 10 items of 73 documents
On the distribution of education and democracy
2006
This paper empirically analyzes the influence of the distribution of education on democracy by controlling for unobservable heterogeneity and by taking into account the persistency of some of the variables. The most novel finding is that increase in the education attained by the majority of the population is what matters for the implementation and sustainability of democracy, rather than the average years of schooling. We show this result is robust to issues pertaining omitted variables, outliers, sample selection, or a narrow definition of the variables used to measure democracy.
Robust refinement of initial prototypes for partitioning-based clustering algorithms
2007
Non-uniqueness of solutions and sensitivity to erroneous data are common problems to large-scale data clustering tasks. In order to avoid poor quality of solutions with partitioning-based clustering methods, robust estimates (that are highly insensitive to erroneous data values) are needed and initial cluster prototypes should be determined properly. In this paper, a robust density estimation initialization method that exploits the spatial median estimate to the prototype update is presented. Besides being insensitive to noise and outliers, the new method is also computationally comparable with other traditional methods. The methods are compared by numerical experiments on a set of syntheti…
Iteratively reweighted least squares in crystal structure refinements
2011
The use of robust techniques in crystal structure multipole refinements of small molecules as an alternative to the commonly adopted weighted least squares is presented and discussed. As is well known, the main disadvantage of least-squares fitting is its sensitivity to outliers. The elimination from the data set of the most aberrant reflections (due to both experimental errors and incompleteness of the model) is an effective practice that could yield satisfactory results, but it is often complicated in the presence of a great number of bad data points, whose one-by-one elimination could become unattainable. This problem can be circumvented by means of a robust least-squares regression that…
Adaptive distributed outlier detection for WSNs.
2014
The paradigm of pervasive computing is gaining more and more attention nowadays, thanks to the possibility of obtaining precise and continuous monitoring. Ease of deployment and adaptivity are typically implemented by adopting autonomous and cooperative sensory devices; however, for such systems to be of any practical use, reliability and fault tolerance must be guaranteed, for instance by detecting corrupted readings amidst the huge amount of gathered sensory data. This paper proposes an adaptive distributed Bayesian approach for detecting outliers in data collected by a wireless sensor network; our algorithm aims at optimizing classification accuracy, time complexity and communication com…
StalAge – An algorithm designed for construction of speleothem age models
2011
Abstract Here we present a new algorithm ( StalAge ), which is designed to construct speleothem age models. The algorithm uses U-series ages and their corresponding age uncertainty for modelling and also includes stratigraphic information in order to further constrain and improve the age model. StalAge is applicable to problematic datasets that include outliers, age inversions, hiatuses and large changes in growth rate. Manual selection of potentially inaccurate ages prior to application is not required. StalAge can be applied by the general, non-expert user and has no adjustable free parameters. This offers the highest degree of reproducibility and comparability of speleothem records from …
The Yearly Land Cover Dynamics (YLCD) method: An analysis of global vegetation from NDVI and LST parameters
2009
NDVI (Normalized Difference Vegetation Index) has been widely used to monitor vegetation changes since the early eighties. On the other hand, little use has been made of land surface temperatures (LST), due to their sensitivity to the orbital drift which affects the NOAA (National Oceanic and Atmospheric Administration) platforms flying AVHRR sensor. This study presents a new method for monitoring vegetation by using NDVI and LST data, based on an orbital drift corrected dataset derived from data provided by the GIMMS (Global Inventory Modeling and Mapping Studies) group. This method, named Yearly Land Cover Dynamics (YLCD), characterizes NDVI and LST behavior on a yearly basis, through the…
A spatially filtered mixture of β-convergence regressions for EU regions, 1980–2002
2007
Assessing regional growth and convergence across Europe is a matter of primary relevance. Empirical models that do not account for structural heterogeneities and spatial effects may face serious misspecification problems. In this work, a mixture regression approach is applied to the beta-convergence model, in order to produce an endogenous selection of regional growth patterns. A priori choices, such as North-South or centre-periphery divisions, are avoided. In addition to this, we deal with the spatial dependence existing in the data, applying a local filter to the data. The results indicate that spatial effects matter, and either absolute, conditional, or club convergence, if extended to …
A gradient-based deletion diagnostic measure for generalized linear mixed models
2016
ABSTRACTA gradient-statistic-based diagnostic measure is developed in the context of the generalized linear mixed models. Its performance is assessed by some real examples and simulation studies, in terms of ability in detecting influential data structures and of concordance with the most used influence measures.
Design-based estimation for geometric quantiles with application to outlier detection
2010
Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…
Bayesian measures of surprise for outlier detection
2003
From a Bayesian point of view, testing whether an observation is an outlier is usually reduced to a testing problem concerning a parameter of a contaminating distribution. This requires elicitation of both (i) the contaminating distribution that generates the outlier and (ii) prior distributions on its parameters. However, very little information is typically available about how the possible outlier could have been generated. Thus easy, preliminary checks in which these assessments can often be avoided may prove useful. Several such measures of surprise are derived for outlier detection in normal models. Results are applied to several examples. Default Bayes factors, where the contaminating…