Search results for "Outliers"
showing 10 items of 14 documents
Influence diagnostics for generalized linear mixed models: a gradient-like statistic
2013
In the literature, many influence measures proposed for Generalized Linear Mixed Models (GLMMs) require the information matrix that can be difficult to calculate. In the present paper, a known influence measure is approximated to get a simpler form, for which the information matrix is no more necessary. The proposed measure is showed to have a form similar to the gradient statistic, recently introduced. Good performances have been obtained through simulation studies.
The Performance of the Gradient-Like Influence Measure in Generalized Linear Mixed Models
2015
A gradient-like statistic, recently introduced as an influence measure, has been proven to work well in large sample, thanks to its asymptotic properties. In this work, through small-scale simulation schemes, the performance of such a diagnostic measure is further investigated in terms of concordance with the main influence measures used for outlier identification. The simulation studies are performed by using generalized linear mixed models (GLMMs).
Outlier detection to hierarchical and mixed effects models
2008
Hierarchical and mixed effects models are models where a varying number of coefficients may be random at different levels of the hierarchy. The purpose of outlier analysis for these models is to determine whether an outlying unit at higher level is entirely outlying, or outlying due to effect of one or a few aberrant lower level units. Most works on diagnostics for these complex models have focused on the mixed model rather than on the hierarchical models, obscuring some relevant aspects of the hierarchical model. In this paper we will present an approach to influence analysis and outlier detection for mixed and hierarchical model, focusing on the special structure of nested data that these…
RAD SNP markers as a tool for conservation of dolphinfish Coryphaena hippurus in the Mediterranean Sea: Identification of subtle genetic structure an…
2016
Dolphinfish is an important fish species for both commercial and sport fishing, but so far limited information is available on genetic variability and pattern of differentiation of dolphinfish populations in the Mediterranean basin. Recently developed techniques allow genome-wide identification of genetic markers for better understanding of population structure in species with limited genome information. Using restriction-site associated DNA analysis we successfully genotyped 140 individuals of dolphinfish from eight locations in the Mediterranean Sea at 3324 SNP loci. We identified 311 sex-related loci that were used to assess sex-ratio in dolphinfish populations. In addition, we identifie…
Reputation or peer review? The role of outliers
2018
We present an agent-based model of paper publication and consumption that allows to study the effect of two different evaluation mechanisms, peer review and reputation, on the quality of the manuscripts accessed by a scientific community. The model was empirically calibrated on two data sets, mono- and multi-disciplinary. Our results point out that disciplinary settings differ in the rapidity with which they deal with extreme events—papers that have an extremely high quality, that we call outliers. In the mono-disciplinary case, reputation is better than traditional peer review to optimize the quality of papers read by researchers. In the multi-disciplinary case, if the quality landscape is…
A gradient-based deletion diagnostic measure for generalized linear mixed models
2016
ABSTRACTA gradient-statistic-based diagnostic measure is developed in the context of the generalized linear mixed models. Its performance is assessed by some real examples and simulation studies, in terms of ability in detecting influential data structures and of concordance with the most used influence measures.
Geometric quality and appearance of surfaces : local and global approaches
2012
Accounting for customers' perception of manufactured goods has become a major challenge for the industry. This process is to be established from early design to retail. Customers are nowadays more aware and detail oriented about perceived quality of products. This allows one to set not only an estimated price but also the expected quality of the product. Surface appearance analysis has therefore become a key industrial issue. Two approaches are proposed here to formalize the detection methodology and provide objective criteria for experts to evaluate surface anomalies. The first proposed approach is based on surface metrology. It consists in analyzing the measured topologies in order to bin…
Métodos estadísticos y computacionales en el estudio de tiempos de respuesta de reconocimiento de palabras
2021
La siguiente tesis doctoral se presenta como un trabajo de investigación relacionado con el análisis estadístico de tiempos de respuesta. El objeto de estudio de este trabajo pretende aportar algo de información sobre el tratamiento de outliers incorporando la alternativa de transformación 1 t2 . Para el análisis de fiabilidad de las transformaciones de datos outliers se suele utilizar el modelo lineal t-test que busca un patrón lineal en el conjunto unificado de datos; como novedad en el presente trabajo, se usará un modelo generalizado lineal mixto que pretende mejorar el modelo lineal primigenio diferenciando los datos en su origen (sujeto e item). En todos los casos los tiempos de respu…
A new method to "clean up" ultra high-frequency data
2007
In the applied econometrics, the availability of ultra high-frequency databases is having an important impact on the research market microstructure theory. The ultra high-frequency databases contain detailed reports of all the financial market activity information which is available. However, ultra high-frequency databases cannot be directly used. On one hand recording mistakes can be present, on the other hand missing information has to be inferred from the available data. In this paper, we propose a simple method in order to clean up the ultra high-frequency data from possible errors and we examine the method efficacy when we analyze data by using an autoregressive conditional duration (A…
TERMITE: AnRscript for fast reduction of laser ablation inductively coupled plasma mass spectrometry data and its application to trace element measur…
2017
RATIONALE High spatial resolution Laser Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICPMS) determination of trace element concentrations is of great interest for geological and environmental studies. Data reduction is a very important aspect of LA-ICP-MS, and several commercial programs for handling LA-ICPMS trace element data are available. Each of these software packages has its specific advantages and disadvantages. METHODS Here we present TERMITE, an R script for the reduction of LA-ICPMS data, which can reduce both spot and line scan measurements. Several parameters can be adjusted by the user, who does not necessarily need prior knowledge in R. Currently, ten reference m…