Search results for "outlier"
showing 10 items of 73 documents
A method for detecting malfunctions in PV solar panels based on electricity production monitoring
2017
In this paper a new method is developed for automatically detecting outliers or faults in the solar energy production of identical sets (sister arrays) of photovoltaic (PV) solar panels. The method involves a two-stage unsupervised approach. In the first stage, "in control" energy production data are created by using outlier detection methods and functional principal component analysis in order to remove global and local outliers from the data set. In the second stage, control charts for the "in control" data are constructed using both a parametric method and three non-parametric methods. The control charts can be used to detect outliers or faults in the production data in real-time or at t…
Influence diagnostics for generalized linear mixed models: a gradient-like statistic
2013
In the literature, many influence measures proposed for Generalized Linear Mixed Models (GLMMs) require the information matrix that can be difficult to calculate. In the present paper, a known influence measure is approximated to get a simpler form, for which the information matrix is no more necessary. The proposed measure is showed to have a form similar to the gradient statistic, recently introduced. Good performances have been obtained through simulation studies.
RAD SNP markers as a tool for conservation of dolphinfish Coryphaena hippurus in the Mediterranean Sea: Identification of subtle genetic structure an…
2016
Dolphinfish is an important fish species for both commercial and sport fishing, but so far limited information is available on genetic variability and pattern of differentiation of dolphinfish populations in the Mediterranean basin. Recently developed techniques allow genome-wide identification of genetic markers for better understanding of population structure in species with limited genome information. Using restriction-site associated DNA analysis we successfully genotyped 140 individuals of dolphinfish from eight locations in the Mediterranean Sea at 3324 SNP loci. We identified 311 sex-related loci that were used to assess sex-ratio in dolphinfish populations. In addition, we identifie…
Environmental effects on molecular and phenotypic variation in populations of Eruca sativa across a steep climatic gradient
2013
Abstract In Israel Eruca sativa has a geographically narrow distribution across a steep climatic gradient that ranges from mesic Mediterranean to hot desert environments. These conditions offer an opportunity to study the influence of the environment on intraspecific genetic variation. For this, we combined an analysis of neutral genetic markers with a phenotypic evaluation in common-garden experiments, and environmental characterization of populations that included climatic and edaphic parameters, as well as geographic distribution. A Bayesian clustering of individuals from nine representative populations based on amplified fragment length polymorphism (AFLP) divided the populations into a…
Robust Graph Topology Learning and Application in Stock Market Inference
2019
In many applications, there are multiple interacting entities, generating time series of data over the space. To describe the relation within the set of data, the underlying topology may be used. In many real applications, not only the signal/data of interest is measured in noise, but it is also contaminated with outliers. The proposed method, called RGTL, infers the graph topology from noisy measurements and removes these outliers simultaneously. Here, it is assumed that we have no information about the space graph topology, while we know that graph signal are sampled consecutively in time and thus the graph in time domain is given. The simulation results show that the proposed algorithm h…
Looking for representative fit models for apparel sizing
2014
This paper is concerned with the generation of optimal fit models for use in apparel design. Representative fit models or prototypes are important for defining a meaningful sizing system. However, there is no agreement among apparel manufacturers and each one has their own prototypes and size charts i.e. there is a lack of standard sizes in garments from different apparel manufacturers. We propose two algorithms based on a new hierarchical partitioning around medoids clustering method originally developed for gene expression data. We are concerned with a different application; therefore, the dissimilarity between the objects has to be different and must be designed to deal with anthropometr…
Improving point matching on multimodal images using distance and orientation automatic filtering
2016
International audience; Speed Up Robust Features SURF is one of the most popular and efficient methods used for image registration task. In order to achieve a correct registration, a good matching of feature point is required. However in the case of multimodal images, the high and non-linear intensity changes between different modalities led to many outliers (mismatching of detected points) and consequently a fail in the registration. Therefore, in this paper we introduce an efficient method devoted to the detection and removal of such outlier. It's based on an automatic filtering of outliers on both distance and orientation between features points. We tested our proposed method on a set of…
Comparison of Epithor clinical national database and medico-administrative database to identify the influence of case-mix on the estimation of hospit…
2019
BackgroundThe national Epithor database was initiated in 2003 in France. Fifteen years on, a quality assessment of the recorded data seemed necessary. This study examines the completeness of the data recorded in Epithor through a comparison with the French PMSI database, which is the national medico-administrative reference database. The aim of this study was to demonstrate the influence of data quality with respect to identifying 30-day mortality hospital outliers.MethodsWe used each hospital's individual FINESS code to compare the number of pulmonary resections and deaths recorded in Epithor to the figures found in the PMSI. Centers were classified into either the good-quality data (GQD) …
Alignment of Noisy and Uniformly Scaled Time Series
2009
The alignment of noisy and uniformly scaled time series is an important but difficult task. Given two time series, one of which is a uniformly stretched subsequence of the other, we want to determine the stretching factor and the offset of the second time series within the first one. We adapted and enhanced different methods to address this problem: classical FFT-based approaches to determine the offset combined with a naive search for the stretching factor or its direct computation in the frequency domain, bounded dynamic time warping and a new approach called shotgun analysis, which is inspired by sequencing and reassembling of genomes in bioinformatics. We thoroughly examined the strengt…
Diagnostics for meta-analysis based on generalized linear mixed models
2012
Meta-analysis is the method to combine data coming from multiple studies, with the aim to provide an overall event-risk measure of interest summarizing information coming from the studies. In meta-analysis generalized linear mixed models (GLMM) are particularly used for a number of measures of interest since they allow the true effect size to differ from study to study while accepting binary, discrete as well as continuous response variable. In the present paper some strategies of influence diagnostics based on log-likelihood are suggested and discussed. These are considered for Individual Patient Data, Aggregate Data and their compounding.