Search results for "Data mining"
showing 10 items of 907 documents
On Distinguishing between Reliable and Unreliable Sensors Without a Knowledge of the Ground Truth
2015
In many applications, data from different sensors are aggregated in order to obtain more reliable information about the process that the sensors are monitoring. However, the quality of the aggregated information is intricately dependent on the reliability of the individual sensors. In fact, unreliable sensors will tend to report erroneous values of the ground truth, and thus degrade the quality of the fused information. Finding strategies to identify unreliable sensors can assist in having a counter-effect on their respective detrimental influences on the fusion process, and this has has been a focal concern in the literature. The purpose of this paper is to propose a solution to an extreme…
Data sets for energy rating of photovoltaic modules
2013
Abstract A proposal for generating standard climatic data sets for use in energy rating of photovoltaic (PV) modules is presented which will give a good comparability between different technologies. The current proposal of standard data sets consisting of “typical days” do not give realistic estimates of PV performance and thus is not sufficient as a rating standard. A dataset striking the balance between being significant for any location but does not consisting of too much data is required. A method to generate such a dataset is presented, meeting all the requirements of an international standard while being sufficiently accurate to differentiate between different devices of different man…
Dataset shift adaptation with active queries
2011
In remote sensing image classification, it is commonly assumed that the distribution of the classes is stable over the entire image. This way, training pixels labeled by photointerpretation are assumed to be representative of the whole image. However, differences in distribution of the classes throughout the image make this assumption weak and a model built on a single area may be suboptimal when applied to the rest of the image. In this paper, we investigate the use of active learning to correct the shifts that may appear when training and test data do not come from the same distribution. Experiments are carried out on a VHR remote sensing classification scenario showing that active learni…
Analysis of Ten Reverse Engineering Tools
2009
Reverse engineering tools can be used in satisfying the information needs of software maintainers. Especially in case of maintaining large-scale legacy systems tool support is essential. Reverse engineering tools provide various kinds of capabilities to provide the needed information to the tool user. In this paper we analyze the provided capabilities in terms of four aspects: provided data structures, visualization mechanisms, information request specification mechanisms, and navigation features. We provide a compact analysis of ten representative reverse engineering tools for supporting C, C++ or Java: Eclipse Java Development Tools, Wind River Workbench (for C and C++), Understand (for C…
Positioning Error Prediction and Training Data Evaluation in RF Fingerprinting Method
2019
Radio Frequency (RF) fingerprinting-based localization has become a research interest due to its minimum hardware requirement and satisfiable positioning accuracy. However, despite the significant attention this topic has gained, most of the research focused on the calculation of position estimates. In this paper, we propose a simple and novel method that can be used as an indicator of fingerprinting positioning error. The method is based on cluster radius evaluation of multiple fingerprinting data during the test phase, which can be used by a Location Based Service (LBS) provider to predict the user position estimation accuracy. This method can be used effectively in real-time to predict t…
A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification
2007
This article is available from: http://www.biomedcentral.com/1471-2105/8/442
Quantifying unpredictability: A multiple-model approach based on satellite imagery data from Mediterranean ponds.
2017
Fluctuations in environmental parameters are increasingly being recognized as essential features of any habitat. The quantification of whether environmental fluctuations are prevalently predictable or unpredictable is remarkably relevant to understanding the evolutionary responses of organisms. However, when characterizing the relevant features of natural habitats, ecologists typically face two problems: (1) gathering long-term data and (2) handling the hard-won data. This paper takes advantage of the free access to long-term recordings of remote sensing data (27 years, Landsat TM/ETM+) to assess a set of environmental models for estimating environmental predictability. The case study inclu…
Comparative study of techniques for large-scale feature selection* *This work was suported by a SERC grant GR/E 97549. The first author was also supp…
1994
The combinatorial search problem arising in feature selection in high dimensional spaces is considered. Recently developed techniques based on the classical sequential methods and the (l, r) search called Floating search algorithms are compared against the Genetic approach to feature subset search. Both approaches have been designed with the view to give a good compromise between efficiency and effectiveness for large problems. The purpose of this paper is to investigate the applicability of these techniques to high dimensional problems of feature selection. The aim is to establish whether the properties inferred for these techniques from medium scale experiments involving up to a few tens …
Towards a mean body for apparel design
2016
This paper focuses on shape average with applications to the apparel industry. Apparel industry uses a consensus sizing system; its major concern is to fit most of the population into it. Since anthropometric measures do not grow linearly, it is important to find prototypes to accurately represent each size. This is done using random compact mean sets, obtained from a cloud of 3D points given by a scanner and applying to the sample a previous definition of mean set. Additionally, two approaches to define confidence sets are introduced. The methodology is applied to data obtained from a real anthropometric survey. This paper has been partially supported by the following grants: TIN2009-14392…
A new methodology for Functional Principal Component Analysis from scarce data. Application to stroke rehabilitation.
2015
Functional Principal Component Analysis (FPCA) is an increasingly used methodology for analysis of biomedical data. This methodology aims to obtain Functional Principal Components (FPCs) from Functional Data (time dependent functions). However, in biomedical data, the most common scenario of this analysis is from discrete time values. Standard procedures for FPCA require obtaining the functional data from these discrete values before extracting the FPCs. The problem appears when there are missing values in a non-negligible sample of subjects, especially at the beginning or the end of the study, because this approach can compromise the analysis due to the need to extrapolate or dismiss subje…