Search results for "Data Quality"
showing 10 items of 96 documents
Executable Data Quality Models
2017
The paper discusses an external solution for data quality management in information systems. In contradiction to traditional data quality assurance methods, the proposed approach provides the usage of a domain specific language (DSL) for description data quality models. Data quality models consists of graphical diagrams, which elements contain requirements for data object's values and procedures for data object's analysis. The DSL interpreter makes the data quality model executable therefore ensuring measurement and improving of data quality. The described approach can be applied: (1) to check the completeness, accuracy and consistency of accumulated data; (2) to support data migration in c…
Response Determination Criteria for ELISPOT: Toward a Standard that Can Be Applied Across Laboratories
2011
ELISPOT assay readout is often dichomized as positive or negative responses according to prespecified criteria. However, these criteria can vary widely across institutions. The adoption of a common response criterion is a key step toward cross-laboratory comparability. This chapter describes the two main approaches to response determination, identifying the strengths and limitations of each. Nonparametric statistical tests and consideration of data quality are recommended and instructions provided for their ready implementation by nonstatisticians and statisticians alike.
Publication Data Integration as a Tool for Excellence-Based Research Analysis at the University of Latvia
2017
The evaluation of research results can be carried out with different purposes aligned with strategic goals of an institution, for example, to decide upon distribution of research funding or to recruit or promote employees of an institution involved in research. Whereas quantitative measures such as number of scientific papers or number of scientific staff are commonly used for such evaluation, the strategy of the institution can be set to achieve ambitious scientific goals. Therefore, a question arises as to how more quality oriented aspects of the research outcomes should be measured. To supply an appropriate dataset for evaluation of both types of metrics, a suitable framework should be p…
Development and analysis of the Soil Water Infiltration Global database
2018
27 Pags.- 11 Tabls.- 8 Figs. © Author(s) 2018. This work is distributed under the Creative Commons Attribution 4.0 License.
The Reprocessed Proba-V Collection 2: Product Validation
2021
With the objective to improve data quality in terms of cloud detection, absolute radiometric calibration and atmospheric correction, the PRoject for On-Board Autonomy-Vegetation (PROBA-V) data archive (October 2013 - June 2020) will be reprocessed to Collection 2 (C2). The product validation is organized in three phases and focuses on the intercomparison with PROBA-V Collection 1 (C1), but also consistency analysis with SPOT-VGT, Sentinel-3 SYN-VGT, Terra-MODIS and METOP-AVHRR is foreseen. First preliminary results show the better performance of cloud and snow/ice masking, and indicate that statistical consistency between PROBA-V C2 and C1 are in line with expectations. PROBA-V C2 data are …
Improving data quality in construction engineering projects : an action design research approach
2014
Aurthor's version of an article in the journal: Journal of Management in Engineering. Also available from the publisher at: http://dx.doi.org/10.1061/(ASCE)ME.1943-5479.0000202 The topic of data and information quality (DQ/IQ) is a longstanding issue of interest in both academia and practice in the construction engineering field. Poor DQ/IQ has led to poor engineering drawings that, in turn, have led to delays and, eventually, cost overruns. In this paper, a study is reported that took an action design research (ADR) approach to develop and evaluate a DQ/IQ assessment tool, which is called the information quality system (IQS), in a large global engineering and construction company. The eval…
Controlling false match rates in record linkage using extreme value theory
2011
AbstractCleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as ‘false matches’), in which records belonging to different entities are wrongly classified as equal. Synonym errors (‘false non-matches’) occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. Fa…
Applying a Data Quality Model to Experiments in Software Engineering
2014
Data collection and analysis are key artifacts in any software engineering experiment. However, these data might contain errors. We propose a Data Quality model specific to data obtained from software engineering experiments, which provides a framework for analyzing and improving these data. We apply the model to two controlled experiments, which results in the discovery of data quality problems that need to be addressed. We conclude that data quality issues have to be considered before obtaining the experimental results.
Integration of foreign trade and maritime transport statistics in Spain
2010
This article aims to contribute to improving maritime trade data by analysing the possibility of integrating in a single enlarged databank two different databases: the Spanish foreign trade and maritime transport datasets. The methodology adopted consisted of studying the primary sources providing data compiled in the foreign trade and maritime traffic databases, analysing the electronic processes used by informants and data collectors and examining the linkages between the different electronic messages involved. Once the links between the trade and sea transport documents and electronic messages were found, a solution for integrating both databases was envisaged. The main outcome is a new …
EHRtemporalVariability
2020
Functions to delineate temporal dataset shifts in Electronic Health Records through the projection and visualization of dissimilarities among data temporal batches. This is done through the estimation of data statistical distributions over time and their projection in non-parametric statistical manifolds, uncovering the patterns of the data latent temporal variability. EHRtemporalVariability is particularly suitable for multi-modal data and categorical variables with a high number of values, common features of biomedical data where traditional statistical process control or time-series methods may not be appropriate. EHRtemporalVariability allows you to explore and identify dataset shifts t…