Search results for "Data mining"
showing 10 items of 907 documents
Power estimation for non-standardized multisite studies
2016
A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this…
Computation of Psycho-Acoustic Annoyance Using Deep Neural Networks
2019
Psycho-acoustic parameters have been extensively used to evaluate the discomfort or pleasure produced by the sounds in our environment. In this context, wireless acoustic sensor networks (WASNs) can be an interesting solution for monitoring subjective annoyance in certain soundscapes, since they can be used to register the evolution of such parameters in time and space. Unfortunately, the calculation of the psycho-acoustic parameters involved in common annoyance models implies a significant computational cost, and makes difficult the acquisition and transmission of these parameters at the nodes. As a result, monitoring psycho-acoustic annoyance becomes an expensive and inefficient task. Thi…
Classification of reference models: a methodology and its application
2003
Classification is an important tool for perception and can be found in numerous scientific disciplines. Several application areas of classification are described in the context of information modeling. The usefulness of classification for reuse resp. selection of reference models is emphasized. A methodology to systematically create classification systems will be introduced. Furthermore, a classification system for reference models will be developed with the aid of the proposed methodology. This classification system gives a comprehensive, but abstract survey of 26 reference models found in the literature.
Querying and reasoning over large scale building data sets
2016
International audience; The architectural design and construction domains work on a daily basis with massive amounts of data. Properly managing, exchanging and exploiting these data is an ever ongoing challenge in this domain. This has resulted in large semantic RDF graphs that are to be combined with a significant number of other data sets (building product catalogues, regulation data, geometric point cloud data, simulation data, sensor data), thus making an already huge dataset even larger. Making these big data available at high performance rates and speeds and into the correct (intuitive) formats is therefore an incredibly high challenge in this domain. Yet, hardly any benchmark is avai…
Basic Sampling Techniques
2004
Data mining and information retrieval
2007
Executable Data Quality Models
2017
The paper discusses an external solution for data quality management in information systems. In contradiction to traditional data quality assurance methods, the proposed approach provides the usage of a domain specific language (DSL) for description data quality models. Data quality models consists of graphical diagrams, which elements contain requirements for data object's values and procedures for data object's analysis. The DSL interpreter makes the data quality model executable therefore ensuring measurement and improving of data quality. The described approach can be applied: (1) to check the completeness, accuracy and consistency of accumulated data; (2) to support data migration in c…
Editing prototypes in the finite sample size case using alternative neighborhoods
1998
The recently introduced concept of Nearest Centroid Neighborhood is applied to discard outliers and prototypes 111 class overlapping regions in order to improve the performance of the Nearest Neighbor rule through an editing procedure, This approach is related to graph based editing algorithms which also define alternative neighborhoods in terms of geornetric relations, Classical editing algorithms are compared to these alternative editing schemes using several synthetic and real data problems. The empirical results show that, the proposed editing algorithm constitutes a good trade-off among performance and computational burden.
Entropy-Based Classifier Enhancement to Handle Imbalanced Class Problem
2017
The paper presents a possible enhancement of entropy-based classifiers to handle problems, caused by the class imbalance in the original dataset. The proposed method was tested on synthetic data in order to analyse its robustness in the controlled environment with different class proportions. As also the proposed method was tested on the real medical data with imbalanced classes and compared to the original classification algorithm results. The medical field was chosen for testing due to frequent situations with uneven class ratios.
MetNet: A two-level approach to reconstructing and comparing metabolic networks
2021
Metabolic pathway comparison and interaction between different species can detect important information for drug engineering and medical science. In the literature, proposals for reconstructing and comparing metabolic networks present two main problems: network reconstruction requires usually human intervention to integrate information from different sources and, in metabolic comparison, the size of the networks leads to a challenging computational problem. We propose to automatically reconstruct a metabolic network on the basis of KEGG database information. Our proposal relies on a two-level representation of the huge metabolic network: the first level is graph-based and depicts pathways a…