Search results for "Mining"
showing 10 items of 1730 documents
Gene set to diseases
2016
Disease enrichment analysis on gene sets.
Spatio-temporal Schema Integration with Validation: A Practical Approach
2005
We propose to enhance a schema integration process with a validation phase employing logic-based data models. In our methodology, we validate the source schemas against the data model; the inter-schema mappings are validated against the semantics of the data model and the syntax of the correspondence language. In this paper, we focus on how to employ a reasoning engine to validate spatio-temporal schemas and describe where the reasoning engine is plugged into our integration methodology. The validation phase distinguishes our integration methodology from other approaches. We shift the emphasis on automation from the a priori discovery to the a posteriori checking of the inter-schema mapping…
Context-related data processing in artificial neural networks for higher reliability of telerehabilitation systems
2015
Classification is a data processing technique of a great significance both for native eHealth systems and web telemedicine solutions. In this sense, artificial neural networks have been widely applied in telerehabilitation as powerful tools to process information and acquire a new medical knowledge. But effective analysis of multidimensional heterogeneous medical data, still poses considerable difficulties. It was shown that processing too many data features simultaneously is costly and has some adverse effects on the resulting models classification properties. Therefore, there is a strong need to develop new techniques for selecting features from the very large data sets that include many …
A hybrid multi-criteria approach to GPR image mining applied to water supply system maintenance
2018
[EN] Data processing techniques for Ground Penetrating Radar (GPR) image mining provide essential information to optimize maintenance management of Water Supply Systems (WSSs). These techniques aim to elaborate on radargrams in order to produce meaningful graphical representations of critical buried components of WSSs. These representations are helpful non-destructive evaluation tools to prevent possible failures in WSSs by keeping them adequately monitored. This paper proposes an integrated multi-criteria decision making (MCDM) approach to prioritize various data processing techniques by means of ranking their outputs, namely their resulting GPR image representations. The Fuzzy Analytic Hi…
Geo‐referencing naturalistic driving data using a novel method based on vehicle speed
2013
Naturalistic driving is an experimentation model that allows us to recognise the driving modes observing the driver's behaviour at the wheel of a set of people in natural conditions during long periods of observation. This research methodology aims at increasing the representativeness of the data collected in opposition to data stemming from highly controlled laboratory experiments. However, naturalistic driving research designs produce large volumes of data that are difficult to handle. Thus, it is very important to work with suitable methods for representing and interpreting data, allowing us to observe the variability of the results. The aim of this study is to implement a new methodolog…
MDA: a MATLAB-based program for morphospace-disparity analysis
2003
A MATLAB® program that examines patterns of state-space occupation is described. Four subroutines are available with which to visualize morphospace patterns: (i) in terms of their features such as dispersion, aggregation and location, thereby allowing users to extract complementary quantitative information about how the state-space is structured, and (ii) in terms of changes in those patterns that can be compared with other biotic (e.g., extinction, origination rates) or abiotic (e.g., environmental proxy) information. The program incorporates many of the latest and most widely used statistical parameters for describing multivariate spaces. The parameters are estimated on the basis of boots…
A New Approach to Investigate Students’ Behavior by Using Cluster Analysis as an Unsupervised Methodology in the Field of Education
2016
The problem of taking a set of data and separating it into subgroups where the ele- ments of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing need…
OpenHVSR - Processing toolkit: Enhanced HVSR processing of distributed microtremor measurements and spatial variation of their informative content
2018
Abstract The investigation of seismic ambient noise (microtremor) in spectral ratio form, known as the Horizontal-to-Vertical Spectral Ratio technique, is extremely popular nowadays both to investigate large areas in a reduced amount of time, and to leverage a wider choice of low cost equipment. In general, measurements at multiple locations are collected to generate multiple, individual spectral ratio curves. Recently, however, there has been an increasing interest in spatially correlating informative content from different locations. Accordingly, we introduce a new computer program, “OpenHVSR – Processing Toolkit”, developed in Matlab (R2015b), specifically engineered to enhance data proc…
Environmental Data Processing by Clustering Methods for Energy Forecast and Planning
2011
This paper presents a statistical approach based on the k-means clustering technique to manage environmental sampled data to evaluate and to forecast of the energy deliverable by different renewable sources in a given site. In particular, wind speed and solar irradiance sampled data are studied in association to the energy capability of a wind generator and a photovoltaic (PV) plant, respectively. The proposed method allows the sub-sets of useful data, describing the energy capability of a site, to be extracted from a set of experimental observations belonging the considered site. The data collection is performed in Sicily, in the south of Italy, as case study. As far as the wind generation…
Hierarchically nested factor model from multivariate data
2005
We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.