Search results for "mining"
showing 10 items of 1730 documents
Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.
2007
Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric) has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rath…
Machine Learning Techniques for Intrusion Detection: A Comparative Analysis
2016
International audience; With the growth of internet world has transformed into a global market with all monetary and business exercises being carried online. Being the most imperative resource of the developing scene, it is the vulnerable object and hence needs to be secured from the users with dangerous personality set. Since the Internet does not have focal surveillance component, assailants once in a while, utilizing varied and advancing hacking topologies discover a path to bypass framework " s security and one such collection of assaults is Intrusion. An intrusion is a movement of breaking into the framework by compromising the security arrangements of the framework set up. The techniq…
Combining conjunctive rule extraction with diffusion maps for network intrusion detection
2013
Network security and intrusion detection are important in the modern world where communication happens via information networks. Traditional signature-based intrusion detection methods cannot find previously unknown attacks. On the other hand, algorithms used for anomaly detection often have black box qualities that are difficult to understand for people who are not algorithm experts. Rule extraction methods create interpretable rule sets that act as classifiers. They have mostly been combined with already labeled data sets. This paper aims to combine unsupervised anomaly detection with rule extraction techniques to create an online anomaly detection framework. Unsupervised anomaly detectio…
Vibrational spectroscopy provides a green tool for multi-component analysis
2010
Abstract Based on the literature published in the past decade, we focus on the possibilities offered by vibrational-spectroscopy-based techniques to make multi-component analysis of samples independently of their physical state. We discuss the main chemometric tools proposed for developing calibration models and solving problems derived from spectroscopic non-idealities (e.g., highly overlapped spectral bands or the presence of spectral non-linearity), and the benefits provided by vibrational-spectroscopy-based multi-component analysis in industry. Our main objective is to show that vibrational spectroscopy provides fast analytical methods that enable non-destructive analysis and permits, i…
Power estimation for non-standardized multisite studies
2016
A concern for researchers planning multisite studies is that scanner and T1-weighted sequence-related biases on regional volumes could overshadow true effects, especially for studies with a heterogeneous set of scanners and sequences. Current approaches attempt to harmonize data by standardizing hardware, pulse sequences, and protocols, or by calibrating across sites using phantom-based corrections to ensure the same raw image intensities. We propose to avoid harmonization and phantom-based correction entirely. We hypothesized that the bias of estimated regional volumes is scaled between sites due to the contrast and gradient distortion differences between scanners and sequences. Given this…
Computation of Psycho-Acoustic Annoyance Using Deep Neural Networks
2019
Psycho-acoustic parameters have been extensively used to evaluate the discomfort or pleasure produced by the sounds in our environment. In this context, wireless acoustic sensor networks (WASNs) can be an interesting solution for monitoring subjective annoyance in certain soundscapes, since they can be used to register the evolution of such parameters in time and space. Unfortunately, the calculation of the psycho-acoustic parameters involved in common annoyance models implies a significant computational cost, and makes difficult the acquisition and transmission of these parameters at the nodes. As a result, monitoring psycho-acoustic annoyance becomes an expensive and inefficient task. Thi…
Classification of reference models: a methodology and its application
2003
Classification is an important tool for perception and can be found in numerous scientific disciplines. Several application areas of classification are described in the context of information modeling. The usefulness of classification for reuse resp. selection of reference models is emphasized. A methodology to systematically create classification systems will be introduced. Furthermore, a classification system for reference models will be developed with the aid of the proposed methodology. This classification system gives a comprehensive, but abstract survey of 26 reference models found in the literature.
Querying and reasoning over large scale building data sets
2016
International audience; The architectural design and construction domains work on a daily basis with massive amounts of data. Properly managing, exchanging and exploiting these data is an ever ongoing challenge in this domain. This has resulted in large semantic RDF graphs that are to be combined with a significant number of other data sets (building product catalogues, regulation data, geometric point cloud data, simulation data, sensor data), thus making an already huge dataset even larger. Making these big data available at high performance rates and speeds and into the correct (intuitive) formats is therefore an incredibly high challenge in this domain. Yet, hardly any benchmark is avai…