Search results for "Data pre-processing"
showing 8 items of 18 documents
Rule-guided identification of cosmic-ray patterns in PLASTEX
1992
Some techniques devised in the computer science fields of pattern recognition and expert systems are being applied to the interpretation of EAS responses in the PLASTEX experiment. An attempt is made to codity in a set of rules the expertise of trained researchers who are able to recognize and classify different hit patterns even in the presence of noisy background, and in spite of imperfections in the detector response. The patterns expected to be useful include, but are not limited to, track patterns. The software described here, as a progress report, automatically finds patterns corresponding to isolated tracks, and patterns composed of tracks that connect with each other in a layer of d…
Modeling recurrent distributions in streams using possible worlds
2015
Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various — possibly recurrent — data distributions of the stream by extending the notion of possible worlds. The representation en…
RPPanalyzer Toolbox: An improved R package for analysis of reverse phase protein array data
2014
Analysis of large-scale proteomic data sets requires specialized software tools, tailored toward the requirements of individual approaches. Here we introduce an extension of an open-source software solution for analyzing reverse phase protein array (RPPA) data. The R package RPPanalyzer was designed for data preprocessing followed by basic statistical analyses and proteomic data visualization. In this update, we merged relevant data preprocessing steps into a single user-friendly function and included a new method for background noise correction as well as new methods for noise estimation and averaging of replicates to transform data in such a way that they can be used as input for a new t…
On the Optimization of Self-Organizing Maps by Genetic Algorithms
1999
Publisher Summary This chapter reviews the research on the genetic optimization of self-organizing maps (SOMs). The optimization of learning rule parameters and of initial weights is able to improve network performance. The latter, however, requires chromosome sizes proportional to the size of the SOM and becomes unwieldy for large networks. The optimization of learning rule structures leads to self-organization processes of character similar to the standard learning rule. A particularly strong potential lies in the optimization of SOM topologies, which allows the study of global dynamical properties of SOMs and related models, as well as to develop tools for their analysis. Hierarchies of …
MuLiMs-MCoMPAs: A Novel Multiplatform Framework to Compute Tensor Algebra-Based Three-Dimensional Protein Descriptors
2019
This report introduces the MuLiMs-MCoMPAs software (acronym for Multi-Linear Maps based on N-Metric and Contact Matrices of 3D Protein and Amino-acid weightings), designed to compute tensor-based 3D protein structural descriptors by applying two- and three-linear algebraic forms. Moreover, these descriptors contemplate generalizing components such as novel 3D protein structural representations, (dis)similarity metrics, and multimetrics to extract geometrical related information between two and three amino acids, weighting schemes based on amino acid properties, matrix normalization procedures that consider simple-stochastic and mutual probability transformations, topological and geometrical…
WhoSNext: Recommending Twitter Users to Follow Using a Spreading Activation Network Based Approach
2020
The huge number of modern social network users has made the web a fertile ground for the growth and development of a plethora of recommender systems. To date, recommending a new user profile X to a given user U that could be interested in creating a relationship with X has been tackled using techniques based on content analysis, existing friendship relationships and other pieces of information coming from different social networks or websites. In this paper we propose a recommending architecture - called WhoSNext (WSN) - tested on Twitter and which aim is promoting the creation of new relationships among users. As recent researches show, this is an interesting recommendation problem: for a …
Comparative evaluation of data preprocessing software tools to increase efficiency and accuracy in diffusion kurtosis imaging
2016
Local dimensionality reduction within natural clusters for medical data analysis
2005
Inductive learning systems have been successfully applied in a number of medical domains. Nevertheless, the effective use of these systems requires data preprocessing before applying a learning algorithm. Especially it is important for multidimensional heterogeneous data, presented by a large number of features of different types. Dimensionality reduction is one commonly applied approach. The goal of this paper is to study the impact of natural clustering on dimensionality reduction for classification. We compare several data mining strategies that apply dimensionality reduction by means of feature extraction or feature selection for subsequent classification. We show experimentally on micr…