Search results for "A* algorithm"
showing 10 items of 2538 documents
The on-line curvilinear component analysis (onCCA) for real-time data reduction
2015
Real time pattern recognition applications often deal with high dimensional data, which require a data reduction step which is only performed offline. However, this loses the possibility of adaption to a changing environment. This is also true for other applications different from pattern recognition, like data visualization for input inspection. Only linear projections, like the principal component analysis, can work in real time by using iterative algorithms while all known nonlinear techniques cannot be implemented in such a way and actually always work on the whole database at each epoch. Among these nonlinear tools, the Curvilinear Component Analysis (CCA), which is a non-convex techni…
Computation Cluster Validation in the Big Data Era
2017
Data-driven class discovery, i.e., the inference of cluster structure in a dataset, is a fundamental task in Data Analysis, in particular for the Life Sciences. We provide a tutorial on the most common approaches used for that task, focusing on methodologies for the prediction of the number of clusters in a dataset. Although the methods that we present are general in terms of the data for which they can be used, we offer a case study relevant for Microarray Data Analysis.
GenClust: A genetic algorithm for clustering gene expression data
2005
Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, …
Data Analysis and Bioinformatics
2007
Data analysis methods and techniques are revisited in the case of biological data sets. Particular emphasis is given to clustering and mining issues. Clustering is still a subject of active research in several fields such as statistics, pattern recognition, and machine learning. Data mining adds to clustering the complications of very large data-sets with many attributes of different types. And this is a typical situation in biology. Some cases studies are also described.
Structural clustering of millions of molecular graphs
2014
We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…
The Three Steps of Clustering In The Post-Genomic Era
2013
This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.
Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm
2003
A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.
Genome-wide detection of signatures of selection in three Valdostana cattle populations
2020
International audience; The Valdostana is a local dual purpose cattle breed developed in Italy. Three populations are recognized within this breed, based on coat colour, production level, morphology and temperament: Valdostana Red Pied (VPR), Valdostana Black Pied (VPN) and Valdostana Chestnut (VCA). Here, we investigated putative genomic regions under selection among these three populations using the Bovine 50K SNP array by combining three different statistical methods based either on allele frequencies (F-ST) or extended haplotype homozygosity (iHS and Rsb). In total, 8, 5 and 8 chromosomes harbouring 13, 13 and 16 genomic regions potentially under selection were identified by at least tw…
Decision Making in Evolving Artificial Systems
2001
The theme of this workshop is artificial perception. In this chapter we will argue that the ecological function of perception is to serve decision-making. If this is so the mechanisms chosen to implement perception, in natural or artificial systems, will be constrained by the requirements of decision-making and theories of decision-making will inevitably influence theories of perception. In what follows we will look at decision-making from what we hope is a new perspective, applying concepts and techniques developed by what we will call “new artificial intelligence”. We will begin, in the second part of the chapter, with a review of traditional, “normative” theories of decision-making and o…
1993
Genetics and developmental genetics have given us such a wealth of new insight that, at the end of this century, the synthetic theory can no longer be maintained in the strict “orthodox” sense in which it was started.