Search results for "computer.software_genre"

showing 10 items of 3858 documents

Adding semantics to cloud computing to enhance service discovery and access

2012

Cloud computing is a technological paradigm that permits to offer computing services over the Internet. This new service model is closely related to previous, well-known distributed computing initiatives such as Web services and grid computing. In the current socio-economic climate, the affordability of cloud computing has gained it popularity among today's innovations. Under these circumstances, more and more cloud services become available. Consequently, it is becoming more and more difficult for service consumers to find and access those cloud services that fulfill their requirements. In this work, a cloud computing ontology is proposed that facilitates a semantic identification, discove…

Cloud computing securitybusiness.industryComputer scienceServices computingCloud computingcomputer.software_genreWorld Wide WebSemantic gridGrid computingUtility computingSemantic computingCloud testingbusinesscomputerProceedings of the 6th Euro American Conference on Telematics and Information Systems

researchProduct

SMART: Unique splitting-while-merging framework for gene clustering

2014

© 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named "splitting merging awareness tactics" (SMART), which does not require any a priori knowledge of either the number …

Clustering algorithmsMicroarrayslcsh:MedicineGene ExpressionBioinformaticscomputer.software_genreCell SignalingData MiningCluster Analysislcsh:ScienceFinite mixture modelOligonucleotide Array Sequence AnalysisPhysicsMultidisciplinarySMART frameworkConstrained clusteringCompetitive learning modelBioassays and Physiological AnalysisMultigene FamilyCanopy clustering algorithmEngineering and TechnologyData miningInformation TechnologyGenomic Signal ProcessingAlgorithmsResearch ArticleSignal TransductionComputer and Information SciencesFuzzy clusteringCorrelation clusteringResearch and Analysis MethodsClusteringMolecular GeneticsCURE data clustering algorithmGeneticsGene RegulationCluster analysista113Gene Expression Profilinglcsh:RBiology and Life SciencesComputational BiologyCell BiologyDetermining the number of clusters in a data setComputingMethodologies_PATTERNRECOGNITIONSplitting-merging awareness tactics (SMART)Signal ProcessingAffinity propagationlcsh:QGene expressionClustering frameworkcomputer

researchProduct

Computation Cluster Validation in the Big Data Era

2017

Data-driven class discovery, i.e., the inference of cluster structure in a dataset, is a fundamental task in Data Analysis, in particular for the Life Sciences. We provide a tutorial on the most common approaches used for that task, focusing on methodologies for the prediction of the number of clusters in a dataset. Although the methods that we present are general in terms of the data for which they can be used, we offer a case study relevant for Microarray Data Analysis.

Clustering high-dimensional dataClass (computer programming)Clustering validation measureSettore INF/01 - InformaticaComputer sciencebusiness.industryBig dataInferenceMicroarrays data analysiscomputer.software_genreGap statisticTask (project management)ComputingMethodologies_PATTERNRECOGNITIONCURE data clustering algorithmConsensus clusteringHypothesis testing in statisticClustering Class Discovery in Data Algorithmsb Clustering algorithmFigure of meritConsensus clusteringData miningCluster analysisbusinesscomputer

researchProduct

A local complexity based combination method for decision forests trained with high-dimensional data

2012

Accurate machine learning with high-dimensional data is affected by phenomena known as the “curse” of dimensionality. One of the main strategies explored in the last decade to deal with this problem is the use of multi-classifier systems. Several of such approaches are inspired by the Random Subspace Method for the construction of decision forests. Furthermore, other studies rely on estimations of the individual classifiers' competence, to enhance the combination in the multi-classifier and improve the accuracy. We propose a competence estimate which is based on local complexity measurements, to perform a weighted average combination of the decision forest. Experimental results show how thi…

Clustering high-dimensional dataComputational complexity theorybusiness.industryComputer scienceDecision treeMachine learningcomputer.software_genreRandom forestRandom subspace methodArtificial intelligenceData miningbusinessCompetence (human resources)computerClassifier (UML)Curse of dimensionality2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)

researchProduct

GenClust: A genetic algorithm for clustering gene expression data

2005

Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, …

Clustering high-dimensional dataDNA ComplementaryComputer scienceRand indexCorrelation clusteringOligonucleotidesEvolutionary algorithmlcsh:Computer applications to medicine. Medical informaticscomputer.software_genreBiochemistryPattern Recognition AutomatedBiclusteringOpen Reading FramesStructural BiologyCURE data clustering algorithmConsensus clusteringGenetic algorithmCluster AnalysisCluster analysislcsh:QH301-705.5Molecular BiologyGene expression data Clustering Evolutionary algorithmsOligonucleotide Array Sequence AnalysisModels StatisticalBrown clusteringHeuristicGene Expression ProfilingApplied MathematicsComputational BiologyComputer Science Applicationslcsh:Biology (General)Gene Expression RegulationMutationlcsh:R858-859.7Data miningSequence AlignmentcomputerSoftwareAlgorithmsBMC Bioinformatics

researchProduct

Data Analysis and Bioinformatics

2007

Data analysis methods and techniques are revisited in the case of biological data sets. Particular emphasis is given to clustering and mining issues. Clustering is still a subject of active research in several fields such as statistics, pattern recognition, and machine learning. Data mining adds to clustering the complications of very large data-sets with many attributes of different types. And this is a typical situation in biology. Some cases studies are also described.

Clustering high-dimensional dataFuzzy clusteringComputer sciencebusiness.industryCorrelation clusteringConceptual clusteringMachine learningcomputer.software_genreComputingMethodologies_PATTERNRECOGNITIONCURE data clustering algorithmConsensus clusteringCanopy clustering algorithmData miningArtificial intelligenceCluster analysisbusinesscomputer

researchProduct

Distance Functions, Clustering Algorithms and Microarray Data Analysis

2010

Distance functions are a fundamental ingredient of classification and clustering procedures, and this holds true also in the particular case of microarray data. In the general data mining and classification literature, functions such as Euclidean distance or Pearson correlation have gained their status of de facto standards thanks to a considerable amount of experimental validation. For microarray data, the issue of which distance function works best has been investigated, but no final conclusion has been reached. The aim of this extended abstract is to shed further light on that issue. Indeed, we present an experimental study, involving several distances, assessing (a) their intrinsic sepa…

Clustering high-dimensional dataFuzzy clusteringSettore INF/01 - Informaticabusiness.industryCorrelation clusteringMachine learningcomputer.software_genrePearson product-moment correlation coefficientRanking (information retrieval)Euclidean distancesymbols.namesakeClustering distance measuressymbolsArtificial intelligenceData miningbusinessCluster analysiscomputerMathematicsDe facto standard

researchProduct

Structural clustering of millions of molecular graphs

2014

We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…

Clustering high-dimensional dataFuzzy clusteringTheoretical computer sciencek-medoidsComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreComplete-linkage clusteringGraphHierarchical clusteringComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithmCanopy clustering algorithmFLAME clusteringAffinity propagationData miningCluster analysiscomputerk-medians clusteringClustering coefficientProceedings of the 29th Annual ACM Symposium on Applied Computing

researchProduct

Making nonlinear manifold learning models interpretable: The manifold grand tour

2015

Smooth nonlinear topographic maps of the data distribution to guide a Grand Tour visualisation.Prioritisation of data linear views that are most consistent with data structure in the maps.Useful visualisations that cannot be obtained by other more classical approaches. Dimensionality reduction is required to produce visualisations of high dimensional data. In this framework, one of the most straightforward approaches to visualising high dimensional data is based on reducing complexity and applying linear projections while tumbling the projection axes in a defined sequence which generates a Grand Tour of the data. We propose using smooth nonlinear topographic maps of the data distribution to…

Clustering high-dimensional dataQA75Nonlinear dimensionality reductionDiscriminative clusteringComputer scienceVisualització de la informaciócomputer.software_genreData visualizationProjection (mathematics)Information visualizationArtificial IntelligenceQA:Informàtica::Infografia [Àrees temàtiques de la UPC]business.industryData visualizationDimensionality reductionGrand tourGeneral EngineeringNonlinear dimensionality reductionTopographic mapData structureComputer Science ApplicationsVisualizationManifold learningData miningbusinesscomputerGenerative topographic mappingLinear projections

researchProduct

The Three Steps of Clustering In The Post-Genomic Era

2013

This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.

Clustering high-dimensional dataSettore INF/01 - Informaticabusiness.industryCorrelation clusteringPattern recognitioncomputer.software_genreBiclusteringCURE data clustering algorithmClustering Classification Biological Data MiningConsensus clusteringArtificial intelligenceData miningbusinessCluster analysiscomputerMathematics

researchProduct