Search results for "cluster analysis."

showing 10 items of 805 documents

Structural clustering of millions of molecular graphs

2014

We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…

Clustering high-dimensional dataFuzzy clusteringTheoretical computer sciencek-medoidsComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreComplete-linkage clusteringGraphHierarchical clusteringComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithmCanopy clustering algorithmFLAME clusteringAffinity propagationData miningCluster analysiscomputerk-medians clusteringClustering coefficientProceedings of the 29th Annual ACM Symposium on Applied Computing
researchProduct

The Three Steps of Clustering In The Post-Genomic Era

2013

This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.

Clustering high-dimensional dataSettore INF/01 - Informaticabusiness.industryCorrelation clusteringPattern recognitioncomputer.software_genreBiclusteringCURE data clustering algorithmClustering Classification Biological Data MiningConsensus clusteringArtificial intelligenceData miningbusinessCluster analysiscomputerMathematics
researchProduct

Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm

2003

A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.

Clustering high-dimensional datak-medoidsComputer scienceCURE data clustering algorithmSingle-linkage clusteringCanopy clustering algorithmVariance (accounting)Data miningCluster analysiscomputer.software_genrecomputerk-medians clustering
researchProduct

Bayesian versus data driven model selection for microarray data

2014

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. A…

Clustering Model selection Bayesian information criterion Akaike information criterion Minimum message length BioinformaticsSettore INF/01 - InformaticaComputer sciencebusiness.industryModel selectionBayesian probabilitycomputer.software_genreMachine learningComputer Science ApplicationsData-drivenDetermining the number of clusters in a data setIdentification (information)Bayesian information criterionData miningArtificial intelligenceAkaike information criterionCluster analysisbusinesscomputer
researchProduct

Project Management Information Systems (PMISs): A Statistical-Based Analysis for the Evaluation of Software Packages Features

2021

Project Managers (PMs) working in competitive markets are finding Project Management Information Systems (PMISs) useful for planning, organizing and controlling projects of varying complexity. A wide variety of PMIS software is available, suitable for projects differing in scope and user needs. This paper identifies the most useful features found in PMISs. An extensive literature review and analysis of commercial software is made to identify the main features of PMISs. Afterwards, the list is reduced by a panel of project management experts, and a statistical analysis is performed on data acquired by means of two different surveys. The relative importance of listed features is properly comp…

Clustering; Conjoint analysis; Design of Experiment (DoE); Project Management Information System (PMIS); Ranking method; Surveyranking methodTechnologyComputer scienceQH301-705.5QC1-999SoftwareSettore ING-IND/17 - Impianti Industriali MeccaniciGeneral Materials SciencesurveyProject managementBiology (General)Cluster analysisInstrumentationQD1-999Fluid Flow and Transfer ProcessesCommercial softwareScope (project management)business.industryProcess Chemistry and TechnologyTPhysicsGeneral EngineeringProject Management Information System (PMIS); survey; Design of Experiment (DoE); conjoint analysis; ranking method; clusteringClustering Conjoint analysis Design of Experiment (DoE) Project Management Information System (PMIS) Ranking method SurveyProject Management Information System (PMIS)Engineering (General). Civil engineering (General)Data scienceDesign of Experiment (DoE)Computer Science ApplicationsConjoint analysisVariety (cybernetics)ChemistryRespondentconjoint analysisTA1-2040businessclusteringApplied Sciences; Volume 11; Issue 23; Pages: 11233
researchProduct

A reappraisal of the Pleurotus eryngii complex – New species and taxonomic combinations based on the application of a polyphasic approach, and an ide…

2014

The Pleurotus eryngii species-complex comprises choice edible mushrooms growing on roots and lower stem residues of Apiaceae (umbellifers) plants. Material deriving from extensive sampling was studied by mating compatibility, morphological and ecological criteria, and through analysis of ITS1-5.8S-ITS2 and IGS1 rRNA sequences. Results revealed that P. eryngii sensu stricto forms a diverse and widely distributed aggregate composed of varieties elaeoselini, eryngii, ferulae, thapsiae, and tingitanus. Pleurotus eryngii subsp. tuoliensis comb. nov. is a phylogenetically sister group to the former growing only on various Ferula species in Asia. The existence of Pleurotus nebrodensis outside of S…

Co-evolution of plants and fungi Fungal phylogeny Pleurotus eryngii subsp. tuoliensis comb. nov. Pleurotus ferulaginis sp. nov. Pleurotus nebrodensis subsp. fossulatus comb. nov.Molecular Sequence DataIdentification keyPleurotusDNA Ribosomal SpacerBotanyGeneticsCluster AnalysisPleurotus eryngiiDNA FungalEcology Evolution Behavior and SystematicsRecombination GeneticMicroscopyPleurotusApiaceaePhylogenetic treebiologySettore BIO/02 - Botanica SistematicaBiodiversitySequence Analysis DNAbiology.organism_classificationRNA Ribosomal 5.8SPhylogeographyInfectious DiseasesTaxonSister groupSettore BIO/03 - Botanica Ambientale E ApplicataKey (lock)ApiaceaeFungal Biology
researchProduct

Comprehensive analysis of forty yeast microarray datasets reveals a novel subset of genes (APha-RiB) consistently negatively associated with ribosome…

2014

Background The scale and complexity of genomic data lend themselves to analysis using sophisticated mathematical techniques to yield information that can generate new hypotheses and so guide further experimental investigations. An ensemble clustering method has the ability to perform consensus clustering over the same set of genes from different microarray datasets by combining results from different clustering methods into a single consensus result. Results In this paper we have performed comprehensive analysis of forty yeast microarray datasets. One recently described Bi-CoPaM method can analyse expressions of the same set of genes from various microarray datasets while using different cl…

Co-regulation(Binarisation of consensus partition matrices) Bi-CoPaMGene Expression ProfilingStress responseGenes FungalCo-expressionGenome-wide analysisGene Expression Regulation FungalRibosome biogenesisSaccharomycetalesCluster AnalysisGene Regulatory NetworksBudding yeastRibosomesOligonucleotide Array Sequence AnalysisResearch ArticleBMC bioinformatics
researchProduct

Comparative genomics and protein domain graph analyses link ubiquitination and RNA metabolism.

2006

The human gene parkin, known to cause familial Parkinson disease, as well as several other genes, likely involved in other neurodegenerative diseases or in cancer, encode proteins of the RBR family of ubiquitin ligases. Here, we describe the structural diversity of the RBR family in order to infer their functional roles. Of particular interest is a relationship detected between RBR-mediated ubiquitination and RNA metabolism: a few RBR proteins contain RNA binding domains and DEAH-box RNA helicase domains. Global protein domain graph analyses demonstrate that this connection is not RBR-specific, but instead many other proteins contain both ubiquitination and RNA-related domains. These protei…

Comparative genomicsGeneticsbiologyProtein ConformationUbiquitinUbiquitin-Protein LigasesProtein domainMolecular Sequence DataRNAGenomicsF-box proteinRNA Helicase AParkinUbiquitin ligaseProtein Structure TertiaryStructural Biologybiology.proteinAnimalsCluster AnalysisHumansRNAMolecular BiologyGeneAlgorithmsJournal of molecular biology
researchProduct

On the determination of coherent solar climates over a tropical island with a complex topography

2020

Abstract Many tropical islands aim at developing a greener self-sufficient energy production systems based on renewable energy, notably solar-generated electricity. This work explores the mean diurnal and annual solar cycles over La Reunion island (southwest Indian Ocean: 21°S, 55.5°E), and their spatial behavior, using the Solar surfAce RAdiation Heliosat – East (SARAH-E) satellite-derived data at high spatial ( 0.05 ° × 0.05 ° ) and time (hourly) resolutions over period 1999–2016. Comparisons of the SARAH-E data with ground-based measurements over the period 2011–2015 show differences of ~ 15 % for diurnal-seasonal variations. The solar resource over the island displays strong spatial var…

Complex topography020209 energyLa Réunion island02 engineering and technologyAtmospheric sciencesSurface solar radiationCluster analysisSolar Resource0202 electrical engineering electronic engineering information engineeringGeneral Materials Science14. Life underwatergeographygeography.geographical_feature_categoryRenewable Energy Sustainability and the Environmentbusiness.industryComplex meteorological context021001 nanoscience & nanotechnologyRenewable energyIndian oceanTropical islandsVolcano[SDU.STU.CL]Sciences of the Universe [physics]/Earth Sciences/Climatology13. Climate actionPeriod (geology)Seasonal/diurnal cyclesEnvironmental scienceSpatial variability0210 nano-technologybusinessSARAH-ESolar Energy
researchProduct

Continuous reformulations and heuristics for the Euclidean travelling salesperson problem

2008

We consider continuous reformulations of the Euclidean travelling salesperson problem (TSP), based on certain clustering problem formulations. These reformulations allow us to apply a generalisation with perturbations of the Weiszfeld algorithm in an attempt to find local approximate solutions to the Euclidean TSP.

Computational MathematicsMathematical optimizationControl and OptimizationControl and Systems EngineeringProblem FormulationsEuclidean geometryApplied mathematicsComputer Science::Data Structures and AlgorithmsHeuristicsCluster analysisMathematicsESAIM: Control, Optimisation and Calculus of Variations
researchProduct