Search results for "cluster analysis."

showing 10 items of 805 documents

On the complexity of the Saccharomyces bayanus taxon: Hybridization and potential hybrid speciation

2014

Although the genus Saccharomyces has been thoroughly studied, some species in the genus has not yet been accurately resolved; an example is S. bayanus, a taxon that includes genetically diverse lineages of pure and hybrid strains. This diversity makes the assignation and classification of strains belonging to this species unclear and controversial. They have been subdivided by some authors into two varieties (bayanus and uvarum), which have been raised to the species level by others. In this work, we evaluate the complexity of 46 different strains included in the S. bayanus taxon by means of PCR-RFLP analysis and by sequencing of 34 gene regions and one mitochondrial gene. Using the sequenc…

Evolutionary GeneticsSaccharomyces bayanusDIVERSITYSequence Homologylcsh:MedicineSaccharomycesPolymerase Chain Reaction//purl.org/becyt/ford/1 [https]Genética y HerenciaPCR-RFLP analysisFungal EvolutionCluster Analysislcsh:ScienceGenome EvolutionPhylogenyGeneticsMultidisciplinarySACCHAROMYCES EUBAYANUSPhylogenetic analysisbiologyStrain (biology)Systems BiologyGenomicsS. bayanusPolymorphism Restriction Fragment LengthCIENCIAS NATURALES Y EXACTASResearch ArticleEvolutionary ProcessesGenetic SpeciationMolecular Sequence DataIntrogressionMycologyGenome ComplexityMicrobiologyGenètica molecularCiencias BiológicasSaccharomycesSpecies SpecificityPhylogeneticsGenetic variationGeneticsYEAST//purl.org/becyt/ford/1.6 [https]HybridizationAllelesHybridEvolutionary BiologyBase Sequencelcsh:ROrganismsFungiBiology and Life SciencesComputational BiologyGenetic VariationSACCHAROMYCES PASTORIANUSSequence Analysis DNAComparative Genomicsbiology.organism_classificationYeastGenetics PopulationHaplotypesFungal ClassificationHybridization GeneticHybrid speciationlcsh:Q
researchProduct

Deep-Time Phylogenetic Clustering of Extinctions in an Evolutionarily Dynamic Clade (Early Jurassic Ammonites)

2012

7 pages; International audience; Conservation biologists and palaeontologists are increasingly investigating the phylogenetic distribution of extinctions and its evolutionary consequences. However, the dearth of palaeontological studies on that subject and the lack of methodological consensus hamper our understanding of that major evolutionary phenomenon. Here we address this issue by (i) reviewing the approaches used to quantify the phylogenetic selectivity of extinctions and extinction risks; (ii) investigating with a high-resolution dataset whether extinctions and survivals were phylogenetically clustered among early Pliensbachian (Early Jurassic) ammonites; (iii) exploring the phylogene…

Evolutionary ProcessesEcological MetricsCombined uselcsh:MedicineBiologyForms of EvolutionExtinction BiologicalPhylogeneticsPhyletic PatternsAnimalsCluster AnalysisEvolutionary SystematicsCladelcsh:ScienceBiologyDeep timeSpecies ExtinctionPhylogeny[ SDU.STU.PG ] Sciences of the Universe [physics]/Earth Sciences/PaleontologyAmmoniteEvolutionary BiologyMultidisciplinaryExtinctionModels StatisticalPhylogenetic treeEcologyEcologyFossilslcsh:RPaleontologysocial sciencesBiological Evolutionlanguage.human_languagehumanities[ SDV.BID.EVO ] Life Sciences [q-bio]/Biodiversity/Populations and Evolution [q-bio.PE]CephalopodaPhylogenetic PatternExtinction RisklanguageEarth SciencesMacroevolutionlcsh:QPaleoecologyPaleobiologyResearch ArticlePLoS ONE
researchProduct

Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals

2000

The study of a few genes has permitted the identification of three elements that constitute a yeast polyadenyl­ation signal: the efficiency element (EE), the positioning element and the actual site for cleavage and poly­adenyl­ation. In this paper we perform an analysis of oligonucleotide composition on the sequences located downstream of the stop codon of all yeast genes. Several oligonucleotide families appear over-represented with a high significance (referred to herein as"words"). The family with the highest over-representation includes the oligonucleotides shown experimentally to play a role as EEs. The word with the highest score is TATATA, followed, among others, by a series of singl…

Expressed Sequence TagsGeneticsExpressed sequence tagBase SequencePolyadenylation[SDV]Life Sciences [q-bio]Saccharomyces cerevisiaeSaccharomyces cerevisiaeBiologybiology.organism_classificationSaccharomycesArticleYeastStop codonSaccharomycesGeneticsCluster Analysis[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]Genome FungalORFSPoly AGeneComputingMilieux_MISCELLANEOUS
researchProduct

Distributed and proximity-constrained C-means for discrete coverage control

2018

In this paper we present a novel distributed coverage control framework for a network of mobile agents, in charge of covering a finite set of points of interest (PoI), such as people in danger, geographically dispersed equipment or environmental landmarks. The proposed algorithm is inspired by C-Means, an unsupervised learning algorithm originally proposed for non-exclusive clustering and for identification of cluster centroids from a set of observations. To cope with the agents' limited sensing range and avoid infeasible coverage solutions, traditional C-Means needs to be enhanced with proximity constraints, ensuring that each agent takes into account only neighboring PoIs. The proposed co…

FOS: Computer and information sciences0209 industrial biotechnologyControl and OptimizationComputer scienceDistributed computing02 engineering and technologyIndustrial and Manufacturing EngineeringSet (abstract data type)Disaster reliefComputer Science - Robotics020901 industrial engineering & automation0202 electrical engineering electronic engineering information engineeringDecision Sciences (miscellaneous)Cluster analysisData fusion processPoints of interest(poi)Sensing rangesNon-exclusive clusteringData fusionDisaster preventionSensor fusionEuclidean distanceCoverage controlIdentification (information)Range (mathematics)Information concerningRanking020201 artificial intelligence & image processingMobile agentsRobotics (cs.RO)Cluster centroids
researchProduct

Multilingual Clustering of Streaming News

2018

Clustering news across languages enables efficient media monitoring by aggregating articles from multilingual sources into coherent stories. Doing so in an online setting allows scalable processing of massive news streams. To this end, we describe a novel method for clustering an incoming stream of multilingual documents into monolingual and crosslingual story clusters. Unlike typical clustering approaches that consider a small and known number of labels, we tackle the problem of discovering an ever growing number of cluster labels in an online fashion, using real news datasets in multiple languages. Our method is simple to implement, computationally efficient and produces state-of-the-art …

FOS: Computer and information sciencesComputer Science - Computation and LanguageInformation retrievalComputer scienceInformationSystems_INFORMATIONSTORAGEANDRETRIEVAL02 engineering and technologyClusteringMedia MonitoringComputer Science - Information RetrievalComputingMethodologies_PATTERNRECOGNITIONMultilingual Methods0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingCluster analysisComputation and Language (cs.CL)Information Retrieval (cs.IR)
researchProduct

Diffusion map for clustering fMRI spatial maps extracted by Indipendent Component Analysis

2013

Functional magnetic resonance imaging (fMRI) produces data about activity inside the brain, from which spatial maps can be extracted by independent component analysis (ICA). In datasets, there are n spatial maps that contain p voxels. The number of voxels is very high compared to the number of analyzed spatial maps. Clustering of the spatial maps is usually based on correlation matrices. This usually works well, although such a similarity matrix inherently can explain only a certain amount of the total variance contained in the high-dimensional data where n is relatively small but p is large. For high-dimensional space, it is reasonable to perform dimensionality reduction before clustering.…

FOS: Computer and information sciencesDiffusion (acoustics)Computer sciencediffusion mapMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreMachine Learning (cs.LG)Computational Engineering Finance and Science (cs.CE)Correlation03 medical and health sciencesTotal variation0302 clinical medicineStatistics - Machine LearningVoxel0202 electrical engineering electronic engineering information engineeringComputer Science - Computational Engineering Finance and ScienceCluster analysisdimensionality reductionta113spatial mapsbusiness.industryDimensionality reductionfunctional magnetic resonance imaging (fMRI)Pattern recognitionIndependent component analysisSpectral clusteringComputer Science - Learningindependent component analysista6131020201 artificial intelligence & image processingArtificial intelligenceDYNAMICAL-SYSTEMSbusinesscomputer030217 neurology & neurosurgeryclustering
researchProduct

Heretical Mutiple Importance Sampling

2016

Multiple Importance Sampling (MIS) methods approximate moments of complicated distributions by drawing samples from a set of proposal distributions. Several ways to compute the importance weights assigned to each sample have been recently proposed, with the so-called deterministic mixture (DM) weights providing the best performance in terms of variance, at the expense of an increase in the computational cost. A recent work has shown that it is possible to achieve a trade-off between variance reduction and computational effort by performing an a priori random clustering of the proposals (partial DM algorithm). In this paper, we propose a novel "heretical" MIS framework, where the clustering …

FOS: Computer and information sciencesMean squared errorComputer scienceApplied MathematicsEstimator020206 networking & telecommunications02 engineering and technologyVariance (accounting)Statistics - Computation01 natural sciencesReduction (complexity)010104 statistics & probability[INFO.INFO-TS]Computer Science [cs]/Signal and Image ProcessingSignal Processing0202 electrical engineering electronic engineering information engineeringA priori and a posterioriVariance reduction0101 mathematicsElectrical and Electronic EngineeringCluster analysisAlgorithm[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processingImportance samplingComputation (stat.CO)ComputingMilieux_MISCELLANEOUS
researchProduct

Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R

2019

Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable states, or to compress information across several types of observations. Extending to mixture hidden Markov models (MHMMs) allows clustering data into homogeneous subsets, with or without external covariate…

FOS: Computer and information sciencesStatistics and ProbabilityMultivariate statisticssequence analysisaikasarjatComputer sciencerMarkov modelStatistics - ComputationStatistics - Applications01 natural sciencesUnobservablecategorical time seriesR-kieli010104 statistics & probabilitymulti-channel sequences; categorical time series; visualizing sequence data; visualizing models; latent Markov models; latent class models; RCovariateApplications (stat.AP)Sannolikhetsteori och statistikComputer software0101 mathematicsTime seriesProbability Theory and StatisticsHidden Markov modelCluster analysislcsh:Statisticslcsh:HA1-4737Categorical variableComputation (stat.CO)ta112business.industryvisualizing sequence dataR (programming languages)Pattern recognitionmulti-channel sequencesvisualizing modelslatent class modelssekvenssianalyysiArtificial intelligencelatent markov modelstime seriesStatistics Probability and UncertaintybusinessSoftwareJournal of Statistical Software
researchProduct

A multi-scale area-interaction model for spatio-temporal point patterns

2018

Models for fitting spatio-temporal point processes should incorporate spatio-temporal inhomogeneity and allow for different types of interaction between points (clustering or regularity). This paper proposes an extension of the spatial multi-scale area-interaction model to a spatio-temporal framework. This model allows for interaction between points at different spatio-temporal scales and the inclusion of covariates. We fit the proposed model to varicella cases registered during 2013 in Valencia, Spain. The fitted model indicates small scale clustering and regularity for higher spatio-temporal scales.

FOS: Computer and information sciencesStatistics and ProbabilityScale (ratio)Computer scienceManagement Monitoring Policy and LawMulti-scale area-interaction modelcomputer.software_genreVaricella01 natural sciencesPoint processMethodology (stat.ME)010104 statistics & probability0502 economics and businessStatisticsCovariate60D05 60G55 62M30Point (geometry)0101 mathematicsComputers in Earth SciencesCluster analysisStatistics - Methodology050205 econometrics 05 social sciencesInteraction modelExtension (predicate logic)Gibbs point processesComputingMethodologies_PATTERNRECOGNITIONSpatio-temporal point processesData miningcomputer
researchProduct

Fast PET Scan Tumor Segmentation Using Superpixels, Principal Component Analysis and K-Means Clustering

2018

Positron Emission Tomography scan images are extensively used in radiotherapy planning, clinical diagnosis, assessment of growth and treatment of a tumor. These all rely on fidelity and speed of detection and delineation algorithm. Despite intensive research, segmentation remained a challenging problem due to the diverse image content, resolution, shape, and noise. This paper presents a fast positron emission tomography tumor segmentation method in which superpixels are extracted first from the input image. Principal component analysis is then applied on the superpixels and also on their average. Distance vector of each superpixel from the average is computed in principal components coordin…

FOS: Computer and information sciencespositron emission tomographyprincipal component analysisComputer scienceComputer Vision and Pattern Recognition (cs.CV)k-meansCoordinate systemComputer Science - Computer Vision and Pattern RecognitionFOS: Physical sciences02 engineering and technologyBenchmarkQuantitative Biology - Quantitative MethodsBiochemistry Genetics and Molecular Biology (miscellaneous)030218 nuclear medicine & medical imagingsuperpixels03 medical and health sciences0302 clinical medicineStructural Biology0202 electrical engineering electronic engineering information engineeringmedicineSegmentationComputer visionTissues and Organs (q-bio.TO)Cluster analysisQuantitative Methods (q-bio.QM)Pixelmedicine.diagnostic_testbusiness.industrysegmentationk-means clusteringQuantitative Biology - Tissues and OrgansPattern recognitionPhysics - Medical PhysicsPositron emission tomographyFOS: Biological sciencesPhysics - Data Analysis Statistics and ProbabilityPrincipal component analysis020201 artificial intelligence & image processingMedical Physics (physics.med-ph)Artificial intelligenceNoise (video)businessData Analysis Statistics and Probability (physics.data-an)BiotechnologyMethods and Protocols
researchProduct