Search results for "Medoid"

showing 8 items of 8 documents

Non-parametric approaches to the impact of Holstein heifer growth from birth to insemination on their dairy performance at lactation one

2012

SUMMARYParametric approaches have been used widely to model animal growth and study the impact of growth profile on performance. Individual variation is often not considered in such approaches. However, non-parametric modelling allows this. Such an approach, based on spline functions, was used to study the importance of growth profiles from age 0 to 15 months (i.e. insemination) on milk yield and composition in primiparous cows. A dataset of 447 heifers was used for analysis of growth performance; 296 of them were also used to study impact on lactation. All of them originated from a French experimental herd and were born between 1986 and 2006. Clustering methods were also tested. Comparison…

040301 veterinary sciencesFUNCTIONAL DATA[SDV]Life Sciences [q-bio]MODELSCATTLEBeef cattleInseminationMilking0403 veterinary scienceLactationStatisticsGeneticsmedicineMathematics2. Zero hungerCOWS0402 animal and dairy scienceNonparametric statistics04 agricultural and veterinary sciences040201 dairy & animal scienceMedoidmedicine.anatomical_structureHerdAnimal Science and ZoologyWEIGHTSpline interpolationAgronomy and Crop Science
researchProduct

Structural clustering of millions of molecular graphs

2014

We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…

Clustering high-dimensional dataFuzzy clusteringTheoretical computer sciencek-medoidsComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreComplete-linkage clusteringGraphHierarchical clusteringComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithmCanopy clustering algorithmFLAME clusteringAffinity propagationData miningCluster analysiscomputerk-medians clusteringClustering coefficientProceedings of the 29th Annual ACM Symposium on Applied Computing
researchProduct

Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm

2003

A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.

Clustering high-dimensional datak-medoidsComputer scienceCURE data clustering algorithmSingle-linkage clusteringCanopy clustering algorithmVariance (accounting)Data miningCluster analysiscomputer.software_genrecomputerk-medians clustering
researchProduct

Looking for representative fit models for apparel sizing

2014

This paper is concerned with the generation of optimal fit models for use in apparel design. Representative fit models or prototypes are important for defining a meaningful sizing system. However, there is no agreement among apparel manufacturers and each one has their own prototypes and size charts i.e. there is a lack of standard sizes in garments from different apparel manufacturers. We propose two algorithms based on a new hierarchical partitioning around medoids clustering method originally developed for gene expression data. We are concerned with a different application; therefore, the dissimilarity between the objects has to be different and must be designed to deal with anthropometr…

Hierarchical treeInformation Systems and ManagementComputer sciencecomputer.software_genreMachine learningManagement Information SystemsINCA statisticArts and Humanities (miscellaneous)Mean split silhouetteDevelopmental and Educational PsychologyMarket shareCluster analysisbusiness.industryClothingMedoidSizingHIPAMOutlierPartitioning around medoidsArtificial intelligenceData miningbusinesscomputerInformation SystemsFit models
researchProduct

Apparel sizing using trimmed PAM and OWA operators

2012

This paper is concerned with apparel sizing system design. One of the most important issues in the apparel development process is to define a sizing system that provides a good fit to the majority of the population. A sizing system classifies a specific population into homogeneous subgroups based on some key body dimensions. Standard sizing systems range linearly from very small to very large. However, anthropometric measures do not grow linearly with size, so they can not accommodate all body types. It is important to determine each class in the sizing system based on a real prototype that is as representative as possible of each class. In this paper we propose a methodology to develop an …

Mathematical optimizationeducation.field_of_studyAnthropometric dataTrimmed k-medoidsComputer scienceProcess (engineering)PopulationGeneral EngineeringClass (biology)SizingComputer Science ApplicationsRange (mathematics)Artificial IntelligenceKey (cryptography)Sizing systemsSystems designOWA operatorsCluster analysiseducationSimulation
researchProduct

A fast and recursive algorithm for clustering large datasets with k-medians

2012

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…

Statistics and ProbabilityClustering high-dimensional dataFOS: Computer and information sciencesMathematical optimizationhigh dimensional dataMachine Learning (stat.ML)02 engineering and technologyStochastic approximation01 natural sciencesStatistics - Computation010104 statistics & probabilityk-medoidsStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]stochastic approximation0202 electrical engineering electronic engineering information engineeringComputational statisticsrecursive estimatorsAlmost surely[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsCluster analysisComputation (stat.CO)Mathematicsaveragingk-medoidsRobbins MonroApplied MathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]stochastic gradient[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]MedoidComputational MathematicsComputational Theory and Mathematicsonline clustering020201 artificial intelligence & image processingpartitioning around medoidsAlgorithm
researchProduct

Tiešsaistes Klientu Segmentācija, Izmantojot Klasterizācijas Metodes

2021

Bakalaura darba mērķis ir izpētīt tiešsaistes klientu segmentāciju, lai tā palīdzētu pieņemt loģiskus lēmumus par efektīvu mārketinga un reklāmas resursu izmantošanu. Darbā tika izmantotas divas klasterizācijas metodes: K-Medoīdu (K-Medoids) klasterizācija, un K-Prototipu (K-Prototypes) klasterizācija. Metožu izvēle tiek pamatota ar pētītā uzdevuma raksturojumu. Darba gaitā tiek aprakstīti gan abu metožu teorētiskie aspekti, gan metodes tiek pielietotas praktiski (izmantojot programmu R) konkrēta uzdevuma risināšanai. Tika veikta iegūto rezultātu analīze un salīdzināšana. Bakalaura darbā tika arī paskaidrota klientu segmentācijas nozīme veiksmīgam uzņēmumam, kā arī tika aprakstīts interneta…

Tiešsaistes klientu segmentācijaK-Prototipu (K-Prototypes) klasterizācijaVidējā silueta metode (Average silhouette method)MatemātikaK-Medoīdu (K-Medoids) klasterizācijaKlasteru analīze
researchProduct

SparseHC: A Memory-efficient Online Hierarchical Clustering Algorithm

2014

Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space with respect to the number of objects, the design of memory-efficient approaches is of high importance to this research area. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted and possibly sparse distance matrix chunk-by-chunk. Meanwhile, a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The k…

sparse matrixClustering high-dimensional dataTheoretical computer scienceonline algorithmsComputer scienceSingle-linkage clusteringComplete-linkage clusteringNearest-neighbor chain algorithmConsensus clusteringmemory-efficient clusteringCluster analysisk-medians clusteringGeneral Environmental ScienceSparse matrix:Engineering::Computer science and engineering [DRNTU]k-medoidsDendrogramConstrained clusteringHierarchical clusteringDistance matrixCanopy clustering algorithmGeneral Earth and Planetary SciencesFLAME clusteringHierarchical clustering of networkshierarchical clusteringAlgorithmProcedia Computer Science
researchProduct