Search results for "Cluster Analysis"

showing 10 items of 848 documents

ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.

2015

Background: Cluster analysis is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. It is central to the life sciences due to the advent of high throughput technologies, e.g., classification of tumors. In particular, in cluster analysis, it is of relevance to assess cluster quality and to predict the number of clusters in a dataset, if any. This latter task is usually performed via internal validation measures. Despite their potentially important role, both the use of classic internal validation measures and the design of new ones, specific for microarray data, do not seem to have grea…

Software documentationInformation retrievalSettore INF/01 - Informaticabusiness.industryComputer scienceSoftware developmentAlgorithm engineeringHealth InformaticsPattern discovery in bioinformatics and biomedicinecomputer.software_genreData scienceSoftware metricComputer Science ApplicationsSoftware frameworkMicroarray cluster analysiSoftwareBioinformatics softwareSoftware constructionComponent-based software engineeringCluster AnalysisProgramming LanguagesbusinesscomputerSoftwareAlgorithmsComputer methods and programs in biomedicine
researchProduct

Description, microhabitat selection and infection patterns of sealworm larvae (Pseudoterranova decipiens species complex, nematoda: ascaridoidea) in …

2013

Third-stage larvae of the Pseudoterranova decipiens species complex (also known as sealworms) have been reported in at least 40 marine fish species belonging to 21 families and 10 orders along the South American coast. Sealworms are a cause for concern because they can infect humans who consume raw or undercooked fish. However, despite their economic and zoonotic importance, morphological and molecular characterization of species of Pseudoterranova in South America is still scarce. Methods: A total of 542 individual fish from 20 species from the Patagonian coast of Argentina were examined for sealworms. The body cavity, the muscles, internal organs, and the mesenteries were examined to dete…

Species complexAnisakidaeMolecular Sequence DataArgentinaPSEUDOTERRANOVA CATTANIZoologyEealwormsHelminth geneticsANISAKIDAE//purl.org/becyt/ford/1 [https]Ciencias BiológicasElectron Transport Complex IVAscaridoideaAnimalsCluster AnalysisSouthwestern Atlantic//purl.org/becyt/ford/1.6 [https]Pseudoterranova cattaniMesenteriesPhylogenyTaxonomyMicroscopyEcologybiologyParalichthysMarine fishesEcologyResearchFishesAnimal StructuresZoología Ornitología Entomología EtologíaSequence Analysis DNATAXONOMYBiología Marina LimnologíaDNA HelminthOtaria flavescensbiology.organism_classificationPseudoterranova decipiensAscaridida InfectionsAnisakidaeInfectious DiseasesSEALWORMSLarvaParasitologyTaxonomy (biology)Cox1CIENCIAS NATURALES Y EXACTASParasites & Vectors
researchProduct

The age and evolution of sociality in Stegodyphus spiders: a molecular phylogenetic perspective

2006

Social, cooperative breeding behaviour is rare in spiders and generally characterized by inbreeding, skewed sex ratios and high rates of colony turnover, processes that when combined may reduce genetic variation and lower individual fitness quickly. On these grounds, social spider species have been suggested to be unstable in evolutionary time, and hence sociality a rare phenomenon in spiders. Based on a partial molecular phylogeny of the genus Stegodyphus , we address the hypothesis that social spiders in this genus are evolutionary transient. We estimate the age of the three social species, test whether they represent an ancestral or derived state and assess diversification relative to s…

Species complexgenetic structuresLineage (evolution)Molecular Sequence DataGeneral Biochemistry Genetics and Molecular BiologyIntraspecific competitionSexual Behavior AnimalSpecies SpecificityCooperative breedingAnimalsCluster AnalysisSocial BehaviorSocialityPhylogenyGeneral Environmental ScienceStegodyphusDNA PrimersLikelihood FunctionsGeneral Immunology and MicrobiologybiologyBase SequenceModels GeneticSpidersGeneral MedicineSequence Analysis DNAAnelosimusbiology.organism_classificationEvolutionary biologyGeneral Agricultural and Biological SciencesSocial spiderResearch Article
researchProduct

Nondestructive Direct Determination of Heroin in Seized Illicit Street Drugs by Diffuse Reflectance near-Infrared Spectroscopy

2008

A new method has been developed for the fast and nondestructive direct determination of heroin in seized street illicit drugs using partial least-squares regression analysis of diffuse reflectance near-infrared spectra. Data were obtained from untreated samples placed in standard glass chromatography vials. A heterogeneous population of 31 samples, previously analyzed by a reference method, was employed to build the calibration model and to have a separated validation set. Based on the use of zero-order data for a calibration set of 21 samples, after standard normal variate and quadratic linear removed baseline correction (detrending), in the wavelength range from 1111 to 1647 nm, 8 PLS fac…

Spectroscopy Near-InfraredMean squared errorIllicit DrugsChemistryDirect methodStreet drugsNear-infrared spectroscopyAnalytical chemistryReproducibility of ResultsResidualAnalytical ChemistryHeroinHeterogeneous populationCalibrationCalibrationCluster AnalysisDiffuse reflectionLeast-Squares AnalysisAnalytical Chemistry
researchProduct

Lexical and sublexical units in speech perception.

2009

Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vint…

Speech perceptionParsingbusiness.industryCognitive NeuroscienceSpeech recognitionText segmentationExperimental and Cognitive Psychologycomputer.software_genreLexiconSpeech segmentationArtificial Intelligence[SCCO.PSYC]Cognitive science/PsychologyLexicoArtificial intelligenceCluster analysisPsychologybusinesscomputerNatural language processingComputingMilieux_MISCELLANEOUScomputer.programming_languageSpoken languageCognitive science
researchProduct

A branch-and-cut algorithm for the soft-clustered vehicle-routing problem

2021

Abstract The soft-clustered vehicle-routing problem is a variant of the classical capacitated vehicle-routing problem (CVRP) in which customers are partitioned into clusters and all customers of the same cluster must be served by the same vehicle. We introduce a novel symmetric formulation of the problem in which the clustering part is modeled with an asymmetric sub-model. We solve the new model with a branch-and-cut algorithm exploiting some known valid inequalities for the CVRP that can be adapted. In addition, we derive problem-specific cutting planes and new heuristic and exact separation procedures. For square grid instances in the Euclidean plane, we provide lower-bounding techniques …

Square tilingHeuristic (computer science)Applied Mathematics0211 other engineering and technologies021107 urban & regional planning0102 computer and information sciences02 engineering and technology01 natural sciencesTravelling salesman problemReduction (complexity)010201 computation theory & mathematicsVehicle routing problemBenchmark (computing)Discrete Mathematics and CombinatoricsCluster analysisBranch and cutAlgorithmMathematicsDiscrete Applied Mathematics
researchProduct

Detection of spatial disease clusters with LISA functions.

2011

Detection of disease clusters is an important tool in epidemiology that can help to identify risk factors associated with the disease and in understanding its etiology. In this article we propose a method for the detection of spatial clusters where the locations of a set of cases and a set of controls are available. The method is based on local indicators of spatial association functions (LISA functions), particularly on the development of a local version of the product density, which is a second-order characteristic of spatial point processes. The behavior of the method is evaluated and compared with Kulldorff's spatial scan statistic by means of a simulation study. It is shown that the LI…

Statistics and ProbabilityAdultMaleDisease clustersEpidemiologyScan statisticIrregular shapePoint processDisease OutbreaksSet (abstract data type)StatisticsCluster AnalysisHumansComputer SimulationSensitivity (control systems)MathematicsAgedAged 80 and overbusiness.industryPattern recognitionMiddle AgedSpainData Interpretation StatisticalSpatial clusteringFemaleKidney DiseasesArtificial intelligencebusinessEpidemiologic MethodsType I and type II errorsStatistics in medicine
researchProduct

Using mathematical morphology for unsupervised classification of functional data

2011

This paper is concerned with the unsupervised classification of functional data by using mathematical morphology. Different morphological operators are used to extract relevant structures of the functions (considered as sets through their subgraph representations). These operators can be considered as preprocessing tools whose outputs are also functional data. We explore some dissimilarity measures and clustering methods for the classification of the transformed data. Our approach is illustrated through a detailed analysis of two data sets. These techniques, which have mainly been used in image processing, provide a flexible and robust toolbox for improving the results in unsupervised funct…

Statistics and ProbabilityApplied MathematicsData classificationImage processingMathematical morphologycomputer.software_genreToolboxComputingMethodologies_PATTERNRECOGNITIONModeling and SimulationPreprocessorData miningStatistics Probability and UncertaintyCluster analysisMorphological operatorscomputerMathematicsJournal of Statistical Computation and Simulation
researchProduct

Cluster-Localized Sparse Logistic Regression for SNP Data

2012

The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, th…

Statistics and ProbabilityBoosting (machine learning)Computer scienceMultivariable calculusComputational BiologyHigh-Throughput Nucleotide SequencingFeature selectionRegression analysisModels TheoreticalLogistic regressioncomputer.software_genrePolymorphism Single NucleotideRegressionComputational MathematicsLogistic ModelsData Interpretation StatisticalGeneticsCluster AnalysisHumansData miningCluster analysisMolecular BiologyUnit-weighted regressioncomputerGenome-Wide Association StudyStatistical Applications in Genetics and Molecular Biology
researchProduct

A fast and recursive algorithm for clustering large datasets with k-medians

2012

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…

Statistics and ProbabilityClustering high-dimensional dataFOS: Computer and information sciencesMathematical optimizationhigh dimensional dataMachine Learning (stat.ML)02 engineering and technologyStochastic approximation01 natural sciencesStatistics - Computation010104 statistics & probabilityk-medoidsStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]stochastic approximation0202 electrical engineering electronic engineering information engineeringComputational statisticsrecursive estimatorsAlmost surely[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsCluster analysisComputation (stat.CO)Mathematicsaveragingk-medoidsRobbins MonroApplied MathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]stochastic gradient[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]MedoidComputational MathematicsComputational Theory and Mathematicsonline clustering020201 artificial intelligence & image processingpartitioning around medoidsAlgorithm
researchProduct