Search results for "artificial intelligence"

showing 10 items of 6122 documents

A fast and recursive algorithm for clustering large datasets with k-medians

2012

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…

Statistics and ProbabilityClustering high-dimensional dataFOS: Computer and information sciencesMathematical optimizationhigh dimensional dataMachine Learning (stat.ML)02 engineering and technologyStochastic approximation01 natural sciencesStatistics - Computation010104 statistics & probabilityk-medoidsStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]stochastic approximation0202 electrical engineering electronic engineering information engineeringComputational statisticsrecursive estimatorsAlmost surely[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsCluster analysisComputation (stat.CO)Mathematicsaveragingk-medoidsRobbins MonroApplied MathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]stochastic gradient[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]MedoidComputational MathematicsComputational Theory and Mathematicsonline clustering020201 artificial intelligence & image processingpartitioning around medoidsAlgorithm
researchProduct

Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?

2017

Summary Principal component analysis (PCA) is a method of choice for dimension reduction. In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to perform the PCA of streaming data and/or massive data. Despite the wide availability of recursive algorithms that can efficiently update the PCA when new data are observed, the literature offers little guidance on how to select a suitable algorithm for a given application. This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical a…

Statistics and ProbabilityComputer scienceComputationDimensionality reductionIncremental methods02 engineering and technologyMissing data01 natural sciences010104 statistics & probabilityData explosionStreaming dataPrincipal component analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing0101 mathematicsStatistics Probability and UncertaintyAlgorithmEigendecomposition of a matrixInternational Statistical Review
researchProduct

Blind Source Separation Based on Joint Diagonalization in R: The Packages JADE and BSSasymp

2017

Blind source separation (BSS) is a well-known signal processing tool which is used to solve practical data analysis problems in various fields of science. In BSS, we assume that the observed data consists of linear mixtures of latent variables. The mixing system and the distributions of the latent variables are unknown. The aim is to find an estimate of an unmixing matrix which then transforms the observed data back to latent sources. In this paper we present the R packages JADE and BSSasymp. The package JADE offers several BSS methods which are based on joint diagonalization. Package BSSasymp contains functions for computing the asymptotic covariance matrices as well as their data-based es…

Statistics and ProbabilityComputer scienceJADE (programming language)02 engineering and technologyLatent variableMachine learningcomputer.software_genre01 natural sciencesBlind signal separation010104 statistics & probabilityMatrix (mathematics)nonstationary source separationMixing (mathematics)0202 electrical engineering electronic engineering information engineeringsecond order source separation0101 mathematicslcsh:Statisticslcsh:HA1-4737computer.programming_languageta113Signal processingta112matematiikkamultivariate time seriesmathematicsbusiness.industryEstimator020206 networking & telecommunicationsriippumattomien komponenttien analyysiindependent component analysis; multivariate time series; nonstationary source separation; performance indices; second order source separationIndependent component analysisperformance indicesstatisticsindependent component analysisArtificial intelligenceStatistics Probability and UncertaintybusinesscomputerAlgorithmSoftwareJournal of Statistical Software
researchProduct

Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods

2014

Abstract Motivation: Protein–protein interaction (PPI) networks are powerful models to represent the pairwise protein interactions of the organisms. Clustering PPI networks can be useful for isolating groups of interacting proteins that participate in the same biological processes or that perform together specific biological functions. Evolutionary orthologies can be inferred this way, as well as functions and properties of yet uncharacterized proteins. Results: We present an overview of the main state-of-the-art clustering methods that have been applied to PPI networks over the past decade. We distinguish five specific categories of approaches, describe and compare their main features and …

Statistics and ProbabilityComputer sciencePopulationPopulation basedMachine learningcomputer.software_genreBiochemistryProtein protein interaction networkgenetic algorithmsProtein–protein interactionBioinformatics Clustering Biological NetworksPPI networkscomplex detectionProtein Interaction MappingAnimalsCluster AnalysisHumanseducationCluster analysisMolecular BiologyTopology (chemistry)Class (computer programming)education.field_of_studybusiness.industryfood and beveragesProteinsComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsArtificial intelligenceData miningbusinessFocus (optics)computerAlgorithms
researchProduct

Anthropometry: An R Package for Analysis of Anthropometric Data

2017

The development of powerful new 3D scanning techniques has enabled the generation of large up-to-date anthropometric databases which provide highly valued data to improve the ergonomic design of products adapted to the user population. As a consequence, Ergonomics and Anthropometry are two increasingly quantitative fields, so advanced statistical methodologies and modern software tools are required to get the maximum benefit from anthropometric data. This paper presents a new R package, called Anthropometry, which is available on the Comprehensive R Archive Network. It brings together some statistical methodologies concerning clustering, statistical shape analysis, statistical archetypal an…

Statistics and ProbabilityComputer sciencePopulationstatistical shape analysis02 engineering and technologycomputer.software_genre01 natural sciences010104 statistics & probabilitySoftware0202 electrical engineering electronic engineering information engineeringR; anthropometric data; clustering; statistical shape analysis; archetypal analysis; data depth0101 mathematicsarchetypal analysisCluster analysiseducationlcsh:Statisticslcsh:HA1-4737education.field_of_studyAnthropometric databusiness.industryStatistical shape analysisRHuman factors and ergonomicsAnthropometryanthropometric dataVignette020201 artificial intelligence & image processingData miningStatistics Probability and Uncertaintydata depthbusinesscomputerSoftwareclusteringJournal of Statistical Software
researchProduct

Overall Objective Priors

2015

In multi-parameter models, reference priors typically depend on the parameter or quantity of interest, and it is well known that this is necessary to produce objective posterior distributions with optimal properties. There are, however, many situations where one is simultaneously interested in all the parameters of the model or, more realistically, in functions of them that include aspects such as prediction, and it would then be useful to have a single objective prior that could safely be used to produce reasonable posterior inferences for all the quantities of interest. In this paper, we consider three methods for selecting a single objective prior and study, in a variety of problems incl…

Statistics and ProbabilityComputer sciencebusiness.industryApplied MathematicsMathematics - Statistics TheoryStatistics Theory (math.ST)Joint Reference PriorReference AnalysisMachine learningcomputer.software_genreLogarithmic DivergenceObjective PriorsVariety (cybernetics)Single objectiveMultinomial ModelPrior probabilityFOS: MathematicsMultinomial distributionMultinomial modelArtificial intelligencebusinesscomputerReference analysisBayesian Analysis
researchProduct

Sequential Monte Carlo methods in Bayesian joint models for longitudinal and time-to-event data

2020

The statistical analysis of the information generated by medical follow-up is a very important challenge in the field of personalized medicine. As the evolutionary course of a patient's disease progresses, his/her medical follow-up generates more and more information that should be processed immediately in order to review and update his/her prognosis and treatment. Hence, we focus on this update process through sequential inference methods for joint models of longitudinal and time-to-event data from a Bayesian perspective. More specifically, we propose the use of sequential Monte Carlo (SMC) methods for static parameter joint models with the intention of reducing computational time in each…

Statistics and ProbabilityComputer sciencebusiness.industryBayesian probabilitySequential monte carlo methodsMachine learningcomputer.software_genre01 natural sciencesField (computer science)010104 statistics & probability03 medical and health sciences0302 clinical medicineEvent data030220 oncology & carcinogenesisStatistical analysisPersonalized medicineArtificial intelligence0101 mathematicsStatistics Probability and UncertaintybusinessJoint (audio engineering)CartographycomputerStatistical Modelling
researchProduct

A review of second‐order blind identification methods

2021

Second-order source separation (SOS) is a data analysis tool which can be used for revealing hidden structures in multivariate time series data or as a tool for dimension reduction. Such methods are nowadays increasingly important as more and more high-dimensional multivariate time series data are measured in numerous fields of applied science. Dimension reduction is crucial, as modeling such high-dimensional data with multivariate time series models is often impractical as the number of parameters describing dependencies between the component time series is usually too high. SOS methods have their roots in the signal processing literature, where they were first used to separate source sign…

Statistics and ProbabilityComputer sciencebusiness.industryDimensionality reductionSecond order blind identificationPattern recognitionArtificial intelligencebusinessBlind signal separationWIREs Computational Statistics
researchProduct

Archetypoids: A new approach to define representative archetypal data

2015

[EN] The new concept archetypoids is introduced. Archetypoid analysis represents each observation in a dataset as a mixture of actual observations in the dataset, which are pure type or archetypoids. Unlike archetype analysis, archetypoids are real observations, not a mixture of observations. This is relevant when existing archetypal observations are needed, rather than fictitious ones. An algorithm is proposed to find them and some of their theoretical properties are introduced. It is also shown how they can be obtained when only dissimilarities between observations are known (features are unavailable). Archetypoid analysis is illustrated in two design problems and several examples, compar…

Statistics and ProbabilityConvex hullArchetypebusiness.industryApplied MathematicsNon-negative matrix factorizationExtremal pointType (model theory)Unsupervised learningNon-negative matrix factorizationComputational MathematicsComputational Theory and MathematicsConvex hullUnsupervised learningExtremal pointArtificial intelligencebusinessArchetypeMathematics
researchProduct

Testing abnormality in the spatial arrangement of cells in the corneal endothelium using spatial point processes

2001

The study of central corneal endothelium morphology is important in Ophthalmology. Some of the pathologies that could compromise endothelial cell morphology are trauma, cataract, surgery, use of contact lenses, corneal dystrophies or degenerations. The quantitative analysis of cell shape and cellular pattern is more sensitive in detecting subtle changes in endothelial morphology than cell density measurement or cell area analysis. In this paper, the morphology of the central cornea, the most important area from the point of view of vision, is studied through an associated bivariate spatial point pattern: the centroids of the cells and the triple points, that is, the points where three diffe…

Statistics and ProbabilityCorneal endotheliumEpidemiologybusiness.industryCentroidPattern recognitionBivariate analysisNearest neighbour distributionBiologyPoint processmedicine.anatomical_structureCorneamedicineArtificial intelligenceAbnormalitybusinessCell shapeStatistics in Medicine
researchProduct