6533b858fe1ef96bd12b58c6

RESEARCH PRODUCT

Estimating finite mixtures of semi-Markov chains: an application to the segmentation of temporal sensory data

Hervé CardotPascal SchlichGuillaume LecuelleMichel Visalli

subject

futureStatistics and ProbabilityFOS: Computer and information sciencesGamma distributionmiceComputer sciencemedia_common.quotation_subjectPopulationdominancecomputer.software_genreStatistics - Applications01 natural sciencesMethodology (stat.ME)modelsExpectation-maximization algorithmModel-based clustering010104 statistics & probability0404 agricultural biotechnology[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Bayesian information criterionPerceptionExpectation–maximization algorithmApplications (stat.AP)Temporal dominance of sensations[MATH]Mathematics [math]0101 mathematicseducationStatistics - Methodologymedia_common2. Zero hungereducation.field_of_studyMarkov chainMarkov renewal processStatistical model04 agricultural and veterinary sciencesidentifiabilityMixture modelBayesian information criterion040401 food science[MATH.MATH-PR]Mathematics [math]/Probability [math.PR]IdentifiabilityPenalized likelihoodData miningStatistics Probability and UncertaintycomputertdsCategorical time seriessensations

description

Summary In food science, it is of great interest to obtain information about the temporal perception of aliments to create new products, to modify existing products or more generally to understand the mechanisms of perception. Temporal dominance of sensations is a technique to measure temporal perception which consists in choosing sequentially attributes describing a food product over tasting. This work introduces new statistical models based on finite mixtures of semi-Markov chains to describe data collected with the temporal dominance of sensations protocol, allowing different temporal perceptions for a same product within a population. The identifiability of the parameters of such mixture models is discussed. Sojourn time distributions are fitted with a gamma probability distribution and a penalty is added to the log-likelihood to ensure convergence of the expectation–maximization algorithm to a non-degenerate solution. Information criteria are employed for determining the number of mixture components. Then, the individual qualitative trajectories are clustered with the help of the maximum a posteriori probability approach. A simulation study confirms the good behaviour of the estimation procedure proposed. The methodology is illustrated on an example of consumers’ perception of a Gouda cheese and assesses the existence of several behaviours in terms of perception of this product.

https://dx.doi.org/10.48550/arxiv.1806.04420