6533b86cfe1ef96bd12c8bb6

RESEARCH PRODUCT

Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R

Jouni HelskeSatu Helske

subject

FOS: Computer and information sciencesStatistics and ProbabilityMultivariate statisticssequence analysisaikasarjatComputer sciencerMarkov modelStatistics - ComputationStatistics - Applications01 natural sciencesUnobservablecategorical time seriesR-kieli010104 statistics & probabilitymulti-channel sequences; categorical time series; visualizing sequence data; visualizing models; latent Markov models; latent class models; RCovariateApplications (stat.AP)Sannolikhetsteori och statistikComputer software0101 mathematicsTime seriesProbability Theory and StatisticsHidden Markov modelCluster analysislcsh:Statisticslcsh:HA1-4737Categorical variableComputation (stat.CO)ta112business.industryvisualizing sequence dataR (programming languages)Pattern recognitionmulti-channel sequencesvisualizing modelslatent class modelssekvenssianalyysiArtificial intelligencelatent markov modelstime seriesStatistics Probability and UncertaintybusinessSoftware

description

Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable states, or to compress information across several types of observations. Extending to mixture hidden Markov models (MHMMs) allows clustering data into homogeneous subsets, with or without external covariates. The seqHMM package in R is designed for the efficient modeling of sequences and other categorical time series data containing one or multiple subjects with one or multiple interdependent sequences using HMMs and MHMMs. Also other restricted variants of the MHMM can be fitted, e.g., latent class models, Markov models, mixture Markov models, or even ordinary multinomial regression models with suitable parameterization of the HMM. Good graphical presentations of data and models are useful during the whole analysis process from the first glimpse at the data to model fitting and presentation of results. The package provides easy options for plotting parallel sequence data, and proposes visualizing HMMs as directed graphs.

https://doi.org/10.18637/jss.v088.i03