6533b858fe1ef96bd12b610c

RESEARCH PRODUCT

Usage of HMM-Based Speech Recognition Methods for Automated Determination of a Similarity Level Between Languages

Ansis Ataols Bērziņš

subject

Space (punctuation)Kullback–Leibler divergenceLanguage identificationSimilarity (network science)Computer scienceSpeech recognitionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Hidden Markov modelUSableDivergence (statistics)

description

The problem of automated determination of language similarity (or even defining of a distance on the space of languages) could be solved in different ways – working with phonetic transcriptions, with speech recordings or both of them. For the recordings, we propose and test a HMM-based one: in the first part of our article we successfully try language detection, afterwards we are trying to calculate distances between HMM-based models, using different metrics and divergences. The Kullback-Leibler divergence is the only one we got good results with – it means that the calculated distances between languages correspond to analytical understanding of similarity between them. Even if it does not work very well, the conclusion is that this method is usable, but usage of some other methods could be more rational.

https://doi.org/10.1007/978-3-030-34518-1_8