6533b839fe1ef96bd12a598e

RESEARCH PRODUCT

Shrinking language models by robust approximation

A.j. BuchsbaumJ.r. WestbrookR. Giancarlo

subject

Theoretical computer scienceFinite-state machineNested wordComputer scienceQuantum finite automataAutomata theoryLanguage modelAlgorithmNatural languageAutomaton

description

We study the problem of reducing the size of a language model while preserving recognition performance (accuracy and speed). A successful approach has been to represent language models by weighted finite-state automata (WFAs). Analogues of classical automata determinization and minimization algorithms then provide a general method to produce smaller but equivalent WFAs. We extend this approach by introducing the notion of approximate determinization. We provide an algorithm that, when applied to language models for the North American Business task, achieves 25-35% size reduction compared to previous techniques, with negligible effects on recognition time and accuracy.

https://doi.org/10.1109/icassp.1998.675357