6533b86efe1ef96bd12cb278
RESEARCH PRODUCT
An Approximate Determinization Algorithm for Weighted Finite-State Automata
Adam L. BuchsbaumRaffaele GiancarloJeffery Westbrooksubject
TheoryofComputation_COMPUTATIONBYABSTRACTDEVICESFinite-state machineTheoretical computer scienceGeneral Computer ScienceComputer scienceApplied MathematicsComputer Science ApplicationsAutomatonNondeterministic algorithmNondeterministic finite automaton with ε-movesComputer Science::SoundDeterministic automatonTheory of computationStandard testMinificationAlgorithmComputer Science::Formal Languages and Automata Theorydescription
Nondeterministic weighted finite-state automata are a key abstraction in automatic speech recognition systems. The efficiency of automatic speech recognition depends directly on the sizes of these automata and the degree of nondeterminism present, so recent research has studied ways to determinize and minimize them, using analogues of classical automata determinization and minimization. Although, as we describe here, determinization can in the worst case cause poly-exponential blowup in the number of states of a weighted finite-state automaton, in practice it is remarkably successful. In extensive experiments in automatic speech recognition systems, deterministic weighted finite-state automata tend to be smaller than the corresponding nondeterministic inputs. Our observations show that these size reductions depend critically on the interplay between weights and topology in nondeterministic weighted finite-state automata. We exploit these observations to design a new approximate determinization algorithm, which produces a deterministic weighted finite-state automaton that preserves the strings of a weighted language but not necessarily their weights. We apply our algorithm to two different types of weighted finite-state automata that occur in automatic speech recognition systems and in each case provide extensive experimental results showing that, compared with current techniques, we achieve significant size reductions without affecting performance. In particular, for a standard test bed, we can reduce automatic speech recognition memory requirements by 25—35\percent with negligible effects on recognition time and accuracy.
year | journal | country | edition | language |
---|---|---|---|---|
2001-10-01 | Algorithmica |