An Approximate Determinization Algorithm for Weighted Finite-State Automata

6533b86efe1ef96bd12cb278

RESEARCH PRODUCT

An Approximate Determinization Algorithm for Weighted Finite-State Automata

Adam L. Buchsbaum Raffaele Giancarlo Jeffery Westbrook

subject

TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES Finite-state machine Theoretical computer science General Computer Science Computer science Applied Mathematics Computer Science Applications Automaton Nondeterministic algorithm Nondeterministic finite automaton with ε-moves Computer Science::Sound Deterministic automaton Theory of computation Standard test Minification Algorithm Computer Science::Formal Languages and Automata Theory

description

Nondeterministic weighted finite-state automata are a key abstraction in automatic speech recognition systems. The efficiency of automatic speech recognition depends directly on the sizes of these automata and the degree of nondeterminism present, so recent research has studied ways to determinize and minimize them, using analogues of classical automata determinization and minimization. Although, as we describe here, determinization can in the worst case cause poly-exponential blowup in the number of states of a weighted finite-state automaton, in practice it is remarkably successful. In extensive experiments in automatic speech recognition systems, deterministic weighted finite-state automata tend to be smaller than the corresponding nondeterministic inputs. Our observations show that these size reductions depend critically on the interplay between weights and topology in nondeterministic weighted finite-state automata. We exploit these observations to design a new approximate determinization algorithm, which produces a deterministic weighted finite-state automaton that preserves the strings of a weighted language but not necessarily their weights. We apply our algorithm to two different types of weighted finite-state automata that occur in automatic speech recognition systems and in each case provide extensive experimental results showing that, compared with current techniques, we achieve significant size reductions without affecting performance. In particular, for a standard test bed, we can reduce automatic speech recognition memory requirements by 25—35\percent with negligible effects on recognition time and accuracy.

year	journal	country	edition	language
2001-10-01	Algorithmica

https://doi.org/10.1007/s00453-001-0026-6