Language Modeling: A detailed look at stochastic .
This essay is written to provide a detailed look at stochastic language models and alternatives. Firstly I intend to explain the problem language models seek to solve in the field of speech recognition then to outline the various approaches taken. Before I do this I should mention that prior to the language model becoming useful, a speech recognizer will already have endeavored to identify the phonemes (atomic sound units) present in an utterance. It will most likely have done this by matching them against Gaussian models of sound contained in Hidden Markov Models (HMMs) (see [Cassidy, 2002] at pages 63 to 68 for a fuller explanation of HMMs). .
Ever since [Baker, 1975] first proposed the use of network representations for speech recognition in the form of HMMs, they have come to dominate the approaches used by the speech recognition community generally. The language model provides the structure that allows decisions to be made (often based upon probabilities, again using HMM's) as to how phonemes may be combined into words and sentences. This is, of course, a classic searching problem, well explored in artificial intelligence circles. .
2 Non-stochastic Models.
At this time most speech applications use non-stochastic language models. Currently, most applications are task oriented and the range of utterances allowed at each in a dialogue stage is limited, so limited that all the alternatives can be captured in a deterministic grammar. A simply binary response might be caught with a VoiceXML grammar by the following:.
.
.
.
.
- Yes
.
- No
.
.
.
.
or, in an alternate grammar format, ABNF, as:.
.
Yes {true} | No {false}.
.
.
3 Stochastic Models .