Hidden Markov Models in Supervised and unsupervised training
Church and DeRose utilized Hidden Markov Models (HMMs) in their computational linguistics work, which were originally developed for speech recognition. HMMs are probabilistic models that generate sequences of states and parallel sequences of output symbols. In language modeling, the output symbols represent sentences in natural language. The automaton defined by an HMM can be in several distinct states, and it starts by randomly selecting a state. The automaton then emits a symbol, chooses a new state, and repeats the process. Each choice is stochastic and made randomly based on a distribution over output symbols or next states, depending on the type of choice and the current state.
The model consists of numeric values representing the probability of choosing a specific transition or emission. Learning an HMM is simple if labeled data is provided, which pairs state sequences with output sequences. To estimate the probability of choosing a particular value when making a stochastic choice of a specific type, one can count how often that choice was made in the labeled data.
Church and DeRose used HMMs to tackle part-of-speech tagging by associating the states of the automaton with parts of speech. The automaton generates a sequence of parts of speech and emits a word for each part of speech, resulting in a tagged text where each word is annotated with its corresponding part of speech. Supervised learning of an HMM for part-of-speech tagging is effective, with HMM taggers for English having an error rate of around 3.5 to 4 percent. The success of these models in part-of-speech tagging was what initially drew attention to probabilistic models in computational linguistics.
No comments:
Post a Comment