Houjun Liu

Hidden Markov Model

  1. draw an initial state \(q_1\) from the initial state distribution \(\pi\)
  2. For each state \(q_{i}\)…
    1. Drew observe something \(o_{t}\) according to the action distribution of state \(q_{i}\)
    2. Use transition probability \(a_{i,j}\) to draw a next state \(q_{j}\)

Isolated recognition: train a family of HMMs, one for each word or something. Then, given new data, perform scoring of the HMM onto the features.

components of HMMs

scoring

Given an observation \(o_1, …, o_{T}\) and a model, we compute $P(O | λ)$—the probability of a sequence given a model \(\lambda\)

“forward and backward algorithm”

decoding

Given observations, find the state sequence \(q1, …, q_{T}\) most likely to have generated

training

Given observations \(O\), find the model parameters \(\lambda\) that maximize \(P(O|\lambda)\), the Maximum Likelihood Parameter Learning.

continuous-density HMM

There are some HMMs that blend the discrete timestamps into Gaussian mixture models.

continuous speech

Scoring becomes hard because you have to go through and calculate every freaking word. THerefore:

\begin{equation} P(W|O) = \frac{P(O|W) P(W)}{P(O)} \end{equation}

Therefore, we really desire:

\begin{equation} \arg\max_{w} P(O|W) P(W) \end{equation}