In the last chapter, we introduced Gibbs sampling for HMMs. Gibbs sampling is a special case of a general set of techniques called Markov chain monte carlo (MCMC). The idea behind MCMC is that we can sampling from a complicated distribution by taking small steps that are conditioned on the preceding sample. Frequently, MCMC works by sampling a small subset of the random variables in a complex joint distribution while leaving the others fixed—this was how our Gibbs sampler worked.

Recall that earlier in the course we used the term Markov model to refer to n-gram models or strictly-local models. Markov chain is a more general term for stochastic processes defined by transition distributions from one state to the next.

Consider the Gibbs sampler. Our states here are point in the joint distribution over derivations and parameter values. By sampling from the conditional posteriors over parameters and derivations we are leaving most of the state space fixed while making small, correlated moves. Thus, our Gibbs sampler defines a Markov chain.

A important concept in the theory of Markov chains is the notion of a stationary distribution. The stationary distribution is a distriubution over states that represents the long run average of the amount of time the chain will stay in each state if you ran it forever.


38 Gibbs Sampling 40 Heads and Dependents