Most probable state sequence
What can we do with a trained HMM?
Well, we can infer the most probable hidden state sequence for given observed sequence.
$$ \star{\mZ} = \argmax_{\vz_1,\ldots,\vz_\nlabeled} p(\vx_1,\ldots,\vx_\nlabeled, \vz_1,\ldots,\vz_\nlabeled) $$
To solve this optimization problem, trying out all possible values of the state sequences is computationally infeasible for many applications.
One efficient approach that incrementally discovers the most probable state sequence is known as the Viterbi algorithm.
It is a dynamic programming approach that breaks down the overall problem into incremental sequence discovery.
It works by only considering the most promising paths at any point in time.
Note that for any \( \nlabeledsmall \), to discover the best \( \vz_\nlabeledsmall \), we need to know the \( K \) different paths that lead to each of the \( K \) possible values of \( \vz_\nlabeledsmall \). The best probability that the model will be in the \( k\)-th state at the \( \nlabeledsmall\) step is
$$ p(z_{\nlabeledsmall k}) = \max_{\vz_1,\ldots,\vz_{\nlabeledsmall-1}} p(\vx_1,\ldots,\vx_\nlabeledsmall, \vz_1,\ldots,\vz_{\nlabeledsmall-1}, z_{\nlabeledsmall k} = 1) $$
When we move to next step of discovering the best value for \( \vz_{\nlabeledsmall+1}\), we only need to retain the path that lead to the best value of \( \vz_\nlabeledsmall \).
By the time we reach the end of the sequence, we are left with a single unique path. We can backtrack this path to discover the best values of all the intermediate states in the sequence.