In a Hidden Markov Model with N states, each (hidden) state is associated to an emission law, the state evolve according to a markov chain, and at each time step the (visible) observation is drawn from the current state emission law.
Denoting
(st) the hidden-state sequence, and
(Ot) the visible-observation sequence we get the following graphical representation:
Formally, the hidden state chain
(st) is defined by an initial probability distribution vector
Π0 and a transition matrix
M such that:
Πt=M.Πt−1
In our case we assume that the N observation law are gaussian vectors determined by their mean vector and covariance matrices:
O(st)∼N(μst,Ωst)
The graph structure shows that conditionaly on the state sequence the observations are independant. Furthermore each single observations depends solely on the state at the same time-step. Thus we have:
P(O1,…,OT∣s1,…,sT)=T∏k=1P(Ok∣sk)
As a result, if we know the probability distribution of the state sequence we can perform the Maximization step of the EM algorithm the same way we do it for the Gaussiam Mixture Model. Fortunately the Baum-Welch backward-forward procedure allows us to compute the probability of the state at each time step.
Estimation with Baum-Welch algorithm
Forward pass (adapted from Wikipedia)
Let
αi(t)=P(O1=o1,...,Ot=ot,st=i|θ) i.e. the
joint probability of observing
o1,o2,...,ot and having
st=i.
By forward recursion:
αi(1)=πigi(o1)
αj(t+1)=gj(ot+1)N∑i=1αi(t)mij
Backward pass (adapted from Wikipedia)
Let
βi(t)=P(Ot+1=ot+1,...,OT=oT|st=i,θ) i.e. the
conditional probability of observing
ot+1,...,oT given
st=i.
By backward recursion:
βi(T)=1
βi(t)=N∑j=1βj(t+1)mijgj(ot+1)
Expected state dynamic (adapted from Wikipedia)
Probability of being in state
st=i given the observed sequence
O and the parameters
θ:
γi(t)=P(st=i|O,θ)=αi(t)βi(t)∑Nj=1αj(t)βj(t)
Probability of being in state
st=i and
st+1=j given the observed sequence
O and parameters
θ:
ξij(t)=P(st=i,st+1=j|O,θ)=αi(t)mijβj(t+1)gj(ot+1)∑Ni=1∑Nj=1αi(t)mijβj(t+1)gj(ot+1)