본문 바로가기

Study/Lecture - Basic

W9.L5. Hidden Markov Model - Baum-Welch Algorithm

| Learning Parameters with Only $X$

X 만 주어져 있을 때 모든 Parameters 알아내보자!

 

이전에는 $\pi, a, b$ 안다는 가정: 이 말은 즉슨 $X$ 와 $Z$ 를 모두 안다는 가정

... 현실에서는 어려움

 

EM Algorithm

Iteratively optimizing $\hat{\pi}, \hat{a}, \hat{b}$ and $Z$

1. Finding the optimized $\hat{\pi}, \hat{a}, \hat{b}$ with $X$

2. Finding the most probable $Z$ with $X, \hat{\pi}, \hat{a}, \hat{b}$

 

| EM Algorithm

Initialize $\theta^{0}$ to an arbitrary point

Loop until the liklihood converges

- Expectation step

  * $q^{t+1}(Z)=argmin_{q}\ KL(\ q(Z)\parallel P(Z \mid X,\ \theta^{t})\ )$

     →  $q^{t}(Z)=P(Z \mid X,\ \theta)$ 

          ... Assign $Z$ by $P(Z \mid X,\ \theta)$

  * Assign $Z$ !

 

- Maximization step

  * $\theta^{t+1}=argmax_{\theta}\ Q(\theta, q^{t}) = argmax_{\theta}\ E_{q^{t}(Z)}ln P(X, Z \mid \theta^{t}) $

  * Fixed $Z$ means that there is no unobserved variables

  * Same optimization of ordinary MLE

  * $Z$ 와 $X$ 가 있으니 $\theta$ 계산!

 

 

| EM Algorithm for HMM

Initialize $\pi^{0}, a^{0}, b^{0}$ to an arbitrary point

Loop until the liklihood converges

- Expectation step

  * $q^{t+1}(Z)=P(Z \mid X,\pi^{t}, a^{t}, b^{t})$ 

  * Assign $Z$ by $P(Z \mid X,\pi^{t}, a^{t}, b^{t})$

 

- Maximization step

  * $ P(X, Z \mid \pi, a, b) = \pi_{z_1} \prod_{t=2}^{T} a_{z_{t-1}, z_t} \prod_{t=1}^{T}b_{z_t, x_t} $

  * $ ln P(X, Z \mid \pi, a, b) = \pi_{z_1} + \sum_{t=2}^{T} ln a_{z_{t-1}, z_t} + \sum_{t=1}^{T} ln b_{z_t, x_t} $

  * $ E_{q^{t+1}(z)}ln P(X, Z \mid \pi, a, b) = \sum_{Z}q^{t+1}(z)ln P(X, Z \mid \pi, a, b) = \sum_{Z}P(Z \mid X, \pi^t, a^t, b^t) \cdot ln P(X, Z \mid \pi, a, b) $

 

 

| Derivation of EM

 

$$
\begin{align*}
E_{q^{t+1}(z)}ln P(X, Z \mid \pi, a, b) &= \sum_{Z}q^{t+1}(z)ln P(X, Z \mid \pi, a, b) \\
&= \sum_{Z}P(Z \mid X, \pi^t, a^t, b^t) \cdot ln P(X, Z \mid \pi, a, b) \\
&= \sum_{Z}P(Z \mid X, \pi^{t}, a^{t}, b^{t})\left\{ ln \pi_{z_1} + \sum_{t=2}^{T} ln a_{z_{t-1}, z_t} + \sum_{t=1}^{T} ln b_{z_t, x_t}\right\} \\
\end{align*}
$$

 

Optimize with constraint of $\sum_{i} \pi_{i} = 1$ 

... Lagrange method!

... 풀어보면 ...

 

$$
\begin{align*}
\pi_{i}^{t+1} &= \frac{P(z_{1}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})}{\sum_{j=1}^{K}P(z_{1}^{j}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
a_{i,j}^{t+1} &= \frac{\sum_{t=2}^{T}P(z_{t-1}^{i}=1, z_{t}^{j}=1 \mid X, \pi^{t}, a^{t}, b^{t})}{\sum_{t=2}^{T}P(z_{t-1}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
b_{i,j}^{t+1} &= \frac{\sum_{t=1}^{T}P(z_{t}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t}) \cdot \delta (idx(x_t)=j)}{\sum_{t=1}^{T}P(z_{t}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
\end{align*}
$$

 

 

| Baum Welch Algorithm

EM algorithm for HMM. a.k.a. Baum-Welch Algorithm

 

Initialize $\pi^{0}, a^{0}, b^{0}$ to an arbitrary point

Loop until the liklihood converges

Expectation step

  * $q^{t+1}(Z)=P(Z \mid X,\pi^{t}, a^{t}, b^{t})$ 

  * Assign $Z$ by $P(Z \mid X,\pi^{t}, a^{t}, b^{t})$

 

$$
\begin{align*}
P(z_{t}^{k}=1, X) &= \alpha_{t}^{k} \cdot \beta_{t}^{k} \\
\alpha_{t}^{k} &= b_{k, x_t} \sum_{i} \alpha_{t-1}^{i} \cdot a_{i, k} \\
\beta_{t}^{k} &= \sum_{i} a_{k, i} \cdot b_{i, x_t} \cdot \beta_{t+1}^{i} \\
P(X) &= \sum_{i} \alpha_{T}^{i} \\
\end{align*}
$$

 

Maximization step

 

$$
\begin{align*}
\pi_{i}^{t+1} &= \frac{P(z_{1}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})}{\sum_{j=1}^{K}P(z_{1}^{j}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
a_{i,j}^{t+1} &= \frac{\sum_{t=2}^{T}P(z_{t-1}^{i}=1, z_{t}^{j}=1 \mid X, \pi^{t}, a^{t}, b^{t})}{\sum_{t=2}^{T}P(z_{t-1}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
b_{i,j}^{t+1} &= \frac{\sum_{t=1}^{T}P(z_{t}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t}) \cdot \delta (idx(x_t)=j)}{\sum_{t=1}^{T}P(z_{t}^{i}=1 \mid X, \pi^{t}, a^{t}, b^{t})} \\
\end{align*}
$$

 

 

 

 

Reference
문일철 교수님 강의 
https://www.youtube.com/watch?v=mnUcZbT5E28&list=PLbhbGI_ppZISMV4tAWHlytBqNq1-lb8bz&index=61