Study/Lecture - Basic

W8.L3-4. Gaussian Mixture Model - Multinomial Distribution

공부해라이 2023. 6. 6. 01:27

| Multinomial Distribution

Binary variable

- Selecting 0 or 1 → binomial distribution

- Selecting 0, 1, 2, ... → multinomial distribution

 

How about K options?

Multinomial distribution (A generalization of binomial distribution)

 

 

주사위 한 번 던지는 경우 

One observation: $X_1=(0, 0, 1, 0, 0, 0)$

 

 

$P(X \mid \mu) = \prod_{k=1}^{K} \mu_{k}^{x_k}$

 

such that $\mu_{k} \geqslant 0,\ \sum_{k}\mu_{k}=1$

 

$\sum_{k} x_{k} = 1 $

 

 

특정 선택지를 선택할 확률: $\mu_{k}$

- 첫 번째를 선택할 확률: $\mu_{1}$

- 두 번째를 선택할 확률: $\mu_{2}$

- ...

- $k$ 번째를 선택할 확률: $\mu_{k}$

 

 

$\mu$ 가 Given 인 상황에서 $X$ 라는 Data 를 관측할 확률

$$
\begin{align*}
P(X \mid \mu) &= \prod_{k=1}^{K} \mu_{k}^{x_k} \\
&= \mu_{1}^{x_1} \cdot \mu_{2}^{x_2} \cdot \mu_{3}^{x_3} \cdot \mu_{4}^{x_4} \cdot \mu_{5}^{x_5} \cdot \mu_{6}^{x_6} \\
&= \mu_{3} \\
\end{align*}
$$

 

 

 

주사위 25 번 던지는 경우 

N=25 observations: $X_1, X_2, ... , X_N $

 

Number of selecting $k^{th}$ option out of $N$ selections

 

$$P(X \mid \mu) = \prod_{n=1}^{N} \prod_{k=1}^{K} \mu_{k}^{x_{nk}} = \prod_{k=1}^{K} \mu_{k}^{\sum_{n=1}^{N} x_{nk}} = \prod_{k=1}^{K} \mu_{k}^{m_k}$$

 

when $ m_k = \sum_{n=1}^{N} x_{nk} $

 

 

 

| How to determine the maximum likelihood solution of $\mu$ ?

 

MLE

 

 

Maximize $P(X \mid \mu) = \prod_{k=1}^{K} \mu_{k}^{m_k} $

 

Subject to $\mu_{k} \geqslant 0,\ \sum_{k}\mu_{k}=1$

 

Constrained Optimization ... Lagrange Method 로 풀면 ...

 

$\mu_{k} = \frac{m_k}{N} $

 

특정 선택지를 선택한 횟수 / 전체 선택 횟수

 

( 마치 Binomial 에서 $\frac{a_H}{a_H + a_T}$ 처럼 )

 

 

 

 

| Multivariate Gaussian Distribution

 

Probability density function of the Gaussian distribution

 

$$ N(x \mid \mu,\ \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2\sigma^2}(x-\mu)^2} $$

$$ N(x \mid \mu,\ \sigma^2) = \frac{1}{(2 \pi)^{\frac{D}{2}}}\frac{1}{\left| \Sigma \right| ^{\frac{1}{2}}}e^{-\frac{1}{2}(x-\mu)^T \Sigma^{-1} (x-\mu)} $$

 

 

MLE 로 $\mu$, $\Sigma$ 추정해보면 ...

 

$$ \hat{\mu}=\frac{\sum_{n=1}^{N}x_n}{N} $$

$$\hat{\Sigma}=\frac{1}{N}\sum_{n=1}^{N}(x_n - \hat{\mu})(x_n - \hat{\mu})^T $$

 

 

2D - Covariance

 

 

 

 

 

| Mixture Model

앞서 살펴본 2 가지 재료를 연결시켜주는 것 (Multinomial Distrobution, Multivariate Gaussian Distribution)

 

$$ P(x) = \sum_{k=1}^{K} \pi_{k}\cdot N (x \mid \mu_{k},\ \sigma_{k} ) $$

 

Mixing coefficients $ \pi_{k} $ :

A normal distribution is chosen ​out of $K$ options with probability

 

$\sum_{k=1}^{K} \pi_{k} = 1 $, $ 0\leq  \pi_k \leq 1 $

 

This is a probability (as well as weighting)

 

 

Mixture component $ N (x \mid \mu_k,\ \sigma_k ) $ :

A distribution for the subpopulation

 

 

 

 

 

 

 

Reference
문일철 교수님 강의 
https://www.youtube.com/watch?v=mnUcZbT5E28&list=PLbhbGI_ppZISMV4tAWHlytBqNq1-lb8bz&index=50

https://www.youtube.com/watch?v=mnUcZbT5E28&list=PLbhbGI_ppZISMV4tAWHlytBqNq1-lb8bz&index=51