mecㆍviewer v1.4 :: lecture12.pdf

T&C LAB-AI

Robotics

Expectation Maximization and

Gaussian Mixture Model

Lecture 12

Jeong-Yean Yang

2020/12/10

T&C LAB-AI

Multi Dimensional
Probabilistic Distribution

T&C LAB-AI

Robotics

Gaussian Distribution

Pr( )

( )

exp

p x dx PDF

p x





 







































T&C LAB-AI

Robotics

With C++ or Python,

How to Generate Gaussian Distribution?

• Rand() returns integer from 0 to RAND_MAX(32767)

– Rand() is NOT Gaussian(Normal) distribution

• Remind the video

*Marsaglia polar method

(0,1)

T&C LAB-AI

Robotics

N(0,1) returns Gaussian Distribution

randn(1,1000) generates

1000 samples

Question:

How we generate x with
mean and standard
deviation?

1000 samples

(0,1)

' ~

( ,

 

T&C LAB-AI

Robotics

Gaussian Generation

• Mean value: is a offset from 0

• Standard deviation

' ~

( ,

)

 



(0,1)

' ~

(0,1)

( ,1)



 

(0,1)

' ~

(0,1) 4

(4,1)

 

 

(0,1)

' ~

(0,1)

(0,

)





-4

-2

100

-10

-5

100

' 3

' ~ 3 (0,1)

(0,3 )



T&C LAB-AI

Robotics

Gaussian Distribution or

Normal Distribution(Z)

• We learn it at high school, TT.

• Z is called “Normal Distribution”

• X is normalized with mean and standard deviation

z ~

(0,1)

( ,

)







 





 

( )

exp

p x





 



































PDF(z)

exp



















T&C LAB-AI

Robotics

Probability in 2D Space

• How to generate 2D Gaussian Distribution?

– Easy. A= randn(1000,2) and plot(A(:,1),A(:,2),’.’)

-4

-2

-4

-3

-2

-1

Plot( A(:,1),A(:,2),’.’)

z ~

(0,1)







 







 

 





1 DIM

2 DIM

mean







 







 

T&C LAB-AI

Robotics

-4

-2

-4

-3

-2

-1

Plot( A(:,1),A(:,2),’.’)

-4

-2

-4

-3

-2

-1

Plot( 2*A(:,1),A(:,2),’.’)

-4

-2

-4

-3

-2

-1

Plot(A(:,1), 1.5*A(:,2),’.’)

 

  

 

 

  

 

1.5





 







-10

-5

-10

-5

How we make it?

0.5

0.5 1.5



 

 

 



 

 

  

 

T&C LAB-AI

Robotics

Quiz 1

3 1.5



 

 

 



 





How it will distribute?

Hint :

3 3 0

3 1.5

Det





  













-10

-5

-10

-5

-4

-2

-4

-3

-2

-1

T&C LAB-AI

Robotics

Quiz 2

Why PDF is Over One?

• What is PDF?

• PDF is not a Probability. p(0) may be over 1.

• Gaussian function is NOT a Probabilistic function

But is a Probabilistic Density Function

Pr( )

( )

, PDF= ( )

exp

p x dx

p x





 







































 2

0.1
0

( )

(0)

exp

3.99

0.1 2

p x

























T&C LAB-AI

Robotics

Cumulative Distribution Function(CDF)

is the integration of PDF

• Think Probability Exactly

PDF= g( )

exp

( )

Pr( )

Prob( )

CDF

g x dx





 









































g( )

x dx











• d(CDF)/dx = PDF
• p(x) in PDF is NOT a

probability

T&C LAB-AI

Robotics

Probabilistic Density Function

in n-dim. Space

• 1Dim

• N-Dim

• Look, Sigma matrix

Pr( )

g( )

, PDF= g( )

exp

x dx





 







































( ,

)

 

( , )

 













g( )

(2 )

( )

exp

Det





























0.5

0.5 1.5





  







0 1.5





  







Scale factor for

principal axis

...

0.5

...





  







Rotation

Important for

Map

matching

T&C LAB-AI

Robotics

Two types of Probability

• A Priori Probability

– When you use probability, you use a prior probability

• Posterior Probability (Conditional probability)

– Bayesian probability
– Prob. Of A on condition that B occurs,

• A prior and Posterior probability are very different.

Pr(A)

0.6



Pr(A | B)

0.6



T&C LAB-AI

Robotics

Conditional Probability

• What is Pr(A|B)?

– Probability of A under the Probability of B
– Or Probability of A within the given B

A^B

= Pr(A|B)

T&C LAB-AI

Robotics

Posterior Prob.

• When events A and B occur,
• P(A): Probability of A occurrence
• P(B): Probability of B occurrence.
• P(A^B): Probability of Both A and B occurrence
• Definition:

( | ) ( )

( ^ )

( | ) ( )

P( | )

( )

P A B P B

P A B

P B A P A

A B

P B







(A^ B)

P( | )

( )

A B

P B





T&C LAB-AI

Robotics

Engineering Notation

(x | w) (w)

P(w | x)

(x)

likelihood

prior

Posterior

Evidence







In engineering, likelihood is one of the popular solution.

T&C LAB-AI

Robotics

Prob. Of Event X between w1 and w2

• p(x)= Probability of event x’s occurrence
• Posterior probability must be required for Classification

Prior Prob. : (

), (

)

p w

( )

p x 

( )

( ,

)

( ,

)

( |

) (

)

( |

) (

)

p x

p x w

p x w p w













( |

) (

)

( |

) (

)

(

| )

( )

( | w ) ( )

p x w p w

p w x

p x

p w



 

T&C LAB-AI

Concept of Clustering

T&C LAB-AI

Robotics

What is a Clustering?

• Grouping similar objects and labeling a Group

– Labeling a Class

• Grouping a set of Objects which are more similar to

each other than to those in other groups

T&C LAB-AI

Robotics

Clustering Method

Important Tools for Intelligent Robotics

• Pattern recognition requires Class definition

• How many classes here?

• There are only two lumps  Two clusters.

2 classes

T&C LAB-AI

Robotics

Famous Clustering method

• 1. K-Means Clustering method

– Geometry based method
– Simple and low computational burdens.
– Shortcoming: Initial guess determines the final result

• 2. Expectation Maximization method

– Probabilistic method
– Very popular for fitting Mixture Distribution
– Back bone of Gaussian Mixture Model (GMM)

T&C LAB-AI

Robotics

K-Means Clustering

• Find Mean value (Centroid) for each cluster
• Algorithm
• 1. Assume there are K clusters.
• 2. Guess each centroid of cluster.
• 3. Find k points to closest centroid
• 4. Recompute the centroid of each cluster.

Centroid

Data

T&C LAB-AI

Robotics

ex/ml/l12kmean.py

• Two groups with Blue and Red
• It looks easy to find two groups

Blue ~

( ,

)

([50,50],

)

Red ~

( ,

)

([70, 60],

)

 





























T&C LAB-AI

Robotics

Real Problem is to find Two Groups

• It is NOT easy.
• By iteration, we find two groups from initial guesses.

T&C LAB-AI

Robotics

l12 kmean.test(

)

y iteration

   

T&C LAB-AI

Robotics

l2kmean.test with Different Guesses

• The Results are strongly affected by Initial Guesses

True value

(40,50) and (80,50)

(20,30) and (80,80)

T&C LAB-AI

Robotics

Centroid of Cluster

What is it?

• In k means cluster,

– Centroid approaches mean value of the test distribution.
– But, it is not on the Exact mean value.
– Why?

• Think the role of K mean cluster.

– K closest points are Not whole data. Just Sample.

 In each turn, K mean clustering method find the centroid

of K closest points.
– If Initial centroid is biased, centroid is sometimes biased.

• If we guess wrong number of centroid, how it works?

T&C LAB-AI

Robotics

Wrong Number of Groups

kmean.test3(50,50,70,70,60,30,20)

kmean.test3(40,80,70,30,50,50,20)

• Thus, what is the Answer?  No answer in General.

T&C LAB-AI

Expectation Maximization

T&C LAB-AI

Robotics

Introduction to

Expectation Maximization

• Let’s think EM in a simple way.

• We have random variable, X

• Maybe, X has two groups.

• How we separate X with

two groups, probabilistically?

T&C LAB-AI

Robotics

EM has two Steps

• Clusters are represented by Probability Distribution

– K-means Clustering is a set of data around centroids.
– But, clusters in EM are the Probabilistic Distribution

• Assumption:

– Data are the Mixture of Gaussian Distributions
– Blue, Red, and Green points are mixed with Gaussian distribution













g( )

(2 )

( )

exp

Det





























(

), ( ˆ

), (

)

}

ˆ {

Gre

Blue

Red

Green

Blue

een

Blue

x 

T&C LAB-AI

Robotics

Simple EM Procedure

We get labeled data

Mix Randomly

Initial guess of PDF

Compare PDF and

rearrange class

Recalculate PDF

Expectation

Maximization

Repeat E-M

T&C LAB-AI

Robotics

Probabilistic Density Function has

mean and variance

• 0. Data is given
• 1. Guess groups
• 2. maximum PDF is wrong in some data

• 3. Find mean and variance for each group

ˆ ˆ

ˆ ˆ ˆ

, }

{

, x x x x x

x x x

x 

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ

{ ,

, }

x x x x x x x x x



(

)

mean(

)

mean( ˆ

ˆ ˆ ˆ ˆ ˆ

ˆ ,

)

x x

mean

x x







ˆ ˆ ˆ ˆ ˆ

ˆ ˆ

(

)

(

)

(

)

std

x x x x

x x

std



ˆ ˆ ˆ

ˆ ˆ

ˆ ˆ ˆ ˆ

}

ˆ {

x x x x x

x 

Fix

ˆ ˆ

ˆ ˆ ˆ

, }

{

, x x x x x

x x x

x 

Red?

Expectation

Maximization

T&C LAB-AI

Robotics

Expectation and Maximization

Step 1. Expectation

• Density function, p(x|c) for each cluster, C

• Density function, P(x) for clustering model,

– W is the fraction of the Cluster C in the entire data

• Assign points to Clusters













p( | )

(2 )

( )

exp

x C

Det





























{

,...,

}

C C



ˆ ˆ

ˆ ˆ ˆ

, }

{

, x x x x x

x x x

x 

Blue

Green



( |

)

( |

)

(C | )

( )

( |

)

p x C

W p x C

P x

W p x C





( )

( |

)

P x

W p x C





T&C LAB-AI

Robotics

Expectation and Maximization

Step 2: Maximization

• Recompute Model

(

| )

P C x





' {

,...,

}

C C



(

| )

(

| )

xP C x

P C x

 



(

)

(

| )

(

| )

P C x





 



T&C LAB-AI

Robotics

EM in 1 Dim.

• Assume that there are 2 groups
• Guess x with Blue and Red groups

Blue ~

(1,1), Red ~

(3,1)

T&C LAB-AI

Robotics

• Use same initial guess
• It is very Robust

0.5



 



T&C LAB-AI

Robotics

But, EM is designed Carefully

• EM looks simple.
• E-M or M-E shows very different result

• 1. Expectation with given parameters

– Initial Guess of mean, variance, and fraction factor, W are first

used.

– At the first step, Do not calculate mean, variance, and so on

• 2. Maximization with p(c|x), and not with p(x|c)

– E and M looks similar. It causes confusion

• 3. If M(calculate parameters) works first, EM often fails.

T&C LAB-AI

Robotics

Example) ex/ml/l12em1.py

Generate Blue and Red

Blue ~

(0,3 ), Red ~

(10,1)

T&C LAB-AI

Robotics

Example) ex/ml/l12em1.py

Initial Guess

point label

0.2

1.3

10.1

3.3

11.5

• Matrix X has two column

– 1st column is random data
– 2nd column, label 0 is blue and

label 1 is red

• Mb=mean of blue
• Sb= standard deviation of blue
• Mr = mean of red
• Sr =standard deviation of red
• W[1,1] = W1
• W[1,2] = W2

T&C LAB-AI

Robotics

Example) ex/ml/l12em1.py

Expectation

• P(x|C) is the p.d.f. of x with respect to a Cluster
• P(C|x) means a new Cluster, C is determined by

p(x) comparison

ˆ ˆ ˆ

, ˆ ˆ

ˆ {

}

x x



1. This PDF is given by the

previous(or initial)

Parameters.

2. Blue p(x3) < Red p(x3)

Change x3’s label is 1(red)

T&C LAB-AI

Robotics

Example) ex/ml/l12em1.py

Maximization

• With a new Model, M’

• Recompute Wi

• New Mean and variance

' { ' , ' ,..., ' }



(

| )

P C x





(

| )

(

| )

xP C x

P C x

 



(

)

(

| )

(

| )

P C x





 



T&C LAB-AI

Robotics

EM in 2Dim

• Above two points are regarded as Blue one in the

right picture.

– Because, EM is based on a probabilistic distribution.

-5

-4

-2

true value

-5

-4

-2

data clustered by EM

True Case

EM Result

See these

points

T&C LAB-AI

Robotics

Why We Learn EM and GMM?

Imitation Learning is Not Doing Memorized Motion

• 1990’s: Encoder Recording and Replay
• After 2005: Trajectories are considered as the set of

Stochastic Process

T&C LAB-AI

Gaussian Mixture Model

T&C LAB-AI

Robotics

Gaussian Mixture Model

• Extend k-means Clustering into a Probabilistic framework

as like EM method

• Left signal is the mixture of Two Different Gaussian

– Goal of GMM is to find Multiple Gaussian Distributions

T&C LAB-AI

Robotics

Modeling of GMM

• Assume that the j th point of the vector x belongs to

the i th Cluster.

• Gaussian PDF of the i th cluster is defined as,

( )

( ,

)

: the input vector

: the mean value of the th cluster

: the covariance(variance) of the th cluster

G x

f x





 















(2 )

( )

exp

N Det



























( )

( |

)

p x









 



T&C LAB-AI

Robotics

Example

i for Cluster and j for input, x

1.1

10.1



























i 





 



1.1

10.1



•

is the prior probability.



Pr(

)

x C

 



T&C LAB-AI

Robotics

Probability of

the jth point belongs to the ith cluster

( )

G x





 

1.1

10.1



























1, 1.1

10, 10.1

1 1.1

( )

( ,

)

G x

f x









10 10.1

( )

( ,

)

G x

f x









( )

G x







T&C LAB-AI

Robotics

( )

G x





 

Expectation Procedure:

Probability of

the jth point belongs to the ith cluster

( )

G x







1.1

10.1



























(

1.1)

(

1.1)

(

1.1)

(10)

(10.1)

G x





 













(

1.1)

(

1.1)

(

1.1)

(10)

(10.1)

G x





 













T&C LAB-AI

Robotics

Maximization

• What is the objective function?

• Log likelihood

( )

( ;

)

p x









Best

for Cluster

 

( , , )

log

( ; , , )

log ( ; , , )

log (

| w ; , ) (w ; )

p x

 





 





 









 

  

( )

p x

p x w

p x

p w





T&C LAB-AI

Robotics

Maximization of Log likelihood

{ ,

,...,

}

(

)(

)

N T

x x

W x











 



T&C LAB-AI

Robotics

Example of gmm1

• Edit ex/ml/gmm1

Blue: Data

Red : GMM

T&C LAB-AI

Robotics

Ref:

Maximum Likelihood Estimation(MLE)

• Estimating parameters of a probability distribution

– by maximizing a likelihood function

( ;

)

(

| )

( ,

| )

p X

p X Z

Z unobserved or latent data





