Smoothing in Mixture of Bernoullis for high-dimensional vectors
I've implemented mixture of Bernoullis in python, for a data set of
416-dimensional multivariate random variable. As expected, when
calculating the probability of such an R.V., I run into numerical
underflow problems about 300 dimensions into the calculation.
I have implemented smoothing in two places. The first uses a sum of
logprobs instead of a product over small probability values. This seems to
work, until I am caught taking the log of 0 if any Bernoulli mean = 1 or
0. So I smooth that also. If the mean = 1, I set it equal to .99, and if
0, I set it equal to .01. Ultimately, the combination of these two
smoothing techniques leads to problems after a few iterations of EM.
Initially, the cluster assignments adequately explain the data (they come
from two different sources and two separate clusters are apparent when the
data are visualized). After this though, the cluster proportions shift
dramatically, and one cluster assignment completely dominates the other
after only a few more iterations. Any idea why this is happening? I have
provided a code snippet below containing both of the smoothing techniques,
in case the problem exists there.
Also, here is a sample output. (Crude as yet, I know.)
Evaluating initial log-likelihood.
log likelihood: -288528.389451
Iteration #: 1 old mixing_proportions: [0.1, 0.1] new mixing_proportions:
[0.46218130113590206, 0.5378186988640978] Iteration #: 2 old
mixing_proportions: [0.46218130113590206, 0.5378186988640978] new
mixing_proportions: [0.5602861839787503, 0.43971381602124976] Iteration #:
3 old mixing_proportions: [0.5602861839787503, 0.43971381602124976] new
mixing_proportions: [0.4137759544813409, 0.5862240455186591] Iteration #:
4 old mixing_proportions: [0.4137759544813409, 0.5862240455186591]
new mixing_proportions: [0.6131604985789163, 0.38683950142108375]
Iteration #: 5 old mixing_proportions: [0.6131604985789163,
0.38683950142108375] new mixing_proportions: [0.3537971913067085,
0.6462028086932916] Iteration #: 6 old mixing_proportions:
[0.3537971913067085, 0.6462028086932916] new mixing_proportions:
[0.6972187269487877, 0.3027812730512123] Iteration #: 7 old
mixing_proportions: [0.6972187269487877, 0.3027812730512123] new
mixing_proportions: [0.23220293867520006, 0.7677970613247999] Iteration #:
8 old mixing_proportions: [0.23220293867520006, 0.7677970613247999] new
mixing_proportions: [0.8774645158252881, 0.12253548417471194] Iteration #:
9 old mixing_proportions: [0.8774645158252881, 0.12253548417471194] new
mixing_proportions: [0.0013733019674422116, 0.9986266980325578] Iteration
#: 10 old mixing_proportions: [0.0013733019674422116, 0.9986266980325578]
new mixing_proportions: [1.0, 1.863681856053267e-20]
Any advice would be tremendously appreciated. I'm really in a jam!
for i in range(0, 20):
print "Iteration #: " + str(i + 1)
# E - STEP
# re-calculate gammas
for n in range(0, N):
k_numerators = []
for k in range(0, K):
log_prob_xn = 0.0 # for calculating the logprobs
#calculate the inner summation of the logprobs (over the vector
dimensions) Bishop pg 446
for d in range(0, D):
if MUs[k][d] == 1.0:
smoothed_mu = .99
log_prob_xn = log_prob_xn + ((babies[n][d] *
math.log(smoothed_mu)) + ((1 - babies[n][d]) * math.log(1 -
smoothed_mu)))
elif MUs[k][d] == 0.0:
smoothed_mu = .01
log_prob_xn = log_prob_xn + ((babies[n][d] *
math.log(smoothed_mu)) + ((1 - babies[n][d]) * math.log(1 -
smoothed_mu)))
else:
log_prob_xn = log_prob_xn + ((babies[n][d] *
math.log(MUs[k][d])) + ((1 - babies[n][d]) * math.log(1 -
MUs[k][d])))
numerator = math.log(mixing_proportions[k]) * log_prob_xn
k_numerators.append(numerator)
#now we need to get m - the max of the numerators, so that we can
estimate our gammas with stable numbers
max_numerator = k_numerators[0]
for k in range(1, K):
if k_numerators[k] > max_numerator:
max_numerator = k_numerators[k]
denominator = 0.0
#now, this is more complicated: we want to approximate the denominator
ignoring very small numbers right now we're trying for -100 as our
threshold
for k in range(0, K):
if ((k_numerators[k] - max_numerator) >= -100):
denominator = denominator + math.exp(k_numerators[k] - max_numerator)
for k in range(0, K):
if ((k_numerators[k] - max_numerator) < -100):
gamma[n][k] = 0
else:
gamma[n][k] = (math.exp(k_numerators[k] - max_numerator) /
denominator)
No comments:
Post a Comment