Log-Sum-Exp trick

It is quite common to process small numbers in probability (no really, so small that double precision 10^{-308} may not be enough). They are encountered while modeling with Gaussians, which are so ubiquitous today. However, it is only so much that one Gaussian can do, and hence the obvious extension is to use a mixture of Gaussians, weighted differently with different means and covariances.

The probability function for a mixture of multi-variate Gaussians looks like
p(x) = \sum_i w_i \frac{1}{\sqrt{(2\pi)^N |C_i|}} \exp[-\frac{1}{2} (x-\mu_i)^T C_i^{-1} (x-\mu_i)]
where both x and \mu_i are vectors of dimension N, and i a count of the number of Gaussians in the mixture.

Everyone knows the above equation, but the problem comes when actually trying to compute it. So typically papers report this Log-Likelihood, which is basically the \log(p(x)). Consider an example of 2 gaussians. Then, you may need to evaluate something like f(x) = \log(e^{-1000} + e^{-1001}). Those are not made up. It is quite easy to obtain a score of 1000. Direct evaluation will usually return -Inf, but we clearly know that it cannot be true, and has to be around -999.something.

The simple thing to do is rewrite f(x) as
\log(e^{-1000}(e^0 + e^{-1})) = -1000 + \log(1 + e^{-1}) \approx -999.7.

And that is the computational trick. Pull out the maximum, from the sum, leave the rest (which can be evaluated). Just FYI, the maximum value is exp(709) which lies just below the numerical Infinity.


One thought on “Log-Sum-Exp trick

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s