Affinity Diffusion

This post is based on a very simple, yet powerful technique presented at CVPR 2013. The authors essentially compare many existing graph-based diffusion techniques, generalize them and evaluate them to pick the best alternative. The goal — to improve the quality of affinity matrices.

M. Donoser and H. Bischof. Diffusion Processes for Retrieval Revisited. CVPR 2013. [paper]

Given N data points x_1, \ldots, x_N which can represent anything (objects, faces, actions, words, audio segments, etc.) we first build a pair-wise affinity matrix using an appropriate function a_{ij} = f(x_i, x_j) that captures their similarity. This affinity matrix (or its inverse, the distance matrix — which captures pair-wise differences/dissimilarities) is very commonly used in hierarchical clustering, or retrieval applications. In fact, it can also be seen as a kernel matrix where the function f is the kernel function.

Since many tasks such as clustering / retrieval / classification are based on this matrix, improving its quality, i.e. making similarity between elements from the same class higher and between different classes lower, can help improve performance drastically. That is precisely what this paper demonstrates, and although the MPEG-7 bulls-eye score may be a controversial measure to capture this performance improvement, an improvement in mAP is also observed.

So what is the magic that improves the affinity? It boils down to “elements similar to my neighbors are probably also similar to me”. Neighbors (in the nearest-neighbor sense) are elements that are similar to a selected element. In math, it essentially involves iterating over matrix multiplications a few times (5-10). The best combination that the authors pick is to initialize the iterations with the P_{kNN} a sparse K-NN matrix obtained from the original affinity matrix, and to update it by pre- and post-multiplication with itself until convergence. The update step is
A^{t+1} = P_{kNN}*A^t*P_{kNN}^T.
That’s it. The resulting converged matrix is then used for the application and shows improved performance.

Apart from the 3 data sets where the authors show that the above works, we also quickly tried this algorithm on face-based person retrieval and saw an improvement in mAP from 0.463 to 0.648! That is huge!

The authors also provide easy-to-use Matlab code to replicate their results, and use it for your problem.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s