M. Donoser and H. Bischof. Diffusion Processes for Retrieval Revisited. CVPR 2013. [paper]
Given data points which can represent anything (objects, faces, actions, words, audio segments, etc.) we first build a pair-wise affinity matrix using an appropriate function that captures their similarity. This affinity matrix (or its inverse, the distance matrix — which captures pair-wise differences/dissimilarities) is very commonly used in hierarchical clustering, or retrieval applications. In fact, it can also be seen as a kernel matrix where the function is the kernel function.
Since many tasks such as clustering / retrieval / classification are based on this matrix, improving its quality, i.e. making similarity between elements from the same class higher and between different classes lower, can help improve performance drastically. That is precisely what this paper demonstrates, and although the MPEG-7 bulls-eye score may be a controversial measure to capture this performance improvement, an improvement in mAP is also observed.
So what is the magic that improves the affinity? It boils down to “elements similar to my neighbors are probably also similar to me”. Neighbors (in the nearest-neighbor sense) are elements that are similar to a selected element. In math, it essentially involves iterating over matrix multiplications a few times (5-10). The best combination that the authors pick is to initialize the iterations with the a sparse K-NN matrix obtained from the original affinity matrix, and to update it by pre- and post-multiplication with itself until convergence. The update step is
That’s it. The resulting converged matrix is then used for the application and shows improved performance.
Apart from the 3 data sets where the authors show that the above works, we also quickly tried this algorithm on face-based person retrieval and saw an improvement in mAP from 0.463 to 0.648! That is huge!
The authors also provide easy-to-use Matlab code to replicate their results, and use it for your problem.