Generate Positive and Negative Pairs given Identities

This post is about generating positive pairs of features (in-class data points) and negative pairs (out-of-class data points) for tasks such as Metric Learning or simply learning classifiers. We just need 7 lines of Matlab code to generate these pairs by relying on the concept of lower / upper triangular matrices and block diagonal matrices.

So here’s the code! Given a list of ground truth identities ids we first generate the unique identities uids, just to know the number of classes. For each class, we count the number of data points that belong to the class sum(uids(k) == ids) and generate a matrix of all ones.

To find negative pairs, we create a block diagonal matrix using the all ones matrices from each class, and then simply logically negate it to get a 1 in the position of out-of-class points and 0 in in-class points. Selecting a lower triangular after this simply removes the duplicate pairs. tril(~blkdiag(onesmats{:})).

For the positive pairs, we need to have a 1 in the intra-class, and to remove duplicates we again use the upper triangular matrix, negate it thus keeping a lower triangular without the diagonal 😉 ~triu(ones(X)). All pairs can be found by directly concatenating the matrices as block diagonals.

uids = unique(ids);
for k = 1:length(uids)
    onesmats{k} = ones(sum(uids(k) == ids));
    pospairmat{k} = ~triu(onesmats{k});
[pospairs(:, 1), pospairs(:, 2)] = find(blkdiag(pospairmat{:}));
[negpairs(:, 1), negpairs(:, 2)] = find(tril(~blkdiag(onesmats{:})));

Finally a simple [x, y] = find() allows to get the row, column indices to find the pairs. Suggestions for shorter, faster and smarter techniques is more than welcome in the comments.

PS: The code assumes that the ids are sorted, i.e. features of the same class occur next to each other. If this is not your case, it’s easy to do it just by running a sort and re-indexing the feature vectors.


