On Distances…

Distances are used every day in statistics and applied mathematics for comparing, seeing how far things are and also tend to be the classifying feature for several pattern recognition problems. Here’s a short overview of the distances I know or had heard of. Please help add others!

Euclidean distance: Straight line distance between any 2 points in n-dimensional Euclidean space (normally most spaces are Euclidean). Most commonly used. More info
Absolute distance (Manhattan distance): Follows the layout of a city while measuring distance. Basically the measure of distance while taking paths that are parallel to the axes. More info
p-norm distance (Minowski distance): Euclidean space distance, generalized to the order of p. (p=1: Absolute, p=2: Euclidean). Derived from the p-norm concepts. More info

Chebyshev distance: The p-norm distance for p=infinity. It picks the axis with the maximum difference between the (corresponding) points of a vector. More info
Mahalanobis distance: Similar in principle to the Euclidean distance, however takes into account the correlation between the data. Widely used in data clustering methods. More info
Algebraic distance: A popular distance metric for the Minimum Mean Square Error solutions. Represented easily with vectors and matrices, it makes this a very powerful tool. More info

Bhattacharya distance: A symmetric measure of distance used to compare probability distributions. eg. Could be used in images for comparing histograms. More info
Kullback-Leibler distance: Used typically by the information theory people, it is an asymmetric measure between probability distributions. It can simplify a lot of Info. Theory proofs 🙂 More info
Earth Mover’s distance: Thought of as the amount of “Earth” one needs to move to make two probability distributions similar. Used typically with image histograms. More info

Hausdorff distance: Idea from set theory however, can be extended to matching shapes using edge images. It is also a measure of how similar two 3D models are in computer graphics. More info
Hamming distance: The number of different symbols while comparing two strings (typically bits). Can be used to measure bit-flips or errors in a data. More info

Metric Learning: See this post.


6 thoughts on “On Distances…

      1. Makarand Tapaswi Post author

        I think Absolute distance is the one which will not satisfy triangle inequality. The rest (p-norm, p>1) should right?

    1. Makarand Tapaswi Post author

      Thanks! Geodesic distance is an example of non-Euclidean space distance in which case the straight line is now replaced by the “geodesics”. A simple example is the great-circle distance on a sphere (Earth) typically known by the shortest distance “as the crow flies”.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s