Central to the foundations of machine learning is the concept of distance, or its opposite, similarity. Supervised machine learning, such as approaches to classification, relies on the assumption that there is some similarity among examples of the same class, or conversely, distant examples belong to separate classes. Similarly, unsupervised machine learning assumes that there are groups of similar examples and that such groups are distant from each other. This notion manifests explicitly in distance-based models such as nearest neighbors, support vector machines and K-means clustering.
In this article, we will describe some popular ways of compute similarity or distance among examples, in the context of machine learning.