When reading about machine learning advances in tech news one can’t help but think about what type of learning algorithms would be in R2D2 or any autonomous robot. The K-Means Clustering algorithm decides whether items are similar. A processing unit can use the K-Mean Clustering algorithm to group similar items and use this information to make decisions.

A fully-autonomous being would need to be able to process a very large amount of data. The human brain can obviously perform all five senses simultaneously. These highly parallelized sensors are interpreted by the brain which alert the consciousness when one sense is off. Something that is hot draws your attention.

In this article we will focus on vision which means we will use a camera to take in images to be processed. Just like a human we have to decide what our focal point will be. The focal point can change. If we are playing soccer the focal point may be the ball at one time instant; the next, the defender then the goalie.

For our example here we will take a bottom up approach which means we will break the problem up into small and manageable pieces. We will first solve the pieces that will be used as building blocks. Then using those building blocks to add more functionality.

The first building block or algorithm to consider will assume a known number of objects to find in an image. Let’s say we are looking for one ball on a field. We need to categorize every pixel in an image to be either “field” or “ball”. The categorization of every pixel is the output or result from this building block. Another possible input could be we know we are categorizing two items, a ball, a person, and the field. There will always be the background pixels in every image. The input to our algorithm will be the number of items and the image itself. The algorithm will categorize pixels into the number of items plus one for the background.

## K-Means Clustering Example

The popular machine learning algorithm called K-Means Clustering can be used for categorizing each pixel. The algorithm uses the n-dimensional Euclidean distance as a basis to compare pixels in this case. Each pixel in a bit map image is made up of three values corresponding to red, green, and blue. One way a pixel can be represented is by a imagining it as a point in three dimensional space. Imagining all the pixels in an image as a point in three-dimensional space also helps us imagine the “clustering” part of the K-Means Clustering algorithm. Lets take an example of an image that has four pixels. The first pixel in this example is black, we will denote the first pixel in the upper left-hand corner by \(P_{1,1}\), and since this pixel is black, \(P_{1,1}=[0, 0, 0]\). Moving on to the next pixel \(P_{2,1}=[255, 255, 255]\) is white along with \(P_{1,2}=P_{2,2}=[255,255,255]\). As shown here

## Calculations

The K-Means Clustering algorithm applied to the pixels in P will group the pixels into two categories. We will consider each pixel in column major order. This means that we will start with \(P_{1,1}\), then we will consider \(P_{2,1}\) along with \(P_{1,1}\) and so on. First, with only \(P_{1,1}\) the calculations undefined since we can’t really differentiate two separate groups; not very exciting. With two pixels, \(P_{1,1}\) and \(P_{2,1}\) we can have two groups. The label for group A (GA) and for group B (GB) can be assigned arbitrarily. We will assign \(P_{1,1}\) to GA and \(P_{2.1}\) to GB. Next, we need to decide which group \(P_{1,2}\) will be assigned.

### Inter-Mean-Distance Definition

We consider inter mean distance and intra mean distance to determine which group a pixel will join. First, we will consider inter mean distance where we find the average distance between all points in a group. This metric represents how closely are the points are in the group. One of the goals for the K-Means Clustering Algorithm is to minimize the inter mean distance.

### Intra-Mean-Distance Definition

The intra mean distance is the average distance between centroids of groups. This metric represents how close the groups are and the K-Means Clustering Algorithms maximizes intra mean distance. The larger the intra mean distance the larger the inter mean distance can be without getting pixels categorized as the wrong group.

Getting back to adding \(P_{1,2}\), first consider adding \(P_{1,2}\) to GA. The inter mean distance is 16 \(\sqrt{3}\) and the intra mean distance is eight \(\sqrt{3}\). If \(P_{1,2}\) is added to GB the inter mean distance is 0 and the intra mean distance is 16 \(\sqrt{3}\). Since the K-Means Clustering algorithm minimizes inter mean distance and maximizes intra mean distance \(P_{1,2}\) is put into GB. We can add \(P_{2,2}\) into group GB as well since it has the same coordinates (or color) as \(P_{1,2}\).

We then can render the categories by assigning colors to the categories, shown here:

### Next Post

In the next post the K-Means Clustering Algorithm is used to categorize pixels in a realistic soccer-ball image.