Image Segmentation : K-Means Clustering Approach
Categorized image of the soccer ball on a grass field.

Image Segmentation : K-Means Clustering Approach

We mentioned in a previous post (K-Means Clustering Introduction) that the K-Means Clustering algorithm can be used to recognize a soccer ball on a soccer field. We found we could use the K-Means Clustering algorithm to identity and group similar pixels into [\latex]$k$[/latex] groups. These groups are based on the RGB values of a pixel and with a simple example we saw with a 4-pixel image it was easy to determine which pixels are similar. In this post we will demonstrate image segmentation by identifying a soccer ball on a grassy field.

The challenges with a normal image are much greater. For instance if we have a soccer ball that is black and white. The RBG values are used to segment the image but we want the soccer ball pixels grouped together. The output of the K-Means Clustering algorithm will not do this for us. We need to use some logic in determining where the object is in the picture.

The first issue with real images is that to gain the high resolution and crisp image lines large jumps in color are used. As an aside, all phrases relating to distances (i.e. “Large Jumps” used in the last sentence) is literally in terms of RGB distances described in the previous post. So, for crisp images means large jumps which adds noise to the image segmentation algorithm we are presenting here. So the first step here is to smooth the image. This can be done by updating a pixel’s color by the average of its surrounding pixels.

Image Smoothing

For a smoothing-window (‘sm_win’ in the code) of one. Each pixel is the average of itself and the eight neighboring pixels. If the smoothing-window is two then the eight neighboring pixels and the 16 pixels that neighbor those eight are averaged. The code below that loops over every pixel in the image (less the sm_win edge pixels). For each pixel, the neighboring pixels are averaged (denoted by the operator <*>) and assigned into a new image to avoid previous pixel averages corrupting current pixel averages.

[Nx, Ny, Nrgb] = size(image.jpg);
sm_win = 1;
for nx = 1+sm_win:Nx-sm_win
    for ny = 1+sm_win:Ny-sm_win
        p_smo(nx,ny,1) = <p(nx-sm_win:nx+sm_win,ny-sm_win:ny+sm_win,1)>;
        p_smo(nx,ny,2) = <p(nx-sm_win:nx+sm_win,ny-sm_win:ny+sm_win,2)>;
        p_smo(nx,ny,3) = <p(nx-sm_win:nx+sm_win,ny-sm_win:ny+sm_win,3)>;

The smoothed image just looks like a blurry replication of the original, but the K-Means Clustering algorithm will have an easier time segmenting the image.

Smoothed Image

Categorized Image

MATLAB has a K-Means Clustering Algorithm function MATLAB K-Means Documentation. The output of MATLAB’s ‘kmeans’ function returns the group ID number. Which is what is plotted here:

Categorized Pixels in image

where the blue is one group (background in this case) and the yellow is the white in the soccer ball. Since the segmented this image into two categories the white of the ball, the black of the ball and the green of the grass had to be put in two categories. And as it turns out the black of the soccer ball and the green of the grass are closer than either are to the white of the ball.

We also notice there are some stray yellow pixels out there in the grass. We will need to clean them up.

Clean up Image

The next step is cleaning up these pixels by looping over all the pixels again. This time we will make sure that if a given pixel is in category zero (or one) it has a neighboring pixel also in category zero (or one). If a pixel is in the opposite category that all its neighbors then that pixel flips the category that’s its in. We can see the resulting image here.

Removed stray mis-categorized pixels

Now that we have a cleaned up image free of random mis-categorized pixels we can begin finding the soccer ball in the image.

Outline Ball in Image

A quick note about the K-Means Clustering algorithm, the label of category zero or one, or one or two are not numerical they are just labels and can be interchanged. Because of this we assume a pixel in the upper left corner is background and make sure its label is one. Then the white pixels in the ball are categorized as a two.

To determine the location and size of the ball we loop through the pixels again. We take note of all the unique indices for which the ball (or category two) occupies. The MATLAB code here shows this:

xi = 1;
fiy = [];
for nx = 1:Nx
    objIdxs = find(cat_imag(nx,:) == 2);
    if ~isempty(objIdxs)
        fix(xi) = nx;
        xi = xi + 1;
        fiy = unique([fiy objIdxs]);

where ‘cat_imag’ is the matrix of ones and two representing the pixel categories. We loop over each row with index ‘nx’, and if there are any category two pixels in that row we save off nx and we also add the column indices to fiy and make sure the list is unique (no repeats in the list).

After making a list of the indices that contain the ball in the x direction (rows) and y direction (columns) then it is simple to determine the mean of the x coordinates and the mean of the y coordinates. Now we have the center of the ball we can use the equations for a circle to make an outline  of the ball with the correct radius.

r = abs(fix_mean - fix_min);
x = fix_min:fix_max;
yp = sqrt(r^2 - (x-fix_mean).^2) + fiy_mean;
yn = -sqrt(r^2 - (x-fix_mean).^2) + fiy_mean;
Asterisk in the center of the ball shows calculated center and red dashed line shows calculated ball radius

Complete Image Segmentation

Now we have the location and containing circle we can categorize all the pixels inside the circle as part of the soccer ball.

Leave a Reply