Colour Image Quantization using K-Means

Colour quantization is a process that reduces the number of colours in an image while it tries to preserve the quality and the important global information. Images are composed of pixels each of which can be associated with 16,777,216 different colours in the case of RGB colour space, which is probably the most commonly used colour space. Each colour can be represented as a 3d vector, and each vector element has an 8-bit dynamic range, which means 2⁸=256 different values (i.e. 256x256x256 =16,777,216). This kind of representation is often called RGB triplet. The key factor for successful colour quantization is the appropriate selection of the colour palette that sufficiently summarizes the information of the initial image.

Motivation: When should we use Colour Image Quantization?

Colour Image Quantization could be vital in cases we want to enable rendering of images in devices that support only a limited number of colours (limited colour palette) and this usually happens due to memory restraints.

Colour Quantization as a Clustering Problem

One of the most common ways to address the chromatic compression of an image is by using clustering where the features are the colours of the image. K-means is a very popular clustering method of vector quantization and thanks to the scikit-learn library it is really easy to use (as long as you know how it works). Of course, other clustering methods such as SVD, Gaussian Mixture Model, etc. do the work as well.

Colour Spaces

Colour spaces are mathematical models describing the way colours can be represented. The easiest way of visualising them is to think of a box containing all the possible colours that can be produced by mixing the three primary colours of light: red, green and blue [Source]. In figure 1 you can see a diagram that mathematicians have come up with in order to fit three axes (to a two-dimensional format: