# Color histogram: Wikis

Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.

# Encyclopedia

In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, it is basically the number of pixels that have colors in each of a fixed list of color ranges, that span the image's color space, the set of all possible colors.

The color histogram can be built for any kind of color space, although the term is more often used for three-dimensional spaces like RGB or HSV. For monochromatic images, the term intensity histogram may be used instead. For multi-spectral images, where each pixel is represented by a Nof measurements, each within its own wavelength range of the light spectrum, some of which may be outside the visible spectrum, the colour histogram is N-dimensional.

If the set of possible color values is sufficiently small, each of those colors may be placed on a range by itself; then the histogram is merely the count of pixels that have each possible color. Most often, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values. The color histogram may also be represented and/or displayed as a smooth function defined over the color space, that approximates the pixel counts.

Like other kinds of histograms, the color histogram is a statistic that can be viewed as an approximation of an underlying continuous distribution of colors values.

## Overview

Color histograms are flexible constructs that can be built from images in various color spaces, whether RGB, rg chromaticity or any other color space of any dimension. A histogram of an image is produced first by discretization of the colors in the image into a number of bins, and counting the number of image pixels in each bin. For example, a Red–Blue chromaticity histogram can be formed by first normalizing color pixel values by dividing RGB values by R+G+B, then quantizing the normalized R and B coordinates into N bins each; say N = 4, which might yield a 2D histogram that looks like this table:

 red 0-63 64-127 128-191 192-255 blue 0-63 43 78 18 0 64-127 45 67 33 2 128-191 127 58 25 8 192-255 140 47 47 13

Similarly a histogram can be made three-dimensional, though it is harder to display.[1]

The histogram provides a compact summarization of the distribution of data in an image. The color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view.[2] By comparing histograms signatures of two images and matching the color content of one image with the other, the color histogram is particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene. Importantly, translation of an RGB image into the illumination invariant rg-chromaticity space allows the histogram to operate well in varying light levels.

The main drawback of histograms for classification is that the representation is dependent of the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put another way, histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality(bins) of color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred dimensions[8].

Some of the proposed solutions have been color histogram intersection, color constant indexing, cumulative color histogram, quadratic distance, and last but not least color correlograms [8]. Check out the external link to Standford for in depth look at the equations.

Although there are drawbacks of using histograms for indexing/classifications, using color in a real-time system has several relative advantages. One is that color information is faster to compute, compared to other "invariants." It has been shown in some cases that color can a be an efficient method for identifying objects of known location and appearances (refer to external link for findings in study)[8].

Further research into the relationship between color histograms data to the physical properties of the objects in an image has shown they can represent not only object color and illumination but relate to surface roughness and image geometry and provide improved estimate of illumination and object color.[3]

Usually Euclidean distance, histogram intersection, or cosine or quadratic distances are used for the calculation of the images’ similarity rating.[4]. Any of these values does not reflect the similarity rate of two images in itself. It is useful only with comparison to other similar values. This is the reason that all the practical implementations of content-based image retrieval must complete computation of all images from the database. It is the main disadvantage of these implementations.

Other approach to representative color image content is 2D-color histogram. 2D-color histogram considers the relation between the pixel pair colors (not only the lighting component).[5] 2D-color histogram is a two-dimensional array, Cmax*Cmax, where Cmax is the number of colors that was used in the phase of color quantization. These arrays are treated as matrices, each element of which stores a normalized count of pixel pairs, with each color corresponding to the index of an element in each pixel neighbourhood. For comparison of 2D-color histograms it is suggested calculating their correlation, because a 2D-color histogram, constructed as described above, is a random vector (in other words, a multidimensional random value). While creating a set of final images, the images should be arranged in decreasing order of the correlation coefficient. Correlation coefficient may be used also for color histograms comparison. Retrieval results with correlation coefficient are better than with other metrics.[6]

## Applications of color histograms

In photography, color histograms in either 2D or 3D spaces are frequently used in digital cameras for estimating the scene illumination, as part of the camera's automatic white balance algorithm. Look at image histogram for information about image histograms. In remote sensing, color histograms are typical features used for classifying different ground regions from aerial or satellite photographs. In the case of multi-spectral images, the histograms may be four-dimensional, or more. In Computer vision, color histograms can be used in object recognition and image retrieval systems/databases. For an example visit the State Hermitage Museum QBIC system, placed in external links below. You are able to retrieve a large number of images based on the color layout that you are looking for.

Color Histograms are a commonly used as appearance-based signature to classify images for content-based image retrieval systems (CBIR).[7] By adding additional information to global color histogram signature, such as spatial information, or by dividing an image into regions and storing local histograms for each of these areas, the signature for each image becomes increasingly robust. Local color histograms are robust to partial occlusion and can be more efficient than global histograms for image retrieval in some cases.[8] For example, applying a weighted color histogram based on color ratios to local histograms, illumination-insensitive object extraction can be achieved.[9] Another technique for increasing the robustness of color histograms is to incorporate directional edge information to retain spatial information.[8]

In one large scale image database application, over 15000 images could be queried in under two seconds by refining color histograms using a technique called color coherence vector.[10]

## Intensity histogram of continuous data

The idea of an intensity histogram can be generalized to continuous data, say audio signals represented by real functions or images represented by functions with two-dimensional domain.

Let $f \in L^1(\mathbb{R}^n)$ (see Lebesgue space), then the cumulative histogram operator H can be defined by:

$H(f)(y) = \mu\{x : f(x)\le y\}$.

μ is the Lebesgue measure of sets. H(f) in turn is a real function. The (non-cumulative) histogram is defined as its derivative.

h(f) = H(f)'.

## References

1. ^ Nello Zuech and Richard K. Miller (1989). Machine Vision. Springer. ISBN 0442237375.
2. ^ Shapiro, Linda G. and Stockman, George C. "Computer Vision" Prentice Hall, 2003 ISBN 0130307963
3. ^ Anatomy of a color histogram; Novak, C.L.; Shafer, S.A.; Computer Vision and Pattern Recognition, 1992. Proceedings CVPR '92., 1992 IEEE Computer Society Conference on 15-18 June 1992 Page(s):599 - 605 doi:10.1109/CVPR.1992.223129
4. ^ Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression; Smith, J.R.; Graduate School of Arts and Sciences, Columbia University, 1997
5. ^ Effectiveness estimation of image retrieval by 2D color histogram; Bashkov, E.A.; Kostyukova, N.S.; Jornal of Automation and Information Sciences, 2006 (6) Page(s): 84-89
6. ^ Content-Based Image Retrieval Using Color Histogram Correlation; Bashkov, E.A.; Shozda, N.S.; Graphicon proceedings, 2002 Page(s): 458-461[1]
7. ^ The Capacity of Color Histogram Indexing, Stricker, M.; Swain, M.; Computer Vision and Pattern Recognition, 1994. Proceedings CVPR '94., 1994 IEEE Computer Society Conference on 21-23 June 1994 Page(s):704 - 708 doi:10.1109/CVPR.1994.323774
8. ^ a b Edge color histogram for image retrieval. Seong-O Shim; Tae-Sun Choi; Image Processing. 2002. Proceedings. 2002 International Conference on Volume 3, 24-28 June 2002 Page(s):957 - 960 vol.3 doi:10.1109/ICIP.2002.1039133
9. ^ Robust object extraction with illumination-insensitive color descriptions, Hashizume, C.; Vinod, V.V.; Murase, H.; Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on 4-7 Oct. 1998 page(s):50 - 54 vol.3 doi:10.1109/ICIP.1998.998995
10. ^ Histogram refinement for content-based image retrieval, Pass, G.; Zabih, R.; Applications of Computer Vision, 1996. WACV '96., Proceedings 3rd IEEE Workshop on 2-4 Dec. 1996 Page(s):96 - 102 doi:10.1109/ACV.1996.572008