Purpose Of A Histogram A histogram is used to graphically summarize and display the distribution of a process data set.
Each rectangle is erected over an interval, with an area equal to the frequency of the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e. the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data.
Shape is Relative You may have heard that a good histogram has a smooth, bell curve shape.
The categories (intervals) must be adjacent, and often are chosen to be of the same size,
but not necessarily so.

Etymology
An example histogram of the heights of 31
Black Cherry trees.
The word
histogram derived from the
Greek histos 'anything set upright' (as the masts of a ship, the bar of a loom, or the vertical bars of a histogram); and
gramma 'drawing, record, writing'. The term was introduced by
Karl Pearson in 1895.

Examples
^ In addition, you may notice the right hand side of the hills is higher than the left hand side most of the time.
This is likely to have arisen from people rounding their reported journey time. This rounding is a common phenomenon when collecting data from people.
This diagram uses Q/width from the table.
Data by absolute numbers
Interval |
Width |
Quantity |
Quantity/width |
0 |
5 |
4180 |
836 |
5 |
5 |
13687 |
2737 |
10 |
5 |
18618 |
3723 |
15 |
5 |
19634 |
3926 |
20 |
5 |
17981 |
3596 |
25 |
5 |
7190 |
1438 |
30 |
5 |
16369 |
3273 |
35 |
5 |
3212 |
642 |
40 |
5 |
4122 |
824 |
45 |
15 |
9200 |
613 |
60 |
30 |
6461 |
215 |
90 |
60 |
3435 |
57 |
Area under the curve equals 1. This diagram uses Q/total/width from the table.
Data by proportion
Interval |
Width |
Quantity (Q) |
Q/total/width |
0 |
5 |
4180 |
0.0067 |
5 |
5 |
13687 |
0.0221 |
10 |
5 |
18618 |
0.0300 |
15 |
5 |
19634 |
0.0316 |
20 |
5 |
17981 |
0.0290 |
25 |
5 |
7190 |
0.0116 |
30 |
5 |
16369 |
0.0264 |
35 |
5 |
3212 |
0.0052 |
40 |
5 |
4122 |
0.0066 |
45 |
15 |
9200 |
0.0049 |
60 |
30 |
6461 |
0.0017 |
90 |
60 |
3435 |
0.0005 |
)

Activities and demonstrations
Mathematical definition
The data shown is a random sample of 10,000 points from a normal distribution with a mean of 0 and a standard deviation of 1.
Thus, if we let
n be the total number of observations and
k be the total number of bins, the histogram
m_{i} meets the following conditions:
Cumulative histogram
That is, the cumulative histogram
M_{i} of a histogram
m_{j} is defined as:
Number of bins and width
The number of bins k can be calculated directly, or from a suggested bin width h:
- Sturges' formula^{[5]}
which implicitly bases the bin sizes on the range of the data, and can perform poorly if n < 30.
- Scott's choice^{[6]}
where
- Square-Root Choice
which takes the square root of the number of data points in the sample (used by Excel histograms and many others)
- Freedman–Diaconis' choice^{[7]}
which is based on the
interquartile range. A good discussion of this and other rules for choice of bin widths is in

See also
References
- ^ Howitt, D. and Cramer, D. (2008) "Statistics in Psychology". Prentice Hall
- ^ Nancy R. Tague (2004). "Seven Basic Quality Tools". The Quality Toolbox. Milwaukee, Wisconsin: American Society for Quality. p. 15. http://www.asq.org/learn-about-quality/seven-basic-quality-tools/overview/overview.html. Retrieved 2010-02-05.
- ^ M. Eileen Magnello (December 1856). "Karl Pearson and the Origins of Modern Statistics: An Elastician becomes a Statistician". The New Zealand Journal for the History and Philosophy of Science and Technology 1 volume. ISSN 1177–1380. http://www.rutherfordjournal.org/article010107.html.
- ^ Dean, S., & Illowsky, B. (2009, February 19). Descriptive Statistics: Histogram. Retrieved from the Connexions Web site: http://cnx.org/content/m16298/1.11/
- ^ Sturges, H. A. (1926). "The choice of a class interval". J. American Statistical Association: 65–66.
- ^ Scott, David W. (1979). "On optimal and data-based histograms". Biometrika 66 (3): 605–610. doi:10.1093/biomet/66.3.605.
- ^ Freedman, David; Diaconis, P. (1981). "On the histogram as a density estimator: L_{2} theory". Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57 (4): 453–476. doi:10.1007/BF01025868.
- ^ W. N. Venables and B. D. Ripley: "Modern Applied Statistics with S", Springer, in (4thedition) section 5.6: "Density Estimation"
Further reading
- Lancaster, H.O. An Introduction to Medical Statistics. John Wiley and Sons. 1974. ISBN 0 471 51250-8
External links