The Full Wiki

Absolute deviation: Wikis

Advertisements
  

Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.

Encyclopedia

From Wikipedia, the free encyclopedia

In statistics, the absolute deviation of an element of a data set is the absolute difference between that element and a given point. Typically the point from which the deviation is measured is a measure of central tendency, most often the median or sometimes the mean of the data set.

Di = | xim(X) |

where

Di is the absolute deviation,
xi is the data element
and m(X) is the chosen measure of central tendency of the data set—sometimes the mean (\overline{x}), but most often the median.

Contents

Measures of dispersion

Several measures of statistical dispersion are defined in terms of the absolute deviation.

Advertisements

Average absolute deviation

The average absolute deviation, or simply average deviation of a data set is the average of the absolute deviations and is a summary statistic of statistical dispersion or variability. It is also called the mean absolute deviation, but this is easily confused with the median absolute deviation.

The average absolute deviation of a set {x1, x2, ..., xn} is

\frac{1}{n}\sum_{i=1}^n |x_i-m(X)|.

The choice of measure of central tendency, m(X), has a marked effect on the value of the average deviation. For example, for the data set {2, 2, 3, 4, 14}:

Measure of central tendency m(X) Average absolute deviation
Mean = 5 \frac{|2 - 5| + |2 - 5| + |3 - 5| + |4 - 5| + |14 - 5|}{5} = 3.6
Median = 3 \frac{|2 - 3| + |2 - 3| + |3 - 3| + |4 - 3| + |14 - 3|}{5} = 2.8
Mode = 2 \frac{|2 - 2| + |2 - 2| + |3 - 2| + |4 - 2| + |14 - 2|}{5} = 3.0

The average absolute deviation from the median is less than or equal to the average absolute deviation from the mean. In fact, the average absolute deviation from the median is always less than or equal to the average absolute deviation from any other fixed number.

The average absolute deviation from the mean is less than or equal to the standard deviation; one way of proving this relies on Jensen's inequality.

If x is a Gaussian random variable with a mean of 0, then, in expectation for large n, the ratio of standard deviation to mean absolute deviation should satisfy the following equality [1]

\frac{\frac{1}{n}\sum|x_i|}{\sqrt{\frac{1}{n}\sum x_i^2}} = \sqrt{\frac{2}{\pi}}.

In other words, for a Gaussian, mean absolute deviation is about 0.8 times the standard deviation.

Mean absolute deviation

The mean absolute deviation (MAD) is the mean absolute deviation from the mean. A related quantity, the mean absolute error (MAE), is a common measure of forecast error in time series analysis, where this measures the average absolute deviation of observations from their forecasts.

It should be noted that although the term mean deviation is used as a synonym for mean absolute deviation, to be precise it is not the same; in its strict interpretation (namely, omitting the absolute value operation), the mean deviation of any data set from its mean is always zero.

Median absolute deviation

The median absolute deviation (also MAD) is the median absolute deviation from the median. It is a robust estimator of dispersion.

For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (or reordered as {0, 1, 1, 1, 11}) with a median absolute deviation of 1, in this case unaffected by the value of the outlier 14.

Maximum absolute deviation

The maximum absolute deviation about a point is the maximum of the absolute deviations of a sample from that point. It is realized by the sample maximum or sample minimum and cannot be less than half the range.

Minimization

The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median is the measure of central tendency most associated with the absolute deviation, in that

L2 norm statistics
just as the mean minimizes the standard deviation,
L1 norm statistics
the median minimizes average absolute deviation,
L norm statistics
the mid-range minimizes the maximum absolute deviation, and
trimmed L norm statistics
for example, the midhinge (average of first and third quartiles) which minimizes the median absolute deviation of the whole distribution, also minimizes the maximum absolute deviation of the distribution after the top and bottom 25% have been trimmed off.

Estimation

The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population.

See also

External links


Advertisements






Got something to say? Make a comment.
Your name
Your email address
Message