In probability theory and statistics, the computational formula for the variance Var(X) of a random variable X is the formula
where E(X) is the expected value of X. This formula can be generalized for covariance:
as well as for the n by n covariance matrix of a random vector of length n:
and for the n by m crosscovariance matrix between two random vectors of length n and m:
where expectations are taken elementwise and and are random vectors of respective lengths n and m.
A closely related identity can be used to calculate the sample variance, which is often used as an unbiased estimate of the population variance:
These results are often used in practice to calculate the variance when it is inconvenient to center a random variable by subtracting its expected value or to center a set of data by subtracting the sample mean. However in some cases it is an easier calculation to carry out the centering first and then directly apply the definition of the variance.
The computational formula for the population variance follows in a straightforward manner from the linearity of expected values and the definition of variance:
To prove the result for the sample variance
note that the sample variance can be expressed as
where X ^{*} is sampled uniformly with replacement from the observed data X_{1}, ..., X_{n} and the variance on the right side is a population variance. Therefore the computational formula for the sample variance follows directly from the computational formula for the population variance. Alternatively, the result can be derived by a direct algebraic calculation using the identity:
Its applications in systolic geometry include Loewner's torus inequality.
