The Full Wiki

Einstein notation: Wikis


Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.


From Wikipedia, the free encyclopedia

In mathematics, especially in applications of linear algebra to physics, the Einstein notation or Einstein summation convention is a notational convention useful when dealing with coordinate formulas. It was introduced by Albert Einstein in 1916.[1]

According to this convention, when an index variable appears twice in a single term, once in an upper (superscript) and once in a lower (subscript) position, it implies that we are summing over all of its possible values. In typical applications, the index values are 1,2,3 (representing the three dimensions of physical Euclidean space), or 0,1,2,3 or 1,2,3,4 (representing the four dimensions of space-time, or Minkowski space), but they can have any range, even (in some applications) an infinite set. Thus in three dimensions

 y = c_i x^i \,

actually means

 y = \sum_{i=1}^3 c_i x^i = c_1 x^1 + c_2 x^2 + c_3 x^3.

The upper indices are not exponents, but instead different axes. Thus, for example, x2 should be read as "x-two", not "x squared", and corresponds to the traditional y-axis.

Abstract index notation is a way of presenting the summation convention so that it is made clear that it is independent of coordinates.

In general relativity, the Greek alphabet and the Roman alphabet are used to distinguish whether summing over 1,2,3 or 0,1,2,3 (usually Roman, i, j, ... for 1,2,3 and Greek, \mu\,, \nu\,, ... for 0,1,2,3). As in sign conventions, the convention used in practice varies: Roman and Greek may be reversed.

When there is a fixed basis, one can work with only subscripts, but in general one must distinguish between superscripts and subscripts; see below.

In some fields, Einstein notation is referred to simply as index notation, or indicial notation. The use of the implied summation of repeated indices is also referred to as the Einstein Sum Convention.



The basic idea of Einstein notation is that a covector and a vector can form a scalar:

 y = c_1 x^1+c_2x^2+c_3x^3+ \cdots + c_nx^n \,

This is typically written as an explicit sum:

 y = \sum_{i=1}^n c_ix^i

A scalar is invariant under transformations of basis. When the basis is changed, the components of a vector change by a linear transformation described by a matrix, while the covector changes by the inverse matrix. This is designed to guarantee that the linear function associated to the covector, the sum above, is the same no matter what basis is. Since it is only this sum which is invariant under changes of basis, not the individual terms in the sum, this led Einstein to propose the convention that repeated indices imply the sum:

 y = c_i x^i \,

In Einstein notation, covector indices are subscripts and vector indices are superscripts. The position of the index has a specific meaning. It is important, of course, not to confuse a superscript with an exponent—all the relations with superscripts and subscripts are linear, they involve no power higher than the first. Here, the superscripted i above the symbol x represents an integer-valued index running from 1 to n.

The virtue of Einstein notation is that it represents the invariant quantities with a simple notation.

Vector representations

First, we can use Einstein notation in linear algebra to distinguish easily between vectors and covectors: upper indices are used to label components (coordinates) of vectors, while lower indices are used to label components of covectors. However, vectors themselves (not their components) have lower indices, and covectors have upper indices.[2] Given a vector space V and its dual space V * , one indexes vectors (elements of V) with subscripts, as in v_i \in V, and covectors with superscripts, as in w^i \in V^*. However, the coordinates of vectors and covectors follow the opposite convention: if ei are a basis for V and ei are the dual basis for V * , then vectors are expressed as:

v = a^i e_i = \begin{bmatrix}a^1\\a^2\\\vdots\\a^n\end{bmatrix}

and covectors are expressed as

w = a_i e^i = \begin{bmatrix}a_1 & a_2 & \cdots & a_n\end{bmatrix}

This is because a component of a vector (one of its coordinates, in some basis) is the value of a covector: the coefficient of ei is the value of the corresponding covector in the dual basis: ai = ei(v). Note that ei is a covector, but ai is a scalar. In other words, since basis vectors are given lower indices and coordinates are labeled with upper indices, summation notation suggests pairing them (in the obvious way) to express the vector.

In terms of covariance and contravariance of vectors, lower indices represent 'components' of covariant vectors (covectors), while upper indices represent components of contravariant vectors (vectors): they transform covariantly (resp., contravariantly) with respect to change of basis.

A particularly confusing notation is to use the same letter both for a (co)vector and its components, as in:

v = v^i e_i = \begin{bmatrix}v^1\\v^2\\\vdots\\v^n\end{bmatrix}
w = w_i e^i = \begin{bmatrix}w_1 & w_2 & \cdots & w_n\end{bmatrix}

Here vi does not mean "the covector v", but rather, "the components of the vector v".



  • "Upper indices go up to down; lower indices go left to right"
  • You can stack vectors (column matrices) side-by-side:
\begin{bmatrix}v_1 & \cdots & v_k\end{bmatrix}.

Hence the lower index indicates which column you are in.

  • You can stack covectors (row matrices) top-to-bottom:
\begin{bmatrix}w^1 \\ \vdots \\ w^k\end{bmatrix}

Hence the upper index indicates which row you are in.

Superscripts and subscripts vs. only subscripts

In the presence of a non-degenerate form (an isomorphism V \to V^*, for instance a Riemannian metric or Minkowski metric), one can raise and lower indices.

A basis gives such a form (via the dual basis), hence when working on \mathbf{R}^n with a fixed basis, one can work with just subscripts.

However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see covariance and contravariance of vectors.

Common operations in this notation

In Einstein notation, the usual element reference \mathbf{A}_{mn} for the mth row and nth column of matrix \mathbf{A} becomes \mathbf{A}_n^m. We can then write the following operations in Einstein notation as follows.

Inner product

Given a row vector vi and a column vector ui of the same size, we can take the inner product viui, which is a scalar: it's evaluating the covector on the vector.

Multiplication of a vector by a matrix

Given a matrix A^i_j and a (column) vector vj, the coefficients of the product \mathbf{A}v are given by A^i_j v^j.

Similarly, \mathbf{A}^\mathrm{T} v is equivalent to A^j_i v^j.

Matrix multiplication

We can represent matrix multiplication as:

C^i_k = A^i_j \, B^j_k

This expression is equivalent to the more conventional (and less compact) notation:

 \mathbf{C}_{ik} = (\mathbf{A} \, \mathbf{B})_{ik} =\sum_{j=1}^N A_{ij} B_{jk}


Given a square matrix A^i_j, summing over a common index A^i_i yields the trace.

Outer product

The outer product of the column vector u by the row vector v yields an M × N matrix A:

 \mathbf{A} = \mathbf{u} \, \mathbf{v}

In Einstein notation, we have:

A^i_j = u^i \, v_j = (uv)^i_j

Since i and j represent two different indices, and in this case over two different ranges M and N respectively, the indices are not eliminated by the multiplication. Both indices survive the multiplication to become the two indices of the newly-created matrix A.

Coefficients on tensors and related

Given a tensor field and a basis (of linearly independent vector fields), the coefficients of the tensor field in a basis can be computed by evaluating on a suitable combination of the basis and dual basis, and inherits the correct indexing. We list notable examples.

Throughout, let ei be a basis of vector fields (a moving frame).

gij = g(ei,ej)
gij = g(ei,ej)
 T^c_{ab} = \Gamma^c_{ab} - \Gamma^c_{ba}-\gamma^c_{ab},

which follows from the formula

 T = \nabla_X Y - \nabla_Y X - [X,Y].
{R^\rho}_{\sigma\mu\nu} = dx^\rho(R(\partial_{\mu},\partial_{\nu})\partial_{\sigma})

This also applies for some operations that are not tensorial, for instance:


where \nabla_i e_j is the covariant derivative. Equivalently,

\Gamma_{ij}^k = e^k\nabla_ie_j
  • commutator coefficients
[e_i,e_j] = \gamma_{ij}^k e_k

where [ei,e j] is the Lie bracket. Equivalently,

\gamma_{ij}^k = e^k[e_i,e_j].

Vector dot product

In mechanics and engineering, vectors in 3D space are often described in relation to orthogonal unit vectors i, j and k.

\mathbf{u} = u_x \mathbf{i} + u_y \mathbf{j} + u_z \mathbf{k}

If the basis vectors i, j, and k are instead expressed as e1, e2, and e3, a vector can be expressed in terms of a summation:

\mathbf{u} = u^1 \mathbf{e}_1 + u^2 \mathbf{e}_2 + u^3 \mathbf{e}_3 = \sum_{i = 1}^3 u^i \mathbf{e}_i

In Einstein notation, the summation symbol is omitted since the index i is repeated once as an upper index and once as a lower index, and we simply write

\mathbf{u} = u^i \mathbf{e}_i

Using e1, e2, and e3 instead of i, j, and k, together with Einstein notation, we obtain a concise algebraic presentation of vector and tensor equations. For example,

\mathbf{u} \cdot \mathbf{v} = \left( \sum_{i = 1}^3 u^i \mathbf{e}_i \right) \cdot \left( \sum_{j = 1}^3 v^j \mathbf{e}_j \right) = (u^i \mathbf{e}_i) \cdot (v^j \mathbf{e}_j)= u^i v^j ( \mathbf{e}_i \cdot \mathbf{e}_j ).


 \mathbf{e}_i \cdot \mathbf{e}_j = \delta_{ij}

where \ \delta_{ij} is the Kronecker delta, which is equal to 1 when i = j, and 0 otherwise, we find

\mathbf{u} \cdot \mathbf{v} = u^i v^j\delta_{ij}.

One can use \ \delta_{ij} to lower indices of the vectors; namely, \ u_i=\delta_{ij}u^j and \ v_i=\delta_{ij}v^j. Then

\mathbf{u} \cdot \mathbf{v} = u^i v^j\delta_{ij}= u^i v_i = u_j v^j

Note that, despite ui = ui for any fixed i, it is incorrect to write

 \mathbf{u} \cdot \mathbf{v} = u^iv^i,

since on the right hand side the index i is repeated both times as an upper index and so there is no summation over i according to the Einstein convention. Rather, one should explicitly write the summation:

 \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^3u^iv^i.

Vector cross product

For the cross product,

 \mathbf{u} \times \mathbf{v}= \left( \sum_{j = 1}^3 u^j \mathbf{e}_j \right) \times \left( \sum_{k = 1}^3 v^k \mathbf{e}_k \right) = (u^j \mathbf{e}_j ) \times (v^k \mathbf{e}_k)
 = u^j v^k (\mathbf{e}_j \times \mathbf{e}_k ) = u^j v^k\epsilon^i_{jk} \mathbf{e}_i

where  \mathbf{e}_j \times \mathbf{e}_k = \epsilon^i_{jk} \mathbf{e}_i and \ \epsilon^i_{jk}=\delta^{il}\epsilon_{ljk}, with εijk the Levi-Civita symbol defined by:

\epsilon_{ijk} = \left\{ \begin{matrix} 0 & \mbox{unless } i,j,k \mbox{ are distinct}\ +1 & \mbox{if } (i,j,k) \mbox{ is an even permutation of } (1,2,3)\ -1 & \mbox{if } (i,j,k) \mbox{ is an odd permutation of } (1,2,3) \end{matrix} \right.

One then recovers

 \mathbf{u} \times \mathbf{v} = (u^2 v^3 - u^3 v^2) \mathbf{e}_1 + (u^3 v^1 - u^1 v^3) \mathbf{e}_2 + (u^1 v^2 - u^2 v^1) \mathbf{e}_3


 \mathbf{u} \times \mathbf{v}= \epsilon^i_{jk} u^j v^k\mathbf{e}_i = \sum_{i = 1}^3 \sum_{j = 1}^3 \sum_{k = 1}^3 \epsilon^i_{jk} u^j v^k\mathbf{e}_i .

In other words, if  \mathbf{w} = \mathbf{u} \times \mathbf{v}, then  w^i \mathbf{e}_i= \epsilon^i_{jk} u^j v^k\mathbf{e}_i , so that \ w^i = \epsilon^i_{jk} u^j v^k .

Abstract definitions

In the traditional usage, one has in mind a vector space V  with finite dimension n, and a specific basis of V. We can write the basis vectors as e1, e2, ..., en. Then if v is a vector in V, it has coordinates v^1,\dots,v^n relative to this basis.

The basic rule is:

\mathbf{v} = v^i\mathbf{e}_i.

In this expression, it was assumed that the term on the right side was to be summed as i  goes from 1 to n, because the index i does not appear on both sides of the expression. (Or, using Einstein's convention, because the index i  appeared twice.)

The i is known as a dummy index since the result is not dependent on it; thus we could also write, for example:

\mathbf{v} = v^j\mathbf{e}_j.

An index that is not summed over is a free index and should be found in each term of the equation or formula. Compare dummy indices and free indices with free variables and bound variables.

The value of the Einstein convention is that it applies to other vector spaces built from V  using the tensor product and duality. For example, V\otimes V, the tensor product of V  with itself, has a basis consisting of tensors of the form \mathbf{e}_{ij} = \mathbf{e}_i \otimes \mathbf{e}_j. Any tensor T in V\otimes V can be written as:

\mathbf{T} = T^{ij}\mathbf{e}_{ij}.

V*, the dual of V, has a basis e1, e2, ..., en which obeys the rule

\mathbf{e}^i (\mathbf{e}_j) = \delta^i_j.

Here δ is the Kronecker delta, so \delta^i_j is 1 if i =j  and 0 otherwise.


\mathrm{Hom}(V,W) = V^* \otimes W

the row-column coordinates on a matrix correspond to the upper-lower indices on the tensor product.


Einstein summation is clarified with the help of a few simple examples. Consider four-dimensional spacetime, where indices run from 0 to 3:

\mathbf{} a^\mu b_\mu = a^0 b_0 + a^1 b_1 + a^2 b_2 + a^3 b_3
\mathbf{} a^{\mu\nu} b_\mu = a^{0\nu} b_0 + a^{1\nu} b_1 + a^{2\nu} b_2 + a^{3\nu} b_3.

The above example is one of contraction, a common tensor operation. The tensor  \mathbf{} a^{\mu\nu}b_{\mu} becomes a new tensor by summing over the first upper index and the lower index. Typically the resulting tensor is renamed with the contracted indices removed:

\mathbf{} {s}^{\nu} = a^{\mu\nu}b_{\mu}.

For a familiar example, consider the dot product of two vectors a and b. The dot product is defined simply as summation over the indices of a and b:

\mathbf{a}\cdot\mathbf{b} = a^{\alpha}b_{\alpha} = a^0 b_0 + a^1 b_1 + a^2 b_2 + a^3 b_3,

which is our familiar formula for the vector dot product. Remember it is sometimes necessary to change the components of a in order to lower its index; however, this is not necessary in Euclidean space, or any space with a metric equal to its inverse metric (e.g., flat spacetime).

See also


  1. ^ Einstein, Albert (1916). "The Foundation of the General Theory of Relativity" (PDF). Annalen der Physik. Retrieved 2006-09-03.  
  2. ^ This applies only for numerical indices. The situation is the opposite for abstract indices. Then, vectors themselves carry upper abstract indices and covectors carry lower abstract indices. Elements of a basis of vectors may carry a lower numerical index and an upper abstract index.


External links


Got something to say? Make a comment.
Your name
Your email address