In calculus, the chain rule is a formula for the derivative of the composite of two functions.
In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x, then the rate of change of y with respect to x can be computed as the rate of change of y with respect to u multiplied by the rate of change of u with respect to x. Schematically,
Contents 
The chain rule states that, under appropriate conditions,
which in short form is written as
Alternatively, in the Leibniz notation, the chain rule is
The chain rule can be applied to as many composed functions as needed:
In integration, the counterpart to the chain rule is the substitution rule.
The chain rule in one variable may be stated more completely as follows.^{[1]} Let g be a realvalued function on (a,b) which is differentiable at c ∈ (a,b); and suppose that f is a realvalued function defined on an interval I containing the range of g and suppose further that g(c) is an interior point of I. If f is differentiable at g(c), then
Suppose that a mountain climber ascends at a rate of 0.5 kilometers per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °C per kilometer. To calculate the decrease in air temperature per hour that the climber experiences, one multiplies 6 °C per kilometer by 0.5 kilometer per hour, to obtain 3 °C per hour. This calculation is a typical chain rule application.
Consider the function f(x) = (x^{2} + 1)^{3}. It follows from the chain rule that
In order to differentiate the trigonometric function
one can write:
Differentiate arctan(sin x).
Thus, by the chain rule,
and in particular,
An illuminating exercise is to compute the derivatives of functions that one already knows, but use the chain rule instead.
Rewriting x as , we have
Rewriting x as , we have
Rewriting as , we have
Rewriting x as , we have
In this example, one has to be careful about the domain and range, but we can pretend we are considering only a microscopic portion of the graph.
The chain rule works for functions of more than one variable.^{[2]} Consider the function z = f(x, y) where x = g(t) and y = h(t), and g(t) and h(t) are differentiable with respect to t, then
Suppose that each argument of z = f(u, v) is a twovariable function such that u = h(x, y) and v = g(x, y), and that these functions are all differentiable. Then the chain rule would look like:
If we consider
above as a Cartesian vector function, we can use vector notation to write the above equivalently as the dot product of the gradient of f and a derivative of :
More generally, for functions of vectors to vectors, the chain rule says that the Jacobian matrix of a composite function is the product of the Jacobian matrices of the two functions:
Given where and , determine the value of and using the chain rule.
and
Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,
where ε(δ) → 0 as δ → 0. Similarly,
where η(α) → 0 as α → 0. Define also^{[3]} that
Now
where
Observe that as δ → 0, α_{δ} / δ → g′(x) and α_{δ} → 0, and thus η(α_{δ}) → 0. It follows that
To prove the multivariate chain rule, we will deal with the case of functions of two variables; a similar proof can be constructed for functions of three or more variables. Let x(t), y(t) be differentiable functions of t and assume f(x, y) has a gradient. If we set and , then we have:
. 
When x is constant, we can regard as a function of . Thus the limit on the right is equal to the derivative of , which by the single variable chain rule is .
To calculate the limit on the left, regard as a function of . By the mean value theorem, we can select a real number such that the numerator on the left limit is equal to . So the left limit is equal to , which equals
Thus, it follows that
The chain rule is a fundamental property of all definitions of derivatives and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E → F and g : F → G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative (the Fréchet derivative) of the composition g o f at the point x is given by
Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.
A particularly clear formulation of the chain rule can be achieved in the most general setting: let M, N and P be C^{k} manifolds (or even Banachmanifolds) and let
be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write
In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C^{∞} manifolds with C^{∞} maps as morphisms.
See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.
Faà di Bruno's formula generalizes the chain rule to higher derivatives. The first few derivatives are
Contents 
We know how to differentiate regular polynomial functions. For example:
However, there is a useful rule known as the chain method rule. The function above (f(x) = (x^{2} + 5)^{2}) can be consolidated into two nested parts f(x) = u^{2}, where u = m(x) = (x^{2} + 5). Therefore:
Then:
Then
The chain rule states that if we have a function of the form y(u(x)) (i.e. y can be written as a function of u and u can be written as a function of x) then:
If a function F(x) is composed to two differentiable functions g(x) and m(x), so that F(x)=g(m(x)), then F(x) is differentiable and,

We can now investigate the original function:
Therefore
This can be performed for more complicated equations. If we consider:
and let and u=1+x^{2}, so that and , then, by applying the chain rule, we find that
So, in just plain words, for the chain rule you take the normal derivative of the whole thing (make the exponent the coefficient, then multiply by original function but decrease the exponent by 1) then multiply by the derivative of the inside.
When we wish to differentiate a more complicated expression such as:
our only way (up to this point) to differentiate the expression is to expand it and get a polynomial, and then differentiate that polynomial. This method becomes very complicated and is particularly error prone when doing calculations by hand. It is advantageous to find the derivative of h(x) using just the functions f(x) = (x^{2}+5)^{5} and g(x) = (x^{3} + 2)^{3} and their derivatives.

What this rule basically means is that if one has a function that
is the product of two functions, then all one has to do is
differentiate the first function, multiply it by the other
undifferentiated function, add that to the first function
undifferentiated multiplied by the differentiated second function.
For example, if one were to take the function
its derivative would not be
Instead it would be
Another way of approaching this is if one were to have a function that was a product of the two functions A and B
Its derivative would be
Proving this rule is relatively straightforward, first let us state the equation for the derivative:
We will then apply one of the oldest tricks in the book—adding a term that cancels itself out to the middle:
Notice that those terms sum to zero, and so all we have done is add 0 to the equation.
Now we can split the equation up into forms that we already know how to solve:
Looking at this, we see that we can separate the common terms out of the numerators to get:
Which, when we take the limit, becomes:
This can be extended to 3 functions:
For any number of functions, the derivative of their product is the sum, for each function, of its derivative times each other function.
The product rule can be used to give a proof of the power rule for whole numbers. The proof proceeds by mathematical induction. We begin with the base case n = 1. If f_{1}(x) = x then from the definition is easy to see that
Next we suppose that for fixed value of N, we know that for f_{N}(x) = x^{N}, f_{N}'(x) = Nx^{N − 1}. Consider the derivative of f_{N + 1}(x) = x^{N + 1},
We have shown that the statement is true for n = 1 and that if this statement holds for n = N, then it also holds for n = N + 1. Thus by the principle of mathematical induction, the statement must hold for .
For quotients, where one function is divided by another function, the equation is more complicated but it is simply a special case of the product rule.
Then we can just use the product rule and the chain rule:
We can then multiply through by 1, or more precisely: g(x)^{2} / g(x)^{2}, which cancels out into 1, to get:
This leads us to the socalled "quotient rule":

Which some people remember with the mnemonic "low Dhigh minus high Dlow (over) square the low and away we go!"
The derivative of (4x − 2) / (x^{2} + 1) is:
Remember: the derivative of a product/quotient is not the product/quotient of the derivatives. (That is, differentiation does not distribute over multiplication or division.) However one can distribute before taking the derivative. That is
Generally, one will encounter functions expressed in explicit form, that is, y = f(x) form. You might encounter a function that contains a mixture of different variables. Many times it is inconvenient or even impossible to solve for y. A good example is the function . It is too cumbersome to isolate y in this function. One can utilize implicit differentiation to find the derivative. To do so, consider y to be a nested function that is defined implicitly by x. You need to employ the chain rule whenever you take the derivative of a variable with respect to a different variable: i.e., (the derivative with respect to x) of x is 1; of y is .
Remember:
Therefore:
can be solved as:
then differentiated:
However, using implicit differentiation it can also be differentiated like this:
use the product rule:
solve for :
Note that, if we substitute into , we end up with again.
Find the derivative of y^{2} + x^{2} = 25 with respect to x.
You are seeking .
Take the derivative of each side of the equation with respect to x.
To determine the derivative of an exponent requires use of the symmetric difference equation for determining the derivative:
First we will solve this for the specific case of an exponent with a base of e and then extend it to the general case with a base of a where a is a positive real number.
First we set up our problem using f(x) = e^{x}:
Then we apply some basic algebra with powers (specifically that a^{b + c} = a^{b} a^{c}):
Treating e^{x} as a constant with respect to what we are taking the limit of, we can use the limit rules to move it to the outside, leaving us with:
A careful examination of the limit reveals a hyperbolic sine:
The limit of as h approaches 0 is equal to 1 (using L'Hopital's rule), leaving us with:

in which f'(x) = f(x).
Now that we have derived a specific case, let us extend things to the general case. Assuming that a is a positive real constant, we wish to calculate:
One of the oldest tricks in mathematics is to break a problem down into a form that we already know we can handle. Since we have already determined the derivative of e^{x}, we will attempt to rewrite a^{x} in that form.
Using that e^{ln(c)} = c and that ln(a^{b}) = b · ln(a), we find that:
Thus, we simply apply the chain rule:
In which we can solve for the derivative and substitute back with e^{x · ln(a)} = a^{x} to get:

Closely related to the exponentiation is the logarithm. Just as with exponents, we will derive the equation for a specific case first (the natural log, where the base is e), and then work to generalize it for any logarithm.
First let us create a variable y such that:
It should be noted that what we want to find is the derivative of y or .
Next we will put both sides to the power of e in an attempt to remove the logarithm from the right hand side:
Now, applying the chain rule and the property of exponents we derived earlier, we take the derivative of both sides:
This leaves us with the derivative:
Substituting back our original equation of x = e^{y}, we find that:

If we wanted, we could go through that same process again for a generalized base, but it is easier just to use properties of logs and realize that:
Since 1 / ln(b) is a constant, we can just take it outside of the derivative:
Which leaves us with the generalized form of:

Sine, Cosine, Tangent, Cosecant, Secant, Cotangent. These are functions that crop up continuously in mathematics and engineering and have a lot of practical applications. They also appear in more advanced mathematics, particularly when dealing with things such as line integrals with complex numbers and alternate representations of space like spherical and cylindrical coordinate systems.
We use the definition of the derivative, i.e.,
to work these first two out.
Let us find the derivative of sin x, using the above definition.

Now for the case of cos x

Therefore we have established

To find the derivative of the tangent, we just remember that:
which is a quotient. Applying the quotient rule, we get:
Then, remembering that cos^{2}(x) + sin^{2}(x) = 1, we simplify:

For secants, we just need to apply the chain rule to the derivations we have already determined.
So for the secant, we state the equation as:
Take the derivative of both equations, we find:
Leaving us with:
Simplifying, we get:

Using the same procedure on cosecants:
We get:

Using the same procedure for the cotangent that we used for the tangent, we get:

Arcsine, arccosine, arctangent. These are the functions that allow you to determine the angle given the sine, cosine, or tangent of that angle.
First, let us start with the arcsine such that:
To find dy/dx we first need to break this down into a form we can work with:
Then we can take the derivative of that:
...and solve for dy / dx:
At this point we need to go back to the unit triangle. Since y is the angle and the opposite side is sin(y) (which is equal to x), the adjacent side is cos(y) (which is equal to the square root of 1 minus x^{2}, based on the Pythagorean theorem), and the hypotenuse is 1. Since we have determined the value of cos(y) based on the unit triangle, we can substitute it back in to the above equation and get:

We can use an identical procedure for the arccosine and arctangent:


We can use the properties of the logarithm, particularly the natural log, to differentiate more difficult functions, such a products with many terms, quotients of composed functions, or functions with variable or function exponents. We do this by taking the natural logarithm of both sides, rearranging terms using the logarithm laws below, and then differentiating both sides implicitly, before multiplying through by y.

See the examples below.
Suppose we wished to differentiate We take the natural logarithm of both sides Differentiating implicitly Multiplying by y 
Let us differentiate a function Taking the natural logarithm of left and right We then differentiate both sides, recalling the product rule Multiplying by the original function y 
Take a function Then We then differentiate And finally multiply by y 
Given the above rules, practice differentiation on the following.
