In mathematics, differential calculus is a subfield of calculus concerned with the study of how functions change when their inputs change. The primary object of study in differential calculus is the derivative. A closely related notion is the differential. The derivative of a function at a chosen input value describes the behavior of the function near that input value. For a realvalued function of a single real variable, the derivative at a point equals the slope of the tangent line to the graph of the function at that point. In general, the derivative of a function at a point determines the best linear approximation to the function at that point.
The process of finding a derivative is called differentiation. The fundamental theorem of calculus states that differentiation is the reverse process to integration.
Differentiation has applications to all quantitative disciplines. In physics, the derivative of the displacement of a moving body with respect to time is the velocity of the body, and the derivative of velocity with respect to time is acceleration. Newton's second law of motion states that the derivative of the momentum of a body equals the force applied to the body. The reaction rate of a chemical reaction is a derivative. In operations research, derivatives determine the most efficient ways to transport materials and design factories. By applying game theory, differentiation can provide best strategies for competing corporations.
Derivatives are frequently used to find the maxima and minima of a function. Equations involving derivatives are called differential equations and are fundamental in describing natural phenomena. Derivatives and their generalizations appear in many fields of mathematics, such as complex analysis, functional analysis, differential geometry, measure theory and abstract algebra.
Contents 
Suppose that x and y are real numbers and that y is a function of x, that is, for every value of x, we can determine the value of y. This relationship is written as: y = f(x). Where f(x) is the equation for a straight line, y = m x + c, where m and c are real numbers that determine the locus of the line in Cartesian coordinates. m is called the slope and is given by:
where the symbol Δ (the uppercase form of the Greek letter Delta) is an abbreviation for "change in". It follows that Δy = m Δx.
In linear functions the derivative of f at the point x is the best possible approximation to the idea of the slope of f at the point x. It is usually denoted f'(x) or dy/dx. Together with the value of f at x, the derivative of f determines the best linear approximation, or linearization, of f near the point x. This latter property is usually taken as the definition of the derivative. Derivatives cannot be calculated in nonlinear functions because they do not have a welldefined slope.
A closely related notion is the differential of a function.
When x and y are real variables, the derivative of f at x is the slope of the tangent line to the graph of f' at x. Because the source and target of f are onedimensional, the derivative of f is a real number. If x and y are vectors, then the best linear approximation to the graph of f depends on how f changes in several directions at once. Taking the best linear approximation in a single direction determines a partial derivative, which is usually denoted ∂y/∂x. The linearization of f in all directions at once is called the total derivative. It is a linear transformation, and it determines the hyperplane that most closely approximates the graph of f. This hyperplane is called the osculating hyperplane; it is conceptually the same idea as taking tangent lines in all directions at once.
The concept of a derivative in the sense of a tangent line is a very old one, familiar to Greek geometers such as Euclid (c. 300 BCE), Archimedes (c. 287–212 BCE) and Apollonius of Perga (c. 262–190 BCE).^{[1]} Archimedes also introduced the use of infinitesimals, although these were primarily used to study areas and volumes rather than derivatives and tangents; see Archimedes' use of infinitesimals.
The use of infinitesimals to study rates of change can be found in Indian mathematics, perhaps as early as 500 CE, when the astronomer and mathematician Aryabhata (476–550) used infinitesimals to study the motion of the moon.^{[2]} The use of infinitesimals to compute rates of change was developed significantly by Bhāskara II (11141185); indeed, it has been argued^{[3]} that many of the key notions of differential calculus can be found in his work, such as "Rolle's theorem".^{[4]} The Persian mathematician, Sharaf alDīn alTūsī (11351213), was the first to discover the derivative of cubic polynomials, an important result in differential calculus;^{[5]} his Treatise on Equations developed concepts related to differential calculus, such as the derivative function and the maxima and minima of curves, in order to solve cubic equations which may not have positive solutions.^{[6]} An early version of the mean value theorem was first described by Parameshvara (1370–1460) from the Kerala school of astronomy and mathematics in his commentary on Bhaskara II.^{[7]}
The modern development of calculus is usually credited to Isaac Newton (1643 – 1727) and Gottfried Leibniz (1646 – 1716), who provided independent^{[8]} and unified approaches to differentiation and derivatives. The key insight, however, that earned them this credit, was the fundamental theorem of calculus relating differentiation and integration: this rendered obsolete most previous methods for computing areas and volumes,^{[9]} which had not been significantly extended since the time of Ibn alHaytham (Alhazen).^{[10]} For their ideas on derivatives, both Newton and Leibniz built on significant earlier work by mathematicians such as Isaac Barrow (1630 – 1677), René Descartes (1596 – 1650), Christiaan Huygens (1629 – 1695), Blaise Pascal (1623 – 1662) and John Wallis (1616 – 1703). In particular, Isaac Barrow is often credited with the early development of the derivative.^{[11]} Nevertheless, Newton and Leibniz remain key figures in the history of differentiation, not least because Newton was the first to apply differentiation to theoretical physics, while Leibniz systematically developed much of the notation still used today.
Since the 17th century many mathematicians have contributed to the theory of differentiation. In the 19th century, calculus was put on a much more rigorous footing by mathematicians such as Augustin Louis Cauchy (1789 – 1857), Bernhard Riemann (1826 – 1866), and Karl Weierstrass (1815 – 1897). It was also during this period that the differentiation was generalized to Euclidean space and the complex plane.
If f is a differentiable function on R (or an open interval) and x is a local maximum or a local minimum of f, then the derivative of f at x is zero; points where f '(x) = 0 are called critical points or stationary points (and the value of f at x is called a critical value). (The definition of a critical point is sometimes extended to include points where the derivative does not exist.) Conversely, a critical point x of f can be analysed by considering the second derivative of f at x:
This is called the second derivative test. An alternative approach, called the first derivative test, involves considering the sign of the f ' on each side of the critical point.
Taking derivatives and solving for critical points is therefore often a simple way to find local minima or maxima, which can be useful in optimization. By the extreme value theorem, a continuous function on a closed interval must attain its minimum and maximum values at least once. If the function is differentiable, the minima and maxima can only occur at critical points or endpoints.
This also has applications in graph sketching: once the local minima and maxima of a differentiable function have been found, a rough plot of the graph can be obtained from the observation that it will be either increasing or decreasing between critical points.
In higher dimensions, a critical point of a scalar valued function is a point at which the gradient is zero. The second derivative test can still be used to analyse critical points by considering the eigenvalues of the Hessian matrix of second partial derivatives of the function at the critical point. If all of the eigenvalues are positive, then the point is a local minimum; if all are negative, it is a local maximum. If there are some positive and some negative eigenvalues, then the critical point is a saddle point, and if none of these cases hold (i.e., some of the eigenvalues are zero) then the test is inconclusive.
One example of an optimization problem is: Find the shortest curve between two points on a surface, assuming that the curve must also lie on the surface. If the surface is a plane, then the shortest curve is a line. But if the surface is, for example, eggshaped, then the shortest path is not immediately clear. These paths are called geodesics, and one of the simplest problems in the calculus of variations is finding geodesics. Another example is: Find the smallest area surface filling in a closed curve in space. This surface is called a minimal surface and it, too, can be found using the calculus of variations.
Calculus is of vital importance in physics: many physical processes are described by equations involving derivatives, called differential equations. Physics is particularly concerned with the way quantities change and evolve over time, and the concept of the "time derivative" — the rate of change over time — is essential for the precise definition of several important concepts. In particular, the time derivatives of an object's position are significant in Newtonian physics:
For example, if an object's position on a line is given by
then the object's velocity is
and the object's acceleration is
which is constant.
A differential equation is relation between a collection of functions and their derivatives. An ordinary differential equation is a differential equation that relates functions of one variable to their derivatives with respect to that variable. A partial differential equation is a differential equation that relates functions of more than one variable to their partial derivatives. Differential equations arise naturally in the physical sciences, in mathematical modelling, and within mathematics itself. For example, Newton's second law, which describes the relationship between acceleration and position, can be stated as the ordinary differential equation
The heat equation in one space variable, which describes how heat diffuses through a straight rod, is the partial differential equation
Here u(x, t) is the temperature of the rod at position x and time t and α is a constant that depends on how fast heat diffuses through the rod.
The mean value theorem gives a relationship between values of the derivative and values of the original function. If f(x) is a realvalued function and a and b are numbers with a < b, then the mean value theorem says that under mild hypotheses, the slope between the two points (a, f(a)) and (b, f(b)) is equal to the slope of the tangent line to f at some point c between a and b. In other words,
In practice, what the mean value theorem does is control a function in terms of its derivative. For instance, suppose that f has derivative equal to zero at each point. This means that its tangent line is horizontal at every point, so the function should also be horizontal. The mean value theorem proves that this must be true: The slope between any two points on the graph of f must equal the slope of one of the tangent lines of f. All of those slopes are zero, so any line from one point on the graph to another point will also have slope zero. But that says that the function does not move up or down, so it must be a horizontal line. More complicated conditions on the derivative lead to less precise but still highly useful information about the original function.
The derivative gives the best possible linear approximation, but this can be very different from the original function. One way of improving the approximation is to take a quadratic approximation. That is to say, the linearization of a realvalued function f(x) at the point x_{0} is a linear polynomial a + b(x  x_{0}), and it may be possible to get a better approximation by considering a quadratic polynomial a + b(x  x_{0}) + c(x  x_{0})². Still better might be a cubic polynomial a + b(x  x_{0}) + c(x  x_{0})² + d(x  x_{0})³, and this idea can be extended to arbitrarily high degree polynomials. For each one of these polynomials, there should be a best possible choice of coefficients a, b, c, and d that makes the approximation as good as possible.
For a, the best possible choice is always f(x_{0}), and for b, the best possible choice is always f'(x_{0}). For c, d, and higherdegree coefficients, these coefficients are determined by higher derivatives of f. c should always be f''(x_{0})/2, and d should always be f'''(x_{0})/3!. Using these coefficients gives the Taylor polynomial of f. The Taylor polynomial of degree d is the polynomial of degree d which best approximates f, and its coefficients can be found by a generalization of the above formulas. Taylor's theorem gives a precise bound on how good the approximation is. If f is a polynomial of degree less than or equal to d, then the Taylor polynomial of degree d equals f.
The limit of the Taylor polynomials is an infinite series called the Taylor series. The Taylor series is frequently a very good approximation to the original function. Functions which are equal to their Taylor series are called analytic functions. It is impossible for functions with discontinuities or sharp corners to be analytic, but there are smooth functions which are not analytic.
Some natural geometric shapes, such as circles, cannot be drawn as the graph of a function. For instance, if F(x, y) = x² + y² − 1, then the circle is the set of all pairs (x, y) such that F(x, y) = 0. This set is called the zero set of F. It is not the same as the graph of F, which is a cone. The implicit function theorem converts relations such as F(x, y) = 0 into functions. It states that if F is continuously differentiable, then around most points, the zero set of F looks like graphs of functions pasted together. The points where this is not true are determined by a condition on the derivative of F. The circle, for instance, can be pasted together from the graphs of the two functions . In a neighborhood of every point on the circle except (1, 0) and (1, 0), one of these two functions has a graph that looks like the circle. (These two functions also happen to meet (1, 0) and (1, 0), but this is not guaranteed by the implicit function theorem.)
The implicit function theorem is closely related to the inverse function theorem, which states when a function looks like graphs of invertible functions pasted together.
Differential calculus, a branch of calculus, is the process of finding out the rate of change of a variable compared to another variable, by using functions. It is a way to find out how a shape changes from one point to the next, without needing to divide the shape into an infinite number of pieces. Differential calculus is the opposite of integral calculus. It was developed in the 1670s and 1680s by Sir Isaac Newton and Gottfried Leibniz.
Unlike a number such as 5 or 200, a variable can change its value. For example, distance and time are variables. At an Olympic running race, as the person runs, their distance from the starting line goes up. Meanwhile, a stopwatch or clock measures the time as it goes up. We can measure the average speed of the runner if we divide the distance they travelled by the time it took. But this does not say what speed the person was running at exactly 1.5 seconds into the race. If we had the distance at 1 second and the distance at 2 seconds, we would still only have an average, although it would probably be more correct than the average for the whole race.
Until calculus was invented, the only way to work this out was to cut the time into smaller and smaller pieces, so the average speed over the smaller time would get closer and closer to the actual speed at exactly 1.5 seconds. This was a very long and hard process and had to be done each time people wanted to work something out. Imagine a driver trying to figure out a car's speed using only its odometer (distance meter) and clock, without a speedometer!
A very similar problem is to find the slope (how steep it is) at any point on a curve. The slope of a straight line is easy to work out—it is simply how much it goes up (y or vertical) divided by how much it goes across (x or horizontal). If a line is parallel to the x axis, then its slope is zero. If a straight line went through (x,y) = (2,10) and (4,18), the line goes up 8 and goes across 2, so its slope is 8 divide 2, which is 4.
On a curve, though, the slope is a variable (has different values at different points) because the line bends. But if the curve was to be cut into very, very small pieces, the curve at the point would look almost like a very short straight line. So to work out its slope, a straight line can be drawn through the point with the same slope as the curve at that point. If it is done exactly right, the straight line will have the same slope as the curve, and is called a tangent. But there is no way to know (without calculus) whether the tangent is exactly right, and our eyes are not accurate enough to be certain whether it is exact or simply very close.
What Newton and Leibniz found was a way to work out the slope (or the speed in the distance example) exactly using simple and logical rules. They divided the curve into an infinite number of very small pieces. They then chose points on either side of the point they were interested in and worked out tangents at each. As the points moved closer together towards the point they were interested in, the slope approached a particular value as the tangents approached the real slope of the curve. They said that this particular value it approached was the actual slope.
Let's say we have a function y = f(x). f is short for function, so this equation means "y is a function of x". This tells us that how high y is on the vertical axis depends on what x (the horizontal axis) is at that time. For example with the equation y = x², we know that if x is 1, then y will be 1; if x is 3, then y will be 9; if x is 20, then y will be 400.
Choose a point A on the curve, and call its horizontal position x. Then choose another point B on the curve which is a little bit farther across than A, and call its horizontal position x + h. It does not matter how much h is, it is a very small number.
So when we go from point A to point B, the vertical position has gone from f(x) to f(x + h), and the horizontal position has gone from x to x + h. Now remember that the slope is how much it goes up divided by how much it goes across. So the slope will be:
$\backslash frac\{f(x+h)\; \; f(x)\}\{h\}$
If you bring B closer and closer to A—which means h gets closer and closer to 0—then we get closer to knowing what the slope is at the point A.
$\backslash lim\_\{h\backslash rightarrow0\}\; \backslash frac\{f(x+h)\; \; f(x)\}\{h\}$
Now let's go back to y = x². The slope of this can be determined as follows:
$=\; \backslash lim\_\{h\backslash rightarrow0\}\; \backslash frac\{f(x+h)\; \; f(x)\}\{h\}$
$=\; \backslash lim\_\{h\backslash rightarrow0\}\; \backslash frac\{(x+h)^2\; \; (x)^2\}\{h\}$
Applying the binomial theorem which states $(x+y)^2\; =\; x^2\; +\; 2xy\; +\; y^2$ we can reduce to:
$=\; \backslash lim\_\{h\backslash rightarrow0\}\; \backslash frac\{x^2\; +\; 2xh\; +\; h^2\; \; x^2\}\{h\}$
$=\; \backslash lim\_\{h\backslash rightarrow0\}\; \backslash frac\{2xh\; +\; h^2\}\{h\}$
$=\; \backslash lim\_\{h\backslash rightarrow0\}\; 2x\; +\; h$
$=\; \backslash frac\{\}\{\}\; 2x$
So we know without having to draw any tangent lines that at any point on the curve f(x) = x², the derivative f'(x) (marked with an apostrophe) will be 2x at any point. This process of working out a slope using limits is called differentiation, or finding the derivative.
Leibniz came to the same result, but called h "dx", which means "a tiny amount of x". He called the resulting change in f(x) "dy", which means "a tiny amount of y". Leibniz's notation is used by more books because it is easy to understand when the equations become more complicated. In Leibniz notation:
$\backslash frac\{dy\}\{dx\}\; =\; f\text{'}(x)$
Using the above system, mathematicians have worked out rules which work all the time, no matter which function is being looked at.
Condition  Function  Derivative  Example  Derivative 

A number by itself  y = a  $\backslash frac\{dy\}\{dx\}\; =\; 0$  y = 3  0 
A straight line  y = mx + c  $\backslash frac\{dy\}\{dx\}\; =\; m$  y = 3x + 5  3 
x to the power of a number  y = xª  $\backslash frac\{dy\}\{dx\}\; =\; a\; x^\{a1\}$  y = x¹²  12x¹¹ 
A number multiplied by a function  y = cu  $\backslash frac\{dy\}\{dx\}\; =\; c\; \backslash frac\{du\}\{dx\}$  y = 3(x² + x)  3*(2x + 1) 
A function plus another function  y = u + v  $\backslash frac\{dy\}\{dx\}\; =\; \backslash frac\{du\}\{dx\}\; +\; \backslash frac\{dv\}\{dx\}$  y = 3x² + 2√x  $6x\; +\; \backslash frac\{1\}\{\backslash sqrt\{x\}\}$ 
A function minus another function  y = u − v  $\backslash frac\{dy\}\{dx\}\; =\; \backslash frac\{du\}\{dx\}\; \; \backslash frac\{dv\}\{dx\}$  y = 3x² − 2√x  $6x\; \; \backslash frac\{1\}\{\backslash sqrt\{x\}\}$ 
Product Rule A function multiplied by another function  y = u × v  $\backslash frac\{dy\}\{dx\}\; =\; \backslash frac\{du\}\{dx\}v\; +\; u\backslash frac\{dv\}\{dx\}$  y = (x² + x + 2)(3x − 1)  (3x − 1)(2x + 1) + (x² + x + 2)*3 
Quotient Rule A function divided by another function  y = u ÷ v  $\backslash frac\{dy\}\{dx\}\; =\; \backslash frac\{\backslash frac\{du\}\{dx\}v\; \; u\backslash frac\{dv\}\{dx\}\}\{v^2\}$  y = (x² + 2) ÷ (x − 1)  2x*(x − 1) − (x² + 2)*1 (x − 1)² 
An exponential function  $\backslash frac\{\}\{\}\; y\; =\; e^x$  $\backslash frac\{dy\}\{dx\}\; =\; e^x$  $\backslash frac\{\}\{\}\; y\; =\; e^x$  $\backslash frac\{\}\{\}\; e^x$ 
