|
|
Chain RuleIn calculus, the chain rule is a formula for the derivative of the composition of two functions. In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x; then, the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x. Suppose, for example, that one is climbing a mountain at a rate of 0.5 kilometre per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 per kilometre. How fast does the temperature drop? Well, if one multiplies 6 per kilometre by 0.5 kilometre per hour, one obtains 3 per hour. This calculation is a typical chain rule application. In algebraic terms, the chain rule (of one variable) states that if the function f is differentiable at g(x) and the function g is differentiable at x, and the function F is defined as f composed with g, that is -
F = f \circ g = f(g(x)) then is given by -
F' = \frac {dF} {dx} = f'(g(x)) \times g'(x). Alternatively, in Leibniz notation, the chain rule can be expressed as: -
\frac {dy}{dx} = \frac {dy} {du} \times \frac {du}{dx} or -
\frac {d(f \circ g)}{dx} = \frac {d(f \circ g)} {dg} \times \frac {dg}{dx}. The general power rule The general power rule (GPR) is derivable, via the Chain Rule. Example I Consider: -
f(x) is comparable to hg(x) where g(x) is (x2 + 1) and h(x) is x3; thus, -
Example II In order to differentiate the trigonometric function: - f(x) = sin(x2)
one can write f(x) = h(g(x)) with h(x) = sin(x) and g(x) = x2 and the chain rule then yields - f '(x) = cos(x2) 2x
since h 'g(x) = cos(x2) and g '(x) = 2x. Proof of the chain rule Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability, - where as .
Similarly, - where as
Now -
-
-
where . Observe that as and . Hence -
as . The fundamental chain rule The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E → F and g : F → G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative of the composition g o f at the point x is given by -
Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication. A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : M → N and g : N → P be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write -
In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C∞ manifolds with C∞ maps as morphisms. Tensors and the chain rule as cocycle As an advanced explanation of the tensor concept, one can interpret the chain rule as applied to coordinate changes also as the requirement for self-consistent concepts of tensor giving rise to tensor fields. Abstractly, we can identify the chain rule as a cocycle. It gives the consistency required to define the tangent bundle in an intrinsic way. The other vector bundles of tensors have comparable cocycles, which come from applying functorial properties of tensor constructions to the chain rule itself; this is why they also are intrinsic (read, 'natural') concepts. What can be read as the 'classical' approach to tensors tries to read this backwards - and is therefore a heuristic approach rather than a foundational one.
|
 |