Multi-Dimensional Taylor Series
Wayne Hacker
Copyright c Wayne Hacker 2006. All rights reserved.
Hackernotes: Wayne Hacker c 2006
1
Contents
Higher-order approximations to f (x, y)
Recall that in Calculus I, you approximated a function f by its tangent line: if |x−x0|
was sufficiently small,
f (x) ≈ f (x0) + f (x0)(x − x0) .
This is the first two terms in the Taylor expansion of f about the point x0. If you
want more accuracy, you keep more terms in the Taylor series. In particular, by
keeping one additional term, we get what is called a “second-order approximation”.
It has the form
1
1
f (x) = f (x0) + f (x0)(x − x0) + f (x
f (ξ)(x − x
2
0)(x − x0)2 + 6
0)3 .
(0.1)
The first two terms make up the tangent-line, or linear, approximation. The first
three terms make up the second-order approximation. The fourth term is called the
error term, and it allows us to use “=” instead of “≈” in the equation. In it, ξ is
between x0 and x: either x0 < ξ < x or x < ξ < x0, depending on whether x0 is
greater or smaller than x. We don’t know exactly what value ξ has, but we can use
it to estimate the maximum possible error in our approximation.
Why use the second-order approximation? There are two approaches to answering
this question: a geometric and an algebraic one.
The geometric approach is more intuitive. The first-order approximation, or linear
approximation,
flinear(x) = f(x0) + f (x0)(x − x0)
approximates f (x) by a line passing through (x0, f(x0)) and tangent to f(x) at that
point. It’s a good approximation as long as x is close enough to x0 that the curve of
f (x) between them can be regarded as a straight line. The second-order approxima-
tion
1
fapprox (x) = f(x0) + f (x0)(x − x0) + f (x
2
0)(x − x0)2
approximates f (x) near x0 as a parabola passing through (x0, f(x0)), with the same
tangent line at x0, and also with the same concavity at x0. Thus even as f(x) curves
away from the tangent line to x0, the parabolic approximation can curve with it.
Hackernotes: Wayne Hacker c 2006
2
T
f
'
approx
f (x)
dd
r
¨¨¨¨¨¨¨¨¨¨ flinear
E
x0
The algebraic approach is based on the error terms in the Taylor expansion. We saw
in (0.1) that the error term was 1f (ξ)(x − x
6
0)3. The corresponding equation for the
first-order approximation is
1
f (x) = f (x0) + f (x0)(x − x0) + f (˜
ξ)(x − x
2
0)2 .
Note that ˜
ξ in the first-order equation need not be the same as ξ in (0.1). However,
like ξ, it must be the case that ˜
ξ is between x0 and x.
Thus
fx − flinear(x) = (some constant)(x − x0)2 ;
fx − fapprox (x) = (some other constant)(x − x0)3 .
If |x − x0| is “small”, i.e. much smaller than 1, then |(x − x0)3| is much smaller than
(x − x0)2. You can see that if you let x − x0 = 10−n for n = 1, 2, 3, ...
Now, let us extend this idea to functions of higher dimensions. Recall that the
tangent-plane approximation to the function z = f (x, y) at the point (x0, y0) is
f (x, y) ≈ zTP(x0, y0) = f(x0, y0) + f(x0, y0) · dx ,
where dx = x − x0, y − y0 .
The second-order approximation is
1
f (x, y) ≈ f (x0, y0) + f(x0, y0) · dx + f
2 xx(x0, y0)(x − x0)2
(0.2)
1
+ fxy(x0, y0)(x − x0)(y − y0) + f
2 yy(x0, y0)(y − y0)2 .
How did we get this formula? We know how to work with a one-dimensional Taylor
series; and we know a directional derivative is just a one-dimensional derivative: the
slope of a curve in the z-u plane, where u is the direction in which we take the
derivative. For example, fx is the same thing as fˆı, taken in the plane containing
ˆı (and therefore the x-axis) and the z-axis. By analogy, we might expect a “two-
dimensional” Taylor series to look like a “one-dimensional” one when viewed in the
proper way.
Hackernotes: Wayne Hacker c 2006
3
Let (x0, y0) be a fixed point in the plane. Suppose we want to approximate f(x, y)
at some other point (x, y). Since Taylor series are constructed from derivatives, and
since the derivative for a general direction is a directional derivative, it makes sense
to parameterize (x, y) to be on the same line as (x0, y0). In that way, the domain is
reduced to one dimension, just as it is for fu.
We parameterize the line segment joining (x0, y0) and (x, y) by s and write it in terms
of the direction vector u = cos θ0, sin θ0 , where θ0 is the direction from (x0, y0) to
(x, y). Then x = x(s); y = y(s); and f (x(s), y(s)) = F (s). We want to expand
F (s) about s = 0, i.e. (x(0) = x0, y(0) = y0). This parameterization reduces a
two-dimensional domain to a one-dimensional one, and a two-dimensional function
f (x, y) to a one-dimensional function F (s). Instead of taking ∂x and ∂y, we take ∂s.
The situation is illustrated below:
The parameter s is the distance from (x0, y0) in
(x, y)
the direction of (x, y). This direction is
y
q
T
E
x
u = cos θ0, sin θ0 .
C
The curve C is the line segment from (x
dy
0, y0)
to (x, y). It is parameterized by
u
x = x
θ
0 + s cos θ0
q
0 0
y = y
(x0, y0)
dx
0 + s sin θ0 .
We expand F (s) in a one-dimensional Taylor series about s = 0:
1
1
F (s) = F (0) + ∂sF (0)s + ∂2F (0)s2 + ∂3F (¯s)s3 ,
(0.3)
2 s
6 s
where ¯
s is analogous to ξ in (0.1): 0 < ¯
s < s.
Consider the second term on the right side of (0.3). By the chain rule,
∂sF (s) = ∂sf(x(s), y(s)) = fx∂sx + fy∂sy
= fx cos θ0 + fy sin θ0
=
f · u = fu(x, y)
= ∂uf(x(s), y(s)) .
Thus ∂s = ∂u. By a similar argument, we can show that
∂2F (s) = ∂2f (x(s), y(s)) = f
s
u
uu .
(0.4)
At s = 0, (x(s), y(s)) = (x0, y0); so
∂sF (0) = fu(x0, y0) = f(x0, y0) · u = fx(x0, y0) cos θ0 + fy(x0, y0) sin θ0 .
Hackernotes: Wayne Hacker c 2006
4
Therefore
∂sF (0)s = fx(x0, y0)(s cos θ0) + fy(x0, y0)(s sin θ0)
(0.5)
= fx(x0, y0)(x − x0) + fy(x0, y0)(y − y0) ,
since x − x0 = s cos θ0 and y − y0 = s sin θ0.
Now, let’s look at the third term on the right side of (0.3). From (0.4), we have
∂2F = f
s
uu. Now
fuu = ∂ufu = ∂u
f · u = ∂u (fx cos θ0 + fy sin θ0)
(0.6)
= ∂ufx cos θ0 + ∂ufy sin θ0 .
Recall that ∂ug = gu = g · u. If we apply this to g = fx and then g = fy, we get
∂ufx = fx · u = fxx cos θ0 + fxy sin θ0
∂ufy = fy · u = fyx cos θ0 + fyy sin θ0 .
When we put these results into (0.6), assuming fxy = fyx, we get
fuu = (fxx cos θ0 + fyy sin θ0) cos θ0 + (fxy cos θ0 + fyy sin θ0) sin θ0
= fxx cos2 θ0 + 2fxy sin θ0 cos θ0 + fyy sin2 θ0
Thus
∂2F (0)s2 = f
s
uu(x0, y0)s2
= fxx(s cos θ0)2 + 2fxy(s cos θ0)(s sin θ0) + fyy(s sin θ0)2
(0.7)
= fxx(x − x0)2 + 2fxy(x − x0)(y − y0) + fyy(y − y0)2 ,
where we have used x − x0 = s cos θ0 and y − y0 = s sin θ0.
Now, substitute (0.5) and (0.7) into (0.3), along with the fact that F (0) = f (x0, y0).
F (s) = F (x, y) = f (x0, y0) + fx(x0, y0)(x − x0) + fy(x0, y0)(y − y0)
1
+ f
2 xx(x0, y0)(x − x0)2 + fxy(x0, y0)(x − x0)(y − y0)
1
+ f
2 yy(x0, y0)(y − y0)2 + (error term) .
Assuming that the error term is small, this is equivalent to (0.2).