## Fixed point iteration

To answer the question why the iterative method for solving nonlinear equations works in some cases but fails in others, we need to understand the theory behind the method, the fixed point of a contraction function. If a single variable function satisfies

 (36)

it is Lipschitz continuous, and is a Lipschitz constant. Specially, if , then is a non-expansive function, if , then is a contraction function or simply a contraction. These concepts can be generated to functions of multivariables (a point in an N-D metric space ):

 (37)

which can also be expressed in vector form:

 (38)

Definition: In a metric space with certain distance defined between any two points , a function is a contraction if

 (39)

The smallest value that satisfies the above is called the (best) Lipschitz constant.

In the following, we will use the p-norm of the vector defined below as the distance measurement:

 (40)

where , e.g., . Also, for conveninece, we can drop so that .

Intuitively, a contraction reduces the distance between points in the space, i.e., it brings them closer together. A function may not be a contraction through out its entire domain, but it can be a contraction in the neighborhood of a certain point , in which any is sufficiently close to so that

 when (41)

Definition: A fixed point of a function is a point in its domain that is mapped to itself:

 (42)

We immediately have

 (43)

A fixed point is an attractive fixed point if any point in its neighborhood converges to , i.e., .

Fixed Point Theorem : Let be a contraction function satisfying

 (44)

then there exists a unique fixed point , which can be found by an iteration from an arbitrary initial point :

 (45)

Proof

• We first prove constructively the existence of a fixed point. As is a contraction, we have

 (46)

As , we have . This is a Cauchy sequence that converges to some point also in the space. We further have

 (47)

i.e., the limit of the Cauchy sequence is a fixed point.

• We next prove the uniqueness of the fixed point. Let and be two fixed points of , then we have

 (48)

For any , the above holds only if , i.e., is the unique fixed point.
QED

Theorem: Let be a fixed point of a differentiable function , i.e, exists for any . If the norm of the Jacobian matrix is smaller than 1, , then is a contraction at .

The Jacobian matrix of is defined as

 (49)

Proof: Consider the Taylor expansion of the function in the neighborhood of :

 (50)

where is the remainder composed of second and higher order terms of . Subtracting and taking any p-norm on both sides, we get

 (51)

When , the second and higher order terms of disappear and , we have

 (52)

The inequality is due to the Cauchy-Schwarz inequality. if , the function is a contraction at .

QED

In particular, for a single-variable function in an dimensional space, we have

 (53)

and

 (54)

If , then is a contraction at .

Now we understand why in the examples of the previous section the iteration leads to convergence in some cases but divergence in other cases: if , the iteration will converge to the root of , but if , it never will never converge.

The iterative process for finding the fixed point of a single-variable function can be shown graphically as the intersections of the function and the identity function , as shown below. The iteration converges in the first two cases as , but it diverges in the last two cases as .

We next find the order of convergence of the fixed point iteration. Consider

 (55)

We see that in general the fixed point iteration converges linearly. However, if the iteration function has zero derivative at the fixed point , we have

 (56)

 constant (57)

Moreover, if , then the iteration converges cubically.

Consider a specific iteration function in the form of , which is equivalent to , as if is the root of satisfying , it also satisfies , which is indeed a fixed point of . The derivative of this function is

 (58)

If , then we can define so that then , then the convergence becomes quadratic. This is actually the Newton-Raphson method, to be discussed later.

Example 1

Find the solution of the following equation:

This equation can converted into an equivalent form of in two different ways:

 and

In the figure below, the two functions (left) and (right) together with and the identity function are plotted:

The iteration based on converges to the solution for any initial guess , as is a contraction. However, the iteration based on does not converge to as it is not a contraction in the neighborhood of . In fact, the iteration will diverge towards either if or if .

Example 2

To solve the following equation

we first convert it into the form of in two different ways:

As can be seen in the plots, this equation has three solutions,

of which and can be obtained by the iteration based on and can be obtained by the iteration based on . But neither of them can find all three roots.

• As shown in the plot on the left, for all except in the neighborhood of , i.e., is a contraction mapping everywhere except around . Therefore the iteration starting from any initial guess will converge to either if , or if .

However, as is not a contraction mapping around , the iteration will never converge to .
• As shown in the plot on the right, for all except in the neighborhood of , i.e., is not a contraction mapping around either or . Therefore the iteration based on will not converge to either or , but it may converge to , if the initial guess is in the range . However, if is outside this range the iteration will diverge toward either if of if .

Example 3

This equation can be converted into the form in different ways:

Example 4

• The iteration from any initial guess will converge to .

• Around , , the iteration does not converge.

Example 5

Consider a 3-variable linear vector function of arguments :

from which the g-function can be obtained:

The Jacobian of this linear system is a constant matrix

with the induced p=2 norm (maximum singular value) . Consequently, the iteration converges from an initial guess to the solution .

Alternatively, the g-function can also be obtained as

The Jacobian is

with the induced p=2 norm . The iteration does not converge.

Example 6

Consider a 3-variable nonlinear function of arguments :

The g-function can be obtained as

With and after iterations converges to , with error . However, the iteration may not converge from other possible initial guesses.

By Aitken's method, the iteration can be accelerated based on two consecutive points and , as shown in the figure below.

The secant line of that goes through the two points and is represented by the equation in terms of its slope:

 (59)

Solving for , we get

 (60)

To accelerate, instead of moving from to , we move to the point at which this secant line intersects with the identity function . We can therefore replace in the equation above by and solve the resulting equation

 (61)

to get

 (62)

where and are respectively the first and second order differences defined below:

 (63)

 (64)

This result can then be converted into an iterative process

 (65)

Given , we skip and but directly move to computed based on and , thereby making a greater step towards the solution.

Example Solve . Construct . It takes 18 iterations for the regular fixed point algorithm with initial guess , to get that satisfies , but it only three iterations for Aitken's method to converge to the same result: