Next: Optimization Up: Solving Non-Linear Equations Previous: Newton-Raphson method (univariate)

## Newton-Raphson method (multivariate)

Before discussing how to solve a multivariate systems, it is helpful to review the Taylor series expansion of an N-D function.

The methods discussed above for solving a 1-D equation can be generalized for solving an N-D multivariate equation system:

where we have defined

We can further define a vector function

so that the equation system above can be written as

When , the solution of the equation above can be geometrically explained. The equation represents contour curves in the plane that partition the plane into regions in which the function takes either positive or negative values. The solutions that satisfy both equations are the intersections of the contour curves of both and .

Newton-Raphson method

The Newton-Raphson method can also be generalized for solving an N-D system . We first consider the Taylor expansions of the functions:

These equations can be written in vector form:

where is an Jacobian matrix defined over the function vector defined as:

with the ijth element being . Note that the ith row vector of is the gradient vector of the ith function :

Like in the 1-D case, if all equations are linear, the error terms are zero and we have

By assuming , we can find the roots as , where can be obtained by solving the equation above:

and the root can be found from any starting point as

If the equations are nonlinear, this result is only an approximation of the real root, which can be improved iteratively:

The iteration moves in the N-D space from an initial position in such a direction that all functions () are reduced.

Newton's method can be further generated to solve over-constrained non-linear equation systems with unknowns but equations. In the case the inverse matrix of the Jacobian matrix does not exist, but the pseudo-inverse can be used in the iteration:

or

The iteration attempts to find a solution in the nonlinear least squares sense. This is essentially the Gauss-Newton algorithm to be considered later.

The Newton-Raphson method assumes the analytical expressions of all partial derivatives can be made available based on the functions , so that the Jacobian matrix can be computed. However, when this is not the case, it is still possible to estimate the ijth component of in given and in two consecutive iterations:

This can be considered as the secant method generalized from 1-D to N-D, based on two initial guesses.

Example 1

Example 2

With , we get a root:

With , we get another root:

Broyden's method

Recall that the secant method approximate the derivative based on and :

Here is the estimated value of based on the old derivate , which is different from the more accurate estimate based on . The difference between these two estimates, when divided by , approximates the difference between and , i.e., the update formula of given above.

The updated can then be used for updating :

Broyden's method, one of the quasi-Newton methods, can be considered as a generalization of this secant method for solving an N-D system . Instead of assuming the availability of the true Jacobian matrix , here we estimate the next Jacobian by an iteration based on the current one. Let

then

With available, we have the iteration for the solution update as before:

Here the inverse of needs to be computed in each iteration. A better method is to estimate directly. Specifically we apply the Sherman-Morrison formula

to the iteration above for , with

and get

Next: Optimization Up: Solving Non-Linear Equations Previous: Newton-Raphson method (univariate)
Ruye Wang 2015-02-12