## Newton-Raphson Method (Multivariate)

The Newton-Raphson method discussed above for solving a single-variable equation can be generalized to the case of multivariate equation systems containing equations of variables in :

 (95)

To solve the equation system, we first consider the Taylor series expansion of each of the functions in the neighborhood of the initial point :

 (96)

where represents the second and higher order terms in the series beyond the linear term, which can be neglected if is small. These equations can be expressed in matrix form
 (97)

where , while and are the function and its Jacobian matrix both evaluated at . We further consider solving the equation system in the following two cases:

• : The number of equations is the same as the number of unknowns, the Jacobian is a square matrix and its inverse exists in general. In the special case where is linear, the Taylor series contains only the first two terms while all higher order terms are zero, and the approximation in Eq. (97) becomes exact. To find the root satisfying , we set in Eq. (97) to zero and solve the resulting equation for to get

 (98)

As in general is nonlinear, it can only be approximated by the first two terms of the Taylor series, consequently the result above is only an approximation of the optimal solution. But this approximation can be improved iteratiively to approach the optimal solution :

 (99)

where we have defined as the increment, which can also be denoted by to represent the search or Newton direction. The iteration moves in the N-D space spanned by from some initial guess along such a path that all function values ) are reduced. Same as in the univariate case, a scaling factor can be used to control the step size of the iteration

 (100)

When , the step size becomes smaller and the convergence of the iteration is slower, however, we will have a better chance not to skip a solution, which may happen if is not smooth and the step size is too big.

The algorithm is listed below:

• Select
• Obtain and
• Obtain
• While do
• Find and

• : There are more equations than unknowns, i.e., equation is an over-constrained system, and the Jacobian is an non-square matrix without an inverse, i.e., no solution exists for the equation in general. But we can still seek to find an optimal solution that minimizes the following sum-of-squares error:

 (101)

 (102)

The nth component of is

 (103)

where is the component in the mth row and nth column of the Jacobian matrix of . Now the gradient can be written as

 (104)

If is linear and can therefore be represented as the sum of the first two terms of its Taylor series in Eq. (97), then the gradient is:

 (105)

where is any chosen initial guess. If we assume is the optimal solution at which is minimized and is zero:

 (106)

then the increment can be found by solving the equation

 (107)

Here is the pseudo-inverse of the non-square matrix . Now the optimal solution can be found as:

 (108)

However, as is nonlinear in general, the sum of the first two terms of its Taylor series is only an approximation. Consequently the result above is not the optimal solution, but an estimate which can be improved by carrying out this step iteratively:

 (109)

This iteration will converge to at which , and the squared error is minimized.

Specially, for a linear equation system , the Jacobian is simplely , and the optimal solution is

 (110)

i.e., the optimal solution can be found from any initial guess in a single step. This result is the same as that in Eq. (7).

Comparing Eqs. (99) and (109), we see that the two algorithms are essentially the same, with the only difference that the regular inverse is used when , but the pseudoinverse is used when and does not exist.

The Newton-Raphson method assumes the availability of the analytical expressions of all partial derivatives in the Jacobian matrix . However, when this is not the case, need to be approximated by forward or central difference (secant) method:

 (111)

where is a small increment.

Example 1

Example 2

With , we get a root:

With , we get another root:

Broyden's method

In the Newton-Raphson method, two main operations are carried out in each iteration: (a) evaluate the Jacobian matrix and (b) obtain its inverse . To avoid these expensive computation for these operations, we can consider using Broyden's method, one of the quasi-Newton methods, which approximates the inverse of the Jacobian from the in the previous iteration step, so that it can be updated iteratively from the initial .

We first consider in the single-variable case how to estimate the next derivative from the current by the secant method:

 (112)

where
• is the true increment of the function over the interval ;
• is the estimated increment of the function based on the previous derivate ;
• is the estimated increment of the derivative:

 (113)

The equation above indicates that the derivative in the step can be estimated by adding the estimated increment to the derivative in the current step.

Having obtained , we can use the same iteration in the Newton-Raphson method to find :

 (114)

This method for single-variable case can be generalized to multiple variable case for solving . Following the way we estimate the increment of the derivative of a single-variable function in Eq. (113), here we can estimate the increment of the Jacobian of a multi-variable function:

 (115)

where and . Now in each iteration, we can update the estimated Jacobian as well as the estimated root:

 (116)

The algorithm can be further improved so that the inverse of the Jacobian is avoided. Specifically, consider the inverse Jacobian:

 (117)

We can apply the Sherman-Morrison formula:

 (118)

to the right-hand side of the equation above by defining

 (119)

and rewrite it as:

 (120)

We see that the next can be iteratively estimated directly from the previous , thereby avoiding computing the inverse of altogether. The algorithm is listed below:

• Select
• Find , , and
• While do
• ,