## Cubic Spline Interpolation

All previously discussed methods of polynomial interpolation fit a set of given points by an nth degree polynomial, and a higher degree polynomial is needed to fit a larger set of data points. A major drawback of such methods is overfitting, as domonstrated by the following example.

Example:

Vased on equally spaced points from to with increment of 1, a function can be approximated by any of the interpolation methods discussed above by polynomial of degree , as shown in the figure below. We note that the approximation is very poor towards to the two ends where the error is disappointingly high. This is known as Runge's phenomenon, indicating the fact that higher degree polynomial interpolation does not necessarily always produce more accurate result, as the degree of the interpolating polynomial may become unnecessarily high and the polynomial may become oscillatory.

This Runge's phenominon is a typical example of overfitting, due to an excessively complex model with too many parameters relative to the observed data, here specifically a polynomial of a degree too high (requiring too many coefficients) to model the given data points.

Now we consider a different method of spline interpolation, which fits the given points by a piecewise polynomial function , known as the spline, a composite function formed by low-degree polynomials each fitting in the interval between and :

 (61)

As this method does not use a single polynomial of degree to fit all points at once, it avoids high degree polynomials and thereby the potential problem of overfitting. These low-degree polynomials need to be such that the spline they form is not only continuous but also smooth.
• For to be continuous, two consecutive polynomials and must join at :

 (62)

Or, equivalently, must pass the two end-points, i.e.,

 (63)

• For to be smooth, they need to have the same derivatives at the point they joint, i.e.,

 (64)

must hold for some order . The higher the order is, the more smooth the spline becomes.

In the following we consider approximating between any two consecutive points and by a linear, quadratic, and cubic polynomial (of first, second, and third degree).

• Linear spline: with two parameters and can only satisfy the following two equations required for to be continuous:

 (65)

or, equivalently, has to pass the two end points and :

 (66)

Solving either of two problems above, we get:
 (67)

which is represented in the form of by the first expression, or a linear interpolation of the two end points and in the second expression.

As , the linear spline is continuous at . But as in general

 (68)

is not smooth, i.e., .

• Quadratic spline: with three parameters and can satisfy the following three equations required for to be smooth () as well as continuous:

 (69)

To obtain the three parameters , and in , we consider , which, as a linear function, can be linearly fit by the two end points and :

 (70)

Integrating, we get

 (71)

As , we have

 (72)

Solving this for

 (73)

and substituting it back into the expression of , we get

 (74)

Also, as , we have

 (75)

or

 (76)

Given , we can get iteratively all subsequent and thereby . Alternatively, given , we can also get iteratively all previous . It is obvious that with only three free parameters, the quadratic polynomials cannot satisfy both boundary conditions and .

• Cubic spline: with four parameters , and can satisfy the following four equations required for to be continuous and smooth ():

 and (77)

To obtain the four parameters , , and in , we first consdier , which, as a linear function, can be linearly fit by the two end points and :

 (78)

Integrating twice we get

 (79)

As and , we have:

 (80)

Solving these two equations we get the two coefficients and :

 (81)

 (82)

Substituting them back into and rearranging the terms we get
 (83)

To find , we take derivative of and rearrange terms to get
 (84)

which, when evaluated at and , becomes:
 (85)

Replacing by in the second equation, we also get

 (86)

To satisfy , we equate the above to the first equation to get:

 (87)

Multiplying both sides by and rearranging, we get:

 (88)

Note that here

 (89)

is simply the second divided differences. We can rewrite the equation above as

 (90)

where

 (91)

Here we have equations but unknowns . To obtain these unknowns, we need to get two additional equations based on certain assumed boundary conditions.
• Assume the first order derivatives at both ends and are known. Specially is called clamped boundary condition. At the front end, we set

 (92)

Multiplying we get

 (93)

Similarly, at the back end, we also set

 (94)

Multiplying we get

 (95)

We can combine Eqs [?], [?] and [?] into a linear equation system of equations and the same number of unknowns:

 (96)

• Alternatively, we can also assume and are known. Specially is called natural boundary condition. Now we can simply get and and solve the following system for the unknowns :

 (97)

Example:

A function is sampled at the following points:

The interpolation results based on linear, quadratic and cubic splines are shown in the figure below, together with the original function , and the interpolating polynomials , used as the ith segment of between and .

For the quadratic interpolation, based on we get . For the cubic interpolation, we solve the following equation

and get .

The errors of these three methods are , , and , respectively. Obviously the higher the degree of the interpolating polynomial, the higher the accuracy. The error of the cubic spline method is significantly smaller than of the polynomial interpolation.

The Matlab code that implements the cubic spline method is listed below.

function [S C]=Spline3(u,x,y,dya,dyb)
% vectors x and y contain n+1 points and the corresponding function values
% vector u contains all discrete samples of the continuous argument of f(x)
% dya and dyb are the derivatives f'(x_0) and f'(x_n), respectively
n=length(x);       % number of interpolating points
k=length(u);       % number of discrete sample points
C=zeros(n,k);      % the n-1 cubic interpolating polynomials
A=2*eye(n);        % coefficient matrix on left-hand side
A(1,2)=1;
A(n,n-1)=1;
d=zeros(n,1);      % vector on right-hand side
d(1)=((y(2)-y(1))/(x(2)-x(1))-dya)/h0;  % first element of d
for i=2:n-1
h0=x(i)-x(i-1);
h1=x(i+1)-x(i);
h2=x(i+1)-x(i-1);
A(i,i-1)=h0/h2;
A(i,i+1)=h1/h2;
d(i)=((y(i+1)-y(i))/h1-(y(i)-y(i-1))/h0)/h2; % 2nd divided difference
end
d(n)=(dyb-(y(n)-y(n-1))/h1)/h1;   % last element of d
M=6*inv(A)*d;                     % solving linear equation system for M's
for i=2:n
h=x(i)-x(i-1);
x0=u-x(i-1);
x1=x(i)-u;
C(i-1,:)=(x1.^3*M(i-1)+x0.^3*M(i))/6/h... % the ith cubic polynomial
-(M(i-1)*x1+M(i)*x0)*h/6+(y(i-1)*x1+y(i)*x0)/h;
idx=find(u>x(i-1) & u<=x(i));  % indices between x(i-1) and x(i)
S(idx)=C(i-1,idx);             % constructing spline by cubic polynomials
end
end


Example:

The function used before is now approximated by both the Newton's method and the cubic spline method, with very different results as shown below. The Runge's phenomenon suffered by Newton's method is certainly avoided by the cubic spline method.