next up previous
Next: Solving Linear Systems by Up: Solving Linear Algebraic Equations Previous: LU Decomposition

Fundamental theorem of linear algebra

An M by N matrix ${\bf A}\in R^{M\times N}$ can be expressed in terms of either its $N$ M-D column vectors ${\bf c}_j$ or its $M$ N-D row vectors ${\bf r}_i^T$:

\begin{displaymath}
{\bf A}=\left[\begin{array}{ccc}a_{11}&\cdots&a_{1N}\\
\v...
...array}{c}{\bf r}^T_1 \vdots {\bf r}^T_M\end{array}\right]
\end{displaymath}

where ${\bf r}_i$ and ${\bf c}_j$ are respectively the ith row vector and jth column vector:

\begin{displaymath}
{\bf c}_j=\left[\begin{array}{c}a_{1j} \vdots a_{Mj}\en...
...\;\;
{\bf r}_i^T=[a_{i1},\cdots,a_{iN}]
\;\;\;(i=1,\cdots,M)
\end{displaymath}

The $M$ by $N$ matrix in the linear function (a transformation) ${\bf f}({\bf x})={\bf A}{\bf x}={\bf b}$ converts an N-D vector ${\bf x}\in \mathbb{R}^N$ from the domain of the function to an M-D vector ${\bf b}\in R^M$ in the codomain $R^M$ of the function, and ${\bf g}({\bf y})={\bf A}^T{\bf y}={\bf c}$ converts an M-D vector ${\bf y}\in R^M$ to an N-D vector ${\bf c}\in \mathbb{R}^N$.

DomainCodomain.png

The fundamental theorem of linear algebra concerns the following four fundamental subspaces associated with ${\bf A}$ (and ${\bf A}^T$):

The rank-nullity theorem states:

\begin{displaymath}\mbox{rank}({\bf A})+\mbox{nullity}({\bf A})=N \end{displaymath}

Also all vectors in the nullspace of ${\bf A}$ are orthogonal to each of the row vectors of ${\bf A}$, i.e., they are mutually exclusive and complementary:

\begin{displaymath}
R({\bf A}) \cap N({\bf A})=\emptyset,\;\;\;R({\bf A}) \perp ...
...\bf A})=\mathbb{R}^N,\;\;\;\;\;R({\bf A})
=N({\bf A})^{\perp},
\end{displaymath}

In other words, the nullspace and the row space are orthogonal complement. For $C({\bf A})=R({\bf A}^T)$ and $N({\bf A}^T)$ we also have

\begin{displaymath}
C({\bf A}) \cap N({\bf A}^T)=\emptyset,\;\;\;\;\;C({\bf A}) ...
...us N({\bf A}^T)=R^M,\;\;\;\;\;
C({\bf A})=N({\bf A}^T)^{\perp}
\end{displaymath}

The bases of these four subspaces can be found by converting ${\bf A}$ into the reduced row echelon form (rref). by the elementary row operations (row reduction).

Example


\begin{displaymath}{\bf A}=\left[\begin{array}{rrr}1&2&5 2&3&8 3&1&5\end{array}\right] \end{displaymath}

By elementary row operations, we can convert ${\bf A}$ into rref form:

\begin{displaymath}{\bf A}=\left[\begin{array}{ccc}1&2&5 2&3&8 3&1&5\end{arr...
...left[\begin{array}{ccc}1&0&1 0&1&2 0&0&0\end{array}\right] \end{displaymath}

We see that ${\bf c}_3={\bf c}_1+2{\bf c}_2$. Alternatively,

\begin{displaymath}{\bf A}=\left[\begin{array}{ccc}1&5&2 2&8&3 3&5&1\end{arr...
...begin{array}{ccc}1&0&-0.5 0&1&0.5 0&0&0\end{array}\right]
\end{displaymath}

Now we have ${\bf c}_3=-0.5{\bf c}_1+0.5{\bf c}_2=({\bf c}_2-{\bf c}_1)/2$. We see that in either case, the column space is a 2-D plane spanned by any two of the three column vectors.

The $R=rank({\bf A})$ columns in the rref matrix that contain a single 1, called a pivot element, are the pivot columns, and the $R$ corresponding rows each containing a pivot are the pivot rows, which span the R-D row space $R({\bf A})$. However, note that the columns of ${\bf A}$ are changed by the row reduction operations and they do not span the column space $C({\bf A})$.

The dimensions of $C({\bf A})$ and $R({\bf A})$ are both equal to the number of pivots in the rref form, which is the rank of $R$ defined as the number of independent rows or columns in ${\bf A}$:

\begin{displaymath}dim\;C({\bf A})=dim\;R({\bf A})=rank({\bf A}) \end{displaymath}

The $N-R$ non-pivot columns are not independent as they can always be expressed as a linear combination of the pivot columns. The $R$ pivot rows correspond to the independent equations in the system ${\bf A}{\bf x}={\bf b}$, while the remaining $M-R$ non-pivot rows containing all zeros are not independent. They don't play any row in the equation system ${\bf A}{\bf x}={\bf b}$, as they map any ${\bf x}$ to ${\bf0}$.

Example

Solve the following homogeneous equation

\begin{displaymath}{\bf A}{\bf x}=\left[\begin{array}{ccc}1&2&5 2&3&8 3&1&5\...
...eft[\begin{array}{c}x_1 x_2 x_3\end{array}\right]
={\bf0} \end{displaymath}

Applying the elementary row operations, we get

\begin{displaymath}
\left[\begin{array}{ccc}1&2&5 2&3&8 3&1&5\end{array}\rig...
...y}\right]
=\left[\begin{array}{c}0 0 0\end{array}\right]
\end{displaymath}

The second to the last matrix is in the row echelon form (ref) of the coefficient matrix ${\bf A}$, while the last matrix is the reduced row echelon form (rref) of ${\bf A}$. In this rref form, there are only two pivot columns and pivot rows. The last row is all zeros, indicating that ${\bf A}$ is singular ($R=2<M=N=3$), i.e., $x_3$ is a free variable which can take any value. We have

\begin{displaymath}x_1+x_3=0,\;\;\;x_2+2x_3=0 \end{displaymath}

If we let $x_3=1$ , then we get $x_1=-1$, $x_2=-2$, and a special solution ${\bf x}_n=[-1,\;-2,\;1]^T$, which is in the nullspace $N({\bf A})$. However, the free variable $x_3$ can take any value $c$, the complete solution, i.e., the entire nullspace is

\begin{displaymath}N({\bf A})=c x_n=c\left[\begin{array}{r}-1 -2 1\end{array}\right] \end{displaymath}

Example

Solve the following non-homogeneous equation

\begin{displaymath}{\bf A}{\bf x}=
\left[\begin{array}{ccc}1&2&5 2&3&8 3&1&5...
...
={\bf b}=\left[\begin{array}{c}19 31 22\end{array}\right]
\end{displaymath}

We use Gauss-Jordan elimination to solve this system by applying a sequence of elementary row operations:

\begin{displaymath}[\begin{array}{c\vert c}{\bf A}&{\bf b}\end{array}]
=\left[\b...
...rray}{rrr\vert r}1&0&1&5 0&1&2&7 0&0&0&0\end{array}\right]
\end{displaymath}

There are only two independent columns (or rows) in the rref form, i.e., the rank is $R=2<M=N=3$, indicating that ${\bf A}$ it is singular, and its inverse ${\bf A}^{-1}$ does not exist, we cannot convert ${\bf A}$ into an identity matrix ${\bf I}$. Instead we just have the following system with a coefficient matrix in rref form, which can also be expressed in block matrix form:

\begin{displaymath}\left[\begin{array}{rrr}1&0&1 0&1&2 0&0&0\end{array}\righ...
...t]
=\left[\begin{array}{c}{\bf b}_1 {\bf0}\end{array}\right]
\end{displaymath}

where

\begin{displaymath}
{\bf I}=\left[\begin{array}{cc}1&0 0&1\end{array}\right],...
...;\;\;
{\bf b}_1=\left[\begin{array}{c}5 7\end{array}\right]
\end{displaymath}

Here ${\bf x}_{pivot}$ is a solution corresponding to the pivot columns and ${\bf x}_{free}$ represents free variables corresponding to the non-pivot columns, which are free variable that can take any value.

Solving for ${\bf x}_{pivot}$, we get

\begin{displaymath}{\bf I}{\bf x}_{pivot}+{\bf F}{\bf x}_{free}={\bf b}_1,\;\;\;...
...y}\right]
+x_3\left[\begin{array}{c}-1 -2\end{array}\right]
\end{displaymath}

If we let ${\bf x}_{free}=x_3=0$, we get a particular solution

\begin{displaymath}{\bf x}_{particular}=\left[\begin{array}{c}5 7 0\end{array}\right] \end{displaymath}

But as the free variable $x_3$ can take any value $c$, the complete solution is

\begin{displaymath}
{\bf x}_{complete}=\left[\begin{array}{c}5 7 0\end{array...
...t]
={\bf x}_{particular}+c x_n={\bf x}_{particular}+N({\bf A})
\end{displaymath}

where $x_n=[-1,\;-2,\;1]^T$ is the solution for ${\bf A}{\bf x}={\bf0}$ we already obtained in previous example, which is in the 1-D nullspace $N({\bf A})=c\; x_n$. We therefore see that the complete solution of ${\bf A}{\bf x}={\bf b}$ is the entire nullspace $N({\bf A})$ translated by the particular solution ${\bf x}_{particular}$. For example, when $c=0,1,2$ we get the following solutions:

\begin{displaymath}
{\bf x}=\left[\begin{array}{c}5 7 0\end{array}\right],\...
...;\;
{\bf x}=\left[\begin{array}{c}3 3 0\end{array}\right]
\end{displaymath}

The following figure illustrates a 3-D $R^3$ space with a 2-D nullspace. The complete solution is the entire nullspace translated by a particular solution ${\bf x}_{particular}$.

CompleteSolution.png

Example

\begin{displaymath}
{\bf A}{\bf x}=\left[\begin{array}{rccc}1&2&3&4 4&3&2&1\\...
...y}\right]
=\left[\begin{array}{r}3 2 4\end{array}\right]
\end{displaymath}

We construct the augmented matrix and then convert it to rref form:

\begin{displaymath}\left[\begin{array}{rrrr\vert r}1&2&3&4&3 4&3&2&1&2 -2&1&...
... I}&{\bf F}&{\bf b_1} {\bf0}&{\bf0}&{\bf0}\end{array}\right]
\end{displaymath}

The equation system now takes this block matrix form:

\begin{displaymath}\left[\begin{array}{rr}{\bf I}&{\bf F} {\bf0}&{\bf0}\end{ar...
...t]=
\left[\begin{array}{c}{\bf b_1} {\bf0}\end{array}\right]
\end{displaymath}

where

\begin{displaymath}{\bf I}=\left[\begin{array}{cc}1&0 0&1\end{array}\right],\;...
...;\;\;
{\bf b}_1=\left[\begin{array}{r}-1 2\end{array}\right]
\end{displaymath}

We first find the solution of the homogeneous equation ${\bf A}{\bf x}={\bf0}$:

\begin{displaymath}
{\bf I}{\bf x}_p+{\bf F}{\bf x}_f={\bf0},\;\;\;\;\mbox{i.e....
...ray}\right]
\left[\begin{array}{r}x_3 x_4\end{array}\right]
\end{displaymath}

We use the two basis vectors ${\bf x}_f=[x_3,\;x_4]^T=[1\;0]^T$ and ${\bf x}_f=[x_3,\;x_4]^T=[0\;1]^T$ of the nullspace $N({\bf A})$ and get

\begin{displaymath}{\bf x}_p=\left[\begin{array}{c}x_1 x_2\end{array}\right]
=...
...right]
=\left[\begin{array}{r}2 -3 0 1\end{array}\right]
\end{displaymath}

We next find the particular solution of the non-homogeneous equation ${\bf A}{\bf x}={\bf b}$ by setting ${\bf x}_f={\bf0}$

\begin{displaymath}
{\bf I}{\bf x}_p+{\bf F}{\bf x}_f={\bf x}_p={\bf b}_1,\;\;\...
...icular}=\left[\begin{array}{r}-1 2 0 0\end{array}\right]
\end{displaymath}

We finally get the complete solution

\begin{displaymath}{\bf x}_{complete}={\bf x}_{particular}+N({\bf A})
={\bf x}_{...
...ht]
+c_2\left[\begin{array}{r}2 -3 0 1\end{array}\right]
\end{displaymath}

If the right-hand side is ${\bf b}=[1,\;3,\;5]^T$, then the row reduction of the augmented matrix yields:

\begin{displaymath}\left[\begin{array}{rrrr\vert r}1&2&3&4&1 4&3&2&1&3 -2&1&...
...vert r}1&2&3&4&1 0&1&2&3&0.2 0&0&0&0&1.2\end{array}\right]
\end{displaymath}

The equation corresponding to the last row is $0=1.2$, indicating the system is not solvable (even though the system has infinite solutions!) as this ${\bf b}$ is not in the image $C({\bf A})$ of ${\bf A}$.

Example:

\begin{displaymath}{\bf A}^T{\bf y}=\left[\begin{array}{rrr}1&4&-2 2&3&1 3&2...
...f c}
=\left[\begin{array}{c}3 11 19 27\end{array}\right]
\end{displaymath}

Carrying out elementary row operations to the augmented matrix we get

\begin{displaymath}\left[\begin{array}{rrr\vert r}1&4&-2&3 2&3&1&11 3&2&4&19...
...rt r}1&0&2&7 0&1&-1&-1 0&0&0&0 0&0&0&0\end{array}\right]
\end{displaymath}

The original system can now be written as

\begin{displaymath}\left[\begin{array}{cc}{\bf I}&{\bf F} {\bf0}&{\bf0}\end{ar...
...t]=
\left[\begin{array}{r}{\bf c}_1 {\bf0}\end{array}\right]
\end{displaymath}

where

\begin{displaymath}{\bf I}=\left[\begin{array}{cc}1&0 0&1\end{array}\right],\;...
...;\;\;
{\bf c}_1=\left[\begin{array}{r}7 -1\end{array}\right]
\end{displaymath}

We first find the solution for ${\bf A}^T{\bf y}={\bf0}$ by assuming the free variable $y_3=1$:

\begin{displaymath}{\bf I}{\bf y}_p+{\bf F}{\bf y}_f={\bf0},
\;\;\;\;\;\mbox{i.e...
...;
{\bf y}_n=\left[\begin{array}{r}-2 1 1\end{array}\right]
\end{displaymath}

We next find the particular solution of ${\bf A}^T{\bf y}={\bf c}$ by assuming ${\bf y}_f=y_3=0$:

\begin{displaymath}{\bf I}{\bf y}_p+{\bf F}{\bf y}_f={\bf y}_p={\bf c}_1=\left[\...
...particular}=\left[\begin{array}{r}7 -1 0\end{array}\right]
\end{displaymath}

The complete solution is

\begin{displaymath}{\bf y}_{complete}={\bf y}_{particular}+N({\bf A}^T)={\bf y}_...
...y}\right]+c
\left[\begin{array}{r}-2 1 1\end{array}\right]
\end{displaymath}

We can summarize the two examples above by listing all four subspaces associated with this matrix ${\bf A}$ with $M=3$, $N=4$, and rank $R=2$:

  1. The column space of ${\bf A}$ is the same as the row space of ${\bf A}^T$, which is a 2-D subspace of $R^3$ spanned by the first two pivot rows of the rref of ${\bf A}^T$.

    \begin{displaymath}C({\bf A})=R({\bf A}^T)=span\left(\left[\begin{array}{c}1 0...
...],
\left[\begin{array}{r}0 1 -1\end{array}\right]\right) \end{displaymath}

  2. The row space of ${\bf A}$ is the same as the column space of ${\bf A}^T$, which is a 2-D subspace of $R^4$ spanned by the first two pivot rows of the rref of ${\bf A}$:

    \begin{displaymath}R({\bf A})=C({\bf A}^T)=
span\left(\left[\begin{array}{r}1\\...
...
\left[\begin{array}{c}0 1 2 3\end{array}\right]\right) \end{displaymath}

  3. The nullspace of ${\bf A}$ is a 2-D subspace of $R^4$ spanned by

    \begin{displaymath}N({\bf A})=span\left(\left[\begin{array}{r}1 -2 1 0\end...
...left[\begin{array}{r}2 -3 0 1\end{array}\right] \right)
\end{displaymath}

  4. The nullspace of ${\bf A}^T$ is a 1-D subspace of $R^3$ spanned by

    \begin{displaymath}N({\bf A}^T)=span\left(\left[\begin{array}{r}-2 1 1\end{a...
...ight)
=c\;\left[\begin{array}{r}-2 1 1\end{array}\right]
\end{displaymath}

Note that the basis of $R({\bf A})$ can be obtained from the rref form of ${\bf A}$ as its rows are equivalent to those of ${\bf A}$. But the basis of $C({\bf A})$ cannot be obtained as the columns of the rref form are changed by the row deduction operations and are no longer equivalent to the columns of ${\bf A}$. However, the basis of $C({\bf A})$ can be obtained from the rref form of ${\bf A}^T$, as its rows are equivalent to those of ${\bf A}^T$, which are the columns of ${\bf A}$.

We can easily verify that these vectors are independent and can be used as the basis to span the corresponding subspaces, and these subspaces indeed satisfy the following orthogonal and complementary properties:

\begin{displaymath}C({\bf A})\perp N({\bf A}^T),\;\;\;\;R({\bf A})\perp N({\bf A...
...oplus N({\bf A}^T)=R^3,\;\;\;\;R({\bf A})\oplus N({\bf A})=R^4
\end{displaymath}

i.e., they are orthogonal complements:

\begin{displaymath}
C({\bf A})=N({\bf A})^\perp,\;\;\;\;R({\bf A})=N({\bf A})^\perp
\end{displaymath}

We can further verify that

\begin{displaymath}
{\bf b}=\left[\begin{array}{r}3 2 4\end{array}\right]
...
...[\begin{array}{r}0 1 2 3\end{array}\right]\in R({\bf A})
\end{displaymath}

and indeed ${\bf b}\perp N({\bf A}^T)$ and ${\bf c}\perp N({\bf A})$, i.e., the systems are solvable. However, when ${\bf b}=[1,\;3,\;5]^T\notin
C({\bf A})$, the system is unsolvable. The particular solutions ${\bf x}_p\in R^4$ is in neither $R({\bf A})$ nor $N({\bf A})$, it is a linear combination of the four basis vectors of the two subspaces:

\begin{displaymath}{\bf x}_p=\left[\begin{array}{r}-1 -2 0 0\end{array}\ri...
...ight]
-4\left[\begin{array}{r}2 -3 0 1\end{array}\right]
\end{displaymath}

Similarly, ${\bf y}_p$ is in neither $C({\bf A})$ nor $N({\bf A}^T)$, but a linear combination of the three basis vectors of the two subspaces:

\begin{displaymath}{\bf y}_p=\left[\begin{array}{r}7 -1 0\end{array}\right]
...
...\right]
-2.5\left[\begin{array}{r}-2 1 1\end{array}\right]
\end{displaymath}

In summary, here are the ways to find the bases of the four subspaces:

Note that these bases are not necessarily orthogonal. While ${\bf x}_f\in N({\bf A})$, ${\bf x}_p\notin R({\bf A})$ in general.

All these results can be illustrated in the figure below, where we assume ${\bf x}_p\in R({\bf A})$ and ${\bf y}_p\in C({\bf A})$, which is not the case in these examples, but will be true when the systems were solved by the SVD method, discussed in the next section.

FourSpaces.png

Here is a summary of the four subspaces associated with an $M$ by $N$ matrix ${\bf A}$ of rank $R$.

$M$, $N$, $R$ dim $R({\bf A})$ dim $N({\bf A})$ dim $C({\bf A})$ dim $R({\bf A}^T)$ solvability of ${\bf A}{\bf x}={\bf b}$
$M=N=R$ $R$ 0 $R$ 0 solvable, ${\bf x}={\bf A}^{-1}{\bf b}$ is unique solution
$M>N=R$ $R$ 0 $R$ $M-R$ solvable only if ${\bf b}\in C({\bf A})$
$N>M=R$ $R$ $N-R$ $R$ 0 solvable, infinite solutions ${\bf x}={\bf x}_p+N({\bf A})$
$R<min(M,N)$ $R$ $N-R$ $R$ $M-R$ solvable only if ${\bf b}\in C({\bf A})$, infinite solutions

The following figure illustrates the case when $M=N=3$, and when ${\bf b}\notin C({\bf A})$. The system can only be approximately solved for the projection ${\bf p}\in C({\bf A})$ of ${\bf b}$in the column space. This is the optimal approximation in the least-squares sense as the error is minimized $\vert\vert{\bf e}\vert\vert\longrightarrow min$.

FourSpaces1.png


next up previous
Next: Solving Linear Systems by Up: Solving Linear Algebraic Equations Previous: LU Decomposition
Ruye Wang 2014-09-29