A clustering algorithm groups the given samples, each represented as a
vector
in the N-dimensional feature space,
into a set of clusters according to their spatial distribution in the N-D
space. Clustering is an unsupervised classification as no *a priori*
knowledge (such as samples of known classes) is assumed to be available.

- Step 1. Randomly choose initial cluster centers
from the given
sample set (e.g., the first samples of the sample set). Set ;
- Step 2. Assign each of the samples to one of the clusters
according to the distance
between the sample
and the mean vectors of the clusters:

where denotes the ith cluster of samples whose center is at the lth iteration; - Step 3. Update the cluster centers to get
so that
the sum of the distances
from all points in to the new center
is
minimized. To find this new center, we let

Solving this we get

where is the number of samples currently in at the lth iteration. - Step 4. Terminate if the algorithm has converged, i.e., the membership
of each pattern is no longer changed:

or a preset maximum number of iterations is exceeded.Otherwise, , goto Step 2.

**The ISODATA Algorithm**

In the K-means method, the number of clusters remains the same through
out the iteration, although it may turn out later that more or fewer
clusters would fit the data better. This drawback can be overcome in the
*ISODATA Algorithm* (Iterative Self-Organizing Data Analysis Technique
Algorithm), which allows the number of clusters to be adjusted automatically
during the iteration by merging similar clusters and splitting clusters with
large standard deviations. The algorithm is highly heuristic and based on the
following pre-specified parameters:

- : desired number of clusters;
- : minimum number of samples in each cluster (for discarding clusters);
- : maximum variance (for splitting clusters).
- : minimum pairwise distance (for merging clusters).

Here are the steps of the algorithm:

- Choose randomly initial mean vectors from the data set.
- Assign each data point to the cluster with
closest mean:

- Discard clusters containing too few members, i.e., if , then discard and reassign its members to other clusters. .
- For each cluster
, update the mean
vector

and the covariance matrix:

The diagonal elements are the variances along the dimensions. **If**(too few clusters), go to Steps 6 for splitting;**else if**(too many clusters), go to Steps 7 for merging;**else**go to Step 8.- (split) For each cluster
, find the
greatest covariance
If and , then split into two new cluster centers

Alternatively, carry out PCA to find the variance corresponding to the greatest eigenvalue and split the cluster along the direction along the corresponding eigenvector.Set .

Go to Step 8.

- (merge) Compute the pairwise Bhattacharyya distances
between every two cluster mean vectors:

For each of the distances satisfying , merge of the corresponding clusters to form a new one:

Delete , set . - Terminate if maximum number of iterations is reached. Otherwise go to Step 2.

As the number of clusters can be dynamically adjusted in the process, the Isodata algorithm is more flexible than the K-means algorithm. But all of the many more parameters listed previously have to be chosen empirically.