The ICA method depends on certain measurement of the non-Gaussianity:

**Kurtosis**Kurtosis is defined as the normalized form of the fourth central moment of a distribution:

If we assume to have zero mean and unit variance , then and . Kurtosis measures the degree of peakedness (spikiness) of a distribution and it is zero only for Gaussian distribution. Any other distribution's kurtosis is either positive if it is supergaussian (spikier than Gaussian) or negative if it is subgaussian (flatter than Gaussian). Therefore the absolute value of the kurtosis or kurtosis squared can be used to measure the non-Gaussianity of a distribution. However, kurtosis is very sensitive to outliers, and it is not a robust measurement of non-Gaussianity.**Differential Entropy - Negentropy**The entropy of a random variable with density function is defined as

An important property of Gaussian distribution is that it has the maximum entropy among all distributions over the entire real axis . (And uniform distribution has the maximum entropy among all distributions over a finite range.) Based on this property, the differential entropy, also called*negentropy*, is defined as

where is a Gaussian variable with the same variance as . As is always greater than zero unless is Gaussian, it is a good measurement of non-Gaussianity.This result can be generalized from random variables to random vectors, such as , and we want to find a matrix so that has the maximum negentropy , i.e., is most non-Gaussian. However, exact is difficult to get as its calculation requires the specific density distribution function .

**Approximations of Negentropy**The negentropy can be approximated by

However, this approximation also suffers from the non-robustness due to the kurtosis function. A better approximation is

where are some positive constants, is assumed to have zero mean and unit variance, and is a Gaussian variable also with zero mean and unit variance. are some non-quadratic functions such as

where is some suitable constant. Although this approximation may not be accurate, it is always greater than zero except when is Gaussian. In particular, when , we have

Since the second term is a constant, we want to maximize to maximize .