A perceptron network (F. Rosenblatt, 1957) also has 2 layers as in the previous Hebb's network, except the learning law is different. This is a supervised learning algorithm based on some prior knowledge.
We first consider the case in which there is only output node in the
network. Presented with a pattern vector
output is computed to be
The binary output is either indicating belongs to class 1 ( ) or indicating belongs to class 2 ( ), i.e.,
The weight vector is obtained in the training process based on a set of training samples with known classes (labeled): , where is an n-D vector representing a pattern and the binary label indicates to which one of the two classes or the input belongs.
Specifically, in each step of the training process, one of the patterns
randomly chosen is presented to the input nodes for the network
to generate an output . This output is then compared with the the desired
output corresponding to this input to obtain their difference ,
based on which the weight vector is then updated by the following
Perceptron convergence theorem:
If and are two linearly separable clusters of patterns, then a perceptron will develop a in finite number of training trials to classify them.
As the perceptron network is only capable of linear operations, the corresponding classification is limited to classes that are linearly separable by a hyperplane in the n-D space. A typical example of two classes not linearly separable is the XOR problem, where the 2-D space is divided into four regions for two classes 0 and 1, just like the Karnaugh map for the exclusive-OR of two logical variables .
When there are output nodes in the network, the network becomes a multiple-class classifier. The weight vectors () for these output nodes define hyperplanes that partition the n-D space into multiple regions each corresponding to one of the classes. Specifically, if , then the n-D space can be partitioned into as many as regions.
The limitation of linear separation could be overcome by having more than one learning layer in the network. However, the learning law for the single-layer perceptron network no longer applies. New training method is needed.