Each of these nets is detailed below.
ADALINE (ADA), or ADaptive LINear element, is not technically a neural network, as
a number of inputs feed into a single output unit. ADALINE's are often used in
signal processing and several ADALINE units can be combined to form MADALINE (for
Many ADALINEs). ADALINE learns by comparing the net's output to the desired output
for each pattern and using the difference between the two to adjust the weights on the
connections between the inputs and output. Because it is a one-layer "net", the
patterns that ADALINE can learn must be linearly separable.
ADA uses either a bipolar discrete or a linear activation
function as detailed below.
ADALINE takes the following parameters:
max error - the desired value for the average squared difference between
desired and obtained outputs for a cycle.
mu - the learning constant. Usually between .05 and .10.
bias - often called the threshold value. The value you enter for bias will be
assigned as the activation for the "extra" neuron in the augmented input layer.
The bias value is typically +1.
inputs - the number of neurons in the input layer.
patterns - the number of patterns in the training set.
In ADALINE, a pattern is presented to the net and the output is calculated. The input to the output unit is simply the sum of the products of each input unit and its associated weight. Each weight is then changed according to the following:
weight change = mu * -2.0 * (desired_output - obtained_output) * input
The next pattern is then presented. When the average squared error (i.e., desired_output - obtained_output) falls below the desired value for a cycle, training stops.
The bidirectional associative memory (BAM) is the simplest of the nets
implemented in Neural Net Lab. It attempts to form associations between pairs
of patterns so that, after learning, presentation of one of the items will result
in recovery of the other. The number of patterns that can be stored is equal to
the number of units in the smallest layer. The BAM algorithm used in the
Neural Net Lab was taken from Zurada (1992).
BAM uses a bipolar discrete activation function as detailed below.
BAM networks take the following parameters:
Inputs - the number of units in the input layer.
Outputs - the number of units in the output layer.
Patterns - the number of patterns in the training set.
In a BAM, an input pattern is applied. Simultaneously, the desired output is
applied to the output units. Each weight is then adjusted by adding the
product of the input and output values associated with that weight. Training
stops when all patterns have been presented. The patterns must be orthogonal
for the next to perform optimally.
Return to Net Types
Backpropagation nets (BPN) are undoubtably the most popular of the neural nets.
They learn by comparing the net's output to the desired output for each pattern
and using the difference between the two to adjust the weights on the
connections between units. BPN nets have at least three layers of neurons: an
input layer, an output layer , and a hidden layer. The BPN algorithm used here is
a modification of the one given in Zurada (1992). I've added a momentum
parameter to improve performance of the net.
BPN uses either a unipolar or bipolar continuous activation
function as detailed elsewhere.
BPN nets take the following parameters:
max error - the desired value for the average squared difference between
desired and obtained outputs for a cycle.
eta - the learning constant. Usually between .1 and .9.
lambda - Determines the steepness of the activation function. See the
discussion of the activation function below
for more information.
alpha - the momentum parameter. Usually between .1 and .7.
bias - often called the threshold value. The value you enter for bias will be
assigned as the activation for the "extra" neuron in the augmented layer.
The bias value is typically either +1 or -1.
inputs - the number of neurons in the input layer.
hidden - the number of neurons in the hidden layer.
outputs - you guessed it - the number of neurons in the output layer.
patterns - the number of patterns in the training set.
The formulas for a BPN net are just too darned difficult to present without
mathematical symbols. Any of the references
listed will go into much more detail than could be provided here.
A counterpropagation network (CPN) consists of two layers: the first is a
self-organizing Kohonen layer that clusters the training patterns based on
their similarity to one another. The measure of similarity uses a Euclidean
distance metric. Once the Kohonen layer has been trained, the second
layer - an Outstar layer - is trained in a supervised mode to assign the
clusters to a desired classification/category.
CPN nets take the following parameters:
alpha - the learning rate for the Kohonen layer. Usually set initially
between .1 and .7. Over the course of training, alpha gets smaller
(see the manual for the formula).
beta - the learning rate for the Outstar layer. Functions in the same
manner as alpha.
Max change - The largest acceptable weight change in the Kohonen layer
for each element of a pattern.
out error - the average squared difference between the desired output
of the Outstar layer neurons and what was actually obtained.
scaling - uses the number of "wins" a given neuron has had to
adjust the distance measure. This parameter promotes the formation
of more than a single cluster.
inputs - the number of neurons in the input layer.
outputs - the number of neurons in the output layer.
patterns - the number of patterns in the training set.
The precise formulas for a CPN net are too difficult to present without
mathematical symbols. Any of the references
listed will go into much more detail than could be provided here. Briefly, in English,
what happens is this: a pattern is presented. The pattern is then compared to the
weight vector for each "output" unit. The vector most similar to the input is declared
the "winner" and it gets adjusted to more closely resemble the input. No other weights
are adjusted. When the average weight change becomes small enough, training of the Kohonen
layer stops. In this way, similar patterns are grouped into clusters.
The Grossberg layer is then trained to assign a given output to each cluster.
A Kohonen net (KOH) is a self-organizing net that, unlike the other nets
in Neural Net Lab, learns in an unsupervised mode. You do not need to
provide a file of desired classifications; the net will cluster patterns based
on their similarity. In other words, given a set of patterns, the net will
attempt to discover the regularities in them. The results can often be
surprising.
KOH nets take the following parameters:
alpha - the learning rate for the Kohonen layer. Usually set initially
between .1 and .7. Over the course of training, alpha gets smaller
(see the manual for the formula).
Max change - The largest acceptable weight change in the Kohonen layer
for each element of a pattern.
scaling - uses the number of "wins" a given neuron has had to
adjust the distance measure. This parameter promotes the formation
of more than a single cluster.
inputs - the number of neurons in the input layer.
outputs - the number of neurons in the output layer.
patterns - the number of patterns in the training set.
The precise formulas for a Kohonen layer are too difficult to present without
mathematical symbols. Any of the references
listed will go into much more detail than could be provided here. Briefly, in English,
what happens is this: a pattern is presented. The pattern is then compared to the
weight vector for each output unit. The vector most similar to the input is declared
the "winner" and it gets adjusted to more closely resemble the input. No other weights
are adjusted. When the average weight change becomes small enough, training of the Kohonen
layer stops. In this way, similar patterns are grouped into clusters.