Club Amiga de Montreal

home *** CD-ROM | disk | FTP | other *** search

/ Club Amiga de Montreal - CAM / CAM_CD_1.iso / files / 618a.lha / NeuralNetwork / Neural_network.doc < prev next >

Wrap

Text File | 1992-03-09 | 16.3 KB | 320 lines

Neural networks are very useful for solving problems where you know the inputs that make some output, but you have no idea how they are related. Some simple examples are the weather, speech recognition, and vision. It is very hard to program a routine which will recognize speech when you don't even know what makes the word "network" sound like the work "network". Neural networks are good classifiers and are able to learn why an input generates a certain output. To train a neural network, you first show it many examples of inputs that you know the output it generates. After a while, the network will generate the output that is correct for a given input. Now if the network learned correctly, the next time you show it an input which it has never seen before, it should give the correct output. Now that some of you are totally confused, let's do an example. Problem: Let us say that we are trying to predict the weather for the next day. We believe that tomorrow's weather depends upon today's temperature, sky conditions (sunny, cloudy, rainy), wind, barometric pressure, and humidty. Assuming that the assumption is correct (it is not but lets keep the example simple), then all we have to do is give the inputs to the network and tell it what has happened in the past. We have recorded 100 days of readings (inputs) and the next day's weather (output). Now we show the network the inputs and tell it what the outputs must look like based on our data. After training the network for a while, all 100 input examples will generate the correct output (prediction of tomorrow's weather). Now if the network has sufficient information then when we give it today's weather and ask it what tomorrow should be like, it should predict correctly. Now the question is, how do we implement a neural network. The network design is a two hidden layer, feed-forward, fully-connected network. The size of each layer (input, hidden1, hidden2, and output) is set at run-time and can be as large as your computer's memory allows. The INPUTS: The inputs must be between +-1.0. The OUTPUTS: Generally each output will either be 1 or 0. For the weather problem, a 1 at output 1 may indicate sunny while a 0 indicates rain. Also output 2 could indicate windy or not windy. To train the network: STEP 1: You first construct the Neural_network with the size and the learning parameters that you want. You may read in the size and previously trained weights from a file. STEP 2: You call calc_forward () with a known input which calculates the actual output. STEP 3: You call back_propagation () with the desired output which will then compare the actual output with the desired output and then calculate how to change the connections to reduce the difference. STEP 4: Go back to step 2 with another known input until all know inputs have been shown once. STEP 5: Call update_weights () which will actually change all the inter- connections the way back_propagation () calculated the should. STEP 6: Go back to step 2 and show all the known inputs again until the actual output is close (usually within 0.1) of the desired output. STEP 7: Save the weights (connections) to a file. See the programs xor_dbd.cc and xor_bp.cc for an example of this basic procedure. //**************************************************************************** // // Neural_network class: // // This class performs all the necessary functions needed to train // a Neural Network. The network has an input layer, two hidden // layers, and an output layer. The size of each layer is specified // a run time so there is no restriction on size except memory. // This is a feed-forward network with full connctions from one // layer to the next. // // The network can perform straight back-propagation with no // modifications (Rumelhart, Hinton, and Williams, 1985) which // will find a solution but not very quickly. The network can also // perform back-propagation with the delta-bar-delta rule developed // by Robert A. Jacobs, University of Massachusetts // (Neural Networks, Vol 1. pp.295-307, 1988). The basic idea of this // rule is that every weight has its own learning rate and each // learning rate should be continously changed according to the // following rules - // - If the weight changes in the same direction as the previous update, // then the learning rate for that weight should increase by a constant. // - If the weight changes in the opposite direction as the previous // update, then the learning rate for that weight should decrease // exponentially. // // learning rate = e(t) for each individual weight // The exact formula for the change in learning rate (DELTA e(t)) is // // // | K if DELTA_BAR(t-1)*DELTA(t) > 0 // DELTA e(t) = | -PHI*e(t) if DELTA_BAR(t-1)*DELTA(t) < 0 // | 0 otherwise // // where DELTA(t) = dJ(t) / dw(t) ---> Partial derivative // // and DELTA_BAR(t) = (1 - THETA)*DELTA(t) + THETA*DELTA_BAR(t-1). // // For full details of the algorithm, read the article in // Neural Networks. // // // To perform straight back-propagation, just construct a Neural_network // with no learning parameters specified (they default to straight // back-propagation) or set them to // K = 0, PHI = 0, THETA = 1.0 // // However, using the delta-bar-delta rule should increase your rate of // convergence by a factor of 10 to 100 generally. The parameters for // the delta-bar-delta rule I use are // K = 0.025, PHI = 0.2, THETA = 0.8 // // One more heuristic method has been employed in this Neural net class- // the skip heuristic. This is something I thought of and I am sure // other people have also. If the output activation is within // skip_epsilon of its desired for each output, then the calc_forward // routine returns the skip_flag = 1. This allows you to not waste // time trying to push already very close examples to the exact value. // If the skip_flag comes back '1', then don't bother calculating forward // or back-propagating the example for X number of epochs. You must // write the routine to skip the example yourself, but the Neural_network // will tell you when to skip the example. This heuristic also has the // advantage of reducing memorization and increases generalization. // Typical values I use for this heuristic - // skip_epsilon = 0.01 - 0.05 // number skipped = 2-10. // // Experiment with all the values to see which work best for your // application. // // // Comments and suggestions are welcome and can be emailed to me // anstey@sun.soe.clarkson.edu // //**************************************************************************** //*********************************************************************** // Constructors : * // Full size specifications and learning parameters. * // Learning parameters are provided defaults which are set to * // just use the BP algorithm with no modifications. * // * // Read constructor which reads in the size and all the weights from * // a file. The network is resized to match the size specified * // by the file. Learning parameters must be specified * // separately. * //*********************************************************************** Neural_network (int number_inputs = 1, int number_hidden1 = 1, int number_hidden2 = 1, int number_outputs = 1, double t_epsilon = 0.1, double t_skip_epsilon = 0.0, double t_learning_rate = 0.1, double t_theta = 1.0, double t_phi = 0.0, double t_K = 0.0, double range = 3.0); Neural_network (char *filename, int& file_error, double t_epsilon = 0.1, double t_skip_epsilon = 0.0, double t_learning_rate = 0.1, double t_theta = 1.0, double t_phi = 0.0, double t_K = 0.0); ~Neural_network (); //************************************************************************** // Weight parameter routines: * // save_weights : This routine saves the weights of the network * // to the file <filename>. * // * // read_weights : This routine reads the weight values from the file * // <filename>. The network is automatically resized to the * // size specified by the file. * // * // Activation routines return the node activation after a calc_forward * // has been performed. * // * // get_weight routines return the weight between node1 and node2. * // * //************************************************************************** int save_weights (char *filename); int read_weights (char *filename); double get_hidden1_activation (int node); double get_hidden2_activation (int node); double get_output_activation (int node); double get_input_weight (int input_node, int hidden1_node); double get_hidden1_weight (int hidden1_node, int hidden2_node); double get_hidden2_weight (int hidden2_node, int output_node); //******************************************************************* // Size parameters of network. * // The size of the network may be changed at any time. The weights * // will be copied from the old size to the new size. If the new * // size is larger, then the extra weights will be randomly set * // between +-range. The matrices used to hold learning updates * // and activations will be re-initialized (cleared). * //******************************************************************* int get_number_of_inputs (); int get_number_of_hidden1 (); int get_number_of_hidden2 (); int get_number_of_outputs (); void set_size_parameters (int number_inputs, int number_hidden1, int number_hidden2, int number_outputs, double range = 3.0); //******************************************************************* // Learning parameters functions. These parameters may be changed * // on the fly. The learning rate and K may have to be reduced as * // more and more training is done to prevent oscillations. * //******************************************************************* void set_epsilon (double eps); void set_skip_epsilon (double eps); void set_learning_rate (double l_rate); void set_theta (double t_theta); void set_phi (double t_phi); void set_K (double t_K); double get_epsilon (); double get_skip_epsilon (); double get_learning_rate (); double get_theta (); double get_phi (); double get_K (); long get_iterations (); //************************************************************************** // The main neural network routines: * // * // The network input is an array of doubles which has a size of * // number_inputs. * // The network desired output is an array of doubles which has a size * // of number_outputs. * // * // back_propagation : Calculates how each weight should be changed. * // Assumes that calc_forward has been called just prior to * // this routine to calculate all of the node activations. * // * // calc_forward : Calculates the output for a given input. Finds * // all node activations which are needed for back_propagation * // to calculate weight adjustment. Returns abs (error). * // The parameter skip is for use with the skip_epsilon * // parameter. What it means is if the output is within * // skip_epsilon of the desired, then it is so close that it * // should be skipped from being calculated the next X times. * // Careful use of this parameter can significantly increase * // the rate of convergence and also help prevent over-learning. * // * // calc_forward_test : Calculates the output for a given input. This * // routine is used for testing rather than training. It returns * // whether the test was CORRECT, GOOD or WRONG which is * // determined by the parameters correct_epsilon and * // good_epsilon. CORRECT > GOOD > WRONG. * // * // update_weights : Actually adjusts all the weights according to * // the calculations of back_propagation. This routine should * // be called at the end of every training epoch. The weights * // can be updated by the straight BP algorithm, or by the * // delta-bar-delta algorithm developed by Robert A. Jacobs * // which increases the rate of convergence generally by at * // least a factor of 10. The parameters THETA, PHI, and K * // determine which algorithm is used. The default settings * // for these parameters cause update_weights to use the straight * // BP algorithm. * // * // kick_weights : This routine changes all weights by a random amount * // within +-range. It is useful in case the network gets * // 'stuck' and is having trouble converging to a solution. I * // use it when the number wrong has not changed for the last 200 * // epochs. Getting the range right will take some trial and * // error as it depends on the application and the weights' * // actual values. * // * //************************************************************************** void back_propagation (double input [], double desired_output [], int& done); double calc_forward (double input [], double desired_output [], int& num_wrong, int& skip, int print_it, int& actual_printed); int calc_forward_test (double input [], double desired_output [], int print_it, double correct_eps, double good_eps); void update_weights (); void kick_weights (double range); }; KNOWN BUGS: There are no known bugs in my code (does not mean there aren't any), but there is one bug with the Aztec C 5.0a m8 library. The fscanf routine does not correctly read in floating point numbers from a text file so you must use a different math library to compile and link. This code has been compiled with g++, gcc, and Aztec C 5.0a. The C version has not been fully tested but it seems to work just fine. I also do not plan to continue revising the C version, just the C++ version. However, I will fix any bugs in either version. You can email me at anstey@sun.soe.clarkson.edu Hope you like the code!