NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / comp / ai / neuraln / 3427 < prev next >

Wrap

Text File | 1992-09-08 | 2.6 KB | 69 lines

Newsgroups: comp.ai.neural-nets Path: sparky!uunet!sequent!muncher.sequent.com!jjb From: jjb@sequent.com (Jeff Berkowitz) Subject: Summary of CUPS + new question Message-ID: <1992Sep9.061803.28712@sequent.com> Keywords: backprop implementation errors (?) Sender: usenet@sequent.com (usenet ) Nntp-Posting-Host: eng3.sequent.com Organization: Sequent Computer Systems, Inc. Date: Wed, 9 Sep 92 06:18:03 GMT Lines: 56 Some weeks back I posted a request for real examples of the performance of back propogation simulators in "backprops/second." Bill Armstrong at the University of Alberta, Canada pointed out in a posting that the accepted unit of measurement was CUPS. He also pointed me at ALNs for use when performance was important. However, I received no replies stating real-world performance. Now, the question. While I was trying to debug my backprop simulator, my wife discovered what appears to be a subtle difference between the precise description of backprop and several of the "C" implementations I've picked up via ftp. In the "original" paper (Rumelhart, Hinton, Williams, "Learning Internal Representations by Error Propogation", 1986) the backward pass is described as follows: ...The first step is to compute delta for each of the output units. [...] We can then compute the weight changes for all connections that feed into the final layer. AFTER this is done, then compute deltas for all units in the penultimate layer...[etc, emphasis mine.] Dayhoff, "Neural Network Architectures", p 67, is even more specific: Delta values are computed for the output layer... its incoming weights are adjusted...delta values are computed for the hidden layer. I have found two implementations, however, in which the inner loops of the backprop do something like this: compute deltas for output layer; for layer := prev(outout) downto input for each element in layer for each outgoing arc from element (1) accumulate output * weight of arc into this element's "error" (2) adjust weight of arc. done use "error" to compute this node's delta. done -- elements in layer done -- layer It appears to me that (1) and (2) are reversed in this pseudocode; as I read the description, I should change the weight FIRST as I back up, and then use the NEW value in the accumulated error. At least the Dayhoff description pretty much states this in black and white. Either I'm confused, or this is a very common implementation error. If it's an implementation error, can anybody suggest what its effect is? I found this in programs that appear to have been used quite a bit. -- Jeff Berkowitz, Sequent Computer Systems jjb@sequent.com uunet!sequent!jjb