NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / ai / neuraln / 3262 < prev next >

Wrap

Internet Message Format | 1992-08-20 | 2.8 KB

Path: sparky!uunet!cis.ohio-state.edu!pacific.mps.ohio-state.edu!linac!uwm.edu!ogicse!das-news.harvard.edu!cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!news From: sef@sef-pmax.slisp.cs.cmu.edu Newsgroups: comp.ai.neural-nets Subject: Re: Reducing Training time vs Generalisation Message-ID: <BtBJCw.M2n.1@cs.cmu.edu> Date: 21 Aug 92 05:31:43 GMT Article-I.D.: cs.BtBJCw.M2n.1 Sender: news@cs.cmu.edu (Usenet News System) Organization: School of Computer Science, Carnegie Mellon Lines: 53 Nntp-Posting-Host: sef-pmax.slisp.cs.cmu.edu From: arms@cs.UAlberta.CA (Bill Armstrong) Now let's see what it takes to get lazy evaluation: first of all, I think you would have to insist that the sign of all weights on an element be positive, and all signals in the net too. Otherwise in forming a weighted sum of inputs, you can not be sure you are on one side of the sharp threshold or not until you have evaluated all inputs (not lazy!). I think the signals would have to be bounded too. I think this would be OK. ALNs are still faster, because they don't do arithmetic, but ALNs don't have as powerful nodes. For lazy evalaution, the inputs and hidden unit values would have to be bounded, but I think you could use bipolar weights. Sort the weights for each unit by magnitude. Evaluate subtrees in order, biggest weight first. Give up when no combination of remaining weights times the input limits could pull the total back across threshold. In fact, you wouldn't have to replace the sigmoids with thresholds. Divide the sgmoid into three regions: saturated-on, saturated-off, and in-between. If you find yourself in one of the saturated regions and no combination of other inputs can pull the net back into the in-between region, stop evaluating and flush the remaining sub-tress. One argument for going whole hog into ALNs is that you don't have to train using sigmoids, then risk damaging the result of learning by going to sharp thresholds. If there were a training procedure for networks of the above kind of node with a sharp threshold, that would be very promising. I thought backprop required differentiability to work though. It does. The Perceptron learning rule (and variants such as Gallant's pocket algorithm) can train sharp-threshold units, but not multiple layers. >Myself, I prefer to think in terms of parallel hardware, so lazy evaluation >isn't an issue. Not true! If you have a fixed amount of hardware, then to do large problems, you will have to iterate it... Sorry, I should have said "sufficiently parallel hardware", meaning you don't have to share. -- Scott =========================================================================== Scott E. Fahlman School of Computer Science Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Internet: sef+@cs.cmu.edu