NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / ai / neuraln / 3282 < prev next >

Wrap

Text File | 1992-08-22 | 3.1 KB | 71 lines

Newsgroups: comp.ai.neural-nets Path: sparky!uunet!elroy.jpl.nasa.gov!sdd.hp.com!caen!destroyer!ubc-cs!unixg.ubc.ca!kakwa.ucs.ualberta.ca!alberta!arms From: arms@cs.UAlberta.CA (Bill Armstrong) Subject: Re: Reducing Training time vs Generalisation Message-ID: <arms.714517521@spedden> Sender: news@cs.UAlberta.CA (News Administrator) Nntp-Posting-Host: spedden.cs.ualberta.ca Organization: University of Alberta, Edmonton, Canada References: <BtBJCw.M2n.1@cs.cmu.edu> Date: Sat, 22 Aug 1992 21:05:21 GMT Lines: 58 sef@sef-pmax.slisp.cs.cmu.edu writes: > From: arms@cs.UAlberta.CA (Bill Armstrong) > > Now let's see what it takes to get lazy evaluation: first of > all, I think you would have to insist that the sign of all weights on > an element be positive, and all signals in the net too. Otherwise in > forming a weighted sum of inputs, you can not be sure you are on one > side of the sharp threshold or not until you have evaluated all inputs > (not lazy!). I think the signals would have to be bounded too. > I think this would be OK. ALNs are still faster, because they don't > do arithmetic, but ALNs don't have as powerful nodes. > >For lazy evalaution, the inputs and hidden unit values would have to be >bounded, but I think you could use bipolar weights. Sort the weights for >each unit by magnitude. Evaluate subtrees in order, biggest weight first. >Give up when no combination of remaining weights times the input limits >could pull the total back across threshold. >In fact, you wouldn't have to replace the sigmoids with thresholds. Divide >the sgmoid into three regions: saturated-on, saturated-off, and in-between. >If you find yourself in one of the saturated regions and no combination of >other inputs can pull the net back into the in-between region, stop >evaluating and flush the remaining sub-tress. I agree with everything you have said. It becomes clear that the non-saturated portion of the sigmoid is the costly part of evaluation, doesn't it? > One argument for going whole hog into ALNs is that you don't have to > train using sigmoids, then risk damaging the result of learning by > going to sharp thresholds. If there were a training procedure for > networks of the above kind of node with a sharp threshold, that would > be very promising. I thought backprop required differentiability to > work though. >It does. The Perceptron learning rule (and variants such as Gallant's >pocket algorithm) can train sharp-threshold units, but not multiple layers. > OK, then you have to admit that since an ALN is a multi-layer perceptron, and since the ALN training procedure works, and since this fact has been in the scientific and patent literature for twenty years, it is about time that people revise their thinking: multilayer perceptrons *can* be trained. Scientists, for whatever reason, seem to prefer to downplay the fact. Now, maybe if I could walk on water ... Thanks for your comments. Bill -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071