NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / ai / neuraln / 3255 < prev next >

Wrap

Internet Message Format | 1992-08-20 | 1.7 KB

Path: sparky!uunet!usc!sdd.hp.com!uakari.primate.wisc.edu!ames!lll-winken!tazdevil!henrik From: henrik@mpci.llnl.gov (Henrik Klagges) Newsgroups: comp.ai.neural-nets Subject: Re: Reducing Training time vs Generalisation Message-ID: <?.714340347@tazdevil> Date: 20 Aug 92 19:52:27 GMT References: <Bt9GIx.9In.1@cs.cmu.edu> Sender: usenet@lll-winken.LLNL.GOV Lines: 32 Nntp-Posting-Host: tazdevil.llnl.gov sef@sef-pmax.slisp.cs.cmu.edu writes: >For example, in the example about the big >gaussian spike, it would drive the output weight to zero if the Gaussian is >not helping to fit the data. Key point. A few add-on 'reasonability criterions' like weight decay are quite effective in avoiding pathological results. >Well, since you keep pounding on this, I will point out that in most >backprop-style nets after training, almost all of the hidden units are >saturated almost all of the time. So you can replace them with sharp Same in our experiments. The decision trees being built do benefit a lot from the remaining nonlinearities, though (smoother decision surfaces- really 8-). >Myself, I prefer to think in terms of parallel hardware, so lazy evaluation >isn't an issue. Yes, sigmoid unit hardware is a bit more expensive to >implement than simple gates, but I don't need nearly as many of them. It is not terribly expensive - a 256 entry table is usually enough. Pipe lined access to such a lookup table can be made at one lookup/cycle at a pipe stall of less than 5 (if not much better, hihi). Moreover, weight accumulation/update are matrix operations, while lookup is only a vector operation. It is no bottleneck at all. Cheers, Henrik BM Research Division physics group Munich massively parallel group at Lawrence Livermore