home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cis.ohio-state.edu!pacific.mps.ohio-state.edu!linac!uwm.edu!ogicse!das-news.harvard.edu!cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!news
- From: sef@sef-pmax.slisp.cs.cmu.edu
- Newsgroups: comp.ai.neural-nets
- Subject: Re: Reducing Training time vs Generalisation
- Message-ID: <BtBJCw.M2n.1@cs.cmu.edu>
- Date: 21 Aug 92 05:31:43 GMT
- Article-I.D.: cs.BtBJCw.M2n.1
- Sender: news@cs.cmu.edu (Usenet News System)
- Organization: School of Computer Science, Carnegie Mellon
- Lines: 53
- Nntp-Posting-Host: sef-pmax.slisp.cs.cmu.edu
-
-
- From: arms@cs.UAlberta.CA (Bill Armstrong)
-
- Now let's see what it takes to get lazy evaluation: first of
- all, I think you would have to insist that the sign of all weights on
- an element be positive, and all signals in the net too. Otherwise in
- forming a weighted sum of inputs, you can not be sure you are on one
- side of the sharp threshold or not until you have evaluated all inputs
- (not lazy!). I think the signals would have to be bounded too.
- I think this would be OK. ALNs are still faster, because they don't
- do arithmetic, but ALNs don't have as powerful nodes.
-
- For lazy evalaution, the inputs and hidden unit values would have to be
- bounded, but I think you could use bipolar weights. Sort the weights for
- each unit by magnitude. Evaluate subtrees in order, biggest weight first.
- Give up when no combination of remaining weights times the input limits
- could pull the total back across threshold.
-
- In fact, you wouldn't have to replace the sigmoids with thresholds. Divide
- the sgmoid into three regions: saturated-on, saturated-off, and in-between.
- If you find yourself in one of the saturated regions and no combination of
- other inputs can pull the net back into the in-between region, stop
- evaluating and flush the remaining sub-tress.
-
- One argument for going whole hog into ALNs is that you don't have to
- train using sigmoids, then risk damaging the result of learning by
- going to sharp thresholds. If there were a training procedure for
- networks of the above kind of node with a sharp threshold, that would
- be very promising. I thought backprop required differentiability to
- work though.
-
- It does. The Perceptron learning rule (and variants such as Gallant's
- pocket algorithm) can train sharp-threshold units, but not multiple layers.
-
- >Myself, I prefer to think in terms of parallel hardware, so lazy evaluation
- >isn't an issue.
-
- Not true! If you have a fixed amount of hardware, then to do large
- problems, you will have to iterate it...
-
- Sorry, I should have said "sufficiently parallel hardware", meaning you
- don't have to share.
-
- -- Scott
- ===========================================================================
-
- Scott E. Fahlman
- School of Computer Science
- Carnegie Mellon University
- 5000 Forbes Avenue
- Pittsburgh, PA 15213
-
- Internet: sef+@cs.cmu.edu
-