home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.ai.neural-nets
- Path: sparky!uunet!elroy.jpl.nasa.gov!sdd.hp.com!caen!destroyer!ubc-cs!unixg.ubc.ca!kakwa.ucs.ualberta.ca!alberta!arms
- From: arms@cs.UAlberta.CA (Bill Armstrong)
- Subject: Re: Reducing Training time vs Generalisation
- Message-ID: <arms.714517521@spedden>
- Sender: news@cs.UAlberta.CA (News Administrator)
- Nntp-Posting-Host: spedden.cs.ualberta.ca
- Organization: University of Alberta, Edmonton, Canada
- References: <BtBJCw.M2n.1@cs.cmu.edu>
- Date: Sat, 22 Aug 1992 21:05:21 GMT
- Lines: 58
-
- sef@sef-pmax.slisp.cs.cmu.edu writes:
-
- > From: arms@cs.UAlberta.CA (Bill Armstrong)
- >
- > Now let's see what it takes to get lazy evaluation: first of
- > all, I think you would have to insist that the sign of all weights on
- > an element be positive, and all signals in the net too. Otherwise in
- > forming a weighted sum of inputs, you can not be sure you are on one
- > side of the sharp threshold or not until you have evaluated all inputs
- > (not lazy!). I think the signals would have to be bounded too.
- > I think this would be OK. ALNs are still faster, because they don't
- > do arithmetic, but ALNs don't have as powerful nodes.
- >
- >For lazy evalaution, the inputs and hidden unit values would have to be
- >bounded, but I think you could use bipolar weights. Sort the weights for
- >each unit by magnitude. Evaluate subtrees in order, biggest weight first.
- >Give up when no combination of remaining weights times the input limits
- >could pull the total back across threshold.
-
- >In fact, you wouldn't have to replace the sigmoids with thresholds. Divide
- >the sgmoid into three regions: saturated-on, saturated-off, and in-between.
- >If you find yourself in one of the saturated regions and no combination of
- >other inputs can pull the net back into the in-between region, stop
- >evaluating and flush the remaining sub-tress.
-
- I agree with everything you have said. It becomes clear that the
- non-saturated portion of the sigmoid is the costly part of evaluation,
- doesn't it?
-
- > One argument for going whole hog into ALNs is that you don't have to
- > train using sigmoids, then risk damaging the result of learning by
- > going to sharp thresholds. If there were a training procedure for
- > networks of the above kind of node with a sharp threshold, that would
- > be very promising. I thought backprop required differentiability to
- > work though.
-
- >It does. The Perceptron learning rule (and variants such as Gallant's
- >pocket algorithm) can train sharp-threshold units, but not multiple layers.
- >
-
- OK, then you have to admit that since an ALN is a multi-layer
- perceptron, and since the ALN training procedure works, and since this
- fact has been in the scientific and patent literature for twenty
- years, it is about time that people revise their thinking: multilayer
- perceptrons *can* be trained.
-
- Scientists, for whatever reason, seem to prefer to downplay the fact.
-
- Now, maybe if I could walk on water ...
-
- Thanks for your comments.
-
- Bill
- --
- ***************************************************
- Prof. William W. Armstrong, Computing Science Dept.
- University of Alberta; Edmonton, Alberta, Canada T6G 2H1
- arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071
-