NetNews Usenet Archive 1992 #30

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #30 / NN_1992_30.iso / spool / comp / ai / neuraln / 4556 < prev next >

Wrap

Internet Message Format | 1992-12-11 | 2.7 KB

Path: sparky!uunet!cs.utexas.edu!sun-barr!olivea!spool.mu.edu!yale.edu!yale!gumby!destroyer!cs.ubc.ca!alberta!arms From: arms@cs.UAlberta.CA (Bill Armstrong) Newsgroups: comp.ai.neural-nets Subject: Re: Questions about sigmoids etc. Keywords: Sigmoids, output layers Message-ID: <arms.724120505@spedden> Date: 12 Dec 92 00:35:05 GMT References: <waugh.723705045@probitas> <1992Dec8.161935@sees.bangor.ac.uk> <1992Dec9.160218.25286@cs.brown.edu> <1992Dec10.084458.12506@dxcern.cern.ch> <1992Dec10.123626.28838@cs.brown.edu> Sender: news@cs.UAlberta.CA (News Administrator) Organization: University of Alberta, Edmonton, Canada Lines: 42 Nntp-Posting-Host: spedden.cs.ualberta.ca hm@cs.brown.edu (Harry Mamaysky) writes: >In article <1992Dec10.084458.12506@dxcern.cern.ch>, block@dxlaa.cern.ch (Frank Block) writes: >|> >|> In article <1992Dec9.160218.25286@cs.brown.edu>, pcm@cs.brown.edu (Peter C. McCluskey) writes: >|> |> In article <1992Dec8.161935@sees.bangor.ac.uk>, paulw@sees.bangor.ac.uk >|> |> (Mr P Williams (AD)) writes: >|> |> |> For backpropagation networks (i.e. Rumelhart ,Mclelland and Williams), >|> |> |> it is neccessary to have a monotonically increasing, DIFFERENTIABLE >|> |> |> function as the output >|> >|> But how are you going to train a network with non-differentiable functions? >|> Certainly not with the standard BP? A partial derivative is only one way of measuring the influence of a weight change on the output error of a network. You can also use ratios of non-infinitesimal perturbations of values too. Even though we speak of the derivative we can never take the limit anyway in implementations. The ALN adaptive algorithm (atree release 2.7 in pub/atre27.exe for Windows 3.x on menaik.cs.ualberta.ca [129.128.4.241]) deals with finite changes. Actually since we are then in a boolean tree, either the output changes when a "weight" changes, or it doesn't. The logical measure is much faster to evaluate in combinational hardware than the derivative, and it can be done by lazy evaluation of the logic in software. I think in large adaptive systems, we have to give up on differentiability, and even on the idea of only having functions. I think we need to learn relationships that are many to many. In the general case, I don't see any way we can use derivatives, however finite perturbations can still be used to reduce error. This error reduction combined with Widrow's idea of "least disturbance" provides lots of possible adaptive algorithms. -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071