NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / ai / neuraln / 2955 < prev next >

Wrap

Text File | 1992-07-25 | 4.5 KB | 100 lines

Newsgroups: comp.ai.neural-nets Path: sparky!uunet!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!destroyer!ubc-cs!alberta!arms From: arms@cs.UAlberta.CA (Bill Armstrong) Subject: Re: Neural Nets and Brains Message-ID: <arms.712096926@spedden> Sender: news@cs.UAlberta.CA (News Administrator) Nntp-Posting-Host: spedden.cs.ualberta.ca Organization: University of Alberta, Edmonton, Canada References: <arms.711907358@spedden>> <BILL.92Jul23135614@ca3.nsma.arizona.edu> <arms.711935064@spedden> <50994@seismo.CSS.GOV> Date: Sat, 25 Jul 1992 20:42:06 GMT Lines: 87 black@seismo.CSS.GOV (Mike Black) writes: >In article <arms.711935064@spedden> arms@cs.UAlberta.CA (Bill Armstrong) writes: >>Sorry, but after you have looked at ALN software, you may no longer >>feel non-logical nets have any real advantages at all. >> >I took the atree software and (for times sake) reduced the multiplication >problem to the 1 and 2 times tables. I removed 1*6 from the table and >let atree crank. When I tested 1*6 it gave me an answer of ~35. This depends on the coding a:b used. It looks like the random walk for the output curved back upon itself. Maybe using a different coding would correct the problem partly. However there is a problem that would tend to give for 1*6 an answer equal to one for a neighboring input, eg 1*5 or 1* 7 or 2*6 or 0*6, which is discussed below. I do >NOT call this superior as the backprop net I trained gave me an answer >that was at least BETWEEN 1*5 and 1*7. The other problem I ran into >was running out of memory (16 meg + 64meg swap space) on a problem that >I had previously solved with backprop. >My conclusions: >1. backprop is able to generalize to a linear solution whereas atree >cannot (in at least one provable case). I agree with you that ALNs do not interpolate. Atree generalizes not by interpolation, but by maintaining some neighboring training point's output. This is because the tree functions are insensitive to perturbations of the inputs. In order to get smooth interpolation, there are other ways of using ALNs. For example you can use a forest of ALNs to compute an index which says which part of the space an input point is in. From there, you can access coefficients of a smooth function for that part of the space. Then the number of arithmetic operations is quite small, just enough to describe the local function. On the other hand, backprop always accesses enough coefficients to describe the function on the whole space, and that's inefficient. This ALN technique gives you only piecewise continuous functions. It is possible to compute continuous functions too, with additional complexity. Fortunately, the number of arithmetic operations still depends only on functions defined in the locality of the input point. >2. atree hits memory constraints before backprop does. This is not intrinsic to atree. The Unix version depends on virtual memory, and so shouldn't run out; and the Windows version uses the Windows facility for getting access to memory above 1 Meg. Part of the problem is in the way atree uses bit-vectors. This is unnecessary, and will be changed in future versions. >3. I have no doubt that atree does well in certain applications, but >it NOT superior to backprop in ALL cases. The technique of random walks to encode continuous values has to be replaced in order to get smooth interpolations, and more importantly to be able to force the functions synthesized to be piecewise monotonic so we don't get wild values. The bit-vectors also have to go. Since the software which does this is not available yet, all I can do is hope that the ideas presented above are enough to indicate where ALNs have problems, and how they can be solved. BP-nets ( or rather MLPs with multiply-adds and sigmoids that are non-constant in every part of the real line) have inefficiencies built in that are going to be difficult or impossible to overcome. A significant step forward would be if BP could use squashing functions that are constant outside an interval (eg [-1,1]). If you could train BP nets using that kind of squashing function, they could be much more efficient to evaluate and many of my arguments against BP would break down. With ALNs, approximations can be produced which only depend on local data, and that will ultimately give ALNs a significant advantage even where fitting smooth functions is concerned. -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071