NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / ai / neuraln / 3250 < prev next >

Wrap

Internet Message Format | 1992-08-20 | 2.6 KB

Path: sparky!uunet!wupost!sdd.hp.com!hplabs!ucbvax!NEURON.SIEMENS.COM!kpfleger From: kpfleger@NEURON.SIEMENS.COM (Karl Pfleger) Newsgroups: comp.ai.neural-nets Subject: Re: Wild values (was Reducing Training time ...) Message-ID: <9208201551.AA02766@neuron.siemens.com> Date: 20 Aug 92 15:51:31 GMT Sender: daemon@ucbvax.BERKELEY.EDU Lines: 50 Newsgroups: comp.ai.neural-nets Subject: Re: Wild values (was Reducing Training time ...) Summary: Expires: References: <arms.714146123@spedden> <36967@sdcc12.ucsd.edu> <arms.714208873@spedden> Sender: Followup-To: Distribution: Keywords: back propagation, training, generalisation 1 quick point and 1 wild idea: First, if one desires to avoid wild output values for certain regions of input space, one ought to have training pairs from that region of input space in the training set. The point about desiring certain behavior on 0 to 1 and not including any training pairs from that region has already been made. Vaguely similar: since the inputs that will be thrown at the system in actual use will have some probability distribution based on whatever the system is doing, the training set should be generated by sampling the same distribution, or something as close to it as possible, NOT by picking a few values by hand or by using a lattice or points regularly spaced (unless that represents the distribution well). I have a much more difficult time picturing wild values coming from a network trained on a significant number of random, real inputs than I do coming from a network trained on a handful of regularly spaced integers. A wild idea for people trying to avoid wild values (e.g. for safety critical applications etc.): Once the network has been trained and the weights are fixed, it should be possible to determine the maximum and minimum output values for all inputs. This can be done with normal gradient ascent and descent. Simply calculate partials of the output(s) with respect to the input(s). Convergence should be much faster than training networks in the first place due to the generally smaller number of inputs than weights. Multiple runs from different starting positions or the use of stochastic techniques likely to converge to global maxima/minima can reduce the chance of not seeing a wild value that actually exists. With a small number of inputs (definitely 1, maybe a more) analytical techniques should be able to provably determine the global maximum and minimum. This idea's usefulness is limited by a number of requirements such as fixed weights and wild values meaning large or small, but still seems like it should have fairly wide applicability. -Karl kpfleger@learning.siemens.com