home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.ai.neural-nets
- Path: sparky!uunet!spool.mu.edu!sol.ctr.columbia.edu!destroyer!cs.ubc.ca!alberta!arms
- From: arms@cs.UAlberta.CA (Bill Armstrong)
- Subject: Re: How to train a lifeless network (of "silicon atoms")?
- Message-ID: <arms.722488897@spedden>
- Sender: news@cs.UAlberta.CA (News Administrator)
- Nntp-Posting-Host: spedden.cs.ualberta.ca
- Organization: University of Alberta, Edmonton, Canada
- References: <1992Nov21.002654.13198@news.columbia.edu> <1992Nov22.182325.24185@dxcern.cern.ch>
- Date: Mon, 23 Nov 1992 03:21:37 GMT
- Lines: 63
-
- block@dxlaa.cern.ch (Frank Block) writes:
-
-
- >In article <1992Nov21.002654.13198@news.columbia.edu>, rs69@cunixb.cc.columbia.edu (Rong Shen) writes:
-
- >|> Please allow me to ask you this childish question:
- >|>
- >|> Suppose you have a neural network and you want to train it to
- >|> perform a task; for the moment, let's say the task is to recognize
- >|> handwriting. Now suppose the network has recognized the word "hello,"
- >|> and the weight in the synapse between neurodes (neurons) X and Y is k.
- >|> If you proceed to train the network to recognize the word "goodbye"
- >|> (by back propagation, or whatever algorithms), and since all the
- >|> neurodes are connected in some way (through some interneurons, maybe),
- >|> the synaptic weight between X and Y is likely to change from k to some
- >|> other number; similarly, the weights in other synapses will change.
- >|> Therefore, it is extremely likely that one training session will erase
- >|> the efforts of previous sessions.
- >|>
- >|> My question is, What engineering tricks shall we use to
- >|> overcome this apparent difficulty?
- >|>
- >|> Thanks.
- >|>
- >|> --
- >|> rs69@cunixb.cc.columbia.edu
- >|>
- >--
-
- >What you normally do during training is to present (taking you example) the
- >words 'hello' and 'goodbye' alternatively. You should not train the net first
- >just on one and then, when it has learned to recognize it, on the other.
- >The training is a statistical process which in the end (let's hope) converges
- >to a good set of weights (a compromise which recognizes all patterns in an
- >optimal way).
- >The engineering trick is mostly the so-called 'gradient descent' (in backprop).
- >This moves you current weight vector always in a direction which decreases the
- >network error measure.
-
- >Hope this helps a bit
-
- >Frank
-
- I attended a tutorial once by Bernard Widrow, and he referred to the
- "least disturbance principle". The idea is to correct an error in
- such a way that the overall state of the net is the least perturbed.
- This is another way of looking at backprop. Not only does it reduce
- the error (in theory anyway), but it does so by favoring changes of
- those weights which have the greatest effect on the error. Hence it
- can change those by the least amount to achieve a given amount of
- error correction.
-
- Widrow's profound insight into adaptation is shown in this simple
- principle of his. It has helped me a lot in thinking about adaptation
- algorithms for ALNs too, but the scope of its applicability seems much
- broader.
-
- Bill
- --
- ***************************************************
- Prof. William W. Armstrong, Computing Science Dept.
- University of Alberta; Edmonton, Alberta, Canada T6G 2H1
- arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071
-