home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Monster Media 1993 #2
/
Image.iso
/
text
/
9305nni.zip
/
930513.BIB
/
comp.ai.neural-nets_9071_000001.msg
< prev
Wrap
Text File
|
1993-05-13
|
4KB
|
82 lines
Newsgroups: comp.ai.neural-nets
Path: serval!netnews.nwnet.net!usenet.coe.montana.edu!saimiri.primate.wisc.edu!eng.ufl.edu!usenet.ufl.edu!darwin.sura.net!zaphod.mps.ohio-state.edu!caen!batcomputer!munnari.oz.au!bunyip.cc.uq.oz.au!s1.elec.uq.oz.au!young
From: young@s1.elec.uq.oz.au (Steven Young)
Subject: Re: Pruning units and weights
Message-ID: <young.737166133@s1.elec.uq.oz.au>
Sender: news@bunyip.cc.uq.oz.au (USENET News System)
Organization: Prentice Centre, University of Queensland
References: <1993May9.011044.21062@leland.Stanford.EDU>
Date: Wed, 12 May 1993 00:22:13 GMT
Lines: 70
furman@leland.Stanford.EDU (Elliot M Furman) writes:
>Can anyone tell me how to prune unnecessary units and weights?
>I would like to start training a fully feedforward NN with
>too many units and then prune those that aren't contributing
>much to the "solution".
This is the standard approach that comes to mind when people consider
pruning and is suggested in various papers I have seen (Many people
attribute the idea to Rumelhart?). I'll include a list of references
that I know of at the end of this post.
One approach is the method of weight decaying, and removing connections
if the final (trained) weight is small (pick a parameter value and if
the weight is less than that in absolute value, remove it). There are
a number of papers on including weight decay as a part of the error
function for minimization with standard descent techniques. There
are a range of approaches and many papers expounding this idea:
(Hanson and Pratt, 1989), (Chauvin, 1989), (Le Cun, Denker and Solla, 1990),
(Ji, Snapp, Psaltis, 1990), (Bishop, 1990), (Weigend, Rumelhart, and Huberman,
1991).
There are some other schemes of making pruning decisions directly.
One simple rule (suggested initially by Sietsma and Dow (1988)) is to check
if network units in the same layer are duplicating function, if so
then remove one of the duplicating units. Mozer and Smolensky (1989) have
suggested a different scheme called skeletonization which makes a
decision based on the `relevance' of the unit. Relevance is checked
by comparing the performance of the network with the unit included and
removed.
J. Sietsma, R. J. F. Dow, `Neural Network Pruning --- Why and How',
ICNN 1988, vol I, pages 325--333, 1988.
Yves Chauvin, `A Back-Propagation Algorithm with optimal use of Hidden
Units', NIPS 1, pages 519--526, 1989.
Stephen Jos{\'e} Hanson, Lorien Y. Pratt, `Comparing Biases for Minimal
Construction with Back-Propagation', NIPS 1, pages 177--185, 1989.
Michael C. Mozer, Paul Smolensky, `Skeletonization: A technique for
trimming the fat from a network via relevance assessment', NIPS 1,
pages 107--115, 1989.
Michael C. Mozer, Paul Smolensky, `Using Relevance to Reduce Network
Size Automatically', Connection Science, vol. 1, no. 1, pages 3--16, 1989
C. M. Bishop, `Curvature-Driven Smoothing in Backpropagation Neural
Networks', INNC-1990-Paris, pages 749--752, 1990.
Chuanyi Ji, Robert R. Snapp, Demetri Psaltis, `Generalizing Smoothness
Constraints from Discrete Samples', Neural Computation, vol. 2, pages
188-197, 1990.
Yann Le Cun, John S. Denker, Sara A. Solla, `Optimal Brain Damage',
NIPS 2, pages 598--605, 1990
Jocelyn Sietsma, Robert J.F. Dow, `Creating Artificial Neural Networks
That Generalize' Neural Networks, vol 4, pages 67--79, 1991.
Andreas S. Weigend, David E. Rumelhart, Bernardo A. Huberman,
`Generalization by Weight-Elimination with Application to Forecasting',
NIP 3, pages 875--882, 1991.
Hope this is helpful.
Steven
--
Steven Young PhD Student | Dept of Electrical Engineering
email : young@s1.elec.uq.oz.au | University of Queensland
Murphy was an anarchist! | AUSTRALIA 4072 Ph:61+7 3653564