home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!decwrl!cache.crc.ricoh.com!crc.ricoh.com!wolff
- From: wolff@crc.ricoh.com (Gregory J. Wolff)
- Newsgroups: comp.ai.neural-nets
- Subject: Paper available on Neuroprose archive: Stork.obs.ps.Z
- Keywords: second derivatives, pruning, Hessian, neural-networks
- Message-ID: <1992Sep14.161526.749@crc.ricoh.com>
- Date: 14 Sep 92 16:15:26 GMT
- Sender: news@crc.ricoh.com (USENET News System)
- Reply-To: wolff@crc.ricoh.com (Gregory J. Wolff)
- Organization: RICOH California Research Center
- Lines: 65
- Nntp-Posting-Host: styx.crc.ricoh.com
-
-
- The following paper has been placed on the neuroprose archive as
- stork.obs.ps.Z and is available via anonymous ftp (from
- archive.cis.ohio-state.edu in the pub/neuroprose directory).
- This paper will be presented at NIPS-92.
-
- =========================================================================
- Second Order Derivatives for Network Pruning:
- Optimal Brain Surgeon
-
- Babak Hassibi and David G. Stork, Ricoh California Research Center
-
- ABSTRACT: We investigate the use of information from all second order
- derivatives of the error function to perform network pruning (i.e.,
- removing unimportant weights from a trained network) in order to improve
- generalization and increase the speed of further training. Our method,
- Optimal Brain Surgeon (OBS), is significantly better than
- magnitude-based methods, which can often remove the wrong weights. OBS
- also represents a major improvement over other methods, such as Optimal
- Brain Damage [Le Cun, Denker and Solla, 1990], because ours uses the
- full off-diagonal information of the Hessian matrix H. Crucial to OBS
- is a recursion relation for calculating H inverse from training data and
- structural information of the net. We illustrate OBS on standard
- benchmark problems: the MONK's problems. The most successful method in
- a recent competition in machine learning [Thrun et al., 1991] was
- backpropagation using weight decay, which yielded a network with 58
- weights for one MONKs problem. OBS requires only 14 weights for the
- same performance accuracy. On two other MONKs problems, our method
- required only 38% and 10% of the weights found by magnitude-based pruning.
-
- ===========================================================================
-
-
- Here is an example of how to retrieve this file:
-
- gvax> ftp archive.cis.ohio-state.edu
- Connected to archive.cis.ohio-state.edu.
- 220 archive.cis.ohio-state.edu FTP server ready.
- Name: anonymous
- 331 Guest login ok, send ident as password.
- Password:neuron@wherever
- 230 Guest login ok, access restrictions apply.
- ftp> binary
- 200 Type set to I.
- ftp> cd pub/neuroprose
- 250 CWD command successful.
- ftp> get stork.obs.ps.Z
- 200 PORT command successful.
- 150 Opening BINARY mode data connection for stork.obs.ps.Z
- 226 Transfer complete.
- 100000 bytes sent in 3.14159 seconds
- ftp> quit
- 221 Goodbye.
- gvax> uncompress stork.obs.ps
- gvax> lpr stork.obs.ps
-
-
- --
- Gregory J. Wolff
- Ricoh California Research Center
- 2882 SandHill Rd. Suite 115
- Menlo Park, CA 94025-7022
- wolff@crc.ricoh.com
- (415) 496-5718
- fax: (415) 854-8740
-