home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!howland.reston.ans.net!paladin.american.edu!auvm!CCB.BBN.COM!BNEVIN
- Return-Path: <@VMD.CSO.UIUC.EDU:bnevin@ccb.bbn.com>
- Message-ID: <CSG-L%93011110145148@VMD.CSO.UIUC.EDU>
- Newsgroups: bit.listserv.csg-l
- Date: Mon, 11 Jan 1993 11:06:46 EST
- Sender: "Control Systems Group Network (CSGnet)" <CSG-L@UIUCVMD.BITNET>
- From: "Bruce E. Nevin" <bnevin@CCB.BBN.COM>
- Subject: transformations and learning
- Lines: 113
-
- [From: Bruce Nevin (Mon 930111 11:11:11)]
-
- The following outline of ideas from a fellow BBNer, Al Boulanger,
- seems to me like a fruitful direction to look for certain aspects
- of higher level control and reorganization as involved in
- learning at those levels. The header info that comes first
- indicates where I copied this from.
-
- (OK, I cheated on the time stamp above, but it was only a few
- minutes short of what appears there. My not-quite-11-year-old
- would love it.)
-
- Bruce
-
- -=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-
-
- Date: Mon, 04 Jan 93 11:37:55 -0800
- From: Michael Pazzani <pazzani@ics.uci.edu>
- Message-ID: <9301041146.aa27802@q2.ics.uci.edu>
-
-
- Machine Learning List: Vol. 5, No. 1
- Monday, January 4, 1993
-
- ------------------------------
-
- Subject: Minimum Description Length & Transformations in Machine Learning
- From: aboulang@bbn.COM
- Date: Sat, 2 Jan 93 19:00:10 EST
-
- Minimum Description Length & Transformations in Machine Learning
-
- Or, Is there a Principle of Least Action for Machine Learning?
-
- In this short note I want to posit that MDL-like methodologies will
- become the unifying "Least Action Principles" of machine learning.
- Furthermore, machine learning architectures will evolve to include a
- fundamental capability for doing coordinate transformations and this
- capability will be intimately tied to the use of MDL-like
- methodologies in Machine Learning.
-
- By MDL-like methodologies I mean the use information-theoretic metrics
- on the results of any machine learning algorithm in its generalization
- phase. This metric is used a a decision criterion for over training
- by comparing the MDL-like metric of the results or the machine
- learning algorithm against the data itself. MDL-like methodologies
- are applicable to supervised and unsupervised learning. What I want to
- mean by the term "MDL-like" is that there is an applicable body of
- work in this area -- including the work of Wallace, Akaike and
- Rissanen. It is possible to use MDL-like metrics in the generation
- phase as well.
-
- Transformations and Machine Learning
-
- Many paradigmnamic problems in machine learning become
- "embarrassingly" simple under straightforward coordinate
- transformations. For instance, the two spirals problem becomes two
- simple lines under a polar coordinate transformation. Much of the
- activity of a physicist is in examination of appropriate coordinate
- system hosting of the problem to exploit symmetries of the problem. I
- posit that at least one phase of any machine learning system should
- include a search for appropriate coordinate system hosting.
-
- These transformations come in many different colors. For example,
- temporal differences is a relativising transformation in time
- coordinates. Another example is the growing use of wavelets for
- time-frequency features.
-
- A significant contributor to the complexity of the description of a
- problem is its chosen coordinate-system hosting. Coordinate
- transformations can be of two types: local and global. An example of a
- global transformation is the aforementioned polar hosting for the two
- spirals problem. The Fukashima network makes use of local
- transformations for robust pattern recognition. MDL can be used as
- the selection criteria in the transformation search.
-
- MDL as a Least Action Principle for Machine Learning
-
- MDL-like methods holds a promise to be a unifying principle in machine
- learning -- much like Lagrangian methods that make use of action and
- its minimization is *the* unifying approach in physics, cutting across
- classical physics, relativistic physics, and quantum mechanics.
- MDL-like metrics are a type of *action* for machine learning. (In fact
- for certain types of search in machine learning, Lagrangian optimization
- can be used.)
-
- (Recent work in machine vision at MIT has suggested the use of MDL as
- a principle for 3-d object recognition and disambiguation. It is
- posited that what is perceived is related to a MDL description of the
- 3d-scene. By the way, who is doing this work?)
-
- There are a couple of long-standing conceptual issues in machine learning:
-
- The relationship between learning methodologies - supervised,
- unsupervised, reinforcement learning, etc. Somehow, one would like a
- unifying framework for all of them. The fact that MDL-like methods
- can be used in several methodologies means that it could help in
- building such a framework.
-
- The relationship between optimization and machine learning. MDL-like
- metrics are posited to be the *general* optimization criterion for
- machine learning.
-
- MDL has broad applicability in machine learning. It can be used to
- guide search in both unsupervised and supervised learning. It can be
- used as the common optimization criterion for "multi-algorithm machine
- learning systems". Finally it can be used to tie the search in feature
- space with that of the search for coordinate system hosting.
-
-
- Seeking a higher form for machine learning,
- Albert Boulanger
- aboulanger@bbn.com
-