NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / bit / listserv / csgl / 2108 < prev next >

Wrap

Text File | 1993-01-11 | 5.4 KB | 125 lines

Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU Path: sparky!uunet!zaphod.mps.ohio-state.edu!howland.reston.ans.net!paladin.american.edu!auvm!CCB.BBN.COM!BNEVIN Return-Path: <@VMD.CSO.UIUC.EDU:bnevin@ccb.bbn.com> Message-ID: <CSG-L%93011110145148@VMD.CSO.UIUC.EDU> Newsgroups: bit.listserv.csg-l Date: Mon, 11 Jan 1993 11:06:46 EST Sender: "Control Systems Group Network (CSGnet)" <CSG-L@UIUCVMD.BITNET> From: "Bruce E. Nevin" <bnevin@CCB.BBN.COM> Subject: transformations and learning Lines: 113 [From: Bruce Nevin (Mon 930111 11:11:11)] The following outline of ideas from a fellow BBNer, Al Boulanger, seems to me like a fruitful direction to look for certain aspects of higher level control and reorganization as involved in learning at those levels. The header info that comes first indicates where I copied this from. (OK, I cheated on the time stamp above, but it was only a few minutes short of what appears there. My not-quite-11-year-old would love it.) Bruce -=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=-=+=- Date: Mon, 04 Jan 93 11:37:55 -0800 From: Michael Pazzani <pazzani@ics.uci.edu> Message-ID: <9301041146.aa27802@q2.ics.uci.edu> Machine Learning List: Vol. 5, No. 1 Monday, January 4, 1993 ------------------------------ Subject: Minimum Description Length & Transformations in Machine Learning From: aboulang@bbn.COM Date: Sat, 2 Jan 93 19:00:10 EST Minimum Description Length & Transformations in Machine Learning Or, Is there a Principle of Least Action for Machine Learning? In this short note I want to posit that MDL-like methodologies will become the unifying "Least Action Principles" of machine learning. Furthermore, machine learning architectures will evolve to include a fundamental capability for doing coordinate transformations and this capability will be intimately tied to the use of MDL-like methodologies in Machine Learning. By MDL-like methodologies I mean the use information-theoretic metrics on the results of any machine learning algorithm in its generalization phase. This metric is used a a decision criterion for over training by comparing the MDL-like metric of the results or the machine learning algorithm against the data itself. MDL-like methodologies are applicable to supervised and unsupervised learning. What I want to mean by the term "MDL-like" is that there is an applicable body of work in this area -- including the work of Wallace, Akaike and Rissanen. It is possible to use MDL-like metrics in the generation phase as well. Transformations and Machine Learning Many paradigmnamic problems in machine learning become "embarrassingly" simple under straightforward coordinate transformations. For instance, the two spirals problem becomes two simple lines under a polar coordinate transformation. Much of the activity of a physicist is in examination of appropriate coordinate system hosting of the problem to exploit symmetries of the problem. I posit that at least one phase of any machine learning system should include a search for appropriate coordinate system hosting. These transformations come in many different colors. For example, temporal differences is a relativising transformation in time coordinates. Another example is the growing use of wavelets for time-frequency features. A significant contributor to the complexity of the description of a problem is its chosen coordinate-system hosting. Coordinate transformations can be of two types: local and global. An example of a global transformation is the aforementioned polar hosting for the two spirals problem. The Fukashima network makes use of local transformations for robust pattern recognition. MDL can be used as the selection criteria in the transformation search. MDL as a Least Action Principle for Machine Learning MDL-like methods holds a promise to be a unifying principle in machine learning -- much like Lagrangian methods that make use of action and its minimization is *the* unifying approach in physics, cutting across classical physics, relativistic physics, and quantum mechanics. MDL-like metrics are a type of *action* for machine learning. (In fact for certain types of search in machine learning, Lagrangian optimization can be used.) (Recent work in machine vision at MIT has suggested the use of MDL as a principle for 3-d object recognition and disambiguation. It is posited that what is perceived is related to a MDL description of the 3d-scene. By the way, who is doing this work?) There are a couple of long-standing conceptual issues in machine learning: The relationship between learning methodologies - supervised, unsupervised, reinforcement learning, etc. Somehow, one would like a unifying framework for all of them. The fact that MDL-like methods can be used in several methodologies means that it could help in building such a framework. The relationship between optimization and machine learning. MDL-like metrics are posited to be the *general* optimization criterion for machine learning. MDL has broad applicability in machine learning. It can be used to guide search in both unsupervised and supervised learning. It can be used as the common optimization criterion for "multi-algorithm machine learning systems". Finally it can be used to tie the search in feature space with that of the search for coordinate system hosting. Seeking a higher form for machine learning, Albert Boulanger aboulanger@bbn.com