NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / ai / neuraln / 2862 < prev next >

Wrap

Text File | 1992-07-21 | 2.5 KB | 56 lines

Newsgroups: comp.ai.neural-nets Path: sparky!uunet!europa.asd.contel.com!darwin.sura.net!wupost!gumby!destroyer!ubc-cs!unixg.ubc.ca!kakwa.ucs.ualberta.ca!alberta!arms From: arms@cs.UAlberta.CA (Bill Armstrong) Subject: Re: need for unique test sets Message-ID: <arms.711754972@spedden> Sender: news@cs.UAlberta.CA (News Administrator) Nntp-Posting-Host: spedden.cs.ualberta.ca Organization: University of Alberta, Edmonton, Canada References: <1992Jul19.070433.5896@afterlife.ncsc.mil> <25633@life.ai.mit.edu> <arms.711688181@spedden> <25663@life.ai.mit.edu> Date: Tue, 21 Jul 1992 21:42:52 GMT Lines: 43 marcus@sun-of-smokey.NoSubdomain.NoDomain (Jeff Marcus) writes: ... lots of deletions. >However, the more test data I have, the more sure I am that the >result on the test data is a good predictor of the classifier's performance on >new data. This can be quantified with a confidence interval on my >classification >error. This scheme is only inadequate in the sense that I can never >achieve >a zero confidence bound. But I don't see why that is important. >Of course, all this assumes that I have a good way of >selecting a representative sample in my test set, a non-trivial problem >in its own right. > you can frame performance >estimation as >a statistical estimation problem and the same arguments I just made >would apply: namely, >that as you get more data, you get more confident of your result, but >you can never >be sure that your result is exactly right. I agree with you, that we have a statistical estimation problem, but my point is that people shouldn't have to be happy with just a statistical estimate of the probability of error. Sometimes, as in cases where safety is critical, you would like to have a system that can be proven to do the right thing in all cases, even though the number of cases is astronomical, far beyond the possibilities for testing. Take software development, for example. Would you accept testing alone as the basis for declaring a piece of software safe? Say it is to be used in a nuclear reactor control system. What about the 10000000-th path through the code that you didn't test for? What I want to propose is that we in the NN field have to develop NNs that are at least as reliable as commercial software, and preferably even more reliable. -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071