home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.ai.neural-nets
- Path: sparky!uunet!europa.asd.contel.com!darwin.sura.net!wupost!gumby!destroyer!ubc-cs!unixg.ubc.ca!kakwa.ucs.ualberta.ca!alberta!arms
- From: arms@cs.UAlberta.CA (Bill Armstrong)
- Subject: Re: need for unique test sets
- Message-ID: <arms.711754972@spedden>
- Sender: news@cs.UAlberta.CA (News Administrator)
- Nntp-Posting-Host: spedden.cs.ualberta.ca
- Organization: University of Alberta, Edmonton, Canada
- References: <1992Jul19.070433.5896@afterlife.ncsc.mil> <25633@life.ai.mit.edu> <arms.711688181@spedden> <25663@life.ai.mit.edu>
- Date: Tue, 21 Jul 1992 21:42:52 GMT
- Lines: 43
-
- marcus@sun-of-smokey.NoSubdomain.NoDomain (Jeff Marcus) writes:
- ... lots of deletions.
-
- >However, the more test data I have, the more sure I am that the
- >result on the test data is a good predictor of the classifier's performance on
- >new data. This can be quantified with a confidence interval on my
- >classification
- >error. This scheme is only inadequate in the sense that I can never
- >achieve
- >a zero confidence bound. But I don't see why that is important.
-
- >Of course, all this assumes that I have a good way of
- >selecting a representative sample in my test set, a non-trivial problem
- >in its own right.
-
- > you can frame performance
- >estimation as
- >a statistical estimation problem and the same arguments I just made
- >would apply: namely,
- >that as you get more data, you get more confident of your result, but
- >you can never
- >be sure that your result is exactly right.
-
- I agree with you, that we have a statistical estimation problem, but
- my point is that people shouldn't have to be happy with just a statistical
- estimate of the probability of error. Sometimes, as in cases where
- safety is critical, you would like to have a system that can be proven to
- do the right thing in all cases, even though the number of cases is
- astronomical, far beyond the possibilities for testing.
-
- Take software development, for example. Would you accept testing
- alone as the basis for declaring a piece of software safe? Say it is
- to be used in a nuclear reactor control system. What about the 10000000-th
- path through the code that you didn't test for? What I want to propose is
- that we in the NN field have to develop NNs that are at least as reliable as
- commercial software, and preferably even more reliable.
-
-
- --
- ***************************************************
- Prof. William W. Armstrong, Computing Science Dept.
- University of Alberta; Edmonton, Alberta, Canada T6G 2H1
- arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071
-