home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!sdd.hp.com!mips!darwin.sura.net!haven.umd.edu!mimsy!afterlife!hcbarth
- From: hcbarth@afterlife.ncsc.mil (Bart Bartholomew)
- Newsgroups: comp.ai.neural-nets
- Subject: Re: re:need for unique test sets
- Message-ID: <1992Jul22.031319.15531@afterlife.ncsc.mil>
- Date: 22 Jul 92 03:13:19 GMT
- References: <1992Jul19.070433.5896@afterlife.ncsc.mil> <25633@life.ai.mit.edu>
- Organization: The Great Beyond
- Lines: 93
-
- In article <25633@life.ai.mit.edu> marcus@goldilocks.lcs.mit.edu (Jeff Marcus) writes:
- >In article <1992Jul19.070433.5896@afterlife.ncsc.mil>,
- >hcbarth@afterlife.ncsc.mil (Bart Bartholomew) writes:
- >|>
- >|> If you want to have any confidence that your network has
- >|> deduced the correct generating function from the training set,
- >|> you must be sure that the test set has no members from the
- >|> training set. Ponder: If the net gets some number correct on
- >|> the training set, and if the test set has some members from the
- >|> training set, then the apparent performance metric on the test
- >|> set will be skewed by the contribution of the members of the
- >|> training set.
- >|> Since there is a large, possibly infinite set of
- >|> functions that can generate any data set, the question we pose
- >|> to the test set is whether we have found the correct function.
- >|> The answer to that question is measured by how well the net can
- >|> 'generalize' - can the function that the net found to explain the
- >|> training set also explain the test set?
- >|> Clearly, the larger both sets are (up to the point where
- >|> all possible members of the function are included), the more
- >|> confidence we have in the answer.
- >|> So, I disagree with Prof Armstrong and you about the need
- >|> for having no overlap between the training and test sets.
- >|> Sincerely,
- >|> Bart
- >|>
- >|> --
- >
- >I stand by my original comments. I don't know what you mean by
- >"a contribution by some members of the training set." If the two
- >sets are chosen independently, there is no contribution; the test set is just
- >reflecting the fact that it is possible to have data in the popluation that has
- >been captured in the training set. If the two sets are not chosen
- >independently, you
- >are doing something wrong.
- >
- >One thing that I am unclear about is:
- >
- >Are you drawing samples from a finite, discrete-valued population or a
- >continuous-valued
- >one. If the latter, you don't have to worry about having the exact same
- >sample in the
- >test set. If the former, then your argument would imply that the more
- >training data you
- >have, the smaller allowable test set for testing your network. So let's
- >say you include
- >all possible test patterns but one in your training set. Then by your
- >argument, the
- >test set can consist of only one pattern. Does the performance on this
- >pattern give
- >a better indication of the network's performance, or does the
- >performance on some test
- >set that reflects the distribution of what you are likely to see in
- >using the network?
- >It's obvious that that latter is better.
- >
- > Jeff
-
- Let me try again.
- If you have trained the net on the training set
- and ithe net gets all the answers right according to some
- arbitrary measure (not necessarily MSE) AND if your
- test set contains some of the same input/output pairs as
- are in the training set, then the net will always get those
- right, and will cause the apparent success on the test set
- to look better than it really is. Unless, of course, it
- gets all the test set right, and then the point is probaly moot.
- The point of having a test set (taken from the same
- source as the training set) is to make sure the net has found
- the right (or equivalent) function.
- On the other hand, if the net does well on the training
- set but falls apart (scores badly) on the test set, you know that
- the net has found a nice function that describes the training
- set well, but *is not the function that actually generated the
- data*. In that case, the net is worthless. If it scores
- pretty well on the test set, but not as well as on the training
- set, then you have probably found a first cousin of the correct
- function and may be able to coax it to find the correct function.
- About the second point (size of training/test sets) - some
- functions have too many possible points to be practical to train on
- even a small fraction, and you can have very large sets for both
- training and test. Generally, I put most of my eggs in the training
- set and keep out 10%-25% for testing. Bear in mind that you may
- not know what the generating function is, but still have a reasonable
- estimate on the dimensions. Indeed, to hope to succeed, the input
- layer must be large enough to be sure that the output is dependant
- on the input.
- Bart
- --
- "It's not the thing you fling, the fling's the thing." - Chris Stevens
- If there's one thing I just can't stand, it's intolerance.
- *No One* is responsible for my views, I'm a committee. Please do not
- infer that which I do not imply. hcbarth@afterlife.ncsc.mil
-