home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!overload.lbl.gov!lll-winken!tazdevil!henrik
- From: henrik@mpci.llnl.gov (Henrik Klagges)
- Newsgroups: comp.ai.neural-nets
- Subject: How to correctly measure time series generalization (?)
- Message-ID: <?.711737203@tazdevil>
- Date: 21 Jul 92 16:46:43 GMT
- Sender: usenet@lll-winken.LLNL.GOV
- Lines: 30
- Nntp-Posting-Host: tazdevil.llnl.gov
-
- Suppose a time series is available, from t=0 to t=99. How are these datapoints
- to be partitioned into training & test sets ? (It is assumed that a training/
- test vector at 't' is made out of (t-k, t-k+1, ..., t) as inputs and (t+1) as
- target output).
-
- a) Use 0->X for training, (X+1)->99 for testing;
- b) Use n randomly selected t's for training, and the rest for testing.
-
- 'a' is straightforward extrapolation, but fails if the time series 'window'
- is too small to capture all major cycles in the series (a cycle not visible
- from 0-X shows up from X+1 -> 99). This is e.g. likely the case with sunspot
- data. If the network's approximation captures the complete problem mechanics,
- though, a solution of 'a' works satisfactory for any time 't'.
-
- 'b' is a bit of a cheat - it is interpolation. However, for practical purposes
- like forecasting only the next ('99+1') sunspot, it may be more reliable than
- 'a'.
-
- Comments, please.
-
- Cheers, Henrik
-
- massively parallel group at Lawrence Livermore
- IBM Research physics group Munich
-
- --
-
- Cheers, Henrik
- MPCI at LLNL
- IBM Research
-