home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!newsserver.pixel.kodak.com!kodak!eastman!b56vxg.kodak.com!ekdug
- From: ekdug@b56vxg.kodak.com (James Cox)
- Newsgroups: sci.math.stat
- Subject: Comparison Of Model With Historical Data
- Message-ID: <3SEP199216491106@b56vxg.kodak.com>
- Date: 3 Sep 92 21:49:00 GMT
- Sender: news@eastman.UUCP
- Organization: Eastman Kodak Company, Rochester NY
- Lines: 48
- News-Software: VAX/VMS VNEWS 1.41
-
-
- We have purchased a simplified model of a complex kinetic chemical reaction.
- This model uses as inputs data from an industrial scale process (and the data
- from that process has all the usual errors associated with real-life data:
- noise, drift, bias, etc).
-
- The model produces six outputs (concentrations) that can be directly compared
- to composition analyses from the plant (which are also subject to all sorts
- of errors). There is one tuning parameter (that has a semi-theoretical basis)
- for the model that adjusts the production of one key component (and indirectly
- all the other concentrations - there are a total of 19 predicted compositions).
-
- The problem is when we tune the model to match the plant's production of the
- key component at one set of (historical) operating conditions, it cannot
- accurately predict even the key component's concentration at other sets of
- operating conditions.
-
- Naturally, we think that the model has been oversimplified to the point of
- unusability, while the view of the company that developed the model is that
- our plant data is too noisy. Their suggested approach is to use an average
- of the plant data inputs to develop an average tuning parameter value and use
- that. Their justification for this is that the variation in the predicted
- compositions from the model with an "average" tuning parameter should be less
- than the variation in the plant's measured compositions. (Even if the
- variations do turn out this way, I don't think it would take a model with a
- theoretical basis to get this sort of behavior!)
-
- We can analyze the variation in a large set of plant data. We can also tune
- the model to match the key component concentration for every point in the set
- of data. If we then define a function which is the sum of the differences
- between the six plant concentrations and the six predicted ones, we would know
- the variation in the defined function, the variation in the tuning parameter
- and the variation in the plant data inputs to the model.
-
- What we would like to say, based on the above information is something like:
-
- "Ah hah! Since the variation in the tuning parameter is larger (or has a
- larger effect on the defined function) than any of the variations in the plant
- data, obviously the model is not valid!"
-
- Is there any way to reach this conclusion in a valid statistical way?
- Especially given the fact that we do not know the theoretical relationships
- between the plant data inputs to the model and the model outputs?
-
- Any help will be *most* appreciated (and might contribute to my job security!).
- Please reply here or to to james_cox@Kodak.COM (which may or may not actually
- get to me.)
-
-