NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / sci / math / stat / 1800 < prev next >

Wrap

Internet Message Format | 1992-09-03 | 3.1 KB

Path: sparky!uunet!newsserver.pixel.kodak.com!kodak!eastman!b56vxg.kodak.com!ekdug From: ekdug@b56vxg.kodak.com (James Cox) Newsgroups: sci.math.stat Subject: Comparison Of Model With Historical Data Message-ID: <3SEP199216491106@b56vxg.kodak.com> Date: 3 Sep 92 21:49:00 GMT Sender: news@eastman.UUCP Organization: Eastman Kodak Company, Rochester NY Lines: 48 News-Software: VAX/VMS VNEWS 1.41 We have purchased a simplified model of a complex kinetic chemical reaction. This model uses as inputs data from an industrial scale process (and the data from that process has all the usual errors associated with real-life data: noise, drift, bias, etc). The model produces six outputs (concentrations) that can be directly compared to composition analyses from the plant (which are also subject to all sorts of errors). There is one tuning parameter (that has a semi-theoretical basis) for the model that adjusts the production of one key component (and indirectly all the other concentrations - there are a total of 19 predicted compositions). The problem is when we tune the model to match the plant's production of the key component at one set of (historical) operating conditions, it cannot accurately predict even the key component's concentration at other sets of operating conditions. Naturally, we think that the model has been oversimplified to the point of unusability, while the view of the company that developed the model is that our plant data is too noisy. Their suggested approach is to use an average of the plant data inputs to develop an average tuning parameter value and use that. Their justification for this is that the variation in the predicted compositions from the model with an "average" tuning parameter should be less than the variation in the plant's measured compositions. (Even if the variations do turn out this way, I don't think it would take a model with a theoretical basis to get this sort of behavior!) We can analyze the variation in a large set of plant data. We can also tune the model to match the key component concentration for every point in the set of data. If we then define a function which is the sum of the differences between the six plant concentrations and the six predicted ones, we would know the variation in the defined function, the variation in the tuning parameter and the variation in the plant data inputs to the model. What we would like to say, based on the above information is something like: "Ah hah! Since the variation in the tuning parameter is larger (or has a larger effect on the defined function) than any of the variations in the plant data, obviously the model is not valid!" Is there any way to reach this conclusion in a valid statistical way? Especially given the fact that we do not know the theoretical relationships between the plant data inputs to the model and the model outputs? Any help will be *most* appreciated (and might contribute to my job security!). Please reply here or to to james_cox@Kodak.COM (which may or may not actually get to me.)