NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / sci / math / stat / 1726 < prev next >

Wrap

Text File | 1992-08-21 | 4.2 KB | 104 lines

Newsgroups: sci.math.stat Path: sparky!uunet!caen!news.cs.indiana.edu!uceng.uc.edu!juber From: juber@uceng.UC.EDU (james uber) Subject: Re: linear covariance estimate for max likelihood Message-ID: <1992Aug21.190537.23867@uceng.UC.EDU> Keywords: parameter estimation, maximum likelihood, covariance estimate Organization: College of Engineering, University of Cincinnati References: <1992Aug20.142353.6297@uceng.UC.EDU> <thompson.714414133@daphne.socsci.umn.edu> Date: Fri, 21 Aug 1992 19:05:37 GMT Lines: 92 In article <thompson.714414133@daphne.socsci.umn.edu> thompson@atlas.socsci.umn.edu writes: >juber@uceng.UC.EDU (james uber) writes: > >>I obtain parameter estimates via maximum likelihood where >>my model is in the standard reduced form y = f(p), y are the >>data and p are the parameters. I assume that the distribution >>of the model + measurement errors is normal with zero mean >>and known covariance matrix Ve. Thus i am solving the optimization >>problem: > >> min Tr(y - f(p))Inv(Ve)(y - f(p)) >> p > >[rest of post deleted] > >I do not understand the question. If y = f(p) (where f is presumably >a known and fixed function of the parameters) and y is observed then >there is no measurement error. Perhaps you meant y = f(p) + e where e >is a vector of measurement errors. (This seems implicit in your >description of the nonlinear least squares procedure.) > >But you also refer to "model errors". What are these and how do they >fit in? If the model is really > > y = f(p) + <error> > >and f(p) is known (up to the parameters p) and fixed, then >Var(y) = Var(<error>) regardless of the source of the errors. > >Please clarify. Thanks for replying to my post. That was my fault for being too hasty. This is what i really meant (now given a second chance). The _measurement_ errors are defined as: e1 = y - y* where y are the data and y* are the (unknown) true values. Now the _model_ errors are defined as: y* = f(p*) + e2 where p* are the "true" parameter values. That is, even given the measurements without error and the true parameters, there still is some error due to faults in the model theory, inaccuracy in solution of f(p), and the like. Combining these two equations gets me back to where i should have started in the beginning: y - e1 = f(p*) + e2 y = f(p*) + e1 + e2 y = f(p*) + E where E = e1 + e2 is the combined model + measurment errors. Thus the relevant distribution to use in estimation of p via maximum likelihood is the distribution of E, which i previously assumed was a normal distribution with known covariance Ve. My understanding is that, while it is often possible to specify the distribution of e1 in a logical way (normal and i.i.d., for example), the same can not necessarily be said for e2. Hence the significance of knowing that both exist. I realize that the parameters of the distribution of E can be estimated, under certain assumptions about their form. Now, like you said, if f(p) is known (not random), then the variability in y comes from the variability in E. This is where my brain start to hurt, `cause i think to myself, "hey, wait a minute, we're talking about the variability in the _data_, which are the _measurement_ errors." Thus i get confused when i look at derivations of the covariance matrix of the parameter estimates that say things like, "the maximum likelihood objective function depends on the _data_," or "if we vary the _data_ slightly, replacing y by y + dy, this would cause the minimum (of the log likelihood function) to shift from p* to p* + dp*." I just can't get around thinking of the variability in my data as the measurement (e1) errors! So, this leads to my original question of, in the derivation of the p covariance estimate, when we talk of the variability in p being caused by the variability in y, do we mean variability caused by e1 or by E? I am fairly certain that it must be E, but applied statistics is a tough business to part-timers, and i'm just not sure. Again, thanks for taking interest in my question. I hope that, in the least, my ignorance might now be clear. jim uber dept. of civil & environmental eng. univ. of cincinnati juber@uceng.uc.edu -- -- james uber juber@uceng.uc.edu