home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The C Users' Group Library 1994 August
/
wc-cdrom-cusersgrouplibrary-1994-08.iso
/
vol_200
/
209_01
/
simpfitr.doc
< prev
next >
Wrap
Text File
|
1990-03-04
|
15KB
|
533 lines
SIMPFITR.DOC VERS:- 01.00 DATE:- 09/26/86 TIME:- 10:02:49 PM
documentation for C routines for simplex fitting and
quadratic approximation of least squares surface.
By J.A. Rupley, Tucson, Arizona
NOTES ON THE SIMPLEX FITTING PROGRAM XXXXFITn
AND ITS SUPPORTING MODULES
A. DESCRIPTION OF THE PROGRAM:
A function explicitly relating a dependent variable to
one or more independent variables and up to ten variable
parameters is fit to a set of data points (observed values of the
dependent and independent variables). The Nelder-Mead algorithm
is used to locate the least squares minimum by a simplex search
procedure, giving estimates of the best-fit values of the
parameters. Standard deviations of the parameters are estimated
by use of a quadratic approximation of the least squares surface
near the minimum. The quadratic approximation, equivalent to
minimization by use of a Taylor's series approximation, also
returns an improved estimate of the parameter. Thus the
quadratic approximation can be used as part of the minimization
procedure.
Alternation of several cycles (eg, 30) of simplex
minimization with one cycle of quadratic minimization quickly
converges on the least squares minimum. The simplex search
efficiently defines the region of parameter space that contains
the minimum; bounds are easily incorporated; because derivatives
are not used, ill-behaved functions can be handled. The
quadratic minimization, on the other hand, rapidly moves to the
least squares minimum, but it is reliable only when the parameter
estimates have been brought near the minimum.
One cycle of quadratic minimization takes about as much time
as 30 cycles of simplex minimization. If the quadratic
minimization fails, as it often will in the early stages of the
search, considerable time is wasted. To reduce this waste, the
next one or several cycles of quadratic minimization are skipped
after a failure.
1
USAGE:
A>xxxxfitn [d:]input_file [[d:]output_file]
the optional output file can be LST: or a disk file;
the required input file has the following structure:
one-line title, with no tabs or ctrl characters;
lines following give control variables, the starting
simplex, and the data array;
the order of entry is fixed by <read_data()>;
see sample file for structure;
the one-word descriptors must be present;
the data format is free-form (no set field widths);
comments at the end of the file are not read by the
program and any other information to be
stored should be there.
2
B. CONTENTS OF THE C SOURCE FILES:
SIMPMAIN:
main program controlling input, output & fitting = main()
XXXXFITn:
routines for simplex fitting that depend on the data and function
to be fit (must be changed when function fit is changed):
all external definitions, except for <data>, which is
defined as dummy storage in SIMPLIB1
or its modification
function for calculation of dependent variable and
weighted sum of residuals squared = func()
print of <data> records = fdatprint()
customizable display called by <simpfit()> = fspecial()
other special functions called by above functions
SIMPLIB0:
general routines for simplex fitting:
simplex fitting routine = simpfit()
quadratic fit for standard deviations = simpdev()
supporting functions
SIMPLIB1:
routines for data input that depend (partially) on the data
and function to be fit:
definition of the aggregate <data>, with a
dummy structure declaration
usage message displayed on command line error = use_mess()
opening of files for input and optional output = file()
input of variables and data = read_data()
3
C. PROGRAM FLOW:
Program flow is controlled by <main()>.
I. A call to <read_data()> at the start of execution initializes
the following:
a one-line title string, without tabs or other control
characters;
control values that are used in testing (1) for exit from the
minimization (<exit_test>), (2) for cycles on which to print
intermediate fitting results (<prt_cycle>), and (3) for cycles on
which to pass through the quadratic approximation (<maxquad_skip>
and <quad_test>);
<iter>, the starting interation number, generally set at 0;
the starting simplex <p>, with <nvert> vertices and <nparm>
parameters; <nvert> = 1 + number of free parameters, <nfree>;
parameters that are the same for all vertices are treated as
fixed;
the data set, stored in the aggregate <data> of size <ndata>,
each record of <data[ndata]> having <ndatval> members.
On return from <read_data()>:
<maxiter> is set at <iter> + <prt_cycle>; <nquad_skip> and
<quad_cycle>, used in resetting <quad_test>, are set at zero and
<quad_test>, respectively;
The first command line argument specifies the input file
from which the above information is read. The (optional) second
argument specifies the (optional) output file or device, on which
a selection of the crt display is saved.
A sample input file is given in a following section.
II. Fitting by use of the Nelder-Mead simplex algorithm is
carried out by call of <simpfit()>. Control is returned to main
every <prt_cycle> number of fitting cycles, for printout of the
current simplex and its centroid.
III. On certain cycles of simplex minimization, determined by
the value of <quad_test>, the current simplex is passed to the
quadratic approximation routine <simpdev()>. Printout consists
of (1) the data array with calculated values of the dependent
variable and (2) a summary of the results of the quadratic fit,
4
including estimates of the standard deviations of the parameters.
<Simpdev()> returns a modified simplex, with, if quadratic
minimization was successful, the new low vertex closer to the
least squares minimum. The variance-covariance matrix is
returned in <qmat[nfree][nfree]>. The standard deviations of the
parameters and other information are returned in the structure
<q>.
IV. Minimization consists of looping through steps II and III,
until the value of <test> is less than <exit_test>:
<exit_test> = input value
<test> = <rms_func>/<mean_func>
<rms_func> = (root mean square of the deviations
of the least squares values
at the simplex vertices)
<mean_func> = (mean of the least squares values)
On exit from the minimization, a pass through <simpdev()>
gives the final estimates of the standard deviations of the
parameters.
The alternation of a set of simplex fitting cycles with a
pass through the quadratic approximation gives rapid convergence
to the least squares minimum. In order to control this,
<quad_test> can be used in either of two ways:
If in the input file <quad_test> is set < 1, then
<simpdev()> is entered at the first printing cycle for which
<test> is less than <quad_test>. Before looping back to
<simpfit()>, <quad_test> is reduced by a factor of 10.
If in the input file <quad_test> is set >= 1 (in which case
it should be an integer multiple of <prt_cycle>), then
<simpdev()> is entered every {<quad_cycle> * (<nquad_skip> + 1)}
number of cycles. <nquad_skip> is initially zero; it is
incremented on each failure of the quadratic minimization, up to
the limit <maxquad_skip>; on a successful quadratic minimization,
<nquad_skip> is reset at zero. This algorithm reduces the time
spent in unsuccessful quadratic minimization.
The operator can choose, from within <simpfit()>, to return
to <main()> and pass through <simpdev()> for a cycle of quadratic
approximation.
5
D. CODING IN XXXXFITn FOR THE SPECIFIC FUNCTION TO BE FIT:
The routine <func()> calculates for a simplex vertex the
weighted sum of residuals for the set of parameter values passed
to it. The result is stored in <p.val>. Bounds on the
parameters can be implemented easily, by a test that returns an
excessively large function value (eg, 1.E38) if a parameter
exceeds a bound. Coding should be efficient, because much of the
minimization time is spent in <func()>.
The print statement of the routine <fdatprint()> may have to
be recoded after change in the function or data.
The routine <fspecial()> controls the customizable display
printed at the end of the standard display for each cycle of the
simplex minimization in <simpfit()>.
Other routines should be function and data independent,
unless special manipulation of the data is required.
NOTE: change in the model being fit should require changes
in the coding only of the function of XXXXFITn.
6
E. COMMENTS ON THE CONSTRUCTION OF THE SIMPLEX FITTING PROGRAM:
About 4K of memory are reserved to allow expansion of
<main()> and<func()>, for more elaborate output, functions, etc.
The maximum number of parameters, NPARM, is set at 10; if
more are needed, all routines (SIMPMAIN, SIMPLIB0, etc) must be
recompiled with the new value of NPARM.
To make more readable the coding in <func()> of the model
equation to be fit to the data:
(1) use mnemonic member names in declaring <struct dat> in
XXXXFITn;
(2) declare a dummy structure, <struct pnamestruct>, that is
entirely equivalent to the structure that holds the parameter
values, <pstruct>, but that has mnemonic member names; the
mnemonic dummy structure then can be used with the <pstruct>
address passed as a parameter to <func()>.
The DEFINITION of the aggregate <data> and the functions
<use_mess()>, <file()>, and <read_data()> are in a separate file,
SIMPLIB1; this is to to allow expansion of the aggregate <data>
by overwriting most of SIMPLIB1; the SIMPLIB1 routines are
entered only once, at the start of execution.
the DECLARATION of <struct dat> according to the
requirements of the model described by <func()> is given in
XXXXFITn; <func()>, etc. reference the aggregate <data> as
external to XXXXFITn, but through the structure <dat>, declared
locally in XXXXFITn with mnemonic member names suitable for use
in the coding of <func()>, etc.
The intent is to generalize the read of the data file and
the allocation of data storage (in SIMPLIB1), while retaining
flexibility in the declaration of <struct dat data> in XXXXFITn;
the following comments bear upon this arrangement:
The loading of values into the aggregate <data> is done in
the SIMPLIB1 module <read_me()> by use of:
(1) a generalized ("dummy") structure for <data>;
(2) a read loop that moves successive double values from the
ascii data file into the storage at and above <data[0]>, without
referencing the elements of <data> by structure member or index.
7
The "useful" declaration of the structure for <data> is in
XXXXFITn, where it is referenced by <func()> and <fdatprint()>;
<struct dat> must be changed to accord with the requirements of
<func()> and <fdatprint()>; all members of <struct dat> MUST be
of type double.
Change in the model being fit should not require recoding
and recompilation of <read_me()> or of any other routines except
those of XXXXFITn; of course, change in the model requires change
of <func()>, <fdatprint()>, and of the declarations of <struct
dat>
and <struct pnamestruct> in XXXXFITn.
The size of the <data> aggregate is limited by (1) the size
of free memory and (2) the size of SIMPLIB1, most of which can be
overwritten by data records; for this version of the program,
SIMPLIB1 corresponds to about 600 double values, and unused
memory to about 600 double values; overwriting of the code of
SIMPLIB1 may not be allowed by some compilers.
For the six-member structure <dat> used in this version of
the program, the maximum number of data points is more than 100
(600 double values), expandable to more than 200 (1200 double
values) if SIMPLIB1 is recompiled with an increase of NDATA;
NDATA is currently set at 350 double values; increase of NDATA of
course decreases the amount of memory available for expansion of
the code of <main()>, <func()>, etc.
8