home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The World of Computer Software
/
World_Of_Computer_Software-02-385-Vol-1of3.iso
/
t
/
ts2st20.zip
/
STATREGR.INF
< prev
next >
Wrap
Text File
|
1992-09-22
|
13KB
|
287 lines
- 1 -
Tue 22-September-1992 (All rights reserved)
About TS2ST in General (Multiple regression analysis)
======================
Apply question mark ? with the program call for a brief description
of a program.
This package may be used and distributed freely for NON-COMMERCIAL,
NON-INSTITUTIONAL, PRIVATE purposes, provided it is not changed in
any way.
┌────────────────────────────────────────────────────────────────┐
│ For ANY other usage (such as use in a business enterprise or a │
│ university) or the full scale version contact the author for a │
│ personal or a site license. │
└────────────────────────────────────────────────────────────────┘
Please do not distribute any part of this package separately.
Uploading to BBSes is encouraged.
The registered version is strictly for the registrant only.
Identical programs must NOT be running on more than one computer at
a time. Site licensed programs must not be run outside the licensed
site.
The programs are under development. Comments and contacts are
solicited. If you have any questions, please do not hesitate to use
electronic mail for communication.
InterNet address: ts@uwasa.fi (preferred)
Bitnet address: SALMI@FINFUN
The author shall not be liable to the user for any direct, indirect
or consequential loss arising from the use of, or inability to use,
any program or file howsoever caused. No warranty is given that the
programs will work under all circumstances.
Timo Salmi
Professor of Accounting and Business Finance
Faculty of Accounting and Industrial Management
University of Vaasa
P.O. BOX 297, SF-65101 Vaasa, Finland
CONTENTS:
1. Acknowledgements
2. General Description
3. Release Notes
4. The Statistics Set for Other Computers
5. List of the files in the package
- 2 -
1. ACKNOWLEDGEMENTS
In developing and testing multiple regression programs for the
VAX 11/750 during 1983-86 I have had many useful discussions with my
colleague Martti Luoma (Associate Professor of Statistics). This has
directly benefited programming the current STATREGR program for PC
compatibles. I have also had useful suggestions from Antti Kanto
(Acting Associate Professor of Statistics) in developing STATREGR.
2. GENERAL DESCRIPTION
STATREGR (Ver 2.0)
STATistics: multiple REGRession analysis is part of the
interactive statistical system by Timo Salmi. It is the second
program in the set. The first program in the set is STATistical
MEASures (STATMEAS in TS1STxx.ZIP), which is intended for univariate
analysis. The third program in the set is STATistics:
TRANsformations (STATTRAN in TS3STxx.ZIP), which can be used for
transforming the observations, and, if necessary, also as an editor.
The fourth program in the set is STATistics: Ranks and CORrelations
(STATRCOR in TS4STxx.ZIP). The fifth program in the set is
STATistics: Least Absolute Deviation multiple Regression (STATLADR
in TS5STxx.ZIP).
STATREGR includes a handy built-in help system, which can be
invoked by typing ? at any interactive question. Because of this
built-in help, and the interactive nature of the program's user
interface, no long-winding instructions have been included. (Who
reads instructions anyhow?)
The program performs and ordinary least squares (OLS) multiple
regression analysis, that is, estimates the coefficients of
Y = a + b(1)X(1) + ... + b(M)X(M)
from a set of observations. Furthermore, it draws (or rather writes)
low-resolution scatter diagrams of the data, and the regression
analysis.
The data can either be given from the keyboard or taken from a
file. If the input is to be taken from a file it must first be
prepared with some editor, or some word processor which includes an
option for preparing ordinary ascii text. (Also STATTRAN can be used
for this purpose.)
The data is given to the program in the following format:
X1 X2 X3 !variable names (! denotes a comment)
3.56 6.32 -1.73
5.12 -4.21 9.18
14.2 5.11 0.31
END !END is optional in a file
A missing item in an observation is marked by a hash (#). E.g. if
the first item of the second observation were missing, the
observation should be written as # -4.21 9.18
The items in an observation can be separated with blanks, as in
the above, or with commas (,) e.g. 5.12,-4.21,9.18. The number of
the intervening blanks is irrelevant, and can be customized for
increased readability. Thus e.g. 5.12 -4.21 9.18 and
5.12 -4.21 9.18 are equivalent.
A row can be continued using an ampersand (&). E.g. the variables
could be given as
X1 X2 &
X3
Alternatively, * or \ can be used instead of & as the continuation
marker.
Comments can be added to the input data. If ! appears on a line
all text after ! will be considered as a comment.
A header can be entered on each page if output is directed to a
file. To accomplish this start the very first line on the input file
with a double exclamation mark (!!) and the rest of the line will be
used as the header. Thus !! indicates a header, a single ! an
ordinary comment.
The maximum number of variables is 25. The maximum number of
observations is 400 (for each variable). The public domain version,
however, sets the limits at 4 and 100 respectively.
3. RELEASE NOTES
Version 1.1 of STATREGR includes some minor changes and
corrections in the user interface.
Version 1.2 of STATREGR introduces CGA high-resolution diagrams
for the analysis. Unlike the low-resolution text-mode diagrams, the
high-resolution diagrams will always be drawn on the screen. (In the
former case, directing the output to a file is also possible).
Further overflow checks have been added to prevent the program
crashing because of bad data.
Multiple regression estimates are based inverting a cross-product
matrix of the observations. If the explaining variables are very
similar (multicollinearity), the matrix will be nearly singular, and
the estimates are very unstable. Further problems of significance
can arise if the values of the explaining variables are of a very
different scale. To test the reliability of the estimation results
the cross-product matrix is multiplied by its computed inverse, and
the result compared with a unit matrix, and the sum of absolute
deviations is reported as ABS DEVIATION FROM UNIT MATRIX. The
smaller this figure, the less probability of computationally weak
estimates. Although seldom reported, this problem is inherent to
most (even the top commercial) statistics packages.
If the input file is not found, you have the choice of listing
directories from within STATREGR. The directory routine has been
rewritten for a more relaxed syntax. (For details see the
information on DIRW in TSUTIL.INF in TSUTILxx.ZIP version 1.8 or
later.)
In version 1.3 the program no longer crashes from an attempt to
rewrite a write-protected file. Second, the user now has more
control over the choice of the graphics driver in the
high-resolution scatter diagrams. Apply ? at the question "USE CGA
GRAPHICS DRIVER" for more information. Third, the t-values of the
residuals have been included in the tableau giving the regressed
values.
In version 1.4 the program has been recompiled with some minor
changes.
Version 1.5: This version introduces input recall and line
editing. The special keys Del, CrLf, CrRg, CrUp, Home, End, and Esc
are functional for this purpose. PgUp is the recall key. Line
editing uses insert mode.
Disk access has been made faster (the program has its own cache).
The directory routine has been updated.
Read-only files can be read properly.
The program size has been reduced by limiting the graphics to CGA,
EGA, and VGA.
Version 1.6: In compiling version 1.5 I made in error in setting
the default heap size. This caused an out of memory condition, which
now has been remedied. In line editing the Insert key has been made
functional.
Version 1.7: The regression line in the high resolution scatter
diagram was incorrectly drawn for negative regression coefficients.
My thanks are due to acting associate professor Roy Dahlstedt for
pointing it out.
Version 1.8: The line editing potential of the program has been
improved. When the task is given from the keyboard, and continuation
lines are used, the repeated input recall (CursorUp) gets each line
in turn. Ctlr-C and break-key abort have been enhanced.
A line can be continued using an &, or a * at the end. Now also the
backslash \ is accepted. This was suggested by Tuomas Eerola.
Some help texts within the program have been extended.
Multiple regression estimates are based inverting a cross-product
matrix of the observations. If the explaining variables are very
similar (multi- collinearity), the matrix will be nearly singular,
and the estimates are very unstable. Further problems of
significance can arise if the values of the explaining variables are
of a very different scale. To test the reliability of the estimation
results the cross-product matrix is multiplied by its computed
inverse, and the result compared with a unit matrix, and the square
root of the sum of the squared deviations is reported as DEVIATION
FROM UNIT MATRIX. (Earlier I used the sum of absolute deviations,
but the norm, that is the square root of squared deviations, is
theoretically better, since it can be considered the length of the
deviation vector.) The smaller this figure, the less probability of
computationally weak estimates. Although seldom reported, this
problem is inherent to most (even the top commercial) statistics
packages.
Found and corrected an annoyingly elusive bug, that caused problems
if the last observation in the data had items with more digits than
10.
Version 1.9: Several improvements to the nuts and bolts of the
user interface.
The new usage of the call is
PROGNAME [/h(elp)] [/iInputFileName] [/oOutputFileName] [/cColumnsPerRow]
(the /c option, which regulates the width of the output, is for
registered versions, only). If you use the /i switch, it stuffs the
InputFileName into the appropriate recall buffer. This means that
when the program asks you for the input file name, you can invoke
the input file name just by pressing the CursorUp key. (The same
goes for the /o switch, respectively.) This is very convenient, if
you use the program many times successively making small changes in
your data in between. (This assumes, of course, that you have a
command line editor like DOSEDIT or CED to recall previous MsDos
commands. These common shareware programs can be obtained from any
well-stocked BBS or FTP site.)
The printer readiness test has been rewritten to be more general.
The earlier test failed for some printers, because the codes the
printers send when they are offline are not standardized.
The "file exists, overwrite?" question is no more asked when the
output file is prn, in other words when the output is directed to
the printer.
The user has now a choice of a left margin from 0 to 20 blanks
when output is directed to the printer.
The user has now a choice between formfeed and four blank lines
to start each new page of output.
When an input file is not found, the user is given the choice of
listing a directory. The directory routine has been rewritten.
The file ready message now also includes the file side besides
the name.
Version 2.0: The input and output file names can be optionally
given as parameters in the program calls, e.g.
STATREGR /ic:\stat\test.dat /or:\tmp
This option has been improved. The "prefilled" name (e.g.
c:\stat\test.dat) will now automatically appear on the input line
without the need of pressing the cursor up key. All you need to do
is to press enter.
Also made some minor internal changes not worth recording.
Rewrote the document files using a 68 column wrap instead of the
former 80 to make the text easier to read and handle. Added the list
of files in the package to the documentation.
4. THE STATISTICS SET FOR OTHER COMPUTERS
The Statistical programs by Timo Salmi are also available for the
Sinclair QL computer. Named STATPREP the system is part of a Public
Domain library for QUANTA members. The descriptions of the files in
the Quanta Library are given in STATMEAS.INF contained in
TS1STxx.ZIP, i.e. the first part of the set statistical programs.
5. LIST OF THE FILES IN THE PACKAGE
TS2ST Statistics by T.Salmi, Part II
Filename Comment
-------- --------------------------------
FILE_ID.DIZ Brief characterization of TS2ST
STATREGR.EXE Multiple regression analysis
STATREGR.INF Document
STATREGR.NWS News announcements about ts2st
TSPROG.INF List of PD programs from T.Salmi
---- ------ ------ -----
0005