home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Monster Media 1993 #2
/
Image.iso
/
math
/
ksprob21.zip
/
KSDOCS.EXE
/
KSPDAT.DOC
< prev
next >
Wrap
Text File
|
1992-12-21
|
31KB
|
616 lines
kspdat 2.10
Joseph C. Hudson
4903 Algonquin
Clarkston, MI 48348
Introduction
kspdat is a contraction of ks probability data.
Introductory prob and stat textbooks usually have a few tables
for common distributions and a few pictures of probability and
density functions. Occasionally, cdfs and oc curves may be seen.
kspdat is a first attempt at allowing prob and stat instructors
to use the tables and pictures they want to use, rather than
being restricted to those that the author chose.
kspdat does not produce pictures, or directly usable tables, for
that matter. It does produce tables of pdfs (density or
distribution functions), cdfs, hazard functions, reliability
(survival) functions and inverse cdfs. The output can be edited
into replacements for book's tables (like when you want to pass
out tables for testing purposes without violating someone's
copyright), alternate forms for these tables, additional tables,
or, most important for me, output can be fed into a graphing
program to produce pictures.
I do not offer a warranty or guarantee of any kind for this
program. I've tried hard to make the output correct, but using
it with new data sets and different machines may reveal errors
I'm not aware of. Follow the advice of Gerard E. Dallal
(Statistical Microcomputing - Like It Is, American Statistician,
V42 N3 Aug 1988): assume that this program does everything wrong
until you put it through its paces with difficult input and
conclude otherwise. Above all, enjoy. If you care to send me a
brief report about what you like and don't like about this
program, it would be very much appreciated.
kspdat is copyright (C) 1990-93 Joseph C. Hudson 4903 Algonquin
Clarkston MI 48348. All rights are reserved.
kspdat page 2
examples of use
let's start with a couple of quick examples to illustrate what
kspdat does. Start kspdat. You see the main menu:
┌──────────────────────────────────────────────────────────────┐
│ kspdat 2.10 │
│ │
│ exit help save spec get spec view data view dfile │
│ compute dir save data view file view names view cfile │
│ view graph │
│ │
│ cols to graph: │
│ data file: │
│ indep var: │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
└──────────────────────────────────────────────────────────────┘
the cursor will be on "help". If you hit return, you will be able
to page through a series of help screens. Don't bother to right
now. Instead, use the down arrow to move to the "data file:"
prompt. The help box at the bottom of the screen will ask you to
enter the name of a file to use for output. Do it. For this
exercise, use bintable, prefaced by the drive/subdirectory you
want the output to go to.
If you hit return or the down arrow after entering the name, the
cursor will be at the <indep var:> prompt. If not, use the arrow
keys to move the cursor there. type in x 0(1)10
Hit return or down arrow when done. the cursor will go to the
blank area below and prompt you with "c2:". type in bi 10 .2 cdf
Hit return or down arrow when done. you will get a "c3:" prompt.
Hold the alt key down and type the letter C. The bi 10 2 cdf
should have been copied from the line above. arrow over to the 2
and change it to a 4, so that the line reads bi 10 .4 cdf
Repeat twice, to get c4: bi 10 .6 cdf and c5: bi 10 .8 cdf.
after you get the last line, hold the alt key down and type the
letter Q (I'll refer to this simply as alt-Q or alt-whatever from
now on). The program should put you at the "compute" prompt.
Hit return. After the computer is done working, you will be at
"view data". Hit return. You will see the data that was just
generated. There will be some strange names heading each column.
kspdat page 3
Hit Esc to get back to the main menu, go left and down with the
arrow keys to the "save data" prompt and hit return. A new menu
will appear:
┌──────────────────────────────────────────────────────────────┐
│ kspdat save data 2/26/90 20:06 390K free mem │
│ │
│ missing values: mvcode number of tables: one │
│ │
│ col column number of blank number of spaces for digits │
│ no. name leading spaces before d.p. after d.p. │
│ 1 x0(1)10 0 0 0 │
│ 2 bic10.2 0 0 0 │
│ 3 bic10.4 0 0 0 │
│ 4 bic10.6 0 0 0 │
│ 5 bic10.8 0 0 0 │
│ │
│ │
│ │
│ │
└──────────────────────────────────────────────────────────────┘
Leave the missing values and number of tables as they are, (hit
return to get past them) and change the zeros to these values:
│ 1 x0(1)10 2 2 0 │
│ 2 bic10.2 2 1 4 │
│ 3 bic10.4 2 1 4 │
│ 4 bic10.6 2 1 4 │
│ 5 bic10.8 2 1 4 │
Hit alt-Q when done, and then return. After a bit, you'll be
back at the main menu at "view dfile". hit return, and you will
see the fruits of your labor:
0 0.1074 0.0060 0.0001 0.0000
1 0.3758 0.0464 0.0017 0.0000
2 0.6778 0.1673 0.0123 0.0001
3 0.8791 0.3823 0.0548 0.0009
4 0.9672 0.6331 0.1662 0.0064
5 0.9936 0.8338 0.3669 0.0328
6 0.9991 0.9452 0.6177 0.1209
7 0.9999 0.9877 0.8327 0.3222
8 1.0000 0.9983 0.9536 0.6242
9 1.0000 0.9999 0.9940 0.8926
10 1.0000 1.0000 1.0000 1.0000
the first column of output are the x values from 0 to 10, the
second column contains binomial cdf values for p = .2, and the
remaining columns binomial cdf values for p = .4, .6 and .8. Take
this into a text editor, add headers and you have a (small)
binomial table.
kspdat page 4
Exit kspdat by going to "exit" in the menu and typing X.
The usual t, F and chi-square tables of upper tail areas can be
constructed by specifying df as the independent variable. Let's
run through the making of a small t table.
Start kspdat and arrow down to the "indep var:" prompt. Type in
df 1(4)5(5)30(10)40(20)120
hit return or the down arrow when done and you will be at the
"c1:" prompt. Type in
st .01 x
Hit return or down arrow and you will go to the next line and
get the "c2:" prompt. Type alt-C to copy the line above, then
right arrow over to the 1 and change it to a 5, to get
st .05 x
Hit return or down arrow. at the "c3:" prompt, do your thing
again to get
st .10 x
Now hit alt-Q to get to the "compute" prompt and hit return.
When computations are complete, you will be at "view data".
Hit return and you should see
1 2 3 4
df1(4)5(5 stx.01 stx.05 stx.10
1 1.0000 31.8205 6.3138 3.0777
2 5.0000 3.3649 2.0150 1.4759
3 10.0000 2.7638 1.8125 1.3722
4 15.0000 2.6025 1.7531 1.3406
5 20.0000 2.5280 1.7247 1.3253
6 25.0000 2.4851 1.7081 1.3163
7 30.0000 2.4573 1.6973 1.3104
8 40.0000 2.4233 1.6839 1.3031
9 60.0000 2.3901 1.6706 1.2958
10 80.0000 2.3739 1.6641 1.2922
11 100.0000 2.3642 1.6602 1.2901
12 120.0000 2.3578 1.6577 1.2886
kspdat page 5
Hit Esc to get back to the menu. Arrow down to the "data
file:" prompt and enter a path and file name (with no
extension). Then arrow up and across to "save data". Hit return,
then hit return twice to begin editing the numbers of spaces.
Enter 1 3 0 in the first row and 2 3 3 in rows 2 - 4,
then hit alt-Q and return. You have created an ascii file,
probably with a .d01 extension, with the following contents:
1 31.821 6.314 3.078
5 3.365 2.015 1.476
10 2.764 1.812 1.372
15 2.602 1.753 1.341
20 2.528 1.725 1.325
25 2.485 1.708 1.316
30 2.457 1.697 1.310
40 2.423 1.684 1.303
60 2.390 1.671 1.296
80 2.374 1.664 1.292
100 2.364 1.660 1.290
120 2.358 1.658 1.289
Add column headings, maybe a little picture on top like
most books have; voilà, a t table. You can even make
the picture with the help of kspdat (and your favorite
graphing program and your favorite graphics editor and
your favorite graphics incorporating word processor).
degrees of freedom (df) can be specified as the independent
variable only with dependent variables that are x values from
the chi-square, F, Student's t distributions and their noncentral
versions. In specifying the dependent variable, substitute the
right tail area for df in the ch, nx st and nt specifications,
and for df2 in the fd and nf specs. For the F and noncentral F,
the ind var df is used as the denominator degrees of freedom.
df1 must still be specified. In all cases, x must be specified
as what to compute.
examples with dep var specification df 1(1)30
this spec: computes a column of 30 values with
ch .05 x 5% right tail % pts for the chi-square dist
ch .95 x 5% left tail % pts for the chi-square dist
nx .95 8.5 x 5% left tail % pts for the noncentral chi-
square dist with noncentrality 8.5.
st .01 x 1% right tail % pts for the Student`s t dist
fd 5 .025 x 2.5% right tail % pts for F with 5 numerator df
kspdat page 6
My main motivation in writing kspdat was to produce data sets
for graphing. Try this exercise to see how to do this:
Start kspdat again and use bino as the name of the output file.
Enter x -0.5(.1)16.5 as the independent variable. In c2, put
bi 16 .5 bar and in c3 put no 8 2 pdf. Go to "compute" and
then "save data". In save data, make sure the number of tables
is "many". (hit the T key when you're at the <number of tables:>
prompt if necessary) Make the menu look like this:
┌──────────────────────────────────────────────────────────────┐
│ kspdat save data 2/26/90 22:33 387K free mem │
│ │
│ missing values: mvcode number of tables: many │
│ │
│ col column number of blank number of spaces for digits │
│ no. name leading spaces before d.p. after d.p. │
│ 1 x-0.5(.1) 2 2 2 │
│ 2 bib16.5 2 1 6 │
│ 3 nop82 2 1 6 │
│ │
Save the data and exit the program. you should have 4 new files:
bino.d01, bino.c01, bino.d02 and bino.c02. The ".d" files are
data files and the ".c" files are codebook files. The codebook
files are used by ksstat for missing value info, column names and
so on. For humans, they are a record of what's in the data
files. Look at bino.d01. It has all x values that are n + .5,
where n is an integer, repeated three times, with three different
values in the second column. These values are there to allow a
graphing program that can graph a data set to trace out a
histogram of the binomial pdf with n = 16 and p = .5. We really
don't need the .1 step size for this, but we do for the second
data set, bino.d02. This contains values of the normal (Gaussian)
pdf with µ = 8 and σ = 2, the same mean and standard deviation as
the binomial distribution in bino.d01. Graphing both of these
data sets together will show a normal curve superimposed on top
of the binomial histogram.
I can't supply a graphing program to do the graphing, but can
recommend gnuplot. An early version of gnuplot produced the
graphs in the files bino.com and bino.prn. All of these files are
in the self extracting archive kspdbn.exe. To see the image on
screen, run bino.com. use the command copy bino.prn prn /b to
kspdat page 7
print the image on an Epson compatible printer. Bino.com was pro-
duced using a program called grabber. With grabber produced
screen images captured as .com files, the directory utility
dirmagic can be used to make simple but effective (and cheap)
slide shows.
There is one additional file in kspdbn.exe, bino.spc. This is a
kspdat specification file created with the "save spec" option in
the main menu. It has all of the stuff that I typed into kspdat
to create the .c## and .d## files.
The "get spec" menu option reads .spc files. Once a .spc file is
read in, you can edit it and produce new output without retyping
all the information.
running kspdat
In running kspdat, you have some purpose in mind: a picture you
want to draw or a table you want to write. To accomplish your
task, you will have to
1. give a name for the data file(s).
2. describe the independent variable. In a table, the
independent variable is what goes in the first column.
In a graph, it is what is graphed on the horizontal (x)
axis.
3. describe the dependent variable(s). These are what go in
all other columns of a table or are graphed on the
vertical (y) axis of a graph. You may want more than one
of these for either tables or graphs.
4. compute the values of the dependent and independent
variables.
5. save the computed data to disk file(s) for further
processing with either a text editor or graphics program.
6. optionally, save the specifications used to generate your
disk files so that you can later recall them for reuse
without retyping. Since some specifications can be
lengthy, this can save time.
7. exit the program.
The next section describes the menu selections used to achieve
these steps.
kspdat page 8
the kspdat main menu
The main menu consists of 17 choices or places to enter data.
There is a help window at the bottom of the screen that shows
brief help for each menu selection. The arrow keys will move you
through the selections, with one exception. In the dependent
variable selection area at the bottom of the menu, use alt-Q to
leave this area. The menu choices are:
exit press x to leave the program. If there is unsaved
data, you will be given a chance to rescind your
choice.
help pressing the enter key will bring up a series of
help screens. Esc brings you back to the main menu.
save spec after you have entered all the information
necessary to produce your output, including the
formatting information entered in the save data
menu selection, you can save all of the things you
typed in in a .spc file. Hit enter to begin. You
will be asked for a file name to use. The .spc
extension is automatically used, so you need not
type this in.
get spec retreives information saved previously in a .spc
file. You will be asked for a file name. As with
save spec, the .spc extension is forced.
view data after using compute, there is data in memory.
Hitting enter on this selection allows you to see
that data. Column names will appear at the top.
These are constructed from the information used to
generate the data and are not pretty, but they
serve as a rough reminder of what is in the column.
Don't worry about the format here. You can specify
that when saving the data to disk. The word missing
will appear in place of any values not computed.
view names lets you see the names of the columns of data
currently in memory.
compute this is what gets the work done. after you have
specified dependent and independent variable
information, come here to actually compute the
data.
dir shows a disk directory.
kspdat page 9
save data this lets you create your output file or files. The
kspdat save data menu will appear. There are two
choices to be made at the top of the menu. When you
enter the menu, the cursor will be at the "missing
values:" prompt, with "mvcode" showing. mvcode
stands for missing value code. If there are any
values that are missing in your data, kspdat will
output either a missing value code or a blank.
Blanks are appropriate when tables are the final
product, missing value codes when the data will be
read by another program. Missing value codes are
numbers that do not otherwise appear in the column
with the missing value. They are chosen by the
program and reported, along with column names, in
a codebook file that is written along with the data
file. You specified a data file name in the main
menu. This name is used, with extension .dnn for
data files and .cnn for codebook files. The nn are
digits chosen to avoid conflict with other file
names. A data file and its corresponding codebook
file always have the same name and same digits.
The second menu selection is "number of tables:".
The choices are one and many. With many, there is
one table produced for each dependent variable.
Each table is written to a separate disk file. Each
file contains two columns, the independent variable
in the first column and a dependent variable in the
second column. This multiple file arrangement is
necessary for some plotting programs. Many must be
chosen if you have any "bar" indep varsand more
than one independent variable. With "one", bar and
pdf produce the same output with more than one ind-
ependent variable. With only one ind var, the many
or one choice is irrelevant.
The third area of the save data menu allows you to
specify the format of the output. For each column
of output, specify three things: the number of
leading blank spaces, the number of digits to allow
for before thew decimal point and the number of
digits to print after the decimal point. If the
number of digits after is 0, the decimal point is
not printed. After entering this information, type
alt-Q. You will be given the choice of saving the
data, going back to edit the information entered or
aborting.
view file choosing this entry will let you see any disk file.
You will be asked for the file name.
kspdat page 10
view dfile lets you see the last data file created. When
multiple files are written, this is the one with
the largest number in the extension, corresponding
to the last dependent variable column. You can see
other data files with the view file selection.
view cfile lets you see the last codebook file created. Same
comments as view dfile.
view graph lets you see rough graphs of one or more dependent
variable columns against the independent variable.
This is meant only to give you a rough idea of what
the data looks like.
cols to graph: lets you specify which columns to view.
data file: this is where you give the name to be used for the
data and codebook files. No extension is necessary.
If you give one, it will be discarded.
indep var: here, you specify the independent variable, the
contents of the first column of the table. enter
either x, cdf or df followed by a range of values
specified in from(step)to format or in min, max,
number of steps format. Don't mix the two formats.
e.g.
x 1(1)10 start at 1 and go to 10 in steps of 1
x 1,10,9 start at 1 and go to 10 in 9 steps
cdf 0(0.1).5(.05)1 start at 0 and go to .5 in
steps of .1 then to 1 in steps
of .05
cdf 0,.5,5,.5,1,10 start at 0 and go to .5 in
5 steps then from .5 to 1 in
10 steps.
cdf 0,.5,5,.55,1,9 does the same thing
df 1(1)30(10)120 both of these generate
df 1,30,29,40,120,8 1 2 .. 30 40 50 .. 120
personally, I much prefer the () notation,
hardly ever use the other.
kspdat page 11
dep var when you hit return or the down arrow from the
<indep var:> prompt, you will be in the go to the
blank area below and be prompted with c2:. At this
point, enter
1. a two letter code for the distribution
2. values of the parameters
3. bar, pdf, cdf, rel, haz or x
The sixth help screen summarizes the choices:
enter <dist> <params> <bar,pdf,cdf,haz,rel or x>
use two letter code for dist, values of params
in order listed
F fd df1 df2
binomial bi n p noncen F nf df1 df2 nc
disc uniform du min max gamma ga Θ a
disc Weibull dw p ß inv Gausn ig µ lambda
hypergeometric hy N n k Laplace la a b
neg binomial nb p n lognormal ln µ σ
Poisson po µ logistic lo µ σ
beta be a b normal no µ σ
Cauchy ca a b observed ob fname col#
chi-square ch df Pareto pa b
noncen chi-sq nx df nc Rayleigh ra b
cont uniform cu min max Students t st df
extr value lg el a b noncen t nt df nc
extr value sm es a b triangular tr a b
exponential ex µ Weibull we Θ ß δ
Descriptions of the distributions and their
parameters can be found in ks.doc.
The solidus can be used to enter fractions, e.g.
you can use 1/3 (1.0 / 3, 1/3.0, etc) instead of
.33333333333333333. Mixed numbers cannot be used.
That is, 4 1/3 is taken as two separate numbers,
not as 13/3.
If the independent variable is x, the dependent
variables can be any combination of
pdf, bar, cdf or rel from discrete distributions or
pdf, cdf, haz or rel from continuous distributions.
If the independent variable is cdf, the dependent
variables can be x from any distribution or
observed data.
If you compute pdf values from a discrete
distribution, you can compute either the actual pdf
or bars. With bars, x is rounded to the nearest
integer and the pdf is computed for that integer.
This is useful for graphing.
kspdat page 12
Examples:
bi 10 .4 bar binomial n = 10 p = .4 bar format
bi 10 .4 pdf binomial n = 10 p = .4 pdf format
no 0 1 cdf normal (Gaussian) µ = 0 σ = 1 cdf
ga 2 .5 x gamma a = 2 Θ = .5 inverse cdf
ob c:df.dat 3 observed data in file df.dat, col 3
Use <enter>, the up arrow or the down arrow to
complete the current entry. The cursor will go to
the next available field, scrolling if necessary.
Use alt-Q to end data entry. During data entry, you
can use the the up and down arrow keys to move to a
previously entered field to edit it.
The information you type is checked for form and
number of parameters, but parameter values are not
checked for correctness. Misspecified parameters
will lead to either missing or garbage output. For
example, giving 10.5 for the binomial n will be
accepted but will truncated to 10.
Most computed values are good to 12+ significant digits. A few
algorithms used, for the normal cdf and inv, t inv, chi sq inv, f
inv, gamma cdf and inv, and beta cdf and inv, require that the
number of significant digits in the returned value be specified
beforehand. For most cases, the number specified is 10. This is a
compromise between computation speed and accuracy.
For refrences, see ks.doc.