home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Overload
/
ShartewareOverload.cdr
/
finance
/
cj100.zip
/
CATJACK.DOC
< prev
next >
Wrap
Text File
|
1989-05-22
|
22KB
|
541 lines
C a t J A C K
Statistical Calculator
by
Brad Strausbaugh
350 E. Del Mar Blvd #125
Pasadena, Ca 91101
CatJACK is an easy-to-use interactive tool for fast quantitative
analysis of small data sets. The statistics available include single
and praired distribution, ANOVA, correlation and regression, and
crosstab analysis. No knowledge of programming is required. You will
need only a basic understanding of elementry statistics.
The Shareware Concept
CatJACK is widely available to the public on electronic bulletin
boards and shareware distributors around the country. You are invited
to share CatJACK with colleagues and/or your local bulletin board. To
keep all the files together, please share only the self-extracting
CJ100.EXE. If you find CatJACK useful in meeting you needs for
quantitative analysis, please send $20 to the author.
Brad Strausbaugh
350 E. Del Mar Blvd #125
Pasadena, Ca 91101
Your support of the shareware concept helps insure the continued
availability of quality software on a test-drive-first basis.
Shareware distributors are encouraged to copy and distribute CatJACK
for the few dollars per diskett typical in the shareware marketplace
today. Charging the author's fee mentioned above is prohibited and in
violation of the copyright.
Getting Started
After extracting CJ100.EXE you will find the following files.
CATJACK.DOC This documentation file
CATJACK.EXE The CatJACK program
BRUN30.EXE The run-time module
T.TBL The probablity tables
F1.TBL
F5.TBL
CHISQ.TBL
STUDENTS A data file for demonstration purposes
Page 2
You can invoke CatJACK by setting your default device and subdirectory
to where you are keeping these files, and entering the following
command at a DOS prompt.
C> CATJACK
After taking a few seconds to load the probability tables, CatJACK
will display the login banner. Just press any key and you are ready
to start.
The 30 Minute Tour
This document uses the rather un-original example of a junior college
psychology class's data to illustrate CatJACK's operation. Although
this data has already been created, you will be asked in the next few
paragraphs to define and create a portion of it yourself. Doing this
you will learn to make the best use of CatJACK.
Controlling the CatJACK Environment
Here are a few basic conventions used throughout the CatJACK system.
1. The main menu is always present at the top of the screen.
The currently active option is highlighted in reverse video.
2. You can change the current menu option with the left and
right arrow keys on the numeric keypad (be sure the "Num
Lock" key is off), or by entering the first letter of the
option you want. When the menu option you want is is
highlighted, press the Return key to select it.
3. The Return key always advances you forward to the the next
step while the Escape key allows you to "stepback" to the
previous step.
4. The bottom line of the screen will always display pertinent
help information.
Defining an Array
By way of illustration, let us define an array containing information
on our students in the junior college psychology class. We want to
determine (1) if the distribution of our students' mid-term exam
scores, and (2) whether or not there is a significant difference
between the scores of male and female students. I have kept the
sample size small to minimize data entry time.
We begin by selecting the "Define" option from the main menu. We will
define a data array with two variables, Grade and Sex, for our 15
students. Just think of the array as a two-dimensional table where
each horizontal row represents a student and each vertical column
Page 3
represents a variable.
After selecting "Define", a prompt will appear on line 2 asking for
the width of the first variable. Keep the following points in mind
when you select a width.
1. Variables can be from 1 to 10 characters wide (8 is the
default). This includes decimal points and signs. Be sure
you choose a width that will hold the maximum value you
expect in your data.
2. The width you select to accomodate the variable's data will
also be the maximum width allowed for the variable's name.
Lets make the first variable, Grades, 8 characters wide. Since 8 is
the default we can just press the Return key at the width prompt.
Next, CatJACK asks for the name we want to give this variable. Notice
the space available to enter the variable name is only 8 characters
wide, the same as our width selection. Enter the variable name
"Grade" (without the quotes).
Finally CatJACK asks whether Grade is a text or numeric variable.
Data for numeric variables is right justified and available for math
operations, while data for text variables is left justified and cannot
be summed, or averaged, etc. Since we want to perform calculations on
Grade, enter "N" for numeric.
Now our first variable has been defined and its name appears in the
upper left corner of the array editor. CatJACK now asks if we want to
define another variable. We still have to define a variable to
identify each student's sex, so enter "Y" for Yes.
We want the width for this variable to be 3 characters wide, the name
to be "Sex" and the data type to be Text. Go ahead and define the
variable Sex. Remember, the helpline is always present at the bottom
of the screen if you get stuck.
This time when CatJACK asks if you want to define another variable,
enter "N" for No. CatJACK will return you to the main menu.
Entering Data
With an array defined, we are now ready to enter our students' data.
Select the "Edit" option from the main menu.
CatJACK prompts for the grade of the first student. It also
highlights the corresponding cell on the spreadsheet. The first
student's mid-term exam grade was 96%, so enter "96" at the prompt,
then press Return (Never enter the percent sign, and never enter a
dollar sign or a pound sign. The only valid characters for a numeric
variable are the digits 0 to 9, a decimal point, and sign if needed).
CatJACK then prompts for the same student's Sex. Enter "M" for Male,
Page 4
then press Return.
CatJACK is now awaiting data for the second student. Go ahead and
enter data for the remaining students as shown below. If you make a
mistake and need to change the data in a certain cell, use the arrow
keys to move to that cell, then enter the correct data. If you get in
trouble, remember the help line at the bottom of the screen.
Define Edit Save Load Sort Analyze Print Quit
+- Grade -Sex--------
1 96 M
2 80 F
3 84 F
4 90 M
5 88 F
6 78 M
7 86 M
8 88 F
9 92 M
10 90 F
11 84 F
12 88 F
13 68 M
14 74 F
15 100 M
16
17
+-------------------
When you are finished press the Escape key to stepback to the main
menu.
We have kept the number of students in our example down to 15 just to
minimize data entry time. Your actual data will probably exceed the
17 lines showing in the first spreadsheet window. CatJACK allows up
to 200 lines of data. The edit window scrolls as needed to show the
current cell.
Single Distribution Analysis
Now we can examine the students' Grade distribution. Select the
"Analyze" Main menu option. CatJACK will ask whether we want
Distribution, ANOVA, Correlation/Regression, or Crosstab analysis.
Select "Distribution".
CatJACK then asks whether we want a "Single" or "Paired" distribution
analysis, Lets select "Single" first to see the distribution of the
whole class.
CatJACK prompts for which variable to analyze. Enter "Grade". If you
enter a variable that does not exist, an error message will appear
asking you to reenter the correct variable name.
Page 5
After a few seconds CatJACK will display the Single Distribution
Analysis of the variable Grade, This screen shows, the sample size
(n), mean, median, mode, standard deviation (StdDev), standard error
of the mean (StdErr), highest, lowest, range, variance and sum. It
also shows a grouped frequency histogram with mid-point raw score and
mid-point z score.
Notice the histogram is made up of asterisks (*), one for each case.
We should mention here that if the mode of a distribution exceeds 15,
each asterisk will represent more than one case. Just how many cases
each asterisk represents depends on how many multiples of 15 the mode
exceeds. In such cases the histogram may show a group with a
frequency greater than 1 but no asterisks. The histogram should be
used only to get a general picture of the distribution curve.
If you want to send this display to your printer select the "Print"
option from the line-2 menu. The "Print" option is available for all
analysis screens. When you are finished select the "Menu" option, or
just press the Escape key to stepback to the distribution menu.
Paired Distribution Analysis
We can compare the mid-term grades of male and female students.
Stepback to the Distribution menu with the Escape key and select
"Paired".
At the appropriate prompts select "Sex" as the independent variable,
and "Grade" as the dependent variable. For the independent group x
enter "F" (Female), and the independent group y enter "M" (Male).
After a few seconds CatJACK will display the Paired Distribution
Analysis of Grade by Sex.
In addition to the statistics we saw in the Single Distribution
Analysis, we also see the t-ratio, degree of freedom (d.f.) and p for
both the one and two tailed probability distributions. The histogram
shows occurrences of the x variables as "x", the y variable as "y",
and were both occur, "#",
Adding New Variables To The Current Data Array
Suppose that instead of comparing the mid-term exam grades by Sex, we
want to compare them, by the students' high schools. To do this we
will add a new variable to the array. We will name it "HS" for High
School.
Stepback to the main menu with the Escape key and select "Define".
Since CatJACK now has our current array in memory, it asks whether we
want to "Create" a new array or "Modify" the current one. We want to
add a variable to the current array so select "Modify". Then CatJACK
asks if we want to "Add" a new variable or "Delete" an existing one.
Select "Add".
Page 6
Subsequent prompts should look familiar. They are the Width, Name,
and Text/Numeric prompts we've seen before. Define the new high
school variable as Width: 2, Name: HS, and data type T (text).
Now stepback to the main menu and select the "Edit" option. Using the
arrow keys to position the active cell on line 1 under the HS
variable, enter the data into the new variable as shown below.
Terminate each entry with the Down Arrow key instead of the Return key
to stay in the HS column.
By the way, "WY" is for West York, "EY" is for East York, and "C" is
for Central.
Define Edit Save Load Sort Analyze Print Quit
+- Grade -Sex-HS-------
1 96 M WY
2 80 F C
3 84 F C
4 90 M WY
5 88 F EY
6 78 M C
7 86 M C
8 88 F WY
9 92 M WY
10 90 F EY
11 84 F WY
12 88 F EY
13 68 M EY
14 74 F C
15 100 M WY
16
17
+-----------------------
When you have entered the high schools, stepback to the main menu with
the Escape key.
Analysis of Variance
Now we can compare grades across high schools. Since there are more
than two high schools reprsented in this junior college class we use
"ANOVA".
Select "Analyze" from the main menu, then "ANOVA" from the analyze
menu. Enter "HS" as the independent variable and "Grade" as the
dependent.
After computing for a few seconds CatJACK displays the Analysis of
Variance Summary Table which includes the within an between groups sum
of squared deviations from the mean (SS), degree of freedom (d.f.),
mean square (MS), F-ratio, and p.
Page 7
If you want to see statistics for each group of students by high
school, select "GroupDetail" from the ANOVA menu. This display shows
each group's n, mean, standard deviation, and standard error of the
mean.
When you are using ANOVA on your own data, keep in mind that the
maximum number of independent groups allowed is ten.
Saving and Loading Arrays from Disk
We still haven't seen correlation and crosstab analyses. To do that
we will need some new variables. Since you now know how to define
arrays and enter data, this would be a good time to look at Saving the
array you've just entered to disk and Loading an array I've prepared
for you that contains the new variables.
It is a good habit to save new data to disk immediately after you
leave the Edit mode. In the event of a system failure you can recover
simply by reloading the array from disk. And if you are entering a
particularly large array you may want to save the array after entering
every twenty lines or so.
First Save your current array to disk. Use the Escape key to stepback
to the main menu, then select the Save option. CatJACK asks for a
name to identify the array when it's on disk. Enter the array name
"CLASS". CatJACK will then save your array to disk.
Now you can load the new larger array. Select the Load option from
the main menu. When CatJACK asks for the name of the array to load,
enter "STUDENTS". CatJACK will take a few seconds to load the array
before it displays it on the spreadsheet.
In addition to the variables you are familiar with, our new array
contains each student's GPA and SAT scores. Notice, also, that the
variable name for HS has been changed to "HighSchool" and its data is
now spelled out.
Correlation Analysis
Suppose we want to know the extent to which we can predict a student's
mid-term exam grade from his or her GPA.
Select the main menu option "Analyze", then select the analyze menu
option "Correlation/Regression". CatJACK then prompts for which
correlation coefficient to use, Pearson or Spearman. Since Grades and
GPA are interval data we select "Pearson". Using Spearman for
anything other than ranked data will yield erroneous results.
CatJACK then asks for which display you want. Select "Detail" (We'll
look at the others in a moment). Then CatJACK prompts for which
variables to use. Enter "GPA as the Independent (predictor) variable,
and "Grade" as the dependent variable.
Page 8
After a few seconds CatJACK displays the Pearson Correlation Analysis.
This display includes the Pearson Correlation coefficient (r), r
Square, standard error of prediction (StdErr), degree of freedom,
t-ratio, regression line slope and Y intersect. It also shows a brief
scatterplot where points are represented as ".", ":", "+", or "#"
depending on their frequencies.
Now we can predict a student's likely mid-term exam score from his or
her GPA. Select the line-2 menu option "Predict", then enter the GPA
of a student whose Grade you want to predict, say, "3.2". CatJACK
will display the predicted Grade (Y Prime) along with the Grade high,
low, and range for the 99%, 95%, 90%, and 80% confidence levels.
You can compare one numeric variable with all others. Lets stepback
to the Correlation menu using the Escape key, select the "Pearson"
coefficient again, and now select the "OneToAll" display. At the
independent variable prompt, enter "Grade". CatJACK now displays
statistics showing how Grade correlates to all other numeric
variables.
And finally we can see how all numeric variables correlate to all
others. Proceed to the display prompt as before, and this time select
"AllToAll". CatJACK will display the correlation coefficient of all
numeric variables to all others.
Crosstab Analysis
We can examine the contingency of two variables. Suppose we want to
compare the frequency of male vs female students from each high
school. Use the Escape key to stepback to the analysis menu and
select "Crosstab". Enter "HighSchool" as the X variable and "Sex" as
the Y variable. After computing for a few seconds, CatJACK will
display the Crosstab Analysis screen, with each cell containing the
observed (O) and theoretical (T) frequencies; and it will show the
degree of freedom, chi-square, and p.
When you use Crosstab analysis on your data keep in mind that each
variable can have no more than ten groups.
Sorting
We can sort the data array by any number of variables. For example,
suppose we want to sort our students by HighSchool, and within each
HighSchool we want to sort by Grade.
To do this, stepback to the main menu and select the "Sort" option.
CatJACK will prompt for the variables or "keys" to sort by. Enter
"HighSchool" and "Grade for keys 1 and 2 respectively. When CatJACK
prompts for key 3, just press Return.
After a few seconds CatJACK will display the array in the sort order
requested.
Page 9
Printing the Data Array
To list the entire data array on your printer just select the "Print"
main menu option. The array will be printed out in the current sort
order with the variables names appearing across the top and line
numbers down the left side.
To Quit from CatJACK
You can terminate your CatJACK interactive session and return to DOS
by selecting the "Quit" main menu option. If the current data array
has been changed but not saved to disk, a warning will appear, asking
if you want to save it before quitting.
Analyzing Data Output by Other Programs
You can use CatJACK to analyze data you already have on disk from
other programs. This data must have the characteristics outlined
here. See the example of our STUDENTS file below.
1. Variables cannot be more than 10 characters wide.
2. Variables must be separated by one space.
3. On the first line, an "N" for numeric or "T" for text must
appear, left justified, over each variable.
4. On the second line, the variable name must appear, left
justified, over each variable.
5. On the third line, hyphens must appear across the width of
each variable. This tells CatJACK each variable's start and
end position.
6. From the forth line on the data should appear. Numeric data
must be right justified within each variable, and text data
must be left justified.
7. CatJACK allows a maximum of 200 lines of data.
Any deviation from these characteristics can have unpredictable
results. You can use your favorite text editor to make the necessary
modifications to your file. If you are using a word processor, be
sure it is set to non-document mode.
Page 10
N T T N N
Grade Sex HighSchool GPA SAT
-------- --- ---------- -------- --------
96 M WESTYORK 3.6 1300
80 F CENTRAL 3.3 1166
84 F CENTRAL 3.4 1210
90 M WESTYORK 3.4 1280
88 F EASTYORK 3.5 1277
78 M CENTRAL 2.7 1120
86 M CENTRAL 3.5 1260
88 F WESTYORK 3.4 1256
92 M WESTYORK 3.6 1380
90 F EASTYORK 2.8 1390
84 F WESTYORK 3.0 1250
88 F EASTYORK 3.1 1310
68 M EASTYORK 2.3 824
74 F CENTRAL 2.4 1022
100 M WESTYORK 3.8 1498
CatJACK
Jack is a brown long-haired Persian cat who kept me company through
the long hours of developing this program.