home *** CD-ROM | disk | FTP | other *** search
- STATS 3 MENU
-
- REGRESSION
- For the tests that follow, all except LOGIT regression have
- similar input and output structures. You will be asked for the
- variables that are the independent variables and for the one
- dependent variable. You will then be asked for the variable
- (column) into which the calculated values should be placed. The
- program does not place the residuals in variable (column) a, as
- this would restrict the number of variables which could actually
- be used in the regression. To get the residuals, simply subtract
- the calculated data from the actual in the data editor. The
- differences lie in additional parts of the regressions.
- -Multiple regression is a traditional regression.
- -Ridge regression will require the entry of a ridge factor, which
- should be small and between 0 and 1 (most often below .2).
- -Stepwise regression is like multiple regression, except that you
- specify all independent variables to be considered. The program
- decides on which of these to actually use in the regression.
- -Cochran refers to a regression done using the Cochran-Orcutt
- procedure. A "Cochran" factor of between 0 and 1 must be used.
- This type of regression actually uses a part of the previous point
- in the calculation. If the Cochran factor is 1, then the
- regression is actually calculated upon the first differences of
- the variables.
- -Huber regression is used to reduce the weight given to outliers
- in the data. You will need to specify two additional pieces of
- data. The first is the variable into which the program places the
- weights, and the second is the value of the residual at which the
- weights should start to be changed. This procedure can only be
- used after first doing a traditional regression.
- -Weighted regression requires you to specify a weight variable
- before execution.
- -Chow regression is a simple modification of multiple regression.
- It is used to see if the regression parameters are constant over
- the scope of the data variables. You will have to specify the
- number of points to keep in the first sample.
- -LOGIT regression is used when the dependent variable is to be
- constrained to a value above 0 but below 1. LOGIT setup converts
- unsummarized data to the form required by the regression program.
- (Save original data first!)
- -Principle Components is not actually a regression method at all.
- It is a process used to reduce the number of variables needed to
- explain the variation in the data. The resultant variables are
- orthogonal; that is the correlation between any two variables is
- 0. Regression can often then be carried out against these pseudo-
- variables. The process is destructive, in that it wipes out the
- existing variables. Each new one is a linear combination of the
- others.
- -Correlation matrix shows the correlation between a group of
- variables, rather than doing a full regression. This is often done
- to look at the effects of multi-colinearity on the data.
-
- TIME SERIES
- These are methods of smoothing or projecting data. They are often
- used in combination with other procedures.
- -Moving average requires you to choose the variable and the period
- of the moving average. As well, you must select a variable into
- which the averaged variable will be placed.
- -Geometric moving average requireS the same input as linear moving
- average.
- -Fourier smoothing requires a variable to smooth and a variable to
- place the result. It also asks for the number of terms to be kept
- in the intermediate calculations. This value should be less than
- 50, usually lesS than 15. There must be no missing data for this
- procedure to work. Note that this can be a slow process.
- -Brown 1-way exponential smoothing is simple exponential
- smoothing. You will be asked to specify the variable to smooth,
- and a variable in which to store the result. In addition, you will
- need a smoothing constant (0 to 1) and a starting value. If you do
- not specify the starting value, the program will generate one.
- This process is not designed for data with a distinct trend line.
- If there is a distinct linear trend, then 2-way exponential
- smoothing should be used.
- -Brown's 2-way exponential smoothing uses linear regression to
- estimate a starting value and trend. You must estimate the
- smoothing coefficient and variable to smooth, and variable for
- result.
- -Holt's 2-way exponential smoothing is similar to Brown's, except
- that a separate smoothing coefficient is used for the trend
- factor.
- -Winter's exponential smoothing is used if there is a seasonal
- aspect to the data (like retail sales which have a December peak).
- You will have to enter 4 quantities. The first is the smoothing
- coefficient for level. The second is for trend. The third is for
- seasonality. The fourth value is the period of seasonality. Note
- that this method should not be used with data fluctuating above
- and below zero. With data that go below zero, add a constant to
- the data to eliminate negative values. Then, after smoothing,
- subtract the constant.
- -Interpolation
- B/STAT uses 3 forms of estimating unavailable data.
- -Simple linear interpolation requires that you simply select the
- variable.
- -Lagrangian interpolation requires two variables: an "X" variable
- and a "Y" variable. There can be no missing "X" variables. This
- can be slow with a large data set, since each point is used in
- estimating missing data.
- -Cubic splines assumes that the data set in the selected variable
- consists of evenly-spaced observations.
-
- EXTRACT
- These selections allow you to reduce the size of the data set. The
- first option sums the data. For example, if you want to get yearly
- totals from a data set of monthly data, you can extract summed data
- and reduce the data by a factor of 12. Each element would then be
- a yearly total. In the non-summed case, only every 12th value would
- be left. No summing would be done. This is useful if you want to
- look at subsets in isolation.
-
- MISCELLANEOUS
- This menu has two procedures, in addition to the usual help
- selection.
- -Crosstabs is used to summarize data which contained in two or
- three variables. It produces a count for the combination of values
- in the chosen variables. For example, you may have data on the
- height and weight of a group of army recruits. You could use
- crosstabs to find out the number in each height and weight
- classification, where these could be height in 2-inch increments
- and weight in 5-pound increments. It is most commonly used in
- market research for crosses, such as between age 30 and 34 and
- earning between 20,000 and 30,000 dollars per year.
-
- You first select the variables to use in the crosstab. If you
- select two, then a 2-way crosstab is done. If three, then a 3-way
- crosstab is done. Next, you select the break points for the
- classes in each variable. There may be up to 14 breakpoints,
- giving a maximum of 15 classes for each variable. You need only
- type in as many breakpoints as there are in the a specific
- variable, and leave the rest blank. The number of break points can
- be different for each variable. Note that the lower class includes
- the break point value. Thus, a breakpoint of 200 pounds would put
- 200-pound people in the lower class and 200.01 pound people in the
- higher class. The program will print out the results. If you want,
- you may replace the data in memory with the summarized totals.
- This can be quite useful if you then want to perform a Chi square
- test, type 2, on the result to see if there are any significant
- relationships.
- -Difference is a rather simple process. The difference of a
- variable is simply the amount of its change from one period to the
- next. Sometimes some procedures will work better on the change in
- a variable rather than the variable itself. This is especially
- true in Box Jenkins analysis. You merely supply the variable to
- difference and the variable into which to place the result.
-
-