home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!nevada.edu!news.unomaha.edu!sol.ctr.columbia.edu!zaphod.mps.ohio-state.edu!darwin.sura.net!paladin.american.edu!auvm!LEVEN.APPCOMP.UTAS.EDU.AU!BROBINSO
- Message-ID: <9211151030.AA00944@leven>
- Newsgroups: bit.listserv.sas-l
- Date: Sun, 15 Nov 1992 21:30:34 +1200
- Reply-To: brobinso@LEVEN.APPCOMP.UTAS.EDU.AU
- Sender: "SAS(r) Discussion" <SAS-L@UGA.BITNET>
- From: brobinso@LEVEN.APPCOMP.UTAS.EDU.AU
- Subject: Covariance with data step
- Comments: To: sas-l@uga.cc.uga.edu
- Lines: 135
-
- Content: Question
- Platform/Release: SunOS/6.07
- Name: Barrie Robinson
- Email: B.Robinson@appcomp.utas.edu.au
-
- I'm sorry to burden the list with the following, (and I don't even
- subscribe to it at the moment -- I simply haven't the time to read/delete
- all the mail for the next few weeks). I've RTFMed till I'm blue in the
- face, but I can't seem to figure out how to do what I feel ought to be
- quite simple.
-
- I have a data file with about 20 variables of interest (llength--lspor),
- and I wish to compute the covariance matrix, using all available
- information, sort of like pairwise deletion of missing values in SPSS. Now
- there are probably other ways of doing this, including proc IML, but
- *surely* one ought to be able to do it with a data step.
-
- The following is an early attempt:
-
- libname g 'GrammDir';
-
- data sp3;
- set g.Gramm3;
- if species=3;
-
- data g.sp3cov (keep=cov1-cov20);
- array vars{*} llength lwidth lpos lpet larc lldens lpdens llthick lnsor
- llsrat lslrat lsleng sang sarc llsep lscalen lscawid
- lsporang annulus lspor;
- array sums{20}(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0);
- array nnm{20}(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0);
- array cov{20};
- array nt{20,20};
- array sscp{20,20};
- do k=1 to last;
- set sp3 point=k nobs=last;
- do j=1 to 20;
- if vars{j}^=. then do;
- nnm{j}+1;
- sums{j}+vars{j};
- end;
- end;
- end;
- do i=1 to 20;
- do j=1 to 20;
- nt{i,j}=0;
- sscp{i,j}=0;
- end;
- end;
- do k=1 to last;
- set sp3 point=k;
- do i=1 to 20;
- do j=1 to 20;
- if vars{i}^=. and vars{j}^=. then do;
- nt{i,j}+1;
- sscp{i,j}+vars{i}*vars{j};
- end;
- end;
- end;
- end;
- do i=1 to 20;
- do j=1 to 20;
- cov{j}=(sscp{i,j}-nt{i,j}*sums{i}/nnm{i}*sums{j}/nnm{j})
- /(nt{i,j}-1);
- end;
- output;
- end;
-
- The above seemed to want to produce hundreds of cases (instead of the 20 I
- wanted) and includes the original variables, despite the keep=cov1-cov20
- clause.
-
- The following is a later attempt:
-
- libname g 'GrammDir';
-
- data sp3;
- set g.Gramm3;
- if species=3;
-
- data g.sp3cov (keep=cov1-cov20);
- array vars{*} llength lwidth lpos lpet larc lldens lpdens llthick lnsor
- llsrat lslrat lsleng sang sarc llsep lscalen lscawid
- lsporang annulus lspor;
- array sums{20}(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0);
- array nnm{20}(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0);
- array nt{20,20};
- array sscp{20,20};
- retain _all_;
- array cov{20};
- do i=1 to 20;
- do j=1 to 20;
- nt{i,j}=0;
- sscp{i,j}=0;
- end;
- end;
- do k=1 to last;
- set sp3 point=k nobs=last;
- do i=1 to 20;
- if vars{i}^=. then do;
- nnm{i}+1;
- sums{i}+vars{i};
- end;
- do j=1 to 20;
- if vars{i}^=. and vars{j}^=. then do;
- nt{i,j}+1;
- sscp{i,j}+vars{i}*vars{j};
- end;
- end;
- end;
- end;
- do i=1 to 20;
- do j=1 to 20;
- cov{j}=(sscp{i,j}-nt{i,j}*sums{i}/nnm{i}*sums{j}/nnm{j})
- /(nt{i,j}-1);
- end;
- output;
- end;
-
- This one doesn't seem to do much better. Both programs overflow my
- available disk space (over 1 Mbyte to spare), and take a long time doing
- it.
-
- If anyone has any ideas, I'd be grateful.
-
- *Please* email to my address above.
-
- Regards,
-
- Barrie.
-
- --
- Barrie Robinson, |email:
- brobinso@leven.appcomp.utas.edu.au
- University of Tasmania at Launceston. |phone: (61)(03)260211
-