home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!wupost!darwin.sura.net!paladin.american.edu!auvm!COMPUSERVE.COM!71020.1025
- Message-ID: <930107031142_71020.1025_EHC114-1@CompuServe.COM>
- Newsgroups: bit.listserv.sas-l
- Date: Wed, 6 Jan 1993 22:11:42 EST
- Reply-To: William Kahn <71020.1025@COMPUSERVE.COM>
- Sender: "SAS(r) Discussion" <SAS-L@UGA.BITNET>
- From: William Kahn <71020.1025@COMPUSERVE.COM>
- Subject: t
- Comments: To: sas-l@ohstvma.bitnet
- Lines: 40
-
- Patrick Haggard wrote
-
- > I have some data containing between n and m observations in each
- > of C conditions. I would like to have exactly n observations in
- > each condition
-
- Which received (as of my last scan) two similar responses-each keeping the
- first n observations of the up to m (m>=n) in each group.
-
- May I suggest that an explicitly _random_ subset of each group be selected
- rather than the first? Even if there is no known order to the data often
- there is a non-random (though not known) order.
-
- data t; set old; x=ranuni(8911002);
- proc sort; by group x; *note explicit scrambling within group;
- data new; set t; by group;
- if first.group then count=0;
- count+1; *using implicit retain implied by this syntax;
- tag=(count>n) *keep all observations in same dataset;
- proc glm; class group; where tag=0; model dv=group; *use where;
-
- BUT--a statistics question arises. When is it better to throw out data in
- order to attain balance than analyze the unbalanced design? Granted, the
- estimates you get which assume balance are no longer min variance unbaised,
- but don't they always have smaller mean square error than throwing out data?
- If you have a procedure which requires balance (proc anova) won't you get
- smaller mse estimates by averaging your m points down to n points (say
- average m-n pairs) and ignoring the averaging in the analysis than by
- throwing out m-n data points? Data is so precious--seems a crying waste to
- throw it out just because some mathematicians/programmers don't give us an
- optimal analysis algorithm.
-
- Bill Kahn <71020.1025@compuserve.com>
- W. L. Gore and Associates
-
-
-
-
- Distribution:
- >INTERNET:sas-l@ohstvma.bitnet
-