
From:  Renan Levine 
Subject:  PSPPBUG: Logistic Regression bugs 
Date:  Wed, 14 Nov 2012 13:29:54 0500 
Useragent:  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 
Dear John: I did not have access to SPSS, but one of my grad students did us a favour... As a result, I can confirm that missing cases on the dependent variable are dropped from the logistic regression analysis by SPSS. She writes: See attached. Originally, the DV had 32.8% system missing cases and .4% DK (which I declared missing). In the first test, these missing cases are dropped. In the second and third tests, I coded the 32.8% missing cases as 0 and 1 respectively, and still declared the .4% DK as missing. This does indeed produce different Ns and coefficients. The question of how SPSS/others estimate a missing dependent variable value when there are no missing values among the independent variables is that the model will result in a predicted Y, which after rounding would predict 0 or 1. I think this link has a more extensive discussion if you are interested http://www.statisticalhorizons.com/wpcontent/uploads/MissingDataByML.pdf At this point, my concerns/interests are much more mundane. Having a simple logistic regression routine to use in the classroom is my primary ambition. Ideally, it would be nice if PSPP could replicate SPSS' classification table (a 2x2 table showing how well the model predicted actual responses and the percentage observations that were correctly predicted). See http://www.ats.ucla.edu/stat/spss/dae/logit.htm for a brief exposition and annotated discussion of the analysis. Yours, Renan  Original Message 
On Tue, Nov 13, 2012 at 08:28:31PM 0500, Renan Levine wrote: Dear Mr. Darrington, Please call me John :)  except on formal occasions, when I enjoy Dr. Darrington. The problem with the error message only concerns dichotomous dependent variables, not predictor variables. Missing values on the predictor variables do not pose any problems. Cases with missing values on any independent variables are dropped just like when completing OLS regressions. Yes. Currently PSPP drops cases with missing values on any independent variable. I think unequivocally that what the routine needs to do is to ignore all missing values and just focus on the nonmissing categories. For example, STATA's manual says: logit fits a maximumlikelihood logit model. depvar=0 indicates a negative outcome; depvar!=0 & depvar!=. (typically depvar=1) indicate a positive outcome. So you are suggesting dropping case with missing dependent variables too? That would seem reasonable. The way I understand that SPSS statement (if its not a typo) is that the SPSS routine will generate a predicted value for any observations with a missing value on the dependent variable, assuming that none of the independent variables contain any missing values for that observation. This is one way that some use maximum likelihood techniques to impute missing values. Yes, that is what it seems to be saying. The question which arises is, HOW does it generate the predicted value? The only reasonable way I can think of would be to calculate it from the coefficients of the predictors  but we don't know them a priori (the very purpose of logistic regression is to find them). Of course, it is possible to run the procedure ignoring the cases with missing dependents, then impute the values from the calculated coefficients, and run the procdure again, this time including the cases with imputed values. However that would yield exactly the same results, except slightly better (misleading better) confidence values. So doing that doesn't make much sense. Hence my confusion. If you have access to SPSS, perhaps you could try some experiments for me? Can you see if SPSS simply drops cases with missing on the dependent variable. Or does it treat them all as 0 or as 1 or what ... Thanks for you help. John  PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://keys.gnupg.net or any PGP keyserver for public key. 
Tests_Renan.docx
Description: application/vnd.openxmlformatsofficedocument.wordprocessingml.document
[Prev in Thread]  Current Thread  [Next in Thread] 