[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PSPP-BUG: Logistic Regression bugs
From: |
John Darrington |
Subject: |
Re: PSPP-BUG: Logistic Regression bugs |
Date: |
Wed, 14 Nov 2012 10:00:50 +0100 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Tue, Nov 13, 2012 at 08:28:31PM -0500, Renan Levine wrote:
Dear Mr. Darrington,
Please call me John :) - except on formal occasions, when I enjoy Dr.
Darrington.
The problem with the error message only concerns dichotomous
dependent variables, not predictor variables. Missing values on
the predictor variables do not pose any problems. Cases with
missing values on any independent variables are dropped just like
when completing OLS regressions.
Yes. Currently PSPP drops cases with missing values on any
independent variable.
I think unequivocally that what the routine needs to do is to
ignore all missing values and just focus on the non-missing
categories. For example, STATA's manual says: logit fits a
maximum-likelihood logit model. depvar=0 indicates a negative
outcome; depvar!=0 & depvar!=. (typically depvar=1) indicate a
positive outcome.
So you are suggesting dropping case with missing dependent variables too?
That would seem reasonable.
The way I understand that SPSS statement (if its not a typo) is
that the SPSS routine will generate a predicted value for any
observations with a missing value on the dependent variable,
assuming that none of the independent variables contain any
missing values for that observation. This is one way that some
use maximum likelihood techniques to impute missing values.
Yes, that is what it seems to be saying. The question which arises is,
HOW does it generate the predicted value? The only reasonable way I
can think of would be to calculate it from the coefficients of the
predictors --- but we don't know them a priori (the very purpose of
logistic regression is to find them). Of course, it is possible to
run the procedure ignoring the cases with missing dependents, then
impute the values from the calculated coefficients, and run the procdure
again, this time including the cases with imputed values.
However that would yield exactly the same results, except slightly
better (misleading better) confidence values. So doing that doesn't
make much sense. Hence my confusion.
If you have access to SPSS, perhaps you could try some experiments for me?
Can you see if SPSS simply drops cases with missing on the dependent variable.
Or does it treat them all as 0 or as 1 or what ...
Thanks for you help.
John
--
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
See http://keys.gnupg.net or any PGP keyserver for public key.
signature.asc
Description: Digital signature