Re: PSPP-BUG: Logistic Regression bugs
John Darrington |
Re: PSPP-BUG: Logistic Regression bugs |
Tue, 13 Nov 2012 21:29:10 +0100 |
Mutt/1.5.20 (2009-06-14) |
You are right.
As you may have noticed, the logistic regression feature is still in
development.
As of the version you mentioned, the /CATEGORICAL subcommand was not
implemented,
but it has been since, try the most recent HEAD.
I think there may still be a problem with missing values in the categorical
variables, and I am working on that. Missing values in the non-categorical
predictor variables should be ok though.
I'm unsure exactly how it should behave in response to missing values on the
dependent variable. The spss documentation says:
"For a case with a missing value on the dependent variable, predicted values
are calculated if it has non-missing values on all independent variables."
This statement doesn't make sense to me. Why would a predicted value be
calculated? from what? - I'm still thinking about that ... Have you any
idea?
So thanks for your feedback. I appreciate you taking the time to test and
report these things.
If you can do some similar tests with a very recent version (
3cd65292e3cc6bd6532214dcc8c8ddc65bdc2972 or later, I would appreciate it).
Particularly, more tests with missing values and with weighted values would
be great.
Regards,
John
On Tue, Nov 13, 2012 at 01:08:19PM -0500, Renan Levine wrote:
Dear PSPP users and programmers,
Thank you for your hard work in developing new capabilities for
PSPP. I'm using the most recent version of PSPP, psppire.exe
0.7.9-gaef7f5
I recently encountered some problems while running logistic regressions:
1) There appears to be a bug in the logistic regression routine
that causes it to recognize missing values in the dependent
variable as a value category. So, even when a variable is coded
0, 1 and [system] missing (common in public opinion data), PSPP
gives an error message: "Dependent variable's values are not
dichotomous."
I've run logit analyses on three different .por and one .sav
datasets, tried to see if user-missing is treated differently
than system-missing, and if declaring missing values works any
differently than a recode statement. The only way I manage to run
a logistic regression is if I recode the dependent variable to be
two integers with no missing values.
2) Less critically, I'm not sure the syntax /CATEGORICAL=var is
working correctly. When I include that line, letting the computer
know that an independent variable is dichotomous, I get an error
message: .3-13: error: Syntax error at 'categorical'. HOWEVER,
just including the variable on the initial line with the other
independent variables seems to work (I can't be certain because I
did not cross-reference my results with another statistics
program).
Yours,
Renan
