[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: transformation question
From: |
Jason Stover |
Subject: |
Re: transformation question |
Date: |
Wed, 26 Apr 2006 22:12:27 -0400 |
User-agent: |
Mutt/1.5.10i |
On Wed, Apr 26, 2006 at 02:45:32PM -0700, Ben Pfaff wrote:
> > Users should be able to save many different variables that result from
> > the REGRESSION procedure. I just checked in some code that lets
> > them save residuals and predicted values. The problem was that
> > the first transformation to execute needs to look in the dictionary
> > and scan cases, which haven't been completely filled in. I checked in a
> > fix for it after I sent my last message. I hope it isn't too offensive.
...
> This is what the code appears to do now:
>
> Each time the transformation procedure is passed a case:
> Look at each variable in default_dict. If it's one we
> recognize, do something with its value in the case. Do
> some calculations on those values and produce some output
> in the current case.
That's right.
> If that's correct, here's the way I would have expected it to be
> done:
>
> While assembling the transformation: Make a list of the
> variables involved.
>
> Each time the transformation procedure is passed a case:
> Iterate through the list of variables involved. We know
> they're involved, because they're in the list, so we do
> something with its value in the case. Do some
> calculations on those values and produce some output in
> the current case.
That sounds better. Pointers to the relevant variables are already
stored pspp_linreg_coeff.
> You're concerned about what variables are initialized, but I
> don't know whether that's relevant. If I understand correctly,
> the code only needs to look at the values of variables that it
> examined during the procedure, and those variables are definitely
> initialized. I don't think it looks at the values of other
> variables at all (right?).
When I asked about the variables being initialized, my thinking was
something like, "if I can ask for only those variables that are
initialized, then I can be stupid and the program will still run."
Maybe that motivation shouldn't apply here. I will request only the
appropriate variables.
> Do I misunderstand what is going on?
You understand. I did it that way for two reasons:
1. It was the first thing that occurred to me since I haven't done
this before. Asking for all the variables was obvious.
2. I wanted the functions in predict.c to accept bad input and still
do something sensible, so originally I wrote regression_trns_proc()
(now called regression_trns_resid_proc()) to pass all the variables to
pspp_linreg_residual(). Now that I know pspp_linreg_residual() takes
bad input and doesn't die, I will change the transformations to ask
only for the variables in the model.
-Jason