Re: PSPP-BUG: Inconsistency with SPSS when setting missing strings value

From: John Darrington
Subject: Re: PSPP-BUG: Inconsistency with SPSS when setting missing strings values with a number
Date: Thu, 17 Sep 2020 11:34:16 +0200
On Wed, Sep 16, 2020 at 05:52:22PM -0700, Dana Williams wrote:
     I'm not sure if I've discovered a small bug or have simply made an
     unusual user error.

It sounds to me like a user error, but I'm not sure that I've fully
     The problem: In the PSPPire GUI, we marked missing values for string
     variables with a zero (i.e., a "0" without the quote-marks) and then
     declared them to be missing as discrete values via the Variable View. I
     presumed this would work fine. When running frequencies in PSPP, all
     cases marked with a 0 successfully showed up as missing. When using this
     dataset--wholly created in PSPP during class time--in SPSS, however,
     students found that the cases that should have been missing were not
     marked as such, and just appeared in output tables as zeroes. I borrowed
     my partner's laptop and opened our dataset in SPSS and found the same
     problems the SPSS-using students had. SPSS showed missing values of 0 for
     those variables in the Variable View, so I was confused why they were nor
     marked as missing when doing a frequency test.

If I had to hazard a guess, perhaps you inadvertently set the missing value
the wrong thing.  It's worth reminding you that all strings in spss/pspp have
a fixed length (default is 8).  So for a string variable of length 8, there
can never be a value of "0".  But there can be a value "0       ".    Normally,
pspp should right pad shorter values, so that if you enter "0"  the value which
actually gets entered is "0       ",  but if you inadvertently type " 0", then
you will get " 0      ".

     While in SPSS, I used the following syntax to try to reset the missing
     values to 0:
     missing values [VARIABLENAME] (0).

I would have expected to see an error here, if VARIABLENAME is a string 

     This didn't change any of the frequency results. But, I wondered if SPSS
     might have thought that the missing value was set to the numeral 0, as
     opposed to a string of 0, so I used:
     missing values [VARIABLENAME] ("0").
     Instantly the values were caught as missing. In other words, I think that
     our use of the number 0 in PSSP didn't appropriately set itself as a
     character/string of "0"... or SPSS couldn't recognize this with the
     PSPP-created dataset. Or maybe, despite these variables being strings, we
     were somehow inappropriately allowed to enter a number value for missing
     in the Variable View.

After you set the missing value, try typing "DISPLAY DICTIONARY."   This will
show you what the system's idea of a missing value is.

     For myself, I was using PSPPire 1.0.1 and SPSS 27.
     I can share our PSPP-created and SPSS-saved/fixed dataset if this would
     help figure out the problem.

Please do.  And please run "DISPLAY DICTIONARY." on both datasets, and post
the output you get.


