[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts
From: |
Ben Pfaff |
Subject: |
Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names |
Date: |
Mon, 10 Feb 2014 10:52:46 -0800 |
Here's the approach I'm trying so far, in case you (or anyone) has ideas:
* Extract all the raw string data from the .sav file, without trying to
determine its encoding.
* Try converting all of the raw string data from every significant encoding
to UTF-8. Discard any encodings that actually fail.
* Of the remaining encodings, merge together the equivalence classes
in which all of the strings are identical in UTF-8.
* For each equivalence class, present the user with the strings that
are not all the same, along with the meaning of the string. Allow the
user to choose one of the encodings.
So what you end up with is a table. Along the y axis are string meanings,
e.g. "Variable Name 1", "Variable Name 2", ..., "Value Label 1". Along
the x axis are encodings. The entries are the strings for those encodings.
The user should be able to figure out whether which set of variable names
(etc.) makes the most sense and choose that encoding.
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/02
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/03
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/03
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/04
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/04
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/04
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/08
- Message not available
- Message not available
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/10
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names,
Ben Pfaff <=
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/16
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/18
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/18
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/18