Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts

bug-gnu-pspp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts

From:	Ben Pfaff
Subject:	Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names
Date:	Mon, 10 Feb 2014 10:52:46 -0800

Here's the approach I'm trying so far, in case you (or anyone) has ideas:

   * Extract all the raw string data from the .sav file, without trying to
     determine its encoding.

   * Try converting all of the raw string data from every significant encoding
     to UTF-8.  Discard any encodings that actually fail.

   * Of the remaining encodings, merge together the equivalence classes
     in which all of the strings are identical in UTF-8.

   * For each equivalence class, present the user with the strings that
     are not all the same, along with the meaning of the string.  Allow the
     user to choose one of the encodings.

So what you end up with is a table.  Along the y axis are string meanings,
e.g. "Variable Name 1", "Variable Name 2", ..., "Value Label 1".  Along
the x axis are encodings.  The entries are the strings for those encodings.
The user should be able to figure out whether which set of variable names
(etc.) makes the most sense and choose that encoding.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/02
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/03
  - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/03
  - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/04
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/04
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/04
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/08
    - Message not available
    - Message not available
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/10
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff <=
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/16
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/18
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/18
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Müller , Andre, 2014/02/18
- Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/02
  - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Przemek Powalko, 2014/02/02
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Ben Pfaff, 2014/02/04
    - Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names, Przemek Powalko, 2014/02/04

Prev by Date: Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names
Next by Date: PSPP-BUG: PSPP icon images not loading
Previous by thread: Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names
Next by thread: Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names
Index(es):
- Date
- Thread