bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts


From: Müller , Andre
Subject: Re: PSPP-BUG: Failure to handle an antique SPSS file containing umlauts in variable names
Date: Mon, 10 Feb 2014 16:29:11 +0000

> -----Original Message-----
> From: Ben Pfaff [mailto:address@hidden
> Sent: Monday, February 10, 2014 17:16
> To: Müller, Andre
> Subject: Re: PSPP-BUG: Failure to handle an antique SPSS file containing
> umlauts in variable names
> 
> On Mon, Feb 10, 2014 at 12:26:36PM +0000, M?ller, Andre wrote:
> > So I learn the .sav-file has no internal markers for the codepage used --
> > which in turn explains a lot of the codepage woes I have seen.
> > Thus, I will have to add a codepage-heuristic to my export-tool.
> 
> It's only the very old SPSS files that lack an indication of codepage.
> This causes problems for a surprising number of PSPP users, so I'm
> working to add some codepage analysis to PSPP as well.

Oh dear, that's work I'd hate to do for the general case.
I do have the advantage of a limited set of failure cases (~2k as my current 
estimate)
and a strong tendency for them to be from western europe,
so I can check the "file -bi" state of the output and check for umlaut presence.

Most of the errors will go rather unnoticed, as the non-us-ascii chars are 
not in the "functional" parts but only in the labels. There I find non-us-ascii 
chars replaced
to "?".

Nevertheless: That work is much appreciated, and I'm looking forward to be able 
and 
throw my lousy heuristics away. 

Don't hesitate to ask me to test on our big pile of antique files when the time 
comes,
I will be glad to help.

Regards,
Andre



reply via email to

[Prev in Thread] Current Thread [Next in Thread]