gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Encoding (viewing) on Mac OS


From: Karsten Hilbert
Subject: Re: [Gnumed-devel] Encoding (viewing) on Mac OS
Date: Tue, 15 Nov 2011 13:39:28 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Nov 15, 2011 at 04:50:15AM +0000, Jim Busser wrote:

> Judging from my favourite Mac text processor TextWrangler
> -- a free version of BBedit -- I think I figured out a Mac
> vulnerability when processing a file encoded as
> 
>       Latin1
> 
> because TextWrangler (perhaps with a dependency on the OS)
> has trouble to appropriately auto-detect which form of Latin
> 1 encoding…

Latin1 is Latin1, there's no two ways about it that I can see

        http://en.wikipedia.org/wiki/ISO/IEC_8859-1

regardless of what a Mac may think.

The problem is likely rather that "auto-detecting" Latin1 is
impossible because it overlaps with many other encodings. If
a file only contains characters from the overlap no
auto-disambiguation is conceptually possible.

> it tends to select
> 
>       Western (Mac OS Roman)
> 
> even when this results in incorrect characters

That's worse yet.

> e.g. in the server sql country-specific file
> 
>       gmDemographics-Data.ca.sql
> 
> it yields
> 
>       <snip>
>       select i18n.upd_tx('fr_CA', 'Nova Scotia', 'Nouvelle-…cosse');
>       select i18n.upd_tx('fr_CA', 'Prince Edward Island', 
> 'Œle-du-Prince-…douard');
>       select i18n.upd_tx('fr_CA', 'Quebec', 'QuÈbec');
>       <snip>
> 
> whereas
> 
>       Western (Windows Latin 1)
>       Western (ISO Latin 1)

They are identical.

> yield
> 
>       <snip>
>       select i18n.upd_tx('fr_CA', 'Nova Scotia', 'Nouvelle-Écosse');
>       select i18n.upd_tx('fr_CA', 'Prince Edward Island', 
> 'Île-du-Prince-Édouard');
>       select i18n.upd_tx('fr_CA', 'Quebec', 'Québec');
>       <snip>
> 
> If necessary, I can open such files manually choosing one of the other Latin1 
> encodings, change the selection to UTF8 and save it.

Yes, that would (IMO) be the correct way of going about it.

> I wonder however whether in future -- in spite of the Canadian source having 
> been Latin 1 -- there is any reason why the sql files cannot be saved in UTF8?

They already *are* UTF8 -- because for all relevant
characters utf8 and latin1 overlap (unless I am mistaken).

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346



reply via email to

[Prev in Thread] Current Thread [Next in Thread]