bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] UTF-8 replacement character cannot be translitera


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] UTF-8 replacement character cannot be transliterated into ISO-8859-1
Date: Mon, 22 Oct 2007 01:19:35 +0200
User-agent: KMail/1.5.4

Hello Vincent,

Vincent Lefevre wrote:
> As shown by the attached script, the UTF-8 replacement character
> cannot be transliterated into ISO-8859-1 (tested under Mac OS X).
> This problem doesn't occur under Linux with iconv from glibc 2.6.1.
> 
> Under Mac OS X (with libiconv built via MacPorts), the attached script
> gives:
> 
> iconv (GNU libiconv 1.11)
> Copyright (C) 2000-2006 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> Written by Bruno Haible.
> éè
> EUR
> iconv: (stdin):3:0: cannot convert
> 
> while under Linux, I get:
> 
> iconv (GNU libc) 2.6.1
> Copyright (C) 2007 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> Written by Ulrich Drepper.
> éè
> EUR
> ?
> ...
> 
> as expected.

This can be reproduced with libiconv on Linux as well.

The amount of transliteration done is at the discretion of the implementation.
glibc transliterates traditionally more characters; libiconv transliterates
only when the transliteration is well understood by everyone and culturally
neutral.

For U+FFFD European users prefer a question mark '?', whereas CJK users
prefer a U+3013 (GETA MARK) for this purpose.

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]