[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Nonstandard implementation problems in iconvdata

From: Ben Hochstedler
Subject: Re: Nonstandard implementation problems in iconvdata
Date: Tue, 05 Dec 2000 08:29:34 -0600


> The correct behaviour is not defined by Solaris but by Standards like
> Unix98 which has:
>   If a sequence of input bytes does not form a valid character in the
>   specified codeset, conversion stops after the previous successfully
>   converted character.  If the input buffer ends with an incomplete
>   character or shift sequence, conversion stops after the previous
>   successfully converted bytes.

What this means is, if an invalid character in the *input* charset
is encountered, conversion stops.  It does not say that conversion
stops when a character can't be converted to an identical character
in the resulting charset.

Here's what the Unix98 documentation says regarding this issue:

   If iconv() encounters a character in the input buffer that is valid,
   but for which an identical character does not exist in the target
   codeset, iconv() performs a conversion on this character that may
   vary among systems that conform to the Single UNIX Specification,
   Version 2. 

And under Return values, it says:

   The iconv() function returns the following:
      Number of non-identical conversions 

This means that iconv() is supposed count the number of characters
that couldn't be identically converted, not that it should stop at
the first valid character it can't convert!  I'm sorry, but I just
don't think that the glibc implementation of iconv() conforms to the
intent of the authors of Unix98.  Both Solaris's and Digital Unix's
iconv() will completely convert a string, replacing characters that
don't map into the resulting character set with an implementation-
defined character.

If you want to provide a function that stops converting when an
unconvertable character is found, that's fine.  Just don't let that
function be iconv().  The differences in iconv()'s between systems
shouldn't cause one to have to write a wrapper function around
iconv() when wanting to use glibc to make it act like it does on
every other system.


Ben Hochstedler         GE Medical Systems Information Technologies
address@hidden         http://www.gemedicalsystems.com/
Phone: 414-362-3317      Fax: 414-362-3389      Dial-comm: 401-3317

reply via email to

[Prev in Thread] Current Thread [Next in Thread]