[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Meta-Characters, Special Characters
From: |
David Kastrup |
Subject: |
Re: Meta-Characters, Special Characters |
Date: |
Sat, 02 Jun 2007 09:45:54 +0200 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (gnu/linux) |
Gernot Hassenpflug <gernot@yahoo.com> writes:
> Miles Bader <miles@gnu.org> writes:
>
>> Gernot Hassenpflug <gernot@nict.go.jp> writes:
>>> I am happy to note that Windows too stores its iinformation in UTF-8
>>> internally, no matter what the user's settings for a particular
>>> program may be.
>>
>> I thought windows used something a bit more annoying and ad-hoc, UCS-16
>> or something like that.
>
> Oh, you may be right there, I should have qualified my statement: as
> opposed to a Windows-specific charset I think Windows uses a
> universal charset. I am not sure why UCS-16 is more ad-hoc than
> UTF-8, but I would be more than happy if linux instead of UTF-8
> moved to UTF-16 or UTF-32, in view of the many charsets I need in my
> work. I am not nearly educated enough on this topic to hold a
> coherent conversation however, still reading. -- Grrr!! ...Pick a
> reason...
As soon as you leave the UTF-16 base plane, you need to deal with
surrogate character pairs. The issues are pretty much the same as
when dealing with UTF-8, and you get the additional complications of
wide characters, quite more conspicuous byte order marks, Endianness
portability problems and so on.
In short: this buys you positively nothing unless you restrict
yourself to the base 16-bit subset (which makes this infeasible for a
number of tasks). And even then, the disadvantages are not really in
a good balance with the advantages.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum