help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 character question


From: David Kastrup
Subject: Re: UTF-8 character question
Date: Mon, 12 May 2008 10:35:07 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux)

Harald Hanche-Olsen <hanche@math.ntnu.no> writes:

> + horatio@gmail.com:
>
>> My guess is there's some basic option or package that I'm missing
>> that will make the problem go away.  Can you (or anyone else) copy
>> and paste that character into an Emacs buffer?  If it works, can you
>> think of anything in your setup that I might not have done?  I'll
>> take a look myself in the meantime.
>
> I can copy and paste it just fine.  However, you said you're running
> emacs 22 on windows, right? I am running various versions of emacs 23
> (the development version) on unix, so I very much doubt that you can
> learn anything useful from my setup. I don't do anything out of the
> ordinary with font setup anyway (other than using the Vera Sans Mono
> font, which will affect only the latin characters). I think some other
> users of emacs on windows will have to step in.

If he is using Chinese or other CJK stuff a lot, he might want to bite
the bullet and switch to Emacs 23.

Almost all Emacs implementations that are around use MULE as an internal
encoding.  Emacs>=23 and XEmacs starting from some 21.5 quite instable
version use utf-8 as an internal encoding.

The "problem" with MULE is that it represents characters as a
charset/character pair, and characters from different charsets are
basically different.  But character sets are coupled with encodings, and
so some characters exist in quite a number of charsets (like the basic
accented letters).  This necessitated functions for "charset
unification" which do a better or worse job depending on what they are
working with, and how muc code have written for the charsets.

Now Emacs 23 loses this information and keeps around only the Unicode
codepoint.  That means that you can't represent as much information as
previously, but usually the information you lose is that which you would
want to have disregarded, anyway.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum


reply via email to

[Prev in Thread] Current Thread [Next in Thread]