[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
emacs thinks UTF-8 can't encode Japanese text?
From: |
James Ralston |
Subject: |
emacs thinks UTF-8 can't encode Japanese text? |
Date: |
Thu, 13 Jan 2005 17:56:28 -0500 |
User-agent: |
Pan/0.14.2 (This is not a psychotic episode. It's a cleansing moment of clarity.) |
I posted the following to gnu.emacs.help:
On 2005-01-12 at 01:32-05, James Ralston wrote:
> I'm trying to use Emacs 21.3 on Fedora Core 3 to edit files
> containing Japanese text encoded with UTF-8.
>
> I've used the same version of Emacs on Fedora Core 2 with no
> problems. Everything just works. My locale is the same on both
> systems: en_US.UTF-8.
>
> But on my FC3 system, if I visit a UTF-8 encoded file, the Japanese
> characters display as empty boxes. Also, if I paste Japanese text
> into an Emacs window, and try to save the buffer, I receive this
> message:
>
>> These default coding systems were tried:
>> utf-8-unix
>> However, none of them safely encodes the target text.
>
> This message makes no sense, because UTF-8 encodes everything.
>
> On my FC2 system, here's what "C-u C-x =" says:
>
>> character: い (0151044, 53796, 0xd224)
>> charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87)
>> code point: 36 36
>> syntax: word
>> category: H:Japanese Hiragana characters of 2-byte character sets
>> j:Japanese
>> |:While filling, we can break a line at this character.
>> buffer code: 0x92 0xA4 0xA4
>> file code: 0xE3 0x81 0x84 (encoded by coding system utf-8-unix)
>> font:
>> -mplus-gothic-medium-R-normal--12-120-75-75-C-120-jisx0208.1990-0
>
> On my FC3 system, here's what "C-u C-x =" on the same character says:
>
>> character: い (0151044, 53796, 0xd224)
>> charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87)
>> code point: 36 36
>> syntax: word
>> category: H:Japanese Hiragana characters of 2-byte character sets
>> j:Japanese
>> |:While filling, we can break a line at this character.
>> buffer code: 0x92 0xA4 0xA4
>> file code: not encodable by coding system utf-8-unix
>> font:
>> -mplus-gothic-medium-R-normal--12-120-75-75-C-120-jisx0208.1990-0
>
> The only difference is the "file code:" line. But I don't
> understand why Emacs 21.3 on FC3 doesn't think that UTF-8 encodes
> that character, because it absolutely does.
>
> The FC3 packager claims that he has no problems:
>
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=144707
>
> Does anyone have any ideas?
The more I ponder this, the more I'm beginning to think that this is
actually a bug with Emacs that I've managed to trigger somehow.
Claiming that UTF-8 doesn't encode い is bogus.
I've even gone so far as to trace Emacs while I open a file that
contains Japanese characters, but I didn't detect any glaring
differences between Emacs on FC2 (which works) and Emacs on FC3 (which
doesn't work).
I'm just about out of ideas. Does anyone else have any?
Thanks,
James
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- emacs thinks UTF-8 can't encode Japanese text?,
James Ralston <=