help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: utf8 char display in buffer


From: ken
Subject: Re: utf8 char display in buffer
Date: Fri, 12 Jun 2009 10:54:23 -0400
User-agent: Thunderbird 2.0.0.0 (X11/20070326)

Ed,

Thanks for distributing.


Everyone responding to this thread,

Please either CC me when posting about this issue or else edit the "To"
field so that your response comes to the whole list.  I'd like to get
everyone's input.  Thanks.


Lewis,

Thanks for posting.  It's lonely out there when you're the only one with
a particular problem.  To make sure we're suffering the same
cyber-indignity, here's the scenario as I see it (from an older version
of emacs running on Linux):

0) Some others and myself want to include some non-English characters in
a file being edited in emacs. Problems arise, however:

1) In a buffer which is already utf-8 encoded, I set the appropriate
input method, type in the desired characters. They display just peachy
and there is happiness in EmacsLand.

2) I save the buffer to a file, then close the buffer.

3) I visit the same file (i.e., load it again into emacs). Because it
has <!-- -*- coding: utf-8; -*- --> as the first line, it opens
utf-8 encoded. This is confirmed by the presence of a 'u' as the second
character in the status bar.

4) The text in the buffer displays fine, except that in place of each of
those non-English characters is a little empty box. With the cursor on
one of those boxes, an 'a' with a horizontal bar above it, doing "C-x
=", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)".
(While, in emacs the character after "Char:" is a little box, if I load
this same file into Firefox, that same character appears as it should,
as an 'a' with a horizontal bar above it. How it appears in your email
client will depend upon your email client.)

A) The fact that, as described in (4), the characters display correctly
in Firefox, but not in emacs indicates that emacs is not drawing on the
needed character set. Yet, the fact that in (1) the characters initially
display correctly (when first input) indicates that the needed character
set is present on the system and emacs can find it and has permission
access it. Further, we would think that emacs would throw out an error
message if either of these conditions were not met... and it doesn't. We
can only assume that, when visiting and then decoding a file and pulling
into a buffer for display, emacs is not even asking for the proper
character set when encountering a non-English character. This is where I
would start to look for the error.

B) It would be helpful if the code which does the decoding of a file and
renders it into the buffer display, if that part of it would throw an
error message when it encounters a character it doesn't know how to
display, i.e., when a little box character is displayed. After all,
isn't it an error when a little box is displayed in lieu of the correct
character? Possible error messages would be something like: "decoding
process can't find /path/to/charset.file" or "decoding process doesn't
have requisite permission to read /path/to/charset.file" or "invalid
character: [hex/decimal value]" or other.


On 06/10/2009 11:21 PM B. T. Raven wrote:
> Lewis Perin wrote:
>> I've been following this thread closely because I have the original
>> poster's problem, only the characters that give me trouble are some -
>> not many, actually - Chinese characters, e.g. ni3, the normal second
>> person pronoun.  And, as with the original poster, the troublesome
>> characters, when copied and pasted to other applications from Emacs,
>> display perfectly.
>>
>> "B. T. Raven" <nihil@nihilo.net> writes:
>>
>>> [...]
>>>    (set-language-environment               'UTF-8)
>>>          (set-default-coding-systems             'utf-8)
>>>          (setq file-name-coding-system           'utf-8)
>>>          (setq default-buffer-file-coding-system 'utf-8)
>>>          (setq coding-system-for-write           'utf-8)
>>>          (set-keyboard-coding-system             'utf-8)
>>>          (set-terminal-coding-system          'utf-8)
>>>          (set-clipboard-coding-system            'utf-8)
>>>          (set-selection-coding-system            'utf-8)
>>>          (prefer-coding-system                   'utf-8)
>>>          (modify-coding-system-alist 'process
>>> "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
>>>
>>>
>>> and try C-x ret c utf-8
>>> C-x C-f
>>>
>>> to open the file.
>>
>> I tried this, but it didn't help.  Emacs 22.3 / Win32.
> 
> Even on Emacs 23 although I see the characters in the buffer, I can't
> save the following as utf-8:
> 
> nǐ hǎo 你 好
> u+4f60 and u+597d
> 
> Or at least not so as to be readable with 22.3. Both versions are using
> Arial Unicode MS.
> 
> Why is that?
> 
> 
>>
>> /Lew
>> ---
>> Lew Perin / perin@acm.org
>> http://www.panix.com/~perin/babelcarp.html




reply via email to

[Prev in Thread] Current Thread [Next in Thread]