[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Display of characters #xa0 and #xad in unibyte buffers
From: |
Kenichi Handa |
Subject: |
Re: Display of characters #xa0 and #xad in unibyte buffers |
Date: |
Mon, 28 Sep 2009 20:24:24 +0900 |
In article <address@hidden>, Eli Zaretskii <address@hidden> writes:
> > In article <address@hidden>, Eli Zaretskii <address@hidden> writes:
> >
> > > > >> $ emacs -Q
> > > > >> M-x toggle-enable-multibyte-characters RET C-q 240 RET C-q 255 RET
> > > > >>
> > > > >> The characters are displayed as "_-" (approximately).
> > > > >>
> > > > >> Shouldn't they be displayed as "\240\255", considering that these are
> > > > >> raw bytes with no specific meaning?
> > > >
> > > > > There are no ``raw bytes'' in a unibyte buffer. Every byte there is
> > > > > interpreted as a character, and shown as such. This is the main
> > > > > feature of unibyte buffers; otherwise, who'd want them?
> >
> > I think the main feature of unibyte buffers is to handle
> > raw-bytes as is.
> How do we even know that they are raw bytes, and how do we
> distinguish, in a unibyte buffer, ΓΌ from \374, say? Just because they
> were inserted by C-q NNN or by some other mechanism?
They are not distinguished.
> > For those who want to see a raw-byte as a character of their locale
> > (language environment), we have
> > unibyte-display-via-language-environment.
> I thought bytes in unibyte buffers are always interpreted as
> characters of the locale, as Emacs 19 did.
Not really because we don't perform automatic
unibyte<->multibyte decoding/encoding anymore. So, if we
cut #xC0 in a unibyte buffer and yank it in a multibyte
buffer, eight-bit character is inserted instead of U+00C0.
> Are you saying that they
> are by default always interpreted as raw bytes, unless
> unibyte-display-via-language-environment is set?
unibyte-display-via-language-environment just controls how
to display them, and it doesn't affect how they are
interpreted.
Actually, the interpretation of characters in a unnibyte
buffer is still inconsistent. For instance,
skip-syntax-forward treats #x80..#xFF as characters
U+0080..U+00FF. Thus #xC0 is a word-constituent and #xD7 is
a symbol. We must fix it somehow. But, how? We currently
don't have a suitable syntax code for eight-bit chars.
---
Kenichi Handa
address@hidden
- Display of characters #xa0 and #xad in unibyte buffers, Ulrich Mueller, 2009/09/24
- Re: Display of characters #xa0 and #xad in unibyte buffers, Eli Zaretskii, 2009/09/25
- Re: Display of characters #xa0 and #xad in unibyte buffers, Ulrich Mueller, 2009/09/25
- Re: Display of characters #xa0 and #xad in unibyte buffers, Eli Zaretskii, 2009/09/25
- Re: Display of characters #xa0 and #xad in unibyte buffers, Kenichi Handa, 2009/09/27
- Re: Display of characters #xa0 and #xad in unibyte buffers, Eli Zaretskii, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers,
Kenichi Handa <=
- Re: Display of characters #xa0 and #xad in unibyte buffers, Eli Zaretskii, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Stefan Monnier, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Kenichi Handa, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Stefan Monnier, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Kenichi Handa, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Stefan Monnier, 2009/09/28
- Re: Display of characters #xa0 and #xad in unibyte buffers, Kenichi Handa, 2009/09/29
Re: Display of characters #xa0 and #xad in unibyte buffers, Stephen J. Turnbull, 2009/09/25
Re: Display of characters #xa0 and #xad in unibyte buffers, Stefan Monnier, 2009/09/25