[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#27270: display-raw-bytes-as-hex generates ambiguous output for Emacs
From: |
Lars Ingebrigtsen |
Subject: |
bug#27270: display-raw-bytes-as-hex generates ambiguous output for Emacs strings |
Date: |
Sat, 23 Apr 2022 16:00:31 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) |
Paul Eggert <eggert@cs.ucla.edu> writes:
> The idea is to add a new \X escape for character constants and
> strings. This escape would allow at most two hexadecimal digits,
> rather than the unlimited number of digits that \x does. For example,
> the Lisp string "\XABC" would be equivalent to the Lisp string "\xAB\
> C", that is, it would be a two-character string containing the
> character U+00AB LEFT POINTING GUILLEMET followed by the character
> U+0043 LATIN CAPITAL LETTER C.
This was four years ago, but I don't think any steps were taken in this
direction, beyond marking the raw bytes more clearly:

Even in *scratch*, where font-locking overrode those, I think?
The issue still remains -- if you do this in emacs -nw:
(format "%c5" 128)
"5"
And cut and paste that do a different Emacs, you get the string
"\x805"
=> "ࠅ"
But... we've had this format for half a decade now, and this doesn't
really seem to be a problem in practice, so while the format is somewhat
ambiguous, I tend to think that introducing a new syntax just to fix it
isn't worth it. Especially a syntax like \x{80}, which was one of the
suggestions -- the idea, after all, is to make display prettier and more
readable.
Any further opinions?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
bug#27270: display-raw-bytes-as-hex generates ambiguous output for Emacs strings, Lars Ingebrigtsen, 2022/04/24