bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50865: 28.0.50; Emoji with emoji modifier in Linux console garbles e


From: Aura Kelloniemi
Subject: bug#50865: 28.0.50; Emoji with emoji modifier in Linux console garbles emacs display
Date: Mon, 04 Oct 2021 15:25:23 +0300

Hi,

On 2021-10-02 at 13:58 +0300, Eli Zaretskii <eliz@gnu.org> wrote:
 > > Are you sure they don't? what do the developers say about that?

I am actually a bit confused about the fact that Linux console doesn't seem to
be well known on this list. I am not blaming, just wondering. I would think it
would be very easy for all GNU/Linux users to reproduce this bug any time.

Anyhow, here I provide a proof that Linux really does not understand
two-column characters.

This is again a Bash session in a bare Linu console:

$ echo $'ab\U0001F64Fxy\rabc'
abcxy

This prints letters a and b followed by a wide emoji, followed by letters x
and y. Then it moves the cursor back to the beginning of line with \r and
writes letters a b and c. These should override the first two letters and the
first half of the emoji. This leaves the letters x and y in tact.

But as you see, the c letter here overrides the whole emoji. If the emoji
really was wide, then the output would be

$ echo $'ab\U0001F64Fxy\rabc'
abc xy

Here the space represents the right half of the broken emoji. This later
example is run in a VTE-based terminal that supports Unicode properly.

 > If indeed the Linux console doesn't support double-width characters,
 > or at least enough of them to cause trouble with Emacs display, my
 > suggestion would be to use this setting:

 >   M-x set-terminal-coding-system RET latin-1 RET

As Andreas pointed out, this would not work. Using only ASCII would be a
horrible regression. My native language uses many letters outside the ascii
range. Nowadays even programming becomes difficult without Unicode. This is
not a feasible solution.

 > This will display characters outside the Latin-1 range as \uNNNN or
 > \U0nnnnn (depending on the codepoint), with an underline attribute to
 > make it easier to tell where the character's code ends and the
 > following text begins (in case it begins with a digit).

Linux console does not support the underline attribute. See man 4
console_codes. It talks about simulating the attributes.

 > This should allow you to read the rest of the text without messing up the
 > display. I don't really see a better solution for such problematic
 > terminals.

The solution of modifying char-width-table at least worked very well for me.
Of course I am intetrested in the things that will break, if I use it, but
most likely those will be smaller annoyances than a garbled display.

I can document this hack on emacs wiki, if nothing else can be done.

 > Emacs relies on the terminal to display characters correctly, using 2
 > columns (with padding by empty space) when the character is
 > double-width.  If the terminal doesn't live up to these expectations,
 > the display will become garbled.

Couldn't emacs add a padding space after every two-column character. This
would fix the alignment/garbling issues altogether. This setting could be
controlled by a terminal-local variable and it could be automatically set for
terminals that don't support multi-column characters.

Emacs already kind of adds a padding space if I type characters one at a time
(because it repositions the cursor after every command), but this does not
happen if the text is sent to the terminal in a batch (e.g. when drawing the
contents of a buffer, or when doing a redraw).

-- 
Aura





reply via email to

[Prev in Thread] Current Thread [Next in Thread]