screen-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [screen-devel] [bug #60030] Screen segfaults by displaying some UTF-


From: Michael Schroeder
Subject: Re: [screen-devel] [bug #60030] Screen segfaults by displaying some UTF-8 character combination
Date: Thu, 11 Feb 2021 16:54:46 +0000
User-agent: Mutt/1.10.1 (2018-07-13)

On Thu, Feb 11, 2021 at 04:54:50PM +0100, Axel Beckert wrote:
> Michael Schroeder schrieb am Thu, Feb 11, 2021 at 02:20:33PM +0000:
> > Two years ago, the line
> > 
> >     c = (c & 255) | (unsigned char)D_rend.font << 8;
> > 
> > was deleted from RAW_PUTCHAR, but the utf8_handle_comb function in
> > encode.c was not adapted.
> 
> Sounds like a possible cause for this issue.

Actually it's not. The screen-v4 branch does not include the commit.
It's pretty weird that master and screen-v4 are so different.

> > Since then, combining characters cannot have worked.
> 
> I haven't sent my mail about that yet, but I've tested especially
> combining and stacked combining diacriticals (two in a row) with
> screen 4.8.0 in Debian Unstable:
> 
> And they worked well _before_ any patching of this issue. With my as
> well as with Taviso's patch, stacked combining diacriticals no more
> work. One of my (randomly choosen) examples was ASCII letter "e" plus
> U+0324 COMBINING DIAERESIS BELOW + U+0312 COMBINING TURNED COMMA
> ABOVE. Looks like this (if mail transport didn't kill it :-): "e̤̒"
> 
> This works fine with screen 4.8.0-3 in Debian Testing (the one which
> crashes, without any of the patches in this thread), but no more with
> 4.8.0-4 which sports my (flawed, but at least no more crashing) patch.
> 
> Will try to send that half-finished mail maybe in a ¾-finished state
> after this one. :-)

I've dug a bit more into this, and the root cause of all the evil we
see seems to be that utf8_isdouble() does not return true for the
0xdf00-0xdfff character range. This seems to be a mistake done
in commit b8fd0c833bbd910a525d270ebc8f7e87ee00cb0a in the year 2008!

This breaks the logic in the combining character handling, e.g.
the "double with" information is lost if two diacriticals are
applied, the problem you already patched in display.c, and so on.
Sigh.

While adding the df00-dfff range to utf8_isdouble is already fixing
the segfault, I think we should still apply the changes I sent earlier
for extra safety.

So we get the following patch for the screen-v4 branch:

diff --git a/encoding.c b/encoding.c
index e5db3e7..c044801 100644
--- a/encoding.c
+++ b/encoding.c
@@ -43,7 +43,7 @@ static int  encmatch __P((char *, char *));
 # ifdef UTF8
 static int   recode_char __P((int, int, int));
 static int   recode_char_to_encoding __P((int, int));
-static void  comb_tofront __P((int, int));
+static void  comb_tofront __P((int));
 #  ifdef DW_CHARS
 static int   recode_char_dw __P((int, int *, int, int));
 static int   recode_char_dw_to_encoding __P((int, int *, int));
@@ -1263,6 +1263,8 @@ int c;
     {0x30000, 0x3FFFD},
   };
 
+  if (c >= 0xdf00 && c <= 0xdfff)
+    return 1;          /* dw combining sequence */
   return ((bisearch(c, wide, sizeof(wide) / sizeof(struct interval) - 1)) ||
           (cjkwidth &&
            bisearch(c, ambiguous,
@@ -1330,11 +1332,12 @@ int c;
 }
 
 static void
-comb_tofront(root, i)
-int root, i;
+comb_tofront(i)
+int i;
 {
   for (;;)
     {
+      int root = i >= 0x700 ? 0x801 : 0x800;
       debug1("bring to front: %x\n", i);
       combchars[combchars[i]->prev]->next = combchars[i]->next;
       combchars[combchars[i]->next]->prev = combchars[i]->prev;
@@ -1396,9 +1399,9 @@ struct mchar *mc;
     {
       /* full, recycle old entry */
       if (c1 >= 0xd800 && c1 < 0xe000)
-        comb_tofront(root, c1 - 0xd800);
+        comb_tofront(c1 - 0xd800);
       i = combchars[root]->prev;
-      if (c1 == i + 0xd800)
+      if (i == 0x800 || i == 0x801 || c1 == i + 0xd800)
        {
          /* completely full, can't recycle */
          debug("utf8_handle_comp: completely full!\n");
@@ -1422,7 +1425,7 @@ struct mchar *mc;
   mc->font  = (i >> 8) + 0xd8;
   mc->fontx = 0;
   debug3("combinig char %x %x -> %x\n", c1, c, i + 0xd800);
-  comb_tofront(root, i);
+  comb_tofront(i);
 }
 
 #else /* !UTF8 */


I didn't add the 0xdf00 && c <= 0xdfff to the bisearch data to ensure
that it does not get lost the next time the table is updated.

Cheers,
  Michael.

-- 
Michael Schroeder          SUSE Software Solutions Germany GmbH
mls@suse.de      GF: Felix Imendoerffer HRB 36809, AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}



reply via email to

[Prev in Thread] Current Thread [Next in Thread]