[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-libunistring] possible memory issue with u8_strconv_to_encoding aft
From: |
Patrice Dumas |
Subject: |
[bug-libunistring] possible memory issue with u8_strconv_to_encoding after u8_toupper |
Date: |
Mon, 4 Sep 2023 10:46:10 +0200 |
Hello,
I have a segfault that seems to come from my use of libunistring, which
I can not reproduce with a small reproducer, but valgrind still shows
the same Invalid read after realloc than for the code segfaulting. It
happens with code like
u8_result = u8_toupper (u8_text, u8_strlen (u8_text),
NULL, NULL, NULL, &lengthp);
result = u8_strconv_to_encoding (u8_result, "UTF-8",
iconveh_question_mark);
I attached a file that can be used to reproduce the valgrind messages
(but no segfault). It can be compiled with
gcc -g -O0 -Wall -fno-stack-protector invalid_read_libunistring.c -lunistring
-o invalid_read_libunistring
Then valgrind gives 3 Invalid read (for each call to the function).
I show the first one
valgrind ./invalid_read_libunistring
==1668190== Invalid read of size 1
==1668190== at 0x4846794: strlen (vg_replace_strmem.c:494)
==1668190== by 0x48BB1C4: u8_strlen (u8-strlen.c:28)
==1668190== by 0x4871076: u8_strconv_to_encoding (u8-strconv-to-enc.c:51)
==1668190== by 0x109210: to_upper_multibyte (invalid_read_libunistring.c:26)
==1668190== by 0x109246: main (invalid_read_libunistring.c:38)
==1668190== Address 0x4c24115 is 0 bytes after a block of size 5 alloc'd
==1668190== at 0x484582F: realloc (vg_replace_malloc.c:1437)
==1668190== by 0x486E35C: libunistring_u8_casemap (u-casemap.h:408)
==1668190== by 0x486FCC4: u8_toupper (u8-toupper.c:41)
==1668190== by 0x1091E5: to_upper_multibyte (invalid_read_libunistring.c:23)
==1668190== by 0x109246: main (invalid_read_libunistring.c:38)
==1668190==
I tested with the git source and I get the same. In comments in the
attached file there are the steps I took to reproduce from within a
directory added to the libunistring top level source directory.
It may be a false positive. It may also be an error on my side, but if
it is the case it is likely that the documentation could be improved.
The segfault happens on a code I am working on in a private texinfo
branch, I am waiting for the official realease to merge. I have no
problem with sharing it, but it is a very big change with many commits,
so I am not sure that it is easy to use. Tell me if you want that I
post the changes as a diff somewhere. Also note that the segfaults does
not happen right in the equivalent function, but there are issues with
the string returned by u8_strconv_to_encoding. When I print it out,
sometime it is ok, but sometime there are additional characters showing
that there is some memory corruption, and in some cases there is a
segfault.
PS: I am not subscribed
--
Pat
invalid_read_libunistring.c
Description: Text Data
- [bug-libunistring] possible memory issue with u8_strconv_to_encoding after u8_toupper,
Patrice Dumas <=