Re: fnmatch: Overcome wchar

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fnmatch: Overcome wchar_t limitations

From:	Adhemerval Zanella Netto
Subject:	Re: fnmatch: Overcome wchar_t limitations
Date:	Tue, 25 Jul 2023 12:29:08 -0300
User-agent:	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0


On 24/07/23 21:46, Bruno Haible wrote:
> This patch fixes the remaining problems of fnmatch() on Cygwin,
> native Windows, and AIX in 32-bit mode.
> 
> I think the code changes are in glibc style, because
>   - they use upper-case names for macros,
>   - the macro names are aligned to the type and function names used
>     by glibc,
>   - the #ifdef _LIBC alternative comes first.
> 
> The fnmatch replacement uses char32_t only when needed, i.e. for
> the three platforms listed above. When char32_t is not needed, we
> can just continue to use wchar_t, because
>   - the wchar_t-based code is needed for glibc source code compatibility,
>   - fewer dependencies need to be compiled into the binaries if we can
>     use libc built-in functions instead of gnulib c32_* and u32_* functions.
> 
> Packages that use the 'fnmatch' or 'fnmatch-gnu' module don't need to
> update their *_LDADD variables, if they don't use the 'libunistring-optional'
> module.
> (Like in 
> <https://lists.gnu.org/archive/html/bug-gnulib/2023-07/msg00000.html>.)

Since you working on fixing fnmatch issues, I wonder if this patch and the
recent wide char work might improve the long standing issues we have on
glibc bugzilla [1][2][3].  They might be related on collating elements,
while the generic gnulib does not support, so I am not sure they are related 
to the shared fnmatch implementation.

Also, on glibc we have fixed two issues that I think you might evaluate to
sync back with gnulib:

  * a79328c7452 which fixes BZ#14185 [4]. As discussed on bug report we will
    need to call mbrtowc for both pattern and string, so it would be at least
    two call per iteration. I am not sure if real world user cases pay off this 
    optimization, some cases or early bailout (where there is no need to check 
    for all pattern and/or string) might be faster by converting character by 
    character.  Maybe it might be something that your recent patches for wide
    support improve.

  * 4e32c8f568200 where we removed alloca usage in favor of dynarray.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=26628
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=23393
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=30483
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=14185

[Prev in Thread]

Current Thread

[Next in Thread]

fnmatch: Overcome wchar_t limitations, Bruno Haible, 2023/07/24
- Re: fnmatch: Overcome wchar_t limitations, Adhemerval Zanella Netto <=
- Re: fnmatch: Overcome wchar_t limitations, Bruno Haible, 2023/07/27
  - Re: fnmatch: Overcome wchar_t limitations, Bruno Haible, 2023/07/27

Prev by Date: fnmatch: Overcome wchar_t limitations
Next by Date: unistr/u8-*: Make Unicode decoder more Unicode Standard compliant
Previous by thread: fnmatch: Overcome wchar_t limitations
Next by thread: Re: fnmatch: Overcome wchar_t limitations
Index(es):
- Date
- Thread