bug-glibc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Regarding strcoll() in glibc 2.2


From: Dmitry Yu. Bolkhovityanov
Subject: Regarding strcoll() in glibc 2.2
Date: Fri, 13 Jul 2001 20:15:13 +0700

    Hi!

    There's a problem (in fact, two problems) with strcoll() in glibc 2.2.

    First, strcoll() in many locales (at least en_US, de_DE, ru_RU, uk_UA,
the only exception is "C") ignores non-alphanumeric characters, e.g. ".".
This completely breaks directory listings -- see either "ls" or "mc"
(RedHat 7.1):

goofy:~% ls -a1
.bashrc
bbb
.cshrc

while in all other unices (and glibc 2.1 too) this would be ".bashrc,
.cshrc, bbb".

    And, of course, this breaks any indexing software which tries to obey
language's native sorting order and hence uses strcoll() instead of
strcasecmp().  For example, if somebody tries to build an index of C
keywords, all preprocessor directives will be spread along the text, instead
of being grouped together.

    Second, since 2.2 strcoll() doesn't distinguish between uppercase and
lowercase letters.  This also breaks directory listings -- there is a long-
time-ago established (and well documented) practice of naming "important"
files in uppercase -- README, Changelog, Makefile etc.

    As I understand, this was done in order to follow national traditions --
most vocabularies are "case-insensitive".  But there is also a need to do
case-sensitive *and* locale-sensitive comparison.  Are there any plans to
implement something like "strnocasecoll()/strnocasexfrm()"?  Or should
Austin group be pushed in this direction first?


(BTW, '(libc.info.gz)Collation Functions' says:

    The `strcoll' function is similar to `strcmp' but uses the collating
    sequence of the current locale for collation.

-- in my understanding, this implies that strcoll() should distinguish upper-
and lowercase (as strcmp() does.)


       ___________________________________________________________________
       Dmitry Yu. Bolkhovityanov  |  Novosibirsk, RUSSIA
       phone (383-2)-39-49-56     |  The Budker Institute of Nuclear Physics
                                  |  Lab. 5-13



reply via email to

[Prev in Thread] Current Thread [Next in Thread]