emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: case-insensitive string comparison


From: Sam Steingold
Subject: Re: case-insensitive string comparison
Date: Mon, 25 Jul 2022 15:39:34 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (darwin)

> * Eli Zaretskii <ryvm@tah.bet> [2022-07-25 18:58:19 +0300]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Mon, 25 Jul 2022 10:23:30 -0400
>> 
>> >> Hmm... `string-collate-equalp`?
>> >
>> > (string-collate-equalp "a" "A" current-locale-environment t)
>> > ==> nil
>> > current-locale-environment
>> > ==> "en_US.UTF-8"
>
> I cannot reproduce this:
>
>   (string-collate-equalp "a" "A" current-locale-environment t)
>     => t
>   current-locale-environment
>     => "en_US.UTF-8"
>
> What OS is this, and which Emacs version?

GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 
Version 12.4 (Build 21F79))
 of 2022-07-25
Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description:  macOS 12.4

>> So, how do we do case-insensitive string comparison in Emacs?
>
> If you want locale-specific collation, as Stefan said, above.

Do I?
Is it really true that "UTF-8" without "en_US" does _not_ define case 
conversion?
but https://docs.python.org/3/library/stdtypes.html#str.casefold says

>>>>> The casefolding algorithm is described in section 3.13 of the Unicode 
>>>>> Standard.

this seems to imply that user locale setting is not relevant.
(locale _is_ mentioned in
https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf but it looks
like a _specification_ of the algorithm, not its _modification_).

>> It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>> (even though it does not recognize "SS" and "ß" as equal)
>
> What's wrong with calling compare-strings directly?

I want to be able to use `string-equal-ignore-case' as a :test argument
to things like `cl-find'.
And I don't want to have to think about encodings and locales.
So I want the core Emacs maintainers who know about these things to
provide me with something that works. Thanks in advance! ;-)

The fact that there are ***TWO*** core functions that compare strings -
`string-collate-equalp' and `compare-strings' - does not look right to me.
_I_ should not have to decide which function to use.

>> Or should we first implement something like casefold in Python?
>
> Ha! we already have that:
>
>   (get-char-code-property ?ß 'special-uppercase)
>     => "SS"

Nice, but how does it help me if
--8<---------------cut here---------------start------------->8---
(compare-strings "SS" 0 nil "ß" 0 nil t)
==> -1
(string-collate-equalp "SS" "ß" "en_US.UTF-8" t)
==> nil
--8<---------------cut here---------------end--------------->8---
instead of `t'?

> Give us some credit, yes?

Sure, and I am very grateful!

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://fairforall.org https://camera.org https://thereligionofpeace.com
He who laughs last did not get the joke.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]