emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: case-insensitive string comparison


From: Sam Steingold
Subject: Re: case-insensitive string comparison
Date: Tue, 26 Jul 2022 10:28:01 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (darwin)

> * Richard Stallman <ezf@tah.bet> [2022-07-25 23:24:43 -0400]:
>
>   > It is okay to add a `string-equal-ignore-case' based on `compare-strings'?
>   > (even though it does not recognize "SS" and "ß" as equal)
>
> A function `string-equal-ignore-case' would make sense.  My question is,
> is it worth the cost in complexity, or is it better to urge users to call
> `compare-strings' directly?

1. we already have `string-prefix-p' and `string-suffix-p' which are
thin wrappers around `compare-strings'

> That depends on how often programs will do case-insensitive string comparison.
> If frequently, that gives a bigger upside to `string-equal-ignore-case'.

2. there are dozens of places in Emacs core with code like

--8<---------------cut here---------------start------------->8---
          (eq t (compare-strings (sgml-tag-name tag-info) nil nil
                                 (car stack) nil nil t))
--8<---------------cut here---------------end--------------->8---

3. some emacs packages already have to define their own versions of
`string-equal-ignore-case', e.g., `bbdb-string='.

>   > Or should we first implement something like casefold in Python?
>   > https://docs.python.org/3/library/stdtypes.html#str.casefold
>
> That casefold operation is not the same thing as ignoring case in
> Emacs.

Normally, case-insensitive comparison means something like

--8<---------------cut here---------------start------------->8---
(string= (casefold A) (casefold B))
--8<---------------cut here---------------end--------------->8---

`compare-strings' does

--8<---------------cut here---------------start------------->8---
(string= (upcase A) (upcase B))
--8<---------------cut here---------------end--------------->8---

(except it does it character-by-character, no allocating new strings for
`upcase').

> How to integrate something like that into Emacs, and in
> general how to handle `ß' properly in case conversion, calls for more
> thought.

Bruno Haible replied in this thread, suggesting libunistring via gnulib.
I think this is the easiest way to handle the issue.

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://memri.org https://honestreporting.com https://ffii.org
The program isn't debugged until the last user is dead.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]