[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#51292: 27.2; Reversing strings with unicode combining characters
From: |
Lars Ingebrigtsen |
Subject: |
bug#51292: 27.2; Reversing strings with unicode combining characters |
Date: |
Tue, 19 Oct 2021 21:26:31 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) |
Howard Melman <hmelman@gmail.com> writes:
> Reversing a string fails to account for unicode combining characters
>
> (reverse "nai\u0308ve")
> "ev̈ian"
>
> Note the diaeresis is now on the v and not the i. s-reverse gets it right:
>
> (s-reverse "nai\u0308ve")
> "evïan"
So I wondered what s-reverse did, and indeed:
(defun s-reverse (s)
"Return the reverse of S."
(declare (pure t) (side-effect-free t))
(save-match-data
(if (multibyte-string-p s)
(let ((input (string-to-list s))
output)
(require 'ucs-normalize)
(while input
;; Handle entire grapheme cluster as a single unit
(let ((grapheme (list (pop input))))
(while (memql (car input) ucs-normalize-combining-chars)
(push (pop input) grapheme))
(setq output (nconc (nreverse grapheme) output))))
(concat output))
(concat (nreverse (string-to-list s))))))
Emacs has string-reverse, obsolete since 25.1. Perhaps we should
reintroduce it and use the definition from s?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
- bug#51292: 27.2; Reversing strings with unicode combining characters, Howard Melman, 2021/10/19
- bug#51292: 27.2; Reversing strings with unicode combining characters,
Lars Ingebrigtsen <=
- bug#51292: 27.2; Reversing strings with unicode combining characters, Lars Ingebrigtsen, 2021/10/19
- bug#51292: 27.2; Reversing strings with unicode combining characters, Stefan Kangas, 2021/10/19
- bug#51292: 27.2; Reversing strings with unicode combining characters, Lars Ingebrigtsen, 2021/10/20
- bug#51292: 27.2; Reversing strings with unicode combining characters, Stefan Kangas, 2021/10/20
- bug#51292: 27.2; Reversing strings with unicode combining characters, Lars Ingebrigtsen, 2021/10/20
- bug#51292: 27.2; Reversing strings with unicode combining characters, Stefan Kangas, 2021/10/20
- bug#51292: 27.2; Reversing strings with unicode combining characters, Eli Zaretskii, 2021/10/20
bug#51292: 27.2; Reversing strings with unicode combining characters, Eli Zaretskii, 2021/10/20