bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51292: 27.2; Reversing strings with unicode combining characters


From: Lars Ingebrigtsen
Subject: bug#51292: 27.2; Reversing strings with unicode combining characters
Date: Tue, 19 Oct 2021 21:26:31 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Howard Melman <hmelman@gmail.com> writes:

> Reversing a string fails to account for unicode combining characters
>
>     (reverse "nai\u0308ve")
>     "ev̈ian"
>
> Note the diaeresis is now on the v and not the i.  s-reverse gets it right:
>
>     (s-reverse "nai\u0308ve")
>     "evïan"

So I wondered what s-reverse did, and indeed:

(defun s-reverse (s)
  "Return the reverse of S."
  (declare (pure t) (side-effect-free t))
  (save-match-data
    (if (multibyte-string-p s)
        (let ((input (string-to-list s))
              output)
          (require 'ucs-normalize)
          (while input
            ;; Handle entire grapheme cluster as a single unit
            (let ((grapheme (list (pop input))))
              (while (memql (car input) ucs-normalize-combining-chars)
                (push (pop input) grapheme))
              (setq output (nconc (nreverse grapheme) output))))
          (concat output))
      (concat (nreverse (string-to-list s))))))

Emacs has string-reverse, obsolete since 25.1.  Perhaps we should
reintroduce it and use the definition from s?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





reply via email to

[Prev in Thread] Current Thread [Next in Thread]