help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Alternative to string< that works "well" with unicode


From: Pascal J. Bourguignon
Subject: Re: Alternative to string< that works "well" with unicode
Date: Fri, 28 Nov 2014 00:52:31 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Rasmus <rasmus@gmx.us> writes:

> Hi,
>
> I want to sort a list of strings, including accented strings, in a
> "meaningful way".  E.g. with this list (É E T A À Z) the sorted list
> should be (A À E É T Z).
>
> (sort '(É E T A À Z) 'string<)
>       => (A E T Z À É) ; expected (A À E É T Z)

You might have expected that, but users writing different languages will
have expected something else.

This is called localization.


> I tried all the versions of 'string< that I could find with apropos.
>
> Is there a function that will support my preferred sorting in Emacs?

AFAICS, there's nothing yet.

http://en.wikipedia.org/wiki/Unicode_collation_algorithm

You could try to implement the UCA (Unicode Collation Algorithm):
http://www.unicode.org/reports/tr10/

Alternatively, you could send the data to the unix sort command, with
the right LC_ALL environment variable.

-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk


reply via email to

[Prev in Thread] Current Thread [Next in Thread]