[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Alternative to string< that works "well" with unicode
From: |
Pascal J. Bourguignon |
Subject: |
Re: Alternative to string< that works "well" with unicode |
Date: |
Fri, 28 Nov 2014 00:52:31 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
Rasmus <rasmus@gmx.us> writes:
> Hi,
>
> I want to sort a list of strings, including accented strings, in a
> "meaningful way". E.g. with this list (É E T A À Z) the sorted list
> should be (A À E É T Z).
>
> (sort '(É E T A À Z) 'string<)
> => (A E T Z À É) ; expected (A À E É T Z)
You might have expected that, but users writing different languages will
have expected something else.
This is called localization.
> I tried all the versions of 'string< that I could find with apropos.
>
> Is there a function that will support my preferred sorting in Emacs?
AFAICS, there's nothing yet.
http://en.wikipedia.org/wiki/Unicode_collation_algorithm
You could try to implement the UCA (Unicode Collation Algorithm):
http://www.unicode.org/reports/tr10/
Alternatively, you could send the data to the unix sort command, with
the right LC_ALL environment variable.
--
__Pascal Bourguignon__ http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk