help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sorting on compound keys?


From: Tim Landscheidt
Subject: Re: Sorting on compound keys?
Date: Fri, 10 Jun 2011 00:27:37 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)

Mark Tilford <ralphmerridew@gmail.com> wrote:

>> sometimes I want to sort unified diffs of CSV files (sepa-
>> rated by tabs (here: \t)):

>> | +A 1\t1\tx
>> | +A 1\t2\ty
>> | +B 2\t3\tz
>> | -A 1\t1\tx
>> | -B 2\t2\ty
>> | -B 2\t3\tz

>> by the second column, then the first column, then "+" vs.
>> "-". Unfortunately, it seems that sort-regexp-fields doesn't
>> allow more than one match field as a key. sort-fields
>> doesn't work either as it requires the fields to be sur-
>> rounded by white space (no "+" vs. "-") and doesn't allow
>> white space inside the fields.

>>  Is there any function in vanilla Emacs (23.1.1) that I
>> missed? I looked at pimping sort-regexp-fields, but it seems
>> to me that sort-subr would have to be rewritten from scratch
>> to achieve sorting on compound keys.

> Is there an option to do a stable sort, such as mergesort?

Eureka! Of course! All Emacs sort functions are stable, so
99 % of my use cases can be dealt with by multiple calls to
sort-regexp-fields (the only exception being sorting numeri-
cally and the like).

  Unfortunately, those multiple calls can be tedious when
done interactively, so voilà:

| (defun tl-sort-regexp-fields (reverse record-regexp key-regexp beg end)
|   (interactive "P\nsRegexp specifying records to sort: 
| sRegexp specifying key within record: \nr")
|   (if (string-match "\\`\\(?:-\\\\[1-9]\\|\\(?:-?\\\\[1-9]\\)\\{2,\\}\\)\\'" 
key-regexp)
|       (let
|           ((i (length key-regexp)))
|         (while (> i 0)
|           (let ((key-reverse (and (> i 2) (= (aref key-regexp (- i 3)) ?-)))
|                 (key (substring key-regexp (- i 2) i)))
|             (sort-regexp-fields (if reverse (not key-reverse) key-reverse) 
record-regexp key beg end)
|             (if key-reverse
|                 (setq i (- i 1)))
|             (setq i (- i 2)))))
|     (sort-regexp-fields reverse record-regexp key-regexp beg end)))

A key-regexp of "\2\3\1" will yield the region sorted by the
second field, then the third, then the first. The fields can
be prefixed with "-" to negate the sort order for this
field, e. g. "\2-\3\1" will sort by the second field ascend-
ingly, then the third descendingly, then the first ascend-
ingly.

  With regard to performance, the region is sorted once for
every key, so it may not be suitable for larger datasets,
but up to a few thousand lines it's fast enough for me. If
someone wants to integrate this into Emacs, please go ahead.

Thanks, also to Andreas,
Tim

P. S.: Is there really no xor in elisp?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]