octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: benchmarks - sort


From: Paul Kienzle
Subject: Re: benchmarks - sort
Date: Thu, 22 Jan 2004 00:13:20 -0500


On Jan 21, 2004, at 5:09 PM, David Bateman wrote:

This is due to the fact that the calls in the class to sort the
elements are of the form "tmp = *a; *a = *b; *b = tmp;". This imposes
quite a penalty on the mergesort. I can really see another way to do
this without writing specific versions of the code for all
instantiations of the mergesort class for with and without indexing
(Ugly)... Any suggestions are welcome.

For the indexed case, maybe you could create a vector
of pointers to doubles.  That leads to an extra dereference
in the code which will surely slow things down a bit, but it
keeps the size of data you are shifting down to a word.
You can recover the index by pointer subtraction.

sortrows would be faster if it were implemented directly
rather than stable sorts on each column from right to left.

The basic result is that the radix sort compares well against Matlab
for randomly distributed values (1.10 times the speed of Matlab), but
as this sort takes no advantage of partial ordering, it breaks down in
these cases. The mergesort with double values is faster than Matlab
for partailly ordered lists, but about 2.5 to 3 times slower than
Matlab for random lists. Interestingly, although merge_double is
significantly slower with indexing, it maintains the same relationship
with Matlab.

Is it possible that matlab is using quicksort when
you aren't indexing?  Does sort need to be stable if you
are not preserving the index?  If not, then choose
the fastest sort that works reasonably well on mostly
ordered data when you aren't keeping the index.  IIRC,
basic quicksort doesn't perform very well on mostly
ordered data.


The clear winner for me is the mergesort with integer comparison of
IEEE754 values. This is roughly two times faster than Matlab for
partially ordered lists, while being only slightly slower (1.29
relative to Matlab) than the radix sort (1.10 relative to Matlab).
This algorithm is roughly 6 times faster than the current Octave
sort code. The performance degrades slightly with indexing (roughly
2.0 relative to Matlab).

Note that my test code, checked the correctness of the sorting
algorithms, that it sorted NaN, Inf and -Inf correctly and that
it was stable.

Did you happen to try on both big and little endian machines?
Don't you have to reverse the order of the integer comparisons
in the two cases?

Thanks for all your efforts.  Sort is used pretty often in some
unexpected ways to avoid loops in the script, so it's nice that
it will be fast.

Paul Kienzle
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]