[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: benchmarks - sort
From: |
David Bateman |
Subject: |
Re: benchmarks - sort |
Date: |
Thu, 22 Jan 2004 11:43:47 +0100 |
User-agent: |
Mutt/1.4.1i |
According to Paul Kienzle <address@hidden> (on 01/22/04):
>
> For the indexed case, maybe you could create a vector
> of pointers to doubles. That leads to an extra dereference
> in the code which will surely slow things down a bit, but it
> keeps the size of data you are shifting down to a word.
> You can recover the index by pointer subtraction.
Not sure I see what you mean. The code as it stands treats "*a = *b",
where at the moment "a" is "double *a" for example. For the indexed
case are you suggesting I have a pointer to a pointer to a structure
containing the value and the index? So that only the pointers are
moved around. For example
OCTAVE_LOCAL_BUFFER (*vec_index<unsigned EIGHT_BYTE_INT>,
vi, elements);
/* Flip the data in the vector so that int compares on
* IEE754 give the correct ordering
*/
for (int i = 0; i < elements; i++)
{
vi[i]->vec = FloatFlip (p[i]);
vi[i]->indx = i + 1;
}
Yeah this might be faster. I should try it and see. I've started but getting
seg-faults....
> sortrows would be faster if it were implemented directly
> rather than stable sorts on each column from right to left.
As I've implemented the sort as a generic class, it should be quite
easy it instantiate it for strings in sortrows and sort them directly
as you suggest.
> >The basic result is that the radix sort compares well against Matlab
> >for randomly distributed values (1.10 times the speed of Matlab), but
> >as this sort takes no advantage of partial ordering, it breaks down in
> >these cases. The mergesort with double values is faster than Matlab
> >for partailly ordered lists, but about 2.5 to 3 times slower than
> >Matlab for random lists. Interestingly, although merge_double is
> >significantly slower with indexing, it maintains the same relationship
> >with Matlab.
>
> Is it possible that matlab is using quicksort when
> you aren't indexing? Does sort need to be stable if you
> are not preserving the index? If not, then choose
> the fastest sort that works reasonably well on mostly
> ordered data when you aren't keeping the index. IIRC,
> basic quicksort doesn't perform very well on mostly
> ordered data.
This would explain the speed differences with Matlab I observed. Do we
care enough to gain another 20% to 40% in speed at the cost of lots of
code duplciation?
> Did you happen to try on both big and little endian machines?
> Don't you have to reverse the order of the integer comparisons
> in the two cases?
Isn't IEEE754 independent of big and little endian issues? An integer
compare on the bytes of a big endian number gives the same results as
on a little endian machine. IE.
short a = 1; 0100 on little endian and 0001 on big endian
short b = 2; 0200 on little endian and 0002 on big endian
if (b > a)
printf("This test is true on both big and little endian\n");
So it isn't necessary to check big/little endian issues. It is however,
necessary to check at configure whether the machine respects the IEEE754
or IEEE854 formats. How to do this?
> Thanks for all your efforts. Sort is used pretty often in some
> unexpected ways to avoid loops in the script, so it's nice that
> it will be fast.
It would be nice to have it in octave-forge this weekend though :-)....
D.
--
David Bateman address@hidden
Motorola CRM +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE
The information contained in this communication has been classified as:
[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary
RE: benchmarks - sort, THOMAS Paul Richard, 2004/01/22
Re: benchmarks - sort, Paul Thomas, 2004/01/22