octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FYI: optimizing certain matrix arithmetic


From: Michael Creel
Subject: Re: FYI: optimizing certain matrix arithmetic
Date: Tue, 29 Sep 2009 13:34:39 +0200

On Tue, Sep 29, 2009 at 12:47 PM, Jaroslav Hajek <address@hidden> wrote:
> On Tue, Sep 29, 2009 at 12:26 PM, Michael Creel <address@hidden> wrote:
>> On Tue, Sep 29, 2009 at 10:44 AM, Jaroslav Hajek <address@hidden> wrote:
>>> On Tue, Sep 29, 2009 at 10:38 AM, Michael Creel <address@hidden> wrote:
>>>> Hi all,
>>>>
>>>> On an Apple Macbook Pro running Ubuntu Jaunty amd64, using the benchmark
>>>>
>>>> %%%%%%%%%%%%%%%%%%%%%%
>>>> n = 500;
>>>> R = triu (rand (n));
>>>> u = rand (n, 1);
>>>>
>>>> tic; for i = 1:1000; R \ u; endfor; toc
>>>> tic; for i = 1:1000; u' / R; endfor; toc
>>>> tic; for i = 1:1000; R' \ u; endfor; toc
>>>>
>>>> R = tril (rand (n));
>>>> u = rand (n, 1);
>>>>
>>>> tic; for i = 1:1000; R \ u; endfor; toc
>>>> tic; for i = 1:1000; u' / R; endfor; toc
>>>> tic; for i = 1:1000; R' \ u; endfor; toc
>>>>
>>>> u = u + I*rand (n, 1);
>>>> tic; for i = 1:1000; R \ u; endfor; toc
>>>> tic; for i = 1:1000; R' \ u; endfor; toc
>>>>
>>>>
>>>> n = 800;
>>>> a = rand (n);
>>>> b = rand (n) + i*rand (n);
>>>> tic; a * b; toc
>>>> tic; b * a; toc
>>>> tic; a' * b; toc
>>>> tic; b * a'; toc
>>>> tic; a \ b; toc
>>>> tic; b / a; toc
>>>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>>>>
>>>> Octave3.0.1 that comes with Ubuntu Jaunty amd64, I get
>>>>
>>>>
>>>> octave:4> bench
>>>> Elapsed time is 0.20216 seconds.
>>>> Elapsed time is 1.93894 seconds.
>>>> Elapsed time is 2.33824 seconds.
>>>> Elapsed time is 0.188448 seconds.
>>>> Elapsed time is 1.95657 seconds.
>>>> Elapsed time is 2.43552 seconds.
>>>> Elapsed time is 4.08299 seconds.
>>>> Elapsed time is 7.84752 seconds.
>>>> Elapsed time is 0.213021 seconds.
>>>> Elapsed time is 0.21117 seconds.
>>>> Elapsed time is 0.218387 seconds.
>>>> Elapsed time is 0.217174 seconds.
>>>> Elapsed time is 0.452714 seconds.
>>>> Elapsed time is 0.391383 seconds.
>>>> octave:5>
>>>>
>>>>
>>>>
>>>> Matlab 2008b gives
>>>>>> bench
>>>> Elapsed time is 0.289161 seconds.
>>>> Elapsed time is 0.566446 seconds.
>>>> Elapsed time is 0.562623 seconds.
>>>> Elapsed time is 0.253456 seconds.
>>>> Elapsed time is 0.574304 seconds.
>>>> Elapsed time is 0.570281 seconds.
>>>> Elapsed time is 0.253070 seconds.
>>>> Elapsed time is 0.572601 seconds.
>>>> Elapsed time is 0.102086 seconds.
>>>> Elapsed time is 0.102677 seconds.
>>>> Elapsed time is 0.103080 seconds.
>>>> Elapsed time is 0.103759 seconds.
>>>> Elapsed time is 0.165608 seconds.
>>>> Elapsed time is 0.181704 seconds.
>>>>>>
>>>>
>>>>
>>>> Octave 3.2.3+ from today, self compiled, gives
>>>> octave:1> bench
>>>> Elapsed time is 0.208794 seconds.
>>>> Elapsed time is 0.189178 seconds.
>>>> Elapsed time is 0.186724 seconds.
>>>> Elapsed time is 0.188649 seconds.
>>>> Elapsed time is 0.192915 seconds.
>>>> Elapsed time is 0.19166 seconds.
>>>> Elapsed time is 0.186277 seconds.
>>>> Elapsed time is 0.19102 seconds.
>>>> Elapsed time is 0.212707 seconds.
>>>> Elapsed time is 0.211013 seconds.
>>>> Elapsed time is 0.210491 seconds.
>>>> Elapsed time is 0.210447 seconds.
>>>> Elapsed time is 0.431791 seconds.
>>>> Elapsed time is 0.367412 seconds.
>>>> octave:2>
>>>>
>>>> Congratulations!
>>>> Michael
>>>>
>>>
>>> It's interesting you didn't get any speed-up in the second part of the
>>> benchmark, compared to 3.0.1...
>>> What BLAS and LAPACK are you using? What's your compiler configuration?
>>> Also, what exactly is your tip? The "3.2.3+" is a bit unclear, did you
>>> mean "3.3.50+", i.e. the development version?
>>>
>>> thanks
>>>
>>> --
>>> RNDr. Jaroslav Hajek
>>> computing expert & GNU Octave developer
>>> Aeronautical Research and Test Institute (VZLU)
>>> Prague, Czech Republic
>>> url: www.highegg.matfyz.cz
>>>
>>
>> Oops, sorry, it's 3.3.50+, updated this morning.
>>
>> I make using
>> make -j2 CFLAGS="-O3 -march=native -funroll-loops" FFLAGS="-O3
>> -march=native -funroll-loops" XTRA_CFLAGS="-O3 -march=native
>> -funroll-loops" XTRA_CXXFLAGS="-O3 -march=native -funroll-loops"
>>
>
> In general, if you're with a newer gcc on a 64-bit architecture, I
> advise you against -funroll-loops. For me, it usually gets some +1% of
> additional speed of some operations, at the cost of increasing the
> binaries' size by more than 50%. Seems like a bad tradeoff.
>
>> ./configure reports
>>  BLAS libraries:       -llapack -lcblas -lf77blas -latlas
>>
>> so I assume that Octave is using Atlas (the atlas dev package that
>> comes with Kubuntu Jaunty amd64).
>>
>> Michael
>>
>
> Apparently, yes. Hmm. It's really weird you got almost exactly the same 
> figures.
> If you apply the attached patch, rebuild and re-run the benchmark,
> what do you get?
>
> --
> RNDr. Jaroslav Hajek
> computing expert & GNU Octave developer
> Aeronautical Research and Test Institute (VZLU)
> Prague, Czech Republic
> url: www.highegg.matfyz.cz
>

With that patch applied, I get
octave:1> bench
Elapsed time is 0.194493 seconds.
Elapsed time is 0.192309 seconds.
Elapsed time is 0.189026 seconds.
Elapsed time is 0.188679 seconds.
Elapsed time is 0.195958 seconds.
Elapsed time is 0.193521 seconds.
Elapsed time is 0.187596 seconds.
Elapsed time is 0.193254 seconds.
Elapsed time is 0.215135 seconds.
Elapsed time is 0.213705 seconds.
Elapsed time is 0.21341 seconds.
Elapsed time is 0.212501 seconds.
Elapsed time is 0.363992 seconds.
Elapsed time is 0.368094 seconds.

so there is an improvement in the second to last number.

Cheers, M.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]