Re: More efficient MEX or MEX-like interface

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More efficient MEX or MEX-like interface

From:	David Bateman
Subject:	Re: More efficient MEX or MEX-like interface
Date:	Tue, 30 Sep 2008 15:08:49 +0200
User-agent:	Thunderbird 2.0.0.17 (X11/20080926)

Fredrik Lingvall wrote:

David Bateman wrote:

Fredrik Lingvall wrote:

Does this mean that BLAS/LAPACK expects complex data stored in the same
way as complex data is currently stored in Octave?

Yes, Octave calls lapack/blas functions like [cz]GEMM, for complex
arguments. Greping the matlab libraries with

nm -C --dynamic  libmwnumerics.so | grep [gG][eE][mM][mM]

gives

                U dgemm_
                U sgemm_
                U zgemm_

So it appears that the double precision complex matrix multiple is
linked in but the single precision version isn't. I'd think this means
that in fact ZGEMM is used by the SuiteSparse code (which doesn't have
single precision), and that in fact matlab internally uses four calls
to DGEMM on the real and imaginary parts of the matrix to simulate
ZGEMM as that means that they never explicitly have to form the
complex matrix. Given the speed differences (or lack thereof) for such
operations between Octave and Matlab I believe that pretty much
confirms what they do.

How did you test this?

I did a quick test (Pentium-M laptop):

1) Octave 3.0.2

octave:1> n=2000; B = randn(n,n) + i*randn(n,n);
octave:2> n=2000; A = randn(n,n) + i*randn(n,n);
octave:3> tic, C=A*B;toc
Elapsed time is 37.4009 seconds.
octave:4> tic, C=A*B;toc
Elapsed time is 37.3448 seconds.
octave:5>

2) Matlab 7.3

n=2000; A = randn(n,n) + i*randn(n,n);
n=2000; B = randn(n,n) + i*randn(n,n);
tic, C=A*B;toc

Elapsed time is 36.725418 seconds.

tic, C=A*B;toc

Elapsed time is 36.786914 seconds.

Both Matlab and Octave use Goto BLAS v. 1.19

It looks like there is no speed penalty  for storing real and imag
parts  separated (at least not for the size of the problem that I tested).

/F

This is exactly my point... If matlab had to reform a C99 complex matrixand then call zGEMM rather than use four calls to dGEMM and do theadditions then it would be slower as the creation and copying of thedata to a C99 complex array comes at a cost. In fact xGEMM does the operator


 C = alpha * A * B + beta * C

and so you might for example call dGEMM once passing the two imaginaryparts as A, B and have beta of zero, then have a second call with thatresult as C, beta of -1 and A and B being the imaginary parts, thusgiving you the real part of the complex matrix multiply with two callsto dGEMM on the real and imaginary parts of the matrix. The same goesfor the calculation of the imaginary part of the matrix multiply. Theunderlying code of zGEMM has to do something similar in any case, itjust does the four multiplications element by element instead, so thereis no surprise it is much the same speed as what matlab does.

The absence of cGEMM from the symbols of the numeric is a pretty goodindication that the above is exactly what mathworks does as I see noreason to handle matrix multiplies different between double and singleprecision values.



D.

--
David Bateman                                address@hidden

Motorola Labs - Paris +33 1 69 35 48 04 (Ph)Parc Les Algorithmes, Commune de St Aubin +33 6 72 01 06 33 (Mob)91193 Gif-Sur-Yvette FRANCE +33 1 69 35 77 01 (Fax)The information contained in this communication has been classified as:[x] General Business Information[ ] Motorola Internal Use Only[ ] Motorola Confidential Proprietary

[Prev in Thread]

Current Thread

[Next in Thread]

Re: More efficient MEX or MEX-like interface, David Bateman, 2008/09/29
- Re: More efficient MEX or MEX-like interface, Fredrik Lingvall, 2008/09/30
  - Re: More efficient MEX or MEX-like interface, Jaroslav Hajek, 2008/09/30
  - Re: More efficient MEX or MEX-like interface, David Bateman, 2008/09/30
    - Re: More efficient MEX or MEX-like interface, Fredrik Lingvall, 2008/09/30
    - Re: More efficient MEX or MEX-like interface, David Bateman <=
    - Re: More efficient MEX or MEX-like interface, Fredrik Lingvall, 2008/09/30
    - Re: More efficient MEX or MEX-like interface, Jaroslav Hajek, 2008/09/30
- Re: More efficient MEX or MEX-like interface, David Bateman, 2008/09/30

Prev by Date: Re: Private company and code salvation / story and comments
Next by Date: ftp site problems
Previous by thread: Re: More efficient MEX or MEX-like interface
Next by thread: Re: More efficient MEX or MEX-like interface
Index(es):
- Date
- Thread