[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
compound operators
From: |
Jaroslav Hajek |
Subject: |
compound operators |
Date: |
Wed, 7 May 2008 16:53:44 +0200 |
hello,
I've made first attempt to elaborate on the idea John W. Eaton gave on
this list a while ago: having Octave's parser to recognize expressions
like `expr1' * expr2' as special, to allow more efficient mapping of
operations onto BLAS routines.
The initial commit is in my public repo:
https://tw-math.de/highegg
on my laptop (x86/Linux/gcc-4.2.1) this compiles and "make check"'s
without a problem. Unfortunately, our server with Intel C++ is down,
so I can't test anywhere else.
In this initial commit, the following expressions are recognized as special:
A.'*B (trans_mul)
A*B.' (mul_trans)
A'*B (herm_mul)
A*B' (mul_herm)
these are contracted into compound operators provided by octave_value,
which are mapped directly to xGEMM/xGEMV/xDOTx BLAS calls if possible
(double-double or complex-complex), otherwise "decomposed" to the
original evaluation sequence.
for certain matrix sizes, significant speedups can be achieved, as can
be demonstrated with this simple benchmark:
n = 50; m = 505000; a = rand(m,n); b = rand(m,n); tic; c = a'*b; toc ;
clear a b c
which can be done directly via DGEMM('T','N',...) instead of
transposing, then doing DGEMM('N','N',...)
on my laptop (Octave linked with GotoBLAS (no mthreading)), with
current Octave I get
Elapsed time is 4.19386 seconds.
and with the new changeset employed
Elapsed time is 1.79197 seconds.
which is a 234% procent speedup (of course, it took me few tries time
to find dims giving such impressive results :)
obviously, one also saves memory by saving the transpose.
I want to also map A.'*A, A*A' etc to xSYRK and xHERK - this is going
to be simple, and possibly more.
regards,
--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
- compound operators,
Jaroslav Hajek <=