[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Backslash inversion vs inv-function
From: |
John W. Eaton |
Subject: |
Backslash inversion vs inv-function |
Date: |
Sat, 19 Jan 2008 12:11:54 -0500 |
On 18-Jan-2008, Rolf Fabian wrote:
| Because of its underlying algorithm, I expected that
| inversion of a (nonsingular) square matrix should
| be faster using backslash operator 'y2 = x\eye(n)'
| instead of Octave's builtin 'y1 = inv(x)' function.
|
| Thus I conducted dimension n - depending tests
| using random matrices input in order to sched
| some light on this issue.
|
| octave-3.0.0.exe:48> n=5; x=randn(n); tic; y1 = inv(x); toc,
| tic; y2 = x\eye(n); toc,
| max(max(abs(y1-y2)))
|
| Elapsed time is sec.
| n inv(x) x\eye(n) Speed Gain Fac max(max(abs(result
| difference)))
| 5 0.05531 0.000258 214 1.1102e-016
| 10 0.06063 0.000313 194 4.4409e-016
| 25 0.05705 0.000705 81 6.9944e-015
| 50 0.05889 0.002197 27 1.2212e-015
| 100 0.07195 0.01153 6.2 1.8952e-015
| 150 0.08721 0.03425 2.6 2.2649e-014
| 200 0.1271 0.06117 2.1 3.9413e-015
| 300 0.2892 0.1991 1.5 8.5820e-014
| 400 0.6255 0.609 1.03 8.9484e-014
| 500 1.176 1.263 0.93 2.2163e-014
| 600 2.05 2.197 0.93 7.0499e-015
| 700 3.228 3.458 0.93 2.4092e-014
| 800 4.855 5.137 0.95 2.2243e-014
| 900 6.873 7.272 0.95 3.0781e-014
| 1000 9.506 9.988 0.95 2.8283e-014
|
| I've done this on a relative old laptop
| (Windows XP, SP2, Intel III Mobile CPU 933 MHz 512 MB RAM)
| so do not wonder about absolute performance times.
|
| >From the the table above it can be seen that the results confirm my
| expectations.
| Up to about 400x400 sized matrices the speed gain (column 4) can be
| enormous.
| Consequently replacing 'inv(x)' in existing code by 'x\eye(n)' might be an
| option and should be considered.
Are you using ATLAS or some other tuned BLAS? If so, is it
appropriate for your hardware?
For timing purposes, you should use cputime, not tic/toc, since tic/toc
measures wall-clock time and can be affected by other processes on
your system. Also, it's probably best to run timing tests a few times
and do some averaging.
In any case, for the following script:
puts (" n inv(x) x\\I err\n");
ntest = 5;
for n = [100,150,200,300,400,500,600,700,800,900,1000,1500,2000]
x = randn (n);
I = eye (n);
tinv = tldiv = 0;
for i = 1:ntest
t = cputime ();
y1 = inv (x);
tinv += cputime () - t;
t = cputime ();
y2 = x\I;
tldiv += cputime () - t;
endfor
tinv /= ntest;
tldiv /= ntest;
err = max (abs (y1 - y2)(:));
printf ("%4d %9.5f %9.5f %e\n", n, tinv, tldiv, err);
endfor
here are the results I see:
n inv(x) x\I err
100 0.00400 0.00480 1.110223e-14
150 0.01040 0.01040 4.378442e-15
200 0.01680 0.02160 1.043610e-14
300 0.04480 0.05600 8.604228e-15
400 0.09361 0.11041 8.160139e-14
500 0.16081 0.18881 4.285461e-14
600 0.25522 0.30162 2.622902e-14
700 0.38242 0.45283 1.847134e-14
800 0.53203 0.64564 2.242651e-13
900 0.72405 0.87205 6.783463e-14
1000 0.96006 1.13767 5.022718e-14
1500 2.80978 3.38981 2.424727e-13
2000 6.29719 7.45967 6.985107e-13
Note that I've also taken the calculation of eye(n) out of the loop
and x\I is still generally slower than inv(x) on my system (AMD64,
and Debian's atlas3-base package, version 3.6.0-20.6).
I guess I don't see a problem here.
jwe