[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: isreal benchmarking
From: |
John W. Eaton |
Subject: |
Re: isreal benchmarking |
Date: |
Tue, 11 Sep 2012 17:26:30 -0400 |
On 11-Sep-2012, Rik wrote:
| Maybe processor architecture makes a difference? I don't know what sort of
| awesome 8 cores you have and maybe there are CPU pinning effects going on.
| I only have two cores, but I turned off the second core with
|
| echo 0 >> /sys/devices/system/cpu/cpu1/online
| cat /proc/cpuinfo # just to verify that the cpu is no longer available to
| the kernel
|
| Still no real change in the behavior.
|
| I also tried compiling with just '-O2' and the difference persists. I
| compiled again with '-g -O0' and the difference is still there. The
| '-msse' option helped reduce the imag(x) runtimes for me. Without it the
| difference I am seeing is close to ~40% rather than just 25%.
Some of this discussion was off the list but I'm bringing it back
because I have a few comments that might be of general interest and
I'd like to know whether people think we should attempt compatibility
for the two cases I describe at the end of this message.
Using -g should not slow things down. I just asked if you are using
-O2 or -g because I usually compile the dev sources without
optimization to make debugging easier, and without optimization
performance is noticeably worse.
Also, I see significantly different results when compiling without
-O2:
octave-cli:1> x = complex (1e4, 1e4);
octave-cli:2> x = complex (zeros (1e4), zeros (1e4));
octave-cli:3> t = cputime (); isreal (x); cputime () - t
ans = 0
octave-cli:4> t = cputime (); isreal ([x]); cputime () - t
ans = 1.8841
octave-cli:5> t = cputime (); isreal (x(:)); cputime () - t
ans = 1.9081
octave-cli:6> t = cputime (); all (all (imag (x) == 0)); cputime () - t
ans = 2.4922
compared to with (different version of Octave, above was dev, this is
3.6.3, but things in this regard should not have changed much):
octave:1> x = complex (zeros (1e4), zeros (1e4));
octave:2> t = cputime (); isreal (x); cputime () - t
ans = 0.0040000
octave:3> t = cputime (); isreal ([x]); cputime () - t
ans = 1.0801
octave:4> t = cputime (); isreal (x(:)); cputime () - t
ans = 1.0681
octave:5> t = cputime (); all (all (imag (x) == 0)); cputime () - t
ans = 1.0881
So one thing to notice is that compiler optimization is critical if
you want to obtain reasonably good performance for Octave.
I doubt that multiple CPUs has anything to do with this. Octave is
not multithreaded and I don't think these operations don't rely on the
BLAS where threading might be enabled. These are just loops in the
interpreter/array classes.
I also didn't realize that we narrowed from complex to real when
indexing with (:). I'm not sure that is the correct behavior. It
looks like Matlab does not narrow to real in this case. Or for [x].
So should we make Octave behave the same way in these cases? It looks
like it does narrow in other cases. For example, x+2 or x*2 or other
arithmetic operations will result in real values, not complex values
with zero imaginary part. Even x+x narrows, so it is a general
property of the result of an arithmetic operation. It doesn't appear
that there is a special check about adding two complex values that
have all imaginary parts or some other trickery like that.
jwe
- isreal and iscomplex, Rik, 2012/09/10
- Re: isreal and iscomplex, Daniel J Sebald, 2012/09/10
- Re: isreal benchmarking, Rik, 2012/09/11
- Re: isreal benchmarking, Daniel J Sebald, 2012/09/11
- Re: isreal benchmarking, John W. Eaton, 2012/09/11
- Message not available
- Message not available
- Message not available
- Re: isreal benchmarking,
John W. Eaton <=
- Re: isreal benchmarking, Daniel J Sebald, 2012/09/11
- Re: isreal benchmarking, John W. Eaton, 2012/09/12
- Re: isreal benchmarking, Daniel J Sebald, 2012/09/12