[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: slow LU decomposiotion? for octave 4 for windows
From: |
Tatsuro MATSUOKA |
Subject: |
Re: slow LU decomposiotion? for octave 4 for windows |
Date: |
Thu, 2 Jun 2016 14:26:55 +0900 (JST) |
----- Original Message -----
> From: Tatsuro MATSUOKA
> To: tmacchant siko1056 "octave-maintainers
> Cc:
> Date: 2016/6/1, Wed 17:40
> Subject: Re: slow LU decomposiotion? for octave 4 for windows
>
> ----- Original Message -----
>
>> From: Tatsuro MATSUOKA
>> To: siko1056 "octave-maintainers
>> Cc:
>> Date: 2016/6/1, Wed 07:50
>> Subject: Re: slow LU decomposiotion? for octave 4 for windows
>>
>> ----- Original Message -----
>>
>>> From: siko1056
>>> To: octave-maintainers
>>> Cc:
>>> Date: 2016/5/31, Tue 21:07
>>> Subject: Re: slow LU decomposiotion? for octave 4 for windows
>>>
>>> Dear Tatsuro MATSUOKA,
>>>
>>> Do you also this performance issue for the vectorized test script in
> all
>>> your versions? I see similar good results on my PC without for-loops
> and
>>> Octave 4.0.1 and dev.
>>>
>>> ------------------
>>> more off
>>>
>>> Num=1000;
>>> rand('seed',1);
>>> A=rand(Num)-0.5;
>>> rand('seed',2);
>>> B=rand(Num,ItNum)-0.5;
>>> [L U P]=lu(A);
>>> %
>>> disp('Simple left division');
>>> tic;
>>> x=A\B; % vectorized
>>> toc;
>>> x1=x(:,end);
>>> %
>>> disp('LU decomposition');
>>> tic;
>>> c=P*B; y=L\c; x=U\y; % vectorized
>>> toc;
>>> x2=x(:,end);
>>> id=1:Num;
>>> plot(id,x1, 'o1',id,x2, '+2');
>>> ------------------
>>>
>>> This would make me struggle. For-loops are slow (Rik held an excellent
> talk
>>> about this at the Octconf 2015, http://wiki.octave.org/OctConf_2015)
> and
>>> maybe older versions of Octave handled loops in another way?!
>>>
>>> Regards,
>>> Kai
>>>
>> Kai
>>
>> Thank you for vectorized test script.
>>
>> One correction
>>
>> Num=1000;
>> |
>> V
>> Num=1000; ItNum=10;
>>
>>
>>
>> ************************************
>> Octave-3.2.4 mingw
>>>> lutest_v
>>
>> Simple left division
>> Elapsed time is 0.114076 seconds.
>> LU decomposition
>> Elapsed time is 0.00600397 seconds.************************************
>> Octave-3.6.4 msvc
>>
>>
>>>> lutest_v
>> Simple left division
>> Elapsed time is 0.11 seconds.
>> LU decomposition
>> Elapsed time is 0.00399995 seconds.
>> ************************************
>> Octave-3.6.4 mingw
>>
>>
>>>> lutest_v
>> Simple left division
>> Elapsed time is 0.0790549 seconds.
>> LU decomposition
>> Elapsed time is 0.00300312 seconds.
>> ************************************
>> Octave-3.8.2 mingw
>>
>>
>>>> lutest_v
>> Simple left division
>> Elapsed time is 0.096066 seconds.
>> LU decomposition
>> Elapsed time is 0.00300384 seconds.
>> ************************************
>> Octave-4.0.0 mingw (32 bit)
>>
>>
>>>> lutest_v
>> Simple left division
>> Elapsed time is 0.07305 seconds.
>> LU decomposition
>> Elapsed time is 0.01701 seconds.
>> ************************************
>> Octave-4.0.2 mingw (32 bit)
>>
>> Simple left division
>> Elapsed time is 0.060039 seconds.
>> LU decomposition
>> Elapsed time is 0.0180109 seconds.
>> ************************************
>> Octave-4.0.2 mingw (64 bit)
>>
>>>> lutest
>>
>> Simple left division
>> Elapsed time is 0.0500329 seconds.
>> LU decomposition
>> Elapsed time is 0.00900698 seconds.
>> ************************************
>> octave-4.1.0+ mingw (64bit)
>> (hg clone on May 28, 2016)
>>
>>>> lutest
>>
>> Simple left division
>> Elapsed time is 0.213143 seconds.
>> LU decomposition
>> Elapsed time is 0.0230169 seconds.************************************
>>
>>
>> As you told, there is issue of slow loop.
>> However, even vetorized, LU decomposition on version 4 on windows is slower
> than
>> version 3.
>> Surprisingly, results on 4.1.0+ are the worst. This is a bad situation and
>> should be improved.
>>
>> Is it better to be file to a bug?
>>
>> Tasuro
>
>
> Before filing this to the bug tracker I show test results on Ubuntu 14.04
> 64bit.
> (Athlon X2 not so fast)
>
>
> % lutest_v.m
> more off
>
> Num=1000; ItNum=10;
> rand('seed',1);
> A=rand(Num)-0.5;
> rand('seed',2);
> B=rand(Num,ItNum)-0.5;
> [L U P]=lu(A);
> %
> disp('Simple left division');
> tic;
> x=A\B; % vectorized
> toc;
> x1=x(:,end);
> %
> disp('LU decomposition');
> tic;
> c=P*B; y=L\c; x=U\y; % vectorized
> toc;
> x2=x(:,end);
> id=1:Num;
> plot(id,x1, 'o1',id,x2, '+2');
>
> % end of lutest_v.m
>
>
> ************************************
>
> 3.8.1 from Ubuntu repository
>>> lutest_v
> Simple left division
> Elapsed time is 0.396069 seconds.
> LU decomposition
> Elapsed time is 0.0427001 seconds.
> ************************************
> 4.0.0 bult myself (gcc 4.8.4)
>>> lutest_v
> Simple left division
> Elapsed time is 0.378708 seconds.
> LU decomposition
> Elapsed time is 0.071897 seconds.
> ************************************
>
> 4.0.2 bult myself (gcc 4.8.4)
>>> lutest_v
> Simple left division
> Elapsed time is 0.377592 seconds.
> LU decomposition
> Elapsed time is 0.0599658 seconds.
> ************************************
>
> 4.1.0+ bult myself (gcc 4.8.4) (cloned 2016-06-01 JST)
>>> lutest_v
> Simple left division
> Elapsed time is 0.378928 seconds.
> LU decomposition
> Elapsed time is 0.0592132 seconds.
> ************************************
>
>
> For simple division, the differences within tolerance.
> The LU decomposition is the slowest on 4.0.0 and fastest on 3.8.1.
> But difference are small comparing the cases on windows.
> In addition. slowness of simple division observed on 4.1.0+ does not appear
> on
> Ubuntu 14.04 64 bit.
>
> Slowness of LU decomposition on octave 4 windows is not allowable, I think.
>
> I would like to see test by other people and opinions.
>
> Regards
>
> ******************************************
> Tatsuro MATSUOKA
>
> Department of Chemical Engineering
> Nagoaya University, Japan
> ******************************************
I have filed to the bug tracker.
Further discussion will be made there.
http://savannah.gnu.org/bugs/?48085
Tatsuro