[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: cellfun vs. parcellfun: speed
From: |
c. |
Subject: |
Re: cellfun vs. parcellfun: speed |
Date: |
Tue, 4 Sep 2012 13:41:19 +0200 |
On 4 Sep 2012, at 13:29, Martin Helm wrote:
> You create way to many jobs, it is not effective to create 100000 jobs
> on a 4 core machine (or whatever you have):
>
> tic ; parcellfun (4, @isempty, cell(1,100000)) ; toc
> parcellfun: 100000/100000 jobs done
> Elapsed time is 30.6851 seconds.
>
>
> tic ; parcellfun (4, @isempty, cell(1,100000), "ChunksPerProc", 1) ; toc
> parcellfun: 4/4 jobs done
> Elapsed time is 0.0850289 seconds.
>
> tic ; parcellfun (2, @isempty, cell(1,100000), "ChunksPerProc", 1) ; toc
> parcellfun: 2/2 jobs done
> Elapsed time is 0.0729971 seconds.
>
> tic ; cellfun (@isempty, cell(1,100000)) ; toc
> Elapsed time is 0.10386 seconds.
>
> so you see even with that trivial example I get a little bit speedup
> compared to cellfun, when limiting the number of jobs to something
> reasonable (i3 notebook dual core with hyperthreading)
My experience is that it is usually almost useless to use more processes tha
physical cores on your machine,
whether you have hypertreading or not makes little difference.
Also results get better if the single tasks you run in parallel are less
trivial.
The attached script demonstrates solving a set of nonlinear differential
equations, on my system (core 2 duo) I get:
>> prova_par
parallel version with cellfun:
parcellfun: 200/200 jobs done
Elapsed time is 32 seconds.
serial version with for cycle: Elapsed time is 65 seconds.
serial version with cellfun: Elapsed time is 64 seconds.
speedup is exactly 2 ;)
HTH,
c.
prova_par.m
Description: Binary data