[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: map-par slower than map
From: |
Damien Mattei |
Subject: |
Re: map-par slower than map |
Date: |
Thu, 13 Oct 2022 10:20:19 +0200 |
ok 'parallelization code and hash table' in google show no real solution, i
have another idea,instead of function-unify-two-minterms-and-tag ,just
unify the minterms and tag them (which use hash table) later out of the //
region, after if it still sucks i will drop list for vectors, making it
step by step will show where is the bottleneck, let me just a few time and
i will go back to the mailing lists with interesting results i hope....
On Thu, Oct 13, 2022 at 9:40 AM Damien Mattei <damien.mattei@gmail.com>
wrote:
> ok , i think the problem comes both from my code and from guile parmap so.
> Obviously parmap could be slower on other codes because of the nature of
> list i think, it is hard to split a list in sublist and send them to thread
> and after redo a single list, i better use vector.
> As mentioned and forget by me, i apologize, i use an hash table which is a
> global variable and mutex can be set on it , no dead lock , here but it
> slow down the code than it is dead for speed , but not locked.
>
> The examples given are good but i do not want to write long and specific
> code for //, for // must a ssimple as OpenMP directives (on arrays) not be
> a pain in the ass like things i already did at work with GPUs in C or
> Fortran,that's nightmare and i do not want to have this in functional
> programming.
>
> ok i will try to modify the code but i do not think there is easy solution
> to replace the hash table, any structure i would use would be global and
> the problem (this function is not pure, it does a side-effect),at the end i
> do not think this code could be // but i'm not a specialist of //. I think
> this is a parallelization problem, how can we deal with a global variable
> such as an hash table?
>
> On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca>
> wrote:
>
>> On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
>> > Hello,
>> > all is in the title, i test on a approximately 30000 element list , i
>> got
>> > 9s with map and 3min 30s with par-map on exactly the same piece of
>> > code!?
>>
>> I can only speculate here. But trying with a very simple example here:
>> --8<---------------cut here---------------start------------->8---
>> (use-modules (statprof))
>> (statprof (lambda () (par-map 1+ (iota 300000))))
>> --8<---------------cut here---------------end--------------->8---
>>
>> Performance are terrible. I don't know how par-map is implemented, but
>> if it does 1 element static scheduling -- which it probably does because
>> you pass a linked list and not a vector -- then yeah you can assure that
>> thing will be very slow.
>>
>> You're probably better off with dynamic scheduling with vectors. Here's
>> a quick snippet I made for static scheduling but with vectors. Feel
>> free to roll your own.
>>
>> --8<---------------cut here---------------start------------->8---
>> (use-modules
>> (srfi srfi-1)
>> (ice-9 threads))
>>
>> (define* (par-map-vector proc input
>> #:optional
>> (max-thread (current-processor-count)))
>>
>> (let* ((block-size (quotient (vector-length input) max-thread))
>> (rest (remainder (vector-length input) max-thread))
>> (output (make-vector (vector-length input) #f)))
>> (when (not (zero? block-size))
>> (let ((mtx (make-mutex))
>> (cnd (make-condition-variable))
>> (n 0))
>> (fold
>> (lambda (scale output)
>> (begin-thread
>> (let lp ((i 0))
>> (when (< i block-size)
>> (let ((i (+ i (* scale block-size))))
>> (vector-set! output i (proc (vector-ref input i))))
>> (lp (1+ i))))
>> (with-mutex mtx
>> (set! n (1+ n))
>> (signal-condition-variable cnd)))
>> output)
>> output
>> (iota max-thread))
>> (with-mutex mtx
>> (while (not (< n max-thread))
>> (wait-condition-variable cnd mtx))))
>> (let ((base (- (vector-length input) rest)))
>> (let lp ((i 0))
>> (when (< i rest)
>> (let ((i (+ i base)))
>> (vector-set! output i (proc (vector-ref input i))))
>> (lp (1+ i)))))
>> output))
>> --8<---------------cut here---------------end--------------->8---
>>
>> --
>> Olivier Dion
>> oldiob.dev
>>
>
- Re: map-par slower than map, (continued)
- Re: map-par slower than map, Damien Mattei, 2022/10/24
- Re: map-par slower than map, Mikael Djurfeldt, 2022/10/25
- Re: map-par slower than map, Mikael Djurfeldt, 2022/10/25
- Re: map-par slower than map, Damien Mattei, 2022/10/25
- Re: map-par slower than map, Maxime Devos, 2022/10/12
Re: map-par slower than map, Olivier Dion, 2022/10/12
- Re: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Olivier Dion, 2022/10/13
- Message not available
- Fwd: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Olivier Dion, 2022/10/13
- Re: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Olivier Dion, 2022/10/13
- Re: map-par slower than map, Damien Mattei, 2022/10/13
- Re: map-par slower than map, Damien Mattei, 2022/10/13