I know little
about the actual caching strategy and how workload is partitioned, thx Jon for
this insightful mail Some similar short explaination of the caching
strategy would be much appreciated.
Maybe this is
a stupid question, but why do we bother with the cache and sequential positions
so much at all? My benchmarks with 4 cores showed that the cache
brings about 10% improvement, but look at what huge amount of bugs it
caused recently, and how harmful it is to threading. Shoudln't be the threading
strategy be to have stateless positions and throw all available cores at
finishing one single position as fast as possible? That would allow a number of
other improvements, such as displaying analysis results move by move, rather
than waiting for batches to complete before they can be
displayed.
I feel the
design is strongly biased towards interactive GUI play and sequential
match batch analysis, but my use case is using gnubg for a bot and thus rather
stateless, random positions.
Ingo
Philippe Michel wrote: >
> On the other hand, analysis doesn't seem to scale well above 8
threads. > It looks like moves are analysed one by one and after the
opening and > early middle game, even with a large move filter, some,
then most of the > threads are starved. 8 threads was about 7 times
faster than 1, but 16 > threads was merely 9 times faster.
This
is almost definitely caused by the fact that I wrote the code to
analyse each game separately, i.e. all the moves in one game are dished out
and then waited for before the next game is analysed. This means that
towards to end of each game lots of threads will wait for the last moves to
finish.
The answer would be to split out all the moves of the entire
match in one go, shouldn't be too hard as long as it doesn't break
anything. This would be worthwhile as it would speed up things slightly
even for small numbers of cores.
Jon
Add other email accounts to Hotmail in 3 easy steps. Find out
how.
|