Re: ||ism

swarm-modeling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ||ism

From:	Theodore C. Belding
Subject:	Re: \|\|ism
Date:	Thu, 8 May 1997 18:31:36 -0400

Hi Glen-
I think relying on shared memory to reduce communication costs relative to
message passing is not a very good idea.  Shared memory hides a lot of the
complexity of parallel computing but can also trick you into doing more
costly inter-processor communication than you would if you used explicit
message-passing.

A few years ago, Michigan had KSR1 and KSR2 cache-based shared memory
(NUMA) parallel computers.  The computers had 2 rings of 32 processor nodes
each. Each node had two levels of cache: a 256 KB data and a 256 KB
instruction subcache, plus a 32 MB local cache.  Subcache access took 2
cycles and could be performed at a rate of one per clock.  If an item was
not stored in a node's subcache, but in its local cache, access time
increased to 23.4-49.2 cycles.  If the item was stored in the local cache
of another node on the same local ring, access time typically raged from
135-175 cycles.  Memory accesses on another ring took 470-600 cycles
(Windheiser et al., 1993).  So if you were unlucky or used a bad parallel
algorithm, you could have a big hit each time you accessed memory.

I took a parallel computing course with Quentin Stout here, and programmed
a distributed GA in C on the KSR, using POSIX threads (Belding, 1995).
Stout's rule of thumb was that you should always program as if you were on
a message-passing architecture and keep operations local to the processor
wherever possible; you shouldn't rely on shared memory to make
communication "free", since there are always hidden communication costs.
Instead you should assume that each non-local operation is costly.  (Of
course, if your program is coarse-grained and communication is infrequent
relative to local computation, you don't have to worry about this as much.)
And if you're just simulating shared memory using message passing (is that
what you mean by "virtual shared memory?), then there isn't any real
performance difference between the two.

Currently, my impression is that distributed-memory architectures such as
the IBM SP2 have the upper hand relative to shared-memory architectures,
due to the increasing gap between cheap microprocessor power and bus
bandwidth (Hennessy and Patterson, 1996).  But it's been a while since I've
worked in this area.

Finally, synchrony vs. asynchrony and fine-grained vs. coarse-grained
parallel computation are somewhat orthogonal issues and should be
considered individually.  Synchrony is not the same as communication.

More generally, I think it's a bad idea for Swarm to assume
machine-specific details like the cost of inter-processor communication, or
the presence of shared memory.  If a user has a fine-grained, synchronous
algorithm, the computation cost would likely be higher when using MPI on a
network of workstations than on a parallel supercomputer with shared memory
and a high-bandwidth inter-processor bus.  But I don't think Swarm should
care.  MPI is supposed to help you avoid having to think about details like
that; it can be run on computers with or without shared memory.  Swarm
should avoid re-inventing the wheel.

At the risk of repeating myself, documentation is still more important than
any of this parallel stuff. :)
-Ted

Belding, T. C. (1995). The distributed genetic algorithm revisited. In
Eshelman, L. J. (Ed.), Proceedings of the Sixth International Conference on
Genetic Algorithms, pp. 114-121. San Francisco, CA: Morgan Kaufmann.
http://www-personal.engin.umich.edu/~streak/dga.html

Hennessy, J. L., and D. A. Patterson. (1996). Computer Architecture: A
Quantitative Approach. 2nd edition. San Francisco, CA: Morgan Kaufmann. pp.
637-644.

Windheiser, D., Boyd, E.L., Hao, E., Abraham, S.G., and E.S. Davidson.
(1993). KSR1 Multiprocessor: Analysis of latency hiding techniques in a
sparse solver. In Proc. 7th International Parallel Processing Symposium,
April 13-16, 1993, Newport Beach, CA. pp. 454-461.

At 1:09 PM -0600 5/8/97, glen e. p. ropella wrote:
>Hey!
>
>I was reading an article in Comp. Science & Eng.  and noticed
>some issues that were raised by Berlin and Gabriel in the theme
>article "Distributed MEMS: New Challenges for Computation,"
>[CS&E Jan-March 1997].
>
>I'm going to quote a paragraph that seems relevant to the coming
> || version of Swarm  (I apologize for quoting so much... but,
>though Berlin and Gabriel are talking about smart dust, they could
>as easily be talking about modelling decisions in Swarm):
>
>   "Parallel computing has traditionally involved relatively close
>temporal synchronization among processes, which tend to be running on
>a set of processors located quite near one another.  Distributed
>computing, on the other hand, has typically been associated with
>processes that may be operating in distant locations, synchronizing
>and exchanging results far less frequently.  Distributed MEMS will
>require new paradigms that support a high degree of synchronization
>over fairly large distances in order to enable applications, such as
>sound localization, that require tight synchronizations to correlate
>data about events in the environment in real time.
>
>   "Smart dust raises many other questions that pose serious
>challenges for computational science.  Should all the smart dust
>particles run the same program, as in a SIMD machine, or should
>particles specialize and diverge from one another?  How do the
>particles synchronize with one another?  How can particles be
>dynamically recruited into collaborative groups?  How can
>communication be established and maintained in a system where the
>physical topology varies over time?  How do we build a global view of
>a situation using many small pieces of information that are collected
>in different places at different times?  In an energy-limited
>programming environment, when is it better to compute (interpolate) a
>result, when is it better to communicate with a neighboring particle
>to obtain the result, and when is it better to sense the result in the
>environment?  How can spatially distinct data streams collected at
>different times be combined in an energy-efficient manner?  What are
>the abstraction mechanisms that allow easy programming of these
>systems?"
>
>
>1.  I think we need a || methodology that will support *both*
>distributed computing, where synchronization may be seldom required,
>and || computing, where synchronization may occur very often.  This
>seems to imply to me that message passing will become a serious
>hindrance in the latter.  And that makes me think that we might not
>want to dive glibly into using a system based on MPI.
>
>2.  It's possible that we could use some hybrid of MPI and a virtual
>shared memory system.  That might make things a bit easier, since
>we could rely on the shared memory for all the kernel data trading
>(which would include the synchronization of Swarms and schedules),
>and still use MPI for other object-to-object communication across
>hardware.
>
>3.  From the modelling perspective, do any of you *know* for certain
>how much synchronization your apps will require?  I.e. do we have
>an estimate of how badly the highly synchronous apps will hit us
>and how often we're likely to see apps like that?
>
>4.  Does anybody out there have any experience with pure
>message passing versus distributed shared memory methods?  I.e.
>am I fooling myself into thinking that the shared memory method
>will be any more efficient than a pure message passing interface?
>
>*Any* opinions on where Swarm should go with this are wanted.
>
>glen
>--
>{glen e. p. ropella <address@hidden> |                                  }
>{Hive Drone, SFI Swarm Project         |            Hail Eris!            }
>{http://www.trail.com/~gepr/home.html  |               =><=               }
>
>
>                  ==================================
>   Swarm-Modelling is for discussion of Simulation and Modelling techniques
>   esp. using Swarm.  For list administration needs (esp. [un]subscribing),
>   please send a message to <address@hidden> with "help" in the
>   body of the message.
>                  ==================================

--
Ted Belding                      <mailto:address@hidden>
University of Michigan Program for the Study of Complex Systems
<http://www-personal.engin.umich.edu/~streak/>

                  ==================================
   Swarm-Modelling is for discussion of Simulation and Modelling techniques
   esp. using Swarm.  For list administration needs (esp. [un]subscribing),
   please send a message to <address@hidden> with "help" in the
   body of the message.
                  ==================================

[Prev in Thread]

Current Thread

[Next in Thread]

||ism, glen e. p. ropella, 1997/05/08
- Re: ||ism, Theodore C. Belding <=
- Message not available
  - Re: ||ism, Lindsay Hood, 1997/05/08
    - Message not available
    - Re: ||ism, manor, 1997/05/08
- Re: ||ism, Kevin Crowston, 1997/05/08
- Re: ||ism, Scott Christley, 1997/05/08
- Re: ||ism, Scott Christley, 1997/05/08
  - Message not available
    - Re: ||ism, manor, 1997/05/08
- Re: ||ism, Scott Christley, 1997/05/09
- Re: ||ism, Scott Christley, 1997/05/09
  - Re: ||ism, Nelson Minar, 1997/05/10
- Re: ||ism, Michael A. Beedle, 1997/05/10

Prev by Date: Re: ||ism
Next by Date: Re: ||ism
Previous by thread: ||ism
Next by thread: Re: ||ism
Index(es):
- Date
- Thread