swarm-modeling
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Swarm-Modelling] Floating point arithmetic


From: James Marshall
Subject: Re: [Swarm-Modelling] Floating point arithmetic
Date: Fri, 06 May 2005 10:13:49 +0100
User-agent: Mozilla Thunderbird 1.0.2 (X11/20050317)

http://aspen.ucs.indiana.edu/pss/HPJava/mpiJava.html

As I say, I haven't used it. It's a JNI interface to the MPI libs though.
        James

Russell Standish wrote:
On Thu, May 05, 2005 at 10:13:20PM -0600, Marcus G. Daniels wrote:

Russell Standish wrote:


What's that got to do with no-one using parallel Java? Plenty of people
do parallel Fortran or parallel C/C++?



The distinction is between parallel application code running in threads vs. code that runs in parallel by using distributed objects. The former has low latency (threads) while the latter is over-the-network (typically much higher latency). Java RMI aims to be a user-friendly high level interface while MPI (C/Fortran) is a more performance-friendly programming model as it doesn't do as much.


You said that Java has an MPI interface too - perhaps that doesn't
work too well?

I do work with parallel distributed obects in C++ using a technology
called ClassdescMP (and also Graphcode). Alas, as ClassdescMP was
developed by me, with some help from one or two of my staff, it is not
so well known as I don't have the marketing clout of someone like Sun.

However it turns out to be fairly easy to get ClassdescMP and
Graphcode codes to scale well in parallel (and they literally get out
of the way in single processor situations). As an example, I once
achieved 40 times speedup on 100 processors with a million agent
simulation. (I haven't got around to formally publishing this stuff
yet, though, and 100 processor systems aren't the easiest things to
lay my hands on).

I concentrated on C++ because it easier to get fantastic single
processor performance, and architected the whole remote object
communication thing in such a way to minimise overheads (the compiler
does have to a lot of template crunching, and each new gcc release
gives me a headache, but compile times are still perfectly reasonable)

ClassdescMP in theory introduces overheads - in practice it often
seems to remove them, due to message concatenation. However, it
deliberately does _not_ hide MPI, it is always possible to recode a
bottleneck in pure MPI as an optimisation step.

And with C++, you can always extract fine grained parallelism through
the use of OpenMP!

Cheers


Amdahl's law is exacerbated with RMI (relative to MPI) due to software overhead of marshalling calls and increased latency. I don't believe there are any Java language dialects that have compiler implementations of the quality of OpenMP (e.g. for C/Fortran compilers by Sun, Portland Group, Cray, etc.) so as to extract fine-grained parallelism from loops and the like (and using kernel threads to schedule the concurrent parts of loops, futures, etc).
_______________________________________________
Modelling mailing list
address@hidden
http://www.swarm.org/mailman/listinfo/modelling



------------------------------------------------------------------------

_______________________________________________
Modelling mailing list
address@hidden
http://www.swarm.org/mailman/listinfo/modelling

--
Dr James A. R. Marshall
Department of Computer Science
University of Bristol
http://www.cs.bris.ac.uk/home/marshall


reply via email to

[Prev in Thread] Current Thread [Next in Thread]