[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [certi-dev] Re: CERTI-Devel Digest, Vol 28, Issue 5
From: |
Christian Stenzel |
Subject: |
Re: [certi-dev] Re: CERTI-Devel Digest, Vol 28, Issue 5 |
Date: |
Fri, 25 Apr 2008 14:02:11 +0200 |
User-agent: |
IceDove 1.5.0.14pre (X11/20080305) |
Hello,
now a longer post follows because this discussion is very interesting
for me. I try to
formulate some of my own views to that subject.
@Erk
OK I see.
For such computation MPI seems more appropriate
are porting the code from MPI or do you "usually" have
such high volume data exchange?
If I'm even more curious could you tell me at which frequency
you send the matrix?
Here are some more details:
The principle concept is introduced and discussed in that paper:
http://www.mb.hs-wismar.de/~stenzel/publications/sne_16_2_paper_p51_p56_hla_military_ship_design.pdf
In short:
We've worked and are still working in the area of ship design processes.
One aim is to have something like
a virtual ship in an early design phase.
Furthermore, we have a model to simulate more or less realistic seaways.
There is a federate computing a sea height matrix as fast as possible.
This matrix is transferred to a federate visualizing the seaway and a
vessel in that seaway.
So the time constraints are clear: In minimum we have to be as fast as
realtime.
For the animation we need ca. 20 pictures in one sec (better would be
>=25). This means, that we have
to compute and transfer one sea height matrix in 1/20 sec. The
computation time
depends mainly on the matrix dimensions.
Obviously this has nothing to do with classical distributed discrete
event simulation. Here HLA
is used only as "data-exchange" middleware.
Anyhow, there are different reasons why we use HLA here:
- the NATO STANAG for Virtual Ships intends the usage of HLA
- in principle, HLA should be capable to do such communications (nothing
in the standard says that
HLA can not do that)
- adding more federates with different time advancing (TAR, NER) strategies
should be easily achieved
I think we may collaborate for this if you can create
a small "test case" like a HugeUAV_federate which may
be launched easily like
rtig
HugeUAV_federate -size 100000
HugeUAV_federate
etc...
the HugeUAV_federate will try to exchange a value of size -size
first federate creates federation, other are automatically "subscriber"
if "FederationAlreadyExists".
using this we may investigate the problem easilly and
with a dtest script added we will add this to our "regression test cases" box;
Yes, I will do that.
@Pierre
Hello Eric,
mon cher voisin de l'autre côté du couloir,
This are the new possibilities of digital communication :).
Discuss problems publicly on a mailing list instead of moving
to the other end of the couloir ;).
But this discussion is very interesting so I try to formulate
my position:
I think that the discussion is more open between MPI and HLA.
I have directed an internship on the subject of scientific computation
with HLA
(parallel resolution of a linear system, Kathrin Quince, 2006).
We had also the problem of a huge matrix transfer, our solution has been
to transmit the matrix by blocks. It was not very efficient, but, after
that,
the following computations were correct.
Are there written results of that internship?
Why to do scientific computation with HLA ?
To avoid a gateway (overhead) between various execution environments
(burden of mastering and deploying many tools).
Do we have a lot of applications which integrate scientific computation
in a distributed event-based simulation ?
I am thinking to the simulation of avionics systems which could require
more elaborate models of the plane and environment physics.
Christian could add more examples here ?
The general term "scientific computation" includes a wide field. Wiki says:
Computational science (or scientific computing) is the field of study
concerned with
constructing mathematical models
<http://en.wikipedia.org/wiki/Mathematical_model> and numerical solution
techniques and using
computers to analyze and solve scientific
<http://en.wikipedia.org/wiki/Scientific>, social scientific
<http://en.wikipedia.org/wiki/Social_science> and engineering
<http://en.wikipedia.org/wiki/Engineering> problems.
In practical use, it is typically the application of computer simulation
<http://en.wikipedia.org/wiki/Computer_simulation> and other forms
of computation <http://en.wikipedia.org/wiki/Computation> to problems in
various scientific disciplines.
My opinion is that also many event-based simulations refer to the field
of scientific computations.
Event-based simulations are often a very high abstraction
of real systems. In some parts of the engineering community
the modeling of dynamic systems via differental equations or partial
differental
equations is more common.
There are two common ways to combine continous and discrete models to
so-called hybrid models.
One way is to detect events in a continous simulation, a good
standard example is "the bouncing ball". Each time the ball hits the
ground an event occurs and the differential equation describing
the ball trajectory is changed.
The other way is to integrate ODE- resp. PDE-solvers in discrete event
simulations.
A special event causes the computation of e.g. some trajectories. This
approach
is more generell as the first one.
Again the example of the bouncing ball: If I'm interested in the
trajectory I can
model the behaviour through a differential equation or easier through a
straight without any influence
of the acceleration. Then I have to compute the position of the ball for
each time step
and check if it hits the ground. After that I can invert the direction.
This is a hybrid continous simulation
with event detection.
But if I'm only interested in the collisions with the ground, I can
advance my simulation time exactly to the
time in which the collision takes place. This refers to the processing
in event oriented simulations.
Additionally if I like to know more about the deformation of the ball
when hitting the ground, the event
"ground hit" can be used to initiate the computation of a continous
"damage model". This refers to hybrid simulation
based on event simulation.
I suppose both simulations can be assigned to the field of "scientific
computation".
Before HLA, DIS and ALSP the only reason to compute such problems in
parallel were to become
faster than the sequentiell solution. This was the only motivation for
parallel computing. The researches
in this area show that the problem have to fulfill some preconditions to
get something like a speedup.
Besides Amdahl's law the ratio between communication effort and
processing effort is very important. This
ratio is called granularity. The granularity is always application and
implementation specific. E.g parameter
studies like a Monte Carlo study have a good ratio. This means the
communication effort is low and the processing
effort high. Instead population simulations (predator-prey-models) need
a lot of communications. Typically this problems
do not reach any speedup. Often the speedup is below 1.
All problems with a small granularity value are appropriate to compute
in a distributed fashion (PDES: LPs have a big
lookahead). Probably we will get a speedup. This means that such an
application can run on top of nearly any middleware
(see today's grid applications). When the communication effort increases
the efficiency of the underlying communication
infrastructure becomes more and more important (as Eric already mentioned).
PDES can also be regarded as or better is an application of parallel
computing (and scientific computing).
The motivation is here also to become faster than the sequential
solution. Main application areas of PDES are e.g.
network or circuit simulations. Typically the whole sequentiell DES
model is partioned in different LPs. Opposite
to a parameter study, the distribution takes places on the model layer
instead on the experiment layer.
I think PDES and that's why also scientific computing can be done with
HLA. That's why HLA supports TM services.
Here the synchronization is done through the middleware.
But historically, HLA as predecessor of DIS and ALSP, the main
application area lies in the flied of so-called
Distributed Virtual Training Environments. Interoperability of existing
simulations, reusabiltity of existing simulation
code and scalability are in the main focus of this middleware. TM is
here used as a possibility to ensure
the causality order.
My opinion is that HLA can be used for parallel processing and in
particular for PDES. The communication
efficiency depends on the used RTI and that's why on implementation details.
Are the HLA services appropriate to write parallel programs ?
A first answer, we can write such programs with HLA.
Are the data management services appropriate ?
We can express point to point communication (a single publisher
and a single subscriber).
We can express one-to-many communication (data distribution).
We can express many-to-one commmunication (reduction operation
but without the power of an binary tree communication scheme).
The DDM services can be used to indicate the receiver of some data.
Are the time management services appropriate ?
A general parallel application requires a complex synchronization
of tasks. These services can be useful.
Even a fork-join mechanism can be easily (but not intituively
for a Fortran programmer) written.
How can we explain the superiority of MPI for the data transfers ?
a) Lower overhead of the MPI layer (latency) ?
b) Execution of MPI applications on efficient architectures
(processors and networks) ?
c) A lot of data transfer optimizations ?
(even in the case of MPI above TCP ?)
What are the MPI optimizations that cannot be included
in a RTI implementation ?
In the case of CERTI, we could study the direct connection
between RTIA for some objects (with a new transport attribute).
When you analyse existing approaches for parallel computing you will
mainly find two paradigms to realize parallel applications.
On the one hand you can use message-passing systems, on the other hand
SHM based systems. Historically mp-based approaches are
applied on distributed memory architectures (e.g. beowulf-clusters) whereas
SHM is used in close coupled memory architectures. Today, the usage of a
specific
programming scheme for a specific architecture is no more required,
(e.g. VSHM based on mp-systems are thinkable).
All mp-based approaches have in common, that the communication between
the sender and the receiver is explicit. That means I have to specify the
receiver of my message, e.g. PVM does it by TIDs. The synchronization
is always implicit through e.g. blocked receive operations etc..
On the other side SHM approaches have primitives for an explicit
synchronization (mutex, semaphor), the communication is
implicit (changing the value of a shared variable).
In my opinion, HLA do not really fit in one of these categories. The
communication
scheme in HLA is implicit mainly caused through the declaration
management services.
A sender or publisher of information does not know who will receive that
information.
A subscriber also does not know who produced that information. In the
view of an HLA
application using TM services the synchronization is
also implicit because sync primitives are provided through the RTI.
I would not say that we have a new programming paradigm for parallel
applications
but HLA is a little bit special :).
Yes we do.
But my point is _efficiency_, current HLA services (at least HLA 1.3
and IEEE-1516 I _currently aware of_) do not offer
efficient data exchange services usually needed by scientific computation:
- periodic exchange
- efficient broadcast in a group/sub-group
- barrier
- reduction
I'm absolut conform with Eric, MPI is a very fast "message-exchanging"
middleware.
Eric mentioned barrier synchronizations. In the field of PDES barriers
are also used in some second generation conservative sync algortihms
(synchronous
sync algorithms).
I suppose that these algorithms can also be applied to an HLA
implementation. At time
CERTI uses the CMB-algorithm (nullmessage) for conservative
synchronizations. The synchronous
sync algortihm can natively handle zero lookahead. Probably it could
handle small lookahead
federates much more faster than CMB.
Hope that someone find this discussion useful. For me it is a good
possibility to
order my own thoughts :).