certi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [certi-dev] FW: RE: using CERTI as a fgms replacement?


From: Eric Noulard
Subject: Re: [certi-dev] FW: RE: using CERTI as a fgms replacement?
Date: Mon, 17 Nov 2008 16:06:41 +0100

2008/11/17 Gotthard, Petr <address@hidden>:
> FYI, here is a test-report concerning running the FlightGear-plugin with 
> various CERTI versions.
>
> I very glad that these tests have happened.

I am too, thank you to people who have taken time to test.

> I believe the following CERTI issues can be drawn from this report:
>
> (1) protocol compatibility checks
> The CERTI protocol changes and RTIA/RTIG/federates running different CERTI 
> versions are very often incompatible. To avoid incompatibility problems in a 
> geographically distributed environment, introducing CERTI protocol versioning 
> and compatibility checks is critical.
> https://savannah.nongnu.org/bugs/index.php?24854

I agree I did comment on this, my suggestion are:
   1) in the short term ensure strict version equality with an extra
first message
   2) in the long run design and implement a compatibility mode.

> (2) federation problems when a federate is joining/leaving/failing
> The need to restart the server is related to unnability of RTIG to detect 
> RTIA crash.
> https://savannah.nongnu.org/bugs/?23746
> I will do more testing once I fix all FlightGear-plugin bugs.

OK will wait for that.
I think RTIG should be rock-solid regarding RTIA crashes, however
the RTIG crash should be user-handled on the Federate side, may be to
terminate RTIA and re-create/re-join  etc... automatically.

May be one can add a new libHLA API which handles
this kind of "high-level" RTIA handling, ideally the API would use
libRTI API with may-be an extra thread.

Another solution would be to add Fault-Tolerance inside RTIA executable
regarding "loss of communication" with RTIG however I don't think it would
be wise (however this may be discussed).

more comment below

> -----Original Message-----
> From: address@hidden
> To: "'Oliver Schroeder'" <address@hidden>
> Sent: 17.11.2008 01:04
> Subject: RE: Re: using CERTI as a fgms replacement?
>
> Oliver,
>
> Thank you for forwarding this. Anders Gidenstam, Jon Stockill, Csaba Halász, 
> and I have been carrying out some testing over the last few days with mixed 
> results. This is a short report on our findings.
>
> I was running Windows XP (SP2) with MSVC9 (SP1). The remainder, various 
> flavours of Linux. We all downloaded and
>installed the FG patch without difficulty. I downloaded the pre-cooked MSVC 
>version Certi 3.3.0. This installs and runs. It
>seems to run "Billards", but since nothing actually happens, it is hard to 
>tell if this is actually correct. Anders ran rtig, and I
>attempted to connect. This appeared to happen correctly. The setting was 
>--hla=10,NAVY400,VirtualAir. The VirtualAir FOM
>appears to be a good basis for developing our own federation. We noted that 
>the frame rate at KSFO was around 9 using the
> Seahawk. This is about half that which we would expect with MultiPlayer (MP).

May be the plugin work should be asynchronous regarding to the
FlightGear main loop.
I'm speaking without actually knowing much of the FlightGear plugin internals.
We did not experienced such slow down using the XPlane plugin however
the HLA plugin
did not run at the framerate rate, i.e. the HLA XPlane Plugin may
choose to receive(or send)
update at slower rate than the graphic loop, i.e. 1Hz, with a frame
rate higher than 20 fps.

> So far so good. However when any other client attempted to join, the Windows 
> client throws an error and RTIA stops in
> SocketUN.cc:
>
>        line 460 - throw NetworkError("Error while receiving UN message.");
>
> This stops the RTIG server, and this in turn stops all clients. Linux clients 
> do not seem
> to experience the first stop, but the Windows client stop brings down all 
> other clients.
> This effect is also seen when a Linux client leaves the network which stops 
> all other clients.
> This requires that the server is restarted.

Ok right like I said,
a federate crash should never make other federate or even the RTIG to crash.
The RTIG crash should make the federate RTIA stops, however each
federate may implements
its own "restart" strategy (this may be done inside VirtualAir).

[...]


> In summary:
>
> - The frame rate hit is excessive and unacceptable.

The solution should be inside VirtualAir plugin (or a better tick
usage)

> - Certi Windows client fails if any client (Windows or Linux) joins the net.
> - Linux clients fail if any client leaves the net
> - Any exit from the net stops RTIG which stops all clients.

     This will definitely be handled.

> - HLA seems to introduce stagger into the FG frame rate.
>
> Recommendations:
>
> - To be usable in FG HLA must not have a greater effect on frame rate than 
> MP, and preferably less.
> - HLA must not introduce stagger into FG.

>
> - The RTIG server must not be affected by clients joining or leaving the net, 
> or failing.

    The RTIG should be rock-solid we should improve this.

>
> - Clients must not be affected by the failure of the RTIG server.

     The RTIG crash case should be
     specifically handled by VirtualAir or be a CERTI extension (libHLA).
     The "good" way to handle RTI crash depends on the Federate objectives.

> - Consideration should be given to running HLA in its own thread.

     That would be a good idea, communication between HLA and
      FG mainloop should be asynchronous.

>
> - Certi must be multi-platform, and proved to be so.

      Agreed, CERTI handles heterogeneity since 3.3.0 so this is somehow new.
      Since then we did never experienced trouble with multi-platform trouble
      when compiling CERTI from source.
      However we do have an issue on Windows with prebuilt version.
      Windows is not the primary development platform of CERTI,
      we should find peoples who wants to do setup a windows Dashboard
     entry for CERTI and HLA_TestsSuite:
     http://lists.nongnu.org/archive/html/certi-devel/2008-10/msg00028.html


> Conclusions:
>
> - HLA shows good promise, but in its current state cannot be considered to be 
> fit for inclusion in FG.
>
> Finally:
>
> - We wonder if Certi has actually been tested under Windows?

      Yes CERTI is used on Windows, may be Windows users can step in and
      tell us the context of their CERTI usage.

      However this is true that the primary and more intensively used
platform is Unix/Linux.
      We do not neglect Windows users but we need more systematic testing
      on the Windows platform.
      CERTI is an Open Source project may be the Windows users can help here.

> We look forward to testing HLA again once at least some of the shortcomings 
> have been addressed.

    I will be please to ear from you again.
    Thank you for this thorough testing.

-- 
Erk




reply via email to

[Prev in Thread] Current Thread [Next in Thread]