gpsd-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gpsd-dev] Parallel build broken?


From: Eric S. Raymond
Subject: Re: [gpsd-dev] Parallel build broken?
Date: Sat, 23 Nov 2013 03:22:32 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

Gary E. Miller <address@hidden>:
> Ideally -j3 should just work.  It seems to me if the dependencies are
> correct then leapfiecth=off should not be needed.  An acceptable 
> fallback might be to auto set leapfetch=off when -j1 is not in effect.

Here's the problem:

We want to build with the freshest possible version of the leap-second offset.
It's wanted for time reporting before the device loads an ephemeris. Without
it we don't have leap-second correction before that point.

(After the ephemeris is loaded, we keep the leap offset around until
it's superseded by another report.  So even if the device's normal fix report
only ships GPS time we can correct what client programs see.)

We have a file, leapseconds.cache, that is part of the repository content,
containing this information.

Early in every build there's a step where a Python script,
leapsecond.py, goes out to the USNO and other relevant sources and
fetches current offset informstion in order to ensure that
leapseconds.cache is up to date. This is the step leapfetch=no
suppresses.

If leapfetch=yes, the built installation's leapsecond error will be determined
by the time of build - the best we can do, since the build system isn't 
precognitive and can't know when in the future IERS will issue corrections..  

If leapfetch=no, the leapsecond error will be determined by the last time
leapseconds.cache was refreshed.  That's probably its state as of the last
git pull before the build.

If your build engine is inside a firewall that requires an
authenticating proxy, the attempt to refresh leapseconds.cache can hang
indefinitely.  This is Christian's case and it's why I implemented 
leapfetch=no.

In order to not hang the build indefinitely if the network environment is 
hostile, the fetch attempt has to time out.  The timeout is implemented
with signal(SIGALRM ...) - which doesn't work in a non-main thread.

Thus, leapfetch=yes is incompatible with parallel build. No amount of
tweaking dependencies can fix this.  Here are some ways we could fix it:

1. Find a way to interrupt a network fetch attempt from a timer that will
work in a thread.

2. Force leapfetch off if -j is set. 

3. Give up on fetching this information every build - probably means
having a special production we fire before each release.  Has the
disadvantage that the expected wrongness of time before an ephemeris
load will go up.

The only good news in this mess is that we're no longer using Evermore's
binary reporting mode.  That was the only device type that both (a) reported
uncorrected GPS time, and (b) never reported the leap offset.  Every other
device either reports UTC (after first ephemeris) or reports leap offset.

So with Evermore binary gone, alternative 3 (abolishing the automatic fetch)
becomes at least thinkable.

There are the tradeoffs.  Discuss.
-- 
                <a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]