gpsd-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gpsd-dev] Mysteriously vanishing bugs don't make me happy


From: Eric S. Raymond
Subject: [gpsd-dev] Mysteriously vanishing bugs don't make me happy
Date: Mon, 4 Nov 2013 14:14:06 -0500 (EST)

The most serious bug I had remaining on my list was this one:

   The end-of-hunt guard at libgpsd_core.c:1234 needs to be fixed so it
   both doesn't choke on a SiRF-III and works on TCP/IP messages with 
   write boundaries in the middle of packets. (Savannah tracker bug  #36409)

My first fix for the TCP/IP bug worked, but Gary reported that it 
broke live-testing on a SiRF-III. I was able to reproduce this, so
I backed out the fix.  But that isn't good enough for a production
release; both cases need to work.

To attack this, I spent nearly two days upgrading the test machinery
so that I could reproduce Savannah bug #36409: "GPSD fails to start
get GPS data from tcp://location:port" inside our regression-test
environment.

This involved (a) making the test framework able to simulate a TCP
source (it could already do fake TTYs and UDP sources) and (b) adding
an ability to insert synthetic delays in logfiles that create write
boundaries in the input

In the process, I discovered a fun hack.  Bear with me, C programmers; 
there's nothing Python-specific about this example:

class FakeTCP(FakeGPS):
    "A TCP serverlet with a test log ready to be cycled to it."
    def __init__(self, testload,
                 host, port,
                 progress=None):
        FakeGPS.__init__(self, testload, progress)
        self.host = host
        self.port = int(port)
        self.byname = "tcp://" + host + ":" + str(port)
        self.dispatcher = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        # This magic prevents "Address already in use" errors after
        # we release the socket.
        self.dispatcher.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.dispatcher.bind((self.host, self.port))
        self.dispatcher.listen(5)
        self.readables = [self.dispatcher]

    def read(self):
        "Handle connection requests and data."
        readable, _writable, _errored = select.select(self.readables, [], [], 0)
        for s in readable:
            if s == self.dispatcher:    # Connection request
                client_socket, _address = s.accept()
                self.readables = [client_socket]
                self.dispatcher.close()
            else:                       # Incoming data
                data = s.recv(1024)
                if not data:
                    s.close()
                    self.readables.remove(s)
                # Incoming-data handler goes here

    def write(self, line):
        "Send the next log packet to everybody connected."
        for s in self.readables:
            if s != self.dispatcher:
                s.send(line)

This is a serverlet class that *never blocks*. The trick is not doing
the (blocking) accept() call during the initialization stage.  Instead
you do it in your polling read() call. The read watches for your 
dispatcher socket becoming readable, which means somebody wants
to connect.  Only *then* does it create a connection socket with accept().

In the most general case, my test framework can now surround a gpsd
instance with any number of fake GPS sources, each cycling a specified
logfile until it's told to stop, with any desired mix of pseudo-TTYs,
TCP sources and UDP sources. The TCP fake sources are all instances
of this little serverlet thingy.

Using the new machinery, I wrote this test:

# Name: Generic NMEA
# Submitter: Eric S. Raymond <address@hidden>
# Date: 2013-11-03
# Transport: TCP
# Notes: A variant of the TCP testload intended to capture the error case
#        reported in Savannah tracker bug  #36409: "GPSD fails to start
#        get GPS data from tcp://location:port".  The delay cookies are
#        inserted to produce write boundaries that will be visible to the
#        packetizer.
# Delay-Cookie: | 0.01
,1.7,-30.40,M,-13.9,M,,*7D
$GPGGA,|19322|1.00,|2037.|72792,N,08704.08478,W,1,04,1.7,-30.40,M,-13.9,M,,*7D
$GPGSA,A,3,10,28,09,13,,,,,,,,,03.4,01.7,03.0*00
$GPGSV,3,1,12,28,14,150,41,09,15,254,41,10,43,192,47,13,06,081,36*7A
$GPGSV,3,2,12,02,56,323,,04,41,024,,12,31,317,,17,31,085,*72
$GPGSV,3,3,12,05,15,318,,24,02,246,,33,08,096,,35,45,118,*7D
$GPRMC,193221.00,A,2037.7279,N,08704.0848,W,00.1,201.8,231207,01,W,A*2D
$GPZDA,193223.00,23,12,2007,00,00*69

That reproduced the bug, all right.  So I reinstated the fix.  Then
live-tested a SiRF-III. And it *worked*.

I don't like it when previously reproducible bugs vanish without
reason.

Gary, please live-test on your SiRF-III.  If you get a hang we're
going to need to look seriously at the new hunt_failure() predicate
function and figure out what the right test is.
--
                                                >>esr>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]