gpsd-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gpsd-dev] gpsd unnoticed of a remote IP gps brutally unplugged


From: Bruno Coudoin
Subject: Re: [gpsd-dev] gpsd unnoticed of a remote IP gps brutally unplugged
Date: Thu, 09 Apr 2015 02:55:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0


Le 09/04/2015 02:20, Eric S. Raymond a écrit :
> Bruno Coudoin <address@hidden>:
>> Hi,
>>
>> I am using gpsd with a remote TCP/IP gps serveur through 'gpsd://' (a
>> CradlePoint IBR1100 router). In this case, if the remote gps is brutally
>> unplugged from its power plug, gpsd stays unnoticed. It does try to
>> reconnect but long after the gps server is back up (got a 5 hour
>> occurrence).
>>
>> In the traces I have this error which shows the 5 hours delay:
>> GPS on gpsd://192.168.0.1:8889 returned error -1 (16332.187656 sec since
>> data)
>>
>> As it is, since we never talk to the device but only receive from it,
>> and that the server cannot close or shutdown the socket, it is normal
>> that at TCP level it goes unnoticed. One solution could be to add a TCP
>> keepalive. Another option is to add a timeout when we don't get any
>> traffic from a networked gps device.
>>
>> Am I doing something wrong, is there an option to handle that case?
>>
>> Bruno.
>>
>>
> I don't understand what problem you are trying to solve.
>
> GPSD is designed to recover from devices droping out and then returning.
> In gpsd.c these definitions appear:
>
> /*
>  * Timeout policy.  We can't rely on clients closing connections
>  * correctly, so we need timeouts to tell us when it's OK to
>  * reclaim client fds.  COMMAND_TIMEOUT fends off programs
>  * that open connections and just sit there, not issuing a WATCH or
>  * doing anything else that triggers a device assignment.  Clients
>  * in watcher or raw mode that don't read their data will get dropped
>  * when throttled_write() fills up the outbound buffers and the
>  * NOREAD_TIMEOUT expires.
>  *
>  * RELEASE_TIMEOUT sets the amount of time we hold a device
>  * open after the last subscriber closes it; this is nonzero so a
>  * client that does open/query/close will have time to come back and
>  * do another single-shot query, if it wants to, before the device is
>  * actually closed.  The reason this matters is because some Bluetooth
>  * GPSes not only shut down the GPS receiver on close to save battery
>  * power, they actually shut down the Bluetooth RF stage as well and
>  * only re-wake it periodically to see if an attempt to raise the
>  * device is in progress.  The result is that if you close the device
>  * when it's powered up, a re-open can fail with EIO and needs to be
>  * tried repeatedly.  Better to avoid this...
>  *
>  * DEVICE_REAWAKE says how long to wait before repolling after a zero-length
>  * read. It's there so we avoid spinning forever on an EOF condition.
>  *
>  * DEVICE_RECONNECT sets interval on retries when (re)connecting to
>  * a device.
>  */
> #define COMMAND_TIMEOUT               60*15
> #define NOREAD_TIMEOUT                60*3
> #define RELEASE_TIMEOUT               60
> #define DEVICE_REAWAKE                0.01
> #define DEVICE_RECONNECT      2
>
> Is your issue that your local GPSD is waiting too long to try to reconnect 
> with
> the remote?  Be aware that this won;t happen at all unless someone is 
> subscribed to the remote, eg. actually has a client session open.
Hi,

I have reread the definition and I think the problem I have is not
currently handled by any of these cases.

I know that GPSD needs to have a client connected to poll my TCP/IP GPS
and this is the case. More precisely, I have a client application
running on the same PC as GPSD. It could be represented by this schema:

PC(GPSD CLIENT, GPSD)<----TCP/IP---->ROUTER(GPS SERVER)

What happens is that the ROUTER is electrically unplugged but not the
PC. In this case GPSD is in its select() but nothing happens on the GPS
SERVER file descriptor as it never writes to it and the GPS SERVER never
got a chance to close the socket. So we are unnoticed that the GPS
SERVER is no more.

I made the patch bellow that fixes the issue by setting a TCP keepalive
on the device socket. Not sure everybody want it and it should be made
configurable but it gives the idea. I configured it to emit a keepalive
packet after 60 seconds of inactivity and retry 3 times at 10 seconds
interval after that.

Bruno.

---
 netlib.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/netlib.c b/netlib.c
index 21bb6db..d57d8a8 100644
--- a/netlib.c
+++ b/netlib.c
@@ -15,6 +15,7 @@
 #ifndef INADDR_ANY
 #include <netinet/in.h>
 #endif /* INADDR_ANY */
+#include <netinet/tcp.h> /* for TCP keepalive */
 #include <arpa/inet.h>     /* for htons() and friends */
 #include <unistd.h>
 #endif /* S_SPLINT_S */
@@ -129,6 +130,21 @@ socket_t netlib_connectsock(int af, const char
*host, const char *service,
     /* set socket to noblocking */
     (void)fcntl(s, F_SETFL, fcntl(s, F_GETFL) | O_NONBLOCK);
 
+    {
+      /* Set the keepalive option active */
+      int optval = 1;
+      setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof(optval));
+
+      optval = 60;
+      setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, &optval, sizeof(optval));
+
+      optval = 10;
+      setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, &optval, sizeof(optval));
+
+      optval = 3;
+      setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, &optval, sizeof(optval));
+    }
+
     return s;
     /*@ +type +mustfreefresh @*/
 }
-- 
1.9.1






reply via email to

[Prev in Thread] Current Thread [Next in Thread]