lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] Debugging a hang in an lwIP-based application


From: Freddie Chopin
Subject: Re: [lwip-users] Debugging a hang in an lwIP-based application
Date: Tue, 26 Feb 2019 08:33:31 +0100
User-agent: Evolution 3.30.5

On Fri, 2019-02-22 at 10:23 -0300, Sergio R. Caprile wrote:
> Unfortunately I only know the RAW API and can't help you further. Did
> you check the basics ?:
> - core lwIP runs in a single thread. If your Ethernet is handled in
> another thread, you don't call any lwIP functions from there, except
> for
> the pbuf allocation/free functions; you queue your packets and the
> core
> lwIP thread (tcpip_thread) will take them out of the queue and handle
> them later.
> - "one thread per socket", quote: "Netconn or Socket API functions
> are
> thread safe against the core thread but they are not reentrant at the
> control block granularity level. That is, a UDP or TCP control block
> must not be shared among multiple threads without proper locking."
> https://www.nongnu.org/lwip/2_1_x/multithreading.html
> 
> This is as far as I can go.

Hello Sergio!

Thanks for your input. I checked your suggestions last week and it
seems I obey these rules. The input thread of ETH only uses
netif_set_link_down()/..._up(), netif.input() and 
pbuf_alloc()/..._free(). All multithreaded uses of netconns are
properly synchronized with mutexes.

Finally it turned out that - as usually - it was an operator error (;
My application was using one lwIP timer for notifying watchdog manager
that lwIP's main thread is still running. As I did not change default
settings of timers (MEMP_NUM_SYS_TIMEOUT), there were not enough of
them and tcp_poll() was _NOT_ running as it should. I guess it fits the
symptoms I observed.

Finding the cause was as easy as enabling lwIP's asserts, what a pity I
did not do that much earlier, but well... you learn every day...

While we're at it, I have one more question. After the fix described
above the application is running much much better and the connection
looks a lot more stable/reliable. But I do still see some rare
occurrences of TCP retransmissions and duplicated ACKs in Wireshark
(like maybe once/twice in an hour or something like that). The hardware
side of the connection in my case is pretty simple - my PC is connected
with a cable to a router, which is connected with a cable to a switch,
which is then connected with a cable to the devices (there are four of
them). I have very little experience with debugging network
connections, so I would like to ask whether this (low) amount of
retransmissions and duplicated ACKs is something normal/expected or
rather something I should worry about, as any number higher than zero
is a symptom of some problem?

Again - thanks in advance for all the help!

Regards,
FCh




reply via email to

[Prev in Thread] Current Thread [Next in Thread]