qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] block/nbd: fix reconnect-delay


From: Eric Blake
Subject: Re: [PATCH 3/4] block/nbd: fix reconnect-delay
Date: Wed, 23 Sep 2020 10:15:43 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 9/3/20 2:03 PM, Vladimir Sementsov-Ogievskiy wrote:
reconnect-delay has a design flaw: we handle it in the same loop where
we do connection attempt. So, reconnect-delay may be exceeded by
unpredictable time of connection attempt.

Let's instead use separate timer.

How to reproduce the bug:



This will make the connect() call of qemu-io at node2 take a long time.

And you'll see that read command in qemu-io will hang for a long time,
more than 15 seconds specified by reconnect-delay parameter. It's the
bug.

8. Don't forget to drop iptables rule on node1:

    sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Important note: Step [5] is necessary to reproduce _this_ bug. If we
miss step [5], the read command (step 6) will hang for a long time and
this commit doesn't help, because there will be not long connect() to
unreachable host, but long sendmsg() to unreachable host, which should
be fixed by enabling and adjusting keep-alive on the socket, which is a
thing for further patch set.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
  block/nbd.c | 59 +++++++++++++++++++++++++++++++++++++++++++++--------
  1 file changed, 50 insertions(+), 9 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]