[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 6/7] block/nbd: decouple reconnect from drain
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [PATCH 6/7] block/nbd: decouple reconnect from drain |
Date: |
Mon, 15 Mar 2021 23:10:14 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
15.03.2021 09:06, Roman Kagan wrote:
The reconnection logic doesn't need to stop while in a drained section.
Moreover it has to be active during the drained section, as the requests
that were caught in-flight with the connection to the server broken can
only usefully get drained if the connection is restored. Otherwise such
requests can only either stall resulting in a deadlock (before
8c517de24a), or be aborted defeating the purpose of the reconnection
machinery (after 8c517de24a).
Since the pieces of the reconnection logic are now properly migrated
from one aio_context to another, it appears safe to just stop messing
with the drained section in the reconnection code.
Fixes: 5ad81b4946 ("nbd: Restrict connection_co reentrance")
I'd not think that it "fixes" it. Behavior changes.. But 5ad81b4946 didn't
introduce any bugs.
Fixes: 8c517de24a ("block/nbd: fix drain dead-lock because of nbd
reconnect-delay")
And here..
1. There is an existing problem (unrelated to nbd) in Qemu that long io request
which we have to wait for at drained_begin may trigger a dead lock
(https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg01339.html)
2. So, when we have nbd reconnect (and therefore long io requests) we simply
trigger this deadlock.. That's why I decided to cancel the requests (assuming
they will most probably fail anyway).
I agree that nbd driver is wrong place for fixing the problem described in
(https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg01339.html), but if
you just revert 8c517de24a, you'll see the deadlock again..
--
Best regards,
Vladimir
[PATCH 5/7] block/nbd: better document a case in nbd_co_establish_connection, Roman Kagan, 2021/03/15
[PATCH 3/7] block/nbd: assert attach/detach runs in the proper context, Roman Kagan, 2021/03/15
[PATCH 4/7] block/nbd: transfer reconnection stuff across aio_context switch, Roman Kagan, 2021/03/15
[PATCH 1/7] block/nbd: avoid touching freed connect_thread, Roman Kagan, 2021/03/15