qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] migration: Fix rdma migration failed


From: Fabiano Rosas
Subject: Re: [PATCH 1/2] migration: Fix rdma migration failed
Date: Wed, 20 Sep 2023 09:46:14 -0300

Li Zhijian <lizhijian@fujitsu.com> writes:

> From: Li Zhijian <lizhijian@cn.fujitsu.com>
>
> Destination will fail with:
> qemu-system-x86_64: rdma: Too many requests in this message 
> (3638950032).Bailing.
>
> migrate with RDMA is different from tcp. RDMA has its own control
> message, and all traffic between RDMA_CONTROL_REGISTER_REQUEST and
> RDMA_CONTROL_REGISTER_FINISHED should not be disturbed.

Yeah, this is really fragile. We need a long term solution to this. Any
other change to multifd protocol as well as any other change to the
migration ram handling might hit this issue again.

Perhaps commit 294e5a4034 ("multifd: Only flush once each full round of
memory") should simply not have touched the stream at that point, but we
don't have any explicit safeguards to avoid interleaving flags from
different layers like that (assuming multifd is at another logical layer
than the ram handling).

I don't have any good suggestions at this moment, so for now:

Reviewed-by: Fabiano Rosas <farosas@suse.de>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]