qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [External] Re: [PATCH 01/16] Cherry pick a set of patches that enabl


From: Hao Xiang
Subject: Re: [External] Re: [PATCH 01/16] Cherry pick a set of patches that enables multifd zero page feature.
Date: Fri, 27 Oct 2023 18:06:05 -0700

On Fri, Oct 27, 2023 at 5:30 AM Fabiano Rosas <farosas@suse.de> wrote:
>
> Hao Xiang <hao.xiang@bytedance.com> writes:
>
> > Juan Quintela had a patchset enabling zero page checking in multifd
> > threads.
> >
> > https://lore.kernel.org/all/20220802063907.18882-13-quintela@redhat.com/
>
> Hmm, risky to base your series on code more than an year old. We should
> bother Juan so he sends an updated version for review.
>
> I have concerns about that series. First is why are we doing payload
> processing (i.e. zero page detection) in the multifd thread. And that
> affects your series directly, because AFAICS we're now doing more
> processing still.
>

I am pretty new to QEMU so my take could be wrong. We can wait for Juan
to comment here. My understanding is that the migration main loop was originally
designed to handle single sender thread (before multifd feature). Zero
page checking
is a pretty CPU intensive operation. So in case of multifd, we scaled
up the number
of sender threads in order to saturate network traffic. Doing zero
page checking in the
main loop is not going to scale with this new design. In fact, we
(Bytedance) has merged
Juan's change into our internal QEMU and we have been using this
feature since last
year. I was told that it improved performance pretty significantly.
Ideally, I would love to
see zero page checking be done in a separate thread pool so we can
scale it independently
from the sender threads but doing it in the sender thread is an
inexpensive way to scale.


> Second is more abstract but the multifd packet header is becoming just
> about small details about pages. We should probably take the time now
> and split that into a multifd header and a payload specific header. With
> some versioning stuck to them for migration compatibility.
>
> Now, I don't want to block this series due to my idealistic views on the
> code base, so I'll keep those aside while reviewing this, but I
> definitely think we should look at the big picture before we get too
> tangled up.
>

Totally agree. I actually have an implementation of this locally to do
exactly that.
The problem I see is that we use a fixed size page count in a payload but the
payload size varies depending on how many zero pages are actually detected.
The sender/receive pair has a synchronous loop on payload transfer and
if we have
a long fat pipe, the current behavior is not optimal for network
bandwidth utilization.
We can make sure we accumulate enough normal pages and we send a large packet.
And when we send zero pages, we can accumulate them until we have a very large
page count and we send them all at once.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]