qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios r


From: Juan Quintela
Subject: Re: [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios run non-live
Date: Fri, 02 Jun 2023 00:55:59 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, Jun 01, 2023 at 11:53:17AM -0400, Peter Xu wrote:
>> On Thu, Jun 01, 2023 at 04:39:48PM +0100, Daniel P. Berrangé wrote:
>> > On Thu, Jun 01, 2023 at 11:30:10AM -0400, Peter Xu wrote:
>> > > Thanks for looking into this.. definitely worthwhile.
>> > > 
>> > > On Wed, May 31, 2023 at 02:23:59PM +0100, Daniel P. Berrangé wrote:
>> > > > There are 27 pre-copy live migration scenarios being tested. In all of
>> > > > these we force non-convergance and run for one iteration, then let it
>> > > > converge and wait for completion during the second (or following)
>> > > > iterations. At 3 mbps bandwidth limit the first iteration takes a very
>> > > > long time (~30 seconds).
>> > > > 
>> > > > While it is important to test the migration passes and convergance
>> > > > logic, it is overkill to do this for all 27 pre-copy scenarios. The
>> > > > TLS migration scenarios in particular are merely exercising different
>> > > > code paths during connection establishment.
>> > > > 
>> > > > To optimize time taken, switch most of the test scenarios to run
>> > > > non-live (ie guest CPUs paused) with no bandwidth limits. This gives
>> > > > a massive speed up for most of the test scenarios.
>> > > > 
>> > > > For test coverage the following scenarios are unchanged
>> > > 
>> > > Curious how are below chosen?  I assume..
>> > 
>> > Chosen based on whether they exercise code paths that are unique
>> > and interesting during the RAM transfer phase.
>> > 
>> > Essentially the goal is that if we have N% code coverage before this
>> > patch, then we should still have the same N% code coverage after this
>> > patch.
>> > 
>> > The TLS tests exercise code paths that are unique during the migration
>> > establishment phase. Once establishd they don't exercise anything
>> > "interesting" during RAM transfer phase. Thus we don't loose code coverage
>> > by runing TLS tests non-live.
>> > 
>> > > 
>> > > > 
>> > > >  * Precopy with UNIX sockets
>> > > 
>> > > this one verifies dirty log.
>> > > 
>> > > >  * Precopy with UNIX sockets and dirty ring tracking
>> > > 
>> > > ... dirty ring...
>> > > 
>> > > >  * Precopy with XBZRLE
>> > > 
>> > > ... xbzrle I think needs a diff on old/new, makes sense.
>> > > 
>> > > >  * Precopy with UNIX compress
>> > > >  * Precopy with UNIX compress (nowait)
>> > > >  * Precopy with multifd
>> > > 
>> > > What about the rest three?  Especially for two compression tests.
>> > 
>> > The compress thread logic is unique/interesting during RAM transfer
>> > so benefits from running live. The wait vs non-wait scenario tests
>> > a distinct codepath/logic.
>> 
>> I assume you mean e.g. when compressing with guest page being modified and
>> we should survive that rather than crashing the compressor?
>
> No, i mean the compression code has a significant behaviour difference
> between its two tests, because they toggle:
>
>  @compress-wait-thread: Controls behavior when all compression
>      threads are currently busy.  If true (default), wait for a free
>      compression thread to become available; otherwise, send the page
>      uncompressed.  (Since 3.1)
>
> so we need to exercise the code path that falls back to sending
> uncompressed, as well as the code path that waits for free threads.

It don't work.
I think that I am going to just drop it for this iteration.

I tried 2 or 3 years ago to get a test to run to compression -> was not
able to get it to work.

Moved compression on top of multifd, much, much faster and much cleaner
(each compression method is around 50 lines of code).

Lukas tried this time and he was not able to get it working either.

So I have no hope at all for this code.

To add insult to injury, it copies things so many times that is just not
worthy.

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]