qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios r


From: Daniel P . Berrangé
Subject: Re: [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios run non-live
Date: Thu, 1 Jun 2023 17:35:07 +0100
User-agent: Mutt/2.2.9 (2022-11-12)

On Thu, Jun 01, 2023 at 12:17:38PM -0400, Peter Xu wrote:
> On Thu, Jun 01, 2023 at 04:55:25PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Jun 01, 2023 at 11:53:17AM -0400, Peter Xu wrote:
> > > On Thu, Jun 01, 2023 at 04:39:48PM +0100, Daniel P. Berrangé wrote:
> > > > On Thu, Jun 01, 2023 at 11:30:10AM -0400, Peter Xu wrote:
> > > > > Thanks for looking into this.. definitely worthwhile.
> > > > > 
> > > > > On Wed, May 31, 2023 at 02:23:59PM +0100, Daniel P. Berrangé wrote:
> > > > > > There are 27 pre-copy live migration scenarios being tested. In all 
> > > > > > of
> > > > > > these we force non-convergance and run for one iteration, then let 
> > > > > > it
> > > > > > converge and wait for completion during the second (or following)
> > > > > > iterations. At 3 mbps bandwidth limit the first iteration takes a 
> > > > > > very
> > > > > > long time (~30 seconds).
> > > > > > 
> > > > > > While it is important to test the migration passes and convergance
> > > > > > logic, it is overkill to do this for all 27 pre-copy scenarios. The
> > > > > > TLS migration scenarios in particular are merely exercising 
> > > > > > different
> > > > > > code paths during connection establishment.
> > > > > > 
> > > > > > To optimize time taken, switch most of the test scenarios to run
> > > > > > non-live (ie guest CPUs paused) with no bandwidth limits. This gives
> > > > > > a massive speed up for most of the test scenarios.
> > > > > > 
> > > > > > For test coverage the following scenarios are unchanged
> > > > > 
> > > > > Curious how are below chosen?  I assume..
> > > > 
> > > > Chosen based on whether they exercise code paths that are unique
> > > > and interesting during the RAM transfer phase.
> > > > 
> > > > Essentially the goal is that if we have N% code coverage before this
> > > > patch, then we should still have the same N% code coverage after this
> > > > patch.
> > > > 
> > > > The TLS tests exercise code paths that are unique during the migration
> > > > establishment phase. Once establishd they don't exercise anything
> > > > "interesting" during RAM transfer phase. Thus we don't loose code 
> > > > coverage
> > > > by runing TLS tests non-live.
> > > > 
> > > > > 
> > > > > > 
> > > > > >  * Precopy with UNIX sockets
> > > > > 
> > > > > this one verifies dirty log.
> > > > > 
> > > > > >  * Precopy with UNIX sockets and dirty ring tracking
> > > > > 
> > > > > ... dirty ring...
> > > > > 
> > > > > >  * Precopy with XBZRLE
> > > > > 
> > > > > ... xbzrle I think needs a diff on old/new, makes sense.
> > > > > 
> > > > > >  * Precopy with UNIX compress
> > > > > >  * Precopy with UNIX compress (nowait)
> > > > > >  * Precopy with multifd
> > > > > 
> > > > > What about the rest three?  Especially for two compression tests.
> > > > 
> > > > The compress thread logic is unique/interesting during RAM transfer
> > > > so benefits from running live. The wait vs non-wait scenario tests
> > > > a distinct codepath/logic.
> > > 
> > > I assume you mean e.g. when compressing with guest page being modified and
> > > we should survive that rather than crashing the compressor?
> > 
> > No, i mean the compression code has a significant behaviour difference
> > between its two tests, because they toggle:
> > 
> >  @compress-wait-thread: Controls behavior when all compression
> >      threads are currently busy.  If true (default), wait for a free
> >      compression thread to become available; otherwise, send the page
> >      uncompressed.  (Since 3.1)
> > 
> > so we need to exercise the code path that falls back to sending
> > uncompressed, as well as the code path that waits for free threads.
> 
> But then the question is why live is needed?
>
> IIUC whether the wait thing triggers have nothing directly related to VM is
> live or not, but whether all compress thread busy.  IOW, IIUC all compress
> paths will be tested even if non-live as long as we feed enough pages to
> the compressor threads.

If non-live the guest won't have dirtied any pages, so I wasn't
confident there would be sufficient amount of dirty ram to send
to trigger this. Possibly that's being too paranoid though

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]