[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PULL 00/18] migration queue
From: |
Peter Xu |
Subject: |
Re: [PULL 00/18] migration queue |
Date: |
Tue, 15 Mar 2022 10:41:12 +0800 |
On Mon, Mar 14, 2022 at 06:53:29PM +0000, Daniel P. Berrangé wrote:
> On Mon, Mar 14, 2022 at 06:20:54PM +0000, Dr. David Alan Gilbert wrote:
> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > On Mon, 14 Mar 2022 at 17:55, Dr. David Alan Gilbert
> > > <dgilbert@redhat.com> wrote:
> > > >
> > > > Peter Maydell (peter.maydell@linaro.org) wrote:
> > > > > One thing that makes this bug investigation trickier, incidentally,
> > > > > is that the migration-test code seems to depend on userfaultfd.
> > > > > That means you can't run it under 'rr'.
> > > >
> > > > That should only be the postcopy tests; the others shouldn't use that.
> > >
> > > tests/qtest/migration-test.c:main() exits immediately without adding
> > > any of the test cases if ufd_version_check() fails, so no userfaultfd
> > > means no tests run at all, currently.
> >
> > Ouch! I could swear we had a fix for that.
https://lore.kernel.org/qemu-devel/20210615175523.439830-2-peterx@redhat.com/
I remembered for some reason that pull (containing this patch) got issues
on applying, and that patch got forgotten.
> >
> > Anyway, it would be really good to see what migrate-query was returning;
> > if it's stuck in running or cancelling then it's a problem with multifd
> > that needs to learn to let go if someone is trying to cancel.
> > If it's failed or similar then the test needs fixing to not lockup.
>
> This patch of mine may well be helpful:
>
> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03192.html
>
> when debugging my TLS tests various mistakes meant I ended up with
> a failed session, but the test was spinning forever on 'query-migrate'.
> It was waiting for it to finish one iteration, and never bothering to
> validate that the reported status == active.
>
> If that patch was merged, it might well cause the test to abort in an
> assertion rather than spining forever, if status == failed.
>
> Of course someone would still need to find out why it failed, but
> none the less, I think assert is nicer than spin forever.
Agreed.
--
Peter Xu
- Re: [PULL 00/18] migration queue, Philippe Mathieu-Daudé, 2022/03/08
- Re: [PULL 00/18] migration queue, Dr. David Alan Gilbert, 2022/03/08
- Re: [PULL 00/18] migration queue, Peter Maydell, 2022/03/14
- Re: [PULL 00/18] migration queue, Daniel P . Berrangé, 2022/03/14
- Re: [PULL 00/18] migration queue, Peter Maydell, 2022/03/14
- Re: [PULL 00/18] migration queue, Daniel P . Berrangé, 2022/03/14
- Re: [PULL 00/18] migration queue, Dr. David Alan Gilbert, 2022/03/14
- Re: [PULL 00/18] migration queue, Peter Maydell, 2022/03/14
- Re: [PULL 00/18] migration queue, Dr. David Alan Gilbert, 2022/03/14
- Re: [PULL 00/18] migration queue, Daniel P . Berrangé, 2022/03/14
- Re: [PULL 00/18] migration queue,
Peter Xu <=
- Re: [PULL 00/18] migration queue, Peter Maydell, 2022/03/14
- Re: [PULL 00/18] migration queue, Peter Maydell, 2022/03/14
- multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Peter Maydell, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Peter Maydell, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Peter Maydell, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Daniel P . Berrangé, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort, Thomas Huth, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort, Daniel P . Berrangé, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Peter Maydell, 2022/03/15
- Re: multifd/tcp/zlib intermittent abort (was: Re: [PULL 00/18] migration queue), Dr. David Alan Gilbert, 2022/03/15