[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: QEMU migration-test CI intermittent failure
From: |
Peter Xu |
Subject: |
Re: QEMU migration-test CI intermittent failure |
Date: |
Thu, 14 Sep 2023 11:35:47 -0400 |
On Thu, Sep 14, 2023 at 12:10:04PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
> >> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >>
> >> > Hi,
> >> > The following intermittent failure occurred in the CI and I have filed
> >> > an Issue for it:
> >> > https://gitlab.com/qemu-project/qemu/-/issues/1886
> >> >
> >> > Output:
> >> >
> >> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116
> >> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> >> > G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
> >> > QTEST_QEMU_BINARY=./qemu-system-x86_64
> >> > /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
> >> > ――――――――――――――――――――――――――――――――――――― ✀
> >> > ―――――――――――――――――――――――――――――――――――――
> >> > stderr:
> >> > qemu-system-x86_64: Unable to read from socket: Connection reset by
> >> > peer
> >> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc
> >> > current = 4f hit_edge = 1
> >> > **
> >> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion
> >> > failed: (bad == 0)
> >> > (test program exited with status code -6)
> >> >
> >> > You can find the full output here:
> >> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
> >>
> >> This is the postcopy return path issue that I'm addressing here:
> >>
> >> 20230911171320.24372-1-farosas@suse.de">https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
> >> Subject: [PATCH v6 00/10] Fix segfault on migration return path
> >> Message-ID: <20230911171320.24372-1-farosas@suse.de>
> >
> > Hmm I just noticed one thing, that Stefan's failure is a ram check issue
> > only, which means qemu won't crash?
> >
>
> The source could have crashed and left the migration at an inconsistent
> state and then the destination saw corrupted memory?
>
> > Fabiano, are you sure it's the same issue on your return-path fix?
> >
>
> I've been running the preempt tests on my branch for thousands of
> iterations and didn't see any other errors. Since there's no code going
> into the migration tree recently I assume it's the same error.
>
> I run the tests with GDB attached to QEMU, so I'll always see a crash
> before any memory corruption.
Okay, maybe that stops you from seeing the above check_guests_ram() error?
Worth checking whether it fails differently always if you just don't attach
gdb to it; I had a feeling that it'll always fail in the other way (I think
migration-test will say something like "qemu killed" etc. in most cases),
further to identify the issues.
>
> > I'm also trying to reproduce either of them with some loads. I think I hit
> > some but it's very hard to reproduce solidly.
>
> Well, if you find anything else let me know and we'll fix it.
I think Stefan's issue is the one I triggered once, but only once; I did
see check_guests_ram() lines.
I ran concurrently 10 migration-tests (on 8 cores; just to make scheduler
start to really work), each looping over preempt/plain for 500 times and
hit nothing.. I'm trying again with a larger host with more instances, so
far I've run 200 loops over 40 instances running together, I hit
nothing.. but I'm keeping trying.
--
Peter Xu
- QEMU migration-test CI intermittent failure, Stefan Hajnoczi, 2023/09/13
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/13
- Re: QEMU migration-test CI intermittent failure, Stefan Hajnoczi, 2023/09/13
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/14
- Re: QEMU migration-test CI intermittent failure,
Peter Xu <=
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/14
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/15
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/15
- Re: QEMU migration-test CI intermittent failure, Fabiano Rosas, 2023/09/18
- Re: QEMU migration-test CI intermittent failure, Peter Xu, 2023/09/18