qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QEMU migration-test CI intermittent failure


From: Fabiano Rosas
Subject: Re: QEMU migration-test CI intermittent failure
Date: Thu, 14 Sep 2023 12:10:04 -0300

Peter Xu <peterx@redhat.com> writes:

> On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > Hi,
>> > The following intermittent failure occurred in the CI and I have filed
>> > an Issue for it:
>> > https://gitlab.com/qemu-project/qemu/-/issues/1886
>> >
>> > Output:
>> >
>> >   >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 
>> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
>> > G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh 
>> > QTEST_QEMU_BINARY=./qemu-system-x86_64 
>> > /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k
>> >   ――――――――――――――――――――――――――――――――――――― ✀  
>> > ―――――――――――――――――――――――――――――――――――――
>> >   stderr:
>> >   qemu-system-x86_64: Unable to read from socket: Connection reset by peer
>> >   Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc 
>> > current = 4f hit_edge = 1
>> >   **
>> >   ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion 
>> > failed: (bad == 0)
>> >   (test program exited with status code -6)
>> >
>> > You can find the full output here:
>> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417
>> 
>> This is the postcopy return path issue that I'm addressing here:
>> 
>> 20230911171320.24372-1-farosas@suse.de">https://lore.kernel.org/r/20230911171320.24372-1-farosas@suse.de
>> Subject: [PATCH v6 00/10] Fix segfault on migration return path
>> Message-ID: <20230911171320.24372-1-farosas@suse.de>
>
> Hmm I just noticed one thing, that Stefan's failure is a ram check issue
> only, which means qemu won't crash?
>

The source could have crashed and left the migration at an inconsistent
state and then the destination saw corrupted memory?

> Fabiano, are you sure it's the same issue on your return-path fix?
>

I've been running the preempt tests on my branch for thousands of
iterations and didn't see any other errors. Since there's no code going
into the migration tree recently I assume it's the same error.

I run the tests with GDB attached to QEMU, so I'll always see a crash
before any memory corruption.

> I'm also trying to reproduce either of them with some loads.  I think I hit
> some but it's very hard to reproduce solidly.

Well, if you find anything else let me know and we'll fix it.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]