qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory


From: Peter Xu
Subject: Re: [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error
Date: Thu, 14 Sep 2023 17:50:03 -0400

On Thu, Sep 14, 2023 at 08:20:53PM +0000, “William Roche wrote:
> From: William Roche <william.roche@oracle.com>
> 
> A Qemu VM can survive a memory error, as qemu can relay the error to the
> VM kernel which could also deal with it -- poisoning/off-lining the impacted
> page.
> This situation creates a hole in the VM memory address space that the VM 
> kernel
> knows about (an unreadable page or set of pages).
> 
> But the migration of this VM (live migration through the network or
> pseudo-migration with the creation of a state file) will crash Qemu when
> it sequentially reads the memory address space and stumbles on the
> existing hole.
> 
> In order to correct this problem, I suggest to treat the poisoned pages as if
> they were zero-pages for the migration copy.
> This fix also works with underlying large pages, taking into account the
> RAMBlock segment "page-size".
> This fix is scripts/checkpatch.pl clean.
> 
> v2:
>   - adding compressed transfer handling of poisoned pages
>  
> Testing: I could verify that migration now works with a poisoned page
> through standard and compressed migration with 4k and large (2M) pages.
> 
> The RDMA transfer is not considered by this patch.
> 
> William Roche (1):
>   migration: skip poisoned memory pages on "ram saving" phase

If there's a new version, please consider adding a TODO above
control_save_page() that poison page is probably broken there, so we can
still remember.

Reviewed-by: Peter Xu <peterx@redhat.com>

Copy:

lizhijian@fujitsu.com, lidongchen@tencent.com

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]