[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" ph
From: |
Peter Xu |
Subject: |
Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase |
Date: |
Mon, 11 Sep 2023 15:48:38 -0400 |
On Sat, Sep 09, 2023 at 03:57:44PM +0100, Joao Martins wrote:
> > Should I continue to treat them as zero pages written with
> > save_zero_page_to_file ?
>
> MCE had already been forward to the guest, so guest is supposed to not be
> using
> the page (nor rely on its contents). Hence destination ought to just see a
> zero
> page. So what you said seems like the best course of action.
>
> > Or should I consider the case of an ongoing compression
> > use and create a new code compressing an empty page with
> > save_compress_page() ?
> >
> The compress code looks to be a tentative compression (not guaranteed IIUC),
> so
> I am not sure it needs any more logic that just adding at the top of
> ram_save_target_page_legacy() as Peter suggested?
>
> > And what about an RDMA memory region impacted by a memory error ?
> > This is an important aspect.
> > Does anyone know how this situation is dealt with ? And how it should be
> > handled
> > in Qemu ?
> >
>
> If you refer to guest RDMA MRs that is just guest RAM, not sure we are even
> aware of those from qemu. But if you refer to the RDMA transport that sits
> below
> the Qemu file (or rather acts as an implementation of QemuFile), so handling
> in
> ram_save_target_page_legacy() already seems to cover it.
I'm also not familiar enough with RDMA, but it looks tricky indeed. AFAIU
it's leveraging RDMA_CONTROL_COMPRESS for zero pages for now (with
RDMACompress.value==0), so it doesn't seem to be using generic migration
protocols.
If we want to fix all places well, one way to consider is to introduce
migration_buffer_is_zero(), which can be a wrapper for buffer_is_zero() by
default, but also returns true for poisoned pages before reading the
buffer. Then we use it in all three places:
- For compression, in do_compress_ram_page()
- For RDMA, in qemu_rdma_write_one()
- For generic migration, in save_zero_page_to_file() (your current patch)
I suppose then all cases will be fixed. We need to make sure we'll always
use migration_buffer_is_zero() as the 1st thing to call when QEMU wants to
migrate a target page. Maybe it'll worth a comment above that function.
Thanks,
--
Peter Xu
- [PATCH 0/1] Qemu crashes on VM migration after an handled memory error, “William Roche, 2023/09/06
- [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, “William Roche, 2023/09/06
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, Joao Martins, 2023/09/06
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, Peter Xu, 2023/09/06
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, William Roche, 2023/09/06
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, Joao Martins, 2023/09/09
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase,
Peter Xu <=
- Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase, Peter Xu, 2023/09/12
- [PATCH v2 0/1] Qemu crashes on VM migration after an handled memory error, “William Roche, 2023/09/14
- [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, “William Roche, 2023/09/14
- Re: [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, Zhijian Li (Fujitsu), 2023/09/14
- Re: [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, William Roche, 2023/09/15
- Re: [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, Zhijian Li (Fujitsu), 2023/09/17
- Re: [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, Zhijian Li (Fujitsu), 2023/09/20
- Re: [PATCH v2 1/1] migration: skip poisoned memory pages on "ram saving" phase, William Roche, 2023/09/20
- [PATCH v3 0/1] Qemu crashes on VM migration after an handled memory error, “William Roche, 2023/09/20
- [PATCH v3 1/1] migration: skip poisoned memory pages on "ram saving" phase, “William Roche, 2023/09/20