qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poiso


From: Shiyang Ruan
Subject: Re: [RFC PATCH v2 3/6] cxl/core: add report option for cxl_mem_get_poison()
Date: Wed, 3 Apr 2024 22:56:58 +0800
User-agent: Mozilla Thunderbird



在 2024/3/30 9:50, Dan Williams 写道:
Shiyang Ruan wrote:
The GMER only has "Physical Address" field, no such one indicates length.
So, when a poison event is received, we could use GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.

I am not sure I agree with the rationale here because there is no
correlation between the event being signaled and the current state of
the poison list. It also establishes race between multiple GMER events,
i.e. imagine the hardware sends 4 GMER events to communicate a 256B
poison discovery event. Does the driver need logic to support GMER event
2, 3, and 4 if it already say all 256B of poison after processing GMER
event 1?

Yes, I didn't thought about that.


I think the best the driver can do is assume at least 64B of poison
per-event and depend on multiple notifications to handle larger poison
lengths.

Agree.  This also makes things easier.

And for qemu, I'm thinking of making a patch to limit the length of a poison record when injecting. The length should between 64B to 4KiB per GMER. And emit many GMERs if length > 4KiB.


Otherwise, the poison list is really only useful for pre-populating
pages to offline after a reboot, i.e. to catch the kernel up with the
state of poison pages after a reboot.

Got it.


--
Thanks,
Ruan.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]