Re: [PATCH v2 0/2] Fix MCE handling on AMD hosts

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 0/2] Fix MCE handling on AMD hosts

From:	John Allen
Subject:	Re: [PATCH v2 0/2] Fix MCE handling on AMD hosts
Date:	Tue, 5 Sep 2023 10:15:26 -0500
On Thu, Aug 31, 2023 at 11:40:08PM +0200, William Roche wrote:
> Hello John,
> 
> I could test your fixes and I can confirm that the BUS_MCEERR_AR is now
> working on AMD:
> 
> Before the fix, the VM panics with:
> 
> qemu-system-x86_64: Guest MCE Memory Error at QEMU addr 0x7f89573ce000 and
> GUEST addr 0x10b5ce000 of type BUS_MCEERR_AR injected
> [   83.562579] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank
> 1: a000000000000000
> [   83.562585] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff81e8f6ff>
> {pv_native_safe_halt+0xf/0x20}
> [   83.562592] mce: [Hardware Error]: TSC 3d39402bdc
> [   83.562593] mce: [Hardware Error]: PROCESSOR 2:800f12 TIME 1693515449
> SOCKET 0 APIC 0 microcode 800126e
> [   83.562596] mce: [Hardware Error]: Machine check: Uncorrected error
> without MCA Recovery
> [   83.562597] Kernel panic - not syncing: Fatal local machine check
> [   83.563401] Kernel Offset: disabled
> 
> With the fix, the same error injection doesn't kill the VM, but generates
> the following console messages:
> 
> qemu-system-x86_64: Guest MCE Memory Error at QEMU addr 0x7fa430ab9000 and
> GUEST addr 0x118cb9000 of type BUS_MCEERR_AR injected
> [  250.851996] Disabling lock debugging due to kernel taint
> [  250.852928] mce: Uncorrected hardware memory error in user-access at
> 118cb9000
> [  250.853261] Memory failure: 0x118cb9: Sending SIGBUS to
> mce_process_rea:1227 due to hardware memory corruption
> [  250.854933] mce: [Hardware Error]: Machine check events logged
> [  250.855800] Memory failure: 0x118cb9: recovery action for dirty LRU page:
> Recovered
> [  250.856661] mce: [Hardware Error]: CPU 2: Machine Check Exception: 7 Bank
> 9: bc00000000000000
> [  250.860552] mce: [Hardware Error]: RIP 33:<00007f56b9ecbee5>
> [  250.861405] mce: [Hardware Error]: TSC 8c2c664410 ADDR 118cb9000 MISC 8c
> [  250.862679] mce: [Hardware Error]: PROCESSOR 2:800f12 TIME 1693508937
> SOCKET 0 APIC 2 microcode 800126e
> 
> 
> But a problem still exists with BUS_MCEERR_AO that kills the VM with:
> 
> qemu-system-x86_64: warning: Guest MCE Memory Error at QEMU addr
> 0x7f1d108e5000 and GUEST addr 0x114ae5000 of type BUS_MCEERR_AO injected
> [  157.392905] mce: [Hardware Error]: CPU 0: Machine Check Exception: 7 Bank
> 9: bc00000000000000
> [  157.392912] mce: [Hardware Error]: RIP 10:<ffffffff81e8f6ff>
> {pv_native_safe_halt+0xf/0x20}
> [  157.392919] mce: [Hardware Error]: TSC 60b92a54d0 ADDR 114ae5000 MISC 8c
> [  157.392921] mce: [Hardware Error]: PROCESSOR 2:800f12 TIME 1693500765
> SOCKET 0 APIC 0 microcode 800126e
> [  157.392924] mce: [Hardware Error]: Machine check: Uncorrected
> unrecoverable error in kernel context
> [  157.392925] Kernel panic - not syncing: Fatal local machine check
> [  157.402582] Kernel Offset: disabled
> 
> As AMD guests can't currently deal with BUS_MCEERR_AO MCE injection,
> according to me the fix is not complete, the 'AO' case must be handled. The
> simplest way is probably to filter it at the qemu level, to only inject the
> 'AR' case -- and it also gives the possibility to let qemu provide a message
> about an ignored 'AO' error.
> 
> I would suggest to add a 3rd patch implementing this AMD specific filter:
> 
> 
> commit bf8cc74df3fcc7bf958a7c42b876e9c059fe4d06
> Author: William Roche <william.roche@oracle.com>
> Date:   Thu Aug 31 18:54:57 2023 +0000
> 
>     i386: Explicitly ignore unsupported BUS_MCEERR_AO MCE on AMD guest
> 
>     AMD guests can't currently deal with BUS_MCEERR_AO MCE injection
>     as it panics the VM kernel. We filter this event and provide a
>     warning message.
> 
>     Signed-off-by: William Roche <william.roche@oracle.com>
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 9ca7187628..bd60d5697b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -606,6 +606,10 @@ static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr,
> int code)
>              mcg_status |= MCG_STATUS_RIPV;
>          }
>      } else {
> +        if (code == BUS_MCEERR_AO) {
> +            /* XXX we don't support BUS_MCEERR_AO injection on AMD yet */
> +            return;
> +        }
>          mcg_status |= MCG_STATUS_EIPV | MCG_STATUS_RIPV;
>      }
> 
> @@ -657,7 +661,8 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void
> *addr)
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr))
> {
>              kvm_hwpoison_page_add(ram_addr);
> -            kvm_mce_inject(cpu, paddr, code);
> +            if (!IS_AMD_CPU(env) || code != BUS_MCEERR_AO)
> +                kvm_mce_inject(cpu, paddr, code);
> 
>              /*
>               * Use different logging severity based on error type.
> @@ -670,8 +675,9 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void
> *addr)
>                      addr, paddr, "BUS_MCEERR_AR");
>              } else {
>                   warn_report("Guest MCE Memory Error at QEMU addr %p and "
> -                     "GUEST addr 0x%" HWADDR_PRIx " of type %s injected",
> -                     addr, paddr, "BUS_MCEERR_AO");
> +                     "GUEST addr 0x%" HWADDR_PRIx " of type %s %s",
> +                     addr, paddr, "BUS_MCEERR_AO",
> +                     IS_AMD_CPU(env) ? "ignored on AMD guest" :
> "injected");
>              }
> 
>              return;
> ---

Thanks, I think this will be a good solution for now while we can't
fully support AO errors. I will test this and include in the next
version of the series.

Thanks,
John

> 
> 
> I hope this can help.
> 
> William.
> 
> 
> On 7/26/23 22:41, John Allen wrote:
> > In the event that a guest process attempts to access memory that has
> > been poisoned in response to a deferred uncorrected MCE, an AMD system
> > will currently generate a SIGBUS error which will result in the entire
> > guest being shutdown. Ideally, we only want to kill the guest process
> > that accessed poisoned memory in this case.
> > 
> > This support has been included in qemu for Intel hosts for a long time,
> > but there are a couple of changes needed for AMD hosts. First, we will
> > need to expose the SUCCOR cpuid bit to guests. Second, we need to modify
> > the MCE injection code to avoid Intel specific behavior when we are
> > running on an AMD host.
> > 
> > v2:
> >    - Add "succor" feature word.
> >    - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
> > 
> > John Allen (2):
> >    i386: Add support for SUCCOR feature
> >    i386: Fix MCE support for AMD hosts
> > 
> >   target/i386/cpu.c     | 18 +++++++++++++++++-
> >   target/i386/cpu.h     |  4 ++++
> >   target/i386/helper.c  |  4 ++++
> >   target/i386/kvm/kvm.c | 19 +++++++++++++------
> >   4 files changed, 38 insertions(+), 7 deletions(-)
> >
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [PATCH v2 0/2] Fix MCE handling on AMD hosts, John Allen <=
Prev by Date: Re: mips system emulation failure with virtio
Next by Date: Re: PCI Hotplug ACPI device names only 3 characters long
Previous by thread: PCI Hotplug ACPI device names only 3 characters long
Next by thread: Re: [PATCH 17/21] block: Take graph rdlock in bdrv_drop_intermediate()
Index(es):
- Date
- Thread