qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v10 2/6] ppc: spapr: Introduce FWNMI capability


From: Aravinda Prasad
Subject: Re: [Qemu-devel] [PATCH v10 2/6] ppc: spapr: Introduce FWNMI capability
Date: Wed, 3 Jul 2019 14:58:24 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0


On Wednesday 03 July 2019 08:33 AM, David Gibson wrote:
> On Tue, Jul 02, 2019 at 11:54:26AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Tuesday 02 July 2019 09:21 AM, David Gibson wrote:
>>> On Wed, Jun 12, 2019 at 02:51:04PM +0530, Aravinda Prasad wrote:
>>>> Introduce the KVM capability KVM_CAP_PPC_FWNMI so that
>>>> the KVM causes guest exit with NMI as exit reason
>>>> when it encounters a machine check exception on the
>>>> address belonging to a guest. Without this capability
>>>> enabled, KVM redirects machine check exceptions to
>>>> guest's 0x200 vector.
>>>>
>>>> This patch also introduces fwnmi-mce capability to
>>>> deal with the case when a guest with the
>>>> KVM_CAP_PPC_FWNMI capability enabled is attempted
>>>> to migrate to a host that does not support this
>>>> capability.
>>>>
>>>> Signed-off-by: Aravinda Prasad <address@hidden>
>>>> ---
>>>>  hw/ppc/spapr.c         |    1 +
>>>>  hw/ppc/spapr_caps.c    |   26 ++++++++++++++++++++++++++
>>>>  include/hw/ppc/spapr.h |    4 +++-
>>>>  target/ppc/kvm.c       |   19 +++++++++++++++++++
>>>>  target/ppc/kvm_ppc.h   |   12 ++++++++++++
>>>>  5 files changed, 61 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 6dd8aaa..2ef86aa 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -4360,6 +4360,7 @@ static void spapr_machine_class_init(ObjectClass 
>>>> *oc, void *data)
>>>>      smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
>>>>      smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
>>>>      smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
>>>> +    smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF;
>>>>      spapr_caps_add_properties(smc, &error_abort);
>>>>      smc->irq = &spapr_irq_dual;
>>>>      smc->dr_phb_enabled = true;
>>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>>>> index 31b4661..2e92eb6 100644
>>>> --- a/hw/ppc/spapr_caps.c
>>>> +++ b/hw/ppc/spapr_caps.c
>>>> @@ -479,6 +479,22 @@ static void cap_ccf_assist_apply(SpaprMachineState 
>>>> *spapr, uint8_t val,
>>>>      }
>>>>  }
>>>>  
>>>> +static void cap_fwnmi_mce_apply(SpaprMachineState *spapr, uint8_t val,
>>>> +                                Error **errp)
>>>> +{
>>>> +    if (!val) {
>>>> +        return; /* Disabled by default */
>>>> +    }
>>>> +
>>>> +    if (tcg_enabled()) {
>>>> +        error_setg(errp,
>>>> +"No Firmware Assisted Non-Maskable Interrupts support in TCG, try 
>>>> cap-fwnmi-mce=off");
>>>
>>> Not allowing this for TCG creates an awkward incompatibility between
>>> KVM and TCG guests.  I can't actually see any reason to ban it for TCG
>>> - with the current code TCG won't ever generate NMIs, but I don't see
>>> that anything will actually break.
>>>
>>> In fact, we do have an nmi monitor command, currently wired to the
>>> spapr_nmi() function which resets each cpu, but it probably makes
>>> sense to wire it up to the fwnmi stuff when present.
>>
>> Yes, but that nmi support is not enough to inject a synchronous error
>> into the guest kernel. For example, we should provide the faulty address
>> along with other information such as the type of error (slb multi-hit,
>> memory error, TLB multi-hit) and when the error occurred (load/store)
>> and whether the error was completely recovered or not. Without such
>> information we cannot build the error log and pass it on to the guest
>> kernel. Right now nmi monitor command takes cpu number as the only argument.
> 
> Obviously we can't inject an arbitrary MCE event with that monitor
> command.  But isn't there some sort of catch-all / unknown type of MCE
> event which we could inject?

We have "unknown" type of error, but we should also pass an address in
the MCE event log. Strictly speaking this address should be a valid
address in the current CPU context as MCEs are synchronous errors
triggered when we touch a bad address.

We can pass a default address with every nmi, but I am not sure whether
that will be practically helpful.

> 
> It seems very confusing to me to have 2 totally separate "nmi"
> mechanisms.
> 
>> So I think TCG support should be a separate patch by itself.
> 
> Even if we don't wire up the monitor command, I still don't see
> anything that this patch breaks - we can support the nmi-register and
> nmi-interlock calls without ever actually creating MCE events.

If we support nmi-register and nmi-interlock calls without the monitor
command wire-up then we will be falsely claiming the nmi support to the
guest while it is not actually supported.

Regards,
Aravinda



> 

-- 
Regards,
Aravinda



reply via email to

[Prev in Thread] Current Thread [Next in Thread]