Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exi

qemu-ppc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exi

From:	Thomas Huth
Subject:	Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Date:	Mon, 16 Nov 2015 11:41:46 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 16/11/15 11:07, Aravinda Prasad wrote:
> 
> 
> On Monday 16 November 2015 01:22 PM, Thomas Huth wrote:
>> On 12/11/15 19:49, Aravinda Prasad wrote:
>>>
>>> On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
>> ...
>>>> Also LoPAPR talks about 'subsequent processors report "fatal error
>>>> previously reported"', so maybe the other processors should report that
>>>> condition in this case?
>>>
>>> I feel guest kernel is responsible for that or does that mean that qemu
>>> should report the same error, which first processor encountered, for
>>> subsequent processors? In that case what if the error encountered by
>>> first processor was recovered.
>>
>> I simply refered to this text in LoPAPR:
>>
>>  Multiple processors of the same OS image may experi-
>>  ence fatal events at, or about, the same time. The first processor
>>  to enter the machine check handling firmware reports
>>  the fatal error. Subsequent processors serialize waiting for the
>>  first processor to issue the ibm,nmi-interlock call. These
>>  subsequent processors report "fatal error previously reported".
> 
> Yes, I asked this because I am not clear what "fatal error previously
> reported" means as described in PAPR.

Looking at table "Table 137. RTAS Event Return Format (Fixed Part)" in
LoPAPR, there is a "ALREADY_REPORTED" severity - I assume this is what
is meant by the cited paragraph?

>> Is there code in the host kernel already that takes care of this (I
>> haven't checked)? If so, how does the host kernel know that the event
>> happened "at or about the same time" since you're checking at the QEMU
>> side for the mutex condition?
> 
> I don't think the host kernel takes care of this; it simply forwards
> such errors to QEMU via NMI exit. I feel the time referred by "at or
> about the same time" is the duration between the registered machine
> check handler is invoked and the corresponding interlock call is issued
> by guest, which QEMU knows and is protected by a mutex.

I agree, that makes sense.

>>>> And of course you've also got to check that the same CPU is not getting
>>>> multiple NMIs before the interlock function has been called again.
>>>
>>> I think it is good to check that. However, shouldn't the guest enable ME
>>> until it calls interlock function?
>>
>> First, the hypervisor should never trust the guest to do the right
>> things. Second, LoPAPR says "the OS permanently relinquishes to firmware
>> the Machine State Register's Machine Check Enable bit", and Paul also
>> said something similar in another mail to this thread, so I think you
>> really have to check this in QEMU instead.
> 
> Hmm. ok. Since ME is always set when running in guest (assuming guest is
> not disabling it), we cannot check ME bit to figure out whether the same
> CPU is getting UEs before interlock is called. One way is to record the
> CPU ID upon such error and check before invoking registered machine
> check handler whether that CPU has a pending interlock call. Terminate
> the guest if there is a pending interlock call for that CPU rather than
> causing the guest to trigger recursive machine check errors.

Do we have some kind of checkstop state emulation in QEMU (sorry, I
haven't checked yet)? If yes, it might be nicer to use that and set the
guest state to PANIC instead of exiting QEMU directly - i.e. to do
something similar like the guest_panicked() function in
target-s390x/kvm.c. That way the management layer (libvirt) can decide
on its own whether to terminate the guest, reboot or keep it in the
crashed state for further analysis.

 Thomas

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-ppc] [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob, (continued)
- [Qemu-ppc] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/11
  - Re: [Qemu-ppc] [PATCH 4/4] target-ppc: Handle NMI guest exit, David Gibson, 2015/11/11
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/12
  - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Thomas Huth, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Thomas Huth, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Thomas Huth, 2015/11/16
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/16
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Thomas Huth <=
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/16
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, David Gibson, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Thomas Huth, 2015/11/13
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, David Gibson, 2015/11/16
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, David Gibson, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/12
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, David Gibson, 2015/11/13
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Aravinda Prasad, 2015/11/13
    - Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit, Alexey Kardashevskiy, 2015/11/18

Prev by Date: Re: [Qemu-ppc] [PATCH 13/77] ppc: tlbie, tlbia and tlbisync are HV only
Next by Date: Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Previous by thread: Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Next by thread: Re: [Qemu-ppc] [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Index(es):
- Date
- Thread