qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Date: Thu, 29 Aug 2019 19:01:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/27/19 20:31, Igor Mammedov wrote:
> On Sat, 24 Aug 2019 01:48:09 +0000
> "Yao, Jiewen" <address@hidden> wrote:

>> (05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New CPU
>>      will not enter CPU because SMI is disabled)
> I think only CPU that does the write will enter SMM

That used to be the case (and it is still the default QEMU behavior, if
broadcast SMI is not negotiated). However, OVMF does negotiate broadcast
SMI whenever QEMU offers the feature. Broadcast SMI is important for the
stability of the edk2 SMM infrastructure on QEMU/KVM, we've found.

https://bugzilla.redhat.com/show_bug.cgi?id=1412313
https://bugzilla.redhat.com/show_bug.cgi?id=1412327

> and we might not need to pull in all already initialized CPUs into SMM.

That, on the other hand, could be a valid idea. But then the CPU should
use a different method for raising a synchronous SMI for itself (not a
write to IO port 0xB2). Is a "directed SMI for self" possible?

> [...]

I've tried to read through the procedure with your suggested changes,
but I'm failing at composing a coherent mental image, in this email
response format.

If you have the time, can you write up the suggested list of steps in a
"flat" format? (I believe you are suggesting to eliminate some steps
completely.)

... jumping to another point:

>> 2) Let trusted software (SMM and init code) guarantee SMREBASE one by one 
>> (include any code runs before SMREBASE)
> that would mean pulling all present CPUs into SMM mode so no attack
> code could be executing before doing hotplug. With a lot of present CPUs
> it could be quite expensive and unlike physical hardware, guest's CPUs
> could be preempted arbitrarily long causing long delays.

I agree with your analysis, but I slightly disagree about the impact:

- CPU hotplug is not a frequent administrative action, so the CPU load
should be temporary (it should be a spike). I don't worry that it would
trip up OS kernel code. (SMI handling is known to take long on physical
platforms oo.) In practice, all "normal" SMIs are broadcast already (for
example when calling the runtime UEFI variable services from the OS kernel).

- The fact that QEMU/KVM introduces some jitter into the execution of
multi-core code (including SMM code) has proved useful in the past, for
catching edk2 regressions.

Again, this is not a strong disagreement from my side. I'm open to
better ways for synching CPUs during muti-CPU-hotplug.

(Digression:

I expect someone could be curious why (a) I find it acceptable (even
beneficial) that "some jitter" injected by the QEMU/KVM scheduling
exposes multi-core regressions in edk2, but at the same time (b) I found
it really important to add broadcast SMI to QEMU and OVMF. After all,
both "jitter" and "unicast SMIs" are QEMU/KVM platform specifics, so why
the different treatment?

The reason is that the "jitter" does not interfere with normal
operation, and it has been good for catching *regressions*. IOW, there
is a working edk2 state, someone posts a patch, works on physical
hardware, but breaks on QEMU/KVM --> then we can still reject or rework
or revert the patch. And we're back to a working state again (in the
best case, with a fixed feature patch).

With the unicast SMIs however, it was impossible to enable the SMM stack
reliably in the first place. There was no functional state to return to.

Digression ends.)

> lets first see if if we can ignore race

Makes me uncomfortable, but if this is the consensus, I'll go along.

> and if it's not then
> we probably end up with implementing some form of #1

OK.

Thanks!
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]