|
From: | Daniel Henrique Barboza |
Subject: | Re: [PATCH 20/20] target/riscv: add 'kvm_supported' class property |
Date: | Fri, 1 Sep 2023 17:57:46 -0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 |
On 8/31/23 09:47, Andrew Jones wrote:
On Fri, Aug 25, 2023 at 10:08:53AM -0300, Daniel Henrique Barboza wrote:This follows the same idea of 'tcg_support' property added in the previous patch. Note that we're now implementing the 'cpu_realizefn' for the KVMAccel class since this verification is done in realize() time. Supporting vendor CPUs with KVM is not possible. We rely on the extension support of the KVM module running in the host, making it impossible to guarantee that a vendor CPU will have all the required extensions available. The only way to guarantee that a vendor CPU is KVM compatible is running KVM in a host that has the same vendor CPU, andOr to attempt to enable each extension which the vendor CPU expects and to attempt to disable everything else. If all those actions succeed, then we can override the ID registers with those of the CPU we want to model and go for it. There's still risk, though, that the guest kernel will see the ID registers of the model and attempt to apply some errata workaround which may or may not work and/or crash the guest.
This can also happen when migrating the guest from a host that happens to have an errata to one that doesn't have, regardless of the CPU type the guest is using (host CPU vs vendor CPU). The guest would need a power cycle to identify the current model ID. We don't have the tooling needed to mitigate this risk in QEMU I'm afraid. Upper layers like libvirt are more able to deal with it.
for this case we already have the 'host' CPU type. We're better of declaring that all vendors CPUs are not KVM capable. After this patch, running KVM accel with a vendor CPU will produce an error like the following: $ ./qemu-system-riscv64 -M virt,accel=kvm -cpu veyron-v1 qemu-system-riscv64: 'veyron-v1' CPU is not compatible with KVM accelerationShouldn't we at least check if the host matches the requested CPU first? So, if we happen to be on a veyron-v1, then the veyron-v1 model should be equivalent to 'host'. (They may not be 100% equivalent in practice, but theoretically they should be, which means trying it and debugging the bugs should improve the CPU models on both sides.)
If we're really going this route we would need to match host and vendor CPU in the extension level, matching each vendor CPU extension with what the CPU can provide, failing if the host can't provide all extensions the vendor CPU requires. I wouldn't even bother checking for things like machine ID since they can be easily impersonated (e.g. use a rv64 emulated host, edit mvendorid) and can't be trusted. TBH I am not thrilled with the idea of supporting vendor CPUs with KVM. The user can pick the 'host' CPU to have the most capable KVM CPU available in the host, and that is already not trivial to support in cases like live migration and so on. Vendor CPU KVM support will promote things like: "I tried to use a veyron-v2 KVM CPU in a veyron-v1 host, why is that not possible it should be it's not fair" "why can't I use a vendor X KVM CPU A into a vendor Y CPU B host it surely should work since CPU A is older than CPU B right" And then, even if we decide to support vendor CPUs with KVM in a feasible way, with a lot of conditions and training wheels, we'll be so restrictive that the user will be better of using the 'host' CPU anyway. All this said, there's a lot going on in this series already and this vendor CPU + KVM discussion might deserve its own RFC/thread. I'll drop this patch from the series to give us time to discuss this properly. Let's leave it as is for now. Thanks, Daniel
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> --- target/riscv/cpu-qom.h | 1 + target/riscv/cpu.c | 1 + target/riscv/kvm/kvm-cpu.c | 24 ++++++++++++++++++++++++ 3 files changed, 26 insertions(+) diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h index e86b76f9fe..32d9bb07b4 100644 --- a/target/riscv/cpu-qom.h +++ b/target/riscv/cpu-qom.h @@ -72,5 +72,6 @@ struct RISCVCPUClass {bool user_extension_properties;bool tcg_supported; + bool kvm_supported; }; #endif /* RISCV_CPU_QOM_H */ diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index f749ea2a2e..73302bb72a 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -1646,6 +1646,7 @@ static void riscv_dynamic_cpu_class_init(ObjectClass *c, void *data)rcc->user_extension_properties = true;rcc->tcg_supported = true; + rcc->kvm_supported = true; }static void riscv_vendor_cpu_class_init(ObjectClass *c, void *data)diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c index 501384924b..85f3b8c80e 100644 --- a/target/riscv/kvm/kvm-cpu.c +++ b/target/riscv/kvm/kvm-cpu.c @@ -1289,6 +1289,7 @@ static void riscv_kvm_cpu_class_init(ObjectClass *c, void *data) RISCVCPUClass *rcc = RISCV_CPU_CLASS(c);rcc->user_extension_properties = true;+ rcc->kvm_supported = true; }static const TypeInfo riscv_kvm_cpu_type_infos[] = {@@ -1302,6 +1303,28 @@ static const TypeInfo riscv_kvm_cpu_type_infos[] = {DEFINE_TYPES(riscv_kvm_cpu_type_infos) +/*+ * We'll get here via the following path: + * + * riscv_cpu_realize() + * -> cpu_exec_realizefn() + * -> kvm_cpu_realizefn() (via accel_cpu_realizefn()) + */ +static bool kvm_cpu_realizefn(CPUState *cs, Error **errp) +{ + RISCVCPU *cpu = RISCV_CPU(cs); + RISCVCPUClass *rcc = RISCV_CPU_GET_CLASS(cpu); + + if (!rcc->kvm_supported) { + g_autofree char *name = riscv_cpu_get_name(rcc); + error_setg(errp, "'%s' CPU is not compatible with KVM acceleration", + name); + return false; + } + + return true; +} + static void kvm_cpu_instance_init(CPUState *cs) { Object *obj = OBJECT(RISCV_CPU(cs)); @@ -1328,6 +1351,7 @@ static void kvm_cpu_accel_class_init(ObjectClass *oc, void *data) AccelCPUClass *acc = ACCEL_CPU_CLASS(oc);acc->cpu_instance_init = kvm_cpu_instance_init;+ acc->cpu_realizefn = kvm_cpu_realizefn; }static const TypeInfo kvm_cpu_accel_type_info = {-- 2.41.0I don't think we want kvm_supported nor tcg_supported as they necessarily bring accelerator knowledge into target/riscv/cpu.c, where the booleans have to be set. It would be better if the decisions as to what is supported or not are made in the accelerator's respective files. So, we try to realize some model with some accel and let that accel do some checks and error out when it can't. If riscv kvm only supports 'host', then it's check would simply be "is the model host?" and the inverse for tcg. Thanks, drew
[Prev in Thread] | Current Thread | [Next in Thread] |