qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration chec


From: Peter Xu
Subject: Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"
Date: Thu, 22 Sep 2022 12:40:01 -0400

On Thu, Sep 22, 2022 at 03:46:17PM +0200, Igor Mammedov wrote:
> On Wed, 21 Sep 2022 12:12:27 -0400
> Peter Xu <peterx@redhat.com> wrote:
> 
> > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > IDs.  However here since we already have EIM=ON it means the hypervisor
> > will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > even if vcpus<=255.  In short, commit 77250171bdc breaks any simple cmdline
> > that wants to boot a VM with >=9 but <=255 vcpus with:
> > 
> >   -device intel-iommu,intremap=on
> > 
> > For anyone who does not want to enable x2apic, we can use eim=off in the
> > intel-iommu parameters to skip enabling KVM x2apic.
> > 
> > This partly reverts commit 77250171bdc02aee106083fd2a068147befa1a38, while
> > keeping the valid bit on checking split irqchip, but revert the other 
> > change.
> > 
> > Cc: David Woodhouse <dwmw2@infradead.org>
> > Cc: Claudio Fontana <cfontana@suse.de>
> > Cc: Igor Mammedov <imammedo@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  hw/i386/intel_iommu.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 05d53a1aa9..6524c2ee32 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3818,6 +3818,11 @@ static bool vtd_decide_config(IntelIOMMUState *s, 
> > Error **errp)
> >              error_setg(errp, "eim=on requires 
> > accel=kvm,kernel-irqchip=split");
> >              return false;
> >          }
> > +        if (!kvm_enable_x2apic()) {
> 
> above 'check' has side-effects
> if it's supposed to be a check it would be better to use kvm_has_x2apic_api()
> instead.

A check is not what I wanted.

As stated in the commit message, since for some reason EIM is enabled on
the VT-d device already, we need to enable x2apic for the whole guest
(including KVM) to match with EIM=on being declared even if vcpus<255.

> 
> Also 77250171bdc says:
> "
>     The check on kvm_enable_x2apic() needs to happen *anyway* in order to
>     allow CPUs above 254 even without an IOMMU, so allow that to happen
>     elsewhere.
> "
> 
> Looking for that elsewhere, it looks like commit dc89f32d92b was supposed
> to take care of removed chunk, but that is not reachable because of > 255 
> vCPUs"
> 
> Likely 77250171bdc just exposed a bug in dc89f32d92b, where
> the later removed kvm_enable_x2apic() always called (with split irqchip)
> and made it called only when > 255 vCPUs.
> 
> So migration wise it looks like all version with it and less than 255 cpus
> are broken.
> 
> Wait earlier c1bb5418e3, introduced that
>    kvm_irqchip_is_split() && kvm_enable_x2apic()
> 'condition', also without any compat machinery to keep old behavior.

There's actually some attempt to be compatible, with:

 GlobalProperty pc_compat_5_1[] = {
     { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
+    { TYPE_X86_CPU, "kvm-msi-ext-dest-id", "off" },
 };

But I don't think that's strictly correct, because afaict that only
controls whether guest enables it or not (I can only see Linux does it that
way; no idea whether that's detected to other OSes from the PV interfaces).
The KVM x2apic will still be enabled even on old machines I think, as you
mentioned.

> 
> And before that kvm_enable_x2apic() was affecting only configuration
> with intel_iommu (fb506e701e9b).

Right, afaict that's what we "officially" support.

> 
> I'm not sure if anything could be salvaged here migration wise

This whole thing is indeed very unfortunate.  For easier reference of
future, here are the list of commits that are relevant in time order:

fb506e701e ("intel_iommu: reject broken EIM", 2016-10-17)
c1bb5418e3 ("target/i386: Support up to 32768 CPUs without IRQ remapping", 
2020-12-10)
77250171bd ("intel_iommu: Fix irqchip / X2APIC configuration checks", 
2022-05-16)
dc89f32d92 ("target/i386: Fix sanity check on max APIC ID / X2APIC enablement", 
2022-05-16)

So regarding compatibility I'm wondering whether we should loose it for
this case, depending on whether vendors (like RH, or QEMU community in
general) should support "allowing 32K vcpu without vIOMMU" as an "official"
feature, or treat it as "experimental only".

IMO it's more important to always make the officially supported bits
compatible and work as expected.  Here IIUC the (only) official way to
support >255 vcpus should still be using vIOMMU with EIM enabled so far.
But I'm happy to be corrected..

If so, I would still suggest having such a patch because it should fix one
of the basic use cases with vIOMMU and currently upstream is broken on it.
This patch will definitely change the behavior again, but the old was
simply broken and we don't really have much choice out of it, IMHO.

Thanks,

> 
> PS:
> I'd keep kvm_enable_x2apic() only in corrected x86_cpus_init()
> and use kvm_has_x2apic_api() elsewhere for checks and bailing out.
> 
> 
> > +            error_setg(errp, "eim=on requires support on the KVM side"
> > +                             "(X2APIC_API, first shipped in v4.7)");
> > +            return false;
> > +        }
> 
> 
> 
> >      }
> >  
> >      /* Currently only address widths supported are 39 and 48 bits */
> 

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]