qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration chec


From: Igor Mammedov
Subject: Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"
Date: Fri, 23 Sep 2022 10:20:34 +0200

On Thu, 22 Sep 2022 12:40:01 -0400
Peter Xu <peterx@redhat.com> wrote:

> On Thu, Sep 22, 2022 at 03:46:17PM +0200, Igor Mammedov wrote:
> > On Wed, 21 Sep 2022 12:12:27 -0400
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > > IDs.  However here since we already have EIM=ON it means the hypervisor
> > > will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> > > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > > even if vcpus<=255.  In short, commit 77250171bdc breaks any simple 
> > > cmdline
> > > that wants to boot a VM with >=9 but <=255 vcpus with:
> > > 
> > >   -device intel-iommu,intremap=on
> > > 
> > > For anyone who does not want to enable x2apic, we can use eim=off in the
> > > intel-iommu parameters to skip enabling KVM x2apic.
> > > 
> > > This partly reverts commit 77250171bdc02aee106083fd2a068147befa1a38, while
> > > keeping the valid bit on checking split irqchip, but revert the other 
> > > change.
> > > 
> > > Cc: David Woodhouse <dwmw2@infradead.org>
> > > Cc: Claudio Fontana <cfontana@suse.de>
> > > Cc: Igor Mammedov <imammedo@redhat.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  hw/i386/intel_iommu.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 05d53a1aa9..6524c2ee32 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -3818,6 +3818,11 @@ static bool vtd_decide_config(IntelIOMMUState *s, 
> > > Error **errp)
> > >              error_setg(errp, "eim=on requires 
> > > accel=kvm,kernel-irqchip=split");
> > >              return false;
> > >          }
> > > +        if (!kvm_enable_x2apic()) {  
> > 
> > above 'check' has side-effects
> > if it's supposed to be a check it would be better to use 
> > kvm_has_x2apic_api()
> > instead.  
> 
> A check is not what I wanted.
> 
> As stated in the commit message, since for some reason EIM is enabled on
> the VT-d device already, we need to enable x2apic for the whole guest
> (including KVM) to match with EIM=on being declared even if vcpus<255.
> 
> > 
> > Also 77250171bdc says:
> > "
> >     The check on kvm_enable_x2apic() needs to happen *anyway* in order to
> >     allow CPUs above 254 even without an IOMMU, so allow that to happen
> >     elsewhere.
> > "
> > 
> > Looking for that elsewhere, it looks like commit dc89f32d92b was supposed
> > to take care of removed chunk, but that is not reachable because of > 255 
> > vCPUs"
> > 
> > Likely 77250171bdc just exposed a bug in dc89f32d92b, where
> > the later removed kvm_enable_x2apic() always called (with split irqchip)
> > and made it called only when > 255 vCPUs.
> > 
> > So migration wise it looks like all version with it and less than 255 cpus
> > are broken.
> > 
> > Wait earlier c1bb5418e3, introduced that
> >    kvm_irqchip_is_split() && kvm_enable_x2apic()
> > 'condition', also without any compat machinery to keep old behavior.  
> 
> There's actually some attempt to be compatible, with:
> 
>  GlobalProperty pc_compat_5_1[] = {
>      { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
> +    { TYPE_X86_CPU, "kvm-msi-ext-dest-id", "off" },
>  };
> 
> But I don't think that's strictly correct, because afaict that only
> controls whether guest enables it or not (I can only see Linux does it that
> way; no idea whether that's detected to other OSes from the PV interfaces).
> The KVM x2apic will still be enabled even on old machines I think, as you
> mentioned.

yep, that compat affects only kvm-msi-ext-dest-id, the kvm_enable_x2apic()
was still called regardless of that.

> > And before that kvm_enable_x2apic() was affecting only configuration
> > with intel_iommu (fb506e701e9b).  
> 
> Right, afaict that's what we "officially" support.
> 
> > 
> > I'm not sure if anything could be salvaged here migration wise  
> 
> This whole thing is indeed very unfortunate.  For easier reference of
> future, here are the list of commits that are relevant in time order:
> 
> fb506e701e ("intel_iommu: reject broken EIM", 2016-10-17)
> c1bb5418e3 ("target/i386: Support up to 32768 CPUs without IRQ remapping", 
> 2020-12-10)
> 77250171bd ("intel_iommu: Fix irqchip / X2APIC configuration checks", 
> 2022-05-16)
> dc89f32d92 ("target/i386: Fix sanity check on max APIC ID / X2APIC 
> enablement", 2022-05-16)
> 
> So regarding compatibility I'm wondering whether we should loose it for
> this case, depending on whether vendors (like RH, or QEMU community in
> general) should support "allowing 32K vcpu without vIOMMU" as an "official"
> feature, or treat it as "experimental only".

If I recall correctly that's PV only feature that also requires special
tailored guest. i.e. it's possible to deliver IPIs in such configuration
but devices could interact only with a fraction of CPUs (irq-wise) or
something else should take care of IRQ remapping.
I don't think mainstream distributions care about this case much.

> IMO it's more important to always make the officially supported bits
> compatible and work as expected.  Here IIUC the (only) official way to
> support >255 vcpus should still be using vIOMMU with EIM enabled so far.
> But I'm happy to be corrected..
> 
> If so, I would still suggest having such a patch because it should fix one
> of the basic use cases with vIOMMU and currently upstream is broken on it.
> This patch will definitely change the behavior again, but the old was
> simply broken and we don't really have much choice out of it, IMHO.

The thing I worry about is calling kvm_enable_x2apic() from multiple places.
It would be better to fix x86_cpus_init() part of dc89f32d92 and have a check
in IOMMU. But if that's impossible/too ugly, I won't object to this patch.


> Thanks,
> 
> > 
> > PS:
> > I'd keep kvm_enable_x2apic() only in corrected x86_cpus_init()
> > and use kvm_has_x2apic_api() elsewhere for checks and bailing out.
> > 
> >   
> > > +            error_setg(errp, "eim=on requires support on the KVM side"
> > > +                             "(X2APIC_API, first shipped in v4.7)");
> > > +            return false;
> > > +        }  
> > 
> > 
> >   
> > >      }
> > >  
> > >      /* Currently only address widths supported are 39 and 48 bits */  
> >   
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]