qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 8/8] x86-iommu: Fail early if vIOMMU specified after vfio-pci


From: Peter Xu
Subject: Re: [PATCH 8/8] x86-iommu: Fail early if vIOMMU specified after vfio-pci
Date: Fri, 22 Oct 2021 10:14:29 +0800

Hi, Alex,

On Thu, Oct 21, 2021 at 04:30:39PM -0600, Alex Williamson wrote:
> On Thu, 21 Oct 2021 18:42:59 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > Scan the pci bus to make sure there's no vfio-pci device attached before 
> > vIOMMU
> > is realized.
> 
> Sorry, I'm not onboard with this solution at all.
> 
> It would be really useful though if this commit log or a code comment
> described exactly the incompatibility for which vfio-pci devices are
> being called out here.  Otherwise I see this as a bit of magic voodoo
> that gets lost in lore and copied elsewhere and we're constantly trying
> to figure out specific incompatibilities when vfio-pci devices are
> trying really hard to be "just another device".

Sure, I can enrich the commit message.

> 
> I infer from the link of the previous alternate solution that this is
> to do with the fact that vfio devices attach a memory listener to the
> device address space.

IMHO it's not about the memory listeners, I think that' after vfio detected
some vIOMMU memory regions already, which must be based on an vIOMMU address
space being available.  I think the problem is that when realize() vfio-pci we
fetch the dma address space specifically for getting the vfio group, while that
could happen too early, even before vIOMMU is created.

> Interestingly that previous cover letter also discusses how vdpa devices
> might have a similar issue, which makes it confusing again that we're calling
> out vfio-pci devices by name rather than for a behavior.

Yes I'll need to see whether this approach will be accepted first.  I think
similar thing could help VDPA but it's not required there because VDPA has
already worked around using pci_device_iommu_address_space().  So potentially
the only one to "fix" is the vfio-pci device using along with vIOMMU, when the
device ordering is specified in the wrong order.  I'll leave the VDPA problem
to Jason to see whether he prefers keeping current code, or switch to a simpler
one.  That should be after this one.

> 
> If the behavior here is that vfio-pci devices attach a listener to the
> device address space, then that provides a couple possible options.  We
> could look for devices that have recorded an interest in their address
> space, such as by setting a flag on PCIDevice when someone calls
> pci_device_iommu_address_space(), where we could walk all devices using
> the code in this series to find a device with such a flag.

Right, we can set a flag for all the pci devices that needs to consolidate
pci_device_iommu_address_space() result, however then it'll be vfio-pci only so
far.  Btw, I actually proposed similar things two months ago, and I think Igor
showed concern on that flag being vague on meaning:

https://lore.kernel.org/qemu-devel/20210906104915.7dd5c934@redhat.com/

  > > Does it need to be a pre_plug hook?  I thought we might just need a flag 
in the
  > > pci device classes showing that it should be after vIOMMUs, then in vIOMMU
  > > realize functions we walk pci bus to make sure no such device exist?
  > > 
  > > We could have a base vIOMMU class, then that could be in the realize() of 
the
  > > common class.
  > 
  > We basically don't know if device needs IOMMU or not and can work
  > with/without it just fine. In this case I'd think about IOMMU as board
  > feature that morphs PCI buses (some of them) (address space, bus numers, 
...).
  > So I don't perceive any iommu flag as a device property at all.
  > 
  > As for realize vs pre_plug, the later is the part of abstract realize
  > (see: device_set_realized) and is already used by some PCI infrastructure:
  >   ex: pcie_cap_slot_pre_plug_cb/spapr_pci_pre_plug

I still think that flag will work, that flag should only shows "whether this
device needs to be specified earlier than vIOMMU", but I can get the point from
Igor that it's at least confusing on what does the flag mean.  Meanwhile I
don't think that flag will be required, as this is not the first time we name a
special device in the code, e.g. pc_machine_device_pre_plug_cb().
intel_iommu.c has it too upon vfio-pci already on making sure caching-mode=on
in vtd_machine_done_notify_one().

If Igor is okay with adding such a flag for PCIDevice class, I can do that in
the new version.  I don't have a strong opinion on this.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]