qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 0/4] intel_iommu: Do sanity check of vfio-pc


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH RFC 0/4] intel_iommu: Do sanity check of vfio-pci earlier
Date: Mon, 12 Aug 2019 10:24:53 -0600

On Mon, 12 Aug 2019 09:45:27 +0200
Peter Xu <address@hidden> wrote:

> This is a RFC series.
> 
> The VT-d code has some defects, one of them is that we cannot detect
> the misuse of vIOMMU and vfio-pci early enough.
> 
> For example, logically this is not allowed:
> 
>   -device intel-iommu,caching-mode=off \
>   -device vfio-pci,host=05:00.0

Do we require intel-iommu with intremap=on in order to get x2apic for
large vCPU count guests?  If so, wouldn't it be a valid configuration
for the user to specify:

   -device intel-iommu,caching-mode=off,intremap=on \
   -device vfio-pci,host=05:00.0

so long as they never have any intention of the guest enabling DMA
translation?  Would there be any advantage to this config versus
caching-mode=on?  I suspect the overhead of CM=1 when only using
interrupt remapping is small to non-existent, but are there other
reasons for running with CM=0, perhaps guest drivers not supporting it?

I like the idea of being able to nak an incompatible hot-add rather
than kill the VM, we could narrow that even further to look at not only
whether caching-mode support is enabled, but also whether translation
is enabled on the vIOMMU.  Ideally we might disallow the guest from
enabling translation in such a configuration, but the Linux code is not
written with the expectation that the hardware can refuse to enable
translation and there are no capability bits to remove the DMA
translation capability of the IOMMU.  Still, we might want to think
about which is the better user experience, to have the guest panic when
DMA_GSTS_TES never becomes set (as it seems Linux would do) or to have
QEMU exit, or as proposed here, prevent all configurations where this
might occur.  Thanks,

Alex

> Because the caching mode is required to make vfio-pci devices
> functional.
> 
> Previously we did this sanity check in vtd_iommu_notify_flag_changed()
> as when the memory regions change their attributes.  However that's
> too late in most cases!  Because the memory region layouts will only
> change after IOMMU is enabled, and that's in most cases during the
> guest OS boots.  So when the configuration is wrong, we will only bail
> out during the guest boots rather than simply telling the user before
> QEMU starts.
> 
> The same problem happens on device hotplug, say, when we have this:
> 
>   -device intel-iommu,caching-mode=off
> 
> Then we do something like:
> 
>   (HMP) device_add vfio-pci,host=05:00.0,bus=pcie.1
> 
> If at that time the vIOMMU is enabled in the guest then the QEMU
> process will simply quit directly due to this hotplug event.  This is
> a bit insane...
> 
> This series tries to solve above two problems by introducing two
> sanity checks upon these places separately:
> 
>   - machine done
>   - hotplug device
> 
> This is a bit awkward but I hope this could be better than before.
> There is of course other solutions like hard-code the check into
> vfio-pci but I feel it even more unpretty.  I didn't think out any
> better way to do this, if there is please kindly shout out.
> 
> Please have a look to see whether this would be acceptable, thanks.
> 
> Peter Xu (4):
>   intel_iommu: Sanity check vfio-pci config on machine init done
>   qdev/machine: Introduce hotplug_allowed hook
>   pc/q35: Disallow vfio-pci hotplug without VT-d caching mode
>   intel_iommu: Remove the caching-mode check during flag change
> 
>  hw/core/qdev.c         | 17 +++++++++++++++++
>  hw/i386/intel_iommu.c  | 40 ++++++++++++++++++++++++++++++++++------
>  hw/i386/pc.c           | 21 +++++++++++++++++++++
>  include/hw/boards.h    |  9 +++++++++
>  include/hw/qdev-core.h |  1 +
>  qdev-monitor.c         |  7 +++++++
>  6 files changed, 89 insertions(+), 6 deletions(-)
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]