qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU and vIOMMU support for emulated VF passthrough to


From: Tian, Kevin
Subject: Re: [Qemu-devel] QEMU and vIOMMU support for emulated VF passthrough to nested (L2) VM
Date: Mon, 8 Apr 2019 00:32:12 +0000

> From: Elijah Shakkour [mailto:address@hidden
> Sent: Sunday, April 7, 2019 9:47 PM
> 
> 
> > -----Original Message-----
> > From: Tian, Kevin <address@hidden>
> > Sent: Thursday, April 4, 2019 10:58 AM
> > To: Peter Xu <address@hidden>; Elijah Shakkour
> > <address@hidden>
> > Cc: Knut Omang <address@hidden>; Michael S. Tsirkin
> > <address@hidden>; Alex Williamson <address@hidden>;
> > Marcel Apfelbaum <address@hidden>; Stefan Hajnoczi
> > <address@hidden>; address@hidden
> > Subject: RE: QEMU and vIOMMU support for emulated VF passthrough to
> > nested (L2) VM
> >
> > > From: Peter Xu [mailto:address@hidden
> > > Sent: Thursday, April 4, 2019 3:00 PM
> > >
> > > On Wed, Apr 03, 2019 at 10:10:35PM +0000, Elijah Shakkour wrote:
> > >
> > > [...]
> > >
> > > > > > > > > > You can also try to enable VT-d device log by appending:
> > > > > > > > > >
> > > > > > > > > >   -trace enable="vtd_*"
> > > > > > > > > >
> > > > > > > > > > In case it dumps anything useful for you.
> > > > > > >
> > > > > > > Here is the relevant dump (dev 01:00.01 is my VF):
> > > > > > > "
> > > > > > > vtd_inv_desc_cc_device context invalidate device 01:00.01
> > > > > > > vtd_ce_not_present Context entry bus 1 devfn 1 not present
> > > > > > > vtd_switch_address_space Device 01:00.1 switching address
> > > > > > > space (iommu
> > > > > > > enabled=1) vtd_ce_not_present Context entry bus 1 devfn 1 not
> > > > > > > present vtd_err Detected invalid context entry when trying to
> > > > > > > sync shadow page table
> > > > > >
> > > > > > These lines mean that the guest sent a device invalidation to
> > > > > > your VF but the IOMMU found that the device context entry is
> > missing.
> > > > > >
> > > > > > > vtd_iotlb_cc_update IOTLB context update bus 0x1 devfn 0x1
> > > > > > > high
> > > > > > > 0x102 low 0x2d007003 gen 0 -> gen 2
> > > > > > > vtd_err_dmar_slpte_resv_error iova
> > > > > > > 0xf08e7000 level 2 slpte 0x2a54008f7
> > > > > >
> > > > > > This line should not exist in latest QEMU.  Are you sure you're
> > > > > > using the latest QEMU?
> > > > >
> > > > > I moved now to QEMU 4.0 RC2.
> > > > > This is the what I get now:
> > > > > vtd_iotlb_cc_update IOTLB context update bus 0x1 devfn 0x1 high
> > > > > 0x102
> > > low
> > > > > 0x2f007003 gen 0 -> gen 1
> > > > > qemu-system-x86_64: vtd_iova_to_slpte: detected splte reserve
> > > > > non-zero iova=0xf0d29000, level=0x2slpte=0x29f6008f7)
> > > > > vtd_fault_disabled Fault processing disabled for context entry
> > > > > qemu-system-x86_64: vtd_iommu_translate: detected translation
> > > > > failure (dev=01:00:01, iova=0xf0d29000) Unassigned mem read
> > > 00000000f0d29000
> > > > >
> > > > > I'm not familiar with vIOMMU registers, but I noticed that I must
> > > > > report snoop control support to Hyper-V (i.e. bit 7 in extended
> > > > > capability register
> > > of
> > > > > vIOMMU) in-order to satisfy IOMMU support for SRIOV.
> > > > > vIOMMU.ecap before    0xf00f5e
> > > > > vIOMMU.ecap after       0xf00fde
> > > > > But I see that vIOMMU doesn't really support snoop control.
> > > > > Could this be the problem that fails IOVA range check in this
> > > > > function vtd_iova_range_check()?
> > > >
> > > > Sorry, I meant the SLPTE reserved non-zero check failure in
> > > vtd_slpte_nonzero_rsvd()
> > > > And NOT IOVA range check failure (since range check didn't fail)
> > >
> > > Probably.  Currently VT-d emulation does not support snooping control,
> > > and if you modify that ecap only you probably will encounter this
> > > problem because then the guest kernel will setup the SNP bit in the
> > > IOMMU page table entries which will violate the reserved bits in the
> > > emulation code then you can see these errors.
> > >
> > > Now talking about implementing the Snoop Control for Intel IOMMU for
> > > real (which corresponds to vt-d ecap bit 7) - I'd confess I'm not 100%
> > > clear on what does the "snooping" mean and what we need to do as an
> > > emulator. I'm quotting from spec:
> > >
> > >   "Snoop behavior for a memory access (to a translation structure
> > >   entry or access to the mapped page) specifies if the access is
> > >   coherent (snoops the processor caches) or not."
> > >
> > > If it is only a capability showing that whether the hardware is
> > > capable of snooping processor caches, then I don't think we need to do
> > > much here as an emulator of VT-d simply because when we access the
> > > data we're still from the processor's side (because we're emulating
> > > the IOMMU behavior only) so the cache should always been coherent
> from
> > > the POV of guest vCPUs, just like how the processors provide cache
> > > coherence between two cores (so IMHO here the VT-d emulation code
> can
> > > be run on one core/thread, and the vcpu which runs the guest iommu
> > > driver can be run on another core/thread).  If so, maybe we can simply
> > > declare support of that but we at least also need to remove the SNP
> > > bit from vtd_paging_entry_rsvd_field[] array to reflect that we
> > > understand that bit.
> > >
> > > CCing Alex and Kevin to see whether I'm misunderstanding or in case of
> > > any further input on the snooping support.
> > >
> >
> > for software DMA yes snoop is guaranteed since it's just CPU access.
> >
> > However for VFIO device i.e. hardware DMA, snoop should be reported
> > based on physical IOMMU capability. It's fine to report no snoop control on
> > vIOMMU (current state) even when it's physically supported. It just results
> > that L1 VMM must favor guest cache attributes instead of forcing WB in L1
> > EPT when doing nested passthrough. However it's incorrect to report snoop
> > control on vIOMMU when physically it's not supported, otherwise L1 VMM
> > may force WB in L1 EPT and enable snoop field in vIOMMU 2nd level PTE
> with
> > assumption that hardware snoop is guaranteed (however it isn't). Then it
> > becomes a correctness issue.
> >
> 
> If my device is fully emulated, can I ignore the SNP bit in the SLPTE? What is
> the cost of ignoring it in such a case? What could go wrong?
> (I tried to ignore it and it seems that translations work for me now).
> 

I'm not sure what you meant by 'ignore' here. But as earlier pointed
out by Peter, for emulated devices you don't need do anything special
here. You can just report snoop capability and then remove it from
reserved bit check in SLPTE.

Thanks
Kevin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]