[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty
From: |
Yan Zhao |
Subject: |
Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap. |
Date: |
Thu, 5 Dec 2019 00:47:39 -0500 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Thu, Dec 05, 2019 at 01:42:23PM +0800, Kirti Wankhede wrote:
>
>
> On 12/5/2019 6:58 AM, Yan Zhao wrote:
> > On Thu, Dec 05, 2019 at 02:34:57AM +0800, Alex Williamson wrote:
> >> On Wed, 4 Dec 2019 23:40:25 +0530
> >> Kirti Wankhede <address@hidden> wrote:
> >>
> >>> On 12/3/2019 11:34 PM, Alex Williamson wrote:
> >>>> On Mon, 25 Nov 2019 19:57:39 -0500
> >>>> Yan Zhao <address@hidden> wrote:
> >>>>
> >>>>> On Fri, Nov 15, 2019 at 05:06:25AM +0800, Alex Williamson wrote:
> >>>>>> On Fri, 15 Nov 2019 00:26:07 +0530
> >>>>>> Kirti Wankhede <address@hidden> wrote:
> >>>>>>
> >>>>>>> On 11/14/2019 1:37 AM, Alex Williamson wrote:
> >>>>>>>> On Thu, 14 Nov 2019 01:07:21 +0530
> >>>>>>>> Kirti Wankhede <address@hidden> wrote:
> >>>>>>>>
> >>>>>>>>> On 11/13/2019 4:00 AM, Alex Williamson wrote:
> >>>>>>>>>> On Tue, 12 Nov 2019 22:33:37 +0530
> >>>>>>>>>> Kirti Wankhede <address@hidden> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> All pages pinned by vendor driver through vfio_pin_pages API
> >>>>>>>>>>> should be
> >>>>>>>>>>> considered as dirty during migration. IOMMU container maintains a
> >>>>>>>>>>> list of
> >>>>>>>>>>> all such pinned pages. Added an ioctl defination to get bitmap of
> >>>>>>>>>>> such
> >>>>>>>>>>
> >>>>>>>>>> definition
> >>>>>>>>>>
> >>>>>>>>>>> pinned pages for requested IO virtual address range.
> >>>>>>>>>>
> >>>>>>>>>> Additionally, all mapped pages are considered dirty when physically
> >>>>>>>>>> mapped through to an IOMMU, modulo we discussed devices opting in
> >>>>>>>>>> to
> >>>>>>>>>> per page pinning to indicate finer granularity with a TBD
> >>>>>>>>>> mechanism to
> >>>>>>>>>> figure out if any non-opt-in devices remain.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> You mean, in case of device direct assignment (device pass through)?
> >>>>>>>>
> >>>>>>>> Yes, or IOMMU backed mdevs. If vfio_dmas in the container are fully
> >>>>>>>> pinned and mapped, then the correct dirty page set is all mapped
> >>>>>>>> pages.
> >>>>>>>> We discussed using the vpfn list as a mechanism for vendor drivers to
> >>>>>>>> reduce their migration footprint, but we also discussed that we would
> >>>>>>>> need a way to determine that all participants in the container have
> >>>>>>>> explicitly pinned their working pages or else we must consider the
> >>>>>>>> entire potential working set as dirty.
> >>>>>>>>
> >>>>>>>
> >>>>>>> How can vendor driver tell this capability to iommu module? Any
> >>>>>>> suggestions?
> >>>>>>
> >>>>>> I think it does so by pinning pages. Is it acceptable that if the
> >>>>>> vendor driver pins any pages, then from that point forward we consider
> >>>>>> the IOMMU group dirty page scope to be limited to pinned pages? There
> >>>>> we should also be aware of that dirty page scope is pinned pages +
> >>>>> unpinned pages,
> >>>>> which means ever since a page is pinned, it should be regarded as dirty
> >>>>> no matter whether it's unpinned later. only after log_sync is called and
> >>>>> dirty info retrieved, its dirty state should be cleared.
> >>>>
> >>>> Yes, good point. We can't just remove a vpfn when a page is unpinned
> >>>> or else we'd lose information that the page potentially had been
> >>>> dirtied while it was pinned. Maybe that vpfn needs to move to a dirty
> >>>> list and both the currently pinned vpfns and the dirty vpfns are walked
> >>>> on a log_sync. The dirty vpfns list would be cleared after a log_sync.
> >>>> The container would need to know that dirty tracking is enabled and
> >>>> only manage the dirty vpfns list when necessary. Thanks,
> >>>>
> >>>
> >>> If page is unpinned, then that page is available in free page pool for
> >>> others to use, then how can we say that unpinned page has valid data?
> >>>
> >>> If suppose, one driver A unpins a page and when driver B of some other
> >>> device gets that page and he pins it, uses it, and then unpins it, then
> >>> how can we say that page has valid data for driver A?
> >>>
> >>> Can you give one example where unpinned page data is considered reliable
> >>> and valid?
> >>
> >> We can only pin pages that the user has already allocated* and mapped
> >> through the vfio DMA API. The pinning of the page simply locks the
> >> page for the vendor driver to access it and unpinning that page only
> >> indicates that access is complete. Pages are not freed when a vendor
> >> driver unpins them, they still exist and at this point we're now
> >> assuming the device dirtied the page while it was pinned. Thanks,
> >>
> >> Alex
> >>
> >> * An exception here is that the page might be demand allocated and the
> >> act of pinning the page could actually allocate the backing page for
> >> the user if they have not faulted the page to trigger that allocation
> >> previously. That page remains mapped for the user's virtual address
> >> space even after the unpinning though.
> >>
> >
> > Yes, I can give an example in GVT.
> > when a gem_object is allocated in guest, before submitting it to guest
> > vGPU, gfx cmds in its ring buffer need to be pinned into GGTT to get a
> > global graphics address for hardware access. At that time, we shadow
> > those cmds and pin pages through vfio pin_pages(), and submit the shadow
> > gem_object to physial hardware.
> > After guest driver thinks the submitted gem_object has completed hardware
> > DMA, it unnpinnd those pinned GGTT graphics memory addresses. Then in
> > host, we unpin the shadow pages through vfio unpin_pages.
> > But, at this point, guest driver is still free to access the gem_object
> > through vCPUs, and guest user space is probably still mapping an object
> > into the gem_object in guest driver.
> > So, missing the dirty page tracking for unpinned pages would cause
> > data inconsitency.
> >
>
> If pages are accessed by guest through vCPUs, then RAM module in QEMU
> will take care of tracking those pages as dirty.
>
> All unpinned pages might not be used, so tracking all unpinned pages
> during VM or application life time would also lead to tracking lots of
> stale pages, even though they are not being used. Increasing number of
> not needed pages could also lead to increasing migration data leading
> increase in migration downtime.
>
> Thanks,
> Kirti
Those are pages dirtied by vGPU during Pin/Unpin. They are not dirtied
by vCPUs. RAM module in QEMU has no idea of it.
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Alex Williamson, 2019/12/03
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Kirti Wankhede, 2019/12/04
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Alex Williamson, 2019/12/04
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Yan Zhao, 2019/12/04
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Kirti Wankhede, 2019/12/05
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap.,
Yan Zhao <=
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Alex Williamson, 2019/12/05
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Kirti Wankhede, 2019/12/05
- Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap., Alex Williamson, 2019/12/05