qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/3] vfio/migration: Support manual clear vfio dirty log


From: Kunkun Jiang
Subject: Re: [RFC PATCH 0/3] vfio/migration: Support manual clear vfio dirty log
Date: Thu, 18 Mar 2021 20:28:33 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

Hi Kevin,

On 2021/3/18 17:04, Tian, Kevin wrote:
From: Kunkun Jiang <jiangkunkun@huawei.com>
Sent: Thursday, March 18, 2021 3:59 PM

Hi Kevin,

On 2021/3/18 14:28, Tian, Kevin wrote:
From: Kunkun Jiang
Sent: Wednesday, March 10, 2021 5:41 PM

Hi all,

In the past, we clear dirty log immediately after sync dirty log to
userspace. This may cause redundant dirty handling if userspace
handles dirty log iteratively:

After vfio clears dirty log, new dirty log starts to generate. These
new dirty log will be reported to userspace even if they are generated
before userspace handles the same dirty page.

Since a new dirty log tracking method for vfio based on iommu hwdbm[1]
has been introduced in the kernel and added a new capability named
VFIO_DIRTY_LOG_MANUAL_CLEAR, we can eliminate some redundant
dirty
handling by supporting it.
Is there any performance data showing the benefit of this new method?

Current dirty log tracking method for VFIO:
[1] All pages marked dirty if not all iommu_groups have pinned_scope
[2] pinned pages by various vendor drivers if all iommu_groups have
pinned scope

Both methods are coarse-grained and can not determine which pages are
really dirty. Each round may mark the pages that are not really dirty as
dirty
and send them to the destination. ( It might be better if the range of the
pinned_scope was smaller. ) This will result in a waste of resources.

HWDBM is short for Hardware Dirty Bit Management.
(e.g. smmuv3 HTTU, Hardware Translation Table Update)

About SMMU HTTU:
HTTU is a feature of ARM SMMUv3, it can update access flag or/and dirty
state of the TTD (Translation Table Descriptor) by hardware.

With HTTU, stage1 TTD is classified into 3 types:
                                   DBM bit AP[2](readonly bit)
1. writable_clean          1                            1
2. writable_dirty           1                            0
3. readonly                   0                            1

If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from
writable_clean to writable_dirty. Then software can scan TTD to sync dirty
state into dirty bitmap. With this feature, we can track the dirty log of
DMA continuously and precisely.

The capability of VFIO_DIRTY_LOG_MANUAL_CLEAR is similar to that on
the KVM side. We add this new log_clear() interface only to split the old
log_sync() into two separated procedures:

- use log_sync() to collect the collection only, and,
- use log_clear() to clear the dirty bitmap.

If you're interested in this new method, you can take a look at our set of
patches.
[1]
https://lore.kernel.org/linux-iommu/20210310090614.26668-1-
zhukeqian1@huawei.com/

I know what you are doing. Intel is also working on VT-d dirty bit support
based on above link. What I'm curious is the actual performance gain
with this optimization. KVM doing that is one good reference, but IOMMU
has different characteristics (e.g. longer invalidation latency) compared to
CPU MMU. It's always good to understand what a so-called optimization
can actually optimize in a context different from where it's originally proved.😊

Thanks
Kevin

My understanding is that this is a new method, which is quite different from the previous two. So can you explain in more detail what performance data you want?😁

Thanks,
Kunkun Jiang




reply via email to

[Prev in Thread] Current Thread [Next in Thread]