[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 07/13] ram-block-attribute: Introduce RamBlockAttribute to
From: |
Chenyi Qiang |
Subject: |
Re: [PATCH v4 07/13] ram-block-attribute: Introduce RamBlockAttribute to manage RAMBLock with guest_memfd |
Date: |
Fri, 9 May 2025 15:55:41 +0800 |
User-agent: |
Mozilla Thunderbird |
Thanks Baolu for your review!
On 5/9/2025 2:41 PM, Baolu Lu wrote:
> On 4/7/25 15:49, Chenyi Qiang wrote:
>> Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated
>> discard") highlighted that subsystems like VFIO may disable RAM block
>> discard. However, guest_memfd relies on discard operations for page
>> conversion between private and shared memory, potentially leading to
>> stale IOMMU mapping issue when assigning hardware devices to
>> confidential VMs via shared memory. To address this, it is crucial to
>> ensure systems like VFIO refresh its IOMMU mappings.
>>
>> PrivateSharedManager is introduced to manage private and shared states in
>> confidential VMs, similar to RamDiscardManager, which supports
>> coordinated RAM discard in VFIO. Integrating PrivateSharedManager with
>> guest_memfd can facilitate the adjustment of VFIO mappings in response
>> to page conversion events.
>>
>> Since guest_memfd is not an object, it cannot directly implement the
>> PrivateSharedManager interface. Implementing it in HostMemoryBackend is
>> not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks
>> have a memory backend while others do not. Notably, virtual BIOS
>> RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
>> backend.
>>
>> To manage RAMBlocks with guest_memfd, define a new object named
>> RamBlockAttribute to implement the RamDiscardManager interface. This
>> object stores guest_memfd information such as shared_bitmap, and handles
>> page conversion notification. The memory state is tracked at the host
>> page size granularity, as the minimum memory conversion size can be one
>> page per request. Additionally, VFIO expects the DMA mapping for a
>> specific iova to be mapped and unmapped with the same granularity.
>> Confidential VMs may perform partial conversions, such as conversions on
>> small regions within larger regions. To prevent invalid cases and until
>> cut_mapping operation support is available, all operations are performed
>> with 4K granularity.
>
> Just for your information, IOMMUFD plans to introduce the support for
> cut operation. The kickoff patch series is under discussion here:
>
> https://lore.kernel.org/linux-iommu/0-v2-5c26bde5c22d+58b-
> iommu_pt_jgg@nvidia.com/
Thanks for this info. Just find the new version comes out.
>
> This new cut support is expected to be exclusive to IOMMUFD and not
> directly available in the VFIO container context. The VFIO uAPI for map/
> unmap is being superseded by IOMMUFD, and all new features will only be
> available in IOMMUFD.
Yeah. I would suggest the test step to use iommufd in my cover letter
since this is the direction.
>
>>
>> Signed-off-by: Chenyi Qiang<chenyi.qiang@intel.com>
>
> <...>
>
>> +
>> +int ram_block_attribute_realize(RamBlockAttribute *attr, MemoryRegion
>> *mr)
>> +{
>> + uint64_t shared_bitmap_size;
>> + const int block_size = qemu_real_host_page_size();
>> + int ret;
>> +
>> + shared_bitmap_size = ROUND_UP(mr->size, block_size) / block_size;
>> +
>> + attr->mr = mr;
>> + ret = memory_region_set_generic_state_manager(mr,
>> GENERIC_STATE_MANAGER(attr));
>> + if (ret) {
>> + return ret;
>> + }
>> + attr->shared_bitmap_size = shared_bitmap_size;
>> + attr->shared_bitmap = bitmap_new(shared_bitmap_size);
>
> Above introduces a bitmap to track the private/shared state of each 4KB
> page. While functional, for large RAM blocks managed by guest_memfd,
> this could lead to significant memory consumption.
>
> Have you considered an alternative like a Maple Tree or a generic
> interval tree? Both are often more memory-efficient for tracking ranges
> of contiguous states.
Maybe not necessary. The memory overhead is 1 bit per page
(1/(4096*8)=0.003%). I think it is not too much.
>
>> +
>> + return ret;
>> +}
>> +
>> +void ram_block_attribute_unrealize(RamBlockAttribute *attr)
>> +{
>> + g_free(attr->shared_bitmap);
>> + memory_region_set_generic_state_manager(attr->mr, NULL);
>> +}
>
> Thanks,
> baolu