qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1] vfio/common: Separate vfio-pci ranges


From: Joao Martins
Subject: Re: [PATCH v1] vfio/common: Separate vfio-pci ranges
Date: Mon, 11 Sep 2023 11:12:55 +0100

On 11/09/2023 10:48, Duan, Zhenzhong wrote:
>> -----Original Message-----
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Sent: Monday, September 11, 2023 5:07 PM
>> Subject: Re: [PATCH v1] vfio/common: Separate vfio-pci ranges
>>
>> On 11/09/2023 09:57, Duan, Zhenzhong wrote:
>>>> -----Original Message-----
>>>> From: qemu-devel-bounces+zhenzhong.duan=intel.com@nongnu.org <qemu-
>>>> devel-bounces+zhenzhong.duan=intel.com@nongnu.org> On Behalf Of Joao
>>>> Martins
>>>> Sent: Friday, September 8, 2023 5:30 PM
>>>> Subject: [PATCH v1] vfio/common: Separate vfio-pci ranges
>>>>
>>>> QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
>>>> and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled,
>>>> QEMU includes in the 64-bit range the RAM regions at the lower part
>>>> and vfio-pci device RAM regions which are at the top of the address
>>>> space. This range contains a large gap and the size can be bigger than
>>>> the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit).
>>>>
>>>> To avoid such large ranges, introduce a new PCI range covering the
>>>> vfio-pci device RAM regions, this only if the addresses are above 4GB
>>>> to avoid breaking potential SeaBIOS guests.
>>>>
>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>> [ clg: - wrote commit log
>>>>       - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ]
>>>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>>>> ---
>>>> v2:
>>>> * s/minpci/minpci64/
>>>> * s/maxpci/maxpci64/
>>>> * Expand comment to cover the pci-hole64 and why we don't do special
>>>>  handling of pci-hole32.
>>>> ---
>>>> hw/vfio/common.c     | 71 +++++++++++++++++++++++++++++++++++++-------
>>>> hw/vfio/trace-events |  2 +-
>>>> 2 files changed, 61 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>> index 237101d03844..134649226d43 100644
>>>> --- a/hw/vfio/common.c
>>>> +++ b/hw/vfio/common.c
>>>> @@ -27,6 +27,7 @@
>>>>
>>>> #include "hw/vfio/vfio-common.h"
>>>> #include "hw/vfio/vfio.h"
>>>> +#include "hw/vfio/pci.h"
>>>> #include "exec/address-spaces.h"
>>>> #include "exec/memory.h"
>>>> #include "exec/ram_addr.h"
>>>> @@ -1400,6 +1401,8 @@ typedef struct VFIODirtyRanges {
>>>>     hwaddr max32;
>>>>     hwaddr min64;
>>>>     hwaddr max64;
>>>> +    hwaddr minpci64;
>>>> +    hwaddr maxpci64;
>>>> } VFIODirtyRanges;
>>>>
>>>> typedef struct VFIODirtyRangesListener {
>>>> @@ -1408,6 +1411,31 @@ typedef struct VFIODirtyRangesListener {
>>>>     MemoryListener listener;
>>>> } VFIODirtyRangesListener;
>>>>
>>>> +static bool vfio_section_is_vfio_pci(MemoryRegionSection *section,
>>>> +                                     VFIOContainer *container)
>>>> +{
>>>> +    VFIOPCIDevice *pcidev;
>>>> +    VFIODevice *vbasedev;
>>>> +    VFIOGroup *group;
>>>> +    Object *owner;
>>>> +
>>>> +    owner = memory_region_owner(section->mr);
>>>> +
>>>> +    QLIST_FOREACH(group, &container->group_list, container_next) {
>>>> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
>>>> +            if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) {
>>>> +                continue;
>>>> +            }
>>>> +            pcidev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>>>> +            if (OBJECT(pcidev) == owner) {
>>>> +                return true;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>
>>> What about simplify it with memory_region_is_ram_device()?
>>> This way vdpa device could also be included.
>>>
>>
>> Note that the check is not interested in RAM (or any other kinda of memory 
>> like
>> VGA). That's covered in the 32-64 ranges. But rather in any PCI device RAM 
>> that
>> would fall in the 64-bit PCI hole. Would memory_region_is_ram_device() really
>> cover it? If so, I am all for the simplification.
> 
> Ram device is used not only by vfio pci bars but also host notifier of vdpa 
> and vhost-user.
> 
> I have another question, previously I think vfio pci bars are device states 
> and
> save/restored through VFIO migration protocol, so we don't need to dirty
> tracking them. Do I understand wrong?

The general thinking of device dirty tracking is to track all addressable IOVAs.
But you do raise a good question. My understanding is that migrating the bars
*as is* might be device migration specific (not a guarantee?); the save file and
precopy interface are the only places we transfer from/to the data and it's just
opaque data, not bars or anything formatted specifically -- so if we migrate
bars it is hidden in what device f/w wants to move. Might be that BARs aren't
even needed as they are sort of scratch space from h/w side. Ultimately, the
dirty tracker is the one reporting the values, and the device h/w chooses to not
report those IOVAs as dirty then nothing changes.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]