qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 03/18] pci: isolated address space for PCI bus


From: Jag Raman
Subject: Re: [PATCH v5 03/18] pci: isolated address space for PCI bus
Date: Wed, 26 Jan 2022 05:27:32 +0000


> On Jan 25, 2022, at 1:38 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> 
> wrote:
> 
> * Jag Raman (jag.raman@oracle.com) wrote:
>> 
>> 
>>> On Jan 19, 2022, at 7:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>> 
>>> On Wed, Jan 19, 2022 at 04:41:52PM -0500, Jagannathan Raman wrote:
>>>> Allow PCI buses to be part of isolated CPU address spaces. This has a
>>>> niche usage.
>>>> 
>>>> TYPE_REMOTE_MACHINE allows multiple VMs to house their PCI devices in
>>>> the same machine/server. This would cause address space collision as
>>>> well as be a security vulnerability. Having separate address spaces for
>>>> each PCI bus would solve this problem.
>>> 
>>> Fascinating, but I am not sure I understand. any examples?
>> 
>> Hi Michael!
>> 
>> multiprocess QEMU and vfio-user implement a client-server model to allow
>> out-of-process emulation of devices. The client QEMU, which makes ioctls
>> to the kernel and runs VCPUs, could attach devices running in a server
>> QEMU. The server QEMU needs access to parts of the client’s RAM to
>> perform DMA.
> 
> Do you ever have the opposite problem? i.e. when an emulated PCI device

That’s an interesting question.

> exposes a chunk of RAM-like space (frame buffer, or maybe a mapped file)
> that the client can see.  What happens if two emulated devices need to
> access each others emulated address space?

In this case, the kernel driver would map the destination’s chunk of internal 
RAM into
the DMA space of the source device. Then the source device could write to that
mapped address range, and the IOMMU should direct those writes to the
destination device.

I would like to take a closer look at the IOMMU implementation on how to achieve
this, and get back to you. I think the IOMMU would handle this. Could you please
point me to the IOMMU implementation you have in mind?

Thank you!
--
Jag

> 
> Dave
> 
>> In the case where multiple clients attach devices that are running on the
>> same server, we need to ensure that each devices has isolated memory
>> ranges. This ensures that the memory space of one device is not visible
>> to other devices in the same server.
>> 
>>> 
>>> I also wonder whether this special type could be modelled like a special
>>> kind of iommu internally.
>> 
>> Could you please provide some more details on the design?
>> 
>>> 
>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>>> ---
>>>> include/hw/pci/pci.h     |  2 ++
>>>> include/hw/pci/pci_bus.h | 17 +++++++++++++++++
>>>> hw/pci/pci.c             | 17 +++++++++++++++++
>>>> hw/pci/pci_bridge.c      |  5 +++++
>>>> 4 files changed, 41 insertions(+)
>>>> 
>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>> index 023abc0f79..9bb4472abc 100644
>>>> --- a/include/hw/pci/pci.h
>>>> +++ b/include/hw/pci/pci.h
>>>> @@ -387,6 +387,8 @@ void pci_device_save(PCIDevice *s, QEMUFile *f);
>>>> int pci_device_load(PCIDevice *s, QEMUFile *f);
>>>> MemoryRegion *pci_address_space(PCIDevice *dev);
>>>> MemoryRegion *pci_address_space_io(PCIDevice *dev);
>>>> +AddressSpace *pci_isol_as_mem(PCIDevice *dev);
>>>> +AddressSpace *pci_isol_as_io(PCIDevice *dev);
>>>> 
>>>> /*
>>>> * Should not normally be used by devices. For use by sPAPR target
>>>> diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
>>>> index 347440d42c..d78258e79e 100644
>>>> --- a/include/hw/pci/pci_bus.h
>>>> +++ b/include/hw/pci/pci_bus.h
>>>> @@ -39,9 +39,26 @@ struct PCIBus {
>>>>    void *irq_opaque;
>>>>    PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
>>>>    PCIDevice *parent_dev;
>>>> +
>>>>    MemoryRegion *address_space_mem;
>>>>    MemoryRegion *address_space_io;
>>>> 
>>>> +    /**
>>>> +     * Isolated address spaces - these allow the PCI bus to be part
>>>> +     * of an isolated address space as opposed to the global
>>>> +     * address_space_memory & address_space_io.
>>> 
>>> Are you sure address_space_memory & address_space_io are
>>> always global? even in the case of an iommu?
>> 
>> On the CPU side of the Root Complex, I believe address_space_memory
>> & address_space_io are global.
>> 
>> In the vfio-user case, devices on the same machine (TYPE_REMOTE_MACHINE)
>> could be attached to different clients VMs. Each client would have their own 
>> address
>> space for their CPUs. With isolated address spaces, we ensure that the 
>> devices
>> see the address space of the CPUs they’re attached to.
>> 
>> Not sure if it’s OK to share weblinks in this mailing list, please let me 
>> know if that’s
>> not preferred. But I’m referring to the terminology used in the following 
>> block diagram:
>> https://en.wikipedia.org/wiki/Root_complex#/media/File:Example_PCI_Express_Topology.svg
>> 
>>> 
>>>> This allows the
>>>> +     * bus to be attached to CPUs from different machines. The
>>>> +     * following is not used used commonly.
>>>> +     *
>>>> +     * TYPE_REMOTE_MACHINE allows emulating devices from multiple
>>>> +     * VM clients,
>>> 
>>> what are VM clients?
>> 
>> It’s the client in the client - server model explained above.
>> 
>> Thank you!
>> --
>> Jag
>> 
>>> 
>>>> as such it needs the PCI buses in the same machine
>>>> +     * to be part of different CPU address spaces. The following is
>>>> +     * useful in that scenario.
>>>> +     *
>>>> +     */
>>>> +    AddressSpace *isol_as_mem;
>>>> +    AddressSpace *isol_as_io;
>>>> +
>>>>    QLIST_HEAD(, PCIBus) child; /* this will be replaced by qdev later */
>>>>    QLIST_ENTRY(PCIBus) sibling;/* this will be replaced by qdev later */
>>>> 
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 5d30f9ca60..d5f1c6c421 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -442,6 +442,8 @@ static void pci_root_bus_internal_init(PCIBus *bus, 
>>>> DeviceState *parent,
>>>>    bus->slot_reserved_mask = 0x0;
>>>>    bus->address_space_mem = address_space_mem;
>>>>    bus->address_space_io = address_space_io;
>>>> +    bus->isol_as_mem = NULL;
>>>> +    bus->isol_as_io = NULL;
>>>>    bus->flags |= PCI_BUS_IS_ROOT;
>>>> 
>>>>    /* host bridge */
>>>> @@ -2676,6 +2678,16 @@ MemoryRegion *pci_address_space_io(PCIDevice *dev)
>>>>    return pci_get_bus(dev)->address_space_io;
>>>> }
>>>> 
>>>> +AddressSpace *pci_isol_as_mem(PCIDevice *dev)
>>>> +{
>>>> +    return pci_get_bus(dev)->isol_as_mem;
>>>> +}
>>>> +
>>>> +AddressSpace *pci_isol_as_io(PCIDevice *dev)
>>>> +{
>>>> +    return pci_get_bus(dev)->isol_as_io;
>>>> +}
>>>> +
>>>> static void pci_device_class_init(ObjectClass *klass, void *data)
>>>> {
>>>>    DeviceClass *k = DEVICE_CLASS(klass);
>>>> @@ -2699,6 +2711,7 @@ static void pci_device_class_base_init(ObjectClass 
>>>> *klass, void *data)
>>>> 
>>>> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>>>> {
>>>> +    AddressSpace *iommu_as = NULL;
>>>>    PCIBus *bus = pci_get_bus(dev);
>>>>    PCIBus *iommu_bus = bus;
>>>>    uint8_t devfn = dev->devfn;
>>>> @@ -2745,6 +2758,10 @@ AddressSpace 
>>>> *pci_device_iommu_address_space(PCIDevice *dev)
>>>>    if (!pci_bus_bypass_iommu(bus) && iommu_bus && iommu_bus->iommu_fn) {
>>>>        return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, devfn);
>>>>    }
>>>> +    iommu_as = pci_isol_as_mem(dev);
>>>> +    if (iommu_as) {
>>>> +        return iommu_as;
>>>> +    }
>>>>    return &address_space_memory;
>>>> }
>>>> 
>>>> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
>>>> index da34c8ebcd..98366768d2 100644
>>>> --- a/hw/pci/pci_bridge.c
>>>> +++ b/hw/pci/pci_bridge.c
>>>> @@ -383,6 +383,11 @@ void pci_bridge_initfn(PCIDevice *dev, const char 
>>>> *typename)
>>>>    sec_bus->address_space_io = &br->address_space_io;
>>>>    memory_region_init(&br->address_space_io, OBJECT(br), "pci_bridge_io",
>>>>                       4 * GiB);
>>>> +
>>>> +    /* This PCI bridge puts the sec_bus in its parent's address space */
>>>> +    sec_bus->isol_as_mem = pci_isol_as_mem(dev);
>>>> +    sec_bus->isol_as_io = pci_isol_as_io(dev);
>>>> +
>>>>    br->windows = pci_bridge_region_init(br);
>>>>    QLIST_INIT(&sec_bus->child);
>>>>    QLIST_INSERT_HEAD(&parent->child, sec_bus, sibling);
>>>> -- 
>>>> 2.20.1
>> 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]