[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] Proposal PCI/PCIe device placement on PAPR guests
From: |
Alexey Kardashevskiy |
Subject: |
Re: [Qemu-ppc] Proposal PCI/PCIe device placement on PAPR guests |
Date: |
Thu, 12 Jan 2017 17:19:40 +1100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 12/01/17 14:52, David Gibson wrote:
> On Fri, Jan 06, 2017 at 12:57:58PM +0100, Greg Kurz wrote:
>> On Thu, 5 Jan 2017 16:46:18 +1100
>> David Gibson <address@hidden> wrote:
>>
>>> There was a discussion back in November on the qemu list which spilled
>>> onto the libvirt list about how to add support for PCIe devices to
>>> POWER VMs, specifically 'pseries' machine type PAPR guests.
>>>
>>> Here's a more concrete proposal for how to handle part of this in
>>> future from the libvirt side. Strictly speaking what I'm suggesting
>>> here isn't intrinsically linked to PCIe: it will make adding PCIe
>>> support sanely easier, as well as having a number of advantages for
>>> both PCIe and plain-PCI devices on PAPR guests.
>>>
>>> Background:
>>>
>>> * Currently the pseries machine type only supports vanilla PCI
>>> buses.
>>> * This is a qemu limitation, not something inherent - PAPR guests
>>> running under PowerVM (the IBM hypervisor) can use passthrough
>>> PCIe devices (PowerVM doesn't emulate devices though).
>>> * In fact the way PCI access is para-virtalized in PAPR makes the
>>> usual distinctions between PCI and PCIe largely disappear
>>> * Presentation of PCIe devices to PAPR guests is unusual
>>> * Unlike x86 - and other "bare metal" platforms, root ports are
>>> not made visible to the guest. i.e. all devices (typically)
>>> appear as though they were integrated devices on x86
>>> * In terms of topology all devices will appear in a way similar to
>>> a vanilla PCI bus, even PCIe devices
>>> * However PCIe extended config space is accessible
>>> * This means libvirt's usual placement of PCIe devices is not
>>> suitable for PAPR guests
>>> * PAPR has its own hotplug mechanism
>>> * This is used instead of standard PCIe hotplug
>>> * This mechanism works for both PCIe and vanilla-PCI devices
>>> * This can hotplug/unplug devices even without a root port P2P
>>> bridge between it and the root "bus
>>> * Multiple independent host bridges are routine on PAPR
>>> * Unlike PC (where all host bridges have multiplexed access to
>>> configuration space) PCI host bridges (PHBs) are truly
>>> independent for PAPR guests (disjoint MMIO regions in system
>>> address space)
>>> * PowerVM typically presents a separate PHB to the guest for each
>>> host slot passed through
>>>
>>> The Proposal:
>>>
>>> I suggest that libvirt implement a new default algorithm for placing
>>> (i.e. assigning addresses to) both PCI and PCIe devices for (only)
>>> PAPR guests.
>>>
>>> The short summary is that by default it should assign each device to a
>>> separate vPHB, creating vPHBs as necessary.
>>>
>>> * For passthrough sometimes a group of host devices can't be safely
>>> isolated from each other - this is known as a (host) Partitionable
>>> Endpoint (PE). In this case, if any device in the PE is passed
>>> through to a guest, the whole PE must be passed through to the
>>> same vPHB in the guest. From the guest POV, each vPHB has exactly
>>> one (guest) PE.
>>> * To allow for hotplugged devices, libvirt should also add a number
>>> of additional, empty vPHBs (the PAPR spec allows for hotplug of
>>> PHBs, but this is not yet implemented in qemu). When hotplugging
>>> a new device (or PE) libvirt should locate a vPHB which doesn't
>>> currently contain anything.
>>> * libvirt should only (automatically) add PHBs - never root ports or
>>> other PCI to PCI bridges
>>>
>>> In order to handle migration, the vPHBs will need to be represented in
>>> the domain XML, which will also allow the user to override this
>>> topology if they want.
>>>
>>> Advantages:
>>>
>>> There are still some details I need to figure out w.r.t. handling PCIe
>>> devices (on both the qemu and libvirt sides). However the fact that
>>
>> One such detail may be that PCIe devices should have the
>> "ibm,pci-config-space-type" property set to 1 in the DT,
>> for the driver to be able to access the extended config
>> space.
>
> So, we have a bit of an oddity here. It looks like we currently set
> 'ibm,pci-config-space-type' to 1 in the PHB, rather than individual
> device nodes. Which, AFAICT, is simply incorrect in terms of PAPR.
I asked Paul how to read the spec and this is rather correct but not enough
- having type=1 on a PHB means that extended access requests can go behind
it but underlying devices and bridges still need to have type=1 if they
support extended space. Having type set to 0 (or none at all) on a PHB
would mean that extended config space is not available on anything under
this PHB.
--
Alexey
signature.asc
Description: OpenPGP digital signature
Re: [Qemu-ppc] Proposal PCI/PCIe device placement on PAPR guests, Greg Kurz, 2017/01/06
Re: [Qemu-ppc] Proposal PCI/PCIe device placement on PAPR guests, Andrea Bolognani, 2017/01/06