[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [PATCH v1 2/2] s390x/pci: Fix hotplugging of PCI bridge
From: |
David Hildenbrand |
Subject: |
Re: [qemu-s390x] [PATCH v1 2/2] s390x/pci: Fix hotplugging of PCI bridges |
Date: |
Wed, 23 Jan 2019 15:26:04 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 |
On 23.01.19 14:42, Thomas Huth wrote:
> On 2019-01-23 12:42, David Hildenbrand wrote:
>> On 23.01.19 12:16, Thomas Huth wrote:
>>> On 2019-01-22 13:51, David Hildenbrand wrote:
>>>> When hotplugging a PCI bridge right now to the root port, we resolve
>>>> pci_get_bus(pdev)->parent_dev, which results in a SEGFAULT. Hotplugging
>>>> really only works right now when hotplugging to another bridge.
>>>>
>>>> Instead, we have to properly check if we are already at the root.
>>>>
>>>> Let's cleanup the code while at it a bit and factor out updating the
>>>> subordiante bus number into a separate function. The check for
>>>> "old_nr < nr" is right now not strictly necessary, but makes it more
>>>> obvious what is actually going on.
>>>>
>>>> Signed-off-by: David Hildenbrand <address@hidden>
>>>> ---
>>>> hw/s390x/s390-pci-bus.c | 28 +++++++++++++++++++---------
>>>> 1 file changed, 19 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>>>> index b7c4613fde..9b5c5fff60 100644
>>>> --- a/hw/s390x/s390-pci-bus.c
>>>> +++ b/hw/s390x/s390-pci-bus.c
>>>> @@ -843,6 +843,21 @@ static void s390_pcihost_pre_plug(HotplugHandler
>>>> *hotplug_dev, DeviceState *dev,
>>>> }
>>>> }
>>>>
>>>> +static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>>>> +{
>>>> + uint32_t old_nr;
>>>> +
>>>> + pci_default_write_config(dev, PCI_SUBORDINATE_BUS, nr, 1);
>>>> + while (!pci_bus_is_root(pci_get_bus(dev))) {
>>>> + dev = pci_get_bus(dev)->parent_dev;
>>>> +
>>>> + old_nr = pci_default_read_config(dev, PCI_SUBORDINATE_BUS, 1);
>>>> + if (old_nr < nr) {
>>>> + pci_default_write_config(dev, PCI_SUBORDINATE_BUS, nr, 1);
>>>> + }
>>>> + }
>>>> +}
>>>> +
>>>> static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState
>>>> *dev,
>>>> Error **errp)
>>>> {
>>>> @@ -851,26 +866,21 @@ static void s390_pcihost_plug(HotplugHandler
>>>> *hotplug_dev, DeviceState *dev,
>>>> S390PCIBusDevice *pbdev = NULL;
>>>>
>>>> if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>>>> - BusState *bus;
>>>> PCIBridge *pb = PCI_BRIDGE(dev);
>>>> - PCIDevice *pdev = PCI_DEVICE(dev);
>>>>
>>>> + pdev = PCI_DEVICE(dev);
>>>
>>> Why not keep the direct assignment?
>>
>> We have the same local variable in that function already. I'm just
>> reusing it.
>
> D'oh, right. Please disregard my comment.
>
>>>
>>>> pci_bridge_map_irq(pb, dev->id, s390_pci_map_irq);
>>>> pci_setup_iommu(&pb->sec_bus, s390_pci_dma_iommu, s);
>>>>
>>>> - bus = BUS(&pb->sec_bus);
>>>> - qbus_set_hotplug_handler(bus, DEVICE(s), errp);
>>>> + qbus_set_hotplug_handler(BUS(&pb->sec_bus), DEVICE(s), errp);
>>>>
>>>> if (dev->hotplugged) {
>>>> pci_default_write_config(pdev, PCI_PRIMARY_BUS,
>>>> pci_dev_bus_num(pdev), 1);
>>>> s->bus_no += 1;
>>>> pci_default_write_config(pdev, PCI_SECONDARY_BUS, s->bus_no,
>>>> 1);
>>>> - do {
>>>> - pdev = pci_get_bus(pdev)->parent_dev;
>>>> - pci_default_write_config(pdev, PCI_SUBORDINATE_BUS,
>>>> - s->bus_no, 1);
>>>> - } while (pci_get_bus(pdev) && pci_dev_bus_num(pdev));
>>>> +
>>>> + s390_pci_update_subordinate(pdev, s->bus_no);
>>>
>>>
>>> While your changes look OK at a first glance, and certainly fix the
>>> crash, I wonder whether this is the right thing to do here at all. Why
>>> does the hypervisor set up the all the bridge registers here? I'd rather
>>> expect that this is the job of the guest after it detects a hot-plugged
>>> device?
>>
>> I honestly have no idea who is responsible for that. I can see that
>> spapr does not seem to handle hotplugging the way s390x does. I was
>> hoping that Colin maybe know why we have to do that on s390x this way.
>> At least this patch does not change the behavior but only fixes it.
>
> Right, you're patch certainly fixes a crash, and does not make it any
> worse than it was before, so:
>
> Reviewed-by: Thomas Huth <address@hidden>
>
>>>
>>> Also I'm not sure whether the numbering is right in every case. Let's
>>> use the example from the URL that you've mentioned in the cover letter
>>> (http://www.science.unitn.it/~fiorella/guidelinux/tlk/node76.html step 4):
>>>
>>> Assume that you hot-plug another bridge "5" to "Bridge 2". If I get the
>>> code right, the setup will now look like this:
>>
>> Yes, this is what I would expect! This implies that any bus in the
>> hierarchy lies between Sec and Sub. But that there might be multiple
>> candidates, which feels wrong as you correctly say.
>>
>>>
>>> CPU
>>> |
>>> --------------+-------------- Bus 0
>>> |
>>> | Pri=0
>>> BR1 Sec=1
>>> | Sub=5
>>> |
>>> ------+-------+----------+--- Bus 1
>>> | |
>>> | Pri=1 | Pri=1
>>> BR3 Sec=3 BR2 Sec=2
>>> | Sub=4 | Sub=5
>>> | |
>>> --+---+--- Bus 3 --+-+- Bus 2
>>> | |
>>> | Pri=3 | Pri=2
>>> BR4 Sec=4 BR5 Sec=5
>>> | Sub=4 | Sub=5
>>> | |
>>> --+---- Bus 4 --+-- Bus 5
>>>
>>> Requests to bus 3 and 4 are now also routed through to bus 2 and 5. That
>>> sounds wrong to me. Or is this how hot-plugging of PCI bridges is
>>> supposed to work??
>>>
>>
>> Learning more about PCI
>>
>> Quoting from [2]
>>
>> "PCIe enumeration is generally done twice. First, your BIOS (UEFI or
>> otherwise) will do it, to figure out who's present and how much memory
>> they need. This data can then be passed on to the host OS who can take
>> it as-is, but Linux and Windows often perform their own enumeration
>> procedure ..."
>>
>> "The custom software part comes in during this enumeration process and
>> that is you must reserve ahead of time PCI Bus numbers, and memory
>> segments for potential future devices -- this is sometimes called 'bus
>> padding'. This avoids the need to re-enumerate the bus in the future
>> which can often not be done without disruption to the system ..."
>>
>> So if I understand correctly, the BIOS/Firmware/QEMU only provides the
>> initial state. After that, the guest is responsible to manage it,
>> especially to manage hotplug.
>>
>> However, I wonder if it would also be up to BIOS/Firmware/QEMU to do
>> this bus padding or if it is completely on the OS side. And if
>> hotplugged devices simply have all 0's.
>
> Anyway, the current behavior sounds pretty wrong to me. I am sure that
> we can not simply traverse the bridge hierarchy up to the top and change
> the subordinate bus setting everywhere without informing the guest about
> that change. What if the guest OS has cached the values from the
> bridges? It then might use the wrong value for future calculations...
Yes, it also sounds wrong to me. Maybe this is what Firmware manages on
s390x - maybe they reserve bus numbers. Or maybe hotplug has to really
be handled by the guests. Only IBM people can clarify that.
Thanks!
--
Thanks,
David / dhildenb
Re: [qemu-s390x] [PATCH v1 0/2] s390x/pci: PCI bridge plugging fixes, Cornelia Huck, 2019/01/22
Re: [qemu-s390x] [PATCH v1 0/2] s390x/pci: PCI bridge plugging fixes, Cornelia Huck, 2019/01/28