qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 5/5] hw/pci: ensure PCIE devices are plugged into only slo


From: Ani Sinha
Subject: Re: [PATCH v6 5/5] hw/pci: ensure PCIE devices are plugged into only slot 0 of PCIE port
Date: Fri, 30 Jun 2023 13:11:33 +0530


> On 30-Jun-2023, at 8:13 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
> 
> On 2023/06/29 23:18, Ani Sinha wrote:
>>> On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>> 
>>> On 2023/06/29 17:05, Ani Sinha wrote:
>>>> On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com 
>>>> <mailto:akihiko.odaki@daynix.com>> wrote:
>>>>    On 2023/06/29 13:07, Ani Sinha wrote:
>>>>     > PCI Express ports only have one slot, so PCI Express devices can
>>>>    only be
>>>>     > plugged into slot 0 on a PCIE port. Enforce it.
>>>>     >
>>>>     > The change has been tested to not break ARI by instantiating
>>>>    seven vfs on an
>>>>     > emulated igb device (the maximum number of vfs the linux igb
>>>>    driver supports).
>>>>     > The vfs are seen to have non-zero device/slot numbers in the
>>>>    conventional
>>>>     > PCI BDF representation.
>>>>     >
>>>>     > CC: jusual@redhat.com <mailto:jusual@redhat.com>
>>>>     > CC: imammedo@redhat.com <mailto:imammedo@redhat.com>
>>>>     > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com>
>>>>     >
>>>>     > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
>>>>    <https://bugzilla.redhat.com/show_bug.cgi?id=2128929>
>>>>     > Signed-off-by: Ani Sinha <anisinha@redhat.com
>>>>    <mailto:anisinha@redhat.com>>
>>>>     > Reviewed-by: Julia Suvorova <jusual@redhat.com
>>>>    <mailto:jusual@redhat.com>>
>>>>     > ---
>>>>     >   hw/pci/pci.c | 15 +++++++++++++++
>>>>     >   1 file changed, 15 insertions(+)
>>>>     >
>>>>     > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>     > index e2eb4c3b4a..0320ac2bb3 100644
>>>>     > --- a/hw/pci/pci.c
>>>>     > +++ b/hw/pci/pci.c
>>>>     > @@ -65,6 +65,7 @@ bool pci_available = true;
>>>>     >   static char *pcibus_get_dev_path(DeviceState *dev);
>>>>     >   static char *pcibus_get_fw_dev_path(DeviceState *dev);
>>>>     >   static void pcibus_reset(BusState *qbus);
>>>>     > +static bool pcie_has_upstream_port(PCIDevice *dev);
>>>>     >
>>>>     >   static Property pci_props[] = {
>>>>     >       DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
>>>>     > @@ -1190,6 +1191,20 @@ static PCIDevice
>>>>    *do_pci_register_device(PCIDevice *pci_dev,
>>>>     >                      name);
>>>>     >
>>>>     >          return NULL;
>>>>     > +    } /*
>>>>     > +       * With SRIOV and ARI, vfs can have non-zero slot in the
>>>>    conventional
>>>>     > +       * PCI interpretation as all five bits reserved for slot
>>>>    addresses are
>>>>     > +       * also used for function bits for the various vfs. Ignore
>>>>    that case.
>>>>     > +       * It is too early here to check for ARI capabilities in
>>>>    the PCI config
>>>>     > +       * space. Hence, we check for a vf device instead.
>>>>     > +       */
>>>>    Why don't just perform this check after the capabilities are set?
>>>> We don't want to allocate resources for wrong device parameters. We want 
>>>> to error out early. Other checks also are performed at the same place .
>>> 
>>> It is indeed better to raise an error as early as possible so that we can 
>>> avoid allocation and other operations that will be reverted and may go 
>>> wrong due to the invalid condition. That should be the reason why other 
>>> checks for the address are performed here.
>>> 
>>> However, in this particular case, we cannot confidently perform the check 
>>> here because it is unknown if the ARI capability will be advertised until 
>>> the device realization code runs. This can justify delaying the check after 
>>> the device realization, unlike the other checks.
>> Ok so are you proposing that the check we have right before (the check for 
>> unoccupied function 0) be also moved? It also uses the same vf approximation 
>> for seemingly to support ARI.
> 
> No, I don't think the check for function 0 is required to be disabled because 
> of the change of addressing caused by ARI, but it is required because SR-IOV 
> VF can be added and removed while the PF (function 0) remains. I think this 
> check should be performed also when SR-IOV is disabled and ARI is enabled.

OK I am a little confused with your explanation here. I understand the non-ARI 
case - to detect the PF we need to have function 0 available. Looking at the 
comment in pci_init_multifunction() it seems in ARI case, we can simply have a 
vf in function 0 (conventional interpretation) as well. Hence we need to ignore 
vfs both in ARI and non-ARI cases. This is what I understood.

> 
> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of 
> checking ARI capability, and that can happen in do_pci_register_device().
> 
>> Also where do you propose we move the check?
> 
> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM 
> loading.

Hmm, I tried this. The issue here is something like this would be now allowed 
since the PF has ARI capability:

-device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0

The above should not be allowed and when used, we do not see the igb ethernet 
device from the guest OS.

> See the check for failover pair as an example. I guess it's also placed in 
> this region because it needs capability information.
> 
>>> 
>>>> Show quoted text
>>>>    Regards,
>>>>    Akihiko Odaki
>>>>     > +    else if (!pci_is_vf(pci_dev) &&
>>>>     > +             pcie_has_upstream_port(pci_dev) &&
>>>>     > +             PCI_SLOT(devfn)) {
>>>>     > +        error_setg(errp, "PCI: slot %d is not valid for %s,"
>>>>     > +                   " parent device only allows plugging into
>>>>    slot 0.",
>>>>     > +                   PCI_SLOT(devfn), name);
>>>>     > +        return NULL;
>>>>     >       }
>>>>     >
>>>>     >       pci_dev->devfn = devfn;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]