qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 02/11] pci: add option for net failover


From: Alex Williamson
Subject: Re: [PATCH v5 02/11] pci: add option for net failover
Date: Thu, 24 Oct 2019 11:06:34 -0600

On Thu, 24 Oct 2019 16:34:01 +0000
Parav Pandit <address@hidden> wrote:

> > -----Original Message-----
> > From: Jens Freimann <address@hidden>
> > Sent: Thursday, October 24, 2019 4:38 AM
> > To: Parav Pandit <address@hidden>
> > Cc: address@hidden; address@hidden; address@hidden;
> > address@hidden; address@hidden; address@hidden;
> > address@hidden; address@hidden; address@hidden;
> > address@hidden
> > Subject: Re: [PATCH v5 02/11] pci: add option for net failover
> > 
> > On Thu, Oct 24, 2019 at 05:03:46AM +0000, Parav Pandit wrote:  
> > >> @@ -2101,6 +2104,20 @@ static void pci_qdev_realize(DeviceState
> > >> *qdev, Error **errp)
> > >>          }
> > >>      }
> > >>
> > >> +    if (pci_dev->net_failover_pair_id) {
> > >> +        if (!pci_is_express(pci_dev)) {  
> > >
> > >I am testing and integrating this piece with mlx5 devices.
> > >I see that pci_is_express() return true only for first PCI function.
> > >Didn't yet dig the API.
> > >Commenting out this check and below class check progresses further.  
> > 
> > First of all, thanks for testing this!
> > Could you share your commandline please? I can't reproduce it.  
> > >  
> I added debug prints to get the difference between VF1 and VF2 behavior.
> What I see is, vfio_populate_device() below code is activated for VF2 where 
> qemu claims that its not a PCIe device.
> 
>     vdev->config_size = reg_info->size;
>     if (vdev->config_size == PCI_CONFIG_SPACE_SIZE) {
>         vdev->pdev.cap_present &= ~QEMU_PCI_CAP_EXPRESS;
>         printf("%s clearing QEMU PCI bits\n", __func__);
>     }
> 
> Command line:
> /usr/local/bin/qemu-system-x86_64 -enable-kvm -m 3072 -smp 3 \
>                -machine q35,usb=off,vmport=off,dump-guest-core=off -cpu 
> Haswell-noTSX-IBRS \
>            -net none \
>                -qmp unix:/tmp/qmp.socket,server,nowait \
>         -monitor telnet:127.0.0.1:5556,server,nowait \
>         -device pcie-root-port,id=root0,multifunction=on,chassis=0,addr=0xa \
>         -device pcie-root-port,id=root1,bus=pcie.0,chassis=1 \
>         -device pcie-root-port,id=root2,bus=pcie.0,chassis=2 \
>         -netdev tap,id=hostnet1,fd=4 4<>/dev/tap49\
>         -device 
> virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:02:02:02,bus=root2,failover=on
>  \
>         -device 
> vfio-pci,id=hostdev0,host=05:00.2,bus=root1,net_failover_pair_id=net1 \
>         /var/lib/libvirt/images/sriov-lm-02.qcow2
> 
> > >While reviewing, I realized that we shouldn't have this check for below  
> > reasons.  
> > >
> > >1. It is user's responsibility to pass networking device.
> > >But its ok to check the class, if PCI Device is passed.
> > >So class comparison should be inside the pci_check().  
> > 
> > I'm not sure I understand this point, could you please elaborate?
> > You're suggesting to move the check for the class into the check for
> > pci_is_express?
> >   
> No. Below is the suggestion.
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 8fbf32d68c..8004309973 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2105,17 +2105,14 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
> **errp)
>      }
> 
>      if (pci_dev->net_failover_pair_id) {
> -        if (!pci_is_express(pci_dev)) {
> -            error_setg(errp, "failover device is not a PCIExpress device");
> -            error_propagate(errp, local_err);
> -            return;
> -        }
> -        class_id = pci_get_word(pci_dev->config + PCI_CLASS_DEVICE);
> -        if (class_id != PCI_CLASS_NETWORK_ETHERNET) {
> -            error_setg(errp, "failover device is not an Ethernet device");
> -            error_propagate(errp, local_err);
> -            return;
> -        }
> +        if (pci_is_express(pci_dev)) {
> +            class_id = pci_get_word(pci_dev->config + PCI_CLASS_DEVICE);
> +            if (class_id != PCI_CLASS_NETWORK_ETHERNET) {
> +                error_setg(errp, "failover device is not an Ethernet 
> device");
> +                error_propagate(errp, local_err);
> +                return;
> +            }
> +       }
> 
> This will allow to map non PCI device as failover too.

As in previous email, the point of the check was to exclude devices
when the hotplug controller is known not to support the feature.  It's
a topology check masked as a device check, it only exists because
support at the hotplug controller is not ubiquitous.  Thanks,

Alex

> After writing above hunk I think that when code reaches to check for 
> If (pci_dev->net_failover_pair_id),... it is already gone gone through 
> do_pci_register_device().
> There should not be any check needed again for pci_is_express().
> Isn't it?
> 
> 
> > >2. It is limiting to only consider PCI devices.
> > >Automated and regression tests should be able validate this feature 
> > >without  
> > PCI Device.  
> > >This will enhance the stability of feature in long run.
> > >
> > >3. net failover driver doesn't limit it to have it over only PCI device.
> > >So similarly hypervisor should be limiting.  
> > 
> > I agree that we don't have to limit it to PCI(e) forever. But for this 
> > first shot I
> > think we should and then extend it continually. There are more things we can
> > support in the future like other hotplug types etc.
> >   
> o.k. But probably net_failover_pair_id field should be in DeviceState instead 
> of PCIDevice at minimum?
> Or you want to refactor it later?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]