qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 0/11] add failover feature for assigned network devices


From: Roman Kagan
Subject: Re: [PATCH v6 0/11] add failover feature for assigned network devices
Date: Fri, 25 Oct 2019 15:23:55 +0000
User-agent: Mutt/1.12.1 (2019-06-15)

On Fri, Oct 25, 2019 at 02:19:19PM +0200, Jens Freimann wrote:
> This is implementing the host side of the net_failover concept
> (https://www.kernel.org/doc/html/latest/networking/net_failover.html)
> 
> Changes since v5:
> * rename net_failover_pair_id parameter/property to failover_pair_id
> * in PCI code use pci_bus_is_express(). This won't fail on functions > 0
> * make sure primary and standby can't be added to same PCI slot
> * add documentation file in docs/ to virtio-net patch, add file to
>    MAINTAINERS (added to networking devices section)
> * add comment to QAPI event for failover negotiation, try to improve
>    commit message 
> 
> The general idea is that we have a pair of devices, a vfio-pci and a
> virtio-net device. Before migration the vfio device is unplugged and data
> flows to the virtio-net device, on the target side another vfio-pci device
> is plugged in to take over the data-path. In the guest the net_failover
> module will pair net devices with the same MAC address.
> 
> * Patch 1 adds the infrastructure to hide the device for the qbus and qdev 
> APIs
> 
> * Patch 2 adds checks to PCIDevice for only allowing ethernet devices as
>   failover primary and only PCIExpress capable devices
> 
> * Patch 3 sets a new flag for PCIDevice 'partially_hotplugged' which we
>   use to skip the unrealize code path when doing a unplug of the primary
>   device
> 
> * Patch 4 sets the pending_deleted_event before triggering the guest
>   unplug request
> 
> * Patch 5 and 6 add new qmp events, one sends the device id of a device
>   that was just requested to be unplugged from the guest and another one
>   to let libvirt know if VIRTIO_NET_F_STANDBY was negotiated
> 
> * Patch 7 make sure that we can unplug the vfio-device before
>   migration starts
> 
> * Patch 8 adds a new migration state that is entered while we wait for
>   devices to be unplugged by guest OS
> 
> * Patch 9 just adds the new migration state to a check in libqos code
> 
> * Patch 10 In the second patch the virtio-net uses the API to defer adding 
> the vfio
>   device until the VIRTIO_NET_F_STANDBY feature is acked. It also
>   implements the migration handler to unplug the device from the guest and
>   re-plug in case of migration failure
> 
> * Patch 11 allows migration for failover vfio-pci devices
> 
> Previous discussion:
>   RFC v1 https://patchwork.ozlabs.org/cover/989098/
>   RFC v2 https://www.mail-archive.com/address@hidden/msg606906.html
>   v1: https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg03968.html
>   v2: https://www.mail-archive.com/address@hidden/msg635214.html
>   v3: https://patchew.org/QEMU/address@hidden/
>   v4: https://patchew.org/QEMU/address@hidden/
>   v5: https://patchew.org/QEMU/address@hidden/
> 
> To summarize concerns/feedback from previous discussion:
> 1.- guest OS can reject or worse _delay_ unplug by any amount of time.
>   Migration might get stuck for unpredictable time with unclear reason.
>   This approach combines two tricky things, hot/unplug and migration.
>   -> We need to let libvirt know what's happening. Add new qmp events
>      and a new migration state. When a primary device is (partially)
>      unplugged (only from guest) we send a qmp event with the device id. When
>      it is unplugged from the guest the DEVICE_DELETED event is sent.
>      Migration will enter the wait-unplug state while waiting for the guest
>      os to unplug all primary devices and then move on with migration.
> 2. PCI devices are a precious ressource. The primary device should never
>   be added to QEMU if it won't be used by guest instead of hiding it in
>   QEMU.
>   -> We only hotplug the device when the standby feature bit was
>      negotiated. We save the device cmdline options until we need it for
>      qdev_device_add()

The status of the feature support in the guest can change.  E.g. a guest
reboot will clear it for certain, and the guest may boot into another OS
that doesn't support pv-pt failover, and will become confused by two
network devices with the same MAC.  AFAICS from a brief skimming, the
patchset doesn't appear to address this scenario (which is probably not
so uncommon).

>      Hiding a device can be a useful concept to model. For example a
>      pci device in a powered-off slot could be marked as hidden until the 
> slot is
>      powered on (mst).
> 3. Management layer software should handle this. Open Stack already has
>   components/code to handle unplug/replug VFIO devices and metadata to
>   provide to the guest for detecting which devices should be paired.
>   -> An approach that includes all software from firmware to
>      higher-level management software wasn't tried in the last years. This is
>      an attempt to keep it simple and contained in QEMU as much as possible.
>      One of the problems that stopped management software and libvirt from
>      implementing this idea is that it can't be sure that it's possible to
>      re-plug the primary device. By not freeing the devices resources in QEMU
>      and only asking the guest OS to unplug it is possible to re-plug the
>      device in case of a migration failure.

Frankly I'm failing to see the point in requiring 100%-reliable re-plug
on migration rollback.  The whole idea of this failover is to allow
temporary QOS degradation; if this isn't allowed you don't even consider
migrating.  So if the migration fails, you can leave the guest in the
degraded state on the source host until a better migraion target is
found or the conditions on the source host allow the re-plug to succeed.

Thanks,
Roman.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]