qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] failover: allow to pause the VM during the migration


From: Juan Quintela
Subject: Re: [PATCH] failover: allow to pause the VM during the migration
Date: Fri, 29 Oct 2021 15:49:38 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Laurent Vivier (lvivier@redhat.com) wrote:
>> If we want to save a snapshot of a VM to a file, we used to follow the
>> following steps:
>> 
>> 1- stop the VM:
>>    (qemu) stop
>> 
>> 2- migrate the VM to a file:
>>    (qemu) migrate "exec:cat > snapshot"
>> 
>> 3- resume the VM:
>>    (qemu) cont
>> 
>> After that we can restore the snapshot with:
>>   qemu-system-x86_64 ... -incoming "exec:cat snapshot"
>>   (qemu) cont
>> 
>> But when failover is configured, it doesn't work anymore.
>> 
>> As the failover needs to ask the guest OS to unplug the card
>> the machine cannot be paused.
>> 
>> This patch introduces a new migration parameter, "pause-vm", that
>> asks the migration to pause the VM during the migration startup
>> phase after the the card is unplugged.
>> 
>> Once the migration is done, we only need to resume the VM with
>> "cont" and the card is plugged back:
>> 
>> 1- set the parameter:
>>    (qemu) migrate_set_parameter pause-vm on
>> 
>> 2- migrate the VM to a file:
>>    (qemu) migrate "exec:cat > snapshot"
>> 
>>    The primary failover card (VFIO) is unplugged and the VM is paused.
>> 
>> 3- resume the VM:
>>    (qemu) cont
>> 
>>    The VM restarts and the primary failover card is plugged back
>> 
>> The VM state sent in the migration stream is "paused", it means
>> when the snapshot is loaded or if the stream is sent to a destination
>> QEMU, the VM needs to be resumed manually.
>> 
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>
> A mix of comments:
>   a) As a boolean, this should be a MigrationCapability rather than a
> parameter
>   b) We already have a pause-before-switchover capability for a pause
> that happens later in the flow; so this would be something like
> pause-after-unplug
>   c) Is this really the right answer?  Could this be done a different
> way by doing the unplugs using (a possibly new) qmp command - so
> that you can explicitly trigger the unplug prior to the migration?

Not if you want the wait to be minimal.
What managedsave wants to do is doing the migration with the guest
stopped.  And wait for it until the last moment.

Doing this is qemu is "relatively" simple.  Doing that on libvirt is
extremely complex, because you basically have to :
- unplug the device
- wait for unplug to finish
- stop the guest
- migrate paused
- (restart the guest)

If you do it in libvirt, you are increasing the time betwee wait for
unplug to finish and stop the guest.  But the biggest problem is what
happens if the migration (or anything else fails).
qemu failover code already knows how to handle the stop/continuation of
the vfio device.  It is what happens on a normal run.  If you do this on
libvirt, it needs to be able to recover for all scenarios, what is much
more complex in my hunble opinion.

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]