qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration


From: Dr. David Alan Gilbert
Subject: Re: [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration
Date: Wed, 6 Oct 2021 09:28:50 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Wed, Oct 06, 2021 at 09:09:30AM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > On Tue, Oct 05, 2021 at 12:10:08PM -0400, Eduardo Habkost wrote:
> > > > On Tue, Oct 05, 2021 at 03:01:05PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > > > > On Tue, Oct 05, 2021 at 02:18:40AM +0300, Roman Kagan wrote:
> > > > > > > On Mon, Oct 04, 2021 at 11:11:00AM -0400, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > On Mon, Oct 04, 2021 at 06:07:29PM +0300, Denis Plotnikov wrote:
> > > > > > > > > It might be useful for the cases when a slow block layer 
> > > > > > > > > should be replaced
> > > > > > > > > with a more performant one on running VM without stopping, 
> > > > > > > > > i.e. with very low
> > > > > > > > > downtime comparable with the one on migration.
> > > > > > > > > 
> > > > > > > > > It's possible to achive that for two reasons:
> > > > > > > > > 
> > > > > > > > > 1.The VMStates of "virtio-blk" and "vhost-user-blk" are 
> > > > > > > > > almost the same.
> > > > > > > > >   They consist of the identical VMSTATE_VIRTIO_DEVICE and 
> > > > > > > > > differs from
> > > > > > > > >   each other in the values of migration service fields only.
> > > > > > > > > 2.The device driver used in the guest is the same: virtio-blk
> > > > > > > > > 
> > > > > > > > > In the series cross-migration is achieved by adding a new 
> > > > > > > > > type.
> > > > > > > > > The new type uses virtio-blk VMState instead of 
> > > > > > > > > vhost-user-blk specific
> > > > > > > > > VMstate, also it implements migration save/load callbacks to 
> > > > > > > > > be compatible
> > > > > > > > > with migration stream produced by "virtio-blk" device.
> > > > > > > > > 
> > > > > > > > > Adding the new type instead of modifying the existing one is 
> > > > > > > > > convenent.
> > > > > > > > > It ease to differ the new virtio-blk-compatible vhost-user-blk
> > > > > > > > > device from the existing non-compatible one using qemu 
> > > > > > > > > machinery without any
> > > > > > > > > other modifiactions. That gives all the variety of qemu 
> > > > > > > > > device related
> > > > > > > > > constraints out of box.
> > > > > > > > 
> > > > > > > > Hmm I'm not sure I understand. What is the advantage for the 
> > > > > > > > user?
> > > > > > > > What if vhost-user-blk became an alias for 
> > > > > > > > vhost-user-virtio-blk?
> > > > > > > > We could add some hacks to make it compatible for old machine 
> > > > > > > > types.
> > > > > > > 
> > > > > > > The point is that virtio-blk and vhost-user-blk are not
> > > > > > > migration-compatible ATM.  OTOH they are the same device from the 
> > > > > > > guest
> > > > > > > POV so there's nothing fundamentally preventing the migration 
> > > > > > > between
> > > > > > > the two.  In particular, we see it as a means to switch between 
> > > > > > > the
> > > > > > > storage backend transports via live migration without disrupting 
> > > > > > > the
> > > > > > > guest.
> > > > > > > 
> > > > > > > Migration-wise virtio-blk and vhost-user-blk have in common
> > > > > > > 
> > > > > > > - the content of the VMState -- VMSTATE_VIRTIO_DEVICE
> > > > > > > 
> > > > > > > The two differ in
> > > > > > > 
> > > > > > > - the name and the version of the VMStateDescription
> > > > > > > 
> > > > > > > - virtio-blk has an extra migration section (via .save/.load 
> > > > > > > callbacks
> > > > > > >   on VirtioDeviceClass) containing requests in flight
> > > > > > > 
> > > > > > > It looks like to become migration-compatible with virtio-blk,
> > > > > > > vhost-user-blk has to start using VMStateDescription of 
> > > > > > > virtio-blk and
> > > > > > > provide compatible .save/.load callbacks.  It isn't entirely 
> > > > > > > obvious how
> > > > > > > to make this machine-type-dependent, so we came up with a simpler 
> > > > > > > idea
> > > > > > > of defining a new device that shares most of the implementation 
> > > > > > > with the
> > > > > > > original vhost-user-blk except for the migration stuff.  We're 
> > > > > > > certainly
> > > > > > > open to suggestions on how to reconcile this under a single
> > > > > > > vhost-user-blk device, as this would be more user-friendly indeed.
> > > > > > > 
> > > > > > > We considered using a class property for this and defining the
> > > > > > > respective compat clause, but IIUC the class constructors (where 
> > > > > > > .vmsd
> > > > > > > and .save/.load are defined) are not supposed to depend on class
> > > > > > > properties.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Roman.
> > > > > > 
> > > > > > So the question is how to make vmsd depend on machine type.
> > > > > > CC Eduardo who poked at this kind of compat stuff recently,
> > > > > > paolo who looked at qom things most recently and dgilbert
> > > > > > for advice on migration.
> > > > > 
> > > > > I don't think I've seen anyone change vmsd name dependent on machine
> > > > > type; making fields appear/disappear is easy - that just ends up as a
> > > > > property on the device that's checked;  I guess if that property is
> > > > > global (rather than per instance) then you can check it in
> > > > > vhost_user_blk_class_init and swing the dc->vmsd pointer?
> > > > 
> > > > class_init can be called very early during QEMU initialization,
> > > > so it's too early to make decisions based on machine type.
> > > > 
> > > > Making a specific vmsd appear/disappear based on machine
> > > > configuration or state is "easy", by implementing
> > > > VMStateDescription.needed.  But this would require registering
> > > > both vmsds (one of them would need to be registered manually
> > > > instead of using DeviceClass.vmsd).
> > > > 
> > > > I don't remember what are the consequences of not using
> > > > DeviceClass.vmsd to register a vmsd, I only remember it was
> > > > subtle.  See commit b170fce3dd06 ("cpu: Register
> > > > VMStateDescription through CPUState") and related threads.  CCing
> > > > Philippe, who might remember the details here.
> > > > 
> > > > If that's an important use case, I would suggest allowing devices
> > > > to implement a DeviceClass.get_vmsd method, which would override
> > > > DeviceClass.vmsd if necessary.  Is the problem we're trying to
> > > > address worth the additional complexity?
> > > 
> > > The tricky part is that we generally dont support migration when
> > > command line is different on source and destination ...
> > 
> > The reality has always been a bit more subtle than that.
> > For example, it's fine if the path to a block device is different on the
> > source and destination; or if it's accessed by iSCSI on the destination
> > say.  As long as what the guest sees, and the migration stream carries
> > are the same, then in principal it's OK - but that does start getting
> > trickier; also it would prboably get interesting to let libvirt know
> > that this combo is OK.
> 
> I agree, but that's not the same as specifying a different
> device. Yes we internally they are compatible, but
> this is a detail users/tools generally won't be able to
> figure out.

Yeh.

> > > So maybe the actual answer is that vhost-user-blk should really
> > > be a drive supplied to a virtio blk device, not a device
> > > itself?
> > > This way it's sane, and also matches what we do e.g. for net.
> > 
> > Hmm a bit of a fudge; it's not quite the same as a drive is it; there's
> > almost another layer split in there.
> > 
> > Dave
> 
> We can make it something else, not "drive=". Maybe simply "vhost-user=" ?
> Point is if we promise it looks the same to guest it should be the
> same -device.

To me it feels the same as the distinction between vhost-kernel and qemu
backended virtio that we get in net and others - in principal it's just 
another implementation.
A tricky part is guaranteeing the set of visible virtio features between
implementations; we have that problem when we use vhost-kernel and run
on a newer/older kernel and gain virtio features; the same will be true
with vhost-user implementations.

But this would make the structure of a vhost-user implementation quite
different.

Dave

> 
> > > -- 
> > > MST
> > > 
> > -- 
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]