qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/27] vDPA software assisted live migration


From: Jason Wang
Subject: Re: [RFC PATCH 00/27] vDPA software assisted live migration
Date: Thu, 10 Dec 2020 17:12:29 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0


On 2020/12/9 下午11:57, Stefan Hajnoczi wrote:
On Wed, Dec 09, 2020 at 04:26:50AM -0500, Jason Wang wrote:
----- Original Message -----
On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote:
This series enable vDPA software assisted live migration for vhost-net
devices. This is a new method of vhost devices migration: Instead of
relay on vDPA device's dirty logging capability, SW assisted LM
intercepts dataplane, forwarding the descriptors between VM and device.
Pros:
+ vhost/vDPA devices don't need to implement dirty memory logging
+ Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends

Cons:
- Not generic, relies on vhost-net-specific ioctls
- Doesn't support VIRTIO Shared Memory Regions
   https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex
I may miss something but my understanding is that it's the
responsiblity of device to migrate this part?
Good point. You're right.

- Performance (see below)

I think performance will be significantly lower when the shadow vq is
enabled. Imagine a vDPA device with hardware vq doorbell registers
mapped into the guest so the guest driver can directly kick the device.
When the shadow vq is enabled a vmexit is needed to write to the shadow
vq ioeventfd, then the host kernel scheduler switches to a QEMU thread
to read the ioeventfd, the descriptors are translated, QEMU writes to
the vhost hdev kick fd, the host kernel scheduler switches to the vhost
worker thread, vhost/vDPA notifies the virtqueue, and finally the
vDPA driver writes to the hardware vq doorbell register. That is a lot
of overhead compared to writing to an exitless MMIO register!
I think it's a balance. E.g we can poll the virtqueue to have an
exitless doorbell.

If the shadow vq was implemented in drivers/vhost/ and QEMU used the
existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be
reduced to just one set of ioeventfd/irqfd. In other words, the QEMU
dirty memory logging happens asynchronously and isn't in the dataplane.

In addition, hardware that supports dirty memory logging as well as
software vDPA devices could completely eliminate the shadow vq for even
better performance.
Yes. That's our plan. But the interface might require more thought.

E.g is the bitmap a good approach? To me reporting dirty pages via
virqueue is better since it get less footprint and is self throttled.

And we need an address space other than the one used by guest for
either bitmap for virtqueue.

But performance is a question of "is it good enough?". Maybe this
approach is okay and users don't expect good performance while dirty
memory logging is enabled.
Yes, and actually such slow down may help for the converge of the
migration.

Note that the whole idea is try to have a generic solution for all
types of devices. It's good to consider the performance but for the
first stage, it should be sufficient to make it work and consider to
optimize on top.
Moving the shadow vq to the kernel later would be quite a big change
requiring rewriting much of the code. That's why I mentioned this now
before a lot of effort is invested in a QEMU implementation.


Right.



I just wanted to share the idea of moving the
shadow vq into the kernel in case you like that approach better.
My understanding is to keep kernel as simple as possible and leave the
polices to userspace as much as possible. E.g it requires us to
disable doorbell mapping and irq offloading, all of which were under
the control of userspace.
If the performance is acceptable with the QEMU approach then I think
that's the best place to implement it. It looks high-overhead though so
maybe one of the first things to do is to run benchmarks to collect data
on how it performs?


Yes, I agree.

Thanks



Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]