|
From: | Si-Wei Liu |
Subject: | Re: Reducing vdpa migration downtime because of memory pin / maps |
Date: | Tue, 6 Jun 2023 15:44:29 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 |
On 4/5/23 04:37, Eugenio Perez Martin wrote:
Hi! As mentioned in the last upstream virtio-networking meeting, one of the factors that adds more downtime to migration is the handling of the guest memory (pin, map, etc). At this moment this handling is bound to the virtio life cycle (DRIVER_OK, RESET). In that sense, the destination device waits until all the guest memory / state is migrated to start pinning all the memory. The proposal is to bind it to the char device life cycle (open vs close),
Hmmm, really? If it's the life cycle for char device, the next guest / qemu launch on the same vhost-vdpa device node won't make it work.
so all the guest memory can be pinned for all the guest / qemu lifecycle.
I think to tie pinning to guest / qemu process life cycle makes more sense. Essentially this pinning part needs to be decoupled from the iotlb mapping abstraction layer, and can / should work as a standalone uAPI. Such that QEMU at the destination may launch and pin all guest's memory as needed without having to start the device, while awaiting any incoming migration request. Though problem is, there's no existing vhost uAPI that could properly serve as the vehicle for that. SET_OWNER / SET_MEM_TABLE / RESET_OWNER seems a remote fit.. Any objection against introducing a new but clean vhost uAPI for pinning guest pages, subject to guest's life cycle?
Another concern is the use_va stuff, originally it tags to the device level and is made static at the time of device instantiation, which is fine. But others to come just find a new home at per-group level or per-vq level struct. Hard to tell whether or not pinning is actually needed for the latter use_va friends, as they are essentially tied to the virtio life cycle or feature negotiation. While guest / Qemu starts way earlier than that. Perhaps just ignore those sub-device level use_va usages? Presumably !use_va at the device level is sufficient to infer the need of pinning for device?
Regards, -Siwei
This has two main problems: * At this moment the reset semantics forces the vdpa device to unmap all the memory. So this change needs a vhost vdpa feature flag. * This may increase the initialization time. Maybe we can delay it if qemu is not the destination of a LM. Anyway I think this should be done as an optimization on top. Any ideas or comments in this regard? Thanks!
[Prev in Thread] | Current Thread | [Next in Thread] |