[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 07/11] vfio/migration: Implement VFIO migration protocol v
From: |
Jason Gunthorpe |
Subject: |
Re: [PATCH v2 07/11] vfio/migration: Implement VFIO migration protocol v2 |
Date: |
Mon, 18 Jul 2022 12:12:19 -0300 |
On Mon, May 30, 2022 at 08:07:35PM +0300, Avihai Horon wrote:
> +/* Returns 1 if end-of-stream is reached, 0 if more data and -1 if error */
> +static int vfio_save_block(QEMUFile *f, VFIOMigration *migration)
> +{
> + ssize_t data_size;
> +
> + data_size = read(migration->data_fd, migration->data_buffer,
> + migration->data_buffer_size);
> + if (data_size < 0) {
> + return -1;
> + }
> + if (data_size == 0) {
> + return 1;
> + }
> +
> + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
> + qemu_put_be64(f, data_size);
> + qemu_put_buffer_async(f, migration->data_buffer, data_size, false);
> + qemu_fflush(f);
> + bytes_transferred += data_size;
> +
> + trace_vfio_save_block(migration->vbasedev->name, data_size);
> +
> + return qemu_file_get_error(f);
> +}
We looked at this from an eye to "how much data is transfered" per
callback.
The above function is the basic data mover, and
'migration->data_buffer_size' is set to 1MB at the moment.
So, we product up to 1MB VFIO_MIG_FLAG_DEV_DATA_STATE sections.
This series does not include the precopy support, but that will
include a precopy 'save_live_iterate' function like this:
static int vfio_save_iterate(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
int ret;
ret = vfio_save_block(f, migration);
if (ret < 0) {
return ret;
}
if (ret == 1) {
return 1;
}
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
return 0;
}
Thus, during precopy this will never do more than 1MB per callback.
> +static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
> +{
> + VFIODevice *vbasedev = opaque;
> + enum vfio_device_mig_state recover_state;
> + int ret;
> +
> + /* We reach here with device state STOP or STOP_COPY only */
> + recover_state = VFIO_DEVICE_STATE_STOP;
> + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
> + recover_state);
> + if (ret) {
> + return ret;
> + }
> +
> + do {
> + ret = vfio_save_block(f, vbasedev->migration);
> + if (ret < 0) {
> + return ret;
> + }
> + } while (!ret);
This seems to be the main problem where we chain together 1MB blocks
until the entire completed precopy data is completed. The above is
hooked to 'save_live_complete_precopy'
So, if we want to break the above up into some 'save_iterate' like
function, do you have some advice how to do it? The above do/while
must happen after the VFIO_DEVICE_STATE_STOP_COPY.
For mlx5 the above loop will often be ~10MB's for small VMs and
100MB's for big VMs (big meaning making extensive use of RDMA
functionality), and this will not change with pre-copy support or not.
Is it still a problem?
For other devices, like a GPU, I would imagine pre-copy support is
implemented and this will be a smaller post-precopy residual.
Jason
- Re: [PATCH v2 07/11] vfio/migration: Implement VFIO migration protocol v2,
Jason Gunthorpe <=