qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QEMU RBD is slow with QCOW2 images


From: Stefano Garzarella
Subject: Re: QEMU RBD is slow with QCOW2 images
Date: Thu, 4 Mar 2021 15:08:29 +0100

On Thu, Mar 04, 2021 at 01:05:02PM +0100, Kevin Wolf wrote:
Am 03.03.2021 um 18:40 hat Stefano Garzarella geschrieben:
Hi Jason,
as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
writing data is very slow compared to a raw file.

Comparing raw vs QCOW2 image creation with RBD I found that we use a
different object size, for the raw file I see '4 MiB objects', for QCOW2 I
see '64 KiB objects' as reported on comment 14 [2].
This should be the main issue of slowness, indeed forcing in the code 4 MiB
object size also for QCOW2 increased the speed a lot.

Looking better I discovered that for raw files, we call rbd_create() with
obj_order = 0 (if 'cluster_size' options is not defined), so the default
object size is used.
Instead for QCOW2, we use obj_order = 16, since the default 'cluster_size'
defined for QCOW2, is 64 KiB.

Hm, the QemuOpts-based image creation is messy, but why does the rbd
driver even see the cluster_size option?

The first thing qcow2_co_create_opts() does is splitting the passed
QemuOpts into options it will process on the qcow2 layer and options
that are passed to the protocol layer. So if you pass a cluster_size
option, qcow2 should take it for itself and not pass it to rbd.

If it is passed to rbd, I think that's a bug in the qcow2 driver.

IIUC qcow2 properyl remove it, but when rbd uses qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0) the default value of qcow2 format is returned.

Going in depth in qemu_opt_get_size_helper(), I found that qemu_opt_find() properly returns a NULL pointer, but then we call find_default_by_name() that returns the default value of qcow2 format (64k).


Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster
size, since in qcow2_co_create_opts() we remove the 'cluster_size' from
QemuOpts calling qemu_opts_to_qdict_filtered().
For some reason that I have yet to understand, after this deletion, however
remains in QemuOpts the default value of 'cluster_size' for qcow2 (64 KiB),
that it's used in qemu_rbd_co_create_opts()

So it seems you came to a similar conclusion. We need to find out where
the 64k come from and just fix that so that rbd uses its default.

Yes, I tried debugging above, but I'm not sure how to fix it.

Maybe a new parameter in qemu_opt_get_size_helper() to prevent it from looking for the default value. Or we should prevent the default value from being added to the opts->list->desc, but that part is still not very clear to me.


At this point my doubts are:
Does it make sense to use the same cluster_size as qcow2 as object_size in
RBD?
If we want to keep the 2 options separated, how can it be done? Should we
rename the option in block/rbd.c?

My lazy answer is that you could just use QMP blockdev-create, where you
create layer by layer separately.

What could possibly be done for the QemuOpts is using the dotted syntax
like for opening, so you could specify file.cluster_size=... for the
protocol layer (or data_file.cluster_size=... for the external data
file etc.)


This would be cool :-)

Thanks,
Stefano




reply via email to

[Prev in Thread] Current Thread [Next in Thread]