On Thu, Mar 04, 2021 at 03:59:17PM +0100, Kevin Wolf wrote:
> Am 04.03.2021 um 15:08 hat Stefano Garzarella geschrieben:
> > On Thu, Mar 04, 2021 at 01:05:02PM +0100, Kevin Wolf wrote:
> > > Am 03.03.2021 um 18:40 hat Stefano Garzarella geschrieben:
> > > > Hi Jason,
> > > > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
> > > > writing data is very slow compared to a raw file.
> > > >
> > > > Comparing raw vs QCOW2 image creation with RBD I found that we use a
> > > > different object size, for the raw file I see '4 MiB objects', for
> > > > QCOW2 I
> > > > see '64 KiB objects' as reported on comment 14 [2].
> > > > This should be the main issue of slowness, indeed forcing in the code 4
MiB
> > > > object size also for QCOW2 increased the speed a lot.
> > > >
> > > > Looking better I discovered that for raw files, we call rbd_create()
with
> > > > obj_order = 0 (if 'cluster_size' options is not defined), so the default
> > > > object size is used.
> > > > Instead for QCOW2, we use obj_order = 16, since the default
'cluster_size'
> > > > defined for QCOW2, is 64 KiB.
> > >
> > > Hm, the QemuOpts-based image creation is messy, but why does the rbd
> > > driver even see the cluster_size option?
> > >
> > > The first thing qcow2_co_create_opts() does is splitting the passed
> > > QemuOpts into options it will process on the qcow2 layer and options
> > > that are passed to the protocol layer. So if you pass a cluster_size
> > > option, qcow2 should take it for itself and not pass it to rbd.
> > >
> > > If it is passed to rbd, I think that's a bug in the qcow2 driver.
> >
> > IIUC qcow2 properyl remove it, but when rbd uses qemu_opt_get_size_del(opts,
> > BLOCK_OPT_CLUSTER_SIZE, 0) the default value of qcow2 format is returned.
> >
> > Going in depth in qemu_opt_get_size_helper(), I found that qemu_opt_find()
> > properly returns a NULL pointer, but then we call find_default_by_name()
> > that returns the default value of qcow2 format (64k).
>
> Ugh, I see why. We're passing the protocol driver a QemuOpts that was
> created for a QemuOptsList with the qcow2 default, not for its own
> QemuOptsList. This is wrong.
>
> Note that the QemuOptsList is not qcow2_create_opts itself, but a list
> that is created with qemu_opts_append() to combine qcow2 and rbd options
> into a new QemuOptsList. For overlapping options, the format wins.
>
> I don't think you can change the QemuOptsList of an existing QemuOpts,
> nor is there a clone operation that could just copy all options into a
> new QemuOpts created for the rbd QemuOptsList, so maybe the easiest
> hack^Wsolution would be converting to QDict and back...
Do you mean something like this? (I'll send a proper patch when everything
is a little clearer to me :-)
diff --git a/block.c b/block.c
index a1f3cecd75..74b02b32dc 100644
--- a/block.c
+++ b/block.c
@@ -671,13 +671,33 @@ out:
int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
{
BlockDriver *drv;
+ QemuOpts *new_opts;
+ QDict *qdict;
+ int ret;
drv = bdrv_find_protocol(filename, true, errp);
if (drv == NULL) {
return -ENOENT;
}
- return bdrv_create(drv, filename, opts, errp);
+ if (!drv->create_opts) {
+ error_setg(errp, "Driver '%s' does not support image creation",
+ drv->format_name);
+ return -ENOTSUP;
+ }
+
+ qdict = qemu_opts_to_qdict(opts, NULL);
+ new_opts = qemu_opts_from_qdict(drv->create_opts, qdict, errp);
+ if (new_opts == NULL) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ret = bdrv_create(drv, filename, new_opts, errp);
+out:
+ qemu_opts_del(new_opts);
+ qobject_unref(qdict);
+ return ret;
}