QEMU RBD is slow with QCOW2 images

* QEMU RBD is slow with QCOW2 images
@ 2021-03-03 17:40 Stefano Garzarella
  2021-03-03 18:47 ` Jason Dillaman
  2021-03-04 12:05 ` Kevin Wolf
  0 siblings, 2 replies; 14+ messages in thread
From: Stefano Garzarella @ 2021-03-03 17:40 UTC (permalink / raw)
  To: Jason Dillaman; +Cc: Peter Lieven, qemu-devel, qemu-block

Hi Jason,
as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD 
writing data is very slow compared to a raw file.

Comparing raw vs QCOW2 image creation with RBD I found that we use a 
different object size, for the raw file I see '4 MiB objects', for QCOW2 
I see '64 KiB objects' as reported on comment 14 [2].
This should be the main issue of slowness, indeed forcing in the code 4 
MiB object size also for QCOW2 increased the speed a lot.

Looking better I discovered that for raw files, we call rbd_create() 
with obj_order = 0 (if 'cluster_size' options is not defined), so the 
default object size is used.
Instead for QCOW2, we use obj_order = 16, since the default 
'cluster_size' defined for QCOW2, is 64 KiB.

Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster 
size, since in qcow2_co_create_opts() we remove the 'cluster_size' from 
QemuOpts calling qemu_opts_to_qdict_filtered().
For some reason that I have yet to understand, after this deletion, 
however remains in QemuOpts the default value of 'cluster_size' for 
qcow2 (64 KiB), that it's used in qemu_rbd_co_create_opts()

At this point my doubts are:
Does it make sense to use the same cluster_size as qcow2 as object_size 
in RBD?
If we want to keep the 2 options separated, how can it be done? Should 
we rename the option in block/rbd.c?

Thanks,
Stefano

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1744525
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1744525#c14

^ permalink raw reply	[flat|nested] 14+ messages in thread