[RFC patch v1 0/3] qemu-file writing performance improving

* [RFC patch v1 0/3] qemu-file writing performance improving
@ 2020-04-13 11:12 Denis Plotnikov
  2020-04-13 11:12 ` [RFC patch v1 1/3] qemu-file: introduce current buffer Denis Plotnikov
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Denis Plotnikov @ 2020-04-13 11:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: den, dgilbert, quintela

Problem description: qcow2 internal snapshot saving time is too big on HDD ~ 25 sec

When a qcow2 image is placed on a regular HDD and the image is openned with
O_DIRECT the snapshot saving time is around 26 sec.
The snapshot saving time can be 4 times sorter.
The patch series propose the way to achive that. 

Why is the saving time = ~25 sec?

There are three things:
1. qemu-file iov limit (currently 64)
2. direct qemu_fflush calls, inducing disk writings
3. ram data copying and synchronous disk wrtings

When 1, 2 are quite clear, the 3rd needs some explaination:

Internal snapshot uses qemu-file as an interface to store the data with
stream semantics.
qemu-file avoids data coping when possible (mostly for ram data)
and use iovectors to propagate the data to an undelying block driver state.
In the case when qcow2 openned with O_DIRECT it is suboptimal.

This is what happens: on writing, when the iovectors query goes from qemu-file
to bdrv (here and further by brdv I mean qcow2 with posix O_DIRECT openned backend),
the brdv checks all iovectors to be base and size aligned, if it's not the case,
the data copied to an internal buffer and synchronous pwrite is called.
If the iovectors are aligned, io_submit is called.

In our case, snapshot almost always induces pwrite, since we never have all
the iovectors aligned in the query, because of frequent adding a short iovector:
8 byte ram-page delimiters, after adding each ram page iovector.

So the qemu-file code in this case:
1. doesn't aviod ram copying
2. works fully synchronously

How to improve the snapshot time:

1. easy way: to increase iov limit to IOV_MAX (1024).
This will reduce synchronous writing frequency.
My test revealed that with iov limit = IOV_MAX the snapshot time *~12 sec*.

2. complex way: do writings asynchronously.
Introduce both base- and size-aligned buffer, write the data only when
the buffer is full, write the buffer asynchronously, meanwhile filling another
buffer with snapshot data.
My test revealed that this complex way provides the snapshot time *~6 sec*,
2 times better than just iov limit increasing.

The patch proposes how to improve the snapshot performance in the complex way,
allowing to use the asyncronous writings when needed.

This is an RFC series, as I didn't confident that I fully understand all
qemu-file use cases. I tried to make the series in a safe way to not break
anything related to qemu-file using in other places, like migration.

All comments are *VERY* appriciated!

Thanks,
Denis

Denis Plotnikov (3):
  qemu-file: introduce current buffer
  qemu-file: add buffered mode
  migration/savevm: use qemu-file buffered mode for non-cached bdrv

 include/qemu/typedefs.h |   2 +
 migration/qemu-file.c   | 479 +++++++++++++++++++++++++++++++++++++++++-------
 migration/qemu-file.h   |   9 +
 migration/savevm.c      |  38 +++-
 4 files changed, 456 insertions(+), 72 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread