On 16.05.19 16:27, Anton Nefedov wrote:
> ..apparently v13 ended up in a weird base64 that would not easily git-am.
> Resending.
> 
> ----
> 
> hi,
> 
> this used to be a large 10-patches series, but now thanks to Kevin's
> BDRV_REQ_NO_FALLBACK series
> (http://lists.nongnu.org/archive/html/qemu-devel/2019-03/msg06389.html)
> the core block layer functionality is already in place: the existing flag
> can be reused.
> 
> A few accompanying patches were also dropped from the series as those can
> be processed separately.
> 
> So,
> 
> new in v13:
>    - patch 1 (was patch 9) rebased to use s->data_file pointer
>    - comments fixed/added
>    - other patches dropped ;)
> 
> v12: http://lists.nongnu.org/archive/html/qemu-devel/2019-01/msg02761.html
> v11: http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg04342.html
> v10: http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00121.html
> 
> ----
> 
> This pull request is to start to improve a few performance points of
> qcow2 format:
> 
>   1. non cluster-aligned write requests (to unallocated clusters) explicitly
>      pad data with zeroes if there is no backing data.
>      Resulting increase in ops number and potential cluster fragmentation
>      (on the host file) is already solved by:
>        ee22a9d qcow2: Merge the writing of the COW regions with the guest data
>      However, in case of zero COW regions, that can be avoided at all
>      but the whole clusters are preallocated and zeroed in a single
>      efficient write_zeroes() operation
> 
>   2. moreover, efficient write_zeroes() operation can be used to preallocate
>      space megabytes (*configurable number) ahead which gives noticeable
>      improvement on some storage types (e.g. distributed storage)
>      where the space allocation operation might be expensive)
>      (Not included in this patchset since v6).
> 
>   3. this will also allow to enable simultaneous writes to the same unallocated
>      cluster after the space has been allocated & zeroed but before
>      the first data is written and the cluster is linked to L2.
>      (Not included in this patchset).
> 
> Efficient write_zeroes usually implies that the blocks are not actually
> written to but only reserved and marked as zeroed by the storage.
> Existing bdrv_write_zeroes() falls back to writing zero buffers if
> write_zeroes is not supported by the driver, so BDRV_REQ_NO_FALLBACK is used.
> 
> simple perf test:
> 
>   qemu-img create -f qcow2 test.img 4G && \
>   qemu-img bench -c $((1024*1024)) -f qcow2 -n -s 4k -t none -w test.img
> 
> test results (seconds):
> 
>     +-----------+-------+------+-------+------+------+
>     |   file    |    before    |     after    | gain |
>     +-----------+-------+------+-------+------+------+
>     |    ssd    |      61.153  |      36.313  |  41% |
>     |    hdd    |     112.676  |     122.056  |  -8% |
>     +-----------+--------------+--------------+------+

I’ve done a few more tests, and I’ve seen more slowdown on an HDD.
(Like 30 % when doing 64 kB requests that are not aligned to clusters.)
 On the other hand, the SSD gain is generally in the same ballpark (38 %
when issuing the same kind of requests.)

On the basis that:

(1) I believe that SSD performance is more important than HDD performance,

(2) I can’t think of a simple way to automatically decide whether the
new or the old codepath is more efficient[1], and

(3) users probably would not use a switch if we introduced one.

I have applied this patch to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Thanks!


[1] Hm.  We can probably investigate whether the file is stored on a
rotational medium or not.  Is there a fundamental reason why this patch
seems to degrade performance on an HDD but improves it on an SSD?  If
so, we can probably make a choice based on that.

Max