qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/5] qcow2: async handling of fragmented io
@ 2019-09-16 17:53 Vladimir Sementsov-Ogievskiy
  2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 1/5] qemu-iotests: ignore leaks on failure paths in 026 Vladimir Sementsov-Ogievskiy
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-09-16 17:53 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, den, vsementsov, qemu-devel, mreitz

Hi all!

Here is an asynchronous scheme for handling fragmented qcow2
reads and writes. Both qcow2 read and write functions loops through
sequential portions of data. The series aim it to parallelize these
loops iterations.
It improves performance for fragmented qcow2 images, I've tested it
as described below.

v5: fix 026 and rebase on Max's block branch [perf results not updated]:

01: new, prepare 026 to not fail
03: - drop read_encrypted blkdbg event [Kevin]
    - assert((x & (BDRV_SECTOR_SIZE - 1)) == 0) -> assert(QEMU_IS_ALIGNED(x, BDRV_SECTOR_SIZE)) [rebase]
    - full host offset in argument of qcow2_co_decrypt [rebase]
04: - substitute remaining qcow2_co_do_pwritev by qcow2_co_pwritev_task in comment [Max]
    - full host offset in argument of qcow2_co_encrypt [rebase]
05: - Now patch don't affect 026 iotest, so its output is not changed

Rebase changes seems trivial, so, I've kept r-b marks.

Based-on: https://github.com/XanClic/qemu.git block

About testing:

I have four 4G qcow2 images (with default 64k block size) on my ssd disk:
t-seq.qcow2 - sequentially written qcow2 image
t-reverse.qcow2 - filled by writing 64k portions from end to the start
t-rand.qcow2 - filled by writing 64k portions (aligned) in random order
t-part-rand.qcow2 - filled by shuffling order of 64k writes in 1m clusters
(see source code of image generation in the end for details)

and I've done several runs like the following (sequential io by 1mb chunks):

    out=/tmp/block; echo > $out; cat /tmp/files | while read file; do for wr in {"","-w"}; do echo "$file" $wr; ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none $wr "$file" | grep 'Run completed in' | awk '{print $4}' >> $out; done; done


short info about parameters:
  -w - do writes (otherwise do reads)
  -c - count of blocks
  -s - block size
  -t none - disable cache
  -n - native aio
  -d 1 - don't use parallel requests provided by qemu-img bench itself

results:

    +---------------------------+---------+---------+
    | file                      | master  | async   |
    +---------------------------+---------+---------+
    | /ssd/t-part-rand.qcow2    | 14.671  | 9.193   |
    +---------------------------+---------+---------+
    | /ssd/t-part-rand.qcow2 -w | 11.434  | 8.621   |
    +---------------------------+---------+---------+
    | /ssd/t-rand.qcow2         | 20.421  | 10.05   |
    +---------------------------+---------+---------+
    | /ssd/t-rand.qcow2 -w      | 11.097  | 8.915   |
    +---------------------------+---------+---------+
    | /ssd/t-reverse.qcow2      | 17.515  | 9.407   |
    +---------------------------+---------+---------+
    | /ssd/t-reverse.qcow2 -w   | 11.255  | 8.649   |
    +---------------------------+---------+---------+
    | /ssd/t-seq.qcow2          | 9.081   | 9.072   |
    +---------------------------+---------+---------+
    | /ssd/t-seq.qcow2 -w       | 8.761   | 8.747   |
    +---------------------------+---------+---------+
    | /tmp/t-part-rand.qcow2    | 41.179  | 41.37   |
    +---------------------------+---------+---------+
    | /tmp/t-part-rand.qcow2 -w | 54.097  | 55.323  |
    +---------------------------+---------+---------+
    | /tmp/t-rand.qcow2         | 711.899 | 514.339 |
    +---------------------------+---------+---------+
    | /tmp/t-rand.qcow2 -w      | 546.259 | 642.114 |
    +---------------------------+---------+---------+
    | /tmp/t-reverse.qcow2      | 86.065  | 96.522  |
    +---------------------------+---------+---------+
    | /tmp/t-reverse.qcow2 -w   | 46.557  | 48.499  |
    +---------------------------+---------+---------+
    | /tmp/t-seq.qcow2          | 33.804  | 33.862  |
    +---------------------------+---------+---------+
    | /tmp/t-seq.qcow2 -w       | 34.299  | 34.233  |
    +---------------------------+---------+---------+


Performance gain is obvious, especially for read and especially for ssd.
For hdd there is a degradation for reverse case, but this is the most
impossible case and seems not critical.

How images are generated:

    ==== gen-writes ======
    #!/usr/bin/env python
    import random
    import sys

    size = 4 * 1024 * 1024 * 1024
    block = 64 * 1024
    block2 = 1024 * 1024

    arg = sys.argv[1]

    if arg in ('rand', 'reverse', 'seq'):
        writes = list(range(0, size, block))

    if arg == 'rand':
        random.shuffle(writes)
    elif arg == 'reverse':
        writes.reverse()
    elif arg == 'part-rand':
        writes = []
        for off in range(0, size, block2):
            wr = list(range(off, off + block2, block))
            random.shuffle(wr)
            writes.extend(wr)
    elif arg != 'seq':
        sys.exit(1)

    for w in writes:
        print 'write -P 0xff {} {}'.format(w, block)

    print 'q'
    ==========================

    ===== gen-test-images.sh =====
    #!/bin/bash

    IMG_PATH=/ssd

    for name in seq reverse rand part-rand; do
        IMG=$IMG_PATH/t-$name.qcow2
        echo createing $IMG ...
        rm -f $IMG
        qemu-img create -f qcow2 $IMG 4G
        gen-writes $name | qemu-io $IMG
    done
    ==============================

Vladimir Sementsov-Ogievskiy (5):
  qemu-iotests: ignore leaks on failure paths in 026
  block: introduce aio task pool
  block/qcow2: refactor qcow2_co_preadv_part
  block/qcow2: refactor qcow2_co_pwritev_part
  block/qcow2: introduce parallel subrequest handling in read and write

 block/qcow2.h                      |   3 +
 include/block/aio_task.h           |  54 ++++
 block/aio_task.c                   | 124 ++++++++
 block/qcow2.c                      | 466 +++++++++++++++++++----------
 block/Makefile.objs                |   2 +
 block/trace-events                 |   1 +
 tests/qemu-iotests/026             |   6 +-
 tests/qemu-iotests/026.out         |  80 ++---
 tests/qemu-iotests/026.out.nocache |  80 ++---
 tests/qemu-iotests/common.rc       |  17 ++
 10 files changed, 549 insertions(+), 284 deletions(-)
 create mode 100644 include/block/aio_task.h
 create mode 100644 block/aio_task.c

-- 
2.21.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-09-20 14:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16 17:53 [Qemu-devel] [PATCH v5 0/5] qcow2: async handling of fragmented io Vladimir Sementsov-Ogievskiy
2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 1/5] qemu-iotests: ignore leaks on failure paths in 026 Vladimir Sementsov-Ogievskiy
2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 2/5] block: introduce aio task pool Vladimir Sementsov-Ogievskiy
2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 3/5] block/qcow2: refactor qcow2_co_preadv_part Vladimir Sementsov-Ogievskiy
2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 4/5] block/qcow2: refactor qcow2_co_pwritev_part Vladimir Sementsov-Ogievskiy
2019-09-16 17:53 ` [Qemu-devel] [PATCH v5 5/5] block/qcow2: introduce parallel subrequest handling in read and write Vladimir Sementsov-Ogievskiy
2019-09-17  9:32 ` [Qemu-devel] [PATCH v5 0/5] qcow2: async handling of fragmented io Vladimir Sementsov-Ogievskiy
2019-09-20 11:10 ` Max Reitz
2019-09-20 11:53   ` Vladimir Sementsov-Ogievskiy
2019-09-20 12:40     ` Max Reitz
2019-09-20 12:53       ` Vladimir Sementsov-Ogievskiy
2019-09-20 13:10         ` Max Reitz
2019-09-20 13:26           ` Vladimir Sementsov-Ogievskiy
2019-09-20 13:29             ` Max Reitz
2019-09-20 14:17         ` Eric Blake
2019-09-20 14:39           ` Vladimir Sementsov-Ogievskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).