* [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing @ 2019-08-27 1:05 Nir Soffer 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block Nir Soffer ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Nir Soffer @ 2019-08-27 1:05 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block, Max Reitz When probing unallocated area on remote XFS filesystem we cannot detect request alignment and we fallback to safe value which may not be optimal. Avoid this fallback by always allocating the first block when creating a new image or resizing empty image. Tested with all formats: for fmt in raw bochs cloop parallels qcow qcow2 qed vdi vpc vhdx vmdk luks dmg; do ./check -$fmt done Changes in v3: - Allocating first block works now when 512 <= size < 4096, storage sector size is 512 bytes, and using block_resize with O_DIRECT (Max) - Fix return value on errors if qemu_vfree() modified errno (Eric) - Improve comment about using allocate_first_block in FALLOC mode (Max) - Remove unneeded $(()) in _filter_block (Max) - Add _default_cache_mode and _supported_cache_mode to new test (Max) - Fix disk size in vmdk tests v2 was here: https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg01265.html Changes in v2: - Support file descriptor opened with O_DIRECT (e.g. in block_resize) (Max) - Remove unneeded change in 160 (Max) - Fix block filter in 175 on filesystem allocating extra blocks (Max) - Comment why we ignore errors in allocte_first_block() (Max) - Comment why allocate_first_block() is needed in FALLOC mode (Max) - Clarify commit message about user visible changes (Maxim) - Fix 178.out.qcow2 - Fix 150.out with -qcow2 by splitting to 150.out.{raw,qcow2} - Add test for allocate_first_block() with block_resize (Max) - Drop provisioning tests results since I ran them only once v1 was here: https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00821.html Nir Soffer (2): block: posix: Always allocate the first block iotests: Test allocate_first_block() with O_DIRECT block/file-posix.c | 51 +++++++++++++++++++ tests/qemu-iotests/059.out | 2 +- tests/qemu-iotests/{150.out => 150.out.qcow2} | 0 tests/qemu-iotests/150.out.raw | 12 +++++ tests/qemu-iotests/175 | 47 ++++++++++++++--- tests/qemu-iotests/175.out | 16 ++++-- tests/qemu-iotests/178.out.qcow2 | 4 +- tests/qemu-iotests/221.out | 12 +++-- tests/qemu-iotests/253.out | 12 +++-- 9 files changed, 135 insertions(+), 21 deletions(-) rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%) create mode 100644 tests/qemu-iotests/150.out.raw -- 2.20.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block 2019-08-27 1:05 [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer @ 2019-08-27 1:05 ` Nir Soffer 2019-08-27 16:58 ` Max Reitz 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 2/2] iotests: Test allocate_first_block() with O_DIRECT Nir Soffer ` (2 subsequent siblings) 3 siblings, 1 reply; 7+ messages in thread From: Nir Soffer @ 2019-08-27 1:05 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block, Max Reitz When creating an image with preallocation "off" or "falloc", the first block of the image is typically not allocated. When using Gluster storage backed by XFS filesystem, reading this block using direct I/O succeeds regardless of request length, fooling alignment detection. In this case we fallback to a safe value (4096) instead of the optimal value (512), which may lead to unneeded data copying when aligning requests. Allocating the first block avoids the fallback. Since we allocate the first block even with preallocation=off, we no longer create images with zero disk size: $ ./qemu-img create -f raw test.raw 1g Formatting 'test.raw', fmt=raw size=1073741824 $ ls -lhs test.raw 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw And converting the image requires additional cluster: $ ./qemu-img measure -f raw -O qcow2 test.raw required size: 458752 fully allocated size: 1074135040 When using format like vmdk with multiple files per image, we allocate one block per file: $ ./qemu-img create -f vmdk -o subformat=twoGbMaxExtentFlat test.vmdk 4g Formatting 'test.vmdk', fmt=vmdk size=4294967296 compat6=off hwversion=undefined subformat=twoGbMaxExtentFlat $ ls -lhs test*.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f001.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f002.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 353 Aug 27 03:23 test.vmdk I did quick performance test for copying disks with qemu-img convert to new raw target image to Gluster storage with sector size of 512 bytes: for i in $(seq 10); do rm -f dst.raw sleep 10 time ./qemu-img convert -f raw -O raw -t none -T none src.raw dst.raw done Here is a table comparing the total time spent: Type Before(s) After(s) Diff(%) --------------------------------------- real 530.028 469.123 -11.4 user 17.204 10.768 -37.4 sys 17.881 7.011 -60.7 We can see very clear improvement in CPU usage. Signed-off-by: Nir Soffer <nsoffer@redhat.com> --- block/file-posix.c | 51 +++++++++++++++++++ tests/qemu-iotests/059.out | 2 +- tests/qemu-iotests/{150.out => 150.out.qcow2} | 0 tests/qemu-iotests/150.out.raw | 12 +++++ tests/qemu-iotests/175 | 19 ++++--- tests/qemu-iotests/175.out | 8 +-- tests/qemu-iotests/178.out.qcow2 | 4 +- tests/qemu-iotests/221.out | 12 +++-- tests/qemu-iotests/253.out | 12 +++-- 9 files changed, 99 insertions(+), 21 deletions(-) rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%) create mode 100644 tests/qemu-iotests/150.out.raw diff --git a/block/file-posix.c b/block/file-posix.c index fbeb0068db..447f937aa1 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1749,6 +1749,43 @@ static int handle_aiocb_discard(void *opaque) return ret; } +/* + * Help alignment probing by allocating the first block. + * + * When reading with direct I/O from unallocated area on Gluster backed by XFS, + * reading succeeds regardless of request length. In this case we fallback to + * safe alignment which is not optimal. Allocating the first block avoids this + * fallback. + * + * fd may be opened with O_DIRECT, but we don't know the buffer alignment or + * request alignment, so we use safe values. + * + * Returns: 0 on success, -errno on failure. Since this is an optimization, + * caller may ignore failures. + */ +static int allocate_first_block(int fd, size_t max_size) +{ + size_t write_size = (max_size < MAX_BLOCKSIZE) + ? BDRV_SECTOR_SIZE + : MAX_BLOCKSIZE; + size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize()); + void *buf; + ssize_t n; + int ret; + + buf = qemu_memalign(max_align, write_size); + memset(buf, 0, write_size); + + do { + n = pwrite(fd, buf, write_size, 0); + } while (n == -1 && errno == EINTR); + + ret = (n == -1) ? -errno : 0; + + qemu_vfree(buf); + return ret; +} + static int handle_aiocb_truncate(void *opaque) { RawPosixAIOData *aiocb = opaque; @@ -1788,6 +1825,17 @@ static int handle_aiocb_truncate(void *opaque) /* posix_fallocate() doesn't set errno. */ error_setg_errno(errp, -result, "Could not preallocate new data"); + } else if (current_length == 0) { + /* + * posix_fallocate() uses fallocate() if the filesystem + * supports it, or fallback to manually writing zeroes. If + * fallocate() was used, unaligned reads from the fallocated + * area in raw_probe_alignment() will succeed, hence we need to + * allocate the first block. + * + * Optimize future alignment probing; ignore failures. + */ + allocate_first_block(fd, offset); } } else { result = 0; @@ -1849,6 +1897,9 @@ static int handle_aiocb_truncate(void *opaque) if (ftruncate(fd, offset) != 0) { result = -errno; error_setg_errno(errp, -result, "Could not resize file"); + } else if (current_length == 0 && offset > current_length) { + /* Optimize future alignment probing; ignore failures. */ + allocate_first_block(fd, offset); } return result; default: diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out index 4fab42a28c..fe3f861f3c 100644 --- a/tests/qemu-iotests/059.out +++ b/tests/qemu-iotests/059.out @@ -27,7 +27,7 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824000 subformat=twoGbMax image: TEST_DIR/t.vmdk file format: vmdk virtual size: 0.977 TiB (1073741824000 bytes) -disk size: 16 KiB +disk size: 1.97 MiB Format specific information: cid: XXXXXXXX parent cid: XXXXXXXX diff --git a/tests/qemu-iotests/150.out b/tests/qemu-iotests/150.out.qcow2 similarity index 100% rename from tests/qemu-iotests/150.out rename to tests/qemu-iotests/150.out.qcow2 diff --git a/tests/qemu-iotests/150.out.raw b/tests/qemu-iotests/150.out.raw new file mode 100644 index 0000000000..3cdc7727a5 --- /dev/null +++ b/tests/qemu-iotests/150.out.raw @@ -0,0 +1,12 @@ +QA output created by 150 + +=== Mapping sparse conversion === + +Offset Length File +0 0x1000 TEST_DIR/t.IMGFMT + +=== Mapping non-sparse conversion === + +Offset Length File +0 0x100000 TEST_DIR/t.IMGFMT +*** done diff --git a/tests/qemu-iotests/175 b/tests/qemu-iotests/175 index 51e62c8276..7ba28b3c1b 100755 --- a/tests/qemu-iotests/175 +++ b/tests/qemu-iotests/175 @@ -37,14 +37,16 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 # the file size. This function hides the resulting difference in the # stat -c '%b' output. # Parameter 1: Number of blocks an empty file occupies -# Parameter 2: Image size in bytes +# Parameter 2: Minimal number of blocks in an image +# Parameter 3: Image size in bytes _filter_blocks() { extra_blocks=$1 - img_size=$2 + min_blocks=$2 + img_size=$3 - sed -e "s/blocks=$extra_blocks\\(\$\\|[^0-9]\\)/nothing allocated/" \ - -e "s/blocks=$((extra_blocks + img_size / 512))\\(\$\\|[^0-9]\\)/everything allocated/" + sed -e "s/blocks=$min_blocks\\(\$\\|[^0-9]\\)/min allocation/" \ + -e "s/blocks=$((extra_blocks + img_size / 512))\\(\$\\|[^0-9]\\)/max allocation/" } # get standard environment, filters and checks @@ -60,16 +62,21 @@ size=$((1 * 1024 * 1024)) touch "$TEST_DIR/empty" extra_blocks=$(stat -c '%b' "$TEST_DIR/empty") +# We always write the first byte; check how many blocks this filesystem +# allocates to match empty image alloation. +printf "\0" > "$TEST_DIR/empty" +min_blocks=$(stat -c '%b' "$TEST_DIR/empty") + echo echo "== creating image with default preallocation ==" _make_test_img $size | _filter_imgfmt -stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $size +stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $min_blocks $size for mode in off full falloc; do echo echo "== creating image with preallocation $mode ==" IMGOPTS=preallocation=$mode _make_test_img $size | _filter_imgfmt - stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $size + stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $min_blocks $size done # success, all done diff --git a/tests/qemu-iotests/175.out b/tests/qemu-iotests/175.out index 6d9a5ed84e..263e521262 100644 --- a/tests/qemu-iotests/175.out +++ b/tests/qemu-iotests/175.out @@ -2,17 +2,17 @@ QA output created by 175 == creating image with default preallocation == Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 -size=1048576, nothing allocated +size=1048576, min allocation == creating image with preallocation off == Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 preallocation=off -size=1048576, nothing allocated +size=1048576, min allocation == creating image with preallocation full == Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 preallocation=full -size=1048576, everything allocated +size=1048576, max allocation == creating image with preallocation falloc == Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 preallocation=falloc -size=1048576, everything allocated +size=1048576, max allocation *** done diff --git a/tests/qemu-iotests/178.out.qcow2 b/tests/qemu-iotests/178.out.qcow2 index 55a8dc926f..9e7d8c44df 100644 --- a/tests/qemu-iotests/178.out.qcow2 +++ b/tests/qemu-iotests/178.out.qcow2 @@ -101,7 +101,7 @@ converted image file size in bytes: 196608 == raw input image with data (human) == Formatting 'TEST_DIR/t.qcow2', fmt=IMGFMT size=1073741824 -required size: 393216 +required size: 458752 fully allocated size: 1074135040 wrote 512/512 bytes at offset 512 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) @@ -257,7 +257,7 @@ converted image file size in bytes: 196608 Formatting 'TEST_DIR/t.qcow2', fmt=IMGFMT size=1073741824 { - "required": 393216, + "required": 458752, "fully-allocated": 1074135040 } wrote 512/512 bytes at offset 512 diff --git a/tests/qemu-iotests/221.out b/tests/qemu-iotests/221.out index 9f9dd52bb0..dca024a0c3 100644 --- a/tests/qemu-iotests/221.out +++ b/tests/qemu-iotests/221.out @@ -3,14 +3,18 @@ QA output created by 221 === Check mapping of unaligned raw image === Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65537 -[{ "start": 0, "length": 66048, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] -[{ "start": 0, "length": 66048, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 61952, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 61952, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] wrote 1/1 bytes at offset 65536 1 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -[{ "start": 0, "length": 65536, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 61440, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 65536, "length": 1, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, { "start": 65537, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] -[{ "start": 0, "length": 65536, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 61440, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 65536, "length": 1, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, { "start": 65537, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] *** done diff --git a/tests/qemu-iotests/253.out b/tests/qemu-iotests/253.out index 607c0baa0b..3d08b305d7 100644 --- a/tests/qemu-iotests/253.out +++ b/tests/qemu-iotests/253.out @@ -3,12 +3,16 @@ QA output created by 253 === Check mapping of unaligned raw image === Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048575 -[{ "start": 0, "length": 1048576, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] -[{ "start": 0, "length": 1048576, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 1044480, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 1044480, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] wrote 65535/65535 bytes at offset 983040 63.999 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -[{ "start": 0, "length": 983040, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 978944, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 983040, "length": 65536, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] -[{ "start": 0, "length": 983040, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, +{ "start": 4096, "length": 978944, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 983040, "length": 65536, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] *** done -- 2.20.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block Nir Soffer @ 2019-08-27 16:58 ` Max Reitz 2019-08-27 17:10 ` Max Reitz 0 siblings, 1 reply; 7+ messages in thread From: Max Reitz @ 2019-08-27 16:58 UTC (permalink / raw) To: Nir Soffer, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block [-- Attachment #1.1: Type: text/plain, Size: 3280 bytes --] On 27.08.19 03:05, Nir Soffer wrote: > When creating an image with preallocation "off" or "falloc", the first > block of the image is typically not allocated. When using Gluster > storage backed by XFS filesystem, reading this block using direct I/O > succeeds regardless of request length, fooling alignment detection. > > In this case we fallback to a safe value (4096) instead of the optimal > value (512), which may lead to unneeded data copying when aligning > requests. Allocating the first block avoids the fallback. > > Since we allocate the first block even with preallocation=off, we no > longer create images with zero disk size: > > $ ./qemu-img create -f raw test.raw 1g > Formatting 'test.raw', fmt=raw size=1073741824 > > $ ls -lhs test.raw > 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw > > And converting the image requires additional cluster: > > $ ./qemu-img measure -f raw -O qcow2 test.raw > required size: 458752 > fully allocated size: 1074135040 > > When using format like vmdk with multiple files per image, we allocate > one block per file: > > $ ./qemu-img create -f vmdk -o subformat=twoGbMaxExtentFlat test.vmdk 4g > Formatting 'test.vmdk', fmt=vmdk size=4294967296 compat6=off hwversion=undefined subformat=twoGbMaxExtentFlat > > $ ls -lhs test*.vmdk > 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f001.vmdk > 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f002.vmdk > 4.0K -rw-r--r--. 1 nsoffer nsoffer 353 Aug 27 03:23 test.vmdk > > I did quick performance test for copying disks with qemu-img convert to > new raw target image to Gluster storage with sector size of 512 bytes: > > for i in $(seq 10); do > rm -f dst.raw > sleep 10 > time ./qemu-img convert -f raw -O raw -t none -T none src.raw dst.raw > done > > Here is a table comparing the total time spent: > > Type Before(s) After(s) Diff(%) > --------------------------------------- > real 530.028 469.123 -11.4 > user 17.204 10.768 -37.4 > sys 17.881 7.011 -60.7 > > We can see very clear improvement in CPU usage. > > Signed-off-by: Nir Soffer <nsoffer@redhat.com> > --- > block/file-posix.c | 51 +++++++++++++++++++ > tests/qemu-iotests/059.out | 2 +- > tests/qemu-iotests/{150.out => 150.out.qcow2} | 0 > tests/qemu-iotests/150.out.raw | 12 +++++ > tests/qemu-iotests/175 | 19 ++++--- > tests/qemu-iotests/175.out | 8 +-- > tests/qemu-iotests/178.out.qcow2 | 4 +- > tests/qemu-iotests/221.out | 12 +++-- > tests/qemu-iotests/253.out | 12 +++-- > 9 files changed, 99 insertions(+), 21 deletions(-) > rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%) > create mode 100644 tests/qemu-iotests/150.out.raw Reviewed-by: Max Reitz <mreitz@redhat.com> Maybe it’ll break the vmdk iotests when using a non-default subformat; but currently running the iotests for non-default VMDK subformats is broken anyway, so it doesn’t matter. Max [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block 2019-08-27 16:58 ` Max Reitz @ 2019-08-27 17:10 ` Max Reitz 0 siblings, 0 replies; 7+ messages in thread From: Max Reitz @ 2019-08-27 17:10 UTC (permalink / raw) To: Nir Soffer, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block [-- Attachment #1.1: Type: text/plain, Size: 3433 bytes --] On 27.08.19 18:58, Max Reitz wrote: > On 27.08.19 03:05, Nir Soffer wrote: >> When creating an image with preallocation "off" or "falloc", the first >> block of the image is typically not allocated. When using Gluster >> storage backed by XFS filesystem, reading this block using direct I/O >> succeeds regardless of request length, fooling alignment detection. >> >> In this case we fallback to a safe value (4096) instead of the optimal >> value (512), which may lead to unneeded data copying when aligning >> requests. Allocating the first block avoids the fallback. >> >> Since we allocate the first block even with preallocation=off, we no >> longer create images with zero disk size: >> >> $ ./qemu-img create -f raw test.raw 1g >> Formatting 'test.raw', fmt=raw size=1073741824 >> >> $ ls -lhs test.raw >> 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw >> >> And converting the image requires additional cluster: >> >> $ ./qemu-img measure -f raw -O qcow2 test.raw >> required size: 458752 >> fully allocated size: 1074135040 >> >> When using format like vmdk with multiple files per image, we allocate >> one block per file: >> >> $ ./qemu-img create -f vmdk -o subformat=twoGbMaxExtentFlat test.vmdk 4g >> Formatting 'test.vmdk', fmt=vmdk size=4294967296 compat6=off hwversion=undefined subformat=twoGbMaxExtentFlat >> >> $ ls -lhs test*.vmdk >> 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f001.vmdk >> 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f002.vmdk >> 4.0K -rw-r--r--. 1 nsoffer nsoffer 353 Aug 27 03:23 test.vmdk >> >> I did quick performance test for copying disks with qemu-img convert to >> new raw target image to Gluster storage with sector size of 512 bytes: >> >> for i in $(seq 10); do >> rm -f dst.raw >> sleep 10 >> time ./qemu-img convert -f raw -O raw -t none -T none src.raw dst.raw >> done >> >> Here is a table comparing the total time spent: >> >> Type Before(s) After(s) Diff(%) >> --------------------------------------- >> real 530.028 469.123 -11.4 >> user 17.204 10.768 -37.4 >> sys 17.881 7.011 -60.7 >> >> We can see very clear improvement in CPU usage. >> >> Signed-off-by: Nir Soffer <nsoffer@redhat.com> >> --- >> block/file-posix.c | 51 +++++++++++++++++++ >> tests/qemu-iotests/059.out | 2 +- >> tests/qemu-iotests/{150.out => 150.out.qcow2} | 0 >> tests/qemu-iotests/150.out.raw | 12 +++++ >> tests/qemu-iotests/175 | 19 ++++--- >> tests/qemu-iotests/175.out | 8 +-- >> tests/qemu-iotests/178.out.qcow2 | 4 +- >> tests/qemu-iotests/221.out | 12 +++-- >> tests/qemu-iotests/253.out | 12 +++-- >> 9 files changed, 99 insertions(+), 21 deletions(-) >> rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%) >> create mode 100644 tests/qemu-iotests/150.out.raw > > Reviewed-by: Max Reitz <mreitz@redhat.com> > > Maybe it’ll break the vmdk iotests when using a non-default subformat; > but currently running the iotests for non-default VMDK subformats is > broken anyway, so it doesn’t matter. (Good news, 059 really was the only issue for VMDK.) [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH v3 2/2] iotests: Test allocate_first_block() with O_DIRECT 2019-08-27 1:05 [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block Nir Soffer @ 2019-08-27 1:05 ` Nir Soffer 2019-08-27 15:06 ` [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer 2019-08-27 17:00 ` Max Reitz 3 siblings, 0 replies; 7+ messages in thread From: Nir Soffer @ 2019-08-27 1:05 UTC (permalink / raw) To: qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block, Max Reitz Using block_resize we can test allocate_first_block() with file descriptor opened with O_DIRECT, ensuring that it works for any size larger than 4096 bytes. Testing smaller sizes is tricky as the result depends on the filesystem used for testing. For example on NFS any size will work since O_DIRECT does not require any alignment. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> --- tests/qemu-iotests/175 | 28 ++++++++++++++++++++++++++++ tests/qemu-iotests/175.out | 8 ++++++++ 2 files changed, 36 insertions(+) diff --git a/tests/qemu-iotests/175 b/tests/qemu-iotests/175 index 7ba28b3c1b..55db2803ed 100755 --- a/tests/qemu-iotests/175 +++ b/tests/qemu-iotests/175 @@ -49,6 +49,23 @@ _filter_blocks() -e "s/blocks=$((extra_blocks + img_size / 512))\\(\$\\|[^0-9]\\)/max allocation/" } +# Resize image using block_resize. +# Parameter 1: image path +# Parameter 2: new size +_block_resize() +{ + local path=$1 + local size=$2 + + $QEMU -qmp stdio -nographic -nodefaults \ + -blockdev file,node-name=file,filename=$path,cache.direct=on \ + <<EOF +{'execute': 'qmp_capabilities'} +{'execute': 'block_resize', 'arguments': {'node-name': 'file', 'size': $size}} +{'execute': 'quit'} +EOF +} + # get standard environment, filters and checks . ./common.rc . ./common.filter @@ -57,6 +74,9 @@ _supported_fmt raw _supported_proto file _supported_os Linux +_default_cache_mode none +_supported_cache_modes none directsync + size=$((1 * 1024 * 1024)) touch "$TEST_DIR/empty" @@ -79,6 +99,14 @@ for mode in off full falloc; do stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $min_blocks $size done +for new_size in 4096 1048576; do + echo + echo "== resize empty image with block_resize ==" + _make_test_img 0 | _filter_imgfmt + _block_resize $TEST_IMG $new_size >/dev/null + stat -c "size=%s, blocks=%b" $TEST_IMG | _filter_blocks $extra_blocks $min_blocks $new_size +done + # success, all done echo "*** done" rm -f $seq.full diff --git a/tests/qemu-iotests/175.out b/tests/qemu-iotests/175.out index 263e521262..39c2ee0f62 100644 --- a/tests/qemu-iotests/175.out +++ b/tests/qemu-iotests/175.out @@ -15,4 +15,12 @@ size=1048576, max allocation == creating image with preallocation falloc == Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 preallocation=falloc size=1048576, max allocation + +== resize empty image with block_resize == +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=0 +size=4096, min allocation + +== resize empty image with block_resize == +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=0 +size=1048576, min allocation *** done -- 2.20.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing 2019-08-27 1:05 [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block Nir Soffer 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 2/2] iotests: Test allocate_first_block() with O_DIRECT Nir Soffer @ 2019-08-27 15:06 ` Nir Soffer 2019-08-27 17:00 ` Max Reitz 3 siblings, 0 replies; 7+ messages in thread From: Nir Soffer @ 2019-08-27 15:06 UTC (permalink / raw) To: Nir Soffer; +Cc: Kevin Wolf, QEMU Developers, qemu-block, Max Reitz Adding Eric On Tue, Aug 27, 2019 at 4:05 AM Nir Soffer <nirsof@gmail.com> wrote: > When probing unallocated area on remote XFS filesystem we cannot detect > request > alignment and we fallback to safe value which may not be optimal. Avoid > this > fallback by always allocating the first block when creating a new image or > resizing empty image. > > Tested with all formats: > > for fmt in raw bochs cloop parallels qcow qcow2 qed vdi vpc vhdx vmdk > luks dmg; do > ./check -$fmt > done > > Changes in v3: > - Allocating first block works now when 512 <= size < 4096, storage sector > size > is 512 bytes, and using block_resize with O_DIRECT (Max) > - Fix return value on errors if qemu_vfree() modified errno (Eric) > - Improve comment about using allocate_first_block in FALLOC mode (Max) > - Remove unneeded $(()) in _filter_block (Max) > - Add _default_cache_mode and _supported_cache_mode to new test (Max) > - Fix disk size in vmdk tests > > v2 was here: > https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg01265.html > > Changes in v2: > - Support file descriptor opened with O_DIRECT (e.g. in block_resize) (Max) > - Remove unneeded change in 160 (Max) > - Fix block filter in 175 on filesystem allocating extra blocks (Max) > - Comment why we ignore errors in allocte_first_block() (Max) > - Comment why allocate_first_block() is needed in FALLOC mode (Max) > - Clarify commit message about user visible changes (Maxim) > - Fix 178.out.qcow2 > - Fix 150.out with -qcow2 by splitting to 150.out.{raw,qcow2} > - Add test for allocate_first_block() with block_resize (Max) > - Drop provisioning tests results since I ran them only once > > v1 was here: > https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00821.html > > Nir Soffer (2): > block: posix: Always allocate the first block > iotests: Test allocate_first_block() with O_DIRECT > > block/file-posix.c | 51 +++++++++++++++++++ > tests/qemu-iotests/059.out | 2 +- > tests/qemu-iotests/{150.out => 150.out.qcow2} | 0 > tests/qemu-iotests/150.out.raw | 12 +++++ > tests/qemu-iotests/175 | 47 ++++++++++++++--- > tests/qemu-iotests/175.out | 16 ++++-- > tests/qemu-iotests/178.out.qcow2 | 4 +- > tests/qemu-iotests/221.out | 12 +++-- > tests/qemu-iotests/253.out | 12 +++-- > 9 files changed, 135 insertions(+), 21 deletions(-) > rename tests/qemu-iotests/{150.out => 150.out.qcow2} (100%) > create mode 100644 tests/qemu-iotests/150.out.raw > > -- > 2.20.1 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing 2019-08-27 1:05 [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer ` (2 preceding siblings ...) 2019-08-27 15:06 ` [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer @ 2019-08-27 17:00 ` Max Reitz 3 siblings, 0 replies; 7+ messages in thread From: Max Reitz @ 2019-08-27 17:00 UTC (permalink / raw) To: Nir Soffer, qemu-devel; +Cc: Kevin Wolf, Nir Soffer, qemu-block [-- Attachment #1.1: Type: text/plain, Size: 576 bytes --] On 27.08.19 03:05, Nir Soffer wrote: > When probing unallocated area on remote XFS filesystem we cannot detect request > alignment and we fallback to safe value which may not be optimal. Avoid this > fallback by always allocating the first block when creating a new image or > resizing empty image. > > Tested with all formats: > > for fmt in raw bochs cloop parallels qcow qcow2 qed vdi vpc vhdx vmdk luks dmg; do > ./check -$fmt > done Thanks, applied to my block branch: https://git.xanclic.moe/XanClic/qemu/commits/branch/block Max [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-27 17:11 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-27 1:05 [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 1/2] block: posix: Always allocate the first block Nir Soffer 2019-08-27 16:58 ` Max Reitz 2019-08-27 17:10 ` Max Reitz 2019-08-27 1:05 ` [Qemu-devel] [PATCH v3 2/2] iotests: Test allocate_first_block() with O_DIRECT Nir Soffer 2019-08-27 15:06 ` [Qemu-devel] [PATCH v3 0/2] Optimize alignment probing Nir Soffer 2019-08-27 17:00 ` Max Reitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).