qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org, Peter Maydell <peter.maydell@linaro.org>
Cc: "Kevin Wolf" <kwolf@redhat.com>, "Fam Zheng" <fam@euphon.net>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@virtuozzo.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	qemu-block@nongnu.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Coiby Xu" <Coiby.Xu@gmail.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Alberto Garcia" <berto@igalia.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Max Reitz" <mreitz@redhat.com>
Subject: [PULL v2 24/28] block/io: fix bdrv_co_block_status_above
Date: Thu, 22 Oct 2020 12:27:22 +0100	[thread overview]
Message-ID: <20201022112726.736757-25-stefanha@redhat.com> (raw)
In-Reply-To: <20201022112726.736757-1-stefanha@redhat.com>

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

bdrv_co_block_status_above has several design problems with handling
short backing files:

1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
which produces these after-EOF zeros is inside requested backing
sequence.

2. With want_zero=false, it may return pnum=0 prior to actual EOF,
because of EOF of short backing file.

Fix these things, making logic about short backing files clearer.

With fixed bdrv_block_status_above we also have to improve is_zero in
qcow2 code, otherwise iotest 154 will fail, because with this patch we
stop to merge zeros of different types (produced by fully unallocated
in the whole backing chain regions vs produced by short backing files).

Note also, that this patch leaves for another day the general problem
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
vs go-to-backing.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com
[Fix s/comes/come/ as suggested by Eric Blake
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/io.c    | 68 ++++++++++++++++++++++++++++++++++++++++-----------
 block/qcow2.c | 16 ++++++++++--
 2 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/block/io.c b/block/io.c
index 54f0968aee..a718d50ca2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2350,34 +2350,74 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
                                   int64_t *map,
                                   BlockDriverState **file)
 {
+    int ret;
     BlockDriverState *p;
-    int ret = 0;
-    bool first = true;
+    int64_t eof = 0;
 
     assert(bs != base);
-    for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
+
+    ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
+    if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
+        return ret;
+    }
+
+    if (ret & BDRV_BLOCK_EOF) {
+        eof = offset + *pnum;
+    }
+
+    assert(*pnum <= bytes);
+    bytes = *pnum;
+
+    for (p = bdrv_filter_or_cow_bs(bs); p != base;
+         p = bdrv_filter_or_cow_bs(p))
+    {
         ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
                                    file);
         if (ret < 0) {
-            break;
+            return ret;
         }
-        if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
+        if (*pnum == 0) {
             /*
-             * Reading beyond the end of the file continues to read
-             * zeroes, but we can only widen the result to the
-             * unallocated length we learned from an earlier
-             * iteration.
+             * The top layer deferred to this layer, and because this layer is
+             * short, any zeroes that we synthesize beyond EOF behave as if they
+             * were allocated at this layer.
+             *
+             * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
+             * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+             * below.
              */
+            assert(ret & BDRV_BLOCK_EOF);
             *pnum = bytes;
+            if (file) {
+                *file = p;
+            }
+            ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
+            break;
         }
-        if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
+        if (ret & BDRV_BLOCK_ALLOCATED) {
+            /*
+             * We've found the node and the status, we must break.
+             *
+             * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
+             * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+             * below.
+             */
+            ret &= ~BDRV_BLOCK_EOF;
             break;
         }
-        /* [offset, pnum] unallocated on this layer, which could be only
-         * the first part of [offset, bytes].  */
-        bytes = MIN(bytes, *pnum);
-        first = false;
+
+        /*
+         * OK, [offset, offset + *pnum) region is unallocated on this layer,
+         * let's continue the diving.
+         */
+        assert(*pnum <= bytes);
+        bytes = *pnum;
+    }
+
+    if (offset + *pnum == eof) {
+        ret |= BDRV_BLOCK_EOF;
     }
+
     return ret;
 }
 
diff --git a/block/qcow2.c b/block/qcow2.c
index b05512718c..b6cb4db8bb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3860,8 +3860,20 @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
     if (!bytes) {
         return true;
     }
-    res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
-    return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
+
+    /*
+     * bdrv_block_status_above doesn't merge different types of zeros, for
+     * example, zeros which come from the region which is unallocated in
+     * the whole backing chain, and zeros which come because of a short
+     * backing file. So, we need a loop.
+     */
+    do {
+        res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
+        offset += nr;
+        bytes -= nr;
+    } while (res >= 0 && (res & BDRV_BLOCK_ZERO) && nr && bytes);
+
+    return res >= 0 && (res & BDRV_BLOCK_ZERO) && bytes == 0;
 }
 
 static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
-- 
2.26.2


  parent reply	other threads:[~2020-10-22 11:38 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22 11:26 [PULL v2 00/28] Block patches Stefan Hajnoczi
2020-10-22 11:26 ` [PULL v2 01/28] block/nvme: Add driver statistics for access alignment and hw errors Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 02/28] libvhost-user: Allow vu_message_read to be replaced Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 03/28] libvhost-user: remove watch for kick_fd when de-initialize vu-dev Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 04/28] util/vhost-user-server: generic vhost user server Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 05/28] block: move logical block size check function to a common utility function Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 06/28] block/export: vhost-user block device backend server Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 07/28] MAINTAINERS: Add vhost-user block device backend server maintainer Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 08/28] util/vhost-user-server: s/fileds/fields/ typo fix Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 09/28] util/vhost-user-server: drop unnecessary QOM cast Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 10/28] util/vhost-user-server: drop unnecessary watch deletion Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 11/28] block/export: consolidate request structs into VuBlockReq Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 12/28] util/vhost-user-server: drop unused DevicePanicNotifier Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 13/28] util/vhost-user-server: fix memory leak in vu_message_read() Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 14/28] util/vhost-user-server: check EOF when reading payload Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 15/28] util/vhost-user-server: rework vu_client_trip() coroutine lifecycle Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 16/28] block/export: report flush errors Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 17/28] block/export: convert vhost-user-blk server to block export API Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 18/28] util/vhost-user-server: move header to include/ Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 19/28] util/vhost-user-server: use static library in meson.build Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 20/28] qemu-storage-daemon: avoid compiling blockdev_ss twice Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 21/28] block: move block exports to libblockdev Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 22/28] block/export: add iothread and fixed-iothread options Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 23/28] block/export: add vhost-user-blk multi-queue support Stefan Hajnoczi
2020-10-22 11:27 ` Stefan Hajnoczi [this message]
2020-10-22 11:27 ` [PULL v2 25/28] block/io: bdrv_common_block_status_above: support include_base Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 26/28] block/io: bdrv_common_block_status_above: support bs == base Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 27/28] block/io: fix bdrv_is_allocated_above Stefan Hajnoczi
2020-10-22 11:27 ` [PULL v2 28/28] iotests: add commit top->base cases to 274 Stefan Hajnoczi
2020-10-22 12:12 ` [PULL v2 00/28] Block patches no-reply
2020-10-22 14:09 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201022112726.736757-25-stefanha@redhat.com \
    --to=stefanha@redhat.com \
    --cc=Coiby.Xu@gmail.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=berto@igalia.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).