All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: jsnow@redhat.com, vsementsov@virtuozzo.com, qemu-block@nongnu.org
Subject: [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status
Date: Tue,  2 Apr 2019 22:05:26 -0500	[thread overview]
Message-ID: <20190403030526.12258-8-eblake@redhat.com> (raw)
In-Reply-To: <20190403030526.12258-1-eblake@redhat.com>

The NBD spec requires that responses to NBD_CMD_BLOCK_STATUS be
aligned to the server's advertised min_block alignment, if the client
agreed to abide by alignments. In general, since dirty bitmap
granularity cannot go below 512, and it is hard to provoke qcow2 to go
above an alignment of 512, this is not an issue. And until the block
layer can see through filters, the moment you use blkdebug, you can no
longer export dirty bitmaps over NBD.  But once we fix the traversal
through filters, then blkdebug can create a scenario where the server
is provoked into supplying a non-compliant reply.

Thus, it is time to fix the dirty bitmap code to round requests to
aligned boundaries.

This hack patch lets blkdebug be used; although note that Max has
proposed a more complete solution
https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg03534.html

| diff --git a/nbd/server.c b/nbd/server.c
| index 576deb931c8..1d370b5b2c7 100644
| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -1507,6 +1507,9 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
|          BdrvDirtyBitmap *bm = NULL;
|
|          while (true) {
| +            while (bs->drv && bs->drv->is_filter && bs->file) {
| +                bs = bs->file->bs;
| +            }
|              bm = bdrv_find_dirty_bitmap(bs, bitmap);
|              if (bm != NULL || bs->backing == NULL) {
|                  break;

Then the following sequence creates a dirty bitmap with granularity
512, exposed through a blkdebug interface with minimum block size 4k;
correct behavior is to see a map showing the first 4k as "data":false
(corresponding to the dirty bit in the 2nd of 8 sectors, rounded
up). Without this patch, the code instead showed the entire image as
"data":true (due to the client working around the server
non-compliance by coalescing regions, but assuming that data:true
takes precedence, which is the opposite of the behavior we want during
x-dirty-bitmap).

$ printf %01000d 0 > img
$ qemu-img create -f qcow2 -F raw -b img img.wrap 64k
$ qemu-system-x86_64 -nodefaults -nographic -qmp stdio
{'execute':'qmp_capabilities'}
{'execute':'blockdev-add','arguments':{'node-name':'n','driver':'qcow2','file':{'driver':'file','filename':'img.wrap'}}}
{'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}}
{'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'b','persistent':true,'granularity':512}}
{'execute':'nbd-server-add','arguments':{'device':'n','bitmap':'b','writable':true}}
^z
$ bg
$ qemu-io -f raw -c 'w 513 1' nbd://localhost:10809
$ fg
{'execute':'nbd-server-remove','arguments':{'name':'n'}}
{'execute':'block-dirty-bitmap-disable','arguments':{'node':'n','name':'b'}}
{'execute':'blockdev-add','arguments':{'driver':'blkdebug','align':4096,'image':'n','node-name':'wrap'}}
{'execute':'nbd-server-add','arguments':{'device':'wrap','bitmap':'b'}}
^z
$ bg
$ qemu-img map --output=json --image-opts driver=nbd,x-dirty-bitmap=qemu:dirty-bitmap:b,server.type=inet,server.host=localhost,server.port=10809,export=wrap
$ fg
{'execute':'quit'}

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Not for 4.0; once Max's patches land to see through filters, then this
commit message will need to be rewritten, and the steps outlined above
instead turned into an addition for iotest 241.
---
 nbd/server.c | 40 ++++++++++++++++++++++++++++++++++------
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 576deb931c8..82da0bae9ba 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1982,7 +1982,8 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
  * byte length encoded (which may be smaller or larger than the
  * original), and return the number of extents used.
  */
-static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
+static unsigned int bitmap_to_extents(uint32_t align,
+                                      BdrvDirtyBitmap *bitmap, uint64_t offset,
                                       uint64_t *length, NBDExtent *extents,
                                       unsigned int nb_extents,
                                       bool dont_fragment)
@@ -2008,13 +2009,30 @@ static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
             bdrv_set_dirty_iter(it, begin);
             end = bdrv_dirty_iter_next(it);
         }
-        if (end == -1 || end - begin > UINT32_MAX) {
+        if (end == -1 || end - begin > UINT32_MAX - align) {
             /* Cap to an aligned value < 4G beyond begin. */
             end = MIN(bdrv_dirty_bitmap_size(bitmap),
                       begin + UINT32_MAX + 1 -
-                      bdrv_dirty_bitmap_granularity(bitmap));
+                      MAX(align, bdrv_dirty_bitmap_granularity(bitmap)));
             next_dirty = dirty;
         }
+        /*
+         * Round unaligned bits: any transition mid-alignment makes
+         * that entire aligned region appear dirty
+         */
+        if (!QEMU_IS_ALIGNED(end, align)) {
+            if (dirty) {
+                end = QEMU_ALIGN_UP(end, align);
+            } else if (end - begin < align) {
+                end = begin + align;
+                dirty = true;
+            } else {
+                end = QEMU_ALIGN_DOWN(end, align);
+            }
+            if (end < bdrv_dirty_bitmap_size(bitmap)) {
+                next_dirty = bdrv_get_dirty_locked(NULL, bitmap, end);
+            }
+        }
         if (dont_fragment && end > overall_end) {
             end = overall_end;
         }
@@ -2042,11 +2060,21 @@ static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
 {
     int ret;
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS;
-    NBDExtent *extents = g_new(NBDExtent, nb_extents);
+    NBDExtent *extents;
     uint64_t final_length = length;

-    nb_extents = bitmap_to_extents(bitmap, offset, &final_length, extents,
-                                   nb_extents, dont_fragment);
+    /* Easiest to just refuse to answer an unaligned query */
+    if (client->check_align &&
+        !QEMU_IS_ALIGNED(offset | length, client->check_align)) {
+        return nbd_co_send_structured_error(client, handle, -EINVAL,
+                                            "unaligned dirty bitmap request",
+                                            errp);
+    }
+
+    extents = g_new(NBDExtent, nb_extents);
+    nb_extents = bitmap_to_extents(client->check_align ?: 1, bitmap, offset,
+                                   &final_length, extents, nb_extents,
+                                   dont_fragment);

     ret = nbd_co_send_extents(client, handle, extents, nb_extents,
                               final_length, last, context_id, errp);
-- 
2.20.1

  parent reply	other threads:[~2019-04-03  3:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
2019-04-05 14:13   ` Vladimir Sementsov-Ogievskiy
2019-04-03  3:05 ` [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests Eric Blake
2019-04-05 14:39   ` Vladimir Sementsov-Ogievskiy
2019-04-05 20:04     ` Eric Blake
2019-04-08 12:14       ` Vladimir Sementsov-Ogievskiy
2019-04-08 14:32         ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources Eric Blake
2019-04-05 15:34   ` Vladimir Sementsov-Ogievskiy
2019-04-03  3:05 ` [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation Eric Blake
2019-04-08 13:51   ` Vladimir Sementsov-Ogievskiy
2019-04-08 13:51     ` Vladimir Sementsov-Ogievskiy
2019-04-10 20:44     ` Eric Blake
2019-04-10 20:44       ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment Eric Blake
2019-04-03 13:03   ` Kevin Wolf
2019-04-03 14:02     ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing Eric Blake
2019-04-03  3:05 ` Eric Blake [this message]
2019-04-04 14:52 ` [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing Eric Blake
2019-04-04 15:22   ` [Qemu-devel] [Qemu-block] " Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190403030526.12258-8-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.