All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes
@ 2019-04-03  3:05 Eric Blake
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block

My recent pull request for -rc2 documented several places where our
server still does not comply with the NBD specification with regards
to alignment. This series finishes the work in making everything
compliant, insofar as I could test.

The first three patches are rather trivial - they don't drive -rc3 on
their own, but are worth having if there is another patch that we want
in 4.0 (oh, and I need to fix the missing space in nbd/client's error
message in commit 3add3ab7, also in that same category but not posted
here).

Patch 4 is as close as I can get to demonstrating the remaining
alignment bugs in existing qemu (the test passes on current master,
but fails if you revert the client workarounds, and then passes again
once you apply the tail of this series to fix the server
noncompliance). (Viewing traces is a lot more informative than running
iotests when it comes to diagnosing protocol compliance issues,
although it is also a lot slower and more verbose). Again, not
something to drive -rc3 on its own, but worth having if we want the
rest.

Patches 5-6 fix the most common cases of bad server alignments, but
does so by changing a fundamental algorithm in
io.c:bdrv_co_block_status() - any time we grab block status by
deferral with BDRV_BLOCK_RAW, we want to ensure that the lower layer
status is aligned to the requirements of the upper layer.  While it
passed iotests for me, I'm very much on the fence as to whether this
is 4.0 material as a bug fix, or whether we declare that because the
alignment bugs are not a regression over 3.1, we save the patch until
4.1.

Patch 7 is definitely not 4.0 material - it fixes a bug that I could
not provoke in practice, without applying other patches to let NBD
exports see through blkdebug (Max's series to fix filter handling will
presumably be applied for 4.1, at which point patch 7 should be
rebased on top of those patches).

Eric Blake (7):
  nbd/server: Fix blockstatus trace
  nbd/server: Trace server noncompliance on unaligned requests
  nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources
  iotests: Update 241 to expose backing layer fragmentation
  block: Fix BDRV_BLOCK_RAW status to honor alignment
  nbd/server: Avoid unaligned read/block_status from backing
  nbd/server: Avoid unaligned dirty-bitmap status

 include/block/block.h      |   2 +
 block/io.c                 | 155 +++++++++++++++++++++++++++++++++++--
 nbd/server.c               |  88 +++++++++++++++------
 nbd/trace-events           |   1 +
 tests/qemu-iotests/221     |  10 +++
 tests/qemu-iotests/221.out |   6 ++
 tests/qemu-iotests/241     |  20 ++++-
 tests/qemu-iotests/241.out |  12 ++-
 8 files changed, 258 insertions(+), 36 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-05 14:13   ` Vladimir Sementsov-Ogievskiy
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests Eric Blake
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block

Don't increment remaining_bytes until we know that we will actually be
including the current block status extent in the reply; otherwise, the
value traced will include a bytes value that is oversized by the
length of the next block status extent which did not get sent because
it instead ended the loop.

Fixes: fb7afc79
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 218a2aa5e65..1b8c8619896 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1880,17 +1880,12 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,

         flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
                 (ret & BDRV_BLOCK_ZERO      ? NBD_STATE_ZERO : 0);
-        offset += num;
-        remaining_bytes -= num;

         if (first_extent) {
             extent->flags = flags;
             extent->length = num;
             first_extent = false;
-            continue;
-        }
-
-        if (flags == extent->flags) {
+        } else if (flags == extent->flags) {
             /* extend current extent */
             extent->length += num;
         } else {
@@ -1903,6 +1898,8 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
             extent->flags = flags;
             extent->length = num;
         }
+        offset += num;
+        remaining_bytes -= num;
     }

     extents_end = extent + 1;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-05 14:39   ` Vladimir Sementsov-Ogievskiy
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources Eric Blake
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block

We've recently added traces for clients to flag server non-compliance;
let's do the same for servers to flag client non-compliance. According
to the spec, if the client requests NBD_INFO_BLOCK_SIZE, it is
promising to send all requests aligned to those boundaries.  Of
course, if the client does not request NBD_INFO_BLOCK_SIZE, then it
made no promises so we shouldn't flag anything; and because we are
willing to handle clients that made no promises (the spec allows us to
use NBD_REP_ERR_BLOCK_SIZE_REQD if we had been unwilling), we already
have to handle unaligned requests (which the block layer already does
on our behalf).  So even though the spec allows us to return EINVAL
for clients that promised to behave, it's easier to always answer
unaligned requests.  Still, flagging non-compliance can be useful in
debugging a client that is trying to be maximally portable.

Qemu as client used to have one spot where it sent non-compliant
requests: if the server sends an unaligned reply to
NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire
disk, the next request would start at that unaligned point; this was
fixed in commit a39286dd when the client was taught to work around
server non-compliance; but is equally fixed if the server is patched
to not send unaligned replies in the first place (the next few patches
will address that). Fortunately, I did not find any more spots where
qemu as client was non-compliant. I was able to test the patch by
using the following hack to convince qemu-io to run various unaligned
commands, coupled with serving 512-byte alignment by intentionally
omitting '-f raw' on the server while viewing server traces.

| diff --git i/nbd/client.c w/nbd/client.c
| index 427980bdd22..1858b2aac35 100644
| --- i/nbd/client.c
| +++ w/nbd/client.c
| @@ -449,6 +449,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
|                  nbd_send_opt_abort(ioc);
|                  return -1;
|              }
| +            info->min_block = 1;//hack
|              if (!is_power_of_2(info->min_block)) {
|                  error_setg(errp, "server minimum block size %" PRIu32
|                             " is not a power of two", info->min_block);

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c     | 12 ++++++++++++
 nbd/trace-events |  1 +
 2 files changed, 13 insertions(+)

diff --git a/nbd/server.c b/nbd/server.c
index 1b8c8619896..3040ceb5606 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -124,6 +124,8 @@ struct NBDClient {
     int nb_requests;
     bool closing;

+    uint32_t check_align; /* If non-zero, check for aligned client requests */
+
     bool structured_reply;
     NBDExportMetaContexts export_meta;

@@ -660,6 +662,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,

     if (client->opt == NBD_OPT_GO) {
         client->exp = exp;
+        client->check_align = blocksize ?
+            blk_get_request_alignment(exp->blk) : 0;
         QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
         nbd_export_get(client->exp);
         nbd_check_meta_export(client);
@@ -2126,6 +2130,14 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
         return (request->type == NBD_CMD_WRITE ||
                 request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL;
     }
+    if (client->check_align && !QEMU_IS_ALIGNED(request->from | request->len,
+                                                client->check_align)) {
+        /*
+         * The block layer gracefully handles unaligned requests, but
+         * it's still worth tracing client non-compliance
+         */
+        trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type));
+    }
     valid_flags = NBD_CMD_FLAG_FUA;
     if (request->type == NBD_CMD_READ && client->structured_reply) {
         valid_flags |= NBD_CMD_FLAG_DF;
diff --git a/nbd/trace-events b/nbd/trace-events
index a6cca8fdf83..87a2b896c35 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -71,4 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t
 nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
+nbd_co_receive_align_compliance(const char *op) "client sent non-compliant unaligned %s request"
 nbd_trip(void) "Reading request"
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-05 15:34   ` Vladimir Sementsov-Ogievskiy
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation Eric Blake
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block

In commit 0c1d50bd, I added a couple of TODO comments about whether we
consult bl.request_alignment when responding to NBD_OPT_INFO. At the
time, qemu as server was hard-coding an advertised alignment of 512 to
clients that promised to obey constraints, and there was no function
for getting at a device's preferred alignment. But in hindsight,
advertising 512 when the block device prefers 1 caused other
compliance problems, and commit b0245d64 changed one of the two TODO
comments to advertise a more accurate alignment. Time to fix the other
TODO.  Doesn't really impact qemu as client (our normal client doesn't
use NBD_OPT_INFO, and qemu-nbd --list promises to obey block sizes),
but it might prove useful to other clients.

Fixes: b0245d64
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 3040ceb5606..faa3c7879fd 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -642,11 +642,14 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
         return rc;
     }

-    /* If the client is just asking for NBD_OPT_INFO, but forgot to
-     * request block sizes, return an error.
-     * TODO: consult blk_bs(blk)->request_align, and only error if it
-     * is not 1? */
-    if (client->opt == NBD_OPT_INFO && !blocksize) {
+    /*
+     * If the client is just asking for NBD_OPT_INFO, but forgot to
+     * request block sizes in a situation that would impact
+     * performance, then return an error. But for NBD_OPT_GO, we
+     * tolerate all clients, regardless of alignments.
+     */
+    if (client->opt == NBD_OPT_INFO && !blocksize &&
+        blk_get_request_alignment(exp->blk) > 1) {
         return nbd_negotiate_send_rep_err(client,
                                           NBD_REP_ERR_BLOCK_SIZE_REQD,
                                           errp,
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
                   ` (2 preceding siblings ...)
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-08 13:51     ` Vladimir Sementsov-Ogievskiy
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment Eric Blake
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block, Kevin Wolf, Max Reitz

Previous commits have mentioned that our NBD server still sends
unaligned fragments when an active layer with large advertised minimum
block size is backed by another layer with a smaller block
size. Expand the test to actually cover this scenario, by using qcow2
encryption (which forces 512-byte alignment) with an unaligned raw
backing file.

The test passes, but only because the client side works around the
server's non-compliance; if you repeat the test manually with tracing
turned on, you will see the server sending a status for 1000 bytes
data then 1048 bytes hole, which is not aligned. But reverting commit
737d3f5244 shows that it is indeed the client working around the bug
in the server.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/241     | 20 +++++++++++++++++++-
 tests/qemu-iotests/241.out |  9 +++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
index 4b196857387..c1fa6980dc8 100755
--- a/tests/qemu-iotests/241
+++ b/tests/qemu-iotests/241
@@ -28,6 +28,7 @@ nbd_unix_socket=$TEST_DIR/test_qemu_nbd_socket
 _cleanup()
 {
     _cleanup_test_img
+    rm -f "$TEST_IMAGE_FILE.qcow2"
     nbd_server_stop
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
@@ -37,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.filter
 . ./common.nbd

-_supported_fmt raw
+_supported_fmt raw # although the test also requires use of qcow2
 _supported_proto nbd
 _supported_os Linux
 _require_command QEMU_NBD
@@ -88,6 +89,23 @@ $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
 $QEMU_IO -c map "$TEST_IMG"
 nbd_server_stop

+echo
+echo "=== Encrypted qcow2 file backed by unaligned raw image  ==="
+echo
+
+# Enabling encryption in qcow2 forces 512-alignment
+SECRET=secret,id=sec0,data=12345
+$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \
+  -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \
+  "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create
+nbd_server_start_unix_socket --object "$SECRET" --image-opts \
+  driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0
+
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IO -c map "$TEST_IMG"
+nbd_server_stop
+
 # Not tested yet: we also want to ensure that qemu as NBD client does
 # not access beyond the end of a server's advertised unaligned size:
 #  nbdkit -U - memory size=513 --run 'qemu-io -f raw -c "r 512 512" $nbd'
diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
index f481074a02e..ef7de1205d2 100644
--- a/tests/qemu-iotests/241.out
+++ b/tests/qemu-iotests/241.out
@@ -25,4 +25,13 @@ WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotest
 [{ "start": 0, "length": 1000, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
 { "start": 1000, "length": 24, "depth": 0, "zero": true, "data": true, "offset": OFFSET}]
 1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
+
+=== Encrypted qcow2 file backed by unaligned raw image  ===
+
+Formatting 'TEST_DIR/t.IMGFMT.qcow2', fmt=qcow2 size=2048 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10
+  size:  2048
+  min block: 512
+[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
+{ "start": 1024, "length": 1024, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
+2 KiB (0x800) bytes     allocated at offset 0 bytes (0x0)
 *** done
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
                   ` (3 preceding siblings ...)
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-03 13:03   ` Kevin Wolf
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing Eric Blake
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: jsnow, vsementsov, qemu-block, Stefan Hajnoczi, Fam Zheng,
	Kevin Wolf, Max Reitz

Previous patches mentioned how the blkdebug filter driver demonstrates
a bug present in our NBD server; the bug is also present with the raw
format driver when probing occurs. Basically, if we specify a
particular alignment > 1, but defer the actual block status to the
underlying file, and the underlying file has a smaller alignment, then
the use of BDRV_BLOCK_RAW to defer to the underlying file can end up
with status split at an alignment unacceptable to our layer.  Many
callers don't care, but NBD does - it is a violation of the NBD
protocol to send status or read results split on an unaligned boundary
(we've taught our client to be tolerant of such violations because of
qemu 3.1; but we still need to fix our server for the sake of other
stricter clients).

This patch lays the groundwork - it adjusts bdrv_block_status to
ensure that any time one layer defers to another via BDRV_BLOCK_RAW,
the deferral is either truncated down to an aligned boundary, or
multiple sub-aligned requests are coalesced into a single
representative answer (using an implicit hole beyond EOF as
needed). Iotest 241 exposes the effect (when format probing occurred,
we don't want block status to subdivide the initial sector, and thus
not any other sector either). Similarly, iotest 221 is a good
candidate to amend to specifically track the effects; a change to a
hole at EOF is not visible if an upper layer does not want alignment
smaller than 512. However, this patch alone is not a complete fix - it
is still possible to have an active layer with large alignment
constraints backed by another layer with smaller constraints; so the
next patch will complete the task.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/io.c                 | 108 +++++++++++++++++++++++++++++++++++--
 tests/qemu-iotests/221     |  10 ++++
 tests/qemu-iotests/221.out |   6 +++
 tests/qemu-iotests/241.out |   3 +-
 4 files changed, 122 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index dfc153b8d82..936877d3745 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2021,6 +2021,97 @@ int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
     return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
 }

+static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
+                                             bool want_zero,
+                                             int64_t offset, int64_t bytes,
+                                             int64_t *pnum, int64_t *map,
+                                             BlockDriverState **file);
+
+/*
+ * Returns an aligned allocation status of the specified sectors.
+ * Wrapper around bdrv_co_block_status() which requires the initial
+ * offset and count to be aligned, and guarantees the result will also
+ * be aligned (even if it requires piecing together multiple
+ * sub-aligned queries into an appropriate coalesced answer).
+ */
+static int coroutine_fn bdrv_co_block_status_aligned(BlockDriverState *bs,
+                                                     uint32_t align,
+                                                     bool want_zero,
+                                                     int64_t offset,
+                                                     int64_t bytes,
+                                                     int64_t *pnum,
+                                                     int64_t *map,
+                                                     BlockDriverState **file)
+{
+    int ret;
+
+    assert(is_power_of_2(align) && QEMU_IS_ALIGNED(offset | bytes, align));
+    ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
+    if (ret < 0) {
+        return ret;
+    }
+    if (!*pnum) {
+        assert(!bytes || ret & BDRV_BLOCK_EOF);
+        return ret;
+    }
+    if (*pnum >= align) {
+        if (!QEMU_IS_ALIGNED(*pnum, align)) {
+            ret &= ~BDRV_BLOCK_EOF;
+            *pnum = QEMU_ALIGN_DOWN(*pnum, align);
+        }
+        return ret;
+    }
+    /* Merge in status for the rest of align */
+    while (*pnum < align) {
+        int ret2;
+        int64_t pnum2;
+        int64_t map2;
+        BlockDriverState *file2;
+
+        ret2 = bdrv_co_block_status(bs, want_zero, offset + *pnum,
+                                    align - *pnum, &pnum2,
+                                    map ? &map2 : NULL, file ? &file2 : NULL);
+        if (ret2 < 0) {
+            return ret2;
+        }
+        if (ret2 & BDRV_BLOCK_EOF) {
+            ret |= BDRV_BLOCK_EOF;
+            if (!pnum2) {
+                pnum2 = align - *pnum;
+                ret2 |= BDRV_BLOCK_ZERO;
+            }
+        } else {
+            assert(pnum2);
+        }
+        if (ret2 & BDRV_BLOCK_DATA) {
+            ret |= BDRV_BLOCK_DATA;
+        }
+        if (!(ret2 & BDRV_BLOCK_ZERO)) {
+            ret &= ~BDRV_BLOCK_ZERO;
+        }
+        if ((ret & BDRV_BLOCK_OFFSET_VALID) &&
+            (!(ret2 & BDRV_BLOCK_OFFSET_VALID) ||
+             (map && *map + *pnum != map2) || (file && *file != file2))) {
+            ret &= ~BDRV_BLOCK_OFFSET_VALID;
+            if (map) {
+                *map = 0;
+            }
+            if (file) {
+                *file = NULL;
+            }
+        }
+        if (ret2 & BDRV_BLOCK_ALLOCATED) {
+            ret |= BDRV_BLOCK_ALLOCATED;
+        }
+        if (ret2 & BDRV_BLOCK_RAW) {
+            assert(ret & BDRV_BLOCK_RAW);
+            assert(ret & BDRV_BLOCK_OFFSET_VALID);
+        }
+        *pnum += pnum2;
+    }
+    return ret;
+}
+
 /*
  * Returns the allocation status of the specified sectors.
  * Drivers not implementing the functionality are assumed to not support
@@ -2121,6 +2212,19 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
      */
     assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) &&
            align > offset - aligned_offset);
+
+    if (ret & BDRV_BLOCK_RAW) {
+        assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file);
+        ret = bdrv_co_block_status_aligned(local_file, align, want_zero,
+                                           local_map, *pnum, pnum, &local_map,
+                                           &local_file);
+        if (ret < 0) {
+            goto out;
+        }
+        assert(!(ret & BDRV_BLOCK_RAW));
+        ret |= BDRV_BLOCK_RAW;
+    }
+
     *pnum -= offset - aligned_offset;
     if (*pnum > bytes) {
         *pnum = bytes;
@@ -2130,9 +2234,7 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
     }

     if (ret & BDRV_BLOCK_RAW) {
-        assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file);
-        ret = bdrv_co_block_status(local_file, want_zero, local_map,
-                                   *pnum, pnum, &local_map, &local_file);
+        ret &= ~BDRV_BLOCK_RAW;
         goto out;
     }

diff --git a/tests/qemu-iotests/221 b/tests/qemu-iotests/221
index 808cd9a289c..3f60fd0fdc8 100755
--- a/tests/qemu-iotests/221
+++ b/tests/qemu-iotests/221
@@ -41,17 +41,27 @@ echo
 echo "=== Check mapping of unaligned raw image ==="
 echo

+# Note that when we enable format probing by omitting -f, the raw
+# layer forces 512-byte alignment and the bytes past EOF take on the
+# same status as the rest of the sector; otherwise, we can see the
+# implicit hole visible past EOF thanks to the block layer rounding
+# sizes up.
+
 _make_test_img 43009 # qemu-img create rounds size up
 $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map

 truncate --size=43009 "$TEST_IMG" # so we resize it and check again
 $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map

 $QEMU_IO -c 'w 43008 1' "$TEST_IMG" | _filter_qemu_io # writing also rounds up
 $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map

 truncate --size=43009 "$TEST_IMG" # so we resize it and check again
 $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
+$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map

 # success, all done
 echo '*** done'
diff --git a/tests/qemu-iotests/221.out b/tests/qemu-iotests/221.out
index a9c0190aadc..723e94037fe 100644
--- a/tests/qemu-iotests/221.out
+++ b/tests/qemu-iotests/221.out
@@ -5,12 +5,18 @@ QA output created by 221
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=43009
 [{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
 [{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
+[{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
+[{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
 wrote 1/1 bytes at offset 43008
 1 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 [{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET},
+{ "start": 40960, "length": 2560, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
+[{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET},
 { "start": 40960, "length": 2049, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
 { "start": 43009, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
 [{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET},
+{ "start": 40960, "length": 2560, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
+[{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET},
 { "start": 40960, "length": 2049, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
 { "start": 43009, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
 *** done
diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
index ef7de1205d2..fd856bb0c8d 100644
--- a/tests/qemu-iotests/241.out
+++ b/tests/qemu-iotests/241.out
@@ -22,8 +22,7 @@ WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotest

   size:  1024
   min block: 1
-[{ "start": 0, "length": 1000, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
-{ "start": 1000, "length": 24, "depth": 0, "zero": true, "data": true, "offset": OFFSET}]
+[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
 1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)

 === Encrypted qcow2 file backed by unaligned raw image  ===
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
                   ` (4 preceding siblings ...)
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-03  3:05 ` [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status Eric Blake
  2019-04-04 14:52 ` [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing Eric Blake
  7 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel
  Cc: jsnow, vsementsov, qemu-block, Stefan Hajnoczi, Fam Zheng,
	Kevin Wolf, Max Reitz

The NBD server code used bdrv_block_status_above() to determine where
to fragment structured read and block status replies.  However, the
protocol can only advertise the active layer's minimum block size; if
the active layer is backed by another file with smaller alignment,
then we can end up leaking unaligned results back through to the
client, in violation of the spec.

Fix this by exposing a new bdrv_block_status_aligned() function around
the recently-added internal bdrv_co_block_status_aligned, to guarantee
that all block status answers from backing layers are rounded up to
the alignment of the current layer.  Note that the underlying function
still requires aligned boundaries, but the public function copes with
unaligned inputs.

iotest 241 does not change in output, but running it manually with
traces shows the improved behavior; furthermore, reverting 737d3f5244
but leaving this patch in place lets the test pass (whereas before the
test would fail because the client had to work around the server's
non-compliance).

Note that while this fixes NBD_CMD_READ and NBD_CMD_BLOCK_STATUS for
base:allocation (because those rely on bdrv_block_status*), we still
have a theoretical compliance problem with NBD_CMD_BLOCK_STATUS for
qemu:dirty-bitmap:NN when visiting a bitmap created at a smaller
granularity than what we advertised. On the other hand, dirty bitmap
granularity cannot go below 512, and without the use of blkdebug
(which in turn needs patches before it can properly find dirty bitmaps
or track status from qcow2 backing chains), I can't provoke an NBD
server to advertise an alignment larger than 512.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/block.h |  2 ++
 block/io.c            | 47 ++++++++++++++++++++++++++++++++++++++-----
 nbd/server.c          | 14 ++++++-------
 3 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index c7a26199aa0..3f38fa57f2d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -444,6 +444,8 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset,
 int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
                             int64_t offset, int64_t bytes, int64_t *pnum,
                             int64_t *map, BlockDriverState **file);
+int bdrv_block_status_aligned(BlockDriverState *bs, int64_t offset,
+                              int64_t bytes, int64_t *pnum);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
                       int64_t *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
diff --git a/block/io.c b/block/io.c
index 936877d3745..5e6c0d06018 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1981,6 +1981,7 @@ int bdrv_flush_all(void)
 typedef struct BdrvCoBlockStatusData {
     BlockDriverState *bs;
     BlockDriverState *base;
+    uint32_t align;
     bool want_zero;
     int64_t offset;
     int64_t bytes;
@@ -2298,6 +2299,7 @@ early_out:

 static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
                                                    BlockDriverState *base,
+                                                   uint32_t align,
                                                    bool want_zero,
                                                    int64_t offset,
                                                    int64_t bytes,
@@ -2311,8 +2313,8 @@ static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,

     assert(bs != base);
     for (p = bs; p != base; p = backing_bs(p)) {
-        ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
-                                   file);
+        ret = bdrv_co_block_status_aligned(p, align, want_zero, offset, bytes,
+                                           pnum, map, file);
         if (ret < 0) {
             break;
         }
@@ -2342,7 +2344,7 @@ static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque)
     BdrvCoBlockStatusData *data = opaque;

     data->ret = bdrv_co_block_status_above(data->bs, data->base,
-                                           data->want_zero,
+                                           data->align, data->want_zero,
                                            data->offset, data->bytes,
                                            data->pnum, data->map, data->file);
     data->done = true;
@@ -2356,6 +2358,7 @@ static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque)
  */
 static int bdrv_common_block_status_above(BlockDriverState *bs,
                                           BlockDriverState *base,
+                                          uint32_t align,
                                           bool want_zero, int64_t offset,
                                           int64_t bytes, int64_t *pnum,
                                           int64_t *map,
@@ -2365,6 +2368,7 @@ static int bdrv_common_block_status_above(BlockDriverState *bs,
     BdrvCoBlockStatusData data = {
         .bs = bs,
         .base = base,
+        .align = align,
         .want_zero = want_zero,
         .offset = offset,
         .bytes = bytes,
@@ -2389,7 +2393,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
                             int64_t offset, int64_t bytes, int64_t *pnum,
                             int64_t *map, BlockDriverState **file)
 {
-    return bdrv_common_block_status_above(bs, base, true, offset, bytes,
+    return bdrv_common_block_status_above(bs, base, 1, true, offset, bytes,
                                           pnum, map, file);
 }

@@ -2400,13 +2404,46 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes,
                                    offset, bytes, pnum, map, file);
 }

+/*
+ * Similar to bdrv_block_status_above(bs, NULL, ...), but ensures that
+ * the answer matches the minimum alignment of bs (smaller alignments
+ * in layers above will not leak through to the active layer). It is
+ * assumed that callers do not care about the resulting mapping of
+ * offsets to an underlying BDS.
+ */
+int bdrv_block_status_aligned(BlockDriverState *bs, int64_t offset,
+                              int64_t bytes, int64_t *pnum)
+{
+    /* Widen the request to aligned boundaries */
+    int64_t aligned_offset, aligned_bytes;
+    uint32_t align = bs->bl.request_alignment;
+    int ret;
+
+    assert(pnum);
+    aligned_offset = QEMU_ALIGN_DOWN(offset, align);
+    aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
+    ret = bdrv_common_block_status_above(bs, NULL, align, true, aligned_offset,
+                                         aligned_bytes, pnum, NULL, NULL);
+    if (ret < 0) {
+        *pnum = 0;
+        return ret;
+    }
+    assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) &&
+           align > offset - aligned_offset);
+    *pnum -= offset - aligned_offset;
+    if (*pnum > bytes) {
+        *pnum = bytes;
+    }
+    return ret;
+}
+
 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
                                    int64_t bytes, int64_t *pnum)
 {
     int ret;
     int64_t dummy;

-    ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset,
+    ret = bdrv_common_block_status_above(bs, backing_bs(bs), 1, false, offset,
                                          bytes, pnum ? pnum : &dummy, NULL,
                                          NULL);
     if (ret < 0) {
diff --git a/nbd/server.c b/nbd/server.c
index faa3c7879fd..576deb931c8 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1792,7 +1792,7 @@ static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
 }

 /* Do a sparse read and send the structured reply to the client.
- * Returns -errno if sending fails. bdrv_block_status_above() failure is
+ * Returns -errno if sending fails. bdrv_block_status_aligned() failure is
  * reported to the client, at which point this function succeeds.
  */
 static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
@@ -1808,10 +1808,9 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,

     while (progress < size) {
         int64_t pnum;
-        int status = bdrv_block_status_above(blk_bs(exp->blk), NULL,
-                                             offset + progress,
-                                             size - progress, &pnum, NULL,
-                                             NULL);
+        int status = bdrv_block_status_aligned(blk_bs(exp->blk),
+                                               offset + progress,
+                                               size - progress, &pnum);
         bool final;

         if (status < 0) {
@@ -1864,7 +1863,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
  * length encoded (which may be smaller than the original), and update
  * @nb_extents to the number of extents used.
  *
- * Returns zero on success and -errno on bdrv_block_status_above failure.
+ * Returns zero on success and -errno on bdrv_block_status_aligned failure.
  */
 static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
                                   uint64_t *bytes, NBDExtent *extents,
@@ -1878,8 +1877,7 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
     while (remaining_bytes) {
         uint32_t flags;
         int64_t num;
-        int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes,
-                                          &num, NULL, NULL);
+        int ret = bdrv_block_status_aligned(bs, offset, remaining_bytes, &num);

         if (ret < 0) {
             return ret;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
                   ` (5 preceding siblings ...)
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing Eric Blake
@ 2019-04-03  3:05 ` Eric Blake
  2019-04-04 14:52 ` [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing Eric Blake
  7 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-03  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, vsementsov, qemu-block

The NBD spec requires that responses to NBD_CMD_BLOCK_STATUS be
aligned to the server's advertised min_block alignment, if the client
agreed to abide by alignments. In general, since dirty bitmap
granularity cannot go below 512, and it is hard to provoke qcow2 to go
above an alignment of 512, this is not an issue. And until the block
layer can see through filters, the moment you use blkdebug, you can no
longer export dirty bitmaps over NBD.  But once we fix the traversal
through filters, then blkdebug can create a scenario where the server
is provoked into supplying a non-compliant reply.

Thus, it is time to fix the dirty bitmap code to round requests to
aligned boundaries.

This hack patch lets blkdebug be used; although note that Max has
proposed a more complete solution
https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg03534.html

| diff --git a/nbd/server.c b/nbd/server.c
| index 576deb931c8..1d370b5b2c7 100644
| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -1507,6 +1507,9 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
|          BdrvDirtyBitmap *bm = NULL;
|
|          while (true) {
| +            while (bs->drv && bs->drv->is_filter && bs->file) {
| +                bs = bs->file->bs;
| +            }
|              bm = bdrv_find_dirty_bitmap(bs, bitmap);
|              if (bm != NULL || bs->backing == NULL) {
|                  break;

Then the following sequence creates a dirty bitmap with granularity
512, exposed through a blkdebug interface with minimum block size 4k;
correct behavior is to see a map showing the first 4k as "data":false
(corresponding to the dirty bit in the 2nd of 8 sectors, rounded
up). Without this patch, the code instead showed the entire image as
"data":true (due to the client working around the server
non-compliance by coalescing regions, but assuming that data:true
takes precedence, which is the opposite of the behavior we want during
x-dirty-bitmap).

$ printf %01000d 0 > img
$ qemu-img create -f qcow2 -F raw -b img img.wrap 64k
$ qemu-system-x86_64 -nodefaults -nographic -qmp stdio
{'execute':'qmp_capabilities'}
{'execute':'blockdev-add','arguments':{'node-name':'n','driver':'qcow2','file':{'driver':'file','filename':'img.wrap'}}}
{'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}}
{'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'b','persistent':true,'granularity':512}}
{'execute':'nbd-server-add','arguments':{'device':'n','bitmap':'b','writable':true}}
^z
$ bg
$ qemu-io -f raw -c 'w 513 1' nbd://localhost:10809
$ fg
{'execute':'nbd-server-remove','arguments':{'name':'n'}}
{'execute':'block-dirty-bitmap-disable','arguments':{'node':'n','name':'b'}}
{'execute':'blockdev-add','arguments':{'driver':'blkdebug','align':4096,'image':'n','node-name':'wrap'}}
{'execute':'nbd-server-add','arguments':{'device':'wrap','bitmap':'b'}}
^z
$ bg
$ qemu-img map --output=json --image-opts driver=nbd,x-dirty-bitmap=qemu:dirty-bitmap:b,server.type=inet,server.host=localhost,server.port=10809,export=wrap
$ fg
{'execute':'quit'}

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Not for 4.0; once Max's patches land to see through filters, then this
commit message will need to be rewritten, and the steps outlined above
instead turned into an addition for iotest 241.
---
 nbd/server.c | 40 ++++++++++++++++++++++++++++++++++------
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 576deb931c8..82da0bae9ba 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1982,7 +1982,8 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
  * byte length encoded (which may be smaller or larger than the
  * original), and return the number of extents used.
  */
-static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
+static unsigned int bitmap_to_extents(uint32_t align,
+                                      BdrvDirtyBitmap *bitmap, uint64_t offset,
                                       uint64_t *length, NBDExtent *extents,
                                       unsigned int nb_extents,
                                       bool dont_fragment)
@@ -2008,13 +2009,30 @@ static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
             bdrv_set_dirty_iter(it, begin);
             end = bdrv_dirty_iter_next(it);
         }
-        if (end == -1 || end - begin > UINT32_MAX) {
+        if (end == -1 || end - begin > UINT32_MAX - align) {
             /* Cap to an aligned value < 4G beyond begin. */
             end = MIN(bdrv_dirty_bitmap_size(bitmap),
                       begin + UINT32_MAX + 1 -
-                      bdrv_dirty_bitmap_granularity(bitmap));
+                      MAX(align, bdrv_dirty_bitmap_granularity(bitmap)));
             next_dirty = dirty;
         }
+        /*
+         * Round unaligned bits: any transition mid-alignment makes
+         * that entire aligned region appear dirty
+         */
+        if (!QEMU_IS_ALIGNED(end, align)) {
+            if (dirty) {
+                end = QEMU_ALIGN_UP(end, align);
+            } else if (end - begin < align) {
+                end = begin + align;
+                dirty = true;
+            } else {
+                end = QEMU_ALIGN_DOWN(end, align);
+            }
+            if (end < bdrv_dirty_bitmap_size(bitmap)) {
+                next_dirty = bdrv_get_dirty_locked(NULL, bitmap, end);
+            }
+        }
         if (dont_fragment && end > overall_end) {
             end = overall_end;
         }
@@ -2042,11 +2060,21 @@ static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
 {
     int ret;
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS;
-    NBDExtent *extents = g_new(NBDExtent, nb_extents);
+    NBDExtent *extents;
     uint64_t final_length = length;

-    nb_extents = bitmap_to_extents(bitmap, offset, &final_length, extents,
-                                   nb_extents, dont_fragment);
+    /* Easiest to just refuse to answer an unaligned query */
+    if (client->check_align &&
+        !QEMU_IS_ALIGNED(offset | length, client->check_align)) {
+        return nbd_co_send_structured_error(client, handle, -EINVAL,
+                                            "unaligned dirty bitmap request",
+                                            errp);
+    }
+
+    extents = g_new(NBDExtent, nb_extents);
+    nb_extents = bitmap_to_extents(client->check_align ?: 1, bitmap, offset,
+                                   &final_length, extents, nb_extents,
+                                   dont_fragment);

     ret = nbd_co_send_extents(client, handle, extents, nb_extents,
                               final_length, last, context_id, errp);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment Eric Blake
@ 2019-04-03 13:03   ` Kevin Wolf
  2019-04-03 14:02     ` Eric Blake
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2019-04-03 13:03 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, jsnow, vsementsov, qemu-block, Stefan Hajnoczi,
	Fam Zheng, Max Reitz

Am 03.04.2019 um 05:05 hat Eric Blake geschrieben:
> Previous patches mentioned how the blkdebug filter driver demonstrates
> a bug present in our NBD server; the bug is also present with the raw
> format driver when probing occurs. Basically, if we specify a
> particular alignment > 1, but defer the actual block status to the
> underlying file, and the underlying file has a smaller alignment, then
> the use of BDRV_BLOCK_RAW to defer to the underlying file can end up
> with status split at an alignment unacceptable to our layer.  Many
> callers don't care, but NBD does - it is a violation of the NBD
> protocol to send status or read results split on an unaligned boundary
> (we've taught our client to be tolerant of such violations because of
> qemu 3.1; but we still need to fix our server for the sake of other
> stricter clients).
> 
> This patch lays the groundwork - it adjusts bdrv_block_status to
> ensure that any time one layer defers to another via BDRV_BLOCK_RAW,
> the deferral is either truncated down to an aligned boundary, or
> multiple sub-aligned requests are coalesced into a single
> representative answer (using an implicit hole beyond EOF as
> needed). Iotest 241 exposes the effect (when format probing occurred,
> we don't want block status to subdivide the initial sector, and thus
> not any other sector either). Similarly, iotest 221 is a good
> candidate to amend to specifically track the effects; a change to a
> hole at EOF is not visible if an upper layer does not want alignment
> smaller than 512. However, this patch alone is not a complete fix - it
> is still possible to have an active layer with large alignment
> constraints backed by another layer with smaller constraints; so the
> next patch will complete the task.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/io.c                 | 108 +++++++++++++++++++++++++++++++++++--
>  tests/qemu-iotests/221     |  10 ++++
>  tests/qemu-iotests/221.out |   6 +++
>  tests/qemu-iotests/241.out |   3 +-
>  4 files changed, 122 insertions(+), 5 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index dfc153b8d82..936877d3745 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2021,6 +2021,97 @@ int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
>      return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
>  }
> 
> +static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
> +                                             bool want_zero,
> +                                             int64_t offset, int64_t bytes,
> +                                             int64_t *pnum, int64_t *map,
> +                                             BlockDriverState **file);
> +
> +/*
> + * Returns an aligned allocation status of the specified sectors.
> + * Wrapper around bdrv_co_block_status() which requires the initial
> + * offset and count to be aligned, and guarantees the result will also
> + * be aligned (even if it requires piecing together multiple
> + * sub-aligned queries into an appropriate coalesced answer).
> + */

I think this comment should specify the result of the operation when the
aligned region consists of multiple subregions with different status.
Probably something like this:

- BDRV_BLOCK_DATA is set if the flag is set for at least one subregion
- BDRV_BLOCK_ZERO is set if the flag is set for all subregions
- BDRV_BLOCK_OFFSET_VALID is set if the flag is set for all subregions,
  the provided offsets are contiguous and file is the same for all
  subregions.
- BDRV_BLOCK_ALLOCATED is never set here (the caller sets it)
- BDRV_BLOCK_EOF is set if the last subregion sets it; assert that it's
  not set for any other subregion
- BDRV_BLOCK_RAW is never set

- *map is only set if BDRV_BLOCK_OFFSET_VALID is set. It contains
  the offset of the first subregion then.

- *file is also only set if BDRV_BLOCK_OFFSET_VALID is set. It contains
  the *file of the subregions, which must be the same for all of them
  (otherwise, we wouldn't have set BDRV_BLOCK_OFFSET_VALID).

- *pnum: The sum of all subregions

> +static int coroutine_fn bdrv_co_block_status_aligned(BlockDriverState *bs,
> +                                                     uint32_t align,
> +                                                     bool want_zero,
> +                                                     int64_t offset,
> +                                                     int64_t bytes,
> +                                                     int64_t *pnum,
> +                                                     int64_t *map,
> +                                                     BlockDriverState **file)
> +{
> +    int ret;
> +
> +    assert(is_power_of_2(align) && QEMU_IS_ALIGNED(offset | bytes, align));
> +    ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    if (!*pnum) {
> +        assert(!bytes || ret & BDRV_BLOCK_EOF);
> +        return ret;
> +    }
> +    if (*pnum >= align) {
> +        if (!QEMU_IS_ALIGNED(*pnum, align)) {
> +            ret &= ~BDRV_BLOCK_EOF;
> +            *pnum = QEMU_ALIGN_DOWN(*pnum, align);
> +        }
> +        return ret;
> +    }

So we decide to just shorten the function if we can return at least some
*pnum > 0. This means that is most cases, this function will have to be
called twice, once for the aligned part, and again for the misaligned
rest.

We do save a little if the caller isn't actually interested in mapping
the whole image, but just in a specific range before the misaligned
part.

So it depends on the use case whether this is an optimisation or a
pessimisation.

> +    /* Merge in status for the rest of align */
> +    while (*pnum < align) {

Okay, in any case, I can see it's a simplification. :-)

> +        int ret2;
> +        int64_t pnum2;
> +        int64_t map2;
> +        BlockDriverState *file2;
> +
> +        ret2 = bdrv_co_block_status(bs, want_zero, offset + *pnum,
> +                                    align - *pnum, &pnum2,
> +                                    map ? &map2 : NULL, file ? &file2 : NULL);
> +        if (ret2 < 0) {
> +            return ret2;
> +        }
> +        if (ret2 & BDRV_BLOCK_EOF) {
> +            ret |= BDRV_BLOCK_EOF;
> +            if (!pnum2) {
> +                pnum2 = align - *pnum;
> +                ret2 |= BDRV_BLOCK_ZERO;
> +            }
> +        } else {
> +            assert(pnum2);
> +        }
> +        if (ret2 & BDRV_BLOCK_DATA) {
> +            ret |= BDRV_BLOCK_DATA;
> +        }
> +        if (!(ret2 & BDRV_BLOCK_ZERO)) {
> +            ret &= ~BDRV_BLOCK_ZERO;
> +        }
> +        if ((ret & BDRV_BLOCK_OFFSET_VALID) &&
> +            (!(ret2 & BDRV_BLOCK_OFFSET_VALID) ||
> +             (map && *map + *pnum != map2) || (file && *file != file2))) {
> +            ret &= ~BDRV_BLOCK_OFFSET_VALID;
> +            if (map) {
> +                *map = 0;
> +            }
> +            if (file) {
> +                *file = NULL;
> +            }
> +        }
> +        if (ret2 & BDRV_BLOCK_ALLOCATED) {
> +            ret |= BDRV_BLOCK_ALLOCATED;
> +        }

Hm, so if we have a region that consists of two subregion, one is
unallocated and the other one is a zero cluster. The result will be
DATA == false, ZERO == false, ALLOCATED == true. This looks a bit odd.

Did you check this case and come to the conclusion that it's okay? If
so, I think a comment would be good.

> +        if (ret2 & BDRV_BLOCK_RAW) {
> +            assert(ret & BDRV_BLOCK_RAW);
> +            assert(ret & BDRV_BLOCK_OFFSET_VALID);
> +        }

Doesn't RAW mean that we need to recurse rather than returning the flag?
Or actually, bdrv_co_block_status() shouldn't ever return the flag, so
can't we assert that it's clear?

> +        *pnum += pnum2;
> +    }
> +    return ret;
> +}

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment
  2019-04-03 13:03   ` Kevin Wolf
@ 2019-04-03 14:02     ` Eric Blake
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-03 14:02 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, jsnow, vsementsov, qemu-block, Stefan Hajnoczi,
	Fam Zheng, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 11091 bytes --]

On 4/3/19 8:03 AM, Kevin Wolf wrote:
> Am 03.04.2019 um 05:05 hat Eric Blake geschrieben:
>> Previous patches mentioned how the blkdebug filter driver demonstrates
>> a bug present in our NBD server; the bug is also present with the raw
>> format driver when probing occurs. Basically, if we specify a
>> particular alignment > 1, but defer the actual block status to the
>> underlying file, and the underlying file has a smaller alignment, then
>> the use of BDRV_BLOCK_RAW to defer to the underlying file can end up
>> with status split at an alignment unacceptable to our layer.  Many
>> callers don't care, but NBD does - it is a violation of the NBD
>> protocol to send status or read results split on an unaligned boundary
>> (we've taught our client to be tolerant of such violations because of
>> qemu 3.1; but we still need to fix our server for the sake of other
>> stricter clients).
>>
>> This patch lays the groundwork - it adjusts bdrv_block_status to
>> ensure that any time one layer defers to another via BDRV_BLOCK_RAW,
>> the deferral is either truncated down to an aligned boundary, or
>> multiple sub-aligned requests are coalesced into a single
>> representative answer (using an implicit hole beyond EOF as
>> needed). Iotest 241 exposes the effect (when format probing occurred,
>> we don't want block status to subdivide the initial sector, and thus
>> not any other sector either). Similarly, iotest 221 is a good
>> candidate to amend to specifically track the effects; a change to a
>> hole at EOF is not visible if an upper layer does not want alignment
>> smaller than 512. However, this patch alone is not a complete fix - it
>> is still possible to have an active layer with large alignment
>> constraints backed by another layer with smaller constraints; so the
>> next patch will complete the task.

I should probably update this text to call out that the next patch
introduces some additional mutual recursion (that is, this patch in
isolation may appear to have dead paths, as the caller
bdrv_co_block_status always passes non-NULL 'map' and 'file'; but the
next patch adjusts bdrv_co_block_status_above to call this instead of
directly into bdrv_co_block_status, hence exposing those paths).

>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  block/io.c                 | 108 +++++++++++++++++++++++++++++++++++--
>>  tests/qemu-iotests/221     |  10 ++++
>>  tests/qemu-iotests/221.out |   6 +++
>>  tests/qemu-iotests/241.out |   3 +-
>>  4 files changed, 122 insertions(+), 5 deletions(-)
>>

>> +/*
>> + * Returns an aligned allocation status of the specified sectors.
>> + * Wrapper around bdrv_co_block_status() which requires the initial
>> + * offset and count to be aligned, and guarantees the result will also
>> + * be aligned (even if it requires piecing together multiple
>> + * sub-aligned queries into an appropriate coalesced answer).
>> + */
> 
> I think this comment should specify the result of the operation when the
> aligned region consists of multiple subregions with different status.

Good idea.

> Probably something like this:
> 
> - BDRV_BLOCK_DATA is set if the flag is set for at least one subregion
> - BDRV_BLOCK_ZERO is set if the flag is set for all subregions
> - BDRV_BLOCK_OFFSET_VALID is set if the flag is set for all subregions,
>   the provided offsets are contiguous and file is the same for all
>   subregions.

Correct.

> - BDRV_BLOCK_ALLOCATED is never set here (the caller sets it)

Not true, bdrv_co_block_status can set BDRV_BLOCK_ALLOCATED.

> - BDRV_BLOCK_EOF is set if the last subregion sets it; assert that it's
>   not set for any other subregion

With the additional caveat that status beyond BDRV_BLOCK_EOF up to the
alignment boundary is treated as an implicit hole.

> - BDRV_BLOCK_RAW is never set

That should be true.

> 
> - *map is only set if BDRV_BLOCK_OFFSET_VALID is set. It contains
>   the offset of the first subregion then.

Correct.  Since the subregions had to be contiguous, it is the correct
offset for the entire aligned region.

> 
> - *file is also only set if BDRV_BLOCK_OFFSET_VALID is set. It contains
>   the *file of the subregions, which must be the same for all of them
>   (otherwise, we wouldn't have set BDRV_BLOCK_OFFSET_VALID).
> 
> - *pnum: The sum of all subregions

And is guaranteed to be aligned, as well as being non-zero unless
'bytes' was 0 on input or if the entire status request is beyond EOF.

> 
>> +static int coroutine_fn bdrv_co_block_status_aligned(BlockDriverState *bs,
>> +                                                     uint32_t align,
>> +                                                     bool want_zero,
>> +                                                     int64_t offset,
>> +                                                     int64_t bytes,
>> +                                                     int64_t *pnum,
>> +                                                     int64_t *map,
>> +                                                     BlockDriverState **file)
>> +{
>> +    int ret;
>> +
>> +    assert(is_power_of_2(align) && QEMU_IS_ALIGNED(offset | bytes, align));
>> +    ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +    if (!*pnum) {
>> +        assert(!bytes || ret & BDRV_BLOCK_EOF);
>> +        return ret;
>> +    }
>> +    if (*pnum >= align) {
>> +        if (!QEMU_IS_ALIGNED(*pnum, align)) {
>> +            ret &= ~BDRV_BLOCK_EOF;
>> +            *pnum = QEMU_ALIGN_DOWN(*pnum, align);
>> +        }
>> +        return ret;
>> +    }
> 
> So we decide to just shorten the function if we can return at least some
> *pnum > 0. This means that is most cases, this function will have to be
> called twice, once for the aligned part, and again for the misaligned
> rest.

Yes, that was my choice for ease of implementation this time around.

> 
> We do save a little if the caller isn't actually interested in mapping
> the whole image, but just in a specific range before the misaligned
> part.
> 
> So it depends on the use case whether this is an optimisation or a
> pessimisation.

Yeah, it looks like I should try harder for the BDRV_BLOCK_EOF case -
there, we KNOW that there is only one more subregion, and it will be a
hole; we also know that coalescing a hole with a hole is still a hole,
and coalescing a hole with data is data.  So special-casing
BDRV_BLOCK_EOF to widen out to alignment is probably better than splitting.

For cases where BDRV_BLOCK_EOF is not set, we have no easy guarantees
about how many subregions the next BDS will provide, nor if those
subsequent subregions will always coalesce into the type of the first
subregion, so splitting early is still probably the smartest approach
for keeping that code sane.

Sounds like I should also try harder to capture some of these nuances in
iotests (I did capture the mid-sector hole due to EOF, but capturing a
blkdebug at 4k alignment wrapping a qcow2 with 512 alignment will cover
more of the interesting coalescing code).


> 
>> +    /* Merge in status for the rest of align */
>> +    while (*pnum < align) {
> 
> Okay, in any case, I can see it's a simplification. :-)
> 
>> +        int ret2;
>> +        int64_t pnum2;
>> +        int64_t map2;
>> +        BlockDriverState *file2;
>> +
>> +        ret2 = bdrv_co_block_status(bs, want_zero, offset + *pnum,
>> +                                    align - *pnum, &pnum2,
>> +                                    map ? &map2 : NULL, file ? &file2 : NULL);
>> +        if (ret2 < 0) {
>> +            return ret2;
>> +        }
>> +        if (ret2 & BDRV_BLOCK_EOF) {
>> +            ret |= BDRV_BLOCK_EOF;
>> +            if (!pnum2) {
>> +                pnum2 = align - *pnum;
>> +                ret2 |= BDRV_BLOCK_ZERO;
>> +            }
>> +        } else {
>> +            assert(pnum2);
>> +        }
>> +        if (ret2 & BDRV_BLOCK_DATA) {
>> +            ret |= BDRV_BLOCK_DATA;
>> +        }
>> +        if (!(ret2 & BDRV_BLOCK_ZERO)) {
>> +            ret &= ~BDRV_BLOCK_ZERO;
>> +        }
>> +        if ((ret & BDRV_BLOCK_OFFSET_VALID) &&
>> +            (!(ret2 & BDRV_BLOCK_OFFSET_VALID) ||
>> +             (map && *map + *pnum != map2) || (file && *file != file2))) {
>> +            ret &= ~BDRV_BLOCK_OFFSET_VALID;
>> +            if (map) {
>> +                *map = 0;
>> +            }
>> +            if (file) {
>> +                *file = NULL;
>> +            }
>> +        }
>> +        if (ret2 & BDRV_BLOCK_ALLOCATED) {
>> +            ret |= BDRV_BLOCK_ALLOCATED;
>> +        }
> 
> Hm, so if we have a region that consists of two subregion, one is
> unallocated and the other one is a zero cluster. The result will be
> DATA == false, ZERO == false, ALLOCATED == true. This looks a bit odd.

In the common case of 'DATA|ALLOCATED|EOF, ZERO' for the hole past EOF,
I think we want to prefer keeping ALLOCATED set. But you are asking
about '0, ZERO|ALLOCATED', where we cannot guarantee that the entire
region reads as zeroes, and we also know that the entire region is not
allocated but some of it is. Maybe this should pay attention to the
'want_zero' flag? We already state elsewhere that:

 * If 'want_zero' is true, the caller is querying for mapping
 * purposes, with a focus on valid BDRV_BLOCK_OFFSET_VALID, _DATA, and
 * _ZERO where possible; otherwise, the result favors larger 'pnum',
 * with a focus on accurate BDRV_BLOCK_ALLOCATED.

so that would mean that if want_zero is true, we treat the entire region
as unallocated if any subregion prior to EOF is unallocated (since we
can't report ZERO, and the caller didn't care about ALLOCATED); if
want_zero is false, we treat the entire region as allocated if any
subregion is allocated (since we know the caller cared about ALLOCATED,
and less about ZERO)?

Or does it mean we just have to audit all callers to see if any of them
would misbehave in one way or another depending on which way we set the
flag?

> Did you check this case and come to the conclusion that it's okay? If
> so, I think a comment would be good.
> 
>> +        if (ret2 & BDRV_BLOCK_RAW) {
>> +            assert(ret & BDRV_BLOCK_RAW);
>> +            assert(ret & BDRV_BLOCK_OFFSET_VALID);
>> +        }
> 
> Doesn't RAW mean that we need to recurse rather than returning the flag?
> Or actually, bdrv_co_block_status() shouldn't ever return the flag, so
> can't we assert that it's clear?

Yes, I think you are right that we can assert RAW is clear by this point.

> 
>> +        *pnum += pnum2;
>> +    }
>> +    return ret;
>> +}
> 
> Kevin
> 

So since I have to do a v2, this is definitely slipping into 4.1.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing
  2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
                   ` (6 preceding siblings ...)
  2019-04-03  3:05 ` [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status Eric Blake
@ 2019-04-04 14:52 ` Eric Blake
  2019-04-04 15:22   ` [Qemu-devel] [Qemu-block] " Kevin Wolf
  7 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-04 14:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: vsementsov, jsnow, qemu-block

Add a missing space to the error message used when giving up on a
server that insists on an alignment which renders the last few bytes
of the export unreadable.

Fixes: 3add3ab78
Signed-off-by: Eric Blake <eblake@redhat.com>
---

We've already concluded that 5-7 are too risky for 4.0 (so qemu 4.0
NBD servers will have non-compliance issues in alignment corner cases,
although in fewer places than where 3.1 had similar bugs; and the 4.0
client is able to work around those non-compliance cases).  But I'd
still like a review on 1-3 and this patch 8, as trivial improvements
worth having in 4.0. I'm borderline on whether having patch 4 in 4.0
adds any value.

 nbd/client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/nbd/client.c b/nbd/client.c
index 427980bdd22..4de30630c73 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -428,7 +428,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
             }
             if (info->min_block &&
                 !QEMU_IS_ALIGNED(info->size, info->min_block)) {
-                error_setg(errp, "export size %" PRIu64 "is not multiple of "
+                error_setg(errp, "export size %" PRIu64 " is not multiple of "
                            "minimum block size %" PRIu32, info->size,
                            info->min_block);
                 nbd_send_opt_abort(ioc);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing
  2019-04-04 14:52 ` [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing Eric Blake
@ 2019-04-04 15:22   ` Kevin Wolf
  0 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2019-04-04 15:22 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, vsementsov, qemu-block

Am 04.04.2019 um 16:52 hat Eric Blake geschrieben:
> Add a missing space to the error message used when giving up on a
> server that insists on an alignment which renders the last few bytes
> of the export unreadable.
> 
> Fixes: 3add3ab78
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
@ 2019-04-05 14:13   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-05 14:13 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: jsnow, qemu-block

03.04.2019 6:05, Eric Blake wrote:
> Don't increment remaining_bytes until we know that we will actually be
> including the current block status extent in the reply; otherwise, the
> value traced will include a bytes value that is oversized by the
> length of the next block status extent which did not get sent because
> it instead ended the loop.
> 
> Fixes: fb7afc79
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests Eric Blake
@ 2019-04-05 14:39   ` Vladimir Sementsov-Ogievskiy
  2019-04-05 20:04     ` Eric Blake
  0 siblings, 1 reply; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-05 14:39 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: jsnow, qemu-block

03.04.2019 6:05, Eric Blake wrote:
> We've recently added traces for clients to flag server non-compliance;
> let's do the same for servers to flag client non-compliance. According
> to the spec, if the client requests NBD_INFO_BLOCK_SIZE, it is
> promising to send all requests aligned to those boundaries.  Of
> course, if the client does not request NBD_INFO_BLOCK_SIZE, then it
> made no promises so we shouldn't flag anything; and because we are
> willing to handle clients that made no promises (the spec allows us to
> use NBD_REP_ERR_BLOCK_SIZE_REQD if we had been unwilling), we already
> have to handle unaligned requests (which the block layer already does
> on our behalf).  So even though the spec allows us to return EINVAL
> for clients that promised to behave, it's easier to always answer
> unaligned requests.  Still, flagging non-compliance can be useful in
> debugging a client that is trying to be maximally portable.
> 
> Qemu as client used to have one spot where it sent non-compliant
> requests: if the server sends an unaligned reply to
> NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire
> disk, the next request would start at that unaligned point; this was
> fixed in commit a39286dd when the client was taught to work around
> server non-compliance; but is equally fixed if the server is patched
> to not send unaligned replies in the first place (the next few patches
> will address that). Fortunately, I did not find any more spots where
> qemu as client was non-compliant. I was able to test the patch by
> using the following hack to convince qemu-io to run various unaligned
> commands, coupled with serving 512-byte alignment by intentionally
> omitting '-f raw' on the server while viewing server traces.
> 
> | diff --git i/nbd/client.c w/nbd/client.c
> | index 427980bdd22..1858b2aac35 100644
> | --- i/nbd/client.c
> | +++ w/nbd/client.c
> | @@ -449,6 +449,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
> |                  nbd_send_opt_abort(ioc);
> |                  return -1;
> |              }
> | +            info->min_block = 1;//hack
> |              if (!is_power_of_2(info->min_block)) {
> |                  error_setg(errp, "server minimum block size %" PRIu32
> |                             " is not a power of two", info->min_block);
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   nbd/server.c     | 12 ++++++++++++
>   nbd/trace-events |  1 +
>   2 files changed, 13 insertions(+)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 1b8c8619896..3040ceb5606 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -124,6 +124,8 @@ struct NBDClient {
>       int nb_requests;
>       bool closing;
> 
> +    uint32_t check_align; /* If non-zero, check for aligned client requests */
> +
>       bool structured_reply;
>       NBDExportMetaContexts export_meta;
> 
> @@ -660,6 +662,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
> 
>       if (client->opt == NBD_OPT_GO) {
>           client->exp = exp;
> +        client->check_align = blocksize ?
> +            blk_get_request_alignment(exp->blk) : 0;

I think better set in same place, where sizes[0] set, or use sizes[0] here or add separate local
varibale for it, so that, if "sizes[0] =" changes somehow, we will not forget to fix this place too.

>           QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>           nbd_export_get(client->exp);
>           nbd_check_meta_export(client);
> @@ -2126,6 +2130,14 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
>           return (request->type == NBD_CMD_WRITE ||
>                   request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL;
>       }
> +    if (client->check_align && !QEMU_IS_ALIGNED(request->from | request->len,
> +                                                client->check_align)) {
> +        /*
> +         * The block layer gracefully handles unaligned requests, but
> +         * it's still worth tracing client non-compliance
> +         */
> +        trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type));
> +    }
>       valid_flags = NBD_CMD_FLAG_FUA;
>       if (request->type == NBD_CMD_READ && client->structured_reply) {
>           valid_flags |= NBD_CMD_FLAG_DF;
> diff --git a/nbd/trace-events b/nbd/trace-events
> index a6cca8fdf83..87a2b896c35 100644
> --- a/nbd/trace-events
> +++ b/nbd/trace-events
> @@ -71,4 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t
>   nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
>   nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
>   nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
> +nbd_co_receive_align_compliance(const char *op) "client sent non-compliant unaligned %s request"

Don't you want print request->from, request->len and client->check_align as well?

>   nbd_trip(void) "Reading request"
> 

Patch seems correct anyway, so if you are in a hurry, it's OK as is:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources
  2019-04-03  3:05 ` [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources Eric Blake
@ 2019-04-05 15:34   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-05 15:34 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: jsnow, qemu-block

03.04.2019 6:05, Eric Blake wrote:
> In commit 0c1d50bd, I added a couple of TODO comments about whether we
> consult bl.request_alignment when responding to NBD_OPT_INFO. At the
> time, qemu as server was hard-coding an advertised alignment of 512 to
> clients that promised to obey constraints, and there was no function
> for getting at a device's preferred alignment. But in hindsight,
> advertising 512 when the block device prefers 1 caused other
> compliance problems, and commit b0245d64 changed one of the two TODO
> comments to advertise a more accurate alignment. Time to fix the other
> TODO.  Doesn't really impact qemu as client (our normal client doesn't
> use NBD_OPT_INFO, and qemu-nbd --list promises to obey block sizes),
> but it might prove useful to other clients.
> 
> Fixes: b0245d64
> Signed-off-by: Eric Blake <eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests
  2019-04-05 14:39   ` Vladimir Sementsov-Ogievskiy
@ 2019-04-05 20:04     ` Eric Blake
  2019-04-08 12:14       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2019-04-05 20:04 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: jsnow, qemu-block

[-- Attachment #1: Type: text/plain, Size: 5970 bytes --]

On 4/5/19 9:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> 03.04.2019 6:05, Eric Blake wrote:
>> We've recently added traces for clients to flag server non-compliance;
>> let's do the same for servers to flag client non-compliance. According

Thus, s/Trace server/Trace client/ in the subject line.

>>
>> Qemu as client used to have one spot where it sent non-compliant
>> requests: if the server sends an unaligned reply to
>> NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire
>> disk, the next request would start at that unaligned point; this was
>> fixed in commit a39286dd when the client was taught to work around
>> server non-compliance; but is equally fixed if the server is patched
>> to not send unaligned replies in the first place (the next few patches
>> will address that).

I'll have to reword this now that we know 4.0 will still have some of
those server bugs in place (as the tail of this series is deferred to 4.1).


>> @@ -660,6 +662,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
>>
>>       if (client->opt == NBD_OPT_GO) {
>>           client->exp = exp;
>> +        client->check_align = blocksize ?
>> +            blk_get_request_alignment(exp->blk) : 0;
> 
> I think better set in same place, where sizes[0] set, or use sizes[0] here or add separate local
> varibale for it, so that, if "sizes[0] =" changes somehow, we will not forget to fix this place too.

I don't want to set client->check_align for NBD_OPT_INFO; but I can
indeed use a temporary variable or hoist the computation so that it is
all in one spot.


>> +++ b/nbd/trace-events
>> @@ -71,4 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t
>>   nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
>>   nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
>>   nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
>> +nbd_co_receive_align_compliance(const char *op) "client sent non-compliant unaligned %s request"
> 
> Don't you want print request->from, request->len and client->check_align as well?

Wouldn't hurt.

> 
>>   nbd_trip(void) "Reading request"
>>
> 
> Patch seems correct anyway, so if you are in a hurry, it's OK as is:
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 

Here's what I'll squash in; I think it is obvious enough to still keep
your R-b, if I send the pull request before you reply back.

diff --git i/nbd/server.c w/nbd/server.c
index 3040ceb5606..cb38dfe6902 100644
--- i/nbd/server.c
+++ w/nbd/server.c
@@ -535,6 +535,7 @@ static int nbd_negotiate_handle_info(NBDClient
*client, uint16_t myflags,
     bool blocksize = false;
     uint32_t sizes[3];
     char buf[sizeof(uint64_t) + sizeof(uint16_t)];
+    uint32_t check_align = 0;

     /* Client sends:
         4 bytes: L, name length (can be 0)
@@ -611,7 +612,7 @@ static int nbd_negotiate_handle_info(NBDClient
*client, uint16_t myflags,
      * whether this is OPT_INFO or OPT_GO. */
     /* minimum - 1 for back-compat, or actual if client will obey it. */
     if (client->opt == NBD_OPT_INFO || blocksize) {
-        sizes[0] = blk_get_request_alignment(exp->blk);
+        check_align = sizes[0] = blk_get_request_alignment(exp->blk);
     } else {
         sizes[0] = 1;
     }
@@ -662,8 +663,7 @@ static int nbd_negotiate_handle_info(NBDClient
*client, uint16_t myflags,

     if (client->opt == NBD_OPT_GO) {
         client->exp = exp;
-        client->check_align = blocksize ?
-            blk_get_request_alignment(exp->blk) : 0;
+        client->check_align = check_align;
         QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
         nbd_export_get(client->exp);
         nbd_check_meta_export(client);
@@ -2136,7 +2136,10 @@ static int nbd_co_receive_request(NBDRequestData
*req, NBDRequest *request,
          * The block layer gracefully handles unaligned requests, but
          * it's still worth tracing client non-compliance
          */
-
trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type));
+        trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type,
+                                                             request->from,
+                                                             request->len,
+
client->check_align));
     }
     valid_flags = NBD_CMD_FLAG_FUA;
     if (request->type == NBD_CMD_READ && client->structured_reply) {
diff --git i/nbd/trace-events w/nbd/trace-events
index 87a2b896c35..ec2d46c9247 100644
--- i/nbd/trace-events
+++ w/nbd/trace-events
@@ -71,5 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int
extents, uint32_t id, uint64_t
 nbd_co_send_structured_error(uint64_t handle, int err, const char
*errname, const char *msg) "Send structured error reply: handle = %"
PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type,
const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16
" (%s)"
 nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len)
"Payload received: handle = %" PRIu64 ", len = %" PRIu32
-nbd_co_receive_align_compliance(const char *op) "client sent
non-compliant unaligned %s request"
+nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t
len, uint32_t align) "client sent non-compliant unaligned %s request:
from=0x%" PRIx64 ", len=0x%x, align=0x%x"
 nbd_trip(void) "Reading request"



-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests
  2019-04-05 20:04     ` Eric Blake
@ 2019-04-08 12:14       ` Vladimir Sementsov-Ogievskiy
  2019-04-08 14:32         ` Eric Blake
  0 siblings, 1 reply; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-08 12:14 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: jsnow, qemu-block

05.04.2019 23:04, Eric Blake wrote:
> On 4/5/19 9:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 03.04.2019 6:05, Eric Blake wrote:
>>> We've recently added traces for clients to flag server non-compliance;
>>> let's do the same for servers to flag client non-compliance. According
> 
> Thus, s/Trace server/Trace client/ in the subject line.
> 
>>>
>>> Qemu as client used to have one spot where it sent non-compliant
>>> requests: if the server sends an unaligned reply to
>>> NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire
>>> disk, the next request would start at that unaligned point; this was
>>> fixed in commit a39286dd when the client was taught to work around
>>> server non-compliance; but is equally fixed if the server is patched
>>> to not send unaligned replies in the first place (the next few patches
>>> will address that).
> 
> I'll have to reword this now that we know 4.0 will still have some of
> those server bugs in place (as the tail of this series is deferred to 4.1).
> 
> 
>>> @@ -660,6 +662,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
>>>
>>>        if (client->opt == NBD_OPT_GO) {
>>>            client->exp = exp;
>>> +        client->check_align = blocksize ?
>>> +            blk_get_request_alignment(exp->blk) : 0;
>>
>> I think better set in same place, where sizes[0] set, or use sizes[0] here or add separate local
>> varibale for it, so that, if "sizes[0] =" changes somehow, we will not forget to fix this place too.
> 
> I don't want to set client->check_align for NBD_OPT_INFO; but I can
> indeed use a temporary variable or hoist the computation so that it is
> all in one spot.
> 
> 
>>> +++ b/nbd/trace-events
>>> @@ -71,4 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t
>>>    nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
>>>    nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
>>>    nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
>>> +nbd_co_receive_align_compliance(const char *op) "client sent non-compliant unaligned %s request"
>>
>> Don't you want print request->from, request->len and client->check_align as well?
> 
> Wouldn't hurt.
> 
>>
>>>    nbd_trip(void) "Reading request"
>>>
>>
>> Patch seems correct anyway, so if you are in a hurry, it's OK as is:
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
> 
> Here's what I'll squash in; I think it is obvious enough to still keep
> your R-b, if I send the pull request before you reply back.
> 
> diff --git i/nbd/server.c w/nbd/server.c
> index 3040ceb5606..cb38dfe6902 100644
> --- i/nbd/server.c
> +++ w/nbd/server.c
> @@ -535,6 +535,7 @@ static int nbd_negotiate_handle_info(NBDClient
> *client, uint16_t myflags,
>       bool blocksize = false;
>       uint32_t sizes[3];
>       char buf[sizeof(uint64_t) + sizeof(uint16_t)];
> +    uint32_t check_align = 0;
> 
>       /* Client sends:
>           4 bytes: L, name length (can be 0)
> @@ -611,7 +612,7 @@ static int nbd_negotiate_handle_info(NBDClient
> *client, uint16_t myflags,
>        * whether this is OPT_INFO or OPT_GO. */
>       /* minimum - 1 for back-compat, or actual if client will obey it. */
>       if (client->opt == NBD_OPT_INFO || blocksize) {
> -        sizes[0] = blk_get_request_alignment(exp->blk);
> +        check_align = sizes[0] = blk_get_request_alignment(exp->blk);
>       } else {
>           sizes[0] = 1;
>       }
> @@ -662,8 +663,7 @@ static int nbd_negotiate_handle_info(NBDClient
> *client, uint16_t myflags,
> 
>       if (client->opt == NBD_OPT_GO) {
>           client->exp = exp;
> -        client->check_align = blocksize ?
> -            blk_get_request_alignment(exp->blk) : 0;
> +        client->check_align = check_align;
>           QTAILQ_INSERT_TAIL(&client->exp->clients, client, next);
>           nbd_export_get(client->exp);
>           nbd_check_meta_export(client);
> @@ -2136,7 +2136,10 @@ static int nbd_co_receive_request(NBDRequestData
> *req, NBDRequest *request,
>            * The block layer gracefully handles unaligned requests, but
>            * it's still worth tracing client non-compliance
>            */
> -
> trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type));
> +        trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type,
> +                                                             request->from,
> +                                                             request->len,
> +
> client->check_align));

something strange here with brackets.

>       }
>       valid_flags = NBD_CMD_FLAG_FUA;
>       if (request->type == NBD_CMD_READ && client->structured_reply) {
> diff --git i/nbd/trace-events w/nbd/trace-events
> index 87a2b896c35..ec2d46c9247 100644
> --- i/nbd/trace-events
> +++ w/nbd/trace-events
> @@ -71,5 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int
> extents, uint32_t id, uint64_t
>   nbd_co_send_structured_error(uint64_t handle, int err, const char
> *errname, const char *msg) "Send structured error reply: handle = %"
> PRIu64 ", error = %d (%s), msg = '%s'"
>   nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type,
> const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16
> " (%s)"
>   nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len)
> "Payload received: handle = %" PRIu64 ", len = %" PRIu32
> -nbd_co_receive_align_compliance(const char *op) "client sent
> non-compliant unaligned %s request"
> +nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t
> len, uint32_t align) "client sent non-compliant unaligned %s request:
> from=0x%" PRIx64 ", len=0x%x, align=0x%x"

%x or % PRIx32 - doesn't matter?

>   nbd_trip(void) "Reading request"
> 
> 
> 

OK for me, thanks.


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation
@ 2019-04-08 13:51     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-08 13:51 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: jsnow, qemu-block, Kevin Wolf, Max Reitz

03.04.2019 6:05, Eric Blake wrote:
> Previous commits have mentioned that our NBD server still sends
> unaligned fragments when an active layer with large advertised minimum
> block size is backed by another layer with a smaller block
> size. Expand the test to actually cover this scenario, by using qcow2
> encryption (which forces 512-byte alignment) with an unaligned raw
> backing file.
> 
> The test passes, but only because the client side works around the
> server's non-compliance; if you repeat the test manually with tracing
> turned on, you will see the server sending a status for 1000 bytes
> data then 1048 bytes hole, which is not aligned. But reverting commit
> 737d3f5244 shows that it is indeed the client working around the bug
> in the server.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Oops, 241 fails for me:

-WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
+WARNING: Image format was not specified for '/work/src/qemu/eric/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.

We forget to filter output :(

Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   tests/qemu-iotests/241     | 20 +++++++++++++++++++-
>   tests/qemu-iotests/241.out |  9 +++++++++
>   2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
> index 4b196857387..c1fa6980dc8 100755
> --- a/tests/qemu-iotests/241
> +++ b/tests/qemu-iotests/241
> @@ -28,6 +28,7 @@ nbd_unix_socket=$TEST_DIR/test_qemu_nbd_socket
>   _cleanup()
>   {
>       _cleanup_test_img
> +    rm -f "$TEST_IMAGE_FILE.qcow2"

you mean _IMG_, not _IMAGE_.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

>       nbd_server_stop
>   }
>   trap "_cleanup; exit \$status" 0 1 2 3 15
> @@ -37,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>   . ./common.filter
>   . ./common.nbd
> 
> -_supported_fmt raw
> +_supported_fmt raw # although the test also requires use of qcow2
>   _supported_proto nbd
>   _supported_os Linux
>   _require_command QEMU_NBD
> @@ -88,6 +89,23 @@ $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
>   $QEMU_IO -c map "$TEST_IMG"
>   nbd_server_stop
> 
> +echo
> +echo "=== Encrypted qcow2 file backed by unaligned raw image  ==="
> +echo
> +
> +# Enabling encryption in qcow2 forces 512-alignment
> +SECRET=secret,id=sec0,data=12345
> +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \
> +  -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \
> +  "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create
> +nbd_server_start_unix_socket --object "$SECRET" --image-opts \
> +  driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0
> +
> +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
> +$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
> +$QEMU_IO -c map "$TEST_IMG"
> +nbd_server_stop
> +
>   # Not tested yet: we also want to ensure that qemu as NBD client does
>   # not access beyond the end of a server's advertised unaligned size:
>   #  nbdkit -U - memory size=513 --run 'qemu-io -f raw -c "r 512 512" $nbd'
> diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
> index f481074a02e..ef7de1205d2 100644
> --- a/tests/qemu-iotests/241.out
> +++ b/tests/qemu-iotests/241.out
> @@ -25,4 +25,13 @@ WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotest
>   [{ "start": 0, "length": 1000, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
>   { "start": 1000, "length": 24, "depth": 0, "zero": true, "data": true, "offset": OFFSET}]
>   1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
> +
> +=== Encrypted qcow2 file backed by unaligned raw image  ===
> +
> +Formatting 'TEST_DIR/t.IMGFMT.qcow2', fmt=qcow2 size=2048 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10
> +  size:  2048
> +  min block: 512
> +[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
> +{ "start": 1024, "length": 1024, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
> +2 KiB (0x800) bytes     allocated at offset 0 bytes (0x0)
>   *** done
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation
@ 2019-04-08 13:51     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 22+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-04-08 13:51 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, jsnow, qemu-block, Max Reitz

03.04.2019 6:05, Eric Blake wrote:
> Previous commits have mentioned that our NBD server still sends
> unaligned fragments when an active layer with large advertised minimum
> block size is backed by another layer with a smaller block
> size. Expand the test to actually cover this scenario, by using qcow2
> encryption (which forces 512-byte alignment) with an unaligned raw
> backing file.
> 
> The test passes, but only because the client side works around the
> server's non-compliance; if you repeat the test manually with tracing
> turned on, you will see the server sending a status for 1000 bytes
> data then 1048 bytes hole, which is not aligned. But reverting commit
> 737d3f5244 shows that it is indeed the client working around the bug
> in the server.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Oops, 241 fails for me:

-WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
+WARNING: Image format was not specified for '/work/src/qemu/eric/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.

We forget to filter output :(

Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   tests/qemu-iotests/241     | 20 +++++++++++++++++++-
>   tests/qemu-iotests/241.out |  9 +++++++++
>   2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
> index 4b196857387..c1fa6980dc8 100755
> --- a/tests/qemu-iotests/241
> +++ b/tests/qemu-iotests/241
> @@ -28,6 +28,7 @@ nbd_unix_socket=$TEST_DIR/test_qemu_nbd_socket
>   _cleanup()
>   {
>       _cleanup_test_img
> +    rm -f "$TEST_IMAGE_FILE.qcow2"

you mean _IMG_, not _IMAGE_.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

>       nbd_server_stop
>   }
>   trap "_cleanup; exit \$status" 0 1 2 3 15
> @@ -37,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>   . ./common.filter
>   . ./common.nbd
> 
> -_supported_fmt raw
> +_supported_fmt raw # although the test also requires use of qcow2
>   _supported_proto nbd
>   _supported_os Linux
>   _require_command QEMU_NBD
> @@ -88,6 +89,23 @@ $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
>   $QEMU_IO -c map "$TEST_IMG"
>   nbd_server_stop
> 
> +echo
> +echo "=== Encrypted qcow2 file backed by unaligned raw image  ==="
> +echo
> +
> +# Enabling encryption in qcow2 forces 512-alignment
> +SECRET=secret,id=sec0,data=12345
> +$QEMU_IMG create -f qcow2 -b "$TEST_IMG_FILE" -F raw --object "$SECRET" \
> +  -o encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10 \
> +  "$TEST_IMG_FILE.qcow2" 2k | _filter_img_create
> +nbd_server_start_unix_socket --object "$SECRET" --image-opts \
> +  driver=qcow2,file.filename="$TEST_IMG_FILE.qcow2",encrypt.key-secret=sec0
> +
> +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
> +$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
> +$QEMU_IO -c map "$TEST_IMG"
> +nbd_server_stop
> +
>   # Not tested yet: we also want to ensure that qemu as NBD client does
>   # not access beyond the end of a server's advertised unaligned size:
>   #  nbdkit -U - memory size=513 --run 'qemu-io -f raw -c "r 512 512" $nbd'
> diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
> index f481074a02e..ef7de1205d2 100644
> --- a/tests/qemu-iotests/241.out
> +++ b/tests/qemu-iotests/241.out
> @@ -25,4 +25,13 @@ WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotest
>   [{ "start": 0, "length": 1000, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
>   { "start": 1000, "length": 24, "depth": 0, "zero": true, "data": true, "offset": OFFSET}]
>   1 KiB (0x400) bytes     allocated at offset 0 bytes (0x0)
> +
> +=== Encrypted qcow2 file backed by unaligned raw image  ===
> +
> +Formatting 'TEST_DIR/t.IMGFMT.qcow2', fmt=qcow2 size=2048 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10
> +  size:  2048
> +  min block: 512
> +[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
> +{ "start": 1024, "length": 1024, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
> +2 KiB (0x800) bytes     allocated at offset 0 bytes (0x0)
>   *** done
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests
  2019-04-08 12:14       ` Vladimir Sementsov-Ogievskiy
@ 2019-04-08 14:32         ` Eric Blake
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-08 14:32 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: jsnow, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

On 4/8/19 7:14 AM, Vladimir Sementsov-Ogievskiy wrote:
> 05.04.2019 23:04, Eric Blake wrote:
>> On 4/5/19 9:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 03.04.2019 6:05, Eric Blake wrote:
>>>> We've recently added traces for clients to flag server non-compliance;
>>>> let's do the same for servers to flag client non-compliance. According
>>
>> Thus, s/Trace server/Trace client/ in the subject line.
>>

>>> Patch seems correct anyway, so if you are in a hurry, it's OK as is:
>>>
>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>
>>
>> Here's what I'll squash in; I think it is obvious enough to still keep
>> your R-b, if I send the pull request before you reply back.
>>

>> @@ -2136,7 +2136,10 @@ static int nbd_co_receive_request(NBDRequestData
>> *req, NBDRequest *request,
>>            * The block layer gracefully handles unaligned requests, but
>>            * it's still worth tracing client non-compliance
>>            */
>> -
>> trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type));
>> +        trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type,
>> +                                                             request->from,
>> +                                                             request->len,
>> +
>> client->check_align));
> 
> something strange here with brackets.

Yeah, I grabbed the diff before compile-testing, but fixed it before
pushing to my branch.


>> +++ w/nbd/trace-events
>> @@ -71,5 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int
>> extents, uint32_t id, uint64_t
>>   nbd_co_send_structured_error(uint64_t handle, int err, const char
>> *errname, const char *msg) "Send structured error reply: handle = %"
>> PRIu64 ", error = %d (%s), msg = '%s'"
>>   nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type,
>> const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16
>> " (%s)"
>>   nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len)
>> "Payload received: handle = %" PRIu64 ", len = %" PRIu32
>> -nbd_co_receive_align_compliance(const char *op) "client sent
>> non-compliant unaligned %s request"
>> +nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t
>> len, uint32_t align) "client sent non-compliant unaligned %s request:
>> from=0x%" PRIx64 ", len=0x%x, align=0x%x"
> 
> %x or % PRIx32 - doesn't matter?

Will use PRIx32 for consistency.

> 
>>   nbd_trip(void) "Reading request"
>>
>>
>>
> 
> OK for me, thanks.

Thanks for a second look. Pull request will be out later today.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation
@ 2019-04-10 20:44       ` Eric Blake
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-10 20:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: jsnow, qemu-block, Kevin Wolf, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]

On 4/8/19 8:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> 03.04.2019 6:05, Eric Blake wrote:
>> Previous commits have mentioned that our NBD server still sends
>> unaligned fragments when an active layer with large advertised minimum
>> block size is backed by another layer with a smaller block
>> size. Expand the test to actually cover this scenario, by using qcow2
>> encryption (which forces 512-byte alignment) with an unaligned raw
>> backing file.
>>
>> The test passes, but only because the client side works around the
>> server's non-compliance; if you repeat the test manually with tracing
>> turned on, you will see the server sending a status for 1000 bytes
>> data then 1048 bytes hole, which is not aligned. But reverting commit
>> 737d3f5244 shows that it is indeed the client working around the bug
>> in the server.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Oops, 241 fails for me:
> 
> -WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
> +WARNING: Image format was not specified for '/work/src/qemu/eric/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
> 
> We forget to filter output :(

The filter bug is pre-existing; separate patch for that is now posted.

> 
> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I saw this email before -rc3, but thought I had only introduced the
problem in this patch which I omitted from my rc3 pull request. Oh well,
the actual break in iotests came earlier; if there's an -rc4 for other
reasons, we'll get the test fixed then.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation
@ 2019-04-10 20:44       ` Eric Blake
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2019-04-10 20:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: Kevin Wolf, jsnow, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]

On 4/8/19 8:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> 03.04.2019 6:05, Eric Blake wrote:
>> Previous commits have mentioned that our NBD server still sends
>> unaligned fragments when an active layer with large advertised minimum
>> block size is backed by another layer with a smaller block
>> size. Expand the test to actually cover this scenario, by using qcow2
>> encryption (which forces 512-byte alignment) with an unaligned raw
>> backing file.
>>
>> The test passes, but only because the client side works around the
>> server's non-compliance; if you repeat the test manually with tracing
>> turned on, you will see the server sending a status for 1000 bytes
>> data then 1048 bytes hole, which is not aligned. But reverting commit
>> 737d3f5244 shows that it is indeed the client working around the bug
>> in the server.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Oops, 241 fails for me:
> 
> -WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
> +WARNING: Image format was not specified for '/work/src/qemu/eric/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
> 
> We forget to filter output :(

The filter bug is pre-existing; separate patch for that is now posted.

> 
> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I saw this email before -rc3, but thought I had only introduced the
problem in this patch which I omitted from my rc3 pull request. Oh well,
the actual break in iotests came earlier; if there's an -rc4 for other
reasons, we'll get the test fixed then.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-04-10 20:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-03  3:05 [Qemu-devel] [PATCH for-4.0? 0/7] Final round of NBD alignment fixes Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace Eric Blake
2019-04-05 14:13   ` Vladimir Sementsov-Ogievskiy
2019-04-03  3:05 ` [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests Eric Blake
2019-04-05 14:39   ` Vladimir Sementsov-Ogievskiy
2019-04-05 20:04     ` Eric Blake
2019-04-08 12:14       ` Vladimir Sementsov-Ogievskiy
2019-04-08 14:32         ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources Eric Blake
2019-04-05 15:34   ` Vladimir Sementsov-Ogievskiy
2019-04-03  3:05 ` [Qemu-devel] [PATCH 4/7] iotests: Update 241 to expose backing layer fragmentation Eric Blake
2019-04-08 13:51   ` Vladimir Sementsov-Ogievskiy
2019-04-08 13:51     ` Vladimir Sementsov-Ogievskiy
2019-04-10 20:44     ` Eric Blake
2019-04-10 20:44       ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment Eric Blake
2019-04-03 13:03   ` Kevin Wolf
2019-04-03 14:02     ` Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing Eric Blake
2019-04-03  3:05 ` [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status Eric Blake
2019-04-04 14:52 ` [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing Eric Blake
2019-04-04 15:22   ` [Qemu-devel] [Qemu-block] " Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.