All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Exposing backing-chain allocation over NBD
@ 2020-09-25 20:32 Eric Blake
  2020-09-25 20:32 ` [PATCH 1/3] nbd: Simplify meta-context parsing Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Eric Blake @ 2020-09-25 20:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: pkrempa, vsementsov, rjones, qemu-block

I'm working on preparing my KVM Forum presentation on NBD, and ran
into a situation where I really wanted to do the equivalent of
'qemu-img map' over NBD for determining which portions of an overlay
image are changed from the backing layer.  So after less than 24 hours
hacking, I'm pretty pleased with the results.

Known caveats:

- Probably has lots of conflicts with Kevin's pending work on
  refactoring NBD for nicer use in qemu-storage-daemon

- Not yet tested with Vladimir's patches to fix bdrv_block_status bugs
  when the backing file is short (and therefore, applying this series
  without that is likely to make it possible to expose the same bugs
  of wrong information)

- I _still_ want to get QMP 'block-dirty-bitmap-populate' in qemu 5.2;
  with that in place, you could avoid the need for this series by
  instead populating a temporary bitmap and exposing that bitmap over
  NBD instead.  But that requires more work, both in coding (Peter
  Krempa and I still need to make sure we have the ideal QMP
  interface) and in usage (managing temporary bitmaps is more effort
  than a new bool toggle).

- And if we _did_ use block-dirty-bitmap-populate, I find myself
  wanting to be able to expose more than one bitmap at a time over
  NBD, which in turn means revisiting our current QAPI for
  nbd-server-add of '*bitmap':'str' to instead allow an alternate type
  that permits either "'str'" or "['str']", except that our QAPI
  generator does not yet support arrays in alternates, and is
  undergoing changes from John Snow for python cleanups...

- I am aware of long-standing qemu bugs where when we advertise a
  large minimum block size (say 4k) but the backing file has smaller
  granularity (such as 512), then we can violate NBD protocol by
  sending a reply to NBD_CMD_BLOCK_STATUS with misaligned data.  This
  adds yet another instance of being able to tickle that (rare) bug.
  I really need to revisit my patches to add
  bdrv_block_status_aligned...

Also available at:
https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/nbd-alloc-depth-v1

Eric Blake (3):
  nbd: Simplify meta-context parsing
  nbd: Add new qemu:allocation-depth metacontext
  nbd: Add 'qemu-nbd -A' to expose allocation depth

 docs/interop/nbd.txt       |  22 ++-
 docs/tools/qemu-nbd.rst    |   6 +
 qapi/block-core.json       |  13 +-
 include/block/nbd.h        |  15 +-
 blockdev-nbd.c             |   3 +-
 nbd/server.c               | 294 ++++++++++++++++++++-----------------
 qemu-nbd.c                 |  16 +-
 tests/qemu-iotests/309     |  73 +++++++++
 tests/qemu-iotests/309.out |  22 +++
 tests/qemu-iotests/group   |   1 +
 10 files changed, 310 insertions(+), 155 deletions(-)
 create mode 100755 tests/qemu-iotests/309
 create mode 100644 tests/qemu-iotests/309.out

-- 
2.28.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] nbd: Simplify meta-context parsing
  2020-09-25 20:32 [PATCH 0/3] Exposing backing-chain allocation over NBD Eric Blake
@ 2020-09-25 20:32 ` Eric Blake
  2020-09-26 12:58   ` Vladimir Sementsov-Ogievskiy
  2020-09-25 20:32 ` [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext Eric Blake
  2020-09-25 20:32 ` [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
  2 siblings, 1 reply; 14+ messages in thread
From: Eric Blake @ 2020-09-25 20:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: pkrempa, vsementsov, rjones, qemu-block

We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces.  And once we do that,
several functions end up no longer performing I/O, and no longer need
to return a value.

While simplifying this, take advantage of g_str_has_prefix for less
repetition of boilerplate string length computation.

Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 172 ++++++++++++++-------------------------------------
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 982de67816a7..0d2d7e52058f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2018 Red Hat, Inc.
+ *  Copyright (C) 2016-2020 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Server Side
@@ -792,135 +792,64 @@ static int nbd_negotiate_send_meta_context(NBDClient *client,
     return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
 }

-/* Read strlen(@pattern) bytes, and set @match to true if they match @pattern.
- * @match is never set to false.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
- *
- * Note: return code = 1 doesn't mean that we've read exactly @pattern.
- * It only means that there are no errors.
+
+/*
+ * Check @ns with @len bytes, and set @match to true if it matches @pattern,
+ * or if @len is 0 and the client is performing _LIST_. @match is never set
+ * to false.
  */
-static int nbd_meta_pattern(NBDClient *client, const char *pattern, bool *match,
-                            Error **errp)
+static void nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
+                                      const char *ns, uint32_t len,
+                                      bool *match, Error **errp)
 {
-    int ret;
-    char *query;
-    size_t len = strlen(pattern);
-
-    assert(len);
-
-    query = g_malloc(len);
-    ret = nbd_opt_read(client, query, len, errp);
-    if (ret <= 0) {
-        g_free(query);
-        return ret;
-    }
-
-    if (strncmp(query, pattern, len) == 0) {
+    if (len == 0) {
+        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+            *match = true;
+        }
+        trace_nbd_negotiate_meta_query_parse("empty");
+    } else if (strcmp(pattern, ns) == 0) {
         trace_nbd_negotiate_meta_query_parse(pattern);
         *match = true;
     } else {
         trace_nbd_negotiate_meta_query_skip("pattern not matched");
     }
-    g_free(query);
-
-    return 1;
-}
-
-/*
- * Read @len bytes, and set @match to true if they match @pattern, or if @len
- * is 0 and the client is performing _LIST_. @match is never set to false.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
- *
- * Note: return code = 1 doesn't mean that we've read exactly @pattern.
- * It only means that there are no errors.
- */
-static int nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
-                                     uint32_t len, bool *match, Error **errp)
-{
-    if (len == 0) {
-        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
-            *match = true;
-        }
-        trace_nbd_negotiate_meta_query_parse("empty");
-        return 1;
-    }
-
-    if (len != strlen(pattern)) {
-        trace_nbd_negotiate_meta_query_skip("different lengths");
-        return nbd_opt_skip(client, len, errp);
-    }
-
-    return nbd_meta_pattern(client, pattern, match, errp);
 }

 /* nbd_meta_base_query
  *
  * Handle queries to 'base' namespace. For now, only the base:allocation
- * context is available.  'len' is the amount of text remaining to be read from
- * the current name, after the 'base:' portion has been stripped.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
+ * context is available.  @len is the length of @ns, including the 'base:'
+ * prefix.
  */
-static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
-                               uint32_t len, Error **errp)
+static void nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+                                const char *ns, uint32_t len, Error **errp)
 {
-    return nbd_meta_empty_or_pattern(client, "allocation", len,
-                                     &meta->base_allocation, errp);
+    nbd_meta_empty_or_pattern(client, "allocation", ns + 5, len - 5,
+                              &meta->base_allocation, errp);
 }

 /* nbd_meta_bitmap_query
  *
  * Handle query to 'qemu:' namespace.
- * @len is the amount of text remaining to be read from the current name, after
- * the 'qemu:' portion has been stripped.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success. */
-static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
-                               uint32_t len, Error **errp)
+ * @len is the length of @ns, including the `qemu:' prefix.
+ */
+static void nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
+                                const char *ns, uint32_t len, Error **errp)
 {
-    bool dirty_bitmap = false;
-    size_t dirty_bitmap_len = strlen("dirty-bitmap:");
-    int ret;
-
     if (!meta->exp->export_bitmap) {
         trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
-        return nbd_opt_skip(client, len, errp);
-    }
-
-    if (len == 0) {
+    } else if (len == 0) {
         if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
             meta->bitmap = true;
         }
         trace_nbd_negotiate_meta_query_parse("empty");
-        return 1;
-    }
-
-    if (len < dirty_bitmap_len) {
-        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
-        return nbd_opt_skip(client, len, errp);
-    }
-
-    len -= dirty_bitmap_len;
-    ret = nbd_meta_pattern(client, "dirty-bitmap:", &dirty_bitmap, errp);
-    if (ret <= 0) {
-        return ret;
-    }
-    if (!dirty_bitmap) {
+    } else if (!g_str_has_prefix(ns + 5, "dirty-bitmap:")) {
         trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
-        return nbd_opt_skip(client, len, errp);
+    } else {
+        trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
+        nbd_meta_empty_or_pattern(client, meta->exp->export_bitmap_context,
+                                  ns, len, &meta->bitmap, errp);
     }
-
-    trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
-
-    return nbd_meta_empty_or_pattern(
-            client, meta->exp->export_bitmap_context +
-            strlen("qemu:dirty_bitmap:"), len, &meta->bitmap, errp);
 }

 /* nbd_negotiate_meta_query
@@ -930,22 +859,13 @@ static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
  *
  * The only supported namespaces are 'base' and 'qemu'.
  *
- * The function aims not wasting time and memory to read long unknown namespace
- * names.
- *
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success. */
 static int nbd_negotiate_meta_query(NBDClient *client,
                                     NBDExportMetaContexts *meta, Error **errp)
 {
-    /*
-     * Both 'qemu' and 'base' namespaces have length = 5 including a
-     * colon. If another length namespace is later introduced, this
-     * should certainly be refactored.
-     */
     int ret;
-    size_t ns_len = 5;
-    char ns[5];
+    g_autofree char *ns = NULL;
     uint32_t len;

     ret = nbd_opt_read(client, &len, sizeof(len), errp);
@@ -958,27 +878,29 @@ static int nbd_negotiate_meta_query(NBDClient *client,
         trace_nbd_negotiate_meta_query_skip("length too long");
         return nbd_opt_skip(client, len, errp);
     }
-    if (len < ns_len) {
-        trace_nbd_negotiate_meta_query_skip("length too short");
-        return nbd_opt_skip(client, len, errp);
-    }

-    len -= ns_len;
-    ret = nbd_opt_read(client, ns, ns_len, errp);
+    ns = g_malloc(len + 1);
+    ret = nbd_opt_read(client, ns, len, errp);
     if (ret <= 0) {
         return ret;
     }
+    ns[len] = '\0';
+    if (strlen(ns) != len) {
+        return nbd_opt_invalid(client, errp,
+                               "Embedded NUL in query for option %s",
+                               nbd_opt_lookup(client->opt));
+    }

-    if (!strncmp(ns, "base:", ns_len)) {
+    if (g_str_has_prefix(ns, "base:")) {
         trace_nbd_negotiate_meta_query_parse("base:");
-        return nbd_meta_base_query(client, meta, len, errp);
-    } else if (!strncmp(ns, "qemu:", ns_len)) {
+        nbd_meta_base_query(client, meta, ns, len, errp);
+    } else if (g_str_has_prefix(ns, "qemu:")) {
         trace_nbd_negotiate_meta_query_parse("qemu:");
-        return nbd_meta_qemu_query(client, meta, len, errp);
+        nbd_meta_qemu_query(client, meta, ns, len, errp);
+    } else {
+        trace_nbd_negotiate_meta_query_skip("unknown namespace");
     }
-
-    trace_nbd_negotiate_meta_query_skip("unknown namespace");
-    return nbd_opt_skip(client, len, errp);
+    return 1;
 }

 /* nbd_negotiate_meta_queries
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-25 20:32 [PATCH 0/3] Exposing backing-chain allocation over NBD Eric Blake
  2020-09-25 20:32 ` [PATCH 1/3] nbd: Simplify meta-context parsing Eric Blake
@ 2020-09-25 20:32 ` Eric Blake
  2020-09-26  7:33   ` Richard W.M. Jones
  2020-09-26 13:15   ` Vladimir Sementsov-Ogievskiy
  2020-09-25 20:32 ` [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
  2 siblings, 2 replies; 14+ messages in thread
From: Eric Blake @ 2020-09-25 20:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, rjones,
	Max Reitz, vsementsov

'qemu-img map' provides a way to determine which extents of an image
come from the top layer vs. inherited from a backing chain.  This is
useful information worth exposing over NBD.  There is a proposal to
add a QMP command block-dirty-bitmap-populate which can create a dirty
bitmap that reflects allocation information, at which point
qemu:dirty-bitmap:NAME can expose that information via the creation of
a temporary bitmap, but we can shorten the effort by adding a new
qemu:allocation-depth context that does the same thing without an
intermediate bitmap (this patch does not eliminate the need for that
proposal, as it will have other uses as well).

For this patch, I just encoded a tri-state value (unallocated, from
this layer, from any of the backing layers); we could instead or in
addition report an actual depth count per extent, if that proves more
useful.

Note that this patch does not actually enable any way to request a
server to enable this context; that will come in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/interop/nbd.txt |  22 +++++++--
 qapi/block-core.json |   6 ++-
 include/block/nbd.h  |   8 +++-
 nbd/server.c         | 110 +++++++++++++++++++++++++++++++++++++++----
 4 files changed, 132 insertions(+), 14 deletions(-)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index f3b3cacc9621..56efec7aee12 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -17,9 +17,9 @@ namespace "qemu".

 == "qemu" namespace ==

-The "qemu" namespace currently contains only one type of context,
-related to exposing the contents of a dirty bitmap alongside the
-associated disk contents.  That context has the following form:
+The "qemu" namespace currently contains two types of context.  The
+first is related to exposing the contents of a dirty bitmap alongside
+the associated disk contents.  That context has the following form:

     qemu:dirty-bitmap:<dirty-bitmap-export-name>

@@ -28,8 +28,21 @@ in reply for NBD_CMD_BLOCK_STATUS:

     bit 0: NBD_STATE_DIRTY, means that the extent is "dirty"

+The second is related to exposing the source of various extents within
+the image, with a single context named:
+
+    qemu:allocation-depth
+
+In the allocation depth context, bits 0 and 1 form a tri-state value:
+
+    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
+    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
+    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
+               backing layer
+
 For NBD_OPT_LIST_META_CONTEXT the following queries are supported
-in addition to "qemu:dirty-bitmap:<dirty-bitmap-export-name>":
+in addition to the specific "qemu:allocation-depth" and
+"qemu:dirty-bitmap:<dirty-bitmap-export-name>":

 * "qemu:" - returns list of all available metadata contexts in the
             namespace.
@@ -55,3 +68,4 @@ the operation of that feature.
 NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
 * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
 NBD_CMD_FLAG_FAST_ZERO
+* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3c16f1e11d6b..b3ec30a83cd7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5261,11 +5261,15 @@
 #          NBD client can use NBD_OPT_SET_META_CONTEXT with
 #          "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
 #
+# @alloc: Also export the allocation map for @device, so the NBD client
+#         can use NBD_OPT_SET_META_CONTEXT with "qemu:allocation-depth"
+#         to inspect allocation details. (since 5.2)
+#
 # Since: 5.0
 ##
 { 'struct': 'BlockExportNbd',
   'data': {'device': 'str', '*name': 'str', '*description': 'str',
-           '*writable': 'bool', '*bitmap': 'str' } }
+           '*writable': 'bool', '*bitmap': 'str', '*alloc': 'bool' } }

 ##
 # @nbd-server-add:
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 9bc3bfaeecf8..71a2623f7842 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2019 Red Hat, Inc.
+ *  Copyright (C) 2016-2020 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device
@@ -257,6 +257,12 @@ enum {
 /* Extent flags for qemu:dirty-bitmap in NBD_REPLY_TYPE_BLOCK_STATUS */
 #define NBD_STATE_DIRTY (1 << 0)

+/* Extent flags for qemu:allocation-depth in NBD_REPLY_TYPE_BLOCK_STATUS */
+#define NBD_STATE_DEPTH_UNALLOC (0 << 0)
+#define NBD_STATE_DEPTH_LOCAL   (1 << 0)
+#define NBD_STATE_DEPTH_BACKING (2 << 0)
+#define NBD_STATE_DEPTH_MASK    (0x3)
+
 static inline bool nbd_reply_type_is_error(int type)
 {
     return type & (1 << 15);
diff --git a/nbd/server.c b/nbd/server.c
index 0d2d7e52058f..02d5fb375b24 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -25,7 +25,8 @@
 #include "qemu/units.h"

 #define NBD_META_ID_BASE_ALLOCATION 0
-#define NBD_META_ID_DIRTY_BITMAP 1
+#define NBD_META_ID_ALLOCATION_DEPTH 1
+#define NBD_META_ID_DIRTY_BITMAP 2

 /*
  * NBD_MAX_BLOCK_STATUS_EXTENTS: 1 MiB of extents data. An empirical
@@ -97,6 +98,7 @@ struct NBDExport {
     BlockBackend *eject_notifier_blk;
     Notifier eject_notifier;

+    bool alloc_context;
     BdrvDirtyBitmap *export_bitmap;
     char *export_bitmap_context;
 };
@@ -113,6 +115,7 @@ typedef struct NBDExportMetaContexts {
     bool valid; /* means that negotiation of the option finished without
                    errors */
     bool base_allocation; /* export base:allocation context (block status) */
+    bool allocation_depth; /* export qemu:allocation-depth */
     bool bitmap; /* export qemu:dirty-bitmap:<export bitmap name> */
 } NBDExportMetaContexts;

@@ -836,15 +839,24 @@ static void nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
 static void nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
                                 const char *ns, uint32_t len, Error **errp)
 {
-    if (!meta->exp->export_bitmap) {
-        trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
-    } else if (len == 0) {
+    if (len == 0) {
         if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
-            meta->bitmap = true;
+            meta->allocation_depth = meta->exp->alloc_context;
+            meta->bitmap = !!meta->exp->export_bitmap;
         }
         trace_nbd_negotiate_meta_query_parse("empty");
+    } else if (strcmp(ns + 5, "allocation-depth") == 0) {
+        if (meta->exp->alloc_context) {
+            trace_nbd_negotiate_meta_query_parse("allocation-depth");
+            meta->allocation_depth = true;
+        } else {
+            trace_nbd_negotiate_meta_query_skip("allocation-depth not "
+                                                "available");
+        }
     } else if (!g_str_has_prefix(ns + 5, "dirty-bitmap:")) {
         trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
+    } else if (!meta->exp->export_bitmap) {
+        trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
     } else {
         trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
         nbd_meta_empty_or_pattern(client, meta->exp->export_bitmap_context,
@@ -954,6 +966,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
     if (client->opt == NBD_OPT_LIST_META_CONTEXT && !nb_queries) {
         /* enable all known contexts */
         meta->base_allocation = true;
+        meta->allocation_depth = meta->exp->alloc_context;
         meta->bitmap = !!meta->exp->export_bitmap;
     } else {
         for (i = 0; i < nb_queries; ++i) {
@@ -973,6 +986,15 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
         }
     }

+    if (meta->allocation_depth) {
+        ret = nbd_negotiate_send_meta_context(client, "qemu:allocation-depth",
+                                              NBD_META_ID_ALLOCATION_DEPTH,
+                                              errp);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
     if (meta->bitmap) {
         ret = nbd_negotiate_send_meta_context(client,
                                               meta->exp->export_bitmap_context,
@@ -1972,6 +1994,40 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
     return 0;
 }

+static int blockalloc_to_extents(BlockDriverState *bs, uint64_t offset,
+                                 uint64_t bytes, NBDExtentArray *ea)
+{
+    while (bytes) {
+        uint32_t flags;
+        int64_t num;
+        int ret = bdrv_is_allocated(bs, offset, bytes, &num);
+
+        if (ret < 0) {
+            return ret;
+        }
+
+        if (ret == 1) {
+            flags = NBD_STATE_DEPTH_LOCAL;
+        } else {
+            ret = bdrv_is_allocated_above(bs, NULL, false, offset, num,
+                                          &num);
+            if (ret < 0) {
+                return ret;
+            }
+            flags = ret ? NBD_STATE_DEPTH_BACKING : NBD_STATE_DEPTH_UNALLOC;
+        }
+
+        if (nbd_extent_array_add(ea, num, flags) < 0) {
+            return 0;
+        }
+
+        offset += num;
+        bytes -= num;
+    }
+
+    return 0;
+}
+
 /*
  * nbd_co_send_extents
  *
@@ -2020,6 +2076,26 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
     return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
 }

+/* Get allocation depth from the exported device and send it to the client */
+static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
+                                   BlockDriverState *bs, uint64_t offset,
+                                   uint32_t length, bool dont_fragment,
+                                   bool last, uint32_t context_id,
+                                   Error **errp)
+{
+    int ret;
+    unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
+    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+
+    ret = blockalloc_to_extents(bs, offset, length, ea);
+    if (ret < 0) {
+        return nbd_co_send_structured_error(
+                client, handle, -ret, "can't get block status", errp);
+    }
+
+    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
+}
+
 /* Populate @ea from a dirty bitmap. */
 static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
                               uint64_t offset, uint64_t length,
@@ -2348,15 +2424,19 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         if (client->export_meta.valid &&
             (client->export_meta.base_allocation ||
+             client->export_meta.allocation_depth ||
              client->export_meta.bitmap))
         {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
+            int contexts_remaining = client->export_meta.base_allocation +
+                client->export_meta.allocation_depth +
+                client->export_meta.bitmap;

             if (client->export_meta.base_allocation) {
                 ret = nbd_co_send_block_status(client, request->handle,
                                                blk_bs(exp->blk), request->from,
                                                request->len, dont_fragment,
-                                               !client->export_meta.bitmap,
+                                               !--contexts_remaining,
                                                NBD_META_ID_BASE_ALLOCATION,
                                                errp);
                 if (ret < 0) {
@@ -2364,17 +2444,31 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 }
             }

+            if (client->export_meta.allocation_depth) {
+                ret = nbd_co_send_block_alloc(client, request->handle,
+                                              blk_bs(exp->blk), request->from,
+                                              request->len, dont_fragment,
+                                              !--contexts_remaining,
+                                              NBD_META_ID_ALLOCATION_DEPTH,
+                                              errp);
+                if (ret < 0) {
+                    return ret;
+                }
+            }
+
             if (client->export_meta.bitmap) {
                 ret = nbd_co_send_bitmap(client, request->handle,
                                          client->exp->export_bitmap,
                                          request->from, request->len,
-                                         dont_fragment,
-                                         true, NBD_META_ID_DIRTY_BITMAP, errp);
+                                         dont_fragment, !--contexts_remaining,
+                                         NBD_META_ID_DIRTY_BITMAP, errp);
                 if (ret < 0) {
                     return ret;
                 }
             }

+            assert(!contexts_remaining);
+
             return 0;
         } else {
             return nbd_send_generic_reply(client, request->handle, -EINVAL,
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-25 20:32 [PATCH 0/3] Exposing backing-chain allocation over NBD Eric Blake
  2020-09-25 20:32 ` [PATCH 1/3] nbd: Simplify meta-context parsing Eric Blake
  2020-09-25 20:32 ` [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext Eric Blake
@ 2020-09-25 20:32 ` Eric Blake
  2020-09-26  7:34   ` Richard W.M. Jones
  2020-09-26 13:32   ` Vladimir Sementsov-Ogievskiy
  2 siblings, 2 replies; 14+ messages in thread
From: Eric Blake @ 2020-09-25 20:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, rjones,
	Max Reitz, vsementsov

Allow the server to expose an additional metacontext to be requested
by savvy clients.  qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using nbd-server-add.

qemu as client can be hacked into viewing this new context by using
the now-misnamed x-dirty-bitmap option when creating an NBD blockdev;
although it is worth noting the decoding of how such context
information will appear in 'qemu-img map --output=json':

NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
NBD_STATE_DEPTH_LOCAL => "zero":false, "data":false
NBD_STATE_DEPTH_BACKING => "zero":true, "data":true

libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/tools/qemu-nbd.rst    |  6 ++++
 qapi/block-core.json       |  7 ++--
 include/block/nbd.h        |  7 ++--
 blockdev-nbd.c             |  3 +-
 nbd/server.c               |  8 +++--
 qemu-nbd.c                 | 16 ++++++---
 tests/qemu-iotests/309     | 73 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/309.out | 22 ++++++++++++
 tests/qemu-iotests/group   |  1 +
 9 files changed, 129 insertions(+), 14 deletions(-)
 create mode 100755 tests/qemu-iotests/309
 create mode 100644 tests/qemu-iotests/309.out

diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 667861cb22e9..0e545a97cfa3 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -72,6 +72,12 @@ driver options if ``--image-opts`` is specified.

   Export the disk as read-only.

+.. option:: -A, --allocation-depth
+
+  Expose allocation depth information via the
+  ``qemu:allocation-depth`` context accessible through
+  NBD_OPT_SET_META_CONTEXT.
+
 .. option:: -B, --bitmap=NAME

   If *filename* has a qcow2 persistent bitmap *NAME*, expose
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b3ec30a83cd7..84f7a7a34dc1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3874,9 +3874,12 @@
 #
 # @tls-creds: TLS credentials ID
 #
-# @x-dirty-bitmap: A "qemu:dirty-bitmap:NAME" string to query in place of
+# @x-dirty-bitmap: A metacontext name such as "qemu:dirty-bitmap:NAME" or
+#                  "qemu:allocation-depth" to query in place of the
 #                  traditional "base:allocation" block status (see
-#                  NBD_OPT_LIST_META_CONTEXT in the NBD protocol) (since 3.0)
+#                  NBD_OPT_LIST_META_CONTEXT in the NBD protocol; and
+#                  yes, naming this option x-context would have made
+#                  more sense) (since 3.0)
 #
 # @reconnect-delay: On an unexpected disconnect, the nbd client tries to
 #                   connect again until succeeding or encountering a serious
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 71a2623f7842..f660e0274ce3 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -336,9 +336,10 @@ typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, bool readonly, bool shared,
-                          void (*close)(NBDExport *), bool writethrough,
-                          BlockBackend *on_eject_blk, Error **errp);
+                          bool alloc_depth, const char *bitmap, bool readonly,
+                          bool shared, void (*close)(NBDExport *),
+                          bool writethrough, BlockBackend *on_eject_blk,
+                          Error **errp);
 void nbd_export_close(NBDExport *exp);
 void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error **errp);
 void nbd_export_get(NBDExport *exp);
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 1a95d89f0096..562ea3751b85 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -203,7 +203,8 @@ void qmp_nbd_server_add(BlockExportNbd *arg, Error **errp)
         arg->writable = false;
     }

-    exp = nbd_export_new(bs, 0, len, arg->name, arg->description, arg->bitmap,
+    exp = nbd_export_new(bs, 0, len, arg->name, arg->description,
+                         arg->alloc, arg->bitmap,
                          !arg->writable, !arg->writable,
                          NULL, false, on_eject_blk, errp);
     if (!exp) {
diff --git a/nbd/server.c b/nbd/server.c
index 02d5fb375b24..f5e0e115b703 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1450,9 +1450,10 @@ static void nbd_eject_notifier(Notifier *n, void *data)

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, bool readonly, bool shared,
-                          void (*close)(NBDExport *), bool writethrough,
-                          BlockBackend *on_eject_blk, Error **errp)
+                          bool alloc_depth, const char *bitmap, bool readonly,
+                          bool shared, void (*close)(NBDExport *),
+                          bool writethrough, BlockBackend *on_eject_blk,
+                          Error **errp)
 {
     AioContext *ctx;
     BlockBackend *blk;
@@ -1545,6 +1546,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
         assert(strlen(exp->export_bitmap_context) < NBD_MAX_STRING_SIZE);
     }

+    exp->alloc_context = alloc_depth;
     exp->close = close;
     exp->ctx = ctx;
     blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index e39d3b23c121..5aeef60b0a5a 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -100,6 +100,7 @@ static void usage(const char *name)
 "\n"
 "Exposing part of the image:\n"
 "  -o, --offset=OFFSET       offset into the image\n"
+"  -A, --allocation-depth    expose the allocation depth\n"
 "  -B, --bitmap=NAME         expose a persistent dirty bitmap\n"
 "\n"
 "General purpose options:\n"
@@ -527,7 +528,7 @@ int main(int argc, char **argv)
     int64_t fd_size;
     QemuOpts *sn_opts = NULL;
     const char *sn_id_or_name = NULL;
-    const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:B:L";
+    const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:AB:L";
     struct option lopt[] = {
         { "help", no_argument, NULL, 'h' },
         { "version", no_argument, NULL, 'V' },
@@ -536,6 +537,7 @@ int main(int argc, char **argv)
         { "socket", required_argument, NULL, 'k' },
         { "offset", required_argument, NULL, 'o' },
         { "read-only", no_argument, NULL, 'r' },
+        { "allocation-depth", no_argument, NULL, 'A' },
         { "bitmap", required_argument, NULL, 'B' },
         { "connect", required_argument, NULL, 'c' },
         { "disconnect", no_argument, NULL, 'd' },
@@ -577,6 +579,7 @@ int main(int argc, char **argv)
     QDict *options = NULL;
     const char *export_name = NULL; /* defaults to "" later for server mode */
     const char *export_description = NULL;
+    bool alloc_depth = false;
     const char *bitmap = NULL;
     const char *tlscredsid = NULL;
     bool imageOpts = false;
@@ -700,6 +703,9 @@ int main(int argc, char **argv)
             readonly = true;
             flags &= ~BDRV_O_RDWR;
             break;
+        case 'A':
+            alloc_depth = true;
+            break;
         case 'B':
             bitmap = optarg;
             break;
@@ -797,8 +803,8 @@ int main(int argc, char **argv)
             exit(EXIT_FAILURE);
         }
         if (export_name || export_description || dev_offset ||
-            device || disconnect || fmt || sn_id_or_name || bitmap ||
-            seen_aio || seen_discard || seen_cache) {
+            device || disconnect || fmt || sn_id_or_name || alloc_depth ||
+            bitmap || seen_aio || seen_discard || seen_cache) {
             error_report("List mode is incompatible with per-device settings");
             exit(EXIT_FAILURE);
         }
@@ -1069,8 +1075,8 @@ int main(int argc, char **argv)
     fd_size -= dev_offset;

     export = nbd_export_new(bs, dev_offset, fd_size, export_name,
-                            export_description, bitmap, readonly, shared > 1,
-                            nbd_export_closed, writethrough, NULL,
+                            export_description, alloc_depth, bitmap, readonly,
+                            shared > 1, nbd_export_closed, writethrough, NULL,
                             &error_fatal);

     if (device) {
diff --git a/tests/qemu-iotests/309 b/tests/qemu-iotests/309
new file mode 100755
index 000000000000..fda3be9abbd2
--- /dev/null
+++ b/tests/qemu-iotests/309
@@ -0,0 +1,73 @@
+#!/usr/bin/env bash
+#
+# Test qemu-nbd -A
+#
+# Copyright (C) 2018-2020 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    nbd_server_stop
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.nbd
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+_require_command QEMU_NBD
+
+echo
+echo "=== Initial image setup ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 4M
+$QEMU_IO -c 'w 0 2M' -f $IMGFMT "$TEST_IMG.base" | _filter_qemu_io
+_make_test_img -b "$TEST_IMG.base" -F $IMGFMT 4M
+$QEMU_IO -c 'w 1M 2M' -f $IMGFMT "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Check allocation over NBD ==="
+echo
+
+# Normal -f raw NBD block status loses access to allocation information
+$QEMU_IMG map --output=json -f qcow2 "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+nbd_server_start_unix_socket -r -f qcow2 -A "$TEST_IMG"
+$QEMU_IMG map --output=json --image-opts \
+    "$IMG" | _filter_qemu_img_map
+# But since we used -A, and use x-dirty-bitmap as a hack for reading bitmaps,
+# we can reconstruct it, by abusing block status to report:
+#    NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
+#    NBD_STATE_DEPTH_LOCAL => "zero":false, "data":false
+#    NBD_STATE_DEPTH_BACKING => "zero":true, "data":true
+$QEMU_IMG map --output=json --image-opts \
+    "$IMG,x-dirty-bitmap=qemu:allocation-depth" | _filter_qemu_img_map
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/309.out b/tests/qemu-iotests/309.out
new file mode 100644
index 000000000000..db75bb6b0df9
--- /dev/null
+++ b/tests/qemu-iotests/309.out
@@ -0,0 +1,22 @@
+QA output created by 309
+
+=== Initial image setup ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4194304
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
+wrote 2097152/2097152 bytes at offset 1048576
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Check allocation over NBD ===
+
+[{ "start": 0, "length": 1048576, "depth": 1, "zero": false, "data": true, "offset": 327680},
+{ "start": 1048576, "length": 2097152, "depth": 0, "zero": false, "data": true, "offset": 327680},
+{ "start": 3145728, "length": 1048576, "depth": 1, "zero": true, "data": false}]
+[{ "start": 0, "length": 3145728, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
+{ "start": 3145728, "length": 1048576, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
+[{ "start": 0, "length": 1048576, "depth": 0, "zero": true, "data": true, "offset": OFFSET},
+{ "start": 1048576, "length": 2097152, "depth": 0, "zero": false, "data": false},
+{ "start": 3145728, "length": 1048576, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ff59cfd2d4c4..57182e4b3169 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -314,3 +314,4 @@
 303 rw quick
 304 rw quick
 305 rw quick
+309 rw auto quick
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-25 20:32 ` [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext Eric Blake
@ 2020-09-26  7:33   ` Richard W.M. Jones
  2020-09-26 13:19     ` Vladimir Sementsov-Ogievskiy
  2020-09-28 14:33     ` Eric Blake
  2020-09-26 13:15   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 14+ messages in thread
From: Richard W.M. Jones @ 2020-09-26  7:33 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, qemu-devel,
	Max Reitz, vsementsov

On Fri, Sep 25, 2020 at 03:32:48PM -0500, Eric Blake wrote:
> +The second is related to exposing the source of various extents within
> +the image, with a single context named:
> +
> +    qemu:allocation-depth
> +
> +In the allocation depth context, bits 0 and 1 form a tri-state value:
> +
> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
> +    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
> +               backing layer

From the cover description I imagined it would show the actual depth, ie:

         top -> backing -> backing -> backing
 depth:   1        2         3   ....          (0 = unallocated)

I wonder if that is possible?  (Perhaps there's something I don't
understand here.)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-25 20:32 ` [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
@ 2020-09-26  7:34   ` Richard W.M. Jones
  2020-09-26 13:32   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 14+ messages in thread
From: Richard W.M. Jones @ 2020-09-26  7:34 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, qemu-devel,
	Max Reitz, vsementsov

On Fri, Sep 25, 2020 at 03:32:49PM -0500, Eric Blake wrote:
> Allow the server to expose an additional metacontext to be requested
> by savvy clients.  qemu-nbd adds a new option -A to expose the
> qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
> can also be set via QMP when using nbd-server-add.
> 
> qemu as client can be hacked into viewing this new context by using
> the now-misnamed x-dirty-bitmap option when creating an NBD blockdev;
> although it is worth noting the decoding of how such context
> information will appear in 'qemu-img map --output=json':
> 
> NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
> NBD_STATE_DEPTH_LOCAL => "zero":false, "data":false
> NBD_STATE_DEPTH_BACKING => "zero":true, "data":true
> 
> libnbd as client is probably a nicer way to get at the information
> without having to decipher such hacks in qemu as client. ;)

I've been meaning to add extents information to nbdinfo, or
perhaps a new tool ("nbdmap").

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] nbd: Simplify meta-context parsing
  2020-09-25 20:32 ` [PATCH 1/3] nbd: Simplify meta-context parsing Eric Blake
@ 2020-09-26 12:58   ` Vladimir Sementsov-Ogievskiy
  2020-09-28 14:05     ` Eric Blake
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-09-26 12:58 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block, pkrempa, rjones

25.09.2020 23:32, Eric Blake wrote:
> We had a premature optimization of trying to read as little from the
> wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
> But in reality, we HAVE to read the entire string from the client
> before we can get to the next command, and it is easier to just read
> it all at once than it is to read it in pieces.  And once we do that,
> several functions end up no longer performing I/O, and no longer need
> to return a value.
> 
> While simplifying this, take advantage of g_str_has_prefix for less
> repetition of boilerplate string length computation.
> 
> Our iotests still pass; I also checked that libnbd's testsuite (which
> covers more corner cases of odd meta context requests) still passes.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   nbd/server.c | 172 ++++++++++++++-------------------------------------
>   1 file changed, 47 insertions(+), 125 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 982de67816a7..0d2d7e52058f 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1,5 +1,5 @@
>   /*
> - *  Copyright (C) 2016-2018 Red Hat, Inc.
> + *  Copyright (C) 2016-2020 Red Hat, Inc.
>    *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
>    *
>    *  Network Block Device Server Side
> @@ -792,135 +792,64 @@ static int nbd_negotiate_send_meta_context(NBDClient *client,
>       return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
>   }
> 
> -/* Read strlen(@pattern) bytes, and set @match to true if they match @pattern.
> - * @match is never set to false.
> - *
> - * Return -errno on I/O error, 0 if option was completely handled by
> - * sending a reply about inconsistent lengths, or 1 on success.
> - *
> - * Note: return code = 1 doesn't mean that we've read exactly @pattern.
> - * It only means that there are no errors.
> +
> +/*
> + * Check @ns with @len bytes, and set @match to true if it matches @pattern,
> + * or if @len is 0 and the client is performing _LIST_. @match is never set
> + * to false.
>    */
> -static int nbd_meta_pattern(NBDClient *client, const char *pattern, bool *match,
> -                            Error **errp)
> +static void nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
> +                                      const char *ns, uint32_t len,

ns changed its meaning, it's not just a namespace, but the whole query. I think, better to rename it.

Also, it's unusual to pass length together with nul-terminated string, it seems redundant.
And, it's used only to compare with zero, strlen(ns) == 0 or ns[0] == 0 is not slower.

Also, errp is unused argument. And it violate Error API recommendation to not create void functions with errp.

Also we can use bool return instead of return through pointer.

> +                                      bool *match, Error **errp)
>   {
> -    int ret;
> -    char *query;
> -    size_t len = strlen(pattern);
> -
> -    assert(len);
> -
> -    query = g_malloc(len);
> -    ret = nbd_opt_read(client, query, len, errp);
> -    if (ret <= 0) {
> -        g_free(query);
> -        return ret;
> -    }
> -
> -    if (strncmp(query, pattern, len) == 0) {
> +    if (len == 0) {
> +        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            *match = true;
> +        }
> +        trace_nbd_negotiate_meta_query_parse("empty");
> +    } else if (strcmp(pattern, ns) == 0) {
>           trace_nbd_negotiate_meta_query_parse(pattern);
>           *match = true;
>       } else {
>           trace_nbd_negotiate_meta_query_skip("pattern not matched");
>       }
> -    g_free(query);
> -
> -    return 1;
> -}
> -
> -/*
> - * Read @len bytes, and set @match to true if they match @pattern, or if @len
> - * is 0 and the client is performing _LIST_. @match is never set to false.
> - *
> - * Return -errno on I/O error, 0 if option was completely handled by
> - * sending a reply about inconsistent lengths, or 1 on success.
> - *
> - * Note: return code = 1 doesn't mean that we've read exactly @pattern.
> - * It only means that there are no errors.
> - */
> -static int nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
> -                                     uint32_t len, bool *match, Error **errp)
> -{
> -    if (len == 0) {
> -        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> -            *match = true;
> -        }
> -        trace_nbd_negotiate_meta_query_parse("empty");
> -        return 1;
> -    }
> -
> -    if (len != strlen(pattern)) {
> -        trace_nbd_negotiate_meta_query_skip("different lengths");
> -        return nbd_opt_skip(client, len, errp);
> -    }
> -
> -    return nbd_meta_pattern(client, pattern, match, errp);
>   }
> 
>   /* nbd_meta_base_query
>    *
>    * Handle queries to 'base' namespace. For now, only the base:allocation
> - * context is available.  'len' is the amount of text remaining to be read from
> - * the current name, after the 'base:' portion has been stripped.
> - *
> - * Return -errno on I/O error, 0 if option was completely handled by
> - * sending a reply about inconsistent lengths, or 1 on success.
> + * context is available.  @len is the length of @ns, including the 'base:'
> + * prefix.
>    */
> -static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
> -                               uint32_t len, Error **errp)
> +static void nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                                const char *ns, uint32_t len, Error **errp)
>   {
> -    return nbd_meta_empty_or_pattern(client, "allocation", len,
> -                                     &meta->base_allocation, errp);
> +    nbd_meta_empty_or_pattern(client, "allocation", ns + 5, len - 5,

This one is correct, but would be simpler, if only subquery (after colon) passed here.
(Really, no reason to pass 'base:' to _base_ func)

Hmm, I see that it gives a possibility to use meta->exp->export_bitmap_context directly.

> +                              &meta->base_allocation, errp);
>   }
> 
>   /* nbd_meta_bitmap_query
>    *
>    * Handle query to 'qemu:' namespace.
> - * @len is the amount of text remaining to be read from the current name, after
> - * the 'qemu:' portion has been stripped.
> - *
> - * Return -errno on I/O error, 0 if option was completely handled by
> - * sending a reply about inconsistent lengths, or 1 on success. */
> -static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> -                               uint32_t len, Error **errp)
> + * @len is the length of @ns, including the `qemu:' prefix.
> + */
> +static void nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                                const char *ns, uint32_t len, Error **errp)
>   {
> -    bool dirty_bitmap = false;
> -    size_t dirty_bitmap_len = strlen("dirty-bitmap:");
> -    int ret;
> -
>       if (!meta->exp->export_bitmap) {
>           trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
> -        return nbd_opt_skip(client, len, errp);
> -    }
> -
> -    if (len == 0) {
> +    } else if (len == 0) {

s/0/5/ ?

>           if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
>               meta->bitmap = true;
>           }
>           trace_nbd_negotiate_meta_query_parse("empty");
> -        return 1;
> -    }
> -
> -    if (len < dirty_bitmap_len) {
> -        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> -        return nbd_opt_skip(client, len, errp);
> -    }
> -
> -    len -= dirty_bitmap_len;
> -    ret = nbd_meta_pattern(client, "dirty-bitmap:", &dirty_bitmap, errp);
> -    if (ret <= 0) {
> -        return ret;
> -    }
> -    if (!dirty_bitmap) {
> +    } else if (!g_str_has_prefix(ns + 5, "dirty-bitmap:")) {
>           trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> -        return nbd_opt_skip(client, len, errp);
> +    } else {
> +        trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
> +        nbd_meta_empty_or_pattern(client, meta->exp->export_bitmap_context,
> +                                  ns, len, &meta->bitmap, errp);

hmm. _empty_ is impossible here, as ns includes "qemu:dirty-bitmap"..

>       }
> -
> -    trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
> -
> -    return nbd_meta_empty_or_pattern(
> -            client, meta->exp->export_bitmap_context +
> -            strlen("qemu:dirty_bitmap:"), len, &meta->bitmap, errp);
>   }
> 
>   /* nbd_negotiate_meta_query
> @@ -930,22 +859,13 @@ static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
>    *
>    * The only supported namespaces are 'base' and 'qemu'.
>    *
> - * The function aims not wasting time and memory to read long unknown namespace
> - * names.
> - *
>    * Return -errno on I/O error, 0 if option was completely handled by
>    * sending a reply about inconsistent lengths, or 1 on success. */
>   static int nbd_negotiate_meta_query(NBDClient *client,
>                                       NBDExportMetaContexts *meta, Error **errp)
>   {
> -    /*
> -     * Both 'qemu' and 'base' namespaces have length = 5 including a
> -     * colon. If another length namespace is later introduced, this
> -     * should certainly be refactored.
> -     */
>       int ret;
> -    size_t ns_len = 5;
> -    char ns[5];
> +    g_autofree char *ns = NULL;
>       uint32_t len;
> 
>       ret = nbd_opt_read(client, &len, sizeof(len), errp);
> @@ -958,27 +878,29 @@ static int nbd_negotiate_meta_query(NBDClient *client,
>           trace_nbd_negotiate_meta_query_skip("length too long");
>           return nbd_opt_skip(client, len, errp);
>       }
> -    if (len < ns_len) {
> -        trace_nbd_negotiate_meta_query_skip("length too short");
> -        return nbd_opt_skip(client, len, errp);
> -    }
> 
> -    len -= ns_len;
> -    ret = nbd_opt_read(client, ns, ns_len, errp);
> +    ns = g_malloc(len + 1);
> +    ret = nbd_opt_read(client, ns, len, errp);

Now ns is not a namespace but the whole query. I'd rename a variable.

>       if (ret <= 0) {
>           return ret;
>       }
> +    ns[len] = '\0';
> +    if (strlen(ns) != len) {
> +        return nbd_opt_invalid(client, errp,
> +                               "Embedded NUL in query for option %s",
> +                               nbd_opt_lookup(client->opt));
> +    }

Hmm, that's a new good check. Intersting, is it specified in NBD protocol, that
NUL shouldn't be in a string.. Probably it can be a separate patch, with
new nbd_opt_string_read, which will check this thing. But I'm OK with it as is
in this patch.

> 
> -    if (!strncmp(ns, "base:", ns_len)) {
> +    if (g_str_has_prefix(ns, "base:")) {
>           trace_nbd_negotiate_meta_query_parse("base:");
> -        return nbd_meta_base_query(client, meta, len, errp);
> -    } else if (!strncmp(ns, "qemu:", ns_len)) {
> +        nbd_meta_base_query(client, meta, ns, len, errp);
> +    } else if (g_str_has_prefix(ns, "qemu:")) {
>           trace_nbd_negotiate_meta_query_parse("qemu:");
> -        return nbd_meta_qemu_query(client, meta, len, errp);
> +        nbd_meta_qemu_query(client, meta, ns, len, errp);

passing length of string together with string seems redundant (and error prone).

> +    } else {
> +        trace_nbd_negotiate_meta_query_skip("unknown namespace");
>       }
> -

Seems you don't like my new-line before final return statement.

> -    trace_nbd_negotiate_meta_query_skip("unknown namespace");
> -    return nbd_opt_skip(client, len, errp);
> +    return 1;
>   }
> 
>   /* nbd_negotiate_meta_queries
> 


To avoid all this pointer arithmetic, what about something like this (I didn't check, just an idea):

/*
  * Return true if @query matches @pattern of if @query is empty and the client
  * is performing _LIST_. Otherwise return false.
  */
static bool nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
                                       const char *query)
{
     if (!strcmp(query, "")) {
         trace_nbd_negotiate_meta_query_parse("empty");
         return client->opt == NBD_OPT_LIST_META_CONTEXT;
     } else if (!strcmp(query, pattern)) {
         trace_nbd_negotiate_meta_query_parse(pattern);
         return true;
     } else {
         trace_nbd_negotiate_meta_query_skip("pattern not matched");
         return false;
     }
}

bool strshift(const char **str, const char *prefix)
{
     if (g_str_has_prefix(*str, prefix)) {
         *str += strlen(prefix);
         return true;
     }

     return false;
}

/*
  * nbd_meta_base_query
  *
  * If it's 'base' namespace parse the query and return true. If not return
  * false.
  */
static bool nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
                                 const char *query)
{
     if (!strshift(&query, "base:")) {
         return false;
     }
     trace_nbd_negotiate_meta_query_parse("base:");

     meta->base_allocation =
             nbd_meta_empty_or_pattern(client, "allocation", query);

     return true;
}

/*
  * nbd_meta_bitmap_query
  *
  * If it's 'qemu' namespace parse the query and return true. If not return
  * false.
  */
static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
                                 const char *query)
{
     if (!strshift(&query, "qemu:")) {
         return false;
     }
     trace_nbd_negotiate_meta_query_parse("qemu:");

     if (!meta->exp->export_bitmap) {
         trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
         return true;
     }

     if (!query[0]) {
         if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
             meta->bitmap = true;
         }
         trace_nbd_negotiate_meta_query_parse("empty");
         return true;
     }

     if (!strshift(&query, "dirty-bitmap:")) {
         trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
         return true;
     }

     trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
     meta->bitmap = nbd_meta_empty_or_pattern(
             client, meta->exp->export_bitmap_context +
             strlen("qemu:dirty-bitmap:"), query);

     return true;
}

/*
  * nbd_negotiate_meta_query
  *
  * Parse namespace name and call corresponding function to parse body of the
  * query.
  *
  * The only supported namespaces are 'base' and 'qemu'.
  *
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success.
  */
static int nbd_negotiate_meta_query(NBDClient *client,
                                     NBDExportMetaContexts *meta, Error **errp)
{
     int ret;
     g_autofree char *query = NULL;
     uint32_t len;

     ret = nbd_opt_read(client, &len, sizeof(len), errp);
     if (ret <= 0) {
         return ret;
     }
     len = cpu_to_be32(len);

     if (len > NBD_MAX_STRING_SIZE) {
         trace_nbd_negotiate_meta_query_skip("length too long");
         return nbd_opt_skip(client, len, errp);
     }

     query = g_malloc(len + 1);
     ret = nbd_opt_read(client, query, len, errp);
     if (ret <= 0) {
         return ret;
     }
     query[len] = '\0';
     if (strlen(query) != len) {
         return nbd_opt_invalid(client, errp,
                                "Embedded NUL in query for option %s",
                                nbd_opt_lookup(client->opt));
     }

     if (nbd_meta_base_query(client, meta, query)) {
         return 1;

     if (nbd_meta_qemu_query(client, meta, query)) {
         return 1;
     }

     trace_nbd_negotiate_meta_query_skip("unknown namespace");
     return 1;
}



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-25 20:32 ` [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext Eric Blake
  2020-09-26  7:33   ` Richard W.M. Jones
@ 2020-09-26 13:15   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-09-26 13:15 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: qemu-block, pkrempa, rjones, Kevin Wolf, Max Reitz, Markus Armbruster

25.09.2020 23:32, Eric Blake wrote:
> 'qemu-img map' provides a way to determine which extents of an image
> come from the top layer vs. inherited from a backing chain.  This is
> useful information worth exposing over NBD.  There is a proposal to
> add a QMP command block-dirty-bitmap-populate which can create a dirty
> bitmap that reflects allocation information, at which point
> qemu:dirty-bitmap:NAME can expose that information via the creation of
> a temporary bitmap, but we can shorten the effort by adding a new
> qemu:allocation-depth context that does the same thing without an
> intermediate bitmap (this patch does not eliminate the need for that
> proposal, as it will have other uses as well).
> 
> For this patch, I just encoded a tri-state value (unallocated, from
> this layer, from any of the backing layers); we could instead or in
> addition report an actual depth count per extent, if that proves more
> useful.
> 
> Note that this patch does not actually enable any way to request a
> server to enable this context; that will come in the next patch.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Looks good to me overall, need to rebase if patch 01 changed (as I propose or in some better way).

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-26  7:33   ` Richard W.M. Jones
@ 2020-09-26 13:19     ` Vladimir Sementsov-Ogievskiy
  2020-09-28 14:33     ` Eric Blake
  1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-09-26 13:19 UTC (permalink / raw)
  To: Richard W.M. Jones, Eric Blake
  Cc: qemu-devel, qemu-block, pkrempa, Kevin Wolf, Max Reitz,
	Markus Armbruster

26.09.2020 10:33, Richard W.M. Jones wrote:
> On Fri, Sep 25, 2020 at 03:32:48PM -0500, Eric Blake wrote:
>> +The second is related to exposing the source of various extents within
>> +the image, with a single context named:
>> +
>> +    qemu:allocation-depth
>> +
>> +In the allocation depth context, bits 0 and 1 form a tri-state value:
>> +
>> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
>> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
>> +    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
>> +               backing layer
> 
>  From the cover description I imagined it would show the actual depth, ie:
> 
>           top -> backing -> backing -> backing
>   depth:   1        2         3   ....          (0 = unallocated)
> 
> I wonder if that is possible?  (Perhaps there's something I don't
> understand here.)
> 
That's possible if we want it. Probably the best way is to add *depth parameter to bdrv_is_allocated_above (and better on top of my "[PATCH v7 0/5] fix & merge block_status_above and is_allocated_above")


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-25 20:32 ` [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
  2020-09-26  7:34   ` Richard W.M. Jones
@ 2020-09-26 13:32   ` Vladimir Sementsov-Ogievskiy
  2020-09-28 14:35     ` Eric Blake
  1 sibling, 1 reply; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-09-26 13:32 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: qemu-block, pkrempa, rjones, Kevin Wolf, Max Reitz, Markus Armbruster

25.09.2020 23:32, Eric Blake wrote:
> Allow the server to expose an additional metacontext to be requested
> by savvy clients.  qemu-nbd adds a new option -A to expose the
> qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
> can also be set via QMP when using nbd-server-add.
> 
> qemu as client can be hacked into viewing this new context by using
> the now-misnamed x-dirty-bitmap option when creating an NBD blockdev;

may be rename it to x-block-status ?

> although it is worth noting the decoding of how such context
> information will appear in 'qemu-img map --output=json':
> 
> NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
> NBD_STATE_DEPTH_LOCAL => "zero":false, "data":false
> NBD_STATE_DEPTH_BACKING => "zero":true, "data":true

It wouldn't be so simple if we decide to export exact depth number..


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] nbd: Simplify meta-context parsing
  2020-09-26 12:58   ` Vladimir Sementsov-Ogievskiy
@ 2020-09-28 14:05     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2020-09-28 14:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: pkrempa, rjones, qemu-block

On 9/26/20 7:58 AM, Vladimir Sementsov-Ogievskiy wrote:
> 25.09.2020 23:32, Eric Blake wrote:
>> We had a premature optimization of trying to read as little from the
>> wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
>> But in reality, we HAVE to read the entire string from the client
>> before we can get to the next command, and it is easier to just read
>> it all at once than it is to read it in pieces.  And once we do that,
>> several functions end up no longer performing I/O, and no longer need
>> to return a value.
>>
>> While simplifying this, take advantage of g_str_has_prefix for less
>> repetition of boilerplate string length computation.
>>
>> Our iotests still pass; I also checked that libnbd's testsuite (which
>> covers more corner cases of odd meta context requests) still passes.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---

>>
>> -/* Read strlen(@pattern) bytes, and set @match to true if they match 
>> @pattern.
>> - * @match is never set to false.
>> - *
>> - * Return -errno on I/O error, 0 if option was completely handled by
>> - * sending a reply about inconsistent lengths, or 1 on success.
>> - *
>> - * Note: return code = 1 doesn't mean that we've read exactly @pattern.
>> - * It only means that there are no errors.
>> +
>> +/*
>> + * Check @ns with @len bytes, and set @match to true if it matches 
>> @pattern,
>> + * or if @len is 0 and the client is performing _LIST_. @match is 
>> never set
>> + * to false.
>>    */
>> -static int nbd_meta_pattern(NBDClient *client, const char *pattern, 
>> bool *match,
>> -                            Error **errp)
>> +static void nbd_meta_empty_or_pattern(NBDClient *client, const char 
>> *pattern,
>> +                                      const char *ns, uint32_t len,
> 
> ns changed its meaning, it's not just a namespace, but the whole query. 
> I think, better to rename it.

Sure, that makes sense.

> 
> Also, it's unusual to pass length together with nul-terminated string, 
> it seems redundant.
> And, it's used only to compare with zero, strlen(ns) == 0 or ns[0] == 0 
> is not slower.

Good point.

> 
> Also, errp is unused argument. And it violate Error API recommendation 
> to not create void functions with errp.

Another simplification.  Looks like I'll be spinning v2.

> 
> Also we can use bool return instead of return through pointer.

That changes the callers, but it's not necessarily a bad thing; whereas 
we were doing:

if (cond) {
     nbd_meta_pattern(..., &match, ...);
}

which either leaves match unchanged or sets it to true, we could instead do:

if (nbd_meta_pattern(...)) {
     match = true;
}

My only worry is that the more changes I make, the harder the patch is 
to read.  I don't know if it's worth breaking it into steps, or just 
doing all the simplifications in one blow.


>> +static void nbd_meta_base_query(NBDClient *client, 
>> NBDExportMetaContexts *meta,
>> +                                const char *ns, uint32_t len, Error 
>> **errp)
>>   {
>> -    return nbd_meta_empty_or_pattern(client, "allocation", len,
>> -                                     &meta->base_allocation, errp);
>> +    nbd_meta_empty_or_pattern(client, "allocation", ns + 5, len - 5,
> 
> This one is correct, but would be simpler, if only subquery (after 
> colon) passed here.
> (Really, no reason to pass 'base:' to _base_ func)
> 
> Hmm, I see that it gives a possibility to use 
> meta->exp->export_bitmap_context directly.

Yeah, that's where it got interesting.  A direct strcmp() is nice, but 
if we strip a prefix in one place, we have to strip it in the other.  I 
could go either way.


>> + * @len is the length of @ns, including the `qemu:' prefix.
>> + */
>> +static void nbd_meta_qemu_query(NBDClient *client, 
>> NBDExportMetaContexts *meta,
>> +                                const char *ns, uint32_t len, Error 
>> **errp)
>>   {
>> -    bool dirty_bitmap = false;
>> -    size_t dirty_bitmap_len = strlen("dirty-bitmap:");
>> -    int ret;
>> -
>>       if (!meta->exp->export_bitmap) {
>>           trace_nbd_negotiate_meta_query_skip("no dirty-bitmap 
>> exported");
>> -        return nbd_opt_skip(client, len, errp);
>> -    }
>> -
>> -    if (len == 0) {
>> +    } else if (len == 0) {
> 
> s/0/5/ ?

Oh, good catch.  Using 0 is an unintended logic change, but none of our 
iotests caught it, so we're also missing test coverage.

I'm working on adding 'nbd_opt_list_meta_context()' to libnbd, which 
will add testsuite coverage of this.


>> +    } else if (!g_str_has_prefix(ns + 5, "dirty-bitmap:")) {
>>           trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
>> -        return nbd_opt_skip(client, len, errp);
>> +    } else {
>> +        trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
>> +        nbd_meta_empty_or_pattern(client, 
>> meta->exp->export_bitmap_context,
>> +                                  ns, len, &meta->bitmap, errp);
> 
> hmm. _empty_ is impossible here, as ns includes "qemu:dirty-bitmap"..

Ultimately, we want something like:

if starts with "base:":
   if nothing else:
     if list mode:
       return "base:allocation"
     else:
       return empty list
   else if "base:allocation":
     return "base:allocation"
   else
     return empty list
else if starts with "qemu:":
   if nothing else:
     if list mode:
       return all possible sub-qemu contexts
     else:
       return empty list
   else if starts with "qemu:dirty-bitmap:":
     if nothing else:
       if list mode:
         return all possible dirty-bitmap contexts (right now, just one,
             but I want to allow an array of bitmaps in the future)
       else:
         return empty list
     else if exact match:
       return that bitmap
     else:
       return empty list
   else if "qemu:allocation-depth" (from next patch):
     return "allocation-depth"
   else:
     return empty list
else:
   return empty list

Maybe some function renames will help.

>>       if (ret <= 0) {
>>           return ret;
>>       }
>> +    ns[len] = '\0';
>> +    if (strlen(ns) != len) {
>> +        return nbd_opt_invalid(client, errp,
>> +                               "Embedded NUL in query for option %s",
>> +                               nbd_opt_lookup(client->opt));
>> +    }
> 
> Hmm, that's a new good check. Intersting, is it specified in NBD 
> protocol, that
> NUL shouldn't be in a string.. Probably it can be a separate patch, with
> new nbd_opt_string_read, which will check this thing. But I'm OK with it 
> as is
> in this patch.

I'll separate it.

> To avoid all this pointer arithmetic, what about something like this (I 
> didn't check, just an idea):
> 

I'll see what parts of that I can use in v2.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-26  7:33   ` Richard W.M. Jones
  2020-09-26 13:19     ` Vladimir Sementsov-Ogievskiy
@ 2020-09-28 14:33     ` Eric Blake
  2020-09-28 16:17       ` Richard W.M. Jones
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Blake @ 2020-09-28 14:33 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, qemu-devel,
	Max Reitz, vsementsov

On 9/26/20 2:33 AM, Richard W.M. Jones wrote:
> On Fri, Sep 25, 2020 at 03:32:48PM -0500, Eric Blake wrote:
>> +The second is related to exposing the source of various extents within
>> +the image, with a single context named:
>> +
>> +    qemu:allocation-depth
>> +
>> +In the allocation depth context, bits 0 and 1 form a tri-state value:
>> +
>> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
>> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
>> +    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
>> +               backing layer
> 
>>From the cover description I imagined it would show the actual depth, ie:
> 
>           top -> backing -> backing -> backing
>   depth:   1        2         3   ....          (0 = unallocated)
> 
> I wonder if that is possible?  (Perhaps there's something I don't
> understand here.)

The real reason I don't want to do a straight depth number is that 
'qemu-img map' combined with x-dirty-bitmap is still a very convenient 
way to get at bits 0 and 1 (even if it requires decoding).  But if we 
plumb in a way for bdrv_get_status to return depth counts (rather than 
reimplementing the depth count ourselves), I would have no problem with 
returning a struct:

bits 31-4: the depth of the chain
bits 3-2: reserved (to make reading hex values easier...)
bits 1-0: tri-state of unalloc, local, or backing

where it would look like:

0x0000 -> unallocated
0x0011 -> depth 1, local
0x0022 -> depth 2, from the first backing layer
0x0032 -> depth 3, from the second backing layer
0x0042 ...

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-26 13:32   ` Vladimir Sementsov-Ogievskiy
@ 2020-09-28 14:35     ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2020-09-28 14:35 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, rjones, Max Reitz

On 9/26/20 8:32 AM, Vladimir Sementsov-Ogievskiy wrote:
> 25.09.2020 23:32, Eric Blake wrote:
>> Allow the server to expose an additional metacontext to be requested
>> by savvy clients.  qemu-nbd adds a new option -A to expose the
>> qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
>> can also be set via QMP when using nbd-server-add.
>>
>> qemu as client can be hacked into viewing this new context by using
>> the now-misnamed x-dirty-bitmap option when creating an NBD blockdev;
> 
> may be rename it to x-block-status ?

I thought about it.  We don't promise any back-compat with x- prefix, 
but at the same time, if there's not a strong reason to change it, 
scripts written against older qemu will not break as long as we still 
have the hack.

> 
>> although it is worth noting the decoding of how such context
>> information will appear in 'qemu-img map --output=json':
>>
>> NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
>> NBD_STATE_DEPTH_LOCAL => "zero":false, "data":false
>> NBD_STATE_DEPTH_BACKING => "zero":true, "data":true
> 
> It wouldn't be so simple if we decide to export exact depth number..

It's still simple to do tri-state results if we separate the exact depth 
number to bits 31-4, leaving bits 1-0 as the tri-state summary...

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext
  2020-09-28 14:33     ` Eric Blake
@ 2020-09-28 16:17       ` Richard W.M. Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Richard W.M. Jones @ 2020-09-28 16:17 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, pkrempa, qemu-block, Markus Armbruster, qemu-devel,
	Max Reitz, vsementsov

On Mon, Sep 28, 2020 at 09:33:22AM -0500, Eric Blake wrote:
> On 9/26/20 2:33 AM, Richard W.M. Jones wrote:
> >On Fri, Sep 25, 2020 at 03:32:48PM -0500, Eric Blake wrote:
> >>+The second is related to exposing the source of various extents within
> >>+the image, with a single context named:
> >>+
> >>+    qemu:allocation-depth
> >>+
> >>+In the allocation depth context, bits 0 and 1 form a tri-state value:
> >>+
> >>+    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
> >>+    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
> >>+    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
> >>+               backing layer
> >
> >>From the cover description I imagined it would show the actual depth, ie:
> >
> >          top -> backing -> backing -> backing
> >  depth:   1        2         3   ....          (0 = unallocated)
> >
> >I wonder if that is possible?  (Perhaps there's something I don't
> >understand here.)
> 
> The real reason I don't want to do a straight depth number is that
> 'qemu-img map' combined with x-dirty-bitmap is still a very
> convenient way to get at bits 0 and 1 (even if it requires
> decoding).  But if we plumb in a way for bdrv_get_status to return
> depth counts (rather than reimplementing the depth count ourselves),
> I would have no problem with returning a struct:
> 
> bits 31-4: the depth of the chain
> bits 3-2: reserved (to make reading hex values easier...)
> bits 1-0: tri-state of unalloc, local, or backing
> 
> where it would look like:
> 
> 0x0000 -> unallocated
> 0x0011 -> depth 1, local
> 0x0022 -> depth 2, from the first backing layer
> 0x0032 -> depth 3, from the second backing layer
> 0x0042 ...

This looks nice too.  However I was only bikeshedding so if any of
this is hard to do then don't worry too much.

Would like to add support for this map to nbdinfo too :-)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-09-28 16:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25 20:32 [PATCH 0/3] Exposing backing-chain allocation over NBD Eric Blake
2020-09-25 20:32 ` [PATCH 1/3] nbd: Simplify meta-context parsing Eric Blake
2020-09-26 12:58   ` Vladimir Sementsov-Ogievskiy
2020-09-28 14:05     ` Eric Blake
2020-09-25 20:32 ` [PATCH 2/3] nbd: Add new qemu:allocation-depth metacontext Eric Blake
2020-09-26  7:33   ` Richard W.M. Jones
2020-09-26 13:19     ` Vladimir Sementsov-Ogievskiy
2020-09-28 14:33     ` Eric Blake
2020-09-28 16:17       ` Richard W.M. Jones
2020-09-26 13:15   ` Vladimir Sementsov-Ogievskiy
2020-09-25 20:32 ` [PATCH 3/3] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
2020-09-26  7:34   ` Richard W.M. Jones
2020-09-26 13:32   ` Vladimir Sementsov-Ogievskiy
2020-09-28 14:35     ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.