All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Exposing backing-chain allocation over NBD
@ 2020-09-30 12:11 Eric Blake
  2020-09-30 12:11 ` [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP Eric Blake
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pkrempa, qemu-block, rjones, vsementsov, stefanha

v1 was here: https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg09623.htm
l

Based-on: <20200924152717.287415-1-kwolf@redhat.com>
(block/export: Add infrastructure and QAPI for block exports)

Also available at:
https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/nbd-alloc-depth-v2

Since then:
- rebase on Kevin's work
- add new patch to fix qemu-nbd SIGINT (conflicts with Stefan's work,
we can either rebase his on mine or drop mine if his goes in first)
- split out fix for handling NUL bytes from client [Vladimir]
- further cleanups of query parsing [Vladimir]
- more documentation of how we could also expose actual depth in
remaining bits of the context reply [Rich]

001/5:[down] 'qemu-nbd: Honor SIGINT and SIGHUP'
002/5:[down] 'nbd/server: Reject embedded NUL in NBD strings'
003/5:[0139] [FC] 'nbd: Simplify meta-context parsing'
004/5:[0035] [FC] 'nbd: Add new qemu:allocation-depth metacontext'
005/5:[0038] [FC] 'nbd: Add 'qemu-nbd -A' to expose allocation depth'

Eric Blake (5):
  qemu-nbd: Honor SIGINT and SIGHUP
  nbd/server: Reject embedded NUL in NBD strings
  nbd: Simplify meta-context parsing
  nbd: Add new qemu:allocation-depth metacontext
  nbd: Add 'qemu-nbd -A' to expose allocation depth

 docs/interop/nbd.txt       |  22 ++-
 docs/tools/qemu-nbd.rst    |   6 +
 qapi/block-core.json       |   7 +-
 qapi/block-export.json     |   6 +-
 include/block/nbd.h        |   8 +-
 blockdev-nbd.c             |   2 +
 nbd/server.c               | 324 +++++++++++++++++++++----------------
 qemu-nbd.c                 |  20 ++-
 tests/qemu-iotests/309     |  73 +++++++++
 tests/qemu-iotests/309.out |  22 +++
 tests/qemu-iotests/group   |   1 +
 11 files changed, 342 insertions(+), 149 deletions(-)
 create mode 100755 tests/qemu-iotests/309
 create mode 100644 tests/qemu-iotests/309.out

-- 
2.28.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP
  2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
@ 2020-09-30 12:11 ` Eric Blake
  2020-10-07 10:32   ` Vladimir Sementsov-Ogievskiy
  2020-09-30 12:11 ` [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings Eric Blake
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pkrempa, qemu-block, rjones, vsementsov, stefanha

Honoring just SIGTERM on Linux is too weak; we also want to handle
other common signals, and do so even on BSD.  Why?  Because at least
'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
bitmaps when the server is shut down via a signal.

See also: http://bugzilla.redhat.com/1883608

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 qemu-nbd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index bacb69b0898b..e7520261134f 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -581,7 +581,7 @@ int main(int argc, char **argv)
     const char *pid_file_name = NULL;
     BlockExportOptions *export_opts;

-#if HAVE_NBD_DEVICE
+#ifdef CONFIG_POSIX
     /* The client thread uses SIGTERM to interrupt the server.  A signal
      * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
      */
@@ -589,9 +589,9 @@ int main(int argc, char **argv)
     memset(&sa_sigterm, 0, sizeof(sa_sigterm));
     sa_sigterm.sa_handler = termsig_handler;
     sigaction(SIGTERM, &sa_sigterm, NULL);
-#endif /* HAVE_NBD_DEVICE */
+    sigaction(SIGINT, &sa_sigterm, NULL);
+    sigaction(SIGHUP, &sa_sigterm, NULL);

-#ifdef CONFIG_POSIX
     signal(SIGPIPE, SIG_IGN);
 #endif

-- 
2.28.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings
  2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
  2020-09-30 12:11 ` [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP Eric Blake
@ 2020-09-30 12:11 ` Eric Blake
  2020-10-07 10:39   ` Vladimir Sementsov-Ogievskiy
  2020-09-30 12:11 ` [PATCH v2 3/5] nbd: Simplify meta-context parsing Eric Blake
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pkrempa, qemu-block, rjones, vsementsov, stefanha

The NBD spec is clear that any string sent from the client must not
contain embedded NUL characters.  If the client passes "a\0", we
should reject that option request rather than act on "a".

Testing this is not possible with a compliant client, but I was able
to use gdb to coerce libnbd into temporarily behaving as such a
client.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index f74766add7b7..809f88ce6607 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -301,10 +301,11 @@ nbd_opt_invalid(NBDClient *client, Error **errp, const char *fmt, ...)
 }

 /* Read size bytes from the unparsed payload of the current option.
+ * If @check_nul, require that no NUL bytes appear in buffer.
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success. */
 static int nbd_opt_read(NBDClient *client, void *buffer, size_t size,
-                        Error **errp)
+                        bool check_nul, Error **errp)
 {
     if (size > client->optlen) {
         return nbd_opt_invalid(client, errp,
@@ -312,7 +313,16 @@ static int nbd_opt_read(NBDClient *client, void *buffer, size_t size,
                                nbd_opt_lookup(client->opt));
     }
     client->optlen -= size;
-    return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO : 1;
+    if (qio_channel_read_all(client->ioc, buffer, size, errp) < 0) {
+        return -EIO;
+    }
+
+    if (check_nul && strnlen(buffer, size) != size) {
+        return nbd_opt_invalid(client, errp,
+                               "Unexpected embedded NUL in option %s",
+                               nbd_opt_lookup(client->opt));
+    }
+    return 1;
 }

 /* Drop size bytes from the unparsed payload of the current option.
@@ -349,7 +359,7 @@ static int nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length,
     g_autofree char *local_name = NULL;

     *name = NULL;
-    ret = nbd_opt_read(client, &len, sizeof(len), errp);
+    ret = nbd_opt_read(client, &len, sizeof(len), false, errp);
     if (ret <= 0) {
         return ret;
     }
@@ -361,7 +371,7 @@ static int nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length,
     }

     local_name = g_malloc(len + 1);
-    ret = nbd_opt_read(client, local_name, len, errp);
+    ret = nbd_opt_read(client, local_name, len, true, errp);
     if (ret <= 0) {
         return ret;
     }
@@ -576,14 +586,14 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
     }
     trace_nbd_negotiate_handle_export_name_request(name);

-    rc = nbd_opt_read(client, &requests, sizeof(requests), errp);
+    rc = nbd_opt_read(client, &requests, sizeof(requests), false, errp);
     if (rc <= 0) {
         return rc;
     }
     requests = be16_to_cpu(requests);
     trace_nbd_negotiate_handle_info_requests(requests);
     while (requests--) {
-        rc = nbd_opt_read(client, &request, sizeof(request), errp);
+        rc = nbd_opt_read(client, &request, sizeof(request), false, errp);
         if (rc <= 0) {
             return rc;
         }
@@ -806,7 +816,7 @@ static int nbd_meta_pattern(NBDClient *client, const char *pattern, bool *match,
     assert(len);

     query = g_malloc(len);
-    ret = nbd_opt_read(client, query, len, errp);
+    ret = nbd_opt_read(client, query, len, true, errp);
     if (ret <= 0) {
         g_free(query);
         return ret;
@@ -943,7 +953,7 @@ static int nbd_negotiate_meta_query(NBDClient *client,
     char ns[5];
     uint32_t len;

-    ret = nbd_opt_read(client, &len, sizeof(len), errp);
+    ret = nbd_opt_read(client, &len, sizeof(len), false, errp);
     if (ret <= 0) {
         return ret;
     }
@@ -959,7 +969,7 @@ static int nbd_negotiate_meta_query(NBDClient *client,
     }

     len -= ns_len;
-    ret = nbd_opt_read(client, ns, ns_len, errp);
+    ret = nbd_opt_read(client, ns, ns_len, true, errp);
     if (ret <= 0) {
         return ret;
     }
@@ -1016,7 +1026,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
                             "export '%s' not present", sane_name);
     }

-    ret = nbd_opt_read(client, &nb_queries, sizeof(nb_queries), errp);
+    ret = nbd_opt_read(client, &nb_queries, sizeof(nb_queries), false, errp);
     if (ret <= 0) {
         return ret;
     }
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/5] nbd: Simplify meta-context parsing
  2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
  2020-09-30 12:11 ` [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP Eric Blake
  2020-09-30 12:11 ` [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings Eric Blake
@ 2020-09-30 12:11 ` Eric Blake
  2020-10-07 11:51   ` Vladimir Sementsov-Ogievskiy
  2020-09-30 12:11 ` [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext Eric Blake
  2020-09-30 12:11 ` [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
  4 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pkrempa, qemu-block, rjones, vsementsov, stefanha

We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces.  And once we do that,
several functions end up no longer performing I/O, so they can drop
length and errp parameters, and just return a bool instead of
modifying through a pointer.

Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 193 +++++++++++++++++++--------------------------------
 1 file changed, 70 insertions(+), 123 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 809f88ce6607..7271a09b5c2b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2018 Red Hat, Inc.
+ *  Copyright (C) 2016-2020 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Server Side
@@ -797,135 +797,95 @@ static int nbd_negotiate_send_meta_context(NBDClient *client,
     return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
 }

-/* Read strlen(@pattern) bytes, and set @match to true if they match @pattern.
- * @match is never set to false.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
- *
- * Note: return code = 1 doesn't mean that we've read exactly @pattern.
- * It only means that there are no errors.
+/*
+ * Return true if @query matches @pattern, or if @query is empty when
+ * the @client is performing _LIST_.
  */
-static int nbd_meta_pattern(NBDClient *client, const char *pattern, bool *match,
-                            Error **errp)
+static bool nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
+                                      const char *query)
 {
-    int ret;
-    char *query;
-    size_t len = strlen(pattern);
-
-    assert(len);
-
-    query = g_malloc(len);
-    ret = nbd_opt_read(client, query, len, true, errp);
-    if (ret <= 0) {
-        g_free(query);
-        return ret;
+    if (!*query) {
+        trace_nbd_negotiate_meta_query_parse("empty");
+        return client->opt == NBD_OPT_LIST_META_CONTEXT;
     }
-
-    if (strncmp(query, pattern, len) == 0) {
+    if (strcmp(query, pattern) == 0) {
         trace_nbd_negotiate_meta_query_parse(pattern);
-        *match = true;
-    } else {
-        trace_nbd_negotiate_meta_query_skip("pattern not matched");
+        return true;
     }
-    g_free(query);
-
-    return 1;
+    trace_nbd_negotiate_meta_query_skip("pattern not matched");
+    return false;
 }

 /*
- * Read @len bytes, and set @match to true if they match @pattern, or if @len
- * is 0 and the client is performing _LIST_. @match is never set to false.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
- *
- * Note: return code = 1 doesn't mean that we've read exactly @pattern.
- * It only means that there are no errors.
+ * Return true and adjust @str in place if it begins with @prefix.
  */
-static int nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
-                                     uint32_t len, bool *match, Error **errp)
+static bool nbd_strshift(const char **str, const char *prefix)
 {
-    if (len == 0) {
-        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
-            *match = true;
-        }
-        trace_nbd_negotiate_meta_query_parse("empty");
-        return 1;
-    }
+    size_t len = strlen(prefix);

-    if (len != strlen(pattern)) {
-        trace_nbd_negotiate_meta_query_skip("different lengths");
-        return nbd_opt_skip(client, len, errp);
+    if (strncmp(*str, prefix, len) == 0) {
+        *str += len;
+        return true;
     }
-
-    return nbd_meta_pattern(client, pattern, match, errp);
+    return false;
 }

 /* nbd_meta_base_query
  *
  * Handle queries to 'base' namespace. For now, only the base:allocation
- * context is available.  'len' is the amount of text remaining to be read from
- * the current name, after the 'base:' portion has been stripped.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success.
+ * context is available.  Return true if @query has been handled.
  */
-static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
-                               uint32_t len, Error **errp)
+static bool nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+                                const char *query)
 {
-    return nbd_meta_empty_or_pattern(client, "allocation", len,
-                                     &meta->base_allocation, errp);
+    if (!nbd_strshift(&query, "base:")) {
+        return false;
+    }
+    trace_nbd_negotiate_meta_query_parse("base:");
+
+    if (nbd_meta_empty_or_pattern(client, "allocation", query)) {
+        meta->base_allocation = true;
+    }
+    return true;
 }

-/* nbd_meta_bitmap_query
+/* nbd_meta_qemu_query
  *
- * Handle query to 'qemu:' namespace.
- * @len is the amount of text remaining to be read from the current name, after
- * the 'qemu:' portion has been stripped.
- *
- * Return -errno on I/O error, 0 if option was completely handled by
- * sending a reply about inconsistent lengths, or 1 on success. */
-static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
-                               uint32_t len, Error **errp)
+ * Handle queries to 'qemu' namespace. For now, only the qemu:dirty-bitmap:
+ * context is available.  Return true if @query has been handled.
+ */
+static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
+                                const char *query)
 {
-    bool dirty_bitmap = false;
-    size_t dirty_bitmap_len = strlen("dirty-bitmap:");
-    int ret;
-
-    if (!meta->exp->export_bitmap) {
-        trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
-        return nbd_opt_skip(client, len, errp);
+    if (!nbd_strshift(&query, "qemu:")) {
+        return false;
     }
+    trace_nbd_negotiate_meta_query_parse("qemu:");

-    if (len == 0) {
+    if (!*query) {
         if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+            meta->bitmap = !!meta->exp->export_bitmap;
+        }
+        trace_nbd_negotiate_meta_query_parse("empty");
+        return true;
+    }
+
+    if (nbd_strshift(&query, "dirty-bitmap:")) {
+        trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
+        if (!meta->exp->export_bitmap) {
+            trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
+            return true;
+        }
+        if (nbd_meta_empty_or_pattern(client,
+                                      meta->exp->export_bitmap_context +
+                                      strlen("qemu:dirty-bitmap:"), query)) {
             meta->bitmap = true;
         }
-        trace_nbd_negotiate_meta_query_parse("empty");
-        return 1;
+        return true;
     }

-    if (len < dirty_bitmap_len) {
-        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
-        return nbd_opt_skip(client, len, errp);
-    }
-
-    len -= dirty_bitmap_len;
-    ret = nbd_meta_pattern(client, "dirty-bitmap:", &dirty_bitmap, errp);
-    if (ret <= 0) {
-        return ret;
-    }
-    if (!dirty_bitmap) {
-        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
-        return nbd_opt_skip(client, len, errp);
-    }
-
-    trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
-
-    return nbd_meta_empty_or_pattern(
-            client, meta->exp->export_bitmap_context +
-            strlen("qemu:dirty_bitmap:"), len, &meta->bitmap, errp);
+    trace_nbd_negotiate_meta_query_skip("not dirty-bitmap");
+    return true;
 }

 /* nbd_negotiate_meta_query
@@ -935,22 +895,13 @@ static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
  *
  * The only supported namespaces are 'base' and 'qemu'.
  *
- * The function aims not wasting time and memory to read long unknown namespace
- * names.
- *
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success. */
 static int nbd_negotiate_meta_query(NBDClient *client,
                                     NBDExportMetaContexts *meta, Error **errp)
 {
-    /*
-     * Both 'qemu' and 'base' namespaces have length = 5 including a
-     * colon. If another length namespace is later introduced, this
-     * should certainly be refactored.
-     */
     int ret;
-    size_t ns_len = 5;
-    char ns[5];
+    g_autofree char *query = NULL;
     uint32_t len;

     ret = nbd_opt_read(client, &len, sizeof(len), false, errp);
@@ -963,27 +914,23 @@ static int nbd_negotiate_meta_query(NBDClient *client,
         trace_nbd_negotiate_meta_query_skip("length too long");
         return nbd_opt_skip(client, len, errp);
     }
-    if (len < ns_len) {
-        trace_nbd_negotiate_meta_query_skip("length too short");
-        return nbd_opt_skip(client, len, errp);
-    }

-    len -= ns_len;
-    ret = nbd_opt_read(client, ns, ns_len, true, errp);
+    query = g_malloc(len + 1);
+    ret = nbd_opt_read(client, query, len, true, errp);
     if (ret <= 0) {
         return ret;
     }
+    query[len] = '\0';

-    if (!strncmp(ns, "base:", ns_len)) {
-        trace_nbd_negotiate_meta_query_parse("base:");
-        return nbd_meta_base_query(client, meta, len, errp);
-    } else if (!strncmp(ns, "qemu:", ns_len)) {
-        trace_nbd_negotiate_meta_query_parse("qemu:");
-        return nbd_meta_qemu_query(client, meta, len, errp);
+    if (nbd_meta_base_query(client, meta, query)) {
+        return 1;
+    }
+    if (nbd_meta_qemu_query(client, meta, query)) {
+        return 1;
     }

     trace_nbd_negotiate_meta_query_skip("unknown namespace");
-    return nbd_opt_skip(client, len, errp);
+    return 1;
 }

 /* nbd_negotiate_meta_queries
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext
  2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
                   ` (2 preceding siblings ...)
  2020-09-30 12:11 ` [PATCH v2 3/5] nbd: Simplify meta-context parsing Eric Blake
@ 2020-09-30 12:11 ` Eric Blake
  2020-10-07 13:40   ` Vladimir Sementsov-Ogievskiy
  2020-09-30 12:11 ` [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
  4 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, pkrempa, qemu-block, rjones, Max Reitz, vsementsov, stefanha

'qemu-img map' provides a way to determine which extents of an image
come from the top layer vs. inherited from a backing chain.  This is
useful information worth exposing over NBD.  There is a proposal to
add a QMP command block-dirty-bitmap-populate which can create a dirty
bitmap that reflects allocation information, at which point
qemu:dirty-bitmap:NAME can expose that information via the creation of
a temporary bitmap, but we can shorten the effort by adding a new
qemu:allocation-depth context that does the same thing without an
intermediate bitmap (this patch does not eliminate the need for that
proposal, as it will have other uses as well).

For this patch, I just encoded a tri-state value (unallocated, from
this layer, from any of the backing layers); an obvious extension
would be to provide the actual depth in bits 31-4 while keeping bits
1-0 as a tri-state (leaving bits 3-2 unused, for ease of reading depth
from a hex number).  But this extension would require
bdrv_is_allocated_above to return a depth number.

Note that this patch does not actually enable any way to request a
server to enable this context; that will come in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/interop/nbd.txt |  22 +++++++--
 include/block/nbd.h  |   8 +++-
 nbd/server.c         | 105 ++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 125 insertions(+), 10 deletions(-)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index f3b3cacc9621..56efec7aee12 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -17,9 +17,9 @@ namespace "qemu".

 == "qemu" namespace ==

-The "qemu" namespace currently contains only one type of context,
-related to exposing the contents of a dirty bitmap alongside the
-associated disk contents.  That context has the following form:
+The "qemu" namespace currently contains two types of context.  The
+first is related to exposing the contents of a dirty bitmap alongside
+the associated disk contents.  That context has the following form:

     qemu:dirty-bitmap:<dirty-bitmap-export-name>

@@ -28,8 +28,21 @@ in reply for NBD_CMD_BLOCK_STATUS:

     bit 0: NBD_STATE_DIRTY, means that the extent is "dirty"

+The second is related to exposing the source of various extents within
+the image, with a single context named:
+
+    qemu:allocation-depth
+
+In the allocation depth context, bits 0 and 1 form a tri-state value:
+
+    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
+    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
+    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
+               backing layer
+
 For NBD_OPT_LIST_META_CONTEXT the following queries are supported
-in addition to "qemu:dirty-bitmap:<dirty-bitmap-export-name>":
+in addition to the specific "qemu:allocation-depth" and
+"qemu:dirty-bitmap:<dirty-bitmap-export-name>":

 * "qemu:" - returns list of all available metadata contexts in the
             namespace.
@@ -55,3 +68,4 @@ the operation of that feature.
 NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
 * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
 NBD_CMD_FLAG_FAST_ZERO
+* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 3dd9a04546ec..06208bc25027 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2019 Red Hat, Inc.
+ *  Copyright (C) 2016-2020 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device
@@ -259,6 +259,12 @@ enum {
 /* Extent flags for qemu:dirty-bitmap in NBD_REPLY_TYPE_BLOCK_STATUS */
 #define NBD_STATE_DIRTY (1 << 0)

+/* Extent flags for qemu:allocation-depth in NBD_REPLY_TYPE_BLOCK_STATUS */
+#define NBD_STATE_DEPTH_UNALLOC (0 << 0)
+#define NBD_STATE_DEPTH_LOCAL   (1 << 0)
+#define NBD_STATE_DEPTH_BACKING (2 << 0)
+#define NBD_STATE_DEPTH_MASK    (0x3)
+
 static inline bool nbd_reply_type_is_error(int type)
 {
     return type & (1 << 15);
diff --git a/nbd/server.c b/nbd/server.c
index 7271a09b5c2b..830b21000be3 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -27,7 +27,8 @@
 #include "qemu/units.h"

 #define NBD_META_ID_BASE_ALLOCATION 0
-#define NBD_META_ID_DIRTY_BITMAP 1
+#define NBD_META_ID_ALLOCATION_DEPTH 1
+#define NBD_META_ID_DIRTY_BITMAP 2

 /*
  * NBD_MAX_BLOCK_STATUS_EXTENTS: 1 MiB of extents data. An empirical
@@ -94,6 +95,7 @@ struct NBDExport {
     BlockBackend *eject_notifier_blk;
     Notifier eject_notifier;

+    bool alloc_context;
     BdrvDirtyBitmap *export_bitmap;
     char *export_bitmap_context;
 };
@@ -108,6 +110,7 @@ typedef struct NBDExportMetaContexts {
     bool valid; /* means that negotiation of the option finished without
                    errors */
     bool base_allocation; /* export base:allocation context (block status) */
+    bool allocation_depth; /* export qemu:allocation-depth */
     bool bitmap; /* export qemu:dirty-bitmap:<export bitmap name> */
 } NBDExportMetaContexts;

@@ -806,7 +809,7 @@ static bool nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
 {
     if (!*query) {
         trace_nbd_negotiate_meta_query_parse("empty");
-        return client->opt == NBD_OPT_LIST_META_CONTEXT;
+        return !*pattern || client->opt == NBD_OPT_LIST_META_CONTEXT;
     }
     if (strcmp(query, pattern) == 0) {
         trace_nbd_negotiate_meta_query_parse(pattern);
@@ -864,12 +867,21 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,

     if (!*query) {
         if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+            meta->allocation_depth = meta->exp->alloc_context;
             meta->bitmap = !!meta->exp->export_bitmap;
         }
         trace_nbd_negotiate_meta_query_parse("empty");
         return true;
     }

+    if (nbd_strshift(&query, "allocation-depth")) {
+        trace_nbd_negotiate_meta_query_parse("allocation-depth");
+        if (nbd_meta_empty_or_pattern(client, "", query)) {
+            meta->allocation_depth = meta->exp->alloc_context;
+        }
+        return true;
+    }
+
     if (nbd_strshift(&query, "dirty-bitmap:")) {
         trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
         if (!meta->exp->export_bitmap) {
@@ -984,6 +996,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
     if (client->opt == NBD_OPT_LIST_META_CONTEXT && !nb_queries) {
         /* enable all known contexts */
         meta->base_allocation = true;
+        meta->allocation_depth = meta->exp->alloc_context;
         meta->bitmap = !!meta->exp->export_bitmap;
     } else {
         for (i = 0; i < nb_queries; ++i) {
@@ -1003,6 +1016,15 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
         }
     }

+    if (meta->allocation_depth) {
+        ret = nbd_negotiate_send_meta_context(client, "qemu:allocation-depth",
+                                              NBD_META_ID_ALLOCATION_DEPTH,
+                                              errp);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
     if (meta->bitmap) {
         ret = nbd_negotiate_send_meta_context(client,
                                               meta->exp->export_bitmap_context,
@@ -1961,6 +1983,40 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
     return 0;
 }

+static int blockalloc_to_extents(BlockDriverState *bs, uint64_t offset,
+                                 uint64_t bytes, NBDExtentArray *ea)
+{
+    while (bytes) {
+        uint32_t flags;
+        int64_t num;
+        int ret = bdrv_is_allocated(bs, offset, bytes, &num);
+
+        if (ret < 0) {
+            return ret;
+        }
+
+        if (ret == 1) {
+            flags = NBD_STATE_DEPTH_LOCAL;
+        } else {
+            ret = bdrv_is_allocated_above(bs, NULL, false, offset, num,
+                                          &num);
+            if (ret < 0) {
+                return ret;
+            }
+            flags = ret ? NBD_STATE_DEPTH_BACKING : NBD_STATE_DEPTH_UNALLOC;
+        }
+
+        if (nbd_extent_array_add(ea, num, flags) < 0) {
+            return 0;
+        }
+
+        offset += num;
+        bytes -= num;
+    }
+
+    return 0;
+}
+
 /*
  * nbd_co_send_extents
  *
@@ -2009,6 +2065,26 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
     return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
 }

+/* Get allocation depth from the exported device and send it to the client */
+static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
+                                   BlockDriverState *bs, uint64_t offset,
+                                   uint32_t length, bool dont_fragment,
+                                   bool last, uint32_t context_id,
+                                   Error **errp)
+{
+    int ret;
+    unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
+    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+
+    ret = blockalloc_to_extents(bs, offset, length, ea);
+    if (ret < 0) {
+        return nbd_co_send_structured_error(
+                client, handle, -ret, "can't get block status", errp);
+    }
+
+    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
+}
+
 /* Populate @ea from a dirty bitmap. */
 static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
                               uint64_t offset, uint64_t length,
@@ -2335,16 +2411,20 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         }
         if (client->export_meta.valid &&
             (client->export_meta.base_allocation ||
+             client->export_meta.allocation_depth ||
              client->export_meta.bitmap))
         {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
+            int contexts_remaining = client->export_meta.base_allocation +
+                client->export_meta.allocation_depth +
+                client->export_meta.bitmap;

             if (client->export_meta.base_allocation) {
                 ret = nbd_co_send_block_status(client, request->handle,
                                                blk_bs(exp->common.blk),
                                                request->from,
                                                request->len, dont_fragment,
-                                               !client->export_meta.bitmap,
+                                               !--contexts_remaining,
                                                NBD_META_ID_BASE_ALLOCATION,
                                                errp);
                 if (ret < 0) {
@@ -2352,17 +2432,32 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
                 }
             }

+            if (client->export_meta.allocation_depth) {
+                ret = nbd_co_send_block_alloc(client, request->handle,
+                                              blk_bs(exp->common.blk),
+                                              request->from, request->len,
+                                              dont_fragment,
+                                              !--contexts_remaining,
+                                              NBD_META_ID_ALLOCATION_DEPTH,
+                                              errp);
+                if (ret < 0) {
+                    return ret;
+                }
+            }
+
             if (client->export_meta.bitmap) {
                 ret = nbd_co_send_bitmap(client, request->handle,
                                          client->exp->export_bitmap,
                                          request->from, request->len,
-                                         dont_fragment,
-                                         true, NBD_META_ID_DIRTY_BITMAP, errp);
+                                         dont_fragment, !--contexts_remaining,
+                                         NBD_META_ID_DIRTY_BITMAP, errp);
                 if (ret < 0) {
                     return ret;
                 }
             }

+            assert(!contexts_remaining);
+
             return 0;
         } else {
             return nbd_send_generic_reply(client, request->handle, -EINVAL,
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
                   ` (3 preceding siblings ...)
  2020-09-30 12:11 ` [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext Eric Blake
@ 2020-09-30 12:11 ` Eric Blake
  2020-10-07 14:03   ` Vladimir Sementsov-Ogievskiy
  4 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-09-30 12:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, pkrempa, qemu-block, Markus Armbruster, rjones, Max Reitz,
	vsementsov, stefanha

Allow the server to expose an additional metacontext to be requested
by savvy clients.  qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using block-export-add.

qemu as client can be hacked into viewing this new context by using
the now-misnamed x-dirty-bitmap option when creating an NBD blockdev
(even though our x- naming means we could rename it, I did not think
it worth breaking back-compat of tools that have been using it while
waiting for a better solution).  It is worth noting the decoding of
how such context information will appear in 'qemu-img map
--output=json':

NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
NBD_STATE_DEPTH_LOCAL   => "zero":false, "data":false
NBD_STATE_DEPTH_BACKING => "zero":true,  "data":true

libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/tools/qemu-nbd.rst    |  6 ++++
 qapi/block-core.json       |  7 ++--
 qapi/block-export.json     |  6 +++-
 blockdev-nbd.c             |  2 ++
 nbd/server.c               |  2 ++
 qemu-nbd.c                 | 14 ++++++--
 tests/qemu-iotests/309     | 73 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/309.out | 22 ++++++++++++
 tests/qemu-iotests/group   |  1 +
 9 files changed, 127 insertions(+), 6 deletions(-)
 create mode 100755 tests/qemu-iotests/309
 create mode 100644 tests/qemu-iotests/309.out

diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 667861cb22e9..0e545a97cfa3 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -72,6 +72,12 @@ driver options if ``--image-opts`` is specified.

   Export the disk as read-only.

+.. option:: -A, --allocation-depth
+
+  Expose allocation depth information via the
+  ``qemu:allocation-depth`` context accessible through
+  NBD_OPT_SET_META_CONTEXT.
+
 .. option:: -B, --bitmap=NAME

   If *filename* has a qcow2 persistent bitmap *NAME*, expose
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d620bd1302b2..0379eb992db8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3874,9 +3874,12 @@
 #
 # @tls-creds: TLS credentials ID
 #
-# @x-dirty-bitmap: A "qemu:dirty-bitmap:NAME" string to query in place of
+# @x-dirty-bitmap: A metacontext name such as "qemu:dirty-bitmap:NAME" or
+#                  "qemu:allocation-depth" to query in place of the
 #                  traditional "base:allocation" block status (see
-#                  NBD_OPT_LIST_META_CONTEXT in the NBD protocol) (since 3.0)
+#                  NBD_OPT_LIST_META_CONTEXT in the NBD protocol; and
+#                  yes, naming this option x-context would have made
+#                  more sense) (since 3.0)
 #
 # @reconnect-delay: On an unexpected disconnect, the nbd client tries to
 #                   connect again until succeeding or encountering a serious
diff --git a/qapi/block-export.json b/qapi/block-export.json
index 2291d6cb0cbc..45e43984b11d 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -78,11 +78,15 @@
 #          NBD client can use NBD_OPT_SET_META_CONTEXT with
 #          "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
 #
+# @alloc: Also export the allocation map for @device, so the NBD client
+#         can use NBD_OPT_SET_META_CONTEXT with "qemu:allocation-depth"
+#         to inspect allocation details. (since 5.2)
+#
 # Since: 5.0
 ##
 { 'struct': 'BlockExportOptionsNbd',
   'data': { '*name': 'str', '*description': 'str',
-            '*bitmap': 'str' } }
+            '*bitmap': 'str', '*alloc': 'bool' } }

 ##
 # @NbdServerAddOptions:
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 8174023e5c47..f9012f93e2bb 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -212,6 +212,8 @@ void qmp_nbd_server_add(NbdServerAddOptions *arg, Error **errp)
             .description        = g_strdup(arg->description),
             .has_bitmap         = arg->has_bitmap,
             .bitmap             = g_strdup(arg->bitmap),
+            .has_alloc          = arg->alloc,
+            .alloc              = arg->alloc,
         },
     };

diff --git a/nbd/server.c b/nbd/server.c
index 830b21000be3..11cdc2eab0b3 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1598,6 +1598,8 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
         assert(strlen(exp->export_bitmap_context) < NBD_MAX_STRING_SIZE);
     }

+    exp->alloc_context = arg->alloc;
+
     blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp);

     QTAILQ_INSERT_TAIL(&exports, exp, next);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index e7520261134f..3a92e00464de 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -99,6 +99,7 @@ static void usage(const char *name)
 "\n"
 "Exposing part of the image:\n"
 "  -o, --offset=OFFSET       offset into the image\n"
+"  -A, --allocation-depth    expose the allocation depth\n"
 "  -B, --bitmap=NAME         expose a persistent dirty bitmap\n"
 "\n"
 "General purpose options:\n"
@@ -519,7 +520,7 @@ int main(int argc, char **argv)
     char *device = NULL;
     QemuOpts *sn_opts = NULL;
     const char *sn_id_or_name = NULL;
-    const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:B:L";
+    const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:AB:L";
     struct option lopt[] = {
         { "help", no_argument, NULL, 'h' },
         { "version", no_argument, NULL, 'V' },
@@ -528,6 +529,7 @@ int main(int argc, char **argv)
         { "socket", required_argument, NULL, 'k' },
         { "offset", required_argument, NULL, 'o' },
         { "read-only", no_argument, NULL, 'r' },
+        { "allocation-depth", no_argument, NULL, 'A' },
         { "bitmap", required_argument, NULL, 'B' },
         { "connect", required_argument, NULL, 'c' },
         { "disconnect", no_argument, NULL, 'd' },
@@ -569,6 +571,7 @@ int main(int argc, char **argv)
     QDict *options = NULL;
     const char *export_name = NULL; /* defaults to "" later for server mode */
     const char *export_description = NULL;
+    bool alloc_depth = false;
     const char *bitmap = NULL;
     const char *tlscredsid = NULL;
     bool imageOpts = false;
@@ -693,6 +696,9 @@ int main(int argc, char **argv)
             readonly = true;
             flags &= ~BDRV_O_RDWR;
             break;
+        case 'A':
+            alloc_depth = true;
+            break;
         case 'B':
             bitmap = optarg;
             break;
@@ -790,8 +796,8 @@ int main(int argc, char **argv)
             exit(EXIT_FAILURE);
         }
         if (export_name || export_description || dev_offset ||
-            device || disconnect || fmt || sn_id_or_name || bitmap ||
-            seen_aio || seen_discard || seen_cache) {
+            device || disconnect || fmt || sn_id_or_name || alloc_depth ||
+            bitmap || seen_aio || seen_discard || seen_cache) {
             error_report("List mode is incompatible with per-device settings");
             exit(EXIT_FAILURE);
         }
@@ -1077,6 +1083,8 @@ int main(int argc, char **argv)
             .description        = g_strdup(export_description),
             .has_bitmap         = !!bitmap,
             .bitmap             = g_strdup(bitmap),
+            .has_alloc          = alloc_depth,
+            .alloc              = alloc_depth,
         },
     };
     blk_exp_add(export_opts, &error_fatal);
diff --git a/tests/qemu-iotests/309 b/tests/qemu-iotests/309
new file mode 100755
index 000000000000..16bd269ad2c1
--- /dev/null
+++ b/tests/qemu-iotests/309
@@ -0,0 +1,73 @@
+#!/usr/bin/env bash
+#
+# Test qemu-nbd -A
+#
+# Copyright (C) 2018-2020 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    nbd_server_stop
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.nbd
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+_require_command QEMU_NBD
+
+echo
+echo "=== Initial image setup ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 4M
+$QEMU_IO -c 'w 0 2M' -f $IMGFMT "$TEST_IMG.base" | _filter_qemu_io
+_make_test_img -b "$TEST_IMG.base" -F $IMGFMT 4M
+$QEMU_IO -c 'w 1M 2M' -f $IMGFMT "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Check allocation over NBD ==="
+echo
+
+# Normal -f raw NBD block status loses access to allocation information
+$QEMU_IMG map --output=json -f qcow2 "$TEST_IMG"
+IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
+nbd_server_start_unix_socket -r -f qcow2 -A "$TEST_IMG"
+$QEMU_IMG map --output=json --image-opts \
+    "$IMG" | _filter_qemu_img_map
+# But since we used -A, and use x-dirty-bitmap as a hack for reading bitmaps,
+# we can reconstruct it, by abusing block status to report:
+#    NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
+#    NBD_STATE_DEPTH_LOCAL   => "zero":false, "data":false
+#    NBD_STATE_DEPTH_BACKING => "zero":true,  "data":true
+$QEMU_IMG map --output=json --image-opts \
+    "$IMG,x-dirty-bitmap=qemu:allocation-depth" | _filter_qemu_img_map
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/309.out b/tests/qemu-iotests/309.out
new file mode 100644
index 000000000000..db75bb6b0df9
--- /dev/null
+++ b/tests/qemu-iotests/309.out
@@ -0,0 +1,22 @@
+QA output created by 309
+
+=== Initial image setup ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4194304
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
+wrote 2097152/2097152 bytes at offset 1048576
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Check allocation over NBD ===
+
+[{ "start": 0, "length": 1048576, "depth": 1, "zero": false, "data": true, "offset": 327680},
+{ "start": 1048576, "length": 2097152, "depth": 0, "zero": false, "data": true, "offset": 327680},
+{ "start": 3145728, "length": 1048576, "depth": 1, "zero": true, "data": false}]
+[{ "start": 0, "length": 3145728, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
+{ "start": 3145728, "length": 1048576, "depth": 0, "zero": true, "data": false, "offset": OFFSET}]
+[{ "start": 0, "length": 1048576, "depth": 0, "zero": true, "data": true, "offset": OFFSET},
+{ "start": 1048576, "length": 2097152, "depth": 0, "zero": false, "data": false},
+{ "start": 3145728, "length": 1048576, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 9e4f7c01530d..a567fa97d7e5 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -315,3 +315,4 @@
 304 rw quick
 305 rw quick
 307 rw quick export
+309 rw auto quick
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP
  2020-09-30 12:11 ` [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP Eric Blake
@ 2020-10-07 10:32   ` Vladimir Sementsov-Ogievskiy
  2020-10-07 21:13     ` Eric Blake
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-07 10:32 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, stefanha, pkrempa, rjones, qemu-block

30.09.2020 15:11, Eric Blake wrote:
> Honoring just SIGTERM on Linux is too weak; we also want to handle
> other common signals, and do so even on BSD.  Why?  Because at least
> 'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
> bitmaps when the server is shut down via a signal.

Probably not bad to update a comment [*] if you have a good wording in mind.

> 
> See also: http://bugzilla.redhat.com/1883608
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   qemu-nbd.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index bacb69b0898b..e7520261134f 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -581,7 +581,7 @@ int main(int argc, char **argv)
>       const char *pid_file_name = NULL;
>       BlockExportOptions *export_opts;
> 
> -#if HAVE_NBD_DEVICE
> +#ifdef CONFIG_POSIX
>       /* The client thread uses SIGTERM to interrupt the server.  A signal
>        * handler ensures that "qemu-nbd -v -c" exits with a nice status code.

[*]

>        */
> @@ -589,9 +589,9 @@ int main(int argc, char **argv)
>       memset(&sa_sigterm, 0, sizeof(sa_sigterm));
>       sa_sigterm.sa_handler = termsig_handler;
>       sigaction(SIGTERM, &sa_sigterm, NULL);
> -#endif /* HAVE_NBD_DEVICE */
> +    sigaction(SIGINT, &sa_sigterm, NULL);
> +    sigaction(SIGHUP, &sa_sigterm, NULL);
> 
> -#ifdef CONFIG_POSIX
>       signal(SIGPIPE, SIG_IGN);
>   #endif
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings
  2020-09-30 12:11 ` [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings Eric Blake
@ 2020-10-07 10:39   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-07 10:39 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, stefanha, pkrempa, rjones, qemu-block

30.09.2020 15:11, Eric Blake wrote:
> The NBD spec is clear that any string sent from the client must not
> contain embedded NUL characters.  If the client passes "a\0", we
> should reject that option request rather than act on "a".
> 
> Testing this is not possible with a compliant client, but I was able
> to use gdb to coerce libnbd into temporarily behaving as such a
> client.
> 
> Signed-off-by: Eric Blake<eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/5] nbd: Simplify meta-context parsing
  2020-09-30 12:11 ` [PATCH v2 3/5] nbd: Simplify meta-context parsing Eric Blake
@ 2020-10-07 11:51   ` Vladimir Sementsov-Ogievskiy
  2020-10-07 21:28     ` Eric Blake
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-07 11:51 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, stefanha, pkrempa, rjones, qemu-block

30.09.2020 15:11, Eric Blake wrote:
> We had a premature optimization of trying to read as little from the
> wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
> But in reality, we HAVE to read the entire string from the client
> before we can get to the next command, and it is easier to just read
> it all at once than it is to read it in pieces.  And once we do that,
> several functions end up no longer performing I/O, so they can drop
> length and errp parameters, and just return a bool instead of
> modifying through a pointer.
> 
> Our iotests still pass; I also checked that libnbd's testsuite (which
> covers more corner cases of odd meta context requests) still passes.
> 

Also, do not advertise bitmaps meta context when bitmap export is not set.

> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext
  2020-09-30 12:11 ` [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext Eric Blake
@ 2020-10-07 13:40   ` Vladimir Sementsov-Ogievskiy
  2020-10-07 21:54     ` Eric Blake
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-07 13:40 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, stefanha, pkrempa, rjones, qemu-block, Max Reitz

30.09.2020 15:11, Eric Blake wrote:
> 'qemu-img map' provides a way to determine which extents of an image
> come from the top layer vs. inherited from a backing chain.  This is
> useful information worth exposing over NBD.  There is a proposal to
> add a QMP command block-dirty-bitmap-populate which can create a dirty
> bitmap that reflects allocation information, at which point
> qemu:dirty-bitmap:NAME can expose that information via the creation of
> a temporary bitmap, but we can shorten the effort by adding a new
> qemu:allocation-depth context that does the same thing without an
> intermediate bitmap (this patch does not eliminate the need for that
> proposal, as it will have other uses as well).
> 
> For this patch, I just encoded a tri-state value (unallocated, from
> this layer, from any of the backing layers); an obvious extension
> would be to provide the actual depth in bits 31-4 while keeping bits
> 1-0 as a tri-state (leaving bits 3-2 unused, for ease of reading depth
> from a hex number).  But this extension would require
> bdrv_is_allocated_above to return a depth number.
> 
> Note that this patch does not actually enable any way to request a
> server to enable this context; that will come in the next patch.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   docs/interop/nbd.txt |  22 +++++++--
>   include/block/nbd.h  |   8 +++-
>   nbd/server.c         | 105 ++++++++++++++++++++++++++++++++++++++++---
>   3 files changed, 125 insertions(+), 10 deletions(-)
> 
> diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
> index f3b3cacc9621..56efec7aee12 100644
> --- a/docs/interop/nbd.txt
> +++ b/docs/interop/nbd.txt
> @@ -17,9 +17,9 @@ namespace "qemu".
> 
>   == "qemu" namespace ==
> 
> -The "qemu" namespace currently contains only one type of context,
> -related to exposing the contents of a dirty bitmap alongside the
> -associated disk contents.  That context has the following form:
> +The "qemu" namespace currently contains two types of context.  The
> +first is related to exposing the contents of a dirty bitmap alongside
> +the associated disk contents.  That context has the following form:
> 
>       qemu:dirty-bitmap:<dirty-bitmap-export-name>
> 
> @@ -28,8 +28,21 @@ in reply for NBD_CMD_BLOCK_STATUS:
> 
>       bit 0: NBD_STATE_DIRTY, means that the extent is "dirty"
> 
> +The second is related to exposing the source of various extents within
> +the image, with a single context named:
> +
> +    qemu:allocation-depth
> +
> +In the allocation depth context, bits 0 and 1 form a tri-state value:
> +
> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image

Maybe, s/this/the top level/, as, what is "this" for NBD client?

> +    bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
> +               backing layer
> +
>   For NBD_OPT_LIST_META_CONTEXT the following queries are supported
> -in addition to "qemu:dirty-bitmap:<dirty-bitmap-export-name>":
> +in addition to the specific "qemu:allocation-depth" and
> +"qemu:dirty-bitmap:<dirty-bitmap-export-name>":
> 
>   * "qemu:" - returns list of all available metadata contexts in the
>               namespace.
> @@ -55,3 +68,4 @@ the operation of that feature.
>   NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
>   * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
>   NBD_CMD_FLAG_FAST_ZERO
> +* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 3dd9a04546ec..06208bc25027 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -1,5 +1,5 @@
>   /*
> - *  Copyright (C) 2016-2019 Red Hat, Inc.
> + *  Copyright (C) 2016-2020 Red Hat, Inc.
>    *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
>    *
>    *  Network Block Device
> @@ -259,6 +259,12 @@ enum {
>   /* Extent flags for qemu:dirty-bitmap in NBD_REPLY_TYPE_BLOCK_STATUS */
>   #define NBD_STATE_DIRTY (1 << 0)
> 
> +/* Extent flags for qemu:allocation-depth in NBD_REPLY_TYPE_BLOCK_STATUS */
> +#define NBD_STATE_DEPTH_UNALLOC (0 << 0)
> +#define NBD_STATE_DEPTH_LOCAL   (1 << 0)
> +#define NBD_STATE_DEPTH_BACKING (2 << 0)
> +#define NBD_STATE_DEPTH_MASK    (0x3)
> +
>   static inline bool nbd_reply_type_is_error(int type)
>   {
>       return type & (1 << 15);
> diff --git a/nbd/server.c b/nbd/server.c
> index 7271a09b5c2b..830b21000be3 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -27,7 +27,8 @@
>   #include "qemu/units.h"
> 
>   #define NBD_META_ID_BASE_ALLOCATION 0
> -#define NBD_META_ID_DIRTY_BITMAP 1
> +#define NBD_META_ID_ALLOCATION_DEPTH 1
> +#define NBD_META_ID_DIRTY_BITMAP 2
> 
>   /*
>    * NBD_MAX_BLOCK_STATUS_EXTENTS: 1 MiB of extents data. An empirical
> @@ -94,6 +95,7 @@ struct NBDExport {
>       BlockBackend *eject_notifier_blk;
>       Notifier eject_notifier;
> 
> +    bool alloc_context;
>       BdrvDirtyBitmap *export_bitmap;
>       char *export_bitmap_context;
>   };
> @@ -108,6 +110,7 @@ typedef struct NBDExportMetaContexts {
>       bool valid; /* means that negotiation of the option finished without
>                      errors */
>       bool base_allocation; /* export base:allocation context (block status) */
> +    bool allocation_depth; /* export qemu:allocation-depth */
>       bool bitmap; /* export qemu:dirty-bitmap:<export bitmap name> */
>   } NBDExportMetaContexts;
> 
> @@ -806,7 +809,7 @@ static bool nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
>   {
>       if (!*query) {
>           trace_nbd_negotiate_meta_query_parse("empty");
> -        return client->opt == NBD_OPT_LIST_META_CONTEXT;
> +        return !*pattern || client->opt == NBD_OPT_LIST_META_CONTEXT;
>       }
>       if (strcmp(query, pattern) == 0) {
>           trace_nbd_negotiate_meta_query_parse(pattern);
> @@ -864,12 +867,21 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> 
>       if (!*query) {
>           if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            meta->allocation_depth = meta->exp->alloc_context;
>               meta->bitmap = !!meta->exp->export_bitmap;
>           }
>           trace_nbd_negotiate_meta_query_parse("empty");
>           return true;
>       }
> 
> +    if (nbd_strshift(&query, "allocation-depth")) {
> +        trace_nbd_negotiate_meta_query_parse("allocation-depth");
> +        if (nbd_meta_empty_or_pattern(client, "", query)) {

How much it differs from "if (!*query) {", I don't see...

> +            meta->allocation_depth = meta->exp->alloc_context;
> +        }
> +        return true;
> +    }

why not just

  if (!strcmp(query, "allocation-depth")) {
     meta->allocation_depth = meta->exp->alloc_context;
     return true;
  }

?

With your code you also parse something like "allocation-depth-extended".. Is it on purpose?

> +
>       if (nbd_strshift(&query, "dirty-bitmap:")) {
>           trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
>           if (!meta->exp->export_bitmap) {


Also, "trace_nbd_negotiate_meta_query_skip("not dirty-bitmap"); " at function end should
now be something like trace_nbd_negotiate_meta_query_skip("unknown context in qemu: namespace");

> @@ -984,6 +996,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
>       if (client->opt == NBD_OPT_LIST_META_CONTEXT && !nb_queries) {
>           /* enable all known contexts */
>           meta->base_allocation = true;
> +        meta->allocation_depth = meta->exp->alloc_context;
>           meta->bitmap = !!meta->exp->export_bitmap;
>       } else {
>           for (i = 0; i < nb_queries; ++i) {
> @@ -1003,6 +1016,15 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
>           }
>       }
> 
> +    if (meta->allocation_depth) {
> +        ret = nbd_negotiate_send_meta_context(client, "qemu:allocation-depth",
> +                                              NBD_META_ID_ALLOCATION_DEPTH,
> +                                              errp);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
>       if (meta->bitmap) {
>           ret = nbd_negotiate_send_meta_context(client,
>                                                 meta->exp->export_bitmap_context,
> @@ -1961,6 +1983,40 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
>       return 0;
>   }
> 
> +static int blockalloc_to_extents(BlockDriverState *bs, uint64_t offset,
> +                                 uint64_t bytes, NBDExtentArray *ea)
> +{
> +    while (bytes) {
> +        uint32_t flags;
> +        int64_t num;
> +        int ret = bdrv_is_allocated(bs, offset, bytes, &num);
> +
> +        if (ret < 0) {
> +            return ret;
> +        }
> +
> +        if (ret == 1) {
> +            flags = NBD_STATE_DEPTH_LOCAL;
> +        } else {
> +            ret = bdrv_is_allocated_above(bs, NULL, false, offset, num,
> +                                          &num);
> +            if (ret < 0) {
> +                return ret;
> +            }
> +            flags = ret ? NBD_STATE_DEPTH_BACKING : NBD_STATE_DEPTH_UNALLOC;
> +        }
> +
> +        if (nbd_extent_array_add(ea, num, flags) < 0) {
> +            return 0;
> +        }
> +
> +        offset += num;
> +        bytes -= num;
> +    }
> +
> +    return 0;
> +}
> +
>   /*
>    * nbd_co_send_extents
>    *
> @@ -2009,6 +2065,26 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
>       return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
>   }
> 
> +/* Get allocation depth from the exported device and send it to the client */
> +static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
> +                                   BlockDriverState *bs, uint64_t offset,
> +                                   uint32_t length, bool dont_fragment,
> +                                   bool last, uint32_t context_id,
> +                                   Error **errp)
> +{
> +    int ret;
> +    unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
> +    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
> +
> +    ret = blockalloc_to_extents(bs, offset, length, ea);
> +    if (ret < 0) {
> +        return nbd_co_send_structured_error(
> +                client, handle, -ret, "can't get block status", errp);
> +    }
> +
> +    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
> +}


The function is a copy of nbd_co_send_block_status()..

Probably, we can reuse nbd_co_send_block_status(), and just call blockstatus_to_extents()
or blockalloc_to_extents() depending on context_id parameter.

Still, I'm OK with it as is.

> +
>   /* Populate @ea from a dirty bitmap. */
>   static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
>                                 uint64_t offset, uint64_t length,
> @@ -2335,16 +2411,20 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>           }
>           if (client->export_meta.valid &&
>               (client->export_meta.base_allocation ||
> +             client->export_meta.allocation_depth ||
>                client->export_meta.bitmap))
>           {
>               bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
> +            int contexts_remaining = client->export_meta.base_allocation +
> +                client->export_meta.allocation_depth +
> +                client->export_meta.bitmap;
> 
>               if (client->export_meta.base_allocation) {
>                   ret = nbd_co_send_block_status(client, request->handle,
>                                                  blk_bs(exp->common.blk),
>                                                  request->from,
>                                                  request->len, dont_fragment,
> -                                               !client->export_meta.bitmap,
> +                                               !--contexts_remaining,
>                                                  NBD_META_ID_BASE_ALLOCATION,
>                                                  errp);
>                   if (ret < 0) {
> @@ -2352,17 +2432,32 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>                   }
>               }
> 
> +            if (client->export_meta.allocation_depth) {
> +                ret = nbd_co_send_block_alloc(client, request->handle,
> +                                              blk_bs(exp->common.blk),
> +                                              request->from, request->len,
> +                                              dont_fragment,
> +                                              !--contexts_remaining,
> +                                              NBD_META_ID_ALLOCATION_DEPTH,
> +                                              errp);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +            }
> +
>               if (client->export_meta.bitmap) {
>                   ret = nbd_co_send_bitmap(client, request->handle,
>                                            client->exp->export_bitmap,
>                                            request->from, request->len,
> -                                         dont_fragment,
> -                                         true, NBD_META_ID_DIRTY_BITMAP, errp);
> +                                         dont_fragment, !--contexts_remaining,
> +                                         NBD_META_ID_DIRTY_BITMAP, errp);
>                   if (ret < 0) {
>                       return ret;
>                   }
>               }
> 
> +            assert(!contexts_remaining);
> +
>               return 0;
>           } else {
>               return nbd_send_generic_reply(client, request->handle, -EINVAL,
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth
  2020-09-30 12:11 ` [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
@ 2020-10-07 14:03   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-07 14:03 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, stefanha, pkrempa, rjones, qemu-block, Max Reitz,
	Markus Armbruster

30.09.2020 15:11, Eric Blake wrote:
> Allow the server to expose an additional metacontext to be requested
> by savvy clients.  qemu-nbd adds a new option -A to expose the
> qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
> can also be set via QMP when using block-export-add.
> 
> qemu as client can be hacked into viewing this new context by using
> the now-misnamed x-dirty-bitmap option when creating an NBD blockdev
> (even though our x- naming means we could rename it, I did not think
> it worth breaking back-compat of tools that have been using it while
> waiting for a better solution).  It is worth noting the decoding of
> how such context information will appear in 'qemu-img map
> --output=json':
> 
> NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
> NBD_STATE_DEPTH_LOCAL   => "zero":false, "data":false
> NBD_STATE_DEPTH_BACKING => "zero":true,  "data":true
> 
> libnbd as client is probably a nicer way to get at the information
> without having to decipher such hacks in qemu as client. ;)
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   docs/tools/qemu-nbd.rst    |  6 ++++
>   qapi/block-core.json       |  7 ++--
>   qapi/block-export.json     |  6 +++-
>   blockdev-nbd.c             |  2 ++
>   nbd/server.c               |  2 ++
>   qemu-nbd.c                 | 14 ++++++--
>   tests/qemu-iotests/309     | 73 ++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/309.out | 22 ++++++++++++
>   tests/qemu-iotests/group   |  1 +
>   9 files changed, 127 insertions(+), 6 deletions(-)
>   create mode 100755 tests/qemu-iotests/309
>   create mode 100644 tests/qemu-iotests/309.out
> 
> diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
> index 667861cb22e9..0e545a97cfa3 100644
> --- a/docs/tools/qemu-nbd.rst
> +++ b/docs/tools/qemu-nbd.rst
> @@ -72,6 +72,12 @@ driver options if ``--image-opts`` is specified.
> 
>     Export the disk as read-only.
> 
> +.. option:: -A, --allocation-depth
> +
> +  Expose allocation depth information via the
> +  ``qemu:allocation-depth`` context accessible through
> +  NBD_OPT_SET_META_CONTEXT.
> +
>   .. option:: -B, --bitmap=NAME
> 
>     If *filename* has a qcow2 persistent bitmap *NAME*, expose
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index d620bd1302b2..0379eb992db8 100644


[..]

> +echo
> +echo "=== Check allocation over NBD ==="
> +echo
> +
> +# Normal -f raw NBD block status loses access to allocation information

^this comment is not for the next line, but for further lines ...

> +$QEMU_IMG map --output=json -f qcow2 "$TEST_IMG"
> +IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket"
> +nbd_server_start_unix_socket -r -f qcow2 -A "$TEST_IMG"

... should it be here?

> +$QEMU_IMG map --output=json --image-opts \
> +    "$IMG" | _filter_qemu_img_map
> +# But since we used -A, and use x-dirty-bitmap as a hack for reading bitmaps,
> +# we can reconstruct it, by abusing block status to report:
> +#    NBD_STATE_DEPTH_UNALLOC => "zero":false, "data":true
> +#    NBD_STATE_DEPTH_LOCAL   => "zero":false, "data":false
> +#    NBD_STATE_DEPTH_BACKING => "zero":true,  "data":true
> +$QEMU_IMG map --output=json --image-opts \
> +    "$IMG,x-dirty-bitmap=qemu:allocation-depth" | _filter_qemu_img_map
> +
> +# success, all done
> +echo '*** done'


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP
  2020-10-07 10:32   ` Vladimir Sementsov-Ogievskiy
@ 2020-10-07 21:13     ` Eric Blake
  2020-10-08  8:27       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-10-07 21:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, qemu-block, pkrempa, rjones, stefanha


[-- Attachment #1.1: Type: text/plain, Size: 2018 bytes --]

On 10/7/20 5:32 AM, Vladimir Sementsov-Ogievskiy wrote:
> 30.09.2020 15:11, Eric Blake wrote:
>> Honoring just SIGTERM on Linux is too weak; we also want to handle
>> other common signals, and do so even on BSD.  Why?  Because at least
>> 'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
>> bitmaps when the server is shut down via a signal.
> 
> Probably not bad to update a comment [*] if you have a good wording in
> mind.
> 
>>
>> See also: http://bugzilla.redhat.com/1883608
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
>> ---
>>   qemu-nbd.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index bacb69b0898b..e7520261134f 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -581,7 +581,7 @@ int main(int argc, char **argv)
>>       const char *pid_file_name = NULL;
>>       BlockExportOptions *export_opts;
>>
>> -#if HAVE_NBD_DEVICE
>> +#ifdef CONFIG_POSIX
>>       /* The client thread uses SIGTERM to interrupt the server.  A
>> signal
>>        * handler ensures that "qemu-nbd -v -c" exits with a nice
>> status code.
> 
> [*]
> 

Sure, I went with:

diff --git i/qemu-nbd.c w/qemu-nbd.c
index e7520261134f..c731dda04ec0 100644
--- i/qemu-nbd.c
+++ w/qemu-nbd.c
@@ -582,8 +582,9 @@ int main(int argc, char **argv)
     BlockExportOptions *export_opts;

 #ifdef CONFIG_POSIX
-    /* The client thread uses SIGTERM to interrupt the server.  A signal
-     * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
+    /*
+     * Exit gracefully on various signals, which includes SIGTERM used
+     * by 'qemu-nbd -v -c'.
      */
     struct sigaction sa_sigterm;
     memset(&sa_sigterm, 0, sizeof(sa_sigterm));

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/5] nbd: Simplify meta-context parsing
  2020-10-07 11:51   ` Vladimir Sementsov-Ogievskiy
@ 2020-10-07 21:28     ` Eric Blake
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Blake @ 2020-10-07 21:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, qemu-block, pkrempa, rjones, stefanha


[-- Attachment #1.1: Type: text/plain, Size: 2209 bytes --]

On 10/7/20 6:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> 30.09.2020 15:11, Eric Blake wrote:
>> We had a premature optimization of trying to read as little from the
>> wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
>> But in reality, we HAVE to read the entire string from the client
>> before we can get to the next command, and it is easier to just read
>> it all at once than it is to read it in pieces.  And once we do that,
>> several functions end up no longer performing I/O, so they can drop
>> length and errp parameters, and just return a bool instead of
>> modifying through a pointer.
>>
>> Our iotests still pass; I also checked that libnbd's testsuite (which
>> covers more corner cases of odd meta context requests) still passes.
>>
> 
> Also, do not advertise bitmaps meta context when bitmap export is not set.

That was already there, although seeing the logic change is tricky and
the trace messages change:

> +static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                                const char *query)
>  {
> -    bool dirty_bitmap = false;
> -    size_t dirty_bitmap_len = strlen("dirty-bitmap:");
> -    int ret;
> -
> -    if (!meta->exp->export_bitmap) {
> -        trace_nbd_negotiate_meta_query_skip("no dirty-bitmap exported");
> -        return nbd_opt_skip(client, len, errp);

Old code returned early if there was no bitmap export set

> +    if (!nbd_strshift(&query, "qemu:")) {
> +        return false;
>      }
> +    trace_nbd_negotiate_meta_query_parse("qemu:");
> 
> -    if (len == 0) {
> +    if (!*query) {
>          if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            meta->bitmap = !!meta->exp->export_bitmap;

while the new code has to handle it specifically.  I'll tweak the commit
message to mention the change in trace messages, even when the end
behavior is the same.


> 
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext
  2020-10-07 13:40   ` Vladimir Sementsov-Ogievskiy
@ 2020-10-07 21:54     ` Eric Blake
  2020-10-08 13:04       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Blake @ 2020-10-07 21:54 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, pkrempa, qemu-block, rjones, Max Reitz, stefanha


[-- Attachment #1.1: Type: text/plain, Size: 5528 bytes --]

On 10/7/20 8:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> 30.09.2020 15:11, Eric Blake wrote:
>> 'qemu-img map' provides a way to determine which extents of an image
>> come from the top layer vs. inherited from a backing chain.  This is
>> useful information worth exposing over NBD.  There is a proposal to
>> add a QMP command block-dirty-bitmap-populate which can create a dirty
>> bitmap that reflects allocation information, at which point
>> qemu:dirty-bitmap:NAME can expose that information via the creation of
>> a temporary bitmap, but we can shorten the effort by adding a new
>> qemu:allocation-depth context that does the same thing without an
>> intermediate bitmap (this patch does not eliminate the need for that
>> proposal, as it will have other uses as well).
>>
>> For this patch, I just encoded a tri-state value (unallocated, from
>> this layer, from any of the backing layers); an obvious extension
>> would be to provide the actual depth in bits 31-4 while keeping bits
>> 1-0 as a tri-state (leaving bits 3-2 unused, for ease of reading depth
>> from a hex number).  But this extension would require
>> bdrv_is_allocated_above to return a depth number.
>>
>> Note that this patch does not actually enable any way to request a
>> server to enable this context; that will come in the next patch.
>>

>> +In the allocation depth context, bits 0 and 1 form a tri-state value:
>> +
>> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is
>> unallocated
>> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this
>> image
> 
> Maybe, s/this/the top level/, as, what is "this" for NBD client?

Sure.


>> @@ -864,12 +867,21 @@ static bool nbd_meta_qemu_query(NBDClient
>> *client, NBDExportMetaContexts *meta,
>>
>>       if (!*query) {

We get here for "qemu:".

>>           if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
>> +            meta->allocation_depth = meta->exp->alloc_context;
>>               meta->bitmap = !!meta->exp->export_bitmap;
>>           }
>>           trace_nbd_negotiate_meta_query_parse("empty");
>>           return true;
>>       }
>>
>> +    if (nbd_strshift(&query, "allocation-depth")) {

We get here for "qemu:allocation-depth", and as you pointed out,
"qemu:allocation-depth-extended".

>> +        trace_nbd_negotiate_meta_query_parse("allocation-depth");
>> +        if (nbd_meta_empty_or_pattern(client, "", query)) {
> 
> How much it differs from "if (!*query) {", I don't see...

The nbd_meta_empty_or_pattern returns false for
"qemu:allocation-depth-extended"; it only returns true for
"qemu:allocation-depth".  But, as you pointed out,

> 
>> +            meta->allocation_depth = meta->exp->alloc_context;
>> +        }
>> +        return true;
>> +    }
> 
> why not just
> 
>  if (!strcmp(query, "allocation-depth")) {
>     meta->allocation_depth = meta->exp->alloc_context;
>     return true;
>  }
> 
> ?

that does seem shorter.

> 
> With your code you also parse something like
> "allocation-depth-extended".. Is it on purpose?

The string is parsed, but does not affect meta->XXX, which means nothing
gets advertised to the client.  The trace messages might differ, but
behavior is correct.

> 
>> +
>>       if (nbd_strshift(&query, "dirty-bitmap:")) {
>>           trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
>>           if (!meta->exp->export_bitmap) {
> 
> 
> Also, "trace_nbd_negotiate_meta_query_skip("not dirty-bitmap"); " at
> function end should
> now be something like trace_nbd_negotiate_meta_query_skip("unknown
> context in qemu: namespace");

Good idea.


>> +/* Get allocation depth from the exported device and send it to the
>> client */
>> +static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
>> +                                   BlockDriverState *bs, uint64_t
>> offset,
>> +                                   uint32_t length, bool dont_fragment,
>> +                                   bool last, uint32_t context_id,
>> +                                   Error **errp)
>> +{
>> +    int ret;
>> +    unsigned int nb_extents = dont_fragment ? 1 :
>> NBD_MAX_BLOCK_STATUS_EXTENTS;
>> +    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
>> +
>> +    ret = blockalloc_to_extents(bs, offset, length, ea);
>> +    if (ret < 0) {
>> +        return nbd_co_send_structured_error(
>> +                client, handle, -ret, "can't get block status", errp);
>> +    }
>> +
>> +    return nbd_co_send_extents(client, handle, ea, last, context_id,
>> errp);
>> +}
> 
> 
> The function is a copy of nbd_co_send_block_status()..
> 
> Probably, we can reuse nbd_co_send_block_status(), and just call
> blockstatus_to_extents()
> or blockalloc_to_extents() depending on context_id parameter.

Nice idea to reduce the duplication.

> 
> Still, I'm OK with it as is.
> 

So is that Reviewed-by:, or do I need to post v3 with my tweaks in
response to your comments?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP
  2020-10-07 21:13     ` Eric Blake
@ 2020-10-08  8:27       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-08  8:27 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, stefanha, pkrempa, rjones, qemu-block

08.10.2020 00:13, Eric Blake wrote:
> On 10/7/20 5:32 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 30.09.2020 15:11, Eric Blake wrote:
>>> Honoring just SIGTERM on Linux is too weak; we also want to handle
>>> other common signals, and do so even on BSD.  Why?  Because at least
>>> 'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
>>> bitmaps when the server is shut down via a signal.
>>
>> Probably not bad to update a comment [*] if you have a good wording in
>> mind.
>>
>>>
>>> See also: http://bugzilla.redhat.com/1883608
>>>
>>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
>>> ---
>>>    qemu-nbd.c | 6 +++---
>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>>> index bacb69b0898b..e7520261134f 100644
>>> --- a/qemu-nbd.c
>>> +++ b/qemu-nbd.c
>>> @@ -581,7 +581,7 @@ int main(int argc, char **argv)
>>>        const char *pid_file_name = NULL;
>>>        BlockExportOptions *export_opts;
>>>
>>> -#if HAVE_NBD_DEVICE
>>> +#ifdef CONFIG_POSIX
>>>        /* The client thread uses SIGTERM to interrupt the server.  A
>>> signal
>>>         * handler ensures that "qemu-nbd -v -c" exits with a nice
>>> status code.
>>
>> [*]
>>
> 
> Sure, I went with:
> 
> diff --git i/qemu-nbd.c w/qemu-nbd.c
> index e7520261134f..c731dda04ec0 100644
> --- i/qemu-nbd.c
> +++ w/qemu-nbd.c
> @@ -582,8 +582,9 @@ int main(int argc, char **argv)
>       BlockExportOptions *export_opts;
> 
>   #ifdef CONFIG_POSIX
> -    /* The client thread uses SIGTERM to interrupt the server.  A signal
> -     * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
> +    /*
> +     * Exit gracefully on various signals, which includes SIGTERM used
> +     * by 'qemu-nbd -v -c'.
>        */
>       struct sigaction sa_sigterm;
>       memset(&sa_sigterm, 0, sizeof(sa_sigterm));
> 

Thanks!

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext
  2020-10-07 21:54     ` Eric Blake
@ 2020-10-08 13:04       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-10-08 13:04 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, stefanha, pkrempa, rjones, qemu-block, Max Reitz

08.10.2020 00:54, Eric Blake wrote:
> On 10/7/20 8:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 30.09.2020 15:11, Eric Blake wrote:
>>> 'qemu-img map' provides a way to determine which extents of an image
>>> come from the top layer vs. inherited from a backing chain.  This is
>>> useful information worth exposing over NBD.  There is a proposal to
>>> add a QMP command block-dirty-bitmap-populate which can create a dirty
>>> bitmap that reflects allocation information, at which point
>>> qemu:dirty-bitmap:NAME can expose that information via the creation of
>>> a temporary bitmap, but we can shorten the effort by adding a new
>>> qemu:allocation-depth context that does the same thing without an
>>> intermediate bitmap (this patch does not eliminate the need for that
>>> proposal, as it will have other uses as well).
>>>
>>> For this patch, I just encoded a tri-state value (unallocated, from
>>> this layer, from any of the backing layers); an obvious extension
>>> would be to provide the actual depth in bits 31-4 while keeping bits
>>> 1-0 as a tri-state (leaving bits 3-2 unused, for ease of reading depth
>>> from a hex number).  But this extension would require
>>> bdrv_is_allocated_above to return a depth number.
>>>
>>> Note that this patch does not actually enable any way to request a
>>> server to enable this context; that will come in the next patch.
>>>
> 
>>> +In the allocation depth context, bits 0 and 1 form a tri-state value:
>>> +
>>> +    bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is
>>> unallocated
>>> +    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this
>>> image
>>
>> Maybe, s/this/the top level/, as, what is "this" for NBD client?
> 
> Sure.
> 
> 
>>> @@ -864,12 +867,21 @@ static bool nbd_meta_qemu_query(NBDClient
>>> *client, NBDExportMetaContexts *meta,
>>>
>>>        if (!*query) {
> 
> We get here for "qemu:".
> 
>>>            if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
>>> +            meta->allocation_depth = meta->exp->alloc_context;
>>>                meta->bitmap = !!meta->exp->export_bitmap;
>>>            }
>>>            trace_nbd_negotiate_meta_query_parse("empty");
>>>            return true;
>>>        }
>>>
>>> +    if (nbd_strshift(&query, "allocation-depth")) {
> 
> We get here for "qemu:allocation-depth", and as you pointed out,
> "qemu:allocation-depth-extended".
> 
>>> +        trace_nbd_negotiate_meta_query_parse("allocation-depth");
>>> +        if (nbd_meta_empty_or_pattern(client, "", query)) {
>>
>> How much it differs from "if (!*query) {", I don't see...
> 
> The nbd_meta_empty_or_pattern returns false for
> "qemu:allocation-depth-extended"; it only returns true for
> "qemu:allocation-depth".  But, as you pointed out,
> 
>>
>>> +            meta->allocation_depth = meta->exp->alloc_context;
>>> +        }
>>> +        return true;
>>> +    }
>>
>> why not just
>>
>>   if (!strcmp(query, "allocation-depth")) {
>>      meta->allocation_depth = meta->exp->alloc_context;
>>      return true;
>>   }
>>
>> ?
> 
> that does seem shorter.
> 
>>
>> With your code you also parse something like
>> "allocation-depth-extended".. Is it on purpose?
> 
> The string is parsed, but does not affect meta->XXX, which means nothing
> gets advertised to the client.  The trace messages might differ, but
> behavior is correct.
> 
>>
>>> +
>>>        if (nbd_strshift(&query, "dirty-bitmap:")) {
>>>            trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
>>>            if (!meta->exp->export_bitmap) {
>>
>>
>> Also, "trace_nbd_negotiate_meta_query_skip("not dirty-bitmap"); " at
>> function end should
>> now be something like trace_nbd_negotiate_meta_query_skip("unknown
>> context in qemu: namespace");
> 
> Good idea.
> 
> 
>>> +/* Get allocation depth from the exported device and send it to the
>>> client */
>>> +static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
>>> +                                   BlockDriverState *bs, uint64_t
>>> offset,
>>> +                                   uint32_t length, bool dont_fragment,
>>> +                                   bool last, uint32_t context_id,
>>> +                                   Error **errp)
>>> +{
>>> +    int ret;
>>> +    unsigned int nb_extents = dont_fragment ? 1 :
>>> NBD_MAX_BLOCK_STATUS_EXTENTS;
>>> +    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
>>> +
>>> +    ret = blockalloc_to_extents(bs, offset, length, ea);
>>> +    if (ret < 0) {
>>> +        return nbd_co_send_structured_error(
>>> +                client, handle, -ret, "can't get block status", errp);
>>> +    }
>>> +
>>> +    return nbd_co_send_extents(client, handle, ea, last, context_id,
>>> errp);
>>> +}
>>
>>
>> The function is a copy of nbd_co_send_block_status()..
>>
>> Probably, we can reuse nbd_co_send_block_status(), and just call
>> blockstatus_to_extents()
>> or blockalloc_to_extents() depending on context_id parameter.
> 
> Nice idea to reduce the duplication.
> 
>>
>> Still, I'm OK with it as is.
>>

it was only about the duplicated function

> 
> So is that Reviewed-by:, or do I need to post v3 with my tweaks in
> response to your comments?
> 

Honestly it wasn't.. So, I found some things to discuss, and it's possible that I've looked through with less attention. Hm. So, with v3 or without, I've have to lookthrough it again..

My suggested diff:

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index 56efec7aee..a0d91ff08b 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -36,7 +36,8 @@ the image, with a single context named:
  In the allocation depth context, bits 0 and 1 form a tri-state value:
  
      bits 0-1 clear: NBD_STATE_DEPTH_UNALLOC, means the extent is unallocated
-    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in this image
+    bit 0 set: NBD_STATE_DEPTH_LOCAL, the extent is allocated in the top
+               level image
      bit 1 set: NBD_STATE_DEPTH_BACKING, the extent is inherited from a
                 backing layer
  
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 06208bc250..2d7165960d 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -262,7 +262,7 @@ enum {
  /* Extent flags for qemu:allocation-depth in NBD_REPLY_TYPE_BLOCK_STATUS */
  #define NBD_STATE_DEPTH_UNALLOC (0 << 0)
  #define NBD_STATE_DEPTH_LOCAL   (1 << 0)
-#define NBD_STATE_DEPTH_BACKING (2 << 0)
+#define NBD_STATE_DEPTH_BACKING (1 << 1)
  #define NBD_STATE_DEPTH_MASK    (0x3)
  
  static inline bool nbd_reply_type_is_error(int type)
diff --git a/nbd/server.c b/nbd/server.c
index 830b21000b..3761ac180f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -809,7 +809,7 @@ static bool nbd_meta_empty_or_pattern(NBDClient *client, const char *pattern,
  {
      if (!*query) {
          trace_nbd_negotiate_meta_query_parse("empty");
-        return !*pattern || client->opt == NBD_OPT_LIST_META_CONTEXT;
+        return client->opt == NBD_OPT_LIST_META_CONTEXT;
      }
      if (strcmp(query, pattern) == 0) {
          trace_nbd_negotiate_meta_query_parse(pattern);
@@ -874,11 +874,9 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
          return true;
      }
  
-    if (nbd_strshift(&query, "allocation-depth")) {
+    if (!strcmp(query, "allocation-depth")) {
          trace_nbd_negotiate_meta_query_parse("allocation-depth");
-        if (nbd_meta_empty_or_pattern(client, "", query)) {
-            meta->allocation_depth = meta->exp->alloc_context;
-        }
+        meta->allocation_depth = meta->exp->alloc_context;
          return true;
      }
  
@@ -896,7 +894,7 @@ static bool nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
          return true;
      }
  
-    trace_nbd_negotiate_meta_query_skip("not dirty-bitmap");
+    trace_nbd_negotiate_meta_query_skip("unknown context in qemu: namespace");
      return true;
  }
  
@@ -2045,7 +2043,11 @@ static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
      return nbd_co_send_iov(client, iov, 2, errp);
  }
  
-/* Get block status from the exported device and send it to the client */
+/*
+ * Get block status from the exported device and send it to the client,
+ * handling base:allocation or qemu:allocation-depth meta-context, based
+ * on @context_id value.
+ */
  static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
                                      BlockDriverState *bs, uint64_t offset,
                                      uint32_t length, bool dont_fragment,
@@ -2056,27 +2058,12 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
      unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
      g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
  
-    ret = blockstatus_to_extents(bs, offset, length, ea);
-    if (ret < 0) {
-        return nbd_co_send_structured_error(
-                client, handle, -ret, "can't get block status", errp);
+    if (context_id == NBD_META_ID_BASE_ALLOCATION) {
+        ret = blockstatus_to_extents(bs, offset, length, ea);
+    } else {
+        assert(context_id == NBD_META_ID_ALLOCATION_DEPTH);
+        ret = blockalloc_to_extents(bs, offset, length, ea);
      }
-
-    return nbd_co_send_extents(client, handle, ea, last, context_id, errp);
-}
-
-/* Get allocation depth from the exported device and send it to the client */
-static int nbd_co_send_block_alloc(NBDClient *client, uint64_t handle,
-                                   BlockDriverState *bs, uint64_t offset,
-                                   uint32_t length, bool dont_fragment,
-                                   bool last, uint32_t context_id,
-                                   Error **errp)
-{
-    int ret;
-    unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
-    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
-
-    ret = blockalloc_to_extents(bs, offset, length, ea);
      if (ret < 0) {
          return nbd_co_send_structured_error(
                  client, handle, -ret, "can't get block status", errp);
@@ -2433,13 +2420,13 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
              }
  
              if (client->export_meta.allocation_depth) {
-                ret = nbd_co_send_block_alloc(client, request->handle,
-                                              blk_bs(exp->common.blk),
-                                              request->from, request->len,
-                                              dont_fragment,
-                                              !--contexts_remaining,
-                                              NBD_META_ID_ALLOCATION_DEPTH,
-                                              errp);
+                ret = nbd_co_send_block_status(client, request->handle,
+                                               blk_bs(exp->common.blk),
+                                               request->from, request->len,
+                                               dont_fragment,
+                                               !--contexts_remaining,
+                                               NBD_META_ID_ALLOCATION_DEPTH,
+                                               errp);
                  if (ret < 0) {
                      return ret;
                  }



with it:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-10-08 13:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 12:11 [PATCH v2 0/5] Exposing backing-chain allocation over NBD Eric Blake
2020-09-30 12:11 ` [PATCH v2 1/5] qemu-nbd: Honor SIGINT and SIGHUP Eric Blake
2020-10-07 10:32   ` Vladimir Sementsov-Ogievskiy
2020-10-07 21:13     ` Eric Blake
2020-10-08  8:27       ` Vladimir Sementsov-Ogievskiy
2020-09-30 12:11 ` [PATCH v2 2/5] nbd/server: Reject embedded NUL in NBD strings Eric Blake
2020-10-07 10:39   ` Vladimir Sementsov-Ogievskiy
2020-09-30 12:11 ` [PATCH v2 3/5] nbd: Simplify meta-context parsing Eric Blake
2020-10-07 11:51   ` Vladimir Sementsov-Ogievskiy
2020-10-07 21:28     ` Eric Blake
2020-09-30 12:11 ` [PATCH v2 4/5] nbd: Add new qemu:allocation-depth metacontext Eric Blake
2020-10-07 13:40   ` Vladimir Sementsov-Ogievskiy
2020-10-07 21:54     ` Eric Blake
2020-10-08 13:04       ` Vladimir Sementsov-Ogievskiy
2020-09-30 12:11 ` [PATCH v2 5/5] nbd: Add 'qemu-nbd -A' to expose allocation depth Eric Blake
2020-10-07 14:03   ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.