All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05
@ 2019-09-05 18:21 Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 1/9] nbd: Advertise multi-conn for shared read-only connections Eric Blake
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit eac2f39602e0423adf56be410c9a22c31fec9a81:

  target/arm: Inline gen_bx_im into callers (2019-09-05 13:23:04 +0100)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2019-09-05

for you to fetch changes up to 73bddca33cb1749ddcbcc1e9972a77d93000553b:

  nbd: Implement server use of NBD FAST_ZERO (2019-09-05 10:48:46 -0500)

----------------------------------------------------------------
nbd patches for 2019-09-05

- Advertise NBD_FLAG_CAN_MULTI_CONN on readonly images
- Tolerate larger set of server error responses during handshake
- More precision on handling fallocate() failures due to alignment
- Better documentation of NBD connection URIs
- Implement new extension NBD_CMD_FLAG_FAST_ZERO to benefit qemu-img convert

----------------------------------------------------------------
Andrey Shinkevich (1):
      block: workaround for unaligned byte range in fallocate()

Eric Blake (8):
      nbd: Advertise multi-conn for shared read-only connections
      nbd: Use g_autofree in a few places
      nbd: Tolerate more errors to structured reply request
      docs: Update preferred NBD device syntax
      nbd: Improve per-export flag handling in server
      nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO
      nbd: Implement client use of NBD FAST_ZERO
      nbd: Implement server use of NBD FAST_ZERO

 docs/interop/nbd.txt |  2 ++
 qemu-doc.texi        | 11 +++++--
 include/block/nbd.h  |  6 +++-
 block/io.c           |  2 +-
 block/file-posix.c   |  7 +++++
 block/nbd.c          | 18 ++++++-----
 blockdev-nbd.c       |  3 +-
 nbd/client.c         | 85 +++++++++++++++++++++++++---------------------------
 nbd/common.c         |  5 ++++
 nbd/server.c         | 83 ++++++++++++++++++++++++++++----------------------
 qemu-nbd.c           |  7 +++--
 nbd/trace-events     |  2 +-
 12 files changed, 134 insertions(+), 97 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 1/9] nbd: Advertise multi-conn for shared read-only connections
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 2/9] nbd: Use g_autofree in a few places Eric Blake
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, John Snow, open list:Network Block Dev..., Max Reitz

The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be
advertised when the server promises cache consistency between
simultaneous clients (basically, rules that determine what FUA and
flush from one client are able to guarantee for reads from another
client).  When we don't permit simultaneous clients (such as qemu-nbd
without -e), the bit makes no sense; and for writable images, we
probably have a lot more work before we can declare that actions from
one client are cache-consistent with actions from another.  But for
read-only images, where flush isn't changing any data, we might as
well advertise multi-conn support.  What's more, advertisement of the
bit makes it easier for clients to determine if 'qemu-nbd -e' was in
use, where a second connection will succeed rather than hang until the
first client goes away.

This patch affects qemu as server in advertising the bit.  We may want
to consider patches to qemu as client to attempt parallel connections
for higher throughput by spreading the load over those connections
when a server advertises multi-conn, but for now sticking to one
connection per nbd:// BDS is okay.

See also: https://bugzilla.redhat.com/1708300
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190815185024.7010-1-eblake@redhat.com>
[eblake: tweak blockdev-nbd.c to not request shared when writable]
Reviewed-by: John Snow <jsnow@redhat.com>
---
 docs/interop/nbd.txt | 1 +
 include/block/nbd.h  | 2 +-
 blockdev-nbd.c       | 2 +-
 nbd/server.c         | 4 +++-
 qemu-nbd.c           | 2 +-
 5 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index fc64473e02b2..6dfec7f47647 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -53,3 +53,4 @@ the operation of that feature.
 * 2.12: NBD_CMD_BLOCK_STATUS for "base:allocation"
 * 3.0: NBD_OPT_STARTTLS with TLS Pre-Shared Keys (PSK),
 NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
+* 4.2: NBD_FLAG_CAN_MULTI_CONN for sharable read-only exports
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 7b36d672f046..991fd52a5134 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -326,7 +326,7 @@ typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, uint16_t nbdflags,
+                          const char *bitmap, uint16_t nbdflags, bool shared,
                           void (*close)(NBDExport *), bool writethrough,
                           BlockBackend *on_eject_blk, Error **errp);
 void nbd_export_close(NBDExport *exp);
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index c621686131fd..1fcfdb0997c6 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -188,7 +188,7 @@ void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
     }

     exp = nbd_export_new(bs, 0, len, name, NULL, bitmap,
-                         writable ? 0 : NBD_FLAG_READ_ONLY,
+                         writable ? 0 : NBD_FLAG_READ_ONLY, !writable,
                          NULL, false, on_eject_blk, errp);
     if (!exp) {
         return;
diff --git a/nbd/server.c b/nbd/server.c
index f55ccf8edfde..0fb41c6c50ea 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1461,7 +1461,7 @@ static void nbd_eject_notifier(Notifier *n, void *data)

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, uint16_t nbdflags,
+                          const char *bitmap, uint16_t nbdflags, bool shared,
                           void (*close)(NBDExport *), bool writethrough,
                           BlockBackend *on_eject_blk, Error **errp)
 {
@@ -1487,6 +1487,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
     perm = BLK_PERM_CONSISTENT_READ;
     if ((nbdflags & NBD_FLAG_READ_ONLY) == 0) {
         perm |= BLK_PERM_WRITE;
+    } else if (shared) {
+        nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
     }
     blk = blk_new(bdrv_get_aio_context(bs), perm,
                   BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 83b6c32d73aa..2403ef3d0f9f 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -1173,7 +1173,7 @@ int main(int argc, char **argv)
     }

     export = nbd_export_new(bs, dev_offset, fd_size, export_name,
-                            export_description, bitmap, nbdflags,
+                            export_description, bitmap, nbdflags, shared > 1,
                             nbd_export_closed, writethrough, NULL,
                             &error_fatal);

-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 2/9] nbd: Use g_autofree in a few places
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 1/9] nbd: Advertise multi-conn for shared read-only connections Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 3/9] nbd: Tolerate more errors to structured reply request Eric Blake
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy,
	Daniel P . Berrangé, open list:Network Block Dev...,
	Max Reitz

Thanks to our recent move to use glib's g_autofree, I can join the
bandwagon.  Getting rid of gotos is fun ;)

There are probably more places where we could register cleanup
functions and get rid of more gotos; this patch just focuses on the
labels that existed merely to call g_free.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190824172813.29720-2-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd.c  | 11 ++++-------
 nbd/client.c | 22 +++++++---------------
 nbd/server.c | 12 ++++--------
 3 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index beed46fb3414..c4c91a158602 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1374,7 +1374,7 @@ static bool nbd_has_filename_options_conflict(QDict *options, Error **errp)
 static void nbd_parse_filename(const char *filename, QDict *options,
                                Error **errp)
 {
-    char *file;
+    g_autofree char *file = NULL;
     char *export_name;
     const char *host_spec;
     const char *unixpath;
@@ -1396,7 +1396,7 @@ static void nbd_parse_filename(const char *filename, QDict *options,
     export_name = strstr(file, EN_OPTSTR);
     if (export_name) {
         if (export_name[strlen(EN_OPTSTR)] == 0) {
-            goto out;
+            return;
         }
         export_name[0] = 0; /* truncate 'file' */
         export_name += strlen(EN_OPTSTR);
@@ -1407,11 +1407,11 @@ static void nbd_parse_filename(const char *filename, QDict *options,
     /* extract the host_spec - fail if it's not nbd:... */
     if (!strstart(file, "nbd:", &host_spec)) {
         error_setg(errp, "File name string for NBD must start with 'nbd:'");
-        goto out;
+        return;
     }

     if (!*host_spec) {
-        goto out;
+        return;
     }

     /* are we a UNIX or TCP socket? */
@@ -1431,9 +1431,6 @@ static void nbd_parse_filename(const char *filename, QDict *options,
     out_inet:
         qapi_free_InetSocketAddress(addr);
     }
-
-out:
-    g_free(file);
 }

 static bool nbd_process_legacy_socket_options(QDict *output_options,
diff --git a/nbd/client.c b/nbd/client.c
index 49bf9906f94b..a9d8d32feff7 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -247,12 +247,11 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
 static int nbd_receive_list(QIOChannel *ioc, char **name, char **description,
                             Error **errp)
 {
-    int ret = -1;
     NBDOptionReply reply;
     uint32_t len;
     uint32_t namelen;
-    char *local_name = NULL;
-    char *local_desc = NULL;
+    g_autofree char *local_name = NULL;
+    g_autofree char *local_desc = NULL;
     int error;

     if (nbd_receive_option_reply(ioc, NBD_OPT_LIST, &reply, errp) < 0) {
@@ -298,7 +297,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, char **description,
     local_name = g_malloc(namelen + 1);
     if (nbd_read(ioc, local_name, namelen, "export name", errp) < 0) {
         nbd_send_opt_abort(ioc);
-        goto out;
+        return -1;
     }
     local_name[namelen] = '\0';
     len -= namelen;
@@ -306,24 +305,17 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, char **description,
         local_desc = g_malloc(len + 1);
         if (nbd_read(ioc, local_desc, len, "export description", errp) < 0) {
             nbd_send_opt_abort(ioc);
-            goto out;
+            return -1;
         }
         local_desc[len] = '\0';
     }

     trace_nbd_receive_list(local_name, local_desc ?: "");
-    *name = local_name;
-    local_name = NULL;
+    *name = g_steal_pointer(&local_name);
     if (description) {
-        *description = local_desc;
-        local_desc = NULL;
+        *description = g_steal_pointer(&local_desc);
     }
-    ret = 1;
-
- out:
-    g_free(local_name);
-    g_free(local_desc);
-    return ret;
+    return 1;
 }


diff --git a/nbd/server.c b/nbd/server.c
index 0fb41c6c50ea..74d205812fee 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -206,7 +206,7 @@ static int GCC_FMT_ATTR(4, 0)
 nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type,
                             Error **errp, const char *fmt, va_list va)
 {
-    char *msg;
+    g_autofree char *msg = NULL;
     int ret;
     size_t len;

@@ -216,18 +216,14 @@ nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type,
     trace_nbd_negotiate_send_rep_err(msg);
     ret = nbd_negotiate_send_rep_len(client, type, len, errp);
     if (ret < 0) {
-        goto out;
+        return ret;
     }
     if (nbd_write(client->ioc, msg, len, errp) < 0) {
         error_prepend(errp, "write failed (error message): ");
-        ret = -EIO;
-    } else {
-        ret = 0;
+        return -EIO;
     }

-out:
-    g_free(msg);
-    return ret;
+    return 0;
 }

 /* Send an error reply.
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 3/9] nbd: Tolerate more errors to structured reply request
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 1/9] nbd: Advertise multi-conn for shared read-only connections Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 2/9] nbd: Use g_autofree in a few places Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate() Eric Blake
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel P . Berrangé, open list:Network Block Dev...

A server may have a reason to reject a request for structured replies,
beyond just not recognizing them as a valid request; similarly, it may
have a reason for rejecting a request for a meta context.  It doesn't
hurt us to continue talking to such a server; otherwise 'qemu-nbd
--list' of such a server fails to display all available details about
the export.

Encountered when temporarily tweaking nbdkit to reply with
NBD_REP_ERR_POLICY.  Present since structured reply support was first
added (commit d795299b reused starttls handling, but starttls is
different in that we can't fall back to other behavior on any error).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190824172813.29720-3-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 nbd/client.c     | 63 +++++++++++++++++++++++++-----------------------
 nbd/trace-events |  2 +-
 2 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index a9d8d32feff7..b9dc829175f9 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2018 Red Hat, Inc.
+ *  Copyright (C) 2016-2019 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Client Side
@@ -142,17 +142,18 @@ static int nbd_receive_option_reply(QIOChannel *ioc, uint32_t opt,
     return 0;
 }

-/* If reply represents success, return 1 without further action.
- * If reply represents an error, consume the optional payload of
- * the packet on ioc.  Then return 0 for unsupported (so the client
- * can fall back to other approaches), or -1 with errp set for other
- * errors.
+/*
+ * If reply represents success, return 1 without further action.  If
+ * reply represents an error, consume the optional payload of the
+ * packet on ioc.  Then return 0 for unsupported (so the client can
+ * fall back to other approaches), where @strict determines if only
+ * ERR_UNSUP or all errors fit that category, or -1 with errp set for
+ * other errors.
  */
 static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
-                                Error **errp)
+                                bool strict, Error **errp)
 {
-    char *msg = NULL;
-    int result = -1;
+    g_autofree char *msg = NULL;

     if (!(reply->type & (1 << 31))) {
         return 1;
@@ -163,26 +164,28 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
             error_setg(errp, "server error %" PRIu32
                        " (%s) message is too long",
                        reply->type, nbd_rep_lookup(reply->type));
-            goto cleanup;
+            goto err;
         }
         msg = g_malloc(reply->length + 1);
         if (nbd_read(ioc, msg, reply->length, NULL, errp) < 0) {
             error_prepend(errp, "Failed to read option error %" PRIu32
                           " (%s) message: ",
                           reply->type, nbd_rep_lookup(reply->type));
-            goto cleanup;
+            goto err;
         }
         msg[reply->length] = '\0';
         trace_nbd_server_error_msg(reply->type,
                                    nbd_reply_type_lookup(reply->type), msg);
     }

+    if (reply->type == NBD_REP_ERR_UNSUP || !strict) {
+        trace_nbd_reply_err_ignored(reply->option,
+                                    nbd_opt_lookup(reply->option),
+                                    reply->type, nbd_rep_lookup(reply->type));
+        return 0;
+    }
+
     switch (reply->type) {
-    case NBD_REP_ERR_UNSUP:
-        trace_nbd_reply_err_unsup(reply->option, nbd_opt_lookup(reply->option));
-        result = 0;
-        goto cleanup;
-
     case NBD_REP_ERR_POLICY:
         error_setg(errp, "Denied by server for option %" PRIu32 " (%s)",
                    reply->option, nbd_opt_lookup(reply->option));
@@ -227,12 +230,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
         error_append_hint(errp, "server reported: %s\n", msg);
     }

- cleanup:
-    g_free(msg);
-    if (result < 0) {
-        nbd_send_opt_abort(ioc);
-    }
-    return result;
+ err:
+    nbd_send_opt_abort(ioc);
+    return -1;
 }

 /* nbd_receive_list:
@@ -257,7 +257,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, char **description,
     if (nbd_receive_option_reply(ioc, NBD_OPT_LIST, &reply, errp) < 0) {
         return -1;
     }
-    error = nbd_handle_reply_err(ioc, &reply, errp);
+    error = nbd_handle_reply_err(ioc, &reply, true, errp);
     if (error <= 0) {
         return error;
     }
@@ -363,7 +363,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
         if (nbd_receive_option_reply(ioc, opt, &reply, errp) < 0) {
             return -1;
         }
-        error = nbd_handle_reply_err(ioc, &reply, errp);
+        error = nbd_handle_reply_err(ioc, &reply, true, errp);
         if (error <= 0) {
             return error;
         }
@@ -538,12 +538,15 @@ static int nbd_receive_query_exports(QIOChannel *ioc,
     }
 }

-/* nbd_request_simple_option: Send an option request, and parse the reply
+/*
+ * nbd_request_simple_option: Send an option request, and parse the reply.
+ * @strict controls whether ERR_UNSUP or all errors produce 0 status.
  * return 1 for successful negotiation,
  *        0 if operation is unsupported,
  *        -1 with errp set for any other error
  */
-static int nbd_request_simple_option(QIOChannel *ioc, int opt, Error **errp)
+static int nbd_request_simple_option(QIOChannel *ioc, int opt, bool strict,
+                                     Error **errp)
 {
     NBDOptionReply reply;
     int error;
@@ -555,7 +558,7 @@ static int nbd_request_simple_option(QIOChannel *ioc, int opt, Error **errp)
     if (nbd_receive_option_reply(ioc, opt, &reply, errp) < 0) {
         return -1;
     }
-    error = nbd_handle_reply_err(ioc, &reply, errp);
+    error = nbd_handle_reply_err(ioc, &reply, strict, errp);
     if (error <= 0) {
         return error;
     }
@@ -587,7 +590,7 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc,
     QIOChannelTLS *tioc;
     struct NBDTLSHandshakeData data = { 0 };

-    ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, errp);
+    ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, true, errp);
     if (ret <= 0) {
         if (ret == 0) {
             error_setg(errp, "Server don't support STARTTLS option");
@@ -687,7 +690,7 @@ static int nbd_receive_one_meta_context(QIOChannel *ioc,
         return -1;
     }

-    ret = nbd_handle_reply_err(ioc, &reply, errp);
+    ret = nbd_handle_reply_err(ioc, &reply, false, errp);
     if (ret <= 0) {
         return ret;
     }
@@ -943,7 +946,7 @@ static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
             if (structured_reply) {
                 result = nbd_request_simple_option(ioc,
                                                    NBD_OPT_STRUCTURED_REPLY,
-                                                   errp);
+                                                   false, errp);
                 if (result < 0) {
                     return -EINVAL;
                 }
diff --git a/nbd/trace-events b/nbd/trace-events
index 7ab6b3788cb2..f6cde967903a 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -4,7 +4,7 @@
 nbd_send_option_request(uint32_t opt, const char *name, uint32_t len) "Sending option request %" PRIu32" (%s), len %" PRIu32
 nbd_receive_option_reply(uint32_t option, const char *optname, uint32_t type, const char *typename, uint32_t length) "Received option reply %" PRIu32" (%s), type %" PRIu32" (%s), len %" PRIu32
 nbd_server_error_msg(uint32_t err, const char *type, const char *msg) "server reported error 0x%" PRIx32 " (%s) with additional message: %s"
-nbd_reply_err_unsup(uint32_t option, const char *name) "server doesn't understand request %" PRIu32 " (%s), attempting fallback"
+nbd_reply_err_ignored(uint32_t option, const char *name, uint32_t reply, const char *reply_name) "server failed request %" PRIu32 " (%s) with error 0x%" PRIx32 " (%s), attempting fallback"
 nbd_receive_list(const char *name, const char *desc) "export list includes '%s', description '%s'"
 nbd_opt_info_go_start(const char *opt, const char *name) "Attempting %s for export '%s'"
 nbd_opt_info_go_success(const char *opt) "Export is ready after %s request"
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate()
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (2 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 3/9] nbd: Tolerate more errors to structured reply request Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-06 12:51   ` Andrey Shinkevich
  2019-09-05 18:21 ` [Qemu-devel] [PULL 5/9] docs: Update preferred NBD device syntax Eric Blake
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, open list:raw, Max Reitz, Stefan Hajnoczi,
	Denis V . Lunev, Andrey Shinkevich

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Revert the commit 118f99442d 'block/io.c: fix for the allocation failure'
and use better error handling for file systems that do not support
fallocate() for an unaligned byte range. Allow falling back to pwrite
in case fallocate() returns EINVAL.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <1566913973-15490-1-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/io.c         | 2 +-
 block/file-posix.c | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 0fa10831edb7..16a598fd0857 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1746,7 +1746,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
             assert(!bs->supported_zero_flags);
         }

-        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
+        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
             BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;

diff --git a/block/file-posix.c b/block/file-posix.c
index 71f168ee2f13..87c5a4ccbdc8 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1588,6 +1588,13 @@ static int handle_aiocb_write_zeroes(void *opaque)
     if (s->has_write_zeroes) {
         int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
                                aiocb->aio_offset, aiocb->aio_nbytes);
+        if (ret == -EINVAL) {
+            /*
+             * Allow falling back to pwrite for file systems that
+             * do not support fallocate() for an unaligned byte range.
+             */
+            return -ENOTSUP;
+        }
         if (ret == 0 || ret != -ENOTSUP) {
             return ret;
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 5/9] docs: Update preferred NBD device syntax
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (3 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate() Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 6/9] nbd: Improve per-export flag handling in server Eric Blake
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: John Snow

Mention the preferred URI form, especially since NBD is trying to
standardize that form: https://lists.debian.org/nbd/2019/06/msg00012.html

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190903145634.20237-1-eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 qemu-doc.texi | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index 577d1e837640..ec460f1d7af5 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -297,9 +297,16 @@ qemu-system-i386 -drive file=iscsi://192.0.2.1/iqn.2001-04.com.example/1

 @item NBD
 QEMU supports NBD (Network Block Devices) both using TCP protocol as well
-as Unix Domain Sockets.
+as Unix Domain Sockets.  With TCP, the default port is 10809.

-Syntax for specifying a NBD device using TCP
+Syntax for specifying a NBD device using TCP, in preferred URI form:
+``nbd://<server-ip>[:<port>]/[<export>]''
+
+Syntax for specifying a NBD device using Unix Domain Sockets; remember
+that '?' is a shell glob character and may need quoting:
+``nbd+unix:///[<export>]?socket=<domain-socket>''
+
+Older syntax that is also recognized:
 ``nbd:<server-ip>:<port>[:exportname=<export>]''

 Syntax for specifying a NBD device using Unix Domain Sockets
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 6/9] nbd: Improve per-export flag handling in server
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (4 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 5/9] docs: Update preferred NBD device syntax Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 7/9] nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO Eric Blake
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy,
	open list:Block layer core, Max Reitz

When creating a read-only image, we are still advertising support for
TRIM and WRITE_ZEROES to the client, even though the client should not
be issuing those commands.  But seeing this requires looking across
multiple functions:

All callers to nbd_export_new() passed a single flag based solely on
whether the export allows writes.  Later, we then pass a constant set
of flags to nbd_negotiate_options() (namely, the set of flags which we
always support, at least for writable images), which is then further
dynamically modified with NBD_FLAG_SEND_DF based on client requests
for structured options.  Finally, when processing NBD_OPT_EXPORT_NAME
or NBD_OPT_EXPORT_GO we bitwise-or the original caller's flag with the
runtime set of flags we've built up over several functions.

Let's refactor things to instead compute a baseline of flags as soon
as possible which gets shared between multiple clients, in
nbd_export_new(), and changing the signature for the callers to pass
in a simpler bool rather than having to figure out flags.  We can then
get rid of the 'myflags' parameter to various functions, and instead
refer to client for everything we need (we still have to perform a
bitwise-OR for NBD_FLAG_SEND_DF during NBD_OPT_EXPORT_NAME and
NBD_OPT_EXPORT_GO, but it's easier to see what is being computed).
This lets us quit advertising senseless flags for read-only images, as
well as making the next patch for exposing FAST_ZERO support easier to
write.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-2-eblake@redhat.com>
[eblake: improve commit message]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/nbd.h |  2 +-
 blockdev-nbd.c      |  3 +--
 nbd/server.c        | 62 +++++++++++++++++++++++++--------------------
 qemu-nbd.c          |  6 ++---
 4 files changed, 39 insertions(+), 34 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 991fd52a5134..2c87b42dfd48 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -326,7 +326,7 @@ typedef struct NBDClient NBDClient;

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, uint16_t nbdflags, bool shared,
+                          const char *bitmap, bool readonly, bool shared,
                           void (*close)(NBDExport *), bool writethrough,
                           BlockBackend *on_eject_blk, Error **errp);
 void nbd_export_close(NBDExport *exp);
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 1fcfdb0997c6..213f226ac1c4 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -187,8 +187,7 @@ void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
         writable = false;
     }

-    exp = nbd_export_new(bs, 0, len, name, NULL, bitmap,
-                         writable ? 0 : NBD_FLAG_READ_ONLY, !writable,
+    exp = nbd_export_new(bs, 0, len, name, NULL, bitmap, !writable, !writable,
                          NULL, false, on_eject_blk, errp);
     if (!exp) {
         return;
diff --git a/nbd/server.c b/nbd/server.c
index 74d205812fee..d5078f7468af 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -419,14 +419,14 @@ static void nbd_check_meta_export(NBDClient *client)

 /* Send a reply to NBD_OPT_EXPORT_NAME.
  * Return -errno on error, 0 on success. */
-static int nbd_negotiate_handle_export_name(NBDClient *client,
-                                            uint16_t myflags, bool no_zeroes,
+static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
                                             Error **errp)
 {
     char name[NBD_MAX_NAME_SIZE + 1];
     char buf[NBD_REPLY_EXPORT_NAME_SIZE] = "";
     size_t len;
     int ret;
+    uint16_t myflags;

     /* Client sends:
         [20 ..  xx]   export name (length bytes)
@@ -454,10 +454,13 @@ static int nbd_negotiate_handle_export_name(NBDClient *client,
         return -EINVAL;
     }

-    trace_nbd_negotiate_new_style_size_flags(client->exp->size,
-                                             client->exp->nbdflags | myflags);
+    myflags = client->exp->nbdflags;
+    if (client->structured_reply) {
+        myflags |= NBD_FLAG_SEND_DF;
+    }
+    trace_nbd_negotiate_new_style_size_flags(client->exp->size, myflags);
     stq_be_p(buf, client->exp->size);
-    stw_be_p(buf + 8, client->exp->nbdflags | myflags);
+    stw_be_p(buf + 8, myflags);
     len = no_zeroes ? 10 : sizeof(buf);
     ret = nbd_write(client->ioc, buf, len, errp);
     if (ret < 0) {
@@ -522,8 +525,7 @@ static int nbd_reject_length(NBDClient *client, bool fatal, Error **errp)
 /* Handle NBD_OPT_INFO and NBD_OPT_GO.
  * Return -errno on error, 0 if ready for next option, and 1 to move
  * into transmission phase.  */
-static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
-                                     Error **errp)
+static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
 {
     int rc;
     char name[NBD_MAX_NAME_SIZE + 1];
@@ -536,6 +538,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
     uint32_t sizes[3];
     char buf[sizeof(uint64_t) + sizeof(uint16_t)];
     uint32_t check_align = 0;
+    uint16_t myflags;

     /* Client sends:
         4 bytes: L, name length (can be 0)
@@ -633,10 +636,13 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
     }

     /* Send NBD_INFO_EXPORT always */
-    trace_nbd_negotiate_new_style_size_flags(exp->size,
-                                             exp->nbdflags | myflags);
+    myflags = exp->nbdflags;
+    if (client->structured_reply) {
+        myflags |= NBD_FLAG_SEND_DF;
+    }
+    trace_nbd_negotiate_new_style_size_flags(exp->size, myflags);
     stq_be_p(buf, exp->size);
-    stw_be_p(buf + 8, exp->nbdflags | myflags);
+    stw_be_p(buf + 8, myflags);
     rc = nbd_negotiate_send_info(client, NBD_INFO_EXPORT,
                                  sizeof(buf), buf, errp);
     if (rc < 0) {
@@ -1033,8 +1039,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
  * 1       if client sent NBD_OPT_ABORT, i.e. on valid disconnect,
  *         errp is not set
  */
-static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
-                                 Error **errp)
+static int nbd_negotiate_options(NBDClient *client, Error **errp)
 {
     uint32_t flags;
     bool fixedNewstyle = false;
@@ -1168,13 +1173,12 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
                 return 1;

             case NBD_OPT_EXPORT_NAME:
-                return nbd_negotiate_handle_export_name(client,
-                                                        myflags, no_zeroes,
+                return nbd_negotiate_handle_export_name(client, no_zeroes,
                                                         errp);

             case NBD_OPT_INFO:
             case NBD_OPT_GO:
-                ret = nbd_negotiate_handle_info(client, myflags, errp);
+                ret = nbd_negotiate_handle_info(client, errp);
                 if (ret == 1) {
                     assert(option == NBD_OPT_GO);
                     return 0;
@@ -1205,7 +1209,6 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
                 } else {
                     ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
                     client->structured_reply = true;
-                    myflags |= NBD_FLAG_SEND_DF;
                 }
                 break;

@@ -1228,8 +1231,7 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags,
              */
             switch (option) {
             case NBD_OPT_EXPORT_NAME:
-                return nbd_negotiate_handle_export_name(client,
-                                                        myflags, no_zeroes,
+                return nbd_negotiate_handle_export_name(client, no_zeroes,
                                                         errp);

             default:
@@ -1255,9 +1257,6 @@ static coroutine_fn int nbd_negotiate(NBDClient *client, Error **errp)
 {
     char buf[NBD_OLDSTYLE_NEGOTIATE_SIZE] = "";
     int ret;
-    const uint16_t myflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
-                              NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA |
-                              NBD_FLAG_SEND_WRITE_ZEROES | NBD_FLAG_SEND_CACHE);

     /* Old style negotiation header, no room for options
         [ 0 ..   7]   passwd       ("NBDMAGIC")
@@ -1285,7 +1284,7 @@ static coroutine_fn int nbd_negotiate(NBDClient *client, Error **errp)
         error_prepend(errp, "write failed: ");
         return -EINVAL;
     }
-    ret = nbd_negotiate_options(client, myflags, errp);
+    ret = nbd_negotiate_options(client, errp);
     if (ret != 0) {
         if (ret < 0) {
             error_prepend(errp, "option negotiation failed: ");
@@ -1457,7 +1456,7 @@ static void nbd_eject_notifier(Notifier *n, void *data)

 NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
                           uint64_t size, const char *name, const char *desc,
-                          const char *bitmap, uint16_t nbdflags, bool shared,
+                          const char *bitmap, bool readonly, bool shared,
                           void (*close)(NBDExport *), bool writethrough,
                           BlockBackend *on_eject_blk, Error **errp)
 {
@@ -1481,10 +1480,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
     /* Don't allow resize while the NBD server is running, otherwise we don't
      * care what happens with the node. */
     perm = BLK_PERM_CONSISTENT_READ;
-    if ((nbdflags & NBD_FLAG_READ_ONLY) == 0) {
+    if (!readonly) {
         perm |= BLK_PERM_WRITE;
-    } else if (shared) {
-        nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
     }
     blk = blk_new(bdrv_get_aio_context(bs), perm,
                   BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
@@ -1503,7 +1500,16 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
     exp->dev_offset = dev_offset;
     exp->name = g_strdup(name);
     exp->description = g_strdup(desc);
-    exp->nbdflags = nbdflags;
+    exp->nbdflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_FLUSH |
+                     NBD_FLAG_SEND_FUA | NBD_FLAG_SEND_CACHE);
+    if (readonly) {
+        exp->nbdflags |= NBD_FLAG_READ_ONLY;
+        if (shared) {
+            exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
+        }
+    } else {
+        exp->nbdflags |= NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES;
+    }
     assert(size <= INT64_MAX - dev_offset);
     exp->size = QEMU_ALIGN_DOWN(size, BDRV_SECTOR_SIZE);

@@ -1528,7 +1534,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
             goto fail;
         }

-        if ((nbdflags & NBD_FLAG_READ_ONLY) && bdrv_is_writable(bs) &&
+        if (readonly && bdrv_is_writable(bs) &&
             bdrv_dirty_bitmap_enabled(bm)) {
             error_setg(errp,
                        "Enabled bitmap '%s' incompatible with readonly export",
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 2403ef3d0f9f..ae841150760e 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -600,7 +600,7 @@ int main(int argc, char **argv)
     BlockBackend *blk;
     BlockDriverState *bs;
     uint64_t dev_offset = 0;
-    uint16_t nbdflags = 0;
+    bool readonly = false;
     bool disconnect = false;
     const char *bindto = NULL;
     const char *port = NULL;
@@ -782,7 +782,7 @@ int main(int argc, char **argv)
             }
             /* fall through */
         case 'r':
-            nbdflags |= NBD_FLAG_READ_ONLY;
+            readonly = true;
             flags &= ~BDRV_O_RDWR;
             break;
         case 'P':
@@ -1173,7 +1173,7 @@ int main(int argc, char **argv)
     }

     export = nbd_export_new(bs, dev_offset, fd_size, export_name,
-                            export_description, bitmap, nbdflags, shared > 1,
+                            export_description, bitmap, readonly, shared > 1,
                             nbd_export_closed, writethrough, NULL,
                             &error_fatal);

-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 7/9] nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (5 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 6/9] nbd: Improve per-export flag handling in server Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 8/9] nbd: Implement client use of NBD FAST_ZERO Eric Blake
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy,
	open list:Network Block Dev...,
	Max Reitz

Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to
avoid wasting time on a preliminary write-zero request that will later
be rewritten by actual data, if it is known that the write-zero
request will use a slow fallback; but in doing so, could not optimize
for NBD.  The NBD specification is now considering an extension that
will allow passing on those semantics; this patch updates the new
protocol bits and 'qemu-nbd --list' output to recognize the bit, as
well as the new errno value possible when using the new flag; while
upcoming patches will improve the client to use the feature when
present, and the server to advertise support for it.

The NBD spec recommends (but not requires) that ENOTSUP be avoided for
all but failures of a fast zero (the only time it is mandatory to
avoid an ENOTSUP failure is when fast zero is supported but not
requested during write zeroes; the questionable use is for ENOTSUP to
other actions like a normal write request).  However, clients that get
an unexpected ENOTSUP will either already be treating it the same as
EINVAL, or may appreciate the extra bit of information.  We were
equally loose for returning EOVERFLOW in more situations than
recommended by the spec, so if it turns out to be a problem in
practice, a later patch can tighten handling for both error codes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message, also handle EOPNOTSUPP]
---
 docs/interop/nbd.txt | 3 ++-
 include/block/nbd.h  | 4 ++++
 nbd/common.c         | 5 +++++
 nbd/server.c         | 5 +++++
 qemu-nbd.c           | 1 +
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index 6dfec7f47647..45118809618e 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -53,4 +53,5 @@ the operation of that feature.
 * 2.12: NBD_CMD_BLOCK_STATUS for "base:allocation"
 * 3.0: NBD_OPT_STARTTLS with TLS Pre-Shared Keys (PSK),
 NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
-* 4.2: NBD_FLAG_CAN_MULTI_CONN for sharable read-only exports
+* 4.2: NBD_FLAG_CAN_MULTI_CONN for sharable read-only exports,
+NBD_CMD_FLAG_FAST_ZERO
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2c87b42dfd48..21550747cf35 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -140,6 +140,7 @@ enum {
     NBD_FLAG_CAN_MULTI_CONN_BIT     =  8, /* Multi-client cache consistent */
     NBD_FLAG_SEND_RESIZE_BIT        =  9, /* Send resize */
     NBD_FLAG_SEND_CACHE_BIT         = 10, /* Send CACHE (prefetch) */
+    NBD_FLAG_SEND_FAST_ZERO_BIT     = 11, /* FAST_ZERO flag for WRITE_ZEROES */
 };

 #define NBD_FLAG_HAS_FLAGS         (1 << NBD_FLAG_HAS_FLAGS_BIT)
@@ -153,6 +154,7 @@ enum {
 #define NBD_FLAG_CAN_MULTI_CONN    (1 << NBD_FLAG_CAN_MULTI_CONN_BIT)
 #define NBD_FLAG_SEND_RESIZE       (1 << NBD_FLAG_SEND_RESIZE_BIT)
 #define NBD_FLAG_SEND_CACHE        (1 << NBD_FLAG_SEND_CACHE_BIT)
+#define NBD_FLAG_SEND_FAST_ZERO    (1 << NBD_FLAG_SEND_FAST_ZERO_BIT)

 /* New-style handshake (global) flags, sent from server to client, and
    control what will happen during handshake phase. */
@@ -205,6 +207,7 @@ enum {
 #define NBD_CMD_FLAG_DF         (1 << 2) /* don't fragment structured read */
 #define NBD_CMD_FLAG_REQ_ONE    (1 << 3) /* only one extent in BLOCK_STATUS
                                           * reply chunk */
+#define NBD_CMD_FLAG_FAST_ZERO  (1 << 4) /* fail if WRITE_ZEROES is not fast */

 /* Supported request types */
 enum {
@@ -270,6 +273,7 @@ static inline bool nbd_reply_type_is_error(int type)
 #define NBD_EINVAL     22
 #define NBD_ENOSPC     28
 #define NBD_EOVERFLOW  75
+#define NBD_ENOTSUP    95
 #define NBD_ESHUTDOWN  108

 /* Details collected by NBD_OPT_EXPORT_NAME and NBD_OPT_GO */
diff --git a/nbd/common.c b/nbd/common.c
index cc8b278e541d..ddfe7d118371 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -201,6 +201,8 @@ const char *nbd_err_lookup(int err)
         return "ENOSPC";
     case NBD_EOVERFLOW:
         return "EOVERFLOW";
+    case NBD_ENOTSUP:
+        return "ENOTSUP";
     case NBD_ESHUTDOWN:
         return "ESHUTDOWN";
     default:
@@ -231,6 +233,9 @@ int nbd_errno_to_system_errno(int err)
     case NBD_EOVERFLOW:
         ret = EOVERFLOW;
         break;
+    case NBD_ENOTSUP:
+        ret = ENOTSUP;
+        break;
     case NBD_ESHUTDOWN:
         ret = ESHUTDOWN;
         break;
diff --git a/nbd/server.c b/nbd/server.c
index d5078f7468af..4992148de1c4 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -55,6 +55,11 @@ static int system_errno_to_nbd_errno(int err)
         return NBD_ENOSPC;
     case EOVERFLOW:
         return NBD_EOVERFLOW;
+    case ENOTSUP:
+#if ENOTSUP != EOPNOTSUPP
+    case EOPNOTSUPP:
+#endif
+        return NBD_ENOTSUP;
     case ESHUTDOWN:
         return NBD_ESHUTDOWN;
     case EINVAL:
diff --git a/qemu-nbd.c b/qemu-nbd.c
index ae841150760e..9032b6de2ace 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -294,6 +294,7 @@ static int qemu_nbd_client_list(SocketAddress *saddr, QCryptoTLSCreds *tls,
                 [NBD_FLAG_CAN_MULTI_CONN_BIT]       = "multi",
                 [NBD_FLAG_SEND_RESIZE_BIT]          = "resize",
                 [NBD_FLAG_SEND_CACHE_BIT]           = "cache",
+                [NBD_FLAG_SEND_FAST_ZERO_BIT]       = "fast-zero",
             };

             printf("  size:  %" PRIu64 "\n", list[i].size);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 8/9] nbd: Implement client use of NBD FAST_ZERO
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (6 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 7/9] nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 18:21 ` [Qemu-devel] [PULL 9/9] nbd: Implement server " Eric Blake
  2019-09-05 20:47 ` [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy,
	open list:Network Block Dev...,
	Max Reitz

The client side is fairly straightforward: if the server advertised
fast zero support, then we can map that to BDRV_REQ_NO_FALLBACK
support.  A server that advertises FAST_ZERO but not WRITE_ZEROES
is technically broken, but we can ignore that situation as it does
not change our behavior.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index c4c91a158602..813c40d8f067 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1044,6 +1044,10 @@ static int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
     if (!(flags & BDRV_REQ_MAY_UNMAP)) {
         request.flags |= NBD_CMD_FLAG_NO_HOLE;
     }
+    if (flags & BDRV_REQ_NO_FALLBACK) {
+        assert(s->info.flags & NBD_FLAG_SEND_FAST_ZERO);
+        request.flags |= NBD_CMD_FLAG_FAST_ZERO;
+    }

     if (!bytes) {
         return 0;
@@ -1239,6 +1243,9 @@ static int nbd_client_connect(BlockDriverState *bs, Error **errp)
     }
     if (s->info.flags & NBD_FLAG_SEND_WRITE_ZEROES) {
         bs->supported_zero_flags |= BDRV_REQ_MAY_UNMAP;
+        if (s->info.flags & NBD_FLAG_SEND_FAST_ZERO) {
+            bs->supported_zero_flags |= BDRV_REQ_NO_FALLBACK;
+        }
     }

     s->sioc = sioc;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PULL 9/9] nbd: Implement server use of NBD FAST_ZERO
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (7 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 8/9] nbd: Implement client use of NBD FAST_ZERO Eric Blake
@ 2019-09-05 18:21 ` Eric Blake
  2019-09-05 20:47 ` [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 18:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vladimir Sementsov-Ogievskiy, open list:Network Block Dev...

The server side is fairly straightforward: we can always advertise
support for detection of fast zero, and implement it by mapping the
request to the block layer BDRV_REQ_NO_FALLBACK.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 nbd/server.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 4992148de1c4..28c3c8be854c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1513,7 +1513,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
             exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
         }
     } else {
-        exp->nbdflags |= NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES;
+        exp->nbdflags |= (NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES |
+                          NBD_FLAG_SEND_FAST_ZERO);
     }
     assert(size <= INT64_MAX - dev_offset);
     exp->size = QEMU_ALIGN_DOWN(size, BDRV_SECTOR_SIZE);
@@ -2166,7 +2167,7 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request,
     if (request->type == NBD_CMD_READ && client->structured_reply) {
         valid_flags |= NBD_CMD_FLAG_DF;
     } else if (request->type == NBD_CMD_WRITE_ZEROES) {
-        valid_flags |= NBD_CMD_FLAG_NO_HOLE;
+        valid_flags |= NBD_CMD_FLAG_NO_HOLE | NBD_CMD_FLAG_FAST_ZERO;
     } else if (request->type == NBD_CMD_BLOCK_STATUS) {
         valid_flags |= NBD_CMD_FLAG_REQ_ONE;
     }
@@ -2305,6 +2306,9 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
         if (!(request->flags & NBD_CMD_FLAG_NO_HOLE)) {
             flags |= BDRV_REQ_MAY_UNMAP;
         }
+        if (request->flags & NBD_CMD_FLAG_FAST_ZERO) {
+            flags |= BDRV_REQ_NO_FALLBACK;
+        }
         ret = blk_pwrite_zeroes(exp->blk, request->from + exp->dev_offset,
                                 request->len, flags);
         return nbd_send_generic_reply(client, request->handle, ret,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05
  2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
                   ` (8 preceding siblings ...)
  2019-09-05 18:21 ` [Qemu-devel] [PULL 9/9] nbd: Implement server " Eric Blake
@ 2019-09-05 20:47 ` Eric Blake
  9 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2019-09-05 20:47 UTC (permalink / raw)
  To: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 860 bytes --]

n 9/5/19 1:21 PM, Eric Blake wrote:
> The following changes since commit eac2f39602e0423adf56be410c9a22c31fec9a81:
> 
>   target/arm: Inline gen_bx_im into callers (2019-09-05 13:23:04 +0100)
> 
> are available in the Git repository at:
> 
>   https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2019-09-05
> 
> for you to fetch changes up to 73bddca33cb1749ddcbcc1e9972a77d93000553b:
> 
>   nbd: Implement server use of NBD FAST_ZERO (2019-09-05 10:48:46 -0500)
> 

>       nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO
>       nbd: Implement client use of NBD FAST_ZERO
>       nbd: Implement server use of NBD FAST_ZERO

These cause failures in iotests 223 and 233; I'll post a v2 pull request
shortly fixing that.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate()
  2019-09-05 18:21 ` [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate() Eric Blake
@ 2019-09-06 12:51   ` Andrey Shinkevich
  0 siblings, 0 replies; 12+ messages in thread
From: Andrey Shinkevich @ 2019-09-06 12:51 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, Denis Lunev, open list:raw, Max Reitz,
	Stefan Hajnoczi

Many thanks

Andrey

On 05/09/2019 21:21, Eric Blake wrote:
> From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> 
> Revert the commit 118f99442d 'block/io.c: fix for the allocation failure'
> and use better error handling for file systems that do not support
> fallocate() for an unaligned byte range. Allow falling back to pwrite
> in case fallocate() returns EINVAL.
> 
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Suggested-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Denis V. Lunev <den@openvz.org>
> Message-Id: <1566913973-15490-1-git-send-email-andrey.shinkevich@virtuozzo.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   block/io.c         | 2 +-
>   block/file-posix.c | 7 +++++++
>   2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 0fa10831edb7..16a598fd0857 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1746,7 +1746,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>               assert(!bs->supported_zero_flags);
>           }
> 
> -        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> +        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>               /* Fall back to bounce buffer if write zeroes is unsupported */
>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 71f168ee2f13..87c5a4ccbdc8 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1588,6 +1588,13 @@ static int handle_aiocb_write_zeroes(void *opaque)
>       if (s->has_write_zeroes) {
>           int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
>                                  aiocb->aio_offset, aiocb->aio_nbytes);
> +        if (ret == -EINVAL) {
> +            /*
> +             * Allow falling back to pwrite for file systems that
> +             * do not support fallocate() for an unaligned byte range.
> +             */
> +            return -ENOTSUP;
> +        }
>           if (ret == 0 || ret != -ENOTSUP) {
>               return ret;
>           }
> 

-- 
With the best regards,
Andrey Shinkevich

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-09-06 12:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-05 18:21 [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 1/9] nbd: Advertise multi-conn for shared read-only connections Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 2/9] nbd: Use g_autofree in a few places Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 3/9] nbd: Tolerate more errors to structured reply request Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 4/9] block: workaround for unaligned byte range in fallocate() Eric Blake
2019-09-06 12:51   ` Andrey Shinkevich
2019-09-05 18:21 ` [Qemu-devel] [PULL 5/9] docs: Update preferred NBD device syntax Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 6/9] nbd: Improve per-export flag handling in server Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 7/9] nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 8/9] nbd: Implement client use of NBD FAST_ZERO Eric Blake
2019-09-05 18:21 ` [Qemu-devel] [PULL 9/9] nbd: Implement server " Eric Blake
2019-09-05 20:47 ` [Qemu-devel] [PULL 0/9] NBD patches through 2019-09-05 Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.