qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 for-6.0 0/8] nbd reconnect on open
@ 2020-11-30 13:40 Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 1/8] block/nbd: move initial connect to coroutine Vladimir Sementsov-Ogievskiy
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Hi all! There is a new feature: reconnect on open. It is useful when
start of vm and start of nbd server are not simple to sync.

v2: rebase on master.
Also, I've discovered that I've sent downstream version of test which
doesn't work here. So, the test is rewritten (with appropriate
improvements of iotests.py)

Vladimir Sementsov-Ogievskiy (8):
  block/nbd: move initial connect to coroutine
  nbd: allow reconnect on open, with corresponding new options
  iotests.py: fix qemu_tool_pipe_and_status()
  iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
  iotests.py: add qemu_tool_popen()
  iotests.py: add and use qemu_io_wrap_args()
  iotests.py: add qemu_io_popen()
  iotests: add 306 to test reconnect on nbd open

 block/nbd.c                   | 173 +++++++++++++++++++++++++---------
 tests/qemu-iotests/306        |  71 ++++++++++++++
 tests/qemu-iotests/306.out    |  11 +++
 tests/qemu-iotests/group      |   1 +
 tests/qemu-iotests/iotests.py |  56 +++++++----
 5 files changed, 246 insertions(+), 66 deletions(-)
 create mode 100755 tests/qemu-iotests/306
 create mode 100644 tests/qemu-iotests/306.out

-- 
2.21.3



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/8] block/nbd: move initial connect to coroutine
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2021-01-20 22:24   ` Eric Blake
  2020-11-30 13:40 ` [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

We are going to implement reconnect-on-open. Let's reuse existing
reconnect loop. For this, do initial connect in connection coroutine.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd.c | 94 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 53 insertions(+), 41 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42536702b6..3e1d6c2b17 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -57,6 +57,7 @@ typedef struct {
 } NBDClientRequest;
 
 typedef enum NBDClientState {
+    NBD_CLIENT_OPENING,
     NBD_CLIENT_CONNECTING_WAIT,
     NBD_CLIENT_CONNECTING_NOWAIT,
     NBD_CLIENT_CONNECTED,
@@ -113,6 +114,7 @@ typedef struct BDRVNBDState {
     CoQueue free_sema;
     Coroutine *connection_co;
     Coroutine *teardown_co;
+    Coroutine *open_co;
     QemuCoSleepState *connection_co_sleep_ns_state;
     bool drained;
     bool wait_drained_end;
@@ -141,8 +143,6 @@ typedef struct BDRVNBDState {
     NBDConnectThread *connect_thread;
 } BDRVNBDState;
 
-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-                                                  Error **errp);
 static QIOChannelSocket *nbd_co_establish_connection(BlockDriverState *bs,
                                                      Error **errp);
 static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
@@ -339,7 +339,8 @@ static void nbd_teardown_connection(BlockDriverState *bs)
 static bool nbd_client_connecting(BDRVNBDState *s)
 {
     return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-        s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+        s->state == NBD_CLIENT_CONNECTING_NOWAIT ||
+        s->state == NBD_CLIENT_OPENING;
 }
 
 static bool nbd_client_connecting_wait(BDRVNBDState *s)
@@ -639,6 +640,7 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
 {
     uint64_t timeout = 1 * NANOSECONDS_PER_SECOND;
     uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND;
+    bool initial_connect = s->state == NBD_CLIENT_OPENING;
 
     if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
         reconnect_delay_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
@@ -647,6 +649,25 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
 
     nbd_reconnect_attempt(s);
 
+    if (initial_connect) {
+        if (s->state == NBD_CLIENT_CONNECTED) {
+            /* All good. Just kick nbd_open() to successfully return */
+            if (s->open_co) {
+                aio_co_wake(s->open_co);
+                s->open_co = NULL;
+            }
+            aio_wait_kick();
+            return;
+        } else {
+            /*
+             * Failed. Currently, reconnect on open is not allowed, so quit.
+             * nbd_open() will be kicked in the end of nbd_connection_entry()
+             */
+            s->state = NBD_CLIENT_QUIT;
+            return;
+        }
+    }
+
     while (nbd_client_connecting(s)) {
         if (s->drained) {
             bdrv_dec_in_flight(s->bs);
@@ -759,6 +780,11 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
         s->ioc = NULL;
     }
 
+    if (s->open_co) {
+        aio_co_wake(s->open_co);
+        s->open_co = NULL;
+    }
+
     if (s->teardown_co) {
         aio_co_wake(s->teardown_co);
     }
@@ -1757,26 +1783,6 @@ static void nbd_client_close(BlockDriverState *bs)
     nbd_teardown_connection(bs);
 }
 
-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-                                                  Error **errp)
-{
-    ERRP_GUARD();
-    QIOChannelSocket *sioc;
-
-    sioc = qio_channel_socket_new();
-    qio_channel_set_name(QIO_CHANNEL(sioc), "nbd-client");
-
-    qio_channel_socket_connect_sync(sioc, saddr, errp);
-    if (*errp) {
-        object_unref(OBJECT(sioc));
-        return NULL;
-    }
-
-    qio_channel_set_delay(QIO_CHANNEL(sioc), false);
-
-    return sioc;
-}
-
 /* nbd_client_handshake takes ownership on sioc. On failure it is unref'ed. */
 static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
                                 Error **errp)
@@ -2245,7 +2251,6 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
 {
     int ret;
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
-    QIOChannelSocket *sioc;
 
     ret = nbd_process_options(bs, options, errp);
     if (ret < 0) {
@@ -2255,23 +2260,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
     s->bs = bs;
     qemu_co_mutex_init(&s->send_mutex);
     qemu_co_queue_init(&s->free_sema);
-
-    /*
-     * establish TCP connection, return error if it fails
-     * TODO: Configurable retry-until-timeout behaviour.
-     */
-    sioc = nbd_establish_connection(s->saddr, errp);
-    if (!sioc) {
-        return -ECONNREFUSED;
-    }
-
-    ret = nbd_client_handshake(bs, sioc, errp);
-    if (ret < 0) {
-        nbd_clear_bdrvstate(s);
-        return ret;
-    }
-    /* successfully connected */
-    s->state = NBD_CLIENT_CONNECTED;
+    s->state = NBD_CLIENT_OPENING;
 
     nbd_init_connect_thread(s);
 
@@ -2279,6 +2268,29 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
     bdrv_inc_in_flight(bs);
     aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
 
+    if (qemu_in_coroutine()) {
+        s->open_co = qemu_coroutine_self();
+        qemu_coroutine_yield();
+    } else {
+        BDRV_POLL_WHILE(bs, s->state == NBD_CLIENT_OPENING);
+    }
+
+    if (s->state != NBD_CLIENT_CONNECTED && s->connect_status < 0) {
+        /*
+         * It's possible that state != NBD_CLIENT_CONNECTED, but connect_status
+         * is 0. This means that initial connecting succeed, but failed later
+         * (during BDRV_POLL_WHILE). It's a rare case, but it happen in iotest
+         * 83. Let's don't care and just report success in this case: it not
+         * much differs from the case when connection failed immediately after
+         * succeeded open.
+         */
+        assert(s->connect_err);
+        error_propagate(errp, s->connect_err);
+        s->connect_err = NULL;
+        nbd_clear_bdrvstate(s);
+        return s->connect_status;
+    }
+
     return 0;
 }
 
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 1/8] block/nbd: move initial connect to coroutine Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2021-01-21  1:44   ` Eric Blake
  2020-11-30 13:40 ` [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Note: currently, using new option with long timeout in qmp command
blockdev-add is not good idea, as qmp interface is blocking, so,
don't add it now, let's add it later after
"monitor: Optionally run handlers in coroutines" series merged.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd.c | 115 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 92 insertions(+), 23 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 3e1d6c2b17..d25acafaad 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -125,12 +125,14 @@ typedef struct BDRVNBDState {
     bool wait_in_flight;
 
     QEMUTimer *reconnect_delay_timer;
+    QEMUTimer *open_timer;
 
     NBDClientRequest requests[MAX_NBD_REQUESTS];
     NBDReply reply;
     BlockDriverState *bs;
 
     /* Connection parameters */
+    uint64_t open_timeout;
     uint32_t reconnect_delay;
     SocketAddress *saddr;
     char *export, *tlscredsid;
@@ -305,7 +307,7 @@ static void coroutine_fn nbd_client_co_drain_end(BlockDriverState *bs)
 }
 
 
-static void nbd_teardown_connection(BlockDriverState *bs)
+static void nbd_teardown_connection_async(BlockDriverState *bs)
 {
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
 
@@ -325,6 +327,14 @@ static void nbd_teardown_connection(BlockDriverState *bs)
         }
         nbd_co_establish_connection_cancel(bs, true);
     }
+}
+
+static void nbd_teardown_connection(BlockDriverState *bs)
+{
+    BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
+
+    nbd_teardown_connection_async(bs);
+
     if (qemu_in_coroutine()) {
         s->teardown_co = qemu_coroutine_self();
         /* connection_co resumes us when it terminates */
@@ -474,6 +484,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
     s->wait_connect = true;
     qemu_coroutine_yield();
 
+    if (!s->connect_thread) {
+        error_setg(errp, "Connection attempt cancelled by other operation");
+        return NULL;
+    }
+
     qemu_mutex_lock(&thr->mutex);
 
     switch (thr->state) {
@@ -529,6 +544,12 @@ static void nbd_co_establish_connection_cancel(BlockDriverState *bs,
     bool wake = false;
     bool do_free = false;
 
+    if (!thr) {
+        /* already detached or finished */
+        assert(!s->wait_connect);
+        return;
+    }
+
     qemu_mutex_lock(&thr->mutex);
 
     if (thr->state == CONNECT_THREAD_RUNNING) {
@@ -624,10 +645,15 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
     bdrv_inc_in_flight(s->bs);
 
 out:
-    s->connect_status = ret;
-    error_free(s->connect_err);
-    s->connect_err = NULL;
-    error_propagate(&s->connect_err, local_err);
+    if (s->connect_status == -ETIMEDOUT) {
+        /* Don't rewrite timeout error by following cancel-provoked error */
+        error_free(local_err);
+    } else {
+        s->connect_status = ret;
+        error_free(s->connect_err);
+        s->connect_err = NULL;
+        error_propagate(&s->connect_err, local_err);
+    }
 
     if (ret >= 0) {
         /* successfully connected */
@@ -636,11 +662,44 @@ out:
     }
 }
 
+static void open_timer_del(BDRVNBDState *s)
+{
+    if (s->open_timer) {
+        timer_del(s->open_timer);
+        timer_free(s->open_timer);
+        s->open_timer = NULL;
+    }
+}
+
+static void open_timer_cb(void *opaque)
+{
+    BDRVNBDState *s = opaque;
+
+    if (!s->connect_status) {
+        /* First attempt was not finished. We should set an error */
+        s->connect_status = -ETIMEDOUT;
+        error_setg(&s->connect_err, "First connection attempt is cancelled by "
+                   "timeout");
+    }
+
+    nbd_teardown_connection_async(s->bs);
+    open_timer_del(s);
+}
+
+static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)
+{
+    assert(!s->open_timer && s->state == NBD_CLIENT_OPENING);
+    s->open_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
+                                  QEMU_CLOCK_REALTIME,
+                                  SCALE_NS,
+                                  open_timer_cb, s);
+    timer_mod(s->open_timer, expire_time_ns);
+}
+
 static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
 {
     uint64_t timeout = 1 * NANOSECONDS_PER_SECOND;
     uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND;
-    bool initial_connect = s->state == NBD_CLIENT_OPENING;
 
     if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
         reconnect_delay_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
@@ -649,23 +708,9 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
 
     nbd_reconnect_attempt(s);
 
-    if (initial_connect) {
-        if (s->state == NBD_CLIENT_CONNECTED) {
-            /* All good. Just kick nbd_open() to successfully return */
-            if (s->open_co) {
-                aio_co_wake(s->open_co);
-                s->open_co = NULL;
-            }
-            aio_wait_kick();
-            return;
-        } else {
-            /*
-             * Failed. Currently, reconnect on open is not allowed, so quit.
-             * nbd_open() will be kicked in the end of nbd_connection_entry()
-             */
-            s->state = NBD_CLIENT_QUIT;
-            return;
-        }
+    if (s->state == NBD_CLIENT_OPENING && !s->open_timeout) {
+        s->state = NBD_CLIENT_QUIT;
+        return;
     }
 
     while (nbd_client_connecting(s)) {
@@ -695,6 +740,16 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
     }
 
     reconnect_delay_timer_del(s);
+    open_timer_del(s);
+
+    if (s->state == NBD_CLIENT_CONNECTED) {
+        /* All good. Just kick nbd_open() to successfully return */
+        if (s->open_co) {
+            aio_co_wake(s->open_co);
+            s->open_co = NULL;
+        }
+        aio_wait_kick();
+    }
 }
 
 static coroutine_fn void nbd_connection_entry(void *opaque)
@@ -2180,6 +2235,14 @@ static QemuOptsList nbd_runtime_opts = {
                     "future requests before a successful reconnect will "
                     "immediately fail. Default 0",
         },
+        {
+            .name = "open-timeout",
+            .type = QEMU_OPT_NUMBER,
+            .help = "In seconds. If zero, nbd driver tries to establish "
+                    "connection only once, on fail open fails. If non-zero, "
+                    "nbd driver may do several attempts until success or "
+                    "@open-timeout seconds passed. Default 0",
+        },
         { /* end of list */ }
     },
 };
@@ -2235,6 +2298,7 @@ static int nbd_process_options(BlockDriverState *bs, QDict *options,
     }
 
     s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
+    s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
 
     ret = 0;
 
@@ -2268,6 +2332,11 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
     bdrv_inc_in_flight(bs);
     aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
 
+    if (s->open_timeout) {
+        open_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+                        s->open_timeout * NANOSECONDS_PER_SECOND);
+    }
+
     if (qemu_in_coroutine()) {
         s->open_co = qemu_coroutine_self();
         qemu_coroutine_yield();
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status()
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 1/8] block/nbd: move initial connect to coroutine Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2021-01-21  1:58   ` Eric Blake
  2020-11-30 13:40 ` [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

qemu_img_args variable is unrelated here. We should print just args.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index bcd4fe5b6f..5ebe25e063 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -101,9 +101,8 @@ def qemu_tool_pipe_and_status(tool: str, args: Sequence[str],
                             universal_newlines=True)
     output = subp.communicate()[0]
     if subp.returncode < 0:
-        sys.stderr.write('%s received signal %i: %s\n'
-                         % (tool, -subp.returncode,
-                            ' '.join(qemu_img_args + list(args))))
+        cmd = ' '.join(args)
+        sys.stderr.write(f'{tool} received signal {-subp.returncode}: {cmd}\n')
     return (output, subp.returncode)
 
 def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2021-01-21  2:13   ` Eric Blake
  2020-11-30 13:40 ` [PATCH v2 5/8] iotests.py: add qemu_tool_popen() Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Just drop code duplication.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5ebe25e063..6c9bcf9042 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -188,14 +188,7 @@ def img_info_log(filename, filter_path=None, imgopts=False, extra_args=()):
 def qemu_io(*args):
     '''Run qemu-io and return the stdout data'''
     args = qemu_io_args + list(args)
-    subp = subprocess.Popen(args, stdout=subprocess.PIPE,
-                            stderr=subprocess.STDOUT,
-                            universal_newlines=True)
-    output = subp.communicate()[0]
-    if subp.returncode < 0:
-        sys.stderr.write('qemu-io received signal %i: %s\n'
-                         % (-subp.returncode, ' '.join(args)))
-    return output
+    return qemu_tool_pipe_and_status('qemu-io', args)[0]
 
 def qemu_io_log(*args):
     result = qemu_io(*args)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/8] iotests.py: add qemu_tool_popen()
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 6/8] iotests.py: add and use qemu_io_wrap_args() Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Split qemu_tool_popen() from qemu_tool_pipe_and_status() to be used
separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6c9bcf9042..df9834e43a 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -89,16 +89,21 @@ luks_default_secret_object = 'secret,id=keysec0,data=' + \
 luks_default_key_secret_opt = 'key-secret=keysec0'
 
 
+def qemu_tool_popen(tool: str, args: Sequence[str],
+                    connect_stderr: bool = True) -> subprocess.Popen:
+    stderr = subprocess.STDOUT if connect_stderr else None
+    return subprocess.Popen(args,
+                            stdout=subprocess.PIPE,
+                            stderr=stderr,
+                            universal_newlines=True)
+
+
 def qemu_tool_pipe_and_status(tool: str, args: Sequence[str],
                               connect_stderr: bool = True) -> Tuple[str, int]:
     """
     Run a tool and return both its output and its exit code
     """
-    stderr = subprocess.STDOUT if connect_stderr else None
-    subp = subprocess.Popen(args,
-                            stdout=subprocess.PIPE,
-                            stderr=stderr,
-                            universal_newlines=True)
+    subp = qemu_tool_popen(tool, args, connect_stderr)
     output = subp.communicate()[0]
     if subp.returncode < 0:
         cmd = ' '.join(args)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/8] iotests.py: add and use qemu_io_wrap_args()
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 5/8] iotests.py: add qemu_tool_popen() Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 7/8] iotests.py: add qemu_io_popen() Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

For qemu_io* functions support --image-opts argument, which conflicts
with -f argument from qemu_io_args.

For QemuIoInteractive use new wrapper as well, which allows relying on
default format.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index df9834e43a..393028eb9a 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -190,10 +190,15 @@ def img_info_log(filename, filter_path=None, imgopts=False, extra_args=()):
         filter_path = filename
     log(filter_img_info(output, filter_path))
 
+def qemu_io_wrap_args(args: Sequence[str]):
+    if '-f' in args or '--image-opts' in args:
+        return qemu_io_args_no_fmt + list(args)
+    else:
+        return qemu_io_args + list(args)
+
 def qemu_io(*args):
     '''Run qemu-io and return the stdout data'''
-    args = qemu_io_args + list(args)
-    return qemu_tool_pipe_and_status('qemu-io', args)[0]
+    return qemu_tool_pipe_and_status('qemu-io', qemu_io_wrap_args(args))[0]
 
 def qemu_io_log(*args):
     result = qemu_io(*args)
@@ -202,7 +207,7 @@ def qemu_io_log(*args):
 
 def qemu_io_silent(*args):
     '''Run qemu-io and return the exit code, suppressing stdout'''
-    args = qemu_io_args + list(args)
+    args = qemu_io_wrap_args(args)
     exitcode = subprocess.call(args, stdout=open('/dev/null', 'w'))
     if exitcode < 0:
         sys.stderr.write('qemu-io received signal %i: %s\n' %
@@ -211,7 +216,7 @@ def qemu_io_silent(*args):
 
 def qemu_io_silent_check(*args):
     '''Run qemu-io and return the true if subprocess returned 0'''
-    args = qemu_io_args + list(args)
+    args = qemu_io_wrap_args(args)
     exitcode = subprocess.call(args, stdout=open('/dev/null', 'w'),
                                stderr=subprocess.STDOUT)
     return exitcode == 0
@@ -223,7 +228,7 @@ def get_virtio_scsi_device():
 
 class QemuIoInteractive:
     def __init__(self, *args):
-        self.args = qemu_io_args_no_fmt + list(args)
+        self.args = qemu_io_wrap_args(args)
         self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
                                    stdout=subprocess.PIPE,
                                    stderr=subprocess.STDOUT,
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/8] iotests.py: add qemu_io_popen()
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 6/8] iotests.py: add and use qemu_io_wrap_args() Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2020-11-30 13:40 ` [PATCH v2 8/8] iotests: add 306 to test reconnect on nbd open Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Add qemu-io Popen constructor wrapper. To be used in the following new
test commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 393028eb9a..d7b488f7ee 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -196,6 +196,9 @@ def qemu_io_wrap_args(args: Sequence[str]):
     else:
         return qemu_io_args + list(args)
 
+def qemu_io_popen(*args):
+    return qemu_tool_popen('qemu-io', qemu_io_wrap_args(args))
+
 def qemu_io(*args):
     '''Run qemu-io and return the stdout data'''
     return qemu_tool_pipe_and_status('qemu-io', qemu_io_wrap_args(args))[0]
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/8] iotests: add 306 to test reconnect on nbd open
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 7/8] iotests.py: add qemu_io_popen() Vladimir Sementsov-Ogievskiy
@ 2020-11-30 13:40 ` Vladimir Sementsov-Ogievskiy
  2020-12-18 10:57 ` [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
  2021-01-21  2:17 ` Eric Blake
  9 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-11-30 13:40 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, mreitz, kwolf, eblake, vsementsov, den

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 tests/qemu-iotests/306        | 71 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/306.out    | 11 ++++++
 tests/qemu-iotests/group      |  1 +
 tests/qemu-iotests/iotests.py | 11 ++++++
 4 files changed, 94 insertions(+)
 create mode 100755 tests/qemu-iotests/306
 create mode 100644 tests/qemu-iotests/306.out

diff --git a/tests/qemu-iotests/306 b/tests/qemu-iotests/306
new file mode 100755
index 0000000000..85216cf088
--- /dev/null
+++ b/tests/qemu-iotests/306
@@ -0,0 +1,71 @@
+#!/usr/bin/env python3
+#
+# Test nbd reconnect on open
+#
+# Copyright (c) 2020 Virtuozzo International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import time
+
+import iotests
+from iotests import qemu_img_create, file_path, qemu_io_popen, qemu_nbd, \
+    qemu_io_log, log
+
+iotests.script_initialize(supported_fmts=['qcow2'])
+
+disk, nbd_sock = file_path('disk', 'nbd-sock')
+
+
+def create_args(open_timeout):
+    return ['--image-opts', '-c', 'read 0 1M',
+            f'driver=nbd,open-timeout={open_timeout},'
+            f'server.type=unix,server.path={nbd_sock}']
+
+
+def check_fail_to_connect(open_timeout):
+    log(f'Check fail to connect with {open_timeout} seconds of timeout')
+
+    start_t = time.time()
+    qemu_io_log(*create_args(open_timeout))
+    delta_t = time.time() - start_t
+
+    max_delta = open_timeout + 0.2
+    if delta_t >= open_timeout and delta_t < max_delta:
+        log(f'qemu_io finished in {open_timeout}..{max_delta} seconds, OK')
+    else:
+        note = 'too early' if delta_t < open_timeout else 'too long'
+        log(f'qemu_io finished in {delta_t:.1f} seconds, {note}')
+
+
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
+
+# Start NBD client when NBD server is not yet running. It should not fail, but
+# wait for 5 seconds for the server to be available.
+client = qemu_io_popen(*create_args(5))
+
+time.sleep(1)
+qemu_nbd('-k', nbd_sock, '-f', iotests.imgfmt, disk)
+
+# client should succeed
+log(client.communicate()[0], filters=[iotests.filter_qemu_io])
+
+# Server was started without --persistent flag, so it should be off now. Let's
+# check it and it the same time check that with open-timeout=0 client fails
+# immediately.
+check_fail_to_connect(0)
+
+# Check that we will fail after non-zero timeout if server is still unavailable
+check_fail_to_connect(1)
diff --git a/tests/qemu-iotests/306.out b/tests/qemu-iotests/306.out
new file mode 100644
index 0000000000..a35ae30ea4
--- /dev/null
+++ b/tests/qemu-iotests/306.out
@@ -0,0 +1,11 @@
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+Check fail to connect with 0 seconds of timeout
+qemu-io: can't open: Failed to connect to 'TEST_DIR/PID-nbd-sock': No such file or directory
+
+qemu_io finished in 0..0.2 seconds, OK
+Check fail to connect with 1 seconds of timeout
+qemu-io: can't open: Failed to connect to 'TEST_DIR/PID-nbd-sock': No such file or directory
+
+qemu_io finished in 1..1.2 seconds, OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 2960dff728..6ea7b83c1e 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -314,5 +314,6 @@
 303 rw quick
 304 rw quick
 305 rw quick
+306 rw auto quick
 307 rw quick export
 309 rw auto quick
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index d7b488f7ee..2530185220 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -208,6 +208,17 @@ def qemu_io_log(*args):
     log(result, filters=[filter_testfiles, filter_qemu_io])
     return result
 
+def qemu_io_popen(*args):
+    '''Run qemu-nbd in daemon mode and return the parent's exit code'''
+    default_args = qemu_io_args[:]
+
+    if ('-f' in args or '--image-opts' in args) and '-f' in default_args:
+        ind = default_args.index('-f')
+        del default_args[ind:ind+2]
+
+    return subprocess.Popen(default_args + list(args), stdout=subprocess.PIPE,
+                            stderr=subprocess.STDOUT, universal_newlines=True)
+
 def qemu_io_silent(*args):
     '''Run qemu-io and return the exit code, suppressing stdout'''
     args = qemu_io_wrap_args(args)
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 for-6.0 0/8] nbd reconnect on open
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-11-30 13:40 ` [PATCH v2 8/8] iotests: add 306 to test reconnect on nbd open Vladimir Sementsov-Ogievskiy
@ 2020-12-18 10:57 ` Vladimir Sementsov-Ogievskiy
  2021-01-09 10:11   ` Vladimir Sementsov-Ogievskiy
  2021-01-21  2:17 ` Eric Blake
  9 siblings, 1 reply; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-12-18 10:57 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

ping :)

30.11.2020 16:40, Vladimir Sementsov-Ogievskiy wrote:
> Hi all! There is a new feature: reconnect on open. It is useful when
> start of vm and start of nbd server are not simple to sync.
> 
> v2: rebase on master.
> Also, I've discovered that I've sent downstream version of test which
> doesn't work here. So, the test is rewritten (with appropriate
> improvements of iotests.py)
> 
> Vladimir Sementsov-Ogievskiy (8):
>    block/nbd: move initial connect to coroutine
>    nbd: allow reconnect on open, with corresponding new options
>    iotests.py: fix qemu_tool_pipe_and_status()
>    iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
>    iotests.py: add qemu_tool_popen()
>    iotests.py: add and use qemu_io_wrap_args()
>    iotests.py: add qemu_io_popen()
>    iotests: add 306 to test reconnect on nbd open
> 
>   block/nbd.c                   | 173 +++++++++++++++++++++++++---------
>   tests/qemu-iotests/306        |  71 ++++++++++++++
>   tests/qemu-iotests/306.out    |  11 +++
>   tests/qemu-iotests/group      |   1 +
>   tests/qemu-iotests/iotests.py |  56 +++++++----
>   5 files changed, 246 insertions(+), 66 deletions(-)
>   create mode 100755 tests/qemu-iotests/306
>   create mode 100644 tests/qemu-iotests/306.out
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 for-6.0 0/8] nbd reconnect on open
  2020-12-18 10:57 ` [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
@ 2021-01-09 10:11   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-09 10:11 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

ping ping

18.12.2020 13:57, Vladimir Sementsov-Ogievskiy wrote:
> ping :)
> 
> 30.11.2020 16:40, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all! There is a new feature: reconnect on open. It is useful when
>> start of vm and start of nbd server are not simple to sync.
>>
>> v2: rebase on master.
>> Also, I've discovered that I've sent downstream version of test which
>> doesn't work here. So, the test is rewritten (with appropriate
>> improvements of iotests.py)
>>
>> Vladimir Sementsov-Ogievskiy (8):
>>    block/nbd: move initial connect to coroutine
>>    nbd: allow reconnect on open, with corresponding new options
>>    iotests.py: fix qemu_tool_pipe_and_status()
>>    iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
>>    iotests.py: add qemu_tool_popen()
>>    iotests.py: add and use qemu_io_wrap_args()
>>    iotests.py: add qemu_io_popen()
>>    iotests: add 306 to test reconnect on nbd open
>>
>>   block/nbd.c                   | 173 +++++++++++++++++++++++++---------
>>   tests/qemu-iotests/306        |  71 ++++++++++++++
>>   tests/qemu-iotests/306.out    |  11 +++
>>   tests/qemu-iotests/group      |   1 +
>>   tests/qemu-iotests/iotests.py |  56 +++++++----
>>   5 files changed, 246 insertions(+), 66 deletions(-)
>>   create mode 100755 tests/qemu-iotests/306
>>   create mode 100644 tests/qemu-iotests/306.out
>>
> 
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/8] block/nbd: move initial connect to coroutine
  2020-11-30 13:40 ` [PATCH v2 1/8] block/nbd: move initial connect to coroutine Vladimir Sementsov-Ogievskiy
@ 2021-01-20 22:24   ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2021-01-20 22:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> We are going to implement reconnect-on-open. Let's reuse existing
> reconnect loop. For this, do initial connect in connection coroutine.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/nbd.c | 94 ++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 53 insertions(+), 41 deletions(-)
> 

> @@ -2279,6 +2268,29 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
>      bdrv_inc_in_flight(bs);
>      aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
>  
> +    if (qemu_in_coroutine()) {
> +        s->open_co = qemu_coroutine_self();
> +        qemu_coroutine_yield();
> +    } else {
> +        BDRV_POLL_WHILE(bs, s->state == NBD_CLIENT_OPENING);
> +    }
> +
> +    if (s->state != NBD_CLIENT_CONNECTED && s->connect_status < 0) {
> +        /*
> +         * It's possible that state != NBD_CLIENT_CONNECTED, but connect_status
> +         * is 0. This means that initial connecting succeed, but failed later
> +         * (during BDRV_POLL_WHILE). It's a rare case, but it happen in iotest

This means that starting the initial connection succeeded, but we failed
later (during BDRV_POLL_WHILE).

happens

> +         * 83. Let's don't care and just report success in this case: it not
> +         * much differs from the case when connection failed immediately after
> +         * succeeded open.

We don't care, and just report success in this case, as it does not
change our behavior from the case when the connection fails right after
open succeeds.


> +         */
> +        assert(s->connect_err);
> +        error_propagate(errp, s->connect_err);
> +        s->connect_err = NULL;
> +        nbd_clear_bdrvstate(s);
> +        return s->connect_status;
> +    }
> +
>      return 0;
>  }
>  
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options
  2020-11-30 13:40 ` [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options Vladimir Sementsov-Ogievskiy
@ 2021-01-21  1:44   ` Eric Blake
  2021-01-22 10:56     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Blake @ 2021-01-21  1:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Note: currently, using new option with long timeout in qmp command
> blockdev-add is not good idea, as qmp interface is blocking, so,
> don't add it now, let's add it later after
> "monitor: Optionally run handlers in coroutines" series merged.

If I'm not mistaken, that landed as of eb94b81a94.  Is it just the
commit message that needs an update, or does this patch need a respin?

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/nbd.c | 115 +++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 92 insertions(+), 23 deletions(-)
> 

> @@ -474,6 +484,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
>      s->wait_connect = true;
>      qemu_coroutine_yield();
>  
> +    if (!s->connect_thread) {
> +        error_setg(errp, "Connection attempt cancelled by other operation");
> +        return NULL;
> +    }

Does this need to use atomics for proper access to s->connect_thread
across threads?  Or are all the operations done by other coroutines but
within the same thread, so we are safe?


> @@ -624,10 +645,15 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
>      bdrv_inc_in_flight(s->bs);
>  
>  out:
> -    s->connect_status = ret;
> -    error_free(s->connect_err);
> -    s->connect_err = NULL;
> -    error_propagate(&s->connect_err, local_err);
> +    if (s->connect_status == -ETIMEDOUT) {
> +        /* Don't rewrite timeout error by following cancel-provoked error */

Maybe:

/* Don't propagate a timeout error caused by a job cancellation. */


> +static void open_timer_cb(void *opaque)
> +{
> +    BDRVNBDState *s = opaque;
> +
> +    if (!s->connect_status) {
> +        /* First attempt was not finished. We should set an error */
> +        s->connect_status = -ETIMEDOUT;
> +        error_setg(&s->connect_err, "First connection attempt is cancelled by "
> +                   "timeout");
> +    }
> +
> +    nbd_teardown_connection_async(s->bs);
> +    open_timer_del(s);
> +}
> +
> +static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)
> +{
> +    assert(!s->open_timer && s->state == NBD_CLIENT_OPENING);
> +    s->open_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
> +                                  QEMU_CLOCK_REALTIME,
> +                                  SCALE_NS,
> +                                  open_timer_cb, s);
> +    timer_mod(s->open_timer, expire_time_ns);
> +}
> +


> @@ -2180,6 +2235,14 @@ static QemuOptsList nbd_runtime_opts = {
>                      "future requests before a successful reconnect will "
>                      "immediately fail. Default 0",
>          },
> +        {
> +            .name = "open-timeout",
> +            .type = QEMU_OPT_NUMBER,
> +            .help = "In seconds. If zero, nbd driver tries to establish "
> +                    "connection only once, on fail open fails. If non-zero, "

If zero, the nbd driver tries the connection only once, and fails to
open if the connection fails.

> +                    "nbd driver may do several attempts until success or "
> +                    "@open-timeout seconds passed. Default 0",

If non-zero, the nbd driver will repeat connection attempts until
successful or until @open-timeout seconds have elapsed.

> +        },

Where is the QMP counterpart for setting this option?

>          { /* end of list */ }
>      },
>  };
> @@ -2235,6 +2298,7 @@ static int nbd_process_options(BlockDriverState *bs, QDict *options,
>      }
>  
>      s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
> +    s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
>  
>      ret = 0;
>  
> @@ -2268,6 +2332,11 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
>      bdrv_inc_in_flight(bs);
>      aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
>  
> +    if (s->open_timeout) {
> +        open_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
> +                        s->open_timeout * NANOSECONDS_PER_SECOND);
> +    }
> +
>      if (qemu_in_coroutine()) {
>          s->open_co = qemu_coroutine_self();
>          qemu_coroutine_yield();
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status()
  2020-11-30 13:40 ` [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
@ 2021-01-21  1:58   ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2021-01-21  1:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> qemu_img_args variable is unrelated here. We should print just args.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/iotests.py | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index bcd4fe5b6f..5ebe25e063 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -101,9 +101,8 @@ def qemu_tool_pipe_and_status(tool: str, args: Sequence[str],
>                              universal_newlines=True)
>      output = subp.communicate()[0]
>      if subp.returncode < 0:
> -        sys.stderr.write('%s received signal %i: %s\n'
> -                         % (tool, -subp.returncode,
> -                            ' '.join(qemu_img_args + list(args))))
> +        cmd = ' '.join(args)
> +        sys.stderr.write(f'{tool} received signal {-subp.returncode}: {cmd}\n')
>      return (output, subp.returncode)
>  
>  def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
  2020-11-30 13:40 ` [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
@ 2021-01-21  2:13   ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2021-01-21  2:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Just drop code duplication.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/iotests.py | 9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 5ebe25e063..6c9bcf9042 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -188,14 +188,7 @@ def img_info_log(filename, filter_path=None, imgopts=False, extra_args=()):
>  def qemu_io(*args):
>      '''Run qemu-io and return the stdout data'''
>      args = qemu_io_args + list(args)
> -    subp = subprocess.Popen(args, stdout=subprocess.PIPE,
> -                            stderr=subprocess.STDOUT,
> -                            universal_newlines=True)
> -    output = subp.communicate()[0]
> -    if subp.returncode < 0:
> -        sys.stderr.write('qemu-io received signal %i: %s\n'
> -                         % (-subp.returncode, ' '.join(args)))
> -    return output
> +    return qemu_tool_pipe_and_status('qemu-io', args)[0]
>  
>  def qemu_io_log(*args):
>      result = qemu_io(*args)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 for-6.0 0/8] nbd reconnect on open
  2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2020-12-18 10:57 ` [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
@ 2021-01-21  2:17 ` Eric Blake
  9 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2021-01-21  2:17 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all! There is a new feature: reconnect on open. It is useful when
> start of vm and start of nbd server are not simple to sync.
> 
> v2: rebase on master.
> Also, I've discovered that I've sent downstream version of test which
> doesn't work here. So, the test is rewritten (with appropriate
> improvements of iotests.py)

Because of my questions on 2, I haven't queued the series yet; but 3 and
4 were independent enough to take now.

> 
> Vladimir Sementsov-Ogievskiy (8):
>   block/nbd: move initial connect to coroutine
>   nbd: allow reconnect on open, with corresponding new options
>   iotests.py: fix qemu_tool_pipe_and_status()
>   iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status()
>   iotests.py: add qemu_tool_popen()
>   iotests.py: add and use qemu_io_wrap_args()
>   iotests.py: add qemu_io_popen()
>   iotests: add 306 to test reconnect on nbd open
> 
>  block/nbd.c                   | 173 +++++++++++++++++++++++++---------
>  tests/qemu-iotests/306        |  71 ++++++++++++++
>  tests/qemu-iotests/306.out    |  11 +++
>  tests/qemu-iotests/group      |   1 +
>  tests/qemu-iotests/iotests.py |  56 +++++++----
>  5 files changed, 246 insertions(+), 66 deletions(-)
>  create mode 100755 tests/qemu-iotests/306
>  create mode 100644 tests/qemu-iotests/306.out
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options
  2021-01-21  1:44   ` Eric Blake
@ 2021-01-22 10:56     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-01-22 10:56 UTC (permalink / raw)
  To: Eric Blake, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

21.01.2021 04:44, Eric Blake wrote:
> On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Note: currently, using new option with long timeout in qmp command
>> blockdev-add is not good idea, as qmp interface is blocking, so,
>> don't add it now, let's add it later after
>> "monitor: Optionally run handlers in coroutines" series merged.
> 
> If I'm not mistaken, that landed as of eb94b81a94.  Is it just the
> commit message that needs an update, or does this patch need a respin?

Oh yes, you are right. I think the most reasonable thing is to keep this patch
in separate (for simple backporting to downstream without Kevin's series), and
add qmp support for the feature as additional new patch. Will do it on respin.

> 
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/nbd.c | 115 +++++++++++++++++++++++++++++++++++++++++-----------
>>   1 file changed, 92 insertions(+), 23 deletions(-)
>>
> 
>> @@ -474,6 +484,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
>>       s->wait_connect = true;
>>       qemu_coroutine_yield();
>>   
>> +    if (!s->connect_thread) {
>> +        error_setg(errp, "Connection attempt cancelled by other operation");
>> +        return NULL;
>> +    }
> 
> Does this need to use atomics for proper access to s->connect_thread
> across threads?  Or are all the operations done by other coroutines but
> within the same thread, so we are safe?

s->connect_thread is not accessed from connect_thread_func, so in this way we are safe. And variables shared between connect_thread_func and other driver code are protected by mutex.

What about accessing nbd bds from different threads.. In my observation, all the code is written in assumption that everything inside block-driver may be called from different coroutines but from one thread.. And we have a lot of s->* variables that are not atomic and not protected by mutexes, and all this works somehow:)

I remember Paolo answered me somewhere in mailing list, that actually, everything in block drivers and block/io must be thread-safe.. But I don't see this thread-safety in current code, so don't introduce it for new variables.

> 
> 
>> @@ -624,10 +645,15 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
>>       bdrv_inc_in_flight(s->bs);
>>   
>>   out:
>> -    s->connect_status = ret;
>> -    error_free(s->connect_err);
>> -    s->connect_err = NULL;
>> -    error_propagate(&s->connect_err, local_err);
>> +    if (s->connect_status == -ETIMEDOUT) {
>> +        /* Don't rewrite timeout error by following cancel-provoked error */
> 
> Maybe:
> 
> /* Don't propagate a timeout error caused by a job cancellation. */

No, we want to keep ETIMEOUT

> 
> 
>> +static void open_timer_cb(void *opaque)
>> +{
>> +    BDRVNBDState *s = opaque;
>> +
>> +    if (!s->connect_status) {
>> +        /* First attempt was not finished. We should set an error */
>> +        s->connect_status = -ETIMEDOUT;
>> +        error_setg(&s->connect_err, "First connection attempt is cancelled by "
>> +                   "timeout");
>> +    }
>> +
>> +    nbd_teardown_connection_async(s->bs);
>> +    open_timer_del(s);
>> +}
>> +
>> +static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)
>> +{
>> +    assert(!s->open_timer && s->state == NBD_CLIENT_OPENING);
>> +    s->open_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
>> +                                  QEMU_CLOCK_REALTIME,
>> +                                  SCALE_NS,
>> +                                  open_timer_cb, s);
>> +    timer_mod(s->open_timer, expire_time_ns);
>> +}
>> +
> 
> 
>> @@ -2180,6 +2235,14 @@ static QemuOptsList nbd_runtime_opts = {
>>                       "future requests before a successful reconnect will "
>>                       "immediately fail. Default 0",
>>           },
>> +        {
>> +            .name = "open-timeout",
>> +            .type = QEMU_OPT_NUMBER,
>> +            .help = "In seconds. If zero, nbd driver tries to establish "
>> +                    "connection only once, on fail open fails. If non-zero, "
> 
> If zero, the nbd driver tries the connection only once, and fails to
> open if the connection fails.
> 
>> +                    "nbd driver may do several attempts until success or "
>> +                    "@open-timeout seconds passed. Default 0",
> 
> If non-zero, the nbd driver will repeat connection attempts until
> successful or until @open-timeout seconds have elapsed.
> 
>> +        },
> 
> Where is the QMP counterpart for setting this option?

Absent (as described in commit msg). Will do in a separate patch.

> 
>>           { /* end of list */ }
>>       },
>>   };
>> @@ -2235,6 +2298,7 @@ static int nbd_process_options(BlockDriverState *bs, QDict *options,
>>       }
>>   
>>       s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
>> +    s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
>>   
>>       ret = 0;
>>   
>> @@ -2268,6 +2332,11 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
>>       bdrv_inc_in_flight(bs);
>>       aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
>>   
>> +    if (s->open_timeout) {
>> +        open_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
>> +                        s->open_timeout * NANOSECONDS_PER_SECOND);
>> +    }
>> +
>>       if (qemu_in_coroutine()) {
>>           s->open_co = qemu_coroutine_self();
>>           qemu_coroutine_yield();
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-22 10:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-30 13:40 [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
2020-11-30 13:40 ` [PATCH v2 1/8] block/nbd: move initial connect to coroutine Vladimir Sementsov-Ogievskiy
2021-01-20 22:24   ` Eric Blake
2020-11-30 13:40 ` [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options Vladimir Sementsov-Ogievskiy
2021-01-21  1:44   ` Eric Blake
2021-01-22 10:56     ` Vladimir Sementsov-Ogievskiy
2020-11-30 13:40 ` [PATCH v2 3/8] iotests.py: fix qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
2021-01-21  1:58   ` Eric Blake
2020-11-30 13:40 ` [PATCH v2 4/8] iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() Vladimir Sementsov-Ogievskiy
2021-01-21  2:13   ` Eric Blake
2020-11-30 13:40 ` [PATCH v2 5/8] iotests.py: add qemu_tool_popen() Vladimir Sementsov-Ogievskiy
2020-11-30 13:40 ` [PATCH v2 6/8] iotests.py: add and use qemu_io_wrap_args() Vladimir Sementsov-Ogievskiy
2020-11-30 13:40 ` [PATCH v2 7/8] iotests.py: add qemu_io_popen() Vladimir Sementsov-Ogievskiy
2020-11-30 13:40 ` [PATCH v2 8/8] iotests: add 306 to test reconnect on nbd open Vladimir Sementsov-Ogievskiy
2020-12-18 10:57 ` [PATCH v2 for-6.0 0/8] nbd reconnect on open Vladimir Sementsov-Ogievskiy
2021-01-09 10:11   ` Vladimir Sementsov-Ogievskiy
2021-01-21  2:17 ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).