qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu
@ 2020-09-01  9:14 Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]

Hello Everyone,
So here is v8. We still need ACKs from NBD and chardev maintainers.

Changes:

v8:
 -add Reviewed-by and Acked-by tags
 -rebase onto master
  -minor change to migration
  -convert to meson
 -change "Since:" to 5.2
 -varios code style fixes (Markus Armbruster)
 -point to oob restrictions in comment to yank_register_function
  (Markus Armbruster)
 -improve qmp documentation (Markus Armbruster)
 -document oob suitability of qio_channel and io_shutdown (Markus Armbruster)

v7:
 -yank_register_instance now returns error via Error **errp instead of aborting
 -dropped "chardev/char.c: Check for duplicate id before  creating chardev"

v6:
 -add Reviewed-by and Acked-by tags
 -rebase on master
 -lots of changes in nbd due to rebase
 -only take maintainership of util/yank.c and include/qemu/yank.h (Daniel P. Berrangé)
 -fix a crash discovered by the newly added chardev test
 -fix the test itself

v5:
 -move yank.c to util/
 -move yank.h to include/qemu/
 -add license to yank.h
 -use const char*
 -nbd: use atomic_store_release and atomic_load_aqcuire
 -io-channel: ensure thread-safety and document it
 -add myself as maintainer for yank

v4:
 -fix build errors...

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Overview:
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

Lukas Straub (8):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature
  io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  io: Document qmp oob suitability of qio_channel_shutdown and
    io_shutdown
  MAINTAINERS: Add myself as maintainer for yank feature
  tests/test-char.c: Wait for the chardev to connect in
    char_socket_client_dupid_test

 MAINTAINERS                   |   6 ++
 block/nbd.c                   | 129 ++++++++++++++---------
 chardev/char-socket.c         |  31 ++++++
 include/io/channel.h          |   5 +-
 include/qemu/yank.h           |  81 +++++++++++++++
 io/channel-tls.c              |   6 +-
 migration/channel.c           |  12 +++
 migration/migration.c         |  25 +++++
 migration/multifd.c           |  10 ++
 migration/qemu-file-channel.c |   6 ++
 migration/savevm.c            |   6 ++
 qapi/misc.json                |  62 +++++++++++
 tests/test-char.c             |   1 +
 util/meson.build              |   1 +
 util/yank.c                   | 187 ++++++++++++++++++++++++++++++++++
 15 files changed, 516 insertions(+), 52 deletions(-)
 create mode 100644 include/qemu/yank.h
 create mode 100644 util/yank.c

--
2.20.1

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v8 1/8] Introduce yank feature
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:45   ` Daniel P. Berrangé
                     ` (2 more replies)
  2020-09-01  9:15 ` [PATCH v8 2/8] block/nbd.c: Add " Lukas Straub
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 10870 bytes --]

The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/yank.h |  81 +++++++++++++++++++
 qapi/misc.json      |  62 +++++++++++++++
 util/meson.build    |   1 +
 util/yank.c         | 187 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 331 insertions(+)
 create mode 100644 include/qemu/yank.h
 create mode 100644 util/yank.c

diff --git a/include/qemu/yank.h b/include/qemu/yank.h
new file mode 100644
index 0000000000..c5ab53965a
--- /dev/null
+++ b/include/qemu/yank.h
@@ -0,0 +1,81 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef YANK_H
+#define YANK_H
+
+typedef void (YankFn)(void *opaque);
+
+/**
+ * yank_register_instance: Register a new instance.
+ *
+ * This registers a new instance for yanking. Must be called before any yank
+ * function is registered for this instance.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The globally unique name of the instance.
+ * @errp: Error object.
+ */
+void yank_register_instance(const char *instance_name, Error **errp);
+
+/**
+ * yank_unregister_instance: Unregister a instance.
+ *
+ * This unregisters a instance. Must be called only after every yank function
+ * of the instance has been unregistered.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance.
+ */
+void yank_unregister_instance(const char *instance_name);
+
+/**
+ * yank_register_function: Register a yank function
+ *
+ * This registers a yank function. All limitations of qmp oob commands apply
+ * to the yank function as well. See docs/devel/qapi-code-gen.txt under
+ * "An OOB-capable command handler must satisfy the following conditions".
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance
+ * @func: The yank function
+ * @opaque: Will be passed to the yank function
+ */
+void yank_register_function(const char *instance_name,
+                            YankFn *func,
+                            void *opaque);
+
+/**
+ * yank_unregister_function: Unregister a yank function
+ *
+ * This unregisters a yank function.
+ *
+ * This function is thread-safe.
+ *
+ * @instance_name: The name of the instance
+ * @func: func that was passed to yank_register_function
+ * @opaque: opaque that was passed to yank_register_function
+ */
+void yank_unregister_function(const char *instance_name,
+                              YankFn *func,
+                              void *opaque);
+
+/**
+ * yank_generic_iochannel: Generic yank function for iochannel
+ *
+ * This is a generic yank function which will call qio_channel_shutdown on the
+ * provided QIOChannel.
+ *
+ * @opaque: QIOChannel to shutdown
+ */
+void yank_generic_iochannel(void *opaque);
+#endif
diff --git a/qapi/misc.json b/qapi/misc.json
index 9d32820dc1..7de330416a 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1615,3 +1615,65 @@
 ##
 { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }

+##
+# @YankInstances:
+#
+# @instances: List of yank instances.
+#
+# A yank instance can be yanked with the "yank" qmp command to recover from a
+# hanging qemu.
+#
+# Yank instances are named after the following schema:
+# "blockdev:<node-name>" refers to a block device. Currently only nbd block
+# devices are implemented.
+# "chardev:<chardev-name>" refers to a chardev. Currently only socket chardevs
+# are implemented.
+# "migration" refers to the migration currently in progress.
+#
+# Currently implemented yank instances:
+#  -nbd block device:
+#   Yanking it will shutdown the connection to the nbd server without
+#   attempting to reconnect.
+#  -socket chardev:
+#   Yanking it will shutdown the connected socket.
+#  -migration:
+#   Yanking it will shutdown all migration connections.
+#
+# Since: 5.2
+##
+{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }
+
+##
+# @yank:
+#
+# Recover from hanging qemu by yanking the specified instances. See
+# "YankInstances" for more information.
+#
+# Takes @YankInstances as argument.
+#
+# Returns: nothing.
+#
+# Example:
+#
+# -> { "execute": "yank", "arguments": { "instances": ["blockdev:nbd0"] } }
+# <- { "return": {} }
+#
+# Since: 5.2
+##
+{ 'command': 'yank', 'data': 'YankInstances', 'allow-oob': true }
+
+##
+# @query-yank:
+#
+# Query yank instances. See "YankInstances" for more information.
+#
+# Returns: @YankInstances
+#
+# Example:
+#
+# -> { "execute": "query-yank" }
+# <- { "return": { "instances": ["blockdev:nbd0"] } }
+#
+# Since: 5.2
+##
+{ 'command': 'query-yank', 'returns': 'YankInstances', 'allow-oob': true }
diff --git a/util/meson.build b/util/meson.build
index e6b207a99e..f3989a1869 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -50,6 +50,7 @@ endif

 if have_system
   util_ss.add(when: 'CONFIG_GIO', if_true: [files('dbus.c'), gio])
+  util_ss.add(files('yank.c'))
 endif

 if have_block
diff --git a/util/yank.c b/util/yank.c
new file mode 100644
index 0000000000..f63bfdca50
--- /dev/null
+++ b/util/yank.c
@@ -0,0 +1,187 @@
+/*
+ * QEMU yank feature
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/thread.h"
+#include "qemu/queue.h"
+#include "qapi/qapi-commands-misc.h"
+#include "io/channel.h"
+#include "qemu/yank.h"
+
+struct YankFuncAndParam {
+    YankFn *func;
+    void *opaque;
+    QLIST_ENTRY(YankFuncAndParam) next;
+};
+
+struct YankInstance {
+    char *name;
+    QLIST_HEAD(, YankFuncAndParam) yankfns;
+    QLIST_ENTRY(YankInstance) next;
+};
+
+/*
+ * This lock protects the yank_instance_list below.
+ */
+static QemuMutex yank_lock;
+
+static QLIST_HEAD(, YankInstance) yank_instance_list
+    = QLIST_HEAD_INITIALIZER(yank_instance_list);
+
+static struct YankInstance *yank_find_instance(const char *name)
+{
+    struct YankInstance *instance;
+
+    QLIST_FOREACH(instance, &yank_instance_list, next) {
+        if (!strcmp(instance->name, name)) {
+            return instance;
+        }
+    }
+    return NULL;
+}
+
+void yank_register_instance(const char *instance_name, Error **errp)
+{
+    struct YankInstance *instance;
+
+    qemu_mutex_lock(&yank_lock);
+
+    if (yank_find_instance(instance_name)) {
+        error_setg(errp, "duplicate yank instance name: '%s'",
+                   instance_name);
+        qemu_mutex_unlock(&yank_lock);
+        return;
+    }
+
+    instance = g_slice_new(struct YankInstance);
+    instance->name = g_strdup(instance_name);
+    QLIST_INIT(&instance->yankfns);
+    QLIST_INSERT_HEAD(&yank_instance_list, instance, next);
+
+    qemu_mutex_unlock(&yank_lock);
+}
+
+void yank_unregister_instance(const char *instance_name)
+{
+    struct YankInstance *instance;
+
+    qemu_mutex_lock(&yank_lock);
+    instance = yank_find_instance(instance_name);
+    assert(instance);
+
+    assert(QLIST_EMPTY(&instance->yankfns));
+    QLIST_REMOVE(instance, next);
+    g_free(instance->name);
+    g_slice_free(struct YankInstance, instance);
+
+    qemu_mutex_unlock(&yank_lock);
+}
+
+void yank_register_function(const char *instance_name,
+                            YankFn *func,
+                            void *opaque)
+{
+    struct YankInstance *instance;
+    struct YankFuncAndParam *entry;
+
+    qemu_mutex_lock(&yank_lock);
+    instance = yank_find_instance(instance_name);
+    assert(instance);
+
+    entry = g_slice_new(struct YankFuncAndParam);
+    entry->func = func;
+    entry->opaque = opaque;
+
+    QLIST_INSERT_HEAD(&instance->yankfns, entry, next);
+    qemu_mutex_unlock(&yank_lock);
+}
+
+void yank_unregister_function(const char *instance_name,
+                              YankFn *func,
+                              void *opaque)
+{
+    struct YankInstance *instance;
+    struct YankFuncAndParam *entry;
+
+    qemu_mutex_lock(&yank_lock);
+    instance = yank_find_instance(instance_name);
+    assert(instance);
+
+    QLIST_FOREACH(entry, &instance->yankfns, next) {
+        if (entry->func == func && entry->opaque == opaque) {
+            QLIST_REMOVE(entry, next);
+            g_slice_free(struct YankFuncAndParam, entry);
+            qemu_mutex_unlock(&yank_lock);
+            return;
+        }
+    }
+
+    abort();
+}
+
+void yank_generic_iochannel(void *opaque)
+{
+    QIOChannel *ioc = QIO_CHANNEL(opaque);
+
+    qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
+}
+
+void qmp_yank(strList *instances,
+              Error **errp)
+{
+    strList *tail;
+    struct YankInstance *instance;
+    struct YankFuncAndParam *entry;
+
+    qemu_mutex_lock(&yank_lock);
+    for (tail = instances; tail; tail = tail->next) {
+        instance = yank_find_instance(tail->value);
+        if (!instance) {
+            error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+                      "Instance '%s' not found", tail->value);
+            qemu_mutex_unlock(&yank_lock);
+            return;
+        }
+    }
+    for (tail = instances; tail; tail = tail->next) {
+        instance = yank_find_instance(tail->value);
+        assert(instance);
+        QLIST_FOREACH(entry, &instance->yankfns, next) {
+            entry->func(entry->opaque);
+        }
+    }
+    qemu_mutex_unlock(&yank_lock);
+}
+
+YankInstances *qmp_query_yank(Error **errp)
+{
+    struct YankInstance *instance;
+    YankInstances *ret;
+
+    ret = g_new0(YankInstances, 1);
+    ret->instances = NULL;
+
+    qemu_mutex_lock(&yank_lock);
+    QLIST_FOREACH(instance, &yank_instance_list, next) {
+        strList *entry;
+        entry = g_new0(strList, 1);
+        entry->value = g_strdup(instance->name);
+        entry->next = ret->instances;
+        ret->instances = entry;
+    }
+    qemu_mutex_unlock(&yank_lock);
+
+    return ret;
+}
+
+static void __attribute__((__constructor__)) yank_init(void)
+{
+    qemu_mutex_init(&yank_lock);
+}
--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 2/8] block/nbd.c: Add yank feature
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 3/8] chardev/char-socket.c: " Lukas Straub
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 13959 bytes --]

Register a yank function which shuts down the socket and sets
s->state = NBD_CLIENT_QUIT. This is the same behaviour as if an
error occured.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 block/nbd.c | 129 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 80 insertions(+), 49 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 7bb881fef4..8632cf5340 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -35,6 +35,7 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"

 #include "qapi/qapi-visit-sockets.h"
 #include "qapi/qmp/qstring.h"
@@ -43,6 +44,8 @@
 #include "block/nbd.h"
 #include "block/block_int.h"

+#include "qemu/yank.h"
+
 #define EN_OPTSTR ":exportname="
 #define MAX_NBD_REQUESTS    16

@@ -84,6 +87,8 @@ typedef struct BDRVNBDState {
     NBDReply reply;
     BlockDriverState *bs;

+    char *yank_name;
+
     /* Connection parameters */
     uint32_t reconnect_delay;
     SocketAddress *saddr;
@@ -93,10 +98,10 @@ typedef struct BDRVNBDState {
     char *x_dirty_bitmap;
 } BDRVNBDState;

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-                                                  Error **errp);
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-                                Error **errp);
+static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr,
+                                    Error **errp);
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
+static void nbd_yank(void *opaque);

 static void nbd_clear_bdrvstate(BDRVNBDState *s)
 {
@@ -109,17 +114,19 @@ static void nbd_clear_bdrvstate(BDRVNBDState *s)
     s->tlscredsid = NULL;
     g_free(s->x_dirty_bitmap);
     s->x_dirty_bitmap = NULL;
+    g_free(s->yank_name);
+    s->yank_name = NULL;
 }

 static void nbd_channel_error(BDRVNBDState *s, int ret)
 {
     if (ret == -EIO) {
-        if (s->state == NBD_CLIENT_CONNECTED) {
+        if (atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED) {
             s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT :
                                             NBD_CLIENT_CONNECTING_NOWAIT;
         }
     } else {
-        if (s->state == NBD_CLIENT_CONNECTED) {
+        if (atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED) {
             qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
         }
         s->state = NBD_CLIENT_QUIT;
@@ -170,7 +177,7 @@ static void nbd_client_attach_aio_context(BlockDriverState *bs,
      * s->connection_co is either yielded from nbd_receive_reply or from
      * nbd_co_reconnect_loop()
      */
-    if (s->state == NBD_CLIENT_CONNECTED) {
+    if (atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED) {
         qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context);
     }

@@ -237,20 +244,20 @@ static void nbd_teardown_connection(BlockDriverState *bs)

 static bool nbd_client_connecting(BDRVNBDState *s)
 {
-    return s->state == NBD_CLIENT_CONNECTING_WAIT ||
-        s->state == NBD_CLIENT_CONNECTING_NOWAIT;
+    NBDClientState state = atomic_load_acquire(&s->state);
+    return state == NBD_CLIENT_CONNECTING_WAIT ||
+        state == NBD_CLIENT_CONNECTING_NOWAIT;
 }

 static bool nbd_client_connecting_wait(BDRVNBDState *s)
 {
-    return s->state == NBD_CLIENT_CONNECTING_WAIT;
+    return atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT;
 }

 static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
 {
     int ret;
     Error *local_err = NULL;
-    QIOChannelSocket *sioc;

     if (!nbd_client_connecting(s)) {
         return;
@@ -283,21 +290,21 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
     /* Finalize previous connection if any */
     if (s->ioc) {
         nbd_client_detach_aio_context(s->bs);
+        yank_unregister_function(s->yank_name, nbd_yank, s->bs);
         object_unref(OBJECT(s->sioc));
         s->sioc = NULL;
         object_unref(OBJECT(s->ioc));
         s->ioc = NULL;
     }

-    sioc = nbd_establish_connection(s->saddr, &local_err);
-    if (!sioc) {
+    if (nbd_establish_connection(s->bs, s->saddr, &local_err) < 0) {
         ret = -ECONNREFUSED;
         goto out;
     }

     bdrv_dec_in_flight(s->bs);

-    ret = nbd_client_handshake(s->bs, sioc, &local_err);
+    ret = nbd_client_handshake(s->bs, &local_err);

     if (s->drained) {
         s->wait_drained_end = true;
@@ -334,7 +341,7 @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
     nbd_reconnect_attempt(s);

     while (nbd_client_connecting(s)) {
-        if (s->state == NBD_CLIENT_CONNECTING_WAIT &&
+        if (atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT &&
             qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > delay_ns)
         {
             s->state = NBD_CLIENT_CONNECTING_NOWAIT;
@@ -371,7 +378,7 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
     int ret = 0;
     Error *local_err = NULL;

-    while (s->state != NBD_CLIENT_QUIT) {
+    while (atomic_load_acquire(&s->state) != NBD_CLIENT_QUIT) {
         /*
          * The NBD client can only really be considered idle when it has
          * yielded from qio_channel_readv_all_eof(), waiting for data. This is
@@ -386,7 +393,7 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
             nbd_co_reconnect_loop(s);
         }

-        if (s->state != NBD_CLIENT_CONNECTED) {
+        if (atomic_load_acquire(&s->state) != NBD_CLIENT_CONNECTED) {
             continue;
         }

@@ -441,6 +448,7 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
     s->connection_co = NULL;
     if (s->ioc) {
         nbd_client_detach_aio_context(s->bs);
+        yank_unregister_function(s->yank_name, nbd_yank, s->bs);
         object_unref(OBJECT(s->sioc));
         s->sioc = NULL;
         object_unref(OBJECT(s->ioc));
@@ -465,7 +473,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
         qemu_co_queue_wait(&s->free_sema, &s->send_mutex);
     }

-    if (s->state != NBD_CLIENT_CONNECTED) {
+    if (atomic_load_acquire(&s->state) != NBD_CLIENT_CONNECTED) {
         rc = -EIO;
         goto err;
     }
@@ -492,7 +500,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
     if (qiov) {
         qio_channel_set_cork(s->ioc, true);
         rc = nbd_send_request(s->ioc, request);
-        if (rc >= 0 && s->state == NBD_CLIENT_CONNECTED) {
+        if (rc >= 0 && atomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED) {
             if (qio_channel_writev_all(s->ioc, qiov->iov, qiov->niov,
                                        NULL) < 0) {
                 rc = -EIO;
@@ -807,7 +815,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
-    if (s->state != NBD_CLIENT_CONNECTED) {
+    if (atomic_load_acquire(&s->state) != NBD_CLIENT_CONNECTED) {
         error_setg(errp, "Connection closed");
         return -EIO;
     }
@@ -966,7 +974,7 @@ static bool nbd_reply_chunk_iter_receive(BDRVNBDState *s,
     NBDReply local_reply;
     NBDStructuredReplyChunk *chunk;
     Error *local_err = NULL;
-    if (s->state != NBD_CLIENT_CONNECTED) {
+    if (atomic_load_acquire(&s->state) != NBD_CLIENT_CONNECTED) {
         error_setg(&local_err, "Connection closed");
         nbd_iter_channel_error(iter, -EIO, &local_err);
         goto break_loop;
@@ -991,7 +999,8 @@ static bool nbd_reply_chunk_iter_receive(BDRVNBDState *s,
     }

     /* Do not execute the body of NBD_FOREACH_REPLY_CHUNK for simple reply. */
-    if (nbd_reply_is_simple(reply) || s->state != NBD_CLIENT_CONNECTED) {
+    if (nbd_reply_is_simple(reply) ||
+        atomic_load_acquire(&s->state) != NBD_CLIENT_CONNECTED) {
         goto break_loop;
     }

@@ -1423,6 +1432,15 @@ static int nbd_client_reopen_prepare(BDRVReopenState *state,
     return 0;
 }

+static void nbd_yank(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+    BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
+
+    atomic_store_release(&s->state, NBD_CLIENT_QUIT);
+    qio_channel_shutdown(QIO_CHANNEL(s->sioc), QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
+}
+
 static void nbd_client_close(BlockDriverState *bs)
 {
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
@@ -1435,52 +1453,53 @@ static void nbd_client_close(BlockDriverState *bs)
     nbd_teardown_connection(bs);
 }

-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
-                                                  Error **errp)
+static int nbd_establish_connection(BlockDriverState *bs,
+                                    SocketAddress *saddr,
+                                    Error **errp)
 {
     ERRP_GUARD();
-    QIOChannelSocket *sioc;
+    BDRVNBDState *s = (BDRVNBDState *)bs->opaque;

-    sioc = qio_channel_socket_new();
-    qio_channel_set_name(QIO_CHANNEL(sioc), "nbd-client");
+    s->sioc = qio_channel_socket_new();
+    qio_channel_set_name(QIO_CHANNEL(s->sioc), "nbd-client");
+    yank_register_function(s->yank_name, nbd_yank, bs);

-    qio_channel_socket_connect_sync(sioc, saddr, errp);
+    qio_channel_socket_connect_sync(s->sioc, saddr, errp);
     if (*errp) {
-        object_unref(OBJECT(sioc));
-        return NULL;
+        yank_unregister_function(s->yank_name, nbd_yank, bs);
+        object_unref(OBJECT(s->sioc));
+        s->sioc = NULL;
+        return -1;
     }

-    qio_channel_set_delay(QIO_CHANNEL(sioc), false);
+    qio_channel_set_delay(QIO_CHANNEL(s->sioc), false);

-    return sioc;
+    return 0;
 }

-/* nbd_client_handshake takes ownership on sioc. On failure it is unref'ed. */
-static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
-                                Error **errp)
+/* nbd_client_handshake takes ownership on s->sioc. On failure it's unref'ed. */
+static int nbd_client_handshake(BlockDriverState *bs, Error **errp)
 {
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
     AioContext *aio_context = bdrv_get_aio_context(bs);
     int ret;

     trace_nbd_client_handshake(s->export);
-
-    s->sioc = sioc;
-
-    qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL);
-    qio_channel_attach_aio_context(QIO_CHANNEL(sioc), aio_context);
+    qio_channel_set_blocking(QIO_CHANNEL(s->sioc), false, NULL);
+    qio_channel_attach_aio_context(QIO_CHANNEL(s->sioc), aio_context);

     s->info.request_sizes = true;
     s->info.structured_reply = true;
     s->info.base_allocation = true;
     s->info.x_dirty_bitmap = g_strdup(s->x_dirty_bitmap);
     s->info.name = g_strdup(s->export ?: "");
-    ret = nbd_receive_negotiate(aio_context, QIO_CHANNEL(sioc), s->tlscreds,
+    ret = nbd_receive_negotiate(aio_context, QIO_CHANNEL(s->sioc), s->tlscreds,
                                 s->hostname, &s->ioc, &s->info, errp);
     g_free(s->info.x_dirty_bitmap);
     g_free(s->info.name);
     if (ret < 0) {
-        object_unref(OBJECT(sioc));
+        yank_unregister_function(s->yank_name, nbd_yank, bs);
+        object_unref(OBJECT(s->sioc));
         s->sioc = NULL;
         return ret;
     }
@@ -1508,7 +1527,7 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
     }

     if (!s->ioc) {
-        s->ioc = QIO_CHANNEL(sioc);
+        s->ioc = QIO_CHANNEL(s->sioc);
         object_ref(OBJECT(s->ioc));
     }

@@ -1524,9 +1543,10 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc,
     {
         NBDRequest request = { .type = NBD_CMD_DISC };

-        nbd_send_request(s->ioc ?: QIO_CHANNEL(sioc), &request);
+        nbd_send_request(s->ioc ?: QIO_CHANNEL(s->sioc), &request);

-        object_unref(OBJECT(sioc));
+        yank_unregister_function(s->yank_name, nbd_yank, bs);
+        object_unref(OBJECT(s->sioc));
         s->sioc = NULL;

         return ret;
@@ -1918,7 +1938,6 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
 {
     int ret;
     BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
-    QIOChannelSocket *sioc;

     ret = nbd_process_options(bs, options, errp);
     if (ret < 0) {
@@ -1929,17 +1948,28 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
     qemu_co_mutex_init(&s->send_mutex);
     qemu_co_queue_init(&s->free_sema);

+    s->yank_name = g_strconcat("blockdev:", bs->node_name, NULL);
+    yank_register_instance(s->yank_name, errp);
+    if (*errp) {
+        g_free(s->yank_name);
+        s->yank_name = NULL;
+        return -EEXIST;
+    }
+
     /*
      * establish TCP connection, return error if it fails
      * TODO: Configurable retry-until-timeout behaviour.
      */
-    sioc = nbd_establish_connection(s->saddr, errp);
-    if (!sioc) {
+    if (nbd_establish_connection(bs, s->saddr, errp) < 0) {
+        yank_unregister_instance(s->yank_name);
+        g_free(s->yank_name);
+        s->yank_name = NULL;
         return -ECONNREFUSED;
     }

-    ret = nbd_client_handshake(bs, sioc, errp);
+    ret = nbd_client_handshake(bs, errp);
     if (ret < 0) {
+        yank_unregister_instance(s->yank_name);
         nbd_clear_bdrvstate(s);
         return ret;
     }
@@ -1997,6 +2027,7 @@ static void nbd_close(BlockDriverState *bs)
     BDRVNBDState *s = bs->opaque;

     nbd_client_close(bs);
+    yank_unregister_instance(s->yank_name);
     nbd_clear_bdrvstate(s);
 }

--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 3/8] chardev/char-socket.c: Add yank feature
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 2/8] block/nbd.c: Add " Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 4/8] migration: " Lukas Straub
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 4553 bytes --]

Register a yank function to shutdown the socket on yank.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 chardev/char-socket.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index ef62dbf3d7..8e2865ca83 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -34,6 +34,7 @@
 #include "qapi/error.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/yank.h"

 #include "chardev/char-io.h"

@@ -69,6 +70,7 @@ typedef struct {
     size_t read_msgfds_num;
     int *write_msgfds;
     size_t write_msgfds_num;
+    char *yank_name;

     SocketAddress *addr;
     bool is_listen;
@@ -413,6 +415,11 @@ static void tcp_chr_free_connection(Chardev *chr)

     tcp_set_msgfds(chr, NULL, 0);
     remove_fd_in_watch(chr);
+    if (s->state == TCP_CHARDEV_STATE_CONNECTING
+        || s->state == TCP_CHARDEV_STATE_CONNECTED) {
+        yank_unregister_function(s->yank_name, yank_generic_iochannel,
+                                 QIO_CHANNEL(s->sioc));
+    }
     object_unref(OBJECT(s->sioc));
     s->sioc = NULL;
     object_unref(OBJECT(s->ioc));
@@ -916,6 +923,8 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
     }
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     tcp_chr_set_client_ioc_name(chr, sioc);
+    yank_register_function(s->yank_name, yank_generic_iochannel,
+                           QIO_CHANNEL(sioc));
     ret = tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
     return ret;
@@ -930,6 +939,8 @@ static void tcp_chr_accept(QIONetListener *listener,

     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     tcp_chr_set_client_ioc_name(chr, cioc);
+    yank_register_function(s->yank_name, yank_generic_iochannel,
+                           QIO_CHANNEL(cioc));
     tcp_chr_new_client(chr, cioc);
 }

@@ -945,6 +956,8 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error **errp)
         object_unref(OBJECT(sioc));
         return -1;
     }
+    yank_register_function(s->yank_name, yank_generic_iochannel,
+                           QIO_CHANNEL(sioc));
     tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
     return 0;
@@ -960,6 +973,8 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     sioc = qio_net_listener_wait_client(s->listener);
     tcp_chr_set_client_ioc_name(chr, sioc);
+    yank_register_function(s->yank_name, yank_generic_iochannel,
+                           QIO_CHANNEL(sioc));
     tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
 }
@@ -1070,6 +1085,10 @@ static void char_socket_finalize(Object *obj)
         object_unref(OBJECT(s->tls_creds));
     }
     g_free(s->tls_authz);
+    if (s->yank_name) {
+        yank_unregister_instance(s->yank_name);
+        g_free(s->yank_name);
+    }

     qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -1085,6 +1104,8 @@ static void qemu_chr_socket_connected(QIOTask *task, void *opaque)

     if (qio_task_propagate_error(task, &err)) {
         tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
+        yank_unregister_function(s->yank_name, yank_generic_iochannel,
+                                 QIO_CHANNEL(sioc));
         check_report_connect_error(chr, err);
         goto cleanup;
     }
@@ -1118,6 +1139,8 @@ static void tcp_chr_connect_client_async(Chardev *chr)
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     sioc = qio_channel_socket_new();
     tcp_chr_set_client_ioc_name(chr, sioc);
+    yank_register_function(s->yank_name, yank_generic_iochannel,
+                           QIO_CHANNEL(sioc));
     /*
      * Normally code would use the qio_channel_socket_connect_async
      * method which uses a QIOTask + qio_task_set_error internally
@@ -1360,6 +1383,14 @@ static void qmp_chardev_open_socket(Chardev *chr,
         qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
     }

+    s->yank_name = g_strconcat("chardev:", chr->label, NULL);
+    yank_register_instance(s->yank_name, errp);
+    if (*errp) {
+        g_free(s->yank_name);
+        s->yank_name = NULL;
+        return;
+    }
+
     /* be isn't opened until we get a connection */
     *be_opened = false;

--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 4/8] migration: Add yank feature
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
                   ` (2 preceding siblings ...)
  2020-09-01  9:15 ` [PATCH v8 3/8] chardev/char-socket.c: " Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:46   ` Daniel P. Berrangé
  2020-09-01  9:15 ` [PATCH v8 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe Lukas Straub
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 7685 bytes --]

Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/channel.c           | 12 ++++++++++++
 migration/migration.c         | 25 +++++++++++++++++++++++++
 migration/multifd.c           | 10 ++++++++++
 migration/qemu-file-channel.c |  6 ++++++
 migration/savevm.c            |  6 ++++++
 5 files changed, 59 insertions(+)

diff --git a/migration/channel.c b/migration/channel.c
index 20e4c8e2dc..21fc8046b9 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -18,6 +18,8 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "io/channel-tls.h"
+#include "io/channel-socket.h"
+#include "qemu/yank.h"

 /**
  * @migration_channel_process_incoming - Create new incoming migration channel
@@ -35,6 +37,11 @@ void migration_channel_process_incoming(QIOChannel *ioc)
     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));

+    if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+        yank_register_function("migration", yank_generic_iochannel,
+                               QIO_CHANNEL(ioc));
+    }
+
     if (s->parameters.tls_creds &&
         *s->parameters.tls_creds &&
         !object_dynamic_cast(OBJECT(ioc),
@@ -67,6 +74,11 @@ void migration_channel_connect(MigrationState *s,
         ioc, object_get_typename(OBJECT(ioc)), hostname, error);

     if (!error) {
+        if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)) {
+            yank_register_function("migration", yank_generic_iochannel,
+                                   QIO_CHANNEL(ioc));
+        }
+
         if (s->parameters.tls_creds &&
             *s->parameters.tls_creds &&
             !object_dynamic_cast(OBJECT(ioc),
diff --git a/migration/migration.c b/migration/migration.c
index 58a5452471..61a9e87d22 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,7 @@
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
+#include "qemu/yank.h"

 #define MAX_THROTTLE  (32 << 20)      /* Migration transfer speed throttling */

@@ -244,6 +245,8 @@ void migration_incoming_state_destroy(void)
         qapi_free_SocketAddressList(mis->socket_address_list);
         mis->socket_address_list = NULL;
     }
+
+    yank_unregister_instance("migration");
 }

 static void migrate_generate_event(int new_state)
@@ -380,8 +383,14 @@ void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p = NULL;

+    yank_register_instance("migration", errp);
+    if (*errp) {
+        return;
+    }
+
     qapi_event_send_migration(MIGRATION_STATUS_SETUP);
     if (!strcmp(uri, "defer")) {
+        yank_unregister_instance("migration");
         deferred_incoming_migration(errp);
     } else if (strstart(uri, "tcp:", &p) ||
                strstart(uri, "unix:", NULL) ||
@@ -396,6 +405,7 @@ void qemu_start_incoming_migration(const char *uri, Error **errp)
     } else if (strstart(uri, "fd:", &p)) {
         fd_start_incoming_migration(p, errp);
     } else {
+        yank_unregister_instance("migration");
         error_setg(errp, "unknown migration protocol: %s", uri);
     }
 }
@@ -1692,6 +1702,7 @@ static void migrate_fd_cleanup(MigrationState *s)
     }
     notifier_list_notify(&migration_state_notifiers, s);
     block_cleanup_parameters(s);
+    yank_unregister_instance("migration");
 }

 static void migrate_fd_cleanup_schedule(MigrationState *s)
@@ -1965,6 +1976,7 @@ void qmp_migrate_recover(const char *uri, Error **errp)
      * only re-setup the migration stream and poke existing migration
      * to continue using that newly established channel.
      */
+    yank_unregister_instance("migration");
     qemu_start_incoming_migration(uri, errp);
 }

@@ -2102,6 +2114,13 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
         return;
     }

+    if (!(has_resume && resume)) {
+        yank_register_instance("migration", errp);
+        if (*errp) {
+            return;
+        }
+    }
+
     if (strstart(uri, "tcp:", &p) ||
         strstart(uri, "unix:", NULL) ||
         strstart(uri, "vsock:", NULL)) {
@@ -2115,6 +2134,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     } else if (strstart(uri, "fd:", &p)) {
         fd_start_outgoing_migration(s, p, &local_err);
     } else {
+        if (!(has_resume && resume)) {
+            yank_unregister_instance("migration");
+        }
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
                    "a valid migration protocol");
         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
@@ -2124,6 +2146,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     }

     if (local_err) {
+        if (!(has_resume && resume)) {
+            yank_unregister_instance("migration");
+        }
         migrate_fd_error(s, local_err);
         error_propagate(errp, local_err);
         return;
diff --git a/migration/multifd.c b/migration/multifd.c
index d0441202aa..2c9863e770 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -24,6 +24,9 @@
 #include "trace.h"
 #include "multifd.h"

+#include "qemu/yank.h"
+#include "io/channel-socket.h"
+
 /* Multiple fd's */

 #define MULTIFD_MAGIC 0x11223344U
@@ -866,6 +869,13 @@ int multifd_load_cleanup(Error **errp)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDRecvParams *p = &multifd_recv_state->params[i];

+        if (object_dynamic_cast(OBJECT(p->c), TYPE_QIO_CHANNEL_SOCKET)
+            && OBJECT(p->c)->ref == 1) {
+            yank_unregister_function("migration",
+                                     yank_generic_iochannel,
+                                     QIO_CHANNEL(p->c));
+        }
+
         object_unref(OBJECT(p->c));
         p->c = NULL;
         qemu_mutex_destroy(&p->mutex);
diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c
index d2ce32f4b9..d8f8384fea 100644
--- a/migration/qemu-file-channel.c
+++ b/migration/qemu-file-channel.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "io/channel-socket.h"
 #include "qemu/iov.h"
+#include "qemu/yank.h"


 static ssize_t channel_writev_buffer(void *opaque,
@@ -104,6 +105,11 @@ static int channel_close(void *opaque, Error **errp)
     int ret;
     QIOChannel *ioc = QIO_CHANNEL(opaque);
     ret = qio_channel_close(ioc, errp);
+    if (object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_SOCKET)
+        && OBJECT(ioc)->ref == 1) {
+        yank_unregister_function("migration", yank_generic_iochannel,
+                                 QIO_CHANNEL(ioc));
+    }
     object_unref(OBJECT(ioc));
     return ret;
 }
diff --git a/migration/savevm.c b/migration/savevm.c
index 304d98ff78..39d5ba9a2e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -63,6 +63,7 @@
 #include "migration/colo.h"
 #include "qemu/bitmap.h"
 #include "net/announce.h"
+#include "qemu/yank.h"

 const unsigned int postcopy_ram_discard_version = 0;

@@ -2935,6 +2936,11 @@ int load_snapshot(const char *name, Error **errp)
     qemu_system_reset(SHUTDOWN_CAUSE_NONE);
     mis->from_src_file = f;

+    yank_register_instance("migration", errp);
+    if (*errp) {
+        ret = -EINVAL;
+        goto err_drain;
+    }
     aio_context_acquire(aio_context);
     ret = qemu_loadvm_state(f);
     migration_incoming_state_destroy();
--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
                   ` (3 preceding siblings ...)
  2020-09-01  9:15 ` [PATCH v8 4/8] migration: " Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown Lukas Straub
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 1379 bytes --]

Make qio_channel_tls_shutdown thread-safe by using atomics when
accessing tioc->shutdown.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 io/channel-tls.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 7ec8ceff2f..b350c84640 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -23,6 +23,7 @@
 #include "qemu/module.h"
 #include "io/channel-tls.h"
 #include "trace.h"
+#include "qemu/atomic.h"


 static ssize_t qio_channel_tls_write_handler(const char *buf,
@@ -277,7 +278,8 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
                     return QIO_CHANNEL_ERR_BLOCK;
                 }
             } else if (errno == ECONNABORTED &&
-                       (tioc->shutdown & QIO_CHANNEL_SHUTDOWN_READ)) {
+                       (atomic_load_acquire(&tioc->shutdown) &
+                        QIO_CHANNEL_SHUTDOWN_READ)) {
                 return 0;
             }

@@ -361,7 +363,7 @@ static int qio_channel_tls_shutdown(QIOChannel *ioc,
 {
     QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);

-    tioc->shutdown |= how;
+    atomic_or(&tioc->shutdown, how);

     return qio_channel_shutdown(tioc->master, how, errp);
 }
--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
                   ` (4 preceding siblings ...)
  2020-09-01  9:15 ` [PATCH v8 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:46   ` Daniel P. Berrangé
  2020-09-01  9:15 ` [PATCH v8 7/8] MAINTAINERS: Add myself as maintainer for yank feature Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test Lukas Straub
  7 siblings, 1 reply; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 1194 bytes --]

Migration and yank code assume that qio_channel_shutdown is thread
-safe and can be called from qmp oob handler. Document this after
checking the code.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/io/channel.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index d4557f0930..46156d84f5 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -98,7 +98,8 @@ struct QIOChannel {
  * provide additional optional features.
  *
  * Consult the corresponding public API docs for a description
- * of the semantics of each callback
+ * of the semantics of each callback. io_shutdown in particular
+ * must be thread-safe, terminate quickly and must not block.
  */
 struct QIOChannelClass {
     ObjectClass parent;
@@ -516,6 +517,8 @@ int qio_channel_close(QIOChannel *ioc,
  * QIO_CHANNEL_FEATURE_SHUTDOWN prior to calling
  * this method.
  *
+ * This function is thread-safe, terminates quickly and does not block.
+ *
  * Returns: 0 on success, -1 on error
  */
 int qio_channel_shutdown(QIOChannel *ioc,
--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 7/8] MAINTAINERS: Add myself as maintainer for yank feature
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
                   ` (5 preceding siblings ...)
  2020-09-01  9:15 ` [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  2020-09-01  9:15 ` [PATCH v8 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test Lukas Straub
  7 siblings, 0 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 732 bytes --]

I'll maintain this for now as the colo usecase is the first user
of this functionality.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5a22c8be42..c1d450e25a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2615,6 +2615,12 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c

+Yank feature
+M: Lukas Straub <lukasstraub2@web.de>
+S: Odd fixes
+F: util/yank.c
+F: include/qemu/yank.h
+
 COLO Framework
 M: zhanghailiang <zhang.zhanghailiang@huawei.com>
 S: Maintained
--
2.20.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test
  2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
                   ` (6 preceding siblings ...)
  2020-09-01  9:15 ` [PATCH v8 7/8] MAINTAINERS: Add myself as maintainer for yank feature Lukas Straub
@ 2020-09-01  9:15 ` Lukas Straub
  7 siblings, 0 replies; 16+ messages in thread
From: Lukas Straub @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster,
	Dr. David Alan Gilbert, Max Reitz, Paolo Bonzini,
	Marc-André Lureau

[-- Attachment #1: Type: text/plain, Size: 1055 bytes --]

A connecting chardev object has an additional reference by the connecting
thread, so if the chardev is still connecting by the end of the test,
then the chardev object won't be freed. This in turn means that the yank
instance won't be unregistered and when running the next test-case
yank_register_instance will abort, because the yank instance is
already/still registered.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/test-char.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-char.c b/tests/test-char.c
index d35cc839bc..2ced07de69 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -937,6 +937,7 @@ static void char_socket_client_dupid_test(gconstpointer opaque)
     g_assert_nonnull(opts);
     chr1 = qemu_chr_new_from_opts(opts, NULL, &error_abort);
     g_assert_nonnull(chr1);
+    qemu_chr_wait_connected(chr1, &error_abort);

     chr2 = qemu_chr_new_from_opts(opts, NULL, &local_err);
     g_assert_null(chr2);
--
2.20.1

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/8] Introduce yank feature
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
@ 2020-09-01  9:45   ` Daniel P. Berrangé
  2020-09-01 11:30   ` Markus Armbruster
  2020-09-01 14:38   ` Markus Armbruster
  2 siblings, 0 replies; 16+ messages in thread
From: Daniel P. Berrangé @ 2020-09-01  9:45 UTC (permalink / raw)
  To: Lukas Straub
  Cc: Kevin Wolf, qemu-block, Juan Quintela, Markus Armbruster,
	qemu-devel, Max Reitz, Paolo Bonzini, Marc-André Lureau,
	Dr. David Alan Gilbert

On Tue, Sep 01, 2020 at 11:15:07AM +0200, Lukas Straub wrote:
> The yank feature allows to recover from hanging qemu by "yanking"
> at various parts. Other qemu systems can register themselves and
> multiple yank functions. Then all yank functions for selected
> instances can be called by the 'yank' out-of-band qmp command.
> Available instances can be queried by a 'query-yank' oob command.
> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/qemu/yank.h |  81 +++++++++++++++++++
>  qapi/misc.json      |  62 +++++++++++++++
>  util/meson.build    |   1 +
>  util/yank.c         | 187 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 331 insertions(+)
>  create mode 100644 include/qemu/yank.h
>  create mode 100644 util/yank.c

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 4/8] migration: Add yank feature
  2020-09-01  9:15 ` [PATCH v8 4/8] migration: " Lukas Straub
@ 2020-09-01  9:46   ` Daniel P. Berrangé
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel P. Berrangé @ 2020-09-01  9:46 UTC (permalink / raw)
  To: Lukas Straub
  Cc: Kevin Wolf, qemu-block, Juan Quintela, Markus Armbruster,
	qemu-devel, Max Reitz, Paolo Bonzini, Marc-André Lureau,
	Dr. David Alan Gilbert

On Tue, Sep 01, 2020 at 11:15:18AM +0200, Lukas Straub wrote:
> Register yank functions on sockets to shut them down.
> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/channel.c           | 12 ++++++++++++
>  migration/migration.c         | 25 +++++++++++++++++++++++++
>  migration/multifd.c           | 10 ++++++++++
>  migration/qemu-file-channel.c |  6 ++++++
>  migration/savevm.c            |  6 ++++++
>  5 files changed, 59 insertions(+)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown
  2020-09-01  9:15 ` [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown Lukas Straub
@ 2020-09-01  9:46   ` Daniel P. Berrangé
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel P. Berrangé @ 2020-09-01  9:46 UTC (permalink / raw)
  To: Lukas Straub
  Cc: Kevin Wolf, qemu-block, Juan Quintela, Markus Armbruster,
	qemu-devel, Max Reitz, Paolo Bonzini, Marc-André Lureau,
	Dr. David Alan Gilbert

On Tue, Sep 01, 2020 at 11:15:25AM +0200, Lukas Straub wrote:
> Migration and yank code assume that qio_channel_shutdown is thread
> -safe and can be called from qmp oob handler. Document this after
> checking the code.
> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/io/channel.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/8] Introduce yank feature
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
  2020-09-01  9:45   ` Daniel P. Berrangé
@ 2020-09-01 11:30   ` Markus Armbruster
  2020-09-01 14:38   ` Markus Armbruster
  2 siblings, 0 replies; 16+ messages in thread
From: Markus Armbruster @ 2020-09-01 11:30 UTC (permalink / raw)
  To: Lukas Straub
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Dr. David Alan Gilbert, qemu-devel,
	Marc-André Lureau, Paolo Bonzini, Max Reitz

Lukas Straub <lukasstraub2@web.de> writes:

> The yank feature allows to recover from hanging qemu by "yanking"
> at various parts. Other qemu systems can register themselves and
> multiple yank functions. Then all yank functions for selected
> instances can be called by the 'yank' out-of-band qmp command.
> Available instances can be queried by a 'query-yank' oob command.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
[...]
> diff --git a/qapi/misc.json b/qapi/misc.json
> index 9d32820dc1..7de330416a 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -1615,3 +1615,65 @@
>  ##
>  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>
> +##
> +# @YankInstances:
> +#
> +# @instances: List of yank instances.
> +#
> +# A yank instance can be yanked with the "yank" qmp command to recover from a
> +# hanging qemu.
> +#
> +# Yank instances are named after the following schema:
> +# "blockdev:<node-name>" refers to a block device. Currently only nbd block
> +# devices are implemented.
> +# "chardev:<chardev-name>" refers to a chardev. Currently only socket chardevs
> +# are implemented.

The two "Currently only ... are implemented" are redundant with ...

> +# "migration" refers to the migration currently in progress.
> +#
> +# Currently implemented yank instances:
> +#  -nbd block device:
> +#   Yanking it will shutdown the connection to the nbd server without
> +#   attempting to reconnect.
> +#  -socket chardev:
> +#   Yanking it will shutdown the connected socket.
> +#  -migration:
> +#   Yanking it will shutdown all migration connections.

... this list.  Not a blocker, but if you have to respin for some other
reason, consider deleting them.

> +#
> +# Since: 5.2
> +##
> +{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }
> +
> +##
> +# @yank:
> +#
> +# Recover from hanging qemu by yanking the specified instances. See
> +# "YankInstances" for more information.
> +#
> +# Takes @YankInstances as argument.
> +#
> +# Returns: nothing.
> +#
> +# Example:
> +#
> +# -> { "execute": "yank", "arguments": { "instances": ["blockdev:nbd0"] } }
> +# <- { "return": {} }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'yank', 'data': 'YankInstances', 'allow-oob': true }
> +
> +##
> +# @query-yank:
> +#
> +# Query yank instances. See "YankInstances" for more information.
> +#
> +# Returns: @YankInstances
> +#
> +# Example:
> +#
> +# -> { "execute": "query-yank" }
> +# <- { "return": { "instances": ["blockdev:nbd0"] } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'query-yank', 'returns': 'YankInstances', 'allow-oob': true }

You addressed all my review comments nicely, except for the one
questioning the wisdom of encoding structured data into the instance
name.  Daniel and you pointed me to the spot where this was discussed
previously, in review of your v1.  I read the thread, and replied to it.
I'd like us to put this series on hold to give us time to discuss and
decide.

Thanks!

[...]



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/8] Introduce yank feature
  2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
  2020-09-01  9:45   ` Daniel P. Berrangé
  2020-09-01 11:30   ` Markus Armbruster
@ 2020-09-01 14:38   ` Markus Armbruster
  2020-09-01 15:22     ` Daniel P. Berrangé
  2 siblings, 1 reply; 16+ messages in thread
From: Markus Armbruster @ 2020-09-01 14:38 UTC (permalink / raw)
  To: Lukas Straub
  Cc: Kevin Wolf, Daniel P. Berrangé,
	qemu-block, Juan Quintela, Markus Armbruster, qemu-devel,
	Marc-André Lureau, Paolo Bonzini, Max Reitz,
	Dr. David Alan Gilbert

One more question...

Lukas Straub <lukasstraub2@web.de> writes:

> The yank feature allows to recover from hanging qemu by "yanking"
> at various parts. Other qemu systems can register themselves and
> multiple yank functions. Then all yank functions for selected
> instances can be called by the 'yank' out-of-band qmp command.
> Available instances can be queried by a 'query-yank' oob command.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[...]
> diff --git a/qapi/misc.json b/qapi/misc.json
> index 9d32820dc1..7de330416a 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -1615,3 +1615,65 @@
>  ##
>  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>
> +##
> +# @YankInstances:
> +#
> +# @instances: List of yank instances.
> +#
> +# A yank instance can be yanked with the "yank" qmp command to recover from a
> +# hanging qemu.
> +#
> +# Yank instances are named after the following schema:
> +# "blockdev:<node-name>" refers to a block device. Currently only nbd block
> +# devices are implemented.
> +# "chardev:<chardev-name>" refers to a chardev. Currently only socket chardevs
> +# are implemented.
> +# "migration" refers to the migration currently in progress.
> +#
> +# Currently implemented yank instances:
> +#  -nbd block device:
> +#   Yanking it will shutdown the connection to the nbd server without
> +#   attempting to reconnect.
> +#  -socket chardev:
> +#   Yanking it will shutdown the connected socket.
> +#  -migration:
> +#   Yanking it will shutdown all migration connections.

How is yanking migration related to command migrate_cancel?

> +#
> +# Since: 5.2
> +##
> +{ 'struct': 'YankInstances', 'data': {'instances': ['str'] } }
> +
> +##
> +# @yank:
> +#
> +# Recover from hanging qemu by yanking the specified instances. See
> +# "YankInstances" for more information.
> +#
> +# Takes @YankInstances as argument.
> +#
> +# Returns: nothing.
> +#
> +# Example:
> +#
> +# -> { "execute": "yank", "arguments": { "instances": ["blockdev:nbd0"] } }
> +# <- { "return": {} }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'yank', 'data': 'YankInstances', 'allow-oob': true }
> +
> +##
> +# @query-yank:
> +#
> +# Query yank instances. See "YankInstances" for more information.
> +#
> +# Returns: @YankInstances
> +#
> +# Example:
> +#
> +# -> { "execute": "query-yank" }
> +# <- { "return": { "instances": ["blockdev:nbd0"] } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'query-yank', 'returns': 'YankInstances', 'allow-oob': true }
[...]



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/8] Introduce yank feature
  2020-09-01 14:38   ` Markus Armbruster
@ 2020-09-01 15:22     ` Daniel P. Berrangé
  2020-09-02  7:55       ` Markus Armbruster
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel P. Berrangé @ 2020-09-01 15:22 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Lukas Straub, qemu-block, Juan Quintela, qemu-devel,
	Max Reitz, Paolo Bonzini, Marc-André Lureau,
	Dr. David Alan Gilbert

On Tue, Sep 01, 2020 at 04:38:46PM +0200, Markus Armbruster wrote:
> One more question...
> 
> Lukas Straub <lukasstraub2@web.de> writes:
> 
> > The yank feature allows to recover from hanging qemu by "yanking"
> > at various parts. Other qemu systems can register themselves and
> > multiple yank functions. Then all yank functions for selected
> > instances can be called by the 'yank' out-of-band qmp command.
> > Available instances can be queried by a 'query-yank' oob command.
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> [...]
> > diff --git a/qapi/misc.json b/qapi/misc.json
> > index 9d32820dc1..7de330416a 100644
> > --- a/qapi/misc.json
> > +++ b/qapi/misc.json
> > @@ -1615,3 +1615,65 @@
> >  ##
> >  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
> >
> > +##
> > +# @YankInstances:
> > +#
> > +# @instances: List of yank instances.
> > +#
> > +# A yank instance can be yanked with the "yank" qmp command to recover from a
> > +# hanging qemu.
> > +#
> > +# Yank instances are named after the following schema:
> > +# "blockdev:<node-name>" refers to a block device. Currently only nbd block
> > +# devices are implemented.
> > +# "chardev:<chardev-name>" refers to a chardev. Currently only socket chardevs
> > +# are implemented.
> > +# "migration" refers to the migration currently in progress.
> > +#
> > +# Currently implemented yank instances:
> > +#  -nbd block device:
> > +#   Yanking it will shutdown the connection to the nbd server without
> > +#   attempting to reconnect.
> > +#  -socket chardev:
> > +#   Yanking it will shutdown the connected socket.
> > +#  -migration:
> > +#   Yanking it will shutdown all migration connections.
> 
> How is yanking migration related to command migrate_cancel?

migrate_cancel will do a shutdown() on the primary migration socket only.
In addition it will toggle the migration state.

Yanking will do a shutdown on all migration sockets (important for
multifd), but won't touch migration state or any other aspect of QEMU
code.

Overall yanking has less potential for things to go wrong than the
migrate_cancel method, as it doesn't try to do any kind of cleanup
or migration.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/8] Introduce yank feature
  2020-09-01 15:22     ` Daniel P. Berrangé
@ 2020-09-02  7:55       ` Markus Armbruster
  0 siblings, 0 replies; 16+ messages in thread
From: Markus Armbruster @ 2020-09-02  7:55 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Lukas Straub, qemu-block, Juan Quintela, qemu-devel,
	Max Reitz, Marc-André Lureau, Paolo Bonzini,
	Dr. David Alan Gilbert

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Sep 01, 2020 at 04:38:46PM +0200, Markus Armbruster wrote:
>> One more question...
>> 
>> Lukas Straub <lukasstraub2@web.de> writes:
>> 
>> > The yank feature allows to recover from hanging qemu by "yanking"
>> > at various parts. Other qemu systems can register themselves and
>> > multiple yank functions. Then all yank functions for selected
>> > instances can be called by the 'yank' out-of-band qmp command.
>> > Available instances can be queried by a 'query-yank' oob command.
>> >
>> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>> > Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
>> [...]
>> > diff --git a/qapi/misc.json b/qapi/misc.json
>> > index 9d32820dc1..7de330416a 100644
>> > --- a/qapi/misc.json
>> > +++ b/qapi/misc.json
>> > @@ -1615,3 +1615,65 @@
>> >  ##
>> >  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>> >
>> > +##
>> > +# @YankInstances:
>> > +#
>> > +# @instances: List of yank instances.
>> > +#
>> > +# A yank instance can be yanked with the "yank" qmp command to recover from a
>> > +# hanging qemu.
>> > +#
>> > +# Yank instances are named after the following schema:
>> > +# "blockdev:<node-name>" refers to a block device. Currently only nbd block
>> > +# devices are implemented.
>> > +# "chardev:<chardev-name>" refers to a chardev. Currently only socket chardevs
>> > +# are implemented.
>> > +# "migration" refers to the migration currently in progress.
>> > +#
>> > +# Currently implemented yank instances:
>> > +#  -nbd block device:
>> > +#   Yanking it will shutdown the connection to the nbd server without
>> > +#   attempting to reconnect.
>> > +#  -socket chardev:
>> > +#   Yanking it will shutdown the connected socket.
>> > +#  -migration:
>> > +#   Yanking it will shutdown all migration connections.
>> 
>> How is yanking migration related to command migrate_cancel?
>
> migrate_cancel will do a shutdown() on the primary migration socket only.
> In addition it will toggle the migration state.
>
> Yanking will do a shutdown on all migration sockets (important for
> multifd), but won't touch migration state or any other aspect of QEMU
> code.
>
> Overall yanking has less potential for things to go wrong than the
> migrate_cancel method, as it doesn't try to do any kind of cleanup
> or migration.

Thanks.  Would it make sense to work this into the documentation?



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-09-02  7:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-01  9:14 [PATCH v8 0/8] Introduce 'yank' oob qmp command to recover from hanging qemu Lukas Straub
2020-09-01  9:15 ` [PATCH v8 1/8] Introduce yank feature Lukas Straub
2020-09-01  9:45   ` Daniel P. Berrangé
2020-09-01 11:30   ` Markus Armbruster
2020-09-01 14:38   ` Markus Armbruster
2020-09-01 15:22     ` Daniel P. Berrangé
2020-09-02  7:55       ` Markus Armbruster
2020-09-01  9:15 ` [PATCH v8 2/8] block/nbd.c: Add " Lukas Straub
2020-09-01  9:15 ` [PATCH v8 3/8] chardev/char-socket.c: " Lukas Straub
2020-09-01  9:15 ` [PATCH v8 4/8] migration: " Lukas Straub
2020-09-01  9:46   ` Daniel P. Berrangé
2020-09-01  9:15 ` [PATCH v8 5/8] io/channel-tls.c: make qio_channel_tls_shutdown thread-safe Lukas Straub
2020-09-01  9:15 ` [PATCH v8 6/8] io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown Lukas Straub
2020-09-01  9:46   ` Daniel P. Berrangé
2020-09-01  9:15 ` [PATCH v8 7/8] MAINTAINERS: Add myself as maintainer for yank feature Lukas Straub
2020-09-01  9:15 ` [PATCH v8 8/8] tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test Lukas Straub

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).