All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery
@ 2017-08-15  6:17 Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated Peter Xu
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

This series is based on the postcopy failure recovery series. It
sololy tries to provide a new way to allow the destination to have a
new incoming channel.

One use case is when we are doing postcopy migration using a fd on
destination side. When network failure is detected, destination QEMU
will switch to postcopy-pause state for a recovery. However since the
old fd is disconnected and not valid any more, there is no way to do a
reconnection without a new reconfiguration.

With this series, we can specify the new listening channel by using
"migrate_incoming xxx:xxx" command. It was used only for "-incoming
defer" to defer an incoming migration. This series extended its usage
for paused postcopy as well.

Please review, thanks.

Peter Xu (6):
  migration: free SocketAddress where allocated
  migration: return incoming task tag for sockets
  migration: return incoming task tag for exec
  migration: return incoming task tag for fd
  migration: store listen task tag
  migration: allow migrate_incoming for paused VM

 migration/exec.c      | 18 ++++++++++-------
 migration/exec.h      |  2 +-
 migration/fd.c        | 18 ++++++++++-------
 migration/fd.h        |  2 +-
 migration/migration.c | 56 +++++++++++++++++++++++++++++++++++++++++----------
 migration/migration.h |  2 ++
 migration/socket.c    | 40 +++++++++++++++++++++++-------------
 migration/socket.h    |  4 ++--
 8 files changed, 99 insertions(+), 43 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 2/6] migration: return incoming task tag for sockets Peter Xu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

Freeing the SocketAddress struct in socket_start_incoming_migration is
slightly confusing. Let's free the address in the same context where we
allocated it.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/socket.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/socket.c b/migration/socket.c
index 757d382..9fc6cb3 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -168,7 +168,6 @@ static void socket_start_incoming_migration(SocketAddress *saddr,
 
     if (qio_channel_socket_listen_sync(listen_ioc, saddr, errp) < 0) {
         object_unref(OBJECT(listen_ioc));
-        qapi_free_SocketAddress(saddr);
         return;
     }
 
@@ -177,7 +176,6 @@ static void socket_start_incoming_migration(SocketAddress *saddr,
                           socket_accept_incoming_migration,
                           listen_ioc,
                           (GDestroyNotify)object_unref);
-    qapi_free_SocketAddress(saddr);
 }
 
 void tcp_start_incoming_migration(const char *host_port, Error **errp)
@@ -188,10 +186,12 @@ void tcp_start_incoming_migration(const char *host_port, Error **errp)
         socket_start_incoming_migration(saddr, &err);
     }
     error_propagate(errp, err);
+    qapi_free_SocketAddress(saddr);
 }
 
 void unix_start_incoming_migration(const char *path, Error **errp)
 {
     SocketAddress *saddr = unix_build_address(path);
     socket_start_incoming_migration(saddr, errp);
+    qapi_free_SocketAddress(saddr);
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 2/6] migration: return incoming task tag for sockets
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 3/6] migration: return incoming task tag for exec Peter Xu
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

For socket based incoming migration, we attached a background task onto
main loop to handle the acception of connections. We never had a way to
destroy it before, only if we finished the migration.

Let's allow socket_start_incoming_migration() to return the source tag
of the listening async work, so that we may be able to clean it up in
the future.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/socket.c | 36 ++++++++++++++++++++++++------------
 migration/socket.h |  4 ++--
 2 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/migration/socket.c b/migration/socket.c
index 9fc6cb3..6ee51ef 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -158,8 +158,12 @@ out:
 }
 
 
-static void socket_start_incoming_migration(SocketAddress *saddr,
-                                            Error **errp)
+/*
+ * Returns the tag ID of the watch that is attached to global main
+ * loop (>0), or zero if failure detected.
+ */
+static guint socket_start_incoming_migration(SocketAddress *saddr,
+                                             Error **errp)
 {
     QIOChannelSocket *listen_ioc = qio_channel_socket_new();
 
@@ -168,30 +172,38 @@ static void socket_start_incoming_migration(SocketAddress *saddr,
 
     if (qio_channel_socket_listen_sync(listen_ioc, saddr, errp) < 0) {
         object_unref(OBJECT(listen_ioc));
-        return;
+        return 0;
     }
 
-    qio_channel_add_watch(QIO_CHANNEL(listen_ioc),
-                          G_IO_IN,
-                          socket_accept_incoming_migration,
-                          listen_ioc,
-                          (GDestroyNotify)object_unref);
+    return qio_channel_add_watch(QIO_CHANNEL(listen_ioc),
+                                 G_IO_IN,
+                                 socket_accept_incoming_migration,
+                                 listen_ioc,
+                                 (GDestroyNotify)object_unref);
 }
 
-void tcp_start_incoming_migration(const char *host_port, Error **errp)
+guint tcp_start_incoming_migration(const char *host_port, Error **errp)
 {
     Error *err = NULL;
     SocketAddress *saddr = tcp_build_address(host_port, &err);
+    guint tag;
+
     if (!err) {
-        socket_start_incoming_migration(saddr, &err);
+        tag = socket_start_incoming_migration(saddr, &err);
     }
     error_propagate(errp, err);
     qapi_free_SocketAddress(saddr);
+
+    return tag;
 }
 
-void unix_start_incoming_migration(const char *path, Error **errp)
+guint unix_start_incoming_migration(const char *path, Error **errp)
 {
     SocketAddress *saddr = unix_build_address(path);
-    socket_start_incoming_migration(saddr, errp);
+    guint tag;
+
+    tag = socket_start_incoming_migration(saddr, errp);
     qapi_free_SocketAddress(saddr);
+
+    return tag;
 }
diff --git a/migration/socket.h b/migration/socket.h
index 6b91e9d..bc8a59a 100644
--- a/migration/socket.h
+++ b/migration/socket.h
@@ -16,12 +16,12 @@
 
 #ifndef QEMU_MIGRATION_SOCKET_H
 #define QEMU_MIGRATION_SOCKET_H
-void tcp_start_incoming_migration(const char *host_port, Error **errp);
+guint tcp_start_incoming_migration(const char *host_port, Error **errp);
 
 void tcp_start_outgoing_migration(MigrationState *s, const char *host_port,
                                   Error **errp);
 
-void unix_start_incoming_migration(const char *path, Error **errp);
+guint unix_start_incoming_migration(const char *path, Error **errp);
 
 void unix_start_outgoing_migration(MigrationState *s, const char *path,
                                    Error **errp);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 3/6] migration: return incoming task tag for exec
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 2/6] migration: return incoming task tag for sockets Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 4/6] migration: return incoming task tag for fd Peter Xu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

Return the async task tag for exec typed incoming migration in
exec_start_incoming_migration().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/exec.c | 18 +++++++++++-------
 migration/exec.h |  2 +-
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/migration/exec.c b/migration/exec.c
index 08b599e..ef1fb4c 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -52,7 +52,11 @@ static gboolean exec_accept_incoming_migration(QIOChannel *ioc,
     return FALSE; /* unregister */
 }
 
-void exec_start_incoming_migration(const char *command, Error **errp)
+/*
+ * Returns the tag ID of the watch that is attached to global main
+ * loop (>0), or zero if failure detected.
+ */
+guint exec_start_incoming_migration(const char *command, Error **errp)
 {
     QIOChannel *ioc;
     const char *argv[] = { "/bin/sh", "-c", command, NULL };
@@ -62,13 +66,13 @@ void exec_start_incoming_migration(const char *command, Error **errp)
                                                     O_RDWR,
                                                     errp));
     if (!ioc) {
-        return;
+        return 0;
     }
 
     qio_channel_set_name(ioc, "migration-exec-incoming");
-    qio_channel_add_watch(ioc,
-                          G_IO_IN,
-                          exec_accept_incoming_migration,
-                          NULL,
-                          NULL);
+    return qio_channel_add_watch(ioc,
+                                 G_IO_IN,
+                                 exec_accept_incoming_migration,
+                                 NULL,
+                                 NULL);
 }
diff --git a/migration/exec.h b/migration/exec.h
index b210ffd..0a7aada 100644
--- a/migration/exec.h
+++ b/migration/exec.h
@@ -19,7 +19,7 @@
 
 #ifndef QEMU_MIGRATION_EXEC_H
 #define QEMU_MIGRATION_EXEC_H
-void exec_start_incoming_migration(const char *host_port, Error **errp);
+guint exec_start_incoming_migration(const char *host_port, Error **errp);
 
 void exec_start_outgoing_migration(MigrationState *s, const char *host_port,
                                    Error **errp);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 4/6] migration: return incoming task tag for fd
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
                   ` (2 preceding siblings ...)
  2017-08-15  6:17 ` [Qemu-devel] [RFC 3/6] migration: return incoming task tag for exec Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 5/6] migration: store listen task tag Peter Xu
  2017-08-15  6:17 ` [Qemu-devel] [RFC 6/6] migration: allow migrate_incoming for paused VM Peter Xu
  5 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

Allow to return the task tag in fd_start_incoming_migration().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/fd.c | 18 +++++++++++-------
 migration/fd.h |  2 +-
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/migration/fd.c b/migration/fd.c
index 30f5258..e9a548c 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -52,7 +52,11 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
     return FALSE; /* unregister */
 }
 
-void fd_start_incoming_migration(const char *infd, Error **errp)
+/*
+ * Returns the tag ID of the watch that is attached to global main
+ * loop (>0), or zero if failure detected.
+ */
+guint fd_start_incoming_migration(const char *infd, Error **errp)
 {
     QIOChannel *ioc;
     int fd;
@@ -63,13 +67,13 @@ void fd_start_incoming_migration(const char *infd, Error **errp)
     ioc = qio_channel_new_fd(fd, errp);
     if (!ioc) {
         close(fd);
-        return;
+        return 0;
     }
 
     qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-incoming");
-    qio_channel_add_watch(ioc,
-                          G_IO_IN,
-                          fd_accept_incoming_migration,
-                          NULL,
-                          NULL);
+    return qio_channel_add_watch(ioc,
+                                 G_IO_IN,
+                                 fd_accept_incoming_migration,
+                                 NULL,
+                                 NULL);
 }
diff --git a/migration/fd.h b/migration/fd.h
index a14a63c..94cdea8 100644
--- a/migration/fd.h
+++ b/migration/fd.h
@@ -16,7 +16,7 @@
 
 #ifndef QEMU_MIGRATION_FD_H
 #define QEMU_MIGRATION_FD_H
-void fd_start_incoming_migration(const char *path, Error **errp);
+guint fd_start_incoming_migration(const char *path, Error **errp);
 
 void fd_start_outgoing_migration(MigrationState *s, const char *fdname,
                                  Error **errp);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
                   ` (3 preceding siblings ...)
  2017-08-15  6:17 ` [Qemu-devel] [RFC 4/6] migration: return incoming task tag for fd Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  2017-08-15  8:37   ` Daniel P. Berrange
  2017-08-15  6:17 ` [Qemu-devel] [RFC 6/6] migration: allow migrate_incoming for paused VM Peter Xu
  5 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

Store the task tag for migration types: tcp/unix/fd/exec in current
MigrationIncomingState struct.

For defered migration, no need to store task tag since there is no task
running in the main loop at all. For RDMA, let's mark it as todo.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 22 ++++++++++++++++++----
 migration/migration.h |  2 ++
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index c9b7085..daf356b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
         mis->from_src_file = NULL;
     }
 
+    mis->listen_task_tag = 0;
     qemu_event_destroy(&mis->main_thread_load_event);
 }
 
@@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
 void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p;
+    guint task_tag = 0;
+    MigrationIncomingState *mis = migration_incoming_get_current();
 
     qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
     if (!strcmp(uri, "defer")) {
         deferred_incoming_migration(errp);
     } else if (strstart(uri, "tcp:", &p)) {
-        tcp_start_incoming_migration(p, errp);
+        task_tag = tcp_start_incoming_migration(p, errp);
 #ifdef CONFIG_RDMA
     } else if (strstart(uri, "rdma:", &p)) {
+        /* TODO: store task tag for RDMA migrations */
         rdma_start_incoming_migration(p, errp);
 #endif
     } else if (strstart(uri, "exec:", &p)) {
-        exec_start_incoming_migration(p, errp);
+        task_tag = exec_start_incoming_migration(p, errp);
     } else if (strstart(uri, "unix:", &p)) {
-        unix_start_incoming_migration(p, errp);
+        task_tag = unix_start_incoming_migration(p, errp);
     } else if (strstart(uri, "fd:", &p)) {
-        fd_start_incoming_migration(p, errp);
+        task_tag = fd_start_incoming_migration(p, errp);
     } else {
         error_setg(errp, "unknown migration protocol: %s", uri);
+        return;
     }
+
+    mis->listen_task_tag = task_tag;
 }
 
 static void process_incoming_migration_bh(void *opaque)
@@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
         co = qemu_coroutine_create(process_incoming_migration_co, f);
         qemu_coroutine_enter(co);
     }
+
+    /*
+     * When reach here, we should not need the listening port any
+     * more. We'll detach the listening task soon, let's reset the
+     * listen task tag.
+     */
+    mis->listen_task_tag = 0;
 }
 
 /*
diff --git a/migration/migration.h b/migration/migration.h
index d041369..1f4faef 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -26,6 +26,8 @@
 /* State for the incoming migration */
 struct MigrationIncomingState {
     QEMUFile *from_src_file;
+    /* Task tag for incoming listen port. Valid when >0. */
+    guint listen_task_tag;
 
     /*
      * Free at the start of the main state load, set as the main thread finishes
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [RFC 6/6] migration: allow migrate_incoming for paused VM
  2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
                   ` (4 preceding siblings ...)
  2017-08-15  6:17 ` [Qemu-devel] [RFC 5/6] migration: store listen task tag Peter Xu
@ 2017-08-15  6:17 ` Peter Xu
  5 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-15  6:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P . Berrange, Laurent Vivier, Juan Quintela,
	Dr . David Alan Gilbert, peterx

migrate_incoming command is previously only used when we were providing
"-incoming defer" in the command line, to defer the incoming migration
channel creation.

However there is similar requirement when we are paused during postcopy
migration. The old incoming channel might have been destroyed already.
We may need another new channel for the recovery to happen.

This patch leveraged the same interface, but allows the user to specify
incoming migration channel even for paused postcopy.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index daf356b..696cc7c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1288,17 +1288,39 @@ void migrate_del_blocker(Error *reason)
     migration_blockers = g_slist_remove(migration_blockers, reason);
 }
 
+static bool migrate_incoming_detach_listen(MigrationIncomingState *mis)
+{
+    if (mis->listen_task_tag) {
+        /* Never fail */
+        g_source_remove(mis->listen_task_tag);
+        mis->listen_task_tag = 0;
+        return true;
+    }
+    return false;
+}
+
 void qmp_migrate_incoming(const char *uri, Error **errp)
 {
     Error *local_err = NULL;
-    static bool once = true;
+    MigrationIncomingState *mis = migration_incoming_get_current();
 
-    if (!deferred_incoming) {
-        error_setg(errp, "For use with '-incoming defer'");
+    if (!deferred_incoming &&
+        mis->state != MIGRATION_STATUS_POSTCOPY_PAUSED) {
+        error_setg(errp, "For use with '-incoming defer'"
+                   " or PAUSED postcopy migration only.");
         return;
     }
-    if (!once) {
-        error_setg(errp, "The incoming migration has already been started");
+
+    /*
+     * Destroy existing listening task if exist. Logically this should
+     * not really happen at all (for either deferred migration or
+     * postcopy migration, we should both detached the listening
+     * task). So raise an error but still we safely detach it.
+     */
+    if (migrate_incoming_detach_listen(mis)) {
+        error_report("%s: detected existing listen channel, "
+                     "while it should not exist", __func__);
+        /* Continue */
     }
 
     qemu_start_incoming_migration(uri, &local_err);
@@ -1307,8 +1329,6 @@ void qmp_migrate_incoming(const char *uri, Error **errp)
         error_propagate(errp, local_err);
         return;
     }
-
-    once = false;
 }
 
 bool migration_is_blocked(Error **errp)
-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  6:17 ` [Qemu-devel] [RFC 5/6] migration: store listen task tag Peter Xu
@ 2017-08-15  8:37   ` Daniel P. Berrange
  2017-08-15  8:50     ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel P. Berrange @ 2017-08-15  8:37 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> Store the task tag for migration types: tcp/unix/fd/exec in current
> MigrationIncomingState struct.
> 
> For defered migration, no need to store task tag since there is no task
> running in the main loop at all. For RDMA, let's mark it as todo.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c | 22 ++++++++++++++++++----
>  migration/migration.h |  2 ++
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index c9b7085..daf356b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
>          mis->from_src_file = NULL;
>      }
>  
> +    mis->listen_task_tag = 0;
>      qemu_event_destroy(&mis->main_thread_load_event);
>  }
>  
> @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
>  void qemu_start_incoming_migration(const char *uri, Error **errp)
>  {
>      const char *p;
> +    guint task_tag = 0;
> +    MigrationIncomingState *mis = migration_incoming_get_current();
>  
>      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
>      if (!strcmp(uri, "defer")) {
>          deferred_incoming_migration(errp);
>      } else if (strstart(uri, "tcp:", &p)) {
> -        tcp_start_incoming_migration(p, errp);
> +        task_tag = tcp_start_incoming_migration(p, errp);
>  #ifdef CONFIG_RDMA
>      } else if (strstart(uri, "rdma:", &p)) {
> +        /* TODO: store task tag for RDMA migrations */
>          rdma_start_incoming_migration(p, errp);
>  #endif
>      } else if (strstart(uri, "exec:", &p)) {
> -        exec_start_incoming_migration(p, errp);
> +        task_tag = exec_start_incoming_migration(p, errp);
>      } else if (strstart(uri, "unix:", &p)) {
> -        unix_start_incoming_migration(p, errp);
> +        task_tag = unix_start_incoming_migration(p, errp);
>      } else if (strstart(uri, "fd:", &p)) {
> -        fd_start_incoming_migration(p, errp);
> +        task_tag = fd_start_incoming_migration(p, errp);
>      } else {
>          error_setg(errp, "unknown migration protocol: %s", uri);
> +        return;
>      }
> +
> +    mis->listen_task_tag = task_tag;

This is unsafe as currently written. The main loop watches that are
associated with these tags are all unregistered when incoming migration
starts. So if you save them, and then unregister later when postcopy
is "stuck", then you're likely unregistering a tag associated with
some other random part of QEMU, because the saved value is no longer
valid.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  8:37   ` Daniel P. Berrange
@ 2017-08-15  8:50     ` Peter Xu
  2017-08-15  9:27       ` Daniel P. Berrange
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2017-08-15  8:50 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > Store the task tag for migration types: tcp/unix/fd/exec in current
> > MigrationIncomingState struct.
> > 
> > For defered migration, no need to store task tag since there is no task
> > running in the main loop at all. For RDMA, let's mark it as todo.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  migration/migration.c | 22 ++++++++++++++++++----
> >  migration/migration.h |  2 ++
> >  2 files changed, 20 insertions(+), 4 deletions(-)
> > 
> > diff --git a/migration/migration.c b/migration/migration.c
> > index c9b7085..daf356b 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> >          mis->from_src_file = NULL;
> >      }
> >  
> > +    mis->listen_task_tag = 0;
> >      qemu_event_destroy(&mis->main_thread_load_event);
> >  }
> >  
> > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> >  {
> >      const char *p;
> > +    guint task_tag = 0;
> > +    MigrationIncomingState *mis = migration_incoming_get_current();
> >  
> >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> >      if (!strcmp(uri, "defer")) {
> >          deferred_incoming_migration(errp);
> >      } else if (strstart(uri, "tcp:", &p)) {
> > -        tcp_start_incoming_migration(p, errp);
> > +        task_tag = tcp_start_incoming_migration(p, errp);
> >  #ifdef CONFIG_RDMA
> >      } else if (strstart(uri, "rdma:", &p)) {
> > +        /* TODO: store task tag for RDMA migrations */
> >          rdma_start_incoming_migration(p, errp);
> >  #endif
> >      } else if (strstart(uri, "exec:", &p)) {
> > -        exec_start_incoming_migration(p, errp);
> > +        task_tag = exec_start_incoming_migration(p, errp);
> >      } else if (strstart(uri, "unix:", &p)) {
> > -        unix_start_incoming_migration(p, errp);
> > +        task_tag = unix_start_incoming_migration(p, errp);
> >      } else if (strstart(uri, "fd:", &p)) {
> > -        fd_start_incoming_migration(p, errp);
> > +        task_tag = fd_start_incoming_migration(p, errp);
> >      } else {
> >          error_setg(errp, "unknown migration protocol: %s", uri);
> > +        return;
> >      }
> > +
> > +    mis->listen_task_tag = task_tag;
> 
> This is unsafe as currently written. The main loop watches that are
> associated with these tags are all unregistered when incoming migration
> starts. So if you save them, and then unregister later when postcopy
> is "stuck", then you're likely unregistering a tag associated with
> some other random part of QEMU, because the saved value is no longer
> valid.

IIUC for incoming migration, the watch is not detached until
migration_fd_process_incoming() completes. So:

- for precopy, the watch should be detached when migration completes

- for postcopy, the watch should be detached when postcopy starts,
  then main loop thread returns, while the ram_load_thread starts to
  continue the migration.

So basically the watch is detaching itself after
migration_fd_process_incoming() completes. To handle that, this patch
has this change:

@@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
         co = qemu_coroutine_create(process_incoming_migration_co, f);
         qemu_coroutine_enter(co);
     }
+
+    /*
+     * When reach here, we should not need the listening port any
+     * more. We'll detach the listening task soon, let's reset the
+     * listen task tag.
+     */
+    mis->listen_task_tag = 0;

Would this make sure the listen_task_tag is always valid if nonzero?

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  8:50     ` Peter Xu
@ 2017-08-15  9:27       ` Daniel P. Berrange
  2017-08-15  9:47         ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel P. Berrange @ 2017-08-15  9:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > MigrationIncomingState struct.
> > > 
> > > For defered migration, no need to store task tag since there is no task
> > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > 
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  migration/migration.c | 22 ++++++++++++++++++----
> > >  migration/migration.h |  2 ++
> > >  2 files changed, 20 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index c9b7085..daf356b 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > >          mis->from_src_file = NULL;
> > >      }
> > >  
> > > +    mis->listen_task_tag = 0;
> > >      qemu_event_destroy(&mis->main_thread_load_event);
> > >  }
> > >  
> > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> > >  {
> > >      const char *p;
> > > +    guint task_tag = 0;
> > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > >  
> > >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > >      if (!strcmp(uri, "defer")) {
> > >          deferred_incoming_migration(errp);
> > >      } else if (strstart(uri, "tcp:", &p)) {
> > > -        tcp_start_incoming_migration(p, errp);
> > > +        task_tag = tcp_start_incoming_migration(p, errp);
> > >  #ifdef CONFIG_RDMA
> > >      } else if (strstart(uri, "rdma:", &p)) {
> > > +        /* TODO: store task tag for RDMA migrations */
> > >          rdma_start_incoming_migration(p, errp);
> > >  #endif
> > >      } else if (strstart(uri, "exec:", &p)) {
> > > -        exec_start_incoming_migration(p, errp);
> > > +        task_tag = exec_start_incoming_migration(p, errp);
> > >      } else if (strstart(uri, "unix:", &p)) {
> > > -        unix_start_incoming_migration(p, errp);
> > > +        task_tag = unix_start_incoming_migration(p, errp);
> > >      } else if (strstart(uri, "fd:", &p)) {
> > > -        fd_start_incoming_migration(p, errp);
> > > +        task_tag = fd_start_incoming_migration(p, errp);
> > >      } else {
> > >          error_setg(errp, "unknown migration protocol: %s", uri);
> > > +        return;
> > >      }
> > > +
> > > +    mis->listen_task_tag = task_tag;
> > 
> > This is unsafe as currently written. The main loop watches that are
> > associated with these tags are all unregistered when incoming migration
> > starts. So if you save them, and then unregister later when postcopy
> > is "stuck", then you're likely unregistering a tag associated with
> > some other random part of QEMU, because the saved value is no longer
> > valid.
> 
> IIUC for incoming migration, the watch is not detached until
> migration_fd_process_incoming() completes. So:
> 
> - for precopy, the watch should be detached when migration completes
> 
> - for postcopy, the watch should be detached when postcopy starts,
>   then main loop thread returns, while the ram_load_thread starts to
>   continue the migration.
> 
> So basically the watch is detaching itself after
> migration_fd_process_incoming() completes. To handle that, this patch
> has this change:
> 
> @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
>          co = qemu_coroutine_create(process_incoming_migration_co, f);
>          qemu_coroutine_enter(co);
>      }
> +
> +    /*
> +     * When reach here, we should not need the listening port any
> +     * more. We'll detach the listening task soon, let's reset the
> +     * listen task tag.
> +     */
> +    mis->listen_task_tag = 0;
> 
> Would this make sure the listen_task_tag is always valid if nonzero?

Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
for hard to diagnose bugs, so should be avoided in general.

So, IMHO, you should change all the callbacks to 'return TRUE' so that
they keep the watch registered, and then explicitly call g_source_remove()
passing the listen tag, when incoming migration starts.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  9:27       ` Daniel P. Berrange
@ 2017-08-15  9:47         ` Peter Xu
  2017-08-16  9:47           ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2017-08-15  9:47 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 15, 2017 at 10:27:07AM +0100, Daniel P. Berrange wrote:
> On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> > On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > > MigrationIncomingState struct.
> > > > 
> > > > For defered migration, no need to store task tag since there is no task
> > > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  migration/migration.c | 22 ++++++++++++++++++----
> > > >  migration/migration.h |  2 ++
> > > >  2 files changed, 20 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index c9b7085..daf356b 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > > >          mis->from_src_file = NULL;
> > > >      }
> > > >  
> > > > +    mis->listen_task_tag = 0;
> > > >      qemu_event_destroy(&mis->main_thread_load_event);
> > > >  }
> > > >  
> > > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > > >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > >  {
> > > >      const char *p;
> > > > +    guint task_tag = 0;
> > > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > >  
> > > >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > > >      if (!strcmp(uri, "defer")) {
> > > >          deferred_incoming_migration(errp);
> > > >      } else if (strstart(uri, "tcp:", &p)) {
> > > > -        tcp_start_incoming_migration(p, errp);
> > > > +        task_tag = tcp_start_incoming_migration(p, errp);
> > > >  #ifdef CONFIG_RDMA
> > > >      } else if (strstart(uri, "rdma:", &p)) {
> > > > +        /* TODO: store task tag for RDMA migrations */
> > > >          rdma_start_incoming_migration(p, errp);
> > > >  #endif
> > > >      } else if (strstart(uri, "exec:", &p)) {
> > > > -        exec_start_incoming_migration(p, errp);
> > > > +        task_tag = exec_start_incoming_migration(p, errp);
> > > >      } else if (strstart(uri, "unix:", &p)) {
> > > > -        unix_start_incoming_migration(p, errp);
> > > > +        task_tag = unix_start_incoming_migration(p, errp);
> > > >      } else if (strstart(uri, "fd:", &p)) {
> > > > -        fd_start_incoming_migration(p, errp);
> > > > +        task_tag = fd_start_incoming_migration(p, errp);
> > > >      } else {
> > > >          error_setg(errp, "unknown migration protocol: %s", uri);
> > > > +        return;
> > > >      }
> > > > +
> > > > +    mis->listen_task_tag = task_tag;
> > > 
> > > This is unsafe as currently written. The main loop watches that are
> > > associated with these tags are all unregistered when incoming migration
> > > starts. So if you save them, and then unregister later when postcopy
> > > is "stuck", then you're likely unregistering a tag associated with
> > > some other random part of QEMU, because the saved value is no longer
> > > valid.
> > 
> > IIUC for incoming migration, the watch is not detached until
> > migration_fd_process_incoming() completes. So:
> > 
> > - for precopy, the watch should be detached when migration completes
> > 
> > - for postcopy, the watch should be detached when postcopy starts,
> >   then main loop thread returns, while the ram_load_thread starts to
> >   continue the migration.
> > 
> > So basically the watch is detaching itself after
> > migration_fd_process_incoming() completes. To handle that, this patch
> > has this change:
> > 
> > @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
> >          co = qemu_coroutine_create(process_incoming_migration_co, f);
> >          qemu_coroutine_enter(co);
> >      }
> > +
> > +    /*
> > +     * When reach here, we should not need the listening port any
> > +     * more. We'll detach the listening task soon, let's reset the
> > +     * listen task tag.
> > +     */
> > +    mis->listen_task_tag = 0;
> > 
> > Would this make sure the listen_task_tag is always valid if nonzero?
> 
> Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
> for hard to diagnose bugs, so should be avoided in general.

Makes sense.

> 
> So, IMHO, you should change all the callbacks to 'return TRUE' so that
> they keep the watch registered, and then explicitly call g_source_remove()
> passing the listen tag, when incoming migration starts.

Yes this sounds better.  Thanks!

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-15  9:47         ` Peter Xu
@ 2017-08-16  9:47           ` Peter Xu
  2017-08-29 10:38             ` Daniel P. Berrange
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2017-08-16  9:47 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 15, 2017 at 05:47:08PM +0800, Peter Xu wrote:
> On Tue, Aug 15, 2017 at 10:27:07AM +0100, Daniel P. Berrange wrote:
> > On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> > > On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > > > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > > > MigrationIncomingState struct.
> > > > > 
> > > > > For defered migration, no need to store task tag since there is no task
> > > > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > > > 
> > > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > > ---
> > > > >  migration/migration.c | 22 ++++++++++++++++++----
> > > > >  migration/migration.h |  2 ++
> > > > >  2 files changed, 20 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > index c9b7085..daf356b 100644
> > > > > --- a/migration/migration.c
> > > > > +++ b/migration/migration.c
> > > > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > > > >          mis->from_src_file = NULL;
> > > > >      }
> > > > >  
> > > > > +    mis->listen_task_tag = 0;
> > > > >      qemu_event_destroy(&mis->main_thread_load_event);
> > > > >  }
> > > > >  
> > > > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > > > >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > > >  {
> > > > >      const char *p;
> > > > > +    guint task_tag = 0;
> > > > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > > >  
> > > > >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > > > >      if (!strcmp(uri, "defer")) {
> > > > >          deferred_incoming_migration(errp);
> > > > >      } else if (strstart(uri, "tcp:", &p)) {
> > > > > -        tcp_start_incoming_migration(p, errp);
> > > > > +        task_tag = tcp_start_incoming_migration(p, errp);
> > > > >  #ifdef CONFIG_RDMA
> > > > >      } else if (strstart(uri, "rdma:", &p)) {
> > > > > +        /* TODO: store task tag for RDMA migrations */
> > > > >          rdma_start_incoming_migration(p, errp);
> > > > >  #endif
> > > > >      } else if (strstart(uri, "exec:", &p)) {
> > > > > -        exec_start_incoming_migration(p, errp);
> > > > > +        task_tag = exec_start_incoming_migration(p, errp);
> > > > >      } else if (strstart(uri, "unix:", &p)) {
> > > > > -        unix_start_incoming_migration(p, errp);
> > > > > +        task_tag = unix_start_incoming_migration(p, errp);
> > > > >      } else if (strstart(uri, "fd:", &p)) {
> > > > > -        fd_start_incoming_migration(p, errp);
> > > > > +        task_tag = fd_start_incoming_migration(p, errp);
> > > > >      } else {
> > > > >          error_setg(errp, "unknown migration protocol: %s", uri);
> > > > > +        return;
> > > > >      }
> > > > > +
> > > > > +    mis->listen_task_tag = task_tag;
> > > > 
> > > > This is unsafe as currently written. The main loop watches that are
> > > > associated with these tags are all unregistered when incoming migration
> > > > starts. So if you save them, and then unregister later when postcopy
> > > > is "stuck", then you're likely unregistering a tag associated with
> > > > some other random part of QEMU, because the saved value is no longer
> > > > valid.
> > > 
> > > IIUC for incoming migration, the watch is not detached until
> > > migration_fd_process_incoming() completes. So:
> > > 
> > > - for precopy, the watch should be detached when migration completes
> > > 
> > > - for postcopy, the watch should be detached when postcopy starts,
> > >   then main loop thread returns, while the ram_load_thread starts to
> > >   continue the migration.
> > > 
> > > So basically the watch is detaching itself after
> > > migration_fd_process_incoming() completes. To handle that, this patch
> > > has this change:
> > > 
> > > @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
> > >          co = qemu_coroutine_create(process_incoming_migration_co, f);
> > >          qemu_coroutine_enter(co);
> > >      }
> > > +
> > > +    /*
> > > +     * When reach here, we should not need the listening port any
> > > +     * more. We'll detach the listening task soon, let's reset the
> > > +     * listen task tag.
> > > +     */
> > > +    mis->listen_task_tag = 0;
> > > 
> > > Would this make sure the listen_task_tag is always valid if nonzero?
> > 
> > Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
> > for hard to diagnose bugs, so should be avoided in general.
> 
> Makes sense.
> 
> > 
> > So, IMHO, you should change all the callbacks to 'return TRUE' so that
> > they keep the watch registered, and then explicitly call g_source_remove()
> > passing the listen tag, when incoming migration starts.
> 
> Yes this sounds better.  Thanks!

Hmm... when incoming migration starts, we are still in the handler of
the listening gsource. I am not sure whether it's safe to remove the
gsource in its own handlers.

Would this mean I should still keep current way to do it? After all
although we may have two places to detach the gsource (either return
FALSE in its handler, or we call g_source_detach), we are safe since
both of them are in main loop context.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-16  9:47           ` Peter Xu
@ 2017-08-29 10:38             ` Daniel P. Berrange
  2017-08-30  7:38               ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel P. Berrange @ 2017-08-29 10:38 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Wed, Aug 16, 2017 at 05:47:13PM +0800, Peter Xu wrote:
> On Tue, Aug 15, 2017 at 05:47:08PM +0800, Peter Xu wrote:
> > On Tue, Aug 15, 2017 at 10:27:07AM +0100, Daniel P. Berrange wrote:
> > > On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> > > > On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > > > > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > > > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > > > > MigrationIncomingState struct.
> > > > > > 
> > > > > > For defered migration, no need to store task tag since there is no task
> > > > > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > > > > 
> > > > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > > > ---
> > > > > >  migration/migration.c | 22 ++++++++++++++++++----
> > > > > >  migration/migration.h |  2 ++
> > > > > >  2 files changed, 20 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > > index c9b7085..daf356b 100644
> > > > > > --- a/migration/migration.c
> > > > > > +++ b/migration/migration.c
> > > > > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > > > > >          mis->from_src_file = NULL;
> > > > > >      }
> > > > > >  
> > > > > > +    mis->listen_task_tag = 0;
> > > > > >      qemu_event_destroy(&mis->main_thread_load_event);
> > > > > >  }
> > > > > >  
> > > > > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > > > > >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > > > >  {
> > > > > >      const char *p;
> > > > > > +    guint task_tag = 0;
> > > > > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > > > >  
> > > > > >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > > > > >      if (!strcmp(uri, "defer")) {
> > > > > >          deferred_incoming_migration(errp);
> > > > > >      } else if (strstart(uri, "tcp:", &p)) {
> > > > > > -        tcp_start_incoming_migration(p, errp);
> > > > > > +        task_tag = tcp_start_incoming_migration(p, errp);
> > > > > >  #ifdef CONFIG_RDMA
> > > > > >      } else if (strstart(uri, "rdma:", &p)) {
> > > > > > +        /* TODO: store task tag for RDMA migrations */
> > > > > >          rdma_start_incoming_migration(p, errp);
> > > > > >  #endif
> > > > > >      } else if (strstart(uri, "exec:", &p)) {
> > > > > > -        exec_start_incoming_migration(p, errp);
> > > > > > +        task_tag = exec_start_incoming_migration(p, errp);
> > > > > >      } else if (strstart(uri, "unix:", &p)) {
> > > > > > -        unix_start_incoming_migration(p, errp);
> > > > > > +        task_tag = unix_start_incoming_migration(p, errp);
> > > > > >      } else if (strstart(uri, "fd:", &p)) {
> > > > > > -        fd_start_incoming_migration(p, errp);
> > > > > > +        task_tag = fd_start_incoming_migration(p, errp);
> > > > > >      } else {
> > > > > >          error_setg(errp, "unknown migration protocol: %s", uri);
> > > > > > +        return;
> > > > > >      }
> > > > > > +
> > > > > > +    mis->listen_task_tag = task_tag;
> > > > > 
> > > > > This is unsafe as currently written. The main loop watches that are
> > > > > associated with these tags are all unregistered when incoming migration
> > > > > starts. So if you save them, and then unregister later when postcopy
> > > > > is "stuck", then you're likely unregistering a tag associated with
> > > > > some other random part of QEMU, because the saved value is no longer
> > > > > valid.
> > > > 
> > > > IIUC for incoming migration, the watch is not detached until
> > > > migration_fd_process_incoming() completes. So:
> > > > 
> > > > - for precopy, the watch should be detached when migration completes
> > > > 
> > > > - for postcopy, the watch should be detached when postcopy starts,
> > > >   then main loop thread returns, while the ram_load_thread starts to
> > > >   continue the migration.
> > > > 
> > > > So basically the watch is detaching itself after
> > > > migration_fd_process_incoming() completes. To handle that, this patch
> > > > has this change:
> > > > 
> > > > @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
> > > >          co = qemu_coroutine_create(process_incoming_migration_co, f);
> > > >          qemu_coroutine_enter(co);
> > > >      }
> > > > +
> > > > +    /*
> > > > +     * When reach here, we should not need the listening port any
> > > > +     * more. We'll detach the listening task soon, let's reset the
> > > > +     * listen task tag.
> > > > +     */
> > > > +    mis->listen_task_tag = 0;
> > > > 
> > > > Would this make sure the listen_task_tag is always valid if nonzero?
> > > 
> > > Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
> > > for hard to diagnose bugs, so should be avoided in general.
> > 
> > Makes sense.
> > 
> > > 
> > > So, IMHO, you should change all the callbacks to 'return TRUE' so that
> > > they keep the watch registered, and then explicitly call g_source_remove()
> > > passing the listen tag, when incoming migration starts.
> > 
> > Yes this sounds better.  Thanks!
> 
> Hmm... when incoming migration starts, we are still in the handler of
> the listening gsource. I am not sure whether it's safe to remove the
> gsource in its own handlers.

Yes, it should be safe to call g_source_remove from within the handler
itself.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
  2017-08-29 10:38             ` Daniel P. Berrange
@ 2017-08-30  7:38               ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2017-08-30  7:38 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: qemu-devel, Laurent Vivier, Juan Quintela, Dr . David Alan Gilbert

On Tue, Aug 29, 2017 at 11:38:31AM +0100, Daniel P. Berrange wrote:
> > > > So, IMHO, you should change all the callbacks to 'return TRUE' so that
> > > > they keep the watch registered, and then explicitly call g_source_remove()
> > > > passing the listen tag, when incoming migration starts.
> > > 
> > > Yes this sounds better.  Thanks!
> > 
> > Hmm... when incoming migration starts, we are still in the handler of
> > the listening gsource. I am not sure whether it's safe to remove the
> > gsource in its own handlers.
> 
> Yes, it should be safe to call g_source_remove from within the handler
> itself.

Thanks for confirmation.  Will post a new version soon.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-08-30  7:38 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15  6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 2/6] migration: return incoming task tag for sockets Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 3/6] migration: return incoming task tag for exec Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 4/6] migration: return incoming task tag for fd Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 5/6] migration: store listen task tag Peter Xu
2017-08-15  8:37   ` Daniel P. Berrange
2017-08-15  8:50     ` Peter Xu
2017-08-15  9:27       ` Daniel P. Berrange
2017-08-15  9:47         ` Peter Xu
2017-08-16  9:47           ` Peter Xu
2017-08-29 10:38             ` Daniel P. Berrange
2017-08-30  7:38               ` Peter Xu
2017-08-15  6:17 ` [Qemu-devel] [RFC 6/6] migration: allow migrate_incoming for paused VM Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.