All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration
@ 2018-05-05 14:35 Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

The RDMA QIOChannel does not support bi-directional communication, so when RDMA 
live migration with postcopy enabled, the source qemu return path get qemu file 
error.

These patches implement bi-directional communication for RDMA QIOChannel and 
disable the RDMA WRITE during the postcopy phase.

This patch just make postcopy works, and will improve performance later.

[v3]
 - add a mutex in QEMUFile struct to avoid concurrent channel close (Daniel)
 - destroy the mutex before free QEMUFile (David)
 - use rdmain and rmdaout instead of rdma->return_path (Daniel)

[v2]
 - does not update bytes_xfer when disable RDMA WRITE (David)
 - implement bi-directional communication for RDMA QIOChannel (Daniel)


Lidong Chen (6):
  migration: disable RDMA WRITE after postcopy started
  migration: create a dedicated connection for rdma return path
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: avoid concurrent invoke channel_close by different threads
  migration: implement bi-directional RDMA QIOChannel
  migration: Stop rdma yielding during incoming postcopy

 migration/colo.c         |   2 +
 migration/migration.c    |   2 +
 migration/postcopy-ram.c |   2 +
 migration/qemu-file.c    |  13 +-
 migration/ram.c          |   4 +
 migration/rdma.c         | 321 +++++++++++++++++++++++++++++++++++++++++------
 migration/savevm.c       |   3 +
 7 files changed, 307 insertions(+), 40 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

RDMA WRITE operations are performed with no notification to the destination
qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE
after postcopy started.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/qemu-file.c |  8 ++++++--
 migration/rdma.c      | 12 ++++++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0463f4c..977b9ae 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
     if (f->hooks && f->hooks->save_page) {
         int ret = f->hooks->save_page(f, f->opaque, block_offset,
                                       offset, size, bytes_sent);
-        f->bytes_xfer += size;
-        if (ret != RAM_SAVE_CONTROL_DELAYED) {
+        if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+            f->bytes_xfer += size;
+        }
+
+        if (ret != RAM_SAVE_CONTROL_DELAYED &&
+            ret != RAM_SAVE_CONTROL_NOT_SUPP) {
             if (bytes_sent && *bytes_sent > 0) {
                 qemu_update_position(f, *bytes_sent);
             } else if (ret < 0) {
diff --git a/migration/rdma.c b/migration/rdma.c
index da474fc..a22be43 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2927,6 +2927,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return RAM_SAVE_CONTROL_NOT_SUPP;
+    }
+
     qemu_fflush(f);
 
     if (size > 0) {
@@ -3482,6 +3486,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return 0;
+    }
+
     trace_qemu_rdma_registration_start(flags);
     qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
     qemu_fflush(f);
@@ -3504,6 +3512,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return 0;
+    }
+
     qemu_fflush(f);
     ret = qemu_rdma_drain_cq(f, rdma);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

If start a RDMA migration with postcopy enabled, the source qemu
establish a dedicated connection for return path.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a22be43..c745427 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -387,6 +387,10 @@ typedef struct RDMAContext {
     uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
 
     GHashTable *blockmap;
+
+    /* the RDMAContext for return path */
+    struct RDMAContext *return_path;
+    bool is_return_path;
 } RDMAContext;
 
 #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
@@ -2329,10 +2333,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
         rdma_destroy_id(rdma->cm_id);
         rdma->cm_id = NULL;
     }
+
+    /* the destination side, listen_id and channel is shared */
     if (rdma->listen_id) {
-        rdma_destroy_id(rdma->listen_id);
+        if (!rdma->is_return_path) {
+            rdma_destroy_id(rdma->listen_id);
+        }
         rdma->listen_id = NULL;
+
+        if (rdma->channel) {
+            if (!rdma->is_return_path) {
+                rdma_destroy_event_channel(rdma->channel);
+            }
+            rdma->channel = NULL;
+        }
     }
+
     if (rdma->channel) {
         rdma_destroy_event_channel(rdma->channel);
         rdma->channel = NULL;
@@ -2561,6 +2577,25 @@ err_dest_init_create_listen_id:
 
 }
 
+static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path,
+                                            RDMAContext *rdma)
+{
+    int idx;
+
+    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
+        rdma_return_path->wr_data[idx].control_len = 0;
+        rdma_return_path->wr_data[idx].control_curr = NULL;
+    }
+
+    /*the CM channel and CM id is shared*/
+    rdma_return_path->channel = rdma->channel;
+    rdma_return_path->listen_id = rdma->listen_id;
+
+    rdma->return_path = rdma_return_path;
+    rdma_return_path->return_path = rdma;
+    rdma_return_path->is_return_path = true;
+}
+
 static void *qemu_rdma_data_init(const char *host_port, Error **errp)
 {
     RDMAContext *rdma = NULL;
@@ -3018,6 +3053,8 @@ err:
     return ret;
 }
 
+static void rdma_accept_incoming_migration(void *opaque);
+
 static int qemu_rdma_accept(RDMAContext *rdma)
 {
     RDMACapabilities cap;
@@ -3112,7 +3149,14 @@ static int qemu_rdma_accept(RDMAContext *rdma)
         }
     }
 
-    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+    /* Accept the second connection request for return path */
+    if (migrate_postcopy() && !rdma->is_return_path) {
+        qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
+                            NULL,
+                            (void *)(intptr_t)rdma->return_path);
+    } else {
+        qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+    }
 
     ret = rdma_accept(rdma->cm_id, &conn_param);
     if (ret) {
@@ -3693,6 +3737,10 @@ static void rdma_accept_incoming_migration(void *opaque)
 
     trace_qemu_rdma_accept_incoming_migration_accepted();
 
+    if (rdma->is_return_path) {
+        return;
+    }
+
     f = qemu_fopen_rdma(rdma, "rb");
     if (f == NULL) {
         ERROR(errp, "could not qemu_fopen_rdma!");
@@ -3707,7 +3755,7 @@ static void rdma_accept_incoming_migration(void *opaque)
 void rdma_start_incoming_migration(const char *host_port, Error **errp)
 {
     int ret;
-    RDMAContext *rdma;
+    RDMAContext *rdma, *rdma_return_path;
     Error *local_err = NULL;
 
     trace_rdma_start_incoming_migration();
@@ -3734,12 +3782,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp)
 
     trace_rdma_start_incoming_migration_after_rdma_listen();
 
+    /* initialize the RDMAContext for return path */
+    if (migrate_postcopy()) {
+        rdma_return_path = qemu_rdma_data_init(host_port, &local_err);
+
+        if (rdma_return_path == NULL) {
+            goto err;
+        }
+
+        qemu_rdma_return_path_dest_init(rdma_return_path, rdma);
+    }
+
     qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
                         NULL, (void *)(intptr_t)rdma);
     return;
 err:
     error_propagate(errp, local_err);
     g_free(rdma);
+    g_free(rdma_return_path);
 }
 
 void rdma_start_outgoing_migration(void *opaque,
@@ -3747,6 +3807,7 @@ void rdma_start_outgoing_migration(void *opaque,
 {
     MigrationState *s = opaque;
     RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
+    RDMAContext *rdma_return_path = NULL;
     int ret = 0;
 
     if (rdma == NULL) {
@@ -3767,6 +3828,32 @@ void rdma_start_outgoing_migration(void *opaque,
         goto err;
     }
 
+    /* RDMA postcopy need a seprate queue pair for return path */
+    if (migrate_postcopy()) {
+        rdma_return_path = qemu_rdma_data_init(host_port, errp);
+
+        if (rdma_return_path == NULL) {
+            goto err;
+        }
+
+        ret = qemu_rdma_source_init(rdma_return_path,
+            s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp);
+
+        if (ret) {
+            goto err;
+        }
+
+        ret = qemu_rdma_connect(rdma_return_path, errp);
+
+        if (ret) {
+            goto err;
+        }
+
+        rdma->return_path = rdma_return_path;
+        rdma_return_path->return_path = rdma;
+        rdma_return_path->is_return_path = true;
+    }
+
     trace_rdma_start_outgoing_migration_after_rdma_connect();
 
     s->to_dst_file = qemu_fopen_rdma(rdma, "wb");
@@ -3774,4 +3861,5 @@ void rdma_start_outgoing_migration(void *opaque,
     return;
 err:
     g_free(rdma);
+    g_free(rdma_return_path);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-08 14:19   ` Dr. David Alan Gilbert
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
---
 migration/rdma.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index c745427..f5c1d02 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -404,7 +404,6 @@ struct QIOChannelRDMA {
     QIOChannel parent;
     RDMAContext *rdma;
     QEMUFile *file;
-    size_t len;
     bool blocking; /* XXX we don't actually honour this yet */
 };
 
@@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
     int ret;
     ssize_t done = 0;
     size_t i;
+    size_t len = 0;
 
     CHECK_ERROR_STATE();
 
@@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
         while (remaining) {
             RDMAControlHeader head;
 
-            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
-            remaining -= rioc->len;
+            len = MIN(remaining, RDMA_SEND_INCREMENT);
+            remaining -= len;
 
-            head.len = rioc->len;
+            head.len = len;
             head.type = RDMA_CONTROL_QEMU_FILE;
 
             ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
@@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
                 return ret;
             }
 
-            data += rioc->len;
-            done += rioc->len;
+            data += len;
+            done += len;
         }
     }
 
@@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
             }
         }
     }
-    rioc->len = done;
-    return rioc->len;
+    return done;
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (2 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

The channel_close maybe invoked by different threads. For example, source
qemu invokes qemu_fclose in main thread, migration thread and return path
thread. Destination qemu invokes qemu_fclose in main thread, listen thread
and COLO incoming thread.

Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 migration/qemu-file.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 977b9ae..87d0f05 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -52,6 +52,7 @@ struct QEMUFile {
     unsigned int iovcnt;
 
     int last_error;
+    QemuMutex lock;
 };
 
 /*
@@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
 
     f = g_new0(QEMUFile, 1);
 
+    qemu_mutex_init(&f->lock);
     f->opaque = opaque;
     f->ops = ops;
     return f;
@@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f)
     ret = qemu_file_get_error(f);
 
     if (f->ops->close) {
+        qemu_mutex_lock(&f->lock);
         int ret2 = f->ops->close(f->opaque);
+        qemu_mutex_unlock(&f->lock);
         if (ret >= 0) {
             ret = ret2;
         }
@@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f)
     if (f->last_error) {
         ret = f->last_error;
     }
+    qemu_mutex_destroy(&f->lock);
     g_free(f);
     trace_qemu_file_fclose();
     return ret;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (3 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-15 14:54   ` Paolo Bonzini
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
  5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

This patch implements bi-directional RDMA QIOChannel. Because different
threads may access RDMAQIOChannel currently, this patch use RCU to protect it.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 migration/colo.c         |   2 +
 migration/migration.c    |   2 +
 migration/postcopy-ram.c |   2 +
 migration/ram.c          |   4 +
 migration/rdma.c         | 196 ++++++++++++++++++++++++++++++++++++++++-------
 migration/savevm.c       |   3 +
 6 files changed, 183 insertions(+), 26 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4381067..88936f5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque)
     uint64_t value;
     Error *local_err = NULL;
 
+    rcu_register_thread();
     qemu_sem_init(&mis->colo_incoming_sem, 0);
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -666,5 +667,6 @@ out:
     }
     migration_incoming_exit_colo();
 
+    rcu_unregister_thread();
     return NULL;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 0bdb28e..584666b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1787,6 +1787,7 @@ static void *source_return_path_thread(void *opaque)
     int res;
 
     trace_source_return_path_thread_entry();
+    rcu_register_thread();
     while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
            migration_is_setup_or_active(ms->state)) {
         trace_source_return_path_thread_loop_top();
@@ -1887,6 +1888,7 @@ static void *source_return_path_thread(void *opaque)
 out:
     ms->rp_state.from_dst_file = NULL;
     qemu_fclose(rp);
+    rcu_unregister_thread();
     return NULL;
 }
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 8ceeaa2..4e05966 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -842,6 +842,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
     RAMBlock *rb = NULL;
 
     trace_postcopy_ram_fault_thread_entry();
+    rcu_register_thread();
     mis->last_rb = NULL; /* last RAMBlock we sent part of */
     qemu_sem_post(&mis->fault_thread_sem);
 
@@ -1013,6 +1014,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
             }
         }
     }
+    rcu_unregister_thread();
     trace_postcopy_ram_fault_thread_exit();
     g_free(pfd);
     return NULL;
diff --git a/migration/ram.c b/migration/ram.c
index 912810c..9bc92fc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -491,6 +491,7 @@ static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
 
+    rcu_register_thread();
     while (true) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
@@ -500,6 +501,7 @@ static void *multifd_send_thread(void *opaque)
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
+    rcu_unregister_thread();
 
     return NULL;
 }
@@ -592,6 +594,7 @@ static void *multifd_recv_thread(void *opaque)
 {
     MultiFDRecvParams *p = opaque;
 
+    rcu_register_thread();
     while (true) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
@@ -601,6 +604,7 @@ static void *multifd_recv_thread(void *opaque)
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
+    rcu_unregister_thread();
 
     return NULL;
 }
diff --git a/migration/rdma.c b/migration/rdma.c
index f5c1d02..854f355 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL;
                                 " to abort!"); \
                 rdma->error_reported = 1; \
             } \
+            rcu_read_unlock(); \
             return rdma->error_state; \
         } \
     } while (0)
@@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA;
 
 struct QIOChannelRDMA {
     QIOChannel parent;
-    RDMAContext *rdma;
+    RDMAContext *rdmain;
+    RDMAContext *rdmaout;
     QEMUFile *file;
     bool blocking; /* XXX we don't actually honour this yet */
 };
@@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
     QEMUFile *f = rioc->file;
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int ret;
     ssize_t done = 0;
     size_t i;
     size_t len = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     /*
@@ -2650,6 +2660,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
     ret = qemu_rdma_write_flush(f, rdma);
     if (ret < 0) {
         rdma->error_state = ret;
+        rcu_read_unlock();
         return ret;
     }
 
@@ -2669,6 +2680,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 
             if (ret < 0) {
                 rdma->error_state = ret;
+                rcu_read_unlock();
                 return ret;
             }
 
@@ -2677,6 +2689,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
         }
     }
 
+    rcu_read_unlock();
     return done;
 }
 
@@ -2710,12 +2723,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
                                       Error **errp)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     RDMAControlHeader head;
     int ret = 0;
     ssize_t i;
     size_t done = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     for (i = 0; i < niov; i++) {
@@ -2727,7 +2748,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
          * were given and dish out the bytes until we run
          * out of bytes.
          */
-        ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+        ret = qemu_rdma_fill(rdma, data, want, 0);
         done += ret;
         want -= ret;
         /* Got what we needed, so go to next iovec */
@@ -2749,25 +2770,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
 
         if (ret < 0) {
             rdma->error_state = ret;
+            rcu_read_unlock();
             return ret;
         }
 
         /*
          * SEND was received with new bytes, now try again.
          */
-        ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+        ret = qemu_rdma_fill(rdma, data, want, 0);
         done += ret;
         want -= ret;
 
         /* Still didn't get enough, so lets just return */
         if (want) {
             if (done == 0) {
+                rcu_read_unlock();
                 return QIO_CHANNEL_ERR_BLOCK;
             } else {
                 break;
             }
         }
     }
+    rcu_read_unlock();
     return done;
 }
 
@@ -2819,15 +2843,29 @@ qio_channel_rdma_source_prepare(GSource *source,
                                 gint *timeout)
 {
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
     *timeout = -1;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when prepare Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return cond & rsource->condition;
 }
 
@@ -2835,14 +2873,28 @@ static gboolean
 qio_channel_rdma_source_check(GSource *source)
 {
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when check Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return cond & rsource->condition;
 }
 
@@ -2853,14 +2905,28 @@ qio_channel_rdma_source_dispatch(GSource *source,
 {
     QIOChannelFunc func = (QIOChannelFunc)callback;
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when dispatch Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return (*func)(QIO_CHANNEL(rsource->rioc),
                    (cond & rsource->condition),
                    user_data);
@@ -2905,15 +2971,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
                                   Error **errp)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+    RDMAContext *rdmain, *rdmaout;
     trace_qemu_rdma_close();
-    if (rioc->rdma) {
-        if (!rioc->rdma->error_state) {
-            rioc->rdma->error_state = qemu_file_get_error(rioc->file);
-        }
-        qemu_rdma_cleanup(rioc->rdma);
-        g_free(rioc->rdma);
-        rioc->rdma = NULL;
+
+    rdmain = rioc->rdmain;
+    if (rdmain) {
+        atomic_rcu_set(&rioc->rdmain, NULL);
+    }
+
+    rdmaout = rioc->rdmaout;
+    if (rdmaout) {
+        atomic_rcu_set(&rioc->rdmaout, NULL);
     }
+
+    synchronize_rcu();
+
+    if (rdmain) {
+        qemu_rdma_cleanup(rdmain);
+    }
+
+    if (rdmaout) {
+        qemu_rdma_cleanup(rdmaout);
+    }
+
+    g_free(rdmain);
+    g_free(rdmaout);
+
     return 0;
 }
 
@@ -2956,12 +3039,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
                                   size_t size, uint64_t *bytes_sent)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int ret;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return RAM_SAVE_CONTROL_NOT_SUPP;
     }
 
@@ -3046,9 +3138,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
         }
     }
 
+    rcu_read_unlock();
     return RAM_SAVE_CONTROL_DELAYED;
 err:
     rdma->error_state = ret;
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3224,8 +3318,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
     RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT,
                                  .repeat = 1 };
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
-    RDMALocalBlocks *local = &rdma->local_ram_blocks;
+    RDMAContext *rdma;
+    RDMALocalBlocks *local;
     RDMAControlHeader head;
     RDMARegister *reg, *registers;
     RDMACompress *comp;
@@ -3238,8 +3332,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
     int count = 0;
     int i = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
+    local = &rdma->local_ram_blocks;
     do {
         trace_qemu_rdma_registration_handle_wait();
 
@@ -3469,6 +3572,7 @@ out:
     if (ret < 0) {
         rdma->error_state = ret;
     }
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3482,10 +3586,18 @@ out:
 static int
 rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
 {
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int curr;
     int found = -1;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     /* Find the matching RAMBlock in our local list */
     for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) {
         if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) {
@@ -3496,6 +3608,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
 
     if (found == -1) {
         error_report("RAMBlock '%s' not found on destination", name);
+        rcu_read_unlock();
         return -ENOENT;
     }
 
@@ -3503,6 +3616,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
     trace_rdma_block_notification_handle(name, rdma->next_src_index);
     rdma->next_src_index++;
 
+    rcu_read_unlock();
     return 0;
 }
 
@@ -3525,11 +3639,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
                                         uint64_t flags, void *data)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
+
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
 
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return 0;
     }
 
@@ -3537,6 +3659,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
     qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
     qemu_fflush(f);
 
+    rcu_read_unlock();
     return 0;
 }
 
@@ -3549,13 +3672,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
 {
     Error *local_err = NULL, **errp = &local_err;
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     RDMAControlHeader head = { .len = 0, .repeat = 1 };
     int ret = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return 0;
     }
 
@@ -3587,6 +3718,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                     qemu_rdma_reg_whole_ram_blocks : NULL);
         if (ret < 0) {
             ERROR(errp, "receiving remote info!");
+            rcu_read_unlock();
             return ret;
         }
 
@@ -3610,6 +3742,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                         "not identical on both the source and destination.",
                         local->nb_blocks, nb_dest_blocks);
             rdma->error_state = -EINVAL;
+            rcu_read_unlock();
             return -EINVAL;
         }
 
@@ -3626,6 +3759,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                             local->block[i].length,
                             rdma->dest_blocks[i].length);
                 rdma->error_state = -EINVAL;
+                rcu_read_unlock();
                 return -EINVAL;
             }
             local->block[i].remote_host_addr =
@@ -3643,9 +3777,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
         goto err;
     }
 
+    rcu_read_unlock();
     return 0;
 err:
     rdma->error_state = ret;
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3663,10 +3799,15 @@ static const QEMUFileHooks rdma_write_hooks = {
 static void qio_channel_rdma_finalize(Object *obj)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj);
-    if (rioc->rdma) {
-        qemu_rdma_cleanup(rioc->rdma);
-        g_free(rioc->rdma);
-        rioc->rdma = NULL;
+    if (rioc->rdmain) {
+        qemu_rdma_cleanup(rioc->rdmain);
+        g_free(rioc->rdmain);
+        rioc->rdmain = NULL;
+    }
+    if (rioc->rdmaout) {
+        qemu_rdma_cleanup(rioc->rdmaout);
+        g_free(rioc->rdmaout);
+        rioc->rdmaout = NULL;
     }
 }
 
@@ -3706,13 +3847,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode)
     }
 
     rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
-    rioc->rdma = rdma;
 
     if (mode[0] == 'w') {
         rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc));
+        rioc->rdmaout = rdma;
+        rioc->rdmain = rdma->return_path;
         qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
     } else {
         rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc));
+        rioc->rdmain = rdma;
+        rioc->rdmaout = rdma->return_path;
         qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
     }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index e2be02a..45ec809 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1573,6 +1573,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     qemu_sem_post(&mis->listen_thread_sem);
     trace_postcopy_ram_listen_thread_start();
 
+    rcu_register_thread();
     /*
      * Because we're a thread and not a coroutine we can't yield
      * in qemu_file, and thus we must be blocking now.
@@ -1605,6 +1606,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
          * to leave the guest running and fire MCEs for pages that never
          * arrived as a desperate recovery step.
          */
+        rcu_unregister_thread();
         exit(EXIT_FAILURE);
     }
 
@@ -1619,6 +1621,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     migration_incoming_state_destroy();
     qemu_loadvm_state_cleanup();
 
+    rcu_unregister_thread();
     return NULL;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (4 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

During incoming postcopy, the destination qemu will invoke
qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma
yield, and poll the completion channel fd instead.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 854f355..ed9cfb1 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1490,11 +1490,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
      * Coroutine doesn't start until migration_fd_process_incoming()
      * so don't yield unless we know we're running inside of a coroutine.
      */
-    if (rdma->migration_started_on_destination) {
+    if (rdma->migration_started_on_destination &&
+        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
         yield_until_fd_readable(rdma->comp_channel->fd);
     } else {
         /* This is the source side, we're in a separate thread
          * or destination prior to migration_fd_process_incoming()
+         * after postcopy, the destination also in a seprate thread.
          * we can't yield; so we have to poll the fd.
          * But we need to be able to handle 'cancel' or an error
          * without hanging forever.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-08 14:19   ` Dr. David Alan Gilbert
  2018-05-09  1:28     ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-08 14:19 UTC (permalink / raw)
  To: Lidong Chen
  Cc: quintela, berrange, qemu-devel, galsha, aviadye, adido, Lidong Chen

* Lidong Chen (jemmy858585@gmail.com) wrote:
> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
> by different threads concurrently, this patch removes unnecessary variables
> len in QIOChannelRDMA and use local variable instead.
> 
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>

Note there's a ' <' missing somehow; minor fix up during commit
hopefully.

Dave

> ---
>  migration/rdma.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index c745427..f5c1d02 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
>      QIOChannel parent;
>      RDMAContext *rdma;
>      QEMUFile *file;
> -    size_t len;
>      bool blocking; /* XXX we don't actually honour this yet */
>  };
>  
> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>      int ret;
>      ssize_t done = 0;
>      size_t i;
> +    size_t len = 0;
>  
>      CHECK_ERROR_STATE();
>  
> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>          while (remaining) {
>              RDMAControlHeader head;
>  
> -            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
> -            remaining -= rioc->len;
> +            len = MIN(remaining, RDMA_SEND_INCREMENT);
> +            remaining -= len;
>  
> -            head.len = rioc->len;
> +            head.len = len;
>              head.type = RDMA_CONTROL_QEMU_FILE;
>  
>              ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>                  return ret;
>              }
>  
> -            data += rioc->len;
> -            done += rioc->len;
> +            data += len;
> +            done += len;
>          }
>      }
>  
> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
>              }
>          }
>      }
> -    rioc->len = done;
> -    return rioc->len;
> +    return done;
>  }
>  
>  /*
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-08 14:19   ` Dr. David Alan Gilbert
@ 2018-05-09  1:28     ` 858585 jemmy
  0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-09  1:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Juan Quintela, Daniel P. Berrange, qemu-devel, Gal Shachaf,
	Aviad Yehezkel, adido, Lidong Chen

On Tue, May 8, 2018 at 10:19 PM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Lidong Chen (jemmy858585@gmail.com) wrote:
>> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
>> by different threads concurrently, this patch removes unnecessary variables
>> len in QIOChannelRDMA and use local variable instead.
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
>
> Note there's a ' <' missing somehow; minor fix up during commit
> hopefully.
>
> Dave

Sorry for this mistake, I will check more carefully.

>
>> ---
>>  migration/rdma.c | 15 +++++++--------
>>  1 file changed, 7 insertions(+), 8 deletions(-)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index c745427..f5c1d02 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
>>      QIOChannel parent;
>>      RDMAContext *rdma;
>>      QEMUFile *file;
>> -    size_t len;
>>      bool blocking; /* XXX we don't actually honour this yet */
>>  };
>>
>> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>      int ret;
>>      ssize_t done = 0;
>>      size_t i;
>> +    size_t len = 0;
>>
>>      CHECK_ERROR_STATE();
>>
>> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>          while (remaining) {
>>              RDMAControlHeader head;
>>
>> -            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
>> -            remaining -= rioc->len;
>> +            len = MIN(remaining, RDMA_SEND_INCREMENT);
>> +            remaining -= len;
>>
>> -            head.len = rioc->len;
>> +            head.len = len;
>>              head.type = RDMA_CONTROL_QEMU_FILE;
>>
>>              ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
>> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>                  return ret;
>>              }
>>
>> -            data += rioc->len;
>> -            done += rioc->len;
>> +            data += len;
>> +            done += len;
>>          }
>>      }
>>
>> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
>>              }
>>          }
>>      }
>> -    rioc->len = done;
>> -    return rioc->len;
>> +    return done;
>>  }
>>
>>  /*
>> --
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-15 14:54   ` Paolo Bonzini
  2018-05-16  9:36     ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2018-05-15 14:54 UTC (permalink / raw)
  To: Lidong Chen, quintela, dgilbert, berrange
  Cc: adido, galsha, aviadye, qemu-devel, Lidong Chen

On 05/05/2018 16:35, Lidong Chen wrote:
> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>  {
>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>      QEMUFile *f = rioc->file;
> -    RDMAContext *rdma = rioc->rdma;
> +    RDMAContext *rdma;
>      int ret;
>      ssize_t done = 0;
>      size_t i;
>      size_t len = 0;
>  
> +    rcu_read_lock();
> +    rdma = atomic_rcu_read(&rioc->rdmaout);
> +
> +    if (!rdma) {
> +        rcu_read_unlock();
> +        return -EIO;
> +    }
> +
>      CHECK_ERROR_STATE();
>  
>      /*

I am not sure I understand this.  It would probably be wrong to use the
output side from two threads at the same time, so why not use two mutexes?

Also, who is calling qio_channel_rdma_close in such a way that another
thread is still using it?  Would it be possible to synchronize with the
other thread *before*, for example with qemu_thread_join?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-15 14:54   ` Paolo Bonzini
@ 2018-05-16  9:36     ` 858585 jemmy
  2018-05-21 11:49       ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-16  9:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Juan Quintela, Dave Gilbert, Daniel P. Berrange, adido,
	Gal Shachaf, Aviad Yehezkel, qemu-devel, Lidong Chen

On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/05/2018 16:35, Lidong Chen wrote:
>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>  {
>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>      QEMUFile *f = rioc->file;
>> -    RDMAContext *rdma = rioc->rdma;
>> +    RDMAContext *rdma;
>>      int ret;
>>      ssize_t done = 0;
>>      size_t i;
>>      size_t len = 0;
>>
>> +    rcu_read_lock();
>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>> +
>> +    if (!rdma) {
>> +        rcu_read_unlock();
>> +        return -EIO;
>> +    }
>> +
>>      CHECK_ERROR_STATE();
>>
>>      /*
>
> I am not sure I understand this.  It would probably be wrong to use the
> output side from two threads at the same time, so why not use two mutexes?

Two thread will not invoke qio_channel_rdma_writev at the same time.
The source qemu, migration thread only use writev, and the return path
thread only
use readv.
The destination qemu already have a mutex mis->rp_mutex to make sure
not use writev
at the same time.

The rcu_read_lock is used to protect not use RDMAContext when another
thread closes it.

>
> Also, who is calling qio_channel_rdma_close in such a way that another
> thread is still using it?  Would it be possible to synchronize with the
> other thread *before*, for example with qemu_thread_join?

The MigrationState structure includes to_dst_file and from_dst_file
QEMUFile, the two QEMUFile use the same QIOChannel.
For example, if the return path thread call
qemu_fclose(ms->rp_state.from_dst_file),
It will also close the RDMAContext for ms->to_dst_file.

For live migration, the source qemu invokes qemu_fclose in different
threads include main thread, migration thread, return path thread.

The destination qemu invokes qemu_fclose in main thread, listen thread and
COLO incoming thread.

I do not find an effective way to synchronize these threads.

Thanks.

>
> Thanks,
>
> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-16  9:36     ` 858585 jemmy
@ 2018-05-21 11:49       ` 858585 jemmy
  2018-05-23  2:36         ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-21 11:49 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
	Aviad Yehezkel, qemu-devel, Lidong Chen

On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 05/05/2018 16:35, Lidong Chen wrote:
>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>  {
>>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>>      QEMUFile *f = rioc->file;
>>> -    RDMAContext *rdma = rioc->rdma;
>>> +    RDMAContext *rdma;
>>>      int ret;
>>>      ssize_t done = 0;
>>>      size_t i;
>>>      size_t len = 0;
>>>
>>> +    rcu_read_lock();
>>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>>> +
>>> +    if (!rdma) {
>>> +        rcu_read_unlock();
>>> +        return -EIO;
>>> +    }
>>> +
>>>      CHECK_ERROR_STATE();
>>>
>>>      /*
>>
>> I am not sure I understand this.  It would probably be wrong to use the
>> output side from two threads at the same time, so why not use two mutexes?
>
> Two thread will not invoke qio_channel_rdma_writev at the same time.
> The source qemu, migration thread only use writev, and the return path
> thread only
> use readv.
> The destination qemu already have a mutex mis->rp_mutex to make sure
> not use writev
> at the same time.
>
> The rcu_read_lock is used to protect not use RDMAContext when another
> thread closes it.

Any suggestion?

>
>>
>> Also, who is calling qio_channel_rdma_close in such a way that another
>> thread is still using it?  Would it be possible to synchronize with the
>> other thread *before*, for example with qemu_thread_join?
>
> The MigrationState structure includes to_dst_file and from_dst_file
> QEMUFile, the two QEMUFile use the same QIOChannel.
> For example, if the return path thread call
> qemu_fclose(ms->rp_state.from_dst_file),
> It will also close the RDMAContext for ms->to_dst_file.
>
> For live migration, the source qemu invokes qemu_fclose in different
> threads include main thread, migration thread, return path thread.
>
> The destination qemu invokes qemu_fclose in main thread, listen thread and
> COLO incoming thread.
>
> I do not find an effective way to synchronize these threads.
>
> Thanks.
>
>>
>> Thanks,
>>
>> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-21 11:49       ` 858585 jemmy
@ 2018-05-23  2:36         ` 858585 jemmy
  0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-23  2:36 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
	Aviad Yehezkel, qemu-devel, Lidong Chen

ping.

On Mon, May 21, 2018 at 7:49 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
>> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 05/05/2018 16:35, Lidong Chen wrote:
>>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>>  {
>>>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>>>      QEMUFile *f = rioc->file;
>>>> -    RDMAContext *rdma = rioc->rdma;
>>>> +    RDMAContext *rdma;
>>>>      int ret;
>>>>      ssize_t done = 0;
>>>>      size_t i;
>>>>      size_t len = 0;
>>>>
>>>> +    rcu_read_lock();
>>>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>>>> +
>>>> +    if (!rdma) {
>>>> +        rcu_read_unlock();
>>>> +        return -EIO;
>>>> +    }
>>>> +
>>>>      CHECK_ERROR_STATE();
>>>>
>>>>      /*
>>>
>>> I am not sure I understand this.  It would probably be wrong to use the
>>> output side from two threads at the same time, so why not use two mutexes?
>>
>> Two thread will not invoke qio_channel_rdma_writev at the same time.
>> The source qemu, migration thread only use writev, and the return path
>> thread only
>> use readv.
>> The destination qemu already have a mutex mis->rp_mutex to make sure
>> not use writev
>> at the same time.
>>
>> The rcu_read_lock is used to protect not use RDMAContext when another
>> thread closes it.
>
> Any suggestion?
>
>>
>>>
>>> Also, who is calling qio_channel_rdma_close in such a way that another
>>> thread is still using it?  Would it be possible to synchronize with the
>>> other thread *before*, for example with qemu_thread_join?
>>
>> The MigrationState structure includes to_dst_file and from_dst_file
>> QEMUFile, the two QEMUFile use the same QIOChannel.
>> For example, if the return path thread call
>> qemu_fclose(ms->rp_state.from_dst_file),
>> It will also close the RDMAContext for ms->to_dst_file.
>>
>> For live migration, the source qemu invokes qemu_fclose in different
>> threads include main thread, migration thread, return path thread.
>>
>> The destination qemu invokes qemu_fclose in main thread, listen thread and
>> COLO incoming thread.
>>
>> I do not find an effective way to synchronize these threads.
>>
>> Thanks.
>>
>>>
>>> Thanks,
>>>
>>> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-23  2:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
2018-05-08 14:19   ` Dr. David Alan Gilbert
2018-05-09  1:28     ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-05-15 14:54   ` Paolo Bonzini
2018-05-16  9:36     ` 858585 jemmy
2018-05-21 11:49       ` 858585 jemmy
2018-05-23  2:36         ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.