All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet
@ 2023-10-22 20:12 Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy Peter Xu
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

This is an RFC series, at least for 9.0, so not urgent for this release.
Just for early reviews.

Said so, patch 1 could be a bugfix, didn't copy stable as I don't think
it's worthwhile.  Maybe worth picking up even soon.

This series majorly does two things as mentioned in the subject, namely:

 1) Quit unifications: after read Fabiano's patch, I moved that further to
    drop p->quit, meanwhile I found some path that may miss things here and
    there.  Got all of them cleaned/fixed up.

 2) Separate SYNC packet: it seems the SYNC packet is confusing in multifd,
    where it's the only case that main thread can modify p->flags too.  The
    field "pending_job" is also confusing to be an integer.  Split it can
    be helpful to make multifd code more readable, meanwhile making
    pending_job a boolean (with yet another one added pending_sync for SYNC).

I think I'm more confident 1) is a good idea, maybe not 2). The last patch
I put it last because I think it reduces duplication, but I'm not sure
whether that's a common flavour of how code should be written.  Let me know
your opinions.  Thanks,

Peter Xu (7):
  migration: Drop stale comment for multifd zero copy
  migration: Fix error leak in multifd_tls_outgoing_handshake()
  migration: multifd_send_kick_main()
  migration: Drop MultiFDSendParams.quit and cleanup error paths
  migration: Modulize multifd send threads with a few helpers
  migration: Split multifd pending_job into two booleans
  migration: Further unify paths for multifd normal or sync requests

 migration/multifd.h |  18 +--
 migration/multifd.c | 307 +++++++++++++++++++++++---------------------
 2 files changed, 170 insertions(+), 155 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-23 14:16   ` Fabiano Rosas
  2023-10-22 20:12 ` [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake() Peter Xu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

We've already done that with multifd_flush_after_each_section, for multifd
in general.  Drop the stale "TODO-like" comment.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 1fe53d3b98..c8bdd88041 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -600,17 +600,6 @@ int multifd_send_sync_main(QEMUFile *f)
         }
     }
 
-    /*
-     * When using zero-copy, it's necessary to flush the pages before any of
-     * the pages can be sent again, so we'll make sure the new version of the
-     * pages will always arrive _later_ than the old pages.
-     *
-     * Currently we achieve this by flushing the zero-page requested writes
-     * per ram iteration, but in the future we could potentially optimize it
-     * to be less frequent, e.g. only after we finished one whole scanning of
-     * all the dirty bitmaps.
-     */
-
     flush_zero_copy = migrate_zero_copy_send();
 
     for (i = 0; i < migrate_multifd_channels(); i++) {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake()
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-23 14:17   ` Fabiano Rosas
  2023-10-22 20:12 ` [PATCH RFC 3/7] migration: multifd_send_kick_main() Peter Xu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

The Error* is leaked when error triggerred.

It logically should have a Fixes here, but since the code changed a few
times, the backport won't be straightforward anyway.  Let's not bother with
leaking an error in the failure path for now.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/multifd.c b/migration/multifd.c
index c8bdd88041..4afdd88602 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -789,6 +789,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
     p->quit = true;
     qemu_sem_post(&multifd_send_state->channels_ready);
     qemu_sem_post(&p->sem_sync);
+    error_free(err);
 }
 
 static void *multifd_tls_handshake_thread(void *opaque)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 3/7] migration: multifd_send_kick_main()
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake() Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-23 14:43   ` Fabiano Rosas
  2023-11-08 22:49   ` Fabiano Rosas
  2023-10-22 20:12 ` [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths Peter Xu
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

When a multifd sender thread hit errors, it always needs to kick the main
thread by kicking all the semaphores that it can be waiting upon.

Provide a helper for it and deduplicate the code.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 4afdd88602..33fb21d0e4 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -374,6 +374,18 @@ struct {
     MultiFDMethods *ops;
 } *multifd_send_state;
 
+/*
+ * The migration thread can wait on either of the two semaphores.  This
+ * function can be used to kick the main thread out of waiting on either of
+ * them.  Should mostly only be called when something wrong happened with
+ * the current multifd send thread.
+ */
+static void multifd_send_kick_main(MultiFDSendParams *p)
+{
+    qemu_sem_post(&p->sem_sync);
+    qemu_sem_post(&multifd_send_state->channels_ready);
+}
+
 /*
  * How we use multifd_send_state->pages and channel->pages?
  *
@@ -746,8 +758,7 @@ out:
         assert(local_err);
         trace_multifd_send_error(p->id);
         multifd_send_terminate_threads(local_err);
-        qemu_sem_post(&p->sem_sync);
-        qemu_sem_post(&multifd_send_state->channels_ready);
+        multifd_send_kick_main(p);
         error_free(local_err);
     }
 
@@ -787,8 +798,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
      * is not created, and then tell who pay attention to me.
      */
     p->quit = true;
-    qemu_sem_post(&multifd_send_state->channels_ready);
-    qemu_sem_post(&p->sem_sync);
+    multifd_send_kick_main(p);
     error_free(err);
 }
 
@@ -859,8 +869,7 @@ static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
 {
      migrate_set_error(migrate_get_current(), err);
      /* Error happen, we need to tell who pay attention to me */
-     qemu_sem_post(&multifd_send_state->channels_ready);
-     qemu_sem_post(&p->sem_sync);
+     multifd_send_kick_main(p);
      /*
       * Although multifd_send_thread is not created, but main migration
       * thread need to judge whether it is running, so we need to mark
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
                   ` (2 preceding siblings ...)
  2023-10-22 20:12 ` [PATCH RFC 3/7] migration: multifd_send_kick_main() Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-23 14:42   ` Fabiano Rosas
  2023-10-22 20:12 ` [PATCH RFC 5/7] migration: Modulize multifd send threads with a few helpers Peter Xu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

Multifd send side has two fields to indicate error quits:

  - MultiFDSendParams.quit
  - &multifd_send_state->exiting

Merge them into the global one.  The replacement is done by changing all
p->quit checks into the global var check.  The global check doesn't need
any lock.

A few more things done on top of this altogether:

  - multifd_send_terminate_threads()

    Moving the xchg() of &multifd_send_state->exiting upper, so as to cover
    the tracepoint, migrate_set_error() and migrate_set_state().

  - multifd_send_sync_main()

    In the 2nd loop, add one more check over the global var to make sure we
    don't keep the looping if QEMU already decided to quit.

  - multifd_tls_outgoing_handshake()

    Use multifd_send_terminate_threads() to set the error state.  That has
    a benefit of updating MigrationState.error to that error too, so we can
    persist that 1st error we hit in that specific channel.

  - multifd_new_send_channel_async()

    Take similar approach like above, drop the migrate_set_error() because
    multifd_send_terminate_threads() already covers that.  Unwrap the helper
    multifd_new_send_channel_cleanup() along the way; not really needed.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.h |  2 --
 migration/multifd.c | 82 ++++++++++++++-------------------------------
 2 files changed, 26 insertions(+), 58 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index a835643b48..2acf400085 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -97,8 +97,6 @@ typedef struct {
     QemuMutex mutex;
     /* is this channel thread running */
     bool running;
-    /* should this thread finish */
-    bool quit;
     /* multifd flags for each packet */
     uint32_t flags;
     /* global number of generated multifd packets */
diff --git a/migration/multifd.c b/migration/multifd.c
index 33fb21d0e4..9d458914a9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
     MultiFDSendParams *p = NULL; /* make happy gcc */
     MultiFDPages_t *pages = multifd_send_state->pages;
 
-    if (qatomic_read(&multifd_send_state->exiting)) {
-        return -1;
-    }
-
     qemu_sem_wait(&multifd_send_state->channels_ready);
     /*
      * next_channel can remain from a previous migration that was
@@ -423,14 +419,11 @@ static int multifd_send_pages(QEMUFile *f)
      */
     next_channel %= migrate_multifd_channels();
     for (i = next_channel;; i = (i + 1) % migrate_multifd_channels()) {
-        p = &multifd_send_state->params[i];
-
-        qemu_mutex_lock(&p->mutex);
-        if (p->quit) {
-            error_report("%s: channel %d has already quit!", __func__, i);
-            qemu_mutex_unlock(&p->mutex);
+        if (qatomic_read(&multifd_send_state->exiting)) {
             return -1;
         }
+        p = &multifd_send_state->params[i];
+        qemu_mutex_lock(&p->mutex);
         if (!p->pending_job) {
             p->pending_job++;
             next_channel = (i + 1) % migrate_multifd_channels();
@@ -485,6 +478,16 @@ static void multifd_send_terminate_threads(Error *err)
 {
     int i;
 
+    /*
+     * We don't want to exit each threads twice.  Depending on where
+     * we get the error, or if there are two independent errors in two
+     * threads at the same time, we can end calling this function
+     * twice.
+     */
+    if (qatomic_xchg(&multifd_send_state->exiting, 1)) {
+        return;
+    }
+
     trace_multifd_send_terminate_threads(err != NULL);
 
     if (err) {
@@ -499,26 +502,13 @@ static void multifd_send_terminate_threads(Error *err)
         }
     }
 
-    /*
-     * We don't want to exit each threads twice.  Depending on where
-     * we get the error, or if there are two independent errors in two
-     * threads at the same time, we can end calling this function
-     * twice.
-     */
-    if (qatomic_xchg(&multifd_send_state->exiting, 1)) {
-        return;
-    }
-
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
-        qemu_mutex_lock(&p->mutex);
-        p->quit = true;
         qemu_sem_post(&p->sem);
         if (p->c) {
             qio_channel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
         }
-        qemu_mutex_unlock(&p->mutex);
     }
 }
 
@@ -617,16 +607,13 @@ int multifd_send_sync_main(QEMUFile *f)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
-        trace_multifd_send_sync_main_signal(p->id);
-
-        qemu_mutex_lock(&p->mutex);
-
-        if (p->quit) {
-            error_report("%s: channel %d has already quit", __func__, i);
-            qemu_mutex_unlock(&p->mutex);
+        if (qatomic_read(&multifd_send_state->exiting)) {
             return -1;
         }
 
+        trace_multifd_send_sync_main_signal(p->id);
+
+        qemu_mutex_lock(&p->mutex);
         p->packet_num = multifd_send_state->packet_num++;
         p->flags |= MULTIFD_FLAG_SYNC;
         p->pending_job++;
@@ -636,6 +623,10 @@ int multifd_send_sync_main(QEMUFile *f)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
+        if (qatomic_read(&multifd_send_state->exiting)) {
+            return -1;
+        }
+
         qemu_sem_wait(&multifd_send_state->channels_ready);
         trace_multifd_send_sync_main_wait(p->id);
         qemu_sem_wait(&p->sem_sync);
@@ -744,9 +735,6 @@ static void *multifd_send_thread(void *opaque)
             if (flags & MULTIFD_FLAG_SYNC) {
                 qemu_sem_post(&p->sem_sync);
             }
-        } else if (p->quit) {
-            qemu_mutex_unlock(&p->mutex);
-            break;
         } else {
             qemu_mutex_unlock(&p->mutex);
             /* sometimes there are spurious wakeups */
@@ -793,11 +781,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
 
     trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
 
-    /*
-     * Error happen, mark multifd_send_thread status as 'quit' although it
-     * is not created, and then tell who pay attention to me.
-     */
-    p->quit = true;
+    multifd_send_terminate_threads(err);
     multifd_send_kick_main(p);
     error_free(err);
 }
@@ -864,22 +848,6 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
     return true;
 }
 
-static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
-                                             QIOChannel *ioc, Error *err)
-{
-     migrate_set_error(migrate_get_current(), err);
-     /* Error happen, we need to tell who pay attention to me */
-     multifd_send_kick_main(p);
-     /*
-      * Although multifd_send_thread is not created, but main migration
-      * thread need to judge whether it is running, so we need to mark
-      * its status.
-      */
-     p->quit = true;
-     object_unref(OBJECT(ioc));
-     error_free(err);
-}
-
 static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
 {
     MultiFDSendParams *p = opaque;
@@ -897,7 +865,10 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
     }
 
     trace_multifd_new_send_channel_async_error(p->id, local_err);
-    multifd_new_send_channel_cleanup(p, ioc, local_err);
+    multifd_send_terminate_threads(local_err);
+    multifd_send_kick_main(p);
+    object_unref(OBJECT(ioc));
+    error_free(local_err);
 }
 
 static void multifd_new_send_channel_create(gpointer opaque)
@@ -929,7 +900,6 @@ int multifd_save_setup(Error **errp)
         qemu_mutex_init(&p->mutex);
         qemu_sem_init(&p->sem, 0);
         qemu_sem_init(&p->sem_sync, 0);
-        p->quit = false;
         p->pending_job = 0;
         p->id = i;
         p->pages = multifd_pages_init(page_count);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 5/7] migration: Modulize multifd send threads with a few helpers
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
                   ` (3 preceding siblings ...)
  2023-10-22 20:12 ` [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans Peter Xu
  2023-10-22 20:12 ` [PATCH RFC 7/7] migration: Further unify paths for multifd normal or sync requests Peter Xu
  6 siblings, 0 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

Abstract the multifd send packet logic into two phases:

  - multifd_send_prepare(): prepare the packet headers, with mutex
  - multifd_do_send(): do the send job finally, without mutex

When at it, always allow the send thread to use Error* for detecting
errors, dropping "int ret" altogether.

One trivial change is the send thread now kicks the sem_sync within mutex
critical section, but that shouldn't be a problem.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 160 ++++++++++++++++++++++++++------------------
 1 file changed, 96 insertions(+), 64 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 9d458914a9..8140520843 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -640,13 +640,89 @@ int multifd_send_sync_main(QEMUFile *f)
     return 0;
 }
 
+/*
+ * Returns true if succeed, false otherwise (with errp set).  Caller must
+ * be with p->mutex held.
+ */
+static bool multifd_send_prepare(MultiFDSendParams *p, Error **errp)
+{
+    bool use_zero_copy_send = migrate_zero_copy_send();
+    uint64_t packet_num = p->packet_num;
+    uint32_t flags;
+    int ret;
+
+    p->normal_num = 0;
+
+    if (use_zero_copy_send) {
+        p->iovs_num = 0;
+    } else {
+        p->iovs_num = 1;
+    }
+
+    for (int i = 0; i < p->pages->num; i++) {
+        p->normal[p->normal_num] = p->pages->offset[i];
+        p->normal_num++;
+    }
+
+    if (p->normal_num) {
+        ret = multifd_send_state->ops->send_prepare(p, errp);
+        if (ret != 0) {
+            return false;
+        }
+    }
+    multifd_send_fill_packet(p);
+    flags = p->flags;
+    p->flags = 0;
+    p->num_packets++;
+    p->total_normal_pages += p->normal_num;
+    p->pages->num = 0;
+    p->pages->block = NULL;
+
+    trace_multifd_send(p->id, packet_num, p->normal_num, flags,
+                       p->next_packet_size);
+
+    return true;
+}
+
+/* Returns true if succeed, false otherwise (with errp set) */
+static bool multifd_do_send(MultiFDSendParams *p, Error **errp)
+{
+    bool use_zero_copy_send = migrate_zero_copy_send();
+    int ret;
+
+    if (use_zero_copy_send) {
+        /* Send header first, without zerocopy */
+        ret = qio_channel_write_all(p->c, (void *)p->packet,
+                                    p->packet_len, errp);
+        if (ret != 0) {
+            return false;
+        }
+    } else {
+        /* Send header using the same writev call */
+        p->iov[0].iov_len = p->packet_len;
+        p->iov[0].iov_base = p->packet;
+    }
+
+    ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
+                                      0, p->write_flags, errp);
+    if (ret != 0) {
+        return false;
+    }
+
+    stat64_add(&mig_stats.multifd_bytes,
+               p->next_packet_size + p->packet_len);
+    stat64_add(&mig_stats.transferred,
+               p->next_packet_size + p->packet_len);
+    p->next_packet_size = 0;
+
+    return true;
+}
+
 static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
     MigrationThread *thread = NULL;
     Error *local_err = NULL;
-    int ret = 0;
-    bool use_zero_copy_send = migrate_zero_copy_send();
 
     thread = migration_threads_add(p->name, qemu_get_thread_id());
 
@@ -654,9 +730,10 @@ static void *multifd_send_thread(void *opaque)
     rcu_register_thread();
 
     if (multifd_send_initial_packet(p, &local_err) < 0) {
-        ret = -1;
+        assert(local_err);
         goto out;
     }
+
     /* initial packet */
     p->num_packets = 1;
 
@@ -667,83 +744,38 @@ static void *multifd_send_thread(void *opaque)
         if (qatomic_read(&multifd_send_state->exiting)) {
             break;
         }
-        qemu_mutex_lock(&p->mutex);
 
+        qemu_mutex_lock(&p->mutex);
         if (p->pending_job) {
-            uint64_t packet_num = p->packet_num;
-            uint32_t flags;
-            p->normal_num = 0;
-
-            if (use_zero_copy_send) {
-                p->iovs_num = 0;
-            } else {
-                p->iovs_num = 1;
-            }
+            bool need_sync = p->flags & MULTIFD_FLAG_SYNC;
 
-            for (int i = 0; i < p->pages->num; i++) {
-                p->normal[p->normal_num] = p->pages->offset[i];
-                p->normal_num++;
+            if (!multifd_send_prepare(p, &local_err)) {
+                assert(local_err);
+                qemu_mutex_unlock(&p->mutex);
+                goto out;
             }
 
-            if (p->normal_num) {
-                ret = multifd_send_state->ops->send_prepare(p, &local_err);
-                if (ret != 0) {
-                    qemu_mutex_unlock(&p->mutex);
-                    break;
-                }
-            }
-            multifd_send_fill_packet(p);
-            flags = p->flags;
-            p->flags = 0;
-            p->num_packets++;
-            p->total_normal_pages += p->normal_num;
-            p->pages->num = 0;
-            p->pages->block = NULL;
+            /* Send the packets without mutex */
             qemu_mutex_unlock(&p->mutex);
-
-            trace_multifd_send(p->id, packet_num, p->normal_num, flags,
-                               p->next_packet_size);
-
-            if (use_zero_copy_send) {
-                /* Send header first, without zerocopy */
-                ret = qio_channel_write_all(p->c, (void *)p->packet,
-                                            p->packet_len, &local_err);
-                if (ret != 0) {
-                    break;
-                }
-            } else {
-                /* Send header using the same writev call */
-                p->iov[0].iov_len = p->packet_len;
-                p->iov[0].iov_base = p->packet;
+            if (!multifd_do_send(p, &local_err)) {
+                assert(local_err);
+                goto out;
             }
-
-            ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
-                                              0, p->write_flags, &local_err);
-            if (ret != 0) {
-                break;
-            }
-
-            stat64_add(&mig_stats.multifd_bytes,
-                       p->next_packet_size + p->packet_len);
-            stat64_add(&mig_stats.transferred,
-                       p->next_packet_size + p->packet_len);
-            p->next_packet_size = 0;
             qemu_mutex_lock(&p->mutex);
+
+            /* Send successful, mark the task completed */
             p->pending_job--;
-            qemu_mutex_unlock(&p->mutex);
 
-            if (flags & MULTIFD_FLAG_SYNC) {
+            /* If this is a sync task, we need one more kick */
+            if (need_sync) {
                 qemu_sem_post(&p->sem_sync);
             }
-        } else {
-            qemu_mutex_unlock(&p->mutex);
-            /* sometimes there are spurious wakeups */
         }
+        qemu_mutex_unlock(&p->mutex);
     }
 
 out:
-    if (ret) {
-        assert(local_err);
+    if (local_err) {
         trace_multifd_send_error(p->id);
         multifd_send_terminate_threads(local_err);
         multifd_send_kick_main(p);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
                   ` (4 preceding siblings ...)
  2023-10-22 20:12 ` [PATCH RFC 5/7] migration: Modulize multifd send threads with a few helpers Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  2023-10-23 15:15   ` Fabiano Rosas
  2023-10-22 20:12 ` [PATCH RFC 7/7] migration: Further unify paths for multifd normal or sync requests Peter Xu
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.h | 16 ++++++++++------
 migration/multifd.c | 33 +++++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 2acf400085..ddee7b8d8a 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -101,12 +101,16 @@ typedef struct {
     uint32_t flags;
     /* global number of generated multifd packets */
     uint64_t packet_num;
-    /* thread has work to do */
-    int pending_job;
-    /* array of pages to sent.
-     * The owner of 'pages' depends of 'pending_job' value:
-     * pending_job == 0 -> migration_thread can use it.
-     * pending_job != 0 -> multifd_channel can use it.
+    /* thread has a request to sync all data */
+    bool pending_sync;
+    /* thread has something to send */
+    bool pending_job;
+    /*
+     * Array of pages to sent. The owner of 'pages' depends of
+     * 'pending_job' value:
+     *
+     *   - true -> multifd_channel owns it.
+     *   - false -> migration_thread owns it.
      */
     MultiFDPages_t *pages;
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 8140520843..fe8d746ff9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -425,7 +425,7 @@ static int multifd_send_pages(QEMUFile *f)
         p = &multifd_send_state->params[i];
         qemu_mutex_lock(&p->mutex);
         if (!p->pending_job) {
-            p->pending_job++;
+            p->pending_job = true;
             next_channel = (i + 1) % migrate_multifd_channels();
             break;
         }
@@ -615,8 +615,7 @@ int multifd_send_sync_main(QEMUFile *f)
 
         qemu_mutex_lock(&p->mutex);
         p->packet_num = multifd_send_state->packet_num++;
-        p->flags |= MULTIFD_FLAG_SYNC;
-        p->pending_job++;
+        p->pending_sync = true;
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_post(&p->sem);
     }
@@ -747,8 +746,6 @@ static void *multifd_send_thread(void *opaque)
 
         qemu_mutex_lock(&p->mutex);
         if (p->pending_job) {
-            bool need_sync = p->flags & MULTIFD_FLAG_SYNC;
-
             if (!multifd_send_prepare(p, &local_err)) {
                 assert(local_err);
                 qemu_mutex_unlock(&p->mutex);
@@ -764,12 +761,27 @@ static void *multifd_send_thread(void *opaque)
             qemu_mutex_lock(&p->mutex);
 
             /* Send successful, mark the task completed */
-            p->pending_job--;
+            p->pending_job = false;
+
+        } else if (p->pending_sync) {
+            p->flags |= MULTIFD_FLAG_SYNC;
+
+            if (!multifd_send_prepare(p, &local_err)) {
+                assert(local_err);
+                qemu_mutex_unlock(&p->mutex);
+                goto out;
+            }
 
-            /* If this is a sync task, we need one more kick */
-            if (need_sync) {
-                qemu_sem_post(&p->sem_sync);
+            /* Send the packets without mutex */
+            qemu_mutex_unlock(&p->mutex);
+            if (!multifd_do_send(p, &local_err)) {
+                assert(local_err);
+                goto out;
             }
+            qemu_mutex_lock(&p->mutex);
+
+            qemu_sem_post(&p->sem_sync);
+            p->pending_sync = false;
         }
         qemu_mutex_unlock(&p->mutex);
     }
@@ -932,7 +944,8 @@ int multifd_save_setup(Error **errp)
         qemu_mutex_init(&p->mutex);
         qemu_sem_init(&p->sem, 0);
         qemu_sem_init(&p->sem_sync, 0);
-        p->pending_job = 0;
+        p->pending_job = false;
+        p->pending_sync = false;
         p->id = i;
         p->pages = multifd_pages_init(page_count);
         p->packet_len = sizeof(MultiFDPacket_t)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC 7/7] migration: Further unify paths for multifd normal or sync requests
  2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
                   ` (5 preceding siblings ...)
  2023-10-22 20:12 ` [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans Peter Xu
@ 2023-10-22 20:12 ` Peter Xu
  6 siblings, 0 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-22 20:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, Fabiano Rosas, peterx

Provide multifd_send_execute() for merging duplicated codes.

The trick here is multifd_send_execute() will conditionally hold the mutex
when returned, depending on the retval.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 51 ++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index fe8d746ff9..0052e5daee 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -717,6 +717,29 @@ static bool multifd_do_send(MultiFDSendParams *p, Error **errp)
     return true;
 }
 
+/*
+ * When succeed: returns true, mutex held.
+ * When failed:  returns false, mutex released.
+ */
+static bool multifd_send_execute(MultiFDSendParams *p, Error **errp)
+{
+    if (!multifd_send_prepare(p, errp)) {
+        qemu_mutex_unlock(&p->mutex);
+        assert(*errp);
+        return false;
+    }
+
+    /* Send the packets without mutex */
+    qemu_mutex_unlock(&p->mutex);
+    if (!multifd_do_send(p, errp)) {
+        assert(*errp);
+        return false;
+    }
+    qemu_mutex_lock(&p->mutex);
+
+    return true;
+}
+
 static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
@@ -746,40 +769,16 @@ static void *multifd_send_thread(void *opaque)
 
         qemu_mutex_lock(&p->mutex);
         if (p->pending_job) {
-            if (!multifd_send_prepare(p, &local_err)) {
-                assert(local_err);
-                qemu_mutex_unlock(&p->mutex);
+            if (!multifd_send_execute(p, &local_err)) {
                 goto out;
             }
-
-            /* Send the packets without mutex */
-            qemu_mutex_unlock(&p->mutex);
-            if (!multifd_do_send(p, &local_err)) {
-                assert(local_err);
-                goto out;
-            }
-            qemu_mutex_lock(&p->mutex);
-
-            /* Send successful, mark the task completed */
             p->pending_job = false;
 
         } else if (p->pending_sync) {
             p->flags |= MULTIFD_FLAG_SYNC;
-
-            if (!multifd_send_prepare(p, &local_err)) {
-                assert(local_err);
-                qemu_mutex_unlock(&p->mutex);
-                goto out;
-            }
-
-            /* Send the packets without mutex */
-            qemu_mutex_unlock(&p->mutex);
-            if (!multifd_do_send(p, &local_err)) {
-                assert(local_err);
+            if (!multifd_send_execute(p, &local_err)) {
                 goto out;
             }
-            qemu_mutex_lock(&p->mutex);
-
             qemu_sem_post(&p->sem_sync);
             p->pending_sync = false;
         }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy
  2023-10-22 20:12 ` [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy Peter Xu
@ 2023-10-23 14:16   ` Fabiano Rosas
  0 siblings, 0 replies; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 14:16 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> We've already done that with multifd_flush_after_each_section, for multifd
> in general.  Drop the stale "TODO-like" comment.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake()
  2023-10-22 20:12 ` [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake() Peter Xu
@ 2023-10-23 14:17   ` Fabiano Rosas
  0 siblings, 0 replies; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 14:17 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> The Error* is leaked when error triggerred.
>
> It logically should have a Fixes here, but since the code changed a few
> times, the backport won't be straightforward anyway.  Let's not bother with
> leaking an error in the failure path for now.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index c8bdd88041..4afdd88602 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -789,6 +789,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>      p->quit = true;
>      qemu_sem_post(&multifd_send_state->channels_ready);
>      qemu_sem_post(&p->sem_sync);
> +    error_free(err);
>  }
>  
>  static void *multifd_tls_handshake_thread(void *opaque)

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths
  2023-10-22 20:12 ` [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths Peter Xu
@ 2023-10-23 14:42   ` Fabiano Rosas
  2023-10-23 14:53     ` Peter Xu
  0 siblings, 1 reply; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 14:42 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> Multifd send side has two fields to indicate error quits:
>
>   - MultiFDSendParams.quit
>   - &multifd_send_state->exiting
>
> Merge them into the global one.  The replacement is done by changing all
> p->quit checks into the global var check.  The global check doesn't need
> any lock.
>
> A few more things done on top of this altogether:
>
>   - multifd_send_terminate_threads()
>
>     Moving the xchg() of &multifd_send_state->exiting upper, so as to cover
>     the tracepoint, migrate_set_error() and migrate_set_state().
>
>   - multifd_send_sync_main()
>
>     In the 2nd loop, add one more check over the global var to make sure we
>     don't keep the looping if QEMU already decided to quit.
>
>   - multifd_tls_outgoing_handshake()
>
>     Use multifd_send_terminate_threads() to set the error state.  That has
>     a benefit of updating MigrationState.error to that error too, so we can
>     persist that 1st error we hit in that specific channel.
>
>   - multifd_new_send_channel_async()
>
>     Take similar approach like above, drop the migrate_set_error() because
>     multifd_send_terminate_threads() already covers that.  Unwrap the helper
>     multifd_new_send_channel_cleanup() along the way; not really needed.

This all looks good to me. I had a very similar patch in the works. Just
one comment below.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.h |  2 --
>  migration/multifd.c | 82 ++++++++++++++-------------------------------
>  2 files changed, 26 insertions(+), 58 deletions(-)
>
> diff --git a/migration/multifd.h b/migration/multifd.h
> index a835643b48..2acf400085 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -97,8 +97,6 @@ typedef struct {
>      QemuMutex mutex;
>      /* is this channel thread running */
>      bool running;
> -    /* should this thread finish */
> -    bool quit;
>      /* multifd flags for each packet */
>      uint32_t flags;
>      /* global number of generated multifd packets */
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 33fb21d0e4..9d458914a9 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
>      MultiFDSendParams *p = NULL; /* make happy gcc */
>      MultiFDPages_t *pages = multifd_send_state->pages;
>  
> -    if (qatomic_read(&multifd_send_state->exiting)) {
> -        return -1;
> -    }
> -

I'd keep this. This function can be called from outside of multifd code
so the channels could be completely gone already.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 3/7] migration: multifd_send_kick_main()
  2023-10-22 20:12 ` [PATCH RFC 3/7] migration: multifd_send_kick_main() Peter Xu
@ 2023-10-23 14:43   ` Fabiano Rosas
  2023-11-08 22:49   ` Fabiano Rosas
  1 sibling, 0 replies; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 14:43 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> When a multifd sender thread hit errors, it always needs to kick the main
> thread by kicking all the semaphores that it can be waiting upon.
>
> Provide a helper for it and deduplicate the code.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths
  2023-10-23 14:42   ` Fabiano Rosas
@ 2023-10-23 14:53     ` Peter Xu
  2023-10-23 15:35       ` Fabiano Rosas
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-10-23 14:53 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

Fabiano,

On Mon, Oct 23, 2023 at 11:42:28AM -0300, Fabiano Rosas wrote:
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 33fb21d0e4..9d458914a9 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
> >      MultiFDSendParams *p = NULL; /* make happy gcc */
> >      MultiFDPages_t *pages = multifd_send_state->pages;
> >  
> > -    if (qatomic_read(&multifd_send_state->exiting)) {
> > -        return -1;
> > -    }
> > -
> 
> I'd keep this. This function can be called from outside of multifd code
> so the channels could be completely gone already.

I can definitely add it back; nothing hurts.  But I want to make sure I
didn't miss some point.

Do you have a specific path that could trigger what you said?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans
  2023-10-22 20:12 ` [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans Peter Xu
@ 2023-10-23 15:15   ` Fabiano Rosas
  2023-10-23 15:52     ` Peter Xu
  0 siblings, 1 reply; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 15:15 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.h | 16 ++++++++++------
>  migration/multifd.c | 33 +++++++++++++++++++++++----------
>  2 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 2acf400085..ddee7b8d8a 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -101,12 +101,16 @@ typedef struct {
>      uint32_t flags;
>      /* global number of generated multifd packets */
>      uint64_t packet_num;
> -    /* thread has work to do */
> -    int pending_job;
> -    /* array of pages to sent.
> -     * The owner of 'pages' depends of 'pending_job' value:
> -     * pending_job == 0 -> migration_thread can use it.
> -     * pending_job != 0 -> multifd_channel can use it.
> +    /* thread has a request to sync all data */
> +    bool pending_sync;
> +    /* thread has something to send */
> +    bool pending_job;
> +    /*
> +     * Array of pages to sent. The owner of 'pages' depends of
> +     * 'pending_job' value:
> +     *
> +     *   - true -> multifd_channel owns it.
> +     *   - false -> migration_thread owns it.
>       */
>      MultiFDPages_t *pages;
>  
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 8140520843..fe8d746ff9 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -425,7 +425,7 @@ static int multifd_send_pages(QEMUFile *f)
>          p = &multifd_send_state->params[i];
>          qemu_mutex_lock(&p->mutex);
>          if (!p->pending_job) {
> -            p->pending_job++;
> +            p->pending_job = true;
>              next_channel = (i + 1) % migrate_multifd_channels();
>              break;
>          }
> @@ -615,8 +615,7 @@ int multifd_send_sync_main(QEMUFile *f)
>  
>          qemu_mutex_lock(&p->mutex);
>          p->packet_num = multifd_send_state->packet_num++;
> -        p->flags |= MULTIFD_FLAG_SYNC;
> -        p->pending_job++;
> +        p->pending_sync = true;
>          qemu_mutex_unlock(&p->mutex);
>          qemu_sem_post(&p->sem);
>      }
> @@ -747,8 +746,6 @@ static void *multifd_send_thread(void *opaque)
>  
>          qemu_mutex_lock(&p->mutex);
>          if (p->pending_job) {
> -            bool need_sync = p->flags & MULTIFD_FLAG_SYNC;
> -
>              if (!multifd_send_prepare(p, &local_err)) {
>                  assert(local_err);
>                  qemu_mutex_unlock(&p->mutex);
> @@ -764,12 +761,27 @@ static void *multifd_send_thread(void *opaque)
>              qemu_mutex_lock(&p->mutex);
>  
>              /* Send successful, mark the task completed */
> -            p->pending_job--;
> +            p->pending_job = false;
> +
> +        } else if (p->pending_sync) {

Is your intention here to stop sending the SYNC along with the pages?
This would have to loop once more to send the sync.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths
  2023-10-23 14:53     ` Peter Xu
@ 2023-10-23 15:35       ` Fabiano Rosas
  2023-10-23 15:54         ` Peter Xu
  0 siblings, 1 reply; 20+ messages in thread
From: Fabiano Rosas @ 2023-10-23 15:35 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> Fabiano,
>
> On Mon, Oct 23, 2023 at 11:42:28AM -0300, Fabiano Rosas wrote:
>> > diff --git a/migration/multifd.c b/migration/multifd.c
>> > index 33fb21d0e4..9d458914a9 100644
>> > --- a/migration/multifd.c
>> > +++ b/migration/multifd.c
>> > @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
>> >      MultiFDSendParams *p = NULL; /* make happy gcc */
>> >      MultiFDPages_t *pages = multifd_send_state->pages;
>> >  
>> > -    if (qatomic_read(&multifd_send_state->exiting)) {
>> > -        return -1;
>> > -    }
>> > -
>> 
>> I'd keep this. This function can be called from outside of multifd code
>> so the channels could be completely gone already.
>
> I can definitely add it back; nothing hurts.  But I want to make sure I
> didn't miss some point.
>
> Do you have a specific path that could trigger what you said?

I don't, just thought of being conservative since this is a multifd
external API (of sorts).



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans
  2023-10-23 15:15   ` Fabiano Rosas
@ 2023-10-23 15:52     ` Peter Xu
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-23 15:52 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Mon, Oct 23, 2023 at 12:15:49PM -0300, Fabiano Rosas wrote:
> > @@ -764,12 +761,27 @@ static void *multifd_send_thread(void *opaque)
> >              qemu_mutex_lock(&p->mutex);
> >  
> >              /* Send successful, mark the task completed */
> > -            p->pending_job--;
> > +            p->pending_job = false;
> > +
> > +        } else if (p->pending_sync) {
> 
> Is your intention here to stop sending the SYNC along with the pages?
> This would have to loop once more to send the sync.

My intention is to be clear on how we do SYNC, e.g., avoid main thread
touching p->flags at all.

AFAIK we'll need to loop twice either before or after this patch to send
SYNC; the old code boosts pending_job for sync too, and kick one more time
upon p->sem to guarantee that 2nd loop.

The major difference after this patch is, it'll be clear we send the pages
first in the 1st packet, then another SYNC packet as the 2nd.  Also I hope
the pending_sync is more readable too..

One thing I should have mentioned but I didn't: we must handle pending_job
before pending_sync here, so that when we do SYNC we make sure all pages
will be sent.  IOW, below:

  if (p->pending_sync) {
     ...
  } else if (p->pending_job) {
     ...
  }

should be buggy, because when pending_sync requested with job==true, we can
send SYNC before that batch of pages.

I'll add a comment block for it:

        /*
         * NOTE: we must handle pending_job before pending_sync, so as to
         * make sure SYNC packet will always cover all queued pages here.
         */
        if (p->pending_job) {

One thing I just notice is I forgot to write commit message for this
patch.. my apologies.  Let me attach a new version here with commit message
written, and with the comment squashed in, attached.

Thanks,

===8<===

From c7636dffe0f58e42e5aa0028cd0a6208cc75dd46 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Sun, 22 Oct 2023 15:20:29 -0400
Subject: [PATCH] migration: Split multifd pending_job into two booleans

We used to have MultiFDSendParams.pending_job covering both sending data
and sending SYNC message.  The send SYNC message part is tricky, because it
directly modifies p->flags, boost pending_job even if there is a request.
It makes it the only chance where pending_job can be larger than 1.

To make it clear, split the pending_job integer into two booleans:

  - pending_job:  keep its own name, a boolean to show we have data to send
  - pending_sync: a new boolean shows QEMU requests a SYNC message to send

With that, we can remove the only place that main thread will touch
p->flags, instead it simply sets pending_sync==true.  Multifd send thread
also does not need to peek p->flags before hand, it can unconditionally
kick p->sem_sync as long as it's a pending_sync request.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.h | 16 ++++++++++------
 migration/multifd.c | 37 +++++++++++++++++++++++++++----------
 2 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 2acf400085..ddee7b8d8a 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -101,12 +101,16 @@ typedef struct {
     uint32_t flags;
     /* global number of generated multifd packets */
     uint64_t packet_num;
-    /* thread has work to do */
-    int pending_job;
-    /* array of pages to sent.
-     * The owner of 'pages' depends of 'pending_job' value:
-     * pending_job == 0 -> migration_thread can use it.
-     * pending_job != 0 -> multifd_channel can use it.
+    /* thread has a request to sync all data */
+    bool pending_sync;
+    /* thread has something to send */
+    bool pending_job;
+    /*
+     * Array of pages to sent. The owner of 'pages' depends of
+     * 'pending_job' value:
+     *
+     *   - true -> multifd_channel owns it.
+     *   - false -> migration_thread owns it.
      */
     MultiFDPages_t *pages;
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 3f4fb6ad40..5d3571faa8 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -434,7 +434,7 @@ static int multifd_send_pages(QEMUFile *f)
         p = &multifd_send_state->params[i];
         qemu_mutex_lock(&p->mutex);
         if (!p->pending_job) {
-            p->pending_job++;
+            p->pending_job = true;
             next_channel = (i + 1) % migrate_multifd_channels();
             break;
         }
@@ -624,8 +624,7 @@ int multifd_send_sync_main(QEMUFile *f)
 
         qemu_mutex_lock(&p->mutex);
         p->packet_num = multifd_send_state->packet_num++;
-        p->flags |= MULTIFD_FLAG_SYNC;
-        p->pending_job++;
+        p->pending_sync = true;
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_post(&p->sem);
     }
@@ -755,9 +754,11 @@ static void *multifd_send_thread(void *opaque)
         }
 
         qemu_mutex_lock(&p->mutex);
+        /*
+         * NOTE: we must handle pending_job before pending_sync, so as to
+         * make sure SYNC packet will always cover all queued pages here.
+         */
         if (p->pending_job) {
-            bool need_sync = p->flags & MULTIFD_FLAG_SYNC;
-
             if (!multifd_send_prepare(p, &local_err)) {
                 assert(local_err);
                 qemu_mutex_unlock(&p->mutex);
@@ -773,12 +774,27 @@ static void *multifd_send_thread(void *opaque)
             qemu_mutex_lock(&p->mutex);
 
             /* Send successful, mark the task completed */
-            p->pending_job--;
+            p->pending_job = false;
+
+        } else if (p->pending_sync) {
+            p->flags |= MULTIFD_FLAG_SYNC;
+
+            if (!multifd_send_prepare(p, &local_err)) {
+                assert(local_err);
+                qemu_mutex_unlock(&p->mutex);
+                goto out;
+            }
 
-            /* If this is a sync task, we need one more kick */
-            if (need_sync) {
-                qemu_sem_post(&p->sem_sync);
+            /* Send the packets without mutex */
+            qemu_mutex_unlock(&p->mutex);
+            if (!multifd_do_send(p, &local_err)) {
+                assert(local_err);
+                goto out;
             }
+            qemu_mutex_lock(&p->mutex);
+
+            qemu_sem_post(&p->sem_sync);
+            p->pending_sync = false;
         }
         qemu_mutex_unlock(&p->mutex);
     }
@@ -941,7 +957,8 @@ int multifd_save_setup(Error **errp)
         qemu_mutex_init(&p->mutex);
         qemu_sem_init(&p->sem, 0);
         qemu_sem_init(&p->sem_sync, 0);
-        p->pending_job = 0;
+        p->pending_job = false;
+        p->pending_sync = false;
         p->id = i;
         p->pages = multifd_pages_init(page_count);
         p->packet_len = sizeof(MultiFDPacket_t)
-- 
2.41.0

-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths
  2023-10-23 15:35       ` Fabiano Rosas
@ 2023-10-23 15:54         ` Peter Xu
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Xu @ 2023-10-23 15:54 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Mon, Oct 23, 2023 at 12:35:50PM -0300, Fabiano Rosas wrote:
> I don't, just thought of being conservative since this is a multifd
> external API (of sorts).

No worry, let me just keep it there.  Thanks for the quick reviews!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 3/7] migration: multifd_send_kick_main()
  2023-10-22 20:12 ` [PATCH RFC 3/7] migration: multifd_send_kick_main() Peter Xu
  2023-10-23 14:43   ` Fabiano Rosas
@ 2023-11-08 22:49   ` Fabiano Rosas
  2023-11-09 16:50     ` Peter Xu
  1 sibling, 1 reply; 20+ messages in thread
From: Fabiano Rosas @ 2023-11-08 22:49 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Juan Quintela, peterx

Peter Xu <peterx@redhat.com> writes:

> When a multifd sender thread hit errors, it always needs to kick the main
> thread by kicking all the semaphores that it can be waiting upon.
>
> Provide a helper for it and deduplicate the code.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 4afdd88602..33fb21d0e4 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -374,6 +374,18 @@ struct {
>      MultiFDMethods *ops;
>  } *multifd_send_state;
>  
> +/*
> + * The migration thread can wait on either of the two semaphores.  This
> + * function can be used to kick the main thread out of waiting on either of
> + * them.  Should mostly only be called when something wrong happened with
> + * the current multifd send thread.
> + */
> +static void multifd_send_kick_main(MultiFDSendParams *p)
> +{
> +    qemu_sem_post(&p->sem_sync);
> +    qemu_sem_post(&multifd_send_state->channels_ready);
> +}
> +
>  /*
>   * How we use multifd_send_state->pages and channel->pages?
>   *
> @@ -746,8 +758,7 @@ out:
>          assert(local_err);
>          trace_multifd_send_error(p->id);
>          multifd_send_terminate_threads(local_err);
> -        qemu_sem_post(&p->sem_sync);
> -        qemu_sem_post(&multifd_send_state->channels_ready);
> +        multifd_send_kick_main(p);
>          error_free(local_err);
>      }
>  
> @@ -787,8 +798,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>       * is not created, and then tell who pay attention to me.
>       */
>      p->quit = true;
> -    qemu_sem_post(&multifd_send_state->channels_ready);
> -    qemu_sem_post(&p->sem_sync);
> +    multifd_send_kick_main(p);

There's a bug here in the original code:

It's not really safe to call any of these outside of the channel lock
because multifd_save_cleanup() could execute at the same time and call
qemu_sem_destroy() -> qemu_mutex_destroy(), which can assert because we
might be holding the sem_lock.

It seems the reason we get away with this today is merely due to
timing. A subset of this problem was already encountered here:

[PATCH] migrate/multifd: fix coredump when the multifd thread cleanup
https://lore.kernel.org/r/20230621081826.3203053-1-zhangjianguo18@huawei.com

We could probably release the semaphores for all channels at once inside
multifd_save_cleanup() in the main thread. We'd have a
multifd_send_kick_main() in each channel when it fails and this:

void multifd_save_cleanup(void)
{
    ...   
    for (i = 0; i < migrate_multifd_channels(); i++) {
        MultiFDSendParams *p = &multifd_send_state->params[i];

        if (p->running) {
            qemu_thread_join(&p->thread);
        } else {
            multifd_send_kick_main(p);
        }
    }
    for (i = 0; i < migrate_multifd_channels(); i++) {
        qemu_sem_destroy, etc...
    }
    ...
}    


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 3/7] migration: multifd_send_kick_main()
  2023-11-08 22:49   ` Fabiano Rosas
@ 2023-11-09 16:50     ` Peter Xu
  2023-11-09 17:00       ` Fabiano Rosas
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2023-11-09 16:50 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Wed, Nov 08, 2023 at 07:49:53PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > When a multifd sender thread hit errors, it always needs to kick the main
> > thread by kicking all the semaphores that it can be waiting upon.
> >
> > Provide a helper for it and deduplicate the code.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  migration/multifd.c | 21 +++++++++++++++------
> >  1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 4afdd88602..33fb21d0e4 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -374,6 +374,18 @@ struct {
> >      MultiFDMethods *ops;
> >  } *multifd_send_state;
> >  
> > +/*
> > + * The migration thread can wait on either of the two semaphores.  This
> > + * function can be used to kick the main thread out of waiting on either of
> > + * them.  Should mostly only be called when something wrong happened with
> > + * the current multifd send thread.
> > + */
> > +static void multifd_send_kick_main(MultiFDSendParams *p)
> > +{
> > +    qemu_sem_post(&p->sem_sync);
> > +    qemu_sem_post(&multifd_send_state->channels_ready);
> > +}
> > +
> >  /*
> >   * How we use multifd_send_state->pages and channel->pages?
> >   *
> > @@ -746,8 +758,7 @@ out:
> >          assert(local_err);
> >          trace_multifd_send_error(p->id);
> >          multifd_send_terminate_threads(local_err);
> > -        qemu_sem_post(&p->sem_sync);
> > -        qemu_sem_post(&multifd_send_state->channels_ready);
> > +        multifd_send_kick_main(p);
> >          error_free(local_err);
> >      }
> >  
> > @@ -787,8 +798,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
> >       * is not created, and then tell who pay attention to me.
> >       */
> >      p->quit = true;
> > -    qemu_sem_post(&multifd_send_state->channels_ready);
> > -    qemu_sem_post(&p->sem_sync);
> > +    multifd_send_kick_main(p);
> 
> There's a bug here in the original code:
> 
> It's not really safe to call any of these outside of the channel lock
> because multifd_save_cleanup() could execute at the same time and call
> qemu_sem_destroy() -> qemu_mutex_destroy(), which can assert because we
> might be holding the sem_lock.

If you meant "p->mutex" as the "channel lock", IIUC even holding that won't
work? Because it'll also be freed in multifd_save_cleanup().



> 
> It seems the reason we get away with this today is merely due to
> timing. A subset of this problem was already encountered here:
> 
> [PATCH] migrate/multifd: fix coredump when the multifd thread cleanup
> https://lore.kernel.org/r/20230621081826.3203053-1-zhangjianguo18@huawei.com
> 
> We could probably release the semaphores for all channels at once inside
> multifd_save_cleanup() in the main thread. We'd have a
> multifd_send_kick_main() in each channel when it fails and this:
> 
> void multifd_save_cleanup(void)
> {
>     ...   
>     for (i = 0; i < migrate_multifd_channels(); i++) {
>         MultiFDSendParams *p = &multifd_send_state->params[i];
> 
>         if (p->running) {
>             qemu_thread_join(&p->thread);
>         } else {
>             multifd_send_kick_main(p);
>         }
>     }
>     for (i = 0; i < migrate_multifd_channels(); i++) {
>         qemu_sem_destroy, etc...
>     }
>     ...
> }    
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC 3/7] migration: multifd_send_kick_main()
  2023-11-09 16:50     ` Peter Xu
@ 2023-11-09 17:00       ` Fabiano Rosas
  0 siblings, 0 replies; 20+ messages in thread
From: Fabiano Rosas @ 2023-11-09 17:00 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> On Wed, Nov 08, 2023 at 07:49:53PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > When a multifd sender thread hit errors, it always needs to kick the main
>> > thread by kicking all the semaphores that it can be waiting upon.
>> >
>> > Provide a helper for it and deduplicate the code.
>> >
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > ---
>> >  migration/multifd.c | 21 +++++++++++++++------
>> >  1 file changed, 15 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/migration/multifd.c b/migration/multifd.c
>> > index 4afdd88602..33fb21d0e4 100644
>> > --- a/migration/multifd.c
>> > +++ b/migration/multifd.c
>> > @@ -374,6 +374,18 @@ struct {
>> >      MultiFDMethods *ops;
>> >  } *multifd_send_state;
>> >  
>> > +/*
>> > + * The migration thread can wait on either of the two semaphores.  This
>> > + * function can be used to kick the main thread out of waiting on either of
>> > + * them.  Should mostly only be called when something wrong happened with
>> > + * the current multifd send thread.
>> > + */
>> > +static void multifd_send_kick_main(MultiFDSendParams *p)
>> > +{
>> > +    qemu_sem_post(&p->sem_sync);
>> > +    qemu_sem_post(&multifd_send_state->channels_ready);
>> > +}
>> > +
>> >  /*
>> >   * How we use multifd_send_state->pages and channel->pages?
>> >   *
>> > @@ -746,8 +758,7 @@ out:
>> >          assert(local_err);
>> >          trace_multifd_send_error(p->id);
>> >          multifd_send_terminate_threads(local_err);
>> > -        qemu_sem_post(&p->sem_sync);
>> > -        qemu_sem_post(&multifd_send_state->channels_ready);
>> > +        multifd_send_kick_main(p);
>> >          error_free(local_err);
>> >      }
>> >  
>> > @@ -787,8 +798,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>> >       * is not created, and then tell who pay attention to me.
>> >       */
>> >      p->quit = true;
>> > -    qemu_sem_post(&multifd_send_state->channels_ready);
>> > -    qemu_sem_post(&p->sem_sync);
>> > +    multifd_send_kick_main(p);
>> 
>> There's a bug here in the original code:
>> 
>> It's not really safe to call any of these outside of the channel lock
>> because multifd_save_cleanup() could execute at the same time and call
>> qemu_sem_destroy() -> qemu_mutex_destroy(), which can assert because we
>> might be holding the sem_lock.
>
> If you meant "p->mutex" as the "channel lock", IIUC even holding that won't
> work? Because it'll also be freed in multifd_save_cleanup().
>

You're right, I just sent an RFC about this, please take a look.



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-11-09 17:02 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-22 20:12 [PATCH RFC 0/7] migration/multifd: quit unitifications and separate sync packet Peter Xu
2023-10-22 20:12 ` [PATCH RFC 1/7] migration: Drop stale comment for multifd zero copy Peter Xu
2023-10-23 14:16   ` Fabiano Rosas
2023-10-22 20:12 ` [PATCH RFC 2/7] migration: Fix error leak in multifd_tls_outgoing_handshake() Peter Xu
2023-10-23 14:17   ` Fabiano Rosas
2023-10-22 20:12 ` [PATCH RFC 3/7] migration: multifd_send_kick_main() Peter Xu
2023-10-23 14:43   ` Fabiano Rosas
2023-11-08 22:49   ` Fabiano Rosas
2023-11-09 16:50     ` Peter Xu
2023-11-09 17:00       ` Fabiano Rosas
2023-10-22 20:12 ` [PATCH RFC 4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths Peter Xu
2023-10-23 14:42   ` Fabiano Rosas
2023-10-23 14:53     ` Peter Xu
2023-10-23 15:35       ` Fabiano Rosas
2023-10-23 15:54         ` Peter Xu
2023-10-22 20:12 ` [PATCH RFC 5/7] migration: Modulize multifd send threads with a few helpers Peter Xu
2023-10-22 20:12 ` [PATCH RFC 6/7] migration: Split multifd pending_job into two booleans Peter Xu
2023-10-23 15:15   ` Fabiano Rosas
2023-10-23 15:52     ` Peter Xu
2023-10-22 20:12 ` [PATCH RFC 7/7] migration: Further unify paths for multifd normal or sync requests Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.