All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Li Zhijian <lizhijian@fujitsu.com>
Cc: peterx@redhat.com,  leobras@redhat.com,  qemu-devel@nongnu.org,
	 Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2 1/2] migration: Fix rdma migration failed
Date: Tue, 03 Oct 2023 20:57:07 +0200	[thread overview]
Message-ID: <87edib5ybg.fsf@secure.mitica> (raw)
In-Reply-To: <20230926100103.201564-1-lizhijian@fujitsu.com> (Li Zhijian's message of "Tue, 26 Sep 2023 18:01:02 +0800")

Li Zhijian <lizhijian@fujitsu.com> wrote:
> Migration over RDMA failed since
> commit: 294e5a4034 ("multifd: Only flush once each full round of memory")
> with erors:
> qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing.
>
> migration with RDMA is different from tcp. RDMA has its own control
> message, and all traffic between RDMA_CONTROL_REGISTER_REQUEST and
> RDMA_CONTROL_REGISTER_FINISHED should not be disturbed.
>
> find_dirty_block() will be called during RDMA_CONTROL_REGISTER_REQUEST
> and RDMA_CONTROL_REGISTER_FINISHED, it will send a extra traffic(
> RAM_SAVE_FLAG_MULTIFD_FLUSH) to destination and cause migration to fail
> even though multifd is disabled.
>
> This change make migrate_multifd_flush_after_each_section() return true
> when multifd is disabled, that also means RAM_SAVE_FLAG_MULTIFD_FLUSH
> will not be sent to destination any more when multifd is disabled.
>
> Fixes: 294e5a4034 ("multifd: Only flush once each full round of memory")
> CC: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>

Ouch.

> index 1d1e1321b0..327bcf2fbe 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -368,7 +368,7 @@ bool migrate_multifd_flush_after_each_section(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    return s->multifd_flush_after_each_section;
> +    return !migrate_multifd() || s->multifd_flush_after_each_section;
>  }
>  
>  bool migrate_postcopy(void)

But I think this is ugly.

migrate_multifd_flush_after_each_section()

returnls true

with multifd not enabled?

And we are creating a "function" that just reads a property now does
something else.

What about this?

I know that the change is bigger, but it makes clear what is happening
here.

commit c638f66121ce30063fbf68c3eab4d7429cf2b209
Author: Juan Quintela <quintela@redhat.com>
Date:   Tue Oct 3 20:53:38 2023 +0200

    migration: Non multifd migration don't care about multifd flushes
    
    RDMA was having trouble because
    migrate_multifd_flush_after_each_section() can only be true or false,
    but we don't want to send any flush when we are not in multifd
    migration.
    
    CC: Fabiano Rosas <farosas@suse.de
    Reported-by: Li Zhijian <lizhijian@fujitsu.com>
    Signed-off-by: Juan Quintela <quintela@redhat.com>

diff --git a/migration/ram.c b/migration/ram.c
index e4bfd39f08..716cef6425 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1387,7 +1387,8 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
-            if (!migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() &&
+                !migrate_multifd_flush_after_each_section()) {
                 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                 int ret = multifd_send_sync_main(f);
                 if (ret < 0) {
@@ -3064,7 +3065,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    if (!migrate_multifd_flush_after_each_section()) {
+    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
         qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
     }
 
@@ -3176,7 +3177,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 out:
     if (ret >= 0
         && migration_is_setup_or_active(migrate_get_current()->state)) {
-        if (migrate_multifd_flush_after_each_section()) {
+        if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
             ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
             if (ret < 0) {
                 return ret;
@@ -3253,7 +3254,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    if (!migrate_multifd_flush_after_each_section()) {
+    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
         qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
     }
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
@@ -3760,7 +3761,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            if (migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
                 multifd_recv_sync_main();
             }
             break;
@@ -4038,7 +4039,8 @@ static int ram_load_precopy(QEMUFile *f)
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            if (migrate_multifd_flush_after_each_section()) {
+            if (migrate_multifd() &&
+                migrate_multifd_flush_after_each_section()) {
                 multifd_recv_sync_main();
             }
             break;



  parent reply	other threads:[~2023-10-03 19:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26 10:01 [PATCH v2 1/2] migration: Fix rdma migration failed Li Zhijian
2023-09-26 10:01 ` [PATCH v2 2/2] migration/rdma: zore out head.repeat to make the error more clear Li Zhijian
2023-10-03 18:57   ` Juan Quintela
2023-09-26 17:04 ` [PATCH v2 1/2] migration: Fix rdma migration failed Peter Xu
2023-10-03 19:00   ` Juan Quintela
2023-10-03 18:57 ` Juan Quintela [this message]
2023-10-06 15:52   ` Peter Xu
2023-10-06 17:15     ` Peter Xu
2023-10-18 14:32     ` Juan Quintela
2023-10-07  6:03   ` Zhijian Li (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edib5ybg.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.