All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>,
	leobras@redhat.com, qemu-devel@nongnu.org,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2 1/2] migration: Fix rdma migration failed
Date: Fri, 6 Oct 2023 13:15:41 -0400	[thread overview]
Message-ID: <ZSBAvU2PHST7/Tte@x1n> (raw)
In-Reply-To: <ZSAtKmOFkomgXyJ7@x1n>

On Fri, Oct 06, 2023 at 11:52:10AM -0400, Peter Xu wrote:
> On Tue, Oct 03, 2023 at 08:57:07PM +0200, Juan Quintela wrote:
> > commit c638f66121ce30063fbf68c3eab4d7429cf2b209
> > Author: Juan Quintela <quintela@redhat.com>
> > Date:   Tue Oct 3 20:53:38 2023 +0200
> > 
> >     migration: Non multifd migration don't care about multifd flushes
> >     
> >     RDMA was having trouble because
> >     migrate_multifd_flush_after_each_section() can only be true or false,
> >     but we don't want to send any flush when we are not in multifd
> >     migration.
> >     
> >     CC: Fabiano Rosas <farosas@suse.de
> >     Reported-by: Li Zhijian <lizhijian@fujitsu.com>
> >     Signed-off-by: Juan Quintela <quintela@redhat.com>
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index e4bfd39f08..716cef6425 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1387,7 +1387,8 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
> >          pss->page = 0;
> >          pss->block = QLIST_NEXT_RCU(pss->block, next);
> >          if (!pss->block) {
> > -            if (!migrate_multifd_flush_after_each_section()) {
> > +            if (migrate_multifd() &&
> > +                !migrate_multifd_flush_after_each_section()) {
> >                  QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
> >                  int ret = multifd_send_sync_main(f);
> >                  if (ret < 0) {
> > @@ -3064,7 +3065,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >          return ret;
> >      }
> >  
> > -    if (!migrate_multifd_flush_after_each_section()) {
> > +    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> >          qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> >      }
> >  
> > @@ -3176,7 +3177,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >  out:
> >      if (ret >= 0
> >          && migration_is_setup_or_active(migrate_get_current()->state)) {
> > -        if (migrate_multifd_flush_after_each_section()) {
> > +        if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
> >              ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
> >              if (ret < 0) {
> >                  return ret;
> > @@ -3253,7 +3254,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >          return ret;
> >      }
> >  
> > -    if (!migrate_multifd_flush_after_each_section()) {
> > +    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
> >          qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> >      }
> >      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> > @@ -3760,7 +3761,7 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> >              break;
> >          case RAM_SAVE_FLAG_EOS:
> >              /* normal exit */
> > -            if (migrate_multifd_flush_after_each_section()) {
> > +            if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
> >                  multifd_recv_sync_main();
> >              }
> >              break;
> > @@ -4038,7 +4039,8 @@ static int ram_load_precopy(QEMUFile *f)
> >              break;
> >          case RAM_SAVE_FLAG_EOS:
> >              /* normal exit */
> > -            if (migrate_multifd_flush_after_each_section()) {
> > +            if (migrate_multifd() &&
> > +                migrate_multifd_flush_after_each_section()) {
> >                  multifd_recv_sync_main();
> >              }
> >              break;
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> Did you forget to send this out formally?  Even if f1de309792d6656e landed
> (which, IMHO, shouldn't..), but IIUC rdma is still broken..

Two more things to mention..

$ git tag --contains 294e5a4034e81b

It tells me v8.1 is also affected.. so we may want to copy stable too for
8.1, for whichever patch we want to merge (either yours or Zhijian's)..

Meanwhile, it also breaks migration as long as user specifies the new
behavior.. for example: v8.1->v8.0 will break with this:

$ (echo "migrate exec:cat>out"; echo "quit") | ./qemu-v8.1.1 -M pc-q35-8.0 -global migration.multifd-flush-after-each-section=false -monitor stdio
QEMU 8.1.1 monitor - type 'help' for more information
VNC server running on ::1:5900
(qemu) migrate exec:cat>out
(qemu) quit

$ ./qemu-v8.0.5 -M pc-q35-8.0 -incoming "exec:cat<out"
VNC server running on ::1:5900
qemu-v8.0.5: Unknown combination of migration flags: 0x200
qemu-v8.0.5: error while loading state for instance 0x0 of device 'ram'
qemu-v8.0.5: load of migration failed: Invalid argument

IOW, besides rdma and the script, it can also break in other ways.

-- 
Peter Xu



  reply	other threads:[~2023-10-06 17:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26 10:01 [PATCH v2 1/2] migration: Fix rdma migration failed Li Zhijian
2023-09-26 10:01 ` [PATCH v2 2/2] migration/rdma: zore out head.repeat to make the error more clear Li Zhijian
2023-10-03 18:57   ` Juan Quintela
2023-09-26 17:04 ` [PATCH v2 1/2] migration: Fix rdma migration failed Peter Xu
2023-10-03 19:00   ` Juan Quintela
2023-10-03 18:57 ` Juan Quintela
2023-10-06 15:52   ` Peter Xu
2023-10-06 17:15     ` Peter Xu [this message]
2023-10-18 14:32     ` Juan Quintela
2023-10-07  6:03   ` Zhijian Li (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSBAvU2PHST7/Tte@x1n \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.