All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Jiri Denemark <jdenemar@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel@nongnu.org, quintela@redhat.com, famz@redhat.com,
	peterx@redhat.com
Subject: Re: [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S
Date: Tue, 10 Apr 2018 10:18:48 +0200	[thread overview]
Message-ID: <20180410081848.GA7026@localhost.localdomain> (raw)
In-Reply-To: <20180410073635.GA91107@orkuz.home>

Am 10.04.2018 um 09:36 hat Jiri Denemark geschrieben:
> On Mon, Apr 09, 2018 at 15:40:03 +0200, Kevin Wolf wrote:
> > Am 09.04.2018 um 12:27 hat Dr. David Alan Gilbert geschrieben:
> > > It's a fairly hairy failure case they had; if I remember correctly it's:
> > >   a) Start migration
> > >   b) Migration gets to completion point
> > >   c) Destination is still paused
> > >   d) Libvirt is restarted on the source
> > >   e) Since libvirt was restarted it fails the migration (and hence knows
> > >      the destination won't be started)
> > >   f) It now tries to resume the qemu on the source
> > > 
> > > (f) fails because (b) caused the locks to be taken on the destination;
> > > hence this patch stops doing that.  It's a case we don't really think
> > > about - i.e. that the migration has actually completed and all the data
> > > is on the destination, but libvirt decides for some other reason to
> > > abandon migration.
> > 
> > If you do remember correctly, that scenario doesn't feel tricky at all.
> > libvirt needs to quit the destination qemu, which will inactivate the
> > images on the destination and release the lock, and then it can continue
> > the source.
> > 
> > In fact, this is so straightforward that I wonder what else libvirt is
> > doing. Is the destination qemu only shut down after trying to continue
> > the source? That would be libvirt using the wrong order of steps.
> 
> There's no connection between the two libvirt daemons in the case we're
> talking about so they can't really synchronize the actions. The
> destination daemon will kill the new QEMU process and the source will
> resume the old one, but the order is completely random.

Hm, okay...

> > > Yes it was a 'block-activate' that I'd wondered about.  One complication
> > > is that if this now under the control of the management layer then we
> > > should stop asserting when the block devices aren't in the expected
> > > state and just cleanly fail the command instead.
> > 
> > Requiring an explicit 'block-activate' on the destination would be an
> > incompatible change, so you would have to introduce a new option for
> > that. 'block-inactivate' on the source feels a bit simpler.
> 
> As I said in another email, the explicit block-activate command could
> depend on a migration capability similarly to how pre-switchover state
> works.

Yeah, that's exactly the thing that we wouldn't need if we could use
'block-inactivate' on the source instead. It feels a bit wrong to
design a more involved QEMU interface around the libvirt internals, but
as long as we implement both sides for symmetry and libvirt just happens
to pick the destination side for now, I think it's okay.

By the way, are block devices the only thing that need to be explicitly
activated? For example, what about qemu_announce_self() for network
cards, do we need to delay that, too?

In any case, I think this patch needs to be reverted for 2.12 because
it's wrong, and then we can create the proper solution in the 2.13
timefrage.

Kevin

  reply	other threads:[~2018-04-10  8:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-28 17:02 [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S Dr. David Alan Gilbert (git)
2018-03-28 17:38 ` [Qemu-devel] [PATCH for-2.12] " Eric Blake
2018-03-29  9:45 ` [Qemu-devel] [PATCH] " Dr. David Alan Gilbert
2018-03-31  7:56 ` no-reply
2018-04-03 14:38 ` Kevin Wolf
2018-04-03 20:52   ` Dr. David Alan Gilbert
2018-04-04 10:03     ` Kevin Wolf
2018-04-09 10:27       ` Dr. David Alan Gilbert
2018-04-09 13:40         ` Kevin Wolf
2018-04-09 14:04           ` Dr. David Alan Gilbert
2018-04-09 15:25             ` Kevin Wolf
2018-04-09 15:35               ` Dr. David Alan Gilbert
2018-04-10  7:36           ` Jiri Denemark
2018-04-10  8:18             ` Kevin Wolf [this message]
2018-04-10  8:45               ` Dr. David Alan Gilbert
2018-04-10  9:14                 ` Kevin Wolf
2018-04-10 10:40                   ` Dr. David Alan Gilbert
2018-04-10 12:26                     ` Kevin Wolf
2018-04-10 14:22                       ` Dr. David Alan Gilbert
2018-04-10 14:47                         ` Kevin Wolf
2018-04-11 10:01                           ` Jiri Denemark
2018-04-11 12:49                             ` Kevin Wolf
2018-04-11 13:12                               ` Dr. David Alan Gilbert
2018-04-09 15:28       ` Jiri Denemark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180410081848.GA7026@localhost.localdomain \
    --to=kwolf@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=famz@redhat.com \
    --cc=jdenemar@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.