All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, quintela@redhat.com, famz@redhat.com,
	jdenemar@redhat.com, peterx@redhat.com
Subject: Re: [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S
Date: Mon, 9 Apr 2018 16:35:45 +0100	[thread overview]
Message-ID: <20180409153545.GH2449@work-vm> (raw)
In-Reply-To: <20180409152524.GH5294@localhost.localdomain>

* Kevin Wolf (kwolf@redhat.com) wrote:
> Am 09.04.2018 um 16:04 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kwolf@redhat.com) wrote:
> > > Am 09.04.2018 um 12:27 hat Dr. David Alan Gilbert geschrieben:
> > > > * Kevin Wolf (kwolf@redhat.com) wrote:
> > > > > Am 03.04.2018 um 22:52 hat Dr. David Alan Gilbert geschrieben:
> > > > > > * Kevin Wolf (kwolf@redhat.com) wrote:
> > > > > > > Am 28.03.2018 um 19:02 hat Dr. David Alan Gilbert (git) geschrieben:
> > > > > > > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > > > 
> > > > > > > > Activating the block devices causes the locks to be taken on
> > > > > > > > the backing file.  If we're running with -S and the destination libvirt
> > > > > > > > hasn't started the destination with 'cont', it's expecting the locks are
> > > > > > > > still untaken.
> > > > > > > > 
> > > > > > > > Don't activate the block devices if we're not going to autostart the VM;
> > > > > > > > 'cont' already will do that anyway.
> > > > > > > > 
> > > > > > > > bz: https://bugzilla.redhat.com/show_bug.cgi?id=1560854
> > > > > > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > > > 
> > > > > > > I'm not sure that this is a good idea. Going back to my old writeup of
> > > > > > > the migration phases...
> > > > > > > 
> > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg07917.html
> > > > > > > 
> > > > > > > ...the phase between migration completion and 'cont' is described like
> > > > > > > this:
> > > > > > > 
> > > > > > >     b) Migration converges:
> > > > > > >        Both VMs are stopped (assuming -S is given on the destination,
> > > > > > >        otherwise this phase is skipped), the destination is in control of
> > > > > > >        the resources
> > > > > > > 
> > > > > > > This patch changes the definition of the phase so that neither side is
> > > > > > > in control of the resources. We lose the phase where the destination is
> > > > > > > in control, but the VM isn't running yet. This feels like a problem to
> > > > > > > me.
> > > > > > 
> > > > > > But see Jiri's writeup on that bz;  libvirt is hitting the opposite
> > > > > > problem;   in this corner case they can't have the destination taking
> > > > > > control yet.
> > > > > 
> > > > > I wonder if they can't already grant the destination QEMU the necessary
> > > > > permission in the pre-switchover phase. Just a thought, I don't know how
> > > > > this works in detail, so it might not possible after all.
> > > > 
> > > > It's a fairly hairy failure case they had; if I remember correctly it's:
> > > >   a) Start migration
> > > >   b) Migration gets to completion point
> > > >   c) Destination is still paused
> > > >   d) Libvirt is restarted on the source
> > > >   e) Since libvirt was restarted it fails the migration (and hence knows
> > > >      the destination won't be started)
> > > >   f) It now tries to resume the qemu on the source
> > > > 
> > > > (f) fails because (b) caused the locks to be taken on the destination;
> > > > hence this patch stops doing that.  It's a case we don't really think
> > > > about - i.e. that the migration has actually completed and all the data
> > > > is on the destination, but libvirt decides for some other reason to
> > > > abandon migration.
> > > 
> > > If you do remember correctly, that scenario doesn't feel tricky at all.
> > > libvirt needs to quit the destination qemu, which will inactivate the
> > > images on the destination and release the lock, and then it can continue
> > > the source.
> > > 
> > > In fact, this is so straightforward that I wonder what else libvirt is
> > > doing. Is the destination qemu only shut down after trying to continue
> > > the source? That would be libvirt using the wrong order of steps.
> > 
> > I'll leave Jiri to reply to this; I think this is a case of the source
> > realising libvirt has restarted, then trying to recover all of it's VMs
> > without being in the position of being able to check on the destination.
> > 
> > > > > > > Consider a case where the management tool keeps a mirror job with
> > > > > > > sync=none running to expose all I/O requests to some external process.
> > > > > > > It needs to shut down the old block job on the source in the
> > > > > > > 'pre-switchover' state, and start a new block job on the destination
> > > > > > > when the destination controls the images, but the VM doesn't run yet (so
> > > > > > > that it doesn't miss an I/O request). This patch removes the migration
> > > > > > > phase that the management tool needs to implement this correctly.
> > > > > > > 
> > > > > > > If we need a "neither side has control" phase, we might need to
> > > > > > > introduce it in addition to the existing phases rather than replacing a
> > > > > > > phase that is still needed in other cases.
> > > > > > 
> > > > > > This is yet another phase to be added.
> > > > > > IMHO this needs the managment tool to explicitly take control in the
> > > > > > case you're talking about.
> > > > > 
> > > > > What kind of mechanism do you have in mind there?
> > > > > 
> > > > > Maybe what could work would be separate QMP commands to inactivate (and
> > > > > possibly for symmetry activate) all block nodes. Then the management
> > > > > tool could use the pre-switchover phase to shut down its block jobs
> > > > > etc., inactivate all block nodes, transfer its own locks and then call
> > > > > migrate-continue.
> > > > 
> > > > Yes it was a 'block-activate' that I'd wondered about.  One complication
> > > > is that if this now under the control of the management layer then we
> > > > should stop asserting when the block devices aren't in the expected
> > > > state and just cleanly fail the command instead.
> > > 
> > > Requiring an explicit 'block-activate' on the destination would be an
> > > incompatible change, so you would have to introduce a new option for
> > > that. 'block-inactivate' on the source feels a bit simpler.
> > 
> > I'd only want the 'block-activate' in the case of this new block-job
> > you're suggesting; not in the case of normal migrates - they'd still get
> > it when they do 'cont' - so the change in behaviour is only with that
> > block-job case that must start before the end of migrate.
> 
> I'm not aware of having suggested a new block job?

I'm referring to your concern in your first reply in the thread:
     Consider a case where the management tool keeps a mirror job with
     sync=none running to expose all I/O requests to some external process.

> > > And yes, you're probably right that we would have to be more careful to
> > > catch inactive images without crashing. On the other hand, it would
> > > become a state that is easier to test because it can be directly
> > > influenced via QMP rather than being only a side effect of migration.
> > 
> > Yes; but crashing is really bad, so we should really really stopping
> > asserting all over.
> 
> Are you aware of any wrong assertions currently?

Well, there's https://bugzilla.redhat.com/show_bug.cgi?id=1408653  that
I've not looked at for a while.
But we have had a few lately.

> The thing is, inactive images can only happen in a fairly restricted set
> of scenarios today - either on the source after migration completed, or
> on the destination before it completed. If you get any write I/O
> requests in these states, that's a QEMU bug, so assertions to catch
> these bugs feel right to me.

But if we add the 'block-inactivate' command you suggest, then it could
be a management screwup rather than a qemu bug, and so assertions feel
wrong.

Dave


> Kevin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-04-09 15:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-28 17:02 [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S Dr. David Alan Gilbert (git)
2018-03-28 17:38 ` [Qemu-devel] [PATCH for-2.12] " Eric Blake
2018-03-29  9:45 ` [Qemu-devel] [PATCH] " Dr. David Alan Gilbert
2018-03-31  7:56 ` no-reply
2018-04-03 14:38 ` Kevin Wolf
2018-04-03 20:52   ` Dr. David Alan Gilbert
2018-04-04 10:03     ` Kevin Wolf
2018-04-09 10:27       ` Dr. David Alan Gilbert
2018-04-09 13:40         ` Kevin Wolf
2018-04-09 14:04           ` Dr. David Alan Gilbert
2018-04-09 15:25             ` Kevin Wolf
2018-04-09 15:35               ` Dr. David Alan Gilbert [this message]
2018-04-10  7:36           ` Jiri Denemark
2018-04-10  8:18             ` Kevin Wolf
2018-04-10  8:45               ` Dr. David Alan Gilbert
2018-04-10  9:14                 ` Kevin Wolf
2018-04-10 10:40                   ` Dr. David Alan Gilbert
2018-04-10 12:26                     ` Kevin Wolf
2018-04-10 14:22                       ` Dr. David Alan Gilbert
2018-04-10 14:47                         ` Kevin Wolf
2018-04-11 10:01                           ` Jiri Denemark
2018-04-11 12:49                             ` Kevin Wolf
2018-04-11 13:12                               ` Dr. David Alan Gilbert
2018-04-09 15:28       ` Jiri Denemark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180409153545.GH2449@work-vm \
    --to=dgilbert@redhat.com \
    --cc=famz@redhat.com \
    --cc=jdenemar@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.