All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Andrew Jones" <drjones@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Leonardo Bras Soares Passos" <lsoaresp@redhat.com>,
	"David Gibson" <dgibson@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: Re: [PATCH 0/2] dump-guest-memory: Add blocker for migration
Date: Wed, 01 Sep 2021 13:35:18 +0200	[thread overview]
Message-ID: <87fsuobkax.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <YSa3BAyTZJ/L0Few@t490s> (Peter Xu's message of "Wed, 25 Aug 2021 17:32:52 -0400")

Peter Xu <peterx@redhat.com> writes:

> Markus,
>
> On Wed, Aug 25, 2021 at 09:54:12AM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > Both dump-guest-memory and live migration have vm state cached internally.
>> > Allowing them to happen together means the vm state can be messed up.  Simply
>> > block live migration for dump-guest-memory.
>> >
>> > One trivial thing to mention is we should still allow dump-guest-memory even if
>> > -only-migratable is specified, because that flag should majorly be used to
>> > guarantee not adding devices that will block migration by accident.  Dump guest
>> > memory is not like that - it'll only block for the seconds when it's dumping.
>> 
>> I recently ran into a similarly unusual use of migration blockers:
>> 
>>     Subject: -only-migrate and the two different uses of migration blockers
>>      (was: spapr_events: Sure we may ignore migrate_add_blocker() failure?)
>>     Date: Mon, 19 Jul 2021 13:00:20 +0200 (5 weeks, 1 day, 20 hours ago)
>>     Message-ID: <87sg0amuuz.fsf_-_@dusky.pond.sub.org>
>> 
>>     We appear to use migration blockers in two ways:
>> 
>>     (1) Prevent migration for an indefinite time, typically due to use of
>>     some feature that isn't compatible with migration.
>> 
>>     (2) Delay migration for a short time.
>> 
>>     Option -only-migrate is designed for (1).  It interferes with (2).
>> 
>>     Example for (1): device "x-pci-proxy-dev" doesn't support migration.  It
>>     adds a migration blocker on realize, and deletes it on unrealize.  With
>>     -only-migrate, device realize fails.  Works as designed.
>> 
>>     Example for (2): spapr_mce_req_event() makes an effort to prevent
>>     migration degrate the reporting of FWNMIs.  It adds a migration blocker
>>     when it receives one, and deletes it when it's done handling it.  This
>>     is a best effort; if migration is already in progress by the time FWNMI
>>     is received, we simply carry on, and that's okay.  However, option
>>     -only-migrate sabotages the best effort entirely.
>> 
>>     While this isn't exactly terrible, it may be a weakness in our thinking
>>     and our infrastructure.  I'm bringing it up so the people in charge are
>>     aware :)
>> 
>> https://lists.nongnu.org/archive/html/qemu-devel/2021-07/msg04723.html
>> 
>> Downthread there, Dave Gilbert opined
>> 
>>     It almost feels like they need a way to temporarily hold off
>>     'completion' of migratio - i.e. the phase where we stop the CPU and
>>     write the device data;  mind you you'd also probably want it to stop
>>     cold-migrates/snapshots?
>
> Yeah, maybe spapr_mce_req_event() can be another candidate of the internal
> version of migration_add_blocker().
>
> I can add a patch to replace it if anyone likes me to.
>
> Both cold and live snapshot should have checked migration blockers, I think.
> E.g., cold snapshot has:
>
> bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>                   bool has_devices, strList *devices, Error **errp)
> {
>     [...]
>     if (migration_is_blocked(errp)) {
>         return false;
>     }
>     [...]
> }
>
> While the live snapshot shares similar code in migrate_prepare().
>
> So looks safe that nothing wrong should happen within add/del pair of blockers.
>
> However I do see that it's possible we'll allow the add_blocker to succeed even
> if during cold snapshot, because migration_is_idle() in migration_add_blocker()
> only checks migration state, not RUN_STATE_SAVE_VM.  So I'm wondering whether
> we'd like one more patch to cover that too, like:
>
> ---8<---
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 41429680c2..9c602a4ac1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2055,15 +2055,16 @@ void migrate_init(MigrationState *s)
>  
>  int migrate_add_blocker_internal(Error *reason, Error **errp)
>  {
> -    if (migration_is_idle()) {
> -        migration_blockers = g_slist_prepend(migration_blockers, reason);
> -        return 0;
> +    /* Snapshots are similar to migrations, so check RUN_STATE_SAVE_VM too. */
> +    if (runstate_check(RUN_STATE_SAVE_VM) || !migration_is_idle()) {
> +        error_propagate_prepend(errp, error_copy(reason),
> +                                "disallowing migration blocker "
> +                                "(migration in progress) for: ");
> +        return -EBUSY;
>      }
>  
> -    error_propagate_prepend(errp, error_copy(reason),
> -                            "disallowing migration blocker "
> -                            "(migration in progress) for: ");
> -    return -EBUSY;
> +    migration_blockers = g_slist_prepend(migration_blockers, reason);
> +    return 0;
>  }
>  
>  int migrate_add_blocker(Error *reason, Error **errp)
> ---8<---
>
> It'll by accident also cover guest dump which also sets RUN_STATE_SAVE_VM, but
> I think that's ok.
>
> Thanks,

I delayed answering this in the hope of finding the time to think.  No
luck.  This is a quick answer, hopefully better than nothing and not too
confused.

"Snapshot should honor migration blockers" feels like an independent
issue.  I can't comment on it, because I haven't done my thinking there.

I'm not sure you fully got Dave's suggestion to "temporarily hold off
'completion'".  Let me explain (Dave, please correct misunderstandings,
if any).

Migration blockers and migration are mutually exclusive: you can't
migrate while such blockers are in effect (trying to immediately fails),
and you can't add such blockers while migration is in progress.

The temporary blockers Dave suggested do not block migration, only its
*completion* phase.  This makes sense *only* for short-lived blockers.
If such temporary blockers are in effect when migration is ready to
enter its completion phase, that entry is delayed until the temporary
blockers are gone.  If you try to add a temporary blocker while
migration is in its completion phase, the add blocks until migration
exists its completion phase.



  reply	other threads:[~2021-09-03 13:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24 15:27 [PATCH 0/2] dump-guest-memory: Add blocker for migration Peter Xu
2021-08-24 15:27 ` [PATCH 1/2] migration: Add migrate_add_blocker_internal() Peter Xu
2021-08-24 18:04   ` Marc-André Lureau
2021-08-25  8:04   ` Juan Quintela
2021-08-24 15:27 ` [PATCH 2/2] dump-guest-memory: Block live migration Peter Xu
2021-08-24 18:04   ` Marc-André Lureau
2021-08-24 19:39     ` Peter Xu
2021-08-25  7:36   ` Marc-André Lureau
2021-08-25 20:48     ` Peter Xu
2021-08-25  7:54 ` [PATCH 0/2] dump-guest-memory: Add blocker for migration Markus Armbruster
2021-08-25 21:32   ` Peter Xu
2021-09-01 11:35     ` Markus Armbruster [this message]
2021-09-03 16:08       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fsuobkax.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=dgibson@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.