All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Eduardo Habkost" <ehabkost@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Shannon Zhao" <shannon.zhao@linaro.org>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Richard Henderson" <rth@twiddle.net>
Subject: Re: [PATCH v1 05/13] migrate/ram: Handle RAM block resizes during precopy
Date: Fri, 21 Feb 2020 11:19:58 +0100	[thread overview]
Message-ID: <81ca5783-97d5-555c-c361-9b5b0ec144b3@redhat.com> (raw)
In-Reply-To: <eacd48d3-4bce-bdd4-941d-3c098cb29034@redhat.com>

On 21.02.20 11:10, David Hildenbrand wrote:
> On 21.02.20 10:18, David Hildenbrand wrote:
>> On 20.02.20 21:17, Peter Xu wrote:
>>> On Thu, Feb 20, 2020 at 04:16:02PM +0100, David Hildenbrand wrote:
>>>> On 19.02.20 17:17, David Hildenbrand wrote:
>>>>> Resizing while migrating is dangerous and does not work as expected.
>>>>> The whole migration code works on the usable_length of ram blocks and does
>>>>> not expect this to change at random points in time.
>>>>>
>>>>> In the case of precopy, the ram block size must not change on the source,
>>>>> after syncing the RAM block list in ram_save_setup(), so as long as the
>>>>> guest is still running on the source.
>>>>>
>>>>> Resizing can be trigger *after* (but not during) a reset in
>>>>> ACPI code by the guest
>>>>> - hw/arm/virt-acpi-build.c:acpi_ram_update()
>>>>> - hw/i386/acpi-build.c:acpi_ram_update()
>>>>>
>>>>> Use the ram block notifier to get notified about resizes. Let's simply
>>>>> cancel migration and indicate the reason. We'll continue running on the
>>>>> source. No harm done.
>>>>>
>>>>> Update the documentation. Postcopy will be handled separately.
>>>>>
>>>>> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>>>> Cc: Juan Quintela <quintela@redhat.com>
>>>>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>>>>> Cc: Richard Henderson <richard.henderson@linaro.org>
>>>>> Cc: Shannon Zhao <shannon.zhao@linaro.org>
>>>>> Cc: Alex Bennée <alex.bennee@linaro.org>
>>>>> Cc: Peter Xu <peterx@redhat.com>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>> ---
>>>>>  exec.c                |  5 +++--
>>>>>  include/exec/memory.h | 10 ++++++----
>>>>>  migration/migration.c |  9 +++++++--
>>>>>  migration/migration.h |  1 +
>>>>>  migration/ram.c       | 41 +++++++++++++++++++++++++++++++++++++++++
>>>>>  5 files changed, 58 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/exec.c b/exec.c
>>>>> index b75250e773..8b015821d6 100644
>>>>> --- a/exec.c
>>>>> +++ b/exec.c
>>>>> @@ -2120,8 +2120,9 @@ static int memory_try_enable_merging(void *addr, size_t len)
>>>>>      return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
>>>>>  }
>>>>>  
>>>>> -/* Only legal before guest might have detected the memory size: e.g. on
>>>>> - * incoming migration, or right after reset.
>>>>> +/*
>>>>> + * Resizing RAM while migrating can result in the migration being canceled.
>>>>> + * Care has to be taken if the guest might have already detected the memory.
>>>>>   *
>>>>>   * As memory core doesn't know how is memory accessed, it is up to
>>>>>   * resize callback to update device state and/or add assertions to detect
>>>>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>>>>> index e85b7de99a..de111347e8 100644
>>>>> --- a/include/exec/memory.h
>>>>> +++ b/include/exec/memory.h
>>>>> @@ -113,7 +113,7 @@ typedef struct IOMMUNotifier IOMMUNotifier;
>>>>>  #define RAM_SHARED     (1 << 1)
>>>>>  
>>>>>  /* Only a portion of RAM (used_length) is actually used, and migrated.
>>>>> - * This used_length size can change across reboots.
>>>>> + * Resizing RAM while migrating can result in the migration being canceled.
>>>>>   */
>>>>>  #define RAM_RESIZEABLE (1 << 2)
>>>>>  
>>>>> @@ -843,7 +843,9 @@ void memory_region_init_ram_shared_nomigrate(MemoryRegion *mr,
>>>>>   *                                     RAM.  Accesses into the region will
>>>>>   *                                     modify memory directly.  Only an initial
>>>>>   *                                     portion of this RAM is actually used.
>>>>> - *                                     The used size can change across reboots.
>>>>> + *                                     Changing the size while migrating
>>>>> + *                                     can result in the migration being
>>>>> + *                                     canceled.
>>>>>   *
>>>>>   * @mr: the #MemoryRegion to be initialized.
>>>>>   * @owner: the object that tracks the region's reference count
>>>>> @@ -1464,8 +1466,8 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
>>>>>  
>>>>>  /* memory_region_ram_resize: Resize a RAM region.
>>>>>   *
>>>>> - * Only legal before guest might have detected the memory size: e.g. on
>>>>> - * incoming migration, or right after reset.
>>>>> + * Resizing RAM while migrating can result in the migration being canceled.
>>>>> + * Care has to be taken if the guest might have already detected the memory.
>>>>>   *
>>>>>   * @mr: a memory region created with @memory_region_init_resizeable_ram.
>>>>>   * @newsize: the new size the region
>>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>>> index 8fb68795dc..ac9751dbe5 100644
>>>>> --- a/migration/migration.c
>>>>> +++ b/migration/migration.c
>>>>> @@ -175,13 +175,18 @@ void migration_object_init(void)
>>>>>      }
>>>>>  }
>>>>>  
>>>>> +void migration_cancel(void)
>>>>> +{
>>>>> +    migrate_fd_cancel(current_migration);
>>>>> +}
>>>>> +
>>>>>  void migration_shutdown(void)
>>>>>  {
>>>>>      /*
>>>>>       * Cancel the current migration - that will (eventually)
>>>>>       * stop the migration using this structure
>>>>>       */
>>>>> -    migrate_fd_cancel(current_migration);
>>>>> +    migration_cancel();
>>>>>      object_unref(OBJECT(current_migration));
>>>>>  }
>>>>>  
>>>>> @@ -2019,7 +2024,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
>>>>>  
>>>>>  void qmp_migrate_cancel(Error **errp)
>>>>>  {
>>>>> -    migrate_fd_cancel(migrate_get_current());
>>>>> +    migration_cancel();
>>>>>  }
>>>>>  
>>>>>  void qmp_migrate_continue(MigrationStatus state, Error **errp)
>>>>> diff --git a/migration/migration.h b/migration/migration.h
>>>>> index 8473ddfc88..79fd74afa5 100644
>>>>> --- a/migration/migration.h
>>>>> +++ b/migration/migration.h
>>>>> @@ -343,5 +343,6 @@ int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
>>>>>  void migration_make_urgent_request(void);
>>>>>  void migration_consume_urgent_request(void);
>>>>>  bool migration_rate_limit(void);
>>>>> +void migration_cancel(void);
>>>>>  
>>>>>  #endif
>>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>>> index ed23ed1c7c..57f32011a3 100644
>>>>> --- a/migration/ram.c
>>>>> +++ b/migration/ram.c
>>>>> @@ -52,6 +52,7 @@
>>>>>  #include "migration/colo.h"
>>>>>  #include "block.h"
>>>>>  #include "sysemu/sysemu.h"
>>>>> +#include "sysemu/runstate.h"
>>>>>  #include "savevm.h"
>>>>>  #include "qemu/iov.h"
>>>>>  #include "multifd.h"
>>>>> @@ -3710,8 +3711,48 @@ static SaveVMHandlers savevm_ram_handlers = {
>>>>>      .resume_prepare = ram_resume_prepare,
>>>>>  };
>>>>>  
>>>>> +static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
>>>>> +                                      size_t old_size, size_t new_size)
>>>>> +{
>>>>> +    ram_addr_t offset;
>>>>> +    Error *err = NULL;
>>>>> +    RAMBlock *rb = qemu_ram_block_from_host(host, false, &offset);
>>>>> +
>>>>> +    if (ramblock_is_ignored(rb)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /*
>>>>> +     * Some resizes are triggered on the migration target by precopy code,
>>>>> +     * when synchronizing RAM block sizes. In these cases, the VM is not
>>>>> +     * running and migration is not idle. We have to ignore these resizes,
>>>>> +     * as we only care about resizes during precopy on the migration source.
>>>>> +     * This handler is always registered, so ignore when migration is idle.
>>>>> +     */
>>>>> +    if (migration_is_idle() || !runstate_is_running() ||
>>>
>>> So I noticed that I mis-misread the code after chat with Dave...
>>>
>>> migration_is_idle() should only return false if on the source and only
>>> if during migration.  Destination should still return true for that
>>> (destination VM reads state from MigrationIncomingState.state
>>> instead).
>>>
>>> With that, I think we can drop the confusing !runstate_is_running()
>>> check because migration_is_idle() will cover that (and touch up the
>>> comment too)?
>>
>> So, we want to cancel migration whenever we are on the source and we are
>> migrating (postcopy). Resizing will only happen while the VM is running
>> on the source once we're migrating.
>>
>> So you're saying that
>>
>> if (migration_is_idle() || postcopy_is_running()) {
>> 	return:
>> }
>>
>> is enough. Will migration_is_idle() always return true on the
>> destination (I remember something different, but might be *I* misread
>> the code)?
>>
>> Then this would distill down to
>>
>> if (migration_is_idle()) {
>> 	return:
>> }
>>
>> Which would be nice.
> 
> ... but looking at process_incoming_migration_co(), we'll set the state
> to MIGRATION_STATUS_ACTIVE right at the start.
> 
> 
> So dropping postcopy_is_running() would be wrong and dropping
> !runstate_is_running() would be wrong as well right now.
> 
> What we actually want is
> 
> if (!migration_outgoing()) {
> 	return;
> }
> 
> Any experts?
> 

Lol, man this code is confusing.

So, migration_is_idle() really only checks current_migration, not
current_incoming.

I didn't expect to be knees deep in migration code at this point ...

if (migration_is_idle()) {
	return:
}

works. Thanks!

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-02-21 10:20 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 16:17 [PATCH v1 00/13] migrate/ram: Fix resizing RAM blocks while migrating David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 01/13] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 02/13] stubs/ram-block: Remove stubs that are no longer needed David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 03/13] numa: Teach ram block notifiers about resizeable ram blocks David Hildenbrand
2020-02-19 16:17   ` [Xen-devel] " David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 20:48     ` [Xen-devel] " Peter Xu
2020-02-19 16:17 ` [PATCH v1 04/13] numa: Make all callbacks of ram block notifiers optional David Hildenbrand
2020-02-19 20:49   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 05/13] migrate/ram: Handle RAM block resizes during precopy David Hildenbrand
2020-02-20 15:16   ` David Hildenbrand
2020-02-20 20:17     ` Peter Xu
2020-02-21  9:18       ` David Hildenbrand
2020-02-21 10:10         ` David Hildenbrand
2020-02-21 10:19           ` David Hildenbrand [this message]
2020-02-21 16:35             ` Peter Xu
2020-02-21 15:14   ` Dr. David Alan Gilbert
2020-02-21 15:18     ` David Hildenbrand
2020-02-21 15:40       ` Dr. David Alan Gilbert
2020-02-21 15:46         ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 06/13] migrate/ram: Discard new RAM when growing RAM blocks and the VM is stopped David Hildenbrand
2020-02-21  9:08   ` David Hildenbrand
2020-02-21 15:51     ` Dr. David Alan Gilbert
2020-02-21 15:53       ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 07/13] migrate/ram: Get rid of "place_source" in ram_load_postcopy() David Hildenbrand
2020-02-19 20:47   ` Peter Xu
2020-02-19 20:55     ` Peter Xu
2020-02-20 13:22       ` David Hildenbrand
2020-02-20 13:48         ` Peter Xu
2020-02-20  9:24     ` David Hildenbrand
2020-02-20 18:58       ` Dr. David Alan Gilbert
2020-02-19 16:17 ` [PATCH v1 08/13] migrate/ram: Simplify host page handling " David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 09/13] migrate/ram: Consolidate variable reset after placement " David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 10/13] migrate/ram: Handle RAM block resizes during postcopy David Hildenbrand
2020-02-20 20:54   ` Peter Xu
2020-02-21  8:35     ` David Hildenbrand
2020-02-21  8:41       ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 11/13] migrate/multifd: Print used_length of memory block David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 12/13] migrate/ram: Use offset_in_ramblock() in range checks David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 13/13] migrate/ram: Tolerate partially changed mappings in postcopy code David Hildenbrand
2020-02-20 11:24   ` David Hildenbrand
2020-02-20 21:07     ` Peter Xu
2020-02-21  8:48       ` David Hildenbrand
2020-02-21 12:13 ` [PATCH v1 00/13] migrate/ram: Fix resizing RAM blocks while migrating David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81ca5783-97d5-555c-c361-9b5b0ec144b3@redhat.com \
    --to=david@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=rth@twiddle.net \
    --cc=shannon.zhao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.