All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Eduardo Habkost" <ehabkost@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Shannon Zhao" <shannon.zhao@linaro.org>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Richard Henderson" <rth@twiddle.net>
Subject: Re: [PATCH v1 10/13] migrate/ram: Handle RAM block resizes during postcopy
Date: Fri, 21 Feb 2020 09:35:04 +0100	[thread overview]
Message-ID: <9a4207fe-4b35-8b6a-6b29-3ad2e704b492@redhat.com> (raw)
In-Reply-To: <20200220205443.GD15253@xz-x1>

On 20.02.20 21:54, Peter Xu wrote:
> On Wed, Feb 19, 2020 at 05:17:22PM +0100, David Hildenbrand wrote:
>> Resizing while migrating is dangerous and does not work as expected.
>> The whole migration code works on the usable_length of ram blocks and does
>> not expect this to change at random points in time.
>>
>> In the case of postcopy, relying on used_length is racy as soon as the
>> guest is running. Also, when used_length changes we might leave the
>> uffd handler registered for some memory regions, reject valid pages
>> when migrating and fail when sending the recv bitmap to the source.
>>
>> Resizing can be trigger *after* (but not during) a reset in
>> ACPI code by the guest
>> - hw/arm/virt-acpi-build.c:acpi_ram_update()
>> - hw/i386/acpi-build.c:acpi_ram_update()
>>
>> Let's remember the original used_length in a separate variable and
>> use it in relevant postcopy code. Make sure to update it when we resize
>> during precopy, when synchronizing the RAM block sizes with the source.
>>
>> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Richard Henderson <richard.henderson@linaro.org>
>> Cc: Shannon Zhao <shannon.zhao@linaro.org>
>> Cc: Alex Bennée <alex.bennee@linaro.org>
>> Cc: Peter Xu <peterx@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  include/exec/ramblock.h  |  9 +++++++++
>>  migration/postcopy-ram.c | 15 ++++++++++++---
>>  migration/ram.c          | 11 +++++++++--
>>  3 files changed, 30 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
>> index 07d50864d8..0e9e9b346b 100644
>> --- a/include/exec/ramblock.h
>> +++ b/include/exec/ramblock.h
>> @@ -59,6 +59,15 @@ struct RAMBlock {
>>       */
>>      unsigned long *clear_bmap;
>>      uint8_t clear_bmap_shift;
>> +
>> +    /*
>> +     * RAM block used_length before the guest started running while postcopy
>> +     * was active. Once the guest is running, used_length can change. Used to
>> +     * register/unregister uffd handlers and as the size of the recv bitmap.
>> +     * Receiving any page beyond this length will bail out, as it could not have
>> +     * been valid on the source.
>> +     */
>> +    ram_addr_t postcopy_length;
>>  };
>>  #endif
>>  #endif
>> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> index a36402722b..c68caf4e42 100644
>> --- a/migration/postcopy-ram.c
>> +++ b/migration/postcopy-ram.c
>> @@ -17,6 +17,7 @@
>>   */
>>  
>>  #include "qemu/osdep.h"
>> +#include "qemu/rcu.h"
>>  #include "exec/target_page.h"
>>  #include "migration.h"
>>  #include "qemu-file.h"
>> @@ -31,6 +32,7 @@
>>  #include "qemu/error-report.h"
>>  #include "trace.h"
>>  #include "hw/boards.h"
>> +#include "exec/ramblock.h"
>>  
>>  /* Arbitrary limit on size of each discard command,
>>   * keeps them around ~200 bytes
>> @@ -456,6 +458,13 @@ static int init_range(RAMBlock *rb, void *opaque)
>>      ram_addr_t length = qemu_ram_get_used_length(rb);
>>      trace_postcopy_init_range(block_name, host_addr, offset, length);
>>  
>> +    /*
>> +     * Save the used_length before running the guest. In case we have to
>> +     * resize RAM blocks when syncing RAM block sizes from the source during
>> +     * precopy, we'll update it manually via the ram block notifier.
>> +     */
>> +    rb->postcopy_length = length;
>> +
>>      /*
>>       * We need the whole of RAM to be truly empty for postcopy, so things
>>       * like ROMs and any data tables built during init must be zero'd
>> @@ -478,7 +487,7 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
>>      const char *block_name = qemu_ram_get_idstr(rb);
>>      void *host_addr = qemu_ram_get_host_addr(rb);
>>      ram_addr_t offset = qemu_ram_get_offset(rb);
>> -    ram_addr_t length = qemu_ram_get_used_length(rb);
>> +    ram_addr_t length = rb->postcopy_length;
>>      MigrationIncomingState *mis = opaque;
>>      struct uffdio_range range_struct;
>>      trace_postcopy_cleanup_range(block_name, host_addr, offset, length);
>> @@ -600,7 +609,7 @@ static int nhp_range(RAMBlock *rb, void *opaque)
>>      const char *block_name = qemu_ram_get_idstr(rb);
>>      void *host_addr = qemu_ram_get_host_addr(rb);
>>      ram_addr_t offset = qemu_ram_get_offset(rb);
>> -    ram_addr_t length = qemu_ram_get_used_length(rb);
>> +    ram_addr_t length = rb->postcopy_length;
>>      trace_postcopy_nhp_range(block_name, host_addr, offset, length);
>>  
>>      /*
>> @@ -644,7 +653,7 @@ static int ram_block_enable_notify(RAMBlock *rb, void *opaque)
>>      struct uffdio_register reg_struct;
>>  
>>      reg_struct.range.start = (uintptr_t)qemu_ram_get_host_addr(rb);
>> -    reg_struct.range.len = qemu_ram_get_used_length(rb);
>> +    reg_struct.range.len = rb->postcopy_length;
>>      reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING;
>>  
>>      /* Now tell our userfault_fd that it's responsible for this area */
>> diff --git a/migration/ram.c b/migration/ram.c
>> index ab1f5534cf..6d1dcb362c 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -244,7 +244,7 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>>          return -1;
>>      }
>>  
>> -    nbits = block->used_length >> TARGET_PAGE_BITS;
>> +    nbits = block->postcopy_length >> TARGET_PAGE_BITS;
>>  
>>      /*
>>       * Make sure the tmp bitmap buffer is big enough, e.g., on 32bit
>> @@ -3159,7 +3159,13 @@ static int ram_load_postcopy(QEMUFile *f)
>>                  break;
>>              }
>>  
>> -            if (!offset_in_ramblock(block, addr)) {
>> +            /*
>> +             * Relying on used_length is racy and can result in false positives.
>> +             * We might place pages beyond used_length in case RAM was shrunk
>> +             * while in postcopy, which is fine - trying to place via
>> +             * UFFDIO_COPY/UFFDIO_ZEROPAGE will never segfault.
>> +             */
>> +            if (!block->host || addr >= block->postcopy_length) {
>>                  error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
>>                  ret = -EINVAL;
>>                  break;
>> @@ -3744,6 +3750,7 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
>>                               rb->idstr);
>>              }
>>          }
>> +        rb->postcopy_length = new_size;
> 
> With this change, postcopy_length will be the same as used_length?
> 
> I thought you wanted to cache that value when starting the postcopy
> phase so postcopy_length should be constant after set once.  Did I
> misunderstood?

So, my understanding on the migration target:

1. Source VM started and initialized.
- Theoretically we could have resizes here already in the future (e.g.,
  virtio-mem). Right now, not.
2. Precopy data is loaded
- RAM block sizes will be synchronized to the source.
3. Guest starts running
- Any RAM block resize will differ to the source.


For postcopy, we have to "freeze" the RAM block size after 2, but before
3. So rb->postcopy_length will always correspond to used_length on the
migration source.

Interestingly, userfaultfd handlers etc. are registered just before 3,
before any "resizes of interest" were handled.

So, postcopy_length will match used_length until we start our guest and
do resizes (shrink/grow) that are out of sync with our source.

Makes sense?

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-02-21  8:36 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 16:17 [PATCH v1 00/13] migrate/ram: Fix resizing RAM blocks while migrating David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 01/13] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 02/13] stubs/ram-block: Remove stubs that are no longer needed David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 03/13] numa: Teach ram block notifiers about resizeable ram blocks David Hildenbrand
2020-02-19 16:17   ` [Xen-devel] " David Hildenbrand
2020-02-19 20:48   ` Peter Xu
2020-02-19 20:48     ` [Xen-devel] " Peter Xu
2020-02-19 16:17 ` [PATCH v1 04/13] numa: Make all callbacks of ram block notifiers optional David Hildenbrand
2020-02-19 20:49   ` Peter Xu
2020-02-19 16:17 ` [PATCH v1 05/13] migrate/ram: Handle RAM block resizes during precopy David Hildenbrand
2020-02-20 15:16   ` David Hildenbrand
2020-02-20 20:17     ` Peter Xu
2020-02-21  9:18       ` David Hildenbrand
2020-02-21 10:10         ` David Hildenbrand
2020-02-21 10:19           ` David Hildenbrand
2020-02-21 16:35             ` Peter Xu
2020-02-21 15:14   ` Dr. David Alan Gilbert
2020-02-21 15:18     ` David Hildenbrand
2020-02-21 15:40       ` Dr. David Alan Gilbert
2020-02-21 15:46         ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 06/13] migrate/ram: Discard new RAM when growing RAM blocks and the VM is stopped David Hildenbrand
2020-02-21  9:08   ` David Hildenbrand
2020-02-21 15:51     ` Dr. David Alan Gilbert
2020-02-21 15:53       ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 07/13] migrate/ram: Get rid of "place_source" in ram_load_postcopy() David Hildenbrand
2020-02-19 20:47   ` Peter Xu
2020-02-19 20:55     ` Peter Xu
2020-02-20 13:22       ` David Hildenbrand
2020-02-20 13:48         ` Peter Xu
2020-02-20  9:24     ` David Hildenbrand
2020-02-20 18:58       ` Dr. David Alan Gilbert
2020-02-19 16:17 ` [PATCH v1 08/13] migrate/ram: Simplify host page handling " David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 09/13] migrate/ram: Consolidate variable reset after placement " David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 10/13] migrate/ram: Handle RAM block resizes during postcopy David Hildenbrand
2020-02-20 20:54   ` Peter Xu
2020-02-21  8:35     ` David Hildenbrand [this message]
2020-02-21  8:41       ` David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 11/13] migrate/multifd: Print used_length of memory block David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 12/13] migrate/ram: Use offset_in_ramblock() in range checks David Hildenbrand
2020-02-19 16:17 ` [PATCH v1 13/13] migrate/ram: Tolerate partially changed mappings in postcopy code David Hildenbrand
2020-02-20 11:24   ` David Hildenbrand
2020-02-20 21:07     ` Peter Xu
2020-02-21  8:48       ` David Hildenbrand
2020-02-21 12:13 ` [PATCH v1 00/13] migrate/ram: Fix resizing RAM blocks while migrating David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a4207fe-4b35-8b6a-6b29-3ad2e704b492@redhat.com \
    --to=david@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=rth@twiddle.net \
    --cc=shannon.zhao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.