All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating
@ 2021-04-29 11:26 David Hildenbrand
  2021-04-29 11:26 ` [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

v4 has been floating around for a while. Let's see if we can find someone
to merge this; or at least give some more feedback ... all patches have
at least one RB.


I realized that resizing RAM blocks while the guest is being migrated
(precopy: resize while still running on the source, postcopy: resize
 while already running on the target) is buggy. In case of precopy, we
can simply cancel migration. Postcopy handling is more involved. Resizing
can currently happen during a guest reboot, triggered by ACPI rebuilds.

Along with the fixes, some cleanups.

--------------------------------------------------------------------------

Example to highlight one part of the problem:

1. Start a paused VM (where a ramblock resize will trigger when booting):
  sudo build/qemu-system-x86_64 \
       --enable-kvm \
       -S \
       -machine q35,nvdimm=on \
       -smp 1 \
       -cpu host \
       -m size=20G,slots=8,maxmem=22G \
       -object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
       -device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
       -nodefaults \
       -chardev stdio,nosignal,id=serial \
       -device isa-serial,chardev=serial \
       -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
       -mon chardev=monitor,mode=readline \
       -device vmgenid \
       -device intel-iommu \
       -nographic

2. Starting precopy and then starting the VM to trigger resizing during
   precopy:
  QEMU 5.2.95 monitor - type 'help' for more information
  (qemu) migrate -d "exec:gzip -c > STATEFILE.gz"
  QEMU 5.2.95 monitor - type 'help' for more information
  (qemu) cont

3a. Before this series, migration never completes:
  QEMU 5.2.95 monitor - type 'help' for more information
  (qemu) info migrate
  globals:
  store-global-state: on
  only-migratable: off
  send-configuration: on
  send-section-footer: on
  decompress-error-check: on
  clear-bitmap-shift: 18
  Migration status: active
  total time: 43826 ms
  expected downtime: 300 ms
  setup: 5 ms
  transferred ram: 65981 kbytes
  throughput: 8.27 mbps
  remaining ram: 18446744073709551612 kbytes
  total ram: 21234188 kbytes
  duplicate: 5308454 pages
  skipped: 0 pages
  normal: 93 pages
  normal bytes: 372 kbytes
  dirty sync count: 1
  page size: 4 kbytes
  multifd bytes: 0 kbytes
  pages-per-second: 0

4. With this change, migration is properly aborted:
  (qemu) info migrate
  globals:
  store-global-state: on
  only-migratable: off
  send-configuration: on
  send-section-footer: on
  decompress-error-check: on
  clear-bitmap-shift: 18
  Migration status: cancelled
  total time: 0 ms

--------------------------------------------------------------------------

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>

v4 -> v5:
- Rephrased some patch descriptions
- Dropped some patches to reduce the footprint
-- "stubs/ram-block: Remove stubs that are no longer needed"
-- "migration/ram: Tolerate partially changed mappings in postcopy code"
- Removed as already upstream now
-- "migration/ram: Consolidate variable reset after placement in
    ram_load_postcopy()"

v3 -> v4:
- Rebased and retested
- Added RBs

v2 -> v3:
- Rebased on current master
- Added RBs
- "migration/ram: Tolerate partially changed mappings in postcopy code"
-- Extended the comment for the uffdio unregister part.

v1 -> v2:
- "util: vfio-helpers: Factor out and fix processing of existing ram
   blocks"
-- Stringify error
- "migraton/ram: Handle RAM block resizes during precopy"
-- Simplified check if we're migrating on the source
- "exec: Relax range check in ram_block_discard_range()"
-- Added to make discard during resizes actually work
- "migration/ram: Discard new RAM when growing RAM blocks after
   ram_postcopy_incoming_init()"
-- Better checks if in the right postcopy mode.
-- Better patch subject/description/comments
- "migration/ram: Handle RAM block resizes during postcopy"
-- Better comments
-- Adapt to changed postcopy checks
- "migrate/ram: Get rid of "place_source" in ram_load_postcopy()"
-- Dropped, as broken
- "migration/ram: Tolerate partially changed mappings in postcopy code"
-- Better comment / description. Clarify that no implicit wakeup will
   happen
-- Warn on EINVAL (older kernels)
-- Wake up any waiter explicitly

David Hildenbrand (10):
  util: vfio-helpers: Factor out and fix processing of existing ram
    blocks
  numa: Teach ram block notifiers about resizeable ram blocks
  numa: Make all callbacks of ram block notifiers optional
  migration/ram: Handle RAM block resizes during precopy
  exec: Relax range check in ram_block_discard_range()
  migration/ram: Discard RAM when growing RAM blocks after
    ram_postcopy_incoming_init()
  migration/ram: Simplify host page handling in ram_load_postcopy()
  migration/ram: Handle RAM block resizes during postcopy
  migration/multifd: Print used_length of memory block
  migration/ram: Use offset_in_ramblock() in range checks

 hw/core/numa.c             |  41 +++++++++--
 hw/i386/xen/xen-mapcache.c |   7 +-
 include/exec/cpu-common.h  |   1 +
 include/exec/memory.h      |  10 +--
 include/exec/ramblock.h    |  10 +++
 include/exec/ramlist.h     |  13 ++--
 migration/migration.c      |   9 ++-
 migration/migration.h      |   1 +
 migration/multifd.c        |   2 +-
 migration/postcopy-ram.c   |  15 ++++-
 migration/ram.c            | 135 +++++++++++++++++++++++++++++--------
 softmmu/physmem.c          |  26 +++++--
 target/i386/hax/hax-mem.c  |   5 +-
 target/i386/sev.c          |  18 ++---
 util/vfio-helpers.c        |  41 ++++-------
 15 files changed, 241 insertions(+), 93 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
@ 2021-04-29 11:26 ` David Hildenbrand
  2021-04-29 13:13   ` Philippe Mathieu-Daudé
  2021-04-29 11:27   ` David Hildenbrand
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

Factor it out into common code when a new notifier is registered, just
as done with the memory region notifier. This keeps logic about how to
process existing ram blocks at a central place.

Just like when adding a new ram block, we have to register the max_length.
Ram blocks are only "fake resized". All memory (max_length) is mapped.

Print the warning from inside qemu_vfio_ram_block_added().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/core/numa.c            | 14 ++++++++++++++
 include/exec/cpu-common.h |  1 +
 softmmu/physmem.c         |  5 +++++
 util/vfio-helpers.c       | 29 ++++++++---------------------
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 68cee65f61..7f08c27a6d 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -803,9 +803,23 @@ void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms)
     }
 }
 
+static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
+{
+    const ram_addr_t max_size = qemu_ram_get_max_length(rb);
+    void *host = qemu_ram_get_host_addr(rb);
+    RAMBlockNotifier *notifier = opaque;
+
+    if (host) {
+        notifier->ram_block_added(notifier, host, max_size);
+    }
+    return 0;
+}
+
 void ram_block_notifier_add(RAMBlockNotifier *n)
 {
     QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next);
+    /* Notify about all existing ram blocks. */
+    qemu_ram_foreach_block(ram_block_notify_add_single, n);
 }
 
 void ram_block_notifier_remove(RAMBlockNotifier *n)
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 5a0a2d93e0..ccabed4003 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -57,6 +57,7 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
 void *qemu_ram_get_host_addr(RAMBlock *rb);
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
 ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
+ram_addr_t qemu_ram_get_max_length(RAMBlock *rb);
 bool qemu_ram_is_shared(RAMBlock *rb);
 bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
 void qemu_ram_set_uf_zeroable(RAMBlock *rb);
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 85034d9c11..bd2c0dc4ec 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1697,6 +1697,11 @@ ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
     return rb->used_length;
 }
 
+ram_addr_t qemu_ram_get_max_length(RAMBlock *rb)
+{
+    return rb->max_length;
+}
+
 bool qemu_ram_is_shared(RAMBlock *rb)
 {
     return rb->flags & RAM_SHARED;
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 97dfa3fd57..92b9565797 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -463,8 +463,14 @@ static void qemu_vfio_ram_block_added(RAMBlockNotifier *n,
                                       void *host, size_t size)
 {
     QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
+    int ret;
+
     trace_qemu_vfio_ram_block_added(s, host, size);
-    qemu_vfio_dma_map(s, host, size, false, NULL);
+    ret = qemu_vfio_dma_map(s, host, size, false, NULL);
+    if (ret) {
+        error_report("qemu_vfio_dma_map(%p, %zu) failed: %s", host, size,
+                     strerror(-ret));
+    }
 }
 
 static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
@@ -477,33 +483,14 @@ static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
     }
 }
 
-static int qemu_vfio_init_ramblock(RAMBlock *rb, void *opaque)
-{
-    void *host_addr = qemu_ram_get_host_addr(rb);
-    ram_addr_t length = qemu_ram_get_used_length(rb);
-    int ret;
-    QEMUVFIOState *s = opaque;
-
-    if (!host_addr) {
-        return 0;
-    }
-    ret = qemu_vfio_dma_map(s, host_addr, length, false, NULL);
-    if (ret) {
-        fprintf(stderr, "qemu_vfio_init_ramblock: failed %p %" PRId64 "\n",
-                host_addr, (uint64_t)length);
-    }
-    return 0;
-}
-
 static void qemu_vfio_open_common(QEMUVFIOState *s)
 {
     qemu_mutex_init(&s->lock);
     s->ram_notifier.ram_block_added = qemu_vfio_ram_block_added;
     s->ram_notifier.ram_block_removed = qemu_vfio_ram_block_removed;
-    ram_block_notifier_add(&s->ram_notifier);
     s->low_water_mark = QEMU_VFIO_IOVA_MIN;
     s->high_water_mark = QEMU_VFIO_IOVA_MAX;
-    qemu_ram_foreach_block(qemu_vfio_init_ramblock, s);
+    ram_block_notifier_add(&s->ram_notifier);
 }
 
 /**
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 02/10] numa: Teach ram block notifiers about resizeable ram blocks
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
@ 2021-04-29 11:27   ` David Hildenbrand
  2021-04-29 11:27   ` David Hildenbrand
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Eduardo Habkost, Michael S. Tsirkin,
	David Hildenbrand, Richard Henderson, Dr. David Alan Gilbert,
	Peter Xu, Paul Durrant, Juan Quintela, Alex Williamson,
	haxm-team, Wenchao Wang, xen-devel, Anthony Perard,
	Paolo Bonzini, Colin Xu

Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.

Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.

Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/core/numa.c             | 22 +++++++++++++++++-----
 hw/i386/xen/xen-mapcache.c |  7 ++++---
 include/exec/ramlist.h     | 13 +++++++++----
 softmmu/physmem.c          | 12 ++++++++++--
 target/i386/hax/hax-mem.c  |  5 +++--
 target/i386/sev.c          | 18 ++++++++++--------
 util/vfio-helpers.c        | 16 ++++++++--------
 7 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 7f08c27a6d..921bf86ab4 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -806,11 +806,12 @@ void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms)
 static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
 {
     const ram_addr_t max_size = qemu_ram_get_max_length(rb);
+    const ram_addr_t size = qemu_ram_get_used_length(rb);
     void *host = qemu_ram_get_host_addr(rb);
     RAMBlockNotifier *notifier = opaque;
 
     if (host) {
-        notifier->ram_block_added(notifier, host, max_size);
+        notifier->ram_block_added(notifier, host, size, max_size);
     }
     return 0;
 }
@@ -827,20 +828,31 @@ void ram_block_notifier_remove(RAMBlockNotifier *n)
     QLIST_REMOVE(n, next);
 }
 
-void ram_block_notify_add(void *host, size_t size)
+void ram_block_notify_add(void *host, size_t size, size_t max_size)
 {
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_added(notifier, host, size);
+        notifier->ram_block_added(notifier, host, size, max_size);
     }
 }
 
-void ram_block_notify_remove(void *host, size_t size)
+void ram_block_notify_remove(void *host, size_t size, size_t max_size)
 {
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_removed(notifier, host, size);
+        notifier->ram_block_removed(notifier, host, size, max_size);
+    }
+}
+
+void ram_block_notify_resize(void *host, size_t old_size, size_t new_size)
+{
+    RAMBlockNotifier *notifier;
+
+    QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
+        if (notifier->ram_block_resized) {
+            notifier->ram_block_resized(notifier, host, old_size, new_size);
+        }
     }
 }
diff --git a/hw/i386/xen/xen-mapcache.c b/hw/i386/xen/xen-mapcache.c
index 5b120ed44b..d6dcea65d1 100644
--- a/hw/i386/xen/xen-mapcache.c
+++ b/hw/i386/xen/xen-mapcache.c
@@ -169,7 +169,8 @@ static void xen_remap_bucket(MapCacheEntry *entry,
 
     if (entry->vaddr_base != NULL) {
         if (!(entry->flags & XEN_MAPCACHE_ENTRY_DUMMY)) {
-            ram_block_notify_remove(entry->vaddr_base, entry->size);
+            ram_block_notify_remove(entry->vaddr_base, entry->size,
+                                    entry->size);
         }
         if (munmap(entry->vaddr_base, entry->size) != 0) {
             perror("unmap fails");
@@ -211,7 +212,7 @@ static void xen_remap_bucket(MapCacheEntry *entry,
     }
 
     if (!(entry->flags & XEN_MAPCACHE_ENTRY_DUMMY)) {
-        ram_block_notify_add(vaddr_base, size);
+        ram_block_notify_add(vaddr_base, size, size);
     }
 
     entry->vaddr_base = vaddr_base;
@@ -452,7 +453,7 @@ static void xen_invalidate_map_cache_entry_unlocked(uint8_t *buffer)
     }
 
     pentry->next = entry->next;
-    ram_block_notify_remove(entry->vaddr_base, entry->size);
+    ram_block_notify_remove(entry->vaddr_base, entry->size, entry->size);
     if (munmap(entry->vaddr_base, entry->size) != 0) {
         perror("unmap fails");
         exit(-1);
diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index 26704aa3b0..ece6497ee2 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -65,15 +65,20 @@ void qemu_mutex_lock_ramlist(void);
 void qemu_mutex_unlock_ramlist(void);
 
 struct RAMBlockNotifier {
-    void (*ram_block_added)(RAMBlockNotifier *n, void *host, size_t size);
-    void (*ram_block_removed)(RAMBlockNotifier *n, void *host, size_t size);
+    void (*ram_block_added)(RAMBlockNotifier *n, void *host, size_t size,
+                            size_t max_size);
+    void (*ram_block_removed)(RAMBlockNotifier *n, void *host, size_t size,
+                              size_t max_size);
+    void (*ram_block_resized)(RAMBlockNotifier *n, void *host, size_t old_size,
+                              size_t new_size);
     QLIST_ENTRY(RAMBlockNotifier) next;
 };
 
 void ram_block_notifier_add(RAMBlockNotifier *n);
 void ram_block_notifier_remove(RAMBlockNotifier *n);
-void ram_block_notify_add(void *host, size_t size);
-void ram_block_notify_remove(void *host, size_t size);
+void ram_block_notify_add(void *host, size_t size, size_t max_size);
+void ram_block_notify_remove(void *host, size_t size, size_t max_size);
+void ram_block_notify_resize(void *host, size_t old_size, size_t new_size);
 
 void ram_block_dump(Monitor *mon);
 
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index bd2c0dc4ec..d6463f203d 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1810,6 +1810,7 @@ static int memory_try_enable_merging(void *addr, size_t len)
  */
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
 {
+    const ram_addr_t oldsize = block->used_length;
     const ram_addr_t unaligned_size = newsize;
 
     assert(block);
@@ -1846,6 +1847,11 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
         return -EINVAL;
     }
 
+    /* Notify before modifying the ram block and touching the bitmaps. */
+    if (block->host) {
+        ram_block_notify_resize(block->host, oldsize, newsize);
+    }
+
     cpu_physical_memory_clear_dirty_range(block->offset, block->used_length);
     block->used_length = newsize;
     cpu_physical_memory_set_dirty_range(block->offset, block->used_length,
@@ -2013,7 +2019,8 @@ static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
             qemu_madvise(new_block->host, new_block->max_length,
                          QEMU_MADV_DONTFORK);
         }
-        ram_block_notify_add(new_block->host, new_block->max_length);
+        ram_block_notify_add(new_block->host, new_block->used_length,
+                             new_block->max_length);
     }
 }
 
@@ -2192,7 +2199,8 @@ void qemu_ram_free(RAMBlock *block)
     }
 
     if (block->host) {
-        ram_block_notify_remove(block->host, block->max_length);
+        ram_block_notify_remove(block->host, block->used_length,
+                                block->max_length);
     }
 
     qemu_mutex_lock_ramlist();
diff --git a/target/i386/hax/hax-mem.c b/target/i386/hax/hax-mem.c
index 35495f5e82..8d44edbffd 100644
--- a/target/i386/hax/hax-mem.c
+++ b/target/i386/hax/hax-mem.c
@@ -293,7 +293,8 @@ static MemoryListener hax_memory_listener = {
     .priority = 10,
 };
 
-static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
+static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
+                                size_t max_size)
 {
     /*
      * We must register each RAM block with the HAXM kernel module, or
@@ -304,7 +305,7 @@ static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
      * host physical pages for the RAM block as part of this registration
      * process, hence the name hax_populate_ram().
      */
-    if (hax_populate_ram((uint64_t)(uintptr_t)host, size) < 0) {
+    if (hax_populate_ram((uint64_t)(uintptr_t)host, max_size) < 0) {
         fprintf(stderr, "HAX failed to populate RAM\n");
         abort();
     }
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 72b9e2ab40..9cfe5afe31 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -181,7 +181,8 @@ sev_set_guest_state(SevGuestState *sev, SevState new_state)
 }
 
 static void
-sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
+sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
+                    size_t max_size)
 {
     int r;
     struct kvm_enc_region range;
@@ -198,19 +199,20 @@ sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
     }
 
     range.addr = (__u64)(unsigned long)host;
-    range.size = size;
+    range.size = max_size;
 
-    trace_kvm_memcrypt_register_region(host, size);
+    trace_kvm_memcrypt_register_region(host, max_size);
     r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_REG_REGION, &range);
     if (r) {
         error_report("%s: failed to register region (%p+%#zx) error '%s'",
-                     __func__, host, size, strerror(errno));
+                     __func__, host, max_size, strerror(errno));
         exit(1);
     }
 }
 
 static void
-sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size)
+sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size,
+                      size_t max_size)
 {
     int r;
     struct kvm_enc_region range;
@@ -227,13 +229,13 @@ sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size)
     }
 
     range.addr = (__u64)(unsigned long)host;
-    range.size = size;
+    range.size = max_size;
 
-    trace_kvm_memcrypt_unregister_region(host, size);
+    trace_kvm_memcrypt_unregister_region(host, max_size);
     r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_UNREG_REGION, &range);
     if (r) {
         error_report("%s: failed to unregister region (%p+%#zx)",
-                     __func__, host, size);
+                     __func__, host, max_size);
     }
 }
 
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 92b9565797..911115b86e 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -459,26 +459,26 @@ fail_container:
     return ret;
 }
 
-static void qemu_vfio_ram_block_added(RAMBlockNotifier *n,
-                                      void *host, size_t size)
+static void qemu_vfio_ram_block_added(RAMBlockNotifier *n, void *host,
+                                      size_t size, size_t max_size)
 {
     QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
     int ret;
 
-    trace_qemu_vfio_ram_block_added(s, host, size);
-    ret = qemu_vfio_dma_map(s, host, size, false, NULL);
+    trace_qemu_vfio_ram_block_added(s, host, max_size);
+    ret = qemu_vfio_dma_map(s, host, max_size, false, NULL);
     if (ret) {
-        error_report("qemu_vfio_dma_map(%p, %zu) failed: %s", host, size,
+        error_report("qemu_vfio_dma_map(%p, %zu) failed: %s", host, max_size,
                      strerror(-ret));
     }
 }
 
-static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
-                                        void *host, size_t size)
+static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n, void *host,
+                                        size_t size, size_t max_size)
 {
     QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
     if (host) {
-        trace_qemu_vfio_ram_block_removed(s, host, size);
+        trace_qemu_vfio_ram_block_removed(s, host, max_size);
         qemu_vfio_dma_unmap(s, host);
     }
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 02/10] numa: Teach ram block notifiers about resizeable ram blocks
@ 2021-04-29 11:27   ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Eduardo Habkost, Marcel Apfelbaum,
	Michael S. Tsirkin, Paolo Bonzini, Richard Henderson,
	Juan Quintela, Dr. David Alan Gilbert, Peter Xu, Alex Williamson,
	Paul Durrant, xen-devel, haxm-team, Stefano Stabellini,
	Anthony Perard, Wenchao Wang, Colin Xu

Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.

Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.

Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/core/numa.c             | 22 +++++++++++++++++-----
 hw/i386/xen/xen-mapcache.c |  7 ++++---
 include/exec/ramlist.h     | 13 +++++++++----
 softmmu/physmem.c          | 12 ++++++++++--
 target/i386/hax/hax-mem.c  |  5 +++--
 target/i386/sev.c          | 18 ++++++++++--------
 util/vfio-helpers.c        | 16 ++++++++--------
 7 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 7f08c27a6d..921bf86ab4 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -806,11 +806,12 @@ void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms)
 static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
 {
     const ram_addr_t max_size = qemu_ram_get_max_length(rb);
+    const ram_addr_t size = qemu_ram_get_used_length(rb);
     void *host = qemu_ram_get_host_addr(rb);
     RAMBlockNotifier *notifier = opaque;
 
     if (host) {
-        notifier->ram_block_added(notifier, host, max_size);
+        notifier->ram_block_added(notifier, host, size, max_size);
     }
     return 0;
 }
@@ -827,20 +828,31 @@ void ram_block_notifier_remove(RAMBlockNotifier *n)
     QLIST_REMOVE(n, next);
 }
 
-void ram_block_notify_add(void *host, size_t size)
+void ram_block_notify_add(void *host, size_t size, size_t max_size)
 {
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_added(notifier, host, size);
+        notifier->ram_block_added(notifier, host, size, max_size);
     }
 }
 
-void ram_block_notify_remove(void *host, size_t size)
+void ram_block_notify_remove(void *host, size_t size, size_t max_size)
 {
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_removed(notifier, host, size);
+        notifier->ram_block_removed(notifier, host, size, max_size);
+    }
+}
+
+void ram_block_notify_resize(void *host, size_t old_size, size_t new_size)
+{
+    RAMBlockNotifier *notifier;
+
+    QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
+        if (notifier->ram_block_resized) {
+            notifier->ram_block_resized(notifier, host, old_size, new_size);
+        }
     }
 }
diff --git a/hw/i386/xen/xen-mapcache.c b/hw/i386/xen/xen-mapcache.c
index 5b120ed44b..d6dcea65d1 100644
--- a/hw/i386/xen/xen-mapcache.c
+++ b/hw/i386/xen/xen-mapcache.c
@@ -169,7 +169,8 @@ static void xen_remap_bucket(MapCacheEntry *entry,
 
     if (entry->vaddr_base != NULL) {
         if (!(entry->flags & XEN_MAPCACHE_ENTRY_DUMMY)) {
-            ram_block_notify_remove(entry->vaddr_base, entry->size);
+            ram_block_notify_remove(entry->vaddr_base, entry->size,
+                                    entry->size);
         }
         if (munmap(entry->vaddr_base, entry->size) != 0) {
             perror("unmap fails");
@@ -211,7 +212,7 @@ static void xen_remap_bucket(MapCacheEntry *entry,
     }
 
     if (!(entry->flags & XEN_MAPCACHE_ENTRY_DUMMY)) {
-        ram_block_notify_add(vaddr_base, size);
+        ram_block_notify_add(vaddr_base, size, size);
     }
 
     entry->vaddr_base = vaddr_base;
@@ -452,7 +453,7 @@ static void xen_invalidate_map_cache_entry_unlocked(uint8_t *buffer)
     }
 
     pentry->next = entry->next;
-    ram_block_notify_remove(entry->vaddr_base, entry->size);
+    ram_block_notify_remove(entry->vaddr_base, entry->size, entry->size);
     if (munmap(entry->vaddr_base, entry->size) != 0) {
         perror("unmap fails");
         exit(-1);
diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index 26704aa3b0..ece6497ee2 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -65,15 +65,20 @@ void qemu_mutex_lock_ramlist(void);
 void qemu_mutex_unlock_ramlist(void);
 
 struct RAMBlockNotifier {
-    void (*ram_block_added)(RAMBlockNotifier *n, void *host, size_t size);
-    void (*ram_block_removed)(RAMBlockNotifier *n, void *host, size_t size);
+    void (*ram_block_added)(RAMBlockNotifier *n, void *host, size_t size,
+                            size_t max_size);
+    void (*ram_block_removed)(RAMBlockNotifier *n, void *host, size_t size,
+                              size_t max_size);
+    void (*ram_block_resized)(RAMBlockNotifier *n, void *host, size_t old_size,
+                              size_t new_size);
     QLIST_ENTRY(RAMBlockNotifier) next;
 };
 
 void ram_block_notifier_add(RAMBlockNotifier *n);
 void ram_block_notifier_remove(RAMBlockNotifier *n);
-void ram_block_notify_add(void *host, size_t size);
-void ram_block_notify_remove(void *host, size_t size);
+void ram_block_notify_add(void *host, size_t size, size_t max_size);
+void ram_block_notify_remove(void *host, size_t size, size_t max_size);
+void ram_block_notify_resize(void *host, size_t old_size, size_t new_size);
 
 void ram_block_dump(Monitor *mon);
 
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index bd2c0dc4ec..d6463f203d 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1810,6 +1810,7 @@ static int memory_try_enable_merging(void *addr, size_t len)
  */
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
 {
+    const ram_addr_t oldsize = block->used_length;
     const ram_addr_t unaligned_size = newsize;
 
     assert(block);
@@ -1846,6 +1847,11 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
         return -EINVAL;
     }
 
+    /* Notify before modifying the ram block and touching the bitmaps. */
+    if (block->host) {
+        ram_block_notify_resize(block->host, oldsize, newsize);
+    }
+
     cpu_physical_memory_clear_dirty_range(block->offset, block->used_length);
     block->used_length = newsize;
     cpu_physical_memory_set_dirty_range(block->offset, block->used_length,
@@ -2013,7 +2019,8 @@ static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
             qemu_madvise(new_block->host, new_block->max_length,
                          QEMU_MADV_DONTFORK);
         }
-        ram_block_notify_add(new_block->host, new_block->max_length);
+        ram_block_notify_add(new_block->host, new_block->used_length,
+                             new_block->max_length);
     }
 }
 
@@ -2192,7 +2199,8 @@ void qemu_ram_free(RAMBlock *block)
     }
 
     if (block->host) {
-        ram_block_notify_remove(block->host, block->max_length);
+        ram_block_notify_remove(block->host, block->used_length,
+                                block->max_length);
     }
 
     qemu_mutex_lock_ramlist();
diff --git a/target/i386/hax/hax-mem.c b/target/i386/hax/hax-mem.c
index 35495f5e82..8d44edbffd 100644
--- a/target/i386/hax/hax-mem.c
+++ b/target/i386/hax/hax-mem.c
@@ -293,7 +293,8 @@ static MemoryListener hax_memory_listener = {
     .priority = 10,
 };
 
-static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
+static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
+                                size_t max_size)
 {
     /*
      * We must register each RAM block with the HAXM kernel module, or
@@ -304,7 +305,7 @@ static void hax_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
      * host physical pages for the RAM block as part of this registration
      * process, hence the name hax_populate_ram().
      */
-    if (hax_populate_ram((uint64_t)(uintptr_t)host, size) < 0) {
+    if (hax_populate_ram((uint64_t)(uintptr_t)host, max_size) < 0) {
         fprintf(stderr, "HAX failed to populate RAM\n");
         abort();
     }
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 72b9e2ab40..9cfe5afe31 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -181,7 +181,8 @@ sev_set_guest_state(SevGuestState *sev, SevState new_state)
 }
 
 static void
-sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
+sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
+                    size_t max_size)
 {
     int r;
     struct kvm_enc_region range;
@@ -198,19 +199,20 @@ sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t size)
     }
 
     range.addr = (__u64)(unsigned long)host;
-    range.size = size;
+    range.size = max_size;
 
-    trace_kvm_memcrypt_register_region(host, size);
+    trace_kvm_memcrypt_register_region(host, max_size);
     r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_REG_REGION, &range);
     if (r) {
         error_report("%s: failed to register region (%p+%#zx) error '%s'",
-                     __func__, host, size, strerror(errno));
+                     __func__, host, max_size, strerror(errno));
         exit(1);
     }
 }
 
 static void
-sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size)
+sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size,
+                      size_t max_size)
 {
     int r;
     struct kvm_enc_region range;
@@ -227,13 +229,13 @@ sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size)
     }
 
     range.addr = (__u64)(unsigned long)host;
-    range.size = size;
+    range.size = max_size;
 
-    trace_kvm_memcrypt_unregister_region(host, size);
+    trace_kvm_memcrypt_unregister_region(host, max_size);
     r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_UNREG_REGION, &range);
     if (r) {
         error_report("%s: failed to unregister region (%p+%#zx)",
-                     __func__, host, size);
+                     __func__, host, max_size);
     }
 }
 
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 92b9565797..911115b86e 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -459,26 +459,26 @@ fail_container:
     return ret;
 }
 
-static void qemu_vfio_ram_block_added(RAMBlockNotifier *n,
-                                      void *host, size_t size)
+static void qemu_vfio_ram_block_added(RAMBlockNotifier *n, void *host,
+                                      size_t size, size_t max_size)
 {
     QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
     int ret;
 
-    trace_qemu_vfio_ram_block_added(s, host, size);
-    ret = qemu_vfio_dma_map(s, host, size, false, NULL);
+    trace_qemu_vfio_ram_block_added(s, host, max_size);
+    ret = qemu_vfio_dma_map(s, host, max_size, false, NULL);
     if (ret) {
-        error_report("qemu_vfio_dma_map(%p, %zu) failed: %s", host, size,
+        error_report("qemu_vfio_dma_map(%p, %zu) failed: %s", host, max_size,
                      strerror(-ret));
     }
 }
 
-static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
-                                        void *host, size_t size)
+static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n, void *host,
+                                        size_t size, size_t max_size)
 {
     QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
     if (host) {
-        trace_qemu_vfio_ram_block_removed(s, host, size);
+        trace_qemu_vfio_ram_block_removed(s, host, max_size);
         qemu_vfio_dma_unmap(s, host);
     }
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 03/10] numa: Make all callbacks of ram block notifiers optional
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
  2021-04-29 11:26 ` [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
  2021-04-29 11:27   ` David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 04/10] migration/ram: Handle RAM block resizes during precopy David Hildenbrand
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

Let's make add/remove optional. We want to introduce a RAM block
notifier for RAM migration that is only interested in resize events.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/core/numa.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 921bf86ab4..64b9334050 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -819,8 +819,11 @@ static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
 void ram_block_notifier_add(RAMBlockNotifier *n)
 {
     QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next);
+
     /* Notify about all existing ram blocks. */
-    qemu_ram_foreach_block(ram_block_notify_add_single, n);
+    if (n->ram_block_added) {
+        qemu_ram_foreach_block(ram_block_notify_add_single, n);
+    }
 }
 
 void ram_block_notifier_remove(RAMBlockNotifier *n)
@@ -833,7 +836,9 @@ void ram_block_notify_add(void *host, size_t size, size_t max_size)
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_added(notifier, host, size, max_size);
+        if (notifier->ram_block_added) {
+            notifier->ram_block_added(notifier, host, size, max_size);
+        }
     }
 }
 
@@ -842,7 +847,9 @@ void ram_block_notify_remove(void *host, size_t size, size_t max_size)
     RAMBlockNotifier *notifier;
 
     QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
-        notifier->ram_block_removed(notifier, host, size, max_size);
+        if (notifier->ram_block_removed) {
+            notifier->ram_block_removed(notifier, host, size, max_size);
+        }
     }
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 04/10] migration/ram: Handle RAM block resizes during precopy
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (2 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 03/10] numa: Make all callbacks of ram block notifiers optional David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 05/10] exec: Relax range check in ram_block_discard_range() David Hildenbrand
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

Resizing while migrating is dangerous and does not work as expected.
The whole migration code works on the usable_length of ram blocks and does
not expect this to change at random points in time.

In the case of precopy, the ram block size must not change on the source,
after syncing the RAM block list in ram_save_setup(), so as long as the
guest is still running on the source.

Resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

Use the ram block notifier to get notified about resizes. Let's simply
cancel migration and indicate the reason. We'll continue running on the
source. No harm done.

Update the documentation. Postcopy will be handled separately.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/exec/memory.h | 10 ++++++----
 migration/migration.c |  9 +++++++--
 migration/migration.h |  1 +
 migration/ram.c       | 31 +++++++++++++++++++++++++++++++
 softmmu/physmem.c     |  5 +++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5728a681b2..c8b9088924 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -131,7 +131,7 @@ typedef struct IOMMUTLBEvent {
 #define RAM_SHARED     (1 << 1)
 
 /* Only a portion of RAM (used_length) is actually used, and migrated.
- * This used_length size can change across reboots.
+ * Resizing RAM while migrating can result in the migration being canceled.
  */
 #define RAM_RESIZEABLE (1 << 2)
 
@@ -955,7 +955,9 @@ void memory_region_init_ram_shared_nomigrate(MemoryRegion *mr,
  *                                     RAM.  Accesses into the region will
  *                                     modify memory directly.  Only an initial
  *                                     portion of this RAM is actually used.
- *                                     The used size can change across reboots.
+ *                                     Changing the size while migrating
+ *                                     can result in the migration being
+ *                                     canceled.
  *
  * @mr: the #MemoryRegion to be initialized.
  * @owner: the object that tracks the region's reference count
@@ -1586,8 +1588,8 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
 
 /* memory_region_ram_resize: Resize a RAM region.
  *
- * Only legal before guest might have detected the memory size: e.g. on
- * incoming migration, or right after reset.
+ * Resizing RAM while migrating can result in the migration being canceled.
+ * Care has to be taken if the guest might have already detected the memory.
  *
  * @mr: a memory region created with @memory_region_init_resizeable_ram.
  * @newsize: the new size the region
diff --git a/migration/migration.c b/migration/migration.c
index 8ca034136b..2dea8e2fc6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -223,13 +223,18 @@ void migration_object_init(void)
     dirty_bitmap_mig_init();
 }
 
+void migration_cancel(void)
+{
+    migrate_fd_cancel(current_migration);
+}
+
 void migration_shutdown(void)
 {
     /*
      * Cancel the current migration - that will (eventually)
      * stop the migration using this structure
      */
-    migrate_fd_cancel(current_migration);
+    migration_cancel();
     object_unref(OBJECT(current_migration));
 
     /*
@@ -2310,7 +2315,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 
 void qmp_migrate_cancel(Error **errp)
 {
-    migrate_fd_cancel(migrate_get_current());
+    migration_cancel();
 }
 
 void qmp_migrate_continue(MigrationStatus state, Error **errp)
diff --git a/migration/migration.h b/migration/migration.h
index db6708326b..f7b388d718 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -375,5 +375,6 @@ int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
 void migration_make_urgent_request(void);
 void migration_consume_urgent_request(void);
 bool migration_rate_limit(void);
+void migration_cancel(void);
 
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index 4682f3625c..195fabbab0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -53,6 +53,7 @@
 #include "block.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/cpu-throttle.h"
+#include "sysemu/runstate.h"
 #include "savevm.h"
 #include "qemu/iov.h"
 #include "multifd.h"
@@ -4138,8 +4139,38 @@ static SaveVMHandlers savevm_ram_handlers = {
     .resume_prepare = ram_resume_prepare,
 };
 
+static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
+                                      size_t old_size, size_t new_size)
+{
+    ram_addr_t offset;
+    RAMBlock *rb = qemu_ram_block_from_host(host, false, &offset);
+    Error *err = NULL;
+
+    if (ramblock_is_ignored(rb)) {
+        return;
+    }
+
+    if (!migration_is_idle()) {
+        /*
+         * Precopy code on the source cannot deal with the size of RAM blocks
+         * changing at random points in time - especially after sending the
+         * RAM block sizes in the migration stream, they must no longer change.
+         * Abort and indicate a proper reason.
+         */
+        error_setg(&err, "RAM block '%s' resized during precopy.", rb->idstr);
+        migrate_set_error(migrate_get_current(), err);
+        error_free(err);
+        migration_cancel();
+    }
+}
+
+static RAMBlockNotifier ram_mig_ram_notifier = {
+    .ram_block_resized = ram_mig_ram_block_resized,
+};
+
 void ram_mig_init(void)
 {
     qemu_mutex_init(&XBZRLE.lock);
     register_savevm_live("ram", 0, 4, &savevm_ram_handlers, &ram_state);
+    ram_block_notifier_add(&ram_mig_ram_notifier);
 }
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index d6463f203d..4834003d47 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1801,8 +1801,9 @@ static int memory_try_enable_merging(void *addr, size_t len)
     return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
-/* Only legal before guest might have detected the memory size: e.g. on
- * incoming migration, or right after reset.
+/*
+ * Resizing RAM while migrating can result in the migration being canceled.
+ * Care has to be taken if the guest might have already detected the memory.
  *
  * As memory core doesn't know how is memory accessed, it is up to
  * resize callback to update device state and/or add assertions to detect
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 05/10] exec: Relax range check in ram_block_discard_range()
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (3 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 04/10] migration/ram: Handle RAM block resizes during precopy David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 06/10] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init() David Hildenbrand
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

We want to make use of ram_block_discard_range() in the RAM block resize
callback when growing a RAM block, *before* used_length is changed.
Let's relax the check. As RAM blocks always mmap the whole max_length area,
we cannot corrupt unrelated data.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 softmmu/physmem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 4834003d47..4e587de150 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3503,7 +3503,7 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
         goto err;
     }
 
-    if ((start + length) <= rb->used_length) {
+    if ((start + length) <= rb->max_length) {
         bool need_madvise, need_fallocate;
         if (!QEMU_IS_ALIGNED(length, rb->page_size)) {
             error_report("ram_block_discard_range: Unaligned length: %zx",
@@ -3570,7 +3570,7 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
     } else {
         error_report("ram_block_discard_range: Overrun block '%s' (%" PRIu64
                      "/%zx/" RAM_ADDR_FMT")",
-                     rb->idstr, start, length, rb->used_length);
+                     rb->idstr, start, length, rb->max_length);
     }
 
 err:
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 06/10] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init()
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (4 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 05/10] exec: Relax range check in ram_block_discard_range() David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 07/10] migration/ram: Simplify host page handling in ram_load_postcopy() David Hildenbrand
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

In case we grow our RAM after ram_postcopy_incoming_init() (e.g., when
synchronizing the RAM block state with the migration source), the resized
part would not get discarded. Let's perform that when being notified
about a resize while postcopy has been advised, but is not listening
yet. With precopy, the process is as following:

1. VM created
- RAM blocks are created
2. Incomming migration started
- Postcopy is advised
- All pages in RAM blocks are discarded
3. Precopy starts
- RAM blocks are resized to match the size on the migration source.
- RAM pages from precopy stream are loaded
- Uffd handler is registered, postcopy starts listening
4. Guest started, postcopy running
- Pagefaults get resolved, pages get placed

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/ram.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 195fabbab0..693851d7a0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4142,6 +4142,7 @@ static SaveVMHandlers savevm_ram_handlers = {
 static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
                                       size_t old_size, size_t new_size)
 {
+    PostcopyState ps = postcopy_state_get();
     ram_addr_t offset;
     RAMBlock *rb = qemu_ram_block_from_host(host, false, &offset);
     Error *err = NULL;
@@ -4162,6 +4163,35 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
         error_free(err);
         migration_cancel();
     }
+
+    switch (ps) {
+    case POSTCOPY_INCOMING_ADVISE:
+        /*
+         * Update what ram_postcopy_incoming_init()->init_range() does at the
+         * time postcopy was advised. Syncing RAM blocks with the source will
+         * result in RAM resizes.
+         */
+        if (old_size < new_size) {
+            if (ram_discard_range(rb->idstr, old_size, new_size - old_size)) {
+                error_report("RAM block '%s' discard of resized RAM failed",
+                             rb->idstr);
+            }
+        }
+        break;
+    case POSTCOPY_INCOMING_NONE:
+    case POSTCOPY_INCOMING_RUNNING:
+    case POSTCOPY_INCOMING_END:
+        /*
+         * Once our guest is running, postcopy does no longer care about
+         * resizes. When growing, the new memory was not available on the
+         * source, no handler needed.
+         */
+        break;
+    default:
+        error_report("RAM block '%s' resized during postcopy state: %d",
+                     rb->idstr, ps);
+        exit(-1);
+    }
 }
 
 static RAMBlockNotifier ram_mig_ram_notifier = {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 07/10] migration/ram: Simplify host page handling in ram_load_postcopy()
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (5 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 06/10] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init() David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 08/10] migration/ram: Handle RAM block resizes during postcopy David Hildenbrand
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

Add two new helper functions. This will come in come handy once we want to
handle ram block resizes while postcopy is active.

Note that ram_block_from_stream() will already print proper errors.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/ram.c | 55 ++++++++++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 693851d7a0..72df5228ee 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3121,6 +3121,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
     return block->host + offset;
 }
 
+static void *host_page_from_ram_block_offset(RAMBlock *block,
+                                             ram_addr_t offset)
+{
+    /* Note: Explicitly no check against offset_in_ramblock(). */
+    return (void *)QEMU_ALIGN_DOWN((uintptr_t)block->host + offset,
+                                   block->page_size);
+}
+
+static ram_addr_t host_page_offset_from_ram_block_offset(RAMBlock *block,
+                                                         ram_addr_t offset)
+{
+    return ((uintptr_t)block->host + offset) & (block->page_size - 1);
+}
+
 static inline void *colo_cache_from_block_offset(RAMBlock *block,
                              ram_addr_t offset, bool record_bitmap)
 {
@@ -3524,13 +3538,12 @@ static int ram_load_postcopy(QEMUFile *f)
     MigrationIncomingState *mis = migration_incoming_get_current();
     /* Temporary page that is later 'placed' */
     void *postcopy_host_page = mis->postcopy_tmp_page;
-    void *this_host = NULL;
+    void *host_page = NULL;
     bool all_zero = true;
     int target_pages = 0;
 
     while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
         ram_addr_t addr;
-        void *host = NULL;
         void *page_buffer = NULL;
         void *place_source = NULL;
         RAMBlock *block = NULL;
@@ -3555,9 +3568,12 @@ static int ram_load_postcopy(QEMUFile *f)
         if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
                      RAM_SAVE_FLAG_COMPRESS_PAGE)) {
             block = ram_block_from_stream(f, flags);
+            if (!block) {
+                ret = -EINVAL;
+                break;
+            }
 
-            host = host_from_ram_block_offset(block, addr);
-            if (!host) {
+            if (!offset_in_ramblock(block, addr)) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
                 break;
@@ -3575,19 +3591,17 @@ static int ram_load_postcopy(QEMUFile *f)
              * of a host page in one chunk.
              */
             page_buffer = postcopy_host_page +
-                          ((uintptr_t)host & (block->page_size - 1));
+                          host_page_offset_from_ram_block_offset(block, addr);
+            /* If all TP are zero then we can optimise the place */
             if (target_pages == 1) {
-                this_host = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
-                                                    block->page_size);
-            } else {
+                host_page = host_page_from_ram_block_offset(block, addr);
+            } else if (host_page != host_page_from_ram_block_offset(block,
+                                                                    addr)) {
                 /* not the 1st TP within the HP */
-                if (QEMU_ALIGN_DOWN((uintptr_t)host, block->page_size) !=
-                    (uintptr_t)this_host) {
-                    error_report("Non-same host page %p/%p",
-                                  host, this_host);
-                    ret = -EINVAL;
-                    break;
-                }
+                error_report("Non-same host page %p/%p", host_page,
+                             host_page_from_ram_block_offset(block, addr));
+                ret = -EINVAL;
+                break;
             }
 
             /*
@@ -3666,16 +3680,11 @@ static int ram_load_postcopy(QEMUFile *f)
         }
 
         if (!ret && place_needed) {
-            /* This gets called at the last target page in the host page */
-            void *place_dest = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
-                                                       block->page_size);
-
             if (all_zero) {
-                ret = postcopy_place_page_zero(mis, place_dest,
-                                               block);
+                ret = postcopy_place_page_zero(mis, host_page, block);
             } else {
-                ret = postcopy_place_page(mis, place_dest,
-                                          place_source, block);
+                ret = postcopy_place_page(mis, host_page, place_source,
+                                          block);
             }
             place_needed = false;
             target_pages = 0;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 08/10] migration/ram: Handle RAM block resizes during postcopy
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (6 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 07/10] migration/ram: Simplify host page handling in ram_load_postcopy() David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 09/10] migration/multifd: Print used_length of memory block David Hildenbrand
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

Resizing while migrating is dangerous and does not work as expected.
The whole migration code works with the usable_length of a ram block and
does not expect this value to change at random points in time.

In the case of postcopy, relying on used_length is racy as soon as the
guest is running. Also, when used_length changes we might leave the
uffd handler registered for some memory regions, reject valid pages
when migrating and fail when sending the recv bitmap to the source.

Resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

Let's remember the original used_length in a separate variable and
use it in relevant postcopy code. Make sure to update it when we resize
during precopy, when synchronizing the RAM block sizes with the source.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/exec/ramblock.h  | 10 ++++++++++
 migration/postcopy-ram.c | 15 ++++++++++++---
 migration/ram.c          | 11 +++++++++--
 3 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 07d50864d8..664701b759 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -59,6 +59,16 @@ struct RAMBlock {
      */
     unsigned long *clear_bmap;
     uint8_t clear_bmap_shift;
+
+    /*
+     * RAM block length that corresponds to the used_length on the migration
+     * source (after RAM block sizes were synchronized). Especially, after
+     * starting to run the guest, used_length and postcopy_length can differ.
+     * Used to register/unregister uffd handlers and as the size of the received
+     * bitmap. Receiving any page beyond this length will bail out, as it
+     * could not have been valid on the source.
+     */
+    ram_addr_t postcopy_length;
 };
 #endif
 #endif
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index ab482adef1..2e9697bdd2 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -17,6 +17,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/rcu.h"
 #include "exec/target_page.h"
 #include "migration.h"
 #include "qemu-file.h"
@@ -30,6 +31,7 @@
 #include "qemu/error-report.h"
 #include "trace.h"
 #include "hw/boards.h"
+#include "exec/ramblock.h"
 
 /* Arbitrary limit on size of each discard command,
  * keeps them around ~200 bytes
@@ -452,6 +454,13 @@ static int init_range(RAMBlock *rb, void *opaque)
     ram_addr_t length = qemu_ram_get_used_length(rb);
     trace_postcopy_init_range(block_name, host_addr, offset, length);
 
+    /*
+     * Save the used_length before running the guest. In case we have to
+     * resize RAM blocks when syncing RAM block sizes from the source during
+     * precopy, we'll update it manually via the ram block notifier.
+     */
+    rb->postcopy_length = length;
+
     /*
      * We need the whole of RAM to be truly empty for postcopy, so things
      * like ROMs and any data tables built during init must be zero'd
@@ -474,7 +483,7 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
     const char *block_name = qemu_ram_get_idstr(rb);
     void *host_addr = qemu_ram_get_host_addr(rb);
     ram_addr_t offset = qemu_ram_get_offset(rb);
-    ram_addr_t length = qemu_ram_get_used_length(rb);
+    ram_addr_t length = rb->postcopy_length;
     MigrationIncomingState *mis = opaque;
     struct uffdio_range range_struct;
     trace_postcopy_cleanup_range(block_name, host_addr, offset, length);
@@ -580,7 +589,7 @@ static int nhp_range(RAMBlock *rb, void *opaque)
     const char *block_name = qemu_ram_get_idstr(rb);
     void *host_addr = qemu_ram_get_host_addr(rb);
     ram_addr_t offset = qemu_ram_get_offset(rb);
-    ram_addr_t length = qemu_ram_get_used_length(rb);
+    ram_addr_t length = rb->postcopy_length;
     trace_postcopy_nhp_range(block_name, host_addr, offset, length);
 
     /*
@@ -624,7 +633,7 @@ static int ram_block_enable_notify(RAMBlock *rb, void *opaque)
     struct uffdio_register reg_struct;
 
     reg_struct.range.start = (uintptr_t)qemu_ram_get_host_addr(rb);
-    reg_struct.range.len = qemu_ram_get_used_length(rb);
+    reg_struct.range.len = rb->postcopy_length;
     reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING;
 
     /* Now tell our userfault_fd that it's responsible for this area */
diff --git a/migration/ram.c b/migration/ram.c
index 72df5228ee..6d4d13d345 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -243,7 +243,7 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
         return -1;
     }
 
-    nbits = block->used_length >> TARGET_PAGE_BITS;
+    nbits = block->postcopy_length >> TARGET_PAGE_BITS;
 
     /*
      * Make sure the tmp bitmap buffer is big enough, e.g., on 32bit
@@ -3573,7 +3573,13 @@ static int ram_load_postcopy(QEMUFile *f)
                 break;
             }
 
-            if (!offset_in_ramblock(block, addr)) {
+            /*
+             * Relying on used_length is racy and can result in false positives.
+             * We might place pages beyond used_length in case RAM was shrunk
+             * while in postcopy, which is fine - trying to place via
+             * UFFDIO_COPY/UFFDIO_ZEROPAGE will never segfault.
+             */
+            if (!block->host || addr >= block->postcopy_length) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
                 break;
@@ -4186,6 +4192,7 @@ static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
                              rb->idstr);
             }
         }
+        rb->postcopy_length = new_size;
         break;
     case POSTCOPY_INCOMING_NONE:
     case POSTCOPY_INCOMING_RUNNING:
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 09/10] migration/multifd: Print used_length of memory block
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (7 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 08/10] migration/ram: Handle RAM block resizes during postcopy David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-04-29 11:27 ` [PATCH v5 10/10] migration/ram: Use offset_in_ramblock() in range checks David Hildenbrand
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

We actually want to print the used_length, against which we check.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/multifd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index a6677c45c8..0a4803cfcc 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -361,7 +361,7 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
         if (offset > (block->used_length - qemu_target_page_size())) {
             error_setg(errp, "multifd: offset too long %" PRIu64
                        " (max " RAM_ADDR_FMT ")",
-                       offset, block->max_length);
+                       offset, block->used_length);
             return -1;
         }
         p->pages->iov[i].iov_base = block->host + offset;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 10/10] migration/ram: Use offset_in_ramblock() in range checks
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (8 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 09/10] migration/multifd: Print used_length of memory block David Hildenbrand
@ 2021-04-29 11:27 ` David Hildenbrand
  2021-05-06 17:57 ` [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating Eduardo Habkost
  2021-05-11  9:42 ` Dr. David Alan Gilbert
  11 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-29 11:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Juan Quintela, Alex Williamson, Paolo Bonzini

We never read or write beyond the used_length of memory blocks when
migrating. Make this clearer by using offset_in_ramblock() consistently.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/ram.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 6d4d13d345..1814c6cf03 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1368,8 +1368,8 @@ static bool find_dirty_block(RAMState *rs, PageSearchStatus *pss, bool *again)
         *again = false;
         return false;
     }
-    if ((((ram_addr_t)pss->page) << TARGET_PAGE_BITS)
-        >= pss->block->used_length) {
+    if (!offset_in_ramblock(pss->block,
+                            ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) {
         /* Didn't find anything in this RAM Block */
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
@@ -1894,7 +1894,7 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
         rs->last_req_rb = ramblock;
     }
     trace_ram_save_queue_pages(ramblock->idstr, start, len);
-    if (start + len > ramblock->used_length) {
+    if (!offset_in_ramblock(ramblock, start + len - 1)) {
         error_report("%s request overrun start=" RAM_ADDR_FMT " len="
                      RAM_ADDR_FMT " blocklen=" RAM_ADDR_FMT,
                      __func__, start, len, ramblock->used_length);
@@ -3739,8 +3739,8 @@ void colo_flush_ram_cache(void)
         while (block) {
             offset = migration_bitmap_find_dirty(ram_state, block, offset);
 
-            if (((ram_addr_t)offset) << TARGET_PAGE_BITS
-                >= block->used_length) {
+            if (!offset_in_ramblock(block,
+                                    ((ram_addr_t)offset) << TARGET_PAGE_BITS)) {
                 offset = 0;
                 block = QLIST_NEXT_RCU(block, next);
             } else {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks
  2021-04-29 11:26 ` [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
@ 2021-04-29 13:13   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 16+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-29 13:13 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Richard Henderson, Dr. David Alan Gilbert, Peter Xu,
	Alex Williamson, Paolo Bonzini

On 4/29/21 1:26 PM, David Hildenbrand wrote:
> Factor it out into common code when a new notifier is registered, just
> as done with the memory region notifier. This keeps logic about how to
> process existing ram blocks at a central place.
> 
> Just like when adding a new ram block, we have to register the max_length.
> Ram blocks are only "fake resized". All memory (max_length) is mapped.
> 
> Print the warning from inside qemu_vfio_ram_block_added().
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/core/numa.c            | 14 ++++++++++++++
>  include/exec/cpu-common.h |  1 +
>  softmmu/physmem.c         |  5 +++++
>  util/vfio-helpers.c       | 29 ++++++++---------------------
>  4 files changed, 28 insertions(+), 21 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (9 preceding siblings ...)
  2021-04-29 11:27 ` [PATCH v5 10/10] migration/ram: Use offset_in_ramblock() in range checks David Hildenbrand
@ 2021-05-06 17:57 ` Eduardo Habkost
  2021-05-06 18:04   ` Dr. David Alan Gilbert
  2021-05-11  9:42 ` Dr. David Alan Gilbert
  11 siblings, 1 reply; 16+ messages in thread
From: Eduardo Habkost @ 2021-05-06 17:57 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Juan Quintela, Michael S. Tsirkin, Richard Henderson, qemu-devel,
	Peter Xu, Dr. David Alan Gilbert, Alex Williamson, Paolo Bonzini

On Thu, Apr 29, 2021 at 01:26:58PM +0200, David Hildenbrand wrote:
> v4 has been floating around for a while. Let's see if we can find someone
> to merge this; or at least give some more feedback ... all patches have
> at least one RB.

Acked-by: Eduardo Habkost <ehabkost@redhat.com>

David Gilbert: I assume this should go through your tree?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating
  2021-05-06 17:57 ` [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating Eduardo Habkost
@ 2021-05-06 18:04   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert @ 2021-05-06 18:04 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michael S. Tsirkin, David Hildenbrand, Richard Henderson,
	qemu-devel, Peter Xu, Juan Quintela, Alex Williamson,
	Paolo Bonzini

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Thu, Apr 29, 2021 at 01:26:58PM +0200, David Hildenbrand wrote:
> > v4 has been floating around for a while. Let's see if we can find someone
> > to merge this; or at least give some more feedback ... all patches have
> > at least one RB.
> 
> Acked-by: Eduardo Habkost <ehabkost@redhat.com>
> 
> David Gilbert: I assume this should go through your tree?

Yep, I'm going to cook a migration pull next week.

Dave

> -- 
> Eduardo
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating
  2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
                   ` (10 preceding siblings ...)
  2021-05-06 17:57 ` [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating Eduardo Habkost
@ 2021-05-11  9:42 ` Dr. David Alan Gilbert
  11 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert @ 2021-05-11  9:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Juan Quintela,
	Richard Henderson, qemu-devel, Peter Xu, Alex Williamson,
	Paolo Bonzini

* David Hildenbrand (david@redhat.com) wrote:
> v4 has been floating around for a while. Let's see if we can find someone
> to merge this; or at least give some more feedback ... all patches have
> at least one RB.
> 
> 
> I realized that resizing RAM blocks while the guest is being migrated
> (precopy: resize while still running on the source, postcopy: resize
>  while already running on the target) is buggy. In case of precopy, we
> can simply cancel migration. Postcopy handling is more involved. Resizing
> can currently happen during a guest reboot, triggered by ACPI rebuilds.
> 
> Along with the fixes, some cleanups.

Queued

> 
> --------------------------------------------------------------------------
> 
> Example to highlight one part of the problem:
> 
> 1. Start a paused VM (where a ramblock resize will trigger when booting):
>   sudo build/qemu-system-x86_64 \
>        --enable-kvm \
>        -S \
>        -machine q35,nvdimm=on \
>        -smp 1 \
>        -cpu host \
>        -m size=20G,slots=8,maxmem=22G \
>        -object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
>        -device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
>        -nodefaults \
>        -chardev stdio,nosignal,id=serial \
>        -device isa-serial,chardev=serial \
>        -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
>        -mon chardev=monitor,mode=readline \
>        -device vmgenid \
>        -device intel-iommu \
>        -nographic
> 
> 2. Starting precopy and then starting the VM to trigger resizing during
>    precopy:
>   QEMU 5.2.95 monitor - type 'help' for more information
>   (qemu) migrate -d "exec:gzip -c > STATEFILE.gz"
>   QEMU 5.2.95 monitor - type 'help' for more information
>   (qemu) cont
> 
> 3a. Before this series, migration never completes:
>   QEMU 5.2.95 monitor - type 'help' for more information
>   (qemu) info migrate
>   globals:
>   store-global-state: on
>   only-migratable: off
>   send-configuration: on
>   send-section-footer: on
>   decompress-error-check: on
>   clear-bitmap-shift: 18
>   Migration status: active
>   total time: 43826 ms
>   expected downtime: 300 ms
>   setup: 5 ms
>   transferred ram: 65981 kbytes
>   throughput: 8.27 mbps
>   remaining ram: 18446744073709551612 kbytes
>   total ram: 21234188 kbytes
>   duplicate: 5308454 pages
>   skipped: 0 pages
>   normal: 93 pages
>   normal bytes: 372 kbytes
>   dirty sync count: 1
>   page size: 4 kbytes
>   multifd bytes: 0 kbytes
>   pages-per-second: 0
> 
> 4. With this change, migration is properly aborted:
>   (qemu) info migrate
>   globals:
>   store-global-state: on
>   only-migratable: off
>   send-configuration: on
>   send-section-footer: on
>   decompress-error-check: on
>   clear-bitmap-shift: 18
>   Migration status: cancelled
>   total time: 0 ms
> 
> --------------------------------------------------------------------------
> 
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> 
> v4 -> v5:
> - Rephrased some patch descriptions
> - Dropped some patches to reduce the footprint
> -- "stubs/ram-block: Remove stubs that are no longer needed"
> -- "migration/ram: Tolerate partially changed mappings in postcopy code"
> - Removed as already upstream now
> -- "migration/ram: Consolidate variable reset after placement in
>     ram_load_postcopy()"
> 
> v3 -> v4:
> - Rebased and retested
> - Added RBs
> 
> v2 -> v3:
> - Rebased on current master
> - Added RBs
> - "migration/ram: Tolerate partially changed mappings in postcopy code"
> -- Extended the comment for the uffdio unregister part.
> 
> v1 -> v2:
> - "util: vfio-helpers: Factor out and fix processing of existing ram
>    blocks"
> -- Stringify error
> - "migraton/ram: Handle RAM block resizes during precopy"
> -- Simplified check if we're migrating on the source
> - "exec: Relax range check in ram_block_discard_range()"
> -- Added to make discard during resizes actually work
> - "migration/ram: Discard new RAM when growing RAM blocks after
>    ram_postcopy_incoming_init()"
> -- Better checks if in the right postcopy mode.
> -- Better patch subject/description/comments
> - "migration/ram: Handle RAM block resizes during postcopy"
> -- Better comments
> -- Adapt to changed postcopy checks
> - "migrate/ram: Get rid of "place_source" in ram_load_postcopy()"
> -- Dropped, as broken
> - "migration/ram: Tolerate partially changed mappings in postcopy code"
> -- Better comment / description. Clarify that no implicit wakeup will
>    happen
> -- Warn on EINVAL (older kernels)
> -- Wake up any waiter explicitly
> 
> David Hildenbrand (10):
>   util: vfio-helpers: Factor out and fix processing of existing ram
>     blocks
>   numa: Teach ram block notifiers about resizeable ram blocks
>   numa: Make all callbacks of ram block notifiers optional
>   migration/ram: Handle RAM block resizes during precopy
>   exec: Relax range check in ram_block_discard_range()
>   migration/ram: Discard RAM when growing RAM blocks after
>     ram_postcopy_incoming_init()
>   migration/ram: Simplify host page handling in ram_load_postcopy()
>   migration/ram: Handle RAM block resizes during postcopy
>   migration/multifd: Print used_length of memory block
>   migration/ram: Use offset_in_ramblock() in range checks
> 
>  hw/core/numa.c             |  41 +++++++++--
>  hw/i386/xen/xen-mapcache.c |   7 +-
>  include/exec/cpu-common.h  |   1 +
>  include/exec/memory.h      |  10 +--
>  include/exec/ramblock.h    |  10 +++
>  include/exec/ramlist.h     |  13 ++--
>  migration/migration.c      |   9 ++-
>  migration/migration.h      |   1 +
>  migration/multifd.c        |   2 +-
>  migration/postcopy-ram.c   |  15 ++++-
>  migration/ram.c            | 135 +++++++++++++++++++++++++++++--------
>  softmmu/physmem.c          |  26 +++++--
>  target/i386/hax/hax-mem.c  |   5 +-
>  target/i386/sev.c          |  18 ++---
>  util/vfio-helpers.c        |  41 ++++-------
>  15 files changed, 241 insertions(+), 93 deletions(-)
> 
> -- 
> 2.30.2
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-11  9:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29 11:26 [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating David Hildenbrand
2021-04-29 11:26 ` [PATCH v5 01/10] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
2021-04-29 13:13   ` Philippe Mathieu-Daudé
2021-04-29 11:27 ` [PATCH v5 02/10] numa: Teach ram block notifiers about resizeable " David Hildenbrand
2021-04-29 11:27   ` David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 03/10] numa: Make all callbacks of ram block notifiers optional David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 04/10] migration/ram: Handle RAM block resizes during precopy David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 05/10] exec: Relax range check in ram_block_discard_range() David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 06/10] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init() David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 07/10] migration/ram: Simplify host page handling in ram_load_postcopy() David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 08/10] migration/ram: Handle RAM block resizes during postcopy David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 09/10] migration/multifd: Print used_length of memory block David Hildenbrand
2021-04-29 11:27 ` [PATCH v5 10/10] migration/ram: Use offset_in_ramblock() in range checks David Hildenbrand
2021-05-06 17:57 ` [PATCH v5 00/10] numa/exec/migration: Fix resizing RAM blocks while migrating Eduardo Habkost
2021-05-06 18:04   ` Dr. David Alan Gilbert
2021-05-11  9:42 ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.