[PATCH RFC] memory: Don't allow to resize RAM while migrating

* [PATCH RFC] memory: Don't allow to resize RAM while migrating
@ 2020-02-13 17:20 David Hildenbrand
  2020-02-13 18:32 ` Peter Xu
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: David Hildenbrand @ 2020-02-13 17:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, David Hildenbrand,
	Richard Henderson, Dr. David Alan Gilbert,
	Shameerali Kolothum Thodi, Juan Quintela, Shannon Zhao,
	Igor Mammedov, Paolo Bonzini, Alex Bennée

Resizing while migrating is dangerous and does not work as expected.
The whole migration code works on the usable_length of ram blocks and does
not expect this to change at random points in time.

Precopy: The ram block size must not change on the source, after
ram_save_setup(), so as long as the guest is still running on the source.

Postcopy: The ram block size must not change on the target, after
synchronizing the RAM block list (ram_load_precopy()).

AFAIKS, resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

I see no easy way to work around this. Fail hard instead of failing
somewhere in migration code due to strange other reasons. AFAIKs, the
rebuilts will be triggered during reboot, so this should not affect
running guests, but only guests that reboot at a very bad time and
actually require size changes.

Let's further limit the impact by checking if an actual resize of the
RAM (in number of pages) is required.

Don't perform the checks in qemu_ram_resize(), as that's called during
migration when syncing the used_length. Update documentation.

Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Shannon Zhao <shannon.zhao@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---

Any idea how to avoid killing the guest? Anything obvious I am missing?

---
 exec.c                |  6 ++++--
 include/exec/memory.h | 11 +++++++----
 memory.c              | 12 ++++++++++++
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 67e520d18e..faa6708414 100644
--- a/exec.c
+++ b/exec.c
@@ -2116,8 +2116,10 @@ static int memory_try_enable_merging(void *addr, size_t len)
     return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
-/* Only legal before guest might have detected the memory size: e.g. on
- * incoming migration, or right after reset.
+/*
+ * RAM must not be resized while migration is active (except from migration
+ * code). Care has to be taken if the guest might have already detected
+ * the memory.
  *
  * As memory core doesn't know how is memory accessed, it is up to
  * resize callback to update device state and/or add assertions to detect
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e85b7de99a..1e5bfbe805 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -113,7 +113,8 @@ typedef struct IOMMUNotifier IOMMUNotifier;
 #define RAM_SHARED     (1 << 1)
 
 /* Only a portion of RAM (used_length) is actually used, and migrated.
- * This used_length size can change across reboots.
+ * RAM must not be resized while migration is active (except from migration
+ * code).
  */
 #define RAM_RESIZEABLE (1 << 2)
 
@@ -843,7 +844,8 @@ void memory_region_init_ram_shared_nomigrate(MemoryRegion *mr,
  *                                     RAM.  Accesses into the region will
  *                                     modify memory directly.  Only an initial
  *                                     portion of this RAM is actually used.
- *                                     The used size can change across reboots.
+ *                                     The size must not change while migration
+ *                                     is active.
  *
  * @mr: the #MemoryRegion to be initialized.
  * @owner: the object that tracks the region's reference count
@@ -1464,8 +1466,9 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr);
 
 /* memory_region_ram_resize: Resize a RAM region.
  *
- * Only legal before guest might have detected the memory size: e.g. on
- * incoming migration, or right after reset.
+ * RAM must not be resized while migration is active (except from migration
+ * code). Care has to be taken if the guest might have already detected
+ * the memory.
  *
  * @mr: a memory region created with @memory_region_init_resizeable_ram.
  * @newsize: the new size the region
diff --git a/memory.c b/memory.c
index aeaa8dcc9e..7fa048aa3a 100644
--- a/memory.c
+++ b/memory.c
@@ -34,6 +34,7 @@
 #include "sysemu/accel.h"
 #include "hw/boards.h"
 #include "migration/vmstate.h"
+#include "migration/misc.h"
 
 //#define DEBUG_UNASSIGNED
 
@@ -2204,6 +2205,17 @@ void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error **errp
 {
     assert(mr->ram_block);
 
+    /*
+     * Resizing RAM while migrating is not possible, as the used_length of
+     * RAM blocks must neither change on the source (precopy), nor on the
+     * target (postcopy) as long as migration code is active.
+     */
+    if (HOST_PAGE_ALIGN(newsize) != mr->ram_block->used_length &&
+        !migration_is_idle()) {
+        error_setg(errp, "Cannot resize RAM while migrating.");
+        return;
+    }
+
     qemu_ram_resize(mr->ram_block, newsize, errp);
 }
 
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread