All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
@ 2021-07-21  9:27 David Hildenbrand
  2021-07-21  9:27 ` [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager David Hildenbrand
                   ` (6 more replies)
  0 siblings, 7 replies; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

virtio-mem exposes a dynamic amount of memory within RAMBlocks by
coordinating with the VM. Memory within a RAMBlock can either get
plugged and consequently used by the VM, or unplugged and consequently no
longer used by the VM. Logical unplug is realized by discarding the
physical memory backing for virtual memory ranges, similar to memory
ballooning.

However, important difference to virtio-balloon are:

a) A virtio-mem device only operates on its assigned memory region /
   RAMBlock ("device memory")
b) Initially, all device memory is logically unplugged
c) Virtual machines will never accidentally reuse memory that is currently
   logically unplugged. The spec defines most accesses to unplugged memory
   as "undefined behavior" -- except reading unplugged memory, which is
   currently expected to work, but that will change in the future.
d) The (un)plug granularity is in the range of megabytes -- "memory blocks"
e) The state (plugged/unplugged) of a memory block is always known and
   properly tracked.

Whenever memory blocks within the RAMBlock get (un)plugged, changes are
communicated via the RamDiscardManager to other QEMU subsystems, most
prominently vfio which updates the DMA mapping accordingly. "Unplugging"
corresponds to "discarding" and "plugging" corresponds to "populating".

While migrating (precopy/postcopy) that state of such memory blocks cannot
change. We never ever want to migrate such logically unplugged memory,
because it can result in an unintended memory consumption both, on the
source (when reading memory from some memory backends) and on the
destination (when writing memory). Further, migration time can be heavily
reduced when skipping logically unplugged blocks and we avoid populating
unnecessary page tables in Linux.

Right now, virtio-mem reuses the free page hinting infrastructure during
precopy to exclude all logically unplugged ("discarded") parts from the
migration stream. However, there are some scenarios that are not handled
properly and need fixing. Further, there are some ugly corner cases in
postcopy code and background snapshotting code that similarly have to
handle such special RAMBlocks.

Let's reuse the RamDiscardManager infrastructure to essentially handle
precopy, postcopy and background snapshots cleanly, which means:

a) In precopy code, always clearing all dirty bits from the bitmap that
   correspond to discarded range, whenever we update the dirty bitmap. This
   results in logically unplugged memory to never get migrated.
b) In postcopy code, placing a zeropage when requested to handle a page
   falling into a discarded range -- because the source will never send it.
c) In background snapshot code, never populating discarded ranges, not even
   with the shared zeropage, to avoid unintended memory consumption,
   especially in the future with hugetlb and shmem.

Detail: When realizing a virtio-mem devices, it will register the RAM
        for migration via vmstate_register_ram(). Further, it will
        set itself as the RamDiscardManager for the corresponding memory
        region of the RAMBlock via memory_region_set_ram_discard_manager().
        Last but not least, memory device code will actually map the
        memory region into guest physical address space. So migration
        code can always properly identify such RAMBlocks.

Tested with precopy/postcopy on shmem, where even reading unpopulated
memory ranges will populate actual memory and not the shared zeropage.
Tested with background snapshots on anonymous memory, because other
backends are not supported yet with upstream Linux.

Idealy, this should all go via the migration tree.

v1 -> v2:
- "migration/ram: Handle RAMBlocks with a RamDiscardManager on the
   migration source"
-- Added a note how it interacts with the clear_bmap and what we might want
   to further optimize in the future when synchronizing bitmaps.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>

David Hildenbrand (6):
  memory: Introduce replay_discarded callback for RamDiscardManager
  virtio-mem: Implement replay_discarded RamDiscardManager callback
  migration/ram: Handle RAMBlocks with a RamDiscardManager on the
    migration source
  virtio-mem: Drop precopy notifier
  migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the
    destination
  migration/ram: Handle RAMBlocks with a RamDiscardManager on background
    snapshots

 hw/virtio/virtio-mem.c         |  92 +++++++++++++--------
 include/exec/memory.h          |  21 +++++
 include/hw/virtio/virtio-mem.h |   3 -
 migration/postcopy-ram.c       |  25 +++++-
 migration/ram.c                | 147 ++++++++++++++++++++++++++++++---
 migration/ram.h                |   1 +
 softmmu/memory.c               |  11 +++
 7 files changed, 246 insertions(+), 54 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-23 16:34   ` Peter Xu
  2021-07-21  9:27 ` [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback David Hildenbrand
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

Introduce replay_discarded callback similar to our existing
replay_populated callback, to be used my migration code to never migrate
discarded memory.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/exec/memory.h | 21 +++++++++++++++++++++
 softmmu/memory.c      | 11 +++++++++++
 2 files changed, 32 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index c3d417d317..93e972b55a 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -537,6 +537,7 @@ static inline void ram_discard_listener_init(RamDiscardListener *rdl,
 }
 
 typedef int (*ReplayRamPopulate)(MemoryRegionSection *section, void *opaque);
+typedef void (*ReplayRamDiscard)(MemoryRegionSection *section, void *opaque);
 
 /*
  * RamDiscardManagerClass:
@@ -625,6 +626,21 @@ struct RamDiscardManagerClass {
                             MemoryRegionSection *section,
                             ReplayRamPopulate replay_fn, void *opaque);
 
+    /**
+     * @replay_discarded:
+     *
+     * Call the #ReplayRamDiscard callback for all discarded parts within the
+     * #MemoryRegionSection via the #RamDiscardManager.
+     *
+     * @rdm: the #RamDiscardManager
+     * @section: the #MemoryRegionSection
+     * @replay_fn: the #ReplayRamDiscard callback
+     * @opaque: pointer to forward to the callback
+     */
+    void (*replay_discarded)(const RamDiscardManager *rdm,
+                             MemoryRegionSection *section,
+                             ReplayRamDiscard replay_fn, void *opaque);
+
     /**
      * @register_listener:
      *
@@ -669,6 +685,11 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
                                          ReplayRamPopulate replay_fn,
                                          void *opaque);
 
+void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
+                                          MemoryRegionSection *section,
+                                          ReplayRamDiscard replay_fn,
+                                          void *opaque);
+
 void ram_discard_manager_register_listener(RamDiscardManager *rdm,
                                            RamDiscardListener *rdl,
                                            MemoryRegionSection *section);
diff --git a/softmmu/memory.c b/softmmu/memory.c
index bfedaf9c4d..cd86205627 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -2076,6 +2076,17 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
     return rdmc->replay_populated(rdm, section, replay_fn, opaque);
 }
 
+void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
+                                          MemoryRegionSection *section,
+                                          ReplayRamDiscard replay_fn,
+                                          void *opaque)
+{
+    RamDiscardManagerClass *rdmc = RAM_DISCARD_MANAGER_GET_CLASS(rdm);
+
+    g_assert(rdmc->replay_discarded);
+    rdmc->replay_discarded(rdm, section, replay_fn, opaque);
+}
+
 void ram_discard_manager_register_listener(RamDiscardManager *rdm,
                                            RamDiscardListener *rdl,
                                            MemoryRegionSection *section)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
  2021-07-21  9:27 ` [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-23 16:34   ` Peter Xu
  2021-07-21  9:27 ` [PATCH v2 3/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source David Hildenbrand
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

Implement it similar to the replay_populated callback.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-mem.c | 58 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index df91e454b2..284096ec5f 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -228,6 +228,38 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem,
     return ret;
 }
 
+static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem,
+                                                 MemoryRegionSection *s,
+                                                 void *arg,
+                                                 virtio_mem_section_cb cb)
+{
+    unsigned long first_bit, last_bit;
+    uint64_t offset, size;
+    int ret = 0;
+
+    first_bit = s->offset_within_region / vmem->bitmap_size;
+    first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, first_bit);
+    while (first_bit < vmem->bitmap_size) {
+        MemoryRegionSection tmp = *s;
+
+        offset = first_bit * vmem->block_size;
+        last_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size,
+                                 first_bit + 1) - 1;
+        size = (last_bit - first_bit + 1) * vmem->block_size;
+
+        if (!virito_mem_intersect_memory_section(&tmp, offset, size)) {
+            break;
+        }
+        ret = cb(&tmp, arg);
+        if (ret) {
+            break;
+        }
+        first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size,
+                                       last_bit + 2);
+    }
+    return ret;
+}
+
 static int virtio_mem_notify_populate_cb(MemoryRegionSection *s, void *arg)
 {
     RamDiscardListener *rdl = arg;
@@ -1170,6 +1202,31 @@ static int virtio_mem_rdm_replay_populated(const RamDiscardManager *rdm,
                                             virtio_mem_rdm_replay_populated_cb);
 }
 
+static int virtio_mem_rdm_replay_discarded_cb(MemoryRegionSection *s,
+                                              void *arg)
+{
+    struct VirtIOMEMReplayData *data = arg;
+
+    ((ReplayRamDiscard)data->fn)(s, data->opaque);
+    return 0;
+}
+
+static void virtio_mem_rdm_replay_discarded(const RamDiscardManager *rdm,
+                                            MemoryRegionSection *s,
+                                            ReplayRamDiscard replay_fn,
+                                            void *opaque)
+{
+    const VirtIOMEM *vmem = VIRTIO_MEM(rdm);
+    struct VirtIOMEMReplayData data = {
+        .fn = replay_fn,
+        .opaque = opaque,
+    };
+
+    g_assert(s->mr == &vmem->memdev->mr);
+    virtio_mem_for_each_unplugged_section(vmem, s, &data,
+                                          virtio_mem_rdm_replay_discarded_cb);
+}
+
 static void virtio_mem_rdm_register_listener(RamDiscardManager *rdm,
                                              RamDiscardListener *rdl,
                                              MemoryRegionSection *s)
@@ -1234,6 +1291,7 @@ static void virtio_mem_class_init(ObjectClass *klass, void *data)
     rdmc->get_min_granularity = virtio_mem_rdm_get_min_granularity;
     rdmc->is_populated = virtio_mem_rdm_is_populated;
     rdmc->replay_populated = virtio_mem_rdm_replay_populated;
+    rdmc->replay_discarded = virtio_mem_rdm_replay_discarded;
     rdmc->register_listener = virtio_mem_rdm_register_listener;
     rdmc->unregister_listener = virtio_mem_rdm_unregister_listener;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 3/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
  2021-07-21  9:27 ` [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager David Hildenbrand
  2021-07-21  9:27 ` [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-21  9:27 ` [PATCH v2 4/6] virtio-mem: Drop precopy notifier David Hildenbrand
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

We never want to migrate memory that corresponds to discarded ranges as
managed by a RamDiscardManager responsible for the mapped memory region of
the RAMBlock. The content of these pages is essentially stale and
without any guarantees for the VM ("logically unplugged").

Depending on the underlying memory type, even reading memory might populate
memory on the source, resulting in an undesired memory consumption. Of
course, on the destination, even writing a zeropage consumes memory,
which we also want to avoid (similar to free page hinting).

Currently, virtio-mem tries achieving that goal (not migrating "unplugged"
memory that was discarded) by going via qemu_guest_free_page_hint() - but
it's hackish and incomplete.

For example, background snapshots still end up reading all memory, as
they don't do bitmap syncs. Postcopy recovery code will re-add
previously cleared bits to the dirty bitmap and migrate them.

Let's consult the RamDiscardManager whenever we add bits to the bitmap
and exclude all discarded pages.

As colo is incompatible with discarding of RAM and inhibits it, we don't
have to bother.

Note: If discarded ranges span complete clear_bmap chunks, we'll never
clear the corresponding bits from clear_bmap and consequently never call
memory_region_clear_dirty_bitmap on the affected regions. While this is
perfectly fine, we're still synchronizing the bitmap of discarded ranges,
for example, in
ramblock_sync_dirty_bitmap()->cpu_physical_memory_sync_dirty_bitmap()
but also during memory_global_dirty_log_sync().

In the future, it might make sense to never even synchronize the dirty log
of these ranges, for example in KVM code, skipping discarded ranges
completely.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/ram.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 65 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index b5fc454b2f..4bf74ae2e1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -830,14 +830,67 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
     return ret;
 }
 
+static void dirty_bitmap_exclude_section(MemoryRegionSection *section,
+                                         void *opaque)
+{
+    const hwaddr offset = section->offset_within_region;
+    const hwaddr size = int128_get64(section->size);
+    const unsigned long start = offset >> TARGET_PAGE_BITS;
+    const unsigned long npages = size >> TARGET_PAGE_BITS;
+    RAMBlock *rb = section->mr->ram_block;
+    uint64_t *cleared_bits = opaque;
+
+    /* Clear the discarded part from our bitmap. */
+    *cleared_bits += bitmap_count_one_with_offset(rb->bmap, start, npages);
+    bitmap_clear(rb->bmap, start, npages);
+}
+
+/*
+ * Exclude all dirty pages that fall into a discarded range as managed by a
+ * RamDiscardManager responsible for the mapped memory region of the RAMBlock.
+ *
+ * Discarded pages ("logically unplugged") have undefined content and must
+ * never be migrated, because even reading these pages for migrating might
+ * result in undesired behavior.
+ *
+ * Returns the number of cleared bits in the dirty bitmap.
+ *
+ * Note: The result is only stable while migration (precopy/postcopy).
+ */
+static uint64_t ramblock_dirty_bitmap_exclude_discarded_pages(RAMBlock *rb)
+{
+    uint64_t cleared_bits = 0;
+
+    if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) {
+        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
+        MemoryRegionSection section = {
+            .mr = rb->mr,
+            .offset_within_region = 0,
+            .size = int128_make64(qemu_ram_get_used_length(rb)),
+        };
+
+        ram_discard_manager_replay_discarded(rdm, &section,
+                                             dirty_bitmap_exclude_section,
+                                             &cleared_bits);
+    }
+    return cleared_bits;
+}
+
 /* Called with RCU critical section */
 static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb)
 {
     uint64_t new_dirty_pages =
         cpu_physical_memory_sync_dirty_bitmap(rb, 0, rb->used_length);
+    uint64_t cleared_pages = 0;
 
-    rs->migration_dirty_pages += new_dirty_pages;
-    rs->num_dirty_pages_period += new_dirty_pages;
+    if (new_dirty_pages) {
+        cleared_pages = ramblock_dirty_bitmap_exclude_discarded_pages(rb);
+    }
+
+    rs->migration_dirty_pages += new_dirty_pages - cleared_pages;
+    if (new_dirty_pages > cleared_pages) {
+        rs->num_dirty_pages_period += new_dirty_pages - cleared_pages;
+    }
 }
 
 /**
@@ -2601,7 +2654,7 @@ static int ram_state_init(RAMState **rsp)
     return 0;
 }
 
-static void ram_list_init_bitmaps(void)
+static void ram_list_init_bitmaps(RAMState *rs)
 {
     MigrationState *ms = migrate_get_current();
     RAMBlock *block;
@@ -2636,6 +2689,10 @@ static void ram_list_init_bitmaps(void)
             bitmap_set(block->bmap, 0, pages);
             block->clear_bmap_shift = shift;
             block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
+
+            /* Exclude all discarded pages we never want to migrate. */
+            pages = ramblock_dirty_bitmap_exclude_discarded_pages(block);
+            rs->migration_dirty_pages -= pages;
         }
     }
 }
@@ -2647,7 +2704,7 @@ static void ram_init_bitmaps(RAMState *rs)
     qemu_mutex_lock_ramlist();
 
     WITH_RCU_READ_LOCK_GUARD() {
-        ram_list_init_bitmaps();
+        ram_list_init_bitmaps(rs);
         /* We don't use dirty log with background snapshots */
         if (!migrate_background_snapshot()) {
             memory_global_dirty_log_start();
@@ -4076,6 +4133,10 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
      */
     bitmap_complement(block->bmap, block->bmap, nbits);
 
+    /* Exclude all discarded pages we never want to migrate. */
+    ramblock_dirty_bitmap_exclude_discarded_pages(block);
+
+    /* We'll recalculate migration_dirty_pages in ram_state_resume_prepare(). */
     trace_ram_dirty_bitmap_reload_complete(block->idstr);
 
     /*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 4/6] virtio-mem: Drop precopy notifier
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
                   ` (2 preceding siblings ...)
  2021-07-21  9:27 ` [PATCH v2 3/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-23 16:34   ` Peter Xu
  2021-07-21  9:27 ` [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination David Hildenbrand
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

Migration code now properly handles RAMBlocks which are indirectly managed
by a RamDiscardManager. No need for manual handling via the free page
optimization interface, let's get rid of it.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-mem.c         | 34 ----------------------------------
 include/hw/virtio/virtio-mem.h |  3 ---
 2 files changed, 37 deletions(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 284096ec5f..d5a578142b 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -776,7 +776,6 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp)
     host_memory_backend_set_mapped(vmem->memdev, true);
     vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem));
     qemu_register_reset(virtio_mem_system_reset, vmem);
-    precopy_add_notifier(&vmem->precopy_notifier);
 
     /*
      * Set ourselves as RamDiscardManager before the plug handler maps the
@@ -796,7 +795,6 @@ static void virtio_mem_device_unrealize(DeviceState *dev)
      * found via an address space anymore. Unset ourselves.
      */
     memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL);
-    precopy_remove_notifier(&vmem->precopy_notifier);
     qemu_unregister_reset(virtio_mem_system_reset, vmem);
     vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem));
     host_memory_backend_set_mapped(vmem->memdev, false);
@@ -1089,43 +1087,11 @@ static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name,
     vmem->block_size = value;
 }
 
-static int virtio_mem_precopy_exclude_range_cb(const VirtIOMEM *vmem, void *arg,
-                                               uint64_t offset, uint64_t size)
-{
-    void * const host = qemu_ram_get_host_addr(vmem->memdev->mr.ram_block);
-
-    qemu_guest_free_page_hint(host + offset, size);
-    return 0;
-}
-
-static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem)
-{
-    virtio_mem_for_each_unplugged_range(vmem, NULL,
-                                        virtio_mem_precopy_exclude_range_cb);
-}
-
-static int virtio_mem_precopy_notify(NotifierWithReturn *n, void *data)
-{
-    VirtIOMEM *vmem = container_of(n, VirtIOMEM, precopy_notifier);
-    PrecopyNotifyData *pnd = data;
-
-    switch (pnd->reason) {
-    case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC:
-        virtio_mem_precopy_exclude_unplugged(vmem);
-        break;
-    default:
-        break;
-    }
-
-    return 0;
-}
-
 static void virtio_mem_instance_init(Object *obj)
 {
     VirtIOMEM *vmem = VIRTIO_MEM(obj);
 
     notifier_list_init(&vmem->size_change_notifiers);
-    vmem->precopy_notifier.notify = virtio_mem_precopy_notify;
     QLIST_INIT(&vmem->rdl_list);
 
     object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size,
diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
index 9a6e348fa2..a5dd6a493b 100644
--- a/include/hw/virtio/virtio-mem.h
+++ b/include/hw/virtio/virtio-mem.h
@@ -65,9 +65,6 @@ struct VirtIOMEM {
     /* notifiers to notify when "size" changes */
     NotifierList size_change_notifiers;
 
-    /* don't migrate unplugged memory */
-    NotifierWithReturn precopy_notifier;
-
     /* listeners to notify on plug/unplug activity. */
     QLIST_HEAD(, RamDiscardListener) rdl_list;
 };
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
                   ` (3 preceding siblings ...)
  2021-07-21  9:27 ` [PATCH v2 4/6] virtio-mem: Drop precopy notifier David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-23 16:34   ` Peter Xu
  2021-07-21  9:27 ` [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots David Hildenbrand
  2021-07-22 11:29 ` [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager Dr. David Alan Gilbert
  6 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

Currently, when someone (i.e., the VM) accesses discarded parts inside a
RAMBlock with a RamDiscardManager managing the corresponding mapped memory
region, postcopy will request migration of the corresponding page from the
source. The source, however, will never answer, because it refuses to
migrate such pages with undefined content ("logically unplugged"): the
pages are never dirty, and get_queued_page() will consequently skip
processing these postcopy requests.

Especially reading discarded ("logically unplugged") ranges is supposed to
work in some setups (for example with current virtio-mem), although it
barely ever happens: still, not placing a page would currently stall the
VM, as it cannot make forward progress.

Let's check the state via the RamDiscardManager (the state e.g.,
of virtio-mem is migrated during precopy) and avoid sending a request
that will never get answered. Place a fresh zero page instead to keep
the VM working. This is the same behavior that would happen
automatically without userfaultfd being active, when accessing virtual
memory regions without populated pages -- "populate on demand".

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/postcopy-ram.c | 25 +++++++++++++++++++++----
 migration/ram.c          | 23 +++++++++++++++++++++++
 migration/ram.h          |  1 +
 3 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 2e9697bdd2..673f60ce2b 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -671,6 +671,23 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd,
     return ret;
 }
 
+static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
+                                 ram_addr_t start, uint64_t haddr)
+{
+    /*
+     * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
+     * access, place a zeropage, which will also set the relevant bits in the
+     * recv_bitmap accordingly, so we won't try placing a zeropage twice.
+     */
+    if (ramblock_page_is_discarded(rb, start)) {
+        bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
+
+        return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, rb);
+    }
+
+    return migrate_send_rp_req_pages(mis, rb, start, haddr);
+}
+
 /*
  * Callback from shared fault handlers to ask for a page,
  * the page must be specified by a RAMBlock and an offset in that rb
@@ -690,7 +707,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb,
                                         qemu_ram_get_idstr(rb), rb_offset);
         return postcopy_wake_shared(pcfd, client_addr, rb);
     }
-    migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr);
+    postcopy_request_page(mis, rb, aligned_rbo, client_addr);
     return 0;
 }
 
@@ -984,8 +1001,8 @@ retry:
              * Send the request to the source - we want to request one
              * of our host page sizes (which is >= TPS)
              */
-            ret = migrate_send_rp_req_pages(mis, rb, rb_offset,
-                                            msg.arg.pagefault.address);
+            ret = postcopy_request_page(mis, rb, rb_offset,
+                                        msg.arg.pagefault.address);
             if (ret) {
                 /* May be network failure, try to wait for recovery */
                 if (ret == -EIO && postcopy_pause_fault_thread(mis)) {
@@ -993,7 +1010,7 @@ retry:
                     goto retry;
                 } else {
                     /* This is a unavoidable fault */
-                    error_report("%s: migrate_send_rp_req_pages() get %d",
+                    error_report("%s: postcopy_request_page() get %d",
                                  __func__, ret);
                     break;
                 }
diff --git a/migration/ram.c b/migration/ram.c
index 4bf74ae2e1..d7505f5368 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -876,6 +876,29 @@ static uint64_t ramblock_dirty_bitmap_exclude_discarded_pages(RAMBlock *rb)
     return cleared_bits;
 }
 
+/*
+ * Check if a page falls into a discarded range as managed by a
+ * RamDiscardManager responsible for the mapped memory region of the RAMBlock.
+ * Pages inside discarded ranges are never migrated and postcopy has to
+ * place zeropages instead.
+ *
+ * Note: The result is only stable while migration (precopy/postcopy).
+ */
+bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset)
+{
+    if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) {
+        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
+        MemoryRegionSection section = {
+            .mr = rb->mr,
+            .offset_within_region = offset,
+            .size = int128_get64(TARGET_PAGE_SIZE),
+        };
+
+        return !ram_discard_manager_is_populated(rdm, &section);
+    }
+    return false;
+}
+
 /* Called with RCU critical section */
 static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb)
 {
diff --git a/migration/ram.h b/migration/ram.h
index 4833e9fd5b..6e7f12e556 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -72,6 +72,7 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
 int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
 int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
+bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset);
 
 /* ram cache */
 int colo_init_ram_cache(void);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
                   ` (4 preceding siblings ...)
  2021-07-21  9:27 ` [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination David Hildenbrand
@ 2021-07-21  9:27 ` David Hildenbrand
  2021-07-23 16:37   ` Peter Xu
  2021-07-22 11:29 ` [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager Dr. David Alan Gilbert
  6 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-21  9:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	David Hildenbrand, Dr. David Alan Gilbert, Peter Xu,
	Marek Kedzierski, Alex Williamson, teawater, Paolo Bonzini,
	Andrey Gruzdev, Wei Yang

We already don't ever migrate memory that corresponds to discarded ranges
as managed by a RamDiscardManager responsible for the mapped memory region
of the RAMBlock.

virtio-mem uses this mechanism to logically unplug parts of a RAMBlock.
Right now, we still populate zeropages for the whole usable part of the
RAMBlock, which is undesired because:

1. Even populating the shared zeropage will result in memory getting
   consumed for page tables.
2. Memory backends without a shared zeropage (like hugetlbfs and shmem)
   will populate an actual, fresh page, resulting in an unintended
   memory consumption.

Discarded ("logically unplugged") parts have to remain discarded. As
these pages are never part of the migration stream, there is no need to
track modifications via userfaultfd WP reliably for these parts.

Further, any writes to these ranges by the VM are invalid and the
behavior is undefined.

Note that Linux only supports userfaultfd WP on private anonymous memory
for now, which usually results in the shared zeropage getting populated.
The issue will become more relevant once userfaultfd WP supports shmem
and hugetlb.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 migration/ram.c | 53 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 45 insertions(+), 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d7505f5368..75de936bd2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1612,6 +1612,28 @@ out:
     return ret;
 }
 
+static inline void populate_range(RAMBlock *block, hwaddr offset, hwaddr size)
+{
+    char *ptr = (char *) block->host;
+
+    for (; offset < size; offset += qemu_real_host_page_size) {
+        char tmp = *(ptr + offset);
+
+        /* Don't optimize the read out */
+        asm volatile("" : "+r" (tmp));
+    }
+}
+
+static inline int populate_section(MemoryRegionSection *section, void *opaque)
+{
+    const hwaddr size = int128_get64(section->size);
+    hwaddr offset = section->offset_within_region;
+    RAMBlock *block = section->mr->ram_block;
+
+    populate_range(block, offset, size);
+    return 0;
+}
+
 /*
  * ram_block_populate_pages: populate memory in the RAM block by reading
  *   an integer from the beginning of each page.
@@ -1621,16 +1643,31 @@ out:
  *
  * @block: RAM block to populate
  */
-static void ram_block_populate_pages(RAMBlock *block)
+static void ram_block_populate_pages(RAMBlock *rb)
 {
-    char *ptr = (char *) block->host;
-
-    for (ram_addr_t offset = 0; offset < block->used_length;
-            offset += qemu_real_host_page_size) {
-        char tmp = *(ptr + offset);
+    /*
+     * Skip populating all pages that fall into a discarded range as managed by
+     * a RamDiscardManager responsible for the mapped memory region of the
+     * RAMBlock. Such discarded ("logically unplugged") parts of a RAMBlock
+     * must not get populated automatically. We don't have to track
+     * modifications via userfaultfd WP reliably, because these pages will
+     * not be part of the migration stream either way -- see
+     * ramblock_dirty_bitmap_exclude_discarded_pages().
+     *
+     * Note: The result is only stable while migration (precopy/postcopy).
+     */
+    if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) {
+        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
+        MemoryRegionSection section = {
+            .mr = rb->mr,
+            .offset_within_region = 0,
+            .size = rb->mr->size,
+        };
 
-        /* Don't optimize the read out */
-        asm volatile("" : "+r" (tmp));
+        ram_discard_manager_replay_populated(rdm, &section,
+                                             populate_section, NULL);
+    } else {
+        populate_range(rb, 0, qemu_ram_get_used_length(rb));
     }
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
                   ` (5 preceding siblings ...)
  2021-07-21  9:27 ` [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots David Hildenbrand
@ 2021-07-22 11:29 ` Dr. David Alan Gilbert
  2021-07-22 11:43   ` David Hildenbrand
  6 siblings, 1 reply; 40+ messages in thread
From: Dr. David Alan Gilbert @ 2021-07-22 11:29 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	qemu-devel, Peter Xu, Marek Kedzierski, Alex Williamson,
	teawater, Paolo Bonzini, Andrey Gruzdev, Wei Yang

* David Hildenbrand (david@redhat.com) wrote:
> virtio-mem exposes a dynamic amount of memory within RAMBlocks by
> coordinating with the VM. Memory within a RAMBlock can either get
> plugged and consequently used by the VM, or unplugged and consequently no
> longer used by the VM. Logical unplug is realized by discarding the
> physical memory backing for virtual memory ranges, similar to memory
> ballooning.
> 
> However, important difference to virtio-balloon are:
> 
> a) A virtio-mem device only operates on its assigned memory region /
>    RAMBlock ("device memory")
> b) Initially, all device memory is logically unplugged
> c) Virtual machines will never accidentally reuse memory that is currently
>    logically unplugged. The spec defines most accesses to unplugged memory
>    as "undefined behavior" -- except reading unplugged memory, which is
>    currently expected to work, but that will change in the future.
> d) The (un)plug granularity is in the range of megabytes -- "memory blocks"
> e) The state (plugged/unplugged) of a memory block is always known and
>    properly tracked.
> 
> Whenever memory blocks within the RAMBlock get (un)plugged, changes are
> communicated via the RamDiscardManager to other QEMU subsystems, most
> prominently vfio which updates the DMA mapping accordingly. "Unplugging"
> corresponds to "discarding" and "plugging" corresponds to "populating".
> 
> While migrating (precopy/postcopy) that state of such memory blocks cannot
> change.

So no plugging/unplugging can happen during the migration?

> We never ever want to migrate such logically unplugged memory,
> because it can result in an unintended memory consumption both, on the
> source (when reading memory from some memory backends) and on the
> destination (when writing memory). Further, migration time can be heavily
> reduced when skipping logically unplugged blocks and we avoid populating
> unnecessary page tables in Linux.
> 
> Right now, virtio-mem reuses the free page hinting infrastructure during
> precopy to exclude all logically unplugged ("discarded") parts from the
> migration stream. However, there are some scenarios that are not handled
> properly and need fixing. Further, there are some ugly corner cases in
> postcopy code and background snapshotting code that similarly have to
> handle such special RAMBlocks.
> 
> Let's reuse the RamDiscardManager infrastructure to essentially handle
> precopy, postcopy and background snapshots cleanly, which means:
> 
> a) In precopy code, always clearing all dirty bits from the bitmap that
>    correspond to discarded range, whenever we update the dirty bitmap. This
>    results in logically unplugged memory to never get migrated.

Have you seen cases where discarded areas are being marked as dirty?
That suggests something somewhere is writing to them and shouldn't be.

Dave

> b) In postcopy code, placing a zeropage when requested to handle a page
>    falling into a discarded range -- because the source will never send it.
> c) In background snapshot code, never populating discarded ranges, not even
>    with the shared zeropage, to avoid unintended memory consumption,
>    especially in the future with hugetlb and shmem.
> 
> Detail: When realizing a virtio-mem devices, it will register the RAM
>         for migration via vmstate_register_ram(). Further, it will
>         set itself as the RamDiscardManager for the corresponding memory
>         region of the RAMBlock via memory_region_set_ram_discard_manager().
>         Last but not least, memory device code will actually map the
>         memory region into guest physical address space. So migration
>         code can always properly identify such RAMBlocks.
> 
> Tested with precopy/postcopy on shmem, where even reading unpopulated
> memory ranges will populate actual memory and not the shared zeropage.
> Tested with background snapshots on anonymous memory, because other
> backends are not supported yet with upstream Linux.
> 
> Idealy, this should all go via the migration tree.
> 
> v1 -> v2:
> - "migration/ram: Handle RAMBlocks with a RamDiscardManager on the
>    migration source"
> -- Added a note how it interacts with the clear_bmap and what we might want
>    to further optimize in the future when synchronizing bitmaps.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
> Cc: Marek Kedzierski <mkedzier@redhat.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: teawater <teawaterz@linux.alibaba.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
> 
> David Hildenbrand (6):
>   memory: Introduce replay_discarded callback for RamDiscardManager
>   virtio-mem: Implement replay_discarded RamDiscardManager callback
>   migration/ram: Handle RAMBlocks with a RamDiscardManager on the
>     migration source
>   virtio-mem: Drop precopy notifier
>   migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the
>     destination
>   migration/ram: Handle RAMBlocks with a RamDiscardManager on background
>     snapshots
> 
>  hw/virtio/virtio-mem.c         |  92 +++++++++++++--------
>  include/exec/memory.h          |  21 +++++
>  include/hw/virtio/virtio-mem.h |   3 -
>  migration/postcopy-ram.c       |  25 +++++-
>  migration/ram.c                | 147 ++++++++++++++++++++++++++++++---
>  migration/ram.h                |   1 +
>  softmmu/memory.c               |  11 +++
>  7 files changed, 246 insertions(+), 54 deletions(-)
> 
> -- 
> 2.31.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-22 11:29 ` [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager Dr. David Alan Gilbert
@ 2021-07-22 11:43   ` David Hildenbrand
  2021-07-23 16:12     ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-22 11:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	qemu-devel, Peter Xu, Marek Kedzierski, Alex Williamson,
	teawater, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 22.07.21 13:29, Dr. David Alan Gilbert wrote:
> * David Hildenbrand (david@redhat.com) wrote:
>> virtio-mem exposes a dynamic amount of memory within RAMBlocks by
>> coordinating with the VM. Memory within a RAMBlock can either get
>> plugged and consequently used by the VM, or unplugged and consequently no
>> longer used by the VM. Logical unplug is realized by discarding the
>> physical memory backing for virtual memory ranges, similar to memory
>> ballooning.
>>
>> However, important difference to virtio-balloon are:
>>
>> a) A virtio-mem device only operates on its assigned memory region /
>>     RAMBlock ("device memory")
>> b) Initially, all device memory is logically unplugged
>> c) Virtual machines will never accidentally reuse memory that is currently
>>     logically unplugged. The spec defines most accesses to unplugged memory
>>     as "undefined behavior" -- except reading unplugged memory, which is
>>     currently expected to work, but that will change in the future.
>> d) The (un)plug granularity is in the range of megabytes -- "memory blocks"
>> e) The state (plugged/unplugged) of a memory block is always known and
>>     properly tracked.
>>
>> Whenever memory blocks within the RAMBlock get (un)plugged, changes are
>> communicated via the RamDiscardManager to other QEMU subsystems, most
>> prominently vfio which updates the DMA mapping accordingly. "Unplugging"
>> corresponds to "discarding" and "plugging" corresponds to "populating".
>>
>> While migrating (precopy/postcopy) that state of such memory blocks cannot
>> change.
> 
> So no plugging/unplugging can happen during the migration?

Exactly:

static bool virtio_mem_is_busy(void)
{
     /*
      * Postcopy cannot handle concurrent discards and we don't want to migrate
      * pages on-demand with stale content when plugging new blocks.
      *
      * For precopy, we don't want unplugged blocks in our migration stream, and
      * when plugging new blocks, the page content might differ between source
      * and destination (observable by the guest when not initializing pages
      * after plugging them) until we're running on the destination (as we didn't
      * migrate these blocks when they were unplugged).
      */
     return migration_in_incoming_postcopy() || !migration_is_idle();
}

[...]

>>
>> Let's reuse the RamDiscardManager infrastructure to essentially handle
>> precopy, postcopy and background snapshots cleanly, which means:
>>
>> a) In precopy code, always clearing all dirty bits from the bitmap that
>>     correspond to discarded range, whenever we update the dirty bitmap. This
>>     results in logically unplugged memory to never get migrated.
> 
> Have you seen cases where discarded areas are being marked as dirty?
> That suggests something somewhere is writing to them and shouldn't be.

I have due to sub-optimal clear_bmap handling to be sorted out by

https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com

Whereby the issue is rather that initially dirty bits don't get cleared in
lower layers and keep popping up as dirty.

The issue with postcopy recovery code setting discarded ranges dirty in
the dirty bitmap, I did not try reproducing. But from looking at the
code, it's pretty clear that it would happen.

Apart from that, nothing should dirty that memory. Of course,
malicious guests could trigger it for now, in which case we wouldn't catch it
and migrate such pages with postcopy, because the final bitmap sync in
ram_postcopy_send_discard_bitmap() is performed without calling notifiers
right now.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-22 11:43   ` David Hildenbrand
@ 2021-07-23 16:12     ` Peter Xu
  2021-07-23 18:41       ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 22, 2021 at 01:43:41PM +0200, David Hildenbrand wrote:
> > > a) In precopy code, always clearing all dirty bits from the bitmap that
> > >     correspond to discarded range, whenever we update the dirty bitmap. This
> > >     results in logically unplugged memory to never get migrated.
> > 
> > Have you seen cases where discarded areas are being marked as dirty?
> > That suggests something somewhere is writing to them and shouldn't be.
> 
> I have due to sub-optimal clear_bmap handling to be sorted out by
> 
> https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com
> 
> Whereby the issue is rather that initially dirty bits don't get cleared in
> lower layers and keep popping up as dirty.
> 
> The issue with postcopy recovery code setting discarded ranges dirty in
> the dirty bitmap, I did not try reproducing. But from looking at the
> code, it's pretty clear that it would happen.
> 
> Apart from that, nothing should dirty that memory. Of course,
> malicious guests could trigger it for now, in which case we wouldn't catch it
> and migrate such pages with postcopy, because the final bitmap sync in
> ram_postcopy_send_discard_bitmap() is performed without calling notifiers
> right now.

I have the same concern with Dave: does it mean that we don't need to touch at
least ramblock_sync_dirty_bitmap in patch 3?

Doing that for bitmap init and postcopy recovery looks right.

One other trivial comment is instead of touching up ram_dirty_bitmap_reload(),
IMHO it's simpler to set all 1's to disgarded memories on dst receivedmap;
imagine multiple postcopy recovery happened, then with that we walk the disgard
memory list only once for each migration.  Not a big deal, though.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-21  9:27 ` [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination David Hildenbrand
@ 2021-07-23 16:34   ` Peter Xu
  2021-07-23 18:36     ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:34 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 21, 2021 at 11:27:58AM +0200, David Hildenbrand wrote:
> Currently, when someone (i.e., the VM) accesses discarded parts inside a
> RAMBlock with a RamDiscardManager managing the corresponding mapped memory
> region, postcopy will request migration of the corresponding page from the
> source. The source, however, will never answer, because it refuses to
> migrate such pages with undefined content ("logically unplugged"): the
> pages are never dirty, and get_queued_page() will consequently skip
> processing these postcopy requests.
> 
> Especially reading discarded ("logically unplugged") ranges is supposed to
> work in some setups (for example with current virtio-mem), although it
> barely ever happens: still, not placing a page would currently stall the
> VM, as it cannot make forward progress.
> 
> Let's check the state via the RamDiscardManager (the state e.g.,
> of virtio-mem is migrated during precopy) and avoid sending a request
> that will never get answered. Place a fresh zero page instead to keep
> the VM working. This is the same behavior that would happen
> automatically without userfaultfd being active, when accessing virtual
> memory regions without populated pages -- "populate on demand".
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  migration/postcopy-ram.c | 25 +++++++++++++++++++++----
>  migration/ram.c          | 23 +++++++++++++++++++++++
>  migration/ram.h          |  1 +
>  3 files changed, 45 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 2e9697bdd2..673f60ce2b 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -671,6 +671,23 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd,
>      return ret;
>  }
>  
> +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
> +                                 ram_addr_t start, uint64_t haddr)
> +{
> +    /*
> +     * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
> +     * access, place a zeropage, which will also set the relevant bits in the
> +     * recv_bitmap accordingly, so we won't try placing a zeropage twice.
> +     */
> +    if (ramblock_page_is_discarded(rb, start)) {
> +        bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);

Will received be set for any case with the current code base?  As I thought
virtio-mem forbids plug/unplug during the whole lifecycle of migration.

> +
> +        return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, rb);

(now we can fill up pages in two threads.. but looks thread-safe)

Meanwhile if this is highly not wanted, maybe worth an error_report_once() so
the admin could see something?

> +    }
> +
> +    return migrate_send_rp_req_pages(mis, rb, start, haddr);
> +}
> +
>  /*
>   * Callback from shared fault handlers to ask for a page,
>   * the page must be specified by a RAMBlock and an offset in that rb
> @@ -690,7 +707,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb,
>                                          qemu_ram_get_idstr(rb), rb_offset);
>          return postcopy_wake_shared(pcfd, client_addr, rb);
>      }
> -    migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr);
> +    postcopy_request_page(mis, rb, aligned_rbo, client_addr);
>      return 0;
>  }
>  
> @@ -984,8 +1001,8 @@ retry:
>               * Send the request to the source - we want to request one
>               * of our host page sizes (which is >= TPS)
>               */
> -            ret = migrate_send_rp_req_pages(mis, rb, rb_offset,
> -                                            msg.arg.pagefault.address);
> +            ret = postcopy_request_page(mis, rb, rb_offset,
> +                                        msg.arg.pagefault.address);
>              if (ret) {
>                  /* May be network failure, try to wait for recovery */
>                  if (ret == -EIO && postcopy_pause_fault_thread(mis)) {
> @@ -993,7 +1010,7 @@ retry:
>                      goto retry;
>                  } else {
>                      /* This is a unavoidable fault */
> -                    error_report("%s: migrate_send_rp_req_pages() get %d",
> +                    error_report("%s: postcopy_request_page() get %d",
>                                   __func__, ret);
>                      break;
>                  }
> diff --git a/migration/ram.c b/migration/ram.c
> index 4bf74ae2e1..d7505f5368 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -876,6 +876,29 @@ static uint64_t ramblock_dirty_bitmap_exclude_discarded_pages(RAMBlock *rb)
>      return cleared_bits;
>  }
>  
> +/*
> + * Check if a page falls into a discarded range as managed by a
> + * RamDiscardManager responsible for the mapped memory region of the RAMBlock.
> + * Pages inside discarded ranges are never migrated and postcopy has to
> + * place zeropages instead.
> + *
> + * Note: The result is only stable while migration (precopy/postcopy).
> + */
> +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset)
> +{
> +    if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) {
> +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
> +        MemoryRegionSection section = {
> +            .mr = rb->mr,
> +            .offset_within_region = offset,
> +            .size = int128_get64(TARGET_PAGE_SIZE),

rb->page_size?

Although I think it should also work with TARGET_PAGE_SIZE in this specific
case, but maybe still better to use what we have.

> +        };
> +
> +        return !ram_discard_manager_is_populated(rdm, &section);
> +    }
> +    return false;
> +}
> +
>  /* Called with RCU critical section */
>  static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb)
>  {
> diff --git a/migration/ram.h b/migration/ram.h
> index 4833e9fd5b..6e7f12e556 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -72,6 +72,7 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
>  int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>                                    const char *block_name);
>  int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
> +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset);
>  
>  /* ram cache */
>  int colo_init_ram_cache(void);
> -- 
> 2.31.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager
  2021-07-21  9:27 ` [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager David Hildenbrand
@ 2021-07-23 16:34   ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:34 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 21, 2021 at 11:27:54AM +0200, David Hildenbrand wrote:
> Introduce replay_discarded callback similar to our existing
> replay_populated callback, to be used my migration code to never migrate
> discarded memory.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback
  2021-07-21  9:27 ` [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback David Hildenbrand
@ 2021-07-23 16:34   ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:34 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 21, 2021 at 11:27:55AM +0200, David Hildenbrand wrote:
> Implement it similar to the replay_populated callback.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 4/6] virtio-mem: Drop precopy notifier
  2021-07-21  9:27 ` [PATCH v2 4/6] virtio-mem: Drop precopy notifier David Hildenbrand
@ 2021-07-23 16:34   ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:34 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 21, 2021 at 11:27:57AM +0200, David Hildenbrand wrote:
> Migration code now properly handles RAMBlocks which are indirectly managed
> by a RamDiscardManager. No need for manual handling via the free page
> optimization interface, let's get rid of it.
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots
  2021-07-21  9:27 ` [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots David Hildenbrand
@ 2021-07-23 16:37   ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2021-07-23 16:37 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 21, 2021 at 11:27:59AM +0200, David Hildenbrand wrote:
> We already don't ever migrate memory that corresponds to discarded ranges
> as managed by a RamDiscardManager responsible for the mapped memory region
> of the RAMBlock.
> 
> virtio-mem uses this mechanism to logically unplug parts of a RAMBlock.
> Right now, we still populate zeropages for the whole usable part of the
> RAMBlock, which is undesired because:
> 
> 1. Even populating the shared zeropage will result in memory getting
>    consumed for page tables.
> 2. Memory backends without a shared zeropage (like hugetlbfs and shmem)
>    will populate an actual, fresh page, resulting in an unintended
>    memory consumption.
> 
> Discarded ("logically unplugged") parts have to remain discarded. As
> these pages are never part of the migration stream, there is no need to
> track modifications via userfaultfd WP reliably for these parts.
> 
> Further, any writes to these ranges by the VM are invalid and the
> behavior is undefined.
> 
> Note that Linux only supports userfaultfd WP on private anonymous memory
> for now, which usually results in the shared zeropage getting populated.
> The issue will become more relevant once userfaultfd WP supports shmem
> and hugetlb.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-23 16:34   ` Peter Xu
@ 2021-07-23 18:36     ` David Hildenbrand
  2021-07-23 18:52       ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-23 18:36 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

>> +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
>> +                                 ram_addr_t start, uint64_t haddr)
>> +{
>> +    /*
>> +     * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
>> +     * access, place a zeropage, which will also set the relevant bits in the
>> +     * recv_bitmap accordingly, so we won't try placing a zeropage twice.
>> +     */
>> +    if (ramblock_page_is_discarded(rb, start)) {
>> +        bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
> 
> Will received be set for any case with the current code base?  As I thought
> virtio-mem forbids plug/unplug during the whole lifecycle of migration.

receive would only be set if you have two CPUs faulting on the same 
address at the same time and the first one already placed a zeropage on 
this code path (as the comment said, that will implicitly set it in the 
rceivedmask).

So, pretty unlikely to happen, but if the stars align ... :)

> 
>> +
>> +        return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, rb);
> 
> (now we can fill up pages in two threads.. but looks thread-safe)
> 
> Meanwhile if this is highly not wanted, maybe worth an error_report_once() so
> the admin could see something?


You mean, if postcopy_place_page_zero() fails?

[...]

>>   
>> +/*
>> + * Check if a page falls into a discarded range as managed by a
>> + * RamDiscardManager responsible for the mapped memory region of the RAMBlock.
>> + * Pages inside discarded ranges are never migrated and postcopy has to
>> + * place zeropages instead.
>> + *
>> + * Note: The result is only stable while migration (precopy/postcopy).
>> + */
>> +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset)
>> +{
>> +    if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) {
>> +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
>> +        MemoryRegionSection section = {
>> +            .mr = rb->mr,
>> +            .offset_within_region = offset,
>> +            .size = int128_get64(TARGET_PAGE_SIZE),
> 
> rb->page_size?
> 
> Although I think it should also work with TARGET_PAGE_SIZE in this specific
> case, but maybe still better to use what we have.


If rb->page_size is discarded, TARGET_PAGE_SIZE is certainly discarded 
as well (as TARGET_PAGE_SIZE <= rb->page_size).

But yes, sounds cleaner.

Thanks Peter!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-23 16:12     ` Peter Xu
@ 2021-07-23 18:41       ` David Hildenbrand
  2021-07-23 22:19         ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-23 18:41 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 23.07.21 18:12, Peter Xu wrote:
> On Thu, Jul 22, 2021 at 01:43:41PM +0200, David Hildenbrand wrote:
>>>> a) In precopy code, always clearing all dirty bits from the bitmap that
>>>>      correspond to discarded range, whenever we update the dirty bitmap. This
>>>>      results in logically unplugged memory to never get migrated.
>>>
>>> Have you seen cases where discarded areas are being marked as dirty?
>>> That suggests something somewhere is writing to them and shouldn't be.
>>
>> I have due to sub-optimal clear_bmap handling to be sorted out by
>>
>> https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com
>>
>> Whereby the issue is rather that initially dirty bits don't get cleared in
>> lower layers and keep popping up as dirty.
>>
>> The issue with postcopy recovery code setting discarded ranges dirty in
>> the dirty bitmap, I did not try reproducing. But from looking at the
>> code, it's pretty clear that it would happen.
>>
>> Apart from that, nothing should dirty that memory. Of course,
>> malicious guests could trigger it for now, in which case we wouldn't catch it
>> and migrate such pages with postcopy, because the final bitmap sync in
>> ram_postcopy_send_discard_bitmap() is performed without calling notifiers
>> right now.
> 
> I have the same concern with Dave: does it mean that we don't need to touch at
> least ramblock_sync_dirty_bitmap in patch 3?

Yes, see the comment in patch #3:

"
Note: If discarded ranges span complete clear_bmap chunks, we'll never
clear the corresponding bits from clear_bmap and consequently never call
memory_region_clear_dirty_bitmap on the affected regions. While this is
perfectly fine, we're still synchronizing the bitmap of discarded ranges,
for example, in
ramblock_sync_dirty_bitmap()->cpu_physical_memory_sync_dirty_bitmap()
but also during memory_global_dirty_log_sync().

In the future, it might make sense to never even synchronize the dirty 
log of these ranges, for example in KVM code, skipping discarded ranges
completely.
"

The KVM path might be even more interesting (with !dirty ring IIRC).

So that might certainly be worth looking into if we find it to be a real 
performance problem.

> 
> Doing that for bitmap init and postcopy recovery looks right.
> 
> One other trivial comment is instead of touching up ram_dirty_bitmap_reload(),
> IMHO it's simpler to set all 1's to disgarded memories on dst receivedmap;
> imagine multiple postcopy recovery happened, then with that we walk the disgard
> memory list only once for each migration.  Not a big deal, though.

Right, but I decided to reuse 
ramblock_dirty_bitmap_exclude_discarded_pages() such that I can avoid 
yet another helper.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-23 18:36     ` David Hildenbrand
@ 2021-07-23 18:52       ` Peter Xu
  2021-07-23 19:01         ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-23 18:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Fri, Jul 23, 2021 at 08:36:32PM +0200, David Hildenbrand wrote:
> > > +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
> > > +                                 ram_addr_t start, uint64_t haddr)
> > > +{
> > > +    /*
> > > +     * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
> > > +     * access, place a zeropage, which will also set the relevant bits in the
> > > +     * recv_bitmap accordingly, so we won't try placing a zeropage twice.
> > > +     */
> > > +    if (ramblock_page_is_discarded(rb, start)) {
> > > +        bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
> > 
> > Will received be set for any case with the current code base?  As I thought
> > virtio-mem forbids plug/unplug during the whole lifecycle of migration.
> 
> receive would only be set if you have two CPUs faulting on the same address
> at the same time and the first one already placed a zeropage on this code
> path (as the comment said, that will implicitly set it in the rceivedmask).

Ah I see; or just ignore the error of postcopy_place_page_zero() here because
per my understanding this whole path is not expected after all.

> 
> So, pretty unlikely to happen, but if the stars align ... :)
> 
> > 
> > > +
> > > +        return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, rb);
> > 
> > (now we can fill up pages in two threads.. but looks thread-safe)
> > 
> > Meanwhile if this is highly not wanted, maybe worth an error_report_once() so
> > the admin could see something?
> 
> 
> You mean, if postcopy_place_page_zero() fails?

I meant ramblock_page_is_discarded() shouldn't really trigger for a sane guest,
right?  Because it means the guest is accessing some unplugged memory.

I wanted to give a heads-up to the admin so he/she knows there's something odd.
Also that can be a hint for debugging if it's hit for any unknown reasons.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-23 18:52       ` Peter Xu
@ 2021-07-23 19:01         ` David Hildenbrand
  2021-07-23 22:10           ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-23 19:01 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 23.07.21 20:52, Peter Xu wrote:
> On Fri, Jul 23, 2021 at 08:36:32PM +0200, David Hildenbrand wrote:
>>>> +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
>>>> +                                 ram_addr_t start, uint64_t haddr)
>>>> +{
>>>> +    /*
>>>> +     * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
>>>> +     * access, place a zeropage, which will also set the relevant bits in the
>>>> +     * recv_bitmap accordingly, so we won't try placing a zeropage twice.
>>>> +     */
>>>> +    if (ramblock_page_is_discarded(rb, start)) {
>>>> +        bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
>>>
>>> Will received be set for any case with the current code base?  As I thought
>>> virtio-mem forbids plug/unplug during the whole lifecycle of migration.
>>
>> receive would only be set if you have two CPUs faulting on the same address
>> at the same time and the first one already placed a zeropage on this code
>> path (as the comment said, that will implicitly set it in the rceivedmask).
> 
> Ah I see; or just ignore the error of postcopy_place_page_zero() here because
> per my understanding this whole path is not expected after all.

See below, if placing would go wrong in this corner case, I think we 
would still want to know it instead of letting a guest thread not make 
progress because nobody would wake it up.

Does that make sense?

> 
>>
>> So, pretty unlikely to happen, but if the stars align ... :)
>>
>>>
>>>> +
>>>> +        return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, rb);
>>>
>>> (now we can fill up pages in two threads.. but looks thread-safe)
>>>
>>> Meanwhile if this is highly not wanted, maybe worth an error_report_once() so
>>> the admin could see something?
>>
>>
>> You mean, if postcopy_place_page_zero() fails?
> 
> I meant ramblock_page_is_discarded() shouldn't really trigger for a sane guest,
> right?  Because it means the guest is accessing some unplugged memory.

It can happen in corner cases and is valid: with the current virtio-mem 
spec, guests are allowed to read unplugged memory. This will, for 
example, happen on older Linux guests when reading /proc/kcore or (with 
even older guests) when dumping guest memory via kdump. These corner 
cases were the main reason why the spec allows for it -- until we have 
guests properly adjusted such that it won't happen even in corner cases.

A future feature bit will disallow it for the guest: required for 
supporting shmem/hugetlb cleanly. With that in place, I agree that we 
would want to warn in this case!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-23 19:01         ` David Hildenbrand
@ 2021-07-23 22:10           ` Peter Xu
  2021-07-29 12:14             ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-23 22:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
> It can happen in corner cases and is valid: with the current virtio-mem
> spec, guests are allowed to read unplugged memory. This will, for example,
> happen on older Linux guests when reading /proc/kcore or (with even older
> guests) when dumping guest memory via kdump. These corner cases were the
> main reason why the spec allows for it -- until we have guests properly
> adjusted such that it won't happen even in corner cases.
> 
> A future feature bit will disallow it for the guest: required for supporting
> shmem/hugetlb cleanly. With that in place, I agree that we would want to
> warn in this case!

OK that makes sense; with the page_size change, feel free to add:

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-23 18:41       ` David Hildenbrand
@ 2021-07-23 22:19         ` Peter Xu
  2021-07-27  9:25           ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-23 22:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Fri, Jul 23, 2021 at 08:41:40PM +0200, David Hildenbrand wrote:
> On 23.07.21 18:12, Peter Xu wrote:
> > On Thu, Jul 22, 2021 at 01:43:41PM +0200, David Hildenbrand wrote:
> > > > > a) In precopy code, always clearing all dirty bits from the bitmap that
> > > > >      correspond to discarded range, whenever we update the dirty bitmap. This
> > > > >      results in logically unplugged memory to never get migrated.
> > > > 
> > > > Have you seen cases where discarded areas are being marked as dirty?
> > > > That suggests something somewhere is writing to them and shouldn't be.
> > > 
> > > I have due to sub-optimal clear_bmap handling to be sorted out by
> > > 
> > > https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com
> > > 
> > > Whereby the issue is rather that initially dirty bits don't get cleared in
> > > lower layers and keep popping up as dirty.
> > > 
> > > The issue with postcopy recovery code setting discarded ranges dirty in
> > > the dirty bitmap, I did not try reproducing. But from looking at the
> > > code, it's pretty clear that it would happen.
> > > 
> > > Apart from that, nothing should dirty that memory. Of course,
> > > malicious guests could trigger it for now, in which case we wouldn't catch it
> > > and migrate such pages with postcopy, because the final bitmap sync in
> > > ram_postcopy_send_discard_bitmap() is performed without calling notifiers
> > > right now.
> > 
> > I have the same concern with Dave: does it mean that we don't need to touch at
> > least ramblock_sync_dirty_bitmap in patch 3?
> 
> Yes, see the comment in patch #3:
> 
> "
> Note: If discarded ranges span complete clear_bmap chunks, we'll never
> clear the corresponding bits from clear_bmap and consequently never call
> memory_region_clear_dirty_bitmap on the affected regions. While this is
> perfectly fine, we're still synchronizing the bitmap of discarded ranges,
> for example, in
> ramblock_sync_dirty_bitmap()->cpu_physical_memory_sync_dirty_bitmap()
> but also during memory_global_dirty_log_sync().
> 
> In the future, it might make sense to never even synchronize the dirty log
> of these ranges, for example in KVM code, skipping discarded ranges
> completely.
> "
> 
> The KVM path might be even more interesting (with !dirty ring IIRC).
> 
> So that might certainly be worth looking into if we find it to be a real
> performance problem.

OK; hmm then I feel like what's missing is we didn't have the dirty bmap and
the clear map synced - say, what if we do memory_region_clear_dirty_bitmap()
when dropping the virtio-mem unplugged ranges too?

If disgarded ranges are static during migration, the clear dirty log should
happen once for them at bitmap init time.  Then IIUC when sync we don't need to
worry about unplugged memory anymore.

> 
> > 
> > Doing that for bitmap init and postcopy recovery looks right.
> > 
> > One other trivial comment is instead of touching up ram_dirty_bitmap_reload(),
> > IMHO it's simpler to set all 1's to disgarded memories on dst receivedmap;
> > imagine multiple postcopy recovery happened, then with that we walk the disgard
> > memory list only once for each migration.  Not a big deal, though.
> 
> Right, but I decided to reuse
> ramblock_dirty_bitmap_exclude_discarded_pages() such that I can avoid yet
> another helper.

Yeah, that's okay.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-23 22:19         ` Peter Xu
@ 2021-07-27  9:25           ` David Hildenbrand
  2021-07-27 17:10             ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-27  9:25 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 24.07.21 00:19, Peter Xu wrote:
> On Fri, Jul 23, 2021 at 08:41:40PM +0200, David Hildenbrand wrote:
>> On 23.07.21 18:12, Peter Xu wrote:
>>> On Thu, Jul 22, 2021 at 01:43:41PM +0200, David Hildenbrand wrote:
>>>>>> a) In precopy code, always clearing all dirty bits from the bitmap that
>>>>>>       correspond to discarded range, whenever we update the dirty bitmap. This
>>>>>>       results in logically unplugged memory to never get migrated.
>>>>>
>>>>> Have you seen cases where discarded areas are being marked as dirty?
>>>>> That suggests something somewhere is writing to them and shouldn't be.
>>>>
>>>> I have due to sub-optimal clear_bmap handling to be sorted out by
>>>>
>>>> https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com
>>>>
>>>> Whereby the issue is rather that initially dirty bits don't get cleared in
>>>> lower layers and keep popping up as dirty.
>>>>
>>>> The issue with postcopy recovery code setting discarded ranges dirty in
>>>> the dirty bitmap, I did not try reproducing. But from looking at the
>>>> code, it's pretty clear that it would happen.
>>>>
>>>> Apart from that, nothing should dirty that memory. Of course,
>>>> malicious guests could trigger it for now, in which case we wouldn't catch it
>>>> and migrate such pages with postcopy, because the final bitmap sync in
>>>> ram_postcopy_send_discard_bitmap() is performed without calling notifiers
>>>> right now.
>>>
>>> I have the same concern with Dave: does it mean that we don't need to touch at
>>> least ramblock_sync_dirty_bitmap in patch 3?
>>
>> Yes, see the comment in patch #3:
>>
>> "
>> Note: If discarded ranges span complete clear_bmap chunks, we'll never
>> clear the corresponding bits from clear_bmap and consequently never call
>> memory_region_clear_dirty_bitmap on the affected regions. While this is
>> perfectly fine, we're still synchronizing the bitmap of discarded ranges,
>> for example, in
>> ramblock_sync_dirty_bitmap()->cpu_physical_memory_sync_dirty_bitmap()
>> but also during memory_global_dirty_log_sync().
>>
>> In the future, it might make sense to never even synchronize the dirty log
>> of these ranges, for example in KVM code, skipping discarded ranges
>> completely.
>> "
>>
>> The KVM path might be even more interesting (with !dirty ring IIRC).
>>
>> So that might certainly be worth looking into if we find it to be a real
>> performance problem.
> 
> OK; hmm then I feel like what's missing is we didn't have the dirty bmap and
> the clear map synced - say, what if we do memory_region_clear_dirty_bitmap()
> when dropping the virtio-mem unplugged ranges too?

Is it a problem that we leave clear_bmap set and actually never clear 
some ranges? I don't think so. To me, this feels like the right thing to 
do: no need to clear something (in QEMU, in KVM) nobody cares about.

IMHO, the real optimization should be to not even sync discarded ranges 
(not from the accelerator, not from the memory region), skipping these 
ranges completely (no sync, no clear). With what you propose, we might 
end up calling into KVM to clear bitmaps of ranges we are not interested 
in, no?

> 
> If disgarded ranges are static during migration, the clear dirty log should
> happen once for them at bitmap init time.  Then IIUC when sync we don't need to
> worry about unplugged memory anymore.

Again, I'm not sure why we want to clear something we don't care about.


There are 3 cases to handle I think:

1) Initially, when the bitmap is set to 1, we want to exclude all 
discarded ranges.

2) Whenever we sync the bitmap, we don't want to get discarded ranges 
set dirty. (e.g., bits still or again dirty in KVM or the memory region)

3) When reloading the bitmap during postcopy errors.


I think for 1) and 3) we seem to agree that clearing the discarded 
ranges from the dirty bitmap is conceptually the right thing.


For 2) I see 3 options:


a) Sync everything, fixup the dirty bitmap, never clear the dirty log of 
discarded parts. It's fairly simple and straight forward, as I can 
simply reuse the existing helper. Something that's discarded will never 
be dirty, not even if a misbehaving guest touches memory it shouldn't. 
[this patch]

b) Sync only populated parts, no need to fixup the dirty bitmap, never 
clear the dirty log of discarded parts. It's a bit more complicated but 
achieves the same goal as a). [optimization I propose for the future]

c) Sync everything, don't fixup the dirty bitmap, clear the dirty log of 
discarded parts initially. There are ways we still might migrate 
discarded ranges, for example, if a misbehaving guest touches memory it 
shouldn't. [what you propose]


Is my understanding correct? Any reasons why we should chose c) over b) 
long term or c) over a) short term?

Thanks!

-- 
Thanks,

David / dhildenb




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-27  9:25           ` David Hildenbrand
@ 2021-07-27 17:10             ` Peter Xu
  2021-07-28 17:39               ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-27 17:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Tue, Jul 27, 2021 at 11:25:09AM +0200, David Hildenbrand wrote:
> For 2) I see 3 options:
> 
> a) Sync everything, fixup the dirty bitmap, never clear the dirty log of
> discarded parts. It's fairly simple and straight forward, as I can simply
> reuse the existing helper. Something that's discarded will never be dirty,
> not even if a misbehaving guest touches memory it shouldn't. [this patch]
> 
> b) Sync only populated parts, no need to fixup the dirty bitmap, never clear
> the dirty log of discarded parts. It's a bit more complicated but achieves
> the same goal as a). [optimization I propose for the future]
> 
> c) Sync everything, don't fixup the dirty bitmap, clear the dirty log of
> discarded parts initially. There are ways we still might migrate discarded
> ranges, for example, if a misbehaving guest touches memory it shouldn't.
> [what you propose]
> 
> Is my understanding correct? Any reasons why we should chose c) over b) long
> term or c) over a) short term?

My major concern is we could do something during sync() for not a very good
reason by looping over virtio-mem bitmap for disgarded ranges - IIUC it should
be destined to be merely no-op if the guest is well-behaved, am I right?

Meanwhile, I still have no idea how much overhead the "loop" part could bring.
For a large virtio-mem region with frequent plugged/unplugged mem interacted,
it seems possible to take a while to me..  I have no solid idea yet.

The thing is I still think this extra operation during sync() can be ignored by
simply clear dirty log during bitmap init, then.. why not? :)

Clear dirty bitmap is as simple as "reprotect the pages" functional-wise - if
they are unplugged memory ranges, and they shouldn't be written by the guest
(we still allow reads even for virtio-mem compatibility), then I don't see it
an issue to wr-protect it using clear dirty log when bitmap init.

It still makes sense to me to keep the dirty/clear bitmap in-sync, at least
before your plan b proposal; leaving the dirty bits set forever on unplugged
memory is okay but just sounds a bit weird.

Though my concern is only valid when virtio-mem is used, so I don't have a
strong opinion on it as you maintains virtio-mem. I believe you will always
have a better judgement than me on that. Especially, when/if Dave & Juan have
no problem on that. :)

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-27 17:10             ` Peter Xu
@ 2021-07-28 17:39               ` David Hildenbrand
  2021-07-28 19:42                 ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-28 17:39 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 27.07.21 19:10, Peter Xu wrote:
> On Tue, Jul 27, 2021 at 11:25:09AM +0200, David Hildenbrand wrote:
>> For 2) I see 3 options:
>>
>> a) Sync everything, fixup the dirty bitmap, never clear the dirty log of
>> discarded parts. It's fairly simple and straight forward, as I can simply
>> reuse the existing helper. Something that's discarded will never be dirty,
>> not even if a misbehaving guest touches memory it shouldn't. [this patch]
>>
>> b) Sync only populated parts, no need to fixup the dirty bitmap, never clear
>> the dirty log of discarded parts. It's a bit more complicated but achieves
>> the same goal as a). [optimization I propose for the future]
>>
>> c) Sync everything, don't fixup the dirty bitmap, clear the dirty log of
>> discarded parts initially. There are ways we still might migrate discarded
>> ranges, for example, if a misbehaving guest touches memory it shouldn't.
>> [what you propose]
>>
>> Is my understanding correct? Any reasons why we should chose c) over b) long
>> term or c) over a) short term?
> 
> My major concern is we could do something during sync() for not a very good
> reason by looping over virtio-mem bitmap for disgarded ranges - IIUC it should
> be destined to be merely no-op if the guest is well-behaved, am I right?

I think yes, as long as we have no initially-set bit stuck somewhere.

> 
> Meanwhile, I still have no idea how much overhead the "loop" part could bring.
> For a large virtio-mem region with frequent plugged/unplugged mem interacted,
> it seems possible to take a while to me..  I have no solid idea yet.

Let's do some math. Assume the worst case on a 1TiB device with a 2MiB 
block size: We have 524288 blocks == bits. That's precisely a 64k bitmap 
in virtio-mem. In the worst case, every second bit would be clear 
("discarded"). For each clear bit ("discarded"), we would have to clear 
512 bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.

So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not 
sure if it will really matter doing that once on every bitmap sync. I 
guess the bitmap syncing itself is much more expensive -- and not 
syncing the discarded ranges (b ) above) would make a bigger impact I guess.

> 
> The thing is I still think this extra operation during sync() can be ignored by
> simply clear dirty log during bitmap init, then.. why not? :)

I guess clearing the dirty log (especially in KVM) might be more 
expensive. But, anyhow, we actually want b) long-term :)

> 
> Clear dirty bitmap is as simple as "reprotect the pages" functional-wise - if
> they are unplugged memory ranges, and they shouldn't be written by the guest
> (we still allow reads even for virtio-mem compatibility), then I don't see it
> an issue to wr-protect it using clear dirty log when bitmap init.
> 
> It still makes sense to me to keep the dirty/clear bitmap in-sync, at least
> before your plan b proposal; leaving the dirty bits set forever on unplugged
> memory is okay but just sounds a bit weird.
> 
> Though my concern is only valid when virtio-mem is used, so I don't have a
> strong opinion on it as you maintains virtio-mem. I believe you will always
> have a better judgement than me on that. Especially, when/if Dave & Juan have
> no problem on that. :)

I'd certainly sleep better at night if I can be 100% sure that a page 
not to be migrated will not get migrated. :)

I'll play with initial clearing and see how much of a difference it 
makes code wise. Thanks a lot for your feedback!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-28 17:39               ` David Hildenbrand
@ 2021-07-28 19:42                 ` Peter Xu
  2021-07-28 19:46                   ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-28 19:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
> > Meanwhile, I still have no idea how much overhead the "loop" part could bring.
> > For a large virtio-mem region with frequent plugged/unplugged mem interacted,
> > it seems possible to take a while to me..  I have no solid idea yet.
> 
> Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
> size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
> virtio-mem. In the worst case, every second bit would be clear
> ("discarded"). For each clear bit ("discarded"), we would have to clear 512
> bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.
> 
> So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
> if it will really matter doing that once on every bitmap sync. I guess the
> bitmap syncing itself is much more expensive -- and not syncing the
> discarded ranges (b ) above) would make a bigger impact I guess.

I'm not worried about the memory size to be accessed as bitmaps; it's more
about the loop itself.  500K blocks/bits means the cb() worse case can be
called 500K/2=250k times, no matter what's the hook is doing.

But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
be too harsh.  It's just that if it's very easy to be done in bitmap init then
still worth thinking about it.

> 
> > 
> > The thing is I still think this extra operation during sync() can be ignored by
> > simply clear dirty log during bitmap init, then.. why not? :)
> 
> I guess clearing the dirty log (especially in KVM) might be more expensive.

If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.

The other thing is imho having overhead during bitmap init is always better
than having that during sync(). :)

> But, anyhow, we actually want b) long-term :)

Regarding the longterm plan - sorry to say that, but I still keep a skeptical
view.. :) 

You did mention that for 1tb memory we only have 32mb dirty bitmap, that's
actually not so huge.  That's why I'm not sure whether that complexity of plan
b would bring a lot of help (before I started to think about the interface of
it).  But I could be missing something.

> 
> > 
> > Clear dirty bitmap is as simple as "reprotect the pages" functional-wise - if
> > they are unplugged memory ranges, and they shouldn't be written by the guest
> > (we still allow reads even for virtio-mem compatibility), then I don't see it
> > an issue to wr-protect it using clear dirty log when bitmap init.
> > 
> > It still makes sense to me to keep the dirty/clear bitmap in-sync, at least
> > before your plan b proposal; leaving the dirty bits set forever on unplugged
> > memory is okay but just sounds a bit weird.
> > 
> > Though my concern is only valid when virtio-mem is used, so I don't have a
> > strong opinion on it as you maintains virtio-mem. I believe you will always
> > have a better judgement than me on that. Especially, when/if Dave & Juan have
> > no problem on that. :)
> 
> I'd certainly sleep better at night if I can be 100% sure that a page not to
> be migrated will not get migrated. :)
> 
> I'll play with initial clearing and see how much of a difference it makes
> code wise. Thanks a lot for your feedback!

Thanks!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-28 19:42                 ` Peter Xu
@ 2021-07-28 19:46                   ` David Hildenbrand
  2021-07-28 20:19                     ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-28 19:46 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 28.07.21 21:42, Peter Xu wrote:
> On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
>>> Meanwhile, I still have no idea how much overhead the "loop" part could bring.
>>> For a large virtio-mem region with frequent plugged/unplugged mem interacted,
>>> it seems possible to take a while to me..  I have no solid idea yet.
>>
>> Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
>> size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
>> virtio-mem. In the worst case, every second bit would be clear
>> ("discarded"). For each clear bit ("discarded"), we would have to clear 512
>> bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.
>>
>> So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
>> if it will really matter doing that once on every bitmap sync. I guess the
>> bitmap syncing itself is much more expensive -- and not syncing the
>> discarded ranges (b ) above) would make a bigger impact I guess.
> 
> I'm not worried about the memory size to be accessed as bitmaps; it's more
> about the loop itself.  500K blocks/bits means the cb() worse case can be
> called 500K/2=250k times, no matter what's the hook is doing.
> 
> But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
> be too harsh.  It's just that if it's very easy to be done in bitmap init then
> still worth thinking about it.
> 
>>
>>>
>>> The thing is I still think this extra operation during sync() can be ignored by
>>> simply clear dirty log during bitmap init, then.. why not? :)
>>
>> I guess clearing the dirty log (especially in KVM) might be more expensive.
> 
> If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
> fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
> 
> The other thing is imho having overhead during bitmap init is always better
> than having that during sync(). :)

Oh, right, so you're saying, after we set the dirty bmap to all ones and 
excluded the discarded parts, setting the respective bits to 0, we 
simply issue clearing of the whole area?

For now I assumed we would have to clear per cb.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-28 19:46                   ` David Hildenbrand
@ 2021-07-28 20:19                     ` Peter Xu
  2021-07-29  8:14                       ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-28 20:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Wed, Jul 28, 2021 at 09:46:09PM +0200, David Hildenbrand wrote:
> On 28.07.21 21:42, Peter Xu wrote:
> > On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
> > > > Meanwhile, I still have no idea how much overhead the "loop" part could bring.
> > > > For a large virtio-mem region with frequent plugged/unplugged mem interacted,
> > > > it seems possible to take a while to me..  I have no solid idea yet.
> > > 
> > > Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
> > > size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
> > > virtio-mem. In the worst case, every second bit would be clear
> > > ("discarded"). For each clear bit ("discarded"), we would have to clear 512
> > > bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.
> > > 
> > > So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
> > > if it will really matter doing that once on every bitmap sync. I guess the
> > > bitmap syncing itself is much more expensive -- and not syncing the
> > > discarded ranges (b ) above) would make a bigger impact I guess.
> > 
> > I'm not worried about the memory size to be accessed as bitmaps; it's more
> > about the loop itself.  500K blocks/bits means the cb() worse case can be
> > called 500K/2=250k times, no matter what's the hook is doing.
> > 
> > But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
> > be too harsh.  It's just that if it's very easy to be done in bitmap init then
> > still worth thinking about it.
> > 
> > > 
> > > > 
> > > > The thing is I still think this extra operation during sync() can be ignored by
> > > > simply clear dirty log during bitmap init, then.. why not? :)
> > > 
> > > I guess clearing the dirty log (especially in KVM) might be more expensive.
> > 
> > If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
> > fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
> > 
> > The other thing is imho having overhead during bitmap init is always better
> > than having that during sync(). :)
> 
> Oh, right, so you're saying, after we set the dirty bmap to all ones and
> excluded the discarded parts, setting the respective bits to 0, we simply
> issue clearing of the whole area?
> 
> For now I assumed we would have to clear per cb.

Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
just remembered memory API actually hides the bitmap interface..

Reset the whole region works, but it'll slow down migration starts, more
importantly that'll be with mmu write lock so we will lose most clear-log
benefit for the initial round of migration and stuck the guest #pf at the
meantime...

Let's try do that in cb()s as you mentioned; I think that'll still be okay,
because so far the clear log block size is much larger (1gb), 1tb is worst case
1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
maybe? :)

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-28 20:19                     ` Peter Xu
@ 2021-07-29  8:14                       ` David Hildenbrand
  2021-07-29 16:12                         ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29  8:14 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

>>>>> The thing is I still think this extra operation during sync() can be ignored by
>>>>> simply clear dirty log during bitmap init, then.. why not? :)
>>>>
>>>> I guess clearing the dirty log (especially in KVM) might be more expensive.
>>>
>>> If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
>>> fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
>>>
>>> The other thing is imho having overhead during bitmap init is always better
>>> than having that during sync(). :)
>>
>> Oh, right, so you're saying, after we set the dirty bmap to all ones and
>> excluded the discarded parts, setting the respective bits to 0, we simply
>> issue clearing of the whole area?
>>
>> For now I assumed we would have to clear per cb.
> 
> Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
> just remembered memory API actually hides the bitmap interface..
> 
> Reset the whole region works, but it'll slow down migration starts, more
> importantly that'll be with mmu write lock so we will lose most clear-log
> benefit for the initial round of migration and stuck the guest #pf at the
> meantime...
> 
> Let's try do that in cb()s as you mentioned; I think that'll still be okay,
> because so far the clear log block size is much larger (1gb), 1tb is worst case
> 1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
> maybe? :)

Just to get it right, what you propose is calling 
migration_clear_memory_region_dirty_bitmap_range() from each cb(). Due 
to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at 
most once.

But if our layout is fragmented, we can actually end up clearing all 
chunks (1024 ioctls for 1TB), resulting in a slower migration start.

Any gut feeling how much slower migration start could be with largish 
(e.g., 1 TiB) regions?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-23 22:10           ` Peter Xu
@ 2021-07-29 12:14             ` David Hildenbrand
  2021-07-29 15:52               ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 12:14 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 24.07.21 00:10, Peter Xu wrote:
> On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
>> It can happen in corner cases and is valid: with the current virtio-mem
>> spec, guests are allowed to read unplugged memory. This will, for example,
>> happen on older Linux guests when reading /proc/kcore or (with even older
>> guests) when dumping guest memory via kdump. These corner cases were the
>> main reason why the spec allows for it -- until we have guests properly
>> adjusted such that it won't happen even in corner cases.
>>
>> A future feature bit will disallow it for the guest: required for supporting
>> shmem/hugetlb cleanly. With that in place, I agree that we would want to
>> warn in this case!
> 
> OK that makes sense; with the page_size change, feel free to add:

I just realized that relying on the page_size would be wrong.

We migrate TARGET_PAGE_SIZE chunks and the offset might not be page_size 
aligned. So if we were to replace TARGET_PAGE_SIZE by rb->page_size, we 
might accidentally cover a "too big" range.

Makes sense?

> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-29 12:14             ` David Hildenbrand
@ 2021-07-29 15:52               ` Peter Xu
  2021-07-29 16:15                 ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-29 15:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 02:14:41PM +0200, David Hildenbrand wrote:
> On 24.07.21 00:10, Peter Xu wrote:
> > On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
> > > It can happen in corner cases and is valid: with the current virtio-mem
> > > spec, guests are allowed to read unplugged memory. This will, for example,
> > > happen on older Linux guests when reading /proc/kcore or (with even older
> > > guests) when dumping guest memory via kdump. These corner cases were the
> > > main reason why the spec allows for it -- until we have guests properly
> > > adjusted such that it won't happen even in corner cases.
> > > 
> > > A future feature bit will disallow it for the guest: required for supporting
> > > shmem/hugetlb cleanly. With that in place, I agree that we would want to
> > > warn in this case!
> > 
> > OK that makes sense; with the page_size change, feel free to add:
> 
> I just realized that relying on the page_size would be wrong.
> 
> We migrate TARGET_PAGE_SIZE chunks and the offset might not be page_size
> aligned. So if we were to replace TARGET_PAGE_SIZE by rb->page_size, we
> might accidentally cover a "too big" range.

I'm wondering whether we should make the offset page size aligned instead.  For
example, note that postcopy_place_page_zero() should only take page_size
aligned host addr or UFFDIO_COPY could fail (hugetlb doesn't support
UFFDIO_ZEROPAGE yet).

Btw, does virtio-mem supports hugetlbfs now?  When with it, the smallest unit
to plug/unplug would the huge page size (e.g., for 1g huge page, sounds not
helpful to unplug 2M memory), am I right?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29  8:14                       ` David Hildenbrand
@ 2021-07-29 16:12                         ` Peter Xu
  2021-07-29 16:19                           ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-29 16:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 10:14:47AM +0200, David Hildenbrand wrote:
> > > > > > The thing is I still think this extra operation during sync() can be ignored by
> > > > > > simply clear dirty log during bitmap init, then.. why not? :)
> > > > > 
> > > > > I guess clearing the dirty log (especially in KVM) might be more expensive.
> > > > 
> > > > If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
> > > > fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
> > > > 
> > > > The other thing is imho having overhead during bitmap init is always better
> > > > than having that during sync(). :)
> > > 
> > > Oh, right, so you're saying, after we set the dirty bmap to all ones and
> > > excluded the discarded parts, setting the respective bits to 0, we simply
> > > issue clearing of the whole area?
> > > 
> > > For now I assumed we would have to clear per cb.
> > 
> > Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
> > just remembered memory API actually hides the bitmap interface..
> > 
> > Reset the whole region works, but it'll slow down migration starts, more
> > importantly that'll be with mmu write lock so we will lose most clear-log
> > benefit for the initial round of migration and stuck the guest #pf at the
> > meantime...
> > 
> > Let's try do that in cb()s as you mentioned; I think that'll still be okay,
> > because so far the clear log block size is much larger (1gb), 1tb is worst case
> > 1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
> > maybe? :)
> 
> Just to get it right, what you propose is calling
> migration_clear_memory_region_dirty_bitmap_range() from each cb().

Right.  We can provide a more complicated memory api for passing in bitmap but
I think that can be an overkill and tricky.

> Due to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at most
> once.
>
> But if our layout is fragmented, we can actually end up clearing all chunks
> (1024 ioctls for 1TB), resulting in a slower migration start.
> 
> Any gut feeling how much slower migration start could be with largish (e.g.,
> 1 TiB) regions?

I had a vague memory of KVM_GET_DIRTY_LOG that I used to measure which took
~10ms for 1g guest mem, supposing that's mostly used to protect the pages or
clearing dirties in the EPT pgtables.  Then the worst case is ~1 second for
1tb.

But note that it's still during setup phase, so we should expect to see a
somehow large setup time and longer period that migration stays in SETUP state,
but I think it's fine.  Reasons:

  - We don't care too much about guest dirtying pages during the setup process
    because we haven't migrated anything yet, meanwhile we should not block any
    other thread either (e.g., we don't hold BQL).

  - We don't block guest execution too.  Unlike KVM_GET_DIRTY_LOG without CLEAR
    we won't hold the mmu lock for a huge long time but do it only in 1g chunk,
    so guest page faults can still be serviced.  It'll be affected somehow
    since we'll still run with the mmu write lock critical sections for each
    single ioctl(), but we do that for 1gb each time so we frequently yield it.

Comparing to the other solution, it'll be different if we spend time during
sync() because during migration guest vcpu threads are dirtying pages that we
migrated so page dirtying is aggregating, it should make the migration to
converge harder just like when sync() is slower.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-29 15:52               ` Peter Xu
@ 2021-07-29 16:15                 ` David Hildenbrand
  2021-07-29 19:20                   ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 16:15 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 29.07.21 17:52, Peter Xu wrote:
> On Thu, Jul 29, 2021 at 02:14:41PM +0200, David Hildenbrand wrote:
>> On 24.07.21 00:10, Peter Xu wrote:
>>> On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
>>>> It can happen in corner cases and is valid: with the current virtio-mem
>>>> spec, guests are allowed to read unplugged memory. This will, for example,
>>>> happen on older Linux guests when reading /proc/kcore or (with even older
>>>> guests) when dumping guest memory via kdump. These corner cases were the
>>>> main reason why the spec allows for it -- until we have guests properly
>>>> adjusted such that it won't happen even in corner cases.
>>>>
>>>> A future feature bit will disallow it for the guest: required for supporting
>>>> shmem/hugetlb cleanly. With that in place, I agree that we would want to
>>>> warn in this case!
>>>
>>> OK that makes sense; with the page_size change, feel free to add:
>>
>> I just realized that relying on the page_size would be wrong.
>>
>> We migrate TARGET_PAGE_SIZE chunks and the offset might not be page_size
>> aligned. So if we were to replace TARGET_PAGE_SIZE by rb->page_size, we
>> might accidentally cover a "too big" range.
> 
> I'm wondering whether we should make the offset page size aligned instead.  For
> example, note that postcopy_place_page_zero() should only take page_size
> aligned host addr or UFFDIO_COPY could fail (hugetlb doesn't support
> UFFDIO_ZEROPAGE yet).

That is true indeed. I'd assume in that case that we would get called 
with the proper offset already, right? Because uffd would only report 
properly aligned pages IIRC.

> 
> Btw, does virtio-mem supports hugetlbfs now?  When with it, the smallest unit
> to plug/unplug would the huge page size (e.g., for 1g huge page, sounds not
> helpful to unplug 2M memory), am I right?

It supports it to 99.9 % I'd say (especially with the dump/tpm/migration 
fixes upstream). The units are handled properly: the unit is at least as 
big as the page size.

So with 1 GiB pages, you have a unit of 1 GiB.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 16:12                         ` Peter Xu
@ 2021-07-29 16:19                           ` David Hildenbrand
  2021-07-29 19:32                             ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 16:19 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 29.07.21 18:12, Peter Xu wrote:
> On Thu, Jul 29, 2021 at 10:14:47AM +0200, David Hildenbrand wrote:
>>>>>>> The thing is I still think this extra operation during sync() can be ignored by
>>>>>>> simply clear dirty log during bitmap init, then.. why not? :)
>>>>>>
>>>>>> I guess clearing the dirty log (especially in KVM) might be more expensive.
>>>>>
>>>>> If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
>>>>> fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
>>>>>
>>>>> The other thing is imho having overhead during bitmap init is always better
>>>>> than having that during sync(). :)
>>>>
>>>> Oh, right, so you're saying, after we set the dirty bmap to all ones and
>>>> excluded the discarded parts, setting the respective bits to 0, we simply
>>>> issue clearing of the whole area?
>>>>
>>>> For now I assumed we would have to clear per cb.
>>>
>>> Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
>>> just remembered memory API actually hides the bitmap interface..
>>>
>>> Reset the whole region works, but it'll slow down migration starts, more
>>> importantly that'll be with mmu write lock so we will lose most clear-log
>>> benefit for the initial round of migration and stuck the guest #pf at the
>>> meantime...
>>>
>>> Let's try do that in cb()s as you mentioned; I think that'll still be okay,
>>> because so far the clear log block size is much larger (1gb), 1tb is worst case
>>> 1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
>>> maybe? :)
>>
>> Just to get it right, what you propose is calling
>> migration_clear_memory_region_dirty_bitmap_range() from each cb().
> 
> Right.  We can provide a more complicated memory api for passing in bitmap but
> I think that can be an overkill and tricky.
> 
>> Due to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at most
>> once.
>>
>> But if our layout is fragmented, we can actually end up clearing all chunks
>> (1024 ioctls for 1TB), resulting in a slower migration start.
>>
>> Any gut feeling how much slower migration start could be with largish (e.g.,
>> 1 TiB) regions?
> 
> I had a vague memory of KVM_GET_DIRTY_LOG that I used to measure which took
> ~10ms for 1g guest mem, supposing that's mostly used to protect the pages or
> clearing dirties in the EPT pgtables.  Then the worst case is ~1 second for
> 1tb.
> 
> But note that it's still during setup phase, so we should expect to see a
> somehow large setup time and longer period that migration stays in SETUP state,
> but I think it's fine.  Reasons:
> 
>    - We don't care too much about guest dirtying pages during the setup process
>      because we haven't migrated anything yet, meanwhile we should not block any
>      other thread either (e.g., we don't hold BQL).
> 
>    - We don't block guest execution too.  Unlike KVM_GET_DIRTY_LOG without CLEAR
>      we won't hold the mmu lock for a huge long time but do it only in 1g chunk,
>      so guest page faults can still be serviced.  It'll be affected somehow
>      since we'll still run with the mmu write lock critical sections for each
>      single ioctl(), but we do that for 1gb each time so we frequently yield it.
> 

Please note that we are holding the iothread lock while setting up the 
bitmaps + syncing the dirty log. I'll have to make sure that that code 
runs outside of the BQL, otherwise we'll block guest execution.

In the meantime I adjusted the code but it does the clearing under the 
iothread lock, which should not be what we want ... I'll have a look.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-29 16:15                 ` David Hildenbrand
@ 2021-07-29 19:20                   ` Peter Xu
  2021-07-29 19:22                     ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-29 19:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 06:15:58PM +0200, David Hildenbrand wrote:
> On 29.07.21 17:52, Peter Xu wrote:
> > On Thu, Jul 29, 2021 at 02:14:41PM +0200, David Hildenbrand wrote:
> > > On 24.07.21 00:10, Peter Xu wrote:
> > > > On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
> > > > > It can happen in corner cases and is valid: with the current virtio-mem
> > > > > spec, guests are allowed to read unplugged memory. This will, for example,
> > > > > happen on older Linux guests when reading /proc/kcore or (with even older
> > > > > guests) when dumping guest memory via kdump. These corner cases were the
> > > > > main reason why the spec allows for it -- until we have guests properly
> > > > > adjusted such that it won't happen even in corner cases.
> > > > > 
> > > > > A future feature bit will disallow it for the guest: required for supporting
> > > > > shmem/hugetlb cleanly. With that in place, I agree that we would want to
> > > > > warn in this case!
> > > > 
> > > > OK that makes sense; with the page_size change, feel free to add:
> > > 
> > > I just realized that relying on the page_size would be wrong.
> > > 
> > > We migrate TARGET_PAGE_SIZE chunks and the offset might not be page_size
> > > aligned. So if we were to replace TARGET_PAGE_SIZE by rb->page_size, we
> > > might accidentally cover a "too big" range.
> > 
> > I'm wondering whether we should make the offset page size aligned instead.  For
> > example, note that postcopy_place_page_zero() should only take page_size
> > aligned host addr or UFFDIO_COPY could fail (hugetlb doesn't support
> > UFFDIO_ZEROPAGE yet).
> 
> That is true indeed. I'd assume in that case that we would get called with
> the proper offset already, right? Because uffd would only report properly
> aligned pages IIRC.

Nop; it should return the faulted address. So postcopy_request_page() may need
some alignment work, as it was handled in migrate_send_rp_req_pages().

> 
> > 
> > Btw, does virtio-mem supports hugetlbfs now?  When with it, the smallest unit
> > to plug/unplug would the huge page size (e.g., for 1g huge page, sounds not
> > helpful to unplug 2M memory), am I right?
> 
> It supports it to 99.9 % I'd say (especially with the dump/tpm/migration
> fixes upstream). The units are handled properly: the unit is at least as big
> as the page size.
> 
> So with 1 GiB pages, you have a unit of 1 GiB.

I see, thanks for confirming.  Then fixing up the offset seems to be the right
thing to do.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
  2021-07-29 19:20                   ` Peter Xu
@ 2021-07-29 19:22                     ` David Hildenbrand
  0 siblings, 0 replies; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 19:22 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Juan Quintela, Pankaj Gupta, Michael S. Tsirkin,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 29.07.21 21:20, Peter Xu wrote:
> On Thu, Jul 29, 2021 at 06:15:58PM +0200, David Hildenbrand wrote:
>> On 29.07.21 17:52, Peter Xu wrote:
>>> On Thu, Jul 29, 2021 at 02:14:41PM +0200, David Hildenbrand wrote:
>>>> On 24.07.21 00:10, Peter Xu wrote:
>>>>> On Fri, Jul 23, 2021 at 09:01:42PM +0200, David Hildenbrand wrote:
>>>>>> It can happen in corner cases and is valid: with the current virtio-mem
>>>>>> spec, guests are allowed to read unplugged memory. This will, for example,
>>>>>> happen on older Linux guests when reading /proc/kcore or (with even older
>>>>>> guests) when dumping guest memory via kdump. These corner cases were the
>>>>>> main reason why the spec allows for it -- until we have guests properly
>>>>>> adjusted such that it won't happen even in corner cases.
>>>>>>
>>>>>> A future feature bit will disallow it for the guest: required for supporting
>>>>>> shmem/hugetlb cleanly. With that in place, I agree that we would want to
>>>>>> warn in this case!
>>>>>
>>>>> OK that makes sense; with the page_size change, feel free to add:
>>>>
>>>> I just realized that relying on the page_size would be wrong.
>>>>
>>>> We migrate TARGET_PAGE_SIZE chunks and the offset might not be page_size
>>>> aligned. So if we were to replace TARGET_PAGE_SIZE by rb->page_size, we
>>>> might accidentally cover a "too big" range.
>>>
>>> I'm wondering whether we should make the offset page size aligned instead.  For
>>> example, note that postcopy_place_page_zero() should only take page_size
>>> aligned host addr or UFFDIO_COPY could fail (hugetlb doesn't support
>>> UFFDIO_ZEROPAGE yet).
>>
>> That is true indeed. I'd assume in that case that we would get called with
>> the proper offset already, right? Because uffd would only report properly
>> aligned pages IIRC.
> 
> Nop; it should return the faulted address. So postcopy_request_page() may need
> some alignment work, as it was handled in migrate_send_rp_req_pages().
> 

Right, figured that out myself just now:

static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
                                  ram_addr_t start, uint64_t haddr)
{
     void *aligned = (void *)(uintptr_t)(haddr & -qemu_ram_pagesize(rb));

     /*
      * Discarded pages (via RamDiscardManager) are never migrated. On unlikely
      * access, place a zeropage, which will also set the relevant bits in the
      * recv_bitmap accordingly, so we won't try placing a zeropage twice.
      *
      * Checking a single bit is sufficient to handle pagesize > TPS as either
      * all relevant bits are set or not.
      */
     assert(QEMU_IS_ALIGNED(start, qemu_ram_pagesize(rb)));
     if (ramblock_page_is_discarded(rb, start)) {
         bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);

         return received ? 0 : postcopy_place_page_zero(mis, aligned, rb);
     }

     return migrate_send_rp_req_pages(mis, rb, start, haddr);
}


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 16:19                           ` David Hildenbrand
@ 2021-07-29 19:32                             ` Peter Xu
  2021-07-29 19:39                               ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-29 19:32 UTC (permalink / raw)
  To: David Hildenbrand, Dr. David Alan Gilbert, Juan Quintela
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 06:19:31PM +0200, David Hildenbrand wrote:
> On 29.07.21 18:12, Peter Xu wrote:
> > On Thu, Jul 29, 2021 at 10:14:47AM +0200, David Hildenbrand wrote:
> > > > > > > > The thing is I still think this extra operation during sync() can be ignored by
> > > > > > > > simply clear dirty log during bitmap init, then.. why not? :)
> > > > > > > 
> > > > > > > I guess clearing the dirty log (especially in KVM) might be more expensive.
> > > > > > 
> > > > > > If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
> > > > > > fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.
> > > > > > 
> > > > > > The other thing is imho having overhead during bitmap init is always better
> > > > > > than having that during sync(). :)
> > > > > 
> > > > > Oh, right, so you're saying, after we set the dirty bmap to all ones and
> > > > > excluded the discarded parts, setting the respective bits to 0, we simply
> > > > > issue clearing of the whole area?
> > > > > 
> > > > > For now I assumed we would have to clear per cb.
> > > > 
> > > > Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
> > > > just remembered memory API actually hides the bitmap interface..
> > > > 
> > > > Reset the whole region works, but it'll slow down migration starts, more
> > > > importantly that'll be with mmu write lock so we will lose most clear-log
> > > > benefit for the initial round of migration and stuck the guest #pf at the
> > > > meantime...
> > > > 
> > > > Let's try do that in cb()s as you mentioned; I think that'll still be okay,
> > > > because so far the clear log block size is much larger (1gb), 1tb is worst case
> > > > 1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
> > > > maybe? :)
> > > 
> > > Just to get it right, what you propose is calling
> > > migration_clear_memory_region_dirty_bitmap_range() from each cb().
> > 
> > Right.  We can provide a more complicated memory api for passing in bitmap but
> > I think that can be an overkill and tricky.
> > 
> > > Due to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at most
> > > once.
> > > 
> > > But if our layout is fragmented, we can actually end up clearing all chunks
> > > (1024 ioctls for 1TB), resulting in a slower migration start.
> > > 
> > > Any gut feeling how much slower migration start could be with largish (e.g.,
> > > 1 TiB) regions?
> > 
> > I had a vague memory of KVM_GET_DIRTY_LOG that I used to measure which took
> > ~10ms for 1g guest mem, supposing that's mostly used to protect the pages or
> > clearing dirties in the EPT pgtables.  Then the worst case is ~1 second for
> > 1tb.
> > 
> > But note that it's still during setup phase, so we should expect to see a
> > somehow large setup time and longer period that migration stays in SETUP state,
> > but I think it's fine.  Reasons:
> > 
> >    - We don't care too much about guest dirtying pages during the setup process
> >      because we haven't migrated anything yet, meanwhile we should not block any
> >      other thread either (e.g., we don't hold BQL).
> > 
> >    - We don't block guest execution too.  Unlike KVM_GET_DIRTY_LOG without CLEAR
> >      we won't hold the mmu lock for a huge long time but do it only in 1g chunk,
> >      so guest page faults can still be serviced.  It'll be affected somehow
> >      since we'll still run with the mmu write lock critical sections for each
> >      single ioctl(), but we do that for 1gb each time so we frequently yield it.
> > 
> 
> Please note that we are holding the iothread lock while setting up the
> bitmaps + syncing the dirty log. I'll have to make sure that that code runs
> outside of the BQL, otherwise we'll block guest execution.

Oh right.

> 
> In the meantime I adjusted the code but it does the clearing under the
> iothread lock, which should not be what we want ... I'll have a look.

Thanks; if it takes more changes than expected we can still start from simple,
IMHO, by taking bql and timely yield it.

At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
need them of not:

  1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
     (small question)

  2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
     (bigger question)

I feel like those can be dropped.  Dave/Juan?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 19:32                             ` Peter Xu
@ 2021-07-29 19:39                               ` David Hildenbrand
  2021-07-29 20:00                                 ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 19:39 UTC (permalink / raw)
  To: Peter Xu, Dr. David Alan Gilbert, Juan Quintela
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, qemu-devel,
	Andrey Gruzdev, Alex Williamson, teawater, Paolo Bonzini,
	Marek Kedzierski, Wei Yang


>> In the meantime I adjusted the code but it does the clearing under the
>> iothread lock, which should not be what we want ... I'll have a look.
> 
> Thanks; if it takes more changes than expected we can still start from simple,
> IMHO, by taking bql and timely yield it.
> 
> At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
> need them of not:
> 
>    1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
>       (small question)

Good question, I'm not sure if we need it.

> 
>    2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
>       (bigger question)

IIRC, the bitmap sync will fetch the proper dirty bitmap from KVM and 
set the proper bits in the clear_bitmap. So once we call 
migration_clear_memory_region_dirty_bitmap_range() etc. later we will 
actually clear dirty bits.

Without that, migration_clear_memory_region_dirty_bitmap_range() is a 
nop and we might migrate stuff unnecessarily twice as dirty bits are not 
cleared:

I certainly need that, otherwise the 
migration_clear_memory_region_dirty_bitmap_range() is a nop and it all 
won't work as you proposed.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 19:39                               ` David Hildenbrand
@ 2021-07-29 20:00                                 ` Peter Xu
  2021-07-29 20:06                                   ` David Hildenbrand
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2021-07-29 20:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 09:39:24PM +0200, David Hildenbrand wrote:
> 
> > > In the meantime I adjusted the code but it does the clearing under the
> > > iothread lock, which should not be what we want ... I'll have a look.
> > 
> > Thanks; if it takes more changes than expected we can still start from simple,
> > IMHO, by taking bql and timely yield it.
> > 
> > At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
> > need them of not:
> > 
> >    1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
> >       (small question)
> 
> Good question, I'm not sure if we need it.
> 
> > 
> >    2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
> >       (bigger question)
> 
> IIRC, the bitmap sync will fetch the proper dirty bitmap from KVM and set
> the proper bits in the clear_bitmap. So once we call
> migration_clear_memory_region_dirty_bitmap_range() etc. later we will
> actually clear dirty bits.

Good point, however.. then I'm wondering whether we should just init clear_bmap
to all 1's too when init just like dirty bmap. :)

Besides: let's not be affected with these details as they should be more
suitable to be handled separately; maybe I'll follow this up.  It could be a
place to discuss things, but shouldn't be a burden to block this series.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 20:00                                 ` Peter Xu
@ 2021-07-29 20:06                                   ` David Hildenbrand
  2021-07-29 20:28                                     ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2021-07-29 20:06 UTC (permalink / raw)
  To: Peter Xu
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On 29.07.21 22:00, Peter Xu wrote:
> On Thu, Jul 29, 2021 at 09:39:24PM +0200, David Hildenbrand wrote:
>>
>>>> In the meantime I adjusted the code but it does the clearing under the
>>>> iothread lock, which should not be what we want ... I'll have a look.
>>>
>>> Thanks; if it takes more changes than expected we can still start from simple,
>>> IMHO, by taking bql and timely yield it.
>>>
>>> At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
>>> need them of not:
>>>
>>>     1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
>>>        (small question)
>>
>> Good question, I'm not sure if we need it.
>>
>>>
>>>     2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
>>>        (bigger question)
>>
>> IIRC, the bitmap sync will fetch the proper dirty bitmap from KVM and set
>> the proper bits in the clear_bitmap. So once we call
>> migration_clear_memory_region_dirty_bitmap_range() etc. later we will
>> actually clear dirty bits.
> 
> Good point, however.. then I'm wondering whether we should just init clear_bmap
> to all 1's too when init just like dirty bmap. :)

Yes, but ... I'm not sure if we have to get the dirty bits into 
KVMSlot->dirty_bmap as well in order to clear them.

It could work with "manual_dirty_log_protect". For 
!manual_dirty_log_protect we might have to keep it that way ... which 
means we might have to expose some ugly details up to migration/ram.c .
Might require some thought :)

> 
> Besides: let's not be affected with these details as they should be more
> suitable to be handled separately; maybe I'll follow this up.  It could be a
> place to discuss things, but shouldn't be a burden to block this series.


Agreed :)


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
  2021-07-29 20:06                                   ` David Hildenbrand
@ 2021-07-29 20:28                                     ` Peter Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Xu @ 2021-07-29 20:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Eduardo Habkost, Michael S. Tsirkin, Pankaj Gupta, Juan Quintela,
	teawater, qemu-devel, Dr. David Alan Gilbert, Alex Williamson,
	Marek Kedzierski, Paolo Bonzini, Andrey Gruzdev, Wei Yang

On Thu, Jul 29, 2021 at 10:06:16PM +0200, David Hildenbrand wrote:
> On 29.07.21 22:00, Peter Xu wrote:
> > On Thu, Jul 29, 2021 at 09:39:24PM +0200, David Hildenbrand wrote:
> > > 
> > > > > In the meantime I adjusted the code but it does the clearing under the
> > > > > iothread lock, which should not be what we want ... I'll have a look.
> > > > 
> > > > Thanks; if it takes more changes than expected we can still start from simple,
> > > > IMHO, by taking bql and timely yield it.
> > > > 
> > > > At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
> > > > need them of not:
> > > > 
> > > >     1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
> > > >        (small question)
> > > 
> > > Good question, I'm not sure if we need it.
> > > 
> > > > 
> > > >     2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
> > > >        (bigger question)
> > > 
> > > IIRC, the bitmap sync will fetch the proper dirty bitmap from KVM and set
> > > the proper bits in the clear_bitmap. So once we call
> > > migration_clear_memory_region_dirty_bitmap_range() etc. later we will
> > > actually clear dirty bits.
> > 
> > Good point, however.. then I'm wondering whether we should just init clear_bmap
> > to all 1's too when init just like dirty bmap. :)
> 
> Yes, but ... I'm not sure if we have to get the dirty bits into
> KVMSlot->dirty_bmap as well in order to clear them.

Yes, so far it's closely bound to kvm's dirty_bmap, so sounds needed indeed (in
kvm_slot_init_dirty_bitmap).

> 
> It could work with "manual_dirty_log_protect". For !manual_dirty_log_protect
> we might have to keep it that way ... which means we might have to expose
> some ugly details up to migration/ram.c .
> Might require some thought :)

We should make sure clear_log() hooks always work, so the memory api should be
able to call the memory region clear log api without knowing whether it's
enabled underneath in either kvm or other future clear_log() hooks.  KVM
currently should be fine as kvm_physical_log_clear() checks manual protect at
the entry, and it returns directly otherwise.  Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-07-29 20:30 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21  9:27 [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager David Hildenbrand
2021-07-21  9:27 ` [PATCH v2 1/6] memory: Introduce replay_discarded callback for RamDiscardManager David Hildenbrand
2021-07-23 16:34   ` Peter Xu
2021-07-21  9:27 ` [PATCH v2 2/6] virtio-mem: Implement replay_discarded RamDiscardManager callback David Hildenbrand
2021-07-23 16:34   ` Peter Xu
2021-07-21  9:27 ` [PATCH v2 3/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source David Hildenbrand
2021-07-21  9:27 ` [PATCH v2 4/6] virtio-mem: Drop precopy notifier David Hildenbrand
2021-07-23 16:34   ` Peter Xu
2021-07-21  9:27 ` [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination David Hildenbrand
2021-07-23 16:34   ` Peter Xu
2021-07-23 18:36     ` David Hildenbrand
2021-07-23 18:52       ` Peter Xu
2021-07-23 19:01         ` David Hildenbrand
2021-07-23 22:10           ` Peter Xu
2021-07-29 12:14             ` David Hildenbrand
2021-07-29 15:52               ` Peter Xu
2021-07-29 16:15                 ` David Hildenbrand
2021-07-29 19:20                   ` Peter Xu
2021-07-29 19:22                     ` David Hildenbrand
2021-07-21  9:27 ` [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots David Hildenbrand
2021-07-23 16:37   ` Peter Xu
2021-07-22 11:29 ` [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager Dr. David Alan Gilbert
2021-07-22 11:43   ` David Hildenbrand
2021-07-23 16:12     ` Peter Xu
2021-07-23 18:41       ` David Hildenbrand
2021-07-23 22:19         ` Peter Xu
2021-07-27  9:25           ` David Hildenbrand
2021-07-27 17:10             ` Peter Xu
2021-07-28 17:39               ` David Hildenbrand
2021-07-28 19:42                 ` Peter Xu
2021-07-28 19:46                   ` David Hildenbrand
2021-07-28 20:19                     ` Peter Xu
2021-07-29  8:14                       ` David Hildenbrand
2021-07-29 16:12                         ` Peter Xu
2021-07-29 16:19                           ` David Hildenbrand
2021-07-29 19:32                             ` Peter Xu
2021-07-29 19:39                               ` David Hildenbrand
2021-07-29 20:00                                 ` Peter Xu
2021-07-29 20:06                                   ` David Hildenbrand
2021-07-29 20:28                                     ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.