qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	kvm@vger.kernel.org, "David Hildenbrand" <david@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Paul Durrant" <paul@xen.org>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Anthony Perard" <anthony.perard@citrix.com>,
	xen-devel@lists.xenproject.org, "Eric Blake" <eblake@redhat.com>
Subject: [PULL 13/20] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source
Date: Mon,  1 Nov 2021 23:09:05 +0100	[thread overview]
Message-ID: <20211101220912.10039-14-quintela@redhat.com> (raw)
In-Reply-To: <20211101220912.10039-1-quintela@redhat.com>

From: David Hildenbrand <david@redhat.com>

We don't want to migrate memory that corresponds to discarded ranges as
managed by a RamDiscardManager responsible for the mapped memory region of
the RAMBlock. The content of these pages is essentially stale and
without any guarantees for the VM ("logically unplugged").

Depending on the underlying memory type, even reading memory might populate
memory on the source, resulting in an undesired memory consumption. Of
course, on the destination, even writing a zeropage consumes memory,
which we also want to avoid (similar to free page hinting).

Currently, virtio-mem tries achieving that goal (not migrating "unplugged"
memory that was discarded) by going via qemu_guest_free_page_hint() - but
it's hackish and incomplete.

For example, background snapshots still end up reading all memory, as
they don't do bitmap syncs. Postcopy recovery code will re-add
previously cleared bits to the dirty bitmap and migrate them.

Let's consult the RamDiscardManager after setting up our dirty bitmap
initially and when postcopy recovery code reinitializes it: clear
corresponding bits in the dirty bitmaps (e.g., of the RAMBlock and inside
KVM). It's important to fixup the dirty bitmap *after* our initial bitmap
sync, such that the corresponding dirty bits in KVM are actually cleared.

As colo is incompatible with discarding of RAM and inhibits it, we don't
have to bother.

Note: if a misbehaving guest would use discarded ranges after migration
started we would still migrate that memory: however, then we already
populated that memory on the migration source.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index ae2601bf3b..e8c06f207c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -858,6 +858,60 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
     return ret;
 }
 
+static void dirty_bitmap_clear_section(MemoryRegionSection *section,
+                                       void *opaque)
+{
+    const hwaddr offset = section->offset_within_region;
+    const hwaddr size = int128_get64(section->size);
+    const unsigned long start = offset >> TARGET_PAGE_BITS;
+    const unsigned long npages = size >> TARGET_PAGE_BITS;
+    RAMBlock *rb = section->mr->ram_block;
+    uint64_t *cleared_bits = opaque;
+
+    /*
+     * We don't grab ram_state->bitmap_mutex because we expect to run
+     * only when starting migration or during postcopy recovery where
+     * we don't have concurrent access.
+     */
+    if (!migration_in_postcopy() && !migrate_background_snapshot()) {
+        migration_clear_memory_region_dirty_bitmap_range(rb, start, npages);
+    }
+    *cleared_bits += bitmap_count_one_with_offset(rb->bmap, start, npages);
+    bitmap_clear(rb->bmap, start, npages);
+}
+
+/*
+ * Exclude all dirty pages from migration that fall into a discarded range as
+ * managed by a RamDiscardManager responsible for the mapped memory region of
+ * the RAMBlock. Clear the corresponding bits in the dirty bitmaps.
+ *
+ * Discarded pages ("logically unplugged") have undefined content and must
+ * not get migrated, because even reading these pages for migration might
+ * result in undesired behavior.
+ *
+ * Returns the number of cleared bits in the RAMBlock dirty bitmap.
+ *
+ * Note: The result is only stable while migrating (precopy/postcopy).
+ */
+static uint64_t ramblock_dirty_bitmap_clear_discarded_pages(RAMBlock *rb)
+{
+    uint64_t cleared_bits = 0;
+
+    if (rb->mr && rb->bmap && memory_region_has_ram_discard_manager(rb->mr)) {
+        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr);
+        MemoryRegionSection section = {
+            .mr = rb->mr,
+            .offset_within_region = 0,
+            .size = int128_make64(qemu_ram_get_used_length(rb)),
+        };
+
+        ram_discard_manager_replay_discarded(rdm, &section,
+                                             dirty_bitmap_clear_section,
+                                             &cleared_bits);
+    }
+    return cleared_bits;
+}
+
 /* Called with RCU critical section */
 static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb)
 {
@@ -2675,6 +2729,19 @@ static void ram_list_init_bitmaps(void)
     }
 }
 
+static void migration_bitmap_clear_discarded_pages(RAMState *rs)
+{
+    unsigned long pages;
+    RAMBlock *rb;
+
+    RCU_READ_LOCK_GUARD();
+
+    RAMBLOCK_FOREACH_NOT_IGNORED(rb) {
+            pages = ramblock_dirty_bitmap_clear_discarded_pages(rb);
+            rs->migration_dirty_pages -= pages;
+    }
+}
+
 static void ram_init_bitmaps(RAMState *rs)
 {
     /* For memory_global_dirty_log_start below.  */
@@ -2691,6 +2758,12 @@ static void ram_init_bitmaps(RAMState *rs)
     }
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
+
+    /*
+     * After an eventual first bitmap sync, fixup the initial bitmap
+     * containing all 1s to exclude any discarded pages from migration.
+     */
+    migration_bitmap_clear_discarded_pages(rs);
 }
 
 static int ram_init_all(RAMState **rsp)
@@ -4119,6 +4192,10 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
      */
     bitmap_complement(block->bmap, block->bmap, nbits);
 
+    /* Clear dirty bits of discarded ranges that we don't want to migrate. */
+    ramblock_dirty_bitmap_clear_discarded_pages(block);
+
+    /* We'll recalculate migration_dirty_pages in ram_state_resume_prepare(). */
     trace_ram_dirty_bitmap_reload_complete(block->idstr);
 
     /*
-- 
2.33.1



  parent reply	other threads:[~2021-11-01 22:28 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 22:08 [PULL 00/20] Migration 20211031 patches Juan Quintela
2021-11-01 22:08 ` [PULL 01/20] migration/rdma: Fix out of order wrid Juan Quintela
2021-11-01 22:08 ` [PULL 02/20] KVM: introduce dirty_pages and kvm_dirty_ring_enabled Juan Quintela
2021-11-01 22:08 ` [PULL 03/20] memory: make global_dirty_tracking a bitmask Juan Quintela
2021-11-01 22:08 ` [PULL 04/20] migration/dirtyrate: introduce struct and adjust DirtyRateStat Juan Quintela
2021-11-04 20:54   ` Eric Blake
2021-11-01 22:08 ` [PULL 05/20] migration/dirtyrate: adjust order of registering thread Juan Quintela
2021-11-01 22:08 ` [PULL 06/20] migration/dirtyrate: move init step of calculation to main thread Juan Quintela
2021-11-01 22:08 ` [PULL 07/20] migration/dirtyrate: implement dirty-ring dirtyrate calculation Juan Quintela
2021-11-04 22:05   ` Philippe Mathieu-Daudé
2021-11-06 11:45     ` Juan Quintela
2021-11-01 22:09 ` [PULL 08/20] migration: Make migration blocker work for snapshots too Juan Quintela
2021-11-01 22:09 ` [PULL 09/20] migration: Add migrate_add_blocker_internal() Juan Quintela
2021-11-01 22:09 ` [PULL 10/20] dump-guest-memory: Block live migration Juan Quintela
2021-11-01 22:09 ` [PULL 11/20] memory: Introduce replay_discarded callback for RamDiscardManager Juan Quintela
2021-11-01 22:09 ` [PULL 12/20] virtio-mem: Implement replay_discarded RamDiscardManager callback Juan Quintela
2021-11-01 22:09 ` Juan Quintela [this message]
2021-11-01 22:09 ` [PULL 14/20] virtio-mem: Drop precopy notifier Juan Quintela
2021-11-01 22:09 ` [PULL 15/20] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination Juan Quintela
2021-11-01 22:09 ` [PULL 16/20] migration: Simplify alignment and alignment checks Juan Quintela
2021-11-01 22:09 ` [PULL 17/20] migration/ram: Factor out populating pages readable in ram_block_populate_pages() Juan Quintela
2021-11-01 22:09 ` [PULL 18/20] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots Juan Quintela
2021-11-01 22:09 ` [PULL 19/20] memory: introduce total_dirty_pages to stat dirty pages Juan Quintela
2021-11-01 22:09 ` [PULL 20/20] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation Juan Quintela
2021-11-02 15:45 ` [PULL 00/20] Migration 20211031 patches Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211101220912.10039-14-quintela@redhat.com \
    --to=quintela@redhat.com \
    --cc=anthony.perard@citrix.com \
    --cc=armbru@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=mst@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).