All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem
@ 2021-07-20 13:02 David Hildenbrand
  2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Stefan Berger, Eduardo Habkost,
	David Hildenbrand, Michael S. Tsirkin, Dr. David Alan Gilbert,
	Peter Xu, Alex Williamson, Paolo Bonzini, Claudio Fontana,
	Marc-André Lureau, Igor Mammedov, Alex Bennée,
	Stefan Berger

This is a resend of v2, now that RamDiscardManager is upstream. I retested
dumping with virtio-mem on shmem and it works as expected now.

---

Minor fixes and cleanups, followed by an optimization for virtio-mem
regarding guest dumps and tpm.

virtio-mem logically plugs/unplugs memory within a sparse memory region
and notifies via the RamDiscardMgr interface when parts become
plugged (populated) or unplugged (discarded).

Currently, guest_phys_blocks_append() appends the whole (sparse)
virtio-mem managed region and therefore tpm code might zero the hole
region and dump code will dump the whole region. Let's only add logically
plugged (populated) parts of that region, skipping over logically
unplugged (discarded) parts by reusing the RamDiscardMgr infrastructure
introduced to handle virtio-mem + VFIO properly.

v1 -> v2:
- "softmmu/memory_mapping: factor out adding physical memory ranges"
-- Simplify based on RamDiscardManager changes: add using a
   MemoryRegionSection
- "softmmu/memory_mapping: optimize for RamDiscardManager sections"
-- Simplify based on RamDiscardManager changes

David Hildenbrand (5):
  tpm: mark correct memory region range dirty when clearing RAM
  softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping()
  softmmu/memory_mapping: never merge ranges accross memory regions
  softmmu/memory_mapping: factor out adding physical memory ranges
  softmmu/memory_mapping: optimize for RamDiscardManager sections

 hw/tpm/tpm_ppi.c         |  4 ++-
 softmmu/memory_mapping.c | 72 ++++++++++++++++++++++++----------------
 2 files changed, 46 insertions(+), 30 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
@ 2021-07-20 13:03 ` David Hildenbrand
  2021-07-23 14:52   ` Peter Xu
  2021-07-20 13:03 ` [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping() David Hildenbrand
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Stefan Berger, Eduardo Habkost,
	Michael S. Tsirkin, David Hildenbrand, Dr . David Alan Gilbert,
	Peter Xu, Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

We might not start at the beginning of the memory region. We could also
calculate via the difference in the host address; however,
memory_region_set_dirty() also relies on memory_region_get_ram_addr()
internally, so let's just use that.

Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Fixes: ffab1be70692 ("tpm: clear RAM when "memory overwrite" requested")
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/tpm/tpm_ppi.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index 362edcc5c9..261f33431c 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -30,11 +30,13 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
         guest_phys_blocks_init(&guest_phys_blocks);
         guest_phys_blocks_append(&guest_phys_blocks);
         QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
+            ram_addr_t mr_start = memory_region_get_ram_addr(block->mr);
+
             trace_tpm_ppi_memset(block->host_addr,
                                  block->target_end - block->target_start);
             memset(block->host_addr, 0,
                    block->target_end - block->target_start);
-            memory_region_set_dirty(block->mr, 0,
+            memory_region_set_dirty(block->mr, block->target_start - mr_start,
                                     block->target_end - block->target_start);
         }
         guest_phys_blocks_free(&guest_phys_blocks);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping()
  2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
  2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
@ 2021-07-20 13:03 ` David Hildenbrand
  2021-07-20 13:37   ` Stefan Berger
  2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	David Hildenbrand, Dr . David Alan Gilbert, Peter Xu,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

Let's reuse qemu_get_guest_simple_memory_mapping(), which does exactly
what we want.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 softmmu/memory_mapping.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
index e7af276546..d63f896b30 100644
--- a/softmmu/memory_mapping.c
+++ b/softmmu/memory_mapping.c
@@ -288,8 +288,6 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
                                    Error **errp)
 {
     CPUState *cpu, *first_paging_enabled_cpu;
-    GuestPhysBlock *block;
-    ram_addr_t offset, length;
 
     first_paging_enabled_cpu = find_paging_enabled_cpu(first_cpu);
     if (first_paging_enabled_cpu) {
@@ -309,11 +307,7 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
      * If the guest doesn't use paging, the virtual address is equal to physical
      * address.
      */
-    QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
-        offset = block->target_start;
-        length = block->target_end - block->target_start;
-        create_new_memory_mapping(list, offset, offset, length);
-    }
+    qemu_get_guest_simple_memory_mapping(list, guest_phys_blocks);
 }
 
 void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list,
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions
  2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
  2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
  2021-07-20 13:03 ` [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping() David Hildenbrand
@ 2021-07-20 13:03 ` David Hildenbrand
  2021-07-20 13:26   ` Stefan Berger
  2021-07-23 15:09   ` Peter Xu
  2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
  2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
  4 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	David Hildenbrand, Dr . David Alan Gilbert, Peter Xu,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

Let's make sure to not merge when different memory regions are involved.
Unlikely, but theoretically possible.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 softmmu/memory_mapping.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
index d63f896b30..c149bad44a 100644
--- a/softmmu/memory_mapping.c
+++ b/softmmu/memory_mapping.c
@@ -229,7 +229,8 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
 
         /* we want continuity in both guest-physical and host-virtual memory */
         if (predecessor->target_end < target_start ||
-            predecessor->host_addr + predecessor_size != host_addr) {
+            predecessor->host_addr + predecessor_size != host_addr ||
+            predecessor->mr != section->mr) {
             predecessor = NULL;
         }
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges
  2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
                   ` (2 preceding siblings ...)
  2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
@ 2021-07-20 13:03 ` David Hildenbrand
  2021-07-20 13:25   ` Stefan Berger
  2021-07-23 15:09   ` Peter Xu
  2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
  4 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	David Hildenbrand, Dr . David Alan Gilbert, Peter Xu,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

Let's factor out adding a MemoryRegionSection to the list, to be reused in
RamDiscardManager context next.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 softmmu/memory_mapping.c | 41 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
index c149bad44a..b7e4f3f788 100644
--- a/softmmu/memory_mapping.c
+++ b/softmmu/memory_mapping.c
@@ -193,29 +193,14 @@ typedef struct GuestPhysListener {
     MemoryListener listener;
 } GuestPhysListener;
 
-static void guest_phys_blocks_region_add(MemoryListener *listener,
+static void guest_phys_block_add_section(GuestPhysListener *g,
                                          MemoryRegionSection *section)
 {
-    GuestPhysListener *g;
-    uint64_t section_size;
-    hwaddr target_start, target_end;
-    uint8_t *host_addr;
-    GuestPhysBlock *predecessor;
-
-    /* we only care about RAM */
-    if (!memory_region_is_ram(section->mr) ||
-        memory_region_is_ram_device(section->mr) ||
-        memory_region_is_nonvolatile(section->mr)) {
-        return;
-    }
-
-    g            = container_of(listener, GuestPhysListener, listener);
-    section_size = int128_get64(section->size);
-    target_start = section->offset_within_address_space;
-    target_end   = target_start + section_size;
-    host_addr    = memory_region_get_ram_ptr(section->mr) +
-                   section->offset_within_region;
-    predecessor  = NULL;
+    const hwaddr target_start = section->offset_within_address_space;
+    const hwaddr target_end = target_start + int128_get64(section->size);
+    uint8_t *host_addr = memory_region_get_ram_ptr(section->mr) +
+                         section->offset_within_region;
+    GuestPhysBlock *predecessor = NULL;
 
     /* find continuity in guest physical address space */
     if (!QTAILQ_EMPTY(&g->list->head)) {
@@ -261,6 +246,20 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
 #endif
 }
 
+static void guest_phys_blocks_region_add(MemoryListener *listener,
+                                         MemoryRegionSection *section)
+{
+    GuestPhysListener *g = container_of(listener, GuestPhysListener, listener);
+
+    /* we only care about RAM */
+    if (!memory_region_is_ram(section->mr) ||
+        memory_region_is_ram_device(section->mr) ||
+        memory_region_is_nonvolatile(section->mr)) {
+        return;
+    }
+    guest_phys_block_add_section(g, section);
+}
+
 void guest_phys_blocks_append(GuestPhysBlockList *list)
 {
     GuestPhysListener g = { 0 };
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
                   ` (3 preceding siblings ...)
  2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
@ 2021-07-20 13:03 ` David Hildenbrand
  2021-07-23 15:28   ` Peter Xu
  2021-07-26 15:24   ` Peter Xu
  4 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	David Hildenbrand, Dr . David Alan Gilbert, Peter Xu,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

virtio-mem logically plugs/unplugs memory within a sparse memory region
and notifies via the RamDiscardManager interface when parts become
plugged (populated) or unplugged (discarded).

Currently, we end up (via the two users)
1) zeroing all logically unplugged/discarded memory during TPM resets.
2) reading all logically unplugged/discarded memory when dumping, to
   figure out the content is zero.

1) is always bad, because we assume unplugged memory stays discarded
   (and is already implicitly zero).
2) isn't that bad with anonymous memory, we end up reading the zero
   page (slow and unnecessary, though). However, once we use some
   file-backed memory (future use case), even reading will populate memory.

Let's cut out all parts marked as not-populated (discarded) via the
RamDiscardManager. As virtio-mem is the single user, this now means that
logically unplugged memory ranges will no longer be included in the
dump, which results in smaller dump files and faster dumping.

virtio-mem has a minimum granularity of 1 MiB (and the default is usually
2 MiB). Theoretically, we can see quite some fragmentation, in practice
we won't have it completely fragmented in 1 MiB pieces. Still, we might
end up with many physical ranges.

Both, the ELF format and kdump seem to be ready to support many
individual ranges (e.g., for ELF it seems to be UINT32_MAX, kdump has a
linear bitmap).

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Claudio Fontana <cfontana@suse.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 softmmu/memory_mapping.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
index b7e4f3f788..856778a109 100644
--- a/softmmu/memory_mapping.c
+++ b/softmmu/memory_mapping.c
@@ -246,6 +246,15 @@ static void guest_phys_block_add_section(GuestPhysListener *g,
 #endif
 }
 
+static int guest_phys_ram_populate_cb(MemoryRegionSection *section,
+                                      void *opaque)
+{
+    GuestPhysListener *g = opaque;
+
+    guest_phys_block_add_section(g, section);
+    return 0;
+}
+
 static void guest_phys_blocks_region_add(MemoryListener *listener,
                                          MemoryRegionSection *section)
 {
@@ -257,6 +266,17 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
         memory_region_is_nonvolatile(section->mr)) {
         return;
     }
+
+    /* for special sparse regions, only add populated parts */
+    if (memory_region_has_ram_discard_manager(section->mr)) {
+        RamDiscardManager *rdm;
+
+        rdm = memory_region_get_ram_discard_manager(section->mr);
+        ram_discard_manager_replay_populated(rdm, section,
+                                             guest_phys_ram_populate_cb, g);
+        return;
+    }
+
     guest_phys_block_add_section(g, section);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges
  2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
@ 2021-07-20 13:25   ` Stefan Berger
  2021-07-23 15:09   ` Peter Xu
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Berger @ 2021-07-20 13:25 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Dr . David Alan Gilbert, Peter Xu, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov


On 7/20/21 9:03 AM, David Hildenbrand wrote:
> Let's factor out adding a MemoryRegionSection to the list, to be reused in
> RamDiscardManager context next.
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   softmmu/memory_mapping.c | 41 ++++++++++++++++++++--------------------
>   1 file changed, 20 insertions(+), 21 deletions(-)
>
> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
> index c149bad44a..b7e4f3f788 100644
> --- a/softmmu/memory_mapping.c
> +++ b/softmmu/memory_mapping.c
> @@ -193,29 +193,14 @@ typedef struct GuestPhysListener {
>       MemoryListener listener;
>   } GuestPhysListener;
>   
> -static void guest_phys_blocks_region_add(MemoryListener *listener,
> +static void guest_phys_block_add_section(GuestPhysListener *g,
>                                            MemoryRegionSection *section)
>   {
> -    GuestPhysListener *g;
> -    uint64_t section_size;
> -    hwaddr target_start, target_end;
> -    uint8_t *host_addr;
> -    GuestPhysBlock *predecessor;
> -
> -    /* we only care about RAM */
> -    if (!memory_region_is_ram(section->mr) ||
> -        memory_region_is_ram_device(section->mr) ||
> -        memory_region_is_nonvolatile(section->mr)) {
> -        return;
> -    }
> -
> -    g            = container_of(listener, GuestPhysListener, listener);
> -    section_size = int128_get64(section->size);
> -    target_start = section->offset_within_address_space;
> -    target_end   = target_start + section_size;
> -    host_addr    = memory_region_get_ram_ptr(section->mr) +
> -                   section->offset_within_region;
> -    predecessor  = NULL;
> +    const hwaddr target_start = section->offset_within_address_space;
> +    const hwaddr target_end = target_start + int128_get64(section->size);
> +    uint8_t *host_addr = memory_region_get_ram_ptr(section->mr) +
> +                         section->offset_within_region;
> +    GuestPhysBlock *predecessor = NULL;
>   
>       /* find continuity in guest physical address space */
>       if (!QTAILQ_EMPTY(&g->list->head)) {
> @@ -261,6 +246,20 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
>   #endif
>   }
>   
> +static void guest_phys_blocks_region_add(MemoryListener *listener,
> +                                         MemoryRegionSection *section)
> +{
> +    GuestPhysListener *g = container_of(listener, GuestPhysListener, listener);
> +
> +    /* we only care about RAM */
> +    if (!memory_region_is_ram(section->mr) ||
> +        memory_region_is_ram_device(section->mr) ||
> +        memory_region_is_nonvolatile(section->mr)) {
> +        return;
> +    }
> +    guest_phys_block_add_section(g, section);
> +}
> +
>   void guest_phys_blocks_append(GuestPhysBlockList *list)
>   {
>       GuestPhysListener g = { 0 };


Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions
  2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
@ 2021-07-20 13:26   ` Stefan Berger
  2021-07-23 15:09   ` Peter Xu
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Berger @ 2021-07-20 13:26 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Dr . David Alan Gilbert, Peter Xu, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov


On 7/20/21 9:03 AM, David Hildenbrand wrote:
> Let's make sure to not merge when different memory regions are involved.
> Unlikely, but theoretically possible.
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   softmmu/memory_mapping.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
> index d63f896b30..c149bad44a 100644
> --- a/softmmu/memory_mapping.c
> +++ b/softmmu/memory_mapping.c
> @@ -229,7 +229,8 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
>   
>           /* we want continuity in both guest-physical and host-virtual memory */
>           if (predecessor->target_end < target_start ||
> -            predecessor->host_addr + predecessor_size != host_addr) {
> +            predecessor->host_addr + predecessor_size != host_addr ||
> +            predecessor->mr != section->mr) {
>               predecessor = NULL;
>           }
>       }


Acked-by: Stefan Berger <stefanb@linux.ibm.com>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping()
  2021-07-20 13:03 ` [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping() David Hildenbrand
@ 2021-07-20 13:37   ` Stefan Berger
  2021-07-20 13:45     ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Berger @ 2021-07-20 13:37 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Dr . David Alan Gilbert, Peter Xu, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov


On 7/20/21 9:03 AM, David Hildenbrand wrote:
> Let's reuse qemu_get_guest_simple_memory_mapping(), which does exactly
> what we want.
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   softmmu/memory_mapping.c | 8 +-------
>   1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
> index e7af276546..d63f896b30 100644
> --- a/softmmu/memory_mapping.c
> +++ b/softmmu/memory_mapping.c
> @@ -288,8 +288,6 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
>                                      Error **errp)
>   {
>       CPUState *cpu, *first_paging_enabled_cpu;
> -    GuestPhysBlock *block;
> -    ram_addr_t offset, length;
>   
>       first_paging_enabled_cpu = find_paging_enabled_cpu(first_cpu);
>       if (first_paging_enabled_cpu) {
> @@ -309,11 +307,7 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
>        * If the guest doesn't use paging, the virtual address is equal to physical
>        * address.
>        */
> -    QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
> -        offset = block->target_start;
> -        length = block->target_end - block->target_start;
> -        create_new_memory_mapping(list, offset, offset, length);
> -    }
> +    qemu_get_guest_simple_memory_mapping(list, guest_phys_blocks);
>   }

I thought I'd find a 1:1 replacement for the above here:

void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list,
                                    const GuestPhysBlockList 
*guest_phys_blocks)
{
     GuestPhysBlock *block;

     QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
         create_new_memory_mapping(list, block->target_start, 0,
                                   block->target_end - block->target_start);
     }
}

But this is calling create_new_memory_mapping() with a different 3rd 
parameter:   0 vs. offset.


>   
>   void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list,


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping()
  2021-07-20 13:37   ` Stefan Berger
@ 2021-07-20 13:45     ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-20 13:45 UTC (permalink / raw)
  To: Stefan Berger, qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Dr . David Alan Gilbert, Peter Xu, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov

On 20.07.21 15:37, Stefan Berger wrote:
> 
> On 7/20/21 9:03 AM, David Hildenbrand wrote:
>> Let's reuse qemu_get_guest_simple_memory_mapping(), which does exactly
>> what we want.
>>
>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> Cc: Alex Williamson <alex.williamson@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Claudio Fontana <cfontana@suse.de>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: "Alex Bennée" <alex.bennee@linaro.org>
>> Cc: Peter Xu <peterx@redhat.com>
>> Cc: Laurent Vivier <lvivier@redhat.com>
>> Cc: Stefan Berger <stefanb@linux.ibm.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>    softmmu/memory_mapping.c | 8 +-------
>>    1 file changed, 1 insertion(+), 7 deletions(-)
>>
>> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
>> index e7af276546..d63f896b30 100644
>> --- a/softmmu/memory_mapping.c
>> +++ b/softmmu/memory_mapping.c
>> @@ -288,8 +288,6 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
>>                                       Error **errp)
>>    {
>>        CPUState *cpu, *first_paging_enabled_cpu;
>> -    GuestPhysBlock *block;
>> -    ram_addr_t offset, length;
>>    
>>        first_paging_enabled_cpu = find_paging_enabled_cpu(first_cpu);
>>        if (first_paging_enabled_cpu) {
>> @@ -309,11 +307,7 @@ void qemu_get_guest_memory_mapping(MemoryMappingList *list,
>>         * If the guest doesn't use paging, the virtual address is equal to physical
>>         * address.
>>         */
>> -    QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
>> -        offset = block->target_start;
>> -        length = block->target_end - block->target_start;
>> -        create_new_memory_mapping(list, offset, offset, length);
>> -    }
>> +    qemu_get_guest_simple_memory_mapping(list, guest_phys_blocks);
>>    }
> 
> I thought I'd find a 1:1 replacement for the above here:
> 
> void qemu_get_guest_simple_memory_mapping(MemoryMappingList *list,
>                                      const GuestPhysBlockList
> *guest_phys_blocks)
> {
>       GuestPhysBlock *block;
> 
>       QTAILQ_FOREACH(block, &guest_phys_blocks->head, next) {
>           create_new_memory_mapping(list, block->target_start, 0,
>                                     block->target_end - block->target_start);
>       }
> }
> 
> But this is calling create_new_memory_mapping() with a different 3rd
> parameter:   0 vs. offset.

Oh, thanks for noticing! Will drop this patch then -- thanks!


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
@ 2021-07-23 14:52   ` Peter Xu
  2021-07-23 19:15     ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Xu @ 2021-07-23 14:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Stefan Berger, qemu-devel, Dr . David Alan Gilbert,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

On Tue, Jul 20, 2021 at 03:03:00PM +0200, David Hildenbrand wrote:
> @@ -30,11 +30,13 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
>          guest_phys_blocks_init(&guest_phys_blocks);
>          guest_phys_blocks_append(&guest_phys_blocks);
>          QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
> +            ram_addr_t mr_start = memory_region_get_ram_addr(block->mr);
> +
>              trace_tpm_ppi_memset(block->host_addr,
>                                   block->target_end - block->target_start);
>              memset(block->host_addr, 0,
>                     block->target_end - block->target_start);
> -            memory_region_set_dirty(block->mr, 0,
> +            memory_region_set_dirty(block->mr, block->target_start - mr_start,
>                                      block->target_end - block->target_start);

target_start should falls in gpa range, while mr_start is ram_addr_t.  I am not
sure whether this is right..

Neither do I know how to get correct mr offset with the existing info we've got
from GuestPhysBlock.  Maybe we need to teach guest_phys_blocks_region_add() to
also record section->offset_within_region?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions
  2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
  2021-07-20 13:26   ` Stefan Berger
@ 2021-07-23 15:09   ` Peter Xu
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Xu @ 2021-07-23 15:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On Tue, Jul 20, 2021 at 03:03:02PM +0200, David Hildenbrand wrote:
> Let's make sure to not merge when different memory regions are involved.
> Unlikely, but theoretically possible.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges
  2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
  2021-07-20 13:25   ` Stefan Berger
@ 2021-07-23 15:09   ` Peter Xu
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Xu @ 2021-07-23 15:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On Tue, Jul 20, 2021 at 03:03:03PM +0200, David Hildenbrand wrote:
> Let's factor out adding a MemoryRegionSection to the list, to be reused in
> RamDiscardManager context next.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
@ 2021-07-23 15:28   ` Peter Xu
  2021-07-23 18:56     ` David Hildenbrand
  2021-07-26 15:24   ` Peter Xu
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Xu @ 2021-07-23 15:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On Tue, Jul 20, 2021 at 03:03:04PM +0200, David Hildenbrand wrote:
> virtio-mem logically plugs/unplugs memory within a sparse memory region
> and notifies via the RamDiscardManager interface when parts become
> plugged (populated) or unplugged (discarded).
> 
> Currently, we end up (via the two users)
> 1) zeroing all logically unplugged/discarded memory during TPM resets.
> 2) reading all logically unplugged/discarded memory when dumping, to
>    figure out the content is zero.
> 
> 1) is always bad, because we assume unplugged memory stays discarded
>    (and is already implicitly zero).
> 2) isn't that bad with anonymous memory, we end up reading the zero
>    page (slow and unnecessary, though). However, once we use some
>    file-backed memory (future use case), even reading will populate memory.
> 
> Let's cut out all parts marked as not-populated (discarded) via the
> RamDiscardManager. As virtio-mem is the single user, this now means that
> logically unplugged memory ranges will no longer be included in the
> dump, which results in smaller dump files and faster dumping.
> 
> virtio-mem has a minimum granularity of 1 MiB (and the default is usually
> 2 MiB). Theoretically, we can see quite some fragmentation, in practice
> we won't have it completely fragmented in 1 MiB pieces. Still, we might
> end up with many physical ranges.
> 
> Both, the ELF format and kdump seem to be ready to support many
> individual ranges (e.g., for ELF it seems to be UINT32_MAX, kdump has a
> linear bitmap).
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  softmmu/memory_mapping.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
> index b7e4f3f788..856778a109 100644
> --- a/softmmu/memory_mapping.c
> +++ b/softmmu/memory_mapping.c
> @@ -246,6 +246,15 @@ static void guest_phys_block_add_section(GuestPhysListener *g,
>  #endif
>  }
>  
> +static int guest_phys_ram_populate_cb(MemoryRegionSection *section,
> +                                      void *opaque)
> +{
> +    GuestPhysListener *g = opaque;
> +
> +    guest_phys_block_add_section(g, section);
> +    return 0;
> +}
> +
>  static void guest_phys_blocks_region_add(MemoryListener *listener,
>                                           MemoryRegionSection *section)
>  {
> @@ -257,6 +266,17 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
>          memory_region_is_nonvolatile(section->mr)) {
>          return;
>      }
> +
> +    /* for special sparse regions, only add populated parts */
> +    if (memory_region_has_ram_discard_manager(section->mr)) {
> +        RamDiscardManager *rdm;
> +
> +        rdm = memory_region_get_ram_discard_manager(section->mr);
> +        ram_discard_manager_replay_populated(rdm, section,
> +                                             guest_phys_ram_populate_cb, g);
> +        return;
> +    }
> +
>      guest_phys_block_add_section(g, section);
>  }

As I've asked this question previously elsewhere, it's more or less also
related to the design decision of having virtio-mem being able to sparsely
plugged in such a small granularity rather than making the plug/unplug still
continuous within GPA range (so we move page when unplug).

There's definitely reasons there and I believe you're the expert on that (as
you mentioned once: some guest GUPed pages cannot migrate so cannot get those
ranges offlined otherwise), but so far I still not sure whether that's a kernel
issue to solve on GUP, although I agree it's a complicated one anyway!

Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.

The patch itself looks okay to me, there's just a slight worry on not sure how
long would the list be at last; if it's chopped in 1M/2M small chunks.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-23 15:28   ` Peter Xu
@ 2021-07-23 18:56     ` David Hildenbrand
  2021-07-23 22:33       ` Peter Xu
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-07-23 18:56 UTC (permalink / raw)
  To: Peter Xu
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

> 
> As I've asked this question previously elsewhere, it's more or less also
> related to the design decision of having virtio-mem being able to sparsely
> plugged in such a small granularity rather than making the plug/unplug still
> continuous within GPA range (so we move page when unplug).

Yes, in an ideal world that would be optimal solution. Unfortunately, 
we're not living in an ideal world :)

virtio-mem in Linux guests will as default try unplugging 
highest-to-lowest address, and I have on my TODO list an item to shrink 
the usable region (-> later, shrinking the actual RAMBlock) once possible.

So virtio-mem is prepared for that, but it will only apply in some cases.

> 
> There's definitely reasons there and I believe you're the expert on that (as
> you mentioned once: some guest GUPed pages cannot migrate so cannot get those
> ranges offlined otherwise), but so far I still not sure whether that's a kernel
> issue to solve on GUP, although I agree it's a complicated one anyway!

To do something like that reliably, you have to manage hotplugged memory 
in a special way, for example, in a movable zone.

We have a at least 4 cases:

a) The guest OS supports the movable zone and uses it for all hotplugged
    memory
b) The guest OS supports the movable zone and uses it for some
    hotplugged memory
c) The guest OS supports the movable zone and uses it for no hotplugged
    memory
d) The guest OS does not support the concept of movable zones


a) is the dream but only applies in some cases if Linux is properly 
configured (e.g., never hotplug more than 3 times boot memory)
b) will be possible under Linux soon (e.g., when hotplugging more than 3 
times boot memory)
c) is the default under Linux for most Linux distributions
d) Is Windows

In addition, we can still have random unplug errors when using the 
movable zone, for example, if someone references a page just a little 
too long.

Maybe that helps.

> 
> Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.

That's the precise description of what virtio-mem is. It's a trade-off 
between which OSs we want to support, what the guest OS can actually do, 
how we can manage memory in the hypervisor efficiently, ...

> 
> The patch itself looks okay to me, there's just a slight worry on not sure how
> long would the list be at last; if it's chopped in 1M/2M small chunks.

I don't think that's really an issue: take a look at 
qemu_get_guest_memory_mapping(), which will create as many entries as 
necessary to express the guest physical mapping of the guest virtual (!) 
address space with such chunks. That can be a lot :)



-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-23 14:52   ` Peter Xu
@ 2021-07-23 19:15     ` David Hildenbrand
  2021-07-23 22:35       ` Peter Xu
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-07-23 19:15 UTC (permalink / raw)
  To: Peter Xu
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	Stefan Berger, qemu-devel, Dr . David Alan Gilbert,
	Alex Williamson, Claudio Fontana, Paolo Bonzini,
	Marc-André Lureau, Alex Bennée, Igor Mammedov,
	Stefan Berger

On 23.07.21 16:52, Peter Xu wrote:
> On Tue, Jul 20, 2021 at 03:03:00PM +0200, David Hildenbrand wrote:
>> @@ -30,11 +30,13 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
>>           guest_phys_blocks_init(&guest_phys_blocks);
>>           guest_phys_blocks_append(&guest_phys_blocks);
>>           QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
>> +            ram_addr_t mr_start = memory_region_get_ram_addr(block->mr);
>> +
>>               trace_tpm_ppi_memset(block->host_addr,
>>                                    block->target_end - block->target_start);
>>               memset(block->host_addr, 0,
>>                      block->target_end - block->target_start);
>> -            memory_region_set_dirty(block->mr, 0,
>> +            memory_region_set_dirty(block->mr, block->target_start - mr_start,
>>                                       block->target_end - block->target_start);
> 
> target_start should falls in gpa range, while mr_start is ram_addr_t.  I am not
> sure whether this is right..

When I wrote that code I was under the impression that 
memory_region_get_ram_addr() would give the GPA where the memory region 
starts, but ... that's not correct as you point out. "offset" confusion :)

> 
> Neither do I know how to get correct mr offset with the existing info we've got
> from GuestPhysBlock.  Maybe we need to teach guest_phys_blocks_region_add() to
> also record section->offset_within_region?

We might actually want offset_within_address_space + 
offset_within_region, so we can calculate the GPA difference to see 
where inside the ramblock we end up.

I'll have a look next week, thanks for noticing!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-23 18:56     ` David Hildenbrand
@ 2021-07-23 22:33       ` Peter Xu
  2021-07-26  7:51         ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Xu @ 2021-07-23 22:33 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On Fri, Jul 23, 2021 at 08:56:54PM +0200, David Hildenbrand wrote:
> > 
> > As I've asked this question previously elsewhere, it's more or less also
> > related to the design decision of having virtio-mem being able to sparsely
> > plugged in such a small granularity rather than making the plug/unplug still
> > continuous within GPA range (so we move page when unplug).
> 
> Yes, in an ideal world that would be optimal solution. Unfortunately, we're
> not living in an ideal world :)
> 
> virtio-mem in Linux guests will as default try unplugging highest-to-lowest
> address, and I have on my TODO list an item to shrink the usable region (->
> later, shrinking the actual RAMBlock) once possible.
> 
> So virtio-mem is prepared for that, but it will only apply in some cases.
> 
> > 
> > There's definitely reasons there and I believe you're the expert on that (as
> > you mentioned once: some guest GUPed pages cannot migrate so cannot get those
> > ranges offlined otherwise), but so far I still not sure whether that's a kernel
> > issue to solve on GUP, although I agree it's a complicated one anyway!
> 
> To do something like that reliably, you have to manage hotplugged memory in
> a special way, for example, in a movable zone.
> 
> We have a at least 4 cases:
> 
> a) The guest OS supports the movable zone and uses it for all hotplugged
>    memory
> b) The guest OS supports the movable zone and uses it for some
>    hotplugged memory
> c) The guest OS supports the movable zone and uses it for no hotplugged
>    memory
> d) The guest OS does not support the concept of movable zones
> 
> 
> a) is the dream but only applies in some cases if Linux is properly
> configured (e.g., never hotplug more than 3 times boot memory)
> b) will be possible under Linux soon (e.g., when hotplugging more than 3
> times boot memory)
> c) is the default under Linux for most Linux distributions
> d) Is Windows
> 
> In addition, we can still have random unplug errors when using the movable
> zone, for example, if someone references a page just a little too long.
> 
> Maybe that helps.

Yes, thanks.

> 
> > 
> > Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.
> 
> That's the precise description of what virtio-mem is. It's a trade-off
> between which OSs we want to support, what the guest OS can actually do, how
> we can manage memory in the hypervisor efficiently, ...
> 
> > 
> > The patch itself looks okay to me, there's just a slight worry on not sure how
> > long would the list be at last; if it's chopped in 1M/2M small chunks.
> 
> I don't think that's really an issue: take a look at
> qemu_get_guest_memory_mapping(), which will create as many entries as
> necessary to express the guest physical mapping of the guest virtual (!)
> address space with such chunks. That can be a lot :)

I'm indeed a bit surprised by the "paging" parameter.. I gave it a try, the
list grows into tens of thousands.

One last question: will virtio-mem still do best-effort to move the pages, so
as to grant as less holes as possible?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-23 19:15     ` David Hildenbrand
@ 2021-07-23 22:35       ` Peter Xu
  2021-07-26  8:08         ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Xu @ 2021-07-23 22:35 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Stefan Berger,
	Michael S. Tsirkin, qemu-devel, Dr . David Alan Gilbert,
	Alex Williamson, Claudio Fontana, Marc-André Lureau,
	Paolo Bonzini, Alex Bennée, Igor Mammedov, Stefan Berger

On Fri, Jul 23, 2021 at 09:15:43PM +0200, David Hildenbrand wrote:
> On 23.07.21 16:52, Peter Xu wrote:
> > On Tue, Jul 20, 2021 at 03:03:00PM +0200, David Hildenbrand wrote:
> > > @@ -30,11 +30,13 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
> > >           guest_phys_blocks_init(&guest_phys_blocks);
> > >           guest_phys_blocks_append(&guest_phys_blocks);
> > >           QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
> > > +            ram_addr_t mr_start = memory_region_get_ram_addr(block->mr);
> > > +
> > >               trace_tpm_ppi_memset(block->host_addr,
> > >                                    block->target_end - block->target_start);
> > >               memset(block->host_addr, 0,
> > >                      block->target_end - block->target_start);
> > > -            memory_region_set_dirty(block->mr, 0,
> > > +            memory_region_set_dirty(block->mr, block->target_start - mr_start,
> > >                                       block->target_end - block->target_start);
> > 
> > target_start should falls in gpa range, while mr_start is ram_addr_t.  I am not
> > sure whether this is right..
> 
> When I wrote that code I was under the impression that
> memory_region_get_ram_addr() would give the GPA where the memory region
> starts, but ... that's not correct as you point out. "offset" confusion :)
> 
> > 
> > Neither do I know how to get correct mr offset with the existing info we've got
> > from GuestPhysBlock.  Maybe we need to teach guest_phys_blocks_region_add() to
> > also record section->offset_within_region?
> 
> We might actually want offset_within_address_space + offset_within_region,
> so we can calculate the GPA difference to see where inside the ramblock we
> end up.

I still think offset_within_region is exactly what we want to fill in here, but
you can do a double check.

> 
> I'll have a look next week, thanks for noticing!

Sure!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-23 22:33       ` Peter Xu
@ 2021-07-26  7:51         ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-07-26  7:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On 24.07.21 00:33, Peter Xu wrote:
> On Fri, Jul 23, 2021 at 08:56:54PM +0200, David Hildenbrand wrote:
>>>
>>> As I've asked this question previously elsewhere, it's more or less also
>>> related to the design decision of having virtio-mem being able to sparsely
>>> plugged in such a small granularity rather than making the plug/unplug still
>>> continuous within GPA range (so we move page when unplug).
>>
>> Yes, in an ideal world that would be optimal solution. Unfortunately, we're
>> not living in an ideal world :)
>>
>> virtio-mem in Linux guests will as default try unplugging highest-to-lowest
>> address, and I have on my TODO list an item to shrink the usable region (->
>> later, shrinking the actual RAMBlock) once possible.
>>
>> So virtio-mem is prepared for that, but it will only apply in some cases.
>>
>>>
>>> There's definitely reasons there and I believe you're the expert on that (as
>>> you mentioned once: some guest GUPed pages cannot migrate so cannot get those
>>> ranges offlined otherwise), but so far I still not sure whether that's a kernel
>>> issue to solve on GUP, although I agree it's a complicated one anyway!
>>
>> To do something like that reliably, you have to manage hotplugged memory in
>> a special way, for example, in a movable zone.
>>
>> We have a at least 4 cases:
>>
>> a) The guest OS supports the movable zone and uses it for all hotplugged
>>     memory
>> b) The guest OS supports the movable zone and uses it for some
>>     hotplugged memory
>> c) The guest OS supports the movable zone and uses it for no hotplugged
>>     memory
>> d) The guest OS does not support the concept of movable zones
>>
>>
>> a) is the dream but only applies in some cases if Linux is properly
>> configured (e.g., never hotplug more than 3 times boot memory)
>> b) will be possible under Linux soon (e.g., when hotplugging more than 3
>> times boot memory)
>> c) is the default under Linux for most Linux distributions
>> d) Is Windows
>>
>> In addition, we can still have random unplug errors when using the movable
>> zone, for example, if someone references a page just a little too long.
>>
>> Maybe that helps.
> 
> Yes, thanks.
> 
>>
>>>
>>> Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.
>>
>> That's the precise description of what virtio-mem is. It's a trade-off
>> between which OSs we want to support, what the guest OS can actually do, how
>> we can manage memory in the hypervisor efficiently, ...
>>
>>>
>>> The patch itself looks okay to me, there's just a slight worry on not sure how
>>> long would the list be at last; if it's chopped in 1M/2M small chunks.
>>
>> I don't think that's really an issue: take a look at
>> qemu_get_guest_memory_mapping(), which will create as many entries as
>> necessary to express the guest physical mapping of the guest virtual (!)
>> address space with such chunks. That can be a lot :)
> 
> I'm indeed a bit surprised by the "paging" parameter.. I gave it a try, the
> list grows into tens of thousands.

Yes, and the bigger the VM, the more entries you should get ... like 
with virtio-mem.

> 
> One last question: will virtio-mem still do best-effort to move the pages, so
> as to grant as less holes as possible?

That depends on the guest OS.

Linux guests will unplug highest-to-lowest addresses. They will try 
migrating pages away (alloc_contig_range()) to minimize fragmentation. 
Further, when (un)plugging, they will try a) unplug within already 
fragmented Linux memory blocks (e.g., 128 MiB) b) plugging within 
already fragmented Linux memory blocks first. Because the goal is to 
require as little as possible Linux memory blocks to reduce metadata 
(memmap) overhead.

I recall that the Windows prototype also tries unplug highest-to-lowest 
using the Windows range allocator, however, I have no idea what that 
range allcoator actually does (if it only grabs free pages or if it can 
actually move around busy pages).

For Linux guests, there is a work item to continue defragmenting the 
layout to free up complete Linux memory blocks over time.


With a 1 TiB virtio-mem device and a 2 MiB block size (default), in the 
worst case we would get 262144 individual blocks (every second one 
plugged). While this is far from realistic, I assume we can get 
something comparable when dumping a huge VM in paging mode.

With 262144 entires, with ~48 byte (6*8 byte) per element, we'd consume 
12 MiB for the whole list. Not perfect, but not too bad.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-23 22:35       ` Peter Xu
@ 2021-07-26  8:08         ` David Hildenbrand
  2021-07-26 14:21           ` Peter Xu
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-07-26  8:08 UTC (permalink / raw)
  To: Peter Xu
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Stefan Berger,
	Michael S. Tsirkin, qemu-devel, Dr . David Alan Gilbert,
	Alex Williamson, Claudio Fontana, Marc-André Lureau,
	Paolo Bonzini, Alex Bennée, Igor Mammedov, Stefan Berger

On 24.07.21 00:35, Peter Xu wrote:
> On Fri, Jul 23, 2021 at 09:15:43PM +0200, David Hildenbrand wrote:
>> On 23.07.21 16:52, Peter Xu wrote:
>>> On Tue, Jul 20, 2021 at 03:03:00PM +0200, David Hildenbrand wrote:
>>>> @@ -30,11 +30,13 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
>>>>            guest_phys_blocks_init(&guest_phys_blocks);
>>>>            guest_phys_blocks_append(&guest_phys_blocks);
>>>>            QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
>>>> +            ram_addr_t mr_start = memory_region_get_ram_addr(block->mr);
>>>> +
>>>>                trace_tpm_ppi_memset(block->host_addr,
>>>>                                     block->target_end - block->target_start);
>>>>                memset(block->host_addr, 0,
>>>>                       block->target_end - block->target_start);
>>>> -            memory_region_set_dirty(block->mr, 0,
>>>> +            memory_region_set_dirty(block->mr, block->target_start - mr_start,
>>>>                                        block->target_end - block->target_start);
>>>
>>> target_start should falls in gpa range, while mr_start is ram_addr_t.  I am not
>>> sure whether this is right..
>>
>> When I wrote that code I was under the impression that
>> memory_region_get_ram_addr() would give the GPA where the memory region
>> starts, but ... that's not correct as you point out. "offset" confusion :)
>>
>>>
>>> Neither do I know how to get correct mr offset with the existing info we've got
>>> from GuestPhysBlock.  Maybe we need to teach guest_phys_blocks_region_add() to
>>> also record section->offset_within_region?
>>
>> We might actually want offset_within_address_space + offset_within_region,
>> so we can calculate the GPA difference to see where inside the ramblock we
>> end up.
> 
> I still think offset_within_region is exactly what we want to fill in here, but
> you can do a double check.

I remember when I first looked into that months ago I wanted to avoid
extending GuestPhysBlock. The commit message actually tells us what to do,
and where my optimization went wrong :)

"We might not start at the beginning of the memory region. We could also
  calculate via the difference in the host address; however,
  memory_region_set_dirty() also relies on memory_region_get_ram_addr()
  internally, so let's just use that."

So, avoiding the optimization, we'd be left with:


diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
index 362edcc5c9..fab49524d7 100644
--- a/hw/tpm/tpm_ppi.c
+++ b/hw/tpm/tpm_ppi.c
@@ -30,11 +30,14 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
          guest_phys_blocks_init(&guest_phys_blocks);
          guest_phys_blocks_append(&guest_phys_blocks);
          QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
+            hwaddr mr_offs = (uint8_t *) memory_region_get_ram_ptr(block->mr) -
+                             block->host_addr;
+
              trace_tpm_ppi_memset(block->host_addr,
                                   block->target_end - block->target_start);
              memset(block->host_addr, 0,
                     block->target_end - block->target_start);
-            memory_region_set_dirty(block->mr, 0,
+            memory_region_set_dirty(block->mr, mr_offs,
                                      block->target_end - block->target_start);
          }
          guest_phys_blocks_free(&guest_phys_blocks);


That should make more sense :)


-- 
Thanks,

David / dhildenb



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM
  2021-07-26  8:08         ` David Hildenbrand
@ 2021-07-26 14:21           ` Peter Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Xu @ 2021-07-26 14:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Stefan Berger,
	Michael S. Tsirkin, qemu-devel, Dr . David Alan Gilbert,
	Alex Williamson, Claudio Fontana, Marc-André Lureau,
	Paolo Bonzini, Alex Bennée, Igor Mammedov, Stefan Berger

On Mon, Jul 26, 2021 at 10:08:59AM +0200, David Hildenbrand wrote:
> diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c
> index 362edcc5c9..fab49524d7 100644
> --- a/hw/tpm/tpm_ppi.c
> +++ b/hw/tpm/tpm_ppi.c
> @@ -30,11 +30,14 @@ void tpm_ppi_reset(TPMPPI *tpmppi)
>          guest_phys_blocks_init(&guest_phys_blocks);
>          guest_phys_blocks_append(&guest_phys_blocks);
>          QTAILQ_FOREACH(block, &guest_phys_blocks.head, next) {
> +            hwaddr mr_offs = (uint8_t *) memory_region_get_ram_ptr(block->mr) -
> +                             block->host_addr;
> +
>              trace_tpm_ppi_memset(block->host_addr,
>                                   block->target_end - block->target_start);
>              memset(block->host_addr, 0,
>                     block->target_end - block->target_start);
> -            memory_region_set_dirty(block->mr, 0,
> +            memory_region_set_dirty(block->mr, mr_offs,
>                                      block->target_end - block->target_start);
>          }
>          guest_phys_blocks_free(&guest_phys_blocks);
> 
> 
> That should make more sense :)

Yep, looks good to me (and simpler!).

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
  2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
  2021-07-23 15:28   ` Peter Xu
@ 2021-07-26 15:24   ` Peter Xu
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Xu @ 2021-07-26 15:24 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Michael S. Tsirkin,
	qemu-devel, Dr . David Alan Gilbert, Alex Williamson,
	Claudio Fontana, Paolo Bonzini, Marc-André Lureau,
	Alex Bennée, Igor Mammedov, Stefan Berger

On Tue, Jul 20, 2021 at 03:03:04PM +0200, David Hildenbrand wrote:
> virtio-mem logically plugs/unplugs memory within a sparse memory region
> and notifies via the RamDiscardManager interface when parts become
> plugged (populated) or unplugged (discarded).
> 
> Currently, we end up (via the two users)
> 1) zeroing all logically unplugged/discarded memory during TPM resets.
> 2) reading all logically unplugged/discarded memory when dumping, to
>    figure out the content is zero.
> 
> 1) is always bad, because we assume unplugged memory stays discarded
>    (and is already implicitly zero).
> 2) isn't that bad with anonymous memory, we end up reading the zero
>    page (slow and unnecessary, though). However, once we use some
>    file-backed memory (future use case), even reading will populate memory.
> 
> Let's cut out all parts marked as not-populated (discarded) via the
> RamDiscardManager. As virtio-mem is the single user, this now means that
> logically unplugged memory ranges will no longer be included in the
> dump, which results in smaller dump files and faster dumping.
> 
> virtio-mem has a minimum granularity of 1 MiB (and the default is usually
> 2 MiB). Theoretically, we can see quite some fragmentation, in practice
> we won't have it completely fragmented in 1 MiB pieces. Still, we might
> end up with many physical ranges.
> 
> Both, the ELF format and kdump seem to be ready to support many
> individual ranges (e.g., for ELF it seems to be UINT32_MAX, kdump has a
> linear bitmap).
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-07-26 15:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
2021-07-23 14:52   ` Peter Xu
2021-07-23 19:15     ` David Hildenbrand
2021-07-23 22:35       ` Peter Xu
2021-07-26  8:08         ` David Hildenbrand
2021-07-26 14:21           ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping() David Hildenbrand
2021-07-20 13:37   ` Stefan Berger
2021-07-20 13:45     ` David Hildenbrand
2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
2021-07-20 13:26   ` Stefan Berger
2021-07-23 15:09   ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
2021-07-20 13:25   ` Stefan Berger
2021-07-23 15:09   ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
2021-07-23 15:28   ` Peter Xu
2021-07-23 18:56     ` David Hildenbrand
2021-07-23 22:33       ` Peter Xu
2021-07-26  7:51         ` David Hildenbrand
2021-07-26 15:24   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.