qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
@ 2019-08-02 13:32 Igor Mammedov
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Igor Mammedov @ 2019-08-02 13:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: thuth, david, cohuck, borntraeger, qemu-s390x, pbonzini


Changelog:
  since v2:
    - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
      and drop migratable aliases patch as was agreed during v2 review
    - drop 4.2 machines patch as it's not prerequisite anymore
  since v1:
    - include 4.2 machines patch for adding compat RAM layout on top
    - 2/4 add missing in v1 patch for splitting too big MemorySection on
          several memslots
    - 3/4 amend code path on alias destruction to ensure that RAMBlock is
          cleaned properly
    - 4/4 add compat machine code to keep old layout (migration-wise) for
          4.1 and older machines 


While looking into unifying guest RAM allocation to use hostmem backends
for initial RAM (especially when -mempath is used) and retiring
memory_region_allocate_system_memory() API, leaving only single hostmem backend,
I was inspecting how currently it is used by boards and it turns out several
boards abuse it by calling the function several times (despite documented contract
forbiding it).

s390 is one of such boards where KVM limitation on memslot size got propagated
to board design and memory_region_allocate_system_memory() was abused to satisfy
KVM requirement for max RAM chunk where memory region alias would suffice.

Unfortunately, memory_region_allocate_system_memory() usage created migration
dependency where guest RAM is transferred in migration stream as several RAMBlocks
if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
migration breakage (documenting it in release notes) and leaving only KVM fix.

In order to replace these several RAM chunks with a single memdev and keep it
working with KVM memslot size limit, following was done:
   * [1/2] split too big RAM chunk inside of KVM code on several memory slots
           if necessary
   * [2/2] drop manual ram splitting in s390 code


CC: pbonzini@redhat.com
CC: qemu-s390x@nongnu.org
CC: borntraeger@de.ibm.com
CC: thuth@redhat.com
CC: david@redhat.com
CC: cohuck@redhat.com

Igor Mammedov (2):
  kvm: s390: split too big memory section on several memslots
  s390: do not call memory_region_allocate_system_memory() multiple
    times

 include/hw/s390x/s390-virtio-ccw.h | 10 ++++
 include/sysemu/kvm_int.h           |  1 +
 accel/kvm/kvm-all.c                | 79 ++++++++++++++++++------------
 hw/s390x/s390-virtio-ccw.c         | 30 ++----------
 target/s390x/kvm.c                 |  1 +
 5 files changed, 63 insertions(+), 58 deletions(-)

-- 
2.18.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots
  2019-08-02 13:32 [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
@ 2019-08-02 13:32 ` Igor Mammedov
  2019-08-02 13:50   ` Christian Borntraeger
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2019-08-02 13:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: thuth, david, cohuck, borntraeger, qemu-s390x, pbonzini

Max memslot size supported by kvm on s390 is 8Tb,
move logic of splitting RAM in chunks upto 8T to KVM code.

This way it will hide KVM specific restrictions in KVM code
and won't affect baord level design decisions. Which would allow
us to avoid misusing memory_region_allocate_system_memory() API
and eventually use a single hostmem backend for guest RAM.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
patch prepares only KVM side for switching to single RAM memory region
another patch will take care of  dropping manual RAM partitioning in
s390 code.
---
 include/hw/s390x/s390-virtio-ccw.h | 10 ++++
 include/sysemu/kvm_int.h           |  1 +
 accel/kvm/kvm-all.c                | 79 ++++++++++++++++++------------
 hw/s390x/s390-virtio-ccw.c         |  9 ----
 target/s390x/kvm.c                 |  1 +
 5 files changed, 60 insertions(+), 40 deletions(-)

diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
index 8aa27199c9..00632f94b4 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -21,6 +21,16 @@
 #define S390_MACHINE_CLASS(klass) \
     OBJECT_CLASS_CHECK(S390CcwMachineClass, (klass), TYPE_S390_CCW_MACHINE)
 
+/*
+ * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages
+ * as the dirty bitmap must be managed by bitops that take an int as
+ * position indicator. If we have a guest beyond that we will split off
+ * new subregions. The split must happen on a segment boundary (1MB).
+ */
+#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1)
+#define SEG_MSK (~0xfffffULL)
+#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK)
+
 typedef struct S390CcwMachineState {
     /*< private >*/
     MachineState parent_obj;
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 31df465fdc..7f7520bce2 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -41,4 +41,5 @@ typedef struct KVMMemoryListener {
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
                                   AddressSpace *as, int as_id);
 
+void kvm_set_max_memslot_size(hwaddr max_slot_size);
 #endif
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f450f25295..7a6efb9612 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -138,6 +138,7 @@ bool kvm_direct_msi_allowed;
 bool kvm_ioeventfd_any_length_allowed;
 bool kvm_msi_use_devid;
 static bool kvm_immediate_exit;
+static hwaddr kvm_max_slot_size = ~0;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
     KVM_CAP_INFO(USER_MEMORY),
@@ -951,6 +952,14 @@ kvm_check_extension_list(KVMState *s, const KVMCapabilityInfo *list)
     return NULL;
 }
 
+void kvm_set_max_memslot_size(hwaddr max_slot_size)
+{
+    g_assert(
+        ROUND_UP(max_slot_size, qemu_real_host_page_size) == max_slot_size
+    );
+    kvm_max_slot_size = max_slot_size;
+}
+
 static void kvm_set_phys_mem(KVMMemoryListener *kml,
                              MemoryRegionSection *section, bool add)
 {
@@ -958,7 +967,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
     int err;
     MemoryRegion *mr = section->mr;
     bool writeable = !mr->readonly && !mr->rom_device;
-    hwaddr start_addr, size;
+    hwaddr start_addr, size, slot_size;
     void *ram;
 
     if (!memory_region_is_ram(mr)) {
@@ -983,41 +992,49 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
     kvm_slots_lock(kml);
 
     if (!add) {
-        mem = kvm_lookup_matching_slot(kml, start_addr, size);
-        if (!mem) {
-            goto out;
-        }
-        if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
-            kvm_physical_sync_dirty_bitmap(kml, section);
-        }
+        do {
+            slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
+            mem = kvm_lookup_matching_slot(kml, start_addr, slot_size);
+            if (!mem) {
+                goto out;
+            }
+            if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
+                kvm_physical_sync_dirty_bitmap(kml, section);
+            }
 
-        /* unregister the slot */
-        g_free(mem->dirty_bmap);
-        mem->dirty_bmap = NULL;
-        mem->memory_size = 0;
-        mem->flags = 0;
-        err = kvm_set_user_memory_region(kml, mem, false);
-        if (err) {
-            fprintf(stderr, "%s: error unregistering slot: %s\n",
-                    __func__, strerror(-err));
-            abort();
-        }
+            /* unregister the slot */
+            g_free(mem->dirty_bmap);
+            mem->dirty_bmap = NULL;
+            mem->memory_size = 0;
+            mem->flags = 0;
+            err = kvm_set_user_memory_region(kml, mem, false);
+            if (err) {
+                fprintf(stderr, "%s: error unregistering slot: %s\n",
+                        __func__, strerror(-err));
+                abort();
+            }
+            start_addr += slot_size;
+        } while ((size -= slot_size));
         goto out;
     }
 
     /* register the new slot */
-    mem = kvm_alloc_slot(kml);
-    mem->memory_size = size;
-    mem->start_addr = start_addr;
-    mem->ram = ram;
-    mem->flags = kvm_mem_flags(mr);
-
-    err = kvm_set_user_memory_region(kml, mem, true);
-    if (err) {
-        fprintf(stderr, "%s: error registering slot: %s\n", __func__,
-                strerror(-err));
-        abort();
-    }
+    do {
+        slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
+        mem = kvm_alloc_slot(kml);
+        mem->memory_size = slot_size;
+        mem->start_addr = start_addr;
+        mem->ram = ram;
+        mem->flags = kvm_mem_flags(mr);
+
+        err = kvm_set_user_memory_region(kml, mem, true);
+        if (err) {
+            fprintf(stderr, "%s: error registering slot: %s\n", __func__,
+                    strerror(-err));
+            abort();
+        }
+        start_addr += slot_size;
+    } while ((size -= slot_size));
 
 out:
     kvm_slots_unlock(kml);
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 5b6a9a4e55..0c03ffb7c7 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -151,15 +151,6 @@ static void virtio_ccw_register_hcalls(void)
                                    virtio_ccw_hcall_early_printk);
 }
 
-/*
- * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages
- * as the dirty bitmap must be managed by bitops that take an int as
- * position indicator. If we have a guest beyond that we will split off
- * new subregions. The split must happen on a segment boundary (1MB).
- */
-#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1)
-#define SEG_MSK (~0xfffffULL)
-#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK)
 static void s390_memory_init(ram_addr_t mem_size)
 {
     MemoryRegion *sysmem = get_system_memory();
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 6e814c230b..d0267da8e1 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -347,6 +347,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
      */
     /* kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0); */
 
+    kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES);
     return 0;
 }
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times
  2019-08-02 13:32 [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
@ 2019-08-02 13:32 ` Igor Mammedov
  2019-08-02 13:36   ` David Hildenbrand
  2019-08-02 14:11 ` [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() no-reply
  2019-08-02 14:42 ` Christian Borntraeger
  3 siblings, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2019-08-02 13:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: thuth, david, cohuck, borntraeger, qemu-s390x, pbonzini

s390 was trying to solve limited KVM memslot size issue by abusing
memory_region_allocate_system_memory(), which breaks API contract
where the function might be called only once.

Beside an invalid use of API, the approach also introduced migration
issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
migration stream as separate RAMBlocks.

After discussion [1], it was agreed to break migration from older
QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
and considered to be not actually used downstream).
Migration should keep working for guests with less than 8TB and for
more than 8TB with QEMU 4.2 and newer binary.
In case user tries to migrate more than 8TB guest, between incompatible
QEMU versions, migration should fail gracefully due to non-exiting
RAMBlock ID or RAMBlock size mismatch.

Taking in account above and that now KVM code is able to split too
big MemorySection into several memslots, stop abusing
  memory_region_allocate_system_memory()
and use only one memory region for RAM.

1) [PATCH RFC v2 4/4] s390: do not call  memory_region_allocate_system_memory() multiple times

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
v3:
  - drop migration compat code

PS:
I don't have access to a suitable system to test it.
---
 hw/s390x/s390-virtio-ccw.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 0c03ffb7c7..e30058df38 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -154,27 +154,12 @@ static void virtio_ccw_register_hcalls(void)
 static void s390_memory_init(ram_addr_t mem_size)
 {
     MemoryRegion *sysmem = get_system_memory();
-    ram_addr_t chunk, offset = 0;
-    unsigned int number = 0;
+    MemoryRegion *ram = g_new(MemoryRegion, 1);
     Error *local_err = NULL;
-    gchar *name;
 
     /* allocate RAM for core */
-    name = g_strdup_printf("s390.ram");
-    while (mem_size) {
-        MemoryRegion *ram = g_new(MemoryRegion, 1);
-        uint64_t size = mem_size;
-
-        /* KVM does not allow memslots >= 8 TB */
-        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
-        memory_region_allocate_system_memory(ram, NULL, name, chunk);
-        memory_region_add_subregion(sysmem, offset, ram);
-        mem_size -= chunk;
-        offset += chunk;
-        g_free(name);
-        name = g_strdup_printf("s390.ram.%u", ++number);
-    }
-    g_free(name);
+    memory_region_allocate_system_memory(ram, NULL, "s390.ram", mem_size);
+    memory_region_add_subregion(sysmem, 0, ram);
 
     /*
      * Configure the maximum page size. As no memory devices were created
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov
@ 2019-08-02 13:36   ` David Hildenbrand
  2019-08-02 13:41     ` Christian Borntraeger
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2019-08-02 13:36 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: pbonzini, qemu-s390x, cohuck, thuth, borntraeger

On 02.08.19 15:32, Igor Mammedov wrote:
> s390 was trying to solve limited KVM memslot size issue by abusing
> memory_region_allocate_system_memory(), which breaks API contract
> where the function might be called only once.
> 
> Beside an invalid use of API, the approach also introduced migration
> issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
> migration stream as separate RAMBlocks.
> 
> After discussion [1], it was agreed to break migration from older
> QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
> and considered to be not actually used downstream).
> Migration should keep working for guests with less than 8TB and for
> more than 8TB with QEMU 4.2 and newer binary.
> In case user tries to migrate more than 8TB guest, between incompatible
> QEMU versions, migration should fail gracefully due to non-exiting
> RAMBlock ID or RAMBlock size mismatch.
> 
> Taking in account above and that now KVM code is able to split too
> big MemorySection into several memslots, stop abusing
>   memory_region_allocate_system_memory()
> and use only one memory region for RAM.
> 
> 1) [PATCH RFC v2 4/4] s390: do not call  memory_region_allocate_system_memory() multiple times
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> v3:
>   - drop migration compat code
> 
> PS:
> I don't have access to a suitable system to test it.
> ---
>  hw/s390x/s390-virtio-ccw.c | 21 +++------------------
>  1 file changed, 3 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 0c03ffb7c7..e30058df38 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -154,27 +154,12 @@ static void virtio_ccw_register_hcalls(void)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> -    ram_addr_t chunk, offset = 0;
> -    unsigned int number = 0;
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>      Error *local_err = NULL;
> -    gchar *name;
>  
>      /* allocate RAM for core */
> -    name = g_strdup_printf("s390.ram");
> -    while (mem_size) {
> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> -        uint64_t size = mem_size;
> -
> -        /* KVM does not allow memslots >= 8 TB */
> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> -        memory_region_add_subregion(sysmem, offset, ram);
> -        mem_size -= chunk;
> -        offset += chunk;
> -        g_free(name);
> -        name = g_strdup_printf("s390.ram.%u", ++number);
> -    }
> -    g_free(name);
> +    memory_region_allocate_system_memory(ram, NULL, "s390.ram", mem_size);
> +    memory_region_add_subregion(sysmem, 0, ram);
>  
>      /*
>       * Configure the maximum page size. As no memory devices were created
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

We have to remember putting it into the next release notes. (or do we
even want to get this into v4.1 ?)


-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times
  2019-08-02 13:36   ` David Hildenbrand
@ 2019-08-02 13:41     ` Christian Borntraeger
  2019-08-02 13:45       ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2019-08-02 13:41 UTC (permalink / raw)
  To: David Hildenbrand, Igor Mammedov, qemu-devel
  Cc: pbonzini, qemu-s390x, cohuck, thuth



On 02.08.19 15:36, David Hildenbrand wrote:
> On 02.08.19 15:32, Igor Mammedov wrote:
>> s390 was trying to solve limited KVM memslot size issue by abusing
>> memory_region_allocate_system_memory(), which breaks API contract
>> where the function might be called only once.
>>
>> Beside an invalid use of API, the approach also introduced migration
>> issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
>> migration stream as separate RAMBlocks.
>>
>> After discussion [1], it was agreed to break migration from older
>> QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
>> and considered to be not actually used downstream).
>> Migration should keep working for guests with less than 8TB and for
>> more than 8TB with QEMU 4.2 and newer binary.
>> In case user tries to migrate more than 8TB guest, between incompatible
>> QEMU versions, migration should fail gracefully due to non-exiting
>> RAMBlock ID or RAMBlock size mismatch.
>>
>> Taking in account above and that now KVM code is able to split too
>> big MemorySection into several memslots, stop abusing
>>   memory_region_allocate_system_memory()
>> and use only one memory region for RAM.
>>
>> 1) [PATCH RFC v2 4/4] s390: do not call  memory_region_allocate_system_memory() multiple times
>>
>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>> ---
>> v3:
>>   - drop migration compat code
>>
>> PS:
>> I don't have access to a suitable system to test it.
>> ---
>>  hw/s390x/s390-virtio-ccw.c | 21 +++------------------
>>  1 file changed, 3 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index 0c03ffb7c7..e30058df38 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -154,27 +154,12 @@ static void virtio_ccw_register_hcalls(void)
>>  static void s390_memory_init(ram_addr_t mem_size)
>>  {
>>      MemoryRegion *sysmem = get_system_memory();
>> -    ram_addr_t chunk, offset = 0;
>> -    unsigned int number = 0;
>> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>>      Error *local_err = NULL;
>> -    gchar *name;
>>  
>>      /* allocate RAM for core */
>> -    name = g_strdup_printf("s390.ram");
>> -    while (mem_size) {
>> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
>> -        uint64_t size = mem_size;
>> -
>> -        /* KVM does not allow memslots >= 8 TB */
>> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
>> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
>> -        memory_region_add_subregion(sysmem, offset, ram);
>> -        mem_size -= chunk;
>> -        offset += chunk;
>> -        g_free(name);
>> -        name = g_strdup_printf("s390.ram.%u", ++number);
>> -    }
>> -    g_free(name);
>> +    memory_region_allocate_system_memory(ram, NULL, "s390.ram", mem_size);
>> +    memory_region_add_subregion(sysmem, 0, ram);
>>  
>>      /*
>>       * Configure the maximum page size. As no memory devices were created
>>
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> We have to remember putting it into the next release notes. (or do we
> even want to get this into v4.1 ?)

This is too invasive for 4.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times
  2019-08-02 13:41     ` Christian Borntraeger
@ 2019-08-02 13:45       ` David Hildenbrand
  0 siblings, 0 replies; 13+ messages in thread
From: David Hildenbrand @ 2019-08-02 13:45 UTC (permalink / raw)
  To: Christian Borntraeger, Igor Mammedov, qemu-devel
  Cc: pbonzini, qemu-s390x, cohuck, thuth

On 02.08.19 15:41, Christian Borntraeger wrote:
> 
> 
> On 02.08.19 15:36, David Hildenbrand wrote:
>> On 02.08.19 15:32, Igor Mammedov wrote:
>>> s390 was trying to solve limited KVM memslot size issue by abusing
>>> memory_region_allocate_system_memory(), which breaks API contract
>>> where the function might be called only once.
>>>
>>> Beside an invalid use of API, the approach also introduced migration
>>> issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
>>> migration stream as separate RAMBlocks.
>>>
>>> After discussion [1], it was agreed to break migration from older
>>> QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
>>> and considered to be not actually used downstream).
>>> Migration should keep working for guests with less than 8TB and for
>>> more than 8TB with QEMU 4.2 and newer binary.
>>> In case user tries to migrate more than 8TB guest, between incompatible
>>> QEMU versions, migration should fail gracefully due to non-exiting
>>> RAMBlock ID or RAMBlock size mismatch.
>>>
>>> Taking in account above and that now KVM code is able to split too
>>> big MemorySection into several memslots, stop abusing
>>>   memory_region_allocate_system_memory()
>>> and use only one memory region for RAM.
>>>
>>> 1) [PATCH RFC v2 4/4] s390: do not call  memory_region_allocate_system_memory() multiple times
>>>
>>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>>> ---
>>> v3:
>>>   - drop migration compat code
>>>
>>> PS:
>>> I don't have access to a suitable system to test it.
>>> ---
>>>  hw/s390x/s390-virtio-ccw.c | 21 +++------------------
>>>  1 file changed, 3 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>>> index 0c03ffb7c7..e30058df38 100644
>>> --- a/hw/s390x/s390-virtio-ccw.c
>>> +++ b/hw/s390x/s390-virtio-ccw.c
>>> @@ -154,27 +154,12 @@ static void virtio_ccw_register_hcalls(void)
>>>  static void s390_memory_init(ram_addr_t mem_size)
>>>  {
>>>      MemoryRegion *sysmem = get_system_memory();
>>> -    ram_addr_t chunk, offset = 0;
>>> -    unsigned int number = 0;
>>> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>>>      Error *local_err = NULL;
>>> -    gchar *name;
>>>  
>>>      /* allocate RAM for core */
>>> -    name = g_strdup_printf("s390.ram");
>>> -    while (mem_size) {
>>> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
>>> -        uint64_t size = mem_size;
>>> -
>>> -        /* KVM does not allow memslots >= 8 TB */
>>> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
>>> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
>>> -        memory_region_add_subregion(sysmem, offset, ram);
>>> -        mem_size -= chunk;
>>> -        offset += chunk;
>>> -        g_free(name);
>>> -        name = g_strdup_printf("s390.ram.%u", ++number);
>>> -    }
>>> -    g_free(name);
>>> +    memory_region_allocate_system_memory(ram, NULL, "s390.ram", mem_size);
>>> +    memory_region_add_subregion(sysmem, 0, ram);
>>>  
>>>      /*
>>>       * Configure the maximum page size. As no memory devices were created
>>>
>>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
>>
>> We have to remember putting it into the next release notes. (or do we
>> even want to get this into v4.1 ?)
> 
> This is too invasive for 4.1
> 

Makes sense

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
@ 2019-08-02 13:50   ` Christian Borntraeger
  0 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2019-08-02 13:50 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: pbonzini, qemu-s390x, cohuck, thuth, david



On 02.08.19 15:32, Igor Mammedov wrote:
> Max memslot size supported by kvm on s390 is 8Tb,
> move logic of splitting RAM in chunks upto 8T to KVM code.
> 
> This way it will hide KVM specific restrictions in KVM code
> and won't affect baord level design decisions. Which would allow
> us to avoid misusing memory_region_allocate_system_memory() API
> and eventually use a single hostmem backend for guest RAM.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> patch prepares only KVM side for switching to single RAM memory region
> another patch will take care of  dropping manual RAM partitioning in
> s390 code.
> ---


/home/cborntra/REPOS/qemu/target/s390x/kvm.c: In function ‘kvm_arch_init’:
/home/cborntra/REPOS/qemu/target/s390x/kvm.c:350:5: error: implicit declaration of function ‘kvm_set_max_memslot_size’; did you mean ‘kvm_get_max_memslots’? [-Werror=implicit-function-declaration]
  350 |     kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~
      |     kvm_get_max_memslots
/home/cborntra/REPOS/qemu/target/s390x/kvm.c:350:5: error: nested extern declaration of ‘kvm_set_max_memslot_size’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
make[1]: *** [/home/cborntra/REPOS/qemu/rules.mak:69: target/s390x/kvm.o] Error 1
make: *** [Makefile:472: s390x-softmmu/all] Error 2


We probably need an include of sysemu/kvm_int.h in target/s390x/kvm.c
I will test with that fixup....

>  include/hw/s390x/s390-virtio-ccw.h | 10 ++++
>  include/sysemu/kvm_int.h           |  1 +
>  accel/kvm/kvm-all.c                | 79 ++++++++++++++++++------------
>  hw/s390x/s390-virtio-ccw.c         |  9 ----
>  target/s390x/kvm.c                 |  1 +
>  5 files changed, 60 insertions(+), 40 deletions(-)
> 
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index 8aa27199c9..00632f94b4 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -21,6 +21,16 @@
>  #define S390_MACHINE_CLASS(klass) \
>      OBJECT_CLASS_CHECK(S390CcwMachineClass, (klass), TYPE_S390_CCW_MACHINE)
>  
> +/*
> + * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages
> + * as the dirty bitmap must be managed by bitops that take an int as
> + * position indicator. If we have a guest beyond that we will split off
> + * new subregions. The split must happen on a segment boundary (1MB).
> + */
> +#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1)
> +#define SEG_MSK (~0xfffffULL)
> +#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK)
> +
>  typedef struct S390CcwMachineState {
>      /*< private >*/
>      MachineState parent_obj;
> diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
> index 31df465fdc..7f7520bce2 100644
> --- a/include/sysemu/kvm_int.h
> +++ b/include/sysemu/kvm_int.h
> @@ -41,4 +41,5 @@ typedef struct KVMMemoryListener {
>  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
>                                    AddressSpace *as, int as_id);
>  
> +void kvm_set_max_memslot_size(hwaddr max_slot_size);
>  #endif
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index f450f25295..7a6efb9612 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -138,6 +138,7 @@ bool kvm_direct_msi_allowed;
>  bool kvm_ioeventfd_any_length_allowed;
>  bool kvm_msi_use_devid;
>  static bool kvm_immediate_exit;
> +static hwaddr kvm_max_slot_size = ~0;
>  
>  static const KVMCapabilityInfo kvm_required_capabilites[] = {
>      KVM_CAP_INFO(USER_MEMORY),
> @@ -951,6 +952,14 @@ kvm_check_extension_list(KVMState *s, const KVMCapabilityInfo *list)
>      return NULL;
>  }
>  
> +void kvm_set_max_memslot_size(hwaddr max_slot_size)
> +{
> +    g_assert(
> +        ROUND_UP(max_slot_size, qemu_real_host_page_size) == max_slot_size
> +    );
> +    kvm_max_slot_size = max_slot_size;
> +}
> +
>  static void kvm_set_phys_mem(KVMMemoryListener *kml,
>                               MemoryRegionSection *section, bool add)
>  {
> @@ -958,7 +967,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
>      int err;
>      MemoryRegion *mr = section->mr;
>      bool writeable = !mr->readonly && !mr->rom_device;
> -    hwaddr start_addr, size;
> +    hwaddr start_addr, size, slot_size;
>      void *ram;
>  
>      if (!memory_region_is_ram(mr)) {
> @@ -983,41 +992,49 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
>      kvm_slots_lock(kml);
>  
>      if (!add) {
> -        mem = kvm_lookup_matching_slot(kml, start_addr, size);
> -        if (!mem) {
> -            goto out;
> -        }
> -        if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> -            kvm_physical_sync_dirty_bitmap(kml, section);
> -        }
> +        do {
> +            slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
> +            mem = kvm_lookup_matching_slot(kml, start_addr, slot_size);
> +            if (!mem) {
> +                goto out;
> +            }
> +            if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> +                kvm_physical_sync_dirty_bitmap(kml, section);
> +            }
>  
> -        /* unregister the slot */
> -        g_free(mem->dirty_bmap);
> -        mem->dirty_bmap = NULL;
> -        mem->memory_size = 0;
> -        mem->flags = 0;
> -        err = kvm_set_user_memory_region(kml, mem, false);
> -        if (err) {
> -            fprintf(stderr, "%s: error unregistering slot: %s\n",
> -                    __func__, strerror(-err));
> -            abort();
> -        }
> +            /* unregister the slot */
> +            g_free(mem->dirty_bmap);
> +            mem->dirty_bmap = NULL;
> +            mem->memory_size = 0;
> +            mem->flags = 0;
> +            err = kvm_set_user_memory_region(kml, mem, false);
> +            if (err) {
> +                fprintf(stderr, "%s: error unregistering slot: %s\n",
> +                        __func__, strerror(-err));
> +                abort();
> +            }
> +            start_addr += slot_size;
> +        } while ((size -= slot_size));
>          goto out;
>      }
>  
>      /* register the new slot */
> -    mem = kvm_alloc_slot(kml);
> -    mem->memory_size = size;
> -    mem->start_addr = start_addr;
> -    mem->ram = ram;
> -    mem->flags = kvm_mem_flags(mr);
> -
> -    err = kvm_set_user_memory_region(kml, mem, true);
> -    if (err) {
> -        fprintf(stderr, "%s: error registering slot: %s\n", __func__,
> -                strerror(-err));
> -        abort();
> -    }
> +    do {
> +        slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size;
> +        mem = kvm_alloc_slot(kml);
> +        mem->memory_size = slot_size;
> +        mem->start_addr = start_addr;
> +        mem->ram = ram;
> +        mem->flags = kvm_mem_flags(mr);
> +
> +        err = kvm_set_user_memory_region(kml, mem, true);
> +        if (err) {
> +            fprintf(stderr, "%s: error registering slot: %s\n", __func__,
> +                    strerror(-err));
> +            abort();
> +        }
> +        start_addr += slot_size;
> +    } while ((size -= slot_size));
>  
>  out:
>      kvm_slots_unlock(kml);
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 5b6a9a4e55..0c03ffb7c7 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -151,15 +151,6 @@ static void virtio_ccw_register_hcalls(void)
>                                     virtio_ccw_hcall_early_printk);
>  }
>  
> -/*
> - * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages
> - * as the dirty bitmap must be managed by bitops that take an int as
> - * position indicator. If we have a guest beyond that we will split off
> - * new subregions. The split must happen on a segment boundary (1MB).
> - */
> -#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1)
> -#define SEG_MSK (~0xfffffULL)
> -#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 6e814c230b..d0267da8e1 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -347,6 +347,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>       */
>      /* kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0); */
>  
> +    kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES);
>      return 0;
>  }
>  
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-02 13:32 [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
  2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov
@ 2019-08-02 14:11 ` no-reply
  2019-08-02 14:42 ` Christian Borntraeger
  3 siblings, 0 replies; 13+ messages in thread
From: no-reply @ 2019-08-02 14:11 UTC (permalink / raw)
  To: imammedo
  Cc: thuth, david, cohuck, qemu-devel, borntraeger, qemu-s390x, pbonzini

Patchew URL: https://patchew.org/QEMU/20190802133241.29298-1-imammedo@redhat.com/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa

echo
echo "=== UNAME ==="
uname -a

CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

  CC      s390x-softmmu/target/s390x/sigp.o
  CC      s390x-softmmu/target/s390x/kvm.o
/var/tmp/patchew-tester-tmp-bcfxbkeg/src/target/s390x/kvm.c: In function ‘kvm_arch_init’:
/var/tmp/patchew-tester-tmp-bcfxbkeg/src/target/s390x/kvm.c:350:5: error: implicit declaration of function ‘kvm_set_max_memslot_size’; did you mean ‘kvm_get_max_memslots’? [-Werror=implicit-function-declaration]
  350 |     kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~
      |     kvm_get_max_memslots
/var/tmp/patchew-tester-tmp-bcfxbkeg/src/target/s390x/kvm.c:350:5: error: nested extern declaration of ‘kvm_set_max_memslot_size’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
make[1]: *** [/var/tmp/patchew-tester-tmp-bcfxbkeg/src/rules.mak:69: target/s390x/kvm.o] Error 1
make: *** [Makefile:472: s390x-softmmu/all] Error 2


The full log is available at
http://patchew.org/logs/20190802133241.29298-1-imammedo@redhat.com/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-02 13:32 [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
                   ` (2 preceding siblings ...)
  2019-08-02 14:11 ` [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() no-reply
@ 2019-08-02 14:42 ` Christian Borntraeger
  2019-08-02 14:59   ` Christian Borntraeger
  3 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2019-08-02 14:42 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: pbonzini, qemu-s390x, cohuck, thuth, david

On 02.08.19 15:32, Igor Mammedov wrote:
> Changelog:
>   since v2:
>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
>       and drop migratable aliases patch as was agreed during v2 review
>     - drop 4.2 machines patch as it's not prerequisite anymore
>   since v1:
>     - include 4.2 machines patch for adding compat RAM layout on top
>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
>           several memslots
>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
>           cleaned properly
>     - 4/4 add compat machine code to keep old layout (migration-wise) for
>           4.1 and older machines 
> 
> 
> While looking into unifying guest RAM allocation to use hostmem backends
> for initial RAM (especially when -mempath is used) and retiring
> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
> I was inspecting how currently it is used by boards and it turns out several
> boards abuse it by calling the function several times (despite documented contract
> forbiding it).
> 
> s390 is one of such boards where KVM limitation on memslot size got propagated
> to board design and memory_region_allocate_system_memory() was abused to satisfy
> KVM requirement for max RAM chunk where memory region alias would suffice.
> 
> Unfortunately, memory_region_allocate_system_memory() usage created migration
> dependency where guest RAM is transferred in migration stream as several RAMBlocks
> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
> migration breakage (documenting it in release notes) and leaving only KVM fix.
> 
> In order to replace these several RAM chunks with a single memdev and keep it
> working with KVM memslot size limit, following was done:
>    * [1/2] split too big RAM chunk inside of KVM code on several memory slots
>            if necessary
>    * [2/2] drop manual ram splitting in s390 code
> 
> 
> CC: pbonzini@redhat.com
> CC: qemu-s390x@nongnu.org
> CC: borntraeger@de.ibm.com
> CC: thuth@redhat.com
> CC: david@redhat.com
> CC: cohuck@redhat.com

With the fixup this patch set seems to work on s390. I can start 9TB guests and
I can migrate smaller guests between 4.1+patch and 4.0 and 3.1. I currently can
not test migration for the 9TB guest due to lack of a 2nd system. 
> 
> Igor Mammedov (2):
>   kvm: s390: split too big memory section on several memslots
>   s390: do not call memory_region_allocate_system_memory() multiple
>     times
> 
>  include/hw/s390x/s390-virtio-ccw.h | 10 ++++
>  include/sysemu/kvm_int.h           |  1 +
>  accel/kvm/kvm-all.c                | 79 ++++++++++++++++++------------
>  hw/s390x/s390-virtio-ccw.c         | 30 ++----------
>  target/s390x/kvm.c                 |  1 +
>  5 files changed, 63 insertions(+), 58 deletions(-)
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-02 14:42 ` Christian Borntraeger
@ 2019-08-02 14:59   ` Christian Borntraeger
  2019-08-02 15:04     ` Christian Borntraeger
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2019-08-02 14:59 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: pbonzini, qemu-s390x, cohuck, thuth, david



On 02.08.19 16:42, Christian Borntraeger wrote:
> On 02.08.19 15:32, Igor Mammedov wrote:
>> Changelog:
>>   since v2:
>>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
>>       and drop migratable aliases patch as was agreed during v2 review
>>     - drop 4.2 machines patch as it's not prerequisite anymore
>>   since v1:
>>     - include 4.2 machines patch for adding compat RAM layout on top
>>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
>>           several memslots
>>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
>>           cleaned properly
>>     - 4/4 add compat machine code to keep old layout (migration-wise) for
>>           4.1 and older machines 
>>
>>
>> While looking into unifying guest RAM allocation to use hostmem backends
>> for initial RAM (especially when -mempath is used) and retiring
>> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
>> I was inspecting how currently it is used by boards and it turns out several
>> boards abuse it by calling the function several times (despite documented contract
>> forbiding it).
>>
>> s390 is one of such boards where KVM limitation on memslot size got propagated
>> to board design and memory_region_allocate_system_memory() was abused to satisfy
>> KVM requirement for max RAM chunk where memory region alias would suffice.
>>
>> Unfortunately, memory_region_allocate_system_memory() usage created migration
>> dependency where guest RAM is transferred in migration stream as several RAMBlocks
>> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
>> migration breakage (documenting it in release notes) and leaving only KVM fix.
>>
>> In order to replace these several RAM chunks with a single memdev and keep it
>> working with KVM memslot size limit, following was done:
>>    * [1/2] split too big RAM chunk inside of KVM code on several memory slots
>>            if necessary
>>    * [2/2] drop manual ram splitting in s390 code
>>
>>
>> CC: pbonzini@redhat.com
>> CC: qemu-s390x@nongnu.org
>> CC: borntraeger@de.ibm.com
>> CC: thuth@redhat.com
>> CC: david@redhat.com
>> CC: cohuck@redhat.com
> 
> With the fixup this patch set seems to work on s390. I can start 9TB guests and
> I can migrate smaller guests between 4.1+patch and 4.0 and 3.1. I currently can
> not test migration for the 9TB guest due to lack of a 2nd system. 

I have to correct myself. The 9TB guest started up but it does not seem to do
anything useful (it hangs).

I will try to investigate. 


>>
>> Igor Mammedov (2):
>>   kvm: s390: split too big memory section on several memslots
>>   s390: do not call memory_region_allocate_system_memory() multiple
>>     times
>>
>>  include/hw/s390x/s390-virtio-ccw.h | 10 ++++
>>  include/sysemu/kvm_int.h           |  1 +
>>  accel/kvm/kvm-all.c                | 79 ++++++++++++++++++------------
>>  hw/s390x/s390-virtio-ccw.c         | 30 ++----------
>>  target/s390x/kvm.c                 |  1 +
>>  5 files changed, 63 insertions(+), 58 deletions(-)
>>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-02 14:59   ` Christian Borntraeger
@ 2019-08-02 15:04     ` Christian Borntraeger
  2019-08-05  8:54       ` Igor Mammedov
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2019-08-02 15:04 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel; +Cc: pbonzini, qemu-s390x, cohuck, thuth, david



On 02.08.19 16:59, Christian Borntraeger wrote:
> 
> 
> On 02.08.19 16:42, Christian Borntraeger wrote:
>> On 02.08.19 15:32, Igor Mammedov wrote:
>>> Changelog:
>>>   since v2:
>>>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
>>>       and drop migratable aliases patch as was agreed during v2 review
>>>     - drop 4.2 machines patch as it's not prerequisite anymore
>>>   since v1:
>>>     - include 4.2 machines patch for adding compat RAM layout on top
>>>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
>>>           several memslots
>>>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
>>>           cleaned properly
>>>     - 4/4 add compat machine code to keep old layout (migration-wise) for
>>>           4.1 and older machines 
>>>
>>>
>>> While looking into unifying guest RAM allocation to use hostmem backends
>>> for initial RAM (especially when -mempath is used) and retiring
>>> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
>>> I was inspecting how currently it is used by boards and it turns out several
>>> boards abuse it by calling the function several times (despite documented contract
>>> forbiding it).
>>>
>>> s390 is one of such boards where KVM limitation on memslot size got propagated
>>> to board design and memory_region_allocate_system_memory() was abused to satisfy
>>> KVM requirement for max RAM chunk where memory region alias would suffice.
>>>
>>> Unfortunately, memory_region_allocate_system_memory() usage created migration
>>> dependency where guest RAM is transferred in migration stream as several RAMBlocks
>>> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
>>> migration breakage (documenting it in release notes) and leaving only KVM fix.
>>>
>>> In order to replace these several RAM chunks with a single memdev and keep it
>>> working with KVM memslot size limit, following was done:
>>>    * [1/2] split too big RAM chunk inside of KVM code on several memory slots
>>>            if necessary
>>>    * [2/2] drop manual ram splitting in s390 code
>>>
>>>
>>> CC: pbonzini@redhat.com
>>> CC: qemu-s390x@nongnu.org
>>> CC: borntraeger@de.ibm.com
>>> CC: thuth@redhat.com
>>> CC: david@redhat.com
>>> CC: cohuck@redhat.com
>>
>> With the fixup this patch set seems to work on s390. I can start 9TB guests and
>> I can migrate smaller guests between 4.1+patch and 4.0 and 3.1. I currently can
>> not test migration for the 9TB guest due to lack of a 2nd system. 
> 
> I have to correct myself. The 9TB guest started up but it does not seem to do
> anything useful (it hangs).

Seems that the userspace addr is wrong (its the same). 
[pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=8796091973632, userspace_addr=0x3fff7d00000}) = 0
[pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=1, flags=0, guest_phys_addr=0x7fffff00000, memory_size=1099512676352, userspace_addr=0x3fff7d00000}) = 0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-02 15:04     ` Christian Borntraeger
@ 2019-08-05  8:54       ` Igor Mammedov
  2019-08-05 15:18         ` Cornelia Huck
  0 siblings, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2019-08-05  8:54 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: thuth, david, cohuck, qemu-devel, qemu-s390x, pbonzini

On Fri, 2 Aug 2019 17:04:21 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 02.08.19 16:59, Christian Borntraeger wrote:
> > 
> > 
> > On 02.08.19 16:42, Christian Borntraeger wrote:  
> >> On 02.08.19 15:32, Igor Mammedov wrote:  
> >>> Changelog:
> >>>   since v2:
> >>>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
> >>>       and drop migratable aliases patch as was agreed during v2 review
> >>>     - drop 4.2 machines patch as it's not prerequisite anymore
> >>>   since v1:
> >>>     - include 4.2 machines patch for adding compat RAM layout on top
> >>>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
> >>>           several memslots
> >>>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
> >>>           cleaned properly
> >>>     - 4/4 add compat machine code to keep old layout (migration-wise) for
> >>>           4.1 and older machines 
> >>>
> >>>
> >>> While looking into unifying guest RAM allocation to use hostmem backends
> >>> for initial RAM (especially when -mempath is used) and retiring
> >>> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
> >>> I was inspecting how currently it is used by boards and it turns out several
> >>> boards abuse it by calling the function several times (despite documented contract
> >>> forbiding it).
> >>>
> >>> s390 is one of such boards where KVM limitation on memslot size got propagated
> >>> to board design and memory_region_allocate_system_memory() was abused to satisfy
> >>> KVM requirement for max RAM chunk where memory region alias would suffice.
> >>>
> >>> Unfortunately, memory_region_allocate_system_memory() usage created migration
> >>> dependency where guest RAM is transferred in migration stream as several RAMBlocks
> >>> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
> >>> migration breakage (documenting it in release notes) and leaving only KVM fix.
> >>>
> >>> In order to replace these several RAM chunks with a single memdev and keep it
> >>> working with KVM memslot size limit, following was done:
> >>>    * [1/2] split too big RAM chunk inside of KVM code on several memory slots
> >>>            if necessary
> >>>    * [2/2] drop manual ram splitting in s390 code
> >>>
> >>>
> >>> CC: pbonzini@redhat.com
> >>> CC: qemu-s390x@nongnu.org
> >>> CC: borntraeger@de.ibm.com
> >>> CC: thuth@redhat.com
> >>> CC: david@redhat.com
> >>> CC: cohuck@redhat.com  
> >>
> >> With the fixup this patch set seems to work on s390. I can start 9TB guests and
> >> I can migrate smaller guests between 4.1+patch and 4.0 and 3.1. I currently can
> >> not test migration for the 9TB guest due to lack of a 2nd system.   
> > 
> > I have to correct myself. The 9TB guest started up but it does not seem to do
> > anything useful (it hangs).  
> 
> Seems that the userspace addr is wrong (its the same). 
> [pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=8796091973632, userspace_addr=0x3fff7d00000}) = 0
> [pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=1, flags=0, guest_phys_addr=0x7fffff00000, memory_size=1099512676352, userspace_addr=0x3fff7d00000}) = 0

It's a bug in 1/2, I forgot to advance mem->ram along with mem->start_addr.
Let me fix it and simulate it on small s390 host (/me sorry for messy patches)
it won't test migration properly but should be sufficient for testing KVM code patch.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory()
  2019-08-05  8:54       ` Igor Mammedov
@ 2019-08-05 15:18         ` Cornelia Huck
  0 siblings, 0 replies; 13+ messages in thread
From: Cornelia Huck @ 2019-08-05 15:18 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: thuth, david, qemu-devel, Christian Borntraeger, qemu-s390x, pbonzini

On Mon, 5 Aug 2019 10:54:40 +0200
Igor Mammedov <imammedo@redhat.com> wrote:

> On Fri, 2 Aug 2019 17:04:21 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
> > On 02.08.19 16:59, Christian Borntraeger wrote:  
> > > 
> > > 
> > > On 02.08.19 16:42, Christian Borntraeger wrote:    
> > >> On 02.08.19 15:32, Igor Mammedov wrote:    
> > >>> Changelog:
> > >>>   since v2:
> > >>>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
> > >>>       and drop migratable aliases patch as was agreed during v2 review

FWIW, that seems reasonable to me as well.

> > >>>     - drop 4.2 machines patch as it's not prerequisite anymore
> > >>>   since v1:
> > >>>     - include 4.2 machines patch for adding compat RAM layout on top
> > >>>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
> > >>>           several memslots
> > >>>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
> > >>>           cleaned properly
> > >>>     - 4/4 add compat machine code to keep old layout (migration-wise) for
> > >>>           4.1 and older machines 
> > >>>
> > >>>
> > >>> While looking into unifying guest RAM allocation to use hostmem backends
> > >>> for initial RAM (especially when -mempath is used) and retiring
> > >>> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
> > >>> I was inspecting how currently it is used by boards and it turns out several
> > >>> boards abuse it by calling the function several times (despite documented contract
> > >>> forbiding it).
> > >>>
> > >>> s390 is one of such boards where KVM limitation on memslot size got propagated
> > >>> to board design and memory_region_allocate_system_memory() was abused to satisfy
> > >>> KVM requirement for max RAM chunk where memory region alias would suffice.
> > >>>
> > >>> Unfortunately, memory_region_allocate_system_memory() usage created migration
> > >>> dependency where guest RAM is transferred in migration stream as several RAMBlocks
> > >>> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
> > >>> migration breakage (documenting it in release notes) and leaving only KVM fix.
> > >>>
> > >>> In order to replace these several RAM chunks with a single memdev and keep it
> > >>> working with KVM memslot size limit, following was done:
> > >>>    * [1/2] split too big RAM chunk inside of KVM code on several memory slots
> > >>>            if necessary
> > >>>    * [2/2] drop manual ram splitting in s390 code
> > >>>
> > >>>
> > >>> CC: pbonzini@redhat.com
> > >>> CC: qemu-s390x@nongnu.org
> > >>> CC: borntraeger@de.ibm.com
> > >>> CC: thuth@redhat.com
> > >>> CC: david@redhat.com
> > >>> CC: cohuck@redhat.com    
> > >>
> > >> With the fixup this patch set seems to work on s390. I can start 9TB guests and
> > >> I can migrate smaller guests between 4.1+patch and 4.0 and 3.1. I currently can
> > >> not test migration for the 9TB guest due to lack of a 2nd system.     
> > > 
> > > I have to correct myself. The 9TB guest started up but it does not seem to do
> > > anything useful (it hangs).    
> > 
> > Seems that the userspace addr is wrong (its the same). 
> > [pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=8796091973632, userspace_addr=0x3fff7d00000}) = 0
> > [pid 258234] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=1, flags=0, guest_phys_addr=0x7fffff00000, memory_size=1099512676352, userspace_addr=0x3fff7d00000}) = 0  
> 
> It's a bug in 1/2, I forgot to advance mem->ram along with mem->start_addr.
> Let me fix it and simulate it on small s390 host (/me sorry for messy patches)
> it won't test migration properly but should be sufficient for testing KVM code patch.
> 

Ok, I'll wait for a v4 before I spend any time on this :)


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-05 15:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 13:32 [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() Igor Mammedov
2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 1/2] kvm: s390: split too big memory section on several memslots Igor Mammedov
2019-08-02 13:50   ` Christian Borntraeger
2019-08-02 13:32 ` [Qemu-devel] [PATCH for-4.2 v3 2/2] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov
2019-08-02 13:36   ` David Hildenbrand
2019-08-02 13:41     ` Christian Borntraeger
2019-08-02 13:45       ` David Hildenbrand
2019-08-02 14:11 ` [Qemu-devel] [PATCH for-4.2 v3 0/2] s390: stop abusing memory_region_allocate_system_memory() no-reply
2019-08-02 14:42 ` Christian Borntraeger
2019-08-02 14:59   ` Christian Borntraeger
2019-08-02 15:04     ` Christian Borntraeger
2019-08-05  8:54       ` Igor Mammedov
2019-08-05 15:18         ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).