[Qemu-devel] [PATCH v1 0/5] Fix misuses of memory_region_allocate_system

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v1 0/5] Fix misuses of memory_region_allocate_system_memory()
@ 2019-04-15 13:27 ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x


memory_region_allocate_system_memory() is supposed to be called only once,
but due to lack of the check some boards abused it by  calling it multiple times.
So here goes fixes that replace it with  memory_region_init_ram() in most of
the cases.
However s390 case a bit convolted as we have to keep cross version migration
working and need -mem-path to work there as well. To make it happen I hacked
MemoryRegion aliases to work with migration by allocating 'fake' RAMBlock that
aliases original RAM pointer (asking for comments on approach since I have no
idea how migration works really). It is one way to fix s390 while not abusing
memory_region_allocate_system_memory() and making it work with -mem-path
correctly.
(another a bit more dirty way would be adding machine callback to handle
 -mem-path and overiding it in s390, but migratable aliases approach looks more
cleaner to me API wise and have potential to be reused if we would need to
fix up RAM layout but keep migration stream compatible).

CC: Paolo Bonzini <pbonzini@redhat.com> 
CC: Richard Henderson <rth@twiddle.net> 
CC: Helge Deller <deller@gmx.de> 
CC: "Hervé Poussineau" <hpoussin@reactos.org> 
CC: David Gibson <david@gibson.dropbear.id.au> 
CC: Cornelia Huck <cohuck@redhat.com> 
CC: Halil Pasic <pasic@linux.ibm.com> 
CC: Christian Borntraeger <borntraeger@de.ibm.com> 
CC: David Hildenbrand <david@redhat.com> 
CC: Artyom Tarasenko <atar4qemu@gmail.com> 
CC: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> 
CC: qemu-ppc@nongnu.org
CC: qemu-s390x@nongnu.org

Igor Mammedov (5):
  sparc64: use memory_region_allocate_system_memory() only for '-m'
    specified RAM
  ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
  hppa: drop usage of memory_region_allocate_system_memory() for ROM
  memory: make MemoryRegion alias migratable
  s390: do not call memory_region_allocate_system_memory() multiple
    times

 exec.c                     |  7 ++++---
 hw/hppa/machine.c          |  5 ++---
 hw/ppc/rs6000_mc.c         | 15 ++++++++++-----
 hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
 hw/sparc64/niagara.c       | 25 +++++++++++++------------
 memory.c                   |  5 +++++
 6 files changed, 49 insertions(+), 28 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 0/5] Fix misuses of memory_region_allocate_system_memory()
@ 2019-04-15 13:27 ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson


memory_region_allocate_system_memory() is supposed to be called only once,
but due to lack of the check some boards abused it by  calling it multiple times.
So here goes fixes that replace it with  memory_region_init_ram() in most of
the cases.
However s390 case a bit convolted as we have to keep cross version migration
working and need -mem-path to work there as well. To make it happen I hacked
MemoryRegion aliases to work with migration by allocating 'fake' RAMBlock that
aliases original RAM pointer (asking for comments on approach since I have no
idea how migration works really). It is one way to fix s390 while not abusing
memory_region_allocate_system_memory() and making it work with -mem-path
correctly.
(another a bit more dirty way would be adding machine callback to handle
 -mem-path and overiding it in s390, but migratable aliases approach looks more
cleaner to me API wise and have potential to be reused if we would need to
fix up RAM layout but keep migration stream compatible).

CC: Paolo Bonzini <pbonzini@redhat.com> 
CC: Richard Henderson <rth@twiddle.net> 
CC: Helge Deller <deller@gmx.de> 
CC: "Hervé Poussineau" <hpoussin@reactos.org> 
CC: David Gibson <david@gibson.dropbear.id.au> 
CC: Cornelia Huck <cohuck@redhat.com> 
CC: Halil Pasic <pasic@linux.ibm.com> 
CC: Christian Borntraeger <borntraeger@de.ibm.com> 
CC: David Hildenbrand <david@redhat.com> 
CC: Artyom Tarasenko <atar4qemu@gmail.com> 
CC: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> 
CC: qemu-ppc@nongnu.org
CC: qemu-s390x@nongnu.org

Igor Mammedov (5):
  sparc64: use memory_region_allocate_system_memory() only for '-m'
    specified RAM
  ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
  hppa: drop usage of memory_region_allocate_system_memory() for ROM
  memory: make MemoryRegion alias migratable
  s390: do not call memory_region_allocate_system_memory() multiple
    times

 exec.c                     |  7 ++++---
 hw/hppa/machine.c          |  5 ++---
 hw/ppc/rs6000_mc.c         | 15 ++++++++++-----
 hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
 hw/sparc64/niagara.c       | 25 +++++++++++++------------
 memory.c                   |  5 +++++
 6 files changed, 49 insertions(+), 28 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 1/5] sparc64: use memory_region_allocate_system_memory() only for '-m' specified RAM
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

memory_region_allocate_system_memory() was designed to be called for
allocating inital RAM. Using it mutiple times within one board is not
supported and could fail if -mem-path with non hugepage path is used.

Keep using memory_region_allocate_system_memory() only for initial
RAM and use memory_region_init_ram() for the rest fixed size regions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/sparc64/niagara.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/sparc64/niagara.c b/hw/sparc64/niagara.c
index f8a856f..39a4522 100644
--- a/hw/sparc64/niagara.c
+++ b/hw/sparc64/niagara.c
@@ -37,6 +37,7 @@
 #include "sysemu/block-backend.h"
 #include "qemu/error-report.h"
 #include "sysemu/qtest.h"
+#include "qapi/error.h"
 
 
 typedef struct NiagaraBoardState {
@@ -108,8 +109,8 @@ static void niagara_init(MachineState *machine)
     /* init CPUs */
     sparc64_cpu_devinit(machine->cpu_type, NIAGARA_PROM_BASE);
     /* set up devices */
-    memory_region_allocate_system_memory(&s->hv_ram, NULL, "sun4v-hv.ram",
-                                         NIAGARA_HV_RAM_SIZE);
+    memory_region_init_ram(&s->hv_ram, NULL, "sun4v-hv.ram",
+                           NIAGARA_HV_RAM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_HV_RAM_BASE, &s->hv_ram);
 
     memory_region_allocate_system_memory(&s->partition_ram, NULL,
@@ -118,17 +119,17 @@ static void niagara_init(MachineState *machine)
     memory_region_add_subregion(sysmem, NIAGARA_PARTITION_RAM_BASE,
                                 &s->partition_ram);
 
-    memory_region_allocate_system_memory(&s->nvram, NULL,
-                                         "sun4v.nvram", NIAGARA_NVRAM_SIZE);
+    memory_region_init_ram(&s->nvram, NULL, "sun4v.nvram", NIAGARA_NVRAM_SIZE,
+                           &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_NVRAM_BASE, &s->nvram);
-    memory_region_allocate_system_memory(&s->md_rom, NULL,
-                                         "sun4v-md.rom", NIAGARA_MD_ROM_SIZE);
+    memory_region_init_ram(&s->md_rom, NULL, "sun4v-md.rom",
+                           NIAGARA_MD_ROM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_MD_ROM_BASE, &s->md_rom);
-    memory_region_allocate_system_memory(&s->hv_rom, NULL,
-                                         "sun4v-hv.rom", NIAGARA_HV_ROM_SIZE);
+    memory_region_init_ram(&s->hv_rom, NULL, "sun4v-hv.rom",
+                           NIAGARA_HV_ROM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_HV_ROM_BASE, &s->hv_rom);
-    memory_region_allocate_system_memory(&s->prom, NULL,
-                                         "sun4v.prom", PROM_SIZE_MAX);
+    memory_region_init_ram(&s->prom, NULL, "sun4v.prom", PROM_SIZE_MAX,
+                           &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_PROM_BASE, &s->prom);
 
     add_rom_or_fail("nvram1", NIAGARA_NVRAM_BASE);
@@ -145,8 +146,8 @@ static void niagara_init(MachineState *machine)
         BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
         int size = blk_getlength(blk);
         if (size > 0) {
-            memory_region_allocate_system_memory(&s->vdisk_ram, NULL,
-                                                 "sun4v_vdisk.ram", size);
+            memory_region_init_ram(&s->vdisk_ram, NULL, "sun4v_vdisk.ram", size,
+                                   &error_fatal);
             memory_region_add_subregion(get_system_memory(),
                                         NIAGARA_VDISK_BASE, &s->vdisk_ram);
             dinfo->is_default = 1;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 1/5] sparc64: use memory_region_allocate_system_memory() only for '-m' specified RAM
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

memory_region_allocate_system_memory() was designed to be called for
allocating inital RAM. Using it mutiple times within one board is not
supported and could fail if -mem-path with non hugepage path is used.

Keep using memory_region_allocate_system_memory() only for initial
RAM and use memory_region_init_ram() for the rest fixed size regions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/sparc64/niagara.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/sparc64/niagara.c b/hw/sparc64/niagara.c
index f8a856f..39a4522 100644
--- a/hw/sparc64/niagara.c
+++ b/hw/sparc64/niagara.c
@@ -37,6 +37,7 @@
 #include "sysemu/block-backend.h"
 #include "qemu/error-report.h"
 #include "sysemu/qtest.h"
+#include "qapi/error.h"
 
 
 typedef struct NiagaraBoardState {
@@ -108,8 +109,8 @@ static void niagara_init(MachineState *machine)
     /* init CPUs */
     sparc64_cpu_devinit(machine->cpu_type, NIAGARA_PROM_BASE);
     /* set up devices */
-    memory_region_allocate_system_memory(&s->hv_ram, NULL, "sun4v-hv.ram",
-                                         NIAGARA_HV_RAM_SIZE);
+    memory_region_init_ram(&s->hv_ram, NULL, "sun4v-hv.ram",
+                           NIAGARA_HV_RAM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_HV_RAM_BASE, &s->hv_ram);
 
     memory_region_allocate_system_memory(&s->partition_ram, NULL,
@@ -118,17 +119,17 @@ static void niagara_init(MachineState *machine)
     memory_region_add_subregion(sysmem, NIAGARA_PARTITION_RAM_BASE,
                                 &s->partition_ram);
 
-    memory_region_allocate_system_memory(&s->nvram, NULL,
-                                         "sun4v.nvram", NIAGARA_NVRAM_SIZE);
+    memory_region_init_ram(&s->nvram, NULL, "sun4v.nvram", NIAGARA_NVRAM_SIZE,
+                           &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_NVRAM_BASE, &s->nvram);
-    memory_region_allocate_system_memory(&s->md_rom, NULL,
-                                         "sun4v-md.rom", NIAGARA_MD_ROM_SIZE);
+    memory_region_init_ram(&s->md_rom, NULL, "sun4v-md.rom",
+                           NIAGARA_MD_ROM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_MD_ROM_BASE, &s->md_rom);
-    memory_region_allocate_system_memory(&s->hv_rom, NULL,
-                                         "sun4v-hv.rom", NIAGARA_HV_ROM_SIZE);
+    memory_region_init_ram(&s->hv_rom, NULL, "sun4v-hv.rom",
+                           NIAGARA_HV_ROM_SIZE, &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_HV_ROM_BASE, &s->hv_rom);
-    memory_region_allocate_system_memory(&s->prom, NULL,
-                                         "sun4v.prom", PROM_SIZE_MAX);
+    memory_region_init_ram(&s->prom, NULL, "sun4v.prom", PROM_SIZE_MAX,
+                           &error_fatal);
     memory_region_add_subregion(sysmem, NIAGARA_PROM_BASE, &s->prom);
 
     add_rom_or_fail("nvram1", NIAGARA_NVRAM_BASE);
@@ -145,8 +146,8 @@ static void niagara_init(MachineState *machine)
         BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
         int size = blk_getlength(blk);
         if (size > 0) {
-            memory_region_allocate_system_memory(&s->vdisk_ram, NULL,
-                                                 "sun4v_vdisk.ram", size);
+            memory_region_init_ram(&s->vdisk_ram, NULL, "sun4v_vdisk.ram", size,
+                                   &error_fatal);
             memory_region_add_subregion(get_system_memory(),
                                         NIAGARA_VDISK_BASE, &s->vdisk_ram);
             dinfo->is_default = 1;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 2/5] ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

rs6000mc_realize() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break -mem-path. Replace it with
plain memory_region_init_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/ppc/rs6000_mc.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/rs6000_mc.c b/hw/ppc/rs6000_mc.c
index 45cb95e..ea174f1 100644
--- a/hw/ppc/rs6000_mc.c
+++ b/hw/ppc/rs6000_mc.c
@@ -142,6 +142,7 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
     RS6000MCState *s = RS6000MC_DEVICE(dev);
     int socket = 0;
     unsigned int ram_size = s->ram_size / MiB;
+    Error *local_err = NULL;
 
     while (socket < 6) {
         if (ram_size >= 64) {
@@ -163,19 +164,21 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
         if (s->simm_size[socket]) {
             char name[] = "simm.?";
             name[5] = socket + '0';
-            memory_region_allocate_system_memory(&s->simm[socket], OBJECT(dev),
-                                                 name,
-                                                 s->simm_size[socket] * MiB);
+            memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
+                                   s->simm_size[socket] * MiB, &local_err);
+            if (local_err) {
+                goto out;
+            }
             memory_region_add_subregion_overlap(get_system_memory(), 0,
                                                 &s->simm[socket], socket);
         }
     }
     if (ram_size) {
         /* unable to push all requested RAM in SIMMs */
-        error_setg(errp, "RAM size incompatible with this board. "
+        error_setg(&local_err, "RAM size incompatible with this board. "
                    "Try again with something else, like %" PRId64 " MB",
                    s->ram_size / MiB - ram_size);
-        return;
+        goto out;
     }
 
     if (s->autoconfigure) {
@@ -191,6 +194,8 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
 
     isa_register_portio_list(ISA_DEVICE(dev), &s->portio, 0x0,
                              rs6000mc_port_list, s, "rs6000mc");
+out:
+    error_propagate(errp, local_err);
 }
 
 static const VMStateDescription vmstate_rs6000mc = {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 2/5] ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

rs6000mc_realize() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break -mem-path. Replace it with
plain memory_region_init_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/ppc/rs6000_mc.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/rs6000_mc.c b/hw/ppc/rs6000_mc.c
index 45cb95e..ea174f1 100644
--- a/hw/ppc/rs6000_mc.c
+++ b/hw/ppc/rs6000_mc.c
@@ -142,6 +142,7 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
     RS6000MCState *s = RS6000MC_DEVICE(dev);
     int socket = 0;
     unsigned int ram_size = s->ram_size / MiB;
+    Error *local_err = NULL;
 
     while (socket < 6) {
         if (ram_size >= 64) {
@@ -163,19 +164,21 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
         if (s->simm_size[socket]) {
             char name[] = "simm.?";
             name[5] = socket + '0';
-            memory_region_allocate_system_memory(&s->simm[socket], OBJECT(dev),
-                                                 name,
-                                                 s->simm_size[socket] * MiB);
+            memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
+                                   s->simm_size[socket] * MiB, &local_err);
+            if (local_err) {
+                goto out;
+            }
             memory_region_add_subregion_overlap(get_system_memory(), 0,
                                                 &s->simm[socket], socket);
         }
     }
     if (ram_size) {
         /* unable to push all requested RAM in SIMMs */
-        error_setg(errp, "RAM size incompatible with this board. "
+        error_setg(&local_err, "RAM size incompatible with this board. "
                    "Try again with something else, like %" PRId64 " MB",
                    s->ram_size / MiB - ram_size);
-        return;
+        goto out;
     }
 
     if (s->autoconfigure) {
@@ -191,6 +194,8 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
 
     isa_register_portio_list(ISA_DEVICE(dev), &s->portio, 0x0,
                              rs6000mc_port_list, s, "rs6000mc");
+out:
+    error_propagate(errp, local_err);
 }
 
 static const VMStateDescription vmstate_rs6000mc = {
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 3/5] hppa: drop usage of memory_region_allocate_system_memory() for ROM
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

machine_hppa_init() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break with -mem-path. Replace
the second usage (for 'rom') with memory_region_init_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/hppa/machine.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index d1b1d3c..b413913 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -158,9 +158,8 @@ static void machine_hppa_init(MachineState *machine)
     g_free(firmware_filename);
 
     rom_region = g_new(MemoryRegion, 1);
-    memory_region_allocate_system_memory(rom_region, OBJECT(machine),
-                                         "firmware",
-                                         (FIRMWARE_END - FIRMWARE_START));
+    memory_region_init_ram(rom_region, NULL, "firmware",
+                           (FIRMWARE_END - FIRMWARE_START), &error_fatal);
     memory_region_add_subregion(addr_space, FIRMWARE_START, rom_region);
 
     /* Load kernel */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 3/5] hppa: drop usage of memory_region_allocate_system_memory() for ROM
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

machine_hppa_init() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break with -mem-path. Replace
the second usage (for 'rom') with memory_region_init_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/hppa/machine.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index d1b1d3c..b413913 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -158,9 +158,8 @@ static void machine_hppa_init(MachineState *machine)
     g_free(firmware_filename);
 
     rom_region = g_new(MemoryRegion, 1);
-    memory_region_allocate_system_memory(rom_region, OBJECT(machine),
-                                         "firmware",
-                                         (FIRMWARE_END - FIRMWARE_START));
+    memory_region_init_ram(rom_region, NULL, "firmware",
+                           (FIRMWARE_END - FIRMWARE_START), &error_fatal);
     memory_region_add_subregion(addr_space, FIRMWARE_START, rom_region);
 
     /* Load kernel */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 4/5] memory: make MemoryRegion alias migratable
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

use qemu_ram_alloc_from_ptr() to create aliased RAMBlock
to the part of original memory region.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 exec.c   | 7 ++++---
 memory.c | 5 +++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 6ab62f4..0fc10d1 100644
--- a/exec.c
+++ b/exec.c
@@ -2255,7 +2255,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
                                         new_block->used_length,
                                         DIRTY_CLIENTS_ALL);
 
-    if (new_block->host) {
+    if (new_block->host && !new_block->mr->alias) {
         qemu_ram_setup_dump(new_block->host, new_block->max_length);
         qemu_madvise(new_block->host, new_block->max_length, QEMU_MADV_HUGEPAGE);
         /* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
@@ -2613,7 +2613,8 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
 
     rcu_read_lock();
     block = atomic_rcu_read(&ram_list.mru_block);
-    if (block && block->host && host - block->host < block->max_length) {
+    if (block && !block->mr->alias && block->host &&
+        host - block->host < block->max_length) {
         goto found;
     }
 
@@ -2622,7 +2623,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
         if (block->host == NULL) {
             continue;
         }
-        if (host - block->host < block->max_length) {
+        if (!block->mr->alias && host - block->host < block->max_length) {
             goto found;
         }
     }
diff --git a/memory.c b/memory.c
index 9fbca52..687a147 100644
--- a/memory.c
+++ b/memory.c
@@ -1672,6 +1672,11 @@ void memory_region_init_alias(MemoryRegion *mr,
     memory_region_init(mr, owner, name, size);
     mr->alias = orig;
     mr->alias_offset = offset;
+    if (orig->ram_block && size) {
+        mr->ram_block = qemu_ram_alloc_from_ptr(size,
+                                                orig->ram_block->host + offset,
+                                                mr, &error_fatal);
+    }
 }
 
 void memory_region_init_rom_nomigrate(MemoryRegion *mr,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 4/5] memory: make MemoryRegion alias migratable
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

use qemu_ram_alloc_from_ptr() to create aliased RAMBlock
to the part of original memory region.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 exec.c   | 7 ++++---
 memory.c | 5 +++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 6ab62f4..0fc10d1 100644
--- a/exec.c
+++ b/exec.c
@@ -2255,7 +2255,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
                                         new_block->used_length,
                                         DIRTY_CLIENTS_ALL);
 
-    if (new_block->host) {
+    if (new_block->host && !new_block->mr->alias) {
         qemu_ram_setup_dump(new_block->host, new_block->max_length);
         qemu_madvise(new_block->host, new_block->max_length, QEMU_MADV_HUGEPAGE);
         /* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
@@ -2613,7 +2613,8 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
 
     rcu_read_lock();
     block = atomic_rcu_read(&ram_list.mru_block);
-    if (block && block->host && host - block->host < block->max_length) {
+    if (block && !block->mr->alias && block->host &&
+        host - block->host < block->max_length) {
         goto found;
     }
 
@@ -2622,7 +2623,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
         if (block->host == NULL) {
             continue;
         }
-        if (host - block->host < block->max_length) {
+        if (!block->mr->alias && host - block->host < block->max_length) {
             goto found;
         }
     }
diff --git a/memory.c b/memory.c
index 9fbca52..687a147 100644
--- a/memory.c
+++ b/memory.c
@@ -1672,6 +1672,11 @@ void memory_region_init_alias(MemoryRegion *mr,
     memory_region_init(mr, owner, name, size);
     mr->alias = orig;
     mr->alias_offset = offset;
+    if (orig->ram_block && size) {
+        mr->ram_block = qemu_ram_alloc_from_ptr(size,
+                                                orig->ram_block->host + offset,
+                                                mr, &error_fatal);
+    }
 }
 
 void memory_region_init_rom_nomigrate(MemoryRegion *mr,
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

s390 was trying to solve limited memslot size issue by abusing
memory_region_allocate_system_memory(), which breaks API contract
where the function might be called only once.

s390 should have used memory aliases to fragment inital memory into
smaller chunks to satisfy KVM's memslot limitation. But its a bit
late now, since allocated pieces are transfered in migration stream
separately, so it's not possible to just replace broken layout with
correct one. Previous patch made MemoryRegion alases migratable and
this patch switches to use them to split big initial RAM chunk into
smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
for migration.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
A don't have access to a suitable system to test it, so I've simulated
it with smaller chunks on x84 host. Ping-pong migration between old
and new QEMU worked fine.  KVM part should be fine as memslots
using mapped MemoryRegions (in this case it would be aliases) as
far as I know but is someone could test it on big enough host it
would be nice.
---
 hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index d11069b..12ca3a9 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
 static void s390_memory_init(ram_addr_t mem_size)
 {
     MemoryRegion *sysmem = get_system_memory();
+    MemoryRegion *ram = g_new(MemoryRegion, 1);
     ram_addr_t chunk, offset = 0;
     unsigned int number = 0;
     gchar *name;
 
     /* allocate RAM for core */
+    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
+    /*
+     * memory_region_allocate_system_memory() registers allocated RAM for
+     * migration, however for compat reasons the RAM should be passed over
+     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
+     * allocated RAM so it won't be migrated directly. Aliases will take
+     * of segmenting RAM into legacy chunks.
+     */
+    vmstate_unregister_ram(ram, NULL);
     name = g_strdup_printf("s390.ram");
     while (mem_size) {
-        MemoryRegion *ram = g_new(MemoryRegion, 1);
-        uint64_t size = mem_size;
+        MemoryRegion *alias = g_new(MemoryRegion, 1);
 
         /* KVM does not allow memslots >= 8 TB */
-        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
-        memory_region_allocate_system_memory(ram, NULL, name, chunk);
-        memory_region_add_subregion(sysmem, offset, ram);
+        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
+        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
+        vmstate_register_ram_global(alias);
+        memory_region_add_subregion(sysmem, offset, alias);
         mem_size -= chunk;
         offset += chunk;
         g_free(name);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-15 13:27   ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-15 13:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

s390 was trying to solve limited memslot size issue by abusing
memory_region_allocate_system_memory(), which breaks API contract
where the function might be called only once.

s390 should have used memory aliases to fragment inital memory into
smaller chunks to satisfy KVM's memslot limitation. But its a bit
late now, since allocated pieces are transfered in migration stream
separately, so it's not possible to just replace broken layout with
correct one. Previous patch made MemoryRegion alases migratable and
this patch switches to use them to split big initial RAM chunk into
smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
for migration.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
A don't have access to a suitable system to test it, so I've simulated
it with smaller chunks on x84 host. Ping-pong migration between old
and new QEMU worked fine.  KVM part should be fine as memslots
using mapped MemoryRegions (in this case it would be aliases) as
far as I know but is someone could test it on big enough host it
would be nice.
---
 hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index d11069b..12ca3a9 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
 static void s390_memory_init(ram_addr_t mem_size)
 {
     MemoryRegion *sysmem = get_system_memory();
+    MemoryRegion *ram = g_new(MemoryRegion, 1);
     ram_addr_t chunk, offset = 0;
     unsigned int number = 0;
     gchar *name;
 
     /* allocate RAM for core */
+    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
+    /*
+     * memory_region_allocate_system_memory() registers allocated RAM for
+     * migration, however for compat reasons the RAM should be passed over
+     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
+     * allocated RAM so it won't be migrated directly. Aliases will take
+     * of segmenting RAM into legacy chunks.
+     */
+    vmstate_unregister_ram(ram, NULL);
     name = g_strdup_printf("s390.ram");
     while (mem_size) {
-        MemoryRegion *ram = g_new(MemoryRegion, 1);
-        uint64_t size = mem_size;
+        MemoryRegion *alias = g_new(MemoryRegion, 1);
 
         /* KVM does not allow memslots >= 8 TB */
-        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
-        memory_region_allocate_system_memory(ram, NULL, name, chunk);
-        memory_region_add_subregion(sysmem, offset, ram);
+        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
+        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
+        vmstate_register_ram_global(alias);
+        memory_region_add_subregion(sysmem, offset, alias);
         mem_size -= chunk;
         offset += chunk;
         g_free(name);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/5] sparc64: use memory_region_allocate_system_memory() only for '-m' specified RAM
@ 2019-04-15 15:07     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-04-15 15:07 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

On 4/15/19 3:27 PM, Igor Mammedov wrote:
> memory_region_allocate_system_memory() was designed to be called for
> allocating inital RAM. Using it mutiple times within one board is not
> supported and could fail if -mem-path with non hugepage path is used.
> 
> Keep using memory_region_allocate_system_memory() only for initial
> RAM and use memory_region_init_ram() for the rest fixed size regions.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/sparc64/niagara.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/sparc64/niagara.c b/hw/sparc64/niagara.c
> index f8a856f..39a4522 100644
> --- a/hw/sparc64/niagara.c
> +++ b/hw/sparc64/niagara.c
> @@ -37,6 +37,7 @@
>  #include "sysemu/block-backend.h"
>  #include "qemu/error-report.h"
>  #include "sysemu/qtest.h"
> +#include "qapi/error.h"
>  
>  
>  typedef struct NiagaraBoardState {
> @@ -108,8 +109,8 @@ static void niagara_init(MachineState *machine)
>      /* init CPUs */
>      sparc64_cpu_devinit(machine->cpu_type, NIAGARA_PROM_BASE);
>      /* set up devices */
> -    memory_region_allocate_system_memory(&s->hv_ram, NULL, "sun4v-hv.ram",
> -                                         NIAGARA_HV_RAM_SIZE);
> +    memory_region_init_ram(&s->hv_ram, NULL, "sun4v-hv.ram",
> +                           NIAGARA_HV_RAM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_HV_RAM_BASE, &s->hv_ram);
>  
>      memory_region_allocate_system_memory(&s->partition_ram, NULL,

Using the partition RAM as system memory rather than HV one seems the
correct choice.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> @@ -118,17 +119,17 @@ static void niagara_init(MachineState *machine)
>      memory_region_add_subregion(sysmem, NIAGARA_PARTITION_RAM_BASE,
>                                  &s->partition_ram);
>  
> -    memory_region_allocate_system_memory(&s->nvram, NULL,
> -                                         "sun4v.nvram", NIAGARA_NVRAM_SIZE);
> +    memory_region_init_ram(&s->nvram, NULL, "sun4v.nvram", NIAGARA_NVRAM_SIZE,
> +                           &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_NVRAM_BASE, &s->nvram);
> -    memory_region_allocate_system_memory(&s->md_rom, NULL,
> -                                         "sun4v-md.rom", NIAGARA_MD_ROM_SIZE);
> +    memory_region_init_ram(&s->md_rom, NULL, "sun4v-md.rom",
> +                           NIAGARA_MD_ROM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_MD_ROM_BASE, &s->md_rom);
> -    memory_region_allocate_system_memory(&s->hv_rom, NULL,
> -                                         "sun4v-hv.rom", NIAGARA_HV_ROM_SIZE);
> +    memory_region_init_ram(&s->hv_rom, NULL, "sun4v-hv.rom",
> +                           NIAGARA_HV_ROM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_HV_ROM_BASE, &s->hv_rom);
> -    memory_region_allocate_system_memory(&s->prom, NULL,
> -                                         "sun4v.prom", PROM_SIZE_MAX);
> +    memory_region_init_ram(&s->prom, NULL, "sun4v.prom", PROM_SIZE_MAX,
> +                           &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_PROM_BASE, &s->prom);
>  
>      add_rom_or_fail("nvram1", NIAGARA_NVRAM_BASE);
> @@ -145,8 +146,8 @@ static void niagara_init(MachineState *machine)
>          BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
>          int size = blk_getlength(blk);
>          if (size > 0) {
> -            memory_region_allocate_system_memory(&s->vdisk_ram, NULL,
> -                                                 "sun4v_vdisk.ram", size);
> +            memory_region_init_ram(&s->vdisk_ram, NULL, "sun4v_vdisk.ram", size,
> +                                   &error_fatal);
>              memory_region_add_subregion(get_system_memory(),
>                                          NIAGARA_VDISK_BASE, &s->vdisk_ram);
>              dinfo->is_default = 1;
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/5] sparc64: use memory_region_allocate_system_memory() only for '-m' specified RAM
@ 2019-04-15 15:07     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-04-15 15:07 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: David Hildenbrand, Helge Deller, Cornelia Huck, Mark Cave-Ayland,
	Halil Pasic, Christian Borntraeger, qemu-s390x, qemu-ppc,
	Paolo Bonzini, Hervé Poussineau, David Gibson,
	Artyom Tarasenko, Richard Henderson

On 4/15/19 3:27 PM, Igor Mammedov wrote:
> memory_region_allocate_system_memory() was designed to be called for
> allocating inital RAM. Using it mutiple times within one board is not
> supported and could fail if -mem-path with non hugepage path is used.
> 
> Keep using memory_region_allocate_system_memory() only for initial
> RAM and use memory_region_init_ram() for the rest fixed size regions.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/sparc64/niagara.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/sparc64/niagara.c b/hw/sparc64/niagara.c
> index f8a856f..39a4522 100644
> --- a/hw/sparc64/niagara.c
> +++ b/hw/sparc64/niagara.c
> @@ -37,6 +37,7 @@
>  #include "sysemu/block-backend.h"
>  #include "qemu/error-report.h"
>  #include "sysemu/qtest.h"
> +#include "qapi/error.h"
>  
>  
>  typedef struct NiagaraBoardState {
> @@ -108,8 +109,8 @@ static void niagara_init(MachineState *machine)
>      /* init CPUs */
>      sparc64_cpu_devinit(machine->cpu_type, NIAGARA_PROM_BASE);
>      /* set up devices */
> -    memory_region_allocate_system_memory(&s->hv_ram, NULL, "sun4v-hv.ram",
> -                                         NIAGARA_HV_RAM_SIZE);
> +    memory_region_init_ram(&s->hv_ram, NULL, "sun4v-hv.ram",
> +                           NIAGARA_HV_RAM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_HV_RAM_BASE, &s->hv_ram);
>  
>      memory_region_allocate_system_memory(&s->partition_ram, NULL,

Using the partition RAM as system memory rather than HV one seems the
correct choice.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> @@ -118,17 +119,17 @@ static void niagara_init(MachineState *machine)
>      memory_region_add_subregion(sysmem, NIAGARA_PARTITION_RAM_BASE,
>                                  &s->partition_ram);
>  
> -    memory_region_allocate_system_memory(&s->nvram, NULL,
> -                                         "sun4v.nvram", NIAGARA_NVRAM_SIZE);
> +    memory_region_init_ram(&s->nvram, NULL, "sun4v.nvram", NIAGARA_NVRAM_SIZE,
> +                           &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_NVRAM_BASE, &s->nvram);
> -    memory_region_allocate_system_memory(&s->md_rom, NULL,
> -                                         "sun4v-md.rom", NIAGARA_MD_ROM_SIZE);
> +    memory_region_init_ram(&s->md_rom, NULL, "sun4v-md.rom",
> +                           NIAGARA_MD_ROM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_MD_ROM_BASE, &s->md_rom);
> -    memory_region_allocate_system_memory(&s->hv_rom, NULL,
> -                                         "sun4v-hv.rom", NIAGARA_HV_ROM_SIZE);
> +    memory_region_init_ram(&s->hv_rom, NULL, "sun4v-hv.rom",
> +                           NIAGARA_HV_ROM_SIZE, &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_HV_ROM_BASE, &s->hv_rom);
> -    memory_region_allocate_system_memory(&s->prom, NULL,
> -                                         "sun4v.prom", PROM_SIZE_MAX);
> +    memory_region_init_ram(&s->prom, NULL, "sun4v.prom", PROM_SIZE_MAX,
> +                           &error_fatal);
>      memory_region_add_subregion(sysmem, NIAGARA_PROM_BASE, &s->prom);
>  
>      add_rom_or_fail("nvram1", NIAGARA_NVRAM_BASE);
> @@ -145,8 +146,8 @@ static void niagara_init(MachineState *machine)
>          BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
>          int size = blk_getlength(blk);
>          if (size > 0) {
> -            memory_region_allocate_system_memory(&s->vdisk_ram, NULL,
> -                                                 "sun4v_vdisk.ram", size);
> +            memory_region_init_ram(&s->vdisk_ram, NULL, "sun4v_vdisk.ram", size,
> +                                   &error_fatal);
>              memory_region_add_subregion(get_system_memory(),
>                                          NIAGARA_VDISK_BASE, &s->vdisk_ram);
>              dinfo->is_default = 1;
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/5] hppa: drop usage of memory_region_allocate_system_memory() for ROM
@ 2019-04-15 15:16     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-04-15 15:16 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, Christian Borntraeger, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

On 4/15/19 3:27 PM, Igor Mammedov wrote:
> machine_hppa_init() violates memory_region_allocate_system_memory() contract
> by calling it multiple times which could break with -mem-path. Replace
> the second usage (for 'rom') with memory_region_init_ram() instead.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/hppa/machine.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
> index d1b1d3c..b413913 100644
> --- a/hw/hppa/machine.c
> +++ b/hw/hppa/machine.c
> @@ -158,9 +158,8 @@ static void machine_hppa_init(MachineState *machine)
>      g_free(firmware_filename);
>  
>      rom_region = g_new(MemoryRegion, 1);
> -    memory_region_allocate_system_memory(rom_region, OBJECT(machine),
> -                                         "firmware",
> -                                         (FIRMWARE_END - FIRMWARE_START));
> +    memory_region_init_ram(rom_region, NULL, "firmware",
> +                           (FIRMWARE_END - FIRMWARE_START), &error_fatal);
>      memory_region_add_subregion(addr_space, FIRMWARE_START, rom_region);
>  
>      /* Load kernel */
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/5] hppa: drop usage of memory_region_allocate_system_memory() for ROM
@ 2019-04-15 15:16     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-04-15 15:16 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: David Hildenbrand, Helge Deller, Cornelia Huck, Mark Cave-Ayland,
	Halil Pasic, Christian Borntraeger, qemu-s390x, qemu-ppc,
	Paolo Bonzini, Hervé Poussineau, David Gibson,
	Artyom Tarasenko, Richard Henderson

On 4/15/19 3:27 PM, Igor Mammedov wrote:
> machine_hppa_init() violates memory_region_allocate_system_memory() contract
> by calling it multiple times which could break with -mem-path. Replace
> the second usage (for 'rom') with memory_region_init_ram() instead.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/hppa/machine.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
> index d1b1d3c..b413913 100644
> --- a/hw/hppa/machine.c
> +++ b/hw/hppa/machine.c
> @@ -158,9 +158,8 @@ static void machine_hppa_init(MachineState *machine)
>      g_free(firmware_filename);
>  
>      rom_region = g_new(MemoryRegion, 1);
> -    memory_region_allocate_system_memory(rom_region, OBJECT(machine),
> -                                         "firmware",
> -                                         (FIRMWARE_END - FIRMWARE_START));
> +    memory_region_init_ram(rom_region, NULL, "firmware",
> +                           (FIRMWARE_END - FIRMWARE_START), &error_fatal);
>      memory_region_add_subregion(addr_space, FIRMWARE_START, rom_region);
>  
>      /* Load kernel */
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/5] ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
@ 2019-04-16  4:13     ` David Gibson
  0 siblings, 0 replies; 38+ messages in thread
From: David Gibson @ 2019-04-16  4:13 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, Cornelia Huck, Halil Pasic,
	Christian Borntraeger, David Hildenbrand, Artyom Tarasenko,
	Mark Cave-Ayland, qemu-ppc, qemu-s390x

[-- Attachment #1: Type: text/plain, Size: 2857 bytes --]

On Mon, Apr 15, 2019 at 03:27:19PM +0200, Igor Mammedov wrote:
> rs6000mc_realize() violates memory_region_allocate_system_memory() contract
> by calling it multiple times which could break -mem-path. Replace it with
> plain memory_region_init_ram() instead.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

LGTM, though I've no idea if the rs6000 machine even works at the moment.

> ---
>  hw/ppc/rs6000_mc.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/rs6000_mc.c b/hw/ppc/rs6000_mc.c
> index 45cb95e..ea174f1 100644
> --- a/hw/ppc/rs6000_mc.c
> +++ b/hw/ppc/rs6000_mc.c
> @@ -142,6 +142,7 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>      RS6000MCState *s = RS6000MC_DEVICE(dev);
>      int socket = 0;
>      unsigned int ram_size = s->ram_size / MiB;
> +    Error *local_err = NULL;
>  
>      while (socket < 6) {
>          if (ram_size >= 64) {
> @@ -163,19 +164,21 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>          if (s->simm_size[socket]) {
>              char name[] = "simm.?";
>              name[5] = socket + '0';
> -            memory_region_allocate_system_memory(&s->simm[socket], OBJECT(dev),
> -                                                 name,
> -                                                 s->simm_size[socket] * MiB);
> +            memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
> +                                   s->simm_size[socket] * MiB, &local_err);
> +            if (local_err) {
> +                goto out;
> +            }
>              memory_region_add_subregion_overlap(get_system_memory(), 0,
>                                                  &s->simm[socket], socket);
>          }
>      }
>      if (ram_size) {
>          /* unable to push all requested RAM in SIMMs */
> -        error_setg(errp, "RAM size incompatible with this board. "
> +        error_setg(&local_err, "RAM size incompatible with this board. "
>                     "Try again with something else, like %" PRId64 " MB",
>                     s->ram_size / MiB - ram_size);
> -        return;
> +        goto out;
>      }
>  
>      if (s->autoconfigure) {
> @@ -191,6 +194,8 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>  
>      isa_register_portio_list(ISA_DEVICE(dev), &s->portio, 0x0,
>                               rs6000mc_port_list, s, "rs6000mc");
> +out:
> +    error_propagate(errp, local_err);
>  }
>  
>  static const VMStateDescription vmstate_rs6000mc = {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/5] ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
@ 2019-04-16  4:13     ` David Gibson
  0 siblings, 0 replies; 38+ messages in thread
From: David Gibson @ 2019-04-16  4:13 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, qemu-devel, Halil Pasic, Christian Borntraeger,
	qemu-s390x, Hervé Poussineau, Paolo Bonzini,
	Artyom Tarasenko, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 2857 bytes --]

On Mon, Apr 15, 2019 at 03:27:19PM +0200, Igor Mammedov wrote:
> rs6000mc_realize() violates memory_region_allocate_system_memory() contract
> by calling it multiple times which could break -mem-path. Replace it with
> plain memory_region_init_ram() instead.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

LGTM, though I've no idea if the rs6000 machine even works at the moment.

> ---
>  hw/ppc/rs6000_mc.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/rs6000_mc.c b/hw/ppc/rs6000_mc.c
> index 45cb95e..ea174f1 100644
> --- a/hw/ppc/rs6000_mc.c
> +++ b/hw/ppc/rs6000_mc.c
> @@ -142,6 +142,7 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>      RS6000MCState *s = RS6000MC_DEVICE(dev);
>      int socket = 0;
>      unsigned int ram_size = s->ram_size / MiB;
> +    Error *local_err = NULL;
>  
>      while (socket < 6) {
>          if (ram_size >= 64) {
> @@ -163,19 +164,21 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>          if (s->simm_size[socket]) {
>              char name[] = "simm.?";
>              name[5] = socket + '0';
> -            memory_region_allocate_system_memory(&s->simm[socket], OBJECT(dev),
> -                                                 name,
> -                                                 s->simm_size[socket] * MiB);
> +            memory_region_init_ram(&s->simm[socket], OBJECT(dev), name,
> +                                   s->simm_size[socket] * MiB, &local_err);
> +            if (local_err) {
> +                goto out;
> +            }
>              memory_region_add_subregion_overlap(get_system_memory(), 0,
>                                                  &s->simm[socket], socket);
>          }
>      }
>      if (ram_size) {
>          /* unable to push all requested RAM in SIMMs */
> -        error_setg(errp, "RAM size incompatible with this board. "
> +        error_setg(&local_err, "RAM size incompatible with this board. "
>                     "Try again with something else, like %" PRId64 " MB",
>                     s->ram_size / MiB - ram_size);
> -        return;
> +        goto out;
>      }
>  
>      if (s->autoconfigure) {
> @@ -191,6 +194,8 @@ static void rs6000mc_realize(DeviceState *dev, Error **errp)
>  
>      isa_register_portio_list(ISA_DEVICE(dev), &s->portio, 0x0,
>                               rs6000mc_port_list, s, "rs6000mc");
> +out:
> +    error_propagate(errp, local_err);
>  }
>  
>  static const VMStateDescription vmstate_rs6000mc = {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:01     ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:01 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	David Hildenbrand, Artyom Tarasenko, Mark Cave-Ayland, qemu-ppc,
	qemu-s390x

This crashes

a simple -kernel -initrd example on s390x.

#0  0x000003ff94e3e47c in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000003ff94e23d18 in __GI_abort () at abort.c:79
#2  0x000003ff94e365e6 in __assert_fail_base
    (fmt=0x3ff94f60ca6 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:92
#3  0x000003ff94e36664 in __GI___assert_fail
    (assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:101
#4  0x000000000102e062 in qemu_ram_set_idstr (new_block=new_block@entry=0x0, name=<optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/exec.c:2041
#5  0x00000000011f5b0a in vmstate_register_ram (mr=0x2cd2dd0, mr@entry=<error reading variable: value has been optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/migration/savevm.c:2828
#6  0x00000000011f5b5a in vmstate_register_ram_global (mr=<error reading variable: value has been optimized out>) at /home/cborntra/REPOS/qemu/migration/savevm.c:2841
#7  0x000000000110d2ce in s390_memory_init (mem_size=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:186
#8  0x000000000110d2ce in ccw_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:266
#9  0x00000000011b342c in machine_run_board_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/core/machine.c:1030
#10 0x0000000001026fee in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4479



On 15.04.19 15:27, Igor Mammedov wrote:
> s390 was trying to solve limited memslot size issue by abusing
> memory_region_allocate_system_memory(), which breaks API contract
> where the function might be called only once.
> 
> s390 should have used memory aliases to fragment inital memory into
> smaller chunks to satisfy KVM's memslot limitation. But its a bit
> late now, since allocated pieces are transfered in migration stream
> separately, so it's not possible to just replace broken layout with
> correct one. Previous patch made MemoryRegion alases migratable and
> this patch switches to use them to split big initial RAM chunk into
> smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> for migration.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> A don't have access to a suitable system to test it, so I've simulated
> it with smaller chunks on x84 host. Ping-pong migration between old
> and new QEMU worked fine.  KVM part should be fine as memslots
> using mapped MemoryRegions (in this case it would be aliases) as
> far as I know but is someone could test it on big enough host it
> would be nice.
> ---
>  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index d11069b..12ca3a9 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>      ram_addr_t chunk, offset = 0;
>      unsigned int number = 0;
>      gchar *name;
>  
>      /* allocate RAM for core */
> +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> +    /*
> +     * memory_region_allocate_system_memory() registers allocated RAM for
> +     * migration, however for compat reasons the RAM should be passed over
> +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> +     * allocated RAM so it won't be migrated directly. Aliases will take
> +     * of segmenting RAM into legacy chunks.
> +     */
> +    vmstate_unregister_ram(ram, NULL);
>      name = g_strdup_printf("s390.ram");
>      while (mem_size) {
> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> -        uint64_t size = mem_size;
> +        MemoryRegion *alias = g_new(MemoryRegion, 1);
>  
>          /* KVM does not allow memslots >= 8 TB */
> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> -        memory_region_add_subregion(sysmem, offset, ram);
> +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> +        vmstate_register_ram_global(alias);
> +        memory_region_add_subregion(sysmem, offset, alias);
>          mem_size -= chunk;
>          offset += chunk;
>          g_free(name);
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:01     ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:01 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, qemu-s390x, Hervé Poussineau,
	Paolo Bonzini, Richard Henderson, Artyom Tarasenko, David Gibson

This crashes

a simple -kernel -initrd example on s390x.

#0  0x000003ff94e3e47c in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000003ff94e23d18 in __GI_abort () at abort.c:79
#2  0x000003ff94e365e6 in __assert_fail_base
    (fmt=0x3ff94f60ca6 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:92
#3  0x000003ff94e36664 in __GI___assert_fail
    (assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:101
#4  0x000000000102e062 in qemu_ram_set_idstr (new_block=new_block@entry=0x0, name=<optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/exec.c:2041
#5  0x00000000011f5b0a in vmstate_register_ram (mr=0x2cd2dd0, mr@entry=<error reading variable: value has been optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/migration/savevm.c:2828
#6  0x00000000011f5b5a in vmstate_register_ram_global (mr=<error reading variable: value has been optimized out>) at /home/cborntra/REPOS/qemu/migration/savevm.c:2841
#7  0x000000000110d2ce in s390_memory_init (mem_size=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:186
#8  0x000000000110d2ce in ccw_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:266
#9  0x00000000011b342c in machine_run_board_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/core/machine.c:1030
#10 0x0000000001026fee in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4479



On 15.04.19 15:27, Igor Mammedov wrote:
> s390 was trying to solve limited memslot size issue by abusing
> memory_region_allocate_system_memory(), which breaks API contract
> where the function might be called only once.
> 
> s390 should have used memory aliases to fragment inital memory into
> smaller chunks to satisfy KVM's memslot limitation. But its a bit
> late now, since allocated pieces are transfered in migration stream
> separately, so it's not possible to just replace broken layout with
> correct one. Previous patch made MemoryRegion alases migratable and
> this patch switches to use them to split big initial RAM chunk into
> smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> for migration.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> A don't have access to a suitable system to test it, so I've simulated
> it with smaller chunks on x84 host. Ping-pong migration between old
> and new QEMU worked fine.  KVM part should be fine as memslots
> using mapped MemoryRegions (in this case it would be aliases) as
> far as I know but is someone could test it on big enough host it
> would be nice.
> ---
>  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index d11069b..12ca3a9 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>      ram_addr_t chunk, offset = 0;
>      unsigned int number = 0;
>      gchar *name;
>  
>      /* allocate RAM for core */
> +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> +    /*
> +     * memory_region_allocate_system_memory() registers allocated RAM for
> +     * migration, however for compat reasons the RAM should be passed over
> +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> +     * allocated RAM so it won't be migrated directly. Aliases will take
> +     * of segmenting RAM into legacy chunks.
> +     */
> +    vmstate_unregister_ram(ram, NULL);
>      name = g_strdup_printf("s390.ram");
>      while (mem_size) {
> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> -        uint64_t size = mem_size;
> +        MemoryRegion *alias = g_new(MemoryRegion, 1);
>  
>          /* KVM does not allow memslots >= 8 TB */
> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> -        memory_region_add_subregion(sysmem, offset, ram);
> +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> +        vmstate_register_ram_global(alias);
> +        memory_region_add_subregion(sysmem, offset, alias);
>          mem_size -= chunk;
>          offset += chunk;
>          g_free(name);
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:02       ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:02 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	David Hildenbrand, Artyom Tarasenko, Mark Cave-Ayland, qemu-ppc,
	qemu-s390x

On 16.04.19 13:01, Christian Borntraeger wrote:
> This crashes
> 
> a simple -kernel -initrd example on s390x.
> 
> #0  0x000003ff94e3e47c in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x000003ff94e23d18 in __GI_abort () at abort.c:79
> #2  0x000003ff94e365e6 in __assert_fail_base
>     (fmt=0x3ff94f60ca6 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:92
> #3  0x000003ff94e36664 in __GI___assert_fail
>     (assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:101
> #4  0x000000000102e062 in qemu_ram_set_idstr (new_block=new_block@entry=0x0, name=<optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/exec.c:2041
> #5  0x00000000011f5b0a in vmstate_register_ram (mr=0x2cd2dd0, mr@entry=<error reading variable: value has been optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/migration/savevm.c:2828
> #6  0x00000000011f5b5a in vmstate_register_ram_global (mr=<error reading variable: value has been optimized out>) at /home/cborntra/REPOS/qemu/migration/savevm.c:2841
> #7  0x000000000110d2ce in s390_memory_init (mem_size=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:186
> #8  0x000000000110d2ce in ccw_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:266
> #9  0x00000000011b342c in machine_run_board_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/core/machine.c:1030
> #10 0x0000000001026fee in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4479

Sorry, I forgot to also add Patch4. With that the crash is gone. I will have a look at the patch.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:02       ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:02 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, qemu-s390x, Hervé Poussineau,
	Paolo Bonzini, Richard Henderson, Artyom Tarasenko, David Gibson

On 16.04.19 13:01, Christian Borntraeger wrote:
> This crashes
> 
> a simple -kernel -initrd example on s390x.
> 
> #0  0x000003ff94e3e47c in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x000003ff94e23d18 in __GI_abort () at abort.c:79
> #2  0x000003ff94e365e6 in __assert_fail_base
>     (fmt=0x3ff94f60ca6 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:92
> #3  0x000003ff94e36664 in __GI___assert_fail
>     (assertion=assertion@entry=0x14aac70 "new_block", file=file@entry=0x13b1168 "/home/cborntra/REPOS/qemu/exec.c", line=line@entry=2041, function=function@entry=0x13b0c8a <__PRETTY_FUNCTION__.34656> "qemu_ram_set_idstr") at assert.c:101
> #4  0x000000000102e062 in qemu_ram_set_idstr (new_block=new_block@entry=0x0, name=<optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/exec.c:2041
> #5  0x00000000011f5b0a in vmstate_register_ram (mr=0x2cd2dd0, mr@entry=<error reading variable: value has been optimized out>, dev=dev@entry=0x0) at /home/cborntra/REPOS/qemu/migration/savevm.c:2828
> #6  0x00000000011f5b5a in vmstate_register_ram_global (mr=<error reading variable: value has been optimized out>) at /home/cborntra/REPOS/qemu/migration/savevm.c:2841
> #7  0x000000000110d2ce in s390_memory_init (mem_size=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:186
> #8  0x000000000110d2ce in ccw_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:266
> #9  0x00000000011b342c in machine_run_board_init (machine=0x2a96770) at /home/cborntra/REPOS/qemu/hw/core/machine.c:1030
> #10 0x0000000001026fee in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4479

Sorry, I forgot to also add Patch4. With that the crash is gone. I will have a look at the patch.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:09     ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:09 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Helge Deller,
	Hervé Poussineau, David Gibson, Cornelia Huck, Halil Pasic,
	David Hildenbrand, Artyom Tarasenko, Mark Cave-Ayland, qemu-ppc,
	qemu-s390x

This fails with more than 8TB, e.g.  "-m 9T "

[pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
[pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)

seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).


On 15.04.19 15:27, Igor Mammedov wrote:
> s390 was trying to solve limited memslot size issue by abusing
> memory_region_allocate_system_memory(), which breaks API contract
> where the function might be called only once.
> 
> s390 should have used memory aliases to fragment inital memory into
> smaller chunks to satisfy KVM's memslot limitation. But its a bit
> late now, since allocated pieces are transfered in migration stream
> separately, so it's not possible to just replace broken layout with
> correct one. Previous patch made MemoryRegion alases migratable and
> this patch switches to use them to split big initial RAM chunk into
> smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> for migration.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> A don't have access to a suitable system to test it, so I've simulated
> it with smaller chunks on x84 host. Ping-pong migration between old
> and new QEMU worked fine.  KVM part should be fine as memslots
> using mapped MemoryRegions (in this case it would be aliases) as
> far as I know but is someone could test it on big enough host it
> would be nice.
> ---
>  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index d11069b..12ca3a9 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>      ram_addr_t chunk, offset = 0;
>      unsigned int number = 0;
>      gchar *name;
>  
>      /* allocate RAM for core */
> +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> +    /*
> +     * memory_region_allocate_system_memory() registers allocated RAM for
> +     * migration, however for compat reasons the RAM should be passed over
> +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> +     * allocated RAM so it won't be migrated directly. Aliases will take
> +     * of segmenting RAM into legacy chunks.
> +     */
> +    vmstate_unregister_ram(ram, NULL);
>      name = g_strdup_printf("s390.ram");
>      while (mem_size) {
> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> -        uint64_t size = mem_size;
> +        MemoryRegion *alias = g_new(MemoryRegion, 1);
>  
>          /* KVM does not allow memslots >= 8 TB */
> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> -        memory_region_add_subregion(sysmem, offset, ram);
> +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> +        vmstate_register_ram_global(alias);
> +        memory_region_add_subregion(sysmem, offset, alias);
>          mem_size -= chunk;
>          offset += chunk;
>          g_free(name);
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-16 11:09     ` Christian Borntraeger
  0 siblings, 0 replies; 38+ messages in thread
From: Christian Borntraeger @ 2019-04-16 11:09 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: qemu-ppc, David Hildenbrand, Helge Deller, Cornelia Huck,
	Mark Cave-Ayland, Halil Pasic, qemu-s390x, Hervé Poussineau,
	Paolo Bonzini, Richard Henderson, Artyom Tarasenko, David Gibson

This fails with more than 8TB, e.g.  "-m 9T "

[pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
[pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)

seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).


On 15.04.19 15:27, Igor Mammedov wrote:
> s390 was trying to solve limited memslot size issue by abusing
> memory_region_allocate_system_memory(), which breaks API contract
> where the function might be called only once.
> 
> s390 should have used memory aliases to fragment inital memory into
> smaller chunks to satisfy KVM's memslot limitation. But its a bit
> late now, since allocated pieces are transfered in migration stream
> separately, so it's not possible to just replace broken layout with
> correct one. Previous patch made MemoryRegion alases migratable and
> this patch switches to use them to split big initial RAM chunk into
> smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> for migration.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> A don't have access to a suitable system to test it, so I've simulated
> it with smaller chunks on x84 host. Ping-pong migration between old
> and new QEMU worked fine.  KVM part should be fine as memslots
> using mapped MemoryRegions (in this case it would be aliases) as
> far as I know but is someone could test it on big enough host it
> would be nice.
> ---
>  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index d11069b..12ca3a9 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
>  static void s390_memory_init(ram_addr_t mem_size)
>  {
>      MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
>      ram_addr_t chunk, offset = 0;
>      unsigned int number = 0;
>      gchar *name;
>  
>      /* allocate RAM for core */
> +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> +    /*
> +     * memory_region_allocate_system_memory() registers allocated RAM for
> +     * migration, however for compat reasons the RAM should be passed over
> +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> +     * allocated RAM so it won't be migrated directly. Aliases will take
> +     * of segmenting RAM into legacy chunks.
> +     */
> +    vmstate_unregister_ram(ram, NULL);
>      name = g_strdup_printf("s390.ram");
>      while (mem_size) {
> -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> -        uint64_t size = mem_size;
> +        MemoryRegion *alias = g_new(MemoryRegion, 1);
>  
>          /* KVM does not allow memslots >= 8 TB */
> -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> -        memory_region_add_subregion(sysmem, offset, ram);
> +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> +        vmstate_register_ram_global(alias);
> +        memory_region_add_subregion(sysmem, offset, alias);
>          mem_size -= chunk;
>          offset += chunk;
>          g_free(name);
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-17 14:30       ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-17 14:30 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: qemu-devel, qemu-ppc, David Hildenbrand, Helge Deller,
	Cornelia Huck, Mark Cave-Ayland, Halil Pasic, qemu-s390x,
	Hervé Poussineau, Paolo Bonzini, Richard Henderson,
	Artyom Tarasenko, David Gibson

On Tue, 16 Apr 2019 13:09:08 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> This fails with more than 8TB, e.g.  "-m 9T "
> 
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> 
> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).

I'm able to simulate issue on s390 host with KVM enabled, it looks like
memory region aliases are broken on s390 host (aliasing works as expected with
x86 where where it for splitting RAM on low and high mem).
I'll try to debug and find out where it goes off on a tangent.

> 
> 
> On 15.04.19 15:27, Igor Mammedov wrote:
> > s390 was trying to solve limited memslot size issue by abusing
> > memory_region_allocate_system_memory(), which breaks API contract
> > where the function might be called only once.
> > 
> > s390 should have used memory aliases to fragment inital memory into
> > smaller chunks to satisfy KVM's memslot limitation. But its a bit
> > late now, since allocated pieces are transfered in migration stream
> > separately, so it's not possible to just replace broken layout with
> > correct one. Previous patch made MemoryRegion alases migratable and
> > this patch switches to use them to split big initial RAM chunk into
> > smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> > for migration.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > A don't have access to a suitable system to test it, so I've simulated
> > it with smaller chunks on x84 host. Ping-pong migration between old
> > and new QEMU worked fine.  KVM part should be fine as memslots
> > using mapped MemoryRegions (in this case it would be aliases) as
> > far as I know but is someone could test it on big enough host it
> > would be nice.
> > ---
> >  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> > index d11069b..12ca3a9 100644
> > --- a/hw/s390x/s390-virtio-ccw.c
> > +++ b/hw/s390x/s390-virtio-ccw.c
> > @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
> >  static void s390_memory_init(ram_addr_t mem_size)
> >  {
> >      MemoryRegion *sysmem = get_system_memory();
> > +    MemoryRegion *ram = g_new(MemoryRegion, 1);
> >      ram_addr_t chunk, offset = 0;
> >      unsigned int number = 0;
> >      gchar *name;
> >  
> >      /* allocate RAM for core */
> > +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> > +    /*
> > +     * memory_region_allocate_system_memory() registers allocated RAM for
> > +     * migration, however for compat reasons the RAM should be passed over
> > +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> > +     * allocated RAM so it won't be migrated directly. Aliases will take
> > +     * of segmenting RAM into legacy chunks.
> > +     */
> > +    vmstate_unregister_ram(ram, NULL);
> >      name = g_strdup_printf("s390.ram");
> >      while (mem_size) {
> > -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> > -        uint64_t size = mem_size;
> > +        MemoryRegion *alias = g_new(MemoryRegion, 1);
> >  
> >          /* KVM does not allow memslots >= 8 TB */
> > -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> > -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> > -        memory_region_add_subregion(sysmem, offset, ram);
> > +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> > +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> > +        vmstate_register_ram_global(alias);
> > +        memory_region_add_subregion(sysmem, offset, alias);
> >          mem_size -= chunk;
> >          offset += chunk;
> >          g_free(name);
> >   
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-17 14:30       ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-17 14:30 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: David Hildenbrand, Helge Deller, Cornelia Huck, Mark Cave-Ayland,
	qemu-devel, Halil Pasic, qemu-s390x, qemu-ppc, Paolo Bonzini,
	Hervé Poussineau, David Gibson, Artyom Tarasenko,
	Richard Henderson

On Tue, 16 Apr 2019 13:09:08 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> This fails with more than 8TB, e.g.  "-m 9T "
> 
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> 
> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).

I'm able to simulate issue on s390 host with KVM enabled, it looks like
memory region aliases are broken on s390 host (aliasing works as expected with
x86 where where it for splitting RAM on low and high mem).
I'll try to debug and find out where it goes off on a tangent.

> 
> 
> On 15.04.19 15:27, Igor Mammedov wrote:
> > s390 was trying to solve limited memslot size issue by abusing
> > memory_region_allocate_system_memory(), which breaks API contract
> > where the function might be called only once.
> > 
> > s390 should have used memory aliases to fragment inital memory into
> > smaller chunks to satisfy KVM's memslot limitation. But its a bit
> > late now, since allocated pieces are transfered in migration stream
> > separately, so it's not possible to just replace broken layout with
> > correct one. Previous patch made MemoryRegion alases migratable and
> > this patch switches to use them to split big initial RAM chunk into
> > smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> > for migration.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > A don't have access to a suitable system to test it, so I've simulated
> > it with smaller chunks on x84 host. Ping-pong migration between old
> > and new QEMU worked fine.  KVM part should be fine as memslots
> > using mapped MemoryRegions (in this case it would be aliases) as
> > far as I know but is someone could test it on big enough host it
> > would be nice.
> > ---
> >  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> > index d11069b..12ca3a9 100644
> > --- a/hw/s390x/s390-virtio-ccw.c
> > +++ b/hw/s390x/s390-virtio-ccw.c
> > @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
> >  static void s390_memory_init(ram_addr_t mem_size)
> >  {
> >      MemoryRegion *sysmem = get_system_memory();
> > +    MemoryRegion *ram = g_new(MemoryRegion, 1);
> >      ram_addr_t chunk, offset = 0;
> >      unsigned int number = 0;
> >      gchar *name;
> >  
> >      /* allocate RAM for core */
> > +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> > +    /*
> > +     * memory_region_allocate_system_memory() registers allocated RAM for
> > +     * migration, however for compat reasons the RAM should be passed over
> > +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> > +     * allocated RAM so it won't be migrated directly. Aliases will take
> > +     * of segmenting RAM into legacy chunks.
> > +     */
> > +    vmstate_unregister_ram(ram, NULL);
> >      name = g_strdup_printf("s390.ram");
> >      while (mem_size) {
> > -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> > -        uint64_t size = mem_size;
> > +        MemoryRegion *alias = g_new(MemoryRegion, 1);
> >  
> >          /* KVM does not allow memslots >= 8 TB */
> > -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> > -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> > -        memory_region_add_subregion(sysmem, offset, ram);
> > +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> > +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> > +        vmstate_register_ram_global(alias);
> > +        memory_region_add_subregion(sysmem, offset, alias);
> >          mem_size -= chunk;
> >          offset += chunk;
> >          g_free(name);
> >   
> 
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18  9:38       ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18  9:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, David Hildenbrand, Cornelia Huck, Halil Pasic, qemu-s390x

On Tue, 16 Apr 2019 13:09:08 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> This fails with more than 8TB, e.g.  "-m 9T "
> 
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> 
> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).

it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
but follow up flatview_simplify() collapses adjacent ranges back
into big one.

I see 2 ways how to approach it:
 1. 'improve' memory region API, so we could disable merging for
     a specific memory region (i.e. RAM providing memory region)
     (I don't in particular like the idea of twisting this API to serve KVM specific purpose)

 2. hide KVMism in kvm code. Move KVM_SLOT_MAX_BYTES out of s390 machine
    code and handle splitting big chunk into several (upto KVM_SLOT_MAX_BYTES)
    in kvm_set_phys_mem().
    We could add KVMState::max_slot_size which is set only by s390,
    so it won't affect other targets.

Paolo,
 I'd like to get your opinion/suggestion which direction I should look into?
     

 
> On 15.04.19 15:27, Igor Mammedov wrote:
> > s390 was trying to solve limited memslot size issue by abusing
> > memory_region_allocate_system_memory(), which breaks API contract
> > where the function might be called only once.
> > 
> > s390 should have used memory aliases to fragment inital memory into
> > smaller chunks to satisfy KVM's memslot limitation. But its a bit
> > late now, since allocated pieces are transfered in migration stream
> > separately, so it's not possible to just replace broken layout with
> > correct one. Previous patch made MemoryRegion alases migratable and
> > this patch switches to use them to split big initial RAM chunk into
> > smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> > for migration.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > A don't have access to a suitable system to test it, so I've simulated
> > it with smaller chunks on x84 host. Ping-pong migration between old
> > and new QEMU worked fine.  KVM part should be fine as memslots
> > using mapped MemoryRegions (in this case it would be aliases) as
> > far as I know but is someone could test it on big enough host it
> > would be nice.
> > ---
> >  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> > index d11069b..12ca3a9 100644
> > --- a/hw/s390x/s390-virtio-ccw.c
> > +++ b/hw/s390x/s390-virtio-ccw.c
> > @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
> >  static void s390_memory_init(ram_addr_t mem_size)
> >  {
> >      MemoryRegion *sysmem = get_system_memory();
> > +    MemoryRegion *ram = g_new(MemoryRegion, 1);
> >      ram_addr_t chunk, offset = 0;
> >      unsigned int number = 0;
> >      gchar *name;
> >  
> >      /* allocate RAM for core */
> > +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> > +    /*
> > +     * memory_region_allocate_system_memory() registers allocated RAM for
> > +     * migration, however for compat reasons the RAM should be passed over
> > +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> > +     * allocated RAM so it won't be migrated directly. Aliases will take
> > +     * of segmenting RAM into legacy chunks.
> > +     */
> > +    vmstate_unregister_ram(ram, NULL);
> >      name = g_strdup_printf("s390.ram");
> >      while (mem_size) {
> > -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> > -        uint64_t size = mem_size;
> > +        MemoryRegion *alias = g_new(MemoryRegion, 1);
> >  
> >          /* KVM does not allow memslots >= 8 TB */
> > -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> > -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> > -        memory_region_add_subregion(sysmem, offset, ram);
> > +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> > +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> > +        vmstate_register_ram_global(alias);
> > +        memory_region_add_subregion(sysmem, offset, alias);
> >          mem_size -= chunk;
> >          offset += chunk;
> >          g_free(name);
> >   
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18  9:38       ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18  9:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Halil Pasic, qemu-s390x, Cornelia Huck, qemu-devel, David Hildenbrand

On Tue, 16 Apr 2019 13:09:08 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> This fails with more than 8TB, e.g.  "-m 9T "
> 
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> 
> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).

it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
but follow up flatview_simplify() collapses adjacent ranges back
into big one.

I see 2 ways how to approach it:
 1. 'improve' memory region API, so we could disable merging for
     a specific memory region (i.e. RAM providing memory region)
     (I don't in particular like the idea of twisting this API to serve KVM specific purpose)

 2. hide KVMism in kvm code. Move KVM_SLOT_MAX_BYTES out of s390 machine
    code and handle splitting big chunk into several (upto KVM_SLOT_MAX_BYTES)
    in kvm_set_phys_mem().
    We could add KVMState::max_slot_size which is set only by s390,
    so it won't affect other targets.

Paolo,
 I'd like to get your opinion/suggestion which direction I should look into?
     

 
> On 15.04.19 15:27, Igor Mammedov wrote:
> > s390 was trying to solve limited memslot size issue by abusing
> > memory_region_allocate_system_memory(), which breaks API contract
> > where the function might be called only once.
> > 
> > s390 should have used memory aliases to fragment inital memory into
> > smaller chunks to satisfy KVM's memslot limitation. But its a bit
> > late now, since allocated pieces are transfered in migration stream
> > separately, so it's not possible to just replace broken layout with
> > correct one. Previous patch made MemoryRegion alases migratable and
> > this patch switches to use them to split big initial RAM chunk into
> > smaller pieces up to KVM_SLOT_MAX_BYTES each and registers aliases
> > for migration.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > A don't have access to a suitable system to test it, so I've simulated
> > it with smaller chunks on x84 host. Ping-pong migration between old
> > and new QEMU worked fine.  KVM part should be fine as memslots
> > using mapped MemoryRegions (in this case it would be aliases) as
> > far as I know but is someone could test it on big enough host it
> > would be nice.
> > ---
> >  hw/s390x/s390-virtio-ccw.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> > index d11069b..12ca3a9 100644
> > --- a/hw/s390x/s390-virtio-ccw.c
> > +++ b/hw/s390x/s390-virtio-ccw.c
> > @@ -161,20 +161,30 @@ static void virtio_ccw_register_hcalls(void)
> >  static void s390_memory_init(ram_addr_t mem_size)
> >  {
> >      MemoryRegion *sysmem = get_system_memory();
> > +    MemoryRegion *ram = g_new(MemoryRegion, 1);
> >      ram_addr_t chunk, offset = 0;
> >      unsigned int number = 0;
> >      gchar *name;
> >  
> >      /* allocate RAM for core */
> > +    memory_region_allocate_system_memory(ram, NULL, "s390.whole.ram", mem_size);
> > +    /*
> > +     * memory_region_allocate_system_memory() registers allocated RAM for
> > +     * migration, however for compat reasons the RAM should be passed over
> > +     * as RAMBlocks of the size upto KVM_SLOT_MAX_BYTES. So unregister just
> > +     * allocated RAM so it won't be migrated directly. Aliases will take
> > +     * of segmenting RAM into legacy chunks.
> > +     */
> > +    vmstate_unregister_ram(ram, NULL);
> >      name = g_strdup_printf("s390.ram");
> >      while (mem_size) {
> > -        MemoryRegion *ram = g_new(MemoryRegion, 1);
> > -        uint64_t size = mem_size;
> > +        MemoryRegion *alias = g_new(MemoryRegion, 1);
> >  
> >          /* KVM does not allow memslots >= 8 TB */
> > -        chunk = MIN(size, KVM_SLOT_MAX_BYTES);
> > -        memory_region_allocate_system_memory(ram, NULL, name, chunk);
> > -        memory_region_add_subregion(sysmem, offset, ram);
> > +        chunk = MIN(mem_size, KVM_SLOT_MAX_BYTES);
> > +        memory_region_init_alias(alias, NULL, name, ram, offset, chunk);
> > +        vmstate_register_ram_global(alias);
> > +        memory_region_add_subregion(sysmem, offset, alias);
> >          mem_size -= chunk;
> >          offset += chunk;
> >          g_free(name);
> >   
> 
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 11:24         ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 11:24 UTC (permalink / raw)
  To: Igor Mammedov, Paolo Bonzini
  Cc: qemu-devel, Cornelia Huck, Halil Pasic, qemu-s390x

On 18.04.19 11:38, Igor Mammedov wrote:
> On Tue, 16 Apr 2019 13:09:08 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> This fails with more than 8TB, e.g.  "-m 9T "
>>
>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>
>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).
> 
> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> but follow up flatview_simplify() collapses adjacent ranges back
> into big one.

That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
DIMM), the kvm memory slot would temporarily be deleted to insert the
new, bigger one. Guest would crash. This could happen if backing memory
of two DIMMs would by pure luck be allocated side by side in user space.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 11:24         ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 11:24 UTC (permalink / raw)
  To: Igor Mammedov, Paolo Bonzini
  Cc: Halil Pasic, qemu-s390x, Cornelia Huck, qemu-devel

On 18.04.19 11:38, Igor Mammedov wrote:
> On Tue, 16 Apr 2019 13:09:08 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> This fails with more than 8TB, e.g.  "-m 9T "
>>
>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>
>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).
> 
> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> but follow up flatview_simplify() collapses adjacent ranges back
> into big one.

That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
DIMM), the kvm memory slot would temporarily be deleted to insert the
new, bigger one. Guest would crash. This could happen if backing memory
of two DIMMs would by pure luck be allocated side by side in user space.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 12:01           ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18 12:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Paolo Bonzini, qemu-devel, Cornelia Huck, Halil Pasic, qemu-s390x

On Thu, 18 Apr 2019 13:24:43 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 18.04.19 11:38, Igor Mammedov wrote:
> > On Tue, 16 Apr 2019 13:09:08 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> >> This fails with more than 8TB, e.g.  "-m 9T "
> >>
> >> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> >> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> >>
> >> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).  
> > 
> > it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> > but follow up flatview_simplify() collapses adjacent ranges back
> > into big one.  
> 
> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
> DIMM), the kvm memory slot would temporarily be deleted to insert the
> new, bigger one. Guest would crash. This could happen if backing memory
> of two DIMMs would by pure luck be allocated side by side in user space.
> 

not sure I fully get your concerns, but if you look at can_merge()
it ensures that ranges belong to the same MemoryRegion.

It's hard for me to say if flatview_simplify() works as designed,
MemoryRegion code is quite complicated so I'd deffer to Paolo's
opinion.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 12:01           ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18 12:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Halil Pasic, Paolo Bonzini, qemu-s390x, Cornelia Huck, qemu-devel

On Thu, 18 Apr 2019 13:24:43 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 18.04.19 11:38, Igor Mammedov wrote:
> > On Tue, 16 Apr 2019 13:09:08 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> >> This fails with more than 8TB, e.g.  "-m 9T "
> >>
> >> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> >> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> >>
> >> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).  
> > 
> > it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> > but follow up flatview_simplify() collapses adjacent ranges back
> > into big one.  
> 
> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
> DIMM), the kvm memory slot would temporarily be deleted to insert the
> new, bigger one. Guest would crash. This could happen if backing memory
> of two DIMMs would by pure luck be allocated side by side in user space.
> 

not sure I fully get your concerns, but if you look at can_merge()
it ensures that ranges belong to the same MemoryRegion.

It's hard for me to say if flatview_simplify() works as designed,
MemoryRegion code is quite complicated so I'd deffer to Paolo's
opinion.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 12:06             ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 12:06 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, qemu-devel, Cornelia Huck, Halil Pasic, qemu-s390x

On 18.04.19 14:01, Igor Mammedov wrote:
> On Thu, 18 Apr 2019 13:24:43 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 18.04.19 11:38, Igor Mammedov wrote:
>>> On Tue, 16 Apr 2019 13:09:08 +0200
>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>   
>>>> This fails with more than 8TB, e.g.  "-m 9T "
>>>>
>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>>>
>>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).  
>>>
>>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
>>> but follow up flatview_simplify() collapses adjacent ranges back
>>> into big one.  
>>
>> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
>> DIMM), the kvm memory slot would temporarily be deleted to insert the
>> new, bigger one. Guest would crash. This could happen if backing memory
>> of two DIMMs would by pure luck be allocated side by side in user space.
>>
> 
> not sure I fully get your concerns, but if you look at can_merge()
> it ensures that ranges belong to the same MemoryRegion.
> 
> It's hard for me to say if flatview_simplify() works as designed,
> MemoryRegion code is quite complicated so I'd deffer to Paolo's
> opinion.
> 

What I had in mind:

We have the Memory Region for memory devices (m->device_memory).

Assume The first DIMM is created, allocating memory in the user space
process:

[0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.

Guests starts to run, a second DIMM is hotplugged. Memory in user space
process is allocated (by pure luck) at:

[0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
m->device_memory.

Without looking at the code, I could imagine that both might be merged
into a single memory slot. That is my concern. Maybe it is not valid.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 12:06             ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 12:06 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Halil Pasic, Paolo Bonzini, qemu-s390x, Cornelia Huck, qemu-devel

On 18.04.19 14:01, Igor Mammedov wrote:
> On Thu, 18 Apr 2019 13:24:43 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 18.04.19 11:38, Igor Mammedov wrote:
>>> On Tue, 16 Apr 2019 13:09:08 +0200
>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>   
>>>> This fails with more than 8TB, e.g.  "-m 9T "
>>>>
>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>>>
>>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).  
>>>
>>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
>>> but follow up flatview_simplify() collapses adjacent ranges back
>>> into big one.  
>>
>> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
>> DIMM), the kvm memory slot would temporarily be deleted to insert the
>> new, bigger one. Guest would crash. This could happen if backing memory
>> of two DIMMs would by pure luck be allocated side by side in user space.
>>
> 
> not sure I fully get your concerns, but if you look at can_merge()
> it ensures that ranges belong to the same MemoryRegion.
> 
> It's hard for me to say if flatview_simplify() works as designed,
> MemoryRegion code is quite complicated so I'd deffer to Paolo's
> opinion.
> 

What I had in mind:

We have the Memory Region for memory devices (m->device_memory).

Assume The first DIMM is created, allocating memory in the user space
process:

[0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.

Guests starts to run, a second DIMM is hotplugged. Memory in user space
process is allocated (by pure luck) at:

[0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
m->device_memory.

Without looking at the code, I could imagine that both might be merged
into a single memory slot. That is my concern. Maybe it is not valid.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 14:56               ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18 14:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Paolo Bonzini, qemu-devel, Cornelia Huck, Halil Pasic, qemu-s390x

On Thu, 18 Apr 2019 14:06:25 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 18.04.19 14:01, Igor Mammedov wrote:
> > On Thu, 18 Apr 2019 13:24:43 +0200
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> On 18.04.19 11:38, Igor Mammedov wrote:  
> >>> On Tue, 16 Apr 2019 13:09:08 +0200
> >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >>>     
> >>>> This fails with more than 8TB, e.g.  "-m 9T "
> >>>>
> >>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> >>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> >>>>
> >>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).    
> >>>
> >>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> >>> but follow up flatview_simplify() collapses adjacent ranges back
> >>> into big one.    
> >>
> >> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
> >> DIMM), the kvm memory slot would temporarily be deleted to insert the
> >> new, bigger one. Guest would crash. This could happen if backing memory
> >> of two DIMMs would by pure luck be allocated side by side in user space.
> >>  
> > 
> > not sure I fully get your concerns, but if you look at can_merge()
> > it ensures that ranges belong to the same MemoryRegion.
> > 
> > It's hard for me to say if flatview_simplify() works as designed,
> > MemoryRegion code is quite complicated so I'd deffer to Paolo's
> > opinion.
> >   
> 
> What I had in mind:
> 
> We have the Memory Region for memory devices (m->device_memory).
> 
> Assume The first DIMM is created, allocating memory in the user space
> process:
> 
> [0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.
> 
> Guests starts to run, a second DIMM is hotplugged. Memory in user space
> process is allocated (by pure luck) at:
> 
> [0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
> m->device_memory.
> 
> Without looking at the code, I could imagine that both might be merged
> into a single memory slot. That is my concern. Maybe it is not valid.
it's not. As far as I see ranges are merged only if they belong to
the same 'mr'. So to dimms will result in 2 memory sections -> 2 KVMSlots.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 14:56               ` Igor Mammedov
  0 siblings, 0 replies; 38+ messages in thread
From: Igor Mammedov @ 2019-04-18 14:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Halil Pasic, Paolo Bonzini, qemu-s390x, Cornelia Huck, qemu-devel

On Thu, 18 Apr 2019 14:06:25 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 18.04.19 14:01, Igor Mammedov wrote:
> > On Thu, 18 Apr 2019 13:24:43 +0200
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> On 18.04.19 11:38, Igor Mammedov wrote:  
> >>> On Tue, 16 Apr 2019 13:09:08 +0200
> >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >>>     
> >>>> This fails with more than 8TB, e.g.  "-m 9T "
> >>>>
> >>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
> >>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
> >>>>
> >>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).    
> >>>
> >>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
> >>> but follow up flatview_simplify() collapses adjacent ranges back
> >>> into big one.    
> >>
> >> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
> >> DIMM), the kvm memory slot would temporarily be deleted to insert the
> >> new, bigger one. Guest would crash. This could happen if backing memory
> >> of two DIMMs would by pure luck be allocated side by side in user space.
> >>  
> > 
> > not sure I fully get your concerns, but if you look at can_merge()
> > it ensures that ranges belong to the same MemoryRegion.
> > 
> > It's hard for me to say if flatview_simplify() works as designed,
> > MemoryRegion code is quite complicated so I'd deffer to Paolo's
> > opinion.
> >   
> 
> What I had in mind:
> 
> We have the Memory Region for memory devices (m->device_memory).
> 
> Assume The first DIMM is created, allocating memory in the user space
> process:
> 
> [0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.
> 
> Guests starts to run, a second DIMM is hotplugged. Memory in user space
> process is allocated (by pure luck) at:
> 
> [0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
> m->device_memory.
> 
> Without looking at the code, I could imagine that both might be merged
> into a single memory slot. That is my concern. Maybe it is not valid.
it's not. As far as I see ranges are merged only if they belong to
the same 'mr'. So to dimms will result in 2 memory sections -> 2 KVMSlots.




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 15:01                 ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 15:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, qemu-devel, Cornelia Huck, Halil Pasic, qemu-s390x

On 18.04.19 16:56, Igor Mammedov wrote:
> On Thu, 18 Apr 2019 14:06:25 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 18.04.19 14:01, Igor Mammedov wrote:
>>> On Thu, 18 Apr 2019 13:24:43 +0200
>>> David Hildenbrand <david@redhat.com> wrote:
>>>   
>>>> On 18.04.19 11:38, Igor Mammedov wrote:  
>>>>> On Tue, 16 Apr 2019 13:09:08 +0200
>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>     
>>>>>> This fails with more than 8TB, e.g.  "-m 9T "
>>>>>>
>>>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>>>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>>>>>
>>>>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).    
>>>>>
>>>>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
>>>>> but follow up flatview_simplify() collapses adjacent ranges back
>>>>> into big one.    
>>>>
>>>> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
>>>> DIMM), the kvm memory slot would temporarily be deleted to insert the
>>>> new, bigger one. Guest would crash. This could happen if backing memory
>>>> of two DIMMs would by pure luck be allocated side by side in user space.
>>>>  
>>>
>>> not sure I fully get your concerns, but if you look at can_merge()
>>> it ensures that ranges belong to the same MemoryRegion.
>>>
>>> It's hard for me to say if flatview_simplify() works as designed,
>>> MemoryRegion code is quite complicated so I'd deffer to Paolo's
>>> opinion.
>>>   
>>
>> What I had in mind:
>>
>> We have the Memory Region for memory devices (m->device_memory).
>>
>> Assume The first DIMM is created, allocating memory in the user space
>> process:
>>
>> [0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.
>>
>> Guests starts to run, a second DIMM is hotplugged. Memory in user space
>> process is allocated (by pure luck) at:
>>
>> [0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
>> m->device_memory.
>>
>> Without looking at the code, I could imagine that both might be merged
>> into a single memory slot. That is my concern. Maybe it is not valid.
> it's not. As far as I see ranges are merged only if they belong to
> the same 'mr'. So to dimms will result in 2 memory sections -> 2 KVMSlots.

Okay, so a shared "parent memory region" is not enough to result in a
merge, only aliases.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times
@ 2019-04-18 15:01                 ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-04-18 15:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Halil Pasic, Paolo Bonzini, qemu-s390x, Cornelia Huck, qemu-devel

On 18.04.19 16:56, Igor Mammedov wrote:
> On Thu, 18 Apr 2019 14:06:25 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 18.04.19 14:01, Igor Mammedov wrote:
>>> On Thu, 18 Apr 2019 13:24:43 +0200
>>> David Hildenbrand <david@redhat.com> wrote:
>>>   
>>>> On 18.04.19 11:38, Igor Mammedov wrote:  
>>>>> On Tue, 16 Apr 2019 13:09:08 +0200
>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>     
>>>>>> This fails with more than 8TB, e.g.  "-m 9T "
>>>>>>
>>>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=0, userspace_addr=0x3ffc8500000}) = 0
>>>>>> [pid 231065] ioctl(10, KVM_SET_USER_MEMORY_REGION, {slot=0, flags=0, guest_phys_addr=0, memory_size=9895604649984, userspace_addr=0x3ffc8500000}) = -1 EINVAL (Invalid argument)
>>>>>>
>>>>>> seems that the 2nd memslot gets the full size (and not 9TB-size of first slot).    
>>>>>
>>>>> it turns out MemoryRegions is rendered correctly in to 2 parts (one per alias),
>>>>> but follow up flatview_simplify() collapses adjacent ranges back
>>>>> into big one.    
>>>>
>>>> That sounds dangerous. Imagine doing that at runtime (e.g. hotplugging a
>>>> DIMM), the kvm memory slot would temporarily be deleted to insert the
>>>> new, bigger one. Guest would crash. This could happen if backing memory
>>>> of two DIMMs would by pure luck be allocated side by side in user space.
>>>>  
>>>
>>> not sure I fully get your concerns, but if you look at can_merge()
>>> it ensures that ranges belong to the same MemoryRegion.
>>>
>>> It's hard for me to say if flatview_simplify() works as designed,
>>> MemoryRegion code is quite complicated so I'd deffer to Paolo's
>>> opinion.
>>>   
>>
>> What I had in mind:
>>
>> We have the Memory Region for memory devices (m->device_memory).
>>
>> Assume The first DIMM is created, allocating memory in the user space
>> process:
>>
>> [0x100000000 .. 0x20000000]. It is placed at offset 0 in m->device_memory.
>>
>> Guests starts to run, a second DIMM is hotplugged. Memory in user space
>> process is allocated (by pure luck) at:
>>
>> [0x200000000 .. 0x30000000]. It is placed at offset 0x100000000 in
>> m->device_memory.
>>
>> Without looking at the code, I could imagine that both might be merged
>> into a single memory slot. That is my concern. Maybe it is not valid.
> it's not. As far as I see ranges are merged only if they belong to
> the same 'mr'. So to dimms will result in 2 memory sections -> 2 KVMSlots.

Okay, so a shared "parent memory region" is not enough to result in a
merge, only aliases.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-04-18 15:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15 13:27 [Qemu-devel] [PATCH v1 0/5] Fix misuses of memory_region_allocate_system_memory() Igor Mammedov
2019-04-15 13:27 ` Igor Mammedov
2019-04-15 13:27 ` [Qemu-devel] [PATCH v1 1/5] sparc64: use memory_region_allocate_system_memory() only for '-m' specified RAM Igor Mammedov
2019-04-15 13:27   ` Igor Mammedov
2019-04-15 15:07   ` Philippe Mathieu-Daudé
2019-04-15 15:07     ` Philippe Mathieu-Daudé
2019-04-15 13:27 ` [Qemu-devel] [PATCH v1 2/5] ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory() Igor Mammedov
2019-04-15 13:27   ` Igor Mammedov
2019-04-16  4:13   ` David Gibson
2019-04-16  4:13     ` David Gibson
2019-04-15 13:27 ` [Qemu-devel] [PATCH v1 3/5] hppa: drop usage of memory_region_allocate_system_memory() for ROM Igor Mammedov
2019-04-15 13:27   ` Igor Mammedov
2019-04-15 15:16   ` Philippe Mathieu-Daudé
2019-04-15 15:16     ` Philippe Mathieu-Daudé
2019-04-15 13:27 ` [Qemu-devel] [PATCH v1 4/5] memory: make MemoryRegion alias migratable Igor Mammedov
2019-04-15 13:27   ` Igor Mammedov
2019-04-15 13:27 ` [Qemu-devel] [PATCH v1 5/5] s390: do not call memory_region_allocate_system_memory() multiple times Igor Mammedov
2019-04-15 13:27   ` Igor Mammedov
2019-04-16 11:01   ` Christian Borntraeger
2019-04-16 11:01     ` Christian Borntraeger
2019-04-16 11:02     ` Christian Borntraeger
2019-04-16 11:02       ` Christian Borntraeger
2019-04-16 11:09   ` Christian Borntraeger
2019-04-16 11:09     ` Christian Borntraeger
2019-04-17 14:30     ` Igor Mammedov
2019-04-17 14:30       ` Igor Mammedov
2019-04-18  9:38     ` Igor Mammedov
2019-04-18  9:38       ` Igor Mammedov
2019-04-18 11:24       ` David Hildenbrand
2019-04-18 11:24         ` David Hildenbrand
2019-04-18 12:01         ` Igor Mammedov
2019-04-18 12:01           ` Igor Mammedov
2019-04-18 12:06           ` David Hildenbrand
2019-04-18 12:06             ` David Hildenbrand
2019-04-18 14:56             ` Igor Mammedov
2019-04-18 14:56               ` Igor Mammedov
2019-04-18 15:01               ` David Hildenbrand
2019-04-18 15:01                 ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.