All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB
@ 2018-10-18 14:30 Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 01/16] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
                   ` (15 more replies)
  0 siblings, 16 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

This series aims at supporting PCDIMM/NVDIMM intantiation in
mach-virt at 2TB guest physical address.

This is not material for 3.1.

Although there is another implementation alternative under
discussion on the ML, consisting in having a single device memory
region with a floating base, I continue to maintain this series
until all dependencies get resolved and this other solution gets
fully prototyped.

Here we consider that the top of the address map is the top of
the device memory starting at 2TB and whose size is defined
by maxram_size - ram_size.

We create a KVM machine with an IPA range including the top of
the device memory. Suzuki's kernel series [0] allows that settings.
In can be found on kvmarm/next branch and this branch was used for
testing.

Then the series adds the device memory framework to mach-virt
and brings support for PC-DIMM and NV-DIMM frontend devices.

This series reuses/rebases patches initially submitted by Shameer
in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts.

Notes:
- The EDK2 FW still hardcodes the max PA size to 40 bits
- The TCG machine limits the PA according to the id_aa64mmfr0 register
  PARange field. At the moment the PARange is hardcoded to 40bits for
  the A53 CPU and to 44 bits for the A57. So only an A57 based machine
  would be able to expose device memory. This consisteny check is not
  yet handled. We could think about adding the support for the phys-bit
  CPU option to set the id_aa64mmfr0.parange.

The series includes "hw/arm/boot: introduce fdt_add_memory_node
helper" which was sent separately.

Best Regards

Eric

References:

[0] [PATCH v6 00/18] arm64: Dynamic & 52bit IPA support
https://lkml.org/lkml/2018/9/26/936
kvmarm next branch can be used for testing

[1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
http://patchwork.ozlabs.org/cover/914694/

[2] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html

Tests:
- On Cavium Gigabyte, a 48b KVM accelerated VM was created.
- PC-DIMM was tested with memtester
- NV-DIMM was tested with ndctl and an ext4 DAX FS was installed on
  guest

This series can be found at:
https://github.com/eauger/qemu/tree/v3.0.0-dimm-2tb-v4

History:

v3 -> v4:
- rebase on David's "pc-dimm: next bunch of cleanups" and
  "pc-dimm: pre_plug "slot" and "addr" assignment"
- kvm-type option not used anymore. We directly use
  maxram_size and ram_size machine fields to compute the
  MAX IPA range. Migration is naturally handled as CLI
  option are kept between source and destination. This was
  suggested by David.
- device_memory_start and device_memory_size not stored
  anymore in vms->bootinfo
- I did not take into account 2 Igor's comments: the one
  related to the refactoring of arm_load_dtb and the one
  related to the generation of the dtb after system_reset
  which would contain nodes of hotplugged devices (we do
  not support hotplug at this stage)
- check the end-user does not attempt to hotplug a device
- addition of "vl: Set machine ram_size, maxram_size and
  ram_slots earlier"

v2 -> v3:
- fix pc_q35 and pc_piix compilation error
- kwangwoo's email being not valid anymore, remove his address

v1 -> v2:
- kvm_get_max_vm_phys_shift moved in arch specific file
- addition of NVDIMM part
- single series
- rebase on David's refactoring

v1:
- was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
- was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"

Best Regards

Eric

Eric Auger (10):
  linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
  hw/boards: Add a MachineState parameter to kvm_type callback
  kvm: add kvm_arm_get_max_vm_phys_shift
  vl: Set machine ram_size, maxram_size and ram_slots earlier
  hw/arm/virt: Add virt-3.2 machine type
  hw/arm/virt: Implement kvm_type function for 3.2 machine
  hw/arm/virt: Allocate device_memory
  acpi: move build_srat_hotpluggable_memory to generic ACPI source
  hw/arm/boot: Expose the pmem nodes in the DT
  hw/arm/virt: Add nvdimm and nvdimm-persistence options

Kwangwoo Lee (2):
  nvdimm: use configurable ACPI IO base and size
  hw/arm/virt: Add nvdimm hot-plug infrastructure

Shameer Kolothum (4):
  hw/arm/boot: introduce fdt_add_memory_node helper
  hw/arm/virt: Add memory hotplug framework
  hw/arm/boot: Expose the PC-DIMM nodes in the DT
  hw/arm/virt-acpi-build: Add PC-DIMM in SRAT

 accel/kvm/kvm-all.c             |   2 +-
 default-configs/arm-softmmu.mak |   4 +
 hw/acpi/aml-build.c             |  51 ++++++
 hw/acpi/nvdimm.c                |  28 ++-
 hw/arm/boot.c                   | 120 ++++++++++---
 hw/arm/virt-acpi-build.c        |  10 ++
 hw/arm/virt.c                   | 302 ++++++++++++++++++++++++++++----
 hw/i386/pc_piix.c               |   8 +-
 hw/i386/pc_q35.c                |   8 +-
 hw/ppc/mac_newworld.c           |   3 +-
 hw/ppc/mac_oldworld.c           |   2 +-
 hw/ppc/spapr.c                  |   2 +-
 include/hw/acpi/aml-build.h     |   3 +
 include/hw/arm/virt.h           |   5 +
 include/hw/boards.h             |   2 +-
 include/hw/mem/nvdimm.h         |  12 ++
 linux-headers/linux/kvm.h       |  10 ++
 target/arm/kvm.c                |   8 +
 target/arm/kvm_arm.h            |  16 ++
 vl.c                            |   6 +-
 20 files changed, 530 insertions(+), 72 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 01/16] hw/arm/boot: introduce fdt_add_memory_node helper
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT Eric Auger
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

We introduce an helper to create a memory node.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/boot.c | 54 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 20c71d7d96..ba2004da5c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -413,6 +413,36 @@ static void set_kernel_args_old(const struct arm_boot_info *info,
     }
 }
 
+static int fdt_add_memory_node(void *fdt, uint32_t acells, hwaddr mem_base,
+                               uint32_t scells, hwaddr mem_len,
+                               int numa_node_id)
+{
+    char *nodename = NULL;
+    int ret;
+
+    nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+    qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
+    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,
+                                       scells, mem_len);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/reg\n", nodename);
+        goto out;
+    }
+    if (numa_node_id < 0) {
+        goto out;
+    }
+
+    ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
+    }
+
+out:
+    g_free(nodename);
+    return ret;
+}
+
 static void fdt_add_psci_node(void *fdt)
 {
     uint32_t cpu_suspend_fn;
@@ -492,7 +522,6 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
     void *fdt = NULL;
     int size, rc, n = 0;
     uint32_t acells, scells;
-    char *nodename;
     unsigned int i;
     hwaddr mem_base, mem_len;
     char **node_path;
@@ -566,35 +595,20 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         mem_base = binfo->loader_start;
         for (i = 0; i < nb_numa_nodes; i++) {
             mem_len = numa_info[i].node_mem;
-            nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
-            qemu_fdt_add_subnode(fdt, nodename);
-            qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
-            rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
-                                              acells, mem_base,
-                                              scells, mem_len);
+            rc = fdt_add_memory_node(fdt, acells, mem_base,
+                                     scells, mem_len, i);
             if (rc < 0) {
-                fprintf(stderr, "couldn't set %s/reg for node %d\n", nodename,
-                        i);
                 goto fail;
             }
 
-            qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", i);
             mem_base += mem_len;
-            g_free(nodename);
         }
     } else {
-        nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
-        qemu_fdt_add_subnode(fdt, nodename);
-        qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
-
-        rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
-                                          acells, binfo->loader_start,
-                                          scells, binfo->ram_size);
+        rc = fdt_add_memory_node(fdt, acells, binfo->loader_start,
+                                 scells, binfo->ram_size, -1);
         if (rc < 0) {
-            fprintf(stderr, "couldn't set %s reg\n", nodename);
             goto fail;
         }
-        g_free(nodename);
     }
 
     rc = fdt_path_offset(fdt, "/chosen");
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 01/16] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-19  8:49   ` Suzuki K Poulose
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 03/16] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

This is a header update against kvmarm next branch

git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm kvmarm/next

to get the KVM_ARM_GET_MAX_VM_PHYS_SHIFT ioctl. This allows to retrieve
the IPA address range KVM supports.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- update against kvmarm next
---
 linux-headers/linux/kvm.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 83ba4eb571..9647ce4fcb 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -750,6 +750,15 @@ struct kvm_ppc_resize_hpt {
 
 #define KVM_S390_SIE_PAGE_OFFSET 1
 
+/*
+ * On arm64, machine type can be used to request the physical
+ * address size for the VM. Bits[7-0] are reserved for the guest
+ * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
+ * value 0 implies the default IPA size, 40bits.
+ */
+#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
+#define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
+	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
 /*
  * ioctls for /dev/kvm fds:
  */
@@ -953,6 +962,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_NESTED_STATE 157
 #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
 #define KVM_CAP_MSR_PLATFORM_INFO 159
+#define KVM_CAP_ARM_VM_IPA_SIZE 160 /* returns maximum IPA bits for a VM */
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 03/16] hw/boards: Add a MachineState parameter to kvm_type callback
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 01/16] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 04/16] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

On ARM, the kvm_type will be resolved by querying the KVMState.
Let's add the MachineState handle to the callback so that we
can retrieve the  KVMState handle. in kvm_init, when the callback
is called, the kvm_state variable is not yet set.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
[ppc parts]
---
 accel/kvm/kvm-all.c   | 2 +-
 hw/ppc/mac_newworld.c | 3 +--
 hw/ppc/mac_oldworld.c | 2 +-
 hw/ppc/spapr.c        | 2 +-
 include/hw/boards.h   | 2 +-
 5 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index de12f78eb8..1505342ec5 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1550,7 +1550,7 @@ static int kvm_init(MachineState *ms)
 
     kvm_type = qemu_opt_get(qemu_get_machine_opts(), "kvm-type");
     if (mc->kvm_type) {
-        type = mc->kvm_type(kvm_type);
+        type = mc->kvm_type(ms, kvm_type);
     } else if (kvm_type) {
         ret = -EINVAL;
         fprintf(stderr, "Invalid argument kvm-type=%s\n", kvm_type);
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index a630cb81cd..5b897011db 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -569,8 +569,7 @@ static char *core99_fw_dev_path(FWPathProvider *p, BusState *bus,
 
     return NULL;
 }
-
-static int core99_kvm_type(const char *arg)
+static int core99_kvm_type(MachineState *ms, const char *arg)
 {
     /* Always force PR KVM */
     return 2;
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 9891c325a9..67cbd06b0f 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -422,7 +422,7 @@ static char *heathrow_fw_dev_path(FWPathProvider *p, BusState *bus,
     return NULL;
 }
 
-static int heathrow_kvm_type(const char *arg)
+static int heathrow_kvm_type(MachineState *ms, const char *arg)
 {
     /* Always force PR KVM */
     return 2;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 98868d893a..18a9d2cf03 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2821,7 +2821,7 @@ static void spapr_machine_init(MachineState *machine)
     }
 }
 
-static int spapr_kvm_type(const char *vm_type)
+static int spapr_kvm_type(MachineState *ms, const char *vm_type)
 {
     if (!vm_type) {
         return 0;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index f82f28468b..8bc015fb7c 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -172,7 +172,7 @@ struct MachineClass {
     void (*init)(MachineState *state);
     void (*reset)(void);
     void (*hot_add_cpu)(const int64_t id, Error **errp);
-    int (*kvm_type)(const char *arg);
+    int (*kvm_type)(MachineState *ms, const char *arg);
 
     BlockInterfaceType block_default_type;
     int units_per_default_bus;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 04/16] kvm: add kvm_arm_get_max_vm_phys_shift
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (2 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 03/16] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 05/16] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

Add the kvm_arm_get_max_vm_phys_shift() helper that returns the
log of the maximum IPA size supported by KVM. This capability
needs to be known to create the VM with a specific IPA max size
(kvm_type passed along KVM_CREATE_VM ioctl.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- s/s/ms in kvm_arm_get_max_vm_phys_shift function comment
- check KVM_CAP_ARM_VM_IPA_SIZE extension

v1 -> v2:
- put this in ARM specific code
---
 target/arm/kvm.c     |  8 ++++++++
 target/arm/kvm_arm.h | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 54ef5f711b..485e3291ae 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -18,6 +18,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
+#include "sysemu/kvm_int.h"
 #include "kvm_arm.h"
 #include "cpu.h"
 #include "trace.h"
@@ -154,6 +155,13 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     env->features = arm_host_cpu_features.features;
 }
 
+int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
+{
+    KVMState *s = KVM_STATE(ms->accelerator);
+
+    return kvm_check_extension(s, KVM_CAP_ARM_VM_IPA_SIZE);
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     /* For ARM interrupt delivery is always asynchronous,
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 5948e8b560..749c38cb35 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -182,6 +182,17 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf);
  */
 void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu);
 
+/**
+ * kvm_arm_get_max_vm_phys_shift - Returns log2 of the max IPA size
+ * supported by KVM
+ *
+ * @ms: Machine state handle
+ *
+ * Return the max number of IPA bits or a negative value if
+ * the host kernel does not expose this value.
+ */
+int kvm_arm_get_max_vm_phys_shift(MachineState *ms);
+
 /**
  * kvm_arm_sync_mpstate_to_kvm
  * @cpu: ARMCPU
@@ -214,6 +225,11 @@ static inline void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
     cpu->host_cpu_probe_failed = true;
 }
 
+static inline int kvm_arm_get_max_vm_phys_shift(MachineState *ms)
+{
+    return -ENOENT;
+}
+
 static inline int kvm_arm_vgic_probe(void)
 {
     return 0;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 05/16] vl: Set machine ram_size, maxram_size and ram_slots earlier
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (3 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 04/16] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 06/16] hw/arm/virt: Add virt-3.2 machine type Eric Auger
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

The machine RAM attributes will need to be analyzed during the
configure_accelerator() process. especially kvm_type() arm64
machine callback will use them to know how many IPA/GPA bits are
needed to model the whole RAM range. So let's assign those machine
state fields before calling configure_accelerator.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v4: new
---
 vl.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/vl.c b/vl.c
index 4e25c78bff..35b426debd 100644
--- a/vl.c
+++ b/vl.c
@@ -4266,6 +4266,9 @@ int main(int argc, char **argv, char **envp)
         object_unref(OBJECT(current_machine));
         exit(1);
     }
+    current_machine->ram_size = ram_size;
+    current_machine->maxram_size = maxram_size;
+    current_machine->ram_slots = ram_slots;
 
     configure_accelerator(current_machine);
 
@@ -4470,9 +4473,6 @@ int main(int argc, char **argv, char **envp)
     replay_checkpoint(CHECKPOINT_INIT);
     qdev_machine_init();
 
-    current_machine->ram_size = ram_size;
-    current_machine->maxram_size = maxram_size;
-    current_machine->ram_slots = ram_slots;
     current_machine->boot_order = boot_order;
 
     /* parse features once if machine provides default cpu_type */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 06/16] hw/arm/virt: Add virt-3.2 machine type
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (4 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 05/16] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine Eric Auger
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

Add virt-3.2 machine type.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9f677825f9..f920ef247b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1796,7 +1796,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_3_1_instance_init(Object *obj)
+static void virt_3_2_instance_init(Object *obj)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1866,10 +1866,21 @@ static void virt_3_1_instance_init(Object *obj)
     vms->irqmap = a15irqmap;
 }
 
-static void virt_machine_3_1_options(MachineClass *mc)
+static void virt_machine_3_2_options(MachineClass *mc)
 {
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(3, 1)
+DEFINE_VIRT_MACHINE_AS_LATEST(3, 2)
+
+static void virt_3_1_instance_init(Object *obj)
+{
+    virt_3_2_instance_init(obj);
+}
+
+static void virt_machine_3_1_options(MachineClass *mc)
+{
+    virt_machine_3_2_options(mc);
+}
+DEFINE_VIRT_MACHINE(3, 1)
 
 static void virt_3_0_instance_init(Object *obj)
 {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (5 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 06/16] hw/arm/virt: Add virt-3.2 machine type Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-19  2:58   ` Richard Henderson
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 08/16] hw/arm/virt: Allocate device_memory Eric Auger
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

This patch computes the requested IPA bits according to
the requested maxram value.

The machine class kvm_type() callback is implemented and
fills the kvm_type[7-0] bits with the computed max IPA shift
(0 default value corresponds to 40b IPA). The kvm_type is
passed to the KVM_CREATE_VM ioctl.

The max IPA address shift is computed assuming the top of the
address space is occuped by device memory starting at 2TB and
of size maxram_size - ramsize.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

The approach to have an IPA range depending on the machine memory
attributes is preferred here against having an IPA range based
on the max capability of the host (which would be simpler). This
latter would sometimes lead to having additional useless translation
levels at stage2 as this may downgrade the guest performance.
---
 hw/arm/virt.c         | 48 ++++++++++++++++++++++++++++++++++++++++++-
 include/hw/arm/virt.h |  1 +
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f920ef247b..21718c250e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -108,8 +108,12 @@
  * of a terabyte of RAM will be doing it on a host with more than a
  * terabyte of physical address space.)
  */
+#define SZ_1G (1024ULL * 1024 * 1024)
 #define RAMLIMIT_GB 255
-#define RAMLIMIT_BYTES (RAMLIMIT_GB * 1024ULL * 1024 * 1024)
+#define RAMLIMIT_BYTES (RAMLIMIT_GB * SZ_1G)
+
+/* device memory starts at 2TB */
+#define DEVICE_MEM_BASE (2048 * SZ_1G)
 
 /* Addresses and sizes of our components.
  * 0..128MB is space for a flash device so we can run bootrom code such as UEFI.
@@ -1748,6 +1752,38 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
     return NULL;
 }
 
+/*
+ * for arm64 kvm_type [7-0] encodes the IPA size shift
+ */
+static inline int virt_kvm_type(MachineState *ms, const char *type_str)
+{
+    int max_vm_phys_shift = kvm_arm_get_max_vm_phys_shift(ms);
+    ram_addr_t device_mem_size = ms->maxram_size - ms->ram_size;
+    uint8_t requested_vm_phys_shift;
+
+    if (!device_mem_size) {
+        return 0; /* default 40b IPA */
+    }
+
+    /* we need at least 42b IPA to fit device memory at 2TB*/
+    if (max_vm_phys_shift < 42) {
+        error_report("This host does not support 42b IPA: "
+                     "maxram/slots options not usable");
+        exit(1);
+    }
+
+    requested_vm_phys_shift = 64 - clz64(DEVICE_MEM_BASE + device_mem_size);
+
+    if (requested_vm_phys_shift > max_vm_phys_shift) {
+        error_report("maxmem option value too large. Max supported value "
+                     "for this host is 0x%"PRIx64,
+                     (ram_addr_t)((1ULL << max_vm_phys_shift) - DEVICE_MEM_BASE));
+       exit(1);
+    }
+
+    return requested_vm_phys_shift;
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -1772,6 +1808,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+    mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
     hc->plug = virt_machine_device_plug_cb;
@@ -1878,7 +1915,16 @@ static void virt_3_1_instance_init(Object *obj)
 
 static void virt_machine_3_1_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
     virt_machine_3_2_options(mc);
+
+    /*
+     * Device memory and capability to set the max IPA address shift
+     * are enabled from 3.2 onwards
+     */
+    vmc->no_device_memory = true;
+    mc->kvm_type = NULL;
 }
 DEFINE_VIRT_MACHINE(3, 1)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 4cc57a7ef6..f57e4c1890 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -101,6 +101,7 @@ typedef struct {
     bool claim_edge_triggered_timers;
     bool smbios_old_sys_ver;
     bool no_highmem_ecam;
+    bool no_device_memory;
 } VirtMachineClass;
 
 typedef struct {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 08/16] hw/arm/virt: Allocate device_memory
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (6 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 09/16] hw/arm/virt: Add memory hotplug framework Eric Auger
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

We define a device memory region stating at 2TB and max 4TB.
This requires support of more than 40b IPA on host (CPU,
kernel config and FW). IPA needs are adjusted according to
maxram_size - ram_size value.

This is largely inspired of device memory initialization in
pc machine code.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>

---

v3 -> v4:
- remove bootinfo.device_memory_start/device_memory_size
- rename VIRT_HOTPLUG_MEM into VIRT_DEVICE_MEM
---
 hw/arm/virt.c         | 98 ++++++++++++++++++++++++++++++-------------
 include/hw/arm/virt.h |  1 +
 2 files changed, 71 insertions(+), 28 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 21718c250e..9b06797090 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -59,6 +59,7 @@
 #include "qapi/visitor.h"
 #include "standard-headers/linux/input.h"
 #include "hw/arm/smmuv3.h"
+#include "hw/acpi/acpi.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -94,38 +95,29 @@
 
 #define PLATFORM_BUS_NUM_IRQS 64
 
-/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this means
- * RAM can go up to the 256GB mark, leaving 256GB of the physical
- * address space unallocated and free for future use between 256G and 512G.
- * If we need to provide more RAM to VMs in the future then we need to:
- *  * allocate a second bank of RAM starting at 2TB and working up
- *  * fix the DT and ACPI table generation code in QEMU to correctly
- *    report two split lumps of RAM to the guest
- *  * fix KVM in the host kernel to allow guests with >40 bit address spaces
- * (We don't want to fill all the way up to 512GB with RAM because
- * we might want it for non-RAM purposes later. Conversely it seems
- * reasonable to assume that anybody configuring a VM with a quarter
- * of a terabyte of RAM will be doing it on a host with more than a
- * terabyte of physical address space.)
- */
 #define SZ_1G (1024ULL * 1024 * 1024)
-#define RAMLIMIT_GB 255
-#define RAMLIMIT_BYTES (RAMLIMIT_GB * SZ_1G)
+#define SZ_64K 0x10000
 
 /* device memory starts at 2TB */
 #define DEVICE_MEM_BASE (2048 * SZ_1G)
+#define DEVICE_MEM_SIZE (4096 * SZ_1G)
 
 /* Addresses and sizes of our components.
- * 0..128MB is space for a flash device so we can run bootrom code such as UEFI.
- * 128MB..256MB is used for miscellaneous device I/O.
- * 256MB..1GB is reserved for possible future PCI support (ie where the
- * PCI memory window will go if we add a PCI host controller).
- * 1GB and up is RAM (which may happily spill over into the
- * high memory region beyond 4GB).
- * This represents a compromise between how much RAM can be given to
- * a 32 bit VM and leaving space for expansion and in particular for PCI.
- * Note that devices should generally be placed at multiples of 0x10000,
+ * 0..128MB is space for a flash device so we can run bootrom code such as UEFI,
+ * 128MB..256MB is used for miscellaneous device I/O,
+ * 256MB..1GB is used for PCI host controller,
+ * 1GB..256GB is RAM (not hotpluggable),
+ * 256GB..512GB: is left for device I/O (non RAM purpose),
+ * 512GB..1TB: high mem PCI MMIO region,
+ * 2TB..6TB is used for device memory (assumes dynamic IPA setting on kernel).
+ *
+ * Note that IO devices should generally be placed at multiples of 0x10000,
  * to accommodate guests using 64K pages.
+ *
+ * Conversely it seems reasonable to assume that anybody configuring a VM
+ * with a quarter of a terabyte of RAM will be doing it on a host with more
+ * than a terabyte of physical address space.)
+ *
  */
 static const MemMapEntry a15memmap[] = {
     /* Space up to 0x8000000 is reserved for a boot ROM */
@@ -154,12 +146,14 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_PCIE_MMIO] =          { 0x10000000, 0x2eff0000 },
     [VIRT_PCIE_PIO] =           { 0x3eff0000, 0x00010000 },
     [VIRT_PCIE_ECAM] =          { 0x3f000000, 0x01000000 },
-    [VIRT_MEM] =                { 0x40000000, RAMLIMIT_BYTES },
+    [VIRT_MEM] =                { SZ_1G , 255 * SZ_1G },
     /* Additional 64 MB redist region (can contain up to 512 redistributors) */
     [VIRT_GIC_REDIST2] =        { 0x4000000000ULL, 0x4000000 },
     [VIRT_PCIE_ECAM_HIGH] =     { 0x4010000000ULL, 0x10000000 },
     /* Second PCIe window, 512GB wide at the 512GB boundary */
-    [VIRT_PCIE_MMIO_HIGH] =   { 0x8000000000ULL, 0x8000000000ULL },
+    [VIRT_PCIE_MMIO_HIGH] =     { 512 * SZ_1G, 512 * SZ_1G },
+    /* device memory beyond 2TB */
+    [VIRT_DEVICE_MEM] =         { DEVICE_MEM_BASE, DEVICE_MEM_SIZE },
 };
 
 static const int a15irqmap[] = {
@@ -1265,6 +1259,51 @@ static void create_secure_ram(VirtMachineState *vms,
     g_free(nodename);
 }
 
+static void create_device_memory(VirtMachineState *vms, MemoryRegion *sysmem)
+{
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
+    uint64_t device_memory_size = ms->maxram_size - ms->ram_size;
+    uint64_t align = SZ_64K;
+
+    if (!device_memory_size) {
+        return;
+    }
+
+    if (vmc->no_device_memory) {
+        error_report("Machine %s does not support device memory: "
+                     "maxmem option not supported", mc->name);
+        exit(EXIT_FAILURE);
+    }
+
+    if (ms->ram_slots > ACPI_MAX_RAM_SLOTS) {
+        error_report("unsupported number of memory slots: %"PRIu64,
+                     ms->ram_slots);
+        exit(EXIT_FAILURE);
+    }
+
+    if (QEMU_ALIGN_UP(ms->maxram_size, align) != ms->maxram_size) {
+        error_report("maximum memory size must be aligned to multiple of 0x%"
+                     PRIx64, align);
+        exit(EXIT_FAILURE);
+    }
+
+    if (device_memory_size > vms->memmap[VIRT_DEVICE_MEM].size) {
+        error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
+                         ms->maxram_size);
+        exit(EXIT_FAILURE);
+    }
+
+    ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
+    ms->device_memory->base = vms->memmap[VIRT_DEVICE_MEM].base;
+
+    memory_region_init(&ms->device_memory->mr, OBJECT(vms),
+                       "device-memory", device_memory_size);
+    memory_region_add_subregion(sysmem, ms->device_memory->base,
+                                &ms->device_memory->mr);
+}
+
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
 {
     const VirtMachineState *board = container_of(binfo, VirtMachineState,
@@ -1438,7 +1477,8 @@ static void machvirt_init(MachineState *machine)
     vms->smp_cpus = smp_cpus;
 
     if (machine->ram_size > vms->memmap[VIRT_MEM].size) {
-        error_report("mach-virt: cannot model more than %dGB RAM", RAMLIMIT_GB);
+        error_report("mach-virt: cannot model more than %dGB RAM",
+                     (int)(vms->memmap[VIRT_MEM].size / SZ_1G));
         exit(1);
     }
 
@@ -1533,6 +1573,8 @@ static void machvirt_init(MachineState *machine)
                                          machine->ram_size);
     memory_region_add_subregion(sysmem, vms->memmap[VIRT_MEM].base, ram);
 
+    create_device_memory(vms, sysmem);
+
     create_flash(vms, sysmem, secure_sysmem ? secure_sysmem : sysmem);
 
     create_gic(vms, pic);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f57e4c1890..032d88f4c4 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -80,6 +80,7 @@ enum {
     VIRT_GPIO,
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
+    VIRT_DEVICE_MEM,
 };
 
 typedef enum VirtIOMMUType {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 09/16] hw/arm/virt: Add memory hotplug framework
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (7 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 08/16] hw/arm/virt: Allocate device_memory Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 10/16] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

This patch adds the the memory hot-plug/hot-unplug infrastructure
in machvirt.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>

---
v3 -> v4:
- check the memory device is not hotplugged

v2 -> v3:
- change in pc_dimm_plug()'s signature
- add pc_dimm_pre_plug call

v1 -> v2:
- s/virt_dimm_plug|unplug/virt_memory_plug|unplug
- s/pc_dimm_memory_plug/pc_dimm_plug
- reworded title and commit message
- added pre_plug cb
- don't handle get_memory_region failure anymore
---
 default-configs/arm-softmmu.mak |  2 ++
 hw/arm/virt.c                   | 64 ++++++++++++++++++++++++++++++++-
 2 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 2420491aac..1c110cfe7f 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -159,3 +159,5 @@ CONFIG_PCI_DESIGNWARE=y
 CONFIG_STRONGARM=y
 CONFIG_HIGHBANK=y
 CONFIG_MUSICPAL=y
+CONFIG_MEM_HOTPLUG=y
+
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9b06797090..1548e9480a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -60,6 +60,8 @@
 #include "standard-headers/linux/input.h"
 #include "hw/arm/smmuv3.h"
 #include "hw/acpi/acpi.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1771,6 +1773,49 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                                 Error **errp)
+{
+    const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+
+    if (dev->hotplugged) {
+        error_setg(errp, "memory hotplug is not supported");
+    }
+
+    if (is_nvdimm) {
+        error_setg(errp, "nvdimm is not yet supported");
+        return;
+    }
+
+    pc_dimm_pre_plug(dev, MACHINE(hotplug_dev), NULL, errp);
+}
+
+static void virt_memory_plug(HotplugHandler *hotplug_dev,
+                             DeviceState *dev, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    Error *local_err = NULL;
+
+    pc_dimm_plug(dev, MACHINE(vms), &local_err);
+
+    error_propagate(errp, local_err);
+}
+
+static void virt_memory_unplug(HotplugHandler *hotplug_dev,
+                               DeviceState *dev, Error **errp)
+{
+    pc_dimm_unplug(dev, MACHINE(hotplug_dev));
+    object_unparent(OBJECT(dev));
+}
+
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+                                            DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_pre_plug(hotplug_dev, dev, errp);
+    }
+}
+
 static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                         DeviceState *dev, Error **errp)
 {
@@ -1782,12 +1827,27 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                                      SYS_BUS_DEVICE(dev));
         }
     }
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+            virt_memory_plug(hotplug_dev, dev, errp);
+    }
+}
+
+static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
+                                          DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        virt_memory_unplug(hotplug_dev, dev, errp);
+    } else {
+        error_setg(errp, "device unplug request for unsupported device"
+                   " type: %s", object_get_typename(OBJECT(dev)));
+    }
 }
 
 static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
                                                         DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
+       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
         return HOTPLUG_HANDLER(machine);
     }
 
@@ -1853,7 +1913,9 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+    hc->pre_plug = virt_machine_device_pre_plug_cb;
     hc->plug = virt_machine_device_plug_cb;
+    hc->unplug = virt_machine_device_unplug_cb;
 }
 
 static const TypeInfo virt_machine_info = {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 10/16] hw/arm/boot: Expose the PC-DIMM nodes in the DT
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (8 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 09/16] hw/arm/virt: Add memory hotplug framework Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source Eric Auger
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

This patch add memory nodes corresponding to PC-DIMM regions.

NV_DIMM and ACPI_NVDIMM configs are not yet set for ARM so we
don't need to care about NV-DIMM at this stage.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- git rid of @base and @len in fdt_add_hotpluggable_memory_nodes

v1 -> v2:
- added qapi_free_MemoryDeviceInfoList and simplify the loop
---
 hw/arm/boot.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index ba2004da5c..81d621ce14 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -19,6 +19,7 @@
 #include "sysemu/numa.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
+#include "hw/mem/memory-device.h"
 #include "elf.h"
 #include "sysemu/device_tree.h"
 #include "qemu/config-file.h"
@@ -516,6 +517,34 @@ static void fdt_add_psci_node(void *fdt)
     qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
 }
 
+static int fdt_add_hotpluggable_memory_nodes(void *fdt,
+                                             uint32_t acells, uint32_t scells) {
+    MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
+    MemoryDeviceInfo *mi;
+    PCDIMMDeviceInfo *di;
+    bool is_nvdimm;
+    int ret = 0;
+
+    for (info = info_list; info != NULL; info = info->next) {
+        mi = info->value;
+        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
+        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
+
+        if (is_nvdimm) {
+            ret = -ENOENT; /* NV-DIMM not yet supported */
+        } else {
+            ret = fdt_add_memory_node(fdt, acells, di->addr,
+                                      scells, di->size, di->node);
+        }
+        if (ret < 0) {
+            goto out;
+        }
+    }
+out:
+    qapi_free_MemoryDeviceInfoList(info_list);
+    return ret;
+}
+
 int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
                  hwaddr addr_limit, AddressSpace *as)
 {
@@ -611,6 +640,12 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
         }
     }
 
+    rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
+    if (rc < 0) {
+            fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
+            goto fail;
+    }
+
     rc = fdt_path_offset(fdt, "/chosen");
     if (rc < 0) {
         qemu_fdt_add_subnode(fdt, "/chosen");
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (9 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 10/16] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2019-01-21 11:44   ` Shameerali Kolothum Thodi
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

We plan to reuse build_srat_hotpluggable_memory() for ARM so
let's move the function to aml-build.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/acpi/aml-build.c         | 51 +++++++++++++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h |  3 +++
 2 files changed, 54 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 1e43cd736d..167fb6bf3e 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -22,6 +22,7 @@
 #include "qemu/osdep.h"
 #include <glib/gprintf.h>
 #include "hw/acpi/aml-build.h"
+#include "hw/mem/memory-device.h"
 #include "qemu/bswap.h"
 #include "qemu/bitops.h"
 #include "sysemu/numa.h"
@@ -1802,3 +1803,53 @@ build_hdr:
     build_header(linker, tbl, (void *)(tbl->data + fadt_start),
                  "FACP", tbl->len - fadt_start, f->rev, oem_id, oem_table_id);
 }
+
+void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
+                                    uint64_t len, int default_node)
+{
+    MemoryDeviceInfoList *info_list = qmp_memory_device_list();
+    MemoryDeviceInfoList *info;
+    MemoryDeviceInfo *mi;
+    PCDIMMDeviceInfo *di;
+    uint64_t end = base + len, cur, size;
+    bool is_nvdimm;
+    AcpiSratMemoryAffinity *numamem;
+    MemoryAffinityFlags flags;
+
+    for (cur = base, info = info_list;
+         cur < end;
+         cur += size, info = info->next) {
+        numamem = acpi_data_push(table_data, sizeof *numamem);
+
+        if (!info) {
+            build_srat_memory(numamem, cur, end - cur, default_node,
+                              MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+            break;
+        }
+
+        mi = info->value;
+        is_nvdimm = (mi->type == MEMORY_DEVICE_INFO_KIND_NVDIMM);
+        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
+
+        if (cur < di->addr) {
+            build_srat_memory(numamem, cur, di->addr - cur, default_node,
+                              MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+            numamem = acpi_data_push(table_data, sizeof *numamem);
+        }
+
+        size = di->size;
+
+        flags = MEM_AFFINITY_ENABLED;
+        if (di->hotpluggable) {
+            flags |= MEM_AFFINITY_HOTPLUGGABLE;
+        }
+        if (is_nvdimm) {
+            flags |= MEM_AFFINITY_NON_VOLATILE;
+        }
+
+        build_srat_memory(numamem, di->addr, size, di->node, flags);
+    }
+
+    qapi_free_MemoryDeviceInfoList(info_list);
+}
+
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 6c36903c0a..4c2ca134ee 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -416,4 +416,7 @@ void build_slit(GArray *table_data, BIOSLinker *linker);
 
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id);
+
+void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
+                                    uint64_t len, int default_node);
 #endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (10 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-22 13:40   ` Igor Mammedov
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 13/16] nvdimm: use configurable ACPI IO base and size Eric Auger
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Generate Memory Affinity Structures for PC-DIMM ranges.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- do not use vms->bootinfo.device_memory_start/device_memory_size anymore

v1 -> v2:
- build_srat_hotpluggable_memory movedc to aml-build
---
 hw/arm/virt-acpi-build.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 5785fb697c..8818bbf5ec 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -545,6 +545,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     int i, srat_start;
     uint64_t mem_base;
     MachineClass *mc = MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
 
     srat_start = table_data->len;
@@ -570,6 +571,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         }
     }
 
+    build_srat_hotpluggable_memory(table_data, ms->device_memory->base,
+                                   ms->device_memory->mr.size, 0);
+
     build_header(linker, table_data, (void *)(table_data->data + srat_start),
                  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 13/16] nvdimm: use configurable ACPI IO base and size
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (11 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 14/16] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Kwangwoo Lee <kwangwoo.lee@sk.com>

This patch uses configurable IO base and size to create NPIO AML for
ACPI NFIT. Since a different architecture like AArch64 does not use
port-mapped IO, a configurable IO base is required to create correct
mapping of ACPI IO address and size.

Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- s/size/len in pc_piix.c and pc_q35.c
---
 hw/acpi/nvdimm.c        | 28 +++++++++++++++++++---------
 hw/i386/pc_piix.c       |  8 +++++++-
 hw/i386/pc_q35.c        |  8 +++++++-
 include/hw/mem/nvdimm.h | 12 ++++++++++++
 4 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 27eeb6609f..17d71469be 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -929,8 +929,8 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
                             FWCfgState *fw_cfg, Object *owner)
 {
     memory_region_init_io(&state->io_mr, owner, &nvdimm_dsm_ops, state,
-                          "nvdimm-acpi-io", NVDIMM_ACPI_IO_LEN);
-    memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, &state->io_mr);
+                          "nvdimm-acpi-io", state->dsm_io.len);
+    memory_region_add_subregion(io, state->dsm_io.base, &state->io_mr);
 
     state->dsm_mem = g_array_new(false, true /* clear */, 1);
     acpi_data_push(state->dsm_mem, sizeof(NvdimmDsmIn));
@@ -959,12 +959,14 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
 
 #define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
 
-static void nvdimm_build_common_dsm(Aml *dev)
+static void nvdimm_build_common_dsm(Aml *dev,
+                                    AcpiNVDIMMState *acpi_nvdimm_state)
 {
     Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
     Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
     Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
     uint8_t byte_list[1];
+    AmlRegionSpace rs;
 
     method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
     uuid = aml_arg(0);
@@ -975,9 +977,16 @@ static void nvdimm_build_common_dsm(Aml *dev)
 
     aml_append(method, aml_store(aml_name(NVDIMM_ACPI_MEM_ADDR), dsm_mem));
 
+    if (acpi_nvdimm_state->dsm_io.type == NVDIMM_ACPI_IO_PORT) {
+        rs = AML_SYSTEM_IO;
+    } else {
+        rs = AML_SYSTEM_MEMORY;
+    }
+
     /* map DSM memory and IO into ACPI namespace. */
-    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, AML_SYSTEM_IO,
-               aml_int(NVDIMM_ACPI_IO_BASE), NVDIMM_ACPI_IO_LEN));
+    aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, rs,
+               aml_int(acpi_nvdimm_state->dsm_io.base),
+               acpi_nvdimm_state->dsm_io.len));
     aml_append(method, aml_operation_region(NVDIMM_DSM_MEMORY,
                AML_SYSTEM_MEMORY, dsm_mem, sizeof(NvdimmDsmIn)));
 
@@ -1260,7 +1269,8 @@ static void nvdimm_build_nvdimm_devices(Aml *root_dev, uint32_t ram_slots)
 }
 
 static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
-                              BIOSLinker *linker, GArray *dsm_dma_arrea,
+                              BIOSLinker *linker,
+                              AcpiNVDIMMState *acpi_nvdimm_state,
                               uint32_t ram_slots)
 {
     Aml *ssdt, *sb_scope, *dev;
@@ -1288,7 +1298,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
      */
     aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
 
-    nvdimm_build_common_dsm(dev);
+    nvdimm_build_common_dsm(dev, acpi_nvdimm_state);
 
     /* 0 is reserved for root device. */
     nvdimm_build_device_dsm(dev, 0);
@@ -1307,7 +1317,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
                                                NVDIMM_ACPI_MEM_ADDR);
 
     bios_linker_loader_alloc(linker,
-                             NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
+                             NVDIMM_DSM_MEM_FILE, acpi_nvdimm_state->dsm_mem,
                              sizeof(NvdimmDsmIn), false /* high memory */);
     bios_linker_loader_add_pointer(linker,
         ACPI_BUILD_TABLE_FILE, mem_addr_offset, sizeof(uint32_t),
@@ -1329,7 +1339,7 @@ void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
         return;
     }
 
-    nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
+    nvdimm_build_ssdt(table_offsets, table_data, linker, state,
                       ram_slots);
 
     device_list = nvdimm_get_device_list();
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index dc09466b3e..c569a19663 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -297,7 +297,13 @@ static void pc_init1(MachineState *machine,
     }
 
     if (pcms->acpi_nvdimm_state.is_enabled) {
-        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
+        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;
+        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
                                pcms->fw_cfg, OBJECT(pcms));
     }
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 532241e3f8..a693feb589 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -277,7 +277,13 @@ static void pc_q35_init(MachineState *machine)
     pc_nic_init(pcmc, isa_bus, host_bus);
 
     if (pcms->acpi_nvdimm_state.is_enabled) {
-        nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io,
+        AcpiNVDIMMState *acpi_nvdimm_state = &pcms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_PORT;
+        acpi_nvdimm_state->dsm_io.base = NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, system_io,
                                pcms->fw_cfg, OBJECT(pcms));
     }
 }
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index c5c9b3c7f8..af8a5fd034 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -123,6 +123,17 @@ struct NvdimmFitBuffer {
 };
 typedef struct NvdimmFitBuffer NvdimmFitBuffer;
 
+typedef enum {
+    NVDIMM_ACPI_IO_PORT,
+    NVDIMM_ACPI_IO_MEMORY,
+} AcpiNVDIMMIOType;
+
+typedef struct AcpiNVDIMMIOEntry {
+    AcpiNVDIMMIOType type;
+    hwaddr base;
+    hwaddr len;
+} AcpiNVDIMMIOEntry;
+
 struct AcpiNVDIMMState {
     /* detect if NVDIMM support is enabled. */
     bool is_enabled;
@@ -140,6 +151,7 @@ struct AcpiNVDIMMState {
      */
     int32_t persistence;
     char    *persistence_string;
+    AcpiNVDIMMIOEntry dsm_io;
 };
 typedef struct AcpiNVDIMMState AcpiNVDIMMState;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 14/16] hw/arm/virt: Add nvdimm hot-plug infrastructure
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (12 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 13/16] nvdimm: use configurable ACPI IO base and size Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 15/16] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 16/16] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

From: Kwangwoo Lee <kwangwoo.lee@sk.com>

Pre-plug and plug handlers are prepared for NVDIMM support.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
---
 default-configs/arm-softmmu.mak |  2 ++
 hw/arm/virt-acpi-build.c        |  6 ++++++
 hw/arm/virt.c                   | 22 ++++++++++++++++++++++
 include/hw/arm/virt.h           |  3 +++
 4 files changed, 33 insertions(+)

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 1c110cfe7f..466b3d2c39 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -160,4 +160,6 @@ CONFIG_STRONGARM=y
 CONFIG_HIGHBANK=y
 CONFIG_MUSICPAL=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM=y
+CONFIG_ACPI_NVDIMM=y
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8818bbf5ec..dc100dd4c0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -808,6 +808,7 @@ static
 void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 {
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     GArray *table_offsets;
     unsigned dsdt, xsdt;
     GArray *tables_blob = tables->table_data;
@@ -848,6 +849,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
         }
     }
 
+    if (vms->acpi_nvdimm_state.is_enabled) {
+        nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
+                          &vms->acpi_nvdimm_state, ms->ram_slots);
+    }
+
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
         build_iort(tables_blob, tables->linker, vms);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 1548e9480a..d84d5a5841 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -141,6 +141,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
     [VIRT_SMMU] =               { 0x09050000, 0x00020000 },
+    [VIRT_ACPI_IO] =            { 0x09070000, 0x00010000 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -1609,6 +1610,18 @@ static void machvirt_init(MachineState *machine)
 
     create_platform_bus(vms, pic);
 
+    if (vms->acpi_nvdimm_state.is_enabled) {
+        AcpiNVDIMMState *acpi_nvdimm_state = &vms->acpi_nvdimm_state;
+
+        acpi_nvdimm_state->dsm_io.type = NVDIMM_ACPI_IO_MEMORY;
+        acpi_nvdimm_state->dsm_io.base =
+                vms->memmap[VIRT_ACPI_IO].base + NVDIMM_ACPI_IO_BASE;
+        acpi_nvdimm_state->dsm_io.len = NVDIMM_ACPI_IO_LEN;
+
+        nvdimm_init_acpi_state(acpi_nvdimm_state, sysmem,
+                               vms->fw_cfg, OBJECT(vms));
+    }
+
     vms->bootinfo.ram_size = machine->ram_size;
     vms->bootinfo.kernel_filename = machine->kernel_filename;
     vms->bootinfo.kernel_cmdline = machine->kernel_cmdline;
@@ -1794,10 +1807,19 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                              DeviceState *dev, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
     Error *local_err = NULL;
 
     pc_dimm_plug(dev, MACHINE(vms), &local_err);
+    if (local_err) {
+        goto out;
+    }
 
+    if (is_nvdimm) {
+        nvdimm_plug(&vms->acpi_nvdimm_state);
+    }
+
+out:
     error_propagate(errp, local_err);
 }
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 032d88f4c4..974a110a38 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -37,6 +37,7 @@
 #include "hw/arm/arm.h"
 #include "sysemu/kvm.h"
 #include "hw/intc/arm_gicv3_common.h"
+#include "hw/mem/nvdimm.h"
 
 #define NUM_GICV2M_SPIS       64
 #define NUM_VIRTIO_TRANSPORTS 32
@@ -81,6 +82,7 @@ enum {
     VIRT_SECURE_UART,
     VIRT_SECURE_MEM,
     VIRT_DEVICE_MEM,
+    VIRT_ACPI_IO,
 };
 
 typedef enum VirtIOMMUType {
@@ -128,6 +130,7 @@ typedef struct {
     uint32_t msi_phandle;
     uint32_t iommu_phandle;
     int psci_conduit;
+    AcpiNVDIMMState acpi_nvdimm_state;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_PCIE_ECAM_HIGH : VIRT_PCIE_ECAM)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 15/16] hw/arm/boot: Expose the pmem nodes in the DT
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (13 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 14/16] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 16/16] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

In case of NV-DIMM slots, let's add /pmem DT nodes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/boot.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 81d621ce14..50acf0abd4 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -444,6 +444,36 @@ out:
     return ret;
 }
 
+static int fdt_add_pmem_node(void *fdt, uint32_t acells, hwaddr mem_base,
+                             uint32_t scells, hwaddr mem_len,
+                             int numa_node_id)
+{
+    char *nodename = NULL;
+    int ret;
+
+    nodename = g_strdup_printf("/pmem@%" PRIx64, mem_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+    qemu_fdt_setprop_string(fdt, nodename, "compatible", "pmem-region");
+    ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,
+                                       scells, mem_len);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/reg\n", nodename);
+        goto out;
+    }
+    if (numa_node_id < 0) {
+        goto out;
+    }
+
+    ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
+    if (ret < 0) {
+        fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
+    }
+
+out:
+    g_free(nodename);
+    return ret;
+}
+
 static void fdt_add_psci_node(void *fdt)
 {
     uint32_t cpu_suspend_fn;
@@ -531,7 +561,8 @@ static int fdt_add_hotpluggable_memory_nodes(void *fdt,
         di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
 
         if (is_nvdimm) {
-            ret = -ENOENT; /* NV-DIMM not yet supported */
+            ret = fdt_add_pmem_node(fdt, acells, di->addr,
+                                    scells, di->size, di->node);
         } else {
             ret = fdt_add_memory_node(fdt, acells, di->addr,
                                       scells, di->size, di->node);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v4 16/16] hw/arm/virt: Add nvdimm and nvdimm-persistence options
  2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
                   ` (14 preceding siblings ...)
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 15/16] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
@ 2018-10-18 14:30 ` Eric Auger
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-10-18 14:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

Machine option nvdimm allows to turn NVDIMM support on.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d84d5a5841..02fb62595e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1748,6 +1748,47 @@ static void virt_set_iommu(Object *obj, const char *value, Error **errp)
     }
 }
 
+static bool virt_get_nvdimm(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->acpi_nvdimm_state.is_enabled;
+}
+
+static void virt_set_nvdimm(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->acpi_nvdimm_state.is_enabled = value;
+}
+
+static char *virt_get_nvdimm_persistence(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return g_strdup(vms->acpi_nvdimm_state.persistence_string);
+}
+
+static void virt_set_nvdimm_persistence(Object *obj, const char *value,
+                                        Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+    AcpiNVDIMMState *nvdimm_state = &vms->acpi_nvdimm_state;
+
+    if (strcmp(value, "cpu") == 0)
+        nvdimm_state->persistence = 3;
+    else if (strcmp(value, "mem-ctrl") == 0)
+        nvdimm_state->persistence = 2;
+    else {
+        error_report("-machine nvdimm-persistence=%s: unsupported option",
+                     value);
+        exit(EXIT_FAILURE);
+    }
+
+    g_free(nvdimm_state->persistence_string);
+    nvdimm_state->persistence_string = g_strdup(value);
+}
+
 static CpuInstanceProperties
 virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
@@ -1790,13 +1831,14 @@ static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                  Error **errp)
 {
     const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
 
     if (dev->hotplugged) {
         error_setg(errp, "memory hotplug is not supported");
     }
 
-    if (is_nvdimm) {
-        error_setg(errp, "nvdimm is not yet supported");
+    if (is_nvdimm && !vms->acpi_nvdimm_state.is_enabled) {
+        error_setg(errp, "nvdimm is not enabled: missing 'nvdimm' in '-M'");
         return;
     }
 
@@ -2025,6 +2067,19 @@ static void virt_3_2_instance_init(Object *obj)
                                     "Valid values are none and smmuv3",
                                     NULL);
 
+    object_property_add_bool(obj, "nvdimm",
+                             virt_get_nvdimm, virt_set_nvdimm, NULL);
+    object_property_set_description(obj, "nvdimm",
+                                         "Set on/off to enable/disable NVDIMM "
+                                         "instantiation", NULL);
+
+    object_property_add_str(obj, "nvdimm-persistence",
+                            virt_get_nvdimm_persistence,
+                            virt_set_nvdimm_persistence, NULL);
+    object_property_set_description(obj, "nvdimm-persistence",
+                                    "Set NVDIMM persistence"
+                                    "Valid values are cpu and mem-ctrl", NULL);
+
     vms->memmap = a15memmap;
     vms->irqmap = a15irqmap;
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine Eric Auger
@ 2018-10-19  2:58   ` Richard Henderson
  2018-10-19 13:59     ` Auger Eric
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Henderson @ 2018-10-19  2:58 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose

On 10/18/18 7:30 AM, Eric Auger wrote:
> +#define SZ_1G (1024ULL * 1024 * 1024)

<qemu/units.h> already defines GiB.


r~

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT Eric Auger
@ 2018-10-19  8:49   ` Suzuki K Poulose
  2018-10-19 14:02     ` Auger Eric
  0 siblings, 1 reply; 26+ messages in thread
From: Suzuki K Poulose @ 2018-10-19  8:49 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert

Hi Eric,

On 10/18/2018 03:30 PM, Eric Auger wrote:
> This is a header update against kvmarm next branch
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm kvmarm/next
> 
> to get the KVM_ARM_GET_MAX_VM_PHYS_SHIFT ioctl. This allows to retrieve
> the IPA address range KVM supports.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v3 -> v4:
> - update against kvmarm next
> ---
>   linux-headers/linux/kvm.h | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 83ba4eb571..9647ce4fcb 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -750,6 +750,15 @@ struct kvm_ppc_resize_hpt {
>   
>   #define KVM_S390_SIE_PAGE_OFFSET 1
>   
> +/*
> + * On arm64, machine type can be used to request the physical
> + * address size for the VM. Bits[7-0] are reserved for the guest
> + * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
> + * value 0 implies the default IPA size, 40bits.
> + */
> +#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
> +#define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
> +	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>   /*
>    * ioctls for /dev/kvm fds:
>    */
> @@ -953,6 +962,7 @@ struct kvm_ppc_resize_hpt {
>   #define KVM_CAP_NESTED_STATE 157
>   #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
>   #define KVM_CAP_MSR_PLATFORM_INFO 159
> +#define KVM_CAP_ARM_VM_IPA_SIZE 160 /* returns maximum IPA bits for a VM */

Please be aware that there have been multiple merge conflicts with
the kvmarm-tree onto kvm tree upstream and the numbers have changed.
I assume that you will be rebasing this to mainline anyways.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine
  2018-10-19  2:58   ` Richard Henderson
@ 2018-10-19 13:59     ` Auger Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Auger Eric @ 2018-10-19 13:59 UTC (permalink / raw)
  To: Richard Henderson, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, shameerali.kolothum.thodi, kwangwoo.lee, imammedo,
	david
  Cc: drjones, dgilbert, Suzuki.Poulose

Hi Richard,

On 10/19/18 4:58 AM, Richard Henderson wrote:
> On 10/18/18 7:30 AM, Eric Auger wrote:
>> +#define SZ_1G (1024ULL * 1024 * 1024)
> 
> <qemu/units.h> already defines GiB.
noted

thanks!

Eric
> 
> 
> r~
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
  2018-10-19  8:49   ` Suzuki K Poulose
@ 2018-10-19 14:02     ` Auger Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Auger Eric @ 2018-10-19 14:02 UTC (permalink / raw)
  To: Suzuki K Poulose, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, shameerali.kolothum.thodi, kwangwoo.lee, imammedo,
	david
  Cc: drjones, dgilbert

Hi Suzuki,

On 10/19/18 10:49 AM, Suzuki K Poulose wrote:
> Hi Eric,
> 
> On 10/18/2018 03:30 PM, Eric Auger wrote:
>> This is a header update against kvmarm next branch
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm kvmarm/next
>>
>> to get the KVM_ARM_GET_MAX_VM_PHYS_SHIFT ioctl. This allows to retrieve
>> the IPA address range KVM supports.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v3 -> v4:
>> - update against kvmarm next
>> ---
>>   linux-headers/linux/kvm.h | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
>> index 83ba4eb571..9647ce4fcb 100644
>> --- a/linux-headers/linux/kvm.h
>> +++ b/linux-headers/linux/kvm.h
>> @@ -750,6 +750,15 @@ struct kvm_ppc_resize_hpt {
>>     #define KVM_S390_SIE_PAGE_OFFSET 1
>>   +/*
>> + * On arm64, machine type can be used to request the physical
>> + * address size for the VM. Bits[7-0] are reserved for the guest
>> + * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
>> + * value 0 implies the default IPA size, 40bits.
>> + */
>> +#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK    0xffULL
>> +#define KVM_VM_TYPE_ARM_IPA_SIZE(x)        \
>> +    ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>>   /*
>>    * ioctls for /dev/kvm fds:
>>    */
>> @@ -953,6 +962,7 @@ struct kvm_ppc_resize_hpt {
>>   #define KVM_CAP_NESTED_STATE 157
>>   #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
>>   #define KVM_CAP_MSR_PLATFORM_INFO 159
>> +#define KVM_CAP_ARM_VM_IPA_SIZE 160 /* returns maximum IPA bits for a
>> VM */
> 
> Please be aware that there have been multiple merge conflicts with
> the kvmarm-tree onto kvm tree upstream and the numbers have changed.
> I assume that you will be rebasing this to mainline anyways.

Thank you for the heads up. Yes I will respin this header update against
the master branch later on.

Thanks

Eric
> 
> Cheers
> Suzuki
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
@ 2018-10-22 13:40   ` Igor Mammedov
  2018-11-05 13:27     ` Auger Eric
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Mammedov @ 2018-10-22 13:40 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	shameerali.kolothum.thodi, kwangwoo.lee, david, drjones,
	dgilbert, Suzuki.Poulose

On Thu, 18 Oct 2018 16:30:38 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> 
> Generate Memory Affinity Structures for PC-DIMM ranges.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v3 -> v4:
> - do not use vms->bootinfo.device_memory_start/device_memory_size anymore
> 
> v1 -> v2:
> - build_srat_hotpluggable_memory movedc to aml-build
> ---
>  hw/arm/virt-acpi-build.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 5785fb697c..8818bbf5ec 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -545,6 +545,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>      int i, srat_start;
>      uint64_t mem_base;
>      MachineClass *mc = MACHINE_GET_CLASS(vms);
> +    MachineState *ms = MACHINE(vms);
>      const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
>  
>      srat_start = table_data->len;
> @@ -570,6 +571,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>          }
>      }
>  
> +    build_srat_hotpluggable_memory(table_data, ms->device_memory->base,
> +                                   ms->device_memory->mr.size, 0);
on x86, we use the last node here to make windows happy. I'd use the same value here.

> +
>      build_header(linker, table_data, (void *)(table_data->data + srat_start),
>                   "SRAT", table_data->len - srat_start, 3, NULL, NULL);
>  }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  2018-10-22 13:40   ` Igor Mammedov
@ 2018-11-05 13:27     ` Auger Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Auger Eric @ 2018-11-05 13:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: peter.maydell, drjones, david, Suzuki.Poulose, qemu-devel,
	shameerali.kolothum.thodi, dgilbert, qemu-arm, kwangwoo.lee,
	eric.auger.pro

Hi Igor,

On 10/22/18 3:40 PM, Igor Mammedov wrote:
> On Thu, 18 Oct 2018 16:30:38 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>
>> Generate Memory Affinity Structures for PC-DIMM ranges.
>>
>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v3 -> v4:
>> - do not use vms->bootinfo.device_memory_start/device_memory_size anymore
>>
>> v1 -> v2:
>> - build_srat_hotpluggable_memory movedc to aml-build
>> ---
>>  hw/arm/virt-acpi-build.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 5785fb697c..8818bbf5ec 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -545,6 +545,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>      int i, srat_start;
>>      uint64_t mem_base;
>>      MachineClass *mc = MACHINE_GET_CLASS(vms);
>> +    MachineState *ms = MACHINE(vms);
>>      const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
>>  
>>      srat_start = table_data->len;
>> @@ -570,6 +571,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>          }
>>      }
>>  
>> +    build_srat_hotpluggable_memory(table_data, ms->device_memory->base,
>> +                                   ms->device_memory->mr.size, 0);
> on x86, we use the last node here to make windows happy. I'd use the same value here.
OK thank you for the information.

Regards

Eric
> 
>> +
>>      build_header(linker, table_data, (void *)(table_data->data + srat_start),
>>                   "SRAT", table_data->len - srat_start, 3, NULL, NULL);
>>  }
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source
  2018-10-18 14:30 ` [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source Eric Auger
@ 2019-01-21 11:44   ` Shameerali Kolothum Thodi
  2019-01-21 13:58     ` Auger Eric
  0 siblings, 1 reply; 26+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-01-21 11:44 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	kwangwoo.lee, imammedo, david
  Cc: drjones, dgilbert, Suzuki.Poulose, Linuxarm

Hi Eric,

As I informed you earlier, I am working on adding the hot-add support for pc-dimm
and nvdimm on top of this series using the pl061 GPIO pins as a way to inject
memory hot add events to the Guest. I came across a problem while testing
my patches which looks like is related to the way SRAT entries are populated
in this patch.

If I boot the Guest without NUMA and hot add dimms, all seems to be 
working fine. But when I enable NUMA (-numa node,nodeid=0), and hot add
dimms and perform a reboot, it fails.

Something like this,

./qemu-system-aarch64 \
-machine virt,kernel_irqchip=on,gic-version=3,nvdimm \
-m 1G,maxmem=4G,slots=5 \
-cpu host \
-kernel Image \
-bios QEMU_EFI.fd \
-initrd rootfs-iperf.cpio \
-numa node,nodeid=0 \
-net none \
-nographic -enable-kvm \
-append "console=ttyAMA0 root=/dev/vda rw acpi=force"

Qemu Monitor:
(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

Exit QM and reboot the Guest.

EFI stub: Booting Linux Kernel...
EFI stub: Generating empty DTB
EFI stub: Exiting boot services and installing virtual address map...
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010]
[    0.000000] Linux version 4.20.0-rc3-dirty (shameer@shameer-ubuntu) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05)) #21 SMP PREEMPT Wed Jan 16 14:45:58 GMT 2019
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] printk: bootconsole [pl11] enabled
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: EFI v2.70 by EDK II
[    0.000000] efi:  SMBIOS 3.0=0x7f810000  MEMATTR=0x7c63ca98  MEMRESERVE=0x7bfeb018 
[    0.000000] cma: Reserved 32 MiB at 0x000000007d000000
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: System description tables not found
[    0.000000] ACPI: Failed to init ACPI tables
[    0.000000] ACPI: NUMA: Failed to initialise from firmware

[...]

Debug shows that edk2 had a check[1] to verify the table size and it fails on reboot when
NUMA is enabled(ie, SRAT is present),

ProcessCmdAddPointer: invalid pointer location or size in "etc/acpi/tables"

This seems to be related to the below code where SRAT is built,

> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: 18 October 2018 15:31
> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> kwangwoo.lee@sk.com; imammedo@redhat.com; david@redhat.com
> Cc: drjones@redhat.com; dgilbert@redhat.com; Suzuki.Poulose@arm.com
> Subject: [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic
> ACPI source
> 
> We plan to reuse build_srat_hotpluggable_memory() for ARM so
> let's move the function to aml-build.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/acpi/aml-build.c         | 51
> +++++++++++++++++++++++++++++++++++++
>  include/hw/acpi/aml-build.h |  3 +++
>  2 files changed, 54 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 1e43cd736d..167fb6bf3e 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -22,6 +22,7 @@
>  #include "qemu/osdep.h"
>  #include <glib/gprintf.h>
>  #include "hw/acpi/aml-build.h"
> +#include "hw/mem/memory-device.h"
>  #include "qemu/bswap.h"
>  #include "qemu/bitops.h"
>  #include "sysemu/numa.h"
> @@ -1802,3 +1803,53 @@ build_hdr:
>      build_header(linker, tbl, (void *)(tbl->data + fadt_start),
>                   "FACP", tbl->len - fadt_start, f->rev, oem_id,
> oem_table_id);
>  }
> +
> +void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
> +                                    uint64_t len, int default_node)
> +{
> +    MemoryDeviceInfoList *info_list = qmp_memory_device_list();
> +    MemoryDeviceInfoList *info;
> +    MemoryDeviceInfo *mi;
> +    PCDIMMDeviceInfo *di;
> +    uint64_t end = base + len, cur, size;
> +    bool is_nvdimm;
> +    AcpiSratMemoryAffinity *numamem;
> +    MemoryAffinityFlags flags;
> +
> +    for (cur = base, info = info_list;
> +         cur < end;
> +         cur += size, info = info->next) {
> +        numamem = acpi_data_push(table_data, sizeof *numamem);
> +
> +        if (!info) {
> +            build_srat_memory(numamem, cur, end - cur, default_node,
> +                              MEM_AFFINITY_HOTPLUGGABLE |
> MEM_AFFINITY_ENABLED);
> +            break;
> +        }
> +
> +        mi = info->value;
> +        is_nvdimm = (mi->type ==
> MEMORY_DEVICE_INFO_KIND_NVDIMM);
> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
> +
> +        if (cur < di->addr) {
> +            build_srat_memory(numamem, cur, di->addr - cur,
> default_node,
> +                              MEM_AFFINITY_HOTPLUGGABLE |
> MEM_AFFINITY_ENABLED);
> +            numamem = acpi_data_push(table_data, sizeof *numamem);
> +        }
> +
> +        size = di->size;
> +
> +        flags = MEM_AFFINITY_ENABLED;
> +        if (di->hotpluggable) {
> +            flags |= MEM_AFFINITY_HOTPLUGGABLE;
> +        }
> +        if (is_nvdimm) {
> +            flags |= MEM_AFFINITY_NON_VOLATILE;
> +        }
> +
> +        build_srat_memory(numamem, di->addr, size, di->node, flags);
> +    }
> +
> +    qapi_free_MemoryDeviceInfoList(info_list);
> +}

The above logic changes the SRAT entries if dimm is hot-added and a reboot is
performed and as mentioned above, EDK2 seems to be not happy with this.

I had a look at the pc code and it looks like it only has one SRAT entry for the 
whole hot-pluggable address space[2] which probable means SRAT remains same
across reboot even after mem is hot added. Not sure why we are doing this differently
for ARM.

Please take a look and let me know your thoughts.

Thanks,
Shameer

[1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c#L459
[2] https://github.com/eauger/qemu/blob/v3.0.0-dimm-2tb-v4/hw/i386/acpi-build.c#L2367


> +
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 6c36903c0a..4c2ca134ee 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -416,4 +416,7 @@ void build_slit(GArray *table_data, BIOSLinker *linker);
> 
>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>                  const char *oem_id, const char *oem_table_id);
> +
> +void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
> +                                    uint64_t len, int default_node);
>  #endif
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source
  2019-01-21 11:44   ` Shameerali Kolothum Thodi
@ 2019-01-21 13:58     ` Auger Eric
  2019-01-21 17:09       ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 26+ messages in thread
From: Auger Eric @ 2019-01-21 13:58 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, kwangwoo.lee, imammedo, david
  Cc: drjones, Linuxarm, dgilbert, Suzuki.Poulose

Hi Shameer,

On 1/21/19 12:44 PM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
> As I informed you earlier, I am working on adding the hot-add support for pc-dimm
> and nvdimm on top of this series using the pl061 GPIO pins as a way to inject
> memory hot add events to the Guest. I came across a problem while testing
> my patches which looks like is related to the way SRAT entries are populated
> in this patch.
> 
> If I boot the Guest without NUMA and hot add dimms, all seems to be 
> working fine. But when I enable NUMA (-numa node,nodeid=0), and hot add
> dimms and perform a reboot, it fails.
> 
> Something like this,
> 
> ./qemu-system-aarch64 \
> -machine virt,kernel_irqchip=on,gic-version=3,nvdimm \
> -m 1G,maxmem=4G,slots=5 \
> -cpu host \
> -kernel Image \
> -bios QEMU_EFI.fd \
> -initrd rootfs-iperf.cpio \
> -numa node,nodeid=0 \
> -net none \
> -nographic -enable-kvm \
> -append "console=ttyAMA0 root=/dev/vda rw acpi=force"
> 
> Qemu Monitor:
> (qemu) object_add memory-backend-ram,id=mem1,size=1G
> (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
> 
> Exit QM and reboot the Guest.
> 
> EFI stub: Booting Linux Kernel...
> EFI stub: Generating empty DTB
> EFI stub: Exiting boot services and installing virtual address map...
> [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010]
> [    0.000000] Linux version 4.20.0-rc3-dirty (shameer@shameer-ubuntu) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05)) #21 SMP PREEMPT Wed Jan 16 14:45:58 GMT 2019
> [    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
> [    0.000000] printk: bootconsole [pl11] enabled
> [    0.000000] efi: Getting EFI parameters from FDT:
> [    0.000000] efi: EFI v2.70 by EDK II
> [    0.000000] efi:  SMBIOS 3.0=0x7f810000  MEMATTR=0x7c63ca98  MEMRESERVE=0x7bfeb018 
> [    0.000000] cma: Reserved 32 MiB at 0x000000007d000000
> [    0.000000] ACPI: Early table checksum verification disabled
> [    0.000000] ACPI: System description tables not found
> [    0.000000] ACPI: Failed to init ACPI tables
> [    0.000000] ACPI: NUMA: Failed to initialise from firmware
> 
> [...]
> 
> Debug shows that edk2 had a check[1] to verify the table size and it fails on reboot when
> NUMA is enabled(ie, SRAT is present),
> 
> ProcessCmdAddPointer: invalid pointer location or size in "etc/acpi/tables"
> 
> This seems to be related to the below code where SRAT is built,
> 
>> -----Original Message-----
>> From: Eric Auger [mailto:eric.auger@redhat.com]
>> Sent: 18 October 2018 15:31
>> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
>> qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
>> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> kwangwoo.lee@sk.com; imammedo@redhat.com; david@redhat.com
>> Cc: drjones@redhat.com; dgilbert@redhat.com; Suzuki.Poulose@arm.com
>> Subject: [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic
>> ACPI source
>>
>> We plan to reuse build_srat_hotpluggable_memory() for ARM so
>> let's move the function to aml-build.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/acpi/aml-build.c         | 51
>> +++++++++++++++++++++++++++++++++++++
>>  include/hw/acpi/aml-build.h |  3 +++
>>  2 files changed, 54 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index 1e43cd736d..167fb6bf3e 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -22,6 +22,7 @@
>>  #include "qemu/osdep.h"
>>  #include <glib/gprintf.h>
>>  #include "hw/acpi/aml-build.h"
>> +#include "hw/mem/memory-device.h"
>>  #include "qemu/bswap.h"
>>  #include "qemu/bitops.h"
>>  #include "sysemu/numa.h"
>> @@ -1802,3 +1803,53 @@ build_hdr:
>>      build_header(linker, tbl, (void *)(tbl->data + fadt_start),
>>                   "FACP", tbl->len - fadt_start, f->rev, oem_id,
>> oem_table_id);
>>  }
>> +
>> +void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
>> +                                    uint64_t len, int default_node)
>> +{
>> +    MemoryDeviceInfoList *info_list = qmp_memory_device_list();
>> +    MemoryDeviceInfoList *info;
>> +    MemoryDeviceInfo *mi;
>> +    PCDIMMDeviceInfo *di;
>> +    uint64_t end = base + len, cur, size;
>> +    bool is_nvdimm;
>> +    AcpiSratMemoryAffinity *numamem;
>> +    MemoryAffinityFlags flags;
>> +
>> +    for (cur = base, info = info_list;
>> +         cur < end;
>> +         cur += size, info = info->next) {
>> +        numamem = acpi_data_push(table_data, sizeof *numamem);
>> +
>> +        if (!info) {
>> +            build_srat_memory(numamem, cur, end - cur, default_node,
>> +                              MEM_AFFINITY_HOTPLUGGABLE |
>> MEM_AFFINITY_ENABLED);
>> +            break;
>> +        }
>> +
>> +        mi = info->value;
>> +        is_nvdimm = (mi->type ==
>> MEMORY_DEVICE_INFO_KIND_NVDIMM);
>> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
>> +
>> +        if (cur < di->addr) {
>> +            build_srat_memory(numamem, cur, di->addr - cur,
>> default_node,
>> +                              MEM_AFFINITY_HOTPLUGGABLE |
>> MEM_AFFINITY_ENABLED);
>> +            numamem = acpi_data_push(table_data, sizeof *numamem);
>> +        }
>> +
>> +        size = di->size;
>> +
>> +        flags = MEM_AFFINITY_ENABLED;
>> +        if (di->hotpluggable) {
>> +            flags |= MEM_AFFINITY_HOTPLUGGABLE;
>> +        }
>> +        if (is_nvdimm) {
>> +            flags |= MEM_AFFINITY_NON_VOLATILE;
>> +        }
>> +
>> +        build_srat_memory(numamem, di->addr, size, di->node, flags);
>> +    }
>> +
>> +    qapi_free_MemoryDeviceInfoList(info_list);
>> +}
> 
> The above logic changes the SRAT entries if dimm is hot-added and a reboot is
> performed and as mentioned above, EDK2 seems to be not happy with this.
> 
> I had a look at the pc code and it looks like it only has one SRAT entry for the 
> whole hot-pluggable address space[2] which probable means SRAT remains same
> across reboot even after mem is hot added. Not sure why we are doing this differently
> for ARM.

Initially this code was inpired from x86 code. Then Igor fixed it on x86
 with:
dbb6da8ba7e02105bdbb33b527e088249c9843c8  pc: acpi: revert back to 1
SRAT entry for hotpluggable area

So it looks the same change must be done on ARM.

I am currently busy with this respin. As suggested by David and Igor I
implemented the initial RAM as device memory. I am now looking at NUMA
impacts of that change.

Thanks

Eric
> 
> Please take a look and let me know your thoughts.
> 
> Thanks,
> Shameer
> 
> [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c#L459
> [2] https://github.com/eauger/qemu/blob/v3.0.0-dimm-2tb-v4/hw/i386/acpi-build.c#L2367
> 
> 
>> +
>> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
>> index 6c36903c0a..4c2ca134ee 100644
>> --- a/include/hw/acpi/aml-build.h
>> +++ b/include/hw/acpi/aml-build.h
>> @@ -416,4 +416,7 @@ void build_slit(GArray *table_data, BIOSLinker *linker);
>>
>>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>>                  const char *oem_id, const char *oem_table_id);
>> +
>> +void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
>> +                                    uint64_t len, int default_node);
>>  #endif
>> --
>> 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source
  2019-01-21 13:58     ` Auger Eric
@ 2019-01-21 17:09       ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 26+ messages in thread
From: Shameerali Kolothum Thodi @ 2019-01-21 17:09 UTC (permalink / raw)
  To: Auger Eric, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	kwangwoo.lee, imammedo, david
  Cc: drjones, Linuxarm, dgilbert, Suzuki.Poulose

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 21 January 2019 13:59
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> eric.auger.pro@gmail.com; qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> peter.maydell@linaro.org; kwangwoo.lee@sk.com; imammedo@redhat.com;
> david@redhat.com
> Cc: drjones@redhat.com; Linuxarm <linuxarm@huawei.com>;
> dgilbert@redhat.com; Suzuki.Poulose@arm.com
> Subject: Re: [Qemu-devel] [RFC v4 11/16] acpi: move
> build_srat_hotpluggable_memory to generic ACPI source

[...]

> >> +void build_srat_hotpluggable_memory(GArray *table_data, uint64_t base,
> >> +                                    uint64_t len, int default_node)
> >> +{
> >> +    MemoryDeviceInfoList *info_list = qmp_memory_device_list();
> >> +    MemoryDeviceInfoList *info;
> >> +    MemoryDeviceInfo *mi;
> >> +    PCDIMMDeviceInfo *di;
> >> +    uint64_t end = base + len, cur, size;
> >> +    bool is_nvdimm;
> >> +    AcpiSratMemoryAffinity *numamem;
> >> +    MemoryAffinityFlags flags;
> >> +
> >> +    for (cur = base, info = info_list;
> >> +         cur < end;
> >> +         cur += size, info = info->next) {
> >> +        numamem = acpi_data_push(table_data, sizeof *numamem);
> >> +
> >> +        if (!info) {
> >> +            build_srat_memory(numamem, cur, end - cur, default_node,
> >> +                              MEM_AFFINITY_HOTPLUGGABLE |
> >> MEM_AFFINITY_ENABLED);
> >> +            break;
> >> +        }
> >> +
> >> +        mi = info->value;
> >> +        is_nvdimm = (mi->type ==
> >> MEMORY_DEVICE_INFO_KIND_NVDIMM);
> >> +        di = !is_nvdimm ? mi->u.dimm.data : mi->u.nvdimm.data;
> >> +
> >> +        if (cur < di->addr) {
> >> +            build_srat_memory(numamem, cur, di->addr - cur,
> >> default_node,
> >> +                              MEM_AFFINITY_HOTPLUGGABLE |
> >> MEM_AFFINITY_ENABLED);
> >> +            numamem = acpi_data_push(table_data, sizeof
> *numamem);
> >> +        }
> >> +
> >> +        size = di->size;
> >> +
> >> +        flags = MEM_AFFINITY_ENABLED;
> >> +        if (di->hotpluggable) {
> >> +            flags |= MEM_AFFINITY_HOTPLUGGABLE;
> >> +        }
> >> +        if (is_nvdimm) {
> >> +            flags |= MEM_AFFINITY_NON_VOLATILE;
> >> +        }
> >> +
> >> +        build_srat_memory(numamem, di->addr, size, di->node, flags);
> >> +    }
> >> +
> >> +    qapi_free_MemoryDeviceInfoList(info_list);
> >> +}
> >
> > The above logic changes the SRAT entries if dimm is hot-added and a reboot is
> > performed and as mentioned above, EDK2 seems to be not happy with this.
> >
> > I had a look at the pc code and it looks like it only has one SRAT entry for the
> > whole hot-pluggable address space[2] which probable means SRAT remains
> same
> > across reboot even after mem is hot added. Not sure why we are doing this
> differently
> > for ARM.
> 
> Initially this code was inpired from x86 code. Then Igor fixed it on x86
>  with:
> dbb6da8ba7e02105bdbb33b527e088249c9843c8  pc: acpi: revert back to 1
> SRAT entry for hotpluggable area

Ok. That explains it. Thanks for the info.

> So it looks the same change must be done on ARM.

Ok. 
 
> I am currently busy with this respin. As suggested by David and Igor I
> implemented the initial RAM as device memory. I am now looking at NUMA
> impacts of that change.

Sure. Mean time I will continue with my testing and will sent out the RFC for 
hot-add feature on top of v5 of this series then.

Thanks,
Shameer.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-01-21 17:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-18 14:30 [Qemu-devel] [RFC v4 00/16] ARM virt: PCDIMM/NVDIMM at 2TB Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 01/16] hw/arm/boot: introduce fdt_add_memory_node helper Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 02/16] linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT Eric Auger
2018-10-19  8:49   ` Suzuki K Poulose
2018-10-19 14:02     ` Auger Eric
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 03/16] hw/boards: Add a MachineState parameter to kvm_type callback Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 04/16] kvm: add kvm_arm_get_max_vm_phys_shift Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 05/16] vl: Set machine ram_size, maxram_size and ram_slots earlier Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 06/16] hw/arm/virt: Add virt-3.2 machine type Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 07/16] hw/arm/virt: Implement kvm_type function for 3.2 machine Eric Auger
2018-10-19  2:58   ` Richard Henderson
2018-10-19 13:59     ` Auger Eric
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 08/16] hw/arm/virt: Allocate device_memory Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 09/16] hw/arm/virt: Add memory hotplug framework Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 10/16] hw/arm/boot: Expose the PC-DIMM nodes in the DT Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 11/16] acpi: move build_srat_hotpluggable_memory to generic ACPI source Eric Auger
2019-01-21 11:44   ` Shameerali Kolothum Thodi
2019-01-21 13:58     ` Auger Eric
2019-01-21 17:09       ` Shameerali Kolothum Thodi
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT Eric Auger
2018-10-22 13:40   ` Igor Mammedov
2018-11-05 13:27     ` Auger Eric
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 13/16] nvdimm: use configurable ACPI IO base and size Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 14/16] hw/arm/virt: Add nvdimm hot-plug infrastructure Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 15/16] hw/arm/boot: Expose the pmem nodes in the DT Eric Auger
2018-10-18 14:30 ` [Qemu-devel] [RFC v4 16/16] hw/arm/virt: Add nvdimm and nvdimm-persistence options Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.