All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests
@ 2015-01-08  6:10 Bharata B Rao
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
                   ` (14 more replies)
  0 siblings, 15 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

This patchset enables CPU and memory hotplug support for PowerPC guests.

Changes in this patchset (v1):

- Based on top of Michael Roth's tree
  (https://github.com/mdroth/qemu/commits/spapr-hotplug-core) which serves
  as base for his PCI hotplug patches too.
- Switched to device_add/del semantics instead of cpu-add.
- Supporting CPU hot unplug now.
- Added patches to enable memory hotplug.
- Added ibm,dynamic-reconfiguration-memory support which is needed for
  memory hotplug.

v0 - http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00752.html

CPU hotplug
-----------
- Works with BE guest, has issues with LE guest. Has been tested on BE host
  only.
- Adding a core (and all its threads) in response to device_add command.
  Similarly removing a core via device_del will remove all the threads.
- Using Gu Zheng's "reclaim vCPU objects" patch to remove and reuse
  the vCPU objects after CPUs removal.
  (Gu Zheng's original patch:
   http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01829.html)

Memory hotplug
--------------
- Able to get an LMB added with the current patchset, but there are issues
  which I am still debugging.
- Re-using pc-dimm infrastructure (hw/mem/pc-dimm.c) to support memory
  hotplug on PowerPC.
- Tested with Nathan Fontenot's memory hotplug kernel patches (with additions
  to drive memory hotplug from EPOW interrupt path)
  (https://www.marc.info/?l=linuxppc-embedded&m=141626066317143&w=2)

Bharata B Rao (11):
  spapr: Add DRC dt entries for CPUs
  spapr: Consider max_cpus during xics initialization
  spapr: Factor out CPU initialization code into realizefn
  spapr: Support ibm,lrdr-capacity device tree property
  spapr: CPU hotplug support
  spapr: Start all the threads of CPU core when core is hotplugged
  spapr: Enable CPU hotplug for POWER8 CPU family
  spapr: CPU hot unplug support
  spapr: Initialize hotplug memory address space
  spapr: Support ibm,dynamic-reconfiguration-memory
  spapr: Memory hotplug support

Gu Zheng (1):
  cpus, spapr: reclaim allocated vCPU objects

Michael Roth (1):
  spapr: enable PHB/CPU/LMB hotplug for pseries-2.3

 cpus.c                            |  44 +++
 default-configs/ppc64-softmmu.mak |   1 +
 hw/ppc/spapr.c                    | 744 ++++++++++++++++++++++++++++++++++----
 hw/ppc/spapr_events.c             |  11 +-
 hw/ppc/spapr_hcall.c              |  51 ++-
 hw/ppc/spapr_rtas.c               |  28 +-
 include/hw/ppc/spapr.h            |  27 +-
 include/qom/cpu.h                 |  11 +
 include/sysemu/kvm.h              |   1 +
 kvm-all.c                         |  57 ++-
 target-ppc/translate_init.c       |  50 ++-
 11 files changed, 918 insertions(+), 107 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-22 21:08   ` Michael Roth
  2015-01-29  1:04   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs Bharata B Rao
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

From: Michael Roth <mdroth@linux.vnet.ibm.com>

Introduce an sPAPRMachineClass sub-class of MachineClass to
handle sPAPR-specific machine configuration properties.

The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
part of machine-specific init code, and is then propagated
to sPAPREnvironment to conditional enable creation of DRC
objects and device-tree description to facilitate hotplug
of PHBs/CPUs/LMBs.

Since we can't migrate this state to older machine types,
default the option to false and only enable it for new
machine types.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
              [Added CPU and LMB bits]
---
 hw/ppc/spapr.c         | 32 ++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  3 +++
 2 files changed, 35 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9eb0a94..71e7052 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -89,11 +89,29 @@
 
 #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
 
+typedef struct sPAPRMachineClass sPAPRMachineClass;
 typedef struct sPAPRMachineState sPAPRMachineState;
 
 #define TYPE_SPAPR_MACHINE      "spapr-machine"
 #define SPAPR_MACHINE(obj) \
     OBJECT_CHECK(sPAPRMachineState, (obj), TYPE_SPAPR_MACHINE)
+#define SPAPR_MACHINE_GET_CLASS(obj) \
+        OBJECT_GET_CLASS(sPAPRMachineClass, obj, TYPE_SPAPR_MACHINE)
+#define SPAPR_MACHINE_CLASS(klass) \
+        OBJECT_CLASS_CHECK(sPAPRMachineClass, klass, TYPE_SPAPR_MACHINE)
+
+/**
+ * sPAPRMachineClass:
+ */
+struct sPAPRMachineClass {
+    /*< private >*/
+    MachineClass parent_class;
+
+    /*< public >*/
+    bool dr_phb_enabled; /* enable dynamic-reconfig/hotplug of PHBs */
+    bool dr_cpu_enabled; /* enable dynamic-reconfig/hotplug of CPUs */
+    bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
+};
 
 /**
  * sPAPRMachineState:
@@ -1343,6 +1361,7 @@ static SaveVMHandlers savevm_htab_handlers = {
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
     ram_addr_t ram_size = machine->ram_size;
     const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
@@ -1503,6 +1522,10 @@ static void ppc_spapr_init(MachineState *machine)
     /* We always have at least the nvram device on VIO */
     spapr_create_nvram(spapr);
 
+    spapr->dr_phb_enabled = smc->dr_phb_enabled;
+    spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
+    spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
+
     /* Set up PCI */
     spapr_pci_rtas_init();
 
@@ -1722,6 +1745,7 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
     FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
     NMIClass *nc = NMI_CLASS(oc);
 
@@ -1733,6 +1757,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_boot_order = NULL;
     mc->kvm_type = spapr_kvm_type;
     mc->has_dynamic_sysbus = true;
+    smc->dr_phb_enabled = false;
+    smc->dr_cpu_enabled = false;
+    smc->dr_lmb_enabled = false;
 
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
@@ -1744,6 +1771,7 @@ static const TypeInfo spapr_machine_info = {
     .abstract      = true,
     .instance_size = sizeof(sPAPRMachineState),
     .instance_init = spapr_machine_initfn,
+    .class_size    = sizeof(sPAPRMachineClass),
     .class_init    = spapr_machine_class_init,
     .interfaces = (InterfaceInfo[]) {
         { TYPE_FW_PATH_PROVIDER },
@@ -1788,11 +1816,15 @@ static const TypeInfo spapr_machine_2_2_info = {
 static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
 
     mc->name = "pseries-2.3";
     mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
     mc->alias = "pseries";
     mc->is_default = 1;
+    smc->dr_phb_enabled = true;
+    smc->dr_cpu_enabled = true;
+    smc->dr_lmb_enabled = true;
 }
 
 static const TypeInfo spapr_machine_2_3_info = {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 973193d..b1a0838 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -30,6 +30,9 @@ typedef struct sPAPREnvironment {
     uint64_t rtc_offset;
     struct PPCTimebase tb;
     bool has_graphics;
+    bool dr_phb_enabled; /* hotplug / dynamic-reconfiguration of PHBs */
+    bool dr_cpu_enabled; /* hotplug / dynamic-reconfiguration of CPUs */
+    bool dr_lmb_enabled; /* hotplug / dynamic-reconfiguration of LMBs */
 
     uint32_t check_exception_irq;
     Notifier epow_notifier;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-22 21:21   ` Michael Roth
  2015-01-29  1:04   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization Bharata B Rao
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Advertise CPU DR-capability to the guest via device tree.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [spapr_drc_reset implementation]
---
 hw/ppc/spapr.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 71e7052..98a32d0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -807,6 +807,14 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
         spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
     }
 
+    if (spapr->dr_cpu_enabled) {
+        int offset = fdt_path_offset(fdt, "/cpus");
+        ret = spapr_drc_populate_dt(fdt, offset, SPAPR_DR_CONNECTOR_TYPE_CPU);
+        if (ret < 0) {
+            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
+        }
+    }
+
     _FDT((fdt_pack(fdt)));
 
     if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
@@ -1358,6 +1366,16 @@ static SaveVMHandlers savevm_htab_handlers = {
     .load_state = htab_load,
 };
 
+static void spapr_drc_reset(void *opaque)
+{
+    sPAPRDRConnector *drc = opaque;
+    DeviceState *d = DEVICE(drc);
+
+    if (d) {
+        device_reset(d);
+    }
+}
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
@@ -1383,6 +1401,7 @@ static void ppc_spapr_init(MachineState *machine)
     long load_limit, fw_size;
     bool kernel_le = false;
     char *filename;
+    int smt = kvmppc_smt_threads();
 
     msi_supported = true;
 
@@ -1526,6 +1545,15 @@ static void ppc_spapr_init(MachineState *machine)
     spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
     spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
 
+    if (spapr->dr_cpu_enabled) {
+        for (i = 0; i < max_cpus/smp_threads; i++) {
+            sPAPRDRConnector *drc =
+                spapr_dr_connector_new(OBJECT(machine),
+                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
+            qemu_register_reset(spapr_drc_reset, drc);
+        }
+    }
+
     /* Set up PCI */
     spapr_pci_rtas_init();
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-29  1:05   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn Bharata B Rao
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Use max_cpus instead of smp_cpus when intializating xics system. Also
report max_cpus in ibm,interrupt-server-ranges device tree property of
interrupt controller node.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 98a32d0..779d364 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -347,7 +347,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
     GString *hypertas = g_string_sized_new(256);
     GString *qemu_hypertas = g_string_sized_new(256);
     uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
-    uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
+    uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
     int smt = kvmppc_smt_threads();
     unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
     QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
@@ -1459,7 +1459,7 @@ static void ppc_spapr_init(MachineState *machine)
     }
 
     /* Set up Interrupt Controller before we create the VCPUs */
-    spapr->icp = xics_system_init(smp_cpus * kvmppc_smt_threads() / smp_threads,
+    spapr->icp = xics_system_init(max_cpus * kvmppc_smt_threads() / smp_threads,
                                   XICS_IRQS);
 
     /* init CPUs */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (2 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-29  1:07   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Move some CPU initialization code from machine init function to
CPU realizefn so that it can be used from CPU hotplug path too.

With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
function definitions aren't required, remove them.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c              | 29 +----------------------------
 include/hw/ppc/spapr.h      |  3 +++
 target-ppc/translate_init.c | 43 ++++++++++++++++++++++++++-----------------
 3 files changed, 30 insertions(+), 45 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 779d364..f49b0fa 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -81,8 +81,6 @@
 
 #define MIN_RMA_SLOF            128UL
 
-#define TIMEBASE_FREQ           512000000ULL
-
 #define MAX_CPUS                255
 
 #define PHANDLE_XICP            0x00001111
@@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
 
 }
 
-static void spapr_cpu_reset(void *opaque)
+void spapr_cpu_reset(void *opaque)
 {
     PowerPCCPU *cpu = opaque;
     CPUState *cs = CPU(cpu);
@@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
     const char *initrd_filename = machine->initrd_filename;
     const char *boot_device = machine->boot_order;
     PowerPCCPU *cpu;
-    CPUPPCState *env;
     PCIHostState *phb;
     int i;
     MemoryRegion *sysmem = get_system_memory();
@@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
         }
-        env = &cpu->env;
-
-        /* Set time-base frequency to 512 MHz */
-        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
-
-        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
-         * MSR[IP] should never be set.
-         */
-        env->msr_mask &= ~(1 << 6);
-
-        /* Tell KVM that we're in PAPR mode */
-        if (kvm_enabled()) {
-            kvmppc_set_papr(cpu);
-        }
-
-        if (cpu->max_compat) {
-            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
-                exit(1);
-            }
-        }
-
-        xics_cpu_setup(spapr->icp, cpu);
-
-        qemu_register_reset(spapr_cpu_reset, cpu);
     }
 
     /* allocate RAM */
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index b1a0838..831db6b 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -478,6 +478,8 @@ struct sPAPRTCETable {
     QLIST_ENTRY(sPAPRTCETable) list;
 };
 
+#define TIMEBASE_FREQ           512000000ULL
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
@@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
                       sPAPRTCETable *tcet);
 void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
 void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
+void spapr_cpu_reset(void *opaque);
 
 #endif /* !defined (__HW_SPAPR_H__) */
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 72cc9d0..9c642a5 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -30,29 +30,14 @@
 #include "qemu/error-report.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
+#include "hw/ppc/spapr.h"
+#include "hw/ppc/ppc.h"
 
 //#define PPC_DUMP_CPU
 //#define PPC_DEBUG_SPR
 //#define PPC_DUMP_SPR_ACCESSES
 /* #define USE_APPLE_GDB */
 
-/* For user-mode emulation, we don't emulate any IRQ controller */
-#if defined(CONFIG_USER_ONLY)
-#define PPC_IRQ_INIT_FN(name)                                                 \
-static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)         \
-{                                                                             \
-}
-#else
-#define PPC_IRQ_INIT_FN(name)                                                 \
-void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
-#endif
-
-PPC_IRQ_INIT_FN(40x);
-PPC_IRQ_INIT_FN(6xx);
-PPC_IRQ_INIT_FN(970);
-PPC_IRQ_INIT_FN(POWER7);
-PPC_IRQ_INIT_FN(e500);
-
 /* Generic callbacks:
  * do nothing but store/retrieve spr value
  */
@@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
     CPUState *cs = CPU(dev);
     PowerPCCPU *cpu = POWERPC_CPU(dev);
     PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+    CPUPPCState *env = &cpu->env;
     Error *local_err = NULL;
 #if !defined(CONFIG_USER_ONLY)
     int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
@@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
 
     qemu_init_vcpu(cs);
 
+    /* Set time-base frequency to 512 MHz */
+    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
+
+    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
+     * MSR[IP] should never be set.
+     */
+    env->msr_mask &= ~(1 << 6);
+
+    /* Tell KVM that we're in PAPR mode */
+    if (kvm_enabled()) {
+        kvmppc_set_papr(cpu);
+    }
+
+    if (cpu->max_compat) {
+        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
+            exit(1);
+        }
+    }
+
+    xics_cpu_setup(spapr->icp, cpu);
+
+    qemu_register_reset(spapr_cpu_reset, cpu);
+
     pcc->parent_realize(dev, errp);
 
 #if defined(PPC_DUMP_CPU)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (3 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-22 21:55   ` Michael Roth
  2015-01-29  1:16   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Add support for ibm,lrdr-capacity since this is needed by the guest
kernel to know about the possible hot-pluggable CPUs and Memory.

Define minimum hotpluggable memory size as 256MB and start storing maximum
possible memory for the guest in sPAPREnvironment.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         |  3 ++-
 hw/ppc/spapr_rtas.c    | 28 ++++++++++++++++++++++++++--
 include/hw/ppc/spapr.h |  6 ++++--
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f49b0fa..515d770 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -775,7 +775,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     }
 
     /* RTAS */
-    ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
+    ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
     if (ret < 0) {
         fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
     }
@@ -1473,6 +1473,7 @@ static void ppc_spapr_init(MachineState *machine)
 
     /* allocate RAM */
     spapr->ram_limit = ram_size;
+    spapr->maxram_limit = machine->maxram_size;
     memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
                                          spapr->ram_limit);
     memory_region_add_subregion(sysmem, 0, ram);
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index d847f45..e8a0f21 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -29,6 +29,7 @@
 #include "sysemu/char.h"
 #include "hw/qdev.h"
 #include "sysemu/device_tree.h"
+#include "sysemu/cpus.h"
 
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_vio.h"
@@ -551,11 +552,12 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
     rtas_table[token].fn = fn;
 }
 
-int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
-                                 hwaddr rtas_size)
+int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
+                                 hwaddr rtas_addr, hwaddr rtas_size)
 {
     int ret;
     int i;
+    uint32_t lrdr_capacity[5];
 
     ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
     if (ret < 0) {
@@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
         }
 
     }
+
+    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
+    if (ret < 0) {
+        fprintf(stderr, "Couldn't add #address-cells rtas property\n");
+    }
+
+    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
+    if (ret < 0) {
+        fprintf(stderr, "Couldn't add #size-cells rtas property\n");
+    }
+
+    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
+    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
+    lrdr_capacity[2] = 0;
+    lrdr_capacity[3] = cpu_to_be32(SPAPR_MIN_MEMORY_BLOCK_SIZE);
+    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
+    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
+                     sizeof(lrdr_capacity));
+    if (ret < 0) {
+        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
+    }
+
     return 0;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 831db6b..ae8b4e1 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -18,6 +18,7 @@ typedef struct sPAPREnvironment {
     XICSState *icp;
 
     hwaddr ram_limit;
+    hwaddr maxram_limit;
     void *htab;
     uint32_t htab_shift;
     hwaddr rma_size;
@@ -444,8 +445,8 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
 target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                              uint32_t token, uint32_t nargs, target_ulong args,
                              uint32_t nret, target_ulong rets);
-int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
-                                 hwaddr rtas_size);
+int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
+                                 hwaddr rtas_addr, hwaddr rtas_size);
 
 #define SPAPR_TCE_PAGE_SHIFT   12
 #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
@@ -479,6 +480,7 @@ struct sPAPRTCETable {
 };
 
 #define TIMEBASE_FREQ           512000000ULL
+#define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
 
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (4 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-22 22:16   ` Michael Roth
                     ` (2 more replies)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged Bharata B Rao
                   ` (8 subsequent siblings)
  14 siblings, 3 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Support CPU hotplug via device-add command. Use the exising EPOW event
infrastructure to send CPU hotplug notification to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c              | 205 +++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c       |   8 +-
 target-ppc/translate_init.c |   6 ++
 3 files changed, 215 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 515d770..a293a59 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
     g_string_append_len(s, s1, strlen(s1) + 1);
 }
 
+uint32_t cpus_per_socket;
+
 static void *spapr_create_fdt_skel(hwaddr initrd_base,
                                    hwaddr initrd_size,
                                    hwaddr kernel_size,
@@ -350,9 +352,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
     unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
     QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
     unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
-    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
     char *buf;
 
+    cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
     add_str(hypertas, "hcall-pft");
     add_str(hypertas, "hcall-term");
     add_str(hypertas, "hcall-dabr");
@@ -1744,12 +1746,209 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+/* TODO: Duplicates code from spapr_create_fdt_skel(), Fix this */
+static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
+            int drc_index)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
+    int index = ppc_get_vcpu_dt_id(cpu);
+    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
+                       0xffffffff, 0xffffffff};
+    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
+    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
+    uint32_t page_sizes_prop[64];
+    size_t page_sizes_prop_size;
+    int smpt = ppc_get_compat_smt_threads(cpu);
+    uint32_t servers_prop[smpt];
+    uint32_t gservers_prop[smpt * 2];
+    int i;
+    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
+    uint32_t associativity[] = {cpu_to_be32(0x5),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(cs->numa_node),
+                                cpu_to_be32(index)};
+
+    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
+    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
+
+    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
+                        env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
+                        env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
+                        env->icache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
+                            env->icache_line_size)));
+
+    if (pcc->l1_dcache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
+            pcc->l1_dcache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
+    }
+    if (pcc->l1_icache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
+            pcc->l1_icache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
+    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
+    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
+
+    if (env->spr_cb[SPR_PURR].oea_read) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
+    }
+
+    if (env->mmu_model & POWERPC_MMU_1TSEG) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
+                           segs, sizeof(segs))));
+    }
+
+    /* Advertise VMX/VSX (vector extensions) if available
+     *   0 / no property == no vector extensions
+     *   1               == VMX / Altivec available
+     *   2               == VSX available */
+    if (env->insns_flags & PPC_ALTIVEC) {
+        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
+
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
+    }
+
+    /* Advertise DFP (Decimal Floating Point) if available
+     *   0 / no property == no DFP
+     *   1               == DFP available */
+    if (env->insns_flags2 & PPC2_DFP) {
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
+    }
+
+    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
+                                                  sizeof(page_sizes_prop));
+    if (page_sizes_prop_size) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
+                           page_sizes_prop, page_sizes_prop_size)));
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
+                                cs->cpu_index / cpus_per_socket)));
+
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
+
+    /* Build interrupt servers and gservers properties */
+    for (i = 0; i < smpt; i++) {
+        servers_prop[i] = cpu_to_be32(index + i);
+        /* Hack, direct the group queues back to cpu 0 */
+        gservers_prop[i*2] = cpu_to_be32(index + i);
+        gservers_prop[i*2 + 1] = 0;
+    }
+    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
+                       servers_prop, sizeof(servers_prop)));
+    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
+                      gservers_prop, sizeof(gservers_prop)));
+    _FDT(fdt_setprop(fdt, offset, "ibm,pft-size",
+                          pft_size_prop, sizeof(pft_size_prop)));
+
+    if (nb_numa_nodes > 1) {
+        _FDT(fdt_setprop(fdt, offset, "ibm,associativity", associativity,
+                          sizeof(associativity)));
+    }
+}
+
+static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    DeviceClass *dc = DEVICE_GET_CLASS(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    int drc_index = drck->get_index(drc);
+    void *fdt, *fdt_orig;
+    int offset, i;
+    char *nodename;
+
+    /* add OF node for CPU and required OF DT properties */
+    fdt_orig = g_malloc0(FDT_MAX_SIZE);
+    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
+    fdt_begin_node(fdt_orig, "");
+    fdt_end_node(fdt_orig);
+    fdt_finish(fdt_orig);
+
+    fdt = g_malloc0(FDT_MAX_SIZE);
+    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
+
+    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
+
+    offset = fdt_add_subnode(fdt, offset, nodename);
+
+    /* Set NUMA node for the added CPU */
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
+            cs->numa_node = i;
+            break;
+        }
+    }
+
+    spapr_populate_cpu_dt(cs, fdt, offset, drc_index);
+    g_free(fdt_orig);
+    g_free(nodename);
+
+    drck->attach(drc, dev, fdt, offset, false);
+}
+
+static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    Error *local_err = NULL;
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
+
+    /* TODO: Check if DR is enabled ? */
+    g_assert(drc);
+
+    spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
+    spapr_cpu_hotplug_add(dev, cs);
+    spapr_hotplug_req_add_event(drc);
+    error_propagate(errp, local_err);
+    return;
+}
+
+static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
+                                      DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        if (dev->hotplugged) {
+            spapr_cpu_plug(hotplug_dev, dev, errp);
+        }
+    }
+}
+
+static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
+                                             DeviceState *dev)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        return HOTPLUG_HANDLER(machine);
+    }
+    return NULL;
+}
+
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
     sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
     FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
     NMIClass *nc = NMI_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
     mc->init = ppc_spapr_init;
     mc->reset = ppc_spapr_reset;
@@ -1759,6 +1958,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_boot_order = NULL;
     mc->kvm_type = spapr_kvm_type;
     mc->has_dynamic_sysbus = true;
+    mc->get_hotplug_handler = spapr_get_hotpug_handler;
+
+    hc->plug = spapr_machine_device_plug;
     smc->dr_phb_enabled = false;
     smc->dr_cpu_enabled = false;
     smc->dr_lmb_enabled = false;
@@ -1778,6 +1980,7 @@ static const TypeInfo spapr_machine_info = {
     .interfaces = (InterfaceInfo[]) {
         { TYPE_FW_PATH_PROVIDER },
         { TYPE_NMI },
+        { TYPE_HOTPLUG_HANDLER },
         { }
     },
 };
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 434a75d..035d8c9 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -364,14 +364,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
     hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
     hp->hdr.section_version = 1; /* includes extended modifier */
     hp->hotplug_action = hp_action;
-
+    hp->drc.index = cpu_to_be32(drck->get_index(drc));
+    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
 
     switch (drc_type) {
     case SPAPR_DR_CONNECTOR_TYPE_PCI:
-        hp->drc.index = cpu_to_be32(drck->get_index(drc));
-        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
+        break;
     default:
         /* skip notification for unknown connector types */
         g_free(new_hp);
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 9c642a5..cf9d8d3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -32,6 +32,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/ppc.h"
+#include "sysemu/sysemu.h"
 
 //#define PPC_DUMP_CPU
 //#define PPC_DEBUG_SPR
@@ -8909,6 +8910,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
+    if (cs->cpu_index >= max_cpus) {
+        error_setg(errp, "Can't have more CPUs, maxcpus limit reached");
+        return;
+    }
+
     cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
         + (cs->cpu_index % smp_threads);
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (5 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-29  1:36   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 08/13] spapr: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

PowerPC kernel adds or removes CPUs in core granularity and hence
onlines/offlines all the SMT threads of a core during hot plug/unplug.
Support this notion by starting all SMT threads of a core when a core
is hotplugged.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a293a59..4347471 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1376,6 +1376,8 @@ static void spapr_drc_reset(void *opaque)
     }
 }
 
+static const char *current_cpu_model;
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
@@ -1473,6 +1475,8 @@ static void ppc_spapr_init(MachineState *machine)
         }
     }
 
+    current_cpu_model = cpu_model;
+
     /* allocate RAM */
     spapr->ram_limit = ram_size;
     spapr->maxram_limit = machine->maxram_size;
@@ -1912,10 +1916,31 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     sPAPRDRConnector *drc =
         spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    int smt = kvmppc_smt_threads();
+    int i;
+
+    /*
+     * SMT threads return from here, only main thread (core) will
+     * continue, create threads and signal hotplug event to the guest.
+     */
+    if ((id % smt) != 0) {
+        return;
+    }
 
     /* TODO: Check if DR is enabled ? */
     g_assert(drc);
 
+    /* Start rest of the SMT threads of the hot plugged core */
+    for (i = 1; i < smp_threads; i++) {
+        cpu = cpu_ppc_init(current_cpu_model);
+        if (cpu == NULL) {
+            fprintf(stderr, "Unable to find PowerPC CPU definition\n");
+            exit(1);
+        }
+        spapr_cpu_reset(cpu);
+    }
+
     spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
     spapr_cpu_hotplug_add(dev, cs);
     spapr_hotplug_req_add_event(drc);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 08/13] spapr: Enable CPU hotplug for POWER8 CPU family
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (6 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support Bharata B Rao
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 target-ppc/translate_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index cf9d8d3..cda706b 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8184,6 +8184,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
     dc->fw_name = "PowerPC,POWER8";
     dc->desc = "POWER8";
     dc->props = powerpc_servercpu_properties;
+    dc->cannot_instantiate_with_device_add_yet = false;
     pcc->pvr_match = ppc_pvr_match_power8;
     pcc->pcr_mask = PCR_COMPAT_2_05 | PCR_COMPAT_2_06;
     pcc->init_proc = init_proc_POWER8;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (7 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 08/13] spapr: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-29  1:39   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects Bharata B Rao
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Support hot removal of CPU for sPAPR guests.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4347471..ec793b1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1908,6 +1908,22 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
     drck->attach(drc, dev, fdt, offset, false);
 }
 
+static void spapr_cpu_release(DeviceState *dev, void *opaque)
+{
+    /* Release vCPU */
+}
+
+static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+    drck->detach(drc, dev, spapr_cpu_release, NULL);
+}
+
 static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                             Error **errp)
 {
@@ -1948,6 +1964,21 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     return;
 }
 
+static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    Error *local_err = NULL;
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
+
+    spapr_cpu_hotplug_remove(dev, cs);
+    spapr_hotplug_req_remove_event(drc);
+    error_propagate(errp, local_err);
+    return;
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -1958,6 +1989,16 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
     }
 }
 
+static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
+                                      DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        if (dev->hotplugged) {
+            spapr_cpu_unplug(hotplug_dev, dev, errp);
+        }
+    }
+}
+
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
@@ -1986,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->get_hotplug_handler = spapr_get_hotpug_handler;
 
     hc->plug = spapr_machine_device_plug;
+    hc->unplug = spapr_machine_device_unplug;
+
     smc->dr_phb_enabled = false;
     smc->dr_cpu_enabled = false;
     smc->dr_lmb_enabled = false;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (8 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-01-29  1:48   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space Bharata B Rao
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gu Zheng, imammedo, Bharata B Rao, mdroth, agraf

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
               (added spapr bits)
---
 cpus.c               | 44 ++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr.c       | 14 ++++++++++++-
 include/qom/cpu.h    | 11 ++++++++++
 include/sysemu/kvm.h |  1 +
 kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 1b5168a..98b7199 100644
--- a/cpus.c
+++ b/cpus.c
@@ -871,6 +871,24 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
     qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+    CPU_REMOVE(cpu);
+
+    if (kvm_destroy_vcpu(cpu) < 0) {
+        fprintf(stderr, "kvm_destroy_vcpu failed.\n");
+        exit(1);
+    }
+
+    object_unparent(OBJECT(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+    CPU_REMOVE(cpu);
+    object_unparent(OBJECT(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
     struct qemu_work_item *wi;
@@ -964,6 +982,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
             }
         }
         qemu_kvm_wait_io_event(cpu);
+        if (cpu->exit && !cpu_can_run(cpu)) {
+            qemu_kvm_destroy_vcpu(cpu);
+            qemu_mutex_unlock(&qemu_global_mutex);
+            return NULL;
+        }
     }
 
     return NULL;
@@ -1018,6 +1041,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
+    CPUState *remove_cpu = NULL;
 
     qemu_tcg_init_cpu_signals();
     qemu_thread_get_self(cpu->thread);
@@ -1052,6 +1076,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
         }
         qemu_tcg_wait_io_event();
+        CPU_FOREACH(cpu) {
+            if (cpu->exit && !cpu_can_run(cpu)) {
+                remove_cpu = cpu;
+                break;
+            }
+        }
+        if (remove_cpu) {
+            qemu_tcg_destroy_vcpu(remove_cpu);
+            remove_cpu = NULL;
+        }
     }
 
     return NULL;
@@ -1208,6 +1242,13 @@ void resume_all_vcpus(void)
     }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+    cpu->stop = true;
+    cpu->exit = true;
+    qemu_cpu_kick(cpu);
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
@@ -1402,6 +1443,9 @@ static void tcg_exec_all(void)
                 break;
             }
         } else if (cpu->stop || cpu->stopped) {
+            if (cpu->exit) {
+                next_cpu = CPU_NEXT(cpu);
+            }
             break;
         }
     }
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ec793b1..44405b2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1910,7 +1910,19 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
 
 static void spapr_cpu_release(DeviceState *dev, void *opaque)
 {
-    /* Release vCPU */
+    CPUState *cs;
+    int i;
+    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
+
+    for (i = id; i < id + smp_threads; i++) {
+        CPU_FOREACH(cs) {
+            PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+            if (i == ppc_get_vcpu_dt_id(cpu)) {
+                cpu_remove(cs);
+            }
+        }
+    }
 }
 
 static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 2098f1c..30fd0cd 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -206,6 +206,7 @@ struct kvm_run;
  * @halted: Nonzero if the CPU is in suspended state.
  * @stop: Indicates a pending stop request.
  * @stopped: Indicates the CPU has been artificially stopped.
+ * @exit: Indicates the CPU has exited due to an unplug operation.
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *           CPU and return to its top level loop.
  * @singlestep_enabled: Flags for single-stepping.
@@ -249,6 +250,7 @@ struct CPUState {
     bool created;
     bool stop;
     bool stopped;
+    bool exit;
     volatile sig_atomic_t exit_request;
     uint32_t interrupt_request;
     int singlestep_enabled;
@@ -305,6 +307,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, &cpus, node)
 #define CPU_FOREACH_SAFE(cpu, next_cpu) \
     QTAILQ_FOREACH_SAFE(cpu, &cpus, node, next_cpu)
@@ -610,6 +613,14 @@ void cpu_exit(CPUState *cpu);
 void cpu_resume(CPUState *cpu);
 
 /**
+ * cpu_remove:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed.
+ */
+void cpu_remove(CPUState *cpu);
+
+/**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
  *
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 104cf35..da064c1 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -186,6 +186,7 @@ int kvm_has_gsi_routing(void);
 int kvm_has_intx_set_mask(void);
 
 int kvm_init_vcpu(CPUState *cpu);
+int kvm_destroy_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
 
 #ifdef NEED_CPU_H
diff --git a/kvm-all.c b/kvm-all.c
index 18cc6b4..6f543ce 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -71,6 +71,12 @@ typedef struct KVMSlot
 
 typedef struct kvm_dirty_log KVMDirtyLog;
 
+struct KVMParkedVcpu {
+    unsigned long vcpu_id;
+    int kvm_fd;
+    QLIST_ENTRY(KVMParkedVcpu) node;
+};
+
 struct KVMState
 {
     AccelState parent_obj;
@@ -107,6 +113,7 @@ struct KVMState
     QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
     bool direct_msi;
 #endif
+    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 };
 
 #define TYPE_KVM_ACCEL ACCEL_CLASS_NAME("kvm")
@@ -247,6 +254,53 @@ static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
     return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
 }
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+    KVMState *s = kvm_state;
+    long mmap_size;
+    struct KVMParkedVcpu *vcpu = NULL;
+    int ret = 0;
+
+    DPRINTF("kvm_destroy_vcpu\n");
+
+    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
+    if (mmap_size < 0) {
+        ret = mmap_size;
+        DPRINTF("kvm_destroy_vcpu failed\n");
+        goto err;
+    }
+
+    ret = munmap(cpu->kvm_run, mmap_size);
+    if (ret < 0) {
+        goto err;
+    }
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+
+err:
+    return ret;
+}
+
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
+{
+    struct KVMParkedVcpu *cpu;
+
+    QLIST_FOREACH(cpu, &s->kvm_parked_vcpus, node) {
+        if (cpu->vcpu_id == vcpu_id) {
+            int kvm_fd;
+
+            QLIST_REMOVE(cpu, node);
+            kvm_fd = cpu->kvm_fd;
+            g_free(cpu);
+            return kvm_fd;
+        }
+    }
+
+    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+}
 int kvm_init_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
@@ -255,7 +309,7 @@ int kvm_init_vcpu(CPUState *cpu)
 
     DPRINTF("kvm_init_vcpu\n");
 
-    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
+    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         DPRINTF("kvm_create_vcpu failed\n");
         goto err;
@@ -1441,6 +1495,7 @@ static int kvm_init(MachineState *ms)
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_INIT(&s->kvm_sw_breakpoints);
 #endif
+    QLIST_INIT(&s->kvm_parked_vcpus);
     s->vmfd = -1;
     s->fd = qemu_open("/dev/kvm", O_RDWR);
     if (s->fd == -1) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (9 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-02-12  5:19   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Initialize a hotplug memory region under which all the hotplugged
memory is accommodated. Also enable memory hotplug by setting
CONFIG_MEM_HOTPLUG.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 default-configs/ppc64-softmmu.mak |  1 +
 hw/ppc/spapr.c                    | 26 ++++++++++++++++++++++++++
 include/hw/ppc/spapr.h            |  3 +++
 3 files changed, 30 insertions(+)

diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
index bd30d69..03210de 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -60,3 +60,4 @@ CONFIG_I82374=y
 CONFIG_I8257=y
 CONFIG_MC146818RTC=y
 CONFIG_ISA_TESTDEV=y
+CONFIG_MEM_HOTPLUG=y
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 44405b2..9ff08ff 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -120,6 +120,8 @@ struct sPAPRMachineState {
 
     /*< public >*/
     char *kvm_type;
+    ram_addr_t hotplug_memory_base;
+    MemoryRegion hotplug_memory;
 };
 
 sPAPREnvironment *spapr;
@@ -1403,6 +1405,7 @@ static void ppc_spapr_init(MachineState *machine)
     bool kernel_le = false;
     char *filename;
     int smt = kvmppc_smt_threads();
+    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
 
     msi_supported = true;
 
@@ -1492,6 +1495,29 @@ static void ppc_spapr_init(MachineState *machine)
         memory_region_add_subregion(sysmem, 0, rma_region);
     }
 
+    if (machine->ram_size < machine->maxram_size) {
+        ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
+
+        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
+            error_report("unsupported amount of memory slots: %"PRIu64,
+                         machine->ram_slots);
+            exit(EXIT_FAILURE);
+        }
+
+        ms->hotplug_memory_base = ROUND_UP(machine->ram_size, 1ULL << 30);
+
+        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
+            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
+                         machine->maxram_size);
+            exit(EXIT_FAILURE);
+        }
+
+        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
+                           "hotplug-memory", hotplug_mem_size);
+        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
+                                    &ms->hotplug_memory);
+    }
+
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
     spapr->rtas_size = get_image_size(filename);
     spapr->rtas_blob = g_malloc(spapr->rtas_size);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ae8b4e1..64681c4 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -482,6 +482,9 @@ struct sPAPRTCETable {
 #define TIMEBASE_FREQ           512000000ULL
 #define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
 
+/* Support a min of 1TB hotplug memory assuming 256MB per slot */
+#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (10 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-02-12  6:02   ` David Gibson
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support Bharata B Rao
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch was tested with an enhancement to SLOF that supports
addition of device tree nodes from ibm,client-architecture-support call.

TODO: Enforce lmb-size alignment for node memory.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 232 ++++++++++++++++++++++++++++++++++++++++---------
 hw/ppc/spapr_hcall.c   |  51 +++++++++--
 include/hw/ppc/spapr.h |  12 ++-
 3 files changed, 246 insertions(+), 49 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9ff08ff..6964b06 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -631,42 +631,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
     return fdt;
 }
 
-int spapr_h_cas_compose_response(target_ulong addr, target_ulong size)
-{
-    void *fdt, *fdt_skel;
-    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
-
-    size -= sizeof(hdr);
-
-    /* Create sceleton */
-    fdt_skel = g_malloc0(size);
-    _FDT((fdt_create(fdt_skel, size)));
-    _FDT((fdt_begin_node(fdt_skel, "")));
-    _FDT((fdt_end_node(fdt_skel)));
-    _FDT((fdt_finish(fdt_skel)));
-    fdt = g_malloc0(size);
-    _FDT((fdt_open_into(fdt_skel, fdt, size)));
-    g_free(fdt_skel);
-
-    /* Fix skeleton up */
-    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
-
-    /* Pack resulting tree */
-    _FDT((fdt_pack(fdt)));
-
-    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
-        trace_spapr_cas_failed(size);
-        return -1;
-    }
-
-    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
-    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
-    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
-    g_free(fdt);
-
-    return 0;
-}
-
 static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
                                        hwaddr size)
 {
@@ -720,7 +684,6 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
         }
         if (!mem_start) {
             /* ppc_spapr_init() checks for rma_size <= node0_size already */
-            spapr_populate_memory_node(fdt, i, 0, spapr->rma_size);
             mem_start += spapr->rma_size;
             node_size -= spapr->rma_size;
         }
@@ -741,6 +704,190 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
     return 0;
 }
 
+/*
+ * TODO: Take care of sparsemem configuration ?
+ */
+static uint64_t numa_node_end(uint32_t nodeid)
+{
+    uint32_t i = 0;
+    uint64_t addr = 0;
+
+    do {
+        addr += numa_info[i].node_mem;
+    } while (++i <= nodeid);
+
+    return addr;
+}
+
+static uint64_t numa_node_start(uint32_t nodeid)
+{
+    if (!nodeid) {
+        return 0;
+    } else {
+        return numa_node_end(nodeid - 1);
+    }
+}
+
+/*
+ * Given the addr, return the NUMA node to which the address belongs to.
+ */
+static uint32_t get_numa_node(uint64_t addr)
+{
+    uint32_t i;
+
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {
+            return i;
+        }
+    }
+
+    /* Unassigned memory goes to node 0 by default */
+    return 0;
+}
+
+/* Adds ibm,dynamic-reconfiguration-memory node */
+static int spapr_populate_drconf_memory(sPAPREnvironment *spapr, void *fdt)
+{
+    int root_offset, ret, i, offset;
+    uint32_t lmb_size = SPAPR_MIN_MEMORY_BLOCK_SIZE;
+    uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
+    uint32_t dynamic_memory[DR_LMB_LIST_ENTRY_SIZE];
+    uint32_t nr_rma_lmbs = spapr->rma_size/lmb_size;
+    uint32_t nr_lmbs = spapr->maxram_limit/lmb_size - nr_rma_lmbs;
+    uint32_t nr_assigned_lmbs = spapr->ram_limit/lmb_size - nr_rma_lmbs;
+    uint32_t *int_buf, *cur_index, buf_len;
+
+    /* Allocate enough buffer size to fit in ibm,dynamic-memory */
+    buf_len = nr_lmbs * DR_LMB_LIST_ENTRY_SIZE * sizeof(uint32_t) +
+                sizeof(uint32_t);
+    cur_index = int_buf = g_malloc0(buf_len);
+    root_offset = fdt_path_offset(fdt, "/");
+
+
+    offset = fdt_add_subnode(fdt, root_offset,
+                   "ibm,dynamic-reconfiguration-memory");
+
+    ret = fdt_setprop(fdt, offset, "ibm,lmb-size", prop_lmb_size,
+            sizeof(prop_lmb_size));
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-flags-mask",
+                            cpu_to_be32(0xff));
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-preservation-time",
+                            cpu_to_be32(0x0));
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* ibm,dynamic-memory */
+    int_buf[0] = cpu_to_be32(nr_lmbs);
+    cur_index++;
+    for (i = 0; i < nr_lmbs; i++) {
+        sPAPRDRConnector *drc;
+        sPAPRDRConnectorClass *drck;
+        uint64_t addr;
+
+        if (i < nr_assigned_lmbs) {
+            addr = (i + nr_rma_lmbs) * lmb_size;
+        } else {
+            addr = (i - nr_assigned_lmbs) * lmb_size +
+                SPAPR_MACHINE(qdev_get_machine())->hotplug_memory_base;
+        }
+        drc = spapr_dr_connector_new(qdev_get_machine(),
+                SPAPR_DR_CONNECTOR_TYPE_LMB, addr/lmb_size);
+        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+        dynamic_memory[0] = cpu_to_be32(addr >> 32);
+        dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff);
+        dynamic_memory[2] = cpu_to_be32(drck->get_index(drc));
+        dynamic_memory[3] = cpu_to_be32(0); /* reserved */
+        dynamic_memory[4] = cpu_to_be32(get_numa_node(addr));
+        dynamic_memory[5] = (addr < spapr->ram_limit) ?
+                            cpu_to_be32(LMB_FLAGS_ASSIGNED) :
+                            cpu_to_be32(0);
+
+        memcpy(cur_index, dynamic_memory, sizeof(dynamic_memory));
+        cur_index += DR_LMB_LIST_ENTRY_SIZE;
+    }
+    ret = fdt_setprop(fdt, offset, "ibm,dynamic-memory", int_buf, buf_len);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* ibm,associativity-lookup-arrays */
+    cur_index = int_buf;
+    int_buf[0] = cpu_to_be32(nb_numa_nodes);
+    int_buf[1] = cpu_to_be32(4);
+    cur_index += 2;
+    for (i = 0; i < nb_numa_nodes; i++) {
+        uint32_t associativity[] = {
+            cpu_to_be32(0x0),
+            cpu_to_be32(0x0),
+            cpu_to_be32(0x0),
+            cpu_to_be32(i)
+        };
+        memcpy(cur_index, associativity, sizeof(associativity));
+        cur_index += 4;
+    }
+    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
+            (cur_index - int_buf) * sizeof(uint32_t));
+out:
+    g_free(int_buf);
+    return ret;
+}
+
+int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
+                                bool cpu_update, bool memory_update)
+{
+    void *fdt, *fdt_skel;
+    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
+
+    size -= sizeof(hdr);
+
+    /* Create sceleton */
+    fdt_skel = g_malloc0(size);
+    _FDT((fdt_create(fdt_skel, size)));
+    _FDT((fdt_begin_node(fdt_skel, "")));
+    _FDT((fdt_end_node(fdt_skel)));
+    _FDT((fdt_finish(fdt_skel)));
+    fdt = g_malloc0(size);
+    _FDT((fdt_open_into(fdt_skel, fdt, size)));
+    g_free(fdt_skel);
+
+    /* Fixup cpu nodes */
+    if (cpu_update) {
+        _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
+    }
+
+    /* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
+    if (memory_update) {
+        _FDT((spapr_populate_drconf_memory(spapr, fdt)));
+    } else {
+        _FDT((spapr_populate_memory(spapr, fdt)));
+    }
+
+    /* Pack resulting tree */
+    _FDT((fdt_pack(fdt)));
+
+    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
+        trace_spapr_cas_failed(size);
+        return -1;
+    }
+
+    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
+    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
+    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
+    g_free(fdt);
+
+    return 0;
+}
+
 static void spapr_finalize_fdt(sPAPREnvironment *spapr,
                                hwaddr fdt_addr,
                                hwaddr rtas_addr,
@@ -757,11 +904,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     /* open out the base tree into a temp buffer for the final tweaks */
     _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE)));
 
-    ret = spapr_populate_memory(spapr, fdt);
-    if (ret < 0) {
-        fprintf(stderr, "couldn't setup memory nodes in fdt\n");
-        exit(1);
-    }
+    /*
+     * Add memory@0 node to represent RMA. Rest of the memory is either
+     * represented by memory nodes or ibm,dynamic-reconfiguration-memory
+     * node later during ibm,client-architecture-support call.
+     */
+    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
 
     ret = spapr_populate_vdevice(spapr->vio_bus, fdt);
     if (ret < 0) {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 8651447..10f05f4 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -805,6 +805,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     return ret;
 }
 
+/*
+ * Return the offset to the requested option vector @vector in the
+ * option vector table @table.
+ */
+static target_ulong cas_get_option_vector(int vector, target_ulong table)
+{
+    int i;
+    char nr_vectors, nr_entries;
+
+    if (!table) {
+        return 0;
+    }
+
+    nr_vectors = (rtas_ld(table, 0) >> 24) + 1;
+    if (!vector || vector > nr_vectors) {
+        return 0;
+    }
+    table++; /* skip nr option vectors */
+
+    for (i = 0; i < vector - 1; i++) {
+        nr_entries = rtas_ld(table, 0) >> 24;
+        table += nr_entries + 2;
+    }
+    return table;
+}
+
 typedef struct {
     PowerPCCPU *cpu;
     uint32_t cpu_version;
@@ -825,19 +851,22 @@ static void do_set_compat(void *arg)
     ((cpuver) == CPU_POWERPC_LOGICAL_2_06_PLUS) ? 2061 : \
     ((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
 
+#define OV5_DRCONF_MEMORY 0x20
+
 static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
                                                   sPAPREnvironment *spapr,
                                                   target_ulong opcode,
                                                   target_ulong *args)
 {
-    target_ulong list = args[0];
+    target_ulong list = args[0], ov_table;
     PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
     CPUState *cs;
-    bool cpu_match = false;
+    bool cpu_match = false, cpu_update = true, memory_update = false;
     unsigned old_cpu_version = cpu_->cpu_version;
     unsigned compat_lvl = 0, cpu_version = 0;
     unsigned max_lvl = get_compat_level(cpu_->max_compat);
     int counter;
+    char ov5_byte2;
 
     /* Parse PVR list */
     for (counter = 0; counter < 512; ++counter) {
@@ -887,8 +916,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
         }
     }
 
-    /* For the future use: here @list points to the first capability */
-
     /* Parsing finished */
     trace_spapr_cas_pvr(cpu_->cpu_version, cpu_match,
                         cpu_version, pcc_->pcr_mask);
@@ -912,14 +939,26 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
     }
 
     if (!cpu_version) {
-        return H_SUCCESS;
+        cpu_update = false;
     }
 
+    /* For the future use: here @ov_table points to the first option vector */
+    ov_table = list;
+
+    list = cas_get_option_vector(5, ov_table);
     if (!list) {
         return H_SUCCESS;
     }
 
-    if (spapr_h_cas_compose_response(args[1], args[2])) {
+    /* @list now points to OV 5 */
+    list += 2;
+    ov5_byte2 = rtas_ld(list, 0) >> 24;
+    if (ov5_byte2 & OV5_DRCONF_MEMORY) {
+        memory_update = true;
+    }
+
+    if (spapr_h_cas_compose_response(args[1], args[2], cpu_update,
+                                    memory_update)) {
         qemu_system_reset_request();
     }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 64681c4..10283f9 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -485,9 +485,19 @@ struct sPAPRTCETable {
 /* Support a min of 1TB hotplug memory assuming 256MB per slot */
 #define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
 
+/*
+ * Number of 32 bit words in each LMB list entry in ibm,dynamic-memory
+ * property under ibm,dynamic-reconfiguration-memory node.
+ */
+#define DR_LMB_LIST_ENTRY_SIZE 6
+
+/* Flag values in Option Vector 5 ibm architecture vector table. */
+#define LMB_FLAGS_ASSIGNED 0x00000008
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
-int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
+int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
+                                bool cpu_update, bool memory_update);
 sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
                                    uint64_t bus_offset,
                                    uint32_t page_shift,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (11 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
@ 2015-01-08  6:10 ` Bharata B Rao
  2015-02-24  6:26   ` David Gibson
  2015-01-29 17:46 ` [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Andreas Färber
  2015-01-29 22:14 ` Tyrel Datwyler
  14 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-08  6:10 UTC (permalink / raw)
  To: qemu-devel; +Cc: imammedo, Bharata B Rao, mdroth, agraf

Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c        | 107 +++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c |   3 ++
 2 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6964b06..1ffff39 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -58,7 +58,8 @@
 #include "hw/nmi.h"
 
 #include "hw/compat.h"
-
+#include "hw/mem/pc-dimm.h"
+#include "qapi/qmp/qerror.h"
 #include <libfdt.h>
 
 /* SLOF memory layout:
@@ -2165,6 +2166,103 @@ static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     return;
 }
 
+static int spapr_dimms_capacity(Object *obj, void *opaque)
+{
+    Error *local_err = NULL;
+    uint64_t *size = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+        (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP, &local_err);
+
+        if (local_err) {
+            qerror_report_err(local_err);
+            error_free(local_err);
+            return 1;
+        }
+    }
+
+    object_child_foreach(obj, spapr_dimms_capacity, opaque);
+    return 0;
+}
+
+static void spapr_memory_plug(HotplugHandler *hotplug_dev,
+                         DeviceState *dev, Error **errp)
+{
+    int slot;
+    Error *local_err = NULL;
+    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
+    MachineState *machine = MACHINE(hotplug_dev);
+    PCDIMMDevice *dimm = PC_DIMM(dev);
+    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    MemoryRegion *mr = ddc->get_memory_region(dimm);
+    uint64_t dimms_capacity = 0;
+    uint64_t align = TARGET_PAGE_SIZE; /* TODO: enforce alignment */
+    uint64_t addr;
+    sPAPRDRConnector *drc;
+
+    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
+                                 memory_region_size(&ms->hotplug_memory),
+                                 !addr ? NULL : &addr, align,
+                                 memory_region_size(mr), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (spapr_dimms_capacity(OBJECT(machine), &dimms_capacity)) {
+        error_setg(&local_err, "failed to get total size of existing DIMMs");
+        goto out;
+    }
+
+    if (dimms_capacity > machine->maxram_size - machine->ram_size) {
+        error_setg(&local_err, "not enough space, proposed use of 0x%" PRIx64
+                   " from total of 0x" RAM_ADDR_FMT,
+                   dimms_capacity, machine->maxram_size);
+        goto out;
+    }
+
+    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
+                                 machine->ram_slots, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
+        error_setg(&local_err, "hypervisor has no free memory slots left");
+        goto out;
+    }
+
+    memory_region_add_subregion(&ms->hotplug_memory,
+                                addr - ms->hotplug_memory_base, mr);
+    vmstate_register_ram(mr, dev);
+
+    drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
+            addr/SPAPR_MIN_MEMORY_BLOCK_SIZE);
+    g_assert(drc);
+    spapr_hotplug_req_add_event(drc);
+
+out:
+    error_propagate(errp, local_err);
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -2172,6 +2270,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
         if (dev->hotplugged) {
             spapr_cpu_plug(hotplug_dev, dev, errp);
         }
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        spapr_memory_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2182,13 +2282,16 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
         if (dev->hotplugged) {
             spapr_cpu_unplug(hotplug_dev, dev, errp);
         }
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        error_setg(errp, "Memory hot unplug is not yet supported\n");
     }
 }
 
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
+            object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 035d8c9..f559661 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -374,6 +374,9 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
     case SPAPR_DR_CONNECTOR_TYPE_CPU:
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_MEMORY;
+        break;
     default:
         /* skip notification for unknown connector types */
         g_free(new_hp);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
@ 2015-01-22 21:08   ` Michael Roth
  2015-01-29  1:04   ` David Gibson
  1 sibling, 0 replies; 50+ messages in thread
From: Michael Roth @ 2015-01-22 21:08 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, agraf

Quoting Bharata B Rao (2015-01-08 00:10:08)
> From: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> Introduce an sPAPRMachineClass sub-class of MachineClass to
> handle sPAPR-specific machine configuration properties.
> 
> The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
> part of machine-specific init code, and is then propagated
> to sPAPREnvironment to conditional enable creation of DRC
> objects and device-tree description to facilitate hotplug
> of PHBs/CPUs/LMBs.
> 
> Since we can't migrate this state to older machine types,
> default the option to false and only enable it for new
> machine types.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>               [Added CPU and LMB bits]

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>

> ---
>  hw/ppc/spapr.c         | 32 ++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  3 +++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9eb0a94..71e7052 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -89,11 +89,29 @@
> 
>  #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
> 
> +typedef struct sPAPRMachineClass sPAPRMachineClass;
>  typedef struct sPAPRMachineState sPAPRMachineState;
> 
>  #define TYPE_SPAPR_MACHINE      "spapr-machine"
>  #define SPAPR_MACHINE(obj) \
>      OBJECT_CHECK(sPAPRMachineState, (obj), TYPE_SPAPR_MACHINE)
> +#define SPAPR_MACHINE_GET_CLASS(obj) \
> +        OBJECT_GET_CLASS(sPAPRMachineClass, obj, TYPE_SPAPR_MACHINE)
> +#define SPAPR_MACHINE_CLASS(klass) \
> +        OBJECT_CLASS_CHECK(sPAPRMachineClass, klass, TYPE_SPAPR_MACHINE)
> +
> +/**
> + * sPAPRMachineClass:
> + */
> +struct sPAPRMachineClass {
> +    /*< private >*/
> +    MachineClass parent_class;
> +
> +    /*< public >*/
> +    bool dr_phb_enabled; /* enable dynamic-reconfig/hotplug of PHBs */
> +    bool dr_cpu_enabled; /* enable dynamic-reconfig/hotplug of CPUs */
> +    bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
> +};
> 
>  /**
>   * sPAPRMachineState:
> @@ -1343,6 +1361,7 @@ static SaveVMHandlers savevm_htab_handlers = {
>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
>      ram_addr_t ram_size = machine->ram_size;
>      const char *cpu_model = machine->cpu_model;
>      const char *kernel_filename = machine->kernel_filename;
> @@ -1503,6 +1522,10 @@ static void ppc_spapr_init(MachineState *machine)
>      /* We always have at least the nvram device on VIO */
>      spapr_create_nvram(spapr);
> 
> +    spapr->dr_phb_enabled = smc->dr_phb_enabled;
> +    spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
> +    spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
> +
>      /* Set up PCI */
>      spapr_pci_rtas_init();
> 
> @@ -1722,6 +1745,7 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>      FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
>      NMIClass *nc = NMI_CLASS(oc);
> 
> @@ -1733,6 +1757,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->default_boot_order = NULL;
>      mc->kvm_type = spapr_kvm_type;
>      mc->has_dynamic_sysbus = true;
> +    smc->dr_phb_enabled = false;
> +    smc->dr_cpu_enabled = false;
> +    smc->dr_lmb_enabled = false;
> 
>      fwc->get_dev_path = spapr_get_fw_dev_path;
>      nc->nmi_monitor_handler = spapr_nmi;
> @@ -1744,6 +1771,7 @@ static const TypeInfo spapr_machine_info = {
>      .abstract      = true,
>      .instance_size = sizeof(sPAPRMachineState),
>      .instance_init = spapr_machine_initfn,
> +    .class_size    = sizeof(sPAPRMachineClass),
>      .class_init    = spapr_machine_class_init,
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_FW_PATH_PROVIDER },
> @@ -1788,11 +1816,15 @@ static const TypeInfo spapr_machine_2_2_info = {
>  static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
> 
>      mc->name = "pseries-2.3";
>      mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
>      mc->alias = "pseries";
>      mc->is_default = 1;
> +    smc->dr_phb_enabled = true;
> +    smc->dr_cpu_enabled = true;
> +    smc->dr_lmb_enabled = true;
>  }
> 
>  static const TypeInfo spapr_machine_2_3_info = {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 973193d..b1a0838 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -30,6 +30,9 @@ typedef struct sPAPREnvironment {
>      uint64_t rtc_offset;
>      struct PPCTimebase tb;
>      bool has_graphics;
> +    bool dr_phb_enabled; /* hotplug / dynamic-reconfiguration of PHBs */
> +    bool dr_cpu_enabled; /* hotplug / dynamic-reconfiguration of CPUs */
> +    bool dr_lmb_enabled; /* hotplug / dynamic-reconfiguration of LMBs */
> 
>      uint32_t check_exception_irq;
>      Notifier epow_notifier;
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs Bharata B Rao
@ 2015-01-22 21:21   ` Michael Roth
  2015-01-29  1:04   ` David Gibson
  1 sibling, 0 replies; 50+ messages in thread
From: Michael Roth @ 2015-01-22 21:21 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, agraf

Quoting Bharata B Rao (2015-01-08 00:10:09)
> Advertise CPU DR-capability to the guest via device tree.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>                [spapr_drc_reset implementation]
> ---
>  hw/ppc/spapr.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 71e7052..98a32d0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -807,6 +807,14 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>      }
> 
> +    if (spapr->dr_cpu_enabled) {
> +        int offset = fdt_path_offset(fdt, "/cpus");
> +        ret = spapr_drc_populate_dt(fdt, offset, SPAPR_DR_CONNECTOR_TYPE_CPU);
> +        if (ret < 0) {
> +            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
> +        }
> +    }
> +

Doesn't hurt to add the check, but spapr_drc_populate_dt() tries to be smart
enough to no-op if there's no drc's that have been created for resources of
the specified type, so spapr_dr_connector_new() *should* be the only spots
where it's necessary to check spapr->dr_cpu_enabled.

>      _FDT((fdt_pack(fdt)));
> 
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1358,6 +1366,16 @@ static SaveVMHandlers savevm_htab_handlers = {
>      .load_state = htab_load,
>  };
> 
> +static void spapr_drc_reset(void *opaque)
> +{
> +    sPAPRDRConnector *drc = opaque;
> +    DeviceState *d = DEVICE(drc);
> +
> +    if (d) {
> +        device_reset(d);
> +    }
> +}
> +
>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> @@ -1383,6 +1401,7 @@ static void ppc_spapr_init(MachineState *machine)
>      long load_limit, fw_size;
>      bool kernel_le = false;
>      char *filename;
> +    int smt = kvmppc_smt_threads();
> 
>      msi_supported = true;
> 
> @@ -1526,6 +1545,15 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
>      spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
> 
> +    if (spapr->dr_cpu_enabled) {
> +        for (i = 0; i < max_cpus/smp_threads; i++) {
> +            sPAPRDRConnector *drc =
> +                spapr_dr_connector_new(OBJECT(machine),
> +                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
> +            qemu_register_reset(spapr_drc_reset, drc);
> +        }
> +    
> +
>      /* Set up PCI */
>      spapr_pci_rtas_init();
> 
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
@ 2015-01-22 21:55   ` Michael Roth
  2015-01-30  8:51     ` Bharata B Rao
  2015-01-29  1:16   ` David Gibson
  1 sibling, 1 reply; 50+ messages in thread
From: Michael Roth @ 2015-01-22 21:55 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, agraf

Quoting Bharata B Rao (2015-01-08 00:10:12)
> Add support for ibm,lrdr-capacity since this is needed by the guest
> kernel to know about the possible hot-pluggable CPUs and Memory.
> 
> Define minimum hotpluggable memory size as 256MB and start storing maximum
> possible memory for the guest in sPAPREnvironment.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         |  3 ++-
>  hw/ppc/spapr_rtas.c    | 28 ++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h |  6 ++++--
>  3 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f49b0fa..515d770 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -775,7 +775,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      }
> 
>      /* RTAS */
> -    ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
> +    ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
>      if (ret < 0) {
>          fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
>      }
> @@ -1473,6 +1473,7 @@ static void ppc_spapr_init(MachineState *machine)
> 
>      /* allocate RAM */
>      spapr->ram_limit = ram_size;
> +    spapr->maxram_limit = machine->maxram_size;
>      memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>                                           spapr->ram_limit);
>      memory_region_add_subregion(sysmem, 0, ram);
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index d847f45..e8a0f21 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -29,6 +29,7 @@
>  #include "sysemu/char.h"
>  #include "hw/qdev.h"
>  #include "sysemu/device_tree.h"
> +#include "sysemu/cpus.h"
> 
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> @@ -551,11 +552,12 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
>      rtas_table[token].fn = fn;
>  }
> 
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> -                                 hwaddr rtas_size)
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> +                                 hwaddr rtas_addr, hwaddr rtas_size)
>  {
>      int ret;
>      int i;
> +    uint32_t lrdr_capacity[5];
> 
>      ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
>      if (ret < 0) {
> @@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
>          }
> 
>      }
> +
> +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add #address-cells rtas property\n");
> +    }
> +
> +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add #size-cells rtas property\n");
> +    }
> +
> +    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> +    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
> +    lrdr_capacity[2] = 0;
> +    lrdr_capacity[3] = cpu_to_be32(SPAPR_MIN_MEMORY_BLOCK_SIZE);
> +    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> +    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> +                     sizeof(lrdr_capacity));
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
> +    }
> +

The property seems simple enough, but would be worthwhile to add a description
of how/when it's used in a new section of docs/specs/ppc-spapr-hotplug.txt to
keep the documentation complete

>      return 0;
>  }
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 831db6b..ae8b4e1 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -18,6 +18,7 @@ typedef struct sPAPREnvironment {
>      XICSState *icp;
> 
>      hwaddr ram_limit;
> +    hwaddr maxram_limit;
>      void *htab;
>      uint32_t htab_shift;
>      hwaddr rma_size;
> @@ -444,8 +445,8 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
>  target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>                               uint32_t token, uint32_t nargs, target_ulong args,
>                               uint32_t nret, target_ulong rets);
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> -                                 hwaddr rtas_size);
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> +                                 hwaddr rtas_addr, hwaddr rtas_size);
> 
>  #define SPAPR_TCE_PAGE_SHIFT   12
>  #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
> @@ -479,6 +480,7 @@ struct sPAPRTCETable {
>  };
> 
>  #define TIMEBASE_FREQ           512000000ULL
> +#define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */

Is this actually the min, or a set increment size? Documentation suggests
the latter, in which case the naming is a little confusing.

> 
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
@ 2015-01-22 22:16   ` Michael Roth
  2015-01-28  4:19     ` Bharata B Rao
  2015-01-23 12:41   ` Igor Mammedov
  2015-01-29  1:31   ` David Gibson
  2 siblings, 1 reply; 50+ messages in thread
From: Michael Roth @ 2015-01-22 22:16 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, agraf

Quoting Bharata B Rao (2015-01-08 00:10:13)
> Support CPU hotplug via device-add command. Use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              | 205 +++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c       |   8 +-
>  target-ppc/translate_init.c |   6 ++
>  3 files changed, 215 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 515d770..a293a59 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
>      g_string_append_len(s, s1, strlen(s1) + 1);
>  }
> 
> +uint32_t cpus_per_socket;
> +
>  static void *spapr_create_fdt_skel(hwaddr initrd_base,
>                                     hwaddr initrd_size,
>                                     hwaddr kernel_size,
> @@ -350,9 +352,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
>      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
>      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> -    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      char *buf;
> 
> +    cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      add_str(hypertas, "hcall-pft");
>      add_str(hypertas, "hcall-term");
>      add_str(hypertas, "hcall-dabr");
> @@ -1744,12 +1746,209 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>      }
>  }
> 
> +/* TODO: Duplicates code from spapr_create_fdt_skel(), Fix this */
> +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
> +            int drc_index)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    CPUPPCState *env = &cpu->env;
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> +    int index = ppc_get_vcpu_dt_id(cpu);
> +    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> +                       0xffffffff, 0xffffffff};
> +    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> +    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> +    uint32_t page_sizes_prop[64];
> +    size_t page_sizes_prop_size;
> +    int smpt = ppc_get_compat_smt_threads(cpu);
> +    uint32_t servers_prop[smpt];
> +    uint32_t gservers_prop[smpt * 2];
> +    int i;
> +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +    uint32_t associativity[] = {cpu_to_be32(0x5),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(cs->numa_node),
> +                                cpu_to_be32(index)};
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> +    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> +                        env->icache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> +                            env->icache_line_size)));
> +
> +    if (pcc->l1_dcache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> +            pcc->l1_dcache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +    }
> +    if (pcc->l1_icache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> +            pcc->l1_icache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> +    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> +    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> +
> +    if (env->spr_cb[SPR_PURR].oea_read) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> +    }
> +
> +    if (env->mmu_model & POWERPC_MMU_1TSEG) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> +                           segs, sizeof(segs))));
> +    }
> +
> +    /* Advertise VMX/VSX (vector extensions) if available
> +     *   0 / no property == no vector extensions
> +     *   1               == VMX / Altivec available
> +     *   2               == VSX available */
> +    if (env->insns_flags & PPC_ALTIVEC) {
> +        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> +
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
> +    }
> +
> +    /* Advertise DFP (Decimal Floating Point) if available
> +     *   0 / no property == no DFP
> +     *   1               == DFP available */
> +    if (env->insns_flags2 & PPC2_DFP) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
> +    }
> +
> +    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> +                                                  sizeof(page_sizes_prop));
> +    if (page_sizes_prop_size) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
> +                           page_sizes_prop, page_sizes_prop_size)));
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> +                                cs->cpu_index / cpus_per_socket)));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> +
> +    /* Build interrupt servers and gservers properties */
> +    for (i = 0; i < smpt; i++) {
> +        servers_prop[i] = cpu_to_be32(index + i);
> +        /* Hack, direct the group queues back to cpu 0 */
> +        gservers_prop[i*2] = cpu_to_be32(index + i);
> +        gservers_prop[i*2 + 1] = 0;
> +    }
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
> +                       servers_prop, sizeof(servers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
> +                      gservers_prop, sizeof(gservers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,pft-size",
> +                          pft_size_prop, sizeof(pft_size_prop)));
> +
> +    if (nb_numa_nodes > 1) {
> +        _FDT(fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> +                          sizeof(associativity)));
> +    }
> +}
> +
> +static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    int drc_index = drck->get_index(drc);
> +    void *fdt, *fdt_orig;
> +    int offset, i;
> +    char *nodename;
> +
> +    /* add OF node for CPU and required OF DT properties */
> +    fdt_orig = g_malloc0(FDT_MAX_SIZE);
> +    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> +    fdt_begin_node(fdt_orig, "");
> +    fdt_end_node(fdt_orig);
> +    fdt_finish(fdt_orig);
> +
> +    fdt = g_malloc0(FDT_MAX_SIZE);
> +    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
> +
> +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> +
> +    offset = fdt_add_subnode(fdt, offset, nodename);
> +
> +    /* Set NUMA node for the added CPU */
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> +            cs->numa_node = i;
> +            break;
> +        }
> +    }
> +
> +    spapr_populate_cpu_dt(cs, fdt, offset, drc_index);
> +    g_free(fdt_orig);
> +    g_free(nodename);
> +
> +    drck->attach(drc, dev, fdt, offset, false);
> +}
> +
> +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    Error *local_err = NULL;
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
> +
> +    /* TODO: Check if DR is enabled ? */

Er, I take back my comment from earlier, you do also need to check for
spapr->dr_cpu_enabled below, since in the case of PCI it's enabled on a per-PHB
basis, whereas here it's a machine-wide option...

> +    g_assert(drc);
> +
> +    spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
> +    spapr_cpu_hotplug_add(dev, cs);
> +    spapr_hotplug_req_add_event(drc);
> +    error_propagate(errp, local_err);
> +    return;
> +}
> +
> +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        if (dev->hotplugged) {

Maybe just

           if (dev->hotplugged && spapr->dr_cpu_enabled) {
               ...

Would do it

> +            spapr_cpu_plug(hotplug_dev, dev, errp);
> +        }
> +    }
> +}
> +
> +static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> +                                             DeviceState *dev)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        return HOTPLUG_HANDLER(machine);
> +    }
> +    return NULL;
> +}
> +
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>      FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
>      NMIClass *nc = NMI_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> 
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
> @@ -1759,6 +1958,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->default_boot_order = NULL;
>      mc->kvm_type = spapr_kvm_type;
>      mc->has_dynamic_sysbus = true;
> +    mc->get_hotplug_handler = spapr_get_hotpug_handler;
> +
> +    hc->plug = spapr_machine_device_plug;
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;
> @@ -1778,6 +1980,7 @@ static const TypeInfo spapr_machine_info = {
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_FW_PATH_PROVIDER },
>          { TYPE_NMI },
> +        { TYPE_HOTPLUG_HANDLER },
>          { }
>      },
>  };
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 434a75d..035d8c9 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -364,14 +364,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
>      hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
>      hp->hdr.section_version = 1; /* includes extended modifier */
>      hp->hotplug_action = hp_action;
> -
> +    hp->drc.index = cpu_to_be32(drck->get_index(drc));
> +    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
> 
>      switch (drc_type) {
>      case SPAPR_DR_CONNECTOR_TYPE_PCI:
> -        hp->drc.index = cpu_to_be32(drck->get_index(drc));
> -        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> +        break;
>      default:
>          /* skip notification for unknown connector types */
>          g_free(new_hp);
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 9c642a5..cf9d8d3 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -32,6 +32,7 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/ppc.h"
> +#include "sysemu/sysemu.h"
> 
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
> @@ -8909,6 +8910,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>          return;
>      }
> 
> +    if (cs->cpu_index >= max_cpus) {
> +        error_setg(errp, "Can't have more CPUs, maxcpus limit reached");
> +        return;
> +    }
> +
>      cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
>          + (cs->cpu_index % smp_threads);
>  #endif
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
  2015-01-22 22:16   ` Michael Roth
@ 2015-01-23 12:41   ` Igor Mammedov
  2015-01-30  6:59     ` Bharata B Rao
  2015-01-29  1:31   ` David Gibson
  2 siblings, 1 reply; 50+ messages in thread
From: Igor Mammedov @ 2015-01-23 12:41 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: agraf, qemu-devel, mdroth

On Thu,  8 Jan 2015 11:40:13 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> Support CPU hotplug via device-add command. Use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              | 205 +++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c       |   8 +-
>  target-ppc/translate_init.c |   6 ++
>  3 files changed, 215 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 515d770..a293a59 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
>      g_string_append_len(s, s1, strlen(s1) + 1);
>  }
>  
> +uint32_t cpus_per_socket;
static ???

> +
>  static void *spapr_create_fdt_skel(hwaddr initrd_base,
>                                     hwaddr initrd_size,
>                                     hwaddr kernel_size,
> @@ -350,9 +352,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
>      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
>      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> -    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      char *buf;
>  
> +    cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      add_str(hypertas, "hcall-pft");
>      add_str(hypertas, "hcall-term");
>      add_str(hypertas, "hcall-dabr");
> @@ -1744,12 +1746,209 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>      }
>  }
>  
> +/* TODO: Duplicates code from spapr_create_fdt_skel(), Fix this */
> +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
> +            int drc_index)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    CPUPPCState *env = &cpu->env;
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> +    int index = ppc_get_vcpu_dt_id(cpu);
> +    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> +                       0xffffffff, 0xffffffff};
> +    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> +    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> +    uint32_t page_sizes_prop[64];
> +    size_t page_sizes_prop_size;
> +    int smpt = ppc_get_compat_smt_threads(cpu);
> +    uint32_t servers_prop[smpt];
> +    uint32_t gservers_prop[smpt * 2];
> +    int i;
> +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +    uint32_t associativity[] = {cpu_to_be32(0x5),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(cs->numa_node),
> +                                cpu_to_be32(index)};
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> +    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> +                        env->icache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> +                            env->icache_line_size)));
> +
> +    if (pcc->l1_dcache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> +            pcc->l1_dcache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +    }
> +    if (pcc->l1_icache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> +            pcc->l1_icache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> +    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> +    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> +
> +    if (env->spr_cb[SPR_PURR].oea_read) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> +    }
> +
> +    if (env->mmu_model & POWERPC_MMU_1TSEG) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> +                           segs, sizeof(segs))));
> +    }
> +
> +    /* Advertise VMX/VSX (vector extensions) if available
> +     *   0 / no property == no vector extensions
> +     *   1               == VMX / Altivec available
> +     *   2               == VSX available */
> +    if (env->insns_flags & PPC_ALTIVEC) {
> +        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> +
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
> +    }
> +
> +    /* Advertise DFP (Decimal Floating Point) if available
> +     *   0 / no property == no DFP
> +     *   1               == DFP available */
> +    if (env->insns_flags2 & PPC2_DFP) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
> +    }
> +
> +    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> +                                                  sizeof(page_sizes_prop));
> +    if (page_sizes_prop_size) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
> +                           page_sizes_prop, page_sizes_prop_size)));
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> +                                cs->cpu_index / cpus_per_socket)));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> +
> +    /* Build interrupt servers and gservers properties */
> +    for (i = 0; i < smpt; i++) {
> +        servers_prop[i] = cpu_to_be32(index + i);
> +        /* Hack, direct the group queues back to cpu 0 */
> +        gservers_prop[i*2] = cpu_to_be32(index + i);
> +        gservers_prop[i*2 + 1] = 0;
> +    }
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
> +                       servers_prop, sizeof(servers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
> +                      gservers_prop, sizeof(gservers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,pft-size",
> +                          pft_size_prop, sizeof(pft_size_prop)));
> +
> +    if (nb_numa_nodes > 1) {
> +        _FDT(fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> +                          sizeof(associativity)));
> +    }
> +}
> +
> +static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    int drc_index = drck->get_index(drc);
> +    void *fdt, *fdt_orig;
> +    int offset, i;
> +    char *nodename;
> +
> +    /* add OF node for CPU and required OF DT properties */
> +    fdt_orig = g_malloc0(FDT_MAX_SIZE);
> +    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> +    fdt_begin_node(fdt_orig, "");
> +    fdt_end_node(fdt_orig);
> +    fdt_finish(fdt_orig);
> +
> +    fdt = g_malloc0(FDT_MAX_SIZE);
> +    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);
> +
> +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> +
> +    offset = fdt_add_subnode(fdt, offset, nodename);
> +
> +    /* Set NUMA node for the added CPU */
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> +            cs->numa_node = i;
> +            break;
> +        }
> +    }
> +
> +    spapr_populate_cpu_dt(cs, fdt, offset, drc_index);
> +    g_free(fdt_orig);
> +    g_free(nodename);
> +
> +    drck->attach(drc, dev, fdt, offset, false);
> +}
> +
> +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    Error *local_err = NULL;
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
just rant: does this have any relation to hotplug_dev, the point here is to get
these data from hotplug_dev object/some child of it rather then via direct adhoc call.

> +
> +    /* TODO: Check if DR is enabled ? */
> +    g_assert(drc);
> +
> +    spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
reset probably should be don at realize time, see x86_cpu_realizefn() for example.

> +    spapr_cpu_hotplug_add(dev, cs);
> +    spapr_hotplug_req_add_event(drc);
> +    error_propagate(errp, local_err);
> +    return;
> +}
> +
> +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        if (dev->hotplugged) {
> +            spapr_cpu_plug(hotplug_dev, dev, errp);
Would be nicer if this could do cold-plugged CPUs wiring too.

> +        }
> +    }
> +}
> +
> +static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> +                                             DeviceState *dev)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        return HOTPLUG_HANDLER(machine);
> +    }
> +    return NULL;
> +}
> +
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>      FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
>      NMIClass *nc = NMI_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>  
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
> @@ -1759,6 +1958,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->default_boot_order = NULL;
>      mc->kvm_type = spapr_kvm_type;
>      mc->has_dynamic_sysbus = true;
> +    mc->get_hotplug_handler = spapr_get_hotpug_handler;
> +
> +    hc->plug = spapr_machine_device_plug;
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;
> @@ -1778,6 +1980,7 @@ static const TypeInfo spapr_machine_info = {
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_FW_PATH_PROVIDER },
>          { TYPE_NMI },
> +        { TYPE_HOTPLUG_HANDLER },
>          { }
>      },
>  };
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 434a75d..035d8c9 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -364,14 +364,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
>      hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
>      hp->hdr.section_version = 1; /* includes extended modifier */
>      hp->hotplug_action = hp_action;
> -
> +    hp->drc.index = cpu_to_be32(drck->get_index(drc));
> +    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>  
>      switch (drc_type) {
>      case SPAPR_DR_CONNECTOR_TYPE_PCI:
> -        hp->drc.index = cpu_to_be32(drck->get_index(drc));
> -        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> +        break;
>      default:
>          /* skip notification for unknown connector types */
>          g_free(new_hp);
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 9c642a5..cf9d8d3 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -32,6 +32,7 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/ppc.h"
> +#include "sysemu/sysemu.h"
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
> @@ -8909,6 +8910,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> +    if (cs->cpu_index >= max_cpus) {
pls note that cpu_index is monotonically increases, so when you do unplug
and then plug it will go above max_cpus or the same will happen if
one device_add fails in the middle, the next CPU add will fail because of
cs->cpu_index goes overboard.

I'd suggest not to rely/use cpu_index for any purposes and use other means
to identify where cpu is plugged in. On x68 we slowly getting rid of this
dependency in favor of apic_id (topology information), eventually it could
become:
  -device cpu_foo,socket=X,core=Y[,thread=Z][,node=N]

you probably could do the same.
It doesn't have to be in this series, just be aware of potential issues.


> +        error_setg(errp, "Can't have more CPUs, maxcpus limit reached");
> +        return;
> +    }
> +
>      cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
>          + (cs->cpu_index % smp_threads);
>  #endif

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-22 22:16   ` Michael Roth
@ 2015-01-28  4:19     ` Bharata B Rao
  2015-01-28  5:41       ` Michael Roth
  0 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-28  4:19 UTC (permalink / raw)
  To: Michael Roth; +Cc: imammedo, qemu-devel, agraf

On Thu, Jan 22, 2015 at 04:16:01PM -0600, Michael Roth wrote:
> Quoting Bharata B Rao (2015-01-08 00:10:13)
> > +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> > +                                      DeviceState *dev, Error **errp)
> > +{
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +        if (dev->hotplugged) {
> 
> Maybe just
> 
>            if (dev->hotplugged && spapr->dr_cpu_enabled) {
>                ...
> 
> Would do it

This is a common ->plug() handler and would be used for memory too. Hence
there is a need to identify the type of object (CPU or memory) and handle it
differently.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-28  4:19     ` Bharata B Rao
@ 2015-01-28  5:41       ` Michael Roth
  0 siblings, 0 replies; 50+ messages in thread
From: Michael Roth @ 2015-01-28  5:41 UTC (permalink / raw)
  To: bharata; +Cc: imammedo, qemu-devel, agraf

Quoting Bharata B Rao (2015-01-27 22:19:56)
> On Thu, Jan 22, 2015 at 04:16:01PM -0600, Michael Roth wrote:
> > Quoting Bharata B Rao (2015-01-08 00:10:13)
> > > +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> > > +                                      DeviceState *dev, Error **errp)
> > > +{
> > > +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > +        if (dev->hotplugged) {
> > 
> > Maybe just
> > 
> >            if (dev->hotplugged && spapr->dr_cpu_enabled) {
> >                ...
> > 
> > Would do it
> 
> This is a common ->plug() handler and would be used for memory too. Hence
> there is a need to identify the type of object (CPU or memory) and handle it
> differently.

I mean in terms of the "/* TODO: Check if DR is enabled ? */". Adding this
check here, as well as during spapr_dr_connector_new(), should cover all
the cases where the value needs to be checked AFAICT.

> 
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
  2015-01-22 21:08   ` Michael Roth
@ 2015-01-29  1:04   ` David Gibson
  1 sibling, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:04 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]

On Thu, Jan 08, 2015 at 11:40:08AM +0530, Bharata B Rao wrote:
> From: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> Introduce an sPAPRMachineClass sub-class of MachineClass to
> handle sPAPR-specific machine configuration properties.
> 
> The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
> part of machine-specific init code, and is then propagated
> to sPAPREnvironment to conditional enable creation of DRC
> objects and device-tree description to facilitate hotplug
> of PHBs/CPUs/LMBs.
> 
> Since we can't migrate this state to older machine types,
> default the option to false and only enable it for new
> machine types.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>               [Added CPU and LMB bits]

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs Bharata B Rao
  2015-01-22 21:21   ` Michael Roth
@ 2015-01-29  1:04   ` David Gibson
  1 sibling, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:04 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 555 bytes --]

On Thu, Jan 08, 2015 at 11:40:09AM +0530, Bharata B Rao wrote:
> Advertise CPU DR-capability to the guest via device tree.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>                [spapr_drc_reset implementation]

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization Bharata B Rao
@ 2015-01-29  1:05   ` David Gibson
  0 siblings, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:05 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

On Thu, Jan 08, 2015 at 11:40:10AM +0530, Bharata B Rao wrote:
> Use max_cpus instead of smp_cpus when intializating xics system. Also
> report max_cpus in ibm,interrupt-server-ranges device tree property of
> interrupt controller node.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn Bharata B Rao
@ 2015-01-29  1:07   ` David Gibson
  2015-01-30  7:49     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:07 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 6189 bytes --]

On Thu, Jan 08, 2015 at 11:40:11AM +0530, Bharata B Rao wrote:
> Move some CPU initialization code from machine init function to
> CPU realizefn so that it can be used from CPU hotplug path too.
> 
> With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
> function definitions aren't required, remove them.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              | 29 +----------------------------
>  include/hw/ppc/spapr.h      |  3 +++
>  target-ppc/translate_init.c | 43 ++++++++++++++++++++++++++-----------------
>  3 files changed, 30 insertions(+), 45 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 779d364..f49b0fa 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -81,8 +81,6 @@
>  
>  #define MIN_RMA_SLOF            128UL
>  
> -#define TIMEBASE_FREQ           512000000ULL
> -
>  #define MAX_CPUS                255
>  
>  #define PHANDLE_XICP            0x00001111
> @@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
>  
>  }
>  
> -static void spapr_cpu_reset(void *opaque)
> +void spapr_cpu_reset(void *opaque)
>  {
>      PowerPCCPU *cpu = opaque;
>      CPUState *cs = CPU(cpu);
> @@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
>      const char *initrd_filename = machine->initrd_filename;
>      const char *boot_device = machine->boot_order;
>      PowerPCCPU *cpu;
> -    CPUPPCState *env;
>      PCIHostState *phb;
>      int i;
>      MemoryRegion *sysmem = get_system_memory();
> @@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
>              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
>              exit(1);
>          }
> -        env = &cpu->env;
> -
> -        /* Set time-base frequency to 512 MHz */
> -        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> -
> -        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> -         * MSR[IP] should never be set.
> -         */
> -        env->msr_mask &= ~(1 << 6);
> -
> -        /* Tell KVM that we're in PAPR mode */
> -        if (kvm_enabled()) {
> -            kvmppc_set_papr(cpu);
> -        }
> -
> -        if (cpu->max_compat) {
> -            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> -                exit(1);
> -            }
> -        }
> -
> -        xics_cpu_setup(spapr->icp, cpu);
> -
> -        qemu_register_reset(spapr_cpu_reset, cpu);
>      }
>  
>      /* allocate RAM */
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index b1a0838..831db6b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -478,6 +478,8 @@ struct sPAPRTCETable {
>      QLIST_ENTRY(sPAPRTCETable) list;
>  };
>  
> +#define TIMEBASE_FREQ           512000000ULL
> +
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
>  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> @@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>                        sPAPRTCETable *tcet);
>  void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
>  void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
> +void spapr_cpu_reset(void *opaque);
>  
>  #endif /* !defined (__HW_SPAPR_H__) */
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 72cc9d0..9c642a5 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -30,29 +30,14 @@
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  #include "hw/qdev-properties.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/ppc.h"
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
>  //#define PPC_DUMP_SPR_ACCESSES
>  /* #define USE_APPLE_GDB */
>  
> -/* For user-mode emulation, we don't emulate any IRQ controller */
> -#if defined(CONFIG_USER_ONLY)
> -#define PPC_IRQ_INIT_FN(name)                                                 \
> -static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)         \
> -{                                                                             \
> -}
> -#else
> -#define PPC_IRQ_INIT_FN(name)                                                 \
> -void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
> -#endif
> -
> -PPC_IRQ_INIT_FN(40x);
> -PPC_IRQ_INIT_FN(6xx);
> -PPC_IRQ_INIT_FN(970);
> -PPC_IRQ_INIT_FN(POWER7);
> -PPC_IRQ_INIT_FN(e500);
> -
>  /* Generic callbacks:
>   * do nothing but store/retrieve spr value
>   */
> @@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>      CPUState *cs = CPU(dev);
>      PowerPCCPU *cpu = POWERPC_CPU(dev);
>      PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +    CPUPPCState *env = &cpu->env;
>      Error *local_err = NULL;
>  #if !defined(CONFIG_USER_ONLY)
>      int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
> @@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>  
>      qemu_init_vcpu(cs);
>  
> +    /* Set time-base frequency to 512 MHz */
> +    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> +
> +    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> +     * MSR[IP] should never be set.
> +     */
> +    env->msr_mask &= ~(1 << 6);
> +
> +    /* Tell KVM that we're in PAPR mode */
> +    if (kvm_enabled()) {
> +        kvmppc_set_papr(cpu);
> +    }
> +
> +    if (cpu->max_compat) {
> +        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> +            exit(1);
> +        }
> +    }
> +
> +    xics_cpu_setup(spapr->icp, cpu);
> +
> +    qemu_register_reset(spapr_cpu_reset, cpu);
> +
>      pcc->parent_realize(dev, errp);

This doesn't look right.  Several of these are clearly PAPR specific
operations, but you're now doing them from code that isn't PAPR specific.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
  2015-01-22 21:55   ` Michael Roth
@ 2015-01-29  1:16   ` David Gibson
  2015-01-30  7:50     ` Bharata B Rao
  1 sibling, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:16 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 5425 bytes --]

On Thu, Jan 08, 2015 at 11:40:12AM +0530, Bharata B Rao wrote:
> Add support for ibm,lrdr-capacity since this is needed by the guest
> kernel to know about the possible hot-pluggable CPUs and Memory.
> 
> Define minimum hotpluggable memory size as 256MB and start storing maximum
> possible memory for the guest in sPAPREnvironment.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         |  3 ++-
>  hw/ppc/spapr_rtas.c    | 28 ++++++++++++++++++++++++++--
>  include/hw/ppc/spapr.h |  6 ++++--
>  3 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f49b0fa..515d770 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -775,7 +775,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      }
>  
>      /* RTAS */
> -    ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
> +    ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
>      if (ret < 0) {
>          fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
>      }
> @@ -1473,6 +1473,7 @@ static void ppc_spapr_init(MachineState *machine)
>  
>      /* allocate RAM */
>      spapr->ram_limit = ram_size;
> +    spapr->maxram_limit = machine->maxram_size;

There's now a bunch of duplication between sPAPREnvironment and
MachineState - it looks like we should really fold sPAPREnvironment
into a subclass of MachineState, but that's probably a project for
another day.

>      memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>                                           spapr->ram_limit);
>      memory_region_add_subregion(sysmem, 0, ram);
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index d847f45..e8a0f21 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -29,6 +29,7 @@
>  #include "sysemu/char.h"
>  #include "hw/qdev.h"
>  #include "sysemu/device_tree.h"
> +#include "sysemu/cpus.h"
>  
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> @@ -551,11 +552,12 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
>      rtas_table[token].fn = fn;
>  }
>  
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> -                                 hwaddr rtas_size)
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> +                                 hwaddr rtas_addr, hwaddr rtas_size)
>  {
>      int ret;
>      int i;
> +    uint32_t lrdr_capacity[5];
>  
>      ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
>      if (ret < 0) {
> @@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
>          }
>  
>      }
> +
> +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add #address-cells rtas property\n");
> +    }
> +
> +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add #size-cells rtas property\n");
> +    }

It's not clear what adding #address-cells and #size-cells has to do
with this, and these properties generally don't make sense on a node
without children.

> +    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> +    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
> +    lrdr_capacity[2] = 0;
> +    lrdr_capacity[3] = cpu_to_be32(SPAPR_MIN_MEMORY_BLOCK_SIZE);
> +    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> +    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> +                     sizeof(lrdr_capacity));
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
> +    }
> +
>      return 0;
>  }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 831db6b..ae8b4e1 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -18,6 +18,7 @@ typedef struct sPAPREnvironment {
>      XICSState *icp;
>  
>      hwaddr ram_limit;
> +    hwaddr maxram_limit;
>      void *htab;
>      uint32_t htab_shift;
>      hwaddr rma_size;
> @@ -444,8 +445,8 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
>  target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>                               uint32_t token, uint32_t nargs, target_ulong args,
>                               uint32_t nret, target_ulong rets);
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> -                                 hwaddr rtas_size);
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> +                                 hwaddr rtas_addr, hwaddr rtas_size);
>  
>  #define SPAPR_TCE_PAGE_SHIFT   12
>  #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
> @@ -479,6 +480,7 @@ struct sPAPRTCETable {
>  };
>  
>  #define TIMEBASE_FREQ           512000000ULL
> +#define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
>  
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
  2015-01-22 22:16   ` Michael Roth
  2015-01-23 12:41   ` Igor Mammedov
@ 2015-01-29  1:31   ` David Gibson
  2 siblings, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:31 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 13756 bytes --]

On Thu, Jan 08, 2015 at 11:40:13AM +0530, Bharata B Rao wrote:
> Support CPU hotplug via device-add command. Use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c              | 205 +++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c       |   8 +-
>  target-ppc/translate_init.c |   6 ++
>  3 files changed, 215 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 515d770..a293a59 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
>      g_string_append_len(s, s1, strlen(s1) + 1);
>  }
>  
> +uint32_t cpus_per_socket;
> +
>  static void *spapr_create_fdt_skel(hwaddr initrd_base,
>                                     hwaddr initrd_size,
>                                     hwaddr kernel_size,
> @@ -350,9 +352,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
>      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
>      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> -    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      char *buf;
>  
> +    cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      add_str(hypertas, "hcall-pft");
>      add_str(hypertas, "hcall-term");
>      add_str(hypertas, "hcall-dabr");
> @@ -1744,12 +1746,209 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>      }
>  }
>  
> +/* TODO: Duplicates code from spapr_create_fdt_skel(), Fix this */

Uh, yeah, you should fix this.  I think you probably want to move the
filling out of the cpu dt information from create_fdt_skel() to
finalize_fdt(), then you should be able to use a common helper function.

> +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
> +            int drc_index)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    CPUPPCState *env = &cpu->env;
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> +    int index = ppc_get_vcpu_dt_id(cpu);
> +    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> +                       0xffffffff, 0xffffffff};
> +    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> +    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> +    uint32_t page_sizes_prop[64];
> +    size_t page_sizes_prop_size;
> +    int smpt = ppc_get_compat_smt_threads(cpu);
> +    uint32_t servers_prop[smpt];
> +    uint32_t gservers_prop[smpt * 2];
> +    int i;
> +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +    uint32_t associativity[] = {cpu_to_be32(0x5),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(cs->numa_node),
> +                                cpu_to_be32(index)};
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> +    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> +                        env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> +                        env->icache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> +                            env->icache_line_size)));
> +
> +    if (pcc->l1_dcache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> +            pcc->l1_dcache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +    }
> +    if (pcc->l1_icache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> +            pcc->l1_icache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> +    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> +    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> +
> +    if (env->spr_cb[SPR_PURR].oea_read) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> +    }
> +
> +    if (env->mmu_model & POWERPC_MMU_1TSEG) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> +                           segs, sizeof(segs))));
> +    }
> +
> +    /* Advertise VMX/VSX (vector extensions) if available
> +     *   0 / no property == no vector extensions
> +     *   1               == VMX / Altivec available
> +     *   2               == VSX available */
> +    if (env->insns_flags & PPC_ALTIVEC) {
> +        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> +
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
> +    }
> +
> +    /* Advertise DFP (Decimal Floating Point) if available
> +     *   0 / no property == no DFP
> +     *   1               == DFP available */
> +    if (env->insns_flags2 & PPC2_DFP) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
> +    }
> +
> +    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> +                                                  sizeof(page_sizes_prop));
> +    if (page_sizes_prop_size) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
> +                           page_sizes_prop, page_sizes_prop_size)));
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> +                                cs->cpu_index / cpus_per_socket)));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> +
> +    /* Build interrupt servers and gservers properties */
> +    for (i = 0; i < smpt; i++) {
> +        servers_prop[i] = cpu_to_be32(index + i);
> +        /* Hack, direct the group queues back to cpu 0 */
> +        gservers_prop[i*2] = cpu_to_be32(index + i);
> +        gservers_prop[i*2 + 1] = 0;
> +    }
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
> +                       servers_prop, sizeof(servers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
> +                      gservers_prop, sizeof(gservers_prop)));
> +    _FDT(fdt_setprop(fdt, offset, "ibm,pft-size",
> +                          pft_size_prop, sizeof(pft_size_prop)));
> +
> +    if (nb_numa_nodes > 1) {
> +        _FDT(fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> +                          sizeof(associativity)));
> +    }
> +}
> +
> +static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    int drc_index = drck->get_index(drc);
> +    void *fdt, *fdt_orig;
> +    int offset, i;
> +    char *nodename;
> +
> +    /* add OF node for CPU and required OF DT properties */

Misleading comment, it's not really adding a node to anything but
rather creating an fdt fragment which the DR code can send to the
guest.

> +    fdt_orig = g_malloc0(FDT_MAX_SIZE);
> +    offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> +    fdt_begin_node(fdt_orig, "");
> +    fdt_end_node(fdt_orig);
> +    fdt_finish(fdt_orig);

So both the PCI and cpu hotplug code now have this idiom.  I think we
want a common helper that allocates some memory and populates it with
an empty fdt fragment.  That will mean there's only one place to fix
up when we pull in a newer fdt which has a built in to do most of
that.

> +    fdt = g_malloc0(FDT_MAX_SIZE);
> +    fdt_open_into(fdt_orig, fdt, FDT_MAX_SIZE);

And like the PCI version, you don't need second malloc() -
fdt_open_into() will work in place.

> +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> +
> +    offset = fdt_add_subnode(fdt, offset, nodename);
> +
> +    /* Set NUMA node for the added CPU */
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> +            cs->numa_node = i;
> +            break;
> +        }
> +    }
> +
> +    spapr_populate_cpu_dt(cs, fdt, offset, drc_index);
> +    g_free(fdt_orig);
> +    g_free(nodename);
> +
> +    drck->attach(drc, dev, fdt, offset, false);
> +}
> +
> +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    Error *local_err = NULL;
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
> +
> +    /* TODO: Check if DR is enabled ? */

Again, that's a TODO that shouldn't be postponed.

> +    g_assert(drc);

Especially since I think this could cause an assertion crash if DRC is disabled.

> +
> +    spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
> +    spapr_cpu_hotplug_add(dev, cs);
> +    spapr_hotplug_req_add_event(drc);
> +    error_propagate(errp, local_err);

Nothing in the code uses local_err, so I think this is redundant.

> +    return;
> +}
> +
> +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        if (dev->hotplugged) {
> +            spapr_cpu_plug(hotplug_dev, dev, errp);
> +        }
> +    }
> +}
> +
> +static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> +                                             DeviceState *dev)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        return HOTPLUG_HANDLER(machine);
> +    }
> +    return NULL;
> +}
> +
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>      FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
>      NMIClass *nc = NMI_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>  
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
> @@ -1759,6 +1958,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->default_boot_order = NULL;
>      mc->kvm_type = spapr_kvm_type;
>      mc->has_dynamic_sysbus = true;
> +    mc->get_hotplug_handler = spapr_get_hotpug_handler;
> +
> +    hc->plug = spapr_machine_device_plug;
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;
> @@ -1778,6 +1980,7 @@ static const TypeInfo spapr_machine_info = {
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_FW_PATH_PROVIDER },
>          { TYPE_NMI },
> +        { TYPE_HOTPLUG_HANDLER },
>          { }
>      },
>  };
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 434a75d..035d8c9 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -364,14 +364,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
>      hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
>      hp->hdr.section_version = 1; /* includes extended modifier */
>      hp->hotplug_action = hp_action;
> -
> +    hp->drc.index = cpu_to_be32(drck->get_index(drc));
> +    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>  
>      switch (drc_type) {
>      case SPAPR_DR_CONNECTOR_TYPE_PCI:
> -        hp->drc.index = cpu_to_be32(drck->get_index(drc));
> -        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> +        break;
>      default:
>          /* skip notification for unknown connector types */
>          g_free(new_hp);
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 9c642a5..cf9d8d3 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -32,6 +32,7 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/ppc.h"
> +#include "sysemu/sysemu.h"
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
> @@ -8909,6 +8910,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> +    if (cs->cpu_index >= max_cpus) {
> +        error_setg(errp, "Can't have more CPUs, maxcpus limit reached");
> +        return;
> +    }
> +
>      cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
>          + (cs->cpu_index % smp_threads);
>  #endif

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged Bharata B Rao
@ 2015-01-29  1:36   ` David Gibson
  2015-01-30  8:12     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:36 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 2488 bytes --]

On Thu, Jan 08, 2015 at 11:40:14AM +0530, Bharata B Rao wrote:
> PowerPC kernel adds or removes CPUs in core granularity and hence
> onlines/offlines all the SMT threads of a core during hot plug/unplug.
> Support this notion by starting all SMT threads of a core when a core
> is hotplugged.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a293a59..4347471 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1376,6 +1376,8 @@ static void spapr_drc_reset(void *opaque)
>      }
>  }
>  
> +static const char *current_cpu_model;

More new global variables?  Please don't.

>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> @@ -1473,6 +1475,8 @@ static void ppc_spapr_init(MachineState *machine)
>          }
>      }
>  
> +    current_cpu_model = cpu_model;
> +
>      /* allocate RAM */
>      spapr->ram_limit = ram_size;
>      spapr->maxram_limit = machine->maxram_size;
> @@ -1912,10 +1916,31 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      PowerPCCPU *cpu = POWERPC_CPU(cs);
>      sPAPRDRConnector *drc =
>          spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    int smt = kvmppc_smt_threads();
> +    int i;
> +
> +    /*
> +     * SMT threads return from here, only main thread (core) will
> +     * continue, create threads and signal hotplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return;
> +    }
>  
>      /* TODO: Check if DR is enabled ? */
>      g_assert(drc);
>  
> +    /* Start rest of the SMT threads of the hot plugged core */
> +    for (i = 1; i < smp_threads; i++) {
> +        cpu = cpu_ppc_init(current_cpu_model);
> +        if (cpu == NULL) {
> +            fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> +            exit(1);
> +        }
> +        spapr_cpu_reset(cpu);
> +    }
> +
>      spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
>      spapr_cpu_hotplug_add(dev, cs);
>      spapr_hotplug_req_add_event(drc);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support Bharata B Rao
@ 2015-01-29  1:39   ` David Gibson
  2015-01-30  8:15     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:39 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 3258 bytes --]

On Thu, Jan 08, 2015 at 11:40:16AM +0530, Bharata B Rao wrote:
> Support hot removal of CPU for sPAPR guests.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c | 43 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4347471..ec793b1 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1908,6 +1908,22 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
>      drck->attach(drc, dev, fdt, offset, false);
>  }
>  
> +static void spapr_cpu_release(DeviceState *dev, void *opaque)
> +{
> +    /* Release vCPU */

Um.. should this actually do something?

> +}
> +
> +static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +    drck->detach(drc, dev, spapr_cpu_release, NULL);
> +}
> +
>  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                              Error **errp)
>  {
> @@ -1948,6 +1964,21 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      return;
>  }
>  
> +static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    Error *local_err = NULL;

Unused variable.

> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
> +
> +    spapr_cpu_hotplug_remove(dev, cs);
> +    spapr_hotplug_req_remove_event(drc);
> +    error_propagate(errp, local_err);
> +    return;
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -1958,6 +1989,16 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>      }
>  }
>  
> +static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        if (dev->hotplugged) {
> +            spapr_cpu_unplug(hotplug_dev, dev, errp);
> +        }
> +    }
> +}
> +
>  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>                                               DeviceState *dev)
>  {
> @@ -1986,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->get_hotplug_handler = spapr_get_hotpug_handler;
>  
>      hc->plug = spapr_machine_device_plug;
> +    hc->unplug = spapr_machine_device_unplug;
> +
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects Bharata B Rao
@ 2015-01-29  1:48   ` David Gibson
  2015-01-30  8:23     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-01-29  1:48 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: Gu Zheng, imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 9079 bytes --]

On Thu, Jan 08, 2015 at 11:40:17AM +0530, Bharata B Rao wrote:
> From: Gu Zheng <guz.fnst@cn.fujitsu.com>

This needs a commit message, it's not at all clear from the 1-line description.

> 
> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>                (added spapr bits)
> ---
>  cpus.c               | 44 ++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr.c       | 14 ++++++++++++-
>  include/qom/cpu.h    | 11 ++++++++++
>  include/sysemu/kvm.h |  1 +
>  kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
>  5 files changed, 125 insertions(+), 2 deletions(-)

The generic and PAPR specific parts should probably be divided into
different patches, since they'll want to go via different trees.

> diff --git a/cpus.c b/cpus.c
> index 1b5168a..98b7199 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -871,6 +871,24 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
>      qemu_cpu_kick(cpu);
>  }
>  
> +static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    CPU_REMOVE(cpu);
> +
> +    if (kvm_destroy_vcpu(cpu) < 0) {
> +        fprintf(stderr, "kvm_destroy_vcpu failed.\n");
> +        exit(1);
> +    }
> +
> +    object_unparent(OBJECT(cpu));
> +}
> +
> +static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> +{
> +    CPU_REMOVE(cpu);
> +    object_unparent(OBJECT(cpu));
> +}
> +
>  static void flush_queued_work(CPUState *cpu)
>  {
>      struct qemu_work_item *wi;
> @@ -964,6 +982,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>              }
>          }
>          qemu_kvm_wait_io_event(cpu);
> +        if (cpu->exit && !cpu_can_run(cpu)) {
> +            qemu_kvm_destroy_vcpu(cpu);
> +            qemu_mutex_unlock(&qemu_global_mutex);
> +            return NULL;
> +        }
>      }
>  
>      return NULL;
> @@ -1018,6 +1041,7 @@ static void tcg_exec_all(void);
>  static void *qemu_tcg_cpu_thread_fn(void *arg)
>  {
>      CPUState *cpu = arg;
> +    CPUState *remove_cpu = NULL;
>  
>      qemu_tcg_init_cpu_signals();
>      qemu_thread_get_self(cpu->thread);
> @@ -1052,6 +1076,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>              }
>          }
>          qemu_tcg_wait_io_event();
> +        CPU_FOREACH(cpu) {
> +            if (cpu->exit && !cpu_can_run(cpu)) {
> +                remove_cpu = cpu;
> +                break;
> +            }
> +        }
> +        if (remove_cpu) {
> +            qemu_tcg_destroy_vcpu(remove_cpu);
> +            remove_cpu = NULL;
> +        }
>      }
>  
>      return NULL;
> @@ -1208,6 +1242,13 @@ void resume_all_vcpus(void)
>      }
>  }
>  
> +void cpu_remove(CPUState *cpu)
> +{
> +    cpu->stop = true;
> +    cpu->exit = true;
> +    qemu_cpu_kick(cpu);
> +}
> +
>  /* For temporary buffers for forming a name */
>  #define VCPU_THREAD_NAME_SIZE 16
>  
> @@ -1402,6 +1443,9 @@ static void tcg_exec_all(void)
>                  break;
>              }
>          } else if (cpu->stop || cpu->stopped) {
> +            if (cpu->exit) {
> +                next_cpu = CPU_NEXT(cpu);
> +            }
>              break;
>          }
>      }
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ec793b1..44405b2 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1910,7 +1910,19 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
>  
>  static void spapr_cpu_release(DeviceState *dev, void *opaque)
>  {
> -    /* Release vCPU */
> +    CPUState *cs;
> +    int i;
> +    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
> +
> +    for (i = id; i < id + smp_threads; i++) {
> +        CPU_FOREACH(cs) {
> +            PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +            if (i == ppc_get_vcpu_dt_id(cpu)) {
> +                cpu_remove(cs);
> +            }
> +        }
> +    }
>  }
>  
>  static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 2098f1c..30fd0cd 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -206,6 +206,7 @@ struct kvm_run;
>   * @halted: Nonzero if the CPU is in suspended state.
>   * @stop: Indicates a pending stop request.
>   * @stopped: Indicates the CPU has been artificially stopped.
> + * @exit: Indicates the CPU has exited due to an unplug operation.
>   * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
>   *           CPU and return to its top level loop.
>   * @singlestep_enabled: Flags for single-stepping.
> @@ -249,6 +250,7 @@ struct CPUState {
>      bool created;
>      bool stop;
>      bool stopped;
> +    bool exit;
>      volatile sig_atomic_t exit_request;
>      uint32_t interrupt_request;
>      int singlestep_enabled;
> @@ -305,6 +307,7 @@ struct CPUState {
>  QTAILQ_HEAD(CPUTailQ, CPUState);
>  extern struct CPUTailQ cpus;
>  #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
> +#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
>  #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, &cpus, node)
>  #define CPU_FOREACH_SAFE(cpu, next_cpu) \
>      QTAILQ_FOREACH_SAFE(cpu, &cpus, node, next_cpu)
> @@ -610,6 +613,14 @@ void cpu_exit(CPUState *cpu);
>  void cpu_resume(CPUState *cpu);
>  
>  /**
> + * cpu_remove:
> + * @cpu: The CPU to remove.
> + *
> + * Requests the CPU to be removed.
> + */
> +void cpu_remove(CPUState *cpu);
> +
> +/**
>   * qemu_init_vcpu:
>   * @cpu: The vCPU to initialize.
>   *
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 104cf35..da064c1 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -186,6 +186,7 @@ int kvm_has_gsi_routing(void);
>  int kvm_has_intx_set_mask(void);
>  
>  int kvm_init_vcpu(CPUState *cpu);
> +int kvm_destroy_vcpu(CPUState *cpu);
>  int kvm_cpu_exec(CPUState *cpu);
>  
>  #ifdef NEED_CPU_H
> diff --git a/kvm-all.c b/kvm-all.c
> index 18cc6b4..6f543ce 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -71,6 +71,12 @@ typedef struct KVMSlot
>  
>  typedef struct kvm_dirty_log KVMDirtyLog;
>  
> +struct KVMParkedVcpu {
> +    unsigned long vcpu_id;
> +    int kvm_fd;
> +    QLIST_ENTRY(KVMParkedVcpu) node;
> +};
> +
>  struct KVMState
>  {
>      AccelState parent_obj;
> @@ -107,6 +113,7 @@ struct KVMState
>      QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
>      bool direct_msi;
>  #endif
> +    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
>  };
>  
>  #define TYPE_KVM_ACCEL ACCEL_CLASS_NAME("kvm")
> @@ -247,6 +254,53 @@ static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
>      return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
>  }
>  
> +int kvm_destroy_vcpu(CPUState *cpu)
> +{
> +    KVMState *s = kvm_state;
> +    long mmap_size;
> +    struct KVMParkedVcpu *vcpu = NULL;
> +    int ret = 0;
> +
> +    DPRINTF("kvm_destroy_vcpu\n");
> +
> +    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
> +    if (mmap_size < 0) {
> +        ret = mmap_size;
> +        DPRINTF("kvm_destroy_vcpu failed\n");
> +        goto err;
> +    }
> +
> +    ret = munmap(cpu->kvm_run, mmap_size);
> +    if (ret < 0) {
> +        goto err;
> +    }
> +
> +    vcpu = g_malloc0(sizeof(*vcpu));
> +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> +    vcpu->kvm_fd = cpu->kvm_fd;
> +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);

What's the reason for parking vcpus rather than removing / recreating
them at the kvm level?

> +
> +err:
> +    return ret;
> +}
> +
> +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
> +{
> +    struct KVMParkedVcpu *cpu;
> +
> +    QLIST_FOREACH(cpu, &s->kvm_parked_vcpus, node) {
> +        if (cpu->vcpu_id == vcpu_id) {
> +            int kvm_fd;
> +
> +            QLIST_REMOVE(cpu, node);
> +            kvm_fd = cpu->kvm_fd;
> +            g_free(cpu);
> +            return kvm_fd;
> +        }
> +    }
> +
> +    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> +}
>  int kvm_init_vcpu(CPUState *cpu)
>  {
>      KVMState *s = kvm_state;
> @@ -255,7 +309,7 @@ int kvm_init_vcpu(CPUState *cpu)
>  
>      DPRINTF("kvm_init_vcpu\n");
>  
> -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
> +    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          DPRINTF("kvm_create_vcpu failed\n");
>          goto err;
> @@ -1441,6 +1495,7 @@ static int kvm_init(MachineState *ms)
>  #ifdef KVM_CAP_SET_GUEST_DEBUG
>      QTAILQ_INIT(&s->kvm_sw_breakpoints);
>  #endif
> +    QLIST_INIT(&s->kvm_parked_vcpus);
>      s->vmfd = -1;
>      s->fd = qemu_open("/dev/kvm", O_RDWR);
>      if (s->fd == -1) {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (12 preceding siblings ...)
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support Bharata B Rao
@ 2015-01-29 17:46 ` Andreas Färber
  2015-02-02  9:00   ` Bharata B Rao
  2015-01-29 22:14 ` Tyrel Datwyler
  14 siblings, 1 reply; 50+ messages in thread
From: Andreas Färber @ 2015-01-29 17:46 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, mdroth, agraf

[-- Attachment #1: Type: text/plain, Size: 2312 bytes --]

Hi,

Am 08.01.2015 um 07:10 schrieb Bharata B Rao:
> This patchset enables CPU and memory hotplug support for PowerPC guests.
> 
> Changes in this patchset (v1):
> 
> - Based on top of Michael Roth's tree
>   (https://github.com/mdroth/qemu/commits/spapr-hotplug-core) which serves
>   as base for his PCI hotplug patches too.
> - Switched to device_add/del semantics instead of cpu-add.

Please don't forget to CC me on this. As previously discussed with Jason
and Christian for s390x, there's certain topology modeling questions
still unsolved for device-based QOM CPU hot-plug. I have an RFC in the
works (again) that hopefully gets us a step closer.

> - Supporting CPU hot unplug now.
> - Added patches to enable memory hotplug.
> - Added ibm,dynamic-reconfiguration-memory support which is needed for
>   memory hotplug.
> 
> v0 - http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00752.html
> 
> CPU hotplug
> -----------
> - Works with BE guest, has issues with LE guest. Has been tested on BE host
>   only.
> - Adding a core (and all its threads) in response to device_add command.
>   Similarly removing a core via device_del will remove all the threads.

Earlier discussions concluded that hot-plug needs to happen on a socket
level, not core. If you're assuming the socket to have one core (as we
were planning for s390x), that doesn't make much of a difference
number-wise, but it does in modeling terms. Think what you can
physically plug onto the mainboard, that's the granularity that
realized=true/false is going to operate on. A virtual socket may well
correspond to a thread on some socket/node of the host, but you cannot
add threads to a core or cores to a chip at runtime.

On x86 this may become a functional limitation of what is possible
through cpu-add, so better avoid that mistake for ppc from the start.

Regards,
Andreas

> - Using Gu Zheng's "reclaim vCPU objects" patch to remove and reuse
>   the vCPU objects after CPUs removal.
>   (Gu Zheng's original patch:
>    http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01829.html)
[snip]

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests
  2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
                   ` (13 preceding siblings ...)
  2015-01-29 17:46 ` [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Andreas Färber
@ 2015-01-29 22:14 ` Tyrel Datwyler
  14 siblings, 0 replies; 50+ messages in thread
From: Tyrel Datwyler @ 2015-01-29 22:14 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel; +Cc: imammedo, mdroth, Nathan Fontenot, agraf

On 01/07/2015 10:10 PM, Bharata B Rao wrote:
> This patchset enables CPU and memory hotplug support for PowerPC guests.

I missed seeing this as the qemu-ppc list was not included. Can you
please add myself and Nathan on Cc in the future as well.

Tyrel

> 
> Changes in this patchset (v1):
> 
> - Based on top of Michael Roth's tree
>   (https://github.com/mdroth/qemu/commits/spapr-hotplug-core) which serves
>   as base for his PCI hotplug patches too.
> - Switched to device_add/del semantics instead of cpu-add.
> - Supporting CPU hot unplug now.
> - Added patches to enable memory hotplug.
> - Added ibm,dynamic-reconfiguration-memory support which is needed for
>   memory hotplug.
> 
> v0 - http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00752.html
> 
> CPU hotplug
> -----------
> - Works with BE guest, has issues with LE guest. Has been tested on BE host
>   only.
> - Adding a core (and all its threads) in response to device_add command.
>   Similarly removing a core via device_del will remove all the threads.
> - Using Gu Zheng's "reclaim vCPU objects" patch to remove and reuse
>   the vCPU objects after CPUs removal.
>   (Gu Zheng's original patch:
>    http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01829.html)
> 
> Memory hotplug
> --------------
> - Able to get an LMB added with the current patchset, but there are issues
>   which I am still debugging.
> - Re-using pc-dimm infrastructure (hw/mem/pc-dimm.c) to support memory
>   hotplug on PowerPC.
> - Tested with Nathan Fontenot's memory hotplug kernel patches (with additions
>   to drive memory hotplug from EPOW interrupt path)
>   (https://www.marc.info/?l=linuxppc-embedded&m=141626066317143&w=2)
> 
> Bharata B Rao (11):
>   spapr: Add DRC dt entries for CPUs
>   spapr: Consider max_cpus during xics initialization
>   spapr: Factor out CPU initialization code into realizefn
>   spapr: Support ibm,lrdr-capacity device tree property
>   spapr: CPU hotplug support
>   spapr: Start all the threads of CPU core when core is hotplugged
>   spapr: Enable CPU hotplug for POWER8 CPU family
>   spapr: CPU hot unplug support
>   spapr: Initialize hotplug memory address space
>   spapr: Support ibm,dynamic-reconfiguration-memory
>   spapr: Memory hotplug support
> 
> Gu Zheng (1):
>   cpus, spapr: reclaim allocated vCPU objects
> 
> Michael Roth (1):
>   spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
> 
>  cpus.c                            |  44 +++
>  default-configs/ppc64-softmmu.mak |   1 +
>  hw/ppc/spapr.c                    | 744 ++++++++++++++++++++++++++++++++++----
>  hw/ppc/spapr_events.c             |  11 +-
>  hw/ppc/spapr_hcall.c              |  51 ++-
>  hw/ppc/spapr_rtas.c               |  28 +-
>  include/hw/ppc/spapr.h            |  27 +-
>  include/qom/cpu.h                 |  11 +
>  include/sysemu/kvm.h              |   1 +
>  kvm-all.c                         |  57 ++-
>  target-ppc/translate_init.c       |  50 ++-
>  11 files changed, 918 insertions(+), 107 deletions(-)
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support
  2015-01-23 12:41   ` Igor Mammedov
@ 2015-01-30  6:59     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  6:59 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: agraf, qemu-devel, mdroth

On Fri, Jan 23, 2015 at 01:41:38PM +0100, Igor Mammedov wrote:
> On Thu,  8 Jan 2015 11:40:13 +0530
> Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> 
> > Support CPU hotplug via device-add command. Use the exising EPOW event
> > infrastructure to send CPU hotplug notification to the guest.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              | 205 +++++++++++++++++++++++++++++++++++++++++++-
> >  hw/ppc/spapr_events.c       |   8 +-
> >  target-ppc/translate_init.c |   6 ++
> >  3 files changed, 215 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 515d770..a293a59 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
> >      g_string_append_len(s, s1, strlen(s1) + 1);
> >  }
> >  
> > +uint32_t cpus_per_socket;
> static ???

Sure.

> > +
> > +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                            Error **errp)
> > +{
> > +    Error *local_err = NULL;
> > +    CPUState *cs = CPU(dev);
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cpu->cpu_dt_id);
> just rant: does this have any relation to hotplug_dev, the point here is to get
> these data from hotplug_dev object/some child of it rather then via direct adhoc call.

I see how hotplug_dev is being used to pass on the plug request to ACPI, but
have to check how hotplug_dev can be used more meaningfully here.

> 
> > +
> > +    /* TODO: Check if DR is enabled ? */
> > +    g_assert(drc);
> > +
> > +    spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
> reset probably should be don at realize time, see x86_cpu_realizefn() for example.

Yes, can be done.

> 
> > +    spapr_cpu_hotplug_add(dev, cs);
> > +    spapr_hotplug_req_add_event(drc);
> > +    error_propagate(errp, local_err);
> > +    return;
> > +}
> > +
> > +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> > +                                      DeviceState *dev, Error **errp)
> > +{
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +        if (dev->hotplugged) {
> > +            spapr_cpu_plug(hotplug_dev, dev, errp);
> Would be nicer if this could do cold-plugged CPUs wiring too.

Yes, will check and see how intrusive change that would be.

> > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> > index 9c642a5..cf9d8d3 100644
> > --- a/target-ppc/translate_init.c
> > +++ b/target-ppc/translate_init.c
> > @@ -32,6 +32,7 @@
> >  #include "hw/qdev-properties.h"
> >  #include "hw/ppc/spapr.h"
> >  #include "hw/ppc/ppc.h"
> > +#include "sysemu/sysemu.h"
> >  
> >  //#define PPC_DUMP_CPU
> >  //#define PPC_DEBUG_SPR
> > @@ -8909,6 +8910,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >  
> > +    if (cs->cpu_index >= max_cpus) {
> pls note that cpu_index is monotonically increases, so when you do unplug
> and then plug it will go above max_cpus or the same will happen if
> one device_add fails in the middle, the next CPU add will fail because of
> cs->cpu_index goes overboard.
> 
> I'd suggest not to rely/use cpu_index for any purposes and use other means
> to identify where cpu is plugged in. On x68 we slowly getting rid of this
> dependency in favor of apic_id (topology information), eventually it could
> become:
>   -device cpu_foo,socket=X,core=Y[,thread=Z][,node=N]
> 
> you probably could do the same.
> It doesn't have to be in this series, just be aware of potential issues.

I see your point and this needs to be fixed as I see this causing problems
with CPU removal (from the middle) and subsequent addition (which makes
use of "vcpu fd parking and reuse" mechanism).

Thanks for your review.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn
  2015-01-29  1:07   ` David Gibson
@ 2015-01-30  7:49     ` Bharata B Rao
  2015-02-23  7:36       ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  7:49 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 12:07:42PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:11AM +0530, Bharata B Rao wrote:
> > Move some CPU initialization code from machine init function to
> > CPU realizefn so that it can be used from CPU hotplug path too.
> > 
> > With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
> > function definitions aren't required, remove them.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c              | 29 +----------------------------
> >  include/hw/ppc/spapr.h      |  3 +++
> >  target-ppc/translate_init.c | 43 ++++++++++++++++++++++++++-----------------
> >  3 files changed, 30 insertions(+), 45 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 779d364..f49b0fa 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -81,8 +81,6 @@
> >  
> >  #define MIN_RMA_SLOF            128UL
> >  
> > -#define TIMEBASE_FREQ           512000000ULL
> > -
> >  #define MAX_CPUS                255
> >  
> >  #define PHANDLE_XICP            0x00001111
> > @@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
> >  
> >  }
> >  
> > -static void spapr_cpu_reset(void *opaque)
> > +void spapr_cpu_reset(void *opaque)
> >  {
> >      PowerPCCPU *cpu = opaque;
> >      CPUState *cs = CPU(cpu);
> > @@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
> >      const char *initrd_filename = machine->initrd_filename;
> >      const char *boot_device = machine->boot_order;
> >      PowerPCCPU *cpu;
> > -    CPUPPCState *env;
> >      PCIHostState *phb;
> >      int i;
> >      MemoryRegion *sysmem = get_system_memory();
> > @@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
> >              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> >              exit(1);
> >          }
> > -        env = &cpu->env;
> > -
> > -        /* Set time-base frequency to 512 MHz */
> > -        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> > -
> > -        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> > -         * MSR[IP] should never be set.
> > -         */
> > -        env->msr_mask &= ~(1 << 6);
> > -
> > -        /* Tell KVM that we're in PAPR mode */
> > -        if (kvm_enabled()) {
> > -            kvmppc_set_papr(cpu);
> > -        }
> > -
> > -        if (cpu->max_compat) {
> > -            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> > -                exit(1);
> > -            }
> > -        }
> > -
> > -        xics_cpu_setup(spapr->icp, cpu);
> > -
> > -        qemu_register_reset(spapr_cpu_reset, cpu);
> >      }
> >  
> >      /* allocate RAM */
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index b1a0838..831db6b 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -478,6 +478,8 @@ struct sPAPRTCETable {
> >      QLIST_ENTRY(sPAPRTCETable) list;
> >  };
> >  
> > +#define TIMEBASE_FREQ           512000000ULL
> > +
> >  void spapr_events_init(sPAPREnvironment *spapr);
> >  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> >  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> > @@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> >                        sPAPRTCETable *tcet);
> >  void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
> >  void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
> > +void spapr_cpu_reset(void *opaque);
> >  
> >  #endif /* !defined (__HW_SPAPR_H__) */
> > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> > index 72cc9d0..9c642a5 100644
> > --- a/target-ppc/translate_init.c
> > +++ b/target-ppc/translate_init.c
> > @@ -30,29 +30,14 @@
> >  #include "qemu/error-report.h"
> >  #include "qapi/visitor.h"
> >  #include "hw/qdev-properties.h"
> > +#include "hw/ppc/spapr.h"
> > +#include "hw/ppc/ppc.h"
> >  
> >  //#define PPC_DUMP_CPU
> >  //#define PPC_DEBUG_SPR
> >  //#define PPC_DUMP_SPR_ACCESSES
> >  /* #define USE_APPLE_GDB */
> >  
> > -/* For user-mode emulation, we don't emulate any IRQ controller */
> > -#if defined(CONFIG_USER_ONLY)
> > -#define PPC_IRQ_INIT_FN(name)                                                 \
> > -static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)         \
> > -{                                                                             \
> > -}
> > -#else
> > -#define PPC_IRQ_INIT_FN(name)                                                 \
> > -void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
> > -#endif
> > -
> > -PPC_IRQ_INIT_FN(40x);
> > -PPC_IRQ_INIT_FN(6xx);
> > -PPC_IRQ_INIT_FN(970);
> > -PPC_IRQ_INIT_FN(POWER7);
> > -PPC_IRQ_INIT_FN(e500);
> > -
> >  /* Generic callbacks:
> >   * do nothing but store/retrieve spr value
> >   */
> > @@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> >      CPUState *cs = CPU(dev);
> >      PowerPCCPU *cpu = POWERPC_CPU(dev);
> >      PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> > +    CPUPPCState *env = &cpu->env;
> >      Error *local_err = NULL;
> >  #if !defined(CONFIG_USER_ONLY)
> >      int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
> > @@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> >  
> >      qemu_init_vcpu(cs);
> >  
> > +    /* Set time-base frequency to 512 MHz */
> > +    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> > +
> > +    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> > +     * MSR[IP] should never be set.
> > +     */
> > +    env->msr_mask &= ~(1 << 6);
> > +
> > +    /* Tell KVM that we're in PAPR mode */
> > +    if (kvm_enabled()) {
> > +        kvmppc_set_papr(cpu);
> > +    }
> > +
> > +    if (cpu->max_compat) {
> > +        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> > +            exit(1);
> > +        }
> > +    }
> > +
> > +    xics_cpu_setup(spapr->icp, cpu);
> > +
> > +    qemu_register_reset(spapr_cpu_reset, cpu);
> > +
> >      pcc->parent_realize(dev, errp);
> 
> This doesn't look right.  Several of these are clearly PAPR specific
> operations, but you're now doing them from code that isn't PAPR specific.

Ok, will re-work on this patch.

Thanks for your review.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property
  2015-01-29  1:16   ` David Gibson
@ 2015-01-30  7:50     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  7:50 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 12:16:09PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:12AM +0530, Bharata B Rao wrote:
> > -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> > -                                 hwaddr rtas_size)
> > +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> > +                                 hwaddr rtas_addr, hwaddr rtas_size)
> >  {
> >      int ret;
> >      int i;
> > +    uint32_t lrdr_capacity[5];
> >  
> >      ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
> >      if (ret < 0) {
> > @@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> >          }
> >  
> >      }
> > +
> > +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add #address-cells rtas property\n");
> > +    }
> > +
> > +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add #size-cells rtas property\n");
> > +    }
> 
> It's not clear what adding #address-cells and #size-cells has to do
> with this, and these properties generally don't make sense on a node
> without children.

Yes, those aren't needed, will remove them.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged
  2015-01-29  1:36   ` David Gibson
@ 2015-01-30  8:12     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  8:12 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 12:36:55PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:14AM +0530, Bharata B Rao wrote:
> > PowerPC kernel adds or removes CPUs in core granularity and hence
> > onlines/offlines all the SMT threads of a core during hot plug/unplug.
> > Support this notion by starting all SMT threads of a core when a core
> > is hotplugged.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index a293a59..4347471 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1376,6 +1376,8 @@ static void spapr_drc_reset(void *opaque)
> >      }
> >  }
> >  
> > +static const char *current_cpu_model;
> 
> More new global variables?  Please don't.

Sure, I should be able to get the model name from the type name
of the object as Igor suggested earlier.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support
  2015-01-29  1:39   ` David Gibson
@ 2015-01-30  8:15     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  8:15 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 12:39:58PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:16AM +0530, Bharata B Rao wrote:
> > Support hot removal of CPU for sPAPR guests.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 43 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 4347471..ec793b1 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1908,6 +1908,22 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs)
> >      drck->attach(drc, dev, fdt, offset, false);
> >  }
> >  
> > +static void spapr_cpu_release(DeviceState *dev, void *opaque)
> > +{
> > +    /* Release vCPU */
> 
> Um.. should this actually do something?

Actual vCPU removal code in the next patch, but as you commented on the
next patch, I will clear generic and ppc parts into different patches
next time.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects
  2015-01-29  1:48   ` David Gibson
@ 2015-01-30  8:23     ` Bharata B Rao
  2015-01-31  0:21       ` David Gibson
  0 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  8:23 UTC (permalink / raw)
  To: David Gibson; +Cc: Gu Zheng, imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 12:48:39PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:17AM +0530, Bharata B Rao wrote:
> > From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> 
> This needs a commit message, it's not at all clear from the 1-line description.

Borrowed patch, but I should have put in a description.

> > diff --git a/kvm-all.c b/kvm-all.c
> > index 18cc6b4..6f543ce 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -71,6 +71,12 @@ typedef struct KVMSlot
> >  
> >  typedef struct kvm_dirty_log KVMDirtyLog;
> >  
> > +struct KVMParkedVcpu {
> > +    unsigned long vcpu_id;
> > +    int kvm_fd;
> > +    QLIST_ENTRY(KVMParkedVcpu) node;
> > +};
> > +
> >  struct KVMState
> >  {
> >      AccelState parent_obj;
> > @@ -107,6 +113,7 @@ struct KVMState
> >      QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
> >      bool direct_msi;
> >  #endif
> > +    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
> >  };
> >  
> >  #define TYPE_KVM_ACCEL ACCEL_CLASS_NAME("kvm")
> > @@ -247,6 +254,53 @@ static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
> >      return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
> >  }
> >  
> > +int kvm_destroy_vcpu(CPUState *cpu)
> > +{
> > +    KVMState *s = kvm_state;
> > +    long mmap_size;
> > +    struct KVMParkedVcpu *vcpu = NULL;
> > +    int ret = 0;
> > +
> > +    DPRINTF("kvm_destroy_vcpu\n");
> > +
> > +    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
> > +    if (mmap_size < 0) {
> > +        ret = mmap_size;
> > +        DPRINTF("kvm_destroy_vcpu failed\n");
> > +        goto err;
> > +    }
> > +
> > +    ret = munmap(cpu->kvm_run, mmap_size);
> > +    if (ret < 0) {
> > +        goto err;
> > +    }
> > +
> > +    vcpu = g_malloc0(sizeof(*vcpu));
> > +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> > +    vcpu->kvm_fd = cpu->kvm_fd;
> > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> 
> What's the reason for parking vcpus rather than removing / recreating
> them at the kvm level?

Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
correctly, certain work arounds have to be employed to allow reuse of
vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
proposed workaround is to park the vcpu fd in userspace during cpu unplug
and reuse it later during next hotplug.

Some details can be found here:
KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property
  2015-01-22 21:55   ` Michael Roth
@ 2015-01-30  8:51     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-01-30  8:51 UTC (permalink / raw)
  To: Michael Roth; +Cc: imammedo, qemu-devel, agraf

On Thu, Jan 22, 2015 at 03:55:40PM -0600, Michael Roth wrote:
> Quoting Bharata B Rao (2015-01-08 00:10:12)
> > Add support for ibm,lrdr-capacity since this is needed by the guest
> > kernel to know about the possible hot-pluggable CPUs and Memory.
> > 
> > Define minimum hotpluggable memory size as 256MB and start storing maximum
> > possible memory for the guest in sPAPREnvironment.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c         |  3 ++-
> >  hw/ppc/spapr_rtas.c    | 28 ++++++++++++++++++++++++++--
> >  include/hw/ppc/spapr.h |  6 ++++--
> >  3 files changed, 32 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index f49b0fa..515d770 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -775,7 +775,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      }
> > 
> >      /* RTAS */
> > -    ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
> > +    ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
> >      if (ret < 0) {
> >          fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
> >      }
> > @@ -1473,6 +1473,7 @@ static void ppc_spapr_init(MachineState *machine)
> > 
> >      /* allocate RAM */
> >      spapr->ram_limit = ram_size;
> > +    spapr->maxram_limit = machine->maxram_size;
> >      memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
> >                                           spapr->ram_limit);
> >      memory_region_add_subregion(sysmem, 0, ram);
> > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> > index d847f45..e8a0f21 100644
> > --- a/hw/ppc/spapr_rtas.c
> > +++ b/hw/ppc/spapr_rtas.c
> > @@ -29,6 +29,7 @@
> >  #include "sysemu/char.h"
> >  #include "hw/qdev.h"
> >  #include "sysemu/device_tree.h"
> > +#include "sysemu/cpus.h"
> > 
> >  #include "hw/ppc/spapr.h"
> >  #include "hw/ppc/spapr_vio.h"
> > @@ -551,11 +552,12 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
> >      rtas_table[token].fn = fn;
> >  }
> > 
> > -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> > -                                 hwaddr rtas_size)
> > +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> > +                                 hwaddr rtas_addr, hwaddr rtas_size)
> >  {
> >      int ret;
> >      int i;
> > +    uint32_t lrdr_capacity[5];
> > 
> >      ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
> >      if (ret < 0) {
> > @@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> >          }
> > 
> >      }
> > +
> > +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add #address-cells rtas property\n");
> > +    }
> > +
> > +    ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add #size-cells rtas property\n");
> > +    }
> > +
> > +    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> > +    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
> > +    lrdr_capacity[2] = 0;
> > +    lrdr_capacity[3] = cpu_to_be32(SPAPR_MIN_MEMORY_BLOCK_SIZE);
> > +    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> > +    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> > +                     sizeof(lrdr_capacity));
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
> > +    }
> > +
> 
> The property seems simple enough, but would be worthwhile to add a description
> of how/when it's used in a new section of docs/specs/ppc-spapr-hotplug.txt to
> keep the documentation complete

Sure.

> 
> >      return 0;
> >  }
> > 
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 831db6b..ae8b4e1 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -18,6 +18,7 @@ typedef struct sPAPREnvironment {
> >      XICSState *icp;
> > 
> >      hwaddr ram_limit;
> > +    hwaddr maxram_limit;
> >      void *htab;
> >      uint32_t htab_shift;
> >      hwaddr rma_size;
> > @@ -444,8 +445,8 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
> >  target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> >                               uint32_t token, uint32_t nargs, target_ulong args,
> >                               uint32_t nret, target_ulong rets);
> > -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> > -                                 hwaddr rtas_size);
> > +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> > +                                 hwaddr rtas_addr, hwaddr rtas_size);
> > 
> >  #define SPAPR_TCE_PAGE_SHIFT   12
> >  #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
> > @@ -479,6 +480,7 @@ struct sPAPRTCETable {
> >  };
> > 
> >  #define TIMEBASE_FREQ           512000000ULL
> > +#define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
> 
> Is this actually the min, or a set increment size? Documentation suggests
> the latter, in which case the naming is a little confusing.

I should just call it SPAPR_MEMORY_BLOCK_SIZE.

Thanks for your review.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects
  2015-01-30  8:23     ` Bharata B Rao
@ 2015-01-31  0:21       ` David Gibson
  0 siblings, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-01-31  0:21 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: Gu Zheng, imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 3122 bytes --]

On Fri, Jan 30, 2015 at 01:53:47PM +0530, Bharata B Rao wrote:
> On Thu, Jan 29, 2015 at 12:48:39PM +1100, David Gibson wrote:
> > On Thu, Jan 08, 2015 at 11:40:17AM +0530, Bharata B Rao wrote:
> > > From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> > 
> > This needs a commit message, it's not at all clear from the 1-line description.
> 
> Borrowed patch, but I should have put in a description.
> 
> > > diff --git a/kvm-all.c b/kvm-all.c
> > > index 18cc6b4..6f543ce 100644
> > > --- a/kvm-all.c
> > > +++ b/kvm-all.c
> > > @@ -71,6 +71,12 @@ typedef struct KVMSlot
> > >  
> > >  typedef struct kvm_dirty_log KVMDirtyLog;
> > >  
> > > +struct KVMParkedVcpu {
> > > +    unsigned long vcpu_id;
> > > +    int kvm_fd;
> > > +    QLIST_ENTRY(KVMParkedVcpu) node;
> > > +};
> > > +
> > >  struct KVMState
> > >  {
> > >      AccelState parent_obj;
> > > @@ -107,6 +113,7 @@ struct KVMState
> > >      QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
> > >      bool direct_msi;
> > >  #endif
> > > +    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
> > >  };
> > >  
> > >  #define TYPE_KVM_ACCEL ACCEL_CLASS_NAME("kvm")
> > > @@ -247,6 +254,53 @@ static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
> > >      return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
> > >  }
> > >  
> > > +int kvm_destroy_vcpu(CPUState *cpu)
> > > +{
> > > +    KVMState *s = kvm_state;
> > > +    long mmap_size;
> > > +    struct KVMParkedVcpu *vcpu = NULL;
> > > +    int ret = 0;
> > > +
> > > +    DPRINTF("kvm_destroy_vcpu\n");
> > > +
> > > +    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
> > > +    if (mmap_size < 0) {
> > > +        ret = mmap_size;
> > > +        DPRINTF("kvm_destroy_vcpu failed\n");
> > > +        goto err;
> > > +    }
> > > +
> > > +    ret = munmap(cpu->kvm_run, mmap_size);
> > > +    if (ret < 0) {
> > > +        goto err;
> > > +    }
> > > +
> > > +    vcpu = g_malloc0(sizeof(*vcpu));
> > > +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> > > +    vcpu->kvm_fd = cpu->kvm_fd;
> > > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> > 
> > What's the reason for parking vcpus rather than removing / recreating
> > them at the kvm level?
> 
> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
> correctly, certain work arounds have to be employed to allow reuse of
> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
> proposed workaround is to park the vcpu fd in userspace during cpu unplug
> and reuse it later during next hotplug.
> 
> Some details can be found here:
> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html

Ok, that makes sense but it definitely needs comment, both in the code
and in the commit message.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests
  2015-01-29 17:46 ` [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Andreas Färber
@ 2015-02-02  9:00   ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-02-02  9:00 UTC (permalink / raw)
  To: Andreas Färber; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Jan 29, 2015 at 06:46:30PM +0100, Andreas Färber wrote:
> Hi,
> 
> Am 08.01.2015 um 07:10 schrieb Bharata B Rao:
> > This patchset enables CPU and memory hotplug support for PowerPC guests.
> > 
> > Changes in this patchset (v1):
> > 
> > - Based on top of Michael Roth's tree
> >   (https://github.com/mdroth/qemu/commits/spapr-hotplug-core) which serves
> >   as base for his PCI hotplug patches too.
> > - Switched to device_add/del semantics instead of cpu-add.
> 
> Please don't forget to CC me on this. As previously discussed with Jason
> and Christian for s390x, there's certain topology modeling questions
> still unsolved for device-based QOM CPU hot-plug. I have an RFC in the
> works (again) that hopefully gets us a step closer.

I think you are referring to this discussion:
http://lists.gnu.org/archive/html/qemu-devel/2013-09/msg00778.html

Looking forward to your above mentioned RFC.

> 
> > - Supporting CPU hot unplug now.
> > - Added patches to enable memory hotplug.
> > - Added ibm,dynamic-reconfiguration-memory support which is needed for
> >   memory hotplug.
> > 
> > v0 - http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00752.html
> > 
> > CPU hotplug
> > -----------
> > - Works with BE guest, has issues with LE guest. Has been tested on BE host
> >   only.
> > - Adding a core (and all its threads) in response to device_add command.
> >   Similarly removing a core via device_del will remove all the threads.
> 
> Earlier discussions concluded that hot-plug needs to happen on a socket
> level, not core. If you're assuming the socket to have one core (as we
> were planning for s390x), that doesn't make much of a difference
> number-wise, but it does in modeling terms. Think what you can
> physically plug onto the mainboard, that's the granularity that
> realized=true/false is going to operate on. A virtual socket may well
> correspond to a thread on some socket/node of the host, but you cannot
> add threads to a core or cores to a chip at runtime.

In PowerPC, guests conform to PAPR specifications and hence the hotplug
behaviour with the real hardware will not matter. For pseries guests
we add CPUs in core granularity.

By default if we don't specify sockets= explicitly, the guest comes with as
many sockets as there are cores with 1 core per socket. However we can
have a guest that has multiple cores in a socket too.

Hence if you have a CPU hotplug model where a socket is added at a time,
I can't see any issues of supporting that with PowerPC hotplug. Based on
the existing topology, we could hotplug as many cores in the socket as needed.
However I would like to hear from others who understand PowerPC architecture
to see if there are any caveats with this model.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space Bharata B Rao
@ 2015-02-12  5:19   ` David Gibson
  2015-02-12  5:39     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-02-12  5:19 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 4125 bytes --]

On Thu, Jan 08, 2015 at 11:40:18AM +0530, Bharata B Rao wrote:
> Initialize a hotplug memory region under which all the hotplugged
> memory is accommodated. Also enable memory hotplug by setting
> CONFIG_MEM_HOTPLUG.
> 
> Modelled on i386 memory hotplug.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  default-configs/ppc64-softmmu.mak |  1 +
>  hw/ppc/spapr.c                    | 26 ++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h            |  3 +++
>  3 files changed, 30 insertions(+)
> 
> diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> index bd30d69..03210de 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -60,3 +60,4 @@ CONFIG_I82374=y
>  CONFIG_I8257=y
>  CONFIG_MC146818RTC=y
>  CONFIG_ISA_TESTDEV=y
> +CONFIG_MEM_HOTPLUG=y
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 44405b2..9ff08ff 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -120,6 +120,8 @@ struct sPAPRMachineState {
>  
>      /*< public >*/
>      char *kvm_type;
> +    ram_addr_t hotplug_memory_base;
> +    MemoryRegion hotplug_memory;

We should really unify sPAPRMachineState with sPAPREnvironment at some
point (I realise that doesn't reasonably fit within the scope of this
series).

>  };
>  
>  sPAPREnvironment *spapr;
> @@ -1403,6 +1405,7 @@ static void ppc_spapr_init(MachineState *machine)
>      bool kernel_le = false;
>      char *filename;
>      int smt = kvmppc_smt_threads();
> +    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
>  
>      msi_supported = true;
>  
> @@ -1492,6 +1495,29 @@ static void ppc_spapr_init(MachineState *machine)
>          memory_region_add_subregion(sysmem, 0, rma_region);
>      }
>  
> +    if (machine->ram_size < machine->maxram_size) {
> +        ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
> +
> +        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
> +            error_report("unsupported amount of memory slots: %"PRIu64,
> +                         machine->ram_slots);
> +            exit(EXIT_FAILURE);
> +        }
> +
> +        ms->hotplug_memory_base = ROUND_UP(machine->ram_size, 1ULL << 30);

Is there a particular significance to the 1GiB alignment?  Is it just
a conveniently large alignment, or is that value specified in PAPR
somewhere?  Using a named constant would probably help to clarify that.

> +        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
> +            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
> +                         machine->maxram_size);
> +            exit(EXIT_FAILURE);
> +        }
> +
> +        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
> +                           "hotplug-memory", hotplug_mem_size);
> +        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
> +                                    &ms->hotplug_memory);
> +    }
> +
>      filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
>      spapr->rtas_size = get_image_size(filename);
>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index ae8b4e1..64681c4 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -482,6 +482,9 @@ struct sPAPRTCETable {
>  #define TIMEBASE_FREQ           512000000ULL
>  #define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
>  
> +/* Support a min of 1TB hotplug memory assuming 256MB per slot */
> +#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)

Is this constraint arbitrary, or does it come from something in PAPR+?

>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
>  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space
  2015-02-12  5:19   ` David Gibson
@ 2015-02-12  5:39     ` Bharata B Rao
  2015-02-16  4:56       ` David Gibson
  0 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-02-12  5:39 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Thu, Feb 12, 2015 at 04:19:36PM +1100, David Gibson wrote:
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 44405b2..9ff08ff 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -120,6 +120,8 @@ struct sPAPRMachineState {
> >  
> >      /*< public >*/
> >      char *kvm_type;
> > +    ram_addr_t hotplug_memory_base;
> > +    MemoryRegion hotplug_memory;
> 
> We should really unify sPAPRMachineState with sPAPREnvironment at some
> point (I realise that doesn't reasonably fit within the scope of this
> series).

ok.

> 
> >  };
> >  
> >  sPAPREnvironment *spapr;
> > @@ -1403,6 +1405,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      bool kernel_le = false;
> >      char *filename;
> >      int smt = kvmppc_smt_threads();
> > +    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
> >  
> >      msi_supported = true;
> >  
> > @@ -1492,6 +1495,29 @@ static void ppc_spapr_init(MachineState *machine)
> >          memory_region_add_subregion(sysmem, 0, rma_region);
> >      }
> >  
> > +    if (machine->ram_size < machine->maxram_size) {
> > +        ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
> > +
> > +        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
> > +            error_report("unsupported amount of memory slots: %"PRIu64,
> > +                         machine->ram_slots);
> > +            exit(EXIT_FAILURE);
> > +        }
> > +
> > +        ms->hotplug_memory_base = ROUND_UP(machine->ram_size, 1ULL << 30);
> 
> Is there a particular significance to the 1GiB alignment?  Is it just
> a conveniently large alignment, or is that value specified in PAPR
> somewhere?  Using a named constant would probably help to clarify that.

I am basing this on x86 memory hotplug and that's how 1GB is coming. It
is not PAPR specified.

> 
> > +        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
> > +            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
> > +                         machine->maxram_size);
> > +            exit(EXIT_FAILURE);
> > +        }
> > +
> > +        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
> > +                           "hotplug-memory", hotplug_mem_size);
> > +        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
> > +                                    &ms->hotplug_memory);
> > +    }
> > +
> >      filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
> >      spapr->rtas_size = get_image_size(filename);
> >      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index ae8b4e1..64681c4 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -482,6 +482,9 @@ struct sPAPRTCETable {
> >  #define TIMEBASE_FREQ           512000000ULL
> >  #define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
> >  
> > +/* Support a min of 1TB hotplug memory assuming 256MB per slot */
> > +#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
> 
> Is this constraint arbitrary, or does it come from something in PAPR+?

Arbitrary max, not defined by PAPR.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
@ 2015-02-12  6:02   ` David Gibson
  0 siblings, 0 replies; 50+ messages in thread
From: David Gibson @ 2015-02-12  6:02 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 16388 bytes --]

On Thu, Jan 08, 2015 at 11:40:19AM +0530, Bharata B Rao wrote:
> Parse ibm,architecture.vec table obtained from the guest and enable
> memory node configuration via ibm,dynamic-reconfiguration-memory if guest
> supports it. This is in preparation to support memory hotplug for
> sPAPR guests.
> 
> This changes the way memory node configuration is done. Currently all
> memory nodes are built upfront. But after this patch, only memory@0 node
> for RMA is built upfront. Guest kernel boots with just that and rest of
> the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
> are built when guest does ibm,client-architecture-support call.
> 
> Note: This patch was tested with an enhancement to SLOF that supports
> addition of device tree nodes from ibm,client-architecture-support call.
> 
> TODO: Enforce lmb-size alignment for node memory.

I think this todo needs to be done before you're ready to merge.

> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 232 ++++++++++++++++++++++++++++++++++++++++---------
>  hw/ppc/spapr_hcall.c   |  51 +++++++++--
>  include/hw/ppc/spapr.h |  12 ++-
>  3 files changed, 246 insertions(+), 49 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9ff08ff..6964b06 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -631,42 +631,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>      return fdt;
>  }
>  
> -int spapr_h_cas_compose_response(target_ulong addr, target_ulong size)
> -{
> -    void *fdt, *fdt_skel;
> -    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> -
> -    size -= sizeof(hdr);
> -
> -    /* Create sceleton */
> -    fdt_skel = g_malloc0(size);
> -    _FDT((fdt_create(fdt_skel, size)));
> -    _FDT((fdt_begin_node(fdt_skel, "")));
> -    _FDT((fdt_end_node(fdt_skel)));
> -    _FDT((fdt_finish(fdt_skel)));
> -    fdt = g_malloc0(size);
> -    _FDT((fdt_open_into(fdt_skel, fdt, size)));
> -    g_free(fdt_skel);
> -
> -    /* Fix skeleton up */
> -    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> -
> -    /* Pack resulting tree */
> -    _FDT((fdt_pack(fdt)));
> -
> -    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
> -        trace_spapr_cas_failed(size);
> -        return -1;
> -    }
> -
> -    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
> -    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
> -    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
> -    g_free(fdt);
> -
> -    return 0;
> -}
> -
>  static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
>                                         hwaddr size)
>  {
> @@ -720,7 +684,6 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>          }
>          if (!mem_start) {
>              /* ppc_spapr_init() checks for rma_size <= node0_size already */
> -            spapr_populate_memory_node(fdt, i, 0, spapr->rma_size);
>              mem_start += spapr->rma_size;
>              node_size -= spapr->rma_size;
>          }
> @@ -741,6 +704,190 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>      return 0;
>  }
>  
> +/*
> + * TODO: Take care of sparsemem configuration ?
> + */
> +static uint64_t numa_node_end(uint32_t nodeid)
> +{
> +    uint32_t i = 0;
> +    uint64_t addr = 0;
> +
> +    do {
> +        addr += numa_info[i].node_mem;
> +    } while (++i <= nodeid);
> +
> +    return addr;
> +}
> +
> +static uint64_t numa_node_start(uint32_t nodeid)
> +{
> +    if (!nodeid) {
> +        return 0;
> +    } else {
> +        return numa_node_end(nodeid - 1);
> +    }
> +}

There's really no better generic way of finding the addresses of numa nodes?

> +/*
> + * Given the addr, return the NUMA node to which the address belongs to.
> + */
> +static uint32_t get_numa_node(uint64_t addr)
> +{
> +    uint32_t i;
> +
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {

This is O(nb_numa_nodes^2) which is kind of nasty, althoguh that's
unlikely to be large enough to be a real problem.

> +            return i;
> +        }
> +    }
> +
> +    /* Unassigned memory goes to node 0 by default */
> +    return 0;
> +}
> +
> +/* Adds ibm,dynamic-reconfiguration-memory node */
> +static int spapr_populate_drconf_memory(sPAPREnvironment *spapr, void *fdt)
> +{
> +    int root_offset, ret, i, offset;
> +    uint32_t lmb_size = SPAPR_MIN_MEMORY_BLOCK_SIZE;
> +    uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
> +    uint32_t dynamic_memory[DR_LMB_LIST_ENTRY_SIZE];
> +    uint32_t nr_rma_lmbs = spapr->rma_size/lmb_size;
> +    uint32_t nr_lmbs = spapr->maxram_limit/lmb_size - nr_rma_lmbs;
> +    uint32_t nr_assigned_lmbs = spapr->ram_limit/lmb_size - nr_rma_lmbs;
> +    uint32_t *int_buf, *cur_index, buf_len;
> +
> +    /* Allocate enough buffer size to fit in ibm,dynamic-memory */
> +    buf_len = nr_lmbs * DR_LMB_LIST_ENTRY_SIZE * sizeof(uint32_t) +
> +                sizeof(uint32_t);
> +    cur_index = int_buf = g_malloc0(buf_len);
> +    root_offset = fdt_path_offset(fdt, "/");

You don't need this, the fdt offset of / is always 0 and you're
allowed to count on that.

> +
> +
> +    offset = fdt_add_subnode(fdt, root_offset,
> +                   "ibm,dynamic-reconfiguration-memory");
> +
> +    ret = fdt_setprop(fdt, offset, "ibm,lmb-size", prop_lmb_size,
> +            sizeof(prop_lmb_size));
> +    if (ret < 0) {
> +        goto out;
> +    }

Current versions of libfdt have fdt_setprop_u64 which you could use
for this.

> +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-flags-mask",
> +                            cpu_to_be32(0xff));

fdt_setprop_cell has the byteswap built-in, so adding your own as well
will make it incorrect for an LE host.

> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-preservation-time",
> +                            cpu_to_be32(0x0));
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* ibm,dynamic-memory */
> +    int_buf[0] = cpu_to_be32(nr_lmbs);
> +    cur_index++;
> +    for (i = 0; i < nr_lmbs; i++) {
> +        sPAPRDRConnector *drc;
> +        sPAPRDRConnectorClass *drck;
> +        uint64_t addr;
> +
> +        if (i < nr_assigned_lmbs) {
> +            addr = (i + nr_rma_lmbs) * lmb_size;
> +        } else {
> +            addr = (i - nr_assigned_lmbs) * lmb_size +
> +                SPAPR_MACHINE(qdev_get_machine())->hotplug_memory_base;
> +        }
> +        drc = spapr_dr_connector_new(qdev_get_machine(),
> +                SPAPR_DR_CONNECTOR_TYPE_LMB, addr/lmb_size);
> +        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +        dynamic_memory[0] = cpu_to_be32(addr >> 32);
> +        dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff);
> +        dynamic_memory[2] = cpu_to_be32(drck->get_index(drc));
> +        dynamic_memory[3] = cpu_to_be32(0); /* reserved */
> +        dynamic_memory[4] = cpu_to_be32(get_numa_node(addr));

As noted above your current get_numa_node() implementation is O(N^2)
making this routine O(N^3).

> +        dynamic_memory[5] = (addr < spapr->ram_limit) ?
> +                            cpu_to_be32(LMB_FLAGS_ASSIGNED) :
> +                            cpu_to_be32(0);
> +
> +        memcpy(cur_index, dynamic_memory, sizeof(dynamic_memory));

You could just use uint32_t *dynamic_memory = cur_index at the
beginning of the loop block to avoid this memcpy().

> +        cur_index += DR_LMB_LIST_ENTRY_SIZE;
> +    }
> +    ret = fdt_setprop(fdt, offset, "ibm,dynamic-memory", int_buf, buf_len);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* ibm,associativity-lookup-arrays */
> +    cur_index = int_buf;
> +    int_buf[0] = cpu_to_be32(nb_numa_nodes);
> +    int_buf[1] = cpu_to_be32(4);

Something explaining the significance of this 4 for those of us that
don't have access to PAPR+ would be nice.

> +    cur_index += 2;
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        uint32_t associativity[] = {
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(i)
> +        };
> +        memcpy(cur_index, associativity, sizeof(associativity));
> +        cur_index += 4;
> +    }
> +    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
> +            (cur_index - int_buf) * sizeof(uint32_t));
> +out:
> +    g_free(int_buf);
> +    return ret;
> +}
> +
> +int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> +                                bool cpu_update, bool memory_update)
> +{
> +    void *fdt, *fdt_skel;
> +    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> +
> +    size -= sizeof(hdr);
> +
> +    /* Create sceleton */
> +    fdt_skel = g_malloc0(size);
> +    _FDT((fdt_create(fdt_skel, size)));
> +    _FDT((fdt_begin_node(fdt_skel, "")));
> +    _FDT((fdt_end_node(fdt_skel)));
> +    _FDT((fdt_finish(fdt_skel)));
> +    fdt = g_malloc0(size);
> +    _FDT((fdt_open_into(fdt_skel, fdt, size)));
> +    g_free(fdt_skel);
> +
> +    /* Fixup cpu nodes */
> +    if (cpu_update) {
> +        _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> +    }
> +
> +    /* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
> +    if (memory_update) {
> +        _FDT((spapr_populate_drconf_memory(spapr, fdt)));
> +    } else {
> +        _FDT((spapr_populate_memory(spapr, fdt)));
> +    }
> +
> +    /* Pack resulting tree */
> +    _FDT((fdt_pack(fdt)));
> +
> +    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {

You could just set the correct size (size - sizeof(hdr)) to
fdt_create() and fdt_open_into() and avoid this failure case.

> +        trace_spapr_cas_failed(size);
> +        return -1;
> +    }
> +
> +    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
> +    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
> +    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
> +    g_free(fdt);
> +
> +    return 0;
> +}
> +
>  static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>                                 hwaddr fdt_addr,
>                                 hwaddr rtas_addr,
> @@ -757,11 +904,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      /* open out the base tree into a temp buffer for the final tweaks */
>      _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE)));
>  
> -    ret = spapr_populate_memory(spapr, fdt);
> -    if (ret < 0) {
> -        fprintf(stderr, "couldn't setup memory nodes in fdt\n");
> -        exit(1);
> -    }
> +    /*
> +     * Add memory@0 node to represent RMA. Rest of the memory is either
> +     * represented by memory nodes or ibm,dynamic-reconfiguration-memory
> +     * node later during ibm,client-architecture-support call.
> +     */
> +    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
>  
>      ret = spapr_populate_vdevice(spapr->vio_bus, fdt);
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 8651447..10f05f4 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -805,6 +805,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      return ret;
>  }
>  
> +/*
> + * Return the offset to the requested option vector @vector in the
> + * option vector table @table.
> + */
> +static target_ulong cas_get_option_vector(int vector, target_ulong table)
> +{
> +    int i;
> +    char nr_vectors, nr_entries;
> +
> +    if (!table) {
> +        return 0;
> +    }
> +
> +    nr_vectors = (rtas_ld(table, 0) >> 24) + 1;

This is kind of abusing the rtas_ld() function.  It's really only
intended for accessing rtas arguments, not an arbitrary array of u32s.

> +    if (!vector || vector > nr_vectors) {
> +        return 0;
> +    }
> +    table++; /* skip nr option vectors */
> +
> +    for (i = 0; i < vector - 1; i++) {
> +        nr_entries = rtas_ld(table, 0) >> 24;
> +        table += nr_entries + 2;
> +    }
> +    return table;
> +}
> +
>  typedef struct {
>      PowerPCCPU *cpu;
>      uint32_t cpu_version;
> @@ -825,19 +851,22 @@ static void do_set_compat(void *arg)
>      ((cpuver) == CPU_POWERPC_LOGICAL_2_06_PLUS) ? 2061 : \
>      ((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
>  
> +#define OV5_DRCONF_MEMORY 0x20
> +
>  static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>                                                    sPAPREnvironment *spapr,
>                                                    target_ulong opcode,
>                                                    target_ulong *args)
>  {
> -    target_ulong list = args[0];
> +    target_ulong list = args[0], ov_table;
>      PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
>      CPUState *cs;
> -    bool cpu_match = false;
> +    bool cpu_match = false, cpu_update = true, memory_update = false;
>      unsigned old_cpu_version = cpu_->cpu_version;
>      unsigned compat_lvl = 0, cpu_version = 0;
>      unsigned max_lvl = get_compat_level(cpu_->max_compat);
>      int counter;
> +    char ov5_byte2;
>  
>      /* Parse PVR list */
>      for (counter = 0; counter < 512; ++counter) {
> @@ -887,8 +916,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>          }
>      }
>  
> -    /* For the future use: here @list points to the first capability */
> -
>      /* Parsing finished */
>      trace_spapr_cas_pvr(cpu_->cpu_version, cpu_match,
>                          cpu_version, pcc_->pcr_mask);
> @@ -912,14 +939,26 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>      }
>  
>      if (!cpu_version) {
> -        return H_SUCCESS;
> +        cpu_update = false;
>      }
>  
> +    /* For the future use: here @ov_table points to the first option vector */
> +    ov_table = list;
> +
> +    list = cas_get_option_vector(5, ov_table);

What's the literal 5 mean?

>      if (!list) {
>          return H_SUCCESS;
>      }
>  
> -    if (spapr_h_cas_compose_response(args[1], args[2])) {
> +    /* @list now points to OV 5 */
> +    list += 2;
> +    ov5_byte2 = rtas_ld(list, 0) >> 24;
> +    if (ov5_byte2 & OV5_DRCONF_MEMORY) {
> +        memory_update = true;
> +    }
> +
> +    if (spapr_h_cas_compose_response(args[1], args[2], cpu_update,
> +                                    memory_update)) {
>          qemu_system_reset_request();
>      }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 64681c4..10283f9 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -485,9 +485,19 @@ struct sPAPRTCETable {
>  /* Support a min of 1TB hotplug memory assuming 256MB per slot */
>  #define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
>  
> +/*
> + * Number of 32 bit words in each LMB list entry in ibm,dynamic-memory
> + * property under ibm,dynamic-reconfiguration-memory node.
> + */
> +#define DR_LMB_LIST_ENTRY_SIZE 6
> +
> +/* Flag values in Option Vector 5 ibm architecture vector table. */
> +#define LMB_FLAGS_ASSIGNED 0x00000008

Things declared in a public header should have some sort of
namespacing, so, SPAPR_DR_LBM_LIST_ENTRY_SIZE for example.

>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> -int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> +int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> +                                bool cpu_update, bool memory_update);
>  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
>                                     uint64_t bus_offset,
>                                     uint32_t page_shift,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space
  2015-02-12  5:39     ` Bharata B Rao
@ 2015-02-16  4:56       ` David Gibson
  2015-02-17  4:00         ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-02-16  4:56 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 3768 bytes --]

On Thu, Feb 12, 2015 at 11:09:14AM +0530, Bharata B Rao wrote:
> On Thu, Feb 12, 2015 at 04:19:36PM +1100, David Gibson wrote:
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 44405b2..9ff08ff 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -120,6 +120,8 @@ struct sPAPRMachineState {
> > >  
> > >      /*< public >*/
> > >      char *kvm_type;
> > > +    ram_addr_t hotplug_memory_base;
> > > +    MemoryRegion hotplug_memory;
> > 
> > We should really unify sPAPRMachineState with sPAPREnvironment at some
> > point (I realise that doesn't reasonably fit within the scope of this
> > series).
> 
> ok.
> 
> > 
> > >  };
> > >  
> > >  sPAPREnvironment *spapr;
> > > @@ -1403,6 +1405,7 @@ static void ppc_spapr_init(MachineState *machine)
> > >      bool kernel_le = false;
> > >      char *filename;
> > >      int smt = kvmppc_smt_threads();
> > > +    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
> > >  
> > >      msi_supported = true;
> > >  
> > > @@ -1492,6 +1495,29 @@ static void ppc_spapr_init(MachineState *machine)
> > >          memory_region_add_subregion(sysmem, 0, rma_region);
> > >      }
> > >  
> > > +    if (machine->ram_size < machine->maxram_size) {
> > > +        ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
> > > +
> > > +        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
> > > +            error_report("unsupported amount of memory slots: %"PRIu64,
> > > +                         machine->ram_slots);
> > > +            exit(EXIT_FAILURE);
> > > +        }
> > > +
> > > +        ms->hotplug_memory_base = ROUND_UP(machine->ram_size, 1ULL << 30);
> > 
> > Is there a particular significance to the 1GiB alignment?  Is it just
> > a conveniently large alignment, or is that value specified in PAPR
> > somewhere?  Using a named constant would probably help to clarify that.
> 
> I am basing this on x86 memory hotplug and that's how 1GB is coming. It
> is not PAPR specified.

Ok, it should definitely be a #define.

> > > +        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
> > > +            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
> > > +                         machine->maxram_size);
> > > +            exit(EXIT_FAILURE);
> > > +        }
> > > +
> > > +        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
> > > +                           "hotplug-memory", hotplug_mem_size);
> > > +        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
> > > +                                    &ms->hotplug_memory);
> > > +    }
> > > +
> > >      filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
> > >      spapr->rtas_size = get_image_size(filename);
> > >      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > index ae8b4e1..64681c4 100644
> > > --- a/include/hw/ppc/spapr.h
> > > +++ b/include/hw/ppc/spapr.h
> > > @@ -482,6 +482,9 @@ struct sPAPRTCETable {
> > >  #define TIMEBASE_FREQ           512000000ULL
> > >  #define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
> > >  
> > > +/* Support a min of 1TB hotplug memory assuming 256MB per slot */
> > > +#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
> > 
> > Is this constraint arbitrary, or does it come from something in PAPR+?
> 
> Arbitrary max, not defined by PAPR.

Ok.  Why do we need a fixed maximum value?  I don't see this used to
size any arrays..

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space
  2015-02-16  4:56       ` David Gibson
@ 2015-02-17  4:00         ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-02-17  4:00 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Mon, Feb 16, 2015 at 03:56:24PM +1100, David Gibson wrote:
> > > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > > index ae8b4e1..64681c4 100644
> > > > --- a/include/hw/ppc/spapr.h
> > > > +++ b/include/hw/ppc/spapr.h
> > > > @@ -482,6 +482,9 @@ struct sPAPRTCETable {
> > > >  #define TIMEBASE_FREQ           512000000ULL
> > > >  #define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
> > > >  
> > > > +/* Support a min of 1TB hotplug memory assuming 256MB per slot */
> > > > +#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
> > > 
> > > Is this constraint arbitrary, or does it come from something in PAPR+?
> > 
> > Arbitrary max, not defined by PAPR.
> 
> Ok.  Why do we need a fixed maximum value?  I don't see this used to
> size any arrays..

The max slots limit comes from x86 where ACPI limits the max dimm
devices to 256. Since pc-dimm implementation requires this max, I came
up with the above value for ppc.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn
  2015-01-30  7:49     ` Bharata B Rao
@ 2015-02-23  7:36       ` Bharata B Rao
  2015-02-23 15:19         ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Bharata B Rao @ 2015-02-23  7:36 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Fri, Jan 30, 2015 at 01:19:39PM +0530, Bharata B Rao wrote:
> On Thu, Jan 29, 2015 at 12:07:42PM +1100, David Gibson wrote:
> > On Thu, Jan 08, 2015 at 11:40:11AM +0530, Bharata B Rao wrote:
> > > Move some CPU initialization code from machine init function to
> > > CPU realizefn so that it can be used from CPU hotplug path too.
> > > 
> > > With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
> > > function definitions aren't required, remove them.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  hw/ppc/spapr.c              | 29 +----------------------------
> > >  include/hw/ppc/spapr.h      |  3 +++
> > >  target-ppc/translate_init.c | 43 ++++++++++++++++++++++++++-----------------
> > >  3 files changed, 30 insertions(+), 45 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 779d364..f49b0fa 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -81,8 +81,6 @@
> > >  
> > >  #define MIN_RMA_SLOF            128UL
> > >  
> > > -#define TIMEBASE_FREQ           512000000ULL
> > > -
> > >  #define MAX_CPUS                255
> > >  
> > >  #define PHANDLE_XICP            0x00001111
> > > @@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
> > >  
> > >  }
> > >  
> > > -static void spapr_cpu_reset(void *opaque)
> > > +void spapr_cpu_reset(void *opaque)
> > >  {
> > >      PowerPCCPU *cpu = opaque;
> > >      CPUState *cs = CPU(cpu);
> > > @@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
> > >      const char *initrd_filename = machine->initrd_filename;
> > >      const char *boot_device = machine->boot_order;
> > >      PowerPCCPU *cpu;
> > > -    CPUPPCState *env;
> > >      PCIHostState *phb;
> > >      int i;
> > >      MemoryRegion *sysmem = get_system_memory();
> > > @@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
> > >              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> > >              exit(1);
> > >          }
> > > -        env = &cpu->env;
> > > -
> > > -        /* Set time-base frequency to 512 MHz */
> > > -        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> > > -
> > > -        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> > > -         * MSR[IP] should never be set.
> > > -         */
> > > -        env->msr_mask &= ~(1 << 6);
> > > -
> > > -        /* Tell KVM that we're in PAPR mode */
> > > -        if (kvm_enabled()) {
> > > -            kvmppc_set_papr(cpu);
> > > -        }
> > > -
> > > -        if (cpu->max_compat) {
> > > -            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> > > -                exit(1);
> > > -            }
> > > -        }
> > > -
> > > -        xics_cpu_setup(spapr->icp, cpu);
> > > -
> > > -        qemu_register_reset(spapr_cpu_reset, cpu);
> > >      }
> > >  
> > >      /* allocate RAM */
> > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > index b1a0838..831db6b 100644
> > > --- a/include/hw/ppc/spapr.h
> > > +++ b/include/hw/ppc/spapr.h
> > > @@ -478,6 +478,8 @@ struct sPAPRTCETable {
> > >      QLIST_ENTRY(sPAPRTCETable) list;
> > >  };
> > >  
> > > +#define TIMEBASE_FREQ           512000000ULL
> > > +
> > >  void spapr_events_init(sPAPREnvironment *spapr);
> > >  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> > >  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> > > @@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
> > >                        sPAPRTCETable *tcet);
> > >  void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
> > >  void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
> > > +void spapr_cpu_reset(void *opaque);
> > >  
> > >  #endif /* !defined (__HW_SPAPR_H__) */
> > > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> > > index 72cc9d0..9c642a5 100644
> > > --- a/target-ppc/translate_init.c
> > > +++ b/target-ppc/translate_init.c
> > > @@ -30,29 +30,14 @@
> > >  #include "qemu/error-report.h"
> > >  #include "qapi/visitor.h"
> > >  #include "hw/qdev-properties.h"
> > > +#include "hw/ppc/spapr.h"
> > > +#include "hw/ppc/ppc.h"
> > >  
> > >  //#define PPC_DUMP_CPU
> > >  //#define PPC_DEBUG_SPR
> > >  //#define PPC_DUMP_SPR_ACCESSES
> > >  /* #define USE_APPLE_GDB */
> > >  
> > > -/* For user-mode emulation, we don't emulate any IRQ controller */
> > > -#if defined(CONFIG_USER_ONLY)
> > > -#define PPC_IRQ_INIT_FN(name)                                                 \
> > > -static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)         \
> > > -{                                                                             \
> > > -}
> > > -#else
> > > -#define PPC_IRQ_INIT_FN(name)                                                 \
> > > -void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
> > > -#endif
> > > -
> > > -PPC_IRQ_INIT_FN(40x);
> > > -PPC_IRQ_INIT_FN(6xx);
> > > -PPC_IRQ_INIT_FN(970);
> > > -PPC_IRQ_INIT_FN(POWER7);
> > > -PPC_IRQ_INIT_FN(e500);
> > > -
> > >  /* Generic callbacks:
> > >   * do nothing but store/retrieve spr value
> > >   */
> > > @@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >      CPUState *cs = CPU(dev);
> > >      PowerPCCPU *cpu = POWERPC_CPU(dev);
> > >      PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> > > +    CPUPPCState *env = &cpu->env;
> > >      Error *local_err = NULL;
> > >  #if !defined(CONFIG_USER_ONLY)
> > >      int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
> > > @@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >  
> > >      qemu_init_vcpu(cs);
> > >  
> > > +    /* Set time-base frequency to 512 MHz */
> > > +    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> > > +
> > > +    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
> > > +     * MSR[IP] should never be set.
> > > +     */
> > > +    env->msr_mask &= ~(1 << 6);
> > > +
> > > +    /* Tell KVM that we're in PAPR mode */
> > > +    if (kvm_enabled()) {
> > > +        kvmppc_set_papr(cpu);
> > > +    }
> > > +
> > > +    if (cpu->max_compat) {
> > > +        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> > > +            exit(1);
> > > +        }
> > > +    }
> > > +
> > > +    xics_cpu_setup(spapr->icp, cpu);
> > > +
> > > +    qemu_register_reset(spapr_cpu_reset, cpu);
> > > +
> > >      pcc->parent_realize(dev, errp);
> > 
> > This doesn't look right.  Several of these are clearly PAPR specific
> > operations, but you're now doing them from code that isn't PAPR specific.
> 
> Ok, will re-work on this patch.

There is only PowerPCCPU and no CPU specific class for sPAPR. So such things
as above which should ideally be done in realizefn path but are sPAPR specific
should sit where ? Under TARGET_PPC64 ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn
  2015-02-23  7:36       ` Bharata B Rao
@ 2015-02-23 15:19         ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2015-02-23 15:19 UTC (permalink / raw)
  To: bharata, David Gibson; +Cc: imammedo, qemu-devel, mdroth



On 23.02.15 08:36, Bharata B Rao wrote:
> On Fri, Jan 30, 2015 at 01:19:39PM +0530, Bharata B Rao wrote:
>> On Thu, Jan 29, 2015 at 12:07:42PM +1100, David Gibson wrote:
>>> On Thu, Jan 08, 2015 at 11:40:11AM +0530, Bharata B Rao wrote:
>>>> Move some CPU initialization code from machine init function to
>>>> CPU realizefn so that it can be used from CPU hotplug path too.
>>>>
>>>> With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
>>>> function definitions aren't required, remove them.
>>>>
>>>> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>>>> ---
>>>>  hw/ppc/spapr.c              | 29 +----------------------------
>>>>  include/hw/ppc/spapr.h      |  3 +++
>>>>  target-ppc/translate_init.c | 43 ++++++++++++++++++++++++++-----------------
>>>>  3 files changed, 30 insertions(+), 45 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 779d364..f49b0fa 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -81,8 +81,6 @@
>>>>  
>>>>  #define MIN_RMA_SLOF            128UL
>>>>  
>>>> -#define TIMEBASE_FREQ           512000000ULL
>>>> -
>>>>  #define MAX_CPUS                255
>>>>  
>>>>  #define PHANDLE_XICP            0x00001111
>>>> @@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
>>>>  
>>>>  }
>>>>  
>>>> -static void spapr_cpu_reset(void *opaque)
>>>> +void spapr_cpu_reset(void *opaque)
>>>>  {
>>>>      PowerPCCPU *cpu = opaque;
>>>>      CPUState *cs = CPU(cpu);
>>>> @@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
>>>>      const char *initrd_filename = machine->initrd_filename;
>>>>      const char *boot_device = machine->boot_order;
>>>>      PowerPCCPU *cpu;
>>>> -    CPUPPCState *env;
>>>>      PCIHostState *phb;
>>>>      int i;
>>>>      MemoryRegion *sysmem = get_system_memory();
>>>> @@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
>>>>              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
>>>>              exit(1);
>>>>          }
>>>> -        env = &cpu->env;
>>>> -
>>>> -        /* Set time-base frequency to 512 MHz */
>>>> -        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
>>>> -
>>>> -        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
>>>> -         * MSR[IP] should never be set.
>>>> -         */
>>>> -        env->msr_mask &= ~(1 << 6);
>>>> -
>>>> -        /* Tell KVM that we're in PAPR mode */
>>>> -        if (kvm_enabled()) {
>>>> -            kvmppc_set_papr(cpu);
>>>> -        }
>>>> -
>>>> -        if (cpu->max_compat) {
>>>> -            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
>>>> -                exit(1);
>>>> -            }
>>>> -        }
>>>> -
>>>> -        xics_cpu_setup(spapr->icp, cpu);
>>>> -
>>>> -        qemu_register_reset(spapr_cpu_reset, cpu);
>>>>      }
>>>>  
>>>>      /* allocate RAM */
>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>> index b1a0838..831db6b 100644
>>>> --- a/include/hw/ppc/spapr.h
>>>> +++ b/include/hw/ppc/spapr.h
>>>> @@ -478,6 +478,8 @@ struct sPAPRTCETable {
>>>>      QLIST_ENTRY(sPAPRTCETable) list;
>>>>  };
>>>>  
>>>> +#define TIMEBASE_FREQ           512000000ULL
>>>> +
>>>>  void spapr_events_init(sPAPREnvironment *spapr);
>>>>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
>>>>  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
>>>> @@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
>>>>                        sPAPRTCETable *tcet);
>>>>  void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
>>>>  void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
>>>> +void spapr_cpu_reset(void *opaque);
>>>>  
>>>>  #endif /* !defined (__HW_SPAPR_H__) */
>>>> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
>>>> index 72cc9d0..9c642a5 100644
>>>> --- a/target-ppc/translate_init.c
>>>> +++ b/target-ppc/translate_init.c
>>>> @@ -30,29 +30,14 @@
>>>>  #include "qemu/error-report.h"
>>>>  #include "qapi/visitor.h"
>>>>  #include "hw/qdev-properties.h"
>>>> +#include "hw/ppc/spapr.h"
>>>> +#include "hw/ppc/ppc.h"
>>>>  
>>>>  //#define PPC_DUMP_CPU
>>>>  //#define PPC_DEBUG_SPR
>>>>  //#define PPC_DUMP_SPR_ACCESSES
>>>>  /* #define USE_APPLE_GDB */
>>>>  
>>>> -/* For user-mode emulation, we don't emulate any IRQ controller */
>>>> -#if defined(CONFIG_USER_ONLY)
>>>> -#define PPC_IRQ_INIT_FN(name)                                                 \
>>>> -static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)         \
>>>> -{                                                                             \
>>>> -}
>>>> -#else
>>>> -#define PPC_IRQ_INIT_FN(name)                                                 \
>>>> -void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
>>>> -#endif
>>>> -
>>>> -PPC_IRQ_INIT_FN(40x);
>>>> -PPC_IRQ_INIT_FN(6xx);
>>>> -PPC_IRQ_INIT_FN(970);
>>>> -PPC_IRQ_INIT_FN(POWER7);
>>>> -PPC_IRQ_INIT_FN(e500);
>>>> -
>>>>  /* Generic callbacks:
>>>>   * do nothing but store/retrieve spr value
>>>>   */
>>>> @@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>>>>      CPUState *cs = CPU(dev);
>>>>      PowerPCCPU *cpu = POWERPC_CPU(dev);
>>>>      PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>>>> +    CPUPPCState *env = &cpu->env;
>>>>      Error *local_err = NULL;
>>>>  #if !defined(CONFIG_USER_ONLY)
>>>>      int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
>>>> @@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>>>>  
>>>>      qemu_init_vcpu(cs);
>>>>  
>>>> +    /* Set time-base frequency to 512 MHz */
>>>> +    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
>>>> +
>>>> +    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
>>>> +     * MSR[IP] should never be set.
>>>> +     */
>>>> +    env->msr_mask &= ~(1 << 6);
>>>> +
>>>> +    /* Tell KVM that we're in PAPR mode */
>>>> +    if (kvm_enabled()) {
>>>> +        kvmppc_set_papr(cpu);
>>>> +    }
>>>> +
>>>> +    if (cpu->max_compat) {
>>>> +        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
>>>> +            exit(1);
>>>> +        }
>>>> +    }
>>>> +
>>>> +    xics_cpu_setup(spapr->icp, cpu);
>>>> +
>>>> +    qemu_register_reset(spapr_cpu_reset, cpu);
>>>> +
>>>>      pcc->parent_realize(dev, errp);
>>>
>>> This doesn't look right.  Several of these are clearly PAPR specific
>>> operations, but you're now doing them from code that isn't PAPR specific.
>>
>> Ok, will re-work on this patch.
> 
> There is only PowerPCCPU and no CPU specific class for sPAPR. So such things
> as above which should ideally be done in realizefn path but are sPAPR specific
> should sit where ? Under TARGET_PPC64 ?

They should be in the board init function and in the board's cpu hotplug
functions.


Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support
  2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support Bharata B Rao
@ 2015-02-24  6:26   ` David Gibson
  2015-02-24  8:12     ` Bharata B Rao
  0 siblings, 1 reply; 50+ messages in thread
From: David Gibson @ 2015-02-24  6:26 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: imammedo, agraf, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 4898 bytes --]

On Thu, Jan 08, 2015 at 11:40:20AM +0530, Bharata B Rao wrote:
> Make use of pc-dimm infrastructure to support memory hotplug
> for PowerPC.
> 
> Modelled on i386 memory hotplug.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c        | 107 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c |   3 ++
>  2 files changed, 108 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6964b06..1ffff39 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -58,7 +58,8 @@
>  #include "hw/nmi.h"
>  
>  #include "hw/compat.h"
> -
> +#include "hw/mem/pc-dimm.h"
> +#include "qapi/qmp/qerror.h"
>  #include <libfdt.h>
>  
>  /* SLOF memory layout:
> @@ -2165,6 +2166,103 @@ static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      return;
>  }
>  
> +static int spapr_dimms_capacity(Object *obj, void *opaque)
> +{
> +    Error *local_err = NULL;
> +    uint64_t *size = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
> +        (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP, &local_err);
> +
> +        if (local_err) {
> +            qerror_report_err(local_err);
> +            error_free(local_err);
> +            return 1;
> +        }
> +    }
> +
> +    object_child_foreach(obj, spapr_dimms_capacity, opaque);
> +    return 0;
> +}

I don't see any reason you can't use pc_existing_dimms_capacity()
rather than duplicating it.

> +static void spapr_memory_plug(HotplugHandler *hotplug_dev,
> +                         DeviceState *dev, Error **errp)
> +{
> +    int slot;
> +    Error *local_err = NULL;
> +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> +    MachineState *machine = MACHINE(hotplug_dev);
> +    PCDIMMDevice *dimm = PC_DIMM(dev);
> +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> +    uint64_t dimms_capacity = 0;
> +    uint64_t align = TARGET_PAGE_SIZE; /* TODO: enforce alignment */
> +    uint64_t addr;
> +    sPAPRDRConnector *drc;
> +
> +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
> +                                 memory_region_size(&ms->hotplug_memory),
> +                                 !addr ? NULL : &addr, align,
> +                                 memory_region_size(mr), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (spapr_dimms_capacity(OBJECT(machine), &dimms_capacity)) {
> +        error_setg(&local_err, "failed to get total size of existing DIMMs");
> +        goto out;
> +    }
> +
> +    if (dimms_capacity > machine->maxram_size - machine->ram_size) {
> +        error_setg(&local_err, "not enough space, proposed use of 0x%" PRIx64
> +                   " from total of 0x" RAM_ADDR_FMT,
> +                   dimms_capacity, machine->maxram_size);
> +        goto out;
> +    }
> +
> +    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
> +                                 machine->ram_slots, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
> +        error_setg(&local_err, "hypervisor has no free memory slots left");
> +        goto out;
> +    }
> +
> +    memory_region_add_subregion(&ms->hotplug_memory,
> +                                addr - ms->hotplug_memory_base, mr);
> +    vmstate_register_ram(mr, dev);
> +
> +    drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> +            addr/SPAPR_MIN_MEMORY_BLOCK_SIZE);
> +    g_assert(drc);
> +    spapr_hotplug_req_add_event(drc);
> +
> +out:
> +    error_propagate(errp, local_err);
> +}

It looks like this is basically the same as pc_dimm_plug() except for
a couple of checks and the last section which actually notifies the
guest.  Could this be made into common code in pc-dimm.c with hooks
for the platform specific notification?

Maybe PC and sPAPR subclasses of a common dimm object?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support
  2015-02-24  6:26   ` David Gibson
@ 2015-02-24  8:12     ` Bharata B Rao
  0 siblings, 0 replies; 50+ messages in thread
From: Bharata B Rao @ 2015-02-24  8:12 UTC (permalink / raw)
  To: David Gibson; +Cc: imammedo, agraf, qemu-devel, mdroth

On Tue, Feb 24, 2015 at 05:26:12PM +1100, David Gibson wrote:
> On Thu, Jan 08, 2015 at 11:40:20AM +0530, Bharata B Rao wrote:
> > Make use of pc-dimm infrastructure to support memory hotplug
> > for PowerPC.
> > 
> > Modelled on i386 memory hotplug.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c        | 107 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  hw/ppc/spapr_events.c |   3 ++
> >  2 files changed, 108 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 6964b06..1ffff39 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -58,7 +58,8 @@
> >  #include "hw/nmi.h"
> >  
> >  #include "hw/compat.h"
> > -
> > +#include "hw/mem/pc-dimm.h"
> > +#include "qapi/qmp/qerror.h"
> >  #include <libfdt.h>
> >  
> >  /* SLOF memory layout:
> > @@ -2165,6 +2166,103 @@ static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >      return;
> >  }
> >  
> > +static int spapr_dimms_capacity(Object *obj, void *opaque)
> > +{
> > +    Error *local_err = NULL;
> > +    uint64_t *size = opaque;
> > +
> > +    if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
> > +        (*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP, &local_err);
> > +
> > +        if (local_err) {
> > +            qerror_report_err(local_err);
> > +            error_free(local_err);
> > +            return 1;
> > +        }
> > +    }
> > +
> > +    object_child_foreach(obj, spapr_dimms_capacity, opaque);
> > +    return 0;
> > +}
> 
> I don't see any reason you can't use pc_existing_dimms_capacity()
> rather than duplicating it.

Already done that and got the required stuff pushed upstream.
http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg03589.html

> 
> > +static void spapr_memory_plug(HotplugHandler *hotplug_dev,
> > +                         DeviceState *dev, Error **errp)
> > +{
> > +    int slot;
> > +    Error *local_err = NULL;
> > +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> > +    MachineState *machine = MACHINE(hotplug_dev);
> > +    PCDIMMDevice *dimm = PC_DIMM(dev);
> > +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> > +    uint64_t dimms_capacity = 0;
> > +    uint64_t align = TARGET_PAGE_SIZE; /* TODO: enforce alignment */
> > +    uint64_t addr;
> > +    sPAPRDRConnector *drc;
> > +
> > +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
> > +                                 memory_region_size(&ms->hotplug_memory),
> > +                                 !addr ? NULL : &addr, align,
> > +                                 memory_region_size(mr), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    if (spapr_dimms_capacity(OBJECT(machine), &dimms_capacity)) {
> > +        error_setg(&local_err, "failed to get total size of existing DIMMs");
> > +        goto out;
> > +    }
> > +
> > +    if (dimms_capacity > machine->maxram_size - machine->ram_size) {
> > +        error_setg(&local_err, "not enough space, proposed use of 0x%" PRIx64
> > +                   " from total of 0x" RAM_ADDR_FMT,
> > +                   dimms_capacity, machine->maxram_size);
> > +        goto out;
> > +    }
> > +
> > +    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
> > +                                 machine->ram_slots, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
> > +        error_setg(&local_err, "hypervisor has no free memory slots left");
> > +        goto out;
> > +    }
> > +
> > +    memory_region_add_subregion(&ms->hotplug_memory,
> > +                                addr - ms->hotplug_memory_base, mr);
> > +    vmstate_register_ram(mr, dev);
> > +
> > +    drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> > +            addr/SPAPR_MIN_MEMORY_BLOCK_SIZE);
> > +    g_assert(drc);
> > +    spapr_hotplug_req_add_event(drc);
> > +
> > +out:
> > +    error_propagate(errp, local_err);
> > +}
> 
> It looks like this is basically the same as pc_dimm_plug() except for
> a couple of checks and the last section which actually notifies the
> guest.  Could this be made into common code in pc-dimm.c with hooks
> for the platform specific notification?

Yes I could do that.

> 
> Maybe PC and sPAPR subclasses of a common dimm object?

Will take a look and see how best this can be done.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2015-02-24  8:13 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08  6:10 [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
2015-01-22 21:08   ` Michael Roth
2015-01-29  1:04   ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs Bharata B Rao
2015-01-22 21:21   ` Michael Roth
2015-01-29  1:04   ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization Bharata B Rao
2015-01-29  1:05   ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn Bharata B Rao
2015-01-29  1:07   ` David Gibson
2015-01-30  7:49     ` Bharata B Rao
2015-02-23  7:36       ` Bharata B Rao
2015-02-23 15:19         ` Alexander Graf
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
2015-01-22 21:55   ` Michael Roth
2015-01-30  8:51     ` Bharata B Rao
2015-01-29  1:16   ` David Gibson
2015-01-30  7:50     ` Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support Bharata B Rao
2015-01-22 22:16   ` Michael Roth
2015-01-28  4:19     ` Bharata B Rao
2015-01-28  5:41       ` Michael Roth
2015-01-23 12:41   ` Igor Mammedov
2015-01-30  6:59     ` Bharata B Rao
2015-01-29  1:31   ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged Bharata B Rao
2015-01-29  1:36   ` David Gibson
2015-01-30  8:12     ` Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 08/13] spapr: Enable CPU hotplug for POWER8 CPU family Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support Bharata B Rao
2015-01-29  1:39   ` David Gibson
2015-01-30  8:15     ` Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects Bharata B Rao
2015-01-29  1:48   ` David Gibson
2015-01-30  8:23     ` Bharata B Rao
2015-01-31  0:21       ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 11/13] spapr: Initialize hotplug memory address space Bharata B Rao
2015-02-12  5:19   ` David Gibson
2015-02-12  5:39     ` Bharata B Rao
2015-02-16  4:56       ` David Gibson
2015-02-17  4:00         ` Bharata B Rao
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 12/13] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
2015-02-12  6:02   ` David Gibson
2015-01-08  6:10 ` [Qemu-devel] [RFC PATCH v1 13/13] spapr: Memory hotplug support Bharata B Rao
2015-02-24  6:26   ` David Gibson
2015-02-24  8:12     ` Bharata B Rao
2015-01-29 17:46 ` [Qemu-devel] [RFC PATCH v1 00/13] CPU and Memory hotplug for PowerPC guests Andreas Färber
2015-02-02  9:00   ` Bharata B Rao
2015-01-29 22:14 ` Tyrel Datwyler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.