All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests
@ 2015-03-23 13:35 Bharata B Rao
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
                   ` (24 more replies)
  0 siblings, 25 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Hi,

This is the version 2 of the patchset that provides CPU and memory hotplug
support for PowerPC sPAPR guests.

These patches apply against spapr-hotplug-pci-v7 branch of Michael Roth's
PCI hotplug tree (git://github.com/mdroth/qemu). I am basing against
Michael's tree because that has the DR connector base code that is required
to do any hotplug in sPAPR.

I have switched to socket level semantics as suggested by Andreas Farber
(http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg04410.html). What
this means is that I now add full sockets (consisting of cores and threads)
at once in response to device_add command for CPU hotplug. This is enabled
by borrowing one patch from Andreas' WIP tree and building PowerPC parts
on top of it.
(https://github.com/afaerber/qemu-cpu/commit/d33c7caa6471507266d02208ff98f72d4990092c)
I would have ideally liked to post this v2 after Andreas formally
posts his socket level abstraction patchset. But I thought may be I could
get some more review while waiting for his post.

I don't expect anyone to try this out yet, but here is my git tree
for the bravehearts :)

spapr-hotplug branch at https://github.com/bharata/qemu/

Major changes in this patchset (v2)
-----------------------------------
- Switched to socket level semantics that is being proposed by Andreas.
- Reorganized CPU device tree generation code for sPAPR so that same
  code is used in the normal and hotplug path.
- Common CPU init code shared between both bootpath and hotplug path.
- Added documentation about new device tree nodes that are being
  added for hotplug.
- CPU hotplug on LE guest now works.
- Hotplugging of more memory than minimum size (256MB) at once works.
- Enforced alignment requirements for memory hotplug.
- Fixed generic CPU enumeration code to enable proper hot removal
  of CPUs.
- Fixed a crash that was happening when a VM which undergone CPU
  removal is rebooted.
- Not mixing sPAPR code with generic ppc code now.
- Addressed most of the review comments from v1 except a few.

v1: http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg00611.html
v0: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00752.html

TODOs
-----
- Share code between pc_dimm_plug() and spapr_memory_plug().
- Make the algorithm that looks up the NUMA node given the physical address
  more efficient.
- Test/enable migration after hotplug.
- Address a few object reference leaks.
- And of course, much more wider testing.

Example Usage
-------------
CPU hotplug:

Cmdline: -smp 16,maxcpus=32,sockets=4,cores=2,threads=2
Monitor: (qemu) device_add powerpc64-cpu-socket,id=sock5

Memory hotplug: (same semantics as x86)

Cmdline: -m 2G,slots=4,maxmem=4G
Monitor: (qemu) object_add memory-backend-ram,id=ram1,size=512M
         (qemu) device_add pc-dimm,id=dimm1,memdev=ram1

After the above steps, the hotplug action needs to be completed by
using rtas_errd and drmgr utilities (part of powerpc-utils package).
With some changes, I am able to get both memory and hotplug parts
working in powerpc-utils, but I expect Nathan Fontenot to take care
of these parts more properly. Nathan has RFC patches to the guest
kernel that complete the hotplug action for both CPU and memory
completely within the guest kernel. When that is available, these user
space tools will not be needed.

Andreas Färber (1):
  cpu: Prepare Socket container type

Bharata B Rao (20):
  spapr: Add DRC dt entries for CPUs
  spapr: Consider max_cpus during xics initialization
  spapr: Support ibm,lrdr-capacity device tree property
  spapr: Reorganize CPU dt generation code
  spapr: Consolidate cpu init code into a routine
  ppc: Prepare CPU socket/core abstraction
  spapr: Add CPU hotplug handler
  ppc: Update cpu_model in MachineState
  ppc: Create sockets and cores for CPUs
  spapr: CPU hotplug support
  cpus: Add Error argument to cpu_exec_init()
  cpus: Convert cpu_index into a bitmap
  ppc: Move cpu_exec_init() call to realize function
  xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
  xics_kvm: Add cpu_destroy method to XICS
  spapr: CPU hot unplug support
  spapr: Remove vCPU objects after CPU hot unplug
  spapr: Initialize hotplug memory address space
  spapr: Support ibm,dynamic-reconfiguration-memory
  spapr: Memory hotplug support

Gu Zheng (1):
  cpus: Reclaim vCPU objects

Michael Roth (1):
  spapr: enable PHB/CPU/LMB hotplug for pseries-2.3

 cpus.c                            |   44 ++
 default-configs/ppc64-softmmu.mak |    1 +
 docs/specs/ppc-spapr-hotplug.txt  |   66 +++
 exec.c                            |   39 +-
 hw/cpu/Makefile.objs              |    2 +-
 hw/cpu/socket.c                   |   21 +
 hw/intc/xics.c                    |   12 +
 hw/intc/xics_kvm.c                |   19 +
 hw/ppc/Makefile.objs              |    1 +
 hw/ppc/cpu-core.c                 |   63 +++
 hw/ppc/cpu-socket.c               |   62 +++
 hw/ppc/mac_newworld.c             |   10 +-
 hw/ppc/mac_oldworld.c             |    7 +-
 hw/ppc/ppc440_bamboo.c            |    7 +-
 hw/ppc/prep.c                     |    7 +-
 hw/ppc/spapr.c                    | 1014 +++++++++++++++++++++++++++++--------
 hw/ppc/spapr_events.c             |   11 +-
 hw/ppc/spapr_hcall.c              |   51 +-
 hw/ppc/spapr_rtas.c               |   29 +-
 hw/ppc/virtex_ml507.c             |    7 +-
 include/exec/exec-all.h           |    2 +-
 include/hw/cpu/socket.h           |   14 +
 include/hw/ppc/cpu-core.h         |   32 ++
 include/hw/ppc/cpu-socket.h       |   32 ++
 include/hw/ppc/spapr.h            |   37 +-
 include/hw/ppc/xics.h             |    3 +
 include/qom/cpu.h                 |   19 +
 include/sysemu/kvm.h              |    1 +
 kvm-all.c                         |   57 ++-
 kvm-stub.c                        |    5 +
 linux-headers/linux/kvm.h         |    1 +
 target-alpha/cpu.c                |    2 +-
 target-arm/cpu.c                  |    2 +-
 target-cris/cpu.c                 |    2 +-
 target-i386/cpu.c                 |    2 +-
 target-lm32/cpu.c                 |    2 +-
 target-m68k/cpu.c                 |    2 +-
 target-microblaze/cpu.c           |    2 +-
 target-mips/cpu.c                 |    2 +-
 target-moxie/cpu.c                |    2 +-
 target-openrisc/cpu.c             |    2 +-
 target-ppc/cpu.h                  |    1 +
 target-ppc/kvm.c                  |    7 +
 target-ppc/kvm_ppc.h              |    6 +
 target-ppc/translate_init.c       |   55 +-
 target-s390x/cpu.c                |    2 +-
 target-sh4/cpu.c                  |    2 +-
 target-sparc/cpu.c                |    2 +-
 target-tricore/cpu.c              |    2 +-
 target-unicore32/cpu.c            |    2 +-
 target-xtensa/cpu.c               |    2 +-
 51 files changed, 1503 insertions(+), 274 deletions(-)
 create mode 100644 hw/cpu/socket.c
 create mode 100644 hw/ppc/cpu-core.c
 create mode 100644 hw/ppc/cpu-socket.c
 create mode 100644 include/hw/cpu/socket.h
 create mode 100644 include/hw/ppc/cpu-core.h
 create mode 100644 include/hw/ppc/cpu-socket.h

-- 
2.1.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  0:04   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs Bharata B Rao
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

From: Michael Roth <mdroth@linux.vnet.ibm.com>

Introduce an sPAPRMachineClass sub-class of MachineClass to
handle sPAPR-specific machine configuration properties.

The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
part of machine-specific init code, and is then propagated
to sPAPREnvironment to conditional enable creation of DRC
objects and device-tree description to facilitate hotplug
of PHBs/CPUs/LMBs.

Since we can't migrate this state to older machine types,
default the option to false and only enable it for new
machine types.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
              [Added CPU and LMB bits]
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c         | 32 ++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  3 +++
 2 files changed, 35 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 74ee277..a782e28 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -90,11 +90,29 @@
 
 #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
 
+typedef struct sPAPRMachineClass sPAPRMachineClass;
 typedef struct sPAPRMachineState sPAPRMachineState;
 
 #define TYPE_SPAPR_MACHINE      "spapr-machine"
 #define SPAPR_MACHINE(obj) \
     OBJECT_CHECK(sPAPRMachineState, (obj), TYPE_SPAPR_MACHINE)
+#define SPAPR_MACHINE_GET_CLASS(obj) \
+        OBJECT_GET_CLASS(sPAPRMachineClass, obj, TYPE_SPAPR_MACHINE)
+#define SPAPR_MACHINE_CLASS(klass) \
+        OBJECT_CLASS_CHECK(sPAPRMachineClass, klass, TYPE_SPAPR_MACHINE)
+
+/**
+ * sPAPRMachineClass:
+ */
+struct sPAPRMachineClass {
+    /*< private >*/
+    MachineClass parent_class;
+
+    /*< public >*/
+    bool dr_phb_enabled; /* enable dynamic-reconfig/hotplug of PHBs */
+    bool dr_cpu_enabled; /* enable dynamic-reconfig/hotplug of CPUs */
+    bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of LMBs */
+};
 
 /**
  * sPAPRMachineState:
@@ -1378,6 +1396,7 @@ static SaveVMHandlers savevm_htab_handlers = {
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
     ram_addr_t ram_size = machine->ram_size;
     const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
@@ -1541,6 +1560,10 @@ static void ppc_spapr_init(MachineState *machine)
     /* We always have at least the nvram device on VIO */
     spapr_create_nvram(spapr);
 
+    spapr->dr_phb_enabled = smc->dr_phb_enabled;
+    spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
+    spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
+
     /* Set up PCI */
     spapr_pci_rtas_init();
 
@@ -1767,6 +1790,7 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
     FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
     NMIClass *nc = NMI_CLASS(oc);
 
@@ -1778,6 +1802,9 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_boot_order = NULL;
     mc->kvm_type = spapr_kvm_type;
     mc->has_dynamic_sysbus = true;
+    smc->dr_phb_enabled = false;
+    smc->dr_cpu_enabled = false;
+    smc->dr_lmb_enabled = false;
 
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
@@ -1789,6 +1816,7 @@ static const TypeInfo spapr_machine_info = {
     .abstract      = true,
     .instance_size = sizeof(sPAPRMachineState),
     .instance_init = spapr_machine_initfn,
+    .class_size    = sizeof(sPAPRMachineClass),
     .class_init    = spapr_machine_class_init,
     .interfaces = (InterfaceInfo[]) {
         { TYPE_FW_PATH_PROVIDER },
@@ -1854,11 +1882,15 @@ static const TypeInfo spapr_machine_2_2_info = {
 static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
 
     mc->name = "pseries-2.3";
     mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
     mc->alias = "pseries";
     mc->is_default = 1;
+    smc->dr_phb_enabled = true;
+    smc->dr_cpu_enabled = true;
+    smc->dr_lmb_enabled = true;
 }
 
 static const TypeInfo spapr_machine_2_3_info = {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 4ab289b..4578433 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -32,6 +32,9 @@ typedef struct sPAPREnvironment {
     uint64_t rtc_offset; /* Now used only during incoming migration */
     struct PPCTimebase tb;
     bool has_graphics;
+    bool dr_phb_enabled; /* hotplug / dynamic-reconfiguration of PHBs */
+    bool dr_cpu_enabled; /* hotplug / dynamic-reconfiguration of CPUs */
+    bool dr_lmb_enabled; /* hotplug / dynamic-reconfiguration of LMBs */
 
     uint32_t check_exception_irq;
     Notifier epow_notifier;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  0:07   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 03/23] spapr: Consider max_cpus during xics initialization Bharata B Rao
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Advertise CPU DR-capability to the guest via device tree.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [spapr_drc_reset implementation]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a782e28..920e650 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -807,6 +807,15 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
         spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
     }
 
+    if (spapr->dr_cpu_enabled) {
+        int offset = fdt_path_offset(fdt, "/cpus");
+        ret = spapr_drc_populate_dt(fdt, offset, NULL,
+                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
+        if (ret < 0) {
+            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
+        }
+    }
+
     _FDT((fdt_pack(fdt)));
 
     if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
@@ -1393,6 +1402,16 @@ static SaveVMHandlers savevm_htab_handlers = {
     .load_state = htab_load,
 };
 
+static void spapr_drc_reset(void *opaque)
+{
+    sPAPRDRConnector *drc = opaque;
+    DeviceState *d = DEVICE(drc);
+
+    if (d) {
+        device_reset(d);
+    }
+}
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
@@ -1418,6 +1437,7 @@ static void ppc_spapr_init(MachineState *machine)
     long load_limit, fw_size;
     bool kernel_le = false;
     char *filename;
+    int smt = kvmppc_smt_threads();
 
     msi_supported = true;
 
@@ -1564,6 +1584,15 @@ static void ppc_spapr_init(MachineState *machine)
     spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
     spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
 
+    if (spapr->dr_cpu_enabled) {
+        for (i = 0; i < max_cpus/smp_threads; i++) {
+            sPAPRDRConnector *drc =
+                spapr_dr_connector_new(OBJECT(machine),
+                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
+            qemu_register_reset(spapr_drc_reset, drc);
+        }
+    }
+
     /* Set up PCI */
     spapr_pci_rtas_init();
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 03/23] spapr: Consider max_cpus during xics initialization
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Use max_cpus instead of smp_cpus when intializating xics system. Also
report max_cpus in ibm,interrupt-server-ranges device tree property of
interrupt controller node.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 920e650..3b26ede 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -347,7 +347,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
     GString *hypertas = g_string_sized_new(256);
     GString *qemu_hypertas = g_string_sized_new(256);
     uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
-    uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
+    uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
     int smt = kvmppc_smt_threads();
     unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
     QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
@@ -1495,7 +1495,7 @@ static void ppc_spapr_init(MachineState *machine)
     }
 
     /* Set up Interrupt Controller before we create the VCPUs */
-    spapr->icp = xics_system_init(smp_cpus * kvmppc_smt_threads() / smp_threads,
+    spapr->icp = xics_system_init(max_cpus * kvmppc_smt_threads() / smp_threads,
                                   XICS_IRQS);
 
     /* init CPUs */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (2 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 03/23] spapr: Consider max_cpus during xics initialization Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  0:15   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code Bharata B Rao
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Add support for ibm,lrdr-capacity since this is needed by the guest
kernel to know about the possible hot-pluggable CPUs and Memory. With
this, pseries kernels will start reporting correct maxcpus in
/sys/devices/system/cpu/possible.

Define minimum hotpluggable memory size as 256MB and start storing maximum
possible memory for the guest in sPAPREnvironment.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 docs/specs/ppc-spapr-hotplug.txt | 18 ++++++++++++++++++
 hw/ppc/spapr.c                   |  3 ++-
 hw/ppc/spapr_rtas.c              | 18 ++++++++++++++++--
 include/hw/ppc/spapr.h           |  7 +++++--
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/docs/specs/ppc-spapr-hotplug.txt b/docs/specs/ppc-spapr-hotplug.txt
index d35771c..46e0719 100644
--- a/docs/specs/ppc-spapr-hotplug.txt
+++ b/docs/specs/ppc-spapr-hotplug.txt
@@ -284,4 +284,22 @@ struct rtas_event_log_v6_hp {
     } drc;
 } QEMU_PACKED;
 
+== ibm,lrdr-capacity ==
+
+ibm,lrdr-capacity is a property in the /rtas device tree node that identifies
+the dynamic reconfiguration capabilities of the guest. It consists of a triple
+consisting of <phys>, <size> and <maxcpus>.
+
+  <phys>, encoded in BE format represents the maximum address in bytes and
+  hence the maximum memory that can be allocated to the guest.
+
+  <size>, encoded in BE format represents the size increments in which
+  memory can be hot-plugged to the guest.
+
+  <maxcpus>, a BE-encoded integer, represents the maximum number of
+  processors that the guest can have.
+
+pseries guests use this property to note the maximum allowed CPUs for the
+guest.
+
 [1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/75350/focus=106867
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3b26ede..36ff754 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -777,7 +777,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     }
 
     /* RTAS */
-    ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
+    ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
     if (ret < 0) {
         fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
     }
@@ -1536,6 +1536,7 @@ static void ppc_spapr_init(MachineState *machine)
 
     /* allocate RAM */
     spapr->ram_limit = ram_size;
+    spapr->maxram_limit = machine->maxram_size;
     memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
                                          spapr->ram_limit);
     memory_region_add_subregion(sysmem, 0, ram);
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index b4047af..57ec97a 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -29,6 +29,7 @@
 #include "sysemu/char.h"
 #include "hw/qdev.h"
 #include "sysemu/device_tree.h"
+#include "sysemu/cpus.h"
 
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_vio.h"
@@ -613,11 +614,12 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
     rtas_table[token].fn = fn;
 }
 
-int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
-                                 hwaddr rtas_size)
+int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
+                                 hwaddr rtas_addr, hwaddr rtas_size)
 {
     int ret;
     int i;
+    uint32_t lrdr_capacity[5];
 
     ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
     if (ret < 0) {
@@ -666,6 +668,18 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
         }
 
     }
+
+    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
+    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
+    lrdr_capacity[2] = 0;
+    lrdr_capacity[3] = cpu_to_be32(SPAPR_MEMORY_BLOCK_SIZE);
+    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
+    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
+                     sizeof(lrdr_capacity));
+    if (ret < 0) {
+        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
+    }
+
     return 0;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 4578433..ecac6e3 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -20,6 +20,7 @@ typedef struct sPAPREnvironment {
     DeviceState *rtc;
 
     hwaddr ram_limit;
+    hwaddr maxram_limit;
     void *htab;
     uint32_t htab_shift;
     hwaddr rma_size;
@@ -497,8 +498,8 @@ void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn);
 target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                              uint32_t token, uint32_t nargs, target_ulong args,
                              uint32_t nret, target_ulong rets);
-int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
-                                 hwaddr rtas_size);
+int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
+                                 hwaddr rtas_addr, hwaddr rtas_size);
 
 #define SPAPR_TCE_PAGE_SHIFT   12
 #define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
@@ -539,6 +540,8 @@ struct sPAPREventLogEntry {
     QTAILQ_ENTRY(sPAPREventLogEntry) next;
 };
 
+#define SPAPR_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (3 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  1:36   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine Bharata B Rao
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Reorganize CPU device tree generation code so that it be reused from
hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
instead of spapr_create_fdt_skel().

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 288 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 154 insertions(+), 134 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 36ff754..1a25cc0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -206,24 +206,50 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
     return ret;
 }
 
+static int spapr_fixup_cpu_numa_smt_dt(void *fdt, int offset, CPUState *cs,
+                                        sPAPREnvironment *spapr)
+{
+    int ret;
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int index = ppc_get_vcpu_dt_id(cpu);
+    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
+    uint32_t associativity[] = {cpu_to_be32(0x5),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(cs->numa_node),
+                                cpu_to_be32(index)};
+
+    /* Advertise NUMA via ibm,associativity */
+    if (nb_numa_nodes > 1) {
+        ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
+                          sizeof(associativity));
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    ret = fdt_setprop(fdt, offset, "ibm,pft-size",
+                      pft_size_prop, sizeof(pft_size_prop));
+    if (ret < 0) {
+        return ret;
+    }
+
+    return spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
+                                 ppc_get_compat_smt_threads(cpu));
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 {
     int ret = 0, offset, cpus_offset;
     CPUState *cs;
     char cpu_model[32];
     int smt = kvmppc_smt_threads();
-    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
 
     CPU_FOREACH(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int index = ppc_get_vcpu_dt_id(cpu);
-        uint32_t associativity[] = {cpu_to_be32(0x5),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(cs->numa_node),
-                                    cpu_to_be32(index)};
 
         if ((index % smt) != 0) {
             continue;
@@ -247,22 +273,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
             }
         }
 
-        if (nb_numa_nodes > 1) {
-            ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
-                              sizeof(associativity));
-            if (ret < 0) {
-                return ret;
-            }
-        }
-
-        ret = fdt_setprop(fdt, offset, "ibm,pft-size",
-                          pft_size_prop, sizeof(pft_size_prop));
-        if (ret < 0) {
-            return ret;
-        }
-
-        ret = spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
-                                     ppc_get_compat_smt_threads(cpu));
+        ret = spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr);
         if (ret < 0) {
             return ret;
         }
@@ -341,18 +352,13 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
                                    uint32_t epow_irq)
 {
     void *fdt;
-    CPUState *cs;
     uint32_t start_prop = cpu_to_be32(initrd_base);
     uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
     GString *hypertas = g_string_sized_new(256);
     GString *qemu_hypertas = g_string_sized_new(256);
     uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
     uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
-    int smt = kvmppc_smt_threads();
     unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
-    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
-    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
-    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
     char *buf;
 
     add_str(hypertas, "hcall-pft");
@@ -441,107 +447,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 
     _FDT((fdt_end_node(fdt)));
 
-    /* cpus */
-    _FDT((fdt_begin_node(fdt, "cpus")));
-
-    _FDT((fdt_property_cell(fdt, "#address-cells", 0x1)));
-    _FDT((fdt_property_cell(fdt, "#size-cells", 0x0)));
-
-    CPU_FOREACH(cs) {
-        PowerPCCPU *cpu = POWERPC_CPU(cs);
-        CPUPPCState *env = &cpu->env;
-        DeviceClass *dc = DEVICE_GET_CLASS(cs);
-        PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-        int index = ppc_get_vcpu_dt_id(cpu);
-        char *nodename;
-        uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
-                           0xffffffff, 0xffffffff};
-        uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
-        uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
-        uint32_t page_sizes_prop[64];
-        size_t page_sizes_prop_size;
-
-        if ((index % smt) != 0) {
-            continue;
-        }
-
-        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
-
-        _FDT((fdt_begin_node(fdt, nodename)));
-
-        g_free(nodename);
-
-        _FDT((fdt_property_cell(fdt, "reg", index)));
-        _FDT((fdt_property_string(fdt, "device_type", "cpu")));
-
-        _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
-        _FDT((fdt_property_cell(fdt, "d-cache-block-size",
-                                env->dcache_line_size)));
-        _FDT((fdt_property_cell(fdt, "d-cache-line-size",
-                                env->dcache_line_size)));
-        _FDT((fdt_property_cell(fdt, "i-cache-block-size",
-                                env->icache_line_size)));
-        _FDT((fdt_property_cell(fdt, "i-cache-line-size",
-                                env->icache_line_size)));
-
-        if (pcc->l1_dcache_size) {
-            _FDT((fdt_property_cell(fdt, "d-cache-size", pcc->l1_dcache_size)));
-        } else {
-            fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
-        }
-        if (pcc->l1_icache_size) {
-            _FDT((fdt_property_cell(fdt, "i-cache-size", pcc->l1_icache_size)));
-        } else {
-            fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
-        }
-
-        _FDT((fdt_property_cell(fdt, "timebase-frequency", tbfreq)));
-        _FDT((fdt_property_cell(fdt, "clock-frequency", cpufreq)));
-        _FDT((fdt_property_cell(fdt, "ibm,slb-size", env->slb_nr)));
-        _FDT((fdt_property_string(fdt, "status", "okay")));
-        _FDT((fdt_property(fdt, "64-bit", NULL, 0)));
-
-        if (env->spr_cb[SPR_PURR].oea_read) {
-            _FDT((fdt_property(fdt, "ibm,purr", NULL, 0)));
-        }
-
-        if (env->mmu_model & POWERPC_MMU_1TSEG) {
-            _FDT((fdt_property(fdt, "ibm,processor-segment-sizes",
-                               segs, sizeof(segs))));
-        }
-
-        /* Advertise VMX/VSX (vector extensions) if available
-         *   0 / no property == no vector extensions
-         *   1               == VMX / Altivec available
-         *   2               == VSX available */
-        if (env->insns_flags & PPC_ALTIVEC) {
-            uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
-
-            _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
-        }
-
-        /* Advertise DFP (Decimal Floating Point) if available
-         *   0 / no property == no DFP
-         *   1               == DFP available */
-        if (env->insns_flags2 & PPC2_DFP) {
-            _FDT((fdt_property_cell(fdt, "ibm,dfp", 1)));
-        }
-
-        page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
-                                                      sizeof(page_sizes_prop));
-        if (page_sizes_prop_size) {
-            _FDT((fdt_property(fdt, "ibm,segment-page-sizes",
-                               page_sizes_prop, page_sizes_prop_size)));
-        }
-
-        _FDT((fdt_property_cell(fdt, "ibm,chip-id",
-                                cs->cpu_index / cpus_per_socket)));
-
-        _FDT((fdt_end_node(fdt)));
-    }
-
-    _FDT((fdt_end_node(fdt)));
-
     /* RTAS */
     _FDT((fdt_begin_node(fdt, "rtas")));
 
@@ -739,6 +644,124 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
     return 0;
 }
 
+static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
+    int index = ppc_get_vcpu_dt_id(cpu);
+    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
+                       0xffffffff, 0xffffffff};
+    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
+    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
+    uint32_t page_sizes_prop[64];
+    size_t page_sizes_prop_size;
+    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
+    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
+    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
+
+    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
+    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
+
+    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
+                            env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
+                            env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
+                            env->icache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
+                            env->icache_line_size)));
+
+    if (pcc->l1_dcache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
+                             pcc->l1_dcache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
+    }
+    if (pcc->l1_icache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
+                             pcc->l1_icache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
+    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
+    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
+
+    if (env->spr_cb[SPR_PURR].oea_read) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
+    }
+
+    if (env->mmu_model & POWERPC_MMU_1TSEG) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
+                           segs, sizeof(segs))));
+    }
+
+    /* Advertise VMX/VSX (vector extensions) if available
+     *   0 / no property == no vector extensions
+     *   1               == VMX / Altivec available
+     *   2               == VSX available */
+    if (env->insns_flags & PPC_ALTIVEC) {
+        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
+
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
+    }
+
+    /* Advertise DFP (Decimal Floating Point) if available
+     *   0 / no property == no DFP
+     *   1               == DFP available */
+    if (env->insns_flags2 & PPC2_DFP) {
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
+    }
+
+    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
+                                                  sizeof(page_sizes_prop));
+    if (page_sizes_prop_size) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
+                           page_sizes_prop, page_sizes_prop_size)));
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
+                            cs->cpu_index / cpus_per_socket)));
+
+    _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
+}
+
+static void spapr_populate_cpu_dt_node(void *fdt, sPAPREnvironment *spapr)
+{
+    CPUState *cs;
+    int cpus_offset;
+    char *nodename;
+    int smt = kvmppc_smt_threads();
+
+    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
+    _FDT(cpus_offset);
+    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
+    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+        int index = ppc_get_vcpu_dt_id(cpu);
+        DeviceClass *dc = DEVICE_GET_CLASS(cs);
+        int offset;
+
+        if ((index % smt) != 0) {
+            continue;
+        }
+
+        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
+        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
+        g_free(nodename);
+        _FDT(offset);
+        spapr_populate_cpu_dt(cs, fdt, offset);
+    }
+
+}
+
 static void spapr_finalize_fdt(sPAPREnvironment *spapr,
                                hwaddr fdt_addr,
                                hwaddr rtas_addr,
@@ -782,11 +805,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
         fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
     }
 
-    /* Advertise NUMA via ibm,associativity */
-    ret = spapr_fixup_cpu_dt(fdt, spapr);
-    if (ret < 0) {
-        fprintf(stderr, "Couldn't finalize CPU device tree properties\n");
-    }
+    /* cpus */
+    spapr_populate_cpu_dt_node(fdt, spapr);
 
     bootlist = get_boot_devices_list(&cb, true);
     if (cb && bootlist) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (4 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  1:37   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type Bharata B Rao
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Factor out bits of sPAPR specific CPU initialization code into
a separate routine so that it can be called from CPU hotplug
path too.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 54 +++++++++++++++++++++++++++++-------------------------
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1a25cc0..200dd75 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1432,6 +1432,34 @@ static void spapr_drc_reset(void *opaque)
     }
 }
 
+static void spapr_cpu_init(PowerPCCPU *cpu)
+{
+    CPUPPCState *env = &cpu->env;
+
+    /* Set time-base frequency to 512 MHz */
+    cpu_ppc_tb_init(env, TIMEBASE_FREQ);
+
+    /* PAPR always has exception vectors in RAM not ROM. To ensure this,
+     * MSR[IP] should never be set.
+     */
+    env->msr_mask &= ~(1 << 6);
+
+    /* Tell KVM that we're in PAPR mode */
+    if (kvm_enabled()) {
+        kvmppc_set_papr(cpu);
+    }
+
+    if (cpu->max_compat) {
+        if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
+            exit(1);
+        }
+    }
+
+    xics_cpu_setup(spapr->icp, cpu);
+
+    qemu_register_reset(spapr_cpu_reset, cpu);
+}
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
@@ -1443,7 +1471,6 @@ static void ppc_spapr_init(MachineState *machine)
     const char *initrd_filename = machine->initrd_filename;
     const char *boot_device = machine->boot_order;
     PowerPCCPU *cpu;
-    CPUPPCState *env;
     PCIHostState *phb;
     int i;
     MemoryRegion *sysmem = get_system_memory();
@@ -1528,30 +1555,7 @@ static void ppc_spapr_init(MachineState *machine)
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
         }
-        env = &cpu->env;
-
-        /* Set time-base frequency to 512 MHz */
-        cpu_ppc_tb_init(env, TIMEBASE_FREQ);
-
-        /* PAPR always has exception vectors in RAM not ROM. To ensure this,
-         * MSR[IP] should never be set.
-         */
-        env->msr_mask &= ~(1 << 6);
-
-        /* Tell KVM that we're in PAPR mode */
-        if (kvm_enabled()) {
-            kvmppc_set_papr(cpu);
-        }
-
-        if (cpu->max_compat) {
-            if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
-                exit(1);
-            }
-        }
-
-        xics_cpu_setup(spapr->icp, cpu);
-
-        qemu_register_reset(spapr_cpu_reset, cpu);
+        spapr_cpu_init(cpu);
     }
 
     /* allocate RAM */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (5 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  2:03   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction Bharata B Rao
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

From: Andreas Färber <afaerber@suse.de>

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/cpu/Makefile.objs    |  2 +-
 hw/cpu/socket.c         | 21 +++++++++++++++++++++
 include/hw/cpu/socket.h | 14 ++++++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 hw/cpu/socket.c
 create mode 100644 include/hw/cpu/socket.h

diff --git a/hw/cpu/Makefile.objs b/hw/cpu/Makefile.objs
index 6381238..e6890cf 100644
--- a/hw/cpu/Makefile.objs
+++ b/hw/cpu/Makefile.objs
@@ -3,4 +3,4 @@ obj-$(CONFIG_REALVIEW) += realview_mpcore.o
 obj-$(CONFIG_A9MPCORE) += a9mpcore.o
 obj-$(CONFIG_A15MPCORE) += a15mpcore.o
 obj-$(CONFIG_ICC_BUS) += icc_bus.o
-
+obj-y += socket.o
diff --git a/hw/cpu/socket.c b/hw/cpu/socket.c
new file mode 100644
index 0000000..5ca47e9
--- /dev/null
+++ b/hw/cpu/socket.c
@@ -0,0 +1,21 @@
+/*
+ * CPU socket abstraction
+ *
+ * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+
+#include "hw/cpu/socket.h"
+
+static const TypeInfo cpu_socket_type_info = {
+    .name = TYPE_CPU_SOCKET,
+    .parent = TYPE_DEVICE,
+    .abstract = true,
+};
+
+static void cpu_socket_register_types(void)
+{
+    type_register_static(&cpu_socket_type_info);
+}
+
+type_init(cpu_socket_register_types)
diff --git a/include/hw/cpu/socket.h b/include/hw/cpu/socket.h
new file mode 100644
index 0000000..c8e0c18
--- /dev/null
+++ b/include/hw/cpu/socket.h
@@ -0,0 +1,14 @@
+/*
+ * CPU socket abstraction
+ *
+ * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+#ifndef HW_CPU_SOCKET_H
+#define HW_CPU_SOCKET_H
+
+#include "hw/qdev.h"
+
+#define TYPE_CPU_SOCKET "cpu-socket"
+
+#endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (6 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  2:06   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler Bharata B Rao
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
---
 hw/ppc/Makefile.objs        |  1 +
 hw/ppc/cpu-core.c           | 46 ++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/cpu-socket.c         | 47 +++++++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/cpu-core.h   | 32 ++++++++++++++++++++++++++++++
 include/hw/ppc/cpu-socket.h | 32 ++++++++++++++++++++++++++++++
 5 files changed, 158 insertions(+)
 create mode 100644 hw/ppc/cpu-core.c
 create mode 100644 hw/ppc/cpu-socket.c
 create mode 100644 include/hw/ppc/cpu-core.h
 create mode 100644 include/hw/ppc/cpu-socket.h

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index c8ab06e..a35cac5 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -1,5 +1,6 @@
 # shared objects
 obj-y += ppc.o ppc_booke.o
+obj-y += cpu-socket.o cpu-core.o
 # IBM pSeries (sPAPR)
 obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
 obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
new file mode 100644
index 0000000..ed0481f
--- /dev/null
+++ b/hw/ppc/cpu-core.c
@@ -0,0 +1,46 @@
+/*
+ * ppc CPU core abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
+ */
+
+#include "hw/qdev.h"
+#include "hw/ppc/cpu-core.h"
+
+static int ppc_cpu_core_realize_child(Object *child, void *opaque)
+{
+    Error **errp = opaque;
+
+    object_property_set_bool(child, true, "realized", errp);
+    if (*errp) {
+        return 1;
+    }
+
+    return 0;
+}
+
+static void ppc_cpu_core_realize(DeviceState *dev, Error **errp)
+{
+    object_child_foreach(OBJECT(dev), ppc_cpu_core_realize_child, errp);
+}
+
+static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = ppc_cpu_core_realize;
+}
+
+static const TypeInfo ppc_cpu_core_type_info = {
+    .name = TYPE_POWERPC_CPU_CORE,
+    .parent = TYPE_DEVICE,
+    .class_init = ppc_cpu_core_class_init,
+};
+
+static void ppc_cpu_core_register_types(void)
+{
+    type_register_static(&ppc_cpu_core_type_info);
+}
+
+type_init(ppc_cpu_core_register_types)
diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
new file mode 100644
index 0000000..602a060
--- /dev/null
+++ b/hw/ppc/cpu-socket.c
@@ -0,0 +1,47 @@
+/*
+ * PPC CPU socket abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
+ */
+
+#include "hw/qdev.h"
+#include "hw/ppc/cpu-socket.h"
+#include "sysemu/cpus.h"
+
+static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
+{
+    Error **errp = opaque;
+
+    object_property_set_bool(child, true, "realized", errp);
+    if (*errp) {
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static void ppc_cpu_socket_realize(DeviceState *dev, Error **errp)
+{
+    object_child_foreach(OBJECT(dev), ppc_cpu_socket_realize_child, errp);
+}
+
+static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = ppc_cpu_socket_realize;
+}
+
+static const TypeInfo ppc_cpu_socket_type_info = {
+    .name = TYPE_POWERPC_CPU_SOCKET,
+    .parent = TYPE_CPU_SOCKET,
+    .class_init = ppc_cpu_socket_class_init,
+};
+
+static void ppc_cpu_socket_register_types(void)
+{
+    type_register_static(&ppc_cpu_socket_type_info);
+}
+
+type_init(ppc_cpu_socket_register_types)
diff --git a/include/hw/ppc/cpu-core.h b/include/hw/ppc/cpu-core.h
new file mode 100644
index 0000000..95f1c28
--- /dev/null
+++ b/include/hw/ppc/cpu-core.h
@@ -0,0 +1,32 @@
+/*
+ * PowerPC CPU core abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
+ */
+#ifndef HW_PPC_CPU_CORE_H
+#define HW_PPC_CPU_CORE_H
+
+#include "hw/qdev.h"
+#include "cpu.h"
+
+#ifdef TARGET_PPC64
+#define TYPE_POWERPC_CPU_CORE "powerpc64-cpu-core"
+#elif defined(TARGET_PPCEMB)
+#define TYPE_POWERPC_CPU_CORE "embedded-powerpc-cpu-core"
+#else
+#define TYPE_POWERPC_CPU_CORE "powerpc-cpu-core"
+#endif
+
+#define POWERPC_CPU_CORE(obj) \
+    OBJECT_CHECK(PowerPCCPUCore, (obj), TYPE_POWERPC_CPU_CORE)
+
+typedef struct PowerPCCPUCore {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+
+    PowerPCCPU thread[0];
+} PowerPCCPUCore;
+
+#endif
diff --git a/include/hw/ppc/cpu-socket.h b/include/hw/ppc/cpu-socket.h
new file mode 100644
index 0000000..5ae19d0
--- /dev/null
+++ b/include/hw/ppc/cpu-socket.h
@@ -0,0 +1,32 @@
+/*
+ * PowerPC CPU socket abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
+ */
+#ifndef HW_PPC_CPU_SOCKET_H
+#define HW_PPC_CPU_SOCKET_H
+
+#include "hw/cpu/socket.h"
+#include "cpu-core.h"
+
+#ifdef TARGET_PPC64
+#define TYPE_POWERPC_CPU_SOCKET "powerpc64-cpu-socket"
+#elif defined(TARGET_PPCEMB)
+#define TYPE_POWERPC_CPU_SOCKET "embedded-powerpc-cpu-socket"
+#else
+#define TYPE_POWERPC_CPU_SOCKET "powerpc-cpu-socket"
+#endif
+
+#define POWERPC_CPU_SOCKET(obj) \
+    OBJECT_CHECK(PowerPCCPUSocket, (obj), TYPE_POWERPC_CPU_SOCKET)
+
+typedef struct PowerPCCPUSocket {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+
+    PowerPCCPUCore core[0];
+} PowerPCCPUSocket;
+
+#endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (7 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  2:08   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState Bharata B Rao
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Add CPU hotplug handler to spapr machine class and let the plug handler
initialize spapr CPU specific initialization bits for a realized CPU.
This lets CPU boot path and hotplug path to share as much code as possible.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 200dd75..6650f82 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1555,7 +1555,6 @@ static void ppc_spapr_init(MachineState *machine)
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
         }
-        spapr_cpu_init(cpu);
     }
 
     /* allocate RAM */
@@ -1841,12 +1840,33 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
+                                      DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        CPUState *cs = CPU(dev);
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+        spapr_cpu_init(cpu);
+    }
+}
+
+static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
+                                             DeviceState *dev)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        return HOTPLUG_HANDLER(machine);
+    }
+    return NULL;
+}
+
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
     sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
     FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
     NMIClass *nc = NMI_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
     mc->init = ppc_spapr_init;
     mc->reset = ppc_spapr_reset;
@@ -1856,6 +1876,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_boot_order = NULL;
     mc->kvm_type = spapr_kvm_type;
     mc->has_dynamic_sysbus = true;
+    mc->get_hotplug_handler = spapr_get_hotpug_handler;
+    hc->plug = spapr_machine_device_plug;
     smc->dr_phb_enabled = false;
     smc->dr_cpu_enabled = false;
     smc->dr_lmb_enabled = false;
@@ -1875,6 +1897,7 @@ static const TypeInfo spapr_machine_info = {
     .interfaces = (InterfaceInfo[]) {
         { TYPE_FW_PATH_PROVIDER },
         { TYPE_NMI },
+        { TYPE_HOTPLUG_HANDLER },
         { }
     },
 };
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (8 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  2:30   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs Bharata B Rao
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Keep cpu_model field in MachineState uptodate so that it can be used
from the CPU hotplug path.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/mac_newworld.c  | 10 +++++-----
 hw/ppc/mac_oldworld.c  |  7 +++----
 hw/ppc/ppc440_bamboo.c |  7 +++----
 hw/ppc/prep.c          |  7 +++----
 hw/ppc/spapr.c         |  7 +++----
 hw/ppc/virtex_ml507.c  |  7 +++----
 6 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 624b4ab..fe18bce 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -145,7 +145,6 @@ static void ppc_core99_reset(void *opaque)
 static void ppc_core99_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
@@ -182,14 +181,15 @@ static void ppc_core99_init(MachineState *machine)
     linux_boot = (kernel_filename != NULL);
 
     /* init CPUs */
-    if (cpu_model == NULL)
+    if (machine->cpu_model == NULL) {
 #ifdef TARGET_PPC64
-        cpu_model = "970fx";
+        machine->cpu_model = "970fx";
 #else
-        cpu_model = "G4";
+        machine->cpu_model = "G4";
 #endif
+    }
     for (i = 0; i < smp_cpus; i++) {
-        cpu = cpu_ppc_init(cpu_model);
+        cpu = cpu_ppc_init(machine->cpu_model);
         if (cpu == NULL) {
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 3079510..2732319 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -75,7 +75,6 @@ static void ppc_heathrow_reset(void *opaque)
 static void ppc_heathrow_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
@@ -107,10 +106,10 @@ static void ppc_heathrow_init(MachineState *machine)
     linux_boot = (kernel_filename != NULL);
 
     /* init CPUs */
-    if (cpu_model == NULL)
-        cpu_model = "G3";
+    if (machine->cpu_model == NULL)
+        machine->cpu_model = "G3";
     for (i = 0; i < smp_cpus; i++) {
-        cpu = cpu_ppc_init(cpu_model);
+        cpu = cpu_ppc_init(machine->cpu_model);
         if (cpu == NULL) {
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 778970a..032fa80 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -159,7 +159,6 @@ static void main_cpu_reset(void *opaque)
 static void bamboo_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
@@ -184,10 +183,10 @@ static void bamboo_init(MachineState *machine)
     int i;
 
     /* Setup CPU. */
-    if (cpu_model == NULL) {
-        cpu_model = "440EP";
+    if (machine->cpu_model == NULL) {
+        machine->cpu_model = "440EP";
     }
-    cpu = cpu_ppc_init(cpu_model);
+    cpu = cpu_ppc_init(machine->cpu_model);
     if (cpu == NULL) {
         fprintf(stderr, "Unable to initialize CPU!\n");
         exit(1);
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 15df7f3..55e9643 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -364,7 +364,6 @@ static PortioList prep_port_list;
 static void ppc_prep_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
@@ -396,10 +395,10 @@ static void ppc_prep_init(MachineState *machine)
     linux_boot = (kernel_filename != NULL);
 
     /* init CPUs */
-    if (cpu_model == NULL)
-        cpu_model = "602";
+    if (machine->cpu_model == NULL)
+        machine->cpu_model = "602";
     for (i = 0; i < smp_cpus; i++) {
-        cpu = cpu_ppc_init(cpu_model);
+        cpu = cpu_ppc_init(machine->cpu_model);
         if (cpu == NULL) {
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6650f82..16b67f4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1465,7 +1465,6 @@ static void ppc_spapr_init(MachineState *machine)
 {
     sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
@@ -1546,11 +1545,11 @@ static void ppc_spapr_init(MachineState *machine)
                                   XICS_IRQS);
 
     /* init CPUs */
-    if (cpu_model == NULL) {
-        cpu_model = kvm_enabled() ? "host" : "POWER7";
+    if (machine->cpu_model == NULL) {
+        machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
     }
     for (i = 0; i < smp_cpus; i++) {
-        cpu = cpu_ppc_init(cpu_model);
+        cpu = cpu_ppc_init(machine->cpu_model);
         if (cpu == NULL) {
             fprintf(stderr, "Unable to find PowerPC CPU definition\n");
             exit(1);
diff --git a/hw/ppc/virtex_ml507.c b/hw/ppc/virtex_ml507.c
index 6ebd5be..f33d398 100644
--- a/hw/ppc/virtex_ml507.c
+++ b/hw/ppc/virtex_ml507.c
@@ -197,7 +197,6 @@ static int xilinx_load_device_tree(hwaddr addr,
 static void virtex_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
-    const char *cpu_model = machine->cpu_model;
     const char *kernel_filename = machine->kernel_filename;
     const char *kernel_cmdline = machine->kernel_cmdline;
     hwaddr initrd_base = 0;
@@ -214,11 +213,11 @@ static void virtex_init(MachineState *machine)
     int i;
 
     /* init CPUs */
-    if (cpu_model == NULL) {
-        cpu_model = "440-Xilinx";
+    if (machine->cpu_model == NULL) {
+        machine->cpu_model = "440-Xilinx";
     }
 
-    cpu = ppc440_init_xilinx(&ram_size, 1, cpu_model, 400000000);
+    cpu = ppc440_init_xilinx(&ram_size, 1, machine->cpu_model, 400000000);
     env = &cpu->env;
     qemu_register_reset(main_cpu_reset, cpu);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (9 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  2:39   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support Bharata B Rao
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

ppc machine init functions create individual CPU threads. Change this
for sPAPR by switching to socket creation. CPUs are created recursively
by socket and core instance init routines.

TODO: Switching to socket level CPU creation is done only for sPAPR
target now.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/cpu-core.c           | 17 +++++++++++++++++
 hw/ppc/cpu-socket.c         | 15 +++++++++++++++
 hw/ppc/spapr.c              | 15 ++++++++-------
 target-ppc/cpu.h            |  1 +
 target-ppc/translate_init.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 87 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
index ed0481f..f60646d 100644
--- a/hw/ppc/cpu-core.c
+++ b/hw/ppc/cpu-core.c
@@ -7,6 +7,8 @@
 
 #include "hw/qdev.h"
 #include "hw/ppc/cpu-core.h"
+#include "hw/boards.h"
+#include <sysemu/cpus.h>
 
 static int ppc_cpu_core_realize_child(Object *child, void *opaque)
 {
@@ -32,10 +34,25 @@ static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
     dc->realize = ppc_cpu_core_realize;
 }
 
+static void ppc_cpu_core_instance_init(Object *obj)
+{
+    int i;
+    PowerPCCPU *cpu = NULL;
+    MachineState *machine = MACHINE(qdev_get_machine());
+
+    for (i = 0; i < smp_threads; i++) {
+        cpu = POWERPC_CPU(cpu_ppc_create(TYPE_POWERPC_CPU, machine->cpu_model));
+        object_property_add_child(obj, "thread[*]", OBJECT(cpu), &error_abort);
+        object_unref(OBJECT(cpu));
+    }
+}
+
 static const TypeInfo ppc_cpu_core_type_info = {
     .name = TYPE_POWERPC_CPU_CORE,
     .parent = TYPE_DEVICE,
     .class_init = ppc_cpu_core_class_init,
+    .instance_init = ppc_cpu_core_instance_init,
+    .instance_size = sizeof(PowerPCCPUCore),
 };
 
 static void ppc_cpu_core_register_types(void)
diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
index 602a060..f901336 100644
--- a/hw/ppc/cpu-socket.c
+++ b/hw/ppc/cpu-socket.c
@@ -8,6 +8,7 @@
 #include "hw/qdev.h"
 #include "hw/ppc/cpu-socket.h"
 #include "sysemu/cpus.h"
+#include "cpu.h"
 
 static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
 {
@@ -33,10 +34,24 @@ static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
     dc->realize = ppc_cpu_socket_realize;
 }
 
+static void ppc_cpu_socket_instance_init(Object *obj)
+{
+    int i;
+    Object *core;
+
+    for (i = 0; i < smp_cores; i++) {
+        core = object_new(TYPE_POWERPC_CPU_CORE);
+        object_property_add_child(obj, "core[*]", core, &error_abort);
+        object_unref(core);
+    }
+}
+
 static const TypeInfo ppc_cpu_socket_type_info = {
     .name = TYPE_POWERPC_CPU_SOCKET,
     .parent = TYPE_CPU_SOCKET,
     .class_init = ppc_cpu_socket_class_init,
+    .instance_init = ppc_cpu_socket_instance_init,
+    .instance_size = sizeof(PowerPCCPUSocket),
 };
 
 static void ppc_cpu_socket_register_types(void)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 16b67f4..f52d38f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -37,6 +37,7 @@
 #include "mmu-hash64.h"
 #include "qom/cpu.h"
 
+#include "hw/ppc/cpu-socket.h"
 #include "hw/boards.h"
 #include "hw/ppc/ppc.h"
 #include "hw/loader.h"
@@ -1469,7 +1470,6 @@ static void ppc_spapr_init(MachineState *machine)
     const char *kernel_cmdline = machine->kernel_cmdline;
     const char *initrd_filename = machine->initrd_filename;
     const char *boot_device = machine->boot_order;
-    PowerPCCPU *cpu;
     PCIHostState *phb;
     int i;
     MemoryRegion *sysmem = get_system_memory();
@@ -1484,6 +1484,8 @@ static void ppc_spapr_init(MachineState *machine)
     bool kernel_le = false;
     char *filename;
     int smt = kvmppc_smt_threads();
+    Object *socket;
+    int sockets;
 
     msi_supported = true;
 
@@ -1548,12 +1550,11 @@ static void ppc_spapr_init(MachineState *machine)
     if (machine->cpu_model == NULL) {
         machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
     }
-    for (i = 0; i < smp_cpus; i++) {
-        cpu = cpu_ppc_init(machine->cpu_model);
-        if (cpu == NULL) {
-            fprintf(stderr, "Unable to find PowerPC CPU definition\n");
-            exit(1);
-        }
+
+    sockets = smp_cpus / smp_cores / smp_threads;
+    for (i = 0; i < sockets; i++) {
+        socket = object_new(TYPE_POWERPC_CPU_SOCKET);
+        object_property_set_bool(socket, true, "realized", &error_abort);
     }
 
     /* allocate RAM */
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index abc3545..f15cc2c 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1162,6 +1162,7 @@ do {                                            \
 
 /*****************************************************************************/
 PowerPCCPU *cpu_ppc_init(const char *cpu_model);
+CPUState *cpu_ppc_create(const char *typename, const char *cpu_model);
 void ppc_translate_init(void);
 void gen_update_current_nip(void *opaque);
 int cpu_ppc_exec (CPUPPCState *s);
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index d74f4f0..a8716cf 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9365,6 +9365,52 @@ static ObjectClass *ppc_cpu_class_by_name(const char *name)
     return NULL;
 }
 
+/*
+ * This is essentially same as cpu_generic_init() but without a set
+ * realize call.
+ */
+CPUState *cpu_ppc_create(const char *typename, const char *cpu_model)
+{
+    char *str, *name, *featurestr;
+    CPUState *cpu;
+    ObjectClass *oc;
+    CPUClass *cc;
+    Error *err = NULL;
+
+    str = g_strdup(cpu_model);
+    name = strtok(str, ",");
+
+    oc = cpu_class_by_name(typename, name);
+    if (oc == NULL) {
+        g_free(str);
+        return NULL;
+    }
+
+    cpu = CPU(object_new(object_class_get_name(oc)));
+    cc = CPU_GET_CLASS(cpu);
+
+    featurestr = strtok(NULL, ",");
+    cc->parse_features(cpu, featurestr, &err);
+    g_free(str);
+    if (err != NULL) {
+        goto out;
+    }
+
+out:
+    if (err != NULL) {
+        error_report("%s", error_get_pretty(err));
+        error_free(err);
+        object_unref(OBJECT(cpu));
+        return NULL;
+    }
+
+    return cpu;
+}
+
+/*
+ * TODO: This can be removed when all powerpc targets are converted to
+ * socket level CPU realization.
+ */
 PowerPCCPU *cpu_ppc_init(const char *cpu_model)
 {
     return POWERPC_CPU(cpu_generic_init(TYPE_POWERPC_CPU, cpu_model));
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (10 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  3:03   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init() Bharata B Rao
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Support CPU hotplug via device-add command. Set up device tree
entries for the hotplugged CPU core and use the exising EPOW event
infrastructure to send CPU hotplug notification to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_events.c |  8 +++---
 hw/ppc/spapr_rtas.c   | 11 ++++++++
 3 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f52d38f..b48994b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -33,6 +33,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
+#include "sysemu/device_tree.h"
 #include "kvm_ppc.h"
 #include "mmu-hash64.h"
 #include "qom/cpu.h"
@@ -660,6 +661,10 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
     QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
     unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
     uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    int drc_index = drck->get_index(drc);
 
     _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
     _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
@@ -728,6 +733,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
 
     _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
                             cs->cpu_index / cpus_per_socket)));
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
+
 
     _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
 }
@@ -1840,6 +1847,70 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    DeviceClass *dc = DEVICE_GET_CLASS(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    void *fdt;
+    int offset, i, fdt_size;
+    char *nodename;
+
+    fdt = create_device_tree(&fdt_size);
+    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
+    offset = fdt_add_subnode(fdt, 0, nodename);
+
+    /* Set NUMA node for the added CPU core */
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
+            cs->numa_node = i;
+            break;
+        }
+    }
+
+    spapr_populate_cpu_dt(cs, fdt, offset);
+    g_free(nodename);
+
+    drck->attach(drc, dev, fdt, offset, !dev->hotplugged, errp);
+    if (*errp) {
+        g_free(fdt);
+    }
+}
+
+static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    int smt = kvmppc_smt_threads();
+    Error *local_err = NULL;
+
+    /*
+     * SMT threads return from here, only main thread (core) will
+     * continue and signal hotplug event to the guest.
+     */
+    if ((id % smt) != 0) {
+        return;
+    }
+
+    g_assert(drc);
+
+    spapr_cpu_hotplug_add(dev, cs, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    spapr_hotplug_req_add_event(drc);
+
+    return;
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -1848,6 +1919,10 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
         PowerPCCPU *cpu = POWERPC_CPU(cs);
 
         spapr_cpu_init(cpu);
+        spapr_cpu_reset(cpu);
+        if (dev->hotplugged && spapr->dr_cpu_enabled) {
+            spapr_cpu_plug(hotplug_dev, dev, errp);
+        }
     }
 }
 
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index be82815..4ae818a 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -421,14 +421,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
     hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
     hp->hdr.section_version = 1; /* includes extended modifier */
     hp->hotplug_action = hp_action;
-
+    hp->drc.index = cpu_to_be32(drck->get_index(drc));
+    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
 
     switch (drc_type) {
     case SPAPR_DR_CONNECTOR_TYPE_PCI:
-        hp->drc.index = cpu_to_be32(drck->get_index(drc));
-        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
+        break;
     default:
         /* we shouldn't be signaling hotplug events for resources
          * that don't support them
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 57ec97a..48aeb86 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -121,6 +121,16 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
     rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
 }
 
+static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
+{
+    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);
+
+    if (!(*pcc->interrupts_big_endian)(fcpu)) {
+        cpu->env.spr[SPR_LPCR] |= LPCR_ILE;
+    }
+}
+
 static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
                            uint32_t token, uint32_t nargs,
                            target_ulong args,
@@ -157,6 +167,7 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
         env->nip = start;
         env->gpr[3] = r3;
         cs->halted = 0;
+        spapr_cpu_set_endianness(cpu);
 
         qemu_cpu_kick(cs);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init()
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (11 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  3:12   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap Bharata B Rao
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Add an Error argument to cpu_exec_init() to let users collect the
error. Change all callers to currently pass NULL error argument. This change
is needed for the following reasons:

- A subsequent commit changes the CPU enumeration logic in cpu_exec_init()
  resulting in cpu_exec_init() to fail if cpu_index values corresponding
  to max_cpus have already been handed out.
- There is a thinking that cpu_exec_init() should be called from realize
  rather than instance_init. With this change, those architectures
  that can move this call into realize function can do so in a phased
  manner.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 exec.c                      | 2 +-
 include/exec/exec-all.h     | 2 +-
 target-alpha/cpu.c          | 2 +-
 target-arm/cpu.c            | 2 +-
 target-cris/cpu.c           | 2 +-
 target-i386/cpu.c           | 2 +-
 target-lm32/cpu.c           | 2 +-
 target-m68k/cpu.c           | 2 +-
 target-microblaze/cpu.c     | 2 +-
 target-mips/cpu.c           | 2 +-
 target-moxie/cpu.c          | 2 +-
 target-openrisc/cpu.c       | 2 +-
 target-ppc/translate_init.c | 2 +-
 target-s390x/cpu.c          | 2 +-
 target-sh4/cpu.c            | 2 +-
 target-sparc/cpu.c          | 2 +-
 target-tricore/cpu.c        | 2 +-
 target-unicore32/cpu.c      | 2 +-
 target-xtensa/cpu.c         | 2 +-
 19 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/exec.c b/exec.c
index c85321a..e1ff6b0 100644
--- a/exec.c
+++ b/exec.c
@@ -527,7 +527,7 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as)
 }
 #endif
 
-void cpu_exec_init(CPUArchState *env)
+void cpu_exec_init(CPUArchState *env, Error **errp)
 {
     CPUState *cpu = ENV_GET_CPU(env);
     CPUClass *cc = CPU_GET_CLASS(cpu);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 8eb0db3..41a9393 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -88,7 +88,7 @@ void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
 TranslationBlock *tb_gen_code(CPUState *cpu,
                               target_ulong pc, target_ulong cs_base, int flags,
                               int cflags);
-void cpu_exec_init(CPUArchState *env);
+void cpu_exec_init(CPUArchState *env, Error **errp);
 void QEMU_NORETURN cpu_loop_exit(CPUState *cpu);
 int page_unprotect(target_ulong address, uintptr_t pc, void *puc);
 void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
diff --git a/target-alpha/cpu.c b/target-alpha/cpu.c
index a98b7d8..0a0c21e 100644
--- a/target-alpha/cpu.c
+++ b/target-alpha/cpu.c
@@ -257,7 +257,7 @@ static void alpha_cpu_initfn(Object *obj)
     CPUAlphaState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
     tlb_flush(cs, 1);
 
     alpha_translate_init();
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 986f04c..86edaab 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -369,7 +369,7 @@ static void arm_cpu_initfn(Object *obj)
     static bool inited;
 
     cs->env_ptr = &cpu->env;
-    cpu_exec_init(&cpu->env);
+    cpu_exec_init(&cpu->env, NULL);
     cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
                                          g_free, g_free);
 
diff --git a/target-cris/cpu.c b/target-cris/cpu.c
index 16cfba9..8b589ec 100644
--- a/target-cris/cpu.c
+++ b/target-cris/cpu.c
@@ -170,7 +170,7 @@ static void cris_cpu_initfn(Object *obj)
     static bool tcg_initialized;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     env->pregs[PR_VR] = ccc->vr;
 
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index d543e2b..daccf4f 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2886,7 +2886,7 @@ static void x86_cpu_initfn(Object *obj)
     static int inited;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     object_property_add(obj, "family", "int",
                         x86_cpuid_version_get_family,
diff --git a/target-lm32/cpu.c b/target-lm32/cpu.c
index f8081f5..89b6631 100644
--- a/target-lm32/cpu.c
+++ b/target-lm32/cpu.c
@@ -151,7 +151,7 @@ static void lm32_cpu_initfn(Object *obj)
     static bool tcg_initialized;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     env->flags = 0;
 
diff --git a/target-m68k/cpu.c b/target-m68k/cpu.c
index 4cfb725..6a41551 100644
--- a/target-m68k/cpu.c
+++ b/target-m68k/cpu.c
@@ -168,7 +168,7 @@ static void m68k_cpu_initfn(Object *obj)
     static bool inited;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     if (tcg_enabled() && !inited) {
         inited = true;
diff --git a/target-microblaze/cpu.c b/target-microblaze/cpu.c
index 67e3182..6b3732d 100644
--- a/target-microblaze/cpu.c
+++ b/target-microblaze/cpu.c
@@ -130,7 +130,7 @@ static void mb_cpu_initfn(Object *obj)
     static bool tcg_initialized;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     set_float_rounding_mode(float_round_nearest_even, &env->fp_status);
 
diff --git a/target-mips/cpu.c b/target-mips/cpu.c
index 98dc94e..02f1d32 100644
--- a/target-mips/cpu.c
+++ b/target-mips/cpu.c
@@ -115,7 +115,7 @@ static void mips_cpu_initfn(Object *obj)
     CPUMIPSState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     if (tcg_enabled()) {
         mips_tcg_init();
diff --git a/target-moxie/cpu.c b/target-moxie/cpu.c
index 47b617f..f815fb3 100644
--- a/target-moxie/cpu.c
+++ b/target-moxie/cpu.c
@@ -66,7 +66,7 @@ static void moxie_cpu_initfn(Object *obj)
     static int inited;
 
     cs->env_ptr = &cpu->env;
-    cpu_exec_init(&cpu->env);
+    cpu_exec_init(&cpu->env, NULL);
 
     if (tcg_enabled() && !inited) {
         inited = 1;
diff --git a/target-openrisc/cpu.c b/target-openrisc/cpu.c
index 39bedc1..87b2f80 100644
--- a/target-openrisc/cpu.c
+++ b/target-openrisc/cpu.c
@@ -92,7 +92,7 @@ static void openrisc_cpu_initfn(Object *obj)
     static int inited;
 
     cs->env_ptr = &cpu->env;
-    cpu_exec_init(&cpu->env);
+    cpu_exec_init(&cpu->env, NULL);
 
 #ifndef CONFIG_USER_ONLY
     cpu_openrisc_mmu_init(cpu);
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index a8716cf..9f4f172 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9679,7 +9679,7 @@ static void ppc_cpu_initfn(Object *obj)
     CPUPPCState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
     cpu->cpu_dt_id = cs->cpu_index;
 
     env->msr_mask = pcc->msr_mask;
diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index d2f6312..28717bd 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -185,7 +185,7 @@ static void s390_cpu_initfn(Object *obj)
 #endif
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 #if !defined(CONFIG_USER_ONLY)
     qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
     qemu_get_timedate(&tm, 0);
diff --git a/target-sh4/cpu.c b/target-sh4/cpu.c
index d187a2b..ffb635e 100644
--- a/target-sh4/cpu.c
+++ b/target-sh4/cpu.c
@@ -247,7 +247,7 @@ static void superh_cpu_initfn(Object *obj)
     CPUSH4State *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     env->movcal_backup_tail = &(env->movcal_backup);
 
diff --git a/target-sparc/cpu.c b/target-sparc/cpu.c
index a952097..d857aae 100644
--- a/target-sparc/cpu.c
+++ b/target-sparc/cpu.c
@@ -802,7 +802,7 @@ static void sparc_cpu_initfn(Object *obj)
     CPUSPARCState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     if (tcg_enabled()) {
         gen_intermediate_code_init(env);
diff --git a/target-tricore/cpu.c b/target-tricore/cpu.c
index 2ba0cf4..53b117b 100644
--- a/target-tricore/cpu.c
+++ b/target-tricore/cpu.c
@@ -88,7 +88,7 @@ static void tricore_cpu_initfn(Object *obj)
     CPUTriCoreState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     if (tcg_enabled()) {
         tricore_tcg_init();
diff --git a/target-unicore32/cpu.c b/target-unicore32/cpu.c
index 5b32987..d56d78a 100644
--- a/target-unicore32/cpu.c
+++ b/target-unicore32/cpu.c
@@ -111,7 +111,7 @@ static void uc32_cpu_initfn(Object *obj)
     static bool inited;
 
     cs->env_ptr = env;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
 #ifdef CONFIG_USER_ONLY
     env->uncached_asr = ASR_MODE_USER;
diff --git a/target-xtensa/cpu.c b/target-xtensa/cpu.c
index 6a5414f..dd23d32 100644
--- a/target-xtensa/cpu.c
+++ b/target-xtensa/cpu.c
@@ -114,7 +114,7 @@ static void xtensa_cpu_initfn(Object *obj)
 
     cs->env_ptr = env;
     env->config = xcc->config;
-    cpu_exec_init(env);
+    cpu_exec_init(env, NULL);
 
     if (tcg_enabled() && !tcg_inited) {
         tcg_inited = true;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (12 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init() Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  3:23   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function Bharata B Rao
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Currently CPUState.cpu_index is monotonically increasing and a newly
created CPU always gets the next higher index. The next available
index is calculated by counting the existing number of CPUs. This is
fine as long as we only add CPUs, but there are architectures which
are starting to support CPU removal too. For an architecture like PowerPC
which derives its CPU identifier (device tree ID) from cpu_index, the
existing logic of generating cpu_index values causes problems.

With the currently proposed method of handling vCPU removal by parking
the vCPU fd in QEMU
(Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html),
generating cpu_index this way will not work for PowerPC.

This patch changes the way cpu_index is handed out by maintaining
a bit map of the CPUs that tracks both addition and removal of CPUs.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 exec.c            | 37 ++++++++++++++++++++++++++++++++++---
 include/qom/cpu.h |  8 ++++++++
 2 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index e1ff6b0..9bbab02 100644
--- a/exec.c
+++ b/exec.c
@@ -527,21 +527,52 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as)
 }
 #endif
 
+#ifndef CONFIG_USER_ONLY
+static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
+
+static int cpu_get_free_index(Error **errp)
+{
+    int cpu = find_first_zero_bit(cpu_index_map, max_cpus);
+
+    if (cpu == max_cpus) {
+        error_setg(errp, "Trying to use more CPUs than allowed max of %d\n",
+                    max_cpus);
+        return max_cpus;
+    } else {
+        bitmap_set(cpu_index_map, cpu, 1);
+        return cpu;
+    }
+}
+
+void cpu_exec_exit(CPUState *cpu)
+{
+    bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
+}
+#endif
+
 void cpu_exec_init(CPUArchState *env, Error **errp)
 {
     CPUState *cpu = ENV_GET_CPU(env);
     CPUClass *cc = CPU_GET_CLASS(cpu);
-    CPUState *some_cpu;
     int cpu_index;
-
 #if defined(CONFIG_USER_ONLY)
+    CPUState *some_cpu;
+
     cpu_list_lock();
-#endif
     cpu_index = 0;
     CPU_FOREACH(some_cpu) {
         cpu_index++;
     }
     cpu->cpu_index = cpu_index;
+#else
+    Error *local_err = NULL;
+
+    cpu_index = cpu->cpu_index = cpu_get_free_index(&local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+#endif
     cpu->numa_node = 0;
     QTAILQ_INIT(&cpu->breakpoints);
     QTAILQ_INIT(&cpu->watchpoints);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 48fd6fb..5241cf4 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -659,6 +659,14 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
 void QEMU_NORETURN cpu_abort(CPUState *cpu, const char *fmt, ...)
     GCC_FMT_ATTR(2, 3);
 
+#ifndef CONFIG_USER_ONLY
+void cpu_exec_exit(CPUState *cpu);
+#else
+static inline void cpu_exec_exit(CPUState *cpu)
+{
+}
+#endif
+
 #ifdef CONFIG_SOFTMMU
 extern const struct VMStateDescription vmstate_cpu_common;
 #else
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (13 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  3:25   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects Bharata B Rao
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Move cpu_exec_init() call from instance_init to realize. This allows
any failures from cpu_exec_init() to be handled appropriately.

Also add cpu_exec_exit() call from unrealize.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 target-ppc/translate_init.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 9f4f172..fccee82 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8928,6 +8928,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
+    cpu_exec_init(&cpu->env, &local_err);
+    if (local_err != NULL) {
+        error_propagate(errp, local_err);
+        return;
+    }
     cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
         + (cs->cpu_index % smp_threads);
 #endif
@@ -9141,6 +9146,8 @@ static void ppc_cpu_unrealizefn(DeviceState *dev, Error **errp)
     opc_handler_t **table;
     int i, j;
 
+    cpu_exec_exit(CPU(dev));
+
     for (i = 0; i < PPC_CPU_OPCODES_LEN; i++) {
         if (env->opcodes[i] == &invalid_handler) {
             continue;
@@ -9679,8 +9686,6 @@ static void ppc_cpu_initfn(Object *obj)
     CPUPPCState *env = &cpu->env;
 
     cs->env_ptr = env;
-    cpu_exec_init(env, NULL);
-    cpu->cpu_dt_id = cs->cpu_index;
 
     env->msr_mask = pcc->msr_mask;
     env->mmu_model = pcc->mmu_model;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (14 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  5:22   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled Bharata B Rao
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Zhu Guihua, Bharata B Rao, mdroth, agraf, Chen Fan, qemu-ppc,
	tyreld, nfont, Gu Zheng, imammedo, afaerber, david

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 cpus.c               | 44 ++++++++++++++++++++++++++++++++++++++++
 include/qom/cpu.h    | 11 ++++++++++
 include/sysemu/kvm.h |  1 +
 kvm-all.c            | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 kvm-stub.c           |  5 +++++
 5 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 0fac143..33906cd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -858,6 +858,24 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
     qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+    CPU_REMOVE(cpu);
+
+    if (kvm_destroy_vcpu(cpu) < 0) {
+        error_report("kvm_destroy_vcpu failed.\n");
+        exit(EXIT_FAILURE);
+    }
+
+    object_unparent(OBJECT(cpu));
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+    CPU_REMOVE(cpu);
+    object_unparent(OBJECT(cpu));
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
     struct qemu_work_item *wi;
@@ -950,6 +968,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
             }
         }
         qemu_kvm_wait_io_event(cpu);
+        if (cpu->exit && !cpu_can_run(cpu)) {
+            qemu_kvm_destroy_vcpu(cpu);
+            qemu_mutex_unlock(&qemu_global_mutex);
+            return NULL;
+        }
     }
 
     return NULL;
@@ -1003,6 +1026,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
+    CPUState *remove_cpu = NULL;
 
     qemu_tcg_init_cpu_signals();
     qemu_thread_get_self(cpu->thread);
@@ -1039,6 +1063,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
         }
         qemu_tcg_wait_io_event();
+        CPU_FOREACH(cpu) {
+            if (cpu->exit && !cpu_can_run(cpu)) {
+                remove_cpu = cpu;
+                break;
+            }
+        }
+        if (remove_cpu) {
+            qemu_tcg_destroy_vcpu(remove_cpu);
+            remove_cpu = NULL;
+        }
     }
 
     return NULL;
@@ -1196,6 +1230,13 @@ void resume_all_vcpus(void)
     }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+    cpu->stop = true;
+    cpu->exit = true;
+    qemu_cpu_kick(cpu);
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
@@ -1390,6 +1431,9 @@ static void tcg_exec_all(void)
                 break;
             }
         } else if (cpu->stop || cpu->stopped) {
+            if (cpu->exit) {
+                next_cpu = CPU_NEXT(cpu);
+            }
             break;
         }
     }
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 5241cf4..1bfc3d4 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -206,6 +206,7 @@ struct kvm_run;
  * @halted: Nonzero if the CPU is in suspended state.
  * @stop: Indicates a pending stop request.
  * @stopped: Indicates the CPU has been artificially stopped.
+ * @exit: Indicates the CPU has exited due to an unplug operation.
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *           CPU and return to its top level loop.
  * @singlestep_enabled: Flags for single-stepping.
@@ -249,6 +250,7 @@ struct CPUState {
     bool created;
     bool stop;
     bool stopped;
+    bool exit;
     volatile sig_atomic_t exit_request;
     uint32_t interrupt_request;
     int singlestep_enabled;
@@ -306,6 +308,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, &cpus, node)
 #define CPU_FOREACH_SAFE(cpu, next_cpu) \
     QTAILQ_FOREACH_SAFE(cpu, &cpus, node, next_cpu)
@@ -610,6 +613,14 @@ void cpu_exit(CPUState *cpu);
  */
 void cpu_resume(CPUState *cpu);
 
+ /**
+ * cpu_remove:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed.
+ */
+void cpu_remove(CPUState *cpu);
+
 /**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 30cb84d..560caef 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -188,6 +188,7 @@ int kvm_has_intx_set_mask(void);
 
 int kvm_init_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
+int kvm_destroy_vcpu(CPUState *cpu);
 
 #ifdef NEED_CPU_H
 
diff --git a/kvm-all.c b/kvm-all.c
index 05a79c2..46e7853 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -71,6 +71,12 @@ typedef struct KVMSlot
 
 typedef struct kvm_dirty_log KVMDirtyLog;
 
+struct KVMParkedVcpu {
+    unsigned long vcpu_id;
+    int kvm_fd;
+    QLIST_ENTRY(KVMParkedVcpu) node;
+};
+
 struct KVMState
 {
     AccelState parent_obj;
@@ -107,6 +113,7 @@ struct KVMState
     QTAILQ_HEAD(msi_hashtab, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
     bool direct_msi;
 #endif
+    QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
 };
 
 #define TYPE_KVM_ACCEL ACCEL_CLASS_NAME("kvm")
@@ -247,6 +254,53 @@ static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
     return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
 }
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+    KVMState *s = kvm_state;
+    long mmap_size;
+    struct KVMParkedVcpu *vcpu = NULL;
+    int ret = 0;
+
+    DPRINTF("kvm_destroy_vcpu\n");
+
+    mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
+    if (mmap_size < 0) {
+        ret = mmap_size;
+        DPRINTF("KVM_GET_VCPU_MMAP_SIZE failed\n");
+        goto err;
+    }
+
+    ret = munmap(cpu->kvm_run, mmap_size);
+    if (ret < 0) {
+        goto err;
+    }
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+err:
+    return ret;
+}
+
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
+{
+    struct KVMParkedVcpu *cpu;
+
+    QLIST_FOREACH(cpu, &s->kvm_parked_vcpus, node) {
+        if (cpu->vcpu_id == vcpu_id) {
+            int kvm_fd;
+
+            QLIST_REMOVE(cpu, node);
+            kvm_fd = cpu->kvm_fd;
+            g_free(cpu);
+            return kvm_fd;
+        }
+    }
+
+    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+}
+
 int kvm_init_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
@@ -255,7 +309,7 @@ int kvm_init_vcpu(CPUState *cpu)
 
     DPRINTF("kvm_init_vcpu\n");
 
-    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
+    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         DPRINTF("kvm_create_vcpu failed\n");
         goto err;
@@ -1448,6 +1502,7 @@ static int kvm_init(MachineState *ms)
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_INIT(&s->kvm_sw_breakpoints);
 #endif
+    QLIST_INIT(&s->kvm_parked_vcpus);
     s->vmfd = -1;
     s->fd = qemu_open("/dev/kvm", O_RDWR);
     if (s->fd == -1) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 7ba90c5..79ac626 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -30,6 +30,11 @@ bool kvm_gsi_direct_mapping;
 bool kvm_allowed;
 bool kvm_readonly_mem_allowed;
 
+int kvm_destroy_vcpu(CPUState *cpu)
+{
+    return -ENOSYS;
+}
+
 int kvm_init_vcpu(CPUState *cpu)
 {
     return -ENOSYS;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (15 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  5:24   ` David Gibson
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

When supporting CPU hot removal by parking the vCPU fd and reusing
it during hotplug again, there can be cases where we try to reenable
KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
Introduce a boolean member in ICPState to track this and don't
reenable the CAP if it was already enabled earlier.

This change allows CPU hot removal to work for sPAPR.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/intc/xics_kvm.c    | 10 ++++++++++
 include/hw/ppc/xics.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index c15453f..5b27bf8 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -331,6 +331,15 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
         abort();
     }
 
+    /*
+     * If we are reusing a parked vCPU fd corresponding to the CPU
+     * which was hot-removed earlier we don't have to renable
+     * KVM_CAP_IRQ_XICS capability again.
+     */
+    if (ss->cap_irq_xics_enabled) {
+        return;
+    }
+
     if (icpkvm->kernel_xics_fd != -1) {
         int ret;
 
@@ -343,6 +352,7 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
                     kvm_arch_vcpu_id(cs), strerror(errno));
             exit(1);
         }
+        ss->cap_irq_xics_enabled = true;
     }
 }
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index a214dd7..355a966 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -109,6 +109,7 @@ struct ICPState {
     uint8_t pending_priority;
     uint8_t mfrr;
     qemu_irq output;
+    bool cap_irq_xics_enabled;
 };
 
 #define TYPE_ICS "ics"
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (16 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled Bharata B Rao
@ 2015-03-23 13:35 ` Bharata B Rao
  2015-03-25  5:26   ` David Gibson
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support Bharata B Rao
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

XICS is setup for each CPU during initialization. Provide a routine
to undo the same when CPU is unplugged.

This allows reboot of a VM that has undergone CPU hotplug and unplug
to work correctly.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/intc/xics.c        | 12 ++++++++++++
 hw/intc/xics_kvm.c    |  9 +++++++++
 include/hw/ppc/xics.h |  2 ++
 3 files changed, 23 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 0fd2a84..406697d 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -44,6 +44,18 @@ static int get_cpu_index_by_dt_id(int cpu_dt_id)
     return -1;
 }
 
+void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+
+    assert(cs->cpu_index < icp->nr_servers);
+
+    if (info->cpu_destroy) {
+        info->cpu_destroy(icp, cpu);
+    }
+}
+
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 5b27bf8..a7c6226 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -356,6 +356,14 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
     }
 }
 
+static void xics_kvm_cpu_destroy(XICSState *icp, PowerPCCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    ICPState *ss = &icp->ss[cs->cpu_index];
+
+    ss->cs = NULL;
+}
+
 static void xics_kvm_set_nr_irqs(XICSState *icp, uint32_t nr_irqs, Error **errp)
 {
     icp->nr_irqs = icp->ics->nr_irqs = nr_irqs;
@@ -486,6 +494,7 @@ static void xics_kvm_class_init(ObjectClass *oc, void *data)
 
     dc->realize = xics_kvm_realize;
     xsc->cpu_setup = xics_kvm_cpu_setup;
+    xsc->cpu_destroy = xics_kvm_cpu_destroy;
     xsc->set_nr_irqs = xics_kvm_set_nr_irqs;
     xsc->set_nr_servers = xics_kvm_set_nr_servers;
 }
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 355a966..2faad48 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -68,6 +68,7 @@ struct XICSStateClass {
     DeviceClass parent_class;
 
     void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
+    void (*cpu_destroy)(XICSState *icp, PowerPCCPU *cpu);
     void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
     void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
 };
@@ -166,5 +167,6 @@ int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align);
 void xics_free(XICSState *icp, int irq, int num);
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu);
+void xics_cpu_destroy(XICSState *icp, PowerPCCPU *cpu);
 
 #endif /* __XICS_H__ */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (17 preceding siblings ...)
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
@ 2015-03-23 13:36 ` Bharata B Rao
  2015-03-25  5:44   ` David Gibson
  2015-04-07  6:45   ` [Qemu-devel] [Qemu-ppc] " Alexey Kardashevskiy
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug Bharata B Rao
                   ` (5 subsequent siblings)
  24 siblings, 2 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Support hot removal of CPU for sPAPR guests by sending the hot
unplug notification to the guest via EPOW interrupt.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++++-
 linux-headers/linux/kvm.h |  1 +
 target-ppc/kvm.c          |  7 +++++
 target-ppc/kvm_ppc.h      |  6 ++++
 4 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b48994b..7b8784d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1468,6 +1468,12 @@ static void spapr_cpu_init(PowerPCCPU *cpu)
     qemu_register_reset(spapr_cpu_reset, cpu);
 }
 
+static void spapr_cpu_destroy(PowerPCCPU *cpu)
+{
+    xics_cpu_destroy(spapr->icp, cpu);
+    qemu_unregister_reset(spapr_cpu_reset, cpu);
+}
+
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(MachineState *machine)
 {
@@ -1880,6 +1886,18 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
     }
 }
 
+static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
+                                     Error **errp)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+    drck->detach(drc, dev, NULL, NULL, errp);
+}
+
 static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                             Error **errp)
 {
@@ -1911,6 +1929,51 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     return;
 }
 
+static int spapr_cpu_unplug(Object *obj, void *opaque)
+{
+    Error **errp = opaque;
+    DeviceState *dev = DEVICE(obj);
+    CPUState *cs = CPU(dev);
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int id = ppc_get_vcpu_dt_id(cpu);
+    int smt = kvmppc_smt_threads();
+    sPAPRDRConnector *drc =
+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+
+    spapr_cpu_destroy(cpu);
+
+    /*
+     * SMT threads return from here, only main thread (core) will
+     * continue and signal hot unplug event to the guest.
+     */
+    if ((id % smt) != 0) {
+        return 0;
+    }
+    g_assert(drc);
+
+    spapr_cpu_hotplug_remove(dev, cs, errp);
+    if (*errp) {
+        return -1;
+    }
+    spapr_hotplug_req_remove_event(drc);
+
+    return 0;
+}
+
+static int spapr_cpu_core_unplug(Object *obj, void *opaque)
+{
+    Error **errp = opaque;
+
+    object_child_foreach(obj, spapr_cpu_unplug, errp);
+    return 0;
+}
+
+static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
+                            DeviceState *dev, Error **errp)
+{
+    object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -1926,10 +1989,21 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
     }
 }
 
+static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
+                                      DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
+        if (dev->hotplugged && spapr->dr_cpu_enabled) {
+            spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
+        }
+    }
+}
+
 static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
-    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
@@ -1953,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->has_dynamic_sysbus = true;
     mc->get_hotplug_handler = spapr_get_hotpug_handler;
     hc->plug = spapr_machine_device_plug;
+    hc->unplug = spapr_machine_device_unplug;
+
     smc->dr_phb_enabled = false;
     smc->dr_cpu_enabled = false;
     smc->dr_lmb_enabled = false;
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 12045a1..0c1be07 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -761,6 +761,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_FIXUP_HCALL 103
 #define KVM_CAP_PPC_ENABLE_HCALL 104
 #define KVM_CAP_CHECK_EXTENSION_VM 105
+#define KVM_CAP_SPAPR_REUSE_VCPU 107
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 1edf2b5..ee23bf6 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -72,6 +72,7 @@ static int cap_ppc_watchdog;
 static int cap_papr;
 static int cap_htab_fd;
 static int cap_fixup_hcalls;
+static int cap_spapr_reuse_vcpu;
 
 static uint32_t debug_inst_opcode;
 
@@ -114,6 +115,7 @@ int kvm_arch_init(KVMState *s)
      * only activated after this by kvmppc_set_papr() */
     cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
     cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
+    cap_spapr_reuse_vcpu = kvm_check_extension(s, KVM_CAP_SPAPR_REUSE_VCPU);
 
     if (!cap_interrupt_level) {
         fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
@@ -2408,3 +2410,8 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
 {
     return 0;
 }
+
+bool kvmppc_spapr_reuse_vcpu(void)
+{
+    return cap_spapr_reuse_vcpu;
+}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 2e0224c..c831229 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -40,6 +40,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd,
 int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
 int kvmppc_reset_htab(int shift_hint);
 uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
+bool kvmppc_spapr_reuse_vcpu(void);
 #endif /* !CONFIG_USER_ONLY */
 bool kvmppc_has_cap_epr(void);
 int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
@@ -185,6 +186,11 @@ static inline int kvmppc_update_sdr1(CPUPPCState *env)
     return 0;
 }
 
+static inline bool kvmppc_spapr_reuse_vcpu(void)
+{
+    return false;
+}
+
 #endif /* !CONFIG_USER_ONLY */
 
 static inline bool kvmppc_has_cap_epr(void)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (18 preceding siblings ...)
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support Bharata B Rao
@ 2015-03-23 13:36 ` Bharata B Rao
  2015-03-25  5:46   ` David Gibson
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space Bharata B Rao
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Release the vCPU objects after CPU hot unplug so that vCPU fd
can be parked and reused.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7b8784d..3e56d9e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1886,6 +1886,23 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
     }
 }
 
+static void spapr_cpu_release(DeviceState *dev, void *opaque)
+{
+    CPUState *cs;
+    int i;
+    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
+
+    for (i = id; i < id + smp_threads; i++) {
+        CPU_FOREACH(cs) {
+            PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+            if (i == ppc_get_vcpu_dt_id(cpu)) {
+                cpu_remove(cs);
+            }
+        }
+    }
+}
+
 static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
                                      Error **errp)
 {
@@ -1895,7 +1912,7 @@ static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
         spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
     sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
-    drck->detach(drc, dev, NULL, NULL, errp);
+    drck->detach(drc, dev, spapr_cpu_release, NULL, errp);
 }
 
 static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (19 preceding siblings ...)
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug Bharata B Rao
@ 2015-03-23 13:36 ` Bharata B Rao
  2015-03-25  5:58   ` David Gibson
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Initialize a hotplug memory region under which all the hotplugged
memory is accommodated. Also enable memory hotplug by setting
CONFIG_MEM_HOTPLUG.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 default-configs/ppc64-softmmu.mak |  1 +
 hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h            | 12 ++++++++++
 3 files changed, 63 insertions(+)

diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
index 22ef132..16b3011 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
 # For PReP
 CONFIG_MC146818RTC=y
 CONFIG_ISA_TESTDEV=y
+CONFIG_MEM_HOTPLUG=y
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3e56d9e..e43bb49 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -125,8 +125,13 @@ struct sPAPRMachineState {
 
     /*< public >*/
     char *kvm_type;
+    ram_addr_t hotplug_memory_base;
+    MemoryRegion hotplug_memory;
+    bool enforce_aligned_dimm;
 };
 
+#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
+
 sPAPREnvironment *spapr;
 
 static XICSState *try_create_xics(const char *type, int nr_servers,
@@ -1499,6 +1504,7 @@ static void ppc_spapr_init(MachineState *machine)
     int smt = kvmppc_smt_threads();
     Object *socket;
     int sockets;
+    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
 
     msi_supported = true;
 
@@ -1585,6 +1591,36 @@ static void ppc_spapr_init(MachineState *machine)
         memory_region_add_subregion(sysmem, 0, rma_region);
     }
 
+    /* initialize hotplug memory address space */
+    if (machine->ram_size < machine->maxram_size) {
+        ram_addr_t hotplug_mem_size =
+            machine->maxram_size - machine->ram_size;
+
+        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
+            error_report("unsupported amount of memory slots: %"PRIu64,
+                         machine->ram_slots);
+            exit(EXIT_FAILURE);
+        }
+
+        ms->hotplug_memory_base = ROUND_UP(machine->ram_size,
+                                    SPAPR_HOTPLUG_MEM_ALIGN);
+
+        if (ms->enforce_aligned_dimm) {
+            hotplug_mem_size += SPAPR_HOTPLUG_MEM_ALIGN * machine->ram_slots;
+        }
+
+        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
+            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
+                         machine->maxram_size);
+            exit(EXIT_FAILURE);
+        }
+
+        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
+                           "hotplug-memory", hotplug_mem_size);
+        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
+                                    &ms->hotplug_memory);
+    }
+
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
     spapr->rtas_size = get_image_size(filename);
     spapr->rtas_blob = g_malloc(spapr->rtas_size);
@@ -1827,13 +1863,27 @@ static void spapr_set_kvm_type(Object *obj, const char *value, Error **errp)
     sm->kvm_type = g_strdup(value);
 }
 
+static bool spapr_machine_get_aligned_dimm(Object *obj, Error **errp)
+{
+    sPAPRMachineState *ms = SPAPR_MACHINE(obj);
+
+    return ms->enforce_aligned_dimm;
+}
+
 static void spapr_machine_initfn(Object *obj)
 {
+    sPAPRMachineState *ms = SPAPR_MACHINE(obj);
+
     object_property_add_str(obj, "kvm-type",
                             spapr_get_kvm_type, spapr_set_kvm_type, NULL);
     object_property_set_description(obj, "kvm-type",
                                     "Specifies the KVM virtualization mode (HV, PR)",
                                     NULL);
+
+    ms->enforce_aligned_dimm = true;
+    object_property_add_bool(obj, SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM,
+                             spapr_machine_get_aligned_dimm,
+                             NULL, NULL);
 }
 
 static void ppc_cpu_do_nmi_on_cpu(void *arg)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ecac6e3..53560e9 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -542,6 +542,18 @@ struct sPAPREventLogEntry {
 
 #define SPAPR_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
 
+/*
+ * This defines the maximum number of DIMM slots we can have for sPAPR
+ * guest. This is not defined by sPAPR but we are defining it to 4096 slots
+ * here. With the worst case addition of SPAPR_MEMORY_BLOCK_SIZE
+ * (256MB) memory per slot, we should be able to support 1TB of guest
+ * hotpluggable memory.
+ */
+#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
+
+/* 1GB alignment for hotplug memory region */
+#define SPAPR_HOTPLUG_MEM_ALIGN (1ULL << 30)
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (20 preceding siblings ...)
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space Bharata B Rao
@ 2015-03-23 13:36 ` Bharata B Rao
  2015-03-26  3:44   ` David Gibson
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support Bharata B Rao
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch needs a SLOF enhancement which is already part of
upstream SLOF.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 docs/specs/ppc-spapr-hotplug.txt |  48 +++++++++
 hw/ppc/spapr.c                   | 228 +++++++++++++++++++++++++++++++--------
 hw/ppc/spapr_hcall.c             |  51 +++++++--
 include/hw/ppc/spapr.h           |  15 ++-
 4 files changed, 293 insertions(+), 49 deletions(-)

diff --git a/docs/specs/ppc-spapr-hotplug.txt b/docs/specs/ppc-spapr-hotplug.txt
index 46e0719..9d574b5 100644
--- a/docs/specs/ppc-spapr-hotplug.txt
+++ b/docs/specs/ppc-spapr-hotplug.txt
@@ -302,4 +302,52 @@ consisting of <phys>, <size> and <maxcpus>.
 pseries guests use this property to note the maximum allowed CPUs for the
 guest.
 
+== ibm,dynamic-reconfiguration-memory ==
+
+ibm,dynamic-reconfiguration-memory is a device tree node that represents
+dynamically reconfigurable logical memory blocks (LMB). This node
+is generated only when the guest advertises the support for it via
+ibm,client-architecture-support call. Memory that is not dynamically
+reconfigurable is represented by /memory nodes. The properties of this
+node that are of interest to the sPAPR memory hotplug implementation
+in QEMU are described here.
+
+ibm,lmb-size
+
+This 64bit integer defines the size of each dynamically reconfigurable LMB.
+
+ibm,associativity-lookup-arrays
+
+This property defines a lookup array in which the NUMA associativity
+information for each LMB can be found. It is a property encoded array
+that begins with an integer M, the number of associativity lists followed
+by an integer N, the number of entries per associativity list and terminated
+by M associativity lists each of length N integers.
+
+This property provides the same information as given by ibm,associativity
+property in a /memory node. Each assigned LMB has an index value between
+0 and M-1 which is used as an index into this table to select which
+associativity list to use for the LMB. This index value for each LMB
+is defined in ibm,dynamic-memory property.
+
+ibm,dynamic-memory
+
+This property describes the dynamically reconfigurable memory. It is a
+property endoded array that has an integer N, the number of LMBs followed
+by N LMB list entires.
+
+Each LMB list entry consists of the following elements:
+
+- Logical address of the start of the LMB encoded as a 64bit integer. This
+  corresponds to reg property in /memory node.
+- DRC index of the LMB that corresponds to ibm,my-drc-index property
+  in a /memory node.
+- Four bytes reserved for expansion.
+- Associativity list index for the LMB that is used an index into
+  ibm,associativity-lookup-arrays property described earlier. This
+  is used to retrieve the right associativity list to be used for this
+  LMB.
+- A 32bit flags word. The bit at bit position 0x00000008 defines whether
+  the LMB is assigned to the the partition as of boot time.
+
 [1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/75350/focus=106867
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e43bb49..4e844ab 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -541,42 +541,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
     return fdt;
 }
 
-int spapr_h_cas_compose_response(target_ulong addr, target_ulong size)
-{
-    void *fdt, *fdt_skel;
-    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
-
-    size -= sizeof(hdr);
-
-    /* Create sceleton */
-    fdt_skel = g_malloc0(size);
-    _FDT((fdt_create(fdt_skel, size)));
-    _FDT((fdt_begin_node(fdt_skel, "")));
-    _FDT((fdt_end_node(fdt_skel)));
-    _FDT((fdt_finish(fdt_skel)));
-    fdt = g_malloc0(size);
-    _FDT((fdt_open_into(fdt_skel, fdt, size)));
-    g_free(fdt_skel);
-
-    /* Fix skeleton up */
-    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
-
-    /* Pack resulting tree */
-    _FDT((fdt_pack(fdt)));
-
-    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
-        trace_spapr_cas_failed(size);
-        return -1;
-    }
-
-    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
-    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
-    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
-    g_free(fdt);
-
-    return 0;
-}
-
 static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
                                        hwaddr size)
 {
@@ -630,7 +594,6 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
         }
         if (!mem_start) {
             /* ppc_spapr_init() checks for rma_size <= node0_size already */
-            spapr_populate_memory_node(fdt, i, 0, spapr->rma_size);
             mem_start += spapr->rma_size;
             node_size -= spapr->rma_size;
         }
@@ -775,6 +738,186 @@ static void spapr_populate_cpu_dt_node(void *fdt, sPAPREnvironment *spapr)
 
 }
 
+/*
+ * TODO: Take care of sparsemem configuration ?
+ */
+static uint64_t numa_node_end(uint32_t nodeid)
+{
+    uint32_t i = 0;
+    uint64_t addr = 0;
+
+    do {
+        addr += numa_info[i].node_mem;
+    } while (++i <= nodeid);
+
+    return addr;
+}
+
+static uint64_t numa_node_start(uint32_t nodeid)
+{
+    if (!nodeid) {
+        return 0;
+    } else {
+        return numa_node_end(nodeid - 1);
+    }
+}
+
+/*
+ * Given the addr, return the NUMA node to which the address belongs to.
+ */
+static uint32_t get_numa_node(uint64_t addr)
+{
+    uint32_t i;
+
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {
+            return i;
+        }
+    }
+
+    /* Unassigned memory goes to node 0 by default */
+    return 0;
+}
+
+/*
+ * Adds ibm,dynamic-reconfiguration-memory node.
+ * Refer to docs/specs/ppc-spapr-hotplug.txt for the documentation
+ * of this device tree node.
+ */
+static int spapr_populate_drconf_memory(sPAPREnvironment *spapr, void *fdt)
+{
+    int ret, i, offset;
+    uint32_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
+    uint32_t nr_rma_lmbs = spapr->rma_size/lmb_size;
+    uint32_t nr_lmbs = spapr->maxram_limit/lmb_size - nr_rma_lmbs;
+    uint32_t nr_assigned_lmbs = spapr->ram_limit/lmb_size - nr_rma_lmbs;
+    uint32_t *int_buf, *cur_index, buf_len;
+
+    /* Allocate enough buffer size to fit in ibm,dynamic-memory */
+    buf_len = nr_lmbs * SPAPR_DR_LMB_LIST_ENTRY_SIZE * sizeof(uint32_t) +
+                sizeof(uint32_t);
+    cur_index = int_buf = g_malloc0(buf_len);
+
+    offset = fdt_add_subnode(fdt, 0, "ibm,dynamic-reconfiguration-memory");
+
+    ret = fdt_setprop_u64(fdt, offset, "ibm,lmb-size", lmb_size);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-flags-mask", 0xff);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-preservation-time", 0x0);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* ibm,dynamic-memory */
+    int_buf[0] = cpu_to_be32(nr_lmbs);
+    cur_index++;
+    for (i = 0; i < nr_lmbs; i++) {
+        sPAPRDRConnector *drc;
+        sPAPRDRConnectorClass *drck;
+        uint64_t addr;
+        uint32_t *dynamic_memory = cur_index;
+
+        if (i < nr_assigned_lmbs) {
+            addr = (i + nr_rma_lmbs) * lmb_size;
+        } else {
+            addr = (i - nr_assigned_lmbs) * lmb_size +
+                SPAPR_MACHINE(qdev_get_machine())->hotplug_memory_base;
+        }
+        drc = spapr_dr_connector_new(qdev_get_machine(),
+                SPAPR_DR_CONNECTOR_TYPE_LMB, addr/lmb_size);
+        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+        dynamic_memory[0] = cpu_to_be32(addr >> 32);
+        dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff);
+        dynamic_memory[2] = cpu_to_be32(drck->get_index(drc));
+        dynamic_memory[3] = cpu_to_be32(0); /* reserved */
+        dynamic_memory[4] = cpu_to_be32(get_numa_node(addr));
+        dynamic_memory[5] = (addr < spapr->ram_limit) ?
+                            cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED) :
+                            cpu_to_be32(0);
+
+        cur_index += SPAPR_DR_LMB_LIST_ENTRY_SIZE;
+    }
+    ret = fdt_setprop(fdt, offset, "ibm,dynamic-memory", int_buf, buf_len);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* ibm,associativity-lookup-arrays */
+    cur_index = int_buf;
+    int_buf[0] = cpu_to_be32(nb_numa_nodes);
+    int_buf[1] = cpu_to_be32(4); /* Number of entries per associativity list */
+    cur_index += 2;
+    for (i = 0; i < nb_numa_nodes; i++) {
+        uint32_t associativity[] = {
+            cpu_to_be32(0x0),
+            cpu_to_be32(0x0),
+            cpu_to_be32(0x0),
+            cpu_to_be32(i)
+        };
+        memcpy(cur_index, associativity, sizeof(associativity));
+        cur_index += 4;
+    }
+    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
+            (cur_index - int_buf) * sizeof(uint32_t));
+out:
+    g_free(int_buf);
+    return ret;
+}
+
+int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
+                                bool cpu_update, bool memory_update)
+{
+    void *fdt, *fdt_skel;
+    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
+
+    size -= sizeof(hdr);
+
+    /* Create sceleton */
+    fdt_skel = g_malloc0(size);
+    _FDT((fdt_create(fdt_skel, size)));
+    _FDT((fdt_begin_node(fdt_skel, "")));
+    _FDT((fdt_end_node(fdt_skel)));
+    _FDT((fdt_finish(fdt_skel)));
+    fdt = g_malloc0(size);
+    _FDT((fdt_open_into(fdt_skel, fdt, size)));
+    g_free(fdt_skel);
+
+    /* Fixup cpu nodes */
+    if (cpu_update) {
+        _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
+    }
+
+    /* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
+    if (memory_update) {
+        _FDT((spapr_populate_drconf_memory(spapr, fdt)));
+    } else {
+        _FDT((spapr_populate_memory(spapr, fdt)));
+    }
+
+    /* Pack resulting tree */
+    _FDT((fdt_pack(fdt)));
+
+    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
+        trace_spapr_cas_failed(size);
+        return -1;
+    }
+
+    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
+    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
+    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
+    g_free(fdt);
+
+    return 0;
+}
+
 static void spapr_finalize_fdt(sPAPREnvironment *spapr,
                                hwaddr fdt_addr,
                                hwaddr rtas_addr,
@@ -791,11 +934,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
     /* open out the base tree into a temp buffer for the final tweaks */
     _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE)));
 
-    ret = spapr_populate_memory(spapr, fdt);
-    if (ret < 0) {
-        fprintf(stderr, "couldn't setup memory nodes in fdt\n");
-        exit(1);
-    }
+    /*
+     * Add memory@0 node to represent RMA. Rest of the memory is either
+     * represented by memory nodes or ibm,dynamic-reconfiguration-memory
+     * node later during ibm,client-architecture-support call.
+     */
+    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
 
     ret = spapr_populate_vdevice(spapr->vio_bus, fdt);
     if (ret < 0) {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 4f76f1c..20507c6 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -807,6 +807,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     return ret;
 }
 
+/*
+ * Return the offset to the requested option vector @vector in the
+ * option vector table @table.
+ */
+static target_ulong cas_get_option_vector(int vector, target_ulong table)
+{
+    int i;
+    char nr_vectors, nr_entries;
+
+    if (!table) {
+        return 0;
+    }
+
+    nr_vectors = (rtas_ld(table, 0) >> 24) + 1;
+    if (!vector || vector > nr_vectors) {
+        return 0;
+    }
+    table++; /* skip nr option vectors */
+
+    for (i = 0; i < vector - 1; i++) {
+        nr_entries = rtas_ld(table, 0) >> 24;
+        table += nr_entries + 2;
+    }
+    return table;
+}
+
 typedef struct {
     PowerPCCPU *cpu;
     uint32_t cpu_version;
@@ -827,19 +853,22 @@ static void do_set_compat(void *arg)
     ((cpuver) == CPU_POWERPC_LOGICAL_2_06_PLUS) ? 2061 : \
     ((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
 
+#define OV5_DRCONF_MEMORY 0x20
+
 static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
                                                   sPAPREnvironment *spapr,
                                                   target_ulong opcode,
                                                   target_ulong *args)
 {
-    target_ulong list = args[0];
+    target_ulong list = args[0], ov_table;
     PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
     CPUState *cs;
-    bool cpu_match = false;
+    bool cpu_match = false, cpu_update = true, memory_update = false;
     unsigned old_cpu_version = cpu_->cpu_version;
     unsigned compat_lvl = 0, cpu_version = 0;
     unsigned max_lvl = get_compat_level(cpu_->max_compat);
     int counter;
+    char ov5_byte2;
 
     /* Parse PVR list */
     for (counter = 0; counter < 512; ++counter) {
@@ -889,8 +918,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
         }
     }
 
-    /* For the future use: here @list points to the first capability */
-
     /* Parsing finished */
     trace_spapr_cas_pvr(cpu_->cpu_version, cpu_match,
                         cpu_version, pcc_->pcr_mask);
@@ -914,14 +941,26 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
     }
 
     if (!cpu_version) {
-        return H_SUCCESS;
+        cpu_update = false;
     }
 
+    /* For the future use: here @ov_table points to the first option vector */
+    ov_table = list;
+
+    list = cas_get_option_vector(5, ov_table);
     if (!list) {
         return H_SUCCESS;
     }
 
-    if (spapr_h_cas_compose_response(args[1], args[2])) {
+    /* @list now points to OV 5 */
+    list += 2;
+    ov5_byte2 = rtas_ld(list, 0) >> 24;
+    if (ov5_byte2 & OV5_DRCONF_MEMORY) {
+        memory_update = true;
+    }
+
+    if (spapr_h_cas_compose_response(args[1], args[2], cpu_update,
+                                    memory_update)) {
         qemu_system_reset_request();
     }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 53560e9..a286fe7 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -554,9 +554,22 @@ struct sPAPREventLogEntry {
 /* 1GB alignment for hotplug memory region */
 #define SPAPR_HOTPLUG_MEM_ALIGN (1ULL << 30)
 
+/*
+ * Number of 32 bit words in each LMB list entry in ibm,dynamic-memory
+ * property under ibm,dynamic-reconfiguration-memory node.
+ */
+#define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
+
+/*
+ * This flag value defines the LMB as assigned in ibm,dynamic-memory
+ * property under ibm,dynamic-reconfiguration-memory node.
+ */
+#define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
+
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
-int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
+int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
+                                bool cpu_update, bool memory_update);
 sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
                                    uint64_t bus_offset,
                                    uint32_t page_shift,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (21 preceding siblings ...)
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
@ 2015-03-23 13:36 ` Bharata B Rao
  2015-03-26  3:57   ` David Gibson
  2015-03-26  3:58 ` [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests David Gibson
  2015-04-06 10:19 ` Bharata B Rao
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-23 13:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Bharata B Rao, mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo,
	afaerber, david

Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c        | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c |   3 ++
 2 files changed, 120 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4e844ab..bc46acd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -61,7 +61,8 @@
 #include "hw/nmi.h"
 
 #include "hw/compat.h"
-
+#include "hw/mem/pc-dimm.h"
+#include "qapi/qmp/qerror.h"
 #include <libfdt.h>
 
 /* SLOF memory layout:
@@ -902,6 +903,10 @@ int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
         _FDT((spapr_populate_memory(spapr, fdt)));
     }
 
+    if (spapr->dr_lmb_enabled) {
+        _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
+    }
+
     /* Pack resulting tree */
     _FDT((fdt_pack(fdt)));
 
@@ -2185,6 +2190,109 @@ static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
     object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
 }
 
+static void spapr_add_lmbs(uint64_t addr, uint64_t size, Error **errp)
+{
+    sPAPRDRConnector *drc;
+    uint32_t nr_lmbs = size/SPAPR_MEMORY_BLOCK_SIZE;
+    int i;
+
+    if (size % SPAPR_MEMORY_BLOCK_SIZE) {
+        error_setg(errp, "Hotplugged memory size must be a multiple of "
+                      "%d MB", SPAPR_MEMORY_BLOCK_SIZE/(1024 * 1024));
+        return;
+    }
+
+    for (i = 0; i < nr_lmbs; i++) {
+        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
+                addr/SPAPR_MEMORY_BLOCK_SIZE);
+        g_assert(drc);
+        spapr_hotplug_req_add_event(drc);
+        addr += SPAPR_MEMORY_BLOCK_SIZE;
+    }
+}
+
+static void spapr_memory_plug(HotplugHandler *hotplug_dev,
+                         DeviceState *dev, Error **errp)
+{
+    int slot;
+    Error *local_err = NULL;
+    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
+    MachineState *machine = MACHINE(hotplug_dev);
+    PCDIMMDevice *dimm = PC_DIMM(dev);
+    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    MemoryRegion *mr = ddc->get_memory_region(dimm);
+    uint64_t existing_dimms_capacity = 0;
+    uint64_t align = TARGET_PAGE_SIZE;
+    uint64_t addr;
+
+    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (memory_region_get_alignment(mr) && ms->enforce_aligned_dimm) {
+        align = memory_region_get_alignment(mr);
+    }
+
+    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
+                                 memory_region_size(&ms->hotplug_memory),
+                                 !addr ? NULL : &addr, align,
+                                 memory_region_size(mr), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (existing_dimms_capacity + memory_region_size(mr) >
+        machine->maxram_size - machine->ram_size) {
+        error_setg(&local_err, "not enough space, currently 0x%" PRIx64
+                   " in use of total hot pluggable 0x" RAM_ADDR_FMT,
+                   existing_dimms_capacity,
+                   machine->maxram_size - machine->ram_size);
+        goto out;
+    }
+
+    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    trace_mhp_pc_dimm_assigned_address(addr);
+
+    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
+                                 machine->ram_slots, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    trace_mhp_pc_dimm_assigned_slot(slot);
+
+    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
+        error_setg(&local_err, "hypervisor has no free memory slots left");
+        goto out;
+    }
+
+    memory_region_add_subregion(&ms->hotplug_memory,
+                                addr - ms->hotplug_memory_base, mr);
+    vmstate_register_ram(mr, dev);
+
+    spapr_add_lmbs(addr, memory_region_size(mr), &local_err);
+
+out:
+    error_propagate(errp, local_err);
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -2197,6 +2305,9 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
         if (dev->hotplugged && spapr->dr_cpu_enabled) {
             spapr_cpu_plug(hotplug_dev, dev, errp);
         }
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
+                spapr->dr_lmb_enabled) {
+        spapr_memory_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2207,6 +2318,9 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
         if (dev->hotplugged && spapr->dr_cpu_enabled) {
             spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
         }
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
+                spapr->dr_lmb_enabled) {
+        error_setg(errp, "Memory hot unplug not supported by sPAPR");
     }
 }
 
@@ -2214,7 +2328,8 @@ static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
                                              DeviceState *dev)
 {
     if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
-        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 4ae818a..e2a22d2 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -431,6 +431,9 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
     case SPAPR_DR_CONNECTOR_TYPE_CPU:
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_MEMORY;
+        break;
     default:
         /* we shouldn't be signaling hotplug events for resources
          * that don't support them
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
@ 2015-03-25  0:04   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  0:04 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1746 bytes --]

On Mon, Mar 23, 2015 at 07:05:42PM +0530, Bharata B Rao wrote:
> From: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> Introduce an sPAPRMachineClass sub-class of MachineClass to
> handle sPAPR-specific machine configuration properties.
> 
> The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
> part of machine-specific init code, and is then propagated
> to sPAPREnvironment to conditional enable creation of DRC
> objects and device-tree description to facilitate hotplug
> of PHBs/CPUs/LMBs.
> 
> Since we can't migrate this state to older machine types,
> default the option to false and only enable it for new
> machine types.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>               [Added CPU and LMB bits]
> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

[snip]
> @@ -1854,11 +1882,15 @@ static const TypeInfo spapr_machine_2_2_info = {
>  static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>  
>      mc->name = "pseries-2.3";
>      mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
>      mc->alias = "pseries";
>      mc->is_default = 1;
> +    smc->dr_phb_enabled = true;
> +    smc->dr_cpu_enabled = true;
> +    smc->dr_lmb_enabled = true;
>  }

Presumably this will move to pseries-2.4 before merge.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs Bharata B Rao
@ 2015-03-25  0:07   ` David Gibson
  2015-03-25  5:02     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  0:07 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2669 bytes --]

On Mon, Mar 23, 2015 at 07:05:43PM +0530, Bharata B Rao wrote:
> Advertise CPU DR-capability to the guest via device tree.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>                [spapr_drc_reset implementation]
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a782e28..920e650 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -807,6 +807,15 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
>      }
>  
> +    if (spapr->dr_cpu_enabled) {
> +        int offset = fdt_path_offset(fdt, "/cpus");
> +        ret = spapr_drc_populate_dt(fdt, offset, NULL,
> +                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
> +        if (ret < 0) {
> +            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
> +        }
> +    }
> +
>      _FDT((fdt_pack(fdt)));
>  
>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> @@ -1393,6 +1402,16 @@ static SaveVMHandlers savevm_htab_handlers = {
>      .load_state = htab_load,
>  };
>  
> +static void spapr_drc_reset(void *opaque)
> +{
> +    sPAPRDRConnector *drc = opaque;
> +    DeviceState *d = DEVICE(drc);
> +
> +    if (d) {
> +        device_reset(d);
> +    }
> +}
> +
>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> @@ -1418,6 +1437,7 @@ static void ppc_spapr_init(MachineState *machine)
>      long load_limit, fw_size;
>      bool kernel_le = false;
>      char *filename;
> +    int smt = kvmppc_smt_threads();
>  
>      msi_supported = true;
>  
> @@ -1564,6 +1584,15 @@ static void ppc_spapr_init(MachineState *machine)
>      spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
>      spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
>  
> +    if (spapr->dr_cpu_enabled) {
> +        for (i = 0; i < max_cpus/smp_threads; i++) {
> +            sPAPRDRConnector *drc =
> +                spapr_dr_connector_new(OBJECT(machine),
> +                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
> +            qemu_register_reset(spapr_drc_reset, drc);

This seems to be per-core, rather than per-socket as your patch
comments suggest.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
@ 2015-03-25  0:15   ` David Gibson
  2015-04-01  3:59     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  0:15 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1429 bytes --]

On Mon, Mar 23, 2015 at 07:05:45PM +0530, Bharata B Rao wrote:
> Add support for ibm,lrdr-capacity since this is needed by the guest
> kernel to know about the possible hot-pluggable CPUs and Memory. With
> this, pseries kernels will start reporting correct maxcpus in
> /sys/devices/system/cpu/possible.
> 
> Define minimum hotpluggable memory size as 256MB and start storing maximum
> possible memory for the guest in sPAPREnvironment.

[snip]
> @@ -666,6 +668,18 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
>          }
>  
>      }
> +
> +    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> +    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
> +    lrdr_capacity[2] = 0;
> +    lrdr_capacity[3] = cpu_to_be32(SPAPR_MEMORY_BLOCK_SIZE);
> +    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> +    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> +                     sizeof(lrdr_capacity));
> +    if (ret < 0) {
> +        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");

This should probably be report_error() these days.

Otherwise,

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code Bharata B Rao
@ 2015-03-25  1:36   ` David Gibson
  2015-03-25  8:26     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  1:36 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 15839 bytes --]

On Mon, Mar 23, 2015 at 07:05:46PM +0530, Bharata B Rao wrote:
> Reorganize CPU device tree generation code so that it be reused from
> hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> instead of spapr_create_fdt_skel().
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c | 288 ++++++++++++++++++++++++++++++---------------------------
>  1 file changed, 154 insertions(+), 134 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 36ff754..1a25cc0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -206,24 +206,50 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>      return ret;
>  }
>  
> +static int spapr_fixup_cpu_numa_smt_dt(void *fdt, int offset, CPUState *cs,
> +                                        sPAPREnvironment *spapr)
> +{
> +    int ret;
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int index = ppc_get_vcpu_dt_id(cpu);
> +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +    uint32_t associativity[] = {cpu_to_be32(0x5),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(0x0),
> +                                cpu_to_be32(cs->numa_node),
> +                                cpu_to_be32(index)};
> +
> +    /* Advertise NUMA via ibm,associativity */
> +    if (nb_numa_nodes > 1) {
> +        ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> +                          sizeof(associativity));
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
> +    ret = fdt_setprop(fdt, offset, "ibm,pft-size",
> +                      pft_size_prop, sizeof(pft_size_prop));
> +    if (ret < 0) {
> +        return ret;
> +    }

The pft-size property isn't actually related to NUMA, so it doesn't
really belong in this function.

> +    return spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
> +                                 ppc_get_compat_smt_threads(cpu));

Likewise calling the smt fixup function from the numa fixup function
just seems odd to me; just be explicit and call the two sequentially.

Overall this seems an odd way to split things.  Why not just make a
spapr_fixup_one_cpu_dt() function, or similar, which should do all the
necessary pieces.

> +}
> +
>  static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
>  {
>      int ret = 0, offset, cpus_offset;
>      CPUState *cs;
>      char cpu_model[32];
>      int smt = kvmppc_smt_threads();
> -    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
>  
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int index = ppc_get_vcpu_dt_id(cpu);
> -        uint32_t associativity[] = {cpu_to_be32(0x5),
> -                                    cpu_to_be32(0x0),
> -                                    cpu_to_be32(0x0),
> -                                    cpu_to_be32(0x0),
> -                                    cpu_to_be32(cs->numa_node),
> -                                    cpu_to_be32(index)};
>  
>          if ((index % smt) != 0) {
>              continue;
> @@ -247,22 +273,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
>              }
>          }
>  
> -        if (nb_numa_nodes > 1) {
> -            ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> -                              sizeof(associativity));
> -            if (ret < 0) {
> -                return ret;
> -            }
> -        }
> -
> -        ret = fdt_setprop(fdt, offset, "ibm,pft-size",
> -                          pft_size_prop, sizeof(pft_size_prop));
> -        if (ret < 0) {
> -            return ret;
> -        }
> -
> -        ret = spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
> -                                     ppc_get_compat_smt_threads(cpu));
> +        ret = spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr);
>          if (ret < 0) {
>              return ret;
>          }
> @@ -341,18 +352,13 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>                                     uint32_t epow_irq)
>  {
>      void *fdt;
> -    CPUState *cs;
>      uint32_t start_prop = cpu_to_be32(initrd_base);
>      uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
>      GString *hypertas = g_string_sized_new(256);
>      GString *qemu_hypertas = g_string_sized_new(256);
>      uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
>      uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
> -    int smt = kvmppc_smt_threads();
>      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
> -    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> -    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> -    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>      char *buf;
>  
>      add_str(hypertas, "hcall-pft");
> @@ -441,107 +447,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>  
>      _FDT((fdt_end_node(fdt)));
>  
> -    /* cpus */
> -    _FDT((fdt_begin_node(fdt, "cpus")));
> -
> -    _FDT((fdt_property_cell(fdt, "#address-cells", 0x1)));
> -    _FDT((fdt_property_cell(fdt, "#size-cells", 0x0)));
> -
> -    CPU_FOREACH(cs) {
> -        PowerPCCPU *cpu = POWERPC_CPU(cs);
> -        CPUPPCState *env = &cpu->env;
> -        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -        PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> -        int index = ppc_get_vcpu_dt_id(cpu);
> -        char *nodename;
> -        uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> -                           0xffffffff, 0xffffffff};
> -        uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> -        uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> -        uint32_t page_sizes_prop[64];
> -        size_t page_sizes_prop_size;
> -
> -        if ((index % smt) != 0) {
> -            continue;
> -        }
> -
> -        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> -
> -        _FDT((fdt_begin_node(fdt, nodename)));
> -
> -        g_free(nodename);
> -
> -        _FDT((fdt_property_cell(fdt, "reg", index)));
> -        _FDT((fdt_property_string(fdt, "device_type", "cpu")));
> -
> -        _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
> -        _FDT((fdt_property_cell(fdt, "d-cache-block-size",
> -                                env->dcache_line_size)));
> -        _FDT((fdt_property_cell(fdt, "d-cache-line-size",
> -                                env->dcache_line_size)));
> -        _FDT((fdt_property_cell(fdt, "i-cache-block-size",
> -                                env->icache_line_size)));
> -        _FDT((fdt_property_cell(fdt, "i-cache-line-size",
> -                                env->icache_line_size)));
> -
> -        if (pcc->l1_dcache_size) {
> -            _FDT((fdt_property_cell(fdt, "d-cache-size", pcc->l1_dcache_size)));
> -        } else {
> -            fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> -        }
> -        if (pcc->l1_icache_size) {
> -            _FDT((fdt_property_cell(fdt, "i-cache-size", pcc->l1_icache_size)));
> -        } else {
> -            fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> -        }
> -
> -        _FDT((fdt_property_cell(fdt, "timebase-frequency", tbfreq)));
> -        _FDT((fdt_property_cell(fdt, "clock-frequency", cpufreq)));
> -        _FDT((fdt_property_cell(fdt, "ibm,slb-size", env->slb_nr)));
> -        _FDT((fdt_property_string(fdt, "status", "okay")));
> -        _FDT((fdt_property(fdt, "64-bit", NULL, 0)));
> -
> -        if (env->spr_cb[SPR_PURR].oea_read) {
> -            _FDT((fdt_property(fdt, "ibm,purr", NULL, 0)));
> -        }
> -
> -        if (env->mmu_model & POWERPC_MMU_1TSEG) {
> -            _FDT((fdt_property(fdt, "ibm,processor-segment-sizes",
> -                               segs, sizeof(segs))));
> -        }
> -
> -        /* Advertise VMX/VSX (vector extensions) if available
> -         *   0 / no property == no vector extensions
> -         *   1               == VMX / Altivec available
> -         *   2               == VSX available */
> -        if (env->insns_flags & PPC_ALTIVEC) {
> -            uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> -
> -            _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
> -        }
> -
> -        /* Advertise DFP (Decimal Floating Point) if available
> -         *   0 / no property == no DFP
> -         *   1               == DFP available */
> -        if (env->insns_flags2 & PPC2_DFP) {
> -            _FDT((fdt_property_cell(fdt, "ibm,dfp", 1)));
> -        }
> -
> -        page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> -                                                      sizeof(page_sizes_prop));
> -        if (page_sizes_prop_size) {
> -            _FDT((fdt_property(fdt, "ibm,segment-page-sizes",
> -                               page_sizes_prop, page_sizes_prop_size)));
> -        }
> -
> -        _FDT((fdt_property_cell(fdt, "ibm,chip-id",
> -                                cs->cpu_index / cpus_per_socket)));
> -
> -        _FDT((fdt_end_node(fdt)));
> -    }
> -
> -    _FDT((fdt_end_node(fdt)));
> -
>      /* RTAS */
>      _FDT((fdt_begin_node(fdt, "rtas")));
>  
> @@ -739,6 +644,124 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>      return 0;
>  }
>  
> +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    CPUPPCState *env = &cpu->env;
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> +    int index = ppc_get_vcpu_dt_id(cpu);
> +    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> +                       0xffffffff, 0xffffffff};
> +    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> +    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> +    uint32_t page_sizes_prop[64];
> +    size_t page_sizes_prop_size;
> +    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> +    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> +    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> +    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> +                            env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> +                            env->dcache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> +                            env->icache_line_size)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> +                            env->icache_line_size)));
> +
> +    if (pcc->l1_dcache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> +                             pcc->l1_dcache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +    }
> +    if (pcc->l1_icache_size) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> +                             pcc->l1_icache_size)));
> +    } else {
> +        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> +    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> +    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> +
> +    if (env->spr_cb[SPR_PURR].oea_read) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> +    }
> +
> +    if (env->mmu_model & POWERPC_MMU_1TSEG) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> +                           segs, sizeof(segs))));
> +    }
> +
> +    /* Advertise VMX/VSX (vector extensions) if available
> +     *   0 / no property == no vector extensions
> +     *   1               == VMX / Altivec available
> +     *   2               == VSX available */
> +    if (env->insns_flags & PPC_ALTIVEC) {
> +        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> +
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
> +    }
> +
> +    /* Advertise DFP (Decimal Floating Point) if available
> +     *   0 / no property == no DFP
> +     *   1               == DFP available */
> +    if (env->insns_flags2 & PPC2_DFP) {
> +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
> +    }
> +
> +    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> +                                                  sizeof(page_sizes_prop));
> +    if (page_sizes_prop_size) {
> +        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
> +                           page_sizes_prop, page_sizes_prop_size)));
> +    }
> +
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> +                            cs->cpu_index / cpus_per_socket)));
> +
> +    _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
> +}
> +
> +static void spapr_populate_cpu_dt_node(void *fdt, sPAPREnvironment *spapr)


I'd suggest s/cpu/cpus/.  If anything "populate_cpu_dt_node" sounds
more like it covers a single cpu than "populate_cpu_dt".

> +{
> +    CPUState *cs;
> +    int cpus_offset;
> +    char *nodename;
> +    int smt = kvmppc_smt_threads();
> +
> +    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
> +    _FDT(cpus_offset);
> +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
> +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
> +
> +    CPU_FOREACH(cs) {
> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> +        int index = ppc_get_vcpu_dt_id(cpu);
> +        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +        int offset;
> +
> +        if ((index % smt) != 0) {
> +            continue;
> +        }
> +
> +        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> +        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
> +        g_free(nodename);
> +        _FDT(offset);
> +        spapr_populate_cpu_dt(cs, fdt, offset);
> +    }
> +
> +}
> +
>  static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>                                 hwaddr fdt_addr,
>                                 hwaddr rtas_addr,
> @@ -782,11 +805,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>          fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
>      }
>  
> -    /* Advertise NUMA via ibm,associativity */
> -    ret = spapr_fixup_cpu_dt(fdt, spapr);
> -    if (ret < 0) {
> -        fprintf(stderr, "Couldn't finalize CPU device tree properties\n");
> -    }
> +    /* cpus */
> +    spapr_populate_cpu_dt_node(fdt, spapr);
>  
>      bootlist = get_boot_devices_list(&cb, true);
>      if (cb && bootlist) {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine Bharata B Rao
@ 2015-03-25  1:37   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  1:37 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

On Mon, Mar 23, 2015 at 07:05:47PM +0530, Bharata B Rao wrote:
> Factor out bits of sPAPR specific CPU initialization code into
> a separate routine so that it can be called from CPU hotplug
> path too.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type Bharata B Rao
@ 2015-03-25  2:03   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  2:03 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2314 bytes --]

On Mon, Mar 23, 2015 at 07:05:48PM +0530, Bharata B Rao wrote:
> From: Andreas Färber <afaerber@suse.de>

This really wants a commit message explaining the function of this new
abstraction.

> 
> Signed-off-by: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/cpu/Makefile.objs    |  2 +-
>  hw/cpu/socket.c         | 21 +++++++++++++++++++++
>  include/hw/cpu/socket.h | 14 ++++++++++++++
>  3 files changed, 36 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cpu/socket.c
>  create mode 100644 include/hw/cpu/socket.h
> 
> diff --git a/hw/cpu/Makefile.objs b/hw/cpu/Makefile.objs
> index 6381238..e6890cf 100644
> --- a/hw/cpu/Makefile.objs
> +++ b/hw/cpu/Makefile.objs
> @@ -3,4 +3,4 @@ obj-$(CONFIG_REALVIEW) += realview_mpcore.o
>  obj-$(CONFIG_A9MPCORE) += a9mpcore.o
>  obj-$(CONFIG_A15MPCORE) += a15mpcore.o
>  obj-$(CONFIG_ICC_BUS) += icc_bus.o
> -
> +obj-y += socket.o
> diff --git a/hw/cpu/socket.c b/hw/cpu/socket.c
> new file mode 100644
> index 0000000..5ca47e9
> --- /dev/null
> +++ b/hw/cpu/socket.c
> @@ -0,0 +1,21 @@
> +/*
> + * CPU socket abstraction
> + *
> + * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
> + * Copyright (c) 2015 SUSE Linux GmbH
> + */
> +
> +#include "hw/cpu/socket.h"
> +
> +static const TypeInfo cpu_socket_type_info = {
> +    .name = TYPE_CPU_SOCKET,
> +    .parent = TYPE_DEVICE,
> +    .abstract = true,
> +};
> +
> +static void cpu_socket_register_types(void)
> +{
> +    type_register_static(&cpu_socket_type_info);
> +}
> +
> +type_init(cpu_socket_register_types)
> diff --git a/include/hw/cpu/socket.h b/include/hw/cpu/socket.h
> new file mode 100644
> index 0000000..c8e0c18
> --- /dev/null
> +++ b/include/hw/cpu/socket.h
> @@ -0,0 +1,14 @@
> +/*
> + * CPU socket abstraction
> + *
> + * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
> + * Copyright (c) 2015 SUSE Linux GmbH
> + */
> +#ifndef HW_CPU_SOCKET_H
> +#define HW_CPU_SOCKET_H
> +
> +#include "hw/qdev.h"
> +
> +#define TYPE_CPU_SOCKET "cpu-socket"
> +
> +#endif

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction Bharata B Rao
@ 2015-03-25  2:06   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  2:06 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 6347 bytes --]

On Mon, Mar 23, 2015 at 07:05:49PM +0530, Bharata B Rao wrote:

Again, this needs a commit message explaining why the new abstraction
is valuable.

> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Signed-off-by: Andreas Färber <afaerber@suse.de>
> ---
>  hw/ppc/Makefile.objs        |  1 +
>  hw/ppc/cpu-core.c           | 46 ++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/cpu-socket.c         | 47 +++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/cpu-core.h   | 32 ++++++++++++++++++++++++++++++
>  include/hw/ppc/cpu-socket.h | 32 ++++++++++++++++++++++++++++++
>  5 files changed, 158 insertions(+)
>  create mode 100644 hw/ppc/cpu-core.c
>  create mode 100644 hw/ppc/cpu-socket.c
>  create mode 100644 include/hw/ppc/cpu-core.h
>  create mode 100644 include/hw/ppc/cpu-socket.h
> 
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index c8ab06e..a35cac5 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -1,5 +1,6 @@
>  # shared objects
>  obj-y += ppc.o ppc_booke.o
> +obj-y += cpu-socket.o cpu-core.o
>  # IBM pSeries (sPAPR)
>  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
> new file mode 100644
> index 0000000..ed0481f
> --- /dev/null
> +++ b/hw/ppc/cpu-core.c
> @@ -0,0 +1,46 @@
> +/*
> + * ppc CPU core abstraction
> + *
> + * Copyright (c) 2015 SUSE Linux GmbH
> + * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
> + */
> +
> +#include "hw/qdev.h"
> +#include "hw/ppc/cpu-core.h"
> +
> +static int ppc_cpu_core_realize_child(Object *child, void *opaque)
> +{
> +    Error **errp = opaque;
> +
> +    object_property_set_bool(child, true, "realized", errp);
> +    if (*errp) {
> +        return 1;
> +    }
> +
> +    return 0;
> +}
> +
> +static void ppc_cpu_core_realize(DeviceState *dev, Error **errp)
> +{
> +    object_child_foreach(OBJECT(dev), ppc_cpu_core_realize_child, errp);
> +}
> +
> +static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +
> +    dc->realize = ppc_cpu_core_realize;
> +}
> +
> +static const TypeInfo ppc_cpu_core_type_info = {
> +    .name = TYPE_POWERPC_CPU_CORE,
> +    .parent = TYPE_DEVICE,
> +    .class_init = ppc_cpu_core_class_init,
> +};
> +
> +static void ppc_cpu_core_register_types(void)
> +{
> +    type_register_static(&ppc_cpu_core_type_info);
> +}
> +
> +type_init(ppc_cpu_core_register_types)
> diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
> new file mode 100644
> index 0000000..602a060
> --- /dev/null
> +++ b/hw/ppc/cpu-socket.c
> @@ -0,0 +1,47 @@
> +/*
> + * PPC CPU socket abstraction
> + *
> + * Copyright (c) 2015 SUSE Linux GmbH
> + * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
> + */
> +
> +#include "hw/qdev.h"
> +#include "hw/ppc/cpu-socket.h"
> +#include "sysemu/cpus.h"
> +
> +static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
> +{
> +    Error **errp = opaque;
> +
> +    object_property_set_bool(child, true, "realized", errp);
> +    if (*errp) {
> +        return 1;
> +    } else {
> +        return 0;
> +    }
> +}
> +
> +static void ppc_cpu_socket_realize(DeviceState *dev, Error **errp)
> +{
> +    object_child_foreach(OBJECT(dev), ppc_cpu_socket_realize_child, errp);
> +}
> +
> +static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +
> +    dc->realize = ppc_cpu_socket_realize;
> +}
> +
> +static const TypeInfo ppc_cpu_socket_type_info = {
> +    .name = TYPE_POWERPC_CPU_SOCKET,
> +    .parent = TYPE_CPU_SOCKET,
> +    .class_init = ppc_cpu_socket_class_init,
> +};
> +
> +static void ppc_cpu_socket_register_types(void)
> +{
> +    type_register_static(&ppc_cpu_socket_type_info);
> +}
> +
> +type_init(ppc_cpu_socket_register_types)
> diff --git a/include/hw/ppc/cpu-core.h b/include/hw/ppc/cpu-core.h
> new file mode 100644
> index 0000000..95f1c28
> --- /dev/null
> +++ b/include/hw/ppc/cpu-core.h
> @@ -0,0 +1,32 @@
> +/*
> + * PowerPC CPU core abstraction
> + *
> + * Copyright (c) 2015 SUSE Linux GmbH
> + * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
> + */
> +#ifndef HW_PPC_CPU_CORE_H
> +#define HW_PPC_CPU_CORE_H
> +
> +#include "hw/qdev.h"
> +#include "cpu.h"
> +
> +#ifdef TARGET_PPC64
> +#define TYPE_POWERPC_CPU_CORE "powerpc64-cpu-core"
> +#elif defined(TARGET_PPCEMB)
> +#define TYPE_POWERPC_CPU_CORE "embedded-powerpc-cpu-core"
> +#else
> +#define TYPE_POWERPC_CPU_CORE "powerpc-cpu-core"
> +#endif
> +
> +#define POWERPC_CPU_CORE(obj) \
> +    OBJECT_CHECK(PowerPCCPUCore, (obj), TYPE_POWERPC_CPU_CORE)
> +
> +typedef struct PowerPCCPUCore {
> +    /*< private >*/
> +    DeviceState parent_obj;
> +    /*< public >*/
> +
> +    PowerPCCPU thread[0];
> +} PowerPCCPUCore;
> +
> +#endif
> diff --git a/include/hw/ppc/cpu-socket.h b/include/hw/ppc/cpu-socket.h
> new file mode 100644
> index 0000000..5ae19d0
> --- /dev/null
> +++ b/include/hw/ppc/cpu-socket.h
> @@ -0,0 +1,32 @@
> +/*
> + * PowerPC CPU socket abstraction
> + *
> + * Copyright (c) 2015 SUSE Linux GmbH
> + * Copyright (C) 2015 Bharata B Rao <bharata@linux.vnet.ibm.com>
> + */
> +#ifndef HW_PPC_CPU_SOCKET_H
> +#define HW_PPC_CPU_SOCKET_H
> +
> +#include "hw/cpu/socket.h"
> +#include "cpu-core.h"
> +
> +#ifdef TARGET_PPC64
> +#define TYPE_POWERPC_CPU_SOCKET "powerpc64-cpu-socket"
> +#elif defined(TARGET_PPCEMB)
> +#define TYPE_POWERPC_CPU_SOCKET "embedded-powerpc-cpu-socket"
> +#else
> +#define TYPE_POWERPC_CPU_SOCKET "powerpc-cpu-socket"
> +#endif
> +
> +#define POWERPC_CPU_SOCKET(obj) \
> +    OBJECT_CHECK(PowerPCCPUSocket, (obj), TYPE_POWERPC_CPU_SOCKET)
> +
> +typedef struct PowerPCCPUSocket {
> +    /*< private >*/
> +    DeviceState parent_obj;
> +    /*< public >*/
> +
> +    PowerPCCPUCore core[0];
> +} PowerPCCPUSocket;
> +
> +#endif

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler Bharata B Rao
@ 2015-03-25  2:08   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  2:08 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2972 bytes --]

On Mon, Mar 23, 2015 at 07:05:50PM +0530, Bharata B Rao wrote:
> Add CPU hotplug handler to spapr machine class and let the plug handler
> initialize spapr CPU specific initialization bits for a realized CPU.
> This lets CPU boot path and hotplug path to share as much code as possible.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>


> ---
>  hw/ppc/spapr.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 200dd75..6650f82 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1555,7 +1555,6 @@ static void ppc_spapr_init(MachineState *machine)
>              fprintf(stderr, "Unable to find PowerPC CPU definition\n");
>              exit(1);
>          }
> -        spapr_cpu_init(cpu);
>      }
>  
>      /* allocate RAM */
> @@ -1841,12 +1840,33 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>      }
>  }
>  
> +static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        CPUState *cs = CPU(dev);
> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +        spapr_cpu_init(cpu);
> +    }
> +}
> +
> +static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> +                                             DeviceState *dev)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        return HOTPLUG_HANDLER(machine);
> +    }
> +    return NULL;
> +}
> +
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
>      sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>      FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
>      NMIClass *nc = NMI_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>  
>      mc->init = ppc_spapr_init;
>      mc->reset = ppc_spapr_reset;
> @@ -1856,6 +1876,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->default_boot_order = NULL;
>      mc->kvm_type = spapr_kvm_type;
>      mc->has_dynamic_sysbus = true;
> +    mc->get_hotplug_handler = spapr_get_hotpug_handler;
> +    hc->plug = spapr_machine_device_plug;
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;
> @@ -1875,6 +1897,7 @@ static const TypeInfo spapr_machine_info = {
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_FW_PATH_PROVIDER },
>          { TYPE_NMI },
> +        { TYPE_HOTPLUG_HANDLER },
>          { }
>      },
>  };

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState Bharata B Rao
@ 2015-03-25  2:30   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  2:30 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 486 bytes --]

On Mon, Mar 23, 2015 at 07:05:51PM +0530, Bharata B Rao wrote:
> Keep cpu_model field in MachineState uptodate so that it can be used
> from the CPU hotplug path.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs Bharata B Rao
@ 2015-03-25  2:39   ` David Gibson
  2015-03-25  8:33     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  2:39 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 5036 bytes --]

On Mon, Mar 23, 2015 at 07:05:52PM +0530, Bharata B Rao wrote:
> ppc machine init functions create individual CPU threads. Change this
> for sPAPR by switching to socket creation. CPUs are created recursively
> by socket and core instance init routines.
> 
> TODO: Switching to socket level CPU creation is done only for sPAPR
> target now.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/cpu-core.c           | 17 +++++++++++++++++
>  hw/ppc/cpu-socket.c         | 15 +++++++++++++++
>  hw/ppc/spapr.c              | 15 ++++++++-------
>  target-ppc/cpu.h            |  1 +
>  target-ppc/translate_init.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 87 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
> index ed0481f..f60646d 100644
> --- a/hw/ppc/cpu-core.c
> +++ b/hw/ppc/cpu-core.c
> @@ -7,6 +7,8 @@
>  
>  #include "hw/qdev.h"
>  #include "hw/ppc/cpu-core.h"
> +#include "hw/boards.h"
> +#include <sysemu/cpus.h>
>  
>  static int ppc_cpu_core_realize_child(Object *child, void *opaque)
>  {
> @@ -32,10 +34,25 @@ static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
>      dc->realize = ppc_cpu_core_realize;
>  }
>  
> +static void ppc_cpu_core_instance_init(Object *obj)
> +{
> +    int i;
> +    PowerPCCPU *cpu = NULL;
> +    MachineState *machine = MACHINE(qdev_get_machine());
> +
> +    for (i = 0; i < smp_threads; i++) {
> +        cpu = POWERPC_CPU(cpu_ppc_create(TYPE_POWERPC_CPU, machine->cpu_model));
> +        object_property_add_child(obj, "thread[*]", OBJECT(cpu), &error_abort);
> +        object_unref(OBJECT(cpu));
> +    }
> +}
> +
>  static const TypeInfo ppc_cpu_core_type_info = {
>      .name = TYPE_POWERPC_CPU_CORE,
>      .parent = TYPE_DEVICE,
>      .class_init = ppc_cpu_core_class_init,
> +    .instance_init = ppc_cpu_core_instance_init,
> +    .instance_size = sizeof(PowerPCCPUCore),

The PowerPCCPUCore structure isn't defined in this patch (I assume it
already existed), which suggests that setting the instance_size should
have already been in an earlier patch.

>  };
>  
>  static void ppc_cpu_core_register_types(void)
> diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
> index 602a060..f901336 100644
> --- a/hw/ppc/cpu-socket.c
> +++ b/hw/ppc/cpu-socket.c
> @@ -8,6 +8,7 @@
>  #include "hw/qdev.h"
>  #include "hw/ppc/cpu-socket.h"
>  #include "sysemu/cpus.h"
> +#include "cpu.h"
>  
>  static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
>  {
> @@ -33,10 +34,24 @@ static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
>      dc->realize = ppc_cpu_socket_realize;
>  }
>  
> +static void ppc_cpu_socket_instance_init(Object *obj)
> +{
> +    int i;
> +    Object *core;
> +
> +    for (i = 0; i < smp_cores; i++) {
> +        core = object_new(TYPE_POWERPC_CPU_CORE);
> +        object_property_add_child(obj, "core[*]", core, &error_abort);
> +        object_unref(core);
> +    }
> +}
> +
>  static const TypeInfo ppc_cpu_socket_type_info = {
>      .name = TYPE_POWERPC_CPU_SOCKET,
>      .parent = TYPE_CPU_SOCKET,
>      .class_init = ppc_cpu_socket_class_init,
> +    .instance_init = ppc_cpu_socket_instance_init,
> +    .instance_size = sizeof(PowerPCCPUSocket),

Likewise for PowerPCCPUSocket.
>  
> +/*
> + * This is essentially same as cpu_generic_init() but without a set
> + * realize call.
> + */

In which case it would probably make more sense to have this be a
generic function, and implement cpu_generic_init() in terms of it.

> +CPUState *cpu_ppc_create(const char *typename, const char *cpu_model)
> +{
> +    char *str, *name, *featurestr;
> +    CPUState *cpu;
> +    ObjectClass *oc;
> +    CPUClass *cc;
> +    Error *err = NULL;
> +
> +    str = g_strdup(cpu_model);
> +    name = strtok(str, ",");
> +
> +    oc = cpu_class_by_name(typename, name);
> +    if (oc == NULL) {
> +        g_free(str);
> +        return NULL;
> +    }
> +
> +    cpu = CPU(object_new(object_class_get_name(oc)));
> +    cc = CPU_GET_CLASS(cpu);
> +
> +    featurestr = strtok(NULL, ",");
> +    cc->parse_features(cpu, featurestr, &err);
> +    g_free(str);
> +    if (err != NULL) {
> +        goto out;
> +    }
> +
> +out:
> +    if (err != NULL) {
> +        error_report("%s", error_get_pretty(err));
> +        error_free(err);
> +        object_unref(OBJECT(cpu));
> +        return NULL;
> +    }
> +
> +    return cpu;
> +}
> +
> +/*
> + * TODO: This can be removed when all powerpc targets are converted to
> + * socket level CPU realization.
> + */
>  PowerPCCPU *cpu_ppc_init(const char *cpu_model)
>  {
>      return POWERPC_CPU(cpu_generic_init(TYPE_POWERPC_CPU, cpu_model));

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support Bharata B Rao
@ 2015-03-25  3:03   ` David Gibson
  2015-03-25  8:36     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  3:03 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 7173 bytes --]

On Mon, Mar 23, 2015 at 07:05:53PM +0530, Bharata B Rao wrote:
> Support CPU hotplug via device-add command. Set up device tree
> entries for the hotplugged CPU core and use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_events.c |  8 +++---
>  hw/ppc/spapr_rtas.c   | 11 ++++++++
>  3 files changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f52d38f..b48994b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -33,6 +33,7 @@
>  #include "sysemu/block-backend.h"
>  #include "sysemu/cpus.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/device_tree.h"
>  #include "kvm_ppc.h"
>  #include "mmu-hash64.h"
>  #include "qom/cpu.h"
> @@ -660,6 +661,10 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
>      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
>      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
>      uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    int drc_index = drck->get_index(drc);
>  
>      _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
>      _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> @@ -728,6 +733,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
>  
>      _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
>                              cs->cpu_index / cpus_per_socket)));
> +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> +

What effect will this have when running with DR disabled?

>      _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
>  }
> @@ -1840,6 +1847,70 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
>      }
>  }
>  
> +static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    void *fdt;
> +    int offset, i, fdt_size;
> +    char *nodename;
> +
> +    fdt = create_device_tree(&fdt_size);
> +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> +    offset = fdt_add_subnode(fdt, 0, nodename);
> +
> +    /* Set NUMA node for the added CPU core */
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> +            cs->numa_node = i;
> +            break;
> +        }
> +    }
> +
> +    spapr_populate_cpu_dt(cs, fdt, offset);
> +    g_free(nodename);
> +
> +    drck->attach(drc, dev, fdt, offset, !dev->hotplugged, errp);
> +    if (*errp) {
> +        g_free(fdt);
> +    }
> +}
> +
> +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    int smt = kvmppc_smt_threads();
> +    Error *local_err = NULL;
> +
> +    /*
> +     * SMT threads return from here, only main thread (core) will
> +     * continue and signal hotplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return;
> +    }
> +
> +    g_assert(drc);
> +
> +    spapr_cpu_hotplug_add(dev, cs, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    spapr_hotplug_req_add_event(drc);
> +
> +    return;
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -1848,6 +1919,10 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
>          spapr_cpu_init(cpu);
> +        spapr_cpu_reset(cpu);
> +        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> +            spapr_cpu_plug(hotplug_dev, dev, errp);
> +        }
>      }
>  }
>  
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index be82815..4ae818a 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -421,14 +421,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
>      hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
>      hp->hdr.section_version = 1; /* includes extended modifier */
>      hp->hotplug_action = hp_action;
> -
> +    hp->drc.index = cpu_to_be32(drck->get_index(drc));
> +    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>  
>      switch (drc_type) {
>      case SPAPR_DR_CONNECTOR_TYPE_PCI:
> -        hp->drc.index = cpu_to_be32(drck->get_index(drc));
> -        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> +        break;
>      default:
>          /* we shouldn't be signaling hotplug events for resources
>           * that don't support them
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 57ec97a..48aeb86 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -121,6 +121,16 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
>      rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>  }
>  
> +static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
> +{
> +    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);

In some ways it might be nicer to store the global endian mode in
sPAPRMachineState, updating it on H_SET_MODE, rather than copying the
first cpu, although I guess that works.

> +    if (!(*pcc->interrupts_big_endian)(fcpu)) {
> +        cpu->env.spr[SPR_LPCR] |= LPCR_ILE;
> +    }
> +}
> +
>  static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
>                             uint32_t token, uint32_t nargs,
>                             target_ulong args,
> @@ -157,6 +167,7 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
>          env->nip = start;
>          env->gpr[3] = r3;
>          cs->halted = 0;
> +        spapr_cpu_set_endianness(cpu);
>  
>          qemu_cpu_kick(cs);
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init()
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init() Bharata B Rao
@ 2015-03-25  3:12   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  3:12 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1010 bytes --]

On Mon, Mar 23, 2015 at 07:05:54PM +0530, Bharata B Rao wrote:
> Add an Error argument to cpu_exec_init() to let users collect the
> error. Change all callers to currently pass NULL error argument. This change
> is needed for the following reasons:
> 
> - A subsequent commit changes the CPU enumeration logic in cpu_exec_init()
>   resulting in cpu_exec_init() to fail if cpu_index values corresponding
>   to max_cpus have already been handed out.
> - There is a thinking that cpu_exec_init() should be called from realize
>   rather than instance_init. With this change, those architectures
>   that can move this call into realize function can do so in a phased
>   manner.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap Bharata B Rao
@ 2015-03-25  3:23   ` David Gibson
  2015-03-25  8:52     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  3:23 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3627 bytes --]

On Mon, Mar 23, 2015 at 07:05:55PM +0530, Bharata B Rao wrote:
> Currently CPUState.cpu_index is monotonically increasing and a newly
> created CPU always gets the next higher index. The next available
> index is calculated by counting the existing number of CPUs. This is
> fine as long as we only add CPUs, but there are architectures which
> are starting to support CPU removal too. For an architecture like PowerPC
> which derives its CPU identifier (device tree ID) from cpu_index, the
> existing logic of generating cpu_index values causes problems.
> 
> With the currently proposed method of handling vCPU removal by parking
> the vCPU fd in QEMU
> (Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html),
> generating cpu_index this way will not work for PowerPC.
> 
> This patch changes the way cpu_index is handed out by maintaining
> a bit map of the CPUs that tracks both addition and removal of CPUs.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  exec.c            | 37 ++++++++++++++++++++++++++++++++++---
>  include/qom/cpu.h |  8 ++++++++
>  2 files changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index e1ff6b0..9bbab02 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -527,21 +527,52 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as)
>  }
>  #endif
>  
> +#ifndef CONFIG_USER_ONLY
> +static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> +
> +static int cpu_get_free_index(Error **errp)
> +{
> +    int cpu = find_first_zero_bit(cpu_index_map, max_cpus);
> +
> +    if (cpu == max_cpus) {
> +        error_setg(errp, "Trying to use more CPUs than allowed max of %d\n",
> +                    max_cpus);
> +        return max_cpus;
> +    } else {
> +        bitmap_set(cpu_index_map, cpu, 1);
> +        return cpu;
> +    }
> +}
> +
> +void cpu_exec_exit(CPUState *cpu)
> +{
> +    bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> +}

AFAICT, this function is never called, which seems like a bug.

> +#endif
> +
>  void cpu_exec_init(CPUArchState *env, Error **errp)
>  {
>      CPUState *cpu = ENV_GET_CPU(env);
>      CPUClass *cc = CPU_GET_CLASS(cpu);
> -    CPUState *some_cpu;
>      int cpu_index;
> -
>  #if defined(CONFIG_USER_ONLY)
> +    CPUState *some_cpu;
> +
>      cpu_list_lock();
> -#endif
>      cpu_index = 0;
>      CPU_FOREACH(some_cpu) {
>          cpu_index++;
>      }
>      cpu->cpu_index = cpu_index;
> +#else
> +    Error *local_err = NULL;
> +
> +    cpu_index = cpu->cpu_index = cpu_get_free_index(&local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +#endif
>      cpu->numa_node = 0;
>      QTAILQ_INIT(&cpu->breakpoints);
>      QTAILQ_INIT(&cpu->watchpoints);
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 48fd6fb..5241cf4 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -659,6 +659,14 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
>  void QEMU_NORETURN cpu_abort(CPUState *cpu, const char *fmt, ...)
>      GCC_FMT_ATTR(2, 3);
>  
> +#ifndef CONFIG_USER_ONLY
> +void cpu_exec_exit(CPUState *cpu);
> +#else
> +static inline void cpu_exec_exit(CPUState *cpu)
> +{
> +}
> +#endif
> +
>  #ifdef CONFIG_SOFTMMU
>  extern const struct VMStateDescription vmstate_cpu_common;
>  #else

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function Bharata B Rao
@ 2015-03-25  3:25   ` David Gibson
  2015-03-25  8:56     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  3:25 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

On Mon, Mar 23, 2015 at 07:05:56PM +0530, Bharata B Rao wrote:
> Move cpu_exec_init() call from instance_init to realize. This allows
> any failures from cpu_exec_init() to be handled appropriately.
> 
> Also add cpu_exec_exit() call from unrealize.

This still leaves all non-ppc archs not ever calling cpu_exec_exit().

> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  target-ppc/translate_init.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 9f4f172..fccee82 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -8928,6 +8928,11 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> +    cpu_exec_init(&cpu->env, &local_err);
> +    if (local_err != NULL) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
>      cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
>          + (cs->cpu_index % smp_threads);
>  #endif
> @@ -9141,6 +9146,8 @@ static void ppc_cpu_unrealizefn(DeviceState *dev, Error **errp)
>      opc_handler_t **table;
>      int i, j;
>  
> +    cpu_exec_exit(CPU(dev));
> +
>      for (i = 0; i < PPC_CPU_OPCODES_LEN; i++) {
>          if (env->opcodes[i] == &invalid_handler) {
>              continue;
> @@ -9679,8 +9686,6 @@ static void ppc_cpu_initfn(Object *obj)
>      CPUPPCState *env = &cpu->env;
>  
>      cs->env_ptr = env;
> -    cpu_exec_init(env, NULL);
> -    cpu->cpu_dt_id = cs->cpu_index;
>  
>      env->msr_mask = pcc->msr_mask;
>      env->mmu_model = pcc->mmu_model;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs
  2015-03-25  0:07   ` David Gibson
@ 2015-03-25  5:02     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  5:02 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 11:07:10AM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:43PM +0530, Bharata B Rao wrote:
> > Advertise CPU DR-capability to the guest via device tree.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> >                [spapr_drc_reset implementation]
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ppc/spapr.c | 29 +++++++++++++++++++++++++++++
> >  1 file changed, 29 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index a782e28..920e650 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -807,6 +807,15 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> >      }
> >  
> > +    if (spapr->dr_cpu_enabled) {
> > +        int offset = fdt_path_offset(fdt, "/cpus");
> > +        ret = spapr_drc_populate_dt(fdt, offset, NULL,
> > +                                    SPAPR_DR_CONNECTOR_TYPE_CPU);
> > +        if (ret < 0) {
> > +            fprintf(stderr, "Couldn't set up CPU DR device tree properties\n");
> > +        }
> > +    }
> > +
> >      _FDT((fdt_pack(fdt)));
> >  
> >      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > @@ -1393,6 +1402,16 @@ static SaveVMHandlers savevm_htab_handlers = {
> >      .load_state = htab_load,
> >  };
> >  
> > +static void spapr_drc_reset(void *opaque)
> > +{
> > +    sPAPRDRConnector *drc = opaque;
> > +    DeviceState *d = DEVICE(drc);
> > +
> > +    if (d) {
> > +        device_reset(d);
> > +    }
> > +}
> > +
> >  /* pSeries LPAR / sPAPR hardware init */
> >  static void ppc_spapr_init(MachineState *machine)
> >  {
> > @@ -1418,6 +1437,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      long load_limit, fw_size;
> >      bool kernel_le = false;
> >      char *filename;
> > +    int smt = kvmppc_smt_threads();
> >  
> >      msi_supported = true;
> >  
> > @@ -1564,6 +1584,15 @@ static void ppc_spapr_init(MachineState *machine)
> >      spapr->dr_cpu_enabled = smc->dr_cpu_enabled;
> >      spapr->dr_lmb_enabled = smc->dr_lmb_enabled;
> >  
> > +    if (spapr->dr_cpu_enabled) {
> > +        for (i = 0; i < max_cpus/smp_threads; i++) {
> > +            sPAPRDRConnector *drc =
> > +                spapr_dr_connector_new(OBJECT(machine),
> > +                                       SPAPR_DR_CONNECTOR_TYPE_CPU, i * smt);
> > +            qemu_register_reset(spapr_drc_reset, drc);
> 
> This seems to be per-core, rather than per-socket as your patch
> comments suggest.

Though we initialize socket-wise at boot time and add one CPU socket
at a time during hot add, the DR connectors are still per-core. 
ibm,my-drc-index property is still per-core.

Also the hotplug event that is sent to the kernel is  per-core and kernel
will bring up one full core (including all its thread) in response to
hot-add.

Socket addition is just a higher level notion but we still do hotplug
at core-level underneath.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects Bharata B Rao
@ 2015-03-25  5:22   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:22 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Zhu Guihua, mdroth, agraf, qemu-devel, Chen Fan, qemu-ppc,
	tyreld, nfont, Gu Zheng, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 972 bytes --]

On Mon, Mar 23, 2015 at 07:05:57PM +0530, Bharata B Rao wrote:
> From: Gu Zheng <guz.fnst@cn.fujitsu.com>
> 
> In order to deal well with the kvm vcpus (which can not be removed without any
> protection), we do not close KVM vcpu fd, just record and mark it as stopped
> into a list, so that we can reuse it for the appending cpu hot-add request if
> possible. It is also the approach that kvm guys suggested:
> https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled Bharata B Rao
@ 2015-03-25  5:24   ` David Gibson
  2015-03-25  9:12     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:24 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

On Mon, Mar 23, 2015 at 07:05:58PM +0530, Bharata B Rao wrote:
> When supporting CPU hot removal by parking the vCPU fd and reusing
> it during hotplug again, there can be cases where we try to reenable
> KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
> Introduce a boolean member in ICPState to track this and don't
> reenable the CAP if it was already enabled earlier.
> 
> This change allows CPU hot removal to work for sPAPR.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Why does double enabling the capability cause problems?  I would have
expected it to be unnecessary, but harmless.

> ---
>  hw/intc/xics_kvm.c    | 10 ++++++++++
>  include/hw/ppc/xics.h |  1 +
>  2 files changed, 11 insertions(+)
> 
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index c15453f..5b27bf8 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -331,6 +331,15 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>          abort();
>      }
>  
> +    /*
> +     * If we are reusing a parked vCPU fd corresponding to the CPU
> +     * which was hot-removed earlier we don't have to renable
> +     * KVM_CAP_IRQ_XICS capability again.
> +     */
> +    if (ss->cap_irq_xics_enabled) {
> +        return;
> +    }
> +
>      if (icpkvm->kernel_xics_fd != -1) {
>          int ret;
>  
> @@ -343,6 +352,7 @@ static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
>                      kvm_arch_vcpu_id(cs), strerror(errno));
>              exit(1);
>          }
> +        ss->cap_irq_xics_enabled = true;
>      }
>  }
>  
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index a214dd7..355a966 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -109,6 +109,7 @@ struct ICPState {
>      uint8_t pending_priority;
>      uint8_t mfrr;
>      qemu_irq output;
> +    bool cap_irq_xics_enabled;
>  };
>  
>  #define TYPE_ICS "ics"

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS
  2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
@ 2015-03-25  5:26   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:26 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Mon, Mar 23, 2015 at 07:05:59PM +0530, Bharata B Rao wrote:
> XICS is setup for each CPU during initialization. Provide a routine
> to undo the same when CPU is unplugged.
> 
> This allows reboot of a VM that has undergone CPU hotplug and unplug
> to work correctly.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support Bharata B Rao
@ 2015-03-25  5:44   ` David Gibson
  2015-03-25 16:34     ` Bharata B Rao
  2015-04-07  6:45   ` [Qemu-devel] [Qemu-ppc] " Alexey Kardashevskiy
  1 sibling, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:44 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 7602 bytes --]

On Mon, Mar 23, 2015 at 07:06:00PM +0530, Bharata B Rao wrote:
> Support hot removal of CPU for sPAPR guests by sending the hot
> unplug notification to the guest via EPOW interrupt.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++++-
>  linux-headers/linux/kvm.h |  1 +
>  target-ppc/kvm.c          |  7 +++++
>  target-ppc/kvm_ppc.h      |  6 ++++
>  4 files changed, 91 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b48994b..7b8784d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1468,6 +1468,12 @@ static void spapr_cpu_init(PowerPCCPU *cpu)
>      qemu_register_reset(spapr_cpu_reset, cpu);
>  }
>  
> +static void spapr_cpu_destroy(PowerPCCPU *cpu)
> +{
> +    xics_cpu_destroy(spapr->icp, cpu);
> +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> +}
> +
>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> @@ -1880,6 +1886,18 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
>      }
>  }
>  
> +static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
> +                                     Error **errp)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +    drck->detach(drc, dev, NULL, NULL, errp);
> +}
> +
>  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                              Error **errp)
>  {
> @@ -1911,6 +1929,51 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      return;
>  }
>  
> +static int spapr_cpu_unplug(Object *obj, void *opaque)
> +{
> +    Error **errp = opaque;
> +    DeviceState *dev = DEVICE(obj);
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    int smt = kvmppc_smt_threads();
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +
> +    spapr_cpu_destroy(cpu);

I may be misunderstanding something here, but don't you need to signal
the removal to the guest (and get its ack) before you "physically"
remove the cpu?

> +    /*
> +     * SMT threads return from here, only main thread (core) will
> +     * continue and signal hot unplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return 0;
> +    }
> +    g_assert(drc);
> +
> +    spapr_cpu_hotplug_remove(dev, cs, errp);
> +    if (*errp) {
> +        return -1;
> +    }
> +    spapr_hotplug_req_remove_event(drc);
> +
> +    return 0;
> +}
> +
> +static int spapr_cpu_core_unplug(Object *obj, void *opaque)
> +{
> +    Error **errp = opaque;
> +
> +    object_child_foreach(obj, spapr_cpu_unplug, errp);
> +    return 0;
> +}
> +
> +static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> +                            DeviceState *dev, Error **errp)
> +{
> +    object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -1926,10 +1989,21 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>      }
>  }
>  
> +static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> +        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> +            spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
> +        }
> +    }
> +}
> +
>  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>                                               DeviceState *dev)
>  {
> -    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
>          return HOTPLUG_HANDLER(machine);
>      }
>      return NULL;
> @@ -1953,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      mc->has_dynamic_sysbus = true;
>      mc->get_hotplug_handler = spapr_get_hotpug_handler;
>      hc->plug = spapr_machine_device_plug;
> +    hc->unplug = spapr_machine_device_unplug;
> +
>      smc->dr_phb_enabled = false;
>      smc->dr_cpu_enabled = false;
>      smc->dr_lmb_enabled = false;
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 12045a1..0c1be07 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -761,6 +761,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_PPC_FIXUP_HCALL 103
>  #define KVM_CAP_PPC_ENABLE_HCALL 104
>  #define KVM_CAP_CHECK_EXTENSION_VM 105
> +#define KVM_CAP_SPAPR_REUSE_VCPU 107

Updates to linux-headers/ are usually put into their own patch for
safety (along with an indication of what kernel commit the additions
came from).

>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 1edf2b5..ee23bf6 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -72,6 +72,7 @@ static int cap_ppc_watchdog;
>  static int cap_papr;
>  static int cap_htab_fd;
>  static int cap_fixup_hcalls;
> +static int cap_spapr_reuse_vcpu;

AFAICT nothing in this patch checks this variable.  Does it belong in
this patch?

>  static uint32_t debug_inst_opcode;
>  
> @@ -114,6 +115,7 @@ int kvm_arch_init(KVMState *s)
>       * only activated after this by kvmppc_set_papr() */
>      cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
>      cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
> +    cap_spapr_reuse_vcpu = kvm_check_extension(s, KVM_CAP_SPAPR_REUSE_VCPU);
>  
>      if (!cap_interrupt_level) {
>          fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
> @@ -2408,3 +2410,8 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>  {
>      return 0;
>  }
> +
> +bool kvmppc_spapr_reuse_vcpu(void)
> +{
> +    return cap_spapr_reuse_vcpu;
> +}
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 2e0224c..c831229 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -40,6 +40,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd,
>  int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
>  int kvmppc_reset_htab(int shift_hint);
>  uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
> +bool kvmppc_spapr_reuse_vcpu(void);
>  #endif /* !CONFIG_USER_ONLY */
>  bool kvmppc_has_cap_epr(void);
>  int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
> @@ -185,6 +186,11 @@ static inline int kvmppc_update_sdr1(CPUPPCState *env)
>      return 0;
>  }
>  
> +static inline bool kvmppc_spapr_reuse_vcpu(void)
> +{
> +    return false;
> +}
> +
>  #endif /* !CONFIG_USER_ONLY */
>  
>  static inline bool kvmppc_has_cap_epr(void)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug Bharata B Rao
@ 2015-03-25  5:46   ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:46 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

On Mon, Mar 23, 2015 at 07:06:01PM +0530, Bharata B Rao wrote:
> Release the vCPU objects after CPU hot unplug so that vCPU fd
> can be parked and reused.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

I think this patch is simple enough it should just be folded in with
the previous one.

> ---
>  hw/ppc/spapr.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 7b8784d..3e56d9e 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1886,6 +1886,23 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
>      }
>  }
>  
> +static void spapr_cpu_release(DeviceState *dev, void *opaque)
> +{
> +    CPUState *cs;
> +    int i;
> +    int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
> +
> +    for (i = id; i < id + smp_threads; i++) {
> +        CPU_FOREACH(cs) {
> +            PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +            if (i == ppc_get_vcpu_dt_id(cpu)) {
> +                cpu_remove(cs);
> +            }
> +        }
> +    }
> +}
> +
>  static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
>                                       Error **errp)
>  {
> @@ -1895,7 +1912,7 @@ static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
>          spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
>      sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  
> -    drck->detach(drc, dev, NULL, NULL, errp);
> +    drck->detach(drc, dev, spapr_cpu_release, NULL, errp);
>  }
>  
>  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space Bharata B Rao
@ 2015-03-25  5:58   ` David Gibson
  2015-04-13  2:59     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-25  5:58 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 5642 bytes --]

On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> Initialize a hotplug memory region under which all the hotplugged
> memory is accommodated. Also enable memory hotplug by setting
> CONFIG_MEM_HOTPLUG.
> 
> Modelled on i386 memory hotplug.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  default-configs/ppc64-softmmu.mak |  1 +
>  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h            | 12 ++++++++++
>  3 files changed, 63 insertions(+)
> 
> diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> index 22ef132..16b3011 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
>  # For PReP
>  CONFIG_MC146818RTC=y
>  CONFIG_ISA_TESTDEV=y
> +CONFIG_MEM_HOTPLUG=y
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 3e56d9e..e43bb49 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -125,8 +125,13 @@ struct sPAPRMachineState {
>  
>      /*< public >*/
>      char *kvm_type;
> +    ram_addr_t hotplug_memory_base;
> +    MemoryRegion hotplug_memory;
> +    bool enforce_aligned_dimm;
>  };
>  
> +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"

What's the rationale for including this option?

> +
>  sPAPREnvironment *spapr;
>  
>  static XICSState *try_create_xics(const char *type, int nr_servers,
> @@ -1499,6 +1504,7 @@ static void ppc_spapr_init(MachineState *machine)
>      int smt = kvmppc_smt_threads();
>      Object *socket;
>      int sockets;
> +    sPAPRMachineState *ms = SPAPR_MACHINE(machine);
>  
>      msi_supported = true;
>  
> @@ -1585,6 +1591,36 @@ static void ppc_spapr_init(MachineState *machine)
>          memory_region_add_subregion(sysmem, 0, rma_region);
>      }
>  
> +    /* initialize hotplug memory address space */
> +    if (machine->ram_size < machine->maxram_size) {
> +        ram_addr_t hotplug_mem_size =
> +            machine->maxram_size - machine->ram_size;
> +
> +        if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
> +            error_report("unsupported amount of memory slots: %"PRIu64,
> +                         machine->ram_slots);
> +            exit(EXIT_FAILURE);
> +        }
> +
> +        ms->hotplug_memory_base = ROUND_UP(machine->ram_size,
> +                                    SPAPR_HOTPLUG_MEM_ALIGN);
> +
> +        if (ms->enforce_aligned_dimm) {
> +            hotplug_mem_size += SPAPR_HOTPLUG_MEM_ALIGN * machine->ram_slots;
> +        }
> +
> +        if ((ms->hotplug_memory_base + hotplug_mem_size) < hotplug_mem_size) {
> +            error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
> +                         machine->maxram_size);
> +            exit(EXIT_FAILURE);
> +        }
> +
> +        memory_region_init(&ms->hotplug_memory, OBJECT(ms),
> +                           "hotplug-memory", hotplug_mem_size);
> +        memory_region_add_subregion(sysmem, ms->hotplug_memory_base,
> +                                    &ms->hotplug_memory);
> +    }
> +
>      filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, "spapr-rtas.bin");
>      spapr->rtas_size = get_image_size(filename);
>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> @@ -1827,13 +1863,27 @@ static void spapr_set_kvm_type(Object *obj, const char *value, Error **errp)
>      sm->kvm_type = g_strdup(value);
>  }
>  
> +static bool spapr_machine_get_aligned_dimm(Object *obj, Error **errp)
> +{
> +    sPAPRMachineState *ms = SPAPR_MACHINE(obj);
> +
> +    return ms->enforce_aligned_dimm;
> +}
> +
>  static void spapr_machine_initfn(Object *obj)
>  {
> +    sPAPRMachineState *ms = SPAPR_MACHINE(obj);
> +
>      object_property_add_str(obj, "kvm-type",
>                              spapr_get_kvm_type, spapr_set_kvm_type, NULL);
>      object_property_set_description(obj, "kvm-type",
>                                      "Specifies the KVM virtualization mode (HV, PR)",
>                                      NULL);
> +
> +    ms->enforce_aligned_dimm = true;
> +    object_property_add_bool(obj, SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM,
> +                             spapr_machine_get_aligned_dimm,
> +                             NULL, NULL);
>  }
>  
>  static void ppc_cpu_do_nmi_on_cpu(void *arg)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index ecac6e3..53560e9 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -542,6 +542,18 @@ struct sPAPREventLogEntry {
>  
>  #define SPAPR_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
>  
> +/*
> + * This defines the maximum number of DIMM slots we can have for sPAPR
> + * guest. This is not defined by sPAPR but we are defining it to 4096 slots
> + * here. With the worst case addition of SPAPR_MEMORY_BLOCK_SIZE
> + * (256MB) memory per slot, we should be able to support 1TB of guest
> + * hotpluggable memory.
> + */
> +#define SPAPR_MAX_RAM_SLOTS     (1ULL << 12)
> +
> +/* 1GB alignment for hotplug memory region */
> +#define SPAPR_HOTPLUG_MEM_ALIGN (1ULL << 30)
> +
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
>  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code
  2015-03-25  1:36   ` David Gibson
@ 2015-03-25  8:26     ` Bharata B Rao
  2015-03-26  1:40       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  8:26 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 12:36:38PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:46PM +0530, Bharata B Rao wrote:
> > Reorganize CPU device tree generation code so that it be reused from
> > hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> > instead of spapr_create_fdt_skel().
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c | 288 ++++++++++++++++++++++++++++++---------------------------
> >  1 file changed, 154 insertions(+), 134 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 36ff754..1a25cc0 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -206,24 +206,50 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
> >      return ret;
> >  }
> >  
> > +static int spapr_fixup_cpu_numa_smt_dt(void *fdt, int offset, CPUState *cs,
> > +                                        sPAPREnvironment *spapr)
> > +{
> > +    int ret;
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    int index = ppc_get_vcpu_dt_id(cpu);
> > +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> > +    uint32_t associativity[] = {cpu_to_be32(0x5),
> > +                                cpu_to_be32(0x0),
> > +                                cpu_to_be32(0x0),
> > +                                cpu_to_be32(0x0),
> > +                                cpu_to_be32(cs->numa_node),
> > +                                cpu_to_be32(index)};
> > +
> > +    /* Advertise NUMA via ibm,associativity */
> > +    if (nb_numa_nodes > 1) {
> > +        ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> > +                          sizeof(associativity));
> > +        if (ret < 0) {
> > +            return ret;
> > +        }
> > +    }
> > +
> > +    ret = fdt_setprop(fdt, offset, "ibm,pft-size",
> > +                      pft_size_prop, sizeof(pft_size_prop));
> > +    if (ret < 0) {
> > +        return ret;
> > +    }
> 
> The pft-size property isn't actually related to NUMA, so it doesn't
> really belong in this function.

Right, let me find an appropriate place to set this.

> 
> > +    return spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
> > +                                 ppc_get_compat_smt_threads(cpu));
> 
> Likewise calling the smt fixup function from the numa fixup function
> just seems odd to me; just be explicit and call the two sequentially.

It's just poor naming, I meant numa-and-smt. So essentially it is
fixing up NUMA and SMT related properties.

> 
> Overall this seems an odd way to split things.  Why not just make a
> spapr_fixup_one_cpu_dt() function, or similar, which should do all the
> necessary pieces.

Just one routine to setup everything related to CPU DT will not work
because some parts (NUMA and SMT bits) can be potentially fixed up
later as part of ibm,client-architecture-support call. This is the
reason why I have split up the reorg in this manner. Anyway will take
another stab at this and see if I can improve this.

> 
> > +}
> > +
> >  static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
> >  {
> >      int ret = 0, offset, cpus_offset;
> >      CPUState *cs;
> >      char cpu_model[32];
> >      int smt = kvmppc_smt_threads();
> > -    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> >  
> >      CPU_FOREACH(cs) {
> >          PowerPCCPU *cpu = POWERPC_CPU(cs);
> >          DeviceClass *dc = DEVICE_GET_CLASS(cs);
> >          int index = ppc_get_vcpu_dt_id(cpu);
> > -        uint32_t associativity[] = {cpu_to_be32(0x5),
> > -                                    cpu_to_be32(0x0),
> > -                                    cpu_to_be32(0x0),
> > -                                    cpu_to_be32(0x0),
> > -                                    cpu_to_be32(cs->numa_node),
> > -                                    cpu_to_be32(index)};
> >  
> >          if ((index % smt) != 0) {
> >              continue;
> > @@ -247,22 +273,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
> >              }
> >          }
> >  
> > -        if (nb_numa_nodes > 1) {
> > -            ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> > -                              sizeof(associativity));
> > -            if (ret < 0) {
> > -                return ret;
> > -            }
> > -        }
> > -
> > -        ret = fdt_setprop(fdt, offset, "ibm,pft-size",
> > -                          pft_size_prop, sizeof(pft_size_prop));
> > -        if (ret < 0) {
> > -            return ret;
> > -        }
> > -
> > -        ret = spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
> > -                                     ppc_get_compat_smt_threads(cpu));
> > +        ret = spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr);
> >          if (ret < 0) {
> >              return ret;
> >          }
> > @@ -341,18 +352,13 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
> >                                     uint32_t epow_irq)
> >  {
> >      void *fdt;
> > -    CPUState *cs;
> >      uint32_t start_prop = cpu_to_be32(initrd_base);
> >      uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
> >      GString *hypertas = g_string_sized_new(256);
> >      GString *qemu_hypertas = g_string_sized_new(256);
> >      uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
> >      uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
> > -    int smt = kvmppc_smt_threads();
> >      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
> > -    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> > -    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> > -    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> >      char *buf;
> >  
> >      add_str(hypertas, "hcall-pft");
> > @@ -441,107 +447,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
> >  
> >      _FDT((fdt_end_node(fdt)));
> >  
> > -    /* cpus */
> > -    _FDT((fdt_begin_node(fdt, "cpus")));
> > -
> > -    _FDT((fdt_property_cell(fdt, "#address-cells", 0x1)));
> > -    _FDT((fdt_property_cell(fdt, "#size-cells", 0x0)));
> > -
> > -    CPU_FOREACH(cs) {
> > -        PowerPCCPU *cpu = POWERPC_CPU(cs);
> > -        CPUPPCState *env = &cpu->env;
> > -        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > -        PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> > -        int index = ppc_get_vcpu_dt_id(cpu);
> > -        char *nodename;
> > -        uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> > -                           0xffffffff, 0xffffffff};
> > -        uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> > -        uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> > -        uint32_t page_sizes_prop[64];
> > -        size_t page_sizes_prop_size;
> > -
> > -        if ((index % smt) != 0) {
> > -            continue;
> > -        }
> > -
> > -        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> > -
> > -        _FDT((fdt_begin_node(fdt, nodename)));
> > -
> > -        g_free(nodename);
> > -
> > -        _FDT((fdt_property_cell(fdt, "reg", index)));
> > -        _FDT((fdt_property_string(fdt, "device_type", "cpu")));
> > -
> > -        _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
> > -        _FDT((fdt_property_cell(fdt, "d-cache-block-size",
> > -                                env->dcache_line_size)));
> > -        _FDT((fdt_property_cell(fdt, "d-cache-line-size",
> > -                                env->dcache_line_size)));
> > -        _FDT((fdt_property_cell(fdt, "i-cache-block-size",
> > -                                env->icache_line_size)));
> > -        _FDT((fdt_property_cell(fdt, "i-cache-line-size",
> > -                                env->icache_line_size)));
> > -
> > -        if (pcc->l1_dcache_size) {
> > -            _FDT((fdt_property_cell(fdt, "d-cache-size", pcc->l1_dcache_size)));
> > -        } else {
> > -            fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> > -        }
> > -        if (pcc->l1_icache_size) {
> > -            _FDT((fdt_property_cell(fdt, "i-cache-size", pcc->l1_icache_size)));
> > -        } else {
> > -            fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> > -        }
> > -
> > -        _FDT((fdt_property_cell(fdt, "timebase-frequency", tbfreq)));
> > -        _FDT((fdt_property_cell(fdt, "clock-frequency", cpufreq)));
> > -        _FDT((fdt_property_cell(fdt, "ibm,slb-size", env->slb_nr)));
> > -        _FDT((fdt_property_string(fdt, "status", "okay")));
> > -        _FDT((fdt_property(fdt, "64-bit", NULL, 0)));
> > -
> > -        if (env->spr_cb[SPR_PURR].oea_read) {
> > -            _FDT((fdt_property(fdt, "ibm,purr", NULL, 0)));
> > -        }
> > -
> > -        if (env->mmu_model & POWERPC_MMU_1TSEG) {
> > -            _FDT((fdt_property(fdt, "ibm,processor-segment-sizes",
> > -                               segs, sizeof(segs))));
> > -        }
> > -
> > -        /* Advertise VMX/VSX (vector extensions) if available
> > -         *   0 / no property == no vector extensions
> > -         *   1               == VMX / Altivec available
> > -         *   2               == VSX available */
> > -        if (env->insns_flags & PPC_ALTIVEC) {
> > -            uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> > -
> > -            _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
> > -        }
> > -
> > -        /* Advertise DFP (Decimal Floating Point) if available
> > -         *   0 / no property == no DFP
> > -         *   1               == DFP available */
> > -        if (env->insns_flags2 & PPC2_DFP) {
> > -            _FDT((fdt_property_cell(fdt, "ibm,dfp", 1)));
> > -        }
> > -
> > -        page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> > -                                                      sizeof(page_sizes_prop));
> > -        if (page_sizes_prop_size) {
> > -            _FDT((fdt_property(fdt, "ibm,segment-page-sizes",
> > -                               page_sizes_prop, page_sizes_prop_size)));
> > -        }
> > -
> > -        _FDT((fdt_property_cell(fdt, "ibm,chip-id",
> > -                                cs->cpu_index / cpus_per_socket)));
> > -
> > -        _FDT((fdt_end_node(fdt)));
> > -    }
> > -
> > -    _FDT((fdt_end_node(fdt)));
> > -
> >      /* RTAS */
> >      _FDT((fdt_begin_node(fdt, "rtas")));
> >  
> > @@ -739,6 +644,124 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
> >      return 0;
> >  }
> >  
> > +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> > +{
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    CPUPPCState *env = &cpu->env;
> > +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> > +    int index = ppc_get_vcpu_dt_id(cpu);
> > +    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> > +                       0xffffffff, 0xffffffff};
> > +    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> > +    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
> > +    uint32_t page_sizes_prop[64];
> > +    size_t page_sizes_prop_size;
> > +    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> > +    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> > +    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> > +
> > +    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> > +    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> > +
> > +    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> > +                            env->dcache_line_size)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> > +                            env->dcache_line_size)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> > +                            env->icache_line_size)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> > +                            env->icache_line_size)));
> > +
> > +    if (pcc->l1_dcache_size) {
> > +        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> > +                             pcc->l1_dcache_size)));
> > +    } else {
> > +        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> > +    }
> > +    if (pcc->l1_icache_size) {
> > +        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> > +                             pcc->l1_icache_size)));
> > +    } else {
> > +        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> > +    }
> > +
> > +    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> > +    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> > +    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> > +
> > +    if (env->spr_cb[SPR_PURR].oea_read) {
> > +        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> > +    }
> > +
> > +    if (env->mmu_model & POWERPC_MMU_1TSEG) {
> > +        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> > +                           segs, sizeof(segs))));
> > +    }
> > +
> > +    /* Advertise VMX/VSX (vector extensions) if available
> > +     *   0 / no property == no vector extensions
> > +     *   1               == VMX / Altivec available
> > +     *   2               == VSX available */
> > +    if (env->insns_flags & PPC_ALTIVEC) {
> > +        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
> > +
> > +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
> > +    }
> > +
> > +    /* Advertise DFP (Decimal Floating Point) if available
> > +     *   0 / no property == no DFP
> > +     *   1               == DFP available */
> > +    if (env->insns_flags2 & PPC2_DFP) {
> > +        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
> > +    }
> > +
> > +    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
> > +                                                  sizeof(page_sizes_prop));
> > +    if (page_sizes_prop_size) {
> > +        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
> > +                           page_sizes_prop, page_sizes_prop_size)));
> > +    }
> > +
> > +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> > +                            cs->cpu_index / cpus_per_socket)));
> > +
> > +    _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
> > +}
> > +
> > +static void spapr_populate_cpu_dt_node(void *fdt, sPAPREnvironment *spapr)
> 
> 
> I'd suggest s/cpu/cpus/.  If anything "populate_cpu_dt_node" sounds
> more like it covers a single cpu than "populate_cpu_dt".

Sure.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs
  2015-03-25  2:39   ` David Gibson
@ 2015-03-25  8:33     ` Bharata B Rao
  2015-03-26  1:54       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  8:33 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 01:39:02PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:52PM +0530, Bharata B Rao wrote:
> > ppc machine init functions create individual CPU threads. Change this
> > for sPAPR by switching to socket creation. CPUs are created recursively
> > by socket and core instance init routines.
> > 
> > TODO: Switching to socket level CPU creation is done only for sPAPR
> > target now.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/cpu-core.c           | 17 +++++++++++++++++
> >  hw/ppc/cpu-socket.c         | 15 +++++++++++++++
> >  hw/ppc/spapr.c              | 15 ++++++++-------
> >  target-ppc/cpu.h            |  1 +
> >  target-ppc/translate_init.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
> >  5 files changed, 87 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
> > index ed0481f..f60646d 100644
> > --- a/hw/ppc/cpu-core.c
> > +++ b/hw/ppc/cpu-core.c
> > @@ -7,6 +7,8 @@
> >  
> >  #include "hw/qdev.h"
> >  #include "hw/ppc/cpu-core.h"
> > +#include "hw/boards.h"
> > +#include <sysemu/cpus.h>
> >  
> >  static int ppc_cpu_core_realize_child(Object *child, void *opaque)
> >  {
> > @@ -32,10 +34,25 @@ static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
> >      dc->realize = ppc_cpu_core_realize;
> >  }
> >  
> > +static void ppc_cpu_core_instance_init(Object *obj)
> > +{
> > +    int i;
> > +    PowerPCCPU *cpu = NULL;
> > +    MachineState *machine = MACHINE(qdev_get_machine());
> > +
> > +    for (i = 0; i < smp_threads; i++) {
> > +        cpu = POWERPC_CPU(cpu_ppc_create(TYPE_POWERPC_CPU, machine->cpu_model));
> > +        object_property_add_child(obj, "thread[*]", OBJECT(cpu), &error_abort);
> > +        object_unref(OBJECT(cpu));
> > +    }
> > +}
> > +
> >  static const TypeInfo ppc_cpu_core_type_info = {
> >      .name = TYPE_POWERPC_CPU_CORE,
> >      .parent = TYPE_DEVICE,
> >      .class_init = ppc_cpu_core_class_init,
> > +    .instance_init = ppc_cpu_core_instance_init,
> > +    .instance_size = sizeof(PowerPCCPUCore),
> 
> The PowerPCCPUCore structure isn't defined in this patch (I assume it
> already existed), which suggests that setting the instance_size should
> have already been in an earlier patch.

PowerPCCPUCore is already defined, but I put the instance_size here
since I needed instance_init only here. I thought it is better to
have instance_init and instance_size populated together.

> 
> >  };
> >  
> >  static void ppc_cpu_core_register_types(void)
> > diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
> > index 602a060..f901336 100644
> > --- a/hw/ppc/cpu-socket.c
> > +++ b/hw/ppc/cpu-socket.c
> > @@ -8,6 +8,7 @@
> >  #include "hw/qdev.h"
> >  #include "hw/ppc/cpu-socket.h"
> >  #include "sysemu/cpus.h"
> > +#include "cpu.h"
> >  
> >  static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
> >  {
> > @@ -33,10 +34,24 @@ static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
> >      dc->realize = ppc_cpu_socket_realize;
> >  }
> >  
> > +static void ppc_cpu_socket_instance_init(Object *obj)
> > +{
> > +    int i;
> > +    Object *core;
> > +
> > +    for (i = 0; i < smp_cores; i++) {
> > +        core = object_new(TYPE_POWERPC_CPU_CORE);
> > +        object_property_add_child(obj, "core[*]", core, &error_abort);
> > +        object_unref(core);
> > +    }
> > +}
> > +
> >  static const TypeInfo ppc_cpu_socket_type_info = {
> >      .name = TYPE_POWERPC_CPU_SOCKET,
> >      .parent = TYPE_CPU_SOCKET,
> >      .class_init = ppc_cpu_socket_class_init,
> > +    .instance_init = ppc_cpu_socket_instance_init,
> > +    .instance_size = sizeof(PowerPCCPUSocket),
> 
> Likewise for PowerPCCPUSocket.
> >  
> > +/*
> > + * This is essentially same as cpu_generic_init() but without a set
> > + * realize call.
> > + */
> 
> In which case it would probably make more sense to have this be a
> generic function, and implement cpu_generic_init() in terms of it.

Actually multiple people are touching that part of the code, I so I
figured it will be a bit easier for now to contain the changes within ppc.
But yes, eventually we should do what you are suggesting.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support
  2015-03-25  3:03   ` David Gibson
@ 2015-03-25  8:36     ` Bharata B Rao
  2015-03-26  1:42       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  8:36 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 02:03:45PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:53PM +0530, Bharata B Rao wrote:
> > Support CPU hotplug via device-add command. Set up device tree
> > entries for the hotplugged CPU core and use the exising EPOW event
> > infrastructure to send CPU hotplug notification to the guest.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_events.c |  8 +++---
> >  hw/ppc/spapr_rtas.c   | 11 ++++++++
> >  3 files changed, 91 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index f52d38f..b48994b 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -33,6 +33,7 @@
> >  #include "sysemu/block-backend.h"
> >  #include "sysemu/cpus.h"
> >  #include "sysemu/kvm.h"
> > +#include "sysemu/device_tree.h"
> >  #include "kvm_ppc.h"
> >  #include "mmu-hash64.h"
> >  #include "qom/cpu.h"
> > @@ -660,6 +661,10 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> >      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> >      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> >      uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +    int drc_index = drck->get_index(drc);
> >  
> >      _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> >      _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> > @@ -728,6 +733,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> >  
> >      _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> >                              cs->cpu_index / cpus_per_socket)));
> > +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> > +
> 
> What effect will this have when running with DR disabled?

I realize now that I shouldn't probably populate this at all in that case.
This routine is now common for both bootpath and hotplug path. Will take
care of this in next post.

> 
> >      _FDT(spapr_fixup_cpu_numa_smt_dt(fdt, offset, cs, spapr));
> >  }
> > @@ -1840,6 +1847,70 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
> >      }
> >  }
> >  
> > +static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
> > +{
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +    void *fdt;
> > +    int offset, i, fdt_size;
> > +    char *nodename;
> > +
> > +    fdt = create_device_tree(&fdt_size);
> > +    nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> > +    offset = fdt_add_subnode(fdt, 0, nodename);
> > +
> > +    /* Set NUMA node for the added CPU core */
> > +    for (i = 0; i < nb_numa_nodes; i++) {
> > +        if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
> > +            cs->numa_node = i;
> > +            break;
> > +        }
> > +    }
> > +
> > +    spapr_populate_cpu_dt(cs, fdt, offset);
> > +    g_free(nodename);
> > +
> > +    drck->attach(drc, dev, fdt, offset, !dev->hotplugged, errp);
> > +    if (*errp) {
> > +        g_free(fdt);
> > +    }
> > +}
> > +
> > +static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                            Error **errp)
> > +{
> > +    CPUState *cs = CPU(dev);
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +    int smt = kvmppc_smt_threads();
> > +    Error *local_err = NULL;
> > +
> > +    /*
> > +     * SMT threads return from here, only main thread (core) will
> > +     * continue and signal hotplug event to the guest.
> > +     */
> > +    if ((id % smt) != 0) {
> > +        return;
> > +    }
> > +
> > +    g_assert(drc);
> > +
> > +    spapr_cpu_hotplug_add(dev, cs, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +    spapr_hotplug_req_add_event(drc);
> > +
> > +    return;
> > +}
> > +
> >  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >                                        DeviceState *dev, Error **errp)
> >  {
> > @@ -1848,6 +1919,10 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >          PowerPCCPU *cpu = POWERPC_CPU(cs);
> >  
> >          spapr_cpu_init(cpu);
> > +        spapr_cpu_reset(cpu);
> > +        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> > +            spapr_cpu_plug(hotplug_dev, dev, errp);
> > +        }
> >      }
> >  }
> >  
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index be82815..4ae818a 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -421,14 +421,16 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
> >      hp->hdr.section_length = cpu_to_be16(sizeof(*hp));
> >      hp->hdr.section_version = 1; /* includes extended modifier */
> >      hp->hotplug_action = hp_action;
> > -
> > +    hp->drc.index = cpu_to_be32(drck->get_index(drc));
> > +    hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
> >  
> >      switch (drc_type) {
> >      case SPAPR_DR_CONNECTOR_TYPE_PCI:
> > -        hp->drc.index = cpu_to_be32(drck->get_index(drc));
> > -        hp->hotplug_identifier = RTAS_LOG_V6_HP_ID_DRC_INDEX;
> >          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
> >          break;
> > +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> > +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
> > +        break;
> >      default:
> >          /* we shouldn't be signaling hotplug events for resources
> >           * that don't support them
> > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> > index 57ec97a..48aeb86 100644
> > --- a/hw/ppc/spapr_rtas.c
> > +++ b/hw/ppc/spapr_rtas.c
> > @@ -121,6 +121,16 @@ static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
> >      rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> >  }
> >  
> > +static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
> > +{
> > +    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
> > +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);
> 
> In some ways it might be nicer to store the global endian mode in
> sPAPRMachineState, updating it on H_SET_MODE, rather than copying the
> first cpu, although I guess that works.

I can give it a stab at that if you feel strongly about it.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap
  2015-03-25  3:23   ` David Gibson
@ 2015-03-25  8:52     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  8:52 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 02:23:29PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:55PM +0530, Bharata B Rao wrote:
> > Currently CPUState.cpu_index is monotonically increasing and a newly
> > created CPU always gets the next higher index. The next available
> > index is calculated by counting the existing number of CPUs. This is
> > fine as long as we only add CPUs, but there are architectures which
> > are starting to support CPU removal too. For an architecture like PowerPC
> > which derives its CPU identifier (device tree ID) from cpu_index, the
> > existing logic of generating cpu_index values causes problems.
> > 
> > With the currently proposed method of handling vCPU removal by parking
> > the vCPU fd in QEMU
> > (Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html),
> > generating cpu_index this way will not work for PowerPC.
> > 
> > This patch changes the way cpu_index is handed out by maintaining
> > a bit map of the CPUs that tracks both addition and removal of CPUs.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  exec.c            | 37 ++++++++++++++++++++++++++++++++++---
> >  include/qom/cpu.h |  8 ++++++++
> >  2 files changed, 42 insertions(+), 3 deletions(-)
> > 
> > diff --git a/exec.c b/exec.c
> > index e1ff6b0..9bbab02 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -527,21 +527,52 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as)
> >  }
> >  #endif
> >  
> > +#ifndef CONFIG_USER_ONLY
> > +static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > +
> > +static int cpu_get_free_index(Error **errp)
> > +{
> > +    int cpu = find_first_zero_bit(cpu_index_map, max_cpus);
> > +
> > +    if (cpu == max_cpus) {
> > +        error_setg(errp, "Trying to use more CPUs than allowed max of %d\n",
> > +                    max_cpus);
> > +        return max_cpus;
> > +    } else {
> > +        bitmap_set(cpu_index_map, cpu, 1);
> > +        return cpu;
> > +    }
> > +}
> > +
> > +void cpu_exec_exit(CPUState *cpu)
> > +{
> > +    bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > +}
> 
> AFAICT, this function is never called, which seems like a bug.

It is called in subsequent patch. If you are suggesting that a function
shouldn't be defined in a patch where it not used, I can move this
down to the other patch.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function
  2015-03-25  3:25   ` David Gibson
@ 2015-03-25  8:56     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  8:56 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 02:25:09PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:56PM +0530, Bharata B Rao wrote:
> > Move cpu_exec_init() call from instance_init to realize. This allows
> > any failures from cpu_exec_init() to be handled appropriately.
> > 
> > Also add cpu_exec_exit() call from unrealize.
> 
> This still leaves all non-ppc archs not ever calling cpu_exec_exit().

Oops missed that. Will fix.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
  2015-03-25  5:24   ` David Gibson
@ 2015-03-25  9:12     ` Bharata B Rao
  2015-03-26  1:46       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25  9:12 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 04:24:39PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:58PM +0530, Bharata B Rao wrote:
> > When supporting CPU hot removal by parking the vCPU fd and reusing
> > it during hotplug again, there can be cases where we try to reenable
> > KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
> > Introduce a boolean member in ICPState to track this and don't
> > reenable the CAP if it was already enabled earlier.
> > 
> > This change allows CPU hot removal to work for sPAPR.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> Why does double enabling the capability cause problems?  I would have
> expected it to be unnecessary, but harmless.

We are reusing the vCPU here w/o closing its fd.

As things stand currently, enabling this cap again will result in
kernel trying to create and associate ICP with this vCPU and that
fails since there is already an ICP associated with it.

Ref: arch/powerpc/kvm/book3s_xics.c:kvmppc_xics_connect_vcpu() kernel code.

So this patch will ensure that we don't renable this cap.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support
  2015-03-25  5:44   ` David Gibson
@ 2015-03-25 16:34     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-25 16:34 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 04:44:48PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:06:00PM +0530, Bharata B Rao wrote:
> > Support hot removal of CPU for sPAPR guests by sending the hot
> > unplug notification to the guest via EPOW interrupt.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++++-
> >  linux-headers/linux/kvm.h |  1 +
> >  target-ppc/kvm.c          |  7 +++++
> >  target-ppc/kvm_ppc.h      |  6 ++++
> >  4 files changed, 91 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index b48994b..7b8784d 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1468,6 +1468,12 @@ static void spapr_cpu_init(PowerPCCPU *cpu)
> >      qemu_register_reset(spapr_cpu_reset, cpu);
> >  }
> >  
> > +static void spapr_cpu_destroy(PowerPCCPU *cpu)
> > +{
> > +    xics_cpu_destroy(spapr->icp, cpu);
> > +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> > +}
> > +
> >  /* pSeries LPAR / sPAPR hardware init */
> >  static void ppc_spapr_init(MachineState *machine)
> >  {
> > @@ -1880,6 +1886,18 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
> >      }
> >  }
> >  
> > +static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
> > +                                     Error **errp)
> > +{
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +
> > +    drck->detach(drc, dev, NULL, NULL, errp);
> > +}
> > +
> >  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >                              Error **errp)
> >  {
> > @@ -1911,6 +1929,51 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >      return;
> >  }
> >  
> > +static int spapr_cpu_unplug(Object *obj, void *opaque)
> > +{
> > +    Error **errp = opaque;
> > +    DeviceState *dev = DEVICE(obj);
> > +    CPUState *cs = CPU(dev);
> > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +    int id = ppc_get_vcpu_dt_id(cpu);
> > +    int smt = kvmppc_smt_threads();
> > +    sPAPRDRConnector *drc =
> > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +
> > +    spapr_cpu_destroy(cpu);
> 
> I may be misunderstanding something here, but don't you need to signal
> the removal to the guest (and get its ack) before you "physically"
> remove the cpu?

Oh yes, I should move this call to spapr_cpu_release().

> 
> > +    /*
> > +     * SMT threads return from here, only main thread (core) will
> > +     * continue and signal hot unplug event to the guest.
> > +     */
> > +    if ((id % smt) != 0) {
> > +        return 0;
> > +    }
> > +    g_assert(drc);
> > +
> > +    spapr_cpu_hotplug_remove(dev, cs, errp);
> > +    if (*errp) {
> > +        return -1;
> > +    }
> > +    spapr_hotplug_req_remove_event(drc);
> > +
> > +    return 0;
> > +}
> > +
> > +static int spapr_cpu_core_unplug(Object *obj, void *opaque)
> > +{
> > +    Error **errp = opaque;
> > +
> > +    object_child_foreach(obj, spapr_cpu_unplug, errp);
> > +    return 0;
> > +}
> > +
> > +static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> > +                            DeviceState *dev, Error **errp)
> > +{
> > +    object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> > +}
> > +
> >  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >                                        DeviceState *dev, Error **errp)
> >  {
> > @@ -1926,10 +1989,21 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >      }
> >  }
> >  
> > +static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> > +                                      DeviceState *dev, Error **errp)
> > +{
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> > +        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> > +            spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
> > +        }
> > +    }
> > +}
> > +
> >  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> >                                               DeviceState *dev)
> >  {
> > -    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
> > +        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> >          return HOTPLUG_HANDLER(machine);
> >      }
> >      return NULL;
> > @@ -1953,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      mc->has_dynamic_sysbus = true;
> >      mc->get_hotplug_handler = spapr_get_hotpug_handler;
> >      hc->plug = spapr_machine_device_plug;
> > +    hc->unplug = spapr_machine_device_unplug;
> > +
> >      smc->dr_phb_enabled = false;
> >      smc->dr_cpu_enabled = false;
> >      smc->dr_lmb_enabled = false;
> > diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> > index 12045a1..0c1be07 100644
> > --- a/linux-headers/linux/kvm.h
> > +++ b/linux-headers/linux/kvm.h
> > @@ -761,6 +761,7 @@ struct kvm_ppc_smmu_info {
> >  #define KVM_CAP_PPC_FIXUP_HCALL 103
> >  #define KVM_CAP_PPC_ENABLE_HCALL 104
> >  #define KVM_CAP_CHECK_EXTENSION_VM 105
> > +#define KVM_CAP_SPAPR_REUSE_VCPU 107
> 
> Updates to linux-headers/ are usually put into their own patch for
> safety (along with an indication of what kernel commit the additions
> came from).
> 
> >  #ifdef KVM_CAP_IRQ_ROUTING
> >  
> > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> > index 1edf2b5..ee23bf6 100644
> > --- a/target-ppc/kvm.c
> > +++ b/target-ppc/kvm.c
> > @@ -72,6 +72,7 @@ static int cap_ppc_watchdog;
> >  static int cap_papr;
> >  static int cap_htab_fd;
> >  static int cap_fixup_hcalls;
> > +static int cap_spapr_reuse_vcpu;
> 
> AFAICT nothing in this patch checks this variable.  Does it belong in
> this patch?

This is some remnant of an old approach I was trying. It got left out
by mistake and sorry that you had to review this part.

I will clean this up in the next post.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code
  2015-03-25  8:26     ` Bharata B Rao
@ 2015-03-26  1:40       ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-26  1:40 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3758 bytes --]

On Wed, Mar 25, 2015 at 01:56:17PM +0530, Bharata B Rao wrote:
> On Wed, Mar 25, 2015 at 12:36:38PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:05:46PM +0530, Bharata B Rao wrote:
> > > Reorganize CPU device tree generation code so that it be reused from
> > > hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> > > instead of spapr_create_fdt_skel().
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  hw/ppc/spapr.c | 288 ++++++++++++++++++++++++++++++---------------------------
> > >  1 file changed, 154 insertions(+), 134 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 36ff754..1a25cc0 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -206,24 +206,50 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
> > >      return ret;
> > >  }
> > >  
> > > +static int spapr_fixup_cpu_numa_smt_dt(void *fdt, int offset, CPUState *cs,
> > > +                                        sPAPREnvironment *spapr)
> > > +{
> > > +    int ret;
> > > +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> > > +    int index = ppc_get_vcpu_dt_id(cpu);
> > > +    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> > > +    uint32_t associativity[] = {cpu_to_be32(0x5),
> > > +                                cpu_to_be32(0x0),
> > > +                                cpu_to_be32(0x0),
> > > +                                cpu_to_be32(0x0),
> > > +                                cpu_to_be32(cs->numa_node),
> > > +                                cpu_to_be32(index)};
> > > +
> > > +    /* Advertise NUMA via ibm,associativity */
> > > +    if (nb_numa_nodes > 1) {
> > > +        ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> > > +                          sizeof(associativity));
> > > +        if (ret < 0) {
> > > +            return ret;
> > > +        }
> > > +    }
> > > +
> > > +    ret = fdt_setprop(fdt, offset, "ibm,pft-size",
> > > +                      pft_size_prop, sizeof(pft_size_prop));
> > > +    if (ret < 0) {
> > > +        return ret;
> > > +    }
> > 
> > The pft-size property isn't actually related to NUMA, so it doesn't
> > really belong in this function.
> 
> Right, let me find an appropriate place to set this.

The top level cpu_fixup function sounds right to me.

> 
> > 
> > > +    return spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
> > > +                                 ppc_get_compat_smt_threads(cpu));
> > 
> > Likewise calling the smt fixup function from the numa fixup function
> > just seems odd to me; just be explicit and call the two sequentially.
> 
> It's just poor naming, I meant numa-and-smt. So essentially it is
> fixing up NUMA and SMT related properties.

I realise that it's NUMA and SMT properties - but "NUMA & SMT" seems a
very arbitrary distinction.  It's not clear why that's a useful subset
of things to be handled by their own function.

> > Overall this seems an odd way to split things.  Why not just make a
> > spapr_fixup_one_cpu_dt() function, or similar, which should do all the
> > necessary pieces.
> 
> Just one routine to setup everything related to CPU DT will not work
> because some parts (NUMA and SMT bits) can be potentially fixed up
> later as part of ibm,client-architecture-support call. This is the
> reason why I have split up the reorg in this manner. Anyway will take
> another stab at this and see if I can improve this.

Ok.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support
  2015-03-25  8:36     ` Bharata B Rao
@ 2015-03-26  1:42       ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-26  1:42 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3191 bytes --]

On Wed, Mar 25, 2015 at 02:06:29PM +0530, Bharata B Rao wrote:
> On Wed, Mar 25, 2015 at 02:03:45PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:05:53PM +0530, Bharata B Rao wrote:
> > > Support CPU hotplug via device-add command. Set up device tree
> > > entries for the hotplugged CPU core and use the exising EPOW event
> > > infrastructure to send CPU hotplug notification to the guest.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  hw/ppc/spapr.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  hw/ppc/spapr_events.c |  8 +++---
> > >  hw/ppc/spapr_rtas.c   | 11 ++++++++
> > >  3 files changed, 91 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index f52d38f..b48994b 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -33,6 +33,7 @@
> > >  #include "sysemu/block-backend.h"
> > >  #include "sysemu/cpus.h"
> > >  #include "sysemu/kvm.h"
> > > +#include "sysemu/device_tree.h"
> > >  #include "kvm_ppc.h"
> > >  #include "mmu-hash64.h"
> > >  #include "qom/cpu.h"
> > > @@ -660,6 +661,10 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> > >      QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
> > >      unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> > >      uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
> > > +    sPAPRDRConnector *drc =
> > > +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> > > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > > +    int drc_index = drck->get_index(drc);
> > >  
> > >      _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> > >      _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> > > @@ -728,6 +733,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
> > >  
> > >      _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
> > >                              cs->cpu_index / cpus_per_socket)));
> > > +    _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
> > > +
> > 
> > What effect will this have when running with DR disabled?
> 
> I realize now that I shouldn't probably populate this at all in that case.
> This routine is now common for both bootpath and hotplug path. Will take
> care of this in next post.

Ok.

[snip]
> > > +static void spapr_cpu_set_endianness(PowerPCCPU *cpu)
> > > +{
> > > +    PowerPCCPU *fcpu = POWERPC_CPU(first_cpu);
> > > +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(fcpu);
> > 
> > In some ways it might be nicer to store the global endian mode in
> > sPAPRMachineState, updating it on H_SET_MODE, rather than copying the
> > first cpu, although I guess that works.
> 
> I can give it a stab at that if you feel strongly about it.

Not that strongly.  It's a small cleanup we can consider later.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled
  2015-03-25  9:12     ` Bharata B Rao
@ 2015-03-26  1:46       ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-26  1:46 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 1620 bytes --]

On Wed, Mar 25, 2015 at 02:42:24PM +0530, Bharata B Rao wrote:
> On Wed, Mar 25, 2015 at 04:24:39PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:05:58PM +0530, Bharata B Rao wrote:
> > > When supporting CPU hot removal by parking the vCPU fd and reusing
> > > it during hotplug again, there can be cases where we try to reenable
> > > KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled.
> > > Introduce a boolean member in ICPState to track this and don't
> > > reenable the CAP if it was already enabled earlier.
> > > 
> > > This change allows CPU hot removal to work for sPAPR.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > 
> > Why does double enabling the capability cause problems?  I would have
> > expected it to be unnecessary, but harmless.
> 
> We are reusing the vCPU here w/o closing its fd.
> 
> As things stand currently, enabling this cap again will result in
> kernel trying to create and associate ICP with this vCPU and that
> fails since there is already an ICP associated with it.
> 
> Ref: arch/powerpc/kvm/book3s_xics.c:kvmppc_xics_connect_vcpu() kernel code.

Ah, right.  Sounds like a kernel bug - I would expect enabling a
capability to be idempotent.  But since the bug's there, we have to
work around it now, so ok.

> So this patch will ensure that we don't renable this cap.
> 
> Regards,
> Bharata.
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs
  2015-03-25  8:33     ` Bharata B Rao
@ 2015-03-26  1:54       ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-26  1:54 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 5042 bytes --]

On Wed, Mar 25, 2015 at 02:03:10PM +0530, Bharata B Rao wrote:
> On Wed, Mar 25, 2015 at 01:39:02PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:05:52PM +0530, Bharata B Rao wrote:
> > > ppc machine init functions create individual CPU threads. Change this
> > > for sPAPR by switching to socket creation. CPUs are created recursively
> > > by socket and core instance init routines.
> > > 
> > > TODO: Switching to socket level CPU creation is done only for sPAPR
> > > target now.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  hw/ppc/cpu-core.c           | 17 +++++++++++++++++
> > >  hw/ppc/cpu-socket.c         | 15 +++++++++++++++
> > >  hw/ppc/spapr.c              | 15 ++++++++-------
> > >  target-ppc/cpu.h            |  1 +
> > >  target-ppc/translate_init.c | 46 +++++++++++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 87 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/ppc/cpu-core.c b/hw/ppc/cpu-core.c
> > > index ed0481f..f60646d 100644
> > > --- a/hw/ppc/cpu-core.c
> > > +++ b/hw/ppc/cpu-core.c
> > > @@ -7,6 +7,8 @@
> > >  
> > >  #include "hw/qdev.h"
> > >  #include "hw/ppc/cpu-core.h"
> > > +#include "hw/boards.h"
> > > +#include <sysemu/cpus.h>
> > >  
> > >  static int ppc_cpu_core_realize_child(Object *child, void *opaque)
> > >  {
> > > @@ -32,10 +34,25 @@ static void ppc_cpu_core_class_init(ObjectClass *oc, void *data)
> > >      dc->realize = ppc_cpu_core_realize;
> > >  }
> > >  
> > > +static void ppc_cpu_core_instance_init(Object *obj)
> > > +{
> > > +    int i;
> > > +    PowerPCCPU *cpu = NULL;
> > > +    MachineState *machine = MACHINE(qdev_get_machine());
> > > +
> > > +    for (i = 0; i < smp_threads; i++) {
> > > +        cpu = POWERPC_CPU(cpu_ppc_create(TYPE_POWERPC_CPU, machine->cpu_model));
> > > +        object_property_add_child(obj, "thread[*]", OBJECT(cpu), &error_abort);
> > > +        object_unref(OBJECT(cpu));
> > > +    }
> > > +}
> > > +
> > >  static const TypeInfo ppc_cpu_core_type_info = {
> > >      .name = TYPE_POWERPC_CPU_CORE,
> > >      .parent = TYPE_DEVICE,
> > >      .class_init = ppc_cpu_core_class_init,
> > > +    .instance_init = ppc_cpu_core_instance_init,
> > > +    .instance_size = sizeof(PowerPCCPUCore),
> > 
> > The PowerPCCPUCore structure isn't defined in this patch (I assume it
> > already existed), which suggests that setting the instance_size should
> > have already been in an earlier patch.
> 
> PowerPCCPUCore is already defined, but I put the instance_size here
> since I needed instance_init only here. I thought it is better to
> have instance_init and instance_size populated together.

Hm, yes, it does make sense to declare instance_init and instance_size
together.  It also makes sense to declare instance_size and the
associated type together.

Maybe folding this together with patch 8/23 would make sense?

> > >  };
> > >  
> > >  static void ppc_cpu_core_register_types(void)
> > > diff --git a/hw/ppc/cpu-socket.c b/hw/ppc/cpu-socket.c
> > > index 602a060..f901336 100644
> > > --- a/hw/ppc/cpu-socket.c
> > > +++ b/hw/ppc/cpu-socket.c
> > > @@ -8,6 +8,7 @@
> > >  #include "hw/qdev.h"
> > >  #include "hw/ppc/cpu-socket.h"
> > >  #include "sysemu/cpus.h"
> > > +#include "cpu.h"
> > >  
> > >  static int ppc_cpu_socket_realize_child(Object *child, void *opaque)
> > >  {
> > > @@ -33,10 +34,24 @@ static void ppc_cpu_socket_class_init(ObjectClass *oc, void *data)
> > >      dc->realize = ppc_cpu_socket_realize;
> > >  }
> > >  
> > > +static void ppc_cpu_socket_instance_init(Object *obj)
> > > +{
> > > +    int i;
> > > +    Object *core;
> > > +
> > > +    for (i = 0; i < smp_cores; i++) {
> > > +        core = object_new(TYPE_POWERPC_CPU_CORE);
> > > +        object_property_add_child(obj, "core[*]", core, &error_abort);
> > > +        object_unref(core);
> > > +    }
> > > +}
> > > +
> > >  static const TypeInfo ppc_cpu_socket_type_info = {
> > >      .name = TYPE_POWERPC_CPU_SOCKET,
> > >      .parent = TYPE_CPU_SOCKET,
> > >      .class_init = ppc_cpu_socket_class_init,
> > > +    .instance_init = ppc_cpu_socket_instance_init,
> > > +    .instance_size = sizeof(PowerPCCPUSocket),
> > 
> > Likewise for PowerPCCPUSocket.
> > >  
> > > +/*
> > > + * This is essentially same as cpu_generic_init() but without a set
> > > + * realize call.
> > > + */
> > 
> > In which case it would probably make more sense to have this be a
> > generic function, and implement cpu_generic_init() in terms of it.
> 
> Actually multiple people are touching that part of the code, I so I
> figured it will be a bit easier for now to contain the changes within ppc.
> But yes, eventually we should do what you are suggesting.

Ok.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
@ 2015-03-26  3:44   ` David Gibson
  2015-03-30  9:11     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-26  3:44 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 17874 bytes --]

On Mon, Mar 23, 2015 at 07:06:03PM +0530, Bharata B Rao wrote:
> Parse ibm,architecture.vec table obtained from the guest and enable
> memory node configuration via ibm,dynamic-reconfiguration-memory if guest
> supports it. This is in preparation to support memory hotplug for
> sPAPR guests.
> 
> This changes the way memory node configuration is done. Currently all
> memory nodes are built upfront. But after this patch, only memory@0 node
> for RMA is built upfront. Guest kernel boots with just that and rest of
> the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
> are built when guest does ibm,client-architecture-support call.
> 
> Note: This patch needs a SLOF enhancement which is already part of
> upstream SLOF.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

> ---
>  docs/specs/ppc-spapr-hotplug.txt |  48 +++++++++
>  hw/ppc/spapr.c                   | 228 +++++++++++++++++++++++++++++++--------
>  hw/ppc/spapr_hcall.c             |  51 +++++++--
>  include/hw/ppc/spapr.h           |  15 ++-
>  4 files changed, 293 insertions(+), 49 deletions(-)
> 
> diff --git a/docs/specs/ppc-spapr-hotplug.txt b/docs/specs/ppc-spapr-hotplug.txt
> index 46e0719..9d574b5 100644
> --- a/docs/specs/ppc-spapr-hotplug.txt
> +++ b/docs/specs/ppc-spapr-hotplug.txt
> @@ -302,4 +302,52 @@ consisting of <phys>, <size> and <maxcpus>.
>  pseries guests use this property to note the maximum allowed CPUs for the
>  guest.
>  
> +== ibm,dynamic-reconfiguration-memory ==
> +
> +ibm,dynamic-reconfiguration-memory is a device tree node that represents
> +dynamically reconfigurable logical memory blocks (LMB). This node
> +is generated only when the guest advertises the support for it via
> +ibm,client-architecture-support call. Memory that is not dynamically
> +reconfigurable is represented by /memory nodes. The properties of this
> +node that are of interest to the sPAPR memory hotplug implementation
> +in QEMU are described here.
> +
> +ibm,lmb-size
> +
> +This 64bit integer defines the size of each dynamically reconfigurable LMB.
> +
> +ibm,associativity-lookup-arrays
> +
> +This property defines a lookup array in which the NUMA associativity
> +information for each LMB can be found. It is a property encoded array
> +that begins with an integer M, the number of associativity lists followed
> +by an integer N, the number of entries per associativity list and terminated
> +by M associativity lists each of length N integers.
> +
> +This property provides the same information as given by ibm,associativity
> +property in a /memory node. Each assigned LMB has an index value between
> +0 and M-1 which is used as an index into this table to select which
> +associativity list to use for the LMB. This index value for each LMB
> +is defined in ibm,dynamic-memory property.
> +
> +ibm,dynamic-memory
> +
> +This property describes the dynamically reconfigurable memory. It is a
> +property endoded array that has an integer N, the number of LMBs followed
> +by N LMB list entires.
> +
> +Each LMB list entry consists of the following elements:
> +
> +- Logical address of the start of the LMB encoded as a 64bit integer. This
> +  corresponds to reg property in /memory node.
> +- DRC index of the LMB that corresponds to ibm,my-drc-index property
> +  in a /memory node.
> +- Four bytes reserved for expansion.
> +- Associativity list index for the LMB that is used an index into
> +  ibm,associativity-lookup-arrays property described earlier. This
> +  is used to retrieve the right associativity list to be used for this
> +  LMB.
> +- A 32bit flags word. The bit at bit position 0x00000008 defines whether
> +  the LMB is assigned to the the partition as of boot time.
> +
>  [1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/75350/focus=106867
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e43bb49..4e844ab 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -541,42 +541,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>      return fdt;
>  }
>  
> -int spapr_h_cas_compose_response(target_ulong addr, target_ulong size)
> -{
> -    void *fdt, *fdt_skel;
> -    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> -
> -    size -= sizeof(hdr);
> -
> -    /* Create sceleton */
> -    fdt_skel = g_malloc0(size);
> -    _FDT((fdt_create(fdt_skel, size)));
> -    _FDT((fdt_begin_node(fdt_skel, "")));
> -    _FDT((fdt_end_node(fdt_skel)));
> -    _FDT((fdt_finish(fdt_skel)));
> -    fdt = g_malloc0(size);
> -    _FDT((fdt_open_into(fdt_skel, fdt, size)));
> -    g_free(fdt_skel);
> -
> -    /* Fix skeleton up */
> -    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> -
> -    /* Pack resulting tree */
> -    _FDT((fdt_pack(fdt)));
> -
> -    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
> -        trace_spapr_cas_failed(size);
> -        return -1;
> -    }
> -
> -    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
> -    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
> -    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
> -    g_free(fdt);
> -
> -    return 0;
> -}
> -
>  static void spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start,
>                                         hwaddr size)
>  {
> @@ -630,7 +594,6 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>          }
>          if (!mem_start) {
>              /* ppc_spapr_init() checks for rma_size <= node0_size already */
> -            spapr_populate_memory_node(fdt, i, 0, spapr->rma_size);
>              mem_start += spapr->rma_size;
>              node_size -= spapr->rma_size;
>          }
> @@ -775,6 +738,186 @@ static void spapr_populate_cpu_dt_node(void *fdt, sPAPREnvironment *spapr)
>  
>  }
>  
> +/*
> + * TODO: Take care of sparsemem configuration ?
> + */
> +static uint64_t numa_node_end(uint32_t nodeid)
> +{
> +    uint32_t i = 0;
> +    uint64_t addr = 0;
> +
> +    do {
> +        addr += numa_info[i].node_mem;
> +    } while (++i <= nodeid);
> +
> +    return addr;
> +}
> +
> +static uint64_t numa_node_start(uint32_t nodeid)
> +{
> +    if (!nodeid) {
> +        return 0;
> +    } else {
> +        return numa_node_end(nodeid - 1);
> +    }
> +}
> +
> +/*
> + * Given the addr, return the NUMA node to which the address belongs to.
> + */
> +static uint32_t get_numa_node(uint64_t addr)
> +{
> +    uint32_t i;
> +
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {
> +            return i;
> +        }
> +    }

This function is O(N^2) in number of nodes, which is a bit hideous for
something so simple.

> +    /* Unassigned memory goes to node 0 by default */
> +    return 0;
> +}
> +
> +/*
> + * Adds ibm,dynamic-reconfiguration-memory node.
> + * Refer to docs/specs/ppc-spapr-hotplug.txt for the documentation
> + * of this device tree node.
> + */
> +static int spapr_populate_drconf_memory(sPAPREnvironment *spapr, void *fdt)
> +{
> +    int ret, i, offset;
> +    uint32_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> +    uint32_t nr_rma_lmbs = spapr->rma_size/lmb_size;
> +    uint32_t nr_lmbs = spapr->maxram_limit/lmb_size - nr_rma_lmbs;
> +    uint32_t nr_assigned_lmbs = spapr->ram_limit/lmb_size - nr_rma_lmbs;
> +    uint32_t *int_buf, *cur_index, buf_len;
> +
> +    /* Allocate enough buffer size to fit in ibm,dynamic-memory */
> +    buf_len = nr_lmbs * SPAPR_DR_LMB_LIST_ENTRY_SIZE * sizeof(uint32_t) +
> +                sizeof(uint32_t);
> +    cur_index = int_buf = g_malloc0(buf_len);
> +
> +    offset = fdt_add_subnode(fdt, 0, "ibm,dynamic-reconfiguration-memory");
> +
> +    ret = fdt_setprop_u64(fdt, offset, "ibm,lmb-size", lmb_size);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-flags-mask", 0xff);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-preservation-time", 0x0);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* ibm,dynamic-memory */
> +    int_buf[0] = cpu_to_be32(nr_lmbs);
> +    cur_index++;
> +    for (i = 0; i < nr_lmbs; i++) {
> +        sPAPRDRConnector *drc;
> +        sPAPRDRConnectorClass *drck;
> +        uint64_t addr;
> +        uint32_t *dynamic_memory = cur_index;
> +
> +        if (i < nr_assigned_lmbs) {
> +            addr = (i + nr_rma_lmbs) * lmb_size;
> +        } else {
> +            addr = (i - nr_assigned_lmbs) * lmb_size +
> +                SPAPR_MACHINE(qdev_get_machine())->hotplug_memory_base;
> +        }
> +        drc = spapr_dr_connector_new(qdev_get_machine(),
> +                SPAPR_DR_CONNECTOR_TYPE_LMB, addr/lmb_size);
> +        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +        dynamic_memory[0] = cpu_to_be32(addr >> 32);
> +        dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff);
> +        dynamic_memory[2] = cpu_to_be32(drck->get_index(drc));
> +        dynamic_memory[3] = cpu_to_be32(0); /* reserved */
> +        dynamic_memory[4] = cpu_to_be32(get_numa_node(addr));
> +        dynamic_memory[5] = (addr < spapr->ram_limit) ?
> +                            cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED) :
> +                            cpu_to_be32(0);
> +
> +        cur_index += SPAPR_DR_LMB_LIST_ENTRY_SIZE;
> +    }
> +    ret = fdt_setprop(fdt, offset, "ibm,dynamic-memory", int_buf, buf_len);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* ibm,associativity-lookup-arrays */
> +    cur_index = int_buf;
> +    int_buf[0] = cpu_to_be32(nb_numa_nodes);
> +    int_buf[1] = cpu_to_be32(4); /* Number of entries per associativity list */
> +    cur_index += 2;
> +    for (i = 0; i < nb_numa_nodes; i++) {
> +        uint32_t associativity[] = {
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(0x0),
> +            cpu_to_be32(i)
> +        };
> +        memcpy(cur_index, associativity, sizeof(associativity));
> +        cur_index += 4;
> +    }
> +    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
> +            (cur_index - int_buf) * sizeof(uint32_t));
> +out:
> +    g_free(int_buf);
> +    return ret;
> +}
> +
> +int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> +                                bool cpu_update, bool memory_update)
> +{
> +    void *fdt, *fdt_skel;
> +    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> +
> +    size -= sizeof(hdr);
> +
> +    /* Create sceleton */
> +    fdt_skel = g_malloc0(size);
> +    _FDT((fdt_create(fdt_skel, size)));
> +    _FDT((fdt_begin_node(fdt_skel, "")));
> +    _FDT((fdt_end_node(fdt_skel)));
> +    _FDT((fdt_finish(fdt_skel)));
> +    fdt = g_malloc0(size);
> +    _FDT((fdt_open_into(fdt_skel, fdt, size)));
> +    g_free(fdt_skel);
> +
> +    /* Fixup cpu nodes */
> +    if (cpu_update) {
> +        _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> +    }
> +
> +    /* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
> +    if (memory_update) {
> +        _FDT((spapr_populate_drconf_memory(spapr, fdt)));
> +    } else {
> +        _FDT((spapr_populate_memory(spapr, fdt)));
> +    }
> +
> +    /* Pack resulting tree */
> +    _FDT((fdt_pack(fdt)));
> +
> +    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
> +        trace_spapr_cas_failed(size);
> +        return -1;
> +    }
> +
> +    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
> +    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
> +    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
> +    g_free(fdt);
> +
> +    return 0;
> +}
> +
>  static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>                                 hwaddr fdt_addr,
>                                 hwaddr rtas_addr,
> @@ -791,11 +934,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>      /* open out the base tree into a temp buffer for the final tweaks */
>      _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE)));
>  
> -    ret = spapr_populate_memory(spapr, fdt);
> -    if (ret < 0) {
> -        fprintf(stderr, "couldn't setup memory nodes in fdt\n");
> -        exit(1);
> -    }
> +    /*
> +     * Add memory@0 node to represent RMA. Rest of the memory is either
> +     * represented by memory nodes or ibm,dynamic-reconfiguration-memory
> +     * node later during ibm,client-architecture-support call.
> +     */
> +    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
>  
>      ret = spapr_populate_vdevice(spapr->vio_bus, fdt);
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 4f76f1c..20507c6 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -807,6 +807,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>      return ret;
>  }
>  
> +/*
> + * Return the offset to the requested option vector @vector in the
> + * option vector table @table.
> + */
> +static target_ulong cas_get_option_vector(int vector, target_ulong table)
> +{
> +    int i;
> +    char nr_vectors, nr_entries;
> +
> +    if (!table) {
> +        return 0;
> +    }
> +
> +    nr_vectors = (rtas_ld(table, 0) >> 24) + 1;

I don't think rtas_ld() should be used outside its intended function
in rtas.  Make a direct call to ldl_phys instead.

> +    if (!vector || vector > nr_vectors) {
> +        return 0;
> +    }
> +    table++; /* skip nr option vectors */
> +
> +    for (i = 0; i < vector - 1; i++) {
> +        nr_entries = rtas_ld(table, 0) >> 24;
> +        table += nr_entries + 2;
> +    }
> +    return table;
> +}
> +
>  typedef struct {
>      PowerPCCPU *cpu;
>      uint32_t cpu_version;
> @@ -827,19 +853,22 @@ static void do_set_compat(void *arg)
>      ((cpuver) == CPU_POWERPC_LOGICAL_2_06_PLUS) ? 2061 : \
>      ((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
>  
> +#define OV5_DRCONF_MEMORY 0x20
> +
>  static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>                                                    sPAPREnvironment *spapr,
>                                                    target_ulong opcode,
>                                                    target_ulong *args)
>  {
> -    target_ulong list = args[0];
> +    target_ulong list = args[0], ov_table;
>      PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
>      CPUState *cs;
> -    bool cpu_match = false;
> +    bool cpu_match = false, cpu_update = true, memory_update = false;
>      unsigned old_cpu_version = cpu_->cpu_version;
>      unsigned compat_lvl = 0, cpu_version = 0;
>      unsigned max_lvl = get_compat_level(cpu_->max_compat);
>      int counter;
> +    char ov5_byte2;
>  
>      /* Parse PVR list */
>      for (counter = 0; counter < 512; ++counter) {
> @@ -889,8 +918,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>          }
>      }
>  
> -    /* For the future use: here @list points to the first capability */
> -
>      /* Parsing finished */
>      trace_spapr_cas_pvr(cpu_->cpu_version, cpu_match,
>                          cpu_version, pcc_->pcr_mask);
> @@ -914,14 +941,26 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>      }
>  
>      if (!cpu_version) {
> -        return H_SUCCESS;
> +        cpu_update = false;
>      }
>  
> +    /* For the future use: here @ov_table points to the first option vector */
> +    ov_table = list;
> +
> +    list = cas_get_option_vector(5, ov_table);
>      if (!list) {
>          return H_SUCCESS;
>      }
>  
> -    if (spapr_h_cas_compose_response(args[1], args[2])) {
> +    /* @list now points to OV 5 */
> +    list += 2;
> +    ov5_byte2 = rtas_ld(list, 0) >> 24;
> +    if (ov5_byte2 & OV5_DRCONF_MEMORY) {
> +        memory_update = true;
> +    }
> +
> +    if (spapr_h_cas_compose_response(args[1], args[2], cpu_update,
> +                                    memory_update)) {
>          qemu_system_reset_request();
>      }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 53560e9..a286fe7 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -554,9 +554,22 @@ struct sPAPREventLogEntry {
>  /* 1GB alignment for hotplug memory region */
>  #define SPAPR_HOTPLUG_MEM_ALIGN (1ULL << 30)
>  
> +/*
> + * Number of 32 bit words in each LMB list entry in ibm,dynamic-memory
> + * property under ibm,dynamic-reconfiguration-memory node.
> + */
> +#define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
> +
> +/*
> + * This flag value defines the LMB as assigned in ibm,dynamic-memory
> + * property under ibm,dynamic-reconfiguration-memory node.
> + */
> +#define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
> +
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> -int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> +int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> +                                bool cpu_update, bool memory_update);
>  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
>                                     uint64_t bus_offset,
>                                     uint32_t page_shift,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support Bharata B Rao
@ 2015-03-26  3:57   ` David Gibson
  2015-04-13  3:03     ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-26  3:57 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 7870 bytes --]

On Mon, Mar 23, 2015 at 07:06:04PM +0530, Bharata B Rao wrote:
> Make use of pc-dimm infrastructure to support memory hotplug
> for PowerPC.
> 
> Modelled on i386 memory hotplug.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c        | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_events.c |   3 ++
>  2 files changed, 120 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4e844ab..bc46acd 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -61,7 +61,8 @@
>  #include "hw/nmi.h"
>  
>  #include "hw/compat.h"
> -
> +#include "hw/mem/pc-dimm.h"
> +#include "qapi/qmp/qerror.h"
>  #include <libfdt.h>
>  
>  /* SLOF memory layout:
> @@ -902,6 +903,10 @@ int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
>          _FDT((spapr_populate_memory(spapr, fdt)));
>      }
>  
> +    if (spapr->dr_lmb_enabled) {
> +        _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
> +    }
> +
>      /* Pack resulting tree */
>      _FDT((fdt_pack(fdt)));
>  
> @@ -2185,6 +2190,109 @@ static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
>      object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
>  }
>  
> +static void spapr_add_lmbs(uint64_t addr, uint64_t size, Error **errp)
> +{
> +    sPAPRDRConnector *drc;
> +    uint32_t nr_lmbs = size/SPAPR_MEMORY_BLOCK_SIZE;
> +    int i;
> +
> +    if (size % SPAPR_MEMORY_BLOCK_SIZE) {
> +        error_setg(errp, "Hotplugged memory size must be a multiple of "
> +                      "%d MB", SPAPR_MEMORY_BLOCK_SIZE/(1024 * 1024));

s/MB/MiB/  there seems to be precedent for using the mebibyte term in
qemu.

Also you can use the MiB #define instead of (1024 * 1024).

> +        return;
> +    }
> +
> +    for (i = 0; i < nr_lmbs; i++) {
> +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> +                addr/SPAPR_MEMORY_BLOCK_SIZE);
> +        g_assert(drc);
> +        spapr_hotplug_req_add_event(drc);
> +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> +    }
> +}
> +
> +static void spapr_memory_plug(HotplugHandler *hotplug_dev,
> +                         DeviceState *dev, Error **errp)
> +{
> +    int slot;
> +    Error *local_err = NULL;
> +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> +    MachineState *machine = MACHINE(hotplug_dev);
> +    PCDIMMDevice *dimm = PC_DIMM(dev);
> +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> +    uint64_t existing_dimms_capacity = 0;
> +    uint64_t align = TARGET_PAGE_SIZE;
> +    uint64_t addr;
> +
> +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (memory_region_get_alignment(mr) && ms->enforce_aligned_dimm) {
> +        align = memory_region_get_alignment(mr);
> +    }
> +
> +    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
> +                                 memory_region_size(&ms->hotplug_memory),
> +                                 !addr ? NULL : &addr, align,
> +                                 memory_region_size(mr), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (existing_dimms_capacity + memory_region_size(mr) >
> +        machine->maxram_size - machine->ram_size) {
> +        error_setg(&local_err, "not enough space, currently 0x%" PRIx64
> +                   " in use of total hot pluggable 0x" RAM_ADDR_FMT,
> +                   existing_dimms_capacity,
> +                   machine->maxram_size - machine->ram_size);
> +        goto out;
> +    }
> +
> +    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +    trace_mhp_pc_dimm_assigned_address(addr);
> +
> +    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
> +                                 machine->ram_slots, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +    trace_mhp_pc_dimm_assigned_slot(slot);
> +
> +    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
> +        error_setg(&local_err, "hypervisor has no free memory slots left");
> +        goto out;
> +    }
> +
> +    memory_region_add_subregion(&ms->hotplug_memory,
> +                                addr - ms->hotplug_memory_base, mr);
> +    vmstate_register_ram(mr, dev);
> +
> +    spapr_add_lmbs(addr, memory_region_size(mr), &local_err);

It really looks as if it should be possible to make a common function
covering most of this and pc_dimm_plug, with the only difference being
the final call to do the arch specific stuff.  Even that might be able
to just be a hotplug handler call if you add a parameter for the
apprpirate hotplughandler object, instead of assuming the acpi device
on the PC side.

> +out:
> +    error_propagate(errp, local_err);
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -2197,6 +2305,9 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>          if (dev->hotplugged && spapr->dr_cpu_enabled) {
>              spapr_cpu_plug(hotplug_dev, dev, errp);
>          }
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
> +                spapr->dr_lmb_enabled) {
> +        spapr_memory_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -2207,6 +2318,9 @@ static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
>          if (dev->hotplugged && spapr->dr_cpu_enabled) {
>              spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
>          }
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
> +                spapr->dr_lmb_enabled) {
> +        error_setg(errp, "Memory hot unplug not supported by sPAPR");
>      }
>  }
>  
> @@ -2214,7 +2328,8 @@ static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>                                               DeviceState *dev)
>  {
>      if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
> -        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> +        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          return HOTPLUG_HANDLER(machine);
>      }
>      return NULL;
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 4ae818a..e2a22d2 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -431,6 +431,9 @@ static void spapr_hotplug_req_event(sPAPRDRConnector *drc, uint8_t hp_action)
>      case SPAPR_DR_CONNECTOR_TYPE_CPU:
>          hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
>          break;
> +    case SPAPR_DR_CONNECTOR_TYPE_LMB:
> +        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_MEMORY;
> +        break;
>      default:
>          /* we shouldn't be signaling hotplug events for resources
>           * that don't support them
> -- 
> 2.1.0
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (22 preceding siblings ...)
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support Bharata B Rao
@ 2015-03-26  3:58 ` David Gibson
  2015-03-26  4:16   ` Bharata B Rao
  2015-04-06 10:19 ` Bharata B Rao
  24 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2015-03-26  3:58 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 483 bytes --]

On Mon, Mar 23, 2015 at 07:05:41PM +0530, Bharata B Rao wrote:
> Hi,
> 
> This is the version 2 of the patchset that provides CPU and memory hotplug
> support for PowerPC sPAPR guests.

Thanks for this Bharata.  This looks much, much better than v1,
pretty close to being ready.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests
  2015-03-26  3:58 ` [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests David Gibson
@ 2015-03-26  4:16   ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-03-26  4:16 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Thu, Mar 26, 2015 at 02:58:12PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:41PM +0530, Bharata B Rao wrote:
> > Hi,
> > 
> > This is the version 2 of the patchset that provides CPU and memory hotplug
> > support for PowerPC sPAPR guests.
> 
> Thanks for this Bharata.  This looks much, much better than v1,
> pretty close to being ready.

David - In fact I should thank you for your timely and valuable review.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-03-26  3:44   ` David Gibson
@ 2015-03-30  9:11     ` Bharata B Rao
  2015-03-31  2:19       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-03-30  9:11 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Thu, Mar 26, 2015 at 02:44:17PM +1100, David Gibson wrote:
> > +/*
> > + * TODO: Take care of sparsemem configuration ?
> > + */
> > +static uint64_t numa_node_end(uint32_t nodeid)
> > +{
> > +    uint32_t i = 0;
> > +    uint64_t addr = 0;
> > +
> > +    do {
> > +        addr += numa_info[i].node_mem;
> > +    } while (++i <= nodeid);
> > +
> > +    return addr;
> > +}
> > +
> > +static uint64_t numa_node_start(uint32_t nodeid)
> > +{
> > +    if (!nodeid) {
> > +        return 0;
> > +    } else {
> > +        return numa_node_end(nodeid - 1);
> > +    }
> > +}
> > +
> > +/*
> > + * Given the addr, return the NUMA node to which the address belongs to.
> > + */
> > +static uint32_t get_numa_node(uint64_t addr)
> > +{
> > +    uint32_t i;
> > +
> > +    for (i = 0; i < nb_numa_nodes; i++) {
> > +        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {
> > +            return i;
> > +        }
> > +    }
> 
> This function is O(N^2) in number of nodes, which is a bit hideous for
> something so simple.

Will something like below work ? Will all archs be ok with this ?

numa: Store start and end address range of each node in numa_info

Keep track of start and end address of each NUMA node in numa_info
structure so that lookup of node by address becomes easier.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 include/sysemu/numa.h |    3 +++
 numa.c                |   28 ++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 5633b85..6dd6387 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -14,11 +14,14 @@ typedef struct node_info {
     DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
     struct HostMemoryBackend *node_memdev;
     bool present;
+    uint64_t mem_start;
+    uint64_t mem_end;
 } NodeInfo;
 extern NodeInfo numa_info[MAX_NODES];
 void parse_numa_opts(void);
 void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
+uint32_t get_numa_node(uint64_t addr, Error **errp);
 
 #endif
diff --git a/numa.c b/numa.c
index 5634bf0..d0eb647 100644
--- a/numa.c
+++ b/numa.c
@@ -53,6 +53,25 @@ static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one.
 int nb_numa_nodes;
 NodeInfo numa_info[MAX_NODES];
 
+/*
+ * Given an address, return the index of the NUMA node to which the
+ * address belongs to.
+ */
+uint32_t get_numa_node(uint64_t addr, Error **errp)
+{
+    uint32_t i;
+
+    for (i = 0; i < nb_numa_nodes; i++) {
+        if (addr >= numa_info[i].mem_start && addr < numa_info[i].mem_end) {
+            return i;
+        }
+    }
+    error_setg(errp, "Address 0x" RAM_ADDR_FMT " doesn't belong to any NUMA node", addr);
+
+    /* Return Node 0 for unclaimed address */
+    return 0;
+}
+
 static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
 {
     uint16_t nodenr;
@@ -119,6 +138,15 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp)
         numa_info[nodenr].node_mem = object_property_get_int(o, "size", NULL);
         numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
     }
+
+    if (nodenr) {
+        numa_info[nodenr].mem_start = numa_info[nodenr-1].mem_end;
+    } else {
+        numa_info[nodenr].mem_start = 0;
+    }
+    numa_info[nodenr].mem_end = numa_info[nodenr].mem_start +
+                                   numa_info[nodenr].node_mem;
+
     numa_info[nodenr].present = true;
     max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1);
 }
> > +    /* Unassigned memory goes to node 0 by default */
> > +    return 0;
> > +}
> > +
> > +/*
> > + * Adds ibm,dynamic-reconfiguration-memory node.
> > + * Refer to docs/specs/ppc-spapr-hotplug.txt for the documentation
> > + * of this device tree node.
> > + */
> > +static int spapr_populate_drconf_memory(sPAPREnvironment *spapr, void *fdt)
> > +{
> > +    int ret, i, offset;
> > +    uint32_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> > +    uint32_t nr_rma_lmbs = spapr->rma_size/lmb_size;
> > +    uint32_t nr_lmbs = spapr->maxram_limit/lmb_size - nr_rma_lmbs;
> > +    uint32_t nr_assigned_lmbs = spapr->ram_limit/lmb_size - nr_rma_lmbs;
> > +    uint32_t *int_buf, *cur_index, buf_len;
> > +
> > +    /* Allocate enough buffer size to fit in ibm,dynamic-memory */
> > +    buf_len = nr_lmbs * SPAPR_DR_LMB_LIST_ENTRY_SIZE * sizeof(uint32_t) +
> > +                sizeof(uint32_t);
> > +    cur_index = int_buf = g_malloc0(buf_len);
> > +
> > +    offset = fdt_add_subnode(fdt, 0, "ibm,dynamic-reconfiguration-memory");
> > +
> > +    ret = fdt_setprop_u64(fdt, offset, "ibm,lmb-size", lmb_size);
> > +    if (ret < 0) {
> > +        goto out;
> > +    }
> > +
> > +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-flags-mask", 0xff);
> > +    if (ret < 0) {
> > +        goto out;
> > +    }
> > +
> > +    ret = fdt_setprop_cell(fdt, offset, "ibm,memory-preservation-time", 0x0);
> > +    if (ret < 0) {
> > +        goto out;
> > +    }
> > +
> > +    /* ibm,dynamic-memory */
> > +    int_buf[0] = cpu_to_be32(nr_lmbs);
> > +    cur_index++;
> > +    for (i = 0; i < nr_lmbs; i++) {
> > +        sPAPRDRConnector *drc;
> > +        sPAPRDRConnectorClass *drck;
> > +        uint64_t addr;
> > +        uint32_t *dynamic_memory = cur_index;
> > +
> > +        if (i < nr_assigned_lmbs) {
> > +            addr = (i + nr_rma_lmbs) * lmb_size;
> > +        } else {
> > +            addr = (i - nr_assigned_lmbs) * lmb_size +
> > +                SPAPR_MACHINE(qdev_get_machine())->hotplug_memory_base;
> > +        }
> > +        drc = spapr_dr_connector_new(qdev_get_machine(),
> > +                SPAPR_DR_CONNECTOR_TYPE_LMB, addr/lmb_size);
> > +        drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +
> > +        dynamic_memory[0] = cpu_to_be32(addr >> 32);
> > +        dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff);
> > +        dynamic_memory[2] = cpu_to_be32(drck->get_index(drc));
> > +        dynamic_memory[3] = cpu_to_be32(0); /* reserved */
> > +        dynamic_memory[4] = cpu_to_be32(get_numa_node(addr));
> > +        dynamic_memory[5] = (addr < spapr->ram_limit) ?
> > +                            cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED) :
> > +                            cpu_to_be32(0);
> > +
> > +        cur_index += SPAPR_DR_LMB_LIST_ENTRY_SIZE;
> > +    }
> > +    ret = fdt_setprop(fdt, offset, "ibm,dynamic-memory", int_buf, buf_len);
> > +    if (ret < 0) {
> > +        goto out;
> > +    }
> > +
> > +    /* ibm,associativity-lookup-arrays */
> > +    cur_index = int_buf;
> > +    int_buf[0] = cpu_to_be32(nb_numa_nodes);
> > +    int_buf[1] = cpu_to_be32(4); /* Number of entries per associativity list */
> > +    cur_index += 2;
> > +    for (i = 0; i < nb_numa_nodes; i++) {
> > +        uint32_t associativity[] = {
> > +            cpu_to_be32(0x0),
> > +            cpu_to_be32(0x0),
> > +            cpu_to_be32(0x0),
> > +            cpu_to_be32(i)
> > +        };
> > +        memcpy(cur_index, associativity, sizeof(associativity));
> > +        cur_index += 4;
> > +    }
> > +    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
> > +            (cur_index - int_buf) * sizeof(uint32_t));
> > +out:
> > +    g_free(int_buf);
> > +    return ret;
> > +}
> > +
> > +int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> > +                                bool cpu_update, bool memory_update)
> > +{
> > +    void *fdt, *fdt_skel;
> > +    sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> > +
> > +    size -= sizeof(hdr);
> > +
> > +    /* Create sceleton */
> > +    fdt_skel = g_malloc0(size);
> > +    _FDT((fdt_create(fdt_skel, size)));
> > +    _FDT((fdt_begin_node(fdt_skel, "")));
> > +    _FDT((fdt_end_node(fdt_skel)));
> > +    _FDT((fdt_finish(fdt_skel)));
> > +    fdt = g_malloc0(size);
> > +    _FDT((fdt_open_into(fdt_skel, fdt, size)));
> > +    g_free(fdt_skel);
> > +
> > +    /* Fixup cpu nodes */
> > +    if (cpu_update) {
> > +        _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> > +    }
> > +
> > +    /* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
> > +    if (memory_update) {
> > +        _FDT((spapr_populate_drconf_memory(spapr, fdt)));
> > +    } else {
> > +        _FDT((spapr_populate_memory(spapr, fdt)));
> > +    }
> > +
> > +    /* Pack resulting tree */
> > +    _FDT((fdt_pack(fdt)));
> > +
> > +    if (fdt_totalsize(fdt) + sizeof(hdr) > size) {
> > +        trace_spapr_cas_failed(size);
> > +        return -1;
> > +    }
> > +
> > +    cpu_physical_memory_write(addr, &hdr, sizeof(hdr));
> > +    cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
> > +    trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
> > +    g_free(fdt);
> > +
> > +    return 0;
> > +}
> > +
> >  static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >                                 hwaddr fdt_addr,
> >                                 hwaddr rtas_addr,
> > @@ -791,11 +934,12 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
> >      /* open out the base tree into a temp buffer for the final tweaks */
> >      _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE)));
> >  
> > -    ret = spapr_populate_memory(spapr, fdt);
> > -    if (ret < 0) {
> > -        fprintf(stderr, "couldn't setup memory nodes in fdt\n");
> > -        exit(1);
> > -    }
> > +    /*
> > +     * Add memory@0 node to represent RMA. Rest of the memory is either
> > +     * represented by memory nodes or ibm,dynamic-reconfiguration-memory
> > +     * node later during ibm,client-architecture-support call.
> > +     */
> > +    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
> >  
> >      ret = spapr_populate_vdevice(spapr->vio_bus, fdt);
> >      if (ret < 0) {
> > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > index 4f76f1c..20507c6 100644
> > --- a/hw/ppc/spapr_hcall.c
> > +++ b/hw/ppc/spapr_hcall.c
> > @@ -807,6 +807,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> >      return ret;
> >  }
> >  
> > +/*
> > + * Return the offset to the requested option vector @vector in the
> > + * option vector table @table.
> > + */
> > +static target_ulong cas_get_option_vector(int vector, target_ulong table)
> > +{
> > +    int i;
> > +    char nr_vectors, nr_entries;
> > +
> > +    if (!table) {
> > +        return 0;
> > +    }
> > +
> > +    nr_vectors = (rtas_ld(table, 0) >> 24) + 1;
> 
> I don't think rtas_ld() should be used outside its intended function
> in rtas.  Make a direct call to ldl_phys instead.

Ok, but @table here is an RTAS arg passed from the main RTAS routine to this
function. Still can't use rtas_ld() ?

Regards,
Bharata.

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory
  2015-03-30  9:11     ` Bharata B Rao
@ 2015-03-31  2:19       ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-03-31  2:19 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, pbonzini, qemu-ppc, tyreld, nfont,
	imammedo, afaerber

[-- Attachment #1: Type: text/plain, Size: 3194 bytes --]

On Mon, Mar 30, 2015 at 02:41:50PM +0530, Bharata B Rao wrote:
> On Thu, Mar 26, 2015 at 02:44:17PM +1100, David Gibson wrote:
> > > +/*
> > > + * TODO: Take care of sparsemem configuration ?
> > > + */
> > > +static uint64_t numa_node_end(uint32_t nodeid)
> > > +{
> > > +    uint32_t i = 0;
> > > +    uint64_t addr = 0;
> > > +
> > > +    do {
> > > +        addr += numa_info[i].node_mem;
> > > +    } while (++i <= nodeid);
> > > +
> > > +    return addr;
> > > +}
> > > +
> > > +static uint64_t numa_node_start(uint32_t nodeid)
> > > +{
> > > +    if (!nodeid) {
> > > +        return 0;
> > > +    } else {
> > > +        return numa_node_end(nodeid - 1);
> > > +    }
> > > +}
> > > +
> > > +/*
> > > + * Given the addr, return the NUMA node to which the address belongs to.
> > > + */
> > > +static uint32_t get_numa_node(uint64_t addr)
> > > +{
> > > +    uint32_t i;
> > > +
> > > +    for (i = 0; i < nb_numa_nodes; i++) {
> > > +        if ((addr >= numa_node_start(i)) && (addr < numa_node_end(i))) {
> > > +            return i;
> > > +        }
> > > +    }
> > 
> > This function is O(N^2) in number of nodes, which is a bit hideous for
> > something so simple.
> 
> Will something like below work ? Will all archs be ok with this ?

The below looks like a reasonable idea to me, but it's really the call
of the numa and memory subsystem people.

Paolo, do you have an opinion?  The basic problem here is to have a
function which returns which NUMA node a given address lies in.

> numa: Store start and end address range of each node in numa_info
> 
> Keep track of start and end address of each NUMA node in numa_info
> structure so that lookup of node by address becomes easier.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>  include/sysemu/numa.h |    3 +++
>  numa.c                |   28 ++++++++++++++++++++++++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> index 5633b85..6dd6387 100644
> --- a/include/sysemu/numa.h
> +++ b/include/sysemu/numa.h
> @@ -14,11 +14,14 @@ typedef struct node_info {
>      DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
>      struct HostMemoryBackend *node_memdev;
>      bool present;
> +    uint64_t mem_start;
> +    uint64_t mem_end;

I suspect these should be hwaddr, or possible ram_addr_t though.

[snip]
> > I don't think rtas_ld() should be used outside its intended function
> > in rtas.  Make a direct call to ldl_phys instead.
> 
> Ok, but @table here is an RTAS arg passed from the main RTAS routine to this
> function. Still can't use rtas_ld() ?

So, rtas_ld() was never intended as a general purpose load function
for RTAS use, but specifically for loading arguments from an RTAS
style argument list.

It's already being used for more than that now in the RTAS code, which
I'm not entirely comfortable with.  I'd prefer not to extend its use
any further.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property
  2015-03-25  0:15   ` David Gibson
@ 2015-04-01  3:59     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-04-01  3:59 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 11:15:55AM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:05:45PM +0530, Bharata B Rao wrote:
> > Add support for ibm,lrdr-capacity since this is needed by the guest
> > kernel to know about the possible hot-pluggable CPUs and Memory. With
> > this, pseries kernels will start reporting correct maxcpus in
> > /sys/devices/system/cpu/possible.
> > 
> > Define minimum hotpluggable memory size as 256MB and start storing maximum
> > possible memory for the guest in sPAPREnvironment.
> 
> [snip]
> > @@ -666,6 +668,18 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> >          }
> >  
> >      }
> > +
> > +    lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> > +    lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0xffffffff);
> > +    lrdr_capacity[2] = 0;
> > +    lrdr_capacity[3] = cpu_to_be32(SPAPR_MEMORY_BLOCK_SIZE);
> > +    lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> > +    ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> > +                     sizeof(lrdr_capacity));
> > +    if (ret < 0) {
> > +        fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
> 
> This should probably be report_error() these days.

This file (hw/ppc/spapr_rtas.c) has lots of fprintf calls, may be it's a
task for another day to change all these.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests
  2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
                   ` (23 preceding siblings ...)
  2015-03-26  3:58 ` [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests David Gibson
@ 2015-04-06 10:19 ` Bharata B Rao
  2015-04-07  8:57   ` Igor Mammedov
  24 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-04-06 10:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: mdroth, agraf, qemu-ppc, tyreld, nfont, imammedo, afaerber, david

On Mon, Mar 23, 2015 at 07:05:41PM +0530, Bharata B Rao wrote:
> Hi,
> 
> This is the version 2 of the patchset that provides CPU and memory hotplug
> support for PowerPC sPAPR guests.

[snip]

> TODOs
> -----
> - Share code between pc_dimm_plug() and spapr_memory_plug().
> - Make the algorithm that looks up the NUMA node given the physical address
>   more efficient.
> - Test/enable migration after hotplug.

While I am using a bitmap based CPU enumeration (patch 14/23 in this patchset),
to correctly support CPU hot removal, it appears that supporting hot removal
with migration is non trivial if we allow removal in arbitrary order and
not necessarily remove-last-added-cpu-first order.

If there are holes in CPU index map (like 0-15,20-23) at the source VM
due to hot removal, the CPU index map at the target VM will be 0-19 by
default.

I see there was a patch to solve somewhat similar problem on x86 last year
(https://lists.nongnu.org/archive/html/qemu-devel/2014-01/msg01607.html), but
don't see it being pursued further.

What would be the ideal or recommended way to solve this problem ? Easiest
would be to enforce hot removal in LIFO order, but I guess that would be
restrictive.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH v2 19/23] spapr: CPU hot unplug support
  2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support Bharata B Rao
  2015-03-25  5:44   ` David Gibson
@ 2015-04-07  6:45   ` Alexey Kardashevskiy
  2015-04-09  3:51     ` Bharata B Rao
  1 sibling, 1 reply; 74+ messages in thread
From: Alexey Kardashevskiy @ 2015-04-07  6:45 UTC (permalink / raw)
  To: Bharata B Rao, qemu-devel
  Cc: mdroth, qemu-ppc, tyreld, imammedo, nfont, afaerber, david

On 03/24/2015 12:36 AM, Bharata B Rao wrote:
> Support hot removal of CPU for sPAPR guests by sending the hot
> unplug notification to the guest via EPOW interrupt.
>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
>   hw/ppc/spapr.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++++-
>   linux-headers/linux/kvm.h |  1 +
>   target-ppc/kvm.c          |  7 +++++
>   target-ppc/kvm_ppc.h      |  6 ++++
>   4 files changed, 91 insertions(+), 1 deletion(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b48994b..7b8784d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1468,6 +1468,12 @@ static void spapr_cpu_init(PowerPCCPU *cpu)
>       qemu_register_reset(spapr_cpu_reset, cpu);
>   }
>
> +static void spapr_cpu_destroy(PowerPCCPU *cpu)
> +{
> +    xics_cpu_destroy(spapr->icp, cpu);
> +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> +}
> +
>   /* pSeries LPAR / sPAPR hardware init */
>   static void ppc_spapr_init(MachineState *machine)
>   {
> @@ -1880,6 +1886,18 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
>       }
>   }
>
> +static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
> +                                     Error **errp)
> +{
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +    drck->detach(drc, dev, NULL, NULL, errp);
> +}
> +
>   static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                               Error **errp)
>   {
> @@ -1911,6 +1929,51 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       return;
>   }
>
> +static int spapr_cpu_unplug(Object *obj, void *opaque)
> +{
> +    Error **errp = opaque;
> +    DeviceState *dev = DEVICE(obj);
> +    CPUState *cs = CPU(dev);
> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> +    int id = ppc_get_vcpu_dt_id(cpu);
> +    int smt = kvmppc_smt_threads();
> +    sPAPRDRConnector *drc =
> +        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +
> +    spapr_cpu_destroy(cpu);
> +
> +    /*
> +     * SMT threads return from here, only main thread (core) will
> +     * continue and signal hot unplug event to the guest.
> +     */
> +    if ((id % smt) != 0) {
> +        return 0;
> +    }
> +    g_assert(drc);
> +
> +    spapr_cpu_hotplug_remove(dev, cs, errp);
> +    if (*errp) {
> +        return -1;
> +    }
> +    spapr_hotplug_req_remove_event(drc);
> +
> +    return 0;
> +}
> +
> +static int spapr_cpu_core_unplug(Object *obj, void *opaque)
> +{
> +    Error **errp = opaque;
> +
> +    object_child_foreach(obj, spapr_cpu_unplug, errp);
> +    return 0;
> +}
> +
> +static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> +                            DeviceState *dev, Error **errp)
> +{
> +    object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> +}
> +
>   static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                         DeviceState *dev, Error **errp)
>   {
> @@ -1926,10 +1989,21 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>       }
>   }
>
> +static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> +                                      DeviceState *dev, Error **errp)
> +{
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> +        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> +            spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
> +        }
> +    }
> +}
> +
>   static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>                                                DeviceState *dev)
>   {
> -    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {


What is this change for? I mean why is not it always socket-only? Commit 
log would not hurt here...


>           return HOTPLUG_HANDLER(machine);
>       }
>       return NULL;
> @@ -1953,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>       mc->has_dynamic_sysbus = true;
>       mc->get_hotplug_handler = spapr_get_hotpug_handler;
>       hc->plug = spapr_machine_device_plug;
> +    hc->unplug = spapr_machine_device_unplug;
> +
>       smc->dr_phb_enabled = false;
>       smc->dr_cpu_enabled = false;
>       smc->dr_lmb_enabled = false;
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index 12045a1..0c1be07 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -761,6 +761,7 @@ struct kvm_ppc_smmu_info {
>   #define KVM_CAP_PPC_FIXUP_HCALL 103
>   #define KVM_CAP_PPC_ENABLE_HCALL 104
>   #define KVM_CAP_CHECK_EXTENSION_VM 105
> +#define KVM_CAP_SPAPR_REUSE_VCPU 107
>
>   #ifdef KVM_CAP_IRQ_ROUTING
>
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 1edf2b5..ee23bf6 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -72,6 +72,7 @@ static int cap_ppc_watchdog;
>   static int cap_papr;
>   static int cap_htab_fd;
>   static int cap_fixup_hcalls;
> +static int cap_spapr_reuse_vcpu;
>
>   static uint32_t debug_inst_opcode;
>
> @@ -114,6 +115,7 @@ int kvm_arch_init(KVMState *s)
>        * only activated after this by kvmppc_set_papr() */
>       cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
>       cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
> +    cap_spapr_reuse_vcpu = kvm_check_extension(s, KVM_CAP_SPAPR_REUSE_VCPU);
>
>       if (!cap_interrupt_level) {
>           fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
> @@ -2408,3 +2410,8 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>   {
>       return 0;
>   }
> +
> +bool kvmppc_spapr_reuse_vcpu(void)
> +{
> +    return cap_spapr_reuse_vcpu;
> +}
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 2e0224c..c831229 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -40,6 +40,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd,
>   int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
>   int kvmppc_reset_htab(int shift_hint);
>   uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
> +bool kvmppc_spapr_reuse_vcpu(void);
>   #endif /* !CONFIG_USER_ONLY */
>   bool kvmppc_has_cap_epr(void);
>   int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
> @@ -185,6 +186,11 @@ static inline int kvmppc_update_sdr1(CPUPPCState *env)
>       return 0;
>   }
>
> +static inline bool kvmppc_spapr_reuse_vcpu(void)
> +{
> +    return false;
> +}
> +
>   #endif /* !CONFIG_USER_ONLY */
>
>   static inline bool kvmppc_has_cap_epr(void)
>


-- 
Alexey

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests
  2015-04-06 10:19 ` Bharata B Rao
@ 2015-04-07  8:57   ` Igor Mammedov
  0 siblings, 0 replies; 74+ messages in thread
From: Igor Mammedov @ 2015-04-07  8:57 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, afaerber, david

On Mon, 6 Apr 2015 15:49:24 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> On Mon, Mar 23, 2015 at 07:05:41PM +0530, Bharata B Rao wrote:
> > Hi,
> > 
> > This is the version 2 of the patchset that provides CPU and memory hotplug
> > support for PowerPC sPAPR guests.
> 
> [snip]
> 
> > TODOs
> > -----
> > - Share code between pc_dimm_plug() and spapr_memory_plug().
> > - Make the algorithm that looks up the NUMA node given the physical address
> >   more efficient.
> > - Test/enable migration after hotplug.
> 
> While I am using a bitmap based CPU enumeration (patch 14/23 in this patchset),
> to correctly support CPU hot removal, it appears that supporting hot removal
> with migration is non trivial if we allow removal in arbitrary order and
> not necessarily remove-last-added-cpu-first order.
> 
> If there are holes in CPU index map (like 0-15,20-23) at the source VM
> due to hot removal, the CPU index map at the target VM will be 0-19 by
> default.
> 
> I see there was a patch to solve somewhat similar problem on x86 last year
> (https://lists.nongnu.org/archive/html/qemu-devel/2014-01/msg01607.html), but
> don't see it being pursued further.
show stopper for that patch that we do not want to expose target specific way
to address CPU (and APIC is exactly this), to avoid users needs to calculate
that ID manually (which is not trivial in APIC ID case).
 
> What would be the ideal or recommended way to solve this problem ? Easiest
> would be to enforce hot removal in LIFO order, but I guess that would be
> restrictive.
That's I think where sockets concept (at least on user visible side could help).

something like you start qemu with:

QEMU -smp 2,sockets=4,cores=2,maxcpus=8

hotadd:

device_add MYCPUTYPE,socket=2,core=1,id=foo1
device_add MYCPUTYPE,socket=2,core=2,id=foo2

hotremove:

device_del id=foo1

then on destination host you start qemu:

QEMU -smp 2,sockets=4,cores=2,maxcpus=8 -device MYCPUTYPE,socket=2,core=2,id=foo2

i.e. CPUs behave the same way as any other devices wrt migration.
initial CPUs /-smp 2/ are not hotpluggable since they doesn't have id assigned, same as other devices.

> 
> Regards,
> Bharata.
> 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH v2 19/23] spapr: CPU hot unplug support
  2015-04-07  6:45   ` [Qemu-devel] [Qemu-ppc] " Alexey Kardashevskiy
@ 2015-04-09  3:51     ` Bharata B Rao
  0 siblings, 0 replies; 74+ messages in thread
From: Bharata B Rao @ 2015-04-09  3:51 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: qemu-devel, mdroth, qemu-ppc, tyreld, imammedo, nfont, afaerber, david

On Tue, Apr 07, 2015 at 04:45:17PM +1000, Alexey Kardashevskiy wrote:
> On 03/24/2015 12:36 AM, Bharata B Rao wrote:
> >Support hot removal of CPU for sPAPR guests by sending the hot
> >unplug notification to the guest via EPOW interrupt.
> >
> >Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> >---
> >  hw/ppc/spapr.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++++-
> >  linux-headers/linux/kvm.h |  1 +
> >  target-ppc/kvm.c          |  7 +++++
> >  target-ppc/kvm_ppc.h      |  6 ++++
> >  4 files changed, 91 insertions(+), 1 deletion(-)
> >
> >diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >index b48994b..7b8784d 100644
> >--- a/hw/ppc/spapr.c
> >+++ b/hw/ppc/spapr.c
> >@@ -1468,6 +1468,12 @@ static void spapr_cpu_init(PowerPCCPU *cpu)
> >      qemu_register_reset(spapr_cpu_reset, cpu);
> >  }
> >
> >+static void spapr_cpu_destroy(PowerPCCPU *cpu)
> >+{
> >+    xics_cpu_destroy(spapr->icp, cpu);
> >+    qemu_unregister_reset(spapr_cpu_reset, cpu);
> >+}
> >+
> >  /* pSeries LPAR / sPAPR hardware init */
> >  static void ppc_spapr_init(MachineState *machine)
> >  {
> >@@ -1880,6 +1886,18 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, CPUState *cs, Error **errp)
> >      }
> >  }
> >
> >+static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs,
> >+                                     Error **errp)
> >+{
> >+    PowerPCCPU *cpu = POWERPC_CPU(cs);
> >+    int id = ppc_get_vcpu_dt_id(cpu);
> >+    sPAPRDRConnector *drc =
> >+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> >+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >+
> >+    drck->detach(drc, dev, NULL, NULL, errp);
> >+}
> >+
> >  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >                              Error **errp)
> >  {
> >@@ -1911,6 +1929,51 @@ static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >      return;
> >  }
> >
> >+static int spapr_cpu_unplug(Object *obj, void *opaque)
> >+{
> >+    Error **errp = opaque;
> >+    DeviceState *dev = DEVICE(obj);
> >+    CPUState *cs = CPU(dev);
> >+    PowerPCCPU *cpu = POWERPC_CPU(cs);
> >+    int id = ppc_get_vcpu_dt_id(cpu);
> >+    int smt = kvmppc_smt_threads();
> >+    sPAPRDRConnector *drc =
> >+        spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> >+
> >+    spapr_cpu_destroy(cpu);
> >+
> >+    /*
> >+     * SMT threads return from here, only main thread (core) will
> >+     * continue and signal hot unplug event to the guest.
> >+     */
> >+    if ((id % smt) != 0) {
> >+        return 0;
> >+    }
> >+    g_assert(drc);
> >+
> >+    spapr_cpu_hotplug_remove(dev, cs, errp);
> >+    if (*errp) {
> >+        return -1;
> >+    }
> >+    spapr_hotplug_req_remove_event(drc);
> >+
> >+    return 0;
> >+}
> >+
> >+static int spapr_cpu_core_unplug(Object *obj, void *opaque)
> >+{
> >+    Error **errp = opaque;
> >+
> >+    object_child_foreach(obj, spapr_cpu_unplug, errp);
> >+    return 0;
> >+}
> >+
> >+static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> >+                            DeviceState *dev, Error **errp)
> >+{
> >+    object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> >+}
> >+
> >  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >                                        DeviceState *dev, Error **errp)
> >  {
> >@@ -1926,10 +1989,21 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >      }
> >  }
> >
> >+static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> >+                                      DeviceState *dev, Error **errp)
> >+{
> >+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> >+        if (dev->hotplugged && spapr->dr_cpu_enabled) {
> >+            spapr_cpu_socket_unplug(hotplug_dev, dev, errp);
> >+        }
> >+    }
> >+}
> >+
> >  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
> >                                               DeviceState *dev)
> >  {
> >-    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> >+    if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) ||
> >+        object_dynamic_cast(OBJECT(dev), TYPE_CPU_SOCKET)) {
> 
> 
> What is this change for? I mean why is not it always socket-only? Commit log
> would not hurt here...

In the hot add case (do_device_add), the CPU socket device is realized first
which will realize the CPU core devices. Core devices will realize the CPU
thread devices. So the ->plug() operation happens as part of CPU thread devices
and hence hotplug_handler is returned only for TYPE_CPU.

However in case of hot remove, qdev_unplug() directly does ->unplug() and
hence I need to return the hotplug_handler for TYPE_CPU_SOCKET also.
This ensures that ->unplug() gets called for socket object where I take
care of recursively walking down the core and thread objects and unplugging
the CPU thread object eventually.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-03-25  5:58   ` David Gibson
@ 2015-04-13  2:59     ` Bharata B Rao
  2015-04-13 14:04       ` Igor Mammedov
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-04-13  2:59 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Wed, Mar 25, 2015 at 04:58:18PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> > Initialize a hotplug memory region under which all the hotplugged
> > memory is accommodated. Also enable memory hotplug by setting
> > CONFIG_MEM_HOTPLUG.
> > 
> > Modelled on i386 memory hotplug.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  default-configs/ppc64-softmmu.mak |  1 +
> >  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
> >  include/hw/ppc/spapr.h            | 12 ++++++++++
> >  3 files changed, 63 insertions(+)
> > 
> > diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> > index 22ef132..16b3011 100644
> > --- a/default-configs/ppc64-softmmu.mak
> > +++ b/default-configs/ppc64-softmmu.mak
> > @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> >  # For PReP
> >  CONFIG_MC146818RTC=y
> >  CONFIG_ISA_TESTDEV=y
> > +CONFIG_MEM_HOTPLUG=y
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 3e56d9e..e43bb49 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -125,8 +125,13 @@ struct sPAPRMachineState {
> >  
> >      /*< public >*/
> >      char *kvm_type;
> > +    ram_addr_t hotplug_memory_base;
> > +    MemoryRegion hotplug_memory;
> > +    bool enforce_aligned_dimm;
> >  };
> >  
> > +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> 
> What's the rationale for including this option?

I couldn't see any use, but added it to be similar with x86. May be Igor
can tell us ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support
  2015-03-26  3:57   ` David Gibson
@ 2015-04-13  3:03     ` Bharata B Rao
  2015-04-13 14:12       ` Igor Mammedov
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-04-13  3:03 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, agraf, qemu-devel, qemu-ppc, tyreld, nfont, imammedo, afaerber

On Thu, Mar 26, 2015 at 02:57:45PM +1100, David Gibson wrote:
> On Mon, Mar 23, 2015 at 07:06:04PM +0530, Bharata B Rao wrote:
> > Make use of pc-dimm infrastructure to support memory hotplug
> > for PowerPC.
> > 
> > Modelled on i386 memory hotplug.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr.c        | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >  hw/ppc/spapr_events.c |   3 ++
> >  2 files changed, 120 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 4e844ab..bc46acd 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -61,7 +61,8 @@
> >  #include "hw/nmi.h"
> >  
> >  #include "hw/compat.h"
> > -
> > +#include "hw/mem/pc-dimm.h"
> > +#include "qapi/qmp/qerror.h"
> >  #include <libfdt.h>
> >  
> >  /* SLOF memory layout:
> > @@ -902,6 +903,10 @@ int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> >          _FDT((spapr_populate_memory(spapr, fdt)));
> >      }
> >  
> > +    if (spapr->dr_lmb_enabled) {
> > +        _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
> > +    }
> > +
> >      /* Pack resulting tree */
> >      _FDT((fdt_pack(fdt)));
> >  
> > @@ -2185,6 +2190,109 @@ static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> >      object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> >  }
> >  
> > +static void spapr_add_lmbs(uint64_t addr, uint64_t size, Error **errp)
> > +{
> > +    sPAPRDRConnector *drc;
> > +    uint32_t nr_lmbs = size/SPAPR_MEMORY_BLOCK_SIZE;
> > +    int i;
> > +
> > +    if (size % SPAPR_MEMORY_BLOCK_SIZE) {
> > +        error_setg(errp, "Hotplugged memory size must be a multiple of "
> > +                      "%d MB", SPAPR_MEMORY_BLOCK_SIZE/(1024 * 1024));
> 
> s/MB/MiB/  there seems to be precedent for using the mebibyte term in
> qemu.
> 
> Also you can use the MiB #define instead of (1024 * 1024).
> 
> > +        return;
> > +    }
> > +
> > +    for (i = 0; i < nr_lmbs; i++) {
> > +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> > +                addr/SPAPR_MEMORY_BLOCK_SIZE);
> > +        g_assert(drc);
> > +        spapr_hotplug_req_add_event(drc);
> > +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> > +    }
> > +}
> > +
> > +static void spapr_memory_plug(HotplugHandler *hotplug_dev,
> > +                         DeviceState *dev, Error **errp)
> > +{
> > +    int slot;
> > +    Error *local_err = NULL;
> > +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> > +    MachineState *machine = MACHINE(hotplug_dev);
> > +    PCDIMMDevice *dimm = PC_DIMM(dev);
> > +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> > +    uint64_t existing_dimms_capacity = 0;
> > +    uint64_t align = TARGET_PAGE_SIZE;
> > +    uint64_t addr;
> > +
> > +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    if (memory_region_get_alignment(mr) && ms->enforce_aligned_dimm) {
> > +        align = memory_region_get_alignment(mr);
> > +    }
> > +
> > +    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
> > +                                 memory_region_size(&ms->hotplug_memory),
> > +                                 !addr ? NULL : &addr, align,
> > +                                 memory_region_size(mr), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    if (existing_dimms_capacity + memory_region_size(mr) >
> > +        machine->maxram_size - machine->ram_size) {
> > +        error_setg(&local_err, "not enough space, currently 0x%" PRIx64
> > +                   " in use of total hot pluggable 0x" RAM_ADDR_FMT,
> > +                   existing_dimms_capacity,
> > +                   machine->maxram_size - machine->ram_size);
> > +        goto out;
> > +    }
> > +
> > +    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +    trace_mhp_pc_dimm_assigned_address(addr);
> > +
> > +    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
> > +                                 machine->ram_slots, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +    trace_mhp_pc_dimm_assigned_slot(slot);
> > +
> > +    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
> > +        error_setg(&local_err, "hypervisor has no free memory slots left");
> > +        goto out;
> > +    }
> > +
> > +    memory_region_add_subregion(&ms->hotplug_memory,
> > +                                addr - ms->hotplug_memory_base, mr);
> > +    vmstate_register_ram(mr, dev);
> > +
> > +    spapr_add_lmbs(addr, memory_region_size(mr), &local_err);
> 
> It really looks as if it should be possible to make a common function
> covering most of this and pc_dimm_plug, with the only difference being
> the final call to do the arch specific stuff.  Even that might be able
> to just be a hotplug handler call if you add a parameter for the
> apprpirate hotplughandler object, instead of assuming the acpi device
> on the PC side.

This routine and the equivalent x86 implementation use hotplug_memory_base
and hotplug_memory fields from sPAPRMachineState and PCMachineState
respectively. If we move these fields to MachineState, then it will be much
easier to share the common code. Is that fine ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-04-13  2:59     ` Bharata B Rao
@ 2015-04-13 14:04       ` Igor Mammedov
  2015-04-13 14:27         ` Bharata B Rao
  0 siblings, 1 reply; 74+ messages in thread
From: Igor Mammedov @ 2015-04-13 14:04 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: qemu-devel, mdroth, agraf, qemu-ppc, tyreld, nfont, afaerber,
	David Gibson

On Mon, 13 Apr 2015 08:29:33 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> On Wed, Mar 25, 2015 at 04:58:18PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> > > Initialize a hotplug memory region under which all the hotplugged
> > > memory is accommodated. Also enable memory hotplug by setting
> > > CONFIG_MEM_HOTPLUG.
> > > 
> > > Modelled on i386 memory hotplug.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  default-configs/ppc64-softmmu.mak |  1 +
> > >  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
> > >  include/hw/ppc/spapr.h            | 12 ++++++++++
> > >  3 files changed, 63 insertions(+)
> > > 
> > > diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> > > index 22ef132..16b3011 100644
> > > --- a/default-configs/ppc64-softmmu.mak
> > > +++ b/default-configs/ppc64-softmmu.mak
> > > @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> > >  # For PReP
> > >  CONFIG_MC146818RTC=y
> > >  CONFIG_ISA_TESTDEV=y
> > > +CONFIG_MEM_HOTPLUG=y
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 3e56d9e..e43bb49 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -125,8 +125,13 @@ struct sPAPRMachineState {
> > >  
> > >      /*< public >*/
> > >      char *kvm_type;
> > > +    ram_addr_t hotplug_memory_base;
> > > +    MemoryRegion hotplug_memory;
> > > +    bool enforce_aligned_dimm;
> > >  };
> > >  
> > > +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> > 
> > What's the rationale for including this option?
> 
> I couldn't see any use, but added it to be similar with x86. May be Igor
> can tell us ?
at least on x86 KVM requires memory region be aligned (otherwise it would abort)
and also performance wise it's better to map regions at offsets aligned at
backend page size (especially applicable for hugepages).

so region base on x86 is 1Gb aligned to support upto 1Gb hugepages.

> 
> Regards,
> Bharata.
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support
  2015-04-13  3:03     ` Bharata B Rao
@ 2015-04-13 14:12       ` Igor Mammedov
  0 siblings, 0 replies; 74+ messages in thread
From: Igor Mammedov @ 2015-04-13 14:12 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: qemu-devel, mdroth, agraf, qemu-ppc, tyreld, nfont, afaerber,
	David Gibson

On Mon, 13 Apr 2015 08:33:16 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> On Thu, Mar 26, 2015 at 02:57:45PM +1100, David Gibson wrote:
> > On Mon, Mar 23, 2015 at 07:06:04PM +0530, Bharata B Rao wrote:
> > > Make use of pc-dimm infrastructure to support memory hotplug
> > > for PowerPC.
> > > 
> > > Modelled on i386 memory hotplug.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > ---
> > >  hw/ppc/spapr.c        | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  hw/ppc/spapr_events.c |   3 ++
> > >  2 files changed, 120 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 4e844ab..bc46acd 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -61,7 +61,8 @@
> > >  #include "hw/nmi.h"
> > >  
> > >  #include "hw/compat.h"
> > > -
> > > +#include "hw/mem/pc-dimm.h"
> > > +#include "qapi/qmp/qerror.h"
> > >  #include <libfdt.h>
> > >  
> > >  /* SLOF memory layout:
> > > @@ -902,6 +903,10 @@ int spapr_h_cas_compose_response(target_ulong addr, target_ulong size,
> > >          _FDT((spapr_populate_memory(spapr, fdt)));
> > >      }
> > >  
> > > +    if (spapr->dr_lmb_enabled) {
> > > +        _FDT(spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_LMB));
> > > +    }
> > > +
> > >      /* Pack resulting tree */
> > >      _FDT((fdt_pack(fdt)));
> > >  
> > > @@ -2185,6 +2190,109 @@ static void spapr_cpu_socket_unplug(HotplugHandler *hotplug_dev,
> > >      object_child_foreach(OBJECT(dev), spapr_cpu_core_unplug, errp);
> > >  }
> > >  
> > > +static void spapr_add_lmbs(uint64_t addr, uint64_t size, Error **errp)
> > > +{
> > > +    sPAPRDRConnector *drc;
> > > +    uint32_t nr_lmbs = size/SPAPR_MEMORY_BLOCK_SIZE;
> > > +    int i;
> > > +
> > > +    if (size % SPAPR_MEMORY_BLOCK_SIZE) {
> > > +        error_setg(errp, "Hotplugged memory size must be a multiple of "
> > > +                      "%d MB", SPAPR_MEMORY_BLOCK_SIZE/(1024 * 1024));
> > 
> > s/MB/MiB/  there seems to be precedent for using the mebibyte term in
> > qemu.
> > 
> > Also you can use the MiB #define instead of (1024 * 1024).
> > 
> > > +        return;
> > > +    }
> > > +
> > > +    for (i = 0; i < nr_lmbs; i++) {
> > > +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> > > +                addr/SPAPR_MEMORY_BLOCK_SIZE);
> > > +        g_assert(drc);
> > > +        spapr_hotplug_req_add_event(drc);
> > > +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> > > +    }
> > > +}
> > > +
> > > +static void spapr_memory_plug(HotplugHandler *hotplug_dev,
> > > +                         DeviceState *dev, Error **errp)
> > > +{
> > > +    int slot;
> > > +    Error *local_err = NULL;
> > > +    sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> > > +    MachineState *machine = MACHINE(hotplug_dev);
> > > +    PCDIMMDevice *dimm = PC_DIMM(dev);
> > > +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > > +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> > > +    uint64_t existing_dimms_capacity = 0;
> > > +    uint64_t align = TARGET_PAGE_SIZE;
> > > +    uint64_t addr;
> > > +
> > > +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    if (memory_region_get_alignment(mr) && ms->enforce_aligned_dimm) {
> > > +        align = memory_region_get_alignment(mr);
> > > +    }
> > > +
> > > +    addr = pc_dimm_get_free_addr(ms->hotplug_memory_base,
> > > +                                 memory_region_size(&ms->hotplug_memory),
> > > +                                 !addr ? NULL : &addr, align,
> > > +                                 memory_region_size(mr), &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    if (existing_dimms_capacity + memory_region_size(mr) >
> > > +        machine->maxram_size - machine->ram_size) {
> > > +        error_setg(&local_err, "not enough space, currently 0x%" PRIx64
> > > +                   " in use of total hot pluggable 0x" RAM_ADDR_FMT,
> > > +                   existing_dimms_capacity,
> > > +                   machine->maxram_size - machine->ram_size);
> > > +        goto out;
> > > +    }
> > > +
> > > +    object_property_set_int(OBJECT(dev), addr, PC_DIMM_ADDR_PROP, &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +    trace_mhp_pc_dimm_assigned_address(addr);
> > > +
> > > +    slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP, &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    slot = pc_dimm_get_free_slot(slot == PC_DIMM_UNASSIGNED_SLOT ? NULL : &slot,
> > > +                                 machine->ram_slots, &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +    object_property_set_int(OBJECT(dev), slot, PC_DIMM_SLOT_PROP, &local_err);
> > > +    if (local_err) {
> > > +        goto out;
> > > +    }
> > > +    trace_mhp_pc_dimm_assigned_slot(slot);
> > > +
> > > +    if (kvm_enabled() && !kvm_has_free_slot(machine)) {
> > > +        error_setg(&local_err, "hypervisor has no free memory slots left");
> > > +        goto out;
> > > +    }
> > > +
> > > +    memory_region_add_subregion(&ms->hotplug_memory,
> > > +                                addr - ms->hotplug_memory_base, mr);
> > > +    vmstate_register_ram(mr, dev);
> > > +
> > > +    spapr_add_lmbs(addr, memory_region_size(mr), &local_err);
> > 
> > It really looks as if it should be possible to make a common function
> > covering most of this and pc_dimm_plug, with the only difference being
> > the final call to do the arch specific stuff.  Even that might be able
> > to just be a hotplug handler call if you add a parameter for the
> > apprpirate hotplughandler object, instead of assuming the acpi device
> > on the PC side.
> 
> This routine and the equivalent x86 implementation use hotplug_memory_base
> and hotplug_memory fields from sPAPRMachineState and PCMachineState
> respectively. If we move these fields to MachineState, then it will be much
> easier to share the common code. Is that fine ?
memory hotplug is not applicable to all machines so answer probably no.

But it should be possible create MachineHotplugMemory interface
to help us get around single parent class limit od QOM and get mem hotplug code
shared in a nice way.

> 
> Regards,
> Bharata.
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-04-13 14:04       ` Igor Mammedov
@ 2015-04-13 14:27         ` Bharata B Rao
  2015-04-13 14:55           ` Igor Mammedov
  0 siblings, 1 reply; 74+ messages in thread
From: Bharata B Rao @ 2015-04-13 14:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, mdroth, agraf, qemu-ppc, tyreld, nfont, afaerber,
	David Gibson

On Mon, Apr 13, 2015 at 04:04:24PM +0200, Igor Mammedov wrote:
> On Mon, 13 Apr 2015 08:29:33 +0530
> Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Mar 25, 2015 at 04:58:18PM +1100, David Gibson wrote:
> > > On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> > > > Initialize a hotplug memory region under which all the hotplugged
> > > > memory is accommodated. Also enable memory hotplug by setting
> > > > CONFIG_MEM_HOTPLUG.
> > > > 
> > > > Modelled on i386 memory hotplug.
> > > > 
> > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > > ---
> > > >  default-configs/ppc64-softmmu.mak |  1 +
> > > >  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
> > > >  include/hw/ppc/spapr.h            | 12 ++++++++++
> > > >  3 files changed, 63 insertions(+)
> > > > 
> > > > diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> > > > index 22ef132..16b3011 100644
> > > > --- a/default-configs/ppc64-softmmu.mak
> > > > +++ b/default-configs/ppc64-softmmu.mak
> > > > @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> > > >  # For PReP
> > > >  CONFIG_MC146818RTC=y
> > > >  CONFIG_ISA_TESTDEV=y
> > > > +CONFIG_MEM_HOTPLUG=y
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 3e56d9e..e43bb49 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -125,8 +125,13 @@ struct sPAPRMachineState {
> > > >  
> > > >      /*< public >*/
> > > >      char *kvm_type;
> > > > +    ram_addr_t hotplug_memory_base;
> > > > +    MemoryRegion hotplug_memory;
> > > > +    bool enforce_aligned_dimm;
> > > >  };
> > > >  
> > > > +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> > > 
> > > What's the rationale for including this option?
> > 
> > I couldn't see any use, but added it to be similar with x86. May be Igor
> > can tell us ?
> at least on x86 KVM requires memory region be aligned (otherwise it would abort)
> and also performance wise it's better to map regions at offsets aligned at
> backend page size (especially applicable for hugepages).
> 
> so region base on x86 is 1Gb aligned to support upto 1Gb hugepages.

My question was specifically about the enforce-aligned-dimm object property
that has a get method. I was wondering where this property is consumed. All
I could see is that PCMachineState.enforce_aligned_dimm used/referenced
directly.

> 
> > 
> > Regards,
> > Bharata.
> > 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-04-13 14:27         ` Bharata B Rao
@ 2015-04-13 14:55           ` Igor Mammedov
  2015-04-14  7:17             ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Igor Mammedov @ 2015-04-13 14:55 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: qemu-devel, mdroth, agraf, qemu-ppc, tyreld, nfont, afaerber,
	David Gibson

On Mon, 13 Apr 2015 19:57:05 +0530
Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:

> On Mon, Apr 13, 2015 at 04:04:24PM +0200, Igor Mammedov wrote:
> > On Mon, 13 Apr 2015 08:29:33 +0530
> > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > 
> > > On Wed, Mar 25, 2015 at 04:58:18PM +1100, David Gibson wrote:
> > > > On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> > > > > Initialize a hotplug memory region under which all the hotplugged
> > > > > memory is accommodated. Also enable memory hotplug by setting
> > > > > CONFIG_MEM_HOTPLUG.
> > > > > 
> > > > > Modelled on i386 memory hotplug.
> > > > > 
> > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > > > ---
> > > > >  default-configs/ppc64-softmmu.mak |  1 +
> > > > >  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
> > > > >  include/hw/ppc/spapr.h            | 12 ++++++++++
> > > > >  3 files changed, 63 insertions(+)
> > > > > 
> > > > > diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> > > > > index 22ef132..16b3011 100644
> > > > > --- a/default-configs/ppc64-softmmu.mak
> > > > > +++ b/default-configs/ppc64-softmmu.mak
> > > > > @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> > > > >  # For PReP
> > > > >  CONFIG_MC146818RTC=y
> > > > >  CONFIG_ISA_TESTDEV=y
> > > > > +CONFIG_MEM_HOTPLUG=y
> > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > index 3e56d9e..e43bb49 100644
> > > > > --- a/hw/ppc/spapr.c
> > > > > +++ b/hw/ppc/spapr.c
> > > > > @@ -125,8 +125,13 @@ struct sPAPRMachineState {
> > > > >  
> > > > >      /*< public >*/
> > > > >      char *kvm_type;
> > > > > +    ram_addr_t hotplug_memory_base;
> > > > > +    MemoryRegion hotplug_memory;
> > > > > +    bool enforce_aligned_dimm;
> > > > >  };
> > > > >  
> > > > > +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> > > > 
> > > > What's the rationale for including this option?
> > > 
> > > I couldn't see any use, but added it to be similar with x86. May be Igor
> > > can tell us ?
> > at least on x86 KVM requires memory region be aligned (otherwise it would abort)
> > and also performance wise it's better to map regions at offsets aligned at
> > backend page size (especially applicable for hugepages).
> > 
> > so region base on x86 is 1Gb aligned to support upto 1Gb hugepages.
> 
> My question was specifically about the enforce-aligned-dimm object property
> that has a get method. I was wondering where this property is consumed. All
> I could see is that PCMachineState.enforce_aligned_dimm used/referenced
> directly.
I think that you don't heed it, it was introduce to to keep memory layout
compatible with old machine types that didn't have aligned memhp region.

Since there is nothing to keep compatible in this target, you can just
drop property and always align region as fit for spapr.

> 
> > 
> > > 
> > > Regards,
> > > Bharata.
> > > 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space
  2015-04-13 14:55           ` Igor Mammedov
@ 2015-04-14  7:17             ` David Gibson
  0 siblings, 0 replies; 74+ messages in thread
From: David Gibson @ 2015-04-14  7:17 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, mdroth, agraf, qemu-ppc, tyreld, Bharata B Rao,
	nfont, afaerber

[-- Attachment #1: Type: text/plain, Size: 3417 bytes --]

On Mon, Apr 13, 2015 at 04:55:52PM +0200, Igor Mammedov wrote:
> On Mon, 13 Apr 2015 19:57:05 +0530
> Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Apr 13, 2015 at 04:04:24PM +0200, Igor Mammedov wrote:
> > > On Mon, 13 Apr 2015 08:29:33 +0530
> > > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote:
> > > 
> > > > On Wed, Mar 25, 2015 at 04:58:18PM +1100, David Gibson wrote:
> > > > > On Mon, Mar 23, 2015 at 07:06:02PM +0530, Bharata B Rao wrote:
> > > > > > Initialize a hotplug memory region under which all the hotplugged
> > > > > > memory is accommodated. Also enable memory hotplug by setting
> > > > > > CONFIG_MEM_HOTPLUG.
> > > > > > 
> > > > > > Modelled on i386 memory hotplug.
> > > > > > 
> > > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > > > > ---
> > > > > >  default-configs/ppc64-softmmu.mak |  1 +
> > > > > >  hw/ppc/spapr.c                    | 50 +++++++++++++++++++++++++++++++++++++++
> > > > > >  include/hw/ppc/spapr.h            | 12 ++++++++++
> > > > > >  3 files changed, 63 insertions(+)
> > > > > > 
> > > > > > diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak
> > > > > > index 22ef132..16b3011 100644
> > > > > > --- a/default-configs/ppc64-softmmu.mak
> > > > > > +++ b/default-configs/ppc64-softmmu.mak
> > > > > > @@ -51,3 +51,4 @@ CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> > > > > >  # For PReP
> > > > > >  CONFIG_MC146818RTC=y
> > > > > >  CONFIG_ISA_TESTDEV=y
> > > > > > +CONFIG_MEM_HOTPLUG=y
> > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > > index 3e56d9e..e43bb49 100644
> > > > > > --- a/hw/ppc/spapr.c
> > > > > > +++ b/hw/ppc/spapr.c
> > > > > > @@ -125,8 +125,13 @@ struct sPAPRMachineState {
> > > > > >  
> > > > > >      /*< public >*/
> > > > > >      char *kvm_type;
> > > > > > +    ram_addr_t hotplug_memory_base;
> > > > > > +    MemoryRegion hotplug_memory;
> > > > > > +    bool enforce_aligned_dimm;
> > > > > >  };
> > > > > >  
> > > > > > +#define SPAPR_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> > > > > 
> > > > > What's the rationale for including this option?
> > > > 
> > > > I couldn't see any use, but added it to be similar with x86. May be Igor
> > > > can tell us ?
> > > at least on x86 KVM requires memory region be aligned (otherwise it would abort)
> > > and also performance wise it's better to map regions at offsets aligned at
> > > backend page size (especially applicable for hugepages).
> > > 
> > > so region base on x86 is 1Gb aligned to support upto 1Gb hugepages.
> > 
> > My question was specifically about the enforce-aligned-dimm object property
> > that has a get method. I was wondering where this property is consumed. All
> > I could see is that PCMachineState.enforce_aligned_dimm used/referenced
> > directly.
> I think that you don't heed it, it was introduce to to keep memory layout
> compatible with old machine types that didn't have aligned memhp region.
> 
> Since there is nothing to keep compatible in this target, you can just
> drop property and always align region as fit for spapr.

That sounds like a good idea to me.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2015-04-14  7:19 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-23 13:35 [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 01/23] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3 Bharata B Rao
2015-03-25  0:04   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 02/23] spapr: Add DRC dt entries for CPUs Bharata B Rao
2015-03-25  0:07   ` David Gibson
2015-03-25  5:02     ` Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 03/23] spapr: Consider max_cpus during xics initialization Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 04/23] spapr: Support ibm, lrdr-capacity device tree property Bharata B Rao
2015-03-25  0:15   ` David Gibson
2015-04-01  3:59     ` Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 05/23] spapr: Reorganize CPU dt generation code Bharata B Rao
2015-03-25  1:36   ` David Gibson
2015-03-25  8:26     ` Bharata B Rao
2015-03-26  1:40       ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 06/23] spapr: Consolidate cpu init code into a routine Bharata B Rao
2015-03-25  1:37   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 07/23] cpu: Prepare Socket container type Bharata B Rao
2015-03-25  2:03   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 08/23] ppc: Prepare CPU socket/core abstraction Bharata B Rao
2015-03-25  2:06   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 09/23] spapr: Add CPU hotplug handler Bharata B Rao
2015-03-25  2:08   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 10/23] ppc: Update cpu_model in MachineState Bharata B Rao
2015-03-25  2:30   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 11/23] ppc: Create sockets and cores for CPUs Bharata B Rao
2015-03-25  2:39   ` David Gibson
2015-03-25  8:33     ` Bharata B Rao
2015-03-26  1:54       ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 12/23] spapr: CPU hotplug support Bharata B Rao
2015-03-25  3:03   ` David Gibson
2015-03-25  8:36     ` Bharata B Rao
2015-03-26  1:42       ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 13/23] cpus: Add Error argument to cpu_exec_init() Bharata B Rao
2015-03-25  3:12   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 14/23] cpus: Convert cpu_index into a bitmap Bharata B Rao
2015-03-25  3:23   ` David Gibson
2015-03-25  8:52     ` Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 15/23] ppc: Move cpu_exec_init() call to realize function Bharata B Rao
2015-03-25  3:25   ` David Gibson
2015-03-25  8:56     ` Bharata B Rao
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 16/23] cpus: Reclaim vCPU objects Bharata B Rao
2015-03-25  5:22   ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 17/23] xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled Bharata B Rao
2015-03-25  5:24   ` David Gibson
2015-03-25  9:12     ` Bharata B Rao
2015-03-26  1:46       ` David Gibson
2015-03-23 13:35 ` [Qemu-devel] [RFC PATCH v2 18/23] xics_kvm: Add cpu_destroy method to XICS Bharata B Rao
2015-03-25  5:26   ` David Gibson
2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 19/23] spapr: CPU hot unplug support Bharata B Rao
2015-03-25  5:44   ` David Gibson
2015-03-25 16:34     ` Bharata B Rao
2015-04-07  6:45   ` [Qemu-devel] [Qemu-ppc] " Alexey Kardashevskiy
2015-04-09  3:51     ` Bharata B Rao
2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 20/23] spapr: Remove vCPU objects after CPU hot unplug Bharata B Rao
2015-03-25  5:46   ` David Gibson
2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 21/23] spapr: Initialize hotplug memory address space Bharata B Rao
2015-03-25  5:58   ` David Gibson
2015-04-13  2:59     ` Bharata B Rao
2015-04-13 14:04       ` Igor Mammedov
2015-04-13 14:27         ` Bharata B Rao
2015-04-13 14:55           ` Igor Mammedov
2015-04-14  7:17             ` David Gibson
2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 22/23] spapr: Support ibm, dynamic-reconfiguration-memory Bharata B Rao
2015-03-26  3:44   ` David Gibson
2015-03-30  9:11     ` Bharata B Rao
2015-03-31  2:19       ` David Gibson
2015-03-23 13:36 ` [Qemu-devel] [RFC PATCH v2 23/23] spapr: Memory hotplug support Bharata B Rao
2015-03-26  3:57   ` David Gibson
2015-04-13  3:03     ` Bharata B Rao
2015-04-13 14:12       ` Igor Mammedov
2015-03-26  3:58 ` [Qemu-devel] [RFC PATCH v2 00/23] CPU and Memory hotplug for PowerPC sPAPR guests David Gibson
2015-03-26  4:16   ` Bharata B Rao
2015-04-06 10:19 ` Bharata B Rao
2015-04-07  8:57   ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.