All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1
@ 2015-03-23 17:31 Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type Andreas Färber
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Andreas Färber @ 2015-03-23 17:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Christian Borntraeger,
	Paolo Bonzini, Bharata B Rao, Igor Mammedov, Andreas Färber

Hello,

This long-postponed series proposes a hierarchical QOM model of socket
and core objects for the x86 PC machines.

Background is that due to qdev limitations we had to introduce an ICC bus
to be able to hot-add CPUs and their APICs. By now this limitation could be
resolved via a QOM hotplug handler interface.

However, the QOM hotplug model is associated with having link<> properties.
Given that physically we cannot hot-add hyperthreads but full CPU sockets,
this series prepares the underbelly for having those link properties be of
the new type X86CPUSocket rather than X86CPU.

As final step in this series, the X86CPU allocation model is changed to be
init'ed in-place, as part of an "atomic" socket object. A follow-up will be
to attempt the same in-place allocation model for the APIC; one difficulty
there is that several places do a NULL check for the APIC pointer as quick
way of telling whether an APIC is present or not.

NOT IN THIS SERIES is converting cpu-add to the same socket/core/thread model
and initializing them in-place. The current PoC implementation assumes that
CPUs get added sequentially and that the preceding CPU can be used to obtain
the core via unclean QOM parent accesses.
IIUC that must be changed so that an arbitrary thread added via cpu-add
creates the full socket and core(s). That would work best if indexed link<>
properties could be used. That's an undecided design question of whether
we want to have /machine/cpu-socket[n] or whether it makes sense to integrate
NUMA modeling while at it and rather have /machine/node[n]/socket[m].

Note that this socket modeling is not only PC-specific in the softmmu sense
but also in that not every X86CPU must be on a socket (e.g., Quark X1000).
Therefore it was a concious decision to not label some things target-i386
and to place code in pc.c rather than cpu.c.

Further note that it is ignored here that -smp enforces that AMD CPUs don't
have hyperthreads, i.e. AMD X86CPUs will have only one thread[n] child<>.

Context:

   qemu.git master
   "pc: Ensure non-zero CPU ref count after attaching to ICC bus"
-> this series adding socket/core objects
   cpu-add conversion
   APIC cleanups

Available for testing here:
git://github.com/afaerber/qemu-cpu.git qom-cpu-x86-sockets-1.v1
https://github.com/afaerber/qemu-cpu/commits/qom-cpu-x86-sockets-1.v1

Regards,
Andreas

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>

Andreas Färber (4):
  cpu: Prepare Socket container type
  target-i386: Prepare CPU socket/core abstraction
  pc: Create sockets and cores for CPUs
  pc: Create initial CPUs in-place

 hw/cpu/Makefile.objs         |  2 +-
 hw/cpu/socket.c              | 21 ++++++++++
 hw/i386/Makefile.objs        |  1 +
 hw/i386/cpu-core.c           | 45 +++++++++++++++++++++
 hw/i386/cpu-socket.c         | 45 +++++++++++++++++++++
 hw/i386/pc.c                 | 95 ++++++++++++++++++++++++++++++++++++++++----
 include/hw/cpu/socket.h      | 14 +++++++
 include/hw/i386/cpu-core.h   | 29 ++++++++++++++
 include/hw/i386/cpu-socket.h | 29 ++++++++++++++
 9 files changed, 272 insertions(+), 9 deletions(-)
 create mode 100644 hw/cpu/socket.c
 create mode 100644 hw/i386/cpu-core.c
 create mode 100644 hw/i386/cpu-socket.c
 create mode 100644 include/hw/cpu/socket.h
 create mode 100644 include/hw/i386/cpu-core.h
 create mode 100644 include/hw/i386/cpu-socket.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
@ 2015-03-23 17:32 ` Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 2/4] target-i386: Prepare CPU socket/core abstraction Andreas Färber
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Andreas Färber @ 2015-03-23 17:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andreas Färber

Signed-off-by: Andreas Färber <afaerber@suse.de>
---
 hw/cpu/Makefile.objs    |  2 +-
 hw/cpu/socket.c         | 21 +++++++++++++++++++++
 include/hw/cpu/socket.h | 14 ++++++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 hw/cpu/socket.c
 create mode 100644 include/hw/cpu/socket.h

diff --git a/hw/cpu/Makefile.objs b/hw/cpu/Makefile.objs
index 6381238..e6890cf 100644
--- a/hw/cpu/Makefile.objs
+++ b/hw/cpu/Makefile.objs
@@ -3,4 +3,4 @@ obj-$(CONFIG_REALVIEW) += realview_mpcore.o
 obj-$(CONFIG_A9MPCORE) += a9mpcore.o
 obj-$(CONFIG_A15MPCORE) += a15mpcore.o
 obj-$(CONFIG_ICC_BUS) += icc_bus.o
-
+obj-y += socket.o
diff --git a/hw/cpu/socket.c b/hw/cpu/socket.c
new file mode 100644
index 0000000..5ca47e9
--- /dev/null
+++ b/hw/cpu/socket.c
@@ -0,0 +1,21 @@
+/*
+ * CPU socket abstraction
+ *
+ * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+
+#include "hw/cpu/socket.h"
+
+static const TypeInfo cpu_socket_type_info = {
+    .name = TYPE_CPU_SOCKET,
+    .parent = TYPE_DEVICE,
+    .abstract = true,
+};
+
+static void cpu_socket_register_types(void)
+{
+    type_register_static(&cpu_socket_type_info);
+}
+
+type_init(cpu_socket_register_types)
diff --git a/include/hw/cpu/socket.h b/include/hw/cpu/socket.h
new file mode 100644
index 0000000..c8e0c18
--- /dev/null
+++ b/include/hw/cpu/socket.h
@@ -0,0 +1,14 @@
+/*
+ * CPU socket abstraction
+ *
+ * Copyright (c) 2013-2014 SUSE LINUX Products GmbH
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+#ifndef HW_CPU_SOCKET_H
+#define HW_CPU_SOCKET_H
+
+#include "hw/qdev.h"
+
+#define TYPE_CPU_SOCKET "cpu-socket"
+
+#endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH RFC 2/4] target-i386: Prepare CPU socket/core abstraction
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type Andreas Färber
@ 2015-03-23 17:32 ` Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs Andreas Färber
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Andreas Färber @ 2015-03-23 17:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Michael S. Tsirkin, Andreas Färber,
	Richard Henderson

Short of generic recursive device realization, realize cores and threads
recursively.

Signed-off-by: Andreas Färber <afaerber@suse.de>
---
 hw/i386/Makefile.objs        |  1 +
 hw/i386/cpu-core.c           | 45 ++++++++++++++++++++++++++++++++++++++++++++
 hw/i386/cpu-socket.c         | 45 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/cpu-core.h   | 26 +++++++++++++++++++++++++
 include/hw/i386/cpu-socket.h | 29 ++++++++++++++++++++++++++++
 5 files changed, 146 insertions(+)
 create mode 100644 hw/i386/cpu-core.c
 create mode 100644 hw/i386/cpu-socket.c
 create mode 100644 include/hw/i386/cpu-core.h
 create mode 100644 include/hw/i386/cpu-socket.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index e058a39..9f76424 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -1,5 +1,6 @@
 obj-$(CONFIG_KVM) += kvm/
 obj-y += multiboot.o smbios.o
+obj-y += cpu-socket.o cpu-core.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
diff --git a/hw/i386/cpu-core.c b/hw/i386/cpu-core.c
new file mode 100644
index 0000000..d579025
--- /dev/null
+++ b/hw/i386/cpu-core.c
@@ -0,0 +1,45 @@
+/*
+ * x86 CPU core abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+
+#include "hw/i386/cpu-core.h"
+
+static int x86_cpu_core_realize_child(Object *child, void *opaque)
+{
+    Error **errp = opaque;
+    Error *local_err = NULL;
+
+    object_property_set_bool(child, true, "realized", &local_err);
+    error_propagate(errp, local_err);
+
+    return local_err != NULL ? 1 : 0;
+}
+
+static void x86_cpu_core_realize(DeviceState *dev, Error **errp)
+{
+    /* XXX generic */
+    object_child_foreach(OBJECT(dev), x86_cpu_core_realize_child, errp);
+}
+
+static void x86_cpu_core_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = x86_cpu_core_realize;
+}
+
+static const TypeInfo x86_cpu_core_type_info = {
+    .name = TYPE_X86_CPU_CORE,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(X86CPUCore),
+    .class_init = x86_cpu_core_class_init,
+};
+
+static void x86_cpu_core_register_types(void)
+{
+    type_register_static(&x86_cpu_core_type_info);
+}
+
+type_init(x86_cpu_core_register_types)
diff --git a/hw/i386/cpu-socket.c b/hw/i386/cpu-socket.c
new file mode 100644
index 0000000..dc27400
--- /dev/null
+++ b/hw/i386/cpu-socket.c
@@ -0,0 +1,45 @@
+/*
+ * x86 CPU socket abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+
+#include "hw/i386/cpu-socket.h"
+
+static int x86_cpu_socket_realize_child(Object *child, void *opaque)
+{
+    Error **errp = opaque;
+    Error *local_err = NULL;
+
+    object_property_set_bool(child, true, "realized", &local_err);
+    error_propagate(errp, local_err);
+
+    return local_err != NULL ? 1 : 0;
+}
+
+static void x86_cpu_socket_realize(DeviceState *dev, Error **errp)
+{
+    /* XXX generic */
+    object_child_foreach(OBJECT(dev), x86_cpu_socket_realize_child, errp);
+}
+
+static void x86_cpu_socket_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = x86_cpu_socket_realize;
+}
+
+static const TypeInfo x86_cpu_socket_type_info = {
+    .name = TYPE_X86_CPU_SOCKET,
+    .parent = TYPE_CPU_SOCKET,
+    .instance_size = sizeof(X86CPUSocket),
+    .class_init = x86_cpu_socket_class_init,
+};
+
+static void x86_cpu_socket_register_types(void)
+{
+    type_register_static(&x86_cpu_socket_type_info);
+}
+
+type_init(x86_cpu_socket_register_types)
diff --git a/include/hw/i386/cpu-core.h b/include/hw/i386/cpu-core.h
new file mode 100644
index 0000000..be78f95
--- /dev/null
+++ b/include/hw/i386/cpu-core.h
@@ -0,0 +1,26 @@
+/*
+ * x86 CPU core abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+#ifndef HW_I386_CPU_CORE_H
+#define HW_I386_CPU_CORE_H
+
+#include "hw/qdev.h"
+
+#ifdef TARGET_X86_64
+#define TYPE_X86_CPU_CORE "x86_64-cpu-core"
+#else
+#define TYPE_X86_CPU_CORE "i386-cpu-core"
+#endif
+
+#define X86_CPU_CORE(obj) \
+    OBJECT_CHECK(X86CPUCore, (obj), TYPE_X86_CPU_CORE)
+
+typedef struct X86CPUCore {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+} X86CPUCore;
+
+#endif
diff --git a/include/hw/i386/cpu-socket.h b/include/hw/i386/cpu-socket.h
new file mode 100644
index 0000000..9fb3ef8
--- /dev/null
+++ b/include/hw/i386/cpu-socket.h
@@ -0,0 +1,29 @@
+/*
+ * x86 CPU socket abstraction
+ *
+ * Copyright (c) 2015 SUSE Linux GmbH
+ */
+#ifndef HW_I386_CPU_SOCKET_H
+#define HW_I386_CPU_SOCKET_H
+
+#include "hw/cpu/socket.h"
+#include "cpu-core.h"
+
+#ifdef TARGET_X86_64
+#define TYPE_X86_CPU_SOCKET "x86_64-cpu-socket"
+#else
+#define TYPE_X86_CPU_SOCKET "i386-cpu-socket"
+#endif
+
+#define X86_CPU_SOCKET(obj) \
+    OBJECT_CHECK(X86CPUSocket, (obj), TYPE_X86_CPU_SOCKET)
+
+typedef struct X86CPUSocket {
+    /*< private >*/
+    DeviceState parent_obj;
+    /*< public >*/
+
+    X86CPUCore core[0];
+} X86CPUSocket;
+
+#endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type Andreas Färber
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 2/4] target-i386: Prepare CPU socket/core abstraction Andreas Färber
@ 2015-03-23 17:32 ` Andreas Färber
  2015-03-25 16:55   ` Bharata B Rao
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 4/4] pc: Create initial CPUs in-place Andreas Färber
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Andreas Färber @ 2015-03-23 17:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Richard Henderson, Andreas Färber,
	Michael S. Tsirkin

Inline realized=true from pc_new_cpu() so that the realization can be
deferred, as it would otherwise create a device[n] node.

Signed-off-by: Andreas Färber <afaerber@suse.de>
---
 hw/i386/pc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 58 insertions(+), 8 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2c48277..492c262 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -54,11 +54,14 @@
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
 #include "sysemu/arch_init.h"
+#include "sysemu/cpus.h"
 #include "qemu/bitmap.h"
 #include "qemu/config-file.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/cpu_hotplug.h"
 #include "hw/cpu/icc_bus.h"
+#include "hw/i386/cpu-socket.h"
+#include "hw/i386/cpu-core.h"
 #include "hw/boards.h"
 #include "hw/pci/pci_host.h"
 #include "acpi-build.h"
@@ -990,6 +993,17 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
     }
 }
 
+static inline size_t pc_cpu_core_size(void)
+{
+    return sizeof(X86CPUCore);
+}
+
+static inline X86CPUCore *pc_cpu_socket_get_core(X86CPUSocket *socket,
+                                                 unsigned int index)
+{
+    return &socket->core[index];
+}
+
 static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
                           DeviceState *icc_bridge, Error **errp)
 {
@@ -1009,7 +1023,6 @@ static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
     qdev_set_parent_bus(DEVICE(cpu), qdev_get_child_bus(icc_bridge, "icc"));
 
     object_property_set_int(OBJECT(cpu), apic_id, "apic-id", &local_err);
-    object_property_set_bool(OBJECT(cpu), true, "realized", &local_err);
 
 out:
     if (local_err) {
@@ -1060,15 +1073,19 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
         error_propagate(errp, local_err);
         return;
     }
+    object_property_set_bool(OBJECT(cpu), true, "realized", errp);
     object_unref(OBJECT(cpu));
 }
 
 void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 {
-    int i;
+    int i, j, k;
+    X86CPUSocket *socket;
+    X86CPUCore *core;
     X86CPU *cpu = NULL;
     Error *error = NULL;
     unsigned long apic_id_limit;
+    int sockets, cpu_index = 0;
 
     /* init CPUs */
     if (cpu_model == NULL) {
@@ -1087,14 +1104,41 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
         exit(1);
     }
 
-    for (i = 0; i < smp_cpus; i++) {
-        cpu = pc_new_cpu(cpu_model, x86_cpu_apic_id_from_index(i),
-                         icc_bridge, &error);
+    sockets = smp_cpus / smp_cores / smp_threads;
+    for (i = 0; i < sockets; i++) {
+        socket = g_malloc0(sizeof(*socket) + smp_cores * pc_cpu_core_size());
+        object_initialize(socket, sizeof(*socket), TYPE_X86_CPU_SOCKET);
+        OBJECT(socket)->free = g_free;
+
+        for (j = 0; j < smp_cores; j++) {
+            core = pc_cpu_socket_get_core(socket, j);
+            object_initialize(core, sizeof(*core), TYPE_X86_CPU_CORE);
+            object_property_add_child(OBJECT(socket), "core[*]",
+                                      OBJECT(core), &error);
+            if (error) {
+                goto error;
+            }
+
+            for (k = 0; k < smp_threads; k++) {
+                cpu = pc_new_cpu(cpu_model,
+                                 x86_cpu_apic_id_from_index(cpu_index),
+                                 icc_bridge, &error);
+                if (error) {
+                    goto error;
+                }
+                object_property_add_child(OBJECT(core), "thread[*]",
+                                          OBJECT(cpu), &error);
+                object_unref(OBJECT(cpu));
+                if (error) {
+                    goto error;
+                }
+                cpu_index++;
+            }
+        }
+        object_property_set_bool(OBJECT(socket), true, "realized", &error);
         if (error) {
-            error_report_err(error);
-            exit(1);
+            goto error;
         }
-        object_unref(OBJECT(cpu));
     }
 
     /* map APIC MMIO area if CPU has APIC */
@@ -1106,6 +1150,12 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 
     /* tell smbios about cpuid version and features */
     smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
+
+error:
+    if (error) {
+        error_report_err(error);
+        exit(1);
+    }
 }
 
 /* pci-info ROM file. Little endian format */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH RFC 4/4] pc: Create initial CPUs in-place
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
                   ` (2 preceding siblings ...)
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs Andreas Färber
@ 2015-03-23 17:32 ` Andreas Färber
  2015-03-24 14:33 ` [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Christian Borntraeger
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Andreas Färber @ 2015-03-23 17:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Michael S. Tsirkin, Andreas Färber,
	Richard Henderson

Inline pc_new_cpu() for the initial setup.

Signed-off-by: Andreas Färber <afaerber@suse.de>
---
 hw/i386/pc.c               | 39 ++++++++++++++++++++++++++++++++++-----
 include/hw/i386/cpu-core.h |  3 +++
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 492c262..efc5a23 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -995,13 +995,13 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
 
 static inline size_t pc_cpu_core_size(void)
 {
-    return sizeof(X86CPUCore);
+    return sizeof(X86CPUCore) + smp_threads * sizeof(X86CPU);
 }
 
 static inline X86CPUCore *pc_cpu_socket_get_core(X86CPUSocket *socket,
                                                  unsigned int index)
 {
-    return &socket->core[index];
+    return (void *)&socket->core[0] + index * pc_cpu_core_size();
 }
 
 static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
@@ -1083,7 +1083,12 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
     X86CPUSocket *socket;
     X86CPUCore *core;
     X86CPU *cpu = NULL;
+    X86CPUClass *xcc;
+    CPUClass *cc;
+    ObjectClass *cpu_oc;
     Error *error = NULL;
+    gchar **cpu_model_pieces;
+    char *cpu_name, *cpu_features;
     unsigned long apic_id_limit;
     int sockets, cpu_index = 0;
 
@@ -1096,6 +1101,23 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
 #endif
     }
     current_cpu_model = cpu_model;
+    cpu_model_pieces = g_strsplit(cpu_model, ",", 2);
+    cpu_name = cpu_model_pieces[0];
+    assert(cpu_name);
+    cpu_features = cpu_model_pieces[1];
+
+    cpu_oc = cpu_class_by_name(TYPE_X86_CPU, cpu_name);
+    if (cpu_oc == NULL) {
+        error_report("Unable to find CPU definition: %s", cpu_name);
+        exit(1);
+    }
+    cc = CPU_CLASS(cpu_oc);
+    xcc = X86_CPU_CLASS(cpu_oc);
+
+    if (xcc->kvm_required && !kvm_enabled()) {
+        error_report("CPU model '%s' requires KVM", cpu_name);
+        exit(1);
+    }
 
     apic_id_limit = pc_apic_id_limit(max_cpus);
     if (apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
@@ -1120,12 +1142,18 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
             }
 
             for (k = 0; k < smp_threads; k++) {
-                cpu = pc_new_cpu(cpu_model,
-                                 x86_cpu_apic_id_from_index(cpu_index),
-                                 icc_bridge, &error);
+                cpu = &core->thread[k];
+                object_initialize(cpu, sizeof(*cpu),
+                                  object_class_get_name(cpu_oc));
+                cc->parse_features(CPU(cpu), cpu_features, &error);
                 if (error) {
                     goto error;
                 }
+                qdev_set_parent_bus(DEVICE(cpu),
+                                    qdev_get_child_bus(icc_bridge, "icc"));
+                object_property_set_int(OBJECT(cpu),
+                                        x86_cpu_apic_id_from_index(cpu_index),
+                                        "apic-id", &error);
                 object_property_add_child(OBJECT(core), "thread[*]",
                                           OBJECT(cpu), &error);
                 object_unref(OBJECT(cpu));
@@ -1152,6 +1180,7 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
     smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
 error:
+    g_strfreev(cpu_model_pieces);
     if (error) {
         error_report_err(error);
         exit(1);
diff --git a/include/hw/i386/cpu-core.h b/include/hw/i386/cpu-core.h
index be78f95..bb60e8e 100644
--- a/include/hw/i386/cpu-core.h
+++ b/include/hw/i386/cpu-core.h
@@ -7,6 +7,7 @@
 #define HW_I386_CPU_CORE_H
 
 #include "hw/qdev.h"
+#include "cpu.h"
 
 #ifdef TARGET_X86_64
 #define TYPE_X86_CPU_CORE "x86_64-cpu-core"
@@ -21,6 +22,8 @@ typedef struct X86CPUCore {
     /*< private >*/
     DeviceState parent_obj;
     /*< public >*/
+
+    X86CPU thread[0];
 } X86CPUCore;
 
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
                   ` (3 preceding siblings ...)
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 4/4] pc: Create initial CPUs in-place Andreas Färber
@ 2015-03-24 14:33 ` Christian Borntraeger
  2015-03-26 17:39 ` Igor Mammedov
  2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
  6 siblings, 0 replies; 19+ messages in thread
From: Christian Borntraeger @ 2015-03-24 14:33 UTC (permalink / raw)
  To: Andreas Färber, qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Paolo Bonzini, Jason J. Herne,
	Bharata B Rao, Cornelia Huck, Igor Mammedov

Am 23.03.2015 um 18:31 schrieb Andreas Färber:
> Hello,
> 
> This long-postponed series proposes a hierarchical QOM model of socket
> and core objects for the x86 PC machines.

Just some comments from the s390 side as we probably want more than status 
quo in the future as well.

Traditionally all CPUs were equal from the software side, but hardware was
already organized in a node like fashion.

Starting with z10 (2008) the cpu topology was exposed to software (not the
memory topology!) which allowed Linux to optimize scheduling based on
locality.
So on an LPAR you might see something like
$ lscpu -e
CPU BOOK SOCKET CORE L1d:L1i:L2 ONLINE CONFIGURED POLARIZATION ADDRESS
0   0    0      0    0:0:0      ja     ja         horizontal   0
1   0    0      1    1:1:1      ja     ja         horizontal   1
2   0    0      2    2:2:2      ja     ja         horizontal   2
3   0    1      3    3:3:3      ja     ja         horizontal   3
4   0    1      4    4:4:4      ja     ja         horizontal   4
5   0    1      5    5:5:5      ja     ja         horizontal   5
6   0    2      6    6:6:6      ja     ja         horizontal   6
7   0    2      7    7:7:7      ja     ja         horizontal   7
8   0    2      8    8:8:8      ja     ja         horizontal   8
9   0    3      9    9:9:9      ja     ja         horizontal   9
10  0    3      10   10:10:10   ja     ja         horizontal   10
11  0    3      11   11:11:11   ja     ja         horizontal   11
12  1    4      12   12:12:12   ja     ja         horizontal   12
13  1    4      13   13:13:13   ja     ja         horizontal   13
14  1    4      14   14:14:14   ja     ja         horizontal   14
15  1    5      15   15:15:15   ja     ja         horizontal   15


Under z/VM the topology is flat (each CPU on a separate node)
# lscpu -e
CPU BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
0   0    0      0    0:0:0:0         ja     ja         horizontal   0
1   1    1      1    1:1:1:1         ja     ja         horizontal   1
2   2    2      2    2:2:2:2         ja     ja         horizontal   2

Thats what we have today in KVM as well. Now...
> 
> Background is that due to qdev limitations we had to introduce an ICC bus
> to be able to hot-add CPUs and their APICs. By now this limitation could be
> resolved via a QOM hotplug handler interface.
> 
> However, the QOM hotplug model is associated with having link<> properties.
> Given that physically we cannot hot-add hyperthreads but full CPU sockets,
> this series prepares the underbelly for having those link properties be of
> the new type X86CPUSocket rather than X86CPU.
> 
> As final step in this series, the X86CPU allocation model is changed to be
> init'ed in-place, as part of an "atomic" socket object. A follow-up will be
> to attempt the same in-place allocation model for the APIC; one difficulty
> there is that several places do a NULL check for the APIC pointer as quick
> way of telling whether an APIC is present or not.
> 
> NOT IN THIS SERIES is converting cpu-add to the same socket/core/thread model
> and initializing them in-place. The current PoC implementation assumes that
> CPUs get added sequentially and that the preceding CPU can be used to obtain
> the core via unclean QOM parent accesses.
> IIUC that must be changed so that an arbitrary thread added via cpu-add
> creates the full socket and core(s). That would work best if indexed link<>
> properties could be used. That's an undecided design question of whether
> we want to have /machine/cpu-socket[n] or whether it makes sense to integrate
> NUMA modeling while at it and rather have /machine/node[n]/socket[m].

...looking into the future, we probably want to be able to model a cpu node
topology as newer machines are getting bigger and bigger and the locality
gets more important. So maybe we want to have a KVM-guest with 40 vCPUs and
pin 20 vCPUs on host book (node) 0 and 20 vcpu on host book 1.
- so node is of interest.


Regarding threads: z13 has SMT2. Right now with upstream code, we can fan 
out at the lpar level so that each thread becomes a host CPU and that is
used to back guest vCPUs. The alternative to pass a full host core to 
the guest and do the fan out in the guest is currently not supported.

Regarding cpu hotplug: This work is currently halted due to other things
but we plan to continue to work on that. 

I will have a look at your code. Thanks for working on this.

> 
> Note that this socket modeling is not only PC-specific in the softmmu sense
> but also in that not every X86CPU must be on a socket (e.g., Quark X1000).
> Therefore it was a concious decision to not label some things target-i386
> and to place code in pc.c rather than cpu.c.
> 
> Further note that it is ignored here that -smp enforces that AMD CPUs don't
> have hyperthreads, i.e. AMD X86CPUs will have only one thread[n] child<>.
> 
> Context:
> 
>    qemu.git master
>    "pc: Ensure non-zero CPU ref count after attaching to ICC bus"
> -> this series adding socket/core objects
>    cpu-add conversion
>    APIC cleanups
> 
> Available for testing here:
> git://github.com/afaerber/qemu-cpu.git qom-cpu-x86-sockets-1.v1
> https://github.com/afaerber/qemu-cpu/commits/qom-cpu-x86-sockets-1.v1
> 
> Regards,
> Andreas
> 
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> Andreas Färber (4):
>   cpu: Prepare Socket container type
>   target-i386: Prepare CPU socket/core abstraction
>   pc: Create sockets and cores for CPUs
>   pc: Create initial CPUs in-place
> 
>  hw/cpu/Makefile.objs         |  2 +-
>  hw/cpu/socket.c              | 21 ++++++++++
>  hw/i386/Makefile.objs        |  1 +
>  hw/i386/cpu-core.c           | 45 +++++++++++++++++++++
>  hw/i386/cpu-socket.c         | 45 +++++++++++++++++++++
>  hw/i386/pc.c                 | 95 ++++++++++++++++++++++++++++++++++++++++----
>  include/hw/cpu/socket.h      | 14 +++++++
>  include/hw/i386/cpu-core.h   | 29 ++++++++++++++
>  include/hw/i386/cpu-socket.h | 29 ++++++++++++++
>  9 files changed, 272 insertions(+), 9 deletions(-)
>  create mode 100644 hw/cpu/socket.c
>  create mode 100644 hw/i386/cpu-core.c
>  create mode 100644 hw/i386/cpu-socket.c
>  create mode 100644 include/hw/cpu/socket.h
>  create mode 100644 include/hw/i386/cpu-core.h
>  create mode 100644 include/hw/i386/cpu-socket.h
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs
  2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs Andreas Färber
@ 2015-03-25 16:55   ` Bharata B Rao
  2015-03-25 17:13     ` Andreas Färber
  0 siblings, 1 reply; 19+ messages in thread
From: Bharata B Rao @ 2015-03-25 16:55 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Paolo Bonzini, Michael S. Tsirkin, qemu-devel, Richard Henderson

On Mon, Mar 23, 2015 at 11:02 PM, Andreas Färber <afaerber@suse.de> wrote:
> Inline realized=true from pc_new_cpu() so that the realization can be
> deferred, as it would otherwise create a device[n] node.
>
> Signed-off-by: Andreas Färber <afaerber@suse.de>
> ---
>  hw/i386/pc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 58 insertions(+), 8 deletions(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 2c48277..492c262 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -54,11 +54,14 @@
>  #include "exec/memory.h"
>  #include "exec/address-spaces.h"
>  #include "sysemu/arch_init.h"
> +#include "sysemu/cpus.h"
>  #include "qemu/bitmap.h"
>  #include "qemu/config-file.h"
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/cpu_hotplug.h"
>  #include "hw/cpu/icc_bus.h"
> +#include "hw/i386/cpu-socket.h"
> +#include "hw/i386/cpu-core.h"
>  #include "hw/boards.h"
>  #include "hw/pci/pci_host.h"
>  #include "acpi-build.h"
> @@ -990,6 +993,17 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>      }
>  }
>
> +static inline size_t pc_cpu_core_size(void)
> +{
> +    return sizeof(X86CPUCore);
> +}
> +
> +static inline X86CPUCore *pc_cpu_socket_get_core(X86CPUSocket *socket,
> +                                                 unsigned int index)
> +{
> +    return &socket->core[index];
> +}
> +
>  static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>                            DeviceState *icc_bridge, Error **errp)
>  {
> @@ -1009,7 +1023,6 @@ static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>      qdev_set_parent_bus(DEVICE(cpu), qdev_get_child_bus(icc_bridge, "icc"));
>
>      object_property_set_int(OBJECT(cpu), apic_id, "apic-id", &local_err);
> -    object_property_set_bool(OBJECT(cpu), true, "realized", &local_err);
>
>  out:
>      if (local_err) {
> @@ -1060,15 +1073,19 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
>          error_propagate(errp, local_err);
>          return;
>      }
> +    object_property_set_bool(OBJECT(cpu), true, "realized", errp);
>      object_unref(OBJECT(cpu));
>  }
>
>  void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>  {
> -    int i;
> +    int i, j, k;
> +    X86CPUSocket *socket;
> +    X86CPUCore *core;
>      X86CPU *cpu = NULL;
>      Error *error = NULL;
>      unsigned long apic_id_limit;
> +    int sockets, cpu_index = 0;
>
>      /* init CPUs */
>      if (cpu_model == NULL) {
> @@ -1087,14 +1104,41 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>          exit(1);
>      }
>
> -    for (i = 0; i < smp_cpus; i++) {
> -        cpu = pc_new_cpu(cpu_model, x86_cpu_apic_id_from_index(i),
> -                         icc_bridge, &error);
> +    sockets = smp_cpus / smp_cores / smp_threads;
> +    for (i = 0; i < sockets; i++) {
> +        socket = g_malloc0(sizeof(*socket) + smp_cores * pc_cpu_core_size());
> +        object_initialize(socket, sizeof(*socket), TYPE_X86_CPU_SOCKET);
> +        OBJECT(socket)->free = g_free;
> +
> +        for (j = 0; j < smp_cores; j++) {
> +            core = pc_cpu_socket_get_core(socket, j);
> +            object_initialize(core, sizeof(*core), TYPE_X86_CPU_CORE);
> +            object_property_add_child(OBJECT(socket), "core[*]",
> +                                      OBJECT(core), &error);
> +            if (error) {
> +                goto error;
> +            }
> +
> +            for (k = 0; k < smp_threads; k++) {
> +                cpu = pc_new_cpu(cpu_model,
> +                                 x86_cpu_apic_id_from_index(cpu_index),
> +                                 icc_bridge, &error);
> +                if (error) {
> +                    goto error;
> +                }
> +                object_property_add_child(OBJECT(core), "thread[*]",
> +                                          OBJECT(cpu), &error);
> +                object_unref(OBJECT(cpu));
> +                if (error) {
> +                    goto error;
> +                }
> +                cpu_index++;
> +            }
> +        }
> +        object_property_set_bool(OBJECT(socket), true, "realized", &error);

So you do in-place initialization as part of an "atomic" socket object.

I am not sure why cores and threads should be allocated as part of
socket object and initialized like above ? Do you see any problem with
just instantiating the socket object and then let the instance_init
routines of socket and cores to initialize the cores and threads like
how I am doing for sPAPR ?

+    sockets = smp_cpus / smp_cores / smp_threads;
+    for (i = 0; i < sockets; i++) {
+        socket = object_new(TYPE_POWERPC_CPU_SOCKET);
+        object_property_set_bool(socket, true, "realized", &error_abort);
     }

Ref: http://lists.gnu.org/archive/html/qemu-ppc/2015-03/msg00492.html

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs
  2015-03-25 16:55   ` Bharata B Rao
@ 2015-03-25 17:13     ` Andreas Färber
  2015-03-26  2:24       ` Bharata B Rao
  0 siblings, 1 reply; 19+ messages in thread
From: Andreas Färber @ 2015-03-25 17:13 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Paolo Bonzini, Michael S. Tsirkin, qemu-devel, Richard Henderson

Am 25.03.2015 um 17:55 schrieb Bharata B Rao:
> On Mon, Mar 23, 2015 at 11:02 PM, Andreas Färber <afaerber@suse.de> wrote:
>> Inline realized=true from pc_new_cpu() so that the realization can be
>> deferred, as it would otherwise create a device[n] node.
>>
>> Signed-off-by: Andreas Färber <afaerber@suse.de>
>> ---
>>  hw/i386/pc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
>>  1 file changed, 58 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 2c48277..492c262 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -54,11 +54,14 @@
>>  #include "exec/memory.h"
>>  #include "exec/address-spaces.h"
>>  #include "sysemu/arch_init.h"
>> +#include "sysemu/cpus.h"
>>  #include "qemu/bitmap.h"
>>  #include "qemu/config-file.h"
>>  #include "hw/acpi/acpi.h"
>>  #include "hw/acpi/cpu_hotplug.h"
>>  #include "hw/cpu/icc_bus.h"
>> +#include "hw/i386/cpu-socket.h"
>> +#include "hw/i386/cpu-core.h"
>>  #include "hw/boards.h"
>>  #include "hw/pci/pci_host.h"
>>  #include "acpi-build.h"
>> @@ -990,6 +993,17 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>>      }
>>  }
>>
>> +static inline size_t pc_cpu_core_size(void)
>> +{
>> +    return sizeof(X86CPUCore);
>> +}
>> +
>> +static inline X86CPUCore *pc_cpu_socket_get_core(X86CPUSocket *socket,
>> +                                                 unsigned int index)
>> +{
>> +    return &socket->core[index];
>> +}
>> +
>>  static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>>                            DeviceState *icc_bridge, Error **errp)
>>  {
>> @@ -1009,7 +1023,6 @@ static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>>      qdev_set_parent_bus(DEVICE(cpu), qdev_get_child_bus(icc_bridge, "icc"));
>>
>>      object_property_set_int(OBJECT(cpu), apic_id, "apic-id", &local_err);
>> -    object_property_set_bool(OBJECT(cpu), true, "realized", &local_err);
>>
>>  out:
>>      if (local_err) {
>> @@ -1060,15 +1073,19 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
>>          error_propagate(errp, local_err);
>>          return;
>>      }
>> +    object_property_set_bool(OBJECT(cpu), true, "realized", errp);
>>      object_unref(OBJECT(cpu));
>>  }
>>
>>  void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>>  {
>> -    int i;
>> +    int i, j, k;
>> +    X86CPUSocket *socket;
>> +    X86CPUCore *core;
>>      X86CPU *cpu = NULL;
>>      Error *error = NULL;
>>      unsigned long apic_id_limit;
>> +    int sockets, cpu_index = 0;
>>
>>      /* init CPUs */
>>      if (cpu_model == NULL) {
>> @@ -1087,14 +1104,41 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>>          exit(1);
>>      }
>>
>> -    for (i = 0; i < smp_cpus; i++) {
>> -        cpu = pc_new_cpu(cpu_model, x86_cpu_apic_id_from_index(i),
>> -                         icc_bridge, &error);
>> +    sockets = smp_cpus / smp_cores / smp_threads;
>> +    for (i = 0; i < sockets; i++) {
>> +        socket = g_malloc0(sizeof(*socket) + smp_cores * pc_cpu_core_size());
>> +        object_initialize(socket, sizeof(*socket), TYPE_X86_CPU_SOCKET);
>> +        OBJECT(socket)->free = g_free;
>> +
>> +        for (j = 0; j < smp_cores; j++) {
>> +            core = pc_cpu_socket_get_core(socket, j);
>> +            object_initialize(core, sizeof(*core), TYPE_X86_CPU_CORE);
>> +            object_property_add_child(OBJECT(socket), "core[*]",
>> +                                      OBJECT(core), &error);
>> +            if (error) {
>> +                goto error;
>> +            }
>> +
>> +            for (k = 0; k < smp_threads; k++) {
>> +                cpu = pc_new_cpu(cpu_model,
>> +                                 x86_cpu_apic_id_from_index(cpu_index),
>> +                                 icc_bridge, &error);
>> +                if (error) {
>> +                    goto error;
>> +                }
>> +                object_property_add_child(OBJECT(core), "thread[*]",
>> +                                          OBJECT(cpu), &error);
>> +                object_unref(OBJECT(cpu));
>> +                if (error) {
>> +                    goto error;
>> +                }
>> +                cpu_index++;
>> +            }
>> +        }
>> +        object_property_set_bool(OBJECT(socket), true, "realized", &error);
> 
> So you do in-place initialization as part of an "atomic" socket object.

(indivisible might've been a better term on my part)

> 
> I am not sure why cores and threads should be allocated as part of
> socket object and initialized like above ? Do you see any problem with
> just instantiating the socket object and then let the instance_init
> routines of socket and cores to initialize the cores and threads like
> how I am doing for sPAPR ?
> 
> +    sockets = smp_cpus / smp_cores / smp_threads;
> +    for (i = 0; i < sockets; i++) {
> +        socket = object_new(TYPE_POWERPC_CPU_SOCKET);
> +        object_property_set_bool(socket, true, "realized", &error_abort);
>      }
> 
> Ref: http://lists.gnu.org/archive/html/qemu-ppc/2015-03/msg00492.html

Yes, instance_init cannot fail. By making the allocation separate, we
assure that at this stage we have sufficient memory to perform the
initializations. This is easiest when we know how many child objects we
have, then we can use proper fields or arrays to reserve the memory
(e.g., ARM MPCore, PCI host bridges). Here, to cope with dynamic
smp_cores, I am using inline helpers to do the size/offset calculations.

Further, setting realized=true from instance_init is a no-go. It's a
two-step initialization, with realization being the second step and
potentially failing. The alternative I pointed you to was creating
objects in a property setter, like Alexey was asked for for xics.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs
  2015-03-25 17:13     ` Andreas Färber
@ 2015-03-26  2:24       ` Bharata B Rao
  0 siblings, 0 replies; 19+ messages in thread
From: Bharata B Rao @ 2015-03-26  2:24 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Paolo Bonzini, Michael S. Tsirkin, qemu-devel, Richard Henderson

On Wed, Mar 25, 2015 at 10:43 PM, Andreas Färber <afaerber@suse.de> wrote:
> Am 25.03.2015 um 17:55 schrieb Bharata B Rao:
>> On Mon, Mar 23, 2015 at 11:02 PM, Andreas Färber <afaerber@suse.de> wrote:
>>> Inline realized=true from pc_new_cpu() so that the realization can be
>>> deferred, as it would otherwise create a device[n] node.
>>>
>>> Signed-off-by: Andreas Färber <afaerber@suse.de>
>>> ---
>>>  hw/i386/pc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
>>>  1 file changed, 58 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>> index 2c48277..492c262 100644
>>> --- a/hw/i386/pc.c
>>> +++ b/hw/i386/pc.c
>>> @@ -54,11 +54,14 @@
>>>  #include "exec/memory.h"
>>>  #include "exec/address-spaces.h"
>>>  #include "sysemu/arch_init.h"
>>> +#include "sysemu/cpus.h"
>>>  #include "qemu/bitmap.h"
>>>  #include "qemu/config-file.h"
>>>  #include "hw/acpi/acpi.h"
>>>  #include "hw/acpi/cpu_hotplug.h"
>>>  #include "hw/cpu/icc_bus.h"
>>> +#include "hw/i386/cpu-socket.h"
>>> +#include "hw/i386/cpu-core.h"
>>>  #include "hw/boards.h"
>>>  #include "hw/pci/pci_host.h"
>>>  #include "acpi-build.h"
>>> @@ -990,6 +993,17 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>>>      }
>>>  }
>>>
>>> +static inline size_t pc_cpu_core_size(void)
>>> +{
>>> +    return sizeof(X86CPUCore);
>>> +}
>>> +
>>> +static inline X86CPUCore *pc_cpu_socket_get_core(X86CPUSocket *socket,
>>> +                                                 unsigned int index)
>>> +{
>>> +    return &socket->core[index];
>>> +}
>>> +
>>>  static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>>>                            DeviceState *icc_bridge, Error **errp)
>>>  {
>>> @@ -1009,7 +1023,6 @@ static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>>>      qdev_set_parent_bus(DEVICE(cpu), qdev_get_child_bus(icc_bridge, "icc"));
>>>
>>>      object_property_set_int(OBJECT(cpu), apic_id, "apic-id", &local_err);
>>> -    object_property_set_bool(OBJECT(cpu), true, "realized", &local_err);
>>>
>>>  out:
>>>      if (local_err) {
>>> @@ -1060,15 +1073,19 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
>>>          error_propagate(errp, local_err);
>>>          return;
>>>      }
>>> +    object_property_set_bool(OBJECT(cpu), true, "realized", errp);
>>>      object_unref(OBJECT(cpu));
>>>  }
>>>
>>>  void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>>>  {
>>> -    int i;
>>> +    int i, j, k;
>>> +    X86CPUSocket *socket;
>>> +    X86CPUCore *core;
>>>      X86CPU *cpu = NULL;
>>>      Error *error = NULL;
>>>      unsigned long apic_id_limit;
>>> +    int sockets, cpu_index = 0;
>>>
>>>      /* init CPUs */
>>>      if (cpu_model == NULL) {
>>> @@ -1087,14 +1104,41 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
>>>          exit(1);
>>>      }
>>>
>>> -    for (i = 0; i < smp_cpus; i++) {
>>> -        cpu = pc_new_cpu(cpu_model, x86_cpu_apic_id_from_index(i),
>>> -                         icc_bridge, &error);
>>> +    sockets = smp_cpus / smp_cores / smp_threads;
>>> +    for (i = 0; i < sockets; i++) {
>>> +        socket = g_malloc0(sizeof(*socket) + smp_cores * pc_cpu_core_size());
>>> +        object_initialize(socket, sizeof(*socket), TYPE_X86_CPU_SOCKET);
>>> +        OBJECT(socket)->free = g_free;
>>> +
>>> +        for (j = 0; j < smp_cores; j++) {
>>> +            core = pc_cpu_socket_get_core(socket, j);
>>> +            object_initialize(core, sizeof(*core), TYPE_X86_CPU_CORE);
>>> +            object_property_add_child(OBJECT(socket), "core[*]",
>>> +                                      OBJECT(core), &error);
>>> +            if (error) {
>>> +                goto error;
>>> +            }
>>> +
>>> +            for (k = 0; k < smp_threads; k++) {
>>> +                cpu = pc_new_cpu(cpu_model,
>>> +                                 x86_cpu_apic_id_from_index(cpu_index),
>>> +                                 icc_bridge, &error);
>>> +                if (error) {
>>> +                    goto error;
>>> +                }
>>> +                object_property_add_child(OBJECT(core), "thread[*]",
>>> +                                          OBJECT(cpu), &error);
>>> +                object_unref(OBJECT(cpu));
>>> +                if (error) {
>>> +                    goto error;
>>> +                }
>>> +                cpu_index++;
>>> +            }
>>> +        }
>>> +        object_property_set_bool(OBJECT(socket), true, "realized", &error);
>>
>> So you do in-place initialization as part of an "atomic" socket object.
>
> (indivisible might've been a better term on my part)
>
>>
>> I am not sure why cores and threads should be allocated as part of
>> socket object and initialized like above ? Do you see any problem with
>> just instantiating the socket object and then let the instance_init
>> routines of socket and cores to initialize the cores and threads like
>> how I am doing for sPAPR ?
>>
>> +    sockets = smp_cpus / smp_cores / smp_threads;
>> +    for (i = 0; i < sockets; i++) {
>> +        socket = object_new(TYPE_POWERPC_CPU_SOCKET);
>> +        object_property_set_bool(socket, true, "realized", &error_abort);
>>      }
>>
>> Ref: http://lists.gnu.org/archive/html/qemu-ppc/2015-03/msg00492.html
>
> Yes, instance_init cannot fail. By making the allocation separate, we
> assure that at this stage we have sufficient memory to perform the
> initializations. This is easiest when we know how many child objects we
> have, then we can use proper fields or arrays to reserve the memory
> (e.g., ARM MPCore, PCI host bridges). Here, to cope with dynamic
> smp_cores, I am using inline helpers to do the size/offset calculations.

Ok, So the difference between what you are doing and the approach I
have taken is not that much.

You allocate a big socket object (which includes memory for cores and
threads objects) and then initialize them in a loop and finally
realize the socket object which iteratively realizes cores and
threads.

Instead of open g_malloc0, I resort to object_new for the socket
object. The instance_init of the socket object will create core
objects (via object_new again) and the instance_init of thread object
will create the CPU threads (using object_new). At the end of this I
do a realize on socket object which iteratively realizes cores and
threads similar to your implementation.

So unless I am missing the subtlety here, it is about one big malloc
vs multiple small mallocs. So the chance for failure exists in both
cases.

>
> Further, setting realized=true from instance_init is a no-go. It's a
> two-step initialization, with realization being the second step and
> potentially failing. The alternative I pointed you to was creating
> objects in a property setter, like Alexey was asked for for xics.

Please look at my v2 patchset, I am not setting realized=true from
instance _init.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
                   ` (4 preceding siblings ...)
  2015-03-24 14:33 ` [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Christian Borntraeger
@ 2015-03-26 17:39 ` Igor Mammedov
  2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
  6 siblings, 0 replies; 19+ messages in thread
From: Igor Mammedov @ 2015-03-26 17:39 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Peter Maydell, Eduardo Habkost, qemu-devel,
	Christian Borntraeger, Bharata B Rao, Paolo Bonzini

On Mon, 23 Mar 2015 18:31:59 +0100
Andreas Färber <afaerber@suse.de> wrote:

> Hello,
> 
> This long-postponed series proposes a hierarchical QOM model of socket
> and core objects for the x86 PC machines.
> 
> Background is that due to qdev limitations we had to introduce an ICC bus
> to be able to hot-add CPUs and their APICs. By now this limitation could be
> resolved via a QOM hotplug handler interface.
> 
> However, the QOM hotplug model is associated with having link<> properties.
> Given that physically we cannot hot-add hyperthreads but full CPU sockets,
> this series prepares the underbelly for having those link properties be of
> the new type X86CPUSocket rather than X86CPU.
Having hotplug associated with link<> properties, was Anthony's suggestion
to overcome limitation of non hotpluggable Sysbus back then, hence ICC bus
has been introduced as an alternative solution to CPU hotplug in target-i386
code. With current hotplug handler interface we in process of removing
ICC bus and switching x86 CPUs to bus-less hotplug.
That by itself doesn't mandate/need link<> properties in hotplug or modeling
sockets at all. Also I don't see much usefulness in limiting CPU hotplug to
socket based just because physical hardware can't it.

I'm thinking about using CPU socket modeling as a way to provide
a sane topology on CLI and other management interfaces, which is currently
represented (or not) by -numa/-smp/-cpu/cpu-add mess.
It's not clear what conversion in this series would give us.
It would be better for us first figure out what is the end goals
of this conversion and how QEMU would benefit from it.
(i.e. lets make a plan what we intend to do first)

> 
> As final step in this series, the X86CPU allocation model is changed to be
> init'ed in-place, as part of an "atomic" socket object. A follow-up will be
> to attempt the same in-place allocation model for the APIC; one difficulty
> there is that several places do a NULL check for the APIC pointer as quick
> way of telling whether an APIC is present or not.
Maybe it's obvious but what does in-place initialization give us?
(other than replacing a bunch of small malloc-s with a big one)

> NOT IN THIS SERIES is converting cpu-add to the same socket/core/thread model
> and initializing them in-place. The current PoC implementation assumes that
> CPUs get added sequentially and that the preceding CPU can be used to obtain
> the core via unclean QOM parent accesses.
> IIUC that must be changed so that an arbitrary thread added via cpu-add
I'd prefer not to use cpu-add for socket based, i.e. deprecate it and leave
it for compatibility for old machine types and for socket based CPUs use
-device instead.

> creates the full socket and core(s). That would work best if indexed link<>
> properties could be used.
> That's an undecided design question of whether
> we want to have /machine/cpu-socket[n] or whether it makes sense to integrate
> NUMA modeling while at it and rather have /machine/node[n]/socket[m].

It might have sense to have both i.e. make "node[n]" optional for numa-less
case.
but lets start with problems we have with current impl.
I'd speak from x86 cpu hotplug/-device X86CPU point of view:

when we plug CPU we have to specify which CPU we want to plug,
currently QEMU uses
  cpu-add cpu_index
as a way to specify which CPU thread to plug or if CPU were created
via -device then cpu_index would be used implicitly.
Problems with it are:
 1. user doesn't really know where given CPU will be plugged in
    /socket, core, thread/ since it's might depend on arch
    specific routine which translates cpu_index into arch specific
    topology info [in case of x86 it's APIC ID]
 2. in NUMA case we can specify which CPU threads are belong to which
    node using the same cpu_index. Problem is the same as with #1,
    user has no idea if CPU thread can be located on that node.
    It's possible to check using target specific hook like has been done
    for default case https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04252.html.
 3. topology info using -smp allows to create only machines with
    homogeneous topology CPUs, the same applies to -cpu.

So my grudge is mainly against using cpu_index in user interface
and that is where I see sockets could be useful.
 * To address NUMA concern remodel CLI to:
   -numa node,cpu_sockets=...

 * Convert -cpu to global properties so that they could be applied
   as default properties to CPUs

 * allow create CPUs with -device_add cpupkg,socket=x[,cores=y[,threads=z]],[type=foo]
   in this case cpupkg could be comoposit object with several cores/threads
   or even microthread like CPUs are now, it could work in both cases.

 * Convert -smp to perform set of device_add cpu,... operations
   to keep CLI compatible with old versions and simple

As result CLI would be pretty arch agnostic without using some vague cpu_index-es
and with generic cpupkg we wouldn't have to create ARCH-FOO-cpu-socket objects
in for every target.

> 
> Note that this socket modeling is not only PC-specific in the softmmu sense
> but also in that not every X86CPU must be on a socket (e.g., Quark X1000).
> Therefore it was a concious decision to not label some things target-i386
> and to place code in pc.c rather than cpu.c.
Strictly speaking Quark X1000 is FCBGA393 socket packaged,
it just happens that package has a little bit more than a CPU core.
I don't like open codding of composing CPU socket in board code like
it's done here, there should be socket object that encapsulates all
that code, while board would just use it.

So maybe socket is wrong term to use in this model, but we need some name
to tell where CPU is plugged in, generic 'cpu' would be a good choice
for composit CPU object if it were not claimed already by cpu threads
and sockets could be links<> in /machine/[node[x]/]cpu_socket<>[y]
topology view.

> 
> Further note that it is ignored here that -smp enforces that AMD CPUs don't
> have hyperthreads, i.e. AMD X86CPUs will have only one thread[n] child<>.
> 
> Context:
> 
>    qemu.git master
>    "pc: Ensure non-zero CPU ref count after attaching to ICC bus"
> -> this series adding socket/core objects
>    cpu-add conversion
>    APIC cleanups
> 
> Available for testing here:
> git://github.com/afaerber/qemu-cpu.git qom-cpu-x86-sockets-1.v1
> https://github.com/afaerber/qemu-cpu/commits/qom-cpu-x86-sockets-1.v1
> 
> Regards,
> Andreas
> 
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> Andreas Färber (4):
>   cpu: Prepare Socket container type
>   target-i386: Prepare CPU socket/core abstraction
>   pc: Create sockets and cores for CPUs
>   pc: Create initial CPUs in-place
> 
>  hw/cpu/Makefile.objs         |  2 +-
>  hw/cpu/socket.c              | 21 ++++++++++
>  hw/i386/Makefile.objs        |  1 +
>  hw/i386/cpu-core.c           | 45 +++++++++++++++++++++
>  hw/i386/cpu-socket.c         | 45 +++++++++++++++++++++
>  hw/i386/pc.c                 | 95 ++++++++++++++++++++++++++++++++++++++++----
>  include/hw/cpu/socket.h      | 14 +++++++
>  include/hw/i386/cpu-core.h   | 29 ++++++++++++++
>  include/hw/i386/cpu-socket.h | 29 ++++++++++++++
>  9 files changed, 272 insertions(+), 9 deletions(-)
>  create mode 100644 hw/cpu/socket.c
>  create mode 100644 hw/i386/cpu-core.c
>  create mode 100644 hw/i386/cpu-socket.c
>  create mode 100644 include/hw/cpu/socket.h
>  create mode 100644 include/hw/i386/cpu-core.h
>  create mode 100644 include/hw/i386/cpu-socket.h
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
                   ` (5 preceding siblings ...)
  2015-03-26 17:39 ` Igor Mammedov
@ 2015-04-07 12:43 ` Christian Borntraeger
  2015-04-07 15:07   ` Igor Mammedov
                     ` (2 more replies)
  6 siblings, 3 replies; 19+ messages in thread
From: Christian Borntraeger @ 2015-04-07 12:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Alexander Graf, Paolo Bonzini,
	Jason J. Herne, Bharata B Rao, Cornelia Huck, Igor Mammedov,
	Andreas Färber, david

We had a call and I was asked to write a summary about our conclusion.

The more I wrote, there more I became uncertain if we really came to a 
conclusion and became more certain that we want to define the QMP/HMP/CLI
interfaces first (or quite early in the process)

As discussed I will provide an initial document as a discussion starter

So here is my current understanding with each piece of information on one line, so 
that everybody can correct me or make additions:

current wrap-up of architecture support
-------------------
x86
- Topology possible
   - can be hierarchical
   - interfaces to query topology
- SMT: fanout in host, guest uses host threads to back guest vCPUS
- supports cpu hotplug via cpu_add

power
- Topology possible
   - interfaces to query topology?
- SMT: Power8: no threads in host and full core passed in due to HW design
       may change in the future

s/390
- Topology possible
    - can be hierarchical
    - interfaces to query topology
- always virtualized via PR/SM LPAR
    - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
- SMT: fanout in host, guest uses host threads to back guest vCPUS


Current downsides of CPU definitions/hotplug
-----------------------------------------------
- smp, sockets=,cores=,threads= builds only homogeneous topology
- cpu_add does not tell were to add
- artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)


discussions
-------------------
- we want to be able to (most important question, IHMO)
 - hotplug CPUs on power/x86/s390 and maybe others
 - define topology information
 - bind the guest topology to the host topology in some way
    - to host nodes
    - maybe also for gang scheduling of threads (might face reluctance from
      the linux scheduler folks)
    - not really deeply outlined in this call
- QOM links must be allocated at boot time, but can be set later on
    - nothing that we want to expose to users
    - Machine provides QOM links that the device_add hotplug mechanism can use to add
      new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads. 
- hotplug and initial config should use same semantics
- cpu and memory topology might be somewhat independent
--> - define nodes
    - map CPUs to nodes
    - map memory to nodes

- hotplug per
    - socket
    - core
    - thread
    ?
Now comes the part where I am not sure if we came to a conclusion or not:
- hotplug/definition per core (but not per thread) seems to handle all cases
    - core might have multiple threads ( and thus multiple cpustates)
    - as device statement (or object?)
- mapping of cpus to nodes or defining the topology not really
  outlined in this call

To be defined:
- QEMU command line for initial setup
- QEMU hmp/qmp interfaces for dynamic setup


Christian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
@ 2015-04-07 15:07   ` Igor Mammedov
  2015-04-08  7:07     ` [Qemu-devel] cpu modelling and hotplug Christian Borntraeger
  2015-04-23  7:32   ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
  2015-10-22  1:27   ` [Qemu-devel] cpu modelling and hotplug Zhu Guihua
  2 siblings, 1 reply; 19+ messages in thread
From: Igor Mammedov @ 2015-04-07 15:07 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Peter Maydell, Eduardo Habkost, Alexander Graf, qemu-devel,
	Jason J. Herne, Bharata B Rao, Cornelia Huck, Paolo Bonzini,
	Andreas Färber, david

On Tue, 07 Apr 2015 14:43:43 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> We had a call and I was asked to write a summary about our conclusion.
> 
> The more I wrote, there more I became uncertain if we really came to
> a conclusion and became more certain that we want to define the
> QMP/HMP/CLI interfaces first (or quite early in the process)
> 
> As discussed I will provide an initial document as a discussion
> starter
> 
> So here is my current understanding with each piece of information on
> one line, so that everybody can correct me or make additions:
> 
> current wrap-up of architecture support
> -------------------
> x86
> - Topology possible
>    - can be hierarchical
>    - interfaces to query topology
topology is static, defined at startup, interface is ACPI tables

> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> - supports cpu hotplug via cpu_add
> 
> power
> - Topology possible
>    - interfaces to query topology?
?

> - SMT: Power8: no threads in host and full core passed in due to HW
> design may change in the future
> 
> s/390
> - Topology possible
>     - can be hierarchical
>     - interfaces to query topology
?

> - always virtualized via PR/SM LPAR
>     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st
> socket, 4 in 2nd)
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> 
> 
> Current downsides of CPU definitions/hotplug
> -----------------------------------------------
> - smp, sockets=,cores=,threads= builds only homogeneous topology
> - cpu_add does not tell were to add
> - artificial icc bus construct on x86 for several reasons (link,
> sysbus not hotpluggable..)
the only reason for ICC bus was that "sysbus not hotpluggable"
links had nothing to do with it, more about links later.

> 
> discussions
> -------------------
> - we want to be able to (most important question, IHMO)
>  - hotplug CPUs on power/x86/s390 and maybe others
>  - define topology information
For defining topology we currently have following CLI options:
 -smp sockets=,cores=,threads=,maxcpus
 -numa nodeid=X,cpus=cpu-index based list
 -numa nodeid=Y,memdev=id
 legacy
    -numa nodeid=Z,mem=addr_range

>  - bind the guest topology to the host topology in some way
>     - to host nodes
>     - maybe also for gang scheduling of threads (might face
> reluctance from the linux scheduler folks)
>     - not really deeply outlined in this call

> - QOM links must be allocated at boot time, but can be set later on
>     - nothing that we want to expose to users
>     - Machine provides QOM links that the device_add hotplug
1.
QOM links have nothing to do with hotplug, back then Antony suggested
to use QOM links as alternative to non-hotpluggable sysbus since it's
possible to change link's value at runtime.

Current device hotplug API supports
 - legacy BUS hotplug
 - BUS-less device hotplug
    - it's up to machine callback to define how to wire in hotplugged
      object.
    - used for memory hotplug on x86
    - we are in process of converting x86 CPU hotplug to this method to
      get rid of ICC bus

2. What QOM links could be useful for is introspection of running
machine.
currently we have HMP qtree command that to some degree shows
which devices connected where wiring wise.

Now we want to have similar QOM tree for introspection
which helps express topology as well, like:

/machine/node[x1]/cpu_socket[y1]
        /node[x2]/cpu_socket[y2]/core[z1][/thread[m1]

but  for now it's only just a VIEW since actual QOM devices (CPUs) are
placed in /machine/peripheral[-anon]/ QOM container.


> mechanism can use to add new CPUs into preallocated slots. "CPUs" can
> be groups of cores and/or threads. 
> - hotplug and initial config should use same semantics
> - cpu and memory topology might be somewhat independent
> --> - define nodes
>     - map CPUs to nodes
>     - map memory to nodes
> 
> - hotplug per
>     - socket
>     - core
>     - thread
>     ?
> Now comes the part where I am not sure if we came to a conclusion or
> not:
> - hotplug/definition per core (but not per thread) seems to handle
> all cases
currently with -smp cores=2,threads=2 it's possible to hotplug
only 1 CPU thread on x86. If we limit granularity to core, we would
be able to hotplug only 2 threads at once, i.e. allocate extra capacity.
solution to it could be to use heterogeneous CPUs, i.e.
if one need only one CPU thread then hotplug 1 core/thread CPU in
socket.
Problem here is that we maybe won't be able to make backward compatible 
with currently deployed per-thread CPU hotplug mgmt and migration wise.


>     - core might have multiple threads ( and thus multiple cpustates)
>     - as device statement (or object?)
> - mapping of cpus to nodes or defining the topology not really
>   outlined in this call
> 
> To be defined:
> - QEMU command line for initial setup
> - QEMU hmp/qmp interfaces for dynamic setup

Here is my suggestions to CLI modeling:
 * To address NUMA concern remodel CLI to:
   -numa node,cpu_sockets=...

 * Convert -cpu to global properties so that they could be applied
   as default properties to CPUs

 * allow create CPUs with -device_add cpupkg,socket=WHERE[,cores=y][,threads=z][,type=foo]
   in this case cpupkg could be composite object with several cores/threads
   or even microthread like CPUs are now, it could work in both cases.
   Not sure how well it will map into introspection view of /machine/node[x]/...

 * Convert -smp to perform set of device_add cpu,... operations
   to keep CLI compatible with old versions and simple for homogeneous setups.

> 
> Christian
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug
  2015-04-07 15:07   ` Igor Mammedov
@ 2015-04-08  7:07     ` Christian Borntraeger
  0 siblings, 0 replies; 19+ messages in thread
From: Christian Borntraeger @ 2015-04-08  7:07 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Peter Maydell, Eduardo Habkost, Alexander Graf, qemu-devel,
	Jason J. Herne, Bharata B Rao, Cornelia Huck, Paolo Bonzini,
	Andreas Färber, david

Am 07.04.2015 um 17:07 schrieb Igor Mammedov:
>>
>> power
>> - Topology possible
>>    - interfaces to query topology?
> ?

I wanted to say: I forgot all the details about topology

> 
>> - SMT: Power8: no threads in host and full core passed in due to HW
>> design may change in the future
>>
>> s/390
>> - Topology possible
>>     - can be hierarchical
>>     - interfaces to query topology
> ?
There are in guest interfaces on s390 to query the cpu topology

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
  2015-04-07 15:07   ` Igor Mammedov
@ 2015-04-23  7:32   ` David Gibson
  2015-04-23  7:37     ` David Gibson
  2015-04-23 13:17     ` Eduardo Habkost
  2015-10-22  1:27   ` [Qemu-devel] cpu modelling and hotplug Zhu Guihua
  2 siblings, 2 replies; 19+ messages in thread
From: David Gibson @ 2015-04-23  7:32 UTC (permalink / raw)
  To: Christian Borntraeger, g
  Cc: Peter Maydell, Eduardo Habkost, Bharata B Rao, qemu-devel,
	Alexander Graf, Jason J. Herne, Paolo Bonzini, Cornelia Huck,
	Igor Mammedov, Andreas Färber

[-- Attachment #1: Type: text/plain, Size: 6797 bytes --]

On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> We had a call and I was asked to write a summary about our conclusion.
> 
> The more I wrote, there more I became uncertain if we really came to a 
> conclusion and became more certain that we want to define the QMP/HMP/CLI
> interfaces first (or quite early in the process)
> 
> As discussed I will provide an initial document as a discussion starter
> 
> So here is my current understanding with each piece of information on one line, so 
> that everybody can correct me or make additions:
> 
> current wrap-up of architecture support
> -------------------
> x86
> - Topology possible
>    - can be hierarchical
>    - interfaces to query topology
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> - supports cpu hotplug via cpu_add
> 
> power
> - Topology possible
>    - interfaces to query topology?

For power, topology information is communicated via the
"ibm,associativity" (and related) properties in the device tree.  This
is can encode heirarchical topologies, but it is *not* bound to the
socket/core/thread heirarchy.  On the guest side in Power there's no
real notion of "socket", just cores with specified proximities to
various memory nodes.

> - SMT: Power8: no threads in host and full core passed in due to HW design
>        may change in the future
> 
> s/390
> - Topology possible
>     - can be hierarchical
>     - interfaces to query topology
> - always virtualized via PR/SM LPAR
>     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> 
> 
> Current downsides of CPU definitions/hotplug
> -----------------------------------------------
> - smp, sockets=,cores=,threads= builds only homogeneous topology
> - cpu_add does not tell were to add
> - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)

Artificial though it may be, I think having a "cpus" pseudo-bus is not
such a bad idea

> discussions
> -------------------
> - we want to be able to (most important question, IHMO)
>  - hotplug CPUs on power/x86/s390 and maybe others
>  - define topology information
>  - bind the guest topology to the host topology in some way
>     - to host nodes
>     - maybe also for gang scheduling of threads (might face reluctance from
>       the linux scheduler folks)
>     - not really deeply outlined in this call
> - QOM links must be allocated at boot time, but can be set later on
>     - nothing that we want to expose to users
>     - Machine provides QOM links that the device_add hotplug mechanism can use to add
>       new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads. 
> - hotplug and initial config should use same semantics
> - cpu and memory topology might be somewhat independent
> --> - define nodes
>     - map CPUs to nodes
>     - map memory to nodes
> 
> - hotplug per
>     - socket
>     - core
>     - thread
>     ?
> Now comes the part where I am not sure if we came to a conclusion or not:
> - hotplug/definition per core (but not per thread) seems to handle all cases
>     - core might have multiple threads ( and thus multiple cpustates)
>     - as device statement (or object?)
> - mapping of cpus to nodes or defining the topology not really
>   outlined in this call
> 
> To be defined:
> - QEMU command line for initial setup
> - QEMU hmp/qmp interfaces for dynamic setup

So, I can't say I've entirely got my head around this, but here's my
thoughts so far.

I think the basic problem here is that the fixed socket -> core ->
thread heirarchy is something from x86 land that's become integrated
into qemu's generic code where it doesn't entirely make sense.

Ignoring NUMA topology (I'll come back to that in a moment) qemu
should really only care about two things:

  a) the unit of execution scheduling (a vCPU or "thread")
  b) the unit of plug/unplug

Now, returning to NUMA topology.  What the guest, and therefore qemu,
really needs to know is the relative proximity of each thread to each
block of memory.  That usually forms some sort of node heirarchy,
but it doesn't necessarily correspond to a socket->core->thread
heirarchy you can see in physical units.

On Power, an arbitrary NUMA node heirarchy can be described in the
device tree without reference to "cores" or "sockets", so really qemu
has no business even talking about such units.

IIUC, on x86 the NUMA topology is bound up to the socket->core->thread
heirarchy so it needs to have a notion of those layers, but ideally
that would be specific to the pc machine type.

So, here's what I'd propose:

1) I think we really need some better terminology to refer to the unit
of plug/unplug.  Until someone comes up with something better, I'm
going to use "CPU Module" (CM), to distinguish from the NUMA baggage
of "socket" and also to refer more clearly to the thing that goes into
the socket, rather than the socket itself.

2) A Virtual CPU Module (vCM) need not correspond to a real physical
object.  For machine types which we want to faithfully represent a
specific physical machine, it would.  For generic or pure virtual
machines, the vCMs would be as small as possible.  So for current
Power, they'd be one virtual core, for future power (maybe) or s390 a
single virtual thread.  For x86 I'm not sure what they'd be.

3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
which would contain the vCMs (also QOM objects).  Their existence
would be generic, though we'd almost certainly use arch and/or machine
specific subtypes.

4) There would be a (generic) way of finding the vCPUS (threads) in a
vCM and the vCM for a specific vCPU.

5) A vCM *might* have internal subdivisions into "cores" or "nodes" or
"chips" or "MCMs" or whatever, but that would be up to the machine
type specific code, and not represented in the QOM heirarchy.

6) Obviously we'd need some backwards compat goo to sort out existing
command line options referring to cores and sockets into the new
representation.  This will need machine type specific hooks - so for
x86 it would need to set up the right vCM subdivisions and make sure
the right NUMA topology info goes into ACPI.  For -machine pseries I'm
thinking that "-smp sockets=2,cores=1,threads=4" and "-smp
sockets=1,cores=2,threads=4" should result in exactly the same thing
internally.


Thoughts?


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-04-23  7:32   ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
@ 2015-04-23  7:37     ` David Gibson
  2015-04-23 13:17     ` Eduardo Habkost
  1 sibling, 0 replies; 19+ messages in thread
From: David Gibson @ 2015-04-23  7:37 UTC (permalink / raw)
  To: Christian Borntraeger, g
  Cc: Peter Maydell, Eduardo Habkost, Bharata B Rao, qemu-devel,
	Alexander Graf, Jason J. Herne, Paolo Bonzini, Cornelia Huck,
	Igor Mammedov, Andreas Färber

[-- Attachment #1: Type: text/plain, Size: 7324 bytes --]

On Thu, Apr 23, 2015 at 05:32:33PM +1000, David Gibson wrote:
> On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> > We had a call and I was asked to write a summary about our conclusion.
> > 
> > The more I wrote, there more I became uncertain if we really came to a 
> > conclusion and became more certain that we want to define the QMP/HMP/CLI
> > interfaces first (or quite early in the process)
> > 
> > As discussed I will provide an initial document as a discussion starter
> > 
> > So here is my current understanding with each piece of information on one line, so 
> > that everybody can correct me or make additions:
> > 
> > current wrap-up of architecture support
> > -------------------
> > x86
> > - Topology possible
> >    - can be hierarchical
> >    - interfaces to query topology
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > - supports cpu hotplug via cpu_add
> > 
> > power
> > - Topology possible
> >    - interfaces to query topology?
> 
> For power, topology information is communicated via the
> "ibm,associativity" (and related) properties in the device tree.  This
> is can encode heirarchical topologies, but it is *not* bound to the
> socket/core/thread heirarchy.  On the guest side in Power there's no
> real notion of "socket", just cores with specified proximities to
> various memory nodes.
> 
> > - SMT: Power8: no threads in host and full core passed in due to HW design
> >        may change in the future
> > 
> > s/390
> > - Topology possible
> >     - can be hierarchical
> >     - interfaces to query topology
> > - always virtualized via PR/SM LPAR
> >     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > 
> > 
> > Current downsides of CPU definitions/hotplug
> > -----------------------------------------------
> > - smp, sockets=,cores=,threads= builds only homogeneous topology
> > - cpu_add does not tell were to add
> > - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)
> 
> Artificial though it may be, I think having a "cpus" pseudo-bus is not
> such a bad idea
> 
> > discussions
> > -------------------
> > - we want to be able to (most important question, IHMO)
> >  - hotplug CPUs on power/x86/s390 and maybe others
> >  - define topology information
> >  - bind the guest topology to the host topology in some way
> >     - to host nodes
> >     - maybe also for gang scheduling of threads (might face reluctance from
> >       the linux scheduler folks)
> >     - not really deeply outlined in this call
> > - QOM links must be allocated at boot time, but can be set later on
> >     - nothing that we want to expose to users
> >     - Machine provides QOM links that the device_add hotplug mechanism can use to add
> >       new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads. 
> > - hotplug and initial config should use same semantics
> > - cpu and memory topology might be somewhat independent
> > --> - define nodes
> >     - map CPUs to nodes
> >     - map memory to nodes
> > 
> > - hotplug per
> >     - socket
> >     - core
> >     - thread
> >     ?
> > Now comes the part where I am not sure if we came to a conclusion or not:
> > - hotplug/definition per core (but not per thread) seems to handle all cases
> >     - core might have multiple threads ( and thus multiple cpustates)
> >     - as device statement (or object?)
> > - mapping of cpus to nodes or defining the topology not really
> >   outlined in this call
> > 
> > To be defined:
> > - QEMU command line for initial setup
> > - QEMU hmp/qmp interfaces for dynamic setup
> 
> So, I can't say I've entirely got my head around this, but here's my
> thoughts so far.
> 
> I think the basic problem here is that the fixed socket -> core ->
> thread heirarchy is something from x86 land that's become integrated
> into qemu's generic code where it doesn't entirely make sense.
> 
> Ignoring NUMA topology (I'll come back to that in a moment) qemu
> should really only care about two things:
> 
>   a) the unit of execution scheduling (a vCPU or "thread")
>   b) the unit of plug/unplug
> 
> Now, returning to NUMA topology.  What the guest, and therefore qemu,
> really needs to know is the relative proximity of each thread to each
> block of memory.  That usually forms some sort of node heirarchy,
> but it doesn't necessarily correspond to a socket->core->thread
> heirarchy you can see in physical units.
> 
> On Power, an arbitrary NUMA node heirarchy can be described in the
> device tree without reference to "cores" or "sockets", so really qemu
> has no business even talking about such units.
> 
> IIUC, on x86 the NUMA topology is bound up to the socket->core->thread
> heirarchy so it needs to have a notion of those layers, but ideally
> that would be specific to the pc machine type.
> 
> So, here's what I'd propose:
> 
> 1) I think we really need some better terminology to refer to the unit
> of plug/unplug.  Until someone comes up with something better, I'm
> going to use "CPU Module" (CM), to distinguish from the NUMA baggage
> of "socket" and also to refer more clearly to the thing that goes into
> the socket, rather than the socket itself.
> 
> 2) A Virtual CPU Module (vCM) need not correspond to a real physical
> object.  For machine types which we want to faithfully represent a
> specific physical machine, it would.  For generic or pure virtual
> machines, the vCMs would be as small as possible.  So for current
> Power, they'd be one virtual core, for future power (maybe) or s390 a
> single virtual thread.  For x86 I'm not sure what they'd be.
> 
> 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> which would contain the vCMs (also QOM objects).  Their existence
> would be generic, though we'd almost certainly use arch and/or machine
> specific subtypes.
> 
> 4) There would be a (generic) way of finding the vCPUS (threads) in a
> vCM and the vCM for a specific vCPU.
> 
> 5) A vCM *might* have internal subdivisions into "cores" or "nodes" or
> "chips" or "MCMs" or whatever, but that would be up to the machine
> type specific code, and not represented in the QOM heirarchy.
> 
> 6) Obviously we'd need some backwards compat goo to sort out existing
> command line options referring to cores and sockets into the new
> representation.  This will need machine type specific hooks - so for
> x86 it would need to set up the right vCM subdivisions and make sure
> the right NUMA topology info goes into ACPI.  For -machine pseries I'm
> thinking that "-smp sockets=2,cores=1,threads=4" and "-smp
> sockets=1,cores=2,threads=4" should result in exactly the same thing
> internally.
> 
> 
> Thoughts?

I should add - the terminology's a bit different, but I think in terms
of code this should be very similar to the "socket" approach
previously proposed.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-04-23  7:32   ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
  2015-04-23  7:37     ` David Gibson
@ 2015-04-23 13:17     ` Eduardo Habkost
  2015-04-27 10:46       ` David Gibson
  1 sibling, 1 reply; 19+ messages in thread
From: Eduardo Habkost @ 2015-04-23 13:17 UTC (permalink / raw)
  To: David Gibson
  Cc: Peter Maydell, Bharata B Rao, qemu-devel, Alexander Graf,
	Christian Borntraeger, Jason J. Herne, Paolo Bonzini,
	Cornelia Huck, Igor Mammedov, Andreas Färber

On Thu, Apr 23, 2015 at 05:32:33PM +1000, David Gibson wrote:
> On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> > We had a call and I was asked to write a summary about our conclusion.
> > 
> > The more I wrote, there more I became uncertain if we really came to a 
> > conclusion and became more certain that we want to define the QMP/HMP/CLI
> > interfaces first (or quite early in the process)
> > 
> > As discussed I will provide an initial document as a discussion starter
> > 
> > So here is my current understanding with each piece of information on one line, so 
> > that everybody can correct me or make additions:
> > 
> > current wrap-up of architecture support
> > -------------------
> > x86
> > - Topology possible
> >    - can be hierarchical
> >    - interfaces to query topology
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > - supports cpu hotplug via cpu_add
> > 
> > power
> > - Topology possible
> >    - interfaces to query topology?
> 
> For power, topology information is communicated via the
> "ibm,associativity" (and related) properties in the device tree.  This
> is can encode heirarchical topologies, but it is *not* bound to the
> socket/core/thread heirarchy.  On the guest side in Power there's no
> real notion of "socket", just cores with specified proximities to
> various memory nodes.
> 
> > - SMT: Power8: no threads in host and full core passed in due to HW design
> >        may change in the future
> > 
> > s/390
> > - Topology possible
> >     - can be hierarchical
> >     - interfaces to query topology
> > - always virtualized via PR/SM LPAR
> >     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > 
> > 
> > Current downsides of CPU definitions/hotplug
> > -----------------------------------------------
> > - smp, sockets=,cores=,threads= builds only homogeneous topology
> > - cpu_add does not tell were to add
> > - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)
> 
> Artificial though it may be, I think having a "cpus" pseudo-bus is not
> such a bad idea

That was considered before[1][2]. We have use cases for adding
additional information about VCPUs to query-cpus, but we could simply
use qom-get for that. The only thing missing is a predictable QOM path
for VCPU objects.

If we provide something like "/cpus/<cpu>" links on all machines,
callers could simply use qom-get to get just the information they need,
instead of getting too much information from query-cpus (which also has
the side-effect of interrupting all running VCPUs to synchronize
register information).

Quoting part of your proposal below:
> Ignoring NUMA topology (I'll come back to that in a moment) qemu
> should really only care about two things:
> 
>   a) the unit of execution scheduling (a vCPU or "thread")
>   b) the unit of plug/unplug
>
[...]
> 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> which would contain the vCMs (also QOM objects).  Their existence
> would be generic, though we'd almost certainly use arch and/or machine
> specific subtypes.
> 
> 4) There would be a (generic) way of finding the vCPUS (threads) in a
> vCM and the vCM for a specific vCPU.
>

What I propose now is a bit simpler: just a mechanism for enumerating
VCPUs/threads (a), that would replace query-cpus. Later we could also
have a generic mechanism for (b), if we decide to introduce a generic
"CPU module" abstraction for plug/unplug.

A more complex mechanism to enumerating vCMs and the vCPUs inside a vCM
would be a superset of (a), so in theory we wouldn't need both. But I
believe that: 1) we will take some time to define the details of the
vCM/plug/unplug abstractions; 2) we already have use cases today[2] that
could benefit from a generic QOM path for (a).

[1] Message-ID: <20140516151641.GY3302@otherpad.lan.raisama.net>
    http://article.gmane.org/gmane.comp.emulators.qemu/273463
[2] Message-ID: <20150331131623.GG7031@thinpad.lan.raisama.net>
    http://article.gmane.org/gmane.comp.emulators.kvm.devel/134625

> 
> > discussions
> > -------------------
> > - we want to be able to (most important question, IHMO)
> >  - hotplug CPUs on power/x86/s390 and maybe others
> >  - define topology information
> >  - bind the guest topology to the host topology in some way
> >     - to host nodes
> >     - maybe also for gang scheduling of threads (might face reluctance from
> >       the linux scheduler folks)
> >     - not really deeply outlined in this call
> > - QOM links must be allocated at boot time, but can be set later on
> >     - nothing that we want to expose to users
> >     - Machine provides QOM links that the device_add hotplug mechanism can use to add
> >       new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads. 
> > - hotplug and initial config should use same semantics
> > - cpu and memory topology might be somewhat independent
> > --> - define nodes
> >     - map CPUs to nodes
> >     - map memory to nodes
> > 
> > - hotplug per
> >     - socket
> >     - core
> >     - thread
> >     ?
> > Now comes the part where I am not sure if we came to a conclusion or not:
> > - hotplug/definition per core (but not per thread) seems to handle all cases
> >     - core might have multiple threads ( and thus multiple cpustates)
> >     - as device statement (or object?)
> > - mapping of cpus to nodes or defining the topology not really
> >   outlined in this call
> > 
> > To be defined:
> > - QEMU command line for initial setup
> > - QEMU hmp/qmp interfaces for dynamic setup
> 
> So, I can't say I've entirely got my head around this, but here's my
> thoughts so far.
> 
> I think the basic problem here is that the fixed socket -> core ->
> thread heirarchy is something from x86 land that's become integrated
> into qemu's generic code where it doesn't entirely make sense.
> 
> Ignoring NUMA topology (I'll come back to that in a moment) qemu
> should really only care about two things:
> 
>   a) the unit of execution scheduling (a vCPU or "thread")
>   b) the unit of plug/unplug
> 
> Now, returning to NUMA topology.  What the guest, and therefore qemu,
> really needs to know is the relative proximity of each thread to each
> block of memory.  That usually forms some sort of node heirarchy,
> but it doesn't necessarily correspond to a socket->core->thread
> heirarchy you can see in physical units.
> 
> On Power, an arbitrary NUMA node heirarchy can be described in the
> device tree without reference to "cores" or "sockets", so really qemu
> has no business even talking about such units.
> 
> IIUC, on x86 the NUMA topology is bound up to the socket->core->thread
> heirarchy so it needs to have a notion of those layers, but ideally
> that would be specific to the pc machine type.
> 
> So, here's what I'd propose:
> 
> 1) I think we really need some better terminology to refer to the unit
> of plug/unplug.  Until someone comes up with something better, I'm
> going to use "CPU Module" (CM), to distinguish from the NUMA baggage
> of "socket" and also to refer more clearly to the thing that goes into
> the socket, rather than the socket itself.
> 
> 2) A Virtual CPU Module (vCM) need not correspond to a real physical
> object.  For machine types which we want to faithfully represent a
> specific physical machine, it would.  For generic or pure virtual
> machines, the vCMs would be as small as possible.  So for current
> Power, they'd be one virtual core, for future power (maybe) or s390 a
> single virtual thread.  For x86 I'm not sure what they'd be.
> 
> 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> which would contain the vCMs (also QOM objects).  Their existence
> would be generic, though we'd almost certainly use arch and/or machine
> specific subtypes.
> 
> 4) There would be a (generic) way of finding the vCPUS (threads) in a
> vCM and the vCM for a specific vCPU.
> 
> 5) A vCM *might* have internal subdivisions into "cores" or "nodes" or
> "chips" or "MCMs" or whatever, but that would be up to the machine
> type specific code, and not represented in the QOM heirarchy.
> 
> 6) Obviously we'd need some backwards compat goo to sort out existing
> command line options referring to cores and sockets into the new
> representation.  This will need machine type specific hooks - so for
> x86 it would need to set up the right vCM subdivisions and make sure
> the right NUMA topology info goes into ACPI.  For -machine pseries I'm
> thinking that "-smp sockets=2,cores=1,threads=4" and "-smp
> sockets=1,cores=2,threads=4" should result in exactly the same thing
> internally.
> 
> 
> Thoughts?
> 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson



-- 
Eduardo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
  2015-04-23 13:17     ` Eduardo Habkost
@ 2015-04-27 10:46       ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2015-04-27 10:46 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Peter Maydell, Bharata B Rao, qemu-devel, Alexander Graf,
	Christian Borntraeger, Jason J. Herne, Paolo Bonzini,
	Cornelia Huck, Igor Mammedov, Andreas Färber

[-- Attachment #1: Type: text/plain, Size: 4573 bytes --]

On Thu, Apr 23, 2015 at 10:17:36AM -0300, Eduardo Habkost wrote:
> On Thu, Apr 23, 2015 at 05:32:33PM +1000, David Gibson wrote:
> > On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> > > We had a call and I was asked to write a summary about our conclusion.
> > > 
> > > The more I wrote, there more I became uncertain if we really came to a 
> > > conclusion and became more certain that we want to define the QMP/HMP/CLI
> > > interfaces first (or quite early in the process)
> > > 
> > > As discussed I will provide an initial document as a discussion starter
> > > 
> > > So here is my current understanding with each piece of information on one line, so 
> > > that everybody can correct me or make additions:
> > > 
> > > current wrap-up of architecture support
> > > -------------------
> > > x86
> > > - Topology possible
> > >    - can be hierarchical
> > >    - interfaces to query topology
> > > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > > - supports cpu hotplug via cpu_add
> > > 
> > > power
> > > - Topology possible
> > >    - interfaces to query topology?
> > 
> > For power, topology information is communicated via the
> > "ibm,associativity" (and related) properties in the device tree.  This
> > is can encode heirarchical topologies, but it is *not* bound to the
> > socket/core/thread heirarchy.  On the guest side in Power there's no
> > real notion of "socket", just cores with specified proximities to
> > various memory nodes.
> > 
> > > - SMT: Power8: no threads in host and full core passed in due to HW design
> > >        may change in the future
> > > 
> > > s/390
> > > - Topology possible
> > >     - can be hierarchical
> > >     - interfaces to query topology
> > > - always virtualized via PR/SM LPAR
> > >     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> > > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > > 
> > > 
> > > Current downsides of CPU definitions/hotplug
> > > -----------------------------------------------
> > > - smp, sockets=,cores=,threads= builds only homogeneous topology
> > > - cpu_add does not tell were to add
> > > - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)
> > 
> > Artificial though it may be, I think having a "cpus" pseudo-bus is not
> > such a bad idea
> 
> That was considered before[1][2]. We have use cases for adding
> additional information about VCPUs to query-cpus, but we could simply
> use qom-get for that. The only thing missing is a predictable QOM path
> for VCPU objects.
> 
> If we provide something like "/cpus/<cpu>" links on all machines,
> callers could simply use qom-get to get just the information they need,
> instead of getting too much information from query-cpus (which also has
> the side-effect of interrupting all running VCPUs to synchronize
> register information).
> 
> Quoting part of your proposal below:
> > Ignoring NUMA topology (I'll come back to that in a moment) qemu
> > should really only care about two things:
> > 
> >   a) the unit of execution scheduling (a vCPU or "thread")
> >   b) the unit of plug/unplug
> >
> [...]
> > 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> > which would contain the vCMs (also QOM objects).  Their existence
> > would be generic, though we'd almost certainly use arch and/or machine
> > specific subtypes.
> > 
> > 4) There would be a (generic) way of finding the vCPUS (threads) in a
> > vCM and the vCM for a specific vCPU.
> >
> 
> What I propose now is a bit simpler: just a mechanism for enumerating
> VCPUs/threads (a), that would replace query-cpus. Later we could also
> have a generic mechanism for (b), if we decide to introduce a generic
> "CPU module" abstraction for plug/unplug.
> 
> A more complex mechanism to enumerating vCMs and the vCPUs inside a vCM
> would be a superset of (a), so in theory we wouldn't need both. But I
> believe that: 1) we will take some time to define the details of the
> vCM/plug/unplug abstractions; 2) we already have use cases today[2] that
> could benefit from a generic QOM path for (a).

That seems like a reasonable first step.  I don't think it conflicts
with any of the things I was suggesting.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug
  2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
  2015-04-07 15:07   ` Igor Mammedov
  2015-04-23  7:32   ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
@ 2015-10-22  1:27   ` Zhu Guihua
  2015-10-22 16:52     ` Andreas Färber
  2 siblings, 1 reply; 19+ messages in thread
From: Zhu Guihua @ 2015-10-22  1:27 UTC (permalink / raw)
  To: Christian Borntraeger, qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Igor Mammedov, Alexander Graf,
	Jason J. Herne, Bharata B Rao, Cornelia Huck, Paolo Bonzini,
	Andreas Färber, david

Hi all,

May I know whether the discussion is still ongoing?

I checked Andreas's git tree, there was no changes about the topology.

Plz let me know the schedule about this.

Thanks,
Zhu

On 04/07/2015 08:43 PM, Christian Borntraeger wrote:
> We had a call and I was asked to write a summary about our conclusion.
>
> The more I wrote, there more I became uncertain if we really came to a
> conclusion and became more certain that we want to define the QMP/HMP/CLI
> interfaces first (or quite early in the process)
>
> As discussed I will provide an initial document as a discussion starter
>
> So here is my current understanding with each piece of information on one line, so
> that everybody can correct me or make additions:
>
> current wrap-up of architecture support
> -------------------
> x86
> - Topology possible
>     - can be hierarchical
>     - interfaces to query topology
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> - supports cpu hotplug via cpu_add
>
> power
> - Topology possible
>     - interfaces to query topology?
> - SMT: Power8: no threads in host and full core passed in due to HW design
>         may change in the future
>
> s/390
> - Topology possible
>      - can be hierarchical
>      - interfaces to query topology
> - always virtualized via PR/SM LPAR
>      - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
>
>
> Current downsides of CPU definitions/hotplug
> -----------------------------------------------
> - smp, sockets=,cores=,threads= builds only homogeneous topology
> - cpu_add does not tell were to add
> - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)
>
>
> discussions
> -------------------
> - we want to be able to (most important question, IHMO)
>   - hotplug CPUs on power/x86/s390 and maybe others
>   - define topology information
>   - bind the guest topology to the host topology in some way
>      - to host nodes
>      - maybe also for gang scheduling of threads (might face reluctance from
>        the linux scheduler folks)
>      - not really deeply outlined in this call
> - QOM links must be allocated at boot time, but can be set later on
>      - nothing that we want to expose to users
>      - Machine provides QOM links that the device_add hotplug mechanism can use to add
>        new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads.
> - hotplug and initial config should use same semantics
> - cpu and memory topology might be somewhat independent
> --> - define nodes
>      - map CPUs to nodes
>      - map memory to nodes
>
> - hotplug per
>      - socket
>      - core
>      - thread
>      ?
> Now comes the part where I am not sure if we came to a conclusion or not:
> - hotplug/definition per core (but not per thread) seems to handle all cases
>      - core might have multiple threads ( and thus multiple cpustates)
>      - as device statement (or object?)
> - mapping of cpus to nodes or defining the topology not really
>    outlined in this call
>
> To be defined:
> - QEMU command line for initial setup
> - QEMU hmp/qmp interfaces for dynamic setup
>
>
> Christian
>
>
> .
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] cpu modelling and hotplug
  2015-10-22  1:27   ` [Qemu-devel] cpu modelling and hotplug Zhu Guihua
@ 2015-10-22 16:52     ` Andreas Färber
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Färber @ 2015-10-22 16:52 UTC (permalink / raw)
  To: Zhu Guihua, qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Igor Mammedov, Alexander Graf,
	Christian Borntraeger, Jason J. Herne, Bharata B Rao,
	Cornelia Huck, Paolo Bonzini, david

Hi,

Am 22.10.2015 um 03:27 schrieb Zhu Guihua:
> May I know whether the discussion is still ongoing?

We did have some discussions at KVM Forum, you may want to check the
video recording of my CPU hot-plug talk (end was cut off, I think).

> I checked Andreas's git tree, there was no changes about the topology.

I've pushed the latest rebase, including machine->cpu_model change, but
no other changes yet, and I did not re-review it at all beyond conflict
resolution and "make check" testing. The first few patches may or may
not be ready for a respin...

> Plz let me know the schedule about this.

I am currently occupied with other (mostly downstream) bugs and do not
have a clear schedule to offer. We can still get patches in during the
Soft Freeze if there is agreement on some subset.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-10-22 16:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 2/4] target-i386: Prepare CPU socket/core abstraction Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs Andreas Färber
2015-03-25 16:55   ` Bharata B Rao
2015-03-25 17:13     ` Andreas Färber
2015-03-26  2:24       ` Bharata B Rao
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 4/4] pc: Create initial CPUs in-place Andreas Färber
2015-03-24 14:33 ` [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Christian Borntraeger
2015-03-26 17:39 ` Igor Mammedov
2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
2015-04-07 15:07   ` Igor Mammedov
2015-04-08  7:07     ` [Qemu-devel] cpu modelling and hotplug Christian Borntraeger
2015-04-23  7:32   ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
2015-04-23  7:37     ` David Gibson
2015-04-23 13:17     ` Eduardo Habkost
2015-04-27 10:46       ` David Gibson
2015-10-22  1:27   ` [Qemu-devel] cpu modelling and hotplug Zhu Guihua
2015-10-22 16:52     ` Andreas Färber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.