All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/12] s390x: CPU Topology
@ 2022-06-20 14:03 Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
                   ` (12 more replies)
  0 siblings, 13 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

Hi,

This new spin is essentially for coherence with the last Linux CPU
Topology patch, function testing and coding style modifications.

Forword
=======

The goal of this series is to implement CPU topology for S390, it
improves the preceeding series with the implementation of books and
drawers, of non uniform CPU topology and with documentation.

To use these patches, you will need the Linux series version 10.
You find it there:
https://lkml.org/lkml/2022/6/20/590

Currently this code is for KVM only, I have no idea if it is interesting
to provide a TCG patch. If ever it will be done in another series.

To have a better understanding of the S390x CPU Topology and its
implementation in QEMU you can have a look at the documentation in the
last patch or follow the introduction here under.

A short introduction
====================

CPU Topology is described in the S390 POP with essentially the description
of two instructions:

PTF Perform Topology function used to poll for topology change
    and used to set the polarization but this part is not part of this item.

STSI Store System Information and the SYSIB 15.1.x providing the Topology
    configuration.

S390 Topology is a 6 levels hierarchical topology with up to 5 level
    of containers. The last topology level, specifying the CPU cores.

    This patch series only uses the two lower levels sockets and cores.
    
    To get the information on the topology, S390 provides the STSI
    instruction, which stores a structures providing the list of the
    containers used in the Machine topology: the SYSIB.
    A selector within the STSI instruction allow to chose how many topology
    levels will be provide in the SYSIB.

    Using the Topology List Entries (TLE) provided inside the SYSIB we
    the Linux kernel is able to compute the information about the cache
    distance between two cores and can use this information to take
    scheduling decisions.

The design
==========

1) To be ready for hotplug, I chose an Object oriented design
of the topology containers:
- A node is a bridge on the SYSBUS and defines a "node bus"
- A drawer is hotplug on the "node bus"
- A book on the "drawer bus"
- A socket on the "book bus"
- And the CPU Topology List Entry (CPU-TLE)sits on the socket bus.
These objects will be enhanced with the cache information when
NUMA is implemented.

This also allows for easy retrieval when building the different SYSIB
for Store Topology System Information (STSI)

2) Perform Topology Function (PTF) instruction is made available to the
guest with a new KVM capability and intercepted in QEMU, allowing the
guest to pool for topology changes.


Features
========

- There is no direct match between IDs shown by:
    - lscpu (unrelated numbered list),
    - SYSIB 15.1.x (topology ID)

- The CPU number, left column of lscpu, is used to reference a CPU
    by Linux tools
    While the CPU address is used by QEMU for hotplug.

- Effect of -smp parsing on the topology with an example:
    -smp 9,sockets=4,cores=4,maxcpus=16

    We have 4 socket each holding 4 cores so that we have a maximum 
    of 16 CPU, 9 of them are active on boot. (Should be obvious)

# lscpu -e
CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
  0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
  1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
  2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
  3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
  4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
  5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
  6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
  7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
  8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
# 


- To plug a new CPU inside the topology one can simply use the CPU
    address like in:
  
(qemu) device_add host-s390x-cpu,core-id=12
# lscpu -e
CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
  0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
  1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
  2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
  3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
  4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
  5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
  6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
  7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
  8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
  9    -      -    -      -    - :::                 no yes        horizontal   12
# chcpu -e 9
CPU 9 enabled
# lscpu -e
CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
  0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
  1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
  2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
  3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
  4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
  5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
  6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
  7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
  8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
  9    0      0    0      3    9 9:9:9:9            yes yes        horizontal   12
#

It is up to the admin level, Libvirt for example, to pin the righ CPU to the right
vCPU, but as we can see without NUMA, chosing separate sockets for CPUs is not easy
without hotplug because without information the code will assign the vCPU and fill
the sockets one after the other.
Note that this is also the default behavior on the LPAR.

Conclusion
==========

This patch, together with the associated KVM patch allows to provide CPU topology
information to the guest.
Currently, only dedicated vCPU and CPU are supported and a NUMA topology can only
be handled using CPU hotplug inside the guest.

Regards,
Pierre

Pierre Morel (12):
  Update Linux Headers
  s390x/cpu_topology: CPU topology objects and structures
  s390x/cpu_topology: implementating Store Topology System Information
  s390x/cpu_topology: Adding books to CPU topology
  s390x/cpu_topology: Adding books to STSI
  s390x/cpu_topology: Adding drawers to CPU topology
  s390x/cpu_topology: Adding drawers to STSI
  s390x/cpu_topology: implementing numa for the s390x topology
  target/s390x: interception of PTF instruction
  s390x/cpu_topology: resetting the Topology-Change-Report
  s390x/cpu_topology: CPU topology migration
  s390x/cpu_topology: activating CPU topology

 hw/core/machine-smp.c              |  48 +-
 hw/core/machine.c                  |  22 +
 hw/s390x/cpu-topology.c            | 754 +++++++++++++++++++++++++++++
 hw/s390x/meson.build               |   1 +
 hw/s390x/s390-virtio-ccw.c         |  77 ++-
 include/hw/boards.h                |   8 +
 include/hw/s390x/cpu-topology.h    |  99 ++++
 include/hw/s390x/s390-virtio-ccw.h |   6 +
 include/hw/s390x/sclp.h            |   1 +
 linux-headers/asm-s390/kvm.h       |   9 +
 linux-headers/linux/kvm.h          |   1 +
 qapi/machine.json                  |  14 +-
 qemu-options.hx                    |   6 +-
 softmmu/vl.c                       |   6 +
 target/s390x/cpu-sysemu.c          |   7 +
 target/s390x/cpu.h                 |  52 ++
 target/s390x/cpu_models.c          |   1 +
 target/s390x/cpu_topology.c        | 169 +++++++
 target/s390x/kvm/kvm.c             |  93 ++++
 target/s390x/kvm/kvm_s390x.h       |   2 +
 target/s390x/meson.build           |   1 +
 21 files changed, 1359 insertions(+), 18 deletions(-)
 create mode 100644 hw/s390x/cpu-topology.c
 create mode 100644 include/hw/s390x/cpu-topology.h
 create mode 100644 target/s390x/cpu_topology.c

-- 
2.31.1

Changelog:

- since v7

- Coherence with the Linux patch series changes for MTCR get
  (Pierre)

- check return values during new CPU creation
  (Thomas)

- Improving codding style and argument usages
  (Thomas)

- since v6

- Changes on smp args in qemu-options
  (Daniel)
  
- changed comments in machine.jason
  (Daniel)
 
- Added reset
  (Janosch)

- since v5

- rebasing on newer QEMU version

- reworked most lines above 80 characters.

- since v4

- Added drawer and books to topology

- Added numa topology

- Added documentation

- since v3

- Added migration
  (Thomas)

- Separated STSI instruction from KVM to prepare TCG
  (Thomas)

- Take care of endianess to prepare TCG
  (Thomas)

- Added comments on STSI CPU container and PFT instruction
  (Thomas)

- Moved enabling the instructions as the last patch
  (Thomas)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v8 01/12] Update Linux Headers
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

This does not belong to the serie.
It is only there to facilitate review with the coresponding
linux series: [PATCH v10 0/3] s390x: KVM: CPU Topology

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 linux-headers/asm-s390/kvm.h | 9 +++++++++
 linux-headers/linux/kvm.h    | 1 +
 2 files changed, 10 insertions(+)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index f053b8304a..2bad030b8c 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
 #define KVM_S390_VM_CRYPTO		2
 #define KVM_S390_VM_CPU_MODEL		3
 #define KVM_S390_VM_MIGRATION		4
+#define KVM_S390_VM_CPU_TOPOLOGY       5
 
 /* kvm attributes for mem_ctrl */
 #define KVM_S390_VM_MEM_ENABLE_CMMA	0
@@ -171,6 +172,14 @@ struct kvm_s390_vm_cpu_subfunc {
 #define KVM_S390_VM_MIGRATION_START	1
 #define KVM_S390_VM_MIGRATION_STATUS	2
 
+/* kvm attributes for cpu topology */
+#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR 0
+#define KVM_S390_VM_CPU_TOPO_MTR_SET   1
+
+struct kvm_s390_cpu_topology {
+	__u16 mtcr;
+};
+
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
 	/* general purpose regs for s390 */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 0d05d02ee4..83f7a49a30 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1150,6 +1150,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DISABLE_QUIRKS2 213
 /* #define KVM_CAP_VM_TSC_CONTROL 214 */
 #define KVM_CAP_SYSTEM_EVENT_DATA 215
+#define KVM_CAP_S390_CPU_TOPOLOGY 217
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-27 13:31   ` Janosch Frank
                     ` (2 more replies)
  2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
                   ` (10 subsequent siblings)
  12 siblings, 3 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

We use new objects to have a dynamic administration of the CPU topology.
The highest level object in this implementation is the s390 book and
in this first implementation of CPU topology for S390 we have a single
book.
The book is built as a SYSBUS bridge during the CPU initialization.
Other objects, sockets and core will be built after the parsing
of the QEMU -smp argument.

Every object under this single book will be build dynamically
immediately after a CPU has be realized if it is needed.
The CPU will fill the sockets once after the other, according to the
number of core per socket defined during the smp parsing.

Each CPU inside a socket will be represented by a bit in a 64bit
unsigned long. Set on plug and clear on unplug of a CPU.

For the S390 CPU topology, thread and cores are merged into
topology cores and the number of topology cores is the multiplication
of cores by the numbers of threads.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
 hw/s390x/meson.build            |   1 +
 hw/s390x/s390-virtio-ccw.c      |   6 +
 include/hw/s390x/cpu-topology.h |  74 ++++++
 target/s390x/cpu.h              |  47 ++++
 5 files changed, 519 insertions(+)
 create mode 100644 hw/s390x/cpu-topology.c
 create mode 100644 include/hw/s390x/cpu-topology.h

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
new file mode 100644
index 0000000000..0fd6f08084
--- /dev/null
+++ b/hw/s390x/cpu-topology.c
@@ -0,0 +1,391 @@
+/*
+ * CPU Topology
+ *
+ * Copyright 2022 IBM Corp.
+ * Author(s): Pierre Morel <pmorel@linux.ibm.com>
+
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "hw/sysbus.h"
+#include "hw/s390x/cpu-topology.h"
+#include "hw/qdev-properties.h"
+#include "hw/boards.h"
+#include "qemu/typedefs.h"
+#include "target/s390x/cpu.h"
+#include "hw/s390x/s390-virtio-ccw.h"
+
+/*
+ * s390_create_cores:
+ * @ms: Machine state
+ * @socket: the socket on which to create the core set
+ * @origin: the origin offset of the first core of the set
+ * @errp: Error pointer
+ *
+ * returns a pointer to the created S390TopologyCores structure
+ *
+ * On error: return NULL
+ */
+static S390TopologyCores *s390_create_cores(MachineState *ms,
+                                            S390TopologySocket *socket,
+                                            int origin, Error **errp)
+{
+    DeviceState *dev;
+    S390TopologyCores *cores;
+
+    if (socket->bus->num_children >= ms->smp.cores * ms->smp.threads) {
+        error_setg(errp, "Unable to create more cores.");
+        return NULL;
+    }
+
+    dev = qdev_new(TYPE_S390_TOPOLOGY_CORES);
+    qdev_realize_and_unref(dev, socket->bus, &error_fatal);
+
+    cores = S390_TOPOLOGY_CORES(dev);
+    cores->origin = origin;
+    socket->cnt += 1;
+
+    return cores;
+}
+
+/*
+ * s390_create_socket:
+ * @ms: Machine state
+ * @book: the book on which to create the socket
+ * @id: the socket id
+ * @errp: Error pointer
+ *
+ * returns a pointer to the created S390TopologySocket structure
+ *
+ * On error: return NULL
+ */
+static S390TopologySocket *s390_create_socket(MachineState *ms,
+                                              S390TopologyBook *book,
+                                              int id, Error **errp)
+{
+    DeviceState *dev;
+    S390TopologySocket *socket;
+
+    if (book->bus->num_children >= ms->smp.sockets) {
+        error_setg(errp, "Unable to create more sockets.");
+        return NULL;
+    }
+
+    dev = qdev_new(TYPE_S390_TOPOLOGY_SOCKET);
+    qdev_realize_and_unref(dev, book->bus, &error_fatal);
+
+    socket = S390_TOPOLOGY_SOCKET(dev);
+    socket->socket_id = id;
+    book->cnt++;
+
+    return socket;
+}
+
+/*
+ * s390_get_cores:
+ * @ms: Machine state
+ * @socket: the socket to search into
+ * @origin: the origin specified for the S390TopologyCores
+ * @errp: Error pointer
+ *
+ * returns a pointer to a S390TopologyCores structure within a socket having
+ * the specified origin.
+ * First search if the socket is already containing the S390TopologyCores
+ * structure and if not create one with this origin.
+ */
+static S390TopologyCores *s390_get_cores(MachineState *ms,
+                                         S390TopologySocket *socket,
+                                         int origin, Error **errp)
+{
+    S390TopologyCores *cores;
+    BusChild *kid;
+
+    QTAILQ_FOREACH(kid, &socket->bus->children, sibling) {
+        cores = S390_TOPOLOGY_CORES(kid->child);
+        if (cores->origin == origin) {
+            return cores;
+        }
+    }
+    return s390_create_cores(ms, socket, origin, errp);
+}
+
+/*
+ * s390_get_socket:
+ * @ms: Machine state
+ * @book: The book to search into
+ * @socket_id: the identifier of the socket to search for
+ * @errp: Error pointer
+ *
+ * returns a pointer to a S390TopologySocket structure within a book having
+ * the specified socket_id.
+ * First search if the book is already containing the S390TopologySocket
+ * structure and if not create one with this socket_id.
+ */
+static S390TopologySocket *s390_get_socket(MachineState *ms,
+                                           S390TopologyBook *book,
+                                           int socket_id, Error **errp)
+{
+    S390TopologySocket *socket;
+    BusChild *kid;
+
+    QTAILQ_FOREACH(kid, &book->bus->children, sibling) {
+        socket = S390_TOPOLOGY_SOCKET(kid->child);
+        if (socket->socket_id == socket_id) {
+            return socket;
+        }
+    }
+    return s390_create_socket(ms, book, socket_id, errp);
+}
+
+/*
+ * s390_topology_new_cpu:
+ * @core_id: the core ID is machine wide
+ *
+ * We have a single book returned by s390_get_topology(),
+ * then we build the hierarchy on demand.
+ * Note that we do not destroy the hierarchy on error creating
+ * an entry in the topology, we just keep it empty.
+ * We do not need to worry about not finding a topology level
+ * entry this would have been caught during smp parsing.
+ */
+bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
+{
+    S390TopologyBook *book;
+    S390TopologySocket *socket;
+    S390TopologyCores *cores;
+    int nb_cores_per_socket;
+    int origin, bit;
+
+    book = s390_get_topology();
+
+    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
+
+    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, errp);
+    if (!socket) {
+        return false;
+    }
+
+    /*
+     * At the core level, each CPU is represented by a bit in a 64bit
+     * unsigned long. Set on plug and clear on unplug of a CPU.
+     * The firmware assume that all CPU in the core description have the same
+     * type, polarization and are all dedicated or shared.
+     * In the case a socket contains CPU with different type, polarization
+     * or dedication then they will be defined in different CPU containers.
+     * Currently we assume all CPU are identical and the only reason to have
+     * several S390TopologyCores inside a socket is to have more than 64 CPUs
+     * in that case the origin field, representing the offset of the first CPU
+     * in the CPU container allows to represent up to the maximal number of
+     * CPU inside several CPU containers inside the socket container.
+     */
+    origin = 64 * (core_id / 64);
+
+    cores = s390_get_cores(ms, socket, origin, errp);
+    if (!cores) {
+        return false;
+    }
+
+    bit = 63 - (core_id - origin);
+    set_bit(bit, &cores->mask);
+    cores->origin = origin;
+
+    return true;
+}
+
+/*
+ * Setting the first topology: 1 book, 1 socket
+ * This is enough for 64 cores if the topology is flat (single socket)
+ */
+void s390_topology_setup(MachineState *ms)
+{
+    DeviceState *dev;
+
+    /* Create BOOK bridge device */
+    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
+    object_property_add_child(qdev_get_machine(),
+                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+}
+
+S390TopologyBook *s390_get_topology(void)
+{
+    static S390TopologyBook *book;
+
+    if (!book) {
+        book = S390_TOPOLOGY_BOOK(
+            object_resolve_path(TYPE_S390_TOPOLOGY_BOOK, NULL));
+        assert(book != NULL);
+    }
+
+    return book;
+}
+
+/* --- CORES Definitions --- */
+
+static Property s390_topology_cores_properties[] = {
+    DEFINE_PROP_BOOL("dedicated", S390TopologyCores, dedicated, false),
+    DEFINE_PROP_UINT8("polarity", S390TopologyCores, polarity,
+                      S390_TOPOLOGY_POLARITY_H),
+    DEFINE_PROP_UINT8("cputype", S390TopologyCores, cputype,
+                      S390_TOPOLOGY_CPU_TYPE),
+    DEFINE_PROP_UINT16("origin", S390TopologyCores, origin, 0),
+    DEFINE_PROP_UINT64("mask", S390TopologyCores, mask, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void cpu_cores_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+    device_class_set_props(dc, s390_topology_cores_properties);
+    hc->unplug = qdev_simple_device_unplug_cb;
+    dc->bus_type = TYPE_S390_TOPOLOGY_SOCKET_BUS;
+    dc->desc = "topology cpu entry";
+}
+
+static const TypeInfo cpu_cores_info = {
+    .name          = TYPE_S390_TOPOLOGY_CORES,
+    .parent        = TYPE_DEVICE,
+    .instance_size = sizeof(S390TopologyCores),
+    .class_init    = cpu_cores_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
+};
+
+static char *socket_bus_get_dev_path(DeviceState *dev)
+{
+    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
+    DeviceState *book = dev->parent_bus->parent;
+    char *id = qdev_get_dev_path(book);
+    char *ret;
+
+    if (id) {
+        ret = g_strdup_printf("%s:%02d", id, socket->socket_id);
+        g_free(id);
+    } else {
+        ret = g_strdup_printf("_:%02d", socket->socket_id);
+    }
+
+    return ret;
+}
+
+static void socket_bus_class_init(ObjectClass *oc, void *data)
+{
+    BusClass *k = BUS_CLASS(oc);
+
+    k->get_dev_path = socket_bus_get_dev_path;
+    k->max_dev = S390_MAX_SOCKETS;
+}
+
+static const TypeInfo socket_bus_info = {
+    .name = TYPE_S390_TOPOLOGY_SOCKET_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = 0,
+    .class_init = socket_bus_class_init,
+};
+
+static void s390_socket_device_realize(DeviceState *dev, Error **errp)
+{
+    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
+    BusState *bus;
+
+    bus = qbus_new(TYPE_S390_TOPOLOGY_SOCKET_BUS, dev,
+                   TYPE_S390_TOPOLOGY_SOCKET_BUS);
+    qbus_set_hotplug_handler(bus, OBJECT(dev));
+    socket->bus = bus;
+}
+
+static void socket_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+    hc->unplug = qdev_simple_device_unplug_cb;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->bus_type = TYPE_S390_TOPOLOGY_BOOK_BUS;
+    dc->realize = s390_socket_device_realize;
+    dc->desc = "topology socket";
+}
+
+static const TypeInfo socket_info = {
+    .name          = TYPE_S390_TOPOLOGY_SOCKET,
+    .parent        = TYPE_DEVICE,
+    .instance_size = sizeof(S390TopologySocket),
+    .class_init    = socket_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
+};
+
+static char *book_bus_get_dev_path(DeviceState *dev)
+{
+    return g_strdup("00");
+}
+
+static void book_bus_class_init(ObjectClass *oc, void *data)
+{
+    BusClass *k = BUS_CLASS(oc);
+
+    k->get_dev_path = book_bus_get_dev_path;
+    k->max_dev = S390_MAX_BOOKS;
+}
+
+static const TypeInfo book_bus_info = {
+    .name = TYPE_S390_TOPOLOGY_BOOK_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = 0,
+    .class_init = book_bus_class_init,
+};
+
+static void s390_book_device_realize(DeviceState *dev, Error **errp)
+{
+    S390TopologyBook *book = S390_TOPOLOGY_BOOK(dev);
+    BusState *bus;
+
+    bus = qbus_new(TYPE_S390_TOPOLOGY_BOOK_BUS, dev,
+                   TYPE_S390_TOPOLOGY_BOOK_BUS);
+    qbus_set_hotplug_handler(bus, OBJECT(dev));
+    book->bus = bus;
+}
+
+static void book_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+    hc->unplug = qdev_simple_device_unplug_cb;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->realize = s390_book_device_realize;
+    dc->desc = "topology book";
+}
+
+static const TypeInfo book_info = {
+    .name          = TYPE_S390_TOPOLOGY_BOOK,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(S390TopologyBook),
+    .class_init    = book_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
+};
+
+static void topology_register(void)
+{
+    type_register_static(&cpu_cores_info);
+    type_register_static(&socket_bus_info);
+    type_register_static(&socket_info);
+    type_register_static(&book_bus_info);
+    type_register_static(&book_info);
+}
+
+type_init(topology_register);
diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index feefe0717e..3592fa952b 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -2,6 +2,7 @@ s390x_ss = ss.source_set()
 s390x_ss.add(files(
   'ap-bridge.c',
   'ap-device.c',
+  'cpu-topology.c',
   'ccw-device.c',
   'css-bridge.c',
   'css.c',
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index cc3097bfee..a586875b24 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -43,6 +43,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/s390x/pv.h"
 #include "migration/blocker.h"
+#include "hw/s390x/cpu-topology.h"
 
 static Error *pv_mig_blocker;
 
@@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
     /* initialize possible_cpus */
     mc->possible_cpu_arch_ids(machine);
 
+    s390_topology_setup(machine);
     for (i = 0; i < machine->smp.cpus; i++) {
         s390x_new_cpu(machine->cpu_type, i, &error_fatal);
     }
@@ -306,6 +308,10 @@ static void s390_cpu_plug(HotplugHandler *hotplug_dev,
     g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
     ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
 
+    if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
+        return;
+    }
+
     if (dev->hotplugged) {
         raise_irq_cpu_hotplug();
     }
diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
new file mode 100644
index 0000000000..beec61706c
--- /dev/null
+++ b/include/hw/s390x/cpu-topology.h
@@ -0,0 +1,74 @@
+/*
+ * CPU Topology
+ *
+ * Copyright 2022 IBM Corp.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+#ifndef HW_S390X_CPU_TOPOLOGY_H
+#define HW_S390X_CPU_TOPOLOGY_H
+
+#include "hw/qdev-core.h"
+#include "qom/object.h"
+
+#define S390_TOPOLOGY_CPU_TYPE    0x03
+
+#define S390_TOPOLOGY_POLARITY_H  0x00
+#define S390_TOPOLOGY_POLARITY_VL 0x01
+#define S390_TOPOLOGY_POLARITY_VM 0x02
+#define S390_TOPOLOGY_POLARITY_VH 0x03
+
+#define TYPE_S390_TOPOLOGY_CORES "topology cores"
+    /*
+     * Each CPU inside a socket will be represented by a bit in a 64bit
+     * unsigned long. Set on plug and clear on unplug of a CPU.
+     * All CPU inside a mask share the same dedicated, polarity and
+     * cputype values.
+     * The origin is the offset of the first CPU in a mask.
+     */
+struct S390TopologyCores {
+    DeviceState parent_obj;
+    int id;
+    bool dedicated;
+    uint8_t polarity;
+    uint8_t cputype;
+    uint16_t origin;
+    uint64_t mask;
+    int cnt;
+};
+typedef struct S390TopologyCores S390TopologyCores;
+OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyCores, S390_TOPOLOGY_CORES)
+
+#define TYPE_S390_TOPOLOGY_SOCKET "topology socket"
+#define TYPE_S390_TOPOLOGY_SOCKET_BUS "socket-bus"
+struct S390TopologySocket {
+    DeviceState parent_obj;
+    BusState *bus;
+    int socket_id;
+    int cnt;
+};
+typedef struct S390TopologySocket S390TopologySocket;
+OBJECT_DECLARE_SIMPLE_TYPE(S390TopologySocket, S390_TOPOLOGY_SOCKET)
+#define S390_MAX_SOCKETS 4
+
+#define TYPE_S390_TOPOLOGY_BOOK "topology book"
+#define TYPE_S390_TOPOLOGY_BOOK_BUS "book-bus"
+struct S390TopologyBook {
+    SysBusDevice parent_obj;
+    BusState *bus;
+    int book_id;
+    int cnt;
+};
+typedef struct S390TopologyBook S390TopologyBook;
+OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyBook, S390_TOPOLOGY_BOOK)
+#define S390_MAX_BOOKS 1
+
+S390TopologyBook *s390_init_topology(void);
+
+S390TopologyBook *s390_get_topology(void);
+void s390_topology_setup(MachineState *ms);
+bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp);
+
+#endif
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 7d6d01325b..216adfde26 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -565,6 +565,53 @@ typedef union SysIB {
 } SysIB;
 QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
 
+/* CPU type Topology List Entry */
+typedef struct SysIBTl_cpu {
+        uint8_t nl;
+        uint8_t reserved0[3];
+        uint8_t reserved1:5;
+        uint8_t dedicated:1;
+        uint8_t polarity:2;
+        uint8_t type;
+        uint16_t origin;
+        uint64_t mask;
+} SysIBTl_cpu;
+QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
+
+/* Container type Topology List Entry */
+typedef struct SysIBTl_container {
+        uint8_t nl;
+        uint8_t reserved[6];
+        uint8_t id;
+} QEMU_PACKED SysIBTl_container;
+QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
+
+/* Generic Topology List Entry */
+typedef union SysIBTl_entry {
+        uint8_t nl;
+        SysIBTl_container container;
+        SysIBTl_cpu cpu;
+} SysIBTl_entry;
+
+#define TOPOLOGY_NR_MAG  6
+#define TOPOLOGY_NR_MAG6 0
+#define TOPOLOGY_NR_MAG5 1
+#define TOPOLOGY_NR_MAG4 2
+#define TOPOLOGY_NR_MAG3 3
+#define TOPOLOGY_NR_MAG2 4
+#define TOPOLOGY_NR_MAG1 5
+/* Configuration topology */
+typedef struct SysIB_151x {
+    uint8_t  res0[2];
+    uint16_t length;
+    uint8_t  mag[TOPOLOGY_NR_MAG];
+    uint8_t  res1;
+    uint8_t  mnest;
+    uint32_t res2;
+    SysIBTl_entry tle[0];
+} SysIB_151x;
+QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
+
 /* MMU defines */
 #define ASCE_ORIGIN           (~0xfffULL) /* segment table origin             */
 #define ASCE_SUBSPACE         0x200       /* subspace group control           */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-27 14:26   ` Janosch Frank
  2022-07-20 19:34   ` Janis Schoetterl-Glausch
  2022-06-20 14:03 ` [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology Pierre Morel
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

The handling of STSI is enhanced with the interception of the
function code 15 for storing CPU topology.

Using the objects built during the plugging of CPU, we build the
SYSIB 15_1_x structures.

With this patch the maximum MNEST level is 2, this is also
the only level allowed and only SYSIB 15_1_2 will be built.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 target/s390x/cpu.h          |   2 +
 target/s390x/cpu_topology.c | 112 ++++++++++++++++++++++++++++++++++++
 target/s390x/kvm/kvm.c      |   5 ++
 target/s390x/meson.build    |   1 +
 4 files changed, 120 insertions(+)
 create mode 100644 target/s390x/cpu_topology.c

diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 216adfde26..9d48087b71 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -890,4 +890,6 @@ S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
 
 #include "exec/cpu-all.h"
 
+void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar);
+
 #endif
diff --git a/target/s390x/cpu_topology.c b/target/s390x/cpu_topology.c
new file mode 100644
index 0000000000..9f656d7e51
--- /dev/null
+++ b/target/s390x/cpu_topology.c
@@ -0,0 +1,112 @@
+/*
+ * QEMU S390x CPU Topology
+ *
+ * Copyright IBM Corp. 2022
+ * Author(s): Pierre Morel <pmorel@linux.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "hw/s390x/pv.h"
+#include "hw/sysbus.h"
+#include "hw/s390x/cpu-topology.h"
+
+static int stsi_15_container(void *p, int nl, int id)
+{
+    SysIBTl_container *tle = (SysIBTl_container *)p;
+
+    tle->nl = nl;
+    tle->id = id;
+
+    return sizeof(*tle);
+}
+
+static int stsi_15_cpus(void *p, S390TopologyCores *cd)
+{
+    SysIBTl_cpu *tle = (SysIBTl_cpu *)p;
+
+    tle->nl = 0;
+    tle->dedicated = cd->dedicated;
+    tle->polarity = cd->polarity;
+    tle->type = cd->cputype;
+    tle->origin = be16_to_cpu(cd->origin);
+    tle->mask = be64_to_cpu(cd->mask);
+
+    return sizeof(*tle);
+}
+
+static int set_socket(const MachineState *ms, void *p,
+                      S390TopologySocket *socket)
+{
+    BusChild *kid;
+    int l, len = 0;
+
+    len += stsi_15_container(p, 1, socket->socket_id);
+    p += len;
+
+    QTAILQ_FOREACH_REVERSE(kid, &socket->bus->children, sibling) {
+        l = stsi_15_cpus(p, S390_TOPOLOGY_CORES(kid->child));
+        p += l;
+        len += l;
+    }
+    return len;
+}
+
+static void setup_stsi(const MachineState *ms, void *p, int level)
+{
+    S390TopologyBook *book;
+    SysIB_151x *sysib;
+    BusChild *kid;
+    int len, l;
+
+    sysib = (SysIB_151x *)p;
+    sysib->mnest = level;
+    sysib->mag[TOPOLOGY_NR_MAG2] = ms->smp.sockets;
+    sysib->mag[TOPOLOGY_NR_MAG1] = ms->smp.cores * ms->smp.threads;
+
+    book = s390_get_topology();
+    len = sizeof(SysIB_151x);
+    p += len;
+
+    QTAILQ_FOREACH_REVERSE(kid, &book->bus->children, sibling) {
+        l = set_socket(ms, p, S390_TOPOLOGY_SOCKET(kid->child));
+        p += l;
+        len += l;
+    }
+
+    sysib->length = be16_to_cpu(len);
+}
+
+void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
+{
+    const MachineState *machine = MACHINE(qdev_get_machine());
+    void *p;
+    int ret;
+
+    /*
+     * Until the SCLP STSI Facility reporting the MNEST value is used,
+     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
+     */
+    if (sel2 != 2) {
+        setcc(cpu, 3);
+        return;
+    }
+
+    p = g_malloc0(TARGET_PAGE_SIZE);
+
+    setup_stsi(machine, p, 2);
+
+    if (s390_is_pv()) {
+        ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);
+    } else {
+        ret = s390_cpu_virt_mem_write(cpu, addr, ar, p, TARGET_PAGE_SIZE);
+    }
+
+    setcc(cpu, ret ? 3 : 0);
+    g_free(p);
+}
+
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 7bd8db0e7b..563bf5ac60 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -51,6 +51,7 @@
 #include "hw/s390x/s390-virtio-ccw.h"
 #include "hw/s390x/s390-virtio-hcall.h"
 #include "hw/s390x/pv.h"
+#include "hw/s390x/cpu-topology.h"
 
 #ifndef DEBUG_KVM
 #define DEBUG_KVM  0
@@ -1918,6 +1919,10 @@ static int handle_stsi(S390CPU *cpu)
         /* Only sysib 3.2.2 needs post-handling for now. */
         insert_stsi_3_2_2(cpu, run->s390_stsi.addr, run->s390_stsi.ar);
         return 0;
+    case 15:
+        insert_stsi_15_1_x(cpu, run->s390_stsi.sel2, run->s390_stsi.addr,
+                           run->s390_stsi.ar);
+        return 0;
     default:
         return 0;
     }
diff --git a/target/s390x/meson.build b/target/s390x/meson.build
index 84c1402a6a..890ccfa789 100644
--- a/target/s390x/meson.build
+++ b/target/s390x/meson.build
@@ -29,6 +29,7 @@ s390x_softmmu_ss.add(files(
   'sigp.c',
   'cpu-sysemu.c',
   'cpu_models_sysemu.c',
+  'cpu_topology.c',
 ))
 
 s390x_user_ss = ss.source_set()
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (2 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI Pierre Morel
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

S390 CPU topology may have up to 5 topology containers.
The first container above the cores is level 2, the sockets.
We introduce here the books, book is the level containing sockets.

Let's add books, level3, containers to the CPU topology.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/core/machine-smp.c      | 29 ++++++++++++++++++++++-------
 hw/core/machine.c          |  2 ++
 hw/s390x/s390-virtio-ccw.c |  1 +
 include/hw/boards.h        |  4 ++++
 qapi/machine.json          |  7 ++++++-
 qemu-options.hx            |  5 +++--
 softmmu/vl.c               |  3 +++
 7 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index b39ed21e65..d7aa39d540 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -31,6 +31,10 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     GString *s = g_string_new(NULL);
 
+    if (mc->smp_props.books_supported) {
+        g_string_append_printf(s, " * books (%u)", ms->smp.books);
+    }
+
     g_string_append_printf(s, "sockets (%u)", ms->smp.sockets);
 
     if (mc->smp_props.dies_supported) {
@@ -73,6 +77,7 @@ void machine_parse_smp_config(MachineState *ms,
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     unsigned cpus    = config->has_cpus ? config->cpus : 0;
+    unsigned books   = config->has_books ? config->books : 0;
     unsigned sockets = config->has_sockets ? config->sockets : 0;
     unsigned dies    = config->has_dies ? config->dies : 0;
     unsigned clusters = config->has_clusters ? config->clusters : 0;
@@ -85,6 +90,7 @@ void machine_parse_smp_config(MachineState *ms,
      * explicit configuration like "cpus=0" is not allowed.
      */
     if ((config->has_cpus && config->cpus == 0) ||
+        (config->has_books && config->books == 0) ||
         (config->has_sockets && config->sockets == 0) ||
         (config->has_dies && config->dies == 0) ||
         (config->has_clusters && config->clusters == 0) ||
@@ -111,6 +117,13 @@ void machine_parse_smp_config(MachineState *ms,
     dies = dies > 0 ? dies : 1;
     clusters = clusters > 0 ? clusters : 1;
 
+    if (!mc->smp_props.books_supported && books > 1) {
+        error_setg(errp, "books not supported by this machine's CPU topology");
+        return;
+    }
+
+    books = books > 0 ? books : 1;
+
     /* compute missing values based on the provided ones */
     if (cpus == 0 && maxcpus == 0) {
         sockets = sockets > 0 ? sockets : 1;
@@ -124,33 +137,35 @@ void machine_parse_smp_config(MachineState *ms,
             if (sockets == 0) {
                 cores = cores > 0 ? cores : 1;
                 threads = threads > 0 ? threads : 1;
-                sockets = maxcpus / (dies * clusters * cores * threads);
+                sockets = maxcpus / (books * dies * clusters * cores * threads);
             } else if (cores == 0) {
                 threads = threads > 0 ? threads : 1;
-                cores = maxcpus / (sockets * dies * clusters * threads);
+                cores = maxcpus / (books * sockets * dies * clusters * threads);
             }
         } else {
             /* prefer cores over sockets since 6.2 */
             if (cores == 0) {
                 sockets = sockets > 0 ? sockets : 1;
                 threads = threads > 0 ? threads : 1;
-                cores = maxcpus / (sockets * dies * clusters * threads);
+                cores = maxcpus / (books * sockets * dies * clusters * threads);
             } else if (sockets == 0) {
                 threads = threads > 0 ? threads : 1;
-                sockets = maxcpus / (dies * clusters * cores * threads);
+                sockets = maxcpus / (books * dies * clusters * cores * threads);
             }
         }
 
         /* try to calculate omitted threads at last */
         if (threads == 0) {
-            threads = maxcpus / (sockets * dies * clusters * cores);
+            threads = maxcpus / (books * sockets * dies * clusters * cores);
         }
     }
 
-    maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * clusters * cores * threads;
+    maxcpus = maxcpus > 0 ? maxcpus : books * sockets * dies *
+                                      clusters * cores * threads;
     cpus = cpus > 0 ? cpus : maxcpus;
 
     ms->smp.cpus = cpus;
+    ms->smp.books = books;
     ms->smp.sockets = sockets;
     ms->smp.dies = dies;
     ms->smp.clusters = clusters;
@@ -159,7 +174,7 @@ void machine_parse_smp_config(MachineState *ms,
     ms->smp.max_cpus = maxcpus;
 
     /* sanity-check of the computed topology */
-    if (sockets * dies * clusters * cores * threads != maxcpus) {
+    if (books * sockets * dies * clusters * cores * threads != maxcpus) {
         g_autofree char *topo_msg = cpu_hierarchy_to_string(ms);
         error_setg(errp, "Invalid CPU topology: "
                    "product of the hierarchy must match maxcpus: "
diff --git a/hw/core/machine.c b/hw/core/machine.c
index a673302cce..8861f58d23 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -821,6 +821,7 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name,
     MachineState *ms = MACHINE(obj);
     SMPConfiguration *config = &(SMPConfiguration){
         .has_cpus = true, .cpus = ms->smp.cpus,
+        .has_books = true, .books = ms->smp.books,
         .has_sockets = true, .sockets = ms->smp.sockets,
         .has_dies = true, .dies = ms->smp.dies,
         .has_clusters = true, .clusters = ms->smp.clusters,
@@ -1087,6 +1088,7 @@ static void machine_initfn(Object *obj)
     /* default to mc->default_cpus */
     ms->smp.cpus = mc->default_cpus;
     ms->smp.max_cpus = mc->default_cpus;
+    ms->smp.books = 1;
     ms->smp.sockets = 1;
     ms->smp.dies = 1;
     ms->smp.clusters = 1;
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index a586875b24..ace65164d8 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -619,6 +619,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug_request = s390_machine_device_unplug_request;
     nc->nmi_monitor_handler = s390_nmi;
     mc->default_ram_id = "s390.ram";
+    mc->smp_props.books_supported = true;
 }
 
 static inline bool machine_get_aes_key_wrap(Object *obj, Error **errp)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index d94edcef28..2b44f50b6e 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -130,11 +130,13 @@ typedef struct {
  * @prefer_sockets - whether sockets are preferred over cores in smp parsing
  * @dies_supported - whether dies are supported by the machine
  * @clusters_supported - whether clusters are supported by the machine
+ * @books_supported - whether books are supported by the machine
  */
 typedef struct {
     bool prefer_sockets;
     bool dies_supported;
     bool clusters_supported;
+    bool books_supported;
 } SMPCompatProps;
 
 /**
@@ -299,6 +301,7 @@ typedef struct DeviceMemoryState {
 /**
  * CpuTopology:
  * @cpus: the number of present logical processors on the machine
+ * @books: the number of books on the machine
  * @sockets: the number of sockets on the machine
  * @dies: the number of dies in one socket
  * @clusters: the number of clusters in one die
@@ -308,6 +311,7 @@ typedef struct DeviceMemoryState {
  */
 typedef struct CpuTopology {
     unsigned int cpus;
+    unsigned int books;
     unsigned int sockets;
     unsigned int dies;
     unsigned int clusters;
diff --git a/qapi/machine.json b/qapi/machine.json
index 6afd1936b0..f838b0c51f 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -900,7 +900,8 @@
 # a CPU is being hotplugged.
 #
 # @node-id: NUMA node ID the CPU belongs to
-# @socket-id: socket number within node/board the CPU belongs to
+# @book-id: book number within node/board the CPU belongs to
+# @socket-id: socket number within book/node/board the CPU belongs to
 # @die-id: die number within socket the CPU belongs to (since 4.1)
 # @cluster-id: cluster number within die the CPU belongs to (since 7.1)
 # @core-id: core number within cluster the CPU belongs to
@@ -916,6 +917,7 @@
 ##
 { 'struct': 'CpuInstanceProperties',
   'data': { '*node-id': 'int',
+            '*book-id': 'int',
             '*socket-id': 'int',
             '*die-id': 'int',
             '*cluster-id': 'int',
@@ -1465,6 +1467,8 @@
 #
 # @cpus: number of virtual CPUs in the virtual machine
 #
+# @books: number of books in the CPU topology
+#
 # @sockets: number of sockets in the CPU topology
 #
 # @dies: number of dies per socket in the CPU topology
@@ -1481,6 +1485,7 @@
 ##
 { 'struct': 'SMPConfiguration', 'data': {
      '*cpus': 'int',
+     '*books': 'int',
      '*sockets': 'int',
      '*dies': 'int',
      '*clusters': 'int',
diff --git a/qemu-options.hx b/qemu-options.hx
index 377d22fbd8..9d72208f50 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -239,11 +239,12 @@ SRST
 ERST
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-    "-smp [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
+    "-smp [[cpus=]n][,maxcpus=maxcpus][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
     "                set the number of initial CPUs to 'n' [default=1]\n"
     "                maxcpus= maximum number of total CPUs, including\n"
     "                offline CPUs for hotplug, etc\n"
-    "                sockets= number of sockets on the machine board\n"
+    "                books= number of books on the machine board\n"
+    "                sockets= number of sockets in one book\n"
     "                dies= number of dies in one socket\n"
     "                clusters= number of clusters in one die\n"
     "                cores= number of cores in one cluster\n"
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 54e920ada1..c13edd6948 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -726,6 +726,9 @@ static QemuOptsList qemu_smp_opts = {
         {
             .name = "cpus",
             .type = QEMU_OPT_NUMBER,
+        }, {
+            .name = "books",
+            .type = QEMU_OPT_NUMBER,
         }, {
             .name = "sockets",
             .type = QEMU_OPT_NUMBER,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (3 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology Pierre Morel
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

Let's add STSI support for the container level 3, books,
and provide the information back to the guest.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c         | 163 +++++++++++++++++++++++++++++---
 include/hw/s390x/cpu-topology.h |  20 +++-
 include/hw/s390x/sclp.h         |   1 +
 target/s390x/cpu_topology.c     |  53 ++++++++---
 4 files changed, 210 insertions(+), 27 deletions(-)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index 0fd6f08084..eba003d498 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -86,6 +86,38 @@ static S390TopologySocket *s390_create_socket(MachineState *ms,
     return socket;
 }
 
+/*
+ * s390_create_book:
+ * @ms: Machine state
+ * @drawer: the drawer on which to create the book
+ * @id: the book id
+ *
+ * returns a pointer to the created S390TopologyBook structure
+ *
+ * On error: return NULL
+ */
+static S390TopologyBook *s390_create_book(MachineState *ms,
+                                          S390TopologyDrawer *drawer,
+                                          int id, Error **errp)
+{
+    DeviceState *dev;
+    S390TopologyBook *book;
+
+    if (drawer->bus->num_children >= ms->smp.books) {
+        error_setg(errp, "Unable to create more books.");
+        return NULL;
+    }
+
+    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
+    qdev_realize_and_unref(dev, drawer->bus, &error_fatal);
+
+    book = S390_TOPOLOGY_BOOK(dev);
+    book->book_id = id;
+    drawer->cnt++;
+
+    return book;
+}
+
 /*
  * s390_get_cores:
  * @ms: Machine state
@@ -142,6 +174,34 @@ static S390TopologySocket *s390_get_socket(MachineState *ms,
     return s390_create_socket(ms, book, socket_id, errp);
 }
 
+/*
+ * s390_get_book:
+ * @ms: Machine state
+ * @drawer: The drawer to search into
+ * @book_id: the identifier of the book to search for
+ * @errp: Error pointer
+ *
+ * returns a pointer to a S390TopologySocket structure within a drawer having
+ * the specified book_id.
+ * First search if the drawer is already containing the S390TopologySocket
+ * structure and if not create one with this book_id.
+ */
+static S390TopologyBook *s390_get_book(MachineState *ms,
+                                       S390TopologyDrawer *drawer,
+                                       int book_id, Error **errp)
+{
+    S390TopologyBook *book;
+    BusChild *kid;
+
+    QTAILQ_FOREACH(kid, &drawer->bus->children, sibling) {
+        book = S390_TOPOLOGY_BOOK(kid->child);
+        if (book->book_id == book_id) {
+            return book;
+        }
+    }
+    return s390_create_book(ms, drawer, book_id, errp);
+}
+
 /*
  * s390_topology_new_cpu:
  * @core_id: the core ID is machine wide
@@ -155,16 +215,23 @@ static S390TopologySocket *s390_get_socket(MachineState *ms,
  */
 bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
 {
+    S390TopologyDrawer *drawer;
     S390TopologyBook *book;
     S390TopologySocket *socket;
     S390TopologyCores *cores;
     int nb_cores_per_socket;
+    int nb_cores_per_book;
     int origin, bit;
 
-    book = s390_get_topology();
+    drawer = s390_get_topology();
 
     nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
+    nb_cores_per_book = ms->smp.sockets * nb_cores_per_socket;
 
+    book = s390_get_book(ms, drawer, core_id / nb_cores_per_book, errp);
+    if (!book) {
+        return false;
+    }
     socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, errp);
     if (!socket) {
         return false;
@@ -206,23 +273,23 @@ void s390_topology_setup(MachineState *ms)
     DeviceState *dev;
 
     /* Create BOOK bridge device */
-    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
+    dev = qdev_new(TYPE_S390_TOPOLOGY_DRAWER);
     object_property_add_child(qdev_get_machine(),
-                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
+                              TYPE_S390_TOPOLOGY_DRAWER, OBJECT(dev));
     sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 }
 
-S390TopologyBook *s390_get_topology(void)
+S390TopologyDrawer *s390_get_topology(void)
 {
-    static S390TopologyBook *book;
+    static S390TopologyDrawer *drawer;
 
-    if (!book) {
-        book = S390_TOPOLOGY_BOOK(
-            object_resolve_path(TYPE_S390_TOPOLOGY_BOOK, NULL));
-        assert(book != NULL);
+    if (!drawer) {
+        drawer = S390_TOPOLOGY_DRAWER(object_resolve_path(
+                                      TYPE_S390_TOPOLOGY_DRAWER, NULL));
+        assert(drawer != NULL);
     }
 
-    return book;
+    return drawer;
 }
 
 /* --- CORES Definitions --- */
@@ -365,12 +432,13 @@ static void book_class_init(ObjectClass *oc, void *data)
     hc->unplug = qdev_simple_device_unplug_cb;
     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
     dc->realize = s390_book_device_realize;
+    dc->bus_type = TYPE_S390_TOPOLOGY_DRAWER_BUS;
     dc->desc = "topology book";
 }
 
 static const TypeInfo book_info = {
     .name          = TYPE_S390_TOPOLOGY_BOOK,
-    .parent        = TYPE_SYS_BUS_DEVICE,
+    .parent        = TYPE_DEVICE,
     .instance_size = sizeof(S390TopologyBook),
     .class_init    = book_class_init,
     .interfaces = (InterfaceInfo[]) {
@@ -379,6 +447,77 @@ static const TypeInfo book_info = {
     }
 };
 
+/* --- DRAWER Definitions --- */
+static Property s390_topology_drawer_properties[] = {
+    DEFINE_PROP_UINT8("drawer_id", S390TopologyDrawer, drawer_id, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static char *drawer_bus_get_dev_path(DeviceState *dev)
+{
+    S390TopologyDrawer *drawer = S390_TOPOLOGY_DRAWER(dev);
+    DeviceState *node = dev->parent_bus->parent;
+    char *id = qdev_get_dev_path(node);
+    char *ret;
+
+    if (id) {
+        ret = g_strdup_printf("%s:%02d", id, drawer->drawer_id);
+        g_free(id);
+    } else {
+        ret = g_strdup_printf("_:%02d", drawer->drawer_id);
+    }
+
+    return ret;
+}
+
+static void drawer_bus_class_init(ObjectClass *oc, void *data)
+{
+    BusClass *k = BUS_CLASS(oc);
+
+    k->get_dev_path = drawer_bus_get_dev_path;
+    k->max_dev = S390_MAX_DRAWERS;
+}
+
+static const TypeInfo drawer_bus_info = {
+    .name = TYPE_S390_TOPOLOGY_DRAWER_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = 0,
+    .class_init = drawer_bus_class_init,
+};
+
+static void s390_drawer_device_realize(DeviceState *dev, Error **errp)
+{
+    S390TopologyDrawer *drawer = S390_TOPOLOGY_DRAWER(dev);
+    BusState *bus;
+
+    bus = qbus_new(TYPE_S390_TOPOLOGY_DRAWER_BUS, dev,
+                   TYPE_S390_TOPOLOGY_DRAWER_BUS);
+    qbus_set_hotplug_handler(bus, OBJECT(dev));
+    drawer->bus = bus;
+}
+
+static void drawer_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+    hc->unplug = qdev_simple_device_unplug_cb;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->realize = s390_drawer_device_realize;
+    device_class_set_props(dc, s390_topology_drawer_properties);
+    dc->desc = "topology drawer";
+}
+
+static const TypeInfo drawer_info = {
+    .name          = TYPE_S390_TOPOLOGY_DRAWER,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(S390TopologyDrawer),
+    .class_init    = drawer_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
+};
 static void topology_register(void)
 {
     type_register_static(&cpu_cores_info);
@@ -386,6 +525,8 @@ static void topology_register(void)
     type_register_static(&socket_info);
     type_register_static(&book_bus_info);
     type_register_static(&book_info);
+    type_register_static(&drawer_bus_info);
+    type_register_static(&drawer_info);
 }
 
 type_init(topology_register);
diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
index beec61706c..5ffb8cba77 100644
--- a/include/hw/s390x/cpu-topology.h
+++ b/include/hw/s390x/cpu-topology.h
@@ -56,18 +56,30 @@ OBJECT_DECLARE_SIMPLE_TYPE(S390TopologySocket, S390_TOPOLOGY_SOCKET)
 #define TYPE_S390_TOPOLOGY_BOOK "topology book"
 #define TYPE_S390_TOPOLOGY_BOOK_BUS "book-bus"
 struct S390TopologyBook {
-    SysBusDevice parent_obj;
+    DeviceState parent_obj;
     BusState *bus;
     int book_id;
     int cnt;
 };
 typedef struct S390TopologyBook S390TopologyBook;
 OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyBook, S390_TOPOLOGY_BOOK)
-#define S390_MAX_BOOKS 1
+#define S390_MAX_BOOKS 4
+
+#define TYPE_S390_TOPOLOGY_DRAWER "topology drawer"
+#define TYPE_S390_TOPOLOGY_DRAWER_BUS "drawer-bus"
+struct S390TopologyDrawer {
+    SysBusDevice parent_obj;
+    BusState *bus;
+    uint8_t drawer_id;
+    int cnt;
+};
+typedef struct S390TopologyDrawer S390TopologyDrawer;
+OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyDrawer, S390_TOPOLOGY_DRAWER)
+#define S390_MAX_DRAWERS 1
 
-S390TopologyBook *s390_init_topology(void);
+S390TopologyDrawer *s390_init_topology(void);
 
-S390TopologyBook *s390_get_topology(void);
+S390TopologyDrawer *s390_get_topology(void);
 void s390_topology_setup(MachineState *ms);
 bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp);
 
diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
index d3ade40a5a..139d46efa4 100644
--- a/include/hw/s390x/sclp.h
+++ b/include/hw/s390x/sclp.h
@@ -111,6 +111,7 @@ typedef struct CPUEntry {
     uint8_t reserved1;
 } QEMU_PACKED CPUEntry;
 
+#define SCLP_READ_SCP_INFO_MNEST                  3
 #define SCLP_READ_SCP_INFO_FIXED_CPU_OFFSET     128
 typedef struct ReadInfo {
     SCCBHeader h;
diff --git a/target/s390x/cpu_topology.c b/target/s390x/cpu_topology.c
index 9f656d7e51..d14b2fb25c 100644
--- a/target/s390x/cpu_topology.c
+++ b/target/s390x/cpu_topology.c
@@ -14,6 +14,7 @@
 #include "hw/s390x/pv.h"
 #include "hw/sysbus.h"
 #include "hw/s390x/cpu-topology.h"
+#include "hw/s390x/sclp.h"
 
 static int stsi_15_container(void *p, int nl, int id)
 {
@@ -40,7 +41,7 @@ static int stsi_15_cpus(void *p, S390TopologyCores *cd)
 }
 
 static int set_socket(const MachineState *ms, void *p,
-                      S390TopologySocket *socket)
+                      S390TopologySocket *socket, int level)
 {
     BusChild *kid;
     int l, len = 0;
@@ -56,24 +57,56 @@ static int set_socket(const MachineState *ms, void *p,
     return len;
 }
 
+static int set_book(const MachineState *ms, void *p,
+                    S390TopologyBook *book, int level)
+{
+    BusChild *kid;
+    int l, len = 0;
+
+    if (level >= 3) {
+        len += stsi_15_container(p, 2, book->book_id);
+        p += len;
+    }
+
+    QTAILQ_FOREACH_REVERSE(kid, &book->bus->children, sibling) {
+        l = set_socket(ms, p, S390_TOPOLOGY_SOCKET(kid->child), level);
+        p += l;
+        len += l;
+    }
+
+    return len;
+}
+
 static void setup_stsi(const MachineState *ms, void *p, int level)
 {
-    S390TopologyBook *book;
+    S390TopologyDrawer *drawer;
     SysIB_151x *sysib;
     BusChild *kid;
+    int nb_sockets, nb_books;
     int len, l;
 
     sysib = (SysIB_151x *)p;
     sysib->mnest = level;
-    sysib->mag[TOPOLOGY_NR_MAG2] = ms->smp.sockets;
+    switch (level) {
+    case 2:
+        nb_books = 0;
+        nb_sockets = ms->smp.sockets * ms->smp.books;
+        break;
+    case 3:
+        nb_books = ms->smp.books;
+        nb_sockets = ms->smp.sockets;
+        break;
+    }
+    sysib->mag[TOPOLOGY_NR_MAG3] = nb_books;
+    sysib->mag[TOPOLOGY_NR_MAG2] = nb_sockets;
     sysib->mag[TOPOLOGY_NR_MAG1] = ms->smp.cores * ms->smp.threads;
 
-    book = s390_get_topology();
+    drawer = s390_get_topology();
     len = sizeof(SysIB_151x);
     p += len;
 
-    QTAILQ_FOREACH_REVERSE(kid, &book->bus->children, sibling) {
-        l = set_socket(ms, p, S390_TOPOLOGY_SOCKET(kid->child));
+    QTAILQ_FOREACH_REVERSE(kid, &drawer->bus->children, sibling) {
+        l = set_book(ms, p, S390_TOPOLOGY_BOOK(kid->child), level);
         p += l;
         len += l;
     }
@@ -87,18 +120,14 @@ void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
     void *p;
     int ret;
 
-    /*
-     * Until the SCLP STSI Facility reporting the MNEST value is used,
-     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
-     */
-    if (sel2 != 2) {
+    if (sel2 < 2 || sel2 > SCLP_READ_SCP_INFO_MNEST) {
         setcc(cpu, 3);
         return;
     }
 
     p = g_malloc0(TARGET_PAGE_SIZE);
 
-    setup_stsi(machine, p, 2);
+    setup_stsi(machine, p, sel2);
 
     if (s390_is_pv()) {
         ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (4 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI Pierre Morel
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

S390 CPU topology may have up to 5 topology containers.
The first container above the cores is level 2, the sockets,
and the level 3, containing sockets are the books.

We introduce here the drawers, drawers is the level containing books.

Let's add drawers, level4, containers to the CPU topology.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/core/machine-smp.c      | 33 ++++++++++++++++++++++++++-------
 hw/core/machine.c          |  2 ++
 hw/s390x/s390-virtio-ccw.c |  1 +
 include/hw/boards.h        |  4 ++++
 qapi/machine.json          |  9 +++++++--
 qemu-options.hx            |  5 +++--
 softmmu/vl.c               |  3 +++
 7 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index d7aa39d540..26150c748f 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -31,6 +31,10 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     GString *s = g_string_new(NULL);
 
+    if (mc->smp_props.drawers_supported) {
+        g_string_append_printf(s, " * drawers (%u)", ms->smp.drawers);
+    }
+
     if (mc->smp_props.books_supported) {
         g_string_append_printf(s, " * books (%u)", ms->smp.books);
     }
@@ -77,6 +81,7 @@ void machine_parse_smp_config(MachineState *ms,
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     unsigned cpus    = config->has_cpus ? config->cpus : 0;
+    unsigned drawers = config->has_drawers ? config->drawers : 0;
     unsigned books   = config->has_books ? config->books : 0;
     unsigned sockets = config->has_sockets ? config->sockets : 0;
     unsigned dies    = config->has_dies ? config->dies : 0;
@@ -90,6 +95,7 @@ void machine_parse_smp_config(MachineState *ms,
      * explicit configuration like "cpus=0" is not allowed.
      */
     if ((config->has_cpus && config->cpus == 0) ||
+        (config->has_drawers && config->drawers == 0) ||
         (config->has_books && config->books == 0) ||
         (config->has_sockets && config->sockets == 0) ||
         (config->has_dies && config->dies == 0) ||
@@ -124,6 +130,13 @@ void machine_parse_smp_config(MachineState *ms,
 
     books = books > 0 ? books : 1;
 
+    if (!mc->smp_props.drawers_supported && drawers > 1) {
+        error_setg(errp, "drawers not supported by this machine's CPU topology");
+        return;
+    }
+
+    drawers = drawers > 0 ? drawers : 1;
+
     /* compute missing values based on the provided ones */
     if (cpus == 0 && maxcpus == 0) {
         sockets = sockets > 0 ? sockets : 1;
@@ -137,34 +150,40 @@ void machine_parse_smp_config(MachineState *ms,
             if (sockets == 0) {
                 cores = cores > 0 ? cores : 1;
                 threads = threads > 0 ? threads : 1;
-                sockets = maxcpus / (books * dies * clusters * cores * threads);
+                sockets = maxcpus /
+                          (drawers * books * dies * clusters * cores * threads);
             } else if (cores == 0) {
                 threads = threads > 0 ? threads : 1;
-                cores = maxcpus / (books * sockets * dies * clusters * threads);
+                cores = maxcpus /
+                        (drawers * books * sockets * dies * clusters * threads);
             }
         } else {
             /* prefer cores over sockets since 6.2 */
             if (cores == 0) {
                 sockets = sockets > 0 ? sockets : 1;
                 threads = threads > 0 ? threads : 1;
-                cores = maxcpus / (books * sockets * dies * clusters * threads);
+                cores = maxcpus /
+                        (drawers * books * sockets * dies * clusters * threads);
             } else if (sockets == 0) {
                 threads = threads > 0 ? threads : 1;
-                sockets = maxcpus / (books * dies * clusters * cores * threads);
+                sockets = maxcpus /
+                         (drawers * books * dies * clusters * cores * threads);
             }
         }
 
         /* try to calculate omitted threads at last */
         if (threads == 0) {
-            threads = maxcpus / (books * sockets * dies * clusters * cores);
+            threads = maxcpus /
+                      (drawers * books * sockets * dies * clusters * cores);
         }
     }
 
-    maxcpus = maxcpus > 0 ? maxcpus : books * sockets * dies *
+    maxcpus = maxcpus > 0 ? maxcpus : drawers * books * sockets * dies *
                                       clusters * cores * threads;
     cpus = cpus > 0 ? cpus : maxcpus;
 
     ms->smp.cpus = cpus;
+    ms->smp.drawers = drawers;
     ms->smp.books = books;
     ms->smp.sockets = sockets;
     ms->smp.dies = dies;
@@ -174,7 +193,7 @@ void machine_parse_smp_config(MachineState *ms,
     ms->smp.max_cpus = maxcpus;
 
     /* sanity-check of the computed topology */
-    if (books * sockets * dies * clusters * cores * threads != maxcpus) {
+    if (drawers * books * sockets * dies * clusters * cores * threads != maxcpus) {
         g_autofree char *topo_msg = cpu_hierarchy_to_string(ms);
         error_setg(errp, "Invalid CPU topology: "
                    "product of the hierarchy must match maxcpus: "
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 8861f58d23..4c5c8d1655 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -821,6 +821,7 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name,
     MachineState *ms = MACHINE(obj);
     SMPConfiguration *config = &(SMPConfiguration){
         .has_cpus = true, .cpus = ms->smp.cpus,
+        .has_drawers = true, .drawers = ms->smp.drawers,
         .has_books = true, .books = ms->smp.books,
         .has_sockets = true, .sockets = ms->smp.sockets,
         .has_dies = true, .dies = ms->smp.dies,
@@ -1088,6 +1089,7 @@ static void machine_initfn(Object *obj)
     /* default to mc->default_cpus */
     ms->smp.cpus = mc->default_cpus;
     ms->smp.max_cpus = mc->default_cpus;
+    ms->smp.drawers = 1;
     ms->smp.books = 1;
     ms->smp.sockets = 1;
     ms->smp.dies = 1;
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index ace65164d8..3b2a1f2729 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -620,6 +620,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
     nc->nmi_monitor_handler = s390_nmi;
     mc->default_ram_id = "s390.ram";
     mc->smp_props.books_supported = true;
+    mc->smp_props.drawers_supported = true;
 }
 
 static inline bool machine_get_aes_key_wrap(Object *obj, Error **errp)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 2b44f50b6e..53014275b2 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -131,12 +131,14 @@ typedef struct {
  * @dies_supported - whether dies are supported by the machine
  * @clusters_supported - whether clusters are supported by the machine
  * @books_supported - whether books are supported by the machine
+ * @drawers_supported - whether drawers are supported by the machine
  */
 typedef struct {
     bool prefer_sockets;
     bool dies_supported;
     bool clusters_supported;
     bool books_supported;
+    bool drawers_supported;
 } SMPCompatProps;
 
 /**
@@ -301,6 +303,7 @@ typedef struct DeviceMemoryState {
 /**
  * CpuTopology:
  * @cpus: the number of present logical processors on the machine
+ * @drawers: the number of drawers on the machine
  * @books: the number of books on the machine
  * @sockets: the number of sockets on the machine
  * @dies: the number of dies in one socket
@@ -311,6 +314,7 @@ typedef struct DeviceMemoryState {
  */
 typedef struct CpuTopology {
     unsigned int cpus;
+    unsigned int drawers;
     unsigned int books;
     unsigned int sockets;
     unsigned int dies;
diff --git a/qapi/machine.json b/qapi/machine.json
index f838b0c51f..bdd92e3cb1 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -900,14 +900,15 @@
 # a CPU is being hotplugged.
 #
 # @node-id: NUMA node ID the CPU belongs to
-# @book-id: book number within node/board the CPU belongs to
+# @drawer-id: drawer number within node/board the CPU belongs to
+# @book-id: book number within drawer/node/board the CPU belongs to
 # @socket-id: socket number within book/node/board the CPU belongs to
 # @die-id: die number within socket the CPU belongs to (since 4.1)
 # @cluster-id: cluster number within die the CPU belongs to (since 7.1)
 # @core-id: core number within cluster the CPU belongs to
 # @thread-id: thread number within core the CPU belongs to
 #
-# Note: currently there are 6 properties that could be present
+# Note: currently there are 7 properties that could be present
 #       but management should be prepared to pass through other
 #       properties with device_add command to allow for future
 #       interface extension. This also requires the filed names to be kept in
@@ -917,6 +918,7 @@
 ##
 { 'struct': 'CpuInstanceProperties',
   'data': { '*node-id': 'int',
+            '*drawer-id': 'int',
             '*book-id': 'int',
             '*socket-id': 'int',
             '*die-id': 'int',
@@ -1467,6 +1469,8 @@
 #
 # @cpus: number of virtual CPUs in the virtual machine
 #
+# @drawers: number of drawers in the CPU topology
+#
 # @books: number of books in the CPU topology
 #
 # @sockets: number of sockets in the CPU topology
@@ -1485,6 +1489,7 @@
 ##
 { 'struct': 'SMPConfiguration', 'data': {
      '*cpus': 'int',
+     '*drawers': 'int',
      '*books': 'int',
      '*sockets': 'int',
      '*dies': 'int',
diff --git a/qemu-options.hx b/qemu-options.hx
index 9d72208f50..46aa79ee26 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -239,11 +239,12 @@ SRST
 ERST
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-    "-smp [[cpus=]n][,maxcpus=maxcpus][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
+    "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
     "                set the number of initial CPUs to 'n' [default=1]\n"
     "                maxcpus= maximum number of total CPUs, including\n"
     "                offline CPUs for hotplug, etc\n"
-    "                books= number of books on the machine board\n"
+    "                drawers= number of drawers on the machine board\n"
+    "                books= number of books in one drawer\n"
     "                sockets= number of sockets in one book\n"
     "                dies= number of dies in one socket\n"
     "                clusters= number of clusters in one die\n"
diff --git a/softmmu/vl.c b/softmmu/vl.c
index c13edd6948..299a85a97a 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -726,6 +726,9 @@ static QemuOptsList qemu_smp_opts = {
         {
             .name = "cpus",
             .type = QEMU_OPT_NUMBER,
+        }, {
+            .name = "drawers",
+            .type = QEMU_OPT_NUMBER,
         }, {
             .name = "books",
             .type = QEMU_OPT_NUMBER,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (5 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

Let's add STSI support for the container level 4, drawers,
and provide the information back to the guest.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c         | 145 +++++++++++++++++++++++++++++---
 include/hw/s390x/cpu-topology.h |  19 ++++-
 include/hw/s390x/sclp.h         |   2 +-
 target/s390x/cpu_topology.c     |  40 +++++++--
 4 files changed, 184 insertions(+), 22 deletions(-)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index eba003d498..107cdbecad 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -118,6 +118,28 @@ static S390TopologyBook *s390_create_book(MachineState *ms,
     return book;
 }
 
+static S390TopologyDrawer *s390_create_drawer(S390TopologyNode *node,
+                                              int id, Error **errp)
+{
+    DeviceState *dev;
+    S390TopologyDrawer *drawer;
+    const MachineState *ms = MACHINE(qdev_get_machine());
+
+    if (node->bus->num_children >= ms->smp.drawers) {
+        error_setg(errp, "Unable to create more drawers.");
+        return NULL;
+    }
+
+    dev = qdev_new(TYPE_S390_TOPOLOGY_DRAWER);
+    qdev_realize_and_unref(dev, node->bus, &error_fatal);
+
+    drawer = S390_TOPOLOGY_DRAWER(dev);
+    drawer->drawer_id = id;
+    node->cnt++;
+
+    return drawer;
+}
+
 /*
  * s390_get_cores:
  * @ms: Machine state
@@ -174,6 +196,34 @@ static S390TopologySocket *s390_get_socket(MachineState *ms,
     return s390_create_socket(ms, book, socket_id, errp);
 }
 
+/*
+ * s390_get_drawer:
+ * @ms: Machine state
+ * @node: The node to search into
+ * @drawer_id: the identifier of the drawer to search for
+ * @errp: Error pointer
+ *
+ * returns a pointer to a S390TopologyDrawer structure within a book having
+ * the specified drawer_id.
+ * First search if the book is already containing the S390TopologyDrawer
+ * structure and if not create one with this drawer_id.
+ */
+static S390TopologyDrawer *s390_get_drawer(MachineState *ms,
+                                           S390TopologyNode *node,
+                                           int drawer_id, Error **errp)
+{
+    S390TopologyDrawer *drawer;
+    BusChild *kid;
+
+    QTAILQ_FOREACH(kid, &node->bus->children, sibling) {
+        drawer = S390_TOPOLOGY_DRAWER(kid->child);
+        if (drawer->drawer_id == drawer_id) {
+            return drawer;
+        }
+    }
+    return s390_create_drawer(node, drawer_id, errp);
+}
+
 /*
  * s390_get_book:
  * @ms: Machine state
@@ -215,19 +265,26 @@ static S390TopologyBook *s390_get_book(MachineState *ms,
  */
 bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
 {
+    S390TopologyNode *node;
     S390TopologyDrawer *drawer;
     S390TopologyBook *book;
     S390TopologySocket *socket;
     S390TopologyCores *cores;
     int nb_cores_per_socket;
     int nb_cores_per_book;
+    int nb_cores_per_drawer;
     int origin, bit;
 
-    drawer = s390_get_topology();
+    node = s390_get_topology();
 
     nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
     nb_cores_per_book = ms->smp.sockets * nb_cores_per_socket;
+    nb_cores_per_drawer = ms->smp.books * nb_cores_per_book;
 
+    drawer = s390_get_drawer(ms, node, core_id / nb_cores_per_drawer, errp);
+    if (!drawer) {
+        return false;
+    }
     book = s390_get_book(ms, drawer, core_id / nb_cores_per_book, errp);
     if (!book) {
         return false;
@@ -273,23 +330,23 @@ void s390_topology_setup(MachineState *ms)
     DeviceState *dev;
 
     /* Create BOOK bridge device */
-    dev = qdev_new(TYPE_S390_TOPOLOGY_DRAWER);
+    dev = qdev_new(TYPE_S390_TOPOLOGY_NODE);
     object_property_add_child(qdev_get_machine(),
-                              TYPE_S390_TOPOLOGY_DRAWER, OBJECT(dev));
+                              TYPE_S390_TOPOLOGY_NODE, OBJECT(dev));
     sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 }
 
-S390TopologyDrawer *s390_get_topology(void)
+S390TopologyNode *s390_get_topology(void)
 {
-    static S390TopologyDrawer *drawer;
+    static S390TopologyNode *node;
 
-    if (!drawer) {
-        drawer = S390_TOPOLOGY_DRAWER(object_resolve_path(
-                                      TYPE_S390_TOPOLOGY_DRAWER, NULL));
-        assert(drawer != NULL);
+    if (!node) {
+        node = S390_TOPOLOGY_NODE(object_resolve_path(
+                                  TYPE_S390_TOPOLOGY_NODE, NULL));
+        assert(node != NULL);
     }
 
-    return drawer;
+    return node;
 }
 
 /* --- CORES Definitions --- */
@@ -503,6 +560,7 @@ static void drawer_class_init(ObjectClass *oc, void *data)
 
     hc->unplug = qdev_simple_device_unplug_cb;
     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->bus_type = TYPE_S390_TOPOLOGY_NODE_BUS;
     dc->realize = s390_drawer_device_realize;
     device_class_set_props(dc, s390_topology_drawer_properties);
     dc->desc = "topology drawer";
@@ -510,7 +568,7 @@ static void drawer_class_init(ObjectClass *oc, void *data)
 
 static const TypeInfo drawer_info = {
     .name          = TYPE_S390_TOPOLOGY_DRAWER,
-    .parent        = TYPE_SYS_BUS_DEVICE,
+    .parent        = TYPE_DEVICE,
     .instance_size = sizeof(S390TopologyDrawer),
     .class_init    = drawer_class_init,
     .interfaces = (InterfaceInfo[]) {
@@ -518,6 +576,69 @@ static const TypeInfo drawer_info = {
         { }
     }
 };
+
+/* --- NODE Definitions --- */
+
+/*
+ * Nodes are the first level of CPU topology we support
+ * only one NODE for the moment.
+ */
+static char *node_bus_get_dev_path(DeviceState *dev)
+{
+    return g_strdup("00");
+}
+
+static void node_bus_class_init(ObjectClass *oc, void *data)
+{
+    BusClass *k = BUS_CLASS(oc);
+
+    k->get_dev_path = node_bus_get_dev_path;
+    k->max_dev = S390_MAX_NODES;
+}
+
+static const TypeInfo node_bus_info = {
+    .name = TYPE_S390_TOPOLOGY_NODE_BUS,
+    .parent = TYPE_BUS,
+    .instance_size = 0,
+    .class_init = node_bus_class_init,
+};
+
+static void s390_node_device_realize(DeviceState *dev, Error **errp)
+{
+    S390TopologyNode *node = S390_TOPOLOGY_NODE(dev);
+    BusState *bus;
+
+    /* Create NODE bus on NODE bridge device */
+    bus = qbus_new(TYPE_S390_TOPOLOGY_NODE_BUS, dev,
+                   TYPE_S390_TOPOLOGY_NODE_BUS);
+    node->bus = bus;
+
+    /* Enable hotplugging */
+    qbus_set_hotplug_handler(bus, OBJECT(dev));
+}
+
+static void node_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+
+    hc->unplug = qdev_simple_device_unplug_cb;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->realize = s390_node_device_realize;
+    dc->desc = "topology node";
+}
+
+static const TypeInfo node_info = {
+    .name          = TYPE_S390_TOPOLOGY_NODE,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(S390TopologyNode),
+    .class_init    = node_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
+};
+
 static void topology_register(void)
 {
     type_register_static(&cpu_cores_info);
@@ -527,6 +648,8 @@ static void topology_register(void)
     type_register_static(&book_info);
     type_register_static(&drawer_bus_info);
     type_register_static(&drawer_info);
+    type_register_static(&node_bus_info);
+    type_register_static(&node_info);
 }
 
 type_init(topology_register);
diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
index 5ffb8cba77..ba0b1c1d7a 100644
--- a/include/hw/s390x/cpu-topology.h
+++ b/include/hw/s390x/cpu-topology.h
@@ -68,18 +68,29 @@ OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyBook, S390_TOPOLOGY_BOOK)
 #define TYPE_S390_TOPOLOGY_DRAWER "topology drawer"
 #define TYPE_S390_TOPOLOGY_DRAWER_BUS "drawer-bus"
 struct S390TopologyDrawer {
-    SysBusDevice parent_obj;
+    DeviceState parent_obj;
     BusState *bus;
     uint8_t drawer_id;
     int cnt;
 };
 typedef struct S390TopologyDrawer S390TopologyDrawer;
 OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyDrawer, S390_TOPOLOGY_DRAWER)
-#define S390_MAX_DRAWERS 1
+#define S390_MAX_DRAWERS 4
 
-S390TopologyDrawer *s390_init_topology(void);
+#define TYPE_S390_TOPOLOGY_NODE "topology node"
+#define TYPE_S390_TOPOLOGY_NODE_BUS "node-bus"
+struct S390TopologyNode {
+    SysBusDevice parent_obj;
+    BusState *bus;
+    uint8_t node_id;
+    int cnt;
+};
+typedef struct S390TopologyNode S390TopologyNode;
+OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyNode, S390_TOPOLOGY_NODE)
+#define S390_MAX_NODES 1
 
-S390TopologyDrawer *s390_get_topology(void);
+S390TopologyNode *s390_init_topology(void);
+S390TopologyNode *s390_get_topology(void);
 void s390_topology_setup(MachineState *ms);
 bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp);
 
diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
index 139d46efa4..7f9ff84bf8 100644
--- a/include/hw/s390x/sclp.h
+++ b/include/hw/s390x/sclp.h
@@ -111,7 +111,7 @@ typedef struct CPUEntry {
     uint8_t reserved1;
 } QEMU_PACKED CPUEntry;
 
-#define SCLP_READ_SCP_INFO_MNEST                  3
+#define SCLP_READ_SCP_INFO_MNEST                  4
 #define SCLP_READ_SCP_INFO_FIXED_CPU_OFFSET     128
 typedef struct ReadInfo {
     SCCBHeader h;
diff --git a/target/s390x/cpu_topology.c b/target/s390x/cpu_topology.c
index d14b2fb25c..bd2ca171f6 100644
--- a/target/s390x/cpu_topology.c
+++ b/target/s390x/cpu_topology.c
@@ -77,36 +77,64 @@ static int set_book(const MachineState *ms, void *p,
     return len;
 }
 
+static int set_drawer(const MachineState *ms, void *p,
+                      S390TopologyDrawer *drawer, int level)
+{
+    BusChild *kid;
+    int l, len = 0;
+
+    if (level >= 4) {
+        len += stsi_15_container(p, 3, drawer->drawer_id);
+        p += len;
+    }
+
+    QTAILQ_FOREACH_REVERSE(kid, &drawer->bus->children, sibling) {
+        l = set_book(ms, p, S390_TOPOLOGY_BOOK(kid->child), level);
+        p += l;
+        len += l;
+    }
+
+    return len;
+}
+
 static void setup_stsi(const MachineState *ms, void *p, int level)
 {
-    S390TopologyDrawer *drawer;
+    S390TopologyNode *node;
     SysIB_151x *sysib;
     BusChild *kid;
-    int nb_sockets, nb_books;
+    int nb_sockets, nb_books, nb_drawers;
     int len, l;
 
     sysib = (SysIB_151x *)p;
     sysib->mnest = level;
     switch (level) {
     case 2:
+        nb_drawers = 0;
         nb_books = 0;
-        nb_sockets = ms->smp.sockets * ms->smp.books;
+        nb_sockets = ms->smp.sockets * ms->smp.books * ms->smp.drawers;
         break;
     case 3:
+        nb_drawers = 0;
+        nb_books = ms->smp.books * ms->smp.drawers;
+        nb_sockets = ms->smp.sockets;
+        break;
+    case 4:
+        nb_drawers = ms->smp.drawers;
         nb_books = ms->smp.books;
         nb_sockets = ms->smp.sockets;
         break;
     }
+    sysib->mag[TOPOLOGY_NR_MAG4] = nb_drawers;
     sysib->mag[TOPOLOGY_NR_MAG3] = nb_books;
     sysib->mag[TOPOLOGY_NR_MAG2] = nb_sockets;
     sysib->mag[TOPOLOGY_NR_MAG1] = ms->smp.cores * ms->smp.threads;
 
-    drawer = s390_get_topology();
+    node = s390_get_topology();
     len = sizeof(SysIB_151x);
     p += len;
 
-    QTAILQ_FOREACH_REVERSE(kid, &drawer->bus->children, sibling) {
-        l = set_book(ms, p, S390_TOPOLOGY_BOOK(kid->child), level);
+    QTAILQ_FOREACH_REVERSE(kid, &node->bus->children, sibling) {
+        l = set_drawer(ms, p, S390_TOPOLOGY_DRAWER(kid->child), level);
         p += l;
         len += l;
     }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (6 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-07-14 14:57   ` Janis Schoetterl-Glausch
  2022-06-20 14:03 ` [PATCH v8 09/12] target/s390x: interception of PTF instruction Pierre Morel
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

S390x CPU Topology allows a non uniform repartition of the CPU
inside the topology containers, sockets, books and drawers.

We use numa to place the CPU inside the right topology container
and report the non uniform topology to the guest.

Note that s390x needs CPU0 to belong to the topology and consequently
all topology must include CPU0.

We accept a partial QEMU numa definition, in that case undefined CPUs
are added to free slots in the topology starting with slot 0 and going
up.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/core/machine.c          | 18 ++++++++++
 hw/s390x/s390-virtio-ccw.c | 68 ++++++++++++++++++++++++++++++++++----
 2 files changed, 79 insertions(+), 7 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 4c5c8d1655..3bee66acc6 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -760,6 +760,16 @@ void machine_set_cpu_numa_node(MachineState *machine,
             return;
         }
 
+        if (props->has_book_id && !slot->props.has_book_id) {
+            error_setg(errp, "book-id is not supported");
+            return;
+        }
+
+        if (props->has_drawer_id && !slot->props.has_drawer_id) {
+            error_setg(errp, "drawer-id is not supported");
+            return;
+        }
+
         /* skip slots with explicit mismatch */
         if (props->has_thread_id && props->thread_id != slot->props.thread_id) {
                 continue;
@@ -782,6 +792,14 @@ void machine_set_cpu_numa_node(MachineState *machine,
                 continue;
         }
 
+        if (props->has_book_id && props->book_id != slot->props.book_id) {
+                continue;
+        }
+
+        if (props->has_drawer_id && props->drawer_id != slot->props.drawer_id) {
+                continue;
+        }
+
         /* reject assignment if slot is already assigned, for compatibility
          * of legacy cpu_index mapping with SPAPR core based mapping do not
          * error out if cpu thread and matched core have the same node-id */
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 3b2a1f2729..5c0dbff6fd 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -85,14 +85,34 @@ out:
 static void s390_init_cpus(MachineState *machine)
 {
     MachineClass *mc = MACHINE_GET_CLASS(machine);
-    int i;
+    CPUArchId *slot;
+    int i, n = 0;
 
     /* initialize possible_cpus */
     mc->possible_cpu_arch_ids(machine);
 
     s390_topology_setup(machine);
-    for (i = 0; i < machine->smp.cpus; i++) {
+
+    /* For NUMA configuration create defined nodes */
+    if (machine->numa_state->num_nodes) {
+        for (i = 0; i < machine->smp.max_cpus; i++) {
+            slot = &machine->possible_cpus->cpus[i];
+            if (slot->arch_id != -1 && n < machine->smp.cpus) {
+                s390x_new_cpu(machine->cpu_type, i, &error_fatal);
+                n++;
+            }
+        }
+    }
+
+    /* create all remaining CPUs */
+    for (i = 0; n < machine->smp.cpus && i < machine->smp.max_cpus; i++) {
+        slot = &machine->possible_cpus->cpus[i];
+        /* For NUMA configuration skip defined nodes */
+        if (machine->numa_state->num_nodes && slot->arch_id != -1) {
+            continue;
+        }
         s390x_new_cpu(machine->cpu_type, i, &error_fatal);
+        n++;
     }
 }
 
@@ -275,6 +295,11 @@ static void ccw_init(MachineState *machine)
     /* register hypercalls */
     virtio_ccw_register_hcalls();
 
+    /* CPU0 must exist on S390x */
+    if (!s390_cpu_addr2state(0)) {
+        error_printf("Core_id 0 must be defined in the CPU configuration\n");
+        exit(1);
+    }
     s390_enable_css_support(s390_cpu_addr2state(0));
 
     ret = css_create_css_image(VIRTUAL_CSSID, true);
@@ -307,6 +332,7 @@ static void s390_cpu_plug(HotplugHandler *hotplug_dev,
 
     g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
     ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
+    ms->possible_cpus->cpus[cpu->env.core_id].arch_id = cpu->env.core_id;
 
     if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
         return;
@@ -532,7 +558,9 @@ static CpuInstanceProperties s390_cpu_index_to_props(MachineState *ms,
 static const CPUArchIdList *s390_possible_cpu_arch_ids(MachineState *ms)
 {
     int i;
+    int drawer_id, book_id, socket_id;
     unsigned int max_cpus = ms->smp.max_cpus;
+    CPUArchId *slot;
 
     if (ms->possible_cpus) {
         g_assert(ms->possible_cpus && ms->possible_cpus->len == max_cpus);
@@ -543,11 +571,25 @@ static const CPUArchIdList *s390_possible_cpu_arch_ids(MachineState *ms)
                                   sizeof(CPUArchId) * max_cpus);
     ms->possible_cpus->len = max_cpus;
     for (i = 0; i < ms->possible_cpus->len; i++) {
-        ms->possible_cpus->cpus[i].type = ms->cpu_type;
-        ms->possible_cpus->cpus[i].vcpus_count = 1;
-        ms->possible_cpus->cpus[i].arch_id = i;
-        ms->possible_cpus->cpus[i].props.has_core_id = true;
-        ms->possible_cpus->cpus[i].props.core_id = i;
+        slot = &ms->possible_cpus->cpus[i];
+
+        slot->type = ms->cpu_type;
+        slot->vcpus_count = 1;
+        slot->arch_id = i;
+        slot->props.has_core_id = true;
+        slot->props.core_id = i;
+
+        socket_id = i / ms->smp.cores;
+        slot->props.socket_id = socket_id;
+        slot->props.has_socket_id = true;
+
+        book_id = socket_id / ms->smp.sockets;
+        slot->props.book_id = book_id;
+        slot->props.has_book_id = true;
+
+        drawer_id = book_id / ms->smp.books;
+        slot->props.drawer_id = drawer_id;
+        slot->props.has_drawer_id = true;
     }
 
     return ms->possible_cpus;
@@ -589,6 +631,17 @@ static ram_addr_t s390_fixup_ram_size(ram_addr_t sz)
     return newsz;
 }
 
+/*
+ * S390 defines CPU topology level 2 as the level for which a change in topology
+ * is worth being taking care of.
+ * Let use level 2, socket, as the numa node.
+ */
+static int64_t s390_get_default_cpu_node_id(const MachineState *ms, int idx)
+{
+    ms->possible_cpus->cpus[idx].arch_id = -1;
+    return idx / ms->smp.cores;
+}
+
 static void ccw_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -621,6 +674,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
     mc->default_ram_id = "s390.ram";
     mc->smp_props.books_supported = true;
     mc->smp_props.drawers_supported = true;
+    mc->get_default_cpu_node_id = s390_get_default_cpu_node_id;
 }
 
 static inline bool machine_get_aes_key_wrap(Object *obj, Error **errp)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 09/12] target/s390x: interception of PTF instruction
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (7 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report Pierre Morel
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

When the host supports the CPU topology facility, the PTF
instruction with function code 2 is interpreted by the SIE,
provided that the userland hypervizor activates the interpretation
by using the KVM_CAP_S390_CPU_TOPOLOGY KVM extension.

The PTF instructions with function code 0 and 1 are intercepted
and must be emulated by the userland hypervizor.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c            | 50 ++++++++++++++++++++++++++++++
 include/hw/s390x/s390-virtio-ccw.h |  6 ++++
 target/s390x/kvm/kvm.c             | 13 ++++++++
 3 files changed, 69 insertions(+)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index 107cdbecad..2dbb42ee2b 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -20,6 +20,56 @@
 #include "target/s390x/cpu.h"
 #include "hw/s390x/s390-virtio-ccw.h"
 
+/*
+ * s390_handle_ptf:
+ *
+ * @register 1: contains the function code
+ *
+ * Function codes 0 and 1 handle the CPU polarization.
+ * We assume an horizontal topology, the only one supported currently
+ * by Linux, consequently we answer to function code 0, requesting
+ * horizontal polarization that it is already the current polarization
+ * and reject vertical polarization request without further explanation.
+ *
+ * Function code 2 is handling topology changes and is interpreted
+ * by the SIE.
+ */
+void s390_handle_ptf(S390CPU *cpu, uint8_t r1, uintptr_t ra)
+{
+    CPUS390XState *env = &cpu->env;
+    uint64_t reg = env->regs[r1];
+    uint8_t fc = reg & S390_TOPO_FC_MASK;
+
+    if (!s390_has_feat(S390_FEAT_CONFIGURATION_TOPOLOGY)) {
+        s390_program_interrupt(env, PGM_OPERATION, ra);
+        return;
+    }
+
+    if (env->psw.mask & PSW_MASK_PSTATE) {
+        s390_program_interrupt(env, PGM_PRIVILEGED, ra);
+        return;
+    }
+
+    if (reg & ~S390_TOPO_FC_MASK) {
+        s390_program_interrupt(env, PGM_SPECIFICATION, ra);
+        return;
+    }
+
+    switch (fc) {
+    case 0:    /* Horizontal polarization is already set */
+        env->regs[r1] |= S390_PTF_REASON_DONE;
+        setcc(cpu, 2);
+        break;
+    case 1:    /* Vertical polarization is not supported */
+        env->regs[r1] |= S390_PTF_REASON_NONE;
+        setcc(cpu, 2);
+        break;
+    default:
+        /* Note that fc == 2 is interpreted by the SIE */
+        s390_program_interrupt(env, PGM_SPECIFICATION, ra);
+    }
+}
+
 /*
  * s390_create_cores:
  * @ms: Machine state
diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
index 3331990e02..f2c64dbc1a 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -30,6 +30,12 @@ struct S390CcwMachineState {
     uint8_t loadparm[8];
 };
 
+#define S390_PTF_REASON_NONE (0x00 << 8)
+#define S390_PTF_REASON_DONE (0x01 << 8)
+#define S390_PTF_REASON_BUSY (0x02 << 8)
+#define S390_TOPO_FC_MASK 0xffUL
+void s390_handle_ptf(S390CPU *cpu, uint8_t r1, uintptr_t ra);
+
 struct S390CcwMachineClass {
     /*< private >*/
     MachineClass parent_class;
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 563bf5ac60..c664c45e8a 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -97,6 +97,7 @@
 
 #define PRIV_B9_EQBS                    0x9c
 #define PRIV_B9_CLP                     0xa0
+#define PRIV_B9_PTF                     0xa2
 #define PRIV_B9_PCISTG                  0xd0
 #define PRIV_B9_PCILG                   0xd2
 #define PRIV_B9_RPCIT                   0xd3
@@ -1461,6 +1462,15 @@ static int kvm_mpcifc_service_call(S390CPU *cpu, struct kvm_run *run)
     }
 }
 
+static int kvm_handle_ptf(S390CPU *cpu, struct kvm_run *run)
+{
+    uint8_t r1 = (run->s390_sieic.ipb >> 20) & 0x0f;
+
+    s390_handle_ptf(cpu, r1, RA_IGNORED);
+
+    return 0;
+}
+
 static int handle_b9(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1)
 {
     int r = 0;
@@ -1478,6 +1488,9 @@ static int handle_b9(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1)
     case PRIV_B9_RPCIT:
         r = kvm_rpcit_service_call(cpu, run);
         break;
+    case PRIV_B9_PTF:
+        r = kvm_handle_ptf(cpu, run);
+        break;
     case PRIV_B9_EQBS:
         /* just inject exception */
         r = -1;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (8 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 09/12] target/s390x: interception of PTF instruction Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration Pierre Morel
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

During a subsystem reset the Topology-Change-Report is cleared.
Let's ask KVM to clear the MTCR in the case of a subsystem reset.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c      |  6 ++++++
 hw/s390x/s390-virtio-ccw.c   |  1 +
 target/s390x/cpu-sysemu.c    |  7 +++++++
 target/s390x/cpu.h           |  1 +
 target/s390x/kvm/kvm.c       | 30 ++++++++++++++++++++++++++++++
 target/s390x/kvm/kvm_s390x.h |  2 ++
 6 files changed, 47 insertions(+)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index 2dbb42ee2b..ba12cafaf7 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -667,6 +667,11 @@ static void s390_node_device_realize(DeviceState *dev, Error **errp)
     qbus_set_hotplug_handler(bus, OBJECT(dev));
 }
 
+static void s390_topology_reset(DeviceState *dev)
+{
+    s390_cpu_topology_mtr_reset();
+}
+
 static void node_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
@@ -676,6 +681,7 @@ static void node_class_init(ObjectClass *oc, void *data)
     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
     dc->realize = s390_node_device_realize;
     dc->desc = "topology node";
+    dc->reset = s390_topology_reset;
 }
 
 static const TypeInfo node_info = {
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 5c0dbff6fd..0b30633ed5 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -122,6 +122,7 @@ static const char *const reset_dev_types[] = {
     "s390-flic",
     "diag288",
     TYPE_S390_PCI_HOST_BRIDGE,
+    TYPE_S390_TOPOLOGY_NODE,
 };
 
 static void subsystem_reset(void)
diff --git a/target/s390x/cpu-sysemu.c b/target/s390x/cpu-sysemu.c
index 948e4bd3e0..11d0d87301 100644
--- a/target/s390x/cpu-sysemu.c
+++ b/target/s390x/cpu-sysemu.c
@@ -306,3 +306,10 @@ void s390_do_cpu_set_diag318(CPUState *cs, run_on_cpu_data arg)
         kvm_s390_set_diag318(cs, arg.host_ulong);
     }
 }
+
+void s390_cpu_topology_mtr_reset(void)
+{
+    if (kvm_enabled()) {
+        kvm_s390_cpu_topology_reset();
+    }
+}
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 9d48087b71..793e72c81a 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -826,6 +826,7 @@ void s390_enable_css_support(S390CPU *cpu);
 void s390_do_cpu_set_diag318(CPUState *cs, run_on_cpu_data arg);
 int s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch_id,
                                 int vq, bool assign);
+void s390_cpu_topology_mtr_reset(void);
 #ifndef CONFIG_USER_ONLY
 unsigned int s390_cpu_set_state(uint8_t cpu_state, S390CPU *cpu);
 #else
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index c664c45e8a..277f8d37cf 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2592,3 +2592,33 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
     return true;
 }
+
+static void kvm_s390_set_mtr(uint64_t attr)
+{
+    struct kvm_device_attr attribute = {
+        .group = KVM_S390_VM_CPU_TOPOLOGY,
+        .attr  = attr,
+    };
+    int ret = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attribute);
+
+    if (ret) {
+        error_report("Failed to set cpu topology attribute %lu: %s",
+                     attr, strerror(-ret));
+    }
+}
+
+static void kvm_s390_reset_mtr(void)
+{
+    uint64_t attr = KVM_S390_VM_CPU_TOPO_MTR_CLEAR;
+
+    if (kvm_vm_check_attr(kvm_state, KVM_S390_VM_CPU_TOPOLOGY, attr)) {
+            kvm_s390_set_mtr(attr);
+    }
+}
+
+void kvm_s390_cpu_topology_reset(void)
+{
+    if (s390_has_feat(S390_FEAT_CONFIGURATION_TOPOLOGY)) {
+        kvm_s390_reset_mtr();
+    }
+}
diff --git a/target/s390x/kvm/kvm_s390x.h b/target/s390x/kvm/kvm_s390x.h
index 05a5e1e6f4..d717c05827 100644
--- a/target/s390x/kvm/kvm_s390x.h
+++ b/target/s390x/kvm/kvm_s390x.h
@@ -46,4 +46,6 @@ void kvm_s390_restart_interrupt(S390CPU *cpu);
 void kvm_s390_stop_interrupt(S390CPU *cpu);
 void kvm_s390_set_diag318(CPUState *cs, uint64_t diag318_info);
 
+void kvm_s390_cpu_topology_reset(void);
+
 #endif /* KVM_S390X_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (9 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-06-20 14:03 ` [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology Pierre Morel
  2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

To migrate the Multiple Topology Change report, MTCR, we
get it from KVM and save its state in the topology VM State
Description during the presave and restore it to KVM on the
destination during the postload.

The migration state is needed whenever the CPU topology
feature is activated.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/cpu-topology.c         | 43 +++++++++++++++++++++++++++++++++
 include/hw/s390x/cpu-topology.h |  2 ++
 target/s390x/cpu.h              |  2 ++
 target/s390x/cpu_models.c       |  1 +
 target/s390x/kvm/kvm.c          | 38 ++++++++++++++++++++++++++++-
 5 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index ba12cafaf7..8fba2c8144 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -19,6 +19,8 @@
 #include "qemu/typedefs.h"
 #include "target/s390x/cpu.h"
 #include "hw/s390x/s390-virtio-ccw.h"
+#include "migration/vmstate.h"
+#include "qemu/error-report.h"
 
 /*
  * s390_handle_ptf:
@@ -672,6 +674,46 @@ static void s390_topology_reset(DeviceState *dev)
     s390_cpu_topology_mtr_reset();
 }
 
+static int cpu_topology_postload(void *opaque, int version_id)
+{
+    S390TopologyNode *node = opaque;
+
+    if (node->topology_needed != s390_has_feat(S390_FEAT_CONFIGURATION_TOPOLOGY)) {
+        return -EINVAL;
+    }
+
+    return s390_cpu_topology_mtcr_set(node->mtcr);
+}
+
+static int cpu_topology_presave(void *opaque)
+{
+    S390TopologyNode *node = opaque;
+
+    node->topology_needed = s390_has_feat(S390_FEAT_CONFIGURATION_TOPOLOGY);
+    node->mtcr =  s390_cpu_topology_mtcr_get();
+    return 1;
+}
+
+static bool cpu_topology_needed(void *opaque)
+{
+    return s390_has_feat(S390_FEAT_CONFIGURATION_TOPOLOGY);
+}
+
+
+const VMStateDescription vmstate_cpu_topology = {
+    .name = "cpu_topology",
+    .version_id = 1,
+    .pre_save = cpu_topology_presave,
+    .post_load = cpu_topology_postload,
+    .minimum_version_id = 1,
+    .needed = cpu_topology_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_BOOL(mtcr, S390TopologyNode),
+        VMSTATE_BOOL(topology_needed, S390TopologyNode),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static void node_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
@@ -682,6 +724,7 @@ static void node_class_init(ObjectClass *oc, void *data)
     dc->realize = s390_node_device_realize;
     dc->desc = "topology node";
     dc->reset = s390_topology_reset;
+    dc->vmsd = &vmstate_cpu_topology;
 }
 
 static const TypeInfo node_info = {
diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
index ba0b1c1d7a..bd94a41135 100644
--- a/include/hw/s390x/cpu-topology.h
+++ b/include/hw/s390x/cpu-topology.h
@@ -84,6 +84,8 @@ struct S390TopologyNode {
     BusState *bus;
     uint8_t node_id;
     int cnt;
+    bool mtcr;
+    bool topology_needed;
 };
 typedef struct S390TopologyNode S390TopologyNode;
 OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyNode, S390_TOPOLOGY_NODE)
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 793e72c81a..0b697f3021 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -827,6 +827,8 @@ void s390_do_cpu_set_diag318(CPUState *cs, run_on_cpu_data arg);
 int s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch_id,
                                 int vq, bool assign);
 void s390_cpu_topology_mtr_reset(void);
+int s390_cpu_topology_mtcr_set(uint16_t mtcr);
+bool s390_cpu_topology_mtcr_get(void);
 #ifndef CONFIG_USER_ONLY
 unsigned int s390_cpu_set_state(uint8_t cpu_state, S390CPU *cpu);
 #else
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 1a562d2801..adf001debb 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -253,6 +253,7 @@ bool s390_has_feat(S390Feat feat)
         case S390_FEAT_SIE_CMMA:
         case S390_FEAT_SIE_PFMFI:
         case S390_FEAT_SIE_IBS:
+        case S390_FEAT_CONFIGURATION_TOPOLOGY:
             return false;
             break;
         default:
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 277f8d37cf..e9aa689da7 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -63,6 +63,8 @@
     }                                         \
 } while (0)
 
+#include "qemu/error-report.h"
+
 #define kvm_vm_check_mem_attr(s, attr) \
     kvm_vm_check_attr(s, KVM_S390_VM_MEM_CTRL, attr)
 
@@ -2607,13 +2609,47 @@ static void kvm_s390_set_mtr(uint64_t attr)
     }
 }
 
-static void kvm_s390_reset_mtr(void)
+int s390_cpu_topology_mtcr_set(uint16_t mtcr)
 {
     uint64_t attr = KVM_S390_VM_CPU_TOPO_MTR_CLEAR;
 
+    attr = mtcr ? KVM_S390_VM_CPU_TOPO_MTR_SET :
+                  KVM_S390_VM_CPU_TOPO_MTR_CLEAR;
+
     if (kvm_vm_check_attr(kvm_state, KVM_S390_VM_CPU_TOPOLOGY, attr)) {
             kvm_s390_set_mtr(attr);
     }
+
+    return 0;
+}
+
+bool s390_cpu_topology_mtcr_get(void)
+{
+    struct kvm_s390_cpu_topology topology;
+    struct kvm_device_attr attribute = {
+        .group = KVM_S390_VM_CPU_TOPOLOGY,
+        .addr = (uint64_t)&topology,
+    };
+    int ret;
+
+    if (!kvm_vm_check_attr(kvm_state, KVM_S390_VM_CPU_TOPOLOGY, 0)) {
+        return -ENODEV;
+    }
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_DEVICE_ATTR, &attribute);
+    if (ret) {
+        error_report("Failed to get cpu topology");
+        return false;
+    }
+    return !!topology.mtcr;
+}
+
+static void kvm_s390_reset_mtr(void)
+{
+    if (kvm_vm_check_attr(kvm_state, KVM_S390_VM_CPU_TOPOLOGY,
+                          KVM_S390_VM_CPU_TOPO_MTR_CLEAR)) {
+            kvm_s390_set_mtr(KVM_S390_VM_CPU_TOPO_MTR_CLEAR);
+    }
 }
 
 void kvm_s390_cpu_topology_reset(void)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (10 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration Pierre Morel
@ 2022-06-20 14:03 ` Pierre Morel
  2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
  12 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-20 14:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

The KVM capability, KVM_CAP_S390_CPU_TOPOLOGY is used to
activate the S390_FEAT_CONFIGURATION_TOPOLOGY feature.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 target/s390x/kvm/kvm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index e9aa689da7..50920bdbca 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -370,6 +370,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     kvm_vm_enable_cap(s, KVM_CAP_S390_USER_SIGP, 0);
     kvm_vm_enable_cap(s, KVM_CAP_S390_VECTOR_REGISTERS, 0);
     kvm_vm_enable_cap(s, KVM_CAP_S390_USER_STSI, 0);
+    kvm_vm_enable_cap(s, KVM_CAP_S390_CPU_TOPOLOGY, 0);
     if (ri_allowed()) {
         if (kvm_vm_enable_cap(s, KVM_CAP_S390_RI, 0) == 0) {
             cap_ri = 1;
@@ -2467,6 +2468,14 @@ void kvm_s390_get_host_cpu_model(S390CPUModel *model, Error **errp)
         set_bit(S390_FEAT_UNPACK, model->features);
     }
 
+    /*
+     * If we have the CPU Topology implemented in KVM activate
+     * the CPU TOPOLOGY feature.
+     */
+    if (kvm_check_extension(kvm_state, KVM_CAP_S390_CPU_TOPOLOGY)) {
+        set_bit(S390_FEAT_CONFIGURATION_TOPOLOGY, model->features);
+    }
+
     /* We emulate a zPCI bus and AEN, therefore we don't need HW support */
     set_bit(S390_FEAT_ZPCI, model->features);
     set_bit(S390_FEAT_ADAPTER_EVENT_NOTIFICATION, model->features);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
@ 2022-06-27 13:31   ` Janosch Frank
  2022-06-28 11:08     ` Pierre Morel
  2022-06-29 15:25     ` Pierre Morel
  2022-07-12 15:40   ` Janis Schoetterl-Glausch
  2022-08-23 13:30   ` Thomas Huth
  2 siblings, 2 replies; 49+ messages in thread
From: Janosch Frank @ 2022-06-27 13:31 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb

On 6/20/22 16:03, Pierre Morel wrote:
> We use new objects to have a dynamic administration of the CPU topology.
> The highest level object in this implementation is the s390 book and
> in this first implementation of CPU topology for S390 we have a single
> book.
> The book is built as a SYSBUS bridge during the CPU initialization.
> Other objects, sockets and core will be built after the parsing
> of the QEMU -smp argument.
> 
> Every object under this single book will be build dynamically
> immediately after a CPU has be realized if it is needed.
> The CPU will fill the sockets once after the other, according to the
> number of core per socket defined during the smp parsing.
> 
> Each CPU inside a socket will be represented by a bit in a 64bit
> unsigned long. Set on plug and clear on unplug of a CPU.
> 
> For the S390 CPU topology, thread and cores are merged into
> topology cores and the number of topology cores is the multiplication
> of cores by the numbers of threads.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>

[...]

> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
> index 7d6d01325b..216adfde26 100644
> --- a/target/s390x/cpu.h
> +++ b/target/s390x/cpu.h
> @@ -565,6 +565,53 @@ typedef union SysIB {
>   } SysIB;
>   QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>   
> +/* CPU type Topology List Entry */
> +typedef struct SysIBTl_cpu {
> +        uint8_t nl;
> +        uint8_t reserved0[3];
> +        uint8_t reserved1:5;
> +        uint8_t dedicated:1;
> +        uint8_t polarity:2;
> +        uint8_t type;
> +        uint16_t origin;
> +        uint64_t mask;
> +} SysIBTl_cpu;
> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
> +
> +/* Container type Topology List Entry */
> +typedef struct SysIBTl_container {
> +        uint8_t nl;
> +        uint8_t reserved[6];
> +        uint8_t id;
> +} QEMU_PACKED SysIBTl_container;
> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
> +
> +/* Generic Topology List Entry */
> +typedef union SysIBTl_entry {
> +        uint8_t nl;

This union member is unused, isn't it?

> +        SysIBTl_container container;
> +        SysIBTl_cpu cpu;
> +} SysIBTl_entry;
> +
> +#define TOPOLOGY_NR_MAG  6

TOPOLOGY_TOTAL_NR_MAGS ?

> +#define TOPOLOGY_NR_MAG6 0

TOPOLOGY_NR_TLES_MAG6 ?

I'm open to other suggestions but we need to differentiate between the 
number of mag array entries and the number of TLEs in the MAGs.

> +#define TOPOLOGY_NR_MAG5 1
> +#define TOPOLOGY_NR_MAG4 2
> +#define TOPOLOGY_NR_MAG3 3
> +#define TOPOLOGY_NR_MAG2 4
> +#define TOPOLOGY_NR_MAG1 5

I'd appreciate a \n here.

> +/* Configuration topology */
> +typedef struct SysIB_151x {
> +    uint8_t  res0[2];

You're using "reserved" everywhere but now it's "rev"?

> +    uint16_t length;
> +    uint8_t  mag[TOPOLOGY_NR_MAG];
> +    uint8_t  res1;
> +    uint8_t  mnest;
> +    uint32_t res2;
> +    SysIBTl_entry tle[0];
> +} SysIB_151x;
> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
> +
>   /* MMU defines */
>   #define ASCE_ORIGIN           (~0xfffULL) /* segment table origin             */
>   #define ASCE_SUBSPACE         0x200       /* subspace group control           */


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information
  2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
@ 2022-06-27 14:26   ` Janosch Frank
  2022-06-28 11:03     ` Pierre Morel
  2022-07-20 19:34   ` Janis Schoetterl-Glausch
  1 sibling, 1 reply; 49+ messages in thread
From: Janosch Frank @ 2022-06-27 14:26 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb

On 6/20/22 16:03, Pierre Morel wrote:

s390x/cpu_topology: Add STSI function code 15 handling

> The handling of STSI is enhanced with the interception of the
> function code 15 for storing CPU topology.

s/interception/handling/

> 
> Using the objects built during the plugging of CPU, we build the
> SYSIB 15_1_x structures.
> 
> With this patch the maximum MNEST level is 2, this is also
> the only level allowed and only SYSIB 15_1_2 will be built.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>

> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
> +{
> +    const MachineState *machine = MACHINE(qdev_get_machine());
> +    void *p;
> +    int ret;
> +
> +    /*
> +     * Until the SCLP STSI Facility reporting the MNEST value is used,
> +     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
> +     */
> +    if (sel2 != 2) {
> +        setcc(cpu, 3);
> +        return;
> +    }
> +
> +    p = g_malloc0(TARGET_PAGE_SIZE);
> +
> +    setup_stsi(machine, p, 2);
> +
> +    if (s390_is_pv()) {
> +        ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);
> +    } else {
> +        ret = s390_cpu_virt_mem_write(cpu, addr, ar, p, TARGET_PAGE_SIZE);
> +    }

For later reference:
FCs over 3 are rejected by SIE for PV guests via cc 3.

I currently don't know if and when that will be changed but I'll ask around.

> +
> +    setcc(cpu, ret ? 3 : 0);
> +    g_free(p);
> +}
> +
> diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
> index 7bd8db0e7b..563bf5ac60 100644
> --- a/target/s390x/kvm/kvm.c
> +++ b/target/s390x/kvm/kvm.c
> @@ -51,6 +51,7 @@
>   #include "hw/s390x/s390-virtio-ccw.h"
>   #include "hw/s390x/s390-virtio-hcall.h"
>   #include "hw/s390x/pv.h"
> +#include "hw/s390x/cpu-topology.h"
>   
>   #ifndef DEBUG_KVM
>   #define DEBUG_KVM  0
> @@ -1918,6 +1919,10 @@ static int handle_stsi(S390CPU *cpu)
>           /* Only sysib 3.2.2 needs post-handling for now. */
>           insert_stsi_3_2_2(cpu, run->s390_stsi.addr, run->s390_stsi.ar);
>           return 0;
> +    case 15:
> +        insert_stsi_15_1_x(cpu, run->s390_stsi.sel2, run->s390_stsi.addr,
> +                           run->s390_stsi.ar);
> +        return 0;
>       default:
>           return 0;
>       }
> diff --git a/target/s390x/meson.build b/target/s390x/meson.build
> index 84c1402a6a..890ccfa789 100644
> --- a/target/s390x/meson.build
> +++ b/target/s390x/meson.build
> @@ -29,6 +29,7 @@ s390x_softmmu_ss.add(files(
>     'sigp.c',
>     'cpu-sysemu.c',
>     'cpu_models_sysemu.c',
> +  'cpu_topology.c',
>   ))
>   
>   s390x_user_ss = ss.source_set()


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information
  2022-06-27 14:26   ` Janosch Frank
@ 2022-06-28 11:03     ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-28 11:03 UTC (permalink / raw)
  To: Janosch Frank, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb



On 6/27/22 16:26, Janosch Frank wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
> 
> s390x/cpu_topology: Add STSI function code 15 handling

OK

> 
>> The handling of STSI is enhanced with the interception of the
>> function code 15 for storing CPU topology.
> 
> s/interception/handling/

OK

> 
>>
>> Using the objects built during the plugging of CPU, we build the
>> SYSIB 15_1_x structures.
>>
>> With this patch the maximum MNEST level is 2, this is also
>> the only level allowed and only SYSIB 15_1_2 will be built.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
>> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
>> +{
>> +    const MachineState *machine = MACHINE(qdev_get_machine());
>> +    void *p;
>> +    int ret;
>> +
>> +    /*
>> +     * Until the SCLP STSI Facility reporting the MNEST value is used,
>> +     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
>> +     */
>> +    if (sel2 != 2) {
>> +        setcc(cpu, 3);
>> +        return;
>> +    }
>> +
>> +    p = g_malloc0(TARGET_PAGE_SIZE);
>> +
>> +    setup_stsi(machine, p, 2);
>> +
>> +    if (s390_is_pv()) {
>> +        ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);
>> +    } else {
>> +        ret = s390_cpu_virt_mem_write(cpu, addr, ar, p, 
>> TARGET_PAGE_SIZE);
>> +    }
> 
> For later reference:
> FCs over 3 are rejected by SIE for PV guests via cc 3.
> 
> I currently don't know if and when that will be changed but I'll ask 
> around.

Yes, thanks, I forgot to change that, will do.


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-27 13:31   ` Janosch Frank
@ 2022-06-28 11:08     ` Pierre Morel
  2022-06-29 15:25     ` Pierre Morel
  1 sibling, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-06-28 11:08 UTC (permalink / raw)
  To: Janosch Frank, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb



On 6/27/22 15:31, Janosch Frank wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> We use new objects to have a dynamic administration of the CPU topology.
>> The highest level object in this implementation is the s390 book and
>> in this first implementation of CPU topology for S390 we have a single
>> book.
>> The book is built as a SYSBUS bridge during the CPU initialization.
>> Other objects, sockets and core will be built after the parsing
>> of the QEMU -smp argument.
>>
>> Every object under this single book will be build dynamically
>> immediately after a CPU has be realized if it is needed.
>> The CPU will fill the sockets once after the other, according to the
>> number of core per socket defined during the smp parsing.
>>
>> Each CPU inside a socket will be represented by a bit in a 64bit
>> unsigned long. Set on plug and clear on unplug of a CPU.
>>
>> For the S390 CPU topology, thread and cores are merged into
>> topology cores and the number of topology cores is the multiplication
>> of cores by the numbers of threads.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> [...]
> 
>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>> index 7d6d01325b..216adfde26 100644
>> --- a/target/s390x/cpu.h
>> +++ b/target/s390x/cpu.h
>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>   } SysIB;
>>   QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>> +/* CPU type Topology List Entry */
>> +typedef struct SysIBTl_cpu {
>> +        uint8_t nl;
>> +        uint8_t reserved0[3];
>> +        uint8_t reserved1:5;
>> +        uint8_t dedicated:1;
>> +        uint8_t polarity:2;
>> +        uint8_t type;
>> +        uint16_t origin;
>> +        uint64_t mask;
>> +} SysIBTl_cpu;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>> +
>> +/* Container type Topology List Entry */
>> +typedef struct SysIBTl_container {
>> +        uint8_t nl;
>> +        uint8_t reserved[6];
>> +        uint8_t id;
>> +} QEMU_PACKED SysIBTl_container;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>> +
>> +/* Generic Topology List Entry */
>> +typedef union SysIBTl_entry {
>> +        uint8_t nl;
> 
> This union member is unused, isn't it?

Yes, forgot to remove it.
will do.

> 
>> +        SysIBTl_container container;
>> +        SysIBTl_cpu cpu;
>> +} SysIBTl_entry;
>> +
>> +#define TOPOLOGY_NR_MAG  6
> 
> TOPOLOGY_TOTAL_NR_MAGS ?

OK

> 
>> +#define TOPOLOGY_NR_MAG6 0
> 
> TOPOLOGY_NR_TLES_MAG6 ?
> 
> I'm open to other suggestions but we need to differentiate between the 
> number of mag array entries and the number of TLEs in the MAGs.

it is OK for me

> 
>> +#define TOPOLOGY_NR_MAG5 1
>> +#define TOPOLOGY_NR_MAG4 2
>> +#define TOPOLOGY_NR_MAG3 3
>> +#define TOPOLOGY_NR_MAG2 4
>> +#define TOPOLOGY_NR_MAG1 5
> 
> I'd appreciate a \n here.

ok

> 
>> +/* Configuration topology */
>> +typedef struct SysIB_151x {
>> +    uint8_t  res0[2];
> 
> You're using "reserved" everywhere but now it's "rev"?

ok, will used reserved overall

> 
>> +    uint16_t length;
>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>> +    uint8_t  res1;
>> +    uint8_t  mnest;
>> +    uint32_t res2;
>> +    SysIBTl_entry tle[0];
>> +} SysIB_151x;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>> +
>>   /* MMU defines */
>>   #define ASCE_ORIGIN           (~0xfffULL) /* segment table 
>> origin             */
>>   #define ASCE_SUBSPACE         0x200       /* subspace group 
>> control           */
> 

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-27 13:31   ` Janosch Frank
  2022-06-28 11:08     ` Pierre Morel
@ 2022-06-29 15:25     ` Pierre Morel
  2022-07-04 11:47       ` Janosch Frank
  1 sibling, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-06-29 15:25 UTC (permalink / raw)
  To: Janosch Frank, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb



On 6/27/22 15:31, Janosch Frank wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> We use new objects to have a dynamic administration of the CPU topology.
>> The highest level object in this implementation is the s390 book and
>> in this first implementation of CPU topology for S390 we have a single
>> book.
>> The book is built as a SYSBUS bridge during the CPU initialization.
>> Other objects, sockets and core will be built after the parsing
>> of the QEMU -smp argument.
>>
>> Every object under this single book will be build dynamically
>> immediately after a CPU has be realized if it is needed.
>> The CPU will fill the sockets once after the other, according to the
>> number of core per socket defined during the smp parsing.
>>
>> Each CPU inside a socket will be represented by a bit in a 64bit
>> unsigned long. Set on plug and clear on unplug of a CPU.
>>
>> For the S390 CPU topology, thread and cores are merged into
>> topology cores and the number of topology cores is the multiplication
>> of cores by the numbers of threads.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> [...]
> 
>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>> index 7d6d01325b..216adfde26 100644
>> --- a/target/s390x/cpu.h
>> +++ b/target/s390x/cpu.h
>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>   } SysIB;
>>   QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>> +/* CPU type Topology List Entry */
>> +typedef struct SysIBTl_cpu {
>> +        uint8_t nl;
>> +        uint8_t reserved0[3];
>> +        uint8_t reserved1:5;
>> +        uint8_t dedicated:1;
>> +        uint8_t polarity:2;
>> +        uint8_t type;
>> +        uint16_t origin;
>> +        uint64_t mask;
>> +} SysIBTl_cpu;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>> +
>> +/* Container type Topology List Entry */
>> +typedef struct SysIBTl_container {
>> +        uint8_t nl;
>> +        uint8_t reserved[6];
>> +        uint8_t id;
>> +} QEMU_PACKED SysIBTl_container;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>> +
>> +/* Generic Topology List Entry */
>> +typedef union SysIBTl_entry {
>> +        uint8_t nl;
> 
> This union member is unused, isn't it?
> 
>> +        SysIBTl_container container;
>> +        SysIBTl_cpu cpu;
>> +} SysIBTl_entry;
>> +
>> +#define TOPOLOGY_NR_MAG  6
> 
> TOPOLOGY_TOTAL_NR_MAGS ?
> 
>> +#define TOPOLOGY_NR_MAG6 0
> 
> TOPOLOGY_NR_TLES_MAG6 ?
> 
> I'm open to other suggestions but we need to differentiate between the 
> number of mag array entries and the number of TLEs in the MAGs.


typedef enum {
         TOPOLOGY_MAG6 = 0,
         TOPOLOGY_MAG5 = 1,
         TOPOLOGY_MAG4 = 2,
         TOPOLOGY_MAG3 = 3,
         TOPOLOGY_MAG2 = 4,
         TOPOLOGY_MAG1 = 5,
         TOPOLOGY_TOTAL_MAGS = 6,
};


oder enum with TOPOLOGY_NR_TLES_MAGx ?

> 
>> +#define TOPOLOGY_NR_MAG5 1
>> +#define TOPOLOGY_NR_MAG4 2
>> +#define TOPOLOGY_NR_MAG3 3
>> +#define TOPOLOGY_NR_MAG2 4
>> +#define TOPOLOGY_NR_MAG1 5
> 
> I'd appreciate a \n here.

OK

> 
>> +/* Configuration topology */
>> +typedef struct SysIB_151x {
>> +    uint8_t  res0[2];
> 
> You're using "reserved" everywhere but now it's "rev"?

OK I will keep reserved

> 
>> +    uint16_t length;
>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>> +    uint8_t  res1;
>> +    uint8_t  mnest;
>> +    uint32_t res2;
>> +    SysIBTl_entry tle[0];
>> +} SysIB_151x;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>> +
>>   /* MMU defines */
>>   #define ASCE_ORIGIN           (~0xfffULL) /* segment table 
>> origin             */
>>   #define ASCE_SUBSPACE         0x200       /* subspace group 
>> control           */
> 
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-29 15:25     ` Pierre Morel
@ 2022-07-04 11:47       ` Janosch Frank
  2022-07-04 14:51         ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janosch Frank @ 2022-07-04 11:47 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb

On 6/29/22 17:25, Pierre Morel wrote:
> 
> 
> On 6/27/22 15:31, Janosch Frank wrote:
>> On 6/20/22 16:03, Pierre Morel wrote:
>>> We use new objects to have a dynamic administration of the CPU topology.
>>> The highest level object in this implementation is the s390 book and
>>> in this first implementation of CPU topology for S390 we have a single
>>> book.
>>> The book is built as a SYSBUS bridge during the CPU initialization.
>>> Other objects, sockets and core will be built after the parsing
>>> of the QEMU -smp argument.
>>>
>>> Every object under this single book will be build dynamically
>>> immediately after a CPU has be realized if it is needed.
>>> The CPU will fill the sockets once after the other, according to the
>>> number of core per socket defined during the smp parsing.
>>>
>>> Each CPU inside a socket will be represented by a bit in a 64bit
>>> unsigned long. Set on plug and clear on unplug of a CPU.
>>>
>>> For the S390 CPU topology, thread and cores are merged into
>>> topology cores and the number of topology cores is the multiplication
>>> of cores by the numbers of threads.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>> [...]
>>
>>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>>> index 7d6d01325b..216adfde26 100644
>>> --- a/target/s390x/cpu.h
>>> +++ b/target/s390x/cpu.h
>>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>>    } SysIB;
>>>    QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>>> +/* CPU type Topology List Entry */
>>> +typedef struct SysIBTl_cpu {
>>> +        uint8_t nl;
>>> +        uint8_t reserved0[3];
>>> +        uint8_t reserved1:5;
>>> +        uint8_t dedicated:1;
>>> +        uint8_t polarity:2;
>>> +        uint8_t type;
>>> +        uint16_t origin;
>>> +        uint64_t mask;
>>> +} SysIBTl_cpu;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>>> +
>>> +/* Container type Topology List Entry */
>>> +typedef struct SysIBTl_container {
>>> +        uint8_t nl;
>>> +        uint8_t reserved[6];
>>> +        uint8_t id;
>>> +} QEMU_PACKED SysIBTl_container;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>>> +
>>> +/* Generic Topology List Entry */
>>> +typedef union SysIBTl_entry {
>>> +        uint8_t nl;
>>
>> This union member is unused, isn't it?
>>
>>> +        SysIBTl_container container;
>>> +        SysIBTl_cpu cpu;
>>> +} SysIBTl_entry;
>>> +
>>> +#define TOPOLOGY_NR_MAG  6
>>
>> TOPOLOGY_TOTAL_NR_MAGS ?
>>
>>> +#define TOPOLOGY_NR_MAG6 0
>>
>> TOPOLOGY_NR_TLES_MAG6 ?
>>
>> I'm open to other suggestions but we need to differentiate between the
>> number of mag array entries and the number of TLEs in the MAGs.
> 
> 
> typedef enum {
>           TOPOLOGY_MAG6 = 0,
>           TOPOLOGY_MAG5 = 1,
>           TOPOLOGY_MAG4 = 2,
>           TOPOLOGY_MAG3 = 3,
>           TOPOLOGY_MAG2 = 4,
>           TOPOLOGY_MAG1 = 5,
>           TOPOLOGY_TOTAL_MAGS = 6,
> };
> 
> 
> oder enum with TOPOLOGY_NR_TLES_MAGx ?

I'd stick with the shorter first variant.

> 
>>
>>> +#define TOPOLOGY_NR_MAG5 1
>>> +#define TOPOLOGY_NR_MAG4 2
>>> +#define TOPOLOGY_NR_MAG3 3
>>> +#define TOPOLOGY_NR_MAG2 4
>>> +#define TOPOLOGY_NR_MAG1 5
>>
>> I'd appreciate a \n here.
> 
> OK
> 
>>
>>> +/* Configuration topology */
>>> +typedef struct SysIB_151x {
>>> +    uint8_t  res0[2];
>>
>> You're using "reserved" everywhere but now it's "rev"?
> 
> OK I will keep reserved
> 
>>
>>> +    uint16_t length;
>>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>>> +    uint8_t  res1;
>>> +    uint8_t  mnest;
>>> +    uint32_t res2;
>>> +    SysIBTl_entry tle[0];
>>> +} SysIB_151x;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>>> +
>>>    /* MMU defines */
>>>    #define ASCE_ORIGIN           (~0xfffULL) /* segment table
>>> origin             */
>>>    #define ASCE_SUBSPACE         0x200       /* subspace group
>>> control           */
>>
>>
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-04 11:47       ` Janosch Frank
@ 2022-07-04 14:51         ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-07-04 14:51 UTC (permalink / raw)
  To: Janosch Frank, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb



On 7/4/22 13:47, Janosch Frank wrote:
> On 6/29/22 17:25, Pierre Morel wrote:
>>
>>
>> On 6/27/22 15:31, Janosch Frank wrote:
>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>> We use new objects to have a dynamic administration of the CPU 
>>>> topology.
>>>> The highest level object in this implementation is the s390 book and
>>>> in this first implementation of CPU topology for S390 we have a single
>>>> book.
>>>> The book is built as a SYSBUS bridge during the CPU initialization.
>>>> Other objects, sockets and core will be built after the parsing
>>>> of the QEMU -smp argument.
>>>>
>>>> Every object under this single book will be build dynamically
>>>> immediately after a CPU has be realized if it is needed.
>>>> The CPU will fill the sockets once after the other, according to the
>>>> number of core per socket defined during the smp parsing.
>>>>
>>>> Each CPU inside a socket will be represented by a bit in a 64bit
>>>> unsigned long. Set on plug and clear on unplug of a CPU.
>>>>
>>>> For the S390 CPU topology, thread and cores are merged into
>>>> topology cores and the number of topology cores is the multiplication
>>>> of cores by the numbers of threads.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>
>>> [...]
>>>
>>>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>>>> index 7d6d01325b..216adfde26 100644
>>>> --- a/target/s390x/cpu.h
>>>> +++ b/target/s390x/cpu.h
>>>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>>>    } SysIB;
>>>>    QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>>>> +/* CPU type Topology List Entry */
>>>> +typedef struct SysIBTl_cpu {
>>>> +        uint8_t nl;
>>>> +        uint8_t reserved0[3];
>>>> +        uint8_t reserved1:5;
>>>> +        uint8_t dedicated:1;
>>>> +        uint8_t polarity:2;
>>>> +        uint8_t type;
>>>> +        uint16_t origin;
>>>> +        uint64_t mask;
>>>> +} SysIBTl_cpu;
>>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>>>> +
>>>> +/* Container type Topology List Entry */
>>>> +typedef struct SysIBTl_container {
>>>> +        uint8_t nl;
>>>> +        uint8_t reserved[6];
>>>> +        uint8_t id;
>>>> +} QEMU_PACKED SysIBTl_container;
>>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>>>> +
>>>> +/* Generic Topology List Entry */
>>>> +typedef union SysIBTl_entry {
>>>> +        uint8_t nl;
>>>
>>> This union member is unused, isn't it?
>>>
>>>> +        SysIBTl_container container;
>>>> +        SysIBTl_cpu cpu;
>>>> +} SysIBTl_entry;
>>>> +
>>>> +#define TOPOLOGY_NR_MAG  6
>>>
>>> TOPOLOGY_TOTAL_NR_MAGS ?
>>>
>>>> +#define TOPOLOGY_NR_MAG6 0
>>>
>>> TOPOLOGY_NR_TLES_MAG6 ?
>>>
>>> I'm open to other suggestions but we need to differentiate between the
>>> number of mag array entries and the number of TLEs in the MAGs.
>>
>>
>> typedef enum {
>>           TOPOLOGY_MAG6 = 0,
>>           TOPOLOGY_MAG5 = 1,
>>           TOPOLOGY_MAG4 = 2,
>>           TOPOLOGY_MAG3 = 3,
>>           TOPOLOGY_MAG2 = 4,
>>           TOPOLOGY_MAG1 = 5,
>>           TOPOLOGY_TOTAL_MAGS = 6,
>> };
>>
>>
>> oder enum with TOPOLOGY_NR_TLES_MAGx ?
> 
> I'd stick with the shorter first variant.


OK, thanks

> 
>>
>>>
>>>> +#define TOPOLOGY_NR_MAG5 1
>>>> +#define TOPOLOGY_NR_MAG4 2
>>>> +#define TOPOLOGY_NR_MAG3 3
>>>> +#define TOPOLOGY_NR_MAG2 4
>>>> +#define TOPOLOGY_NR_MAG1 5
>>>
>>> I'd appreciate a \n here.
>>
>> OK
>>
>>>
>>>> +/* Configuration topology */
>>>> +typedef struct SysIB_151x {
>>>> +    uint8_t  res0[2];
>>>
>>> You're using "reserved" everywhere but now it's "rev"?
>>
>> OK I will keep reserved
>>
>>>
>>>> +    uint16_t length;
>>>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>>>> +    uint8_t  res1;
>>>> +    uint8_t  mnest;
>>>> +    uint32_t res2;
>>>> +    SysIBTl_entry tle[0];
>>>> +} SysIB_151x;
>>>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>>>> +
>>>>    /* MMU defines */
>>>>    #define ASCE_ORIGIN           (~0xfffULL) /* segment table
>>>> origin             */
>>>>    #define ASCE_SUBSPACE         0x200       /* subspace group
>>>> control           */
>>>
>>>
>>
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
  2022-06-27 13:31   ` Janosch Frank
@ 2022-07-12 15:40   ` Janis Schoetterl-Glausch
  2022-07-13 14:59     ` Pierre Morel
  2022-08-23 13:30   ` Thomas Huth
  2 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-12 15:40 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 6/20/22 16:03, Pierre Morel wrote:
> We use new objects to have a dynamic administration of the CPU topology.
> The highest level object in this implementation is the s390 book and
> in this first implementation of CPU topology for S390 we have a single
> book.
> The book is built as a SYSBUS bridge during the CPU initialization.
> Other objects, sockets and core will be built after the parsing
> of the QEMU -smp argument.
> 
> Every object under this single book will be build dynamically
> immediately after a CPU has be realized if it is needed.
> The CPU will fill the sockets once after the other, according to the
> number of core per socket defined during the smp parsing.
> 
> Each CPU inside a socket will be represented by a bit in a 64bit
> unsigned long. Set on plug and clear on unplug of a CPU.
> 
> For the S390 CPU topology, thread and cores are merged into
> topology cores and the number of topology cores is the multiplication
> of cores by the numbers of threads.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>  hw/s390x/meson.build            |   1 +
>  hw/s390x/s390-virtio-ccw.c      |   6 +
>  include/hw/s390x/cpu-topology.h |  74 ++++++
>  target/s390x/cpu.h              |  47 ++++
>  5 files changed, 519 insertions(+)
>  create mode 100644 hw/s390x/cpu-topology.c
>  create mode 100644 include/hw/s390x/cpu-topology.h
> 
> diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
> new file mode 100644
> index 0000000000..0fd6f08084
> --- /dev/null
> +++ b/hw/s390x/cpu-topology.c
> @@ -0,0 +1,391 @@
> +/*
> + * CPU Topology
> + *
> + * Copyright 2022 IBM Corp.

Should be Copyright IBM Corp. 2022, and maybe even have a year range.

> + * Author(s): Pierre Morel <pmorel@linux.ibm.com>
> +
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> + * your option) any later version. See the COPYING file in the top-level
> + * directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "hw/sysbus.h"
> +#include "hw/s390x/cpu-topology.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/boards.h"
> +#include "qemu/typedefs.h"
> +#include "target/s390x/cpu.h"
> +#include "hw/s390x/s390-virtio-ccw.h"
> +
> +/*
> + * s390_create_cores:
> + * @ms: Machine state
> + * @socket: the socket on which to create the core set
> + * @origin: the origin offset of the first core of the set
> + * @errp: Error pointer
> + *
> + * returns a pointer to the created S390TopologyCores structure
> + *
> + * On error: return NULL
> + */
> +static S390TopologyCores *s390_create_cores(MachineState *ms,
> +                                            S390TopologySocket *socket,
> +                                            int origin, Error **errp)
> +{
> +    DeviceState *dev;
> +    S390TopologyCores *cores;
> +
> +    if (socket->bus->num_children >= ms->smp.cores * ms->smp.threads) {
> +        error_setg(errp, "Unable to create more cores.");
> +        return NULL;
> +    }

Why/How can this happen?
The "location" of the CPU is a function of core_id and the same CPU should not be added twice.
If it's to enforce a limit on the smp arguments that should happen earlier in my opinion.
If it's necessary, you could also make the message more verbose and add ", maximum number reached".
> +
> +    dev = qdev_new(TYPE_S390_TOPOLOGY_CORES);
> +    qdev_realize_and_unref(dev, socket->bus, &error_fatal);

As a result of this, the order of cores in the socket bus is the creation order, correct?
So newest first and not ordered by the origin (since we can hot plug CPUs), correct?
> +
> +    cores = S390_TOPOLOGY_CORES(dev);
> +    cores->origin = origin;

I must admit that I haven't fully grokked the qemu object model, yet, but I'd be more comfortable
if you unref'ed cores after you set the origin.
Does the socket bus own the object after you unref it? Does it then make sense to return cores
after unref'ing it?
But then we don't support CPU unplug, so the object shouldn't just vanish.

> +    socket->cnt += 1;

cnt++ to be consistent with create_socket below.
> +
> +    return cores;
> +}
> +
> +/*
> + * s390_create_socket:
> + * @ms: Machine state
> + * @book: the book on which to create the socket
> + * @id: the socket id
> + * @errp: Error pointer
> + *
> + * returns a pointer to the created S390TopologySocket structure
> + *
> + * On error: return NULL
> + */
> +static S390TopologySocket *s390_create_socket(MachineState *ms,
> +                                              S390TopologyBook *book,
> +                                              int id, Error **errp)
> +{

Same questions/comments as above.

> +    DeviceState *dev;
> +    S390TopologySocket *socket;
> +
> +    if (book->bus->num_children >= ms->smp.sockets) {
> +        error_setg(errp, "Unable to create more sockets.");
> +        return NULL;
> +    }
> +
> +    dev = qdev_new(TYPE_S390_TOPOLOGY_SOCKET);
> +    qdev_realize_and_unref(dev, book->bus, &error_fatal);
> +
> +    socket = S390_TOPOLOGY_SOCKET(dev);
> +    socket->socket_id = id;
> +    book->cnt++;
> +
> +    return socket;
> +}
> +
> +/*
> + * s390_get_cores:
> + * @ms: Machine state
> + * @socket: the socket to search into
> + * @origin: the origin specified for the S390TopologyCores
> + * @errp: Error pointer
> + *
> + * returns a pointer to a S390TopologyCores structure within a socket having
> + * the specified origin.
> + * First search if the socket is already containing the S390TopologyCores
> + * structure and if not create one with this origin.
> + */
> +static S390TopologyCores *s390_get_cores(MachineState *ms,
> +                                         S390TopologySocket *socket,
> +                                         int origin, Error **errp)
> +{
> +    S390TopologyCores *cores;
> +    BusChild *kid;
> +
> +    QTAILQ_FOREACH(kid, &socket->bus->children, sibling) {
> +        cores = S390_TOPOLOGY_CORES(kid->child);
> +        if (cores->origin == origin) {
> +            return cores;
> +        }
> +    }
> +    return s390_create_cores(ms, socket, origin, errp);

I think calling create here is unintuative.
You only use get_cores once when creating a new cpu, I think doing

    cores = s390_get_cores(ms, socket, origin, errp);
    if (!cores) {
        cores = s390_create_cores(...);
    ]
    if (!cores) {
        return false;
    }

is more straight forward and readable.
> +}
> +
> +/*
> + * s390_get_socket:
> + * @ms: Machine state
> + * @book: The book to search into
> + * @socket_id: the identifier of the socket to search for
> + * @errp: Error pointer
> + *
> + * returns a pointer to a S390TopologySocket structure within a book having
> + * the specified socket_id.
> + * First search if the book is already containing the S390TopologySocket
> + * structure and if not create one with this socket_id.
> + */
> +static S390TopologySocket *s390_get_socket(MachineState *ms,
> +                                           S390TopologyBook *book,
> +                                           int socket_id, Error **errp)
> +{
> +    S390TopologySocket *socket;
> +    BusChild *kid;
> +
> +    QTAILQ_FOREACH(kid, &book->bus->children, sibling) {
> +        socket = S390_TOPOLOGY_SOCKET(kid->child);
> +        if (socket->socket_id == socket_id) {
> +            return socket;
> +        }
> +    }
> +    return s390_create_socket(ms, book, socket_id, errp);

As above.

> +}
> +
> +/*
> + * s390_topology_new_cpu:
> + * @core_id: the core ID is machine wide
> + *
> + * We have a single book returned by s390_get_topology(),
> + * then we build the hierarchy on demand.
> + * Note that we do not destroy the hierarchy on error creating
> + * an entry in the topology, we just keep it empty.
> + * We do not need to worry about not finding a topology level
> + * entry this would have been caught during smp parsing.
> + */
> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
> +{
> +    S390TopologyBook *book;
> +    S390TopologySocket *socket;
> +    S390TopologyCores *cores;
> +    int nb_cores_per_socket;

num_cores_per_socket instead?

> +    int origin, bit;
> +
> +    book = s390_get_topology();
> +
> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;

We don't support the multithreading facility, do we?
So, I think we should assert smp.threads == 1 somewhere.
In any case I think the correct expression would round the threads up to the next power of 2,
because the core_id has the thread id in the lower bits, but threads per core doesn't need to be
a power of 2 according to the architecture.

> +
> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, errp);
> +    if (!socket) {
> +        return false;
> +    }
> +
> +    /*
> +     * At the core level, each CPU is represented by a bit in a 64bit
> +     * unsigned long. Set on plug and clear on unplug of a CPU.
> +     * The firmware assume that all CPU in the core description have the same
> +     * type, polarization and are all dedicated or shared.
> +     * In the case a socket contains CPU with different type, polarization
> +     * or dedication then they will be defined in different CPU containers.
> +     * Currently we assume all CPU are identical and the only reason to have
> +     * several S390TopologyCores inside a socket is to have more than 64 CPUs
> +     * in that case the origin field, representing the offset of the first CPU
> +     * in the CPU container allows to represent up to the maximal number of
> +     * CPU inside several CPU containers inside the socket container.
> +     */
> +    origin = 64 * (core_id / 64);
> +
> +    cores = s390_get_cores(ms, socket, origin, errp);
> +    if (!cores) {
> +        return false;
> +    }
> +
> +    bit = 63 - (core_id - origin);
> +    set_bit(bit, &cores->mask);
> +    cores->origin = origin;

This is redundant, origin is already set.
Also I think you should generally pass the core_id and not the origin.
Then on construction you can also set the bit.

> +
> +    return true;
> +}
> +
> +/*
> + * Setting the first topology: 1 book, 1 socket
> + * This is enough for 64 cores if the topology is flat (single socket)
> + */
> +void s390_topology_setup(MachineState *ms)
> +{
> +    DeviceState *dev;
> +
> +    /* Create BOOK bridge device */
> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
> +    object_property_add_child(qdev_get_machine(),
> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));

Why add it to the machine instead of directly using a static?
So it's visible to the user via info qtree or something?
Would that even be the appropriate location to show that?

> +    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> +}
> +
> +S390TopologyBook *s390_get_topology(void)
> +{
> +    static S390TopologyBook *book;
> +
> +    if (!book) {
> +        book = S390_TOPOLOGY_BOOK(
> +            object_resolve_path(TYPE_S390_TOPOLOGY_BOOK, NULL));
> +        assert(book != NULL);
> +    }
> +
> +    return book;
> +}
> +
> +/* --- CORES Definitions --- */
> +
> +static Property s390_topology_cores_properties[] = {
> +    DEFINE_PROP_BOOL("dedicated", S390TopologyCores, dedicated, false),
> +    DEFINE_PROP_UINT8("polarity", S390TopologyCores, polarity,
> +                      S390_TOPOLOGY_POLARITY_H),
> +    DEFINE_PROP_UINT8("cputype", S390TopologyCores, cputype,
> +                      S390_TOPOLOGY_CPU_TYPE),
> +    DEFINE_PROP_UINT16("origin", S390TopologyCores, origin, 0),
> +    DEFINE_PROP_UINT64("mask", S390TopologyCores, mask, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void cpu_cores_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> +
> +    device_class_set_props(dc, s390_topology_cores_properties);
> +    hc->unplug = qdev_simple_device_unplug_cb;
> +    dc->bus_type = TYPE_S390_TOPOLOGY_SOCKET_BUS;
> +    dc->desc = "topology cpu entry";
> +}
> +
> +static const TypeInfo cpu_cores_info = {
> +    .name          = TYPE_S390_TOPOLOGY_CORES,
> +    .parent        = TYPE_DEVICE,
> +    .instance_size = sizeof(S390TopologyCores),
> +    .class_init    = cpu_cores_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_HOTPLUG_HANDLER },

Why implement the hotplug interface? That is not actually supported, is it?
> +        { }
> +    }
> +};
> +
> +static char *socket_bus_get_dev_path(DeviceState *dev)
> +{
> +    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
> +    DeviceState *book = dev->parent_bus->parent;
> +    char *id = qdev_get_dev_path(book);
> +    char *ret;
> +
> +    if (id) {
> +        ret = g_strdup_printf("%s:%02d", id, socket->socket_id);
> +        g_free(id);
> +    } else {
> +        ret = g_strdup_printf("_:%02d", socket->socket_id);

How can this case occur? Sockets get attached to the book bus immediately after creation, correct?

> +    }
> +
> +    return ret;
> +}
> +
> +static void socket_bus_class_init(ObjectClass *oc, void *data)
> +{
> +    BusClass *k = BUS_CLASS(oc);
> +
> +    k->get_dev_path = socket_bus_get_dev_path;
> +    k->max_dev = S390_MAX_SOCKETS;

This is the bus the cores are attached to, correct?
Is this constant badly named, or should this be MAX_CORES (which doesn't exist)?
How does this limit get enforced?
Why is there a limit in the first place? I don't see one defined by STSI, other than having to fit in a u8.
> +}
> +
> +static const TypeInfo socket_bus_info = {
> +    .name = TYPE_S390_TOPOLOGY_SOCKET_BUS,
> +    .parent = TYPE_BUS,
> +    .instance_size = 0,

After a bit of grepping it seems to me that omitting that field is more common that setting it to 0.

> +    .class_init = socket_bus_class_init,
> +};
> +
> +static void s390_socket_device_realize(DeviceState *dev, Error **errp)
> +{
> +    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
> +    BusState *bus;
> +
> +    bus = qbus_new(TYPE_S390_TOPOLOGY_SOCKET_BUS, dev,
> +                   TYPE_S390_TOPOLOGY_SOCKET_BUS);
> +    qbus_set_hotplug_handler(bus, OBJECT(dev));
> +    socket->bus = bus;
> +}
> +
> +static void socket_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> +
> +    hc->unplug = qdev_simple_device_unplug_cb;
> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> +    dc->bus_type = TYPE_S390_TOPOLOGY_BOOK_BUS;
> +    dc->realize = s390_socket_device_realize;
> +    dc->desc = "topology socket";
> +}
> +
> +static const TypeInfo socket_info = {
> +    .name          = TYPE_S390_TOPOLOGY_SOCKET,
> +    .parent        = TYPE_DEVICE,
> +    .instance_size = sizeof(S390TopologySocket),
> +    .class_init    = socket_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_HOTPLUG_HANDLER },
> +        { }
> +    }
> +};
> +
> +static char *book_bus_get_dev_path(DeviceState *dev)
> +{
> +    return g_strdup("00");
> +}
> +
> +static void book_bus_class_init(ObjectClass *oc, void *data)
> +{
> +    BusClass *k = BUS_CLASS(oc);
> +
> +    k->get_dev_path = book_bus_get_dev_path;
> +    k->max_dev = S390_MAX_BOOKS;

Same question as for socket_bus_class_init here.

> +}
> +
> +static const TypeInfo book_bus_info = {
> +    .name = TYPE_S390_TOPOLOGY_BOOK_BUS,
> +    .parent = TYPE_BUS,
> +    .instance_size = 0,
> +    .class_init = book_bus_class_init,
> +};
> +
> +static void s390_book_device_realize(DeviceState *dev, Error **errp)
> +{
> +    S390TopologyBook *book = S390_TOPOLOGY_BOOK(dev);
> +    BusState *bus;
> +
> +    bus = qbus_new(TYPE_S390_TOPOLOGY_BOOK_BUS, dev,
> +                   TYPE_S390_TOPOLOGY_BOOK_BUS);
> +    qbus_set_hotplug_handler(bus, OBJECT(dev));
> +    book->bus = bus;
> +}
> +
> +static void book_class_init(ObjectClass *oc, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(oc);
> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> +
> +    hc->unplug = qdev_simple_device_unplug_cb;
> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> +    dc->realize = s390_book_device_realize;
> +    dc->desc = "topology book";
> +}
> +
> +static const TypeInfo book_info = {
> +    .name          = TYPE_S390_TOPOLOGY_BOOK,
> +    .parent        = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(S390TopologyBook),
> +    .class_init    = book_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_HOTPLUG_HANDLER },
> +        { }
> +    }
> +};
> +
> +static void topology_register(void)
> +{
> +    type_register_static(&cpu_cores_info);
> +    type_register_static(&socket_bus_info);
> +    type_register_static(&socket_info);
> +    type_register_static(&book_bus_info);
> +    type_register_static(&book_info);
> +}
> +
> +type_init(topology_register);
> diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
> index feefe0717e..3592fa952b 100644
> --- a/hw/s390x/meson.build
> +++ b/hw/s390x/meson.build
> @@ -2,6 +2,7 @@ s390x_ss = ss.source_set()
>  s390x_ss.add(files(
>    'ap-bridge.c',
>    'ap-device.c',
> +  'cpu-topology.c',
>    'ccw-device.c',
>    'css-bridge.c',
>    'css.c',
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index cc3097bfee..a586875b24 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -43,6 +43,7 @@
>  #include "sysemu/sysemu.h"
>  #include "hw/s390x/pv.h"
>  #include "migration/blocker.h"
> +#include "hw/s390x/cpu-topology.h"
>  
>  static Error *pv_mig_blocker;
>  
> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>      /* initialize possible_cpus */
>      mc->possible_cpu_arch_ids(machine);
>  
> +    s390_topology_setup(machine);
>      for (i = 0; i < machine->smp.cpus; i++) {
>          s390x_new_cpu(machine->cpu_type, i, &error_fatal);
>      }
> @@ -306,6 +308,10 @@ static void s390_cpu_plug(HotplugHandler *hotplug_dev,
>      g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
>      ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
>  
> +    if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
> +        return;
> +    }
> +
>      if (dev->hotplugged) {
>          raise_irq_cpu_hotplug();
>      }
> diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
> new file mode 100644
> index 0000000000..beec61706c
> --- /dev/null
> +++ b/include/hw/s390x/cpu-topology.h
> @@ -0,0 +1,74 @@
> +/*
> + * CPU Topology
> + *
> + * Copyright 2022 IBM Corp.

Same issue as with .c copyright notice.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> + * your option) any later version. See the COPYING file in the top-level
> + * directory.
> + */
> +#ifndef HW_S390X_CPU_TOPOLOGY_H
> +#define HW_S390X_CPU_TOPOLOGY_H
> +
> +#include "hw/qdev-core.h"
> +#include "qom/object.h"
> +
> +#define S390_TOPOLOGY_CPU_TYPE    0x03

This is the IFL type, right? If so the name should reflect it.
> +
> +#define S390_TOPOLOGY_POLARITY_H  0x00
> +#define S390_TOPOLOGY_POLARITY_VL 0x01
> +#define S390_TOPOLOGY_POLARITY_VM 0x02
> +#define S390_TOPOLOGY_POLARITY_VH 0x03

Why not use an enum?
> +
> +#define TYPE_S390_TOPOLOGY_CORES "topology cores"

Seems to me that using a - instead of a space is the usual way of doing things.
> +    /*
> +     * Each CPU inside a socket will be represented by a bit in a 64bit
> +     * unsigned long. Set on plug and clear on unplug of a CPU.
> +     * All CPU inside a mask share the same dedicated, polarity and
> +     * cputype values.
> +     * The origin is the offset of the first CPU in a mask.
> +     */
> +struct S390TopologyCores {
> +    DeviceState parent_obj;
> +    int id;
> +    bool dedicated;
> +    uint8_t polarity;
> +    uint8_t cputype;

Why not snake_case for cpu type?

> +    uint16_t origin;
> +    uint64_t mask;
> +    int cnt;

num_cores instead ?

> +};
> +typedef struct S390TopologyCores S390TopologyCores;
> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyCores, S390_TOPOLOGY_CORES)
> +
> +#define TYPE_S390_TOPOLOGY_SOCKET "topology socket"
> +#define TYPE_S390_TOPOLOGY_SOCKET_BUS "socket-bus"
> +struct S390TopologySocket {
> +    DeviceState parent_obj;
> +    BusState *bus;
> +    int socket_id;
> +    int cnt;
> +};
> +typedef struct S390TopologySocket S390TopologySocket;
> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologySocket, S390_TOPOLOGY_SOCKET)
> +#define S390_MAX_SOCKETS 4
> +
> +#define TYPE_S390_TOPOLOGY_BOOK "topology book"
> +#define TYPE_S390_TOPOLOGY_BOOK_BUS "book-bus"
> +struct S390TopologyBook {
> +    SysBusDevice parent_obj;
> +    BusState *bus;
> +    int book_id;
> +    int cnt;
> +};
> +typedef struct S390TopologyBook S390TopologyBook;
> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyBook, S390_TOPOLOGY_BOOK)
> +#define S390_MAX_BOOKS 1
> +
> +S390TopologyBook *s390_init_topology(void);
> +
> +S390TopologyBook *s390_get_topology(void);
> +void s390_topology_setup(MachineState *ms);
> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp);
> +
> +#endif
> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
> index 7d6d01325b..216adfde26 100644
> --- a/target/s390x/cpu.h
> +++ b/target/s390x/cpu.h

I think these definitions should be moved to the STSI patch since they're not used in this one.

> @@ -565,6 +565,53 @@ typedef union SysIB {
>  } SysIB;
>  QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>  
> +/* CPU type Topology List Entry */
> +typedef struct SysIBTl_cpu {
> +        uint8_t nl;
> +        uint8_t reserved0[3];
> +        uint8_t reserved1:5;
> +        uint8_t dedicated:1;
> +        uint8_t polarity:2;
> +        uint8_t type;
> +        uint16_t origin;
> +        uint64_t mask;
> +} SysIBTl_cpu;
> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
> +
> +/* Container type Topology List Entry */
> +typedef struct SysIBTl_container {
> +        uint8_t nl;
> +        uint8_t reserved[6];
> +        uint8_t id;
> +} QEMU_PACKED SysIBTl_container;
> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
> +
> +/* Generic Topology List Entry */
> +typedef union SysIBTl_entry {
> +        uint8_t nl;
> +        SysIBTl_container container;
> +        SysIBTl_cpu cpu;
> +} SysIBTl_entry;

I don't like this union, it's only used in SysIB_151x below and that's misleading,
because the entries are packed without padding, but the union members have different
sizes.

> +
> +#define TOPOLOGY_NR_MAG  6
> +#define TOPOLOGY_NR_MAG6 0
> +#define TOPOLOGY_NR_MAG5 1
> +#define TOPOLOGY_NR_MAG4 2
> +#define TOPOLOGY_NR_MAG3 3
> +#define TOPOLOGY_NR_MAG2 4
> +#define TOPOLOGY_NR_MAG1 5
> +/* Configuration topology */
> +typedef struct SysIB_151x {
> +    uint8_t  res0[2];
> +    uint16_t length;
> +    uint8_t  mag[TOPOLOGY_NR_MAG];
> +    uint8_t  res1;
> +    uint8_t  mnest;
> +    uint32_t res2;
> +    SysIBTl_entry tle[0];

I think this should just be a uint64_t[] or uint64_t[0], whichever is QEMU style.
> +} SysIB_151x;
> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
> +
>  /* MMU defines */
>  #define ASCE_ORIGIN           (~0xfffULL) /* segment table origin             */
>  #define ASCE_SUBSPACE         0x200       /* subspace group control           */


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-12 15:40   ` Janis Schoetterl-Glausch
@ 2022-07-13 14:59     ` Pierre Morel
  2022-07-14 10:38       ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-13 14:59 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/12/22 17:40, Janis Schoetterl-Glausch wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> We use new objects to have a dynamic administration of the CPU topology.
>> The highest level object in this implementation is the s390 book and
>> in this first implementation of CPU topology for S390 we have a single
>> book.
>> The book is built as a SYSBUS bridge during the CPU initialization.
>> Other objects, sockets and core will be built after the parsing
>> of the QEMU -smp argument.
>>
>> Every object under this single book will be build dynamically
>> immediately after a CPU has be realized if it is needed.
>> The CPU will fill the sockets once after the other, according to the
>> number of core per socket defined during the smp parsing.
>>
>> Each CPU inside a socket will be represented by a bit in a 64bit
>> unsigned long. Set on plug and clear on unplug of a CPU.
>>
>> For the S390 CPU topology, thread and cores are merged into
>> topology cores and the number of topology cores is the multiplication
>> of cores by the numbers of threads.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>>   hw/s390x/meson.build            |   1 +
>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>   target/s390x/cpu.h              |  47 ++++
>>   5 files changed, 519 insertions(+)
>>   create mode 100644 hw/s390x/cpu-topology.c
>>   create mode 100644 include/hw/s390x/cpu-topology.h
>>
>> diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
>> new file mode 100644
>> index 0000000000..0fd6f08084
>> --- /dev/null
>> +++ b/hw/s390x/cpu-topology.c
>> @@ -0,0 +1,391 @@
>> +/*
>> + * CPU Topology
>> + *
>> + * Copyright 2022 IBM Corp.
> 
> Should be Copyright IBM Corp. 2022, and maybe even have a year range.

OK

> 
>> + * Author(s): Pierre Morel <pmorel@linux.ibm.com>
>> +
>> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
>> + * your option) any later version. See the COPYING file in the top-level
>> + * directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qemu/error-report.h"
>> +#include "hw/sysbus.h"
>> +#include "hw/s390x/cpu-topology.h"
>> +#include "hw/qdev-properties.h"
>> +#include "hw/boards.h"
>> +#include "qemu/typedefs.h"
>> +#include "target/s390x/cpu.h"
>> +#include "hw/s390x/s390-virtio-ccw.h"
>> +
>> +/*
>> + * s390_create_cores:
>> + * @ms: Machine state
>> + * @socket: the socket on which to create the core set
>> + * @origin: the origin offset of the first core of the set
>> + * @errp: Error pointer
>> + *
>> + * returns a pointer to the created S390TopologyCores structure
>> + *
>> + * On error: return NULL
>> + */
>> +static S390TopologyCores *s390_create_cores(MachineState *ms,
>> +                                            S390TopologySocket *socket,
>> +                                            int origin, Error **errp)
>> +{
>> +    DeviceState *dev;
>> +    S390TopologyCores *cores;
>> +
>> +    if (socket->bus->num_children >= ms->smp.cores * ms->smp.threads) {
>> +        error_setg(errp, "Unable to create more cores.");
>> +        return NULL;
>> +    }
> 
> Why/How can this happen?
> The "location" of the CPU is a function of core_id and the same CPU should not be added twice.
> If it's to enforce a limit on the smp arguments that should happen earlier in my opinion.
> If it's necessary, you could also make the message more verbose and add ", maximum number reached".

should not happen.

>> +
>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_CORES);
>> +    qdev_realize_and_unref(dev, socket->bus, &error_fatal);
> 
> As a result of this, the order of cores in the socket bus is the creation order, correct?
> So newest first and not ordered by the origin (since we can hot plug CPUs), correct?

yes

>> +
>> +    cores = S390_TOPOLOGY_CORES(dev);
>> +    cores->origin = origin;
> 
> I must admit that I haven't fully grokked the qemu object model, yet, but I'd be more comfortable
> if you unref'ed cores after you set the origin.
> Does the socket bus own the object after you unref it? Does it then make sense to return cores
> after unref'ing it?

AFAIU yes the bus owns it.

> But then we don't support CPU unplug, so the object shouldn't just vanish.
> 
>> +    socket->cnt += 1;
> 
> cnt++ to be consistent with create_socket below.

OK

>> +
>> +    return cores;
>> +}
>> +
>> +/*
>> + * s390_create_socket:
>> + * @ms: Machine state
>> + * @book: the book on which to create the socket
>> + * @id: the socket id
>> + * @errp: Error pointer
>> + *
>> + * returns a pointer to the created S390TopologySocket structure
>> + *
>> + * On error: return NULL
>> + */
>> +static S390TopologySocket *s390_create_socket(MachineState *ms,
>> +                                              S390TopologyBook *book,
>> +                                              int id, Error **errp)
>> +{
> 
> Same questions/comments as above.

Same answer, should not happen.
I will remove the check.

> 
>> +    DeviceState *dev;
>> +    S390TopologySocket *socket;
>> +
>> +    if (book->bus->num_children >= ms->smp.sockets) {
>> +        error_setg(errp, "Unable to create more sockets.");
>> +        return NULL;
>> +    }
>> +
>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_SOCKET);
>> +    qdev_realize_and_unref(dev, book->bus, &error_fatal);
>> +
>> +    socket = S390_TOPOLOGY_SOCKET(dev);
>> +    socket->socket_id = id;
>> +    book->cnt++;
>> +
>> +    return socket;
>> +}
>> +
>> +/*
>> + * s390_get_cores:
>> + * @ms: Machine state
>> + * @socket: the socket to search into
>> + * @origin: the origin specified for the S390TopologyCores
>> + * @errp: Error pointer
>> + *
>> + * returns a pointer to a S390TopologyCores structure within a socket having
>> + * the specified origin.
>> + * First search if the socket is already containing the S390TopologyCores
>> + * structure and if not create one with this origin.
>> + */
>> +static S390TopologyCores *s390_get_cores(MachineState *ms,
>> +                                         S390TopologySocket *socket,
>> +                                         int origin, Error **errp)
>> +{
>> +    S390TopologyCores *cores;
>> +    BusChild *kid;
>> +
>> +    QTAILQ_FOREACH(kid, &socket->bus->children, sibling) {
>> +        cores = S390_TOPOLOGY_CORES(kid->child);
>> +        if (cores->origin == origin) {
>> +            return cores;
>> +        }
>> +    }
>> +    return s390_create_cores(ms, socket, origin, errp);
> 
> I think calling create here is unintuative.
> You only use get_cores once when creating a new cpu, I think doing
> 
>      cores = s390_get_cores(ms, socket, origin, errp);
>      if (!cores) {
>          cores = s390_create_cores(...);
>      ]
>      if (!cores) {
>          return false;
>      }
> 
> is more straight forward and readable.

I will keep it so as the creation can not fail.
As it was in the first series, I made this change later, and it was not 
a good idea.

>> +}
>> +
>> +/*
>> + * s390_get_socket:
>> + * @ms: Machine state
>> + * @book: The book to search into
>> + * @socket_id: the identifier of the socket to search for
>> + * @errp: Error pointer
>> + *
>> + * returns a pointer to a S390TopologySocket structure within a book having
>> + * the specified socket_id.
>> + * First search if the book is already containing the S390TopologySocket
>> + * structure and if not create one with this socket_id.
>> + */
>> +static S390TopologySocket *s390_get_socket(MachineState *ms,
>> +                                           S390TopologyBook *book,
>> +                                           int socket_id, Error **errp)
>> +{
>> +    S390TopologySocket *socket;
>> +    BusChild *kid;
>> +
>> +    QTAILQ_FOREACH(kid, &book->bus->children, sibling) {
>> +        socket = S390_TOPOLOGY_SOCKET(kid->child);
>> +        if (socket->socket_id == socket_id) {
>> +            return socket;
>> +        }
>> +    }
>> +    return s390_create_socket(ms, book, socket_id, errp);
> 
> As above.
> 
>> +}
>> +
>> +/*
>> + * s390_topology_new_cpu:
>> + * @core_id: the core ID is machine wide
>> + *
>> + * We have a single book returned by s390_get_topology(),
>> + * then we build the hierarchy on demand.
>> + * Note that we do not destroy the hierarchy on error creating
>> + * an entry in the topology, we just keep it empty.
>> + * We do not need to worry about not finding a topology level
>> + * entry this would have been caught during smp parsing.
>> + */
>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>> +{
>> +    S390TopologyBook *book;
>> +    S390TopologySocket *socket;
>> +    S390TopologyCores *cores;
>> +    int nb_cores_per_socket;
> 
> num_cores_per_socket instead?
> 
>> +    int origin, bit;
>> +
>> +    book = s390_get_topology();
>> +
>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
> 
> We don't support the multithreading facility, do we?
> So, I think we should assert smp.threads == 1 somewhere.
> In any case I think the correct expression would round the threads up to the next power of 2,
> because the core_id has the thread id in the lower bits, but threads per core doesn't need to be
> a power of 2 according to the architecture.

That is right.
I will add that.


> 
>> +
>> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, errp);
>> +    if (!socket) {
>> +        return false;
>> +    }
>> +
>> +    /*
>> +     * At the core level, each CPU is represented by a bit in a 64bit
>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>> +     * The firmware assume that all CPU in the core description have the same
>> +     * type, polarization and are all dedicated or shared.
>> +     * In the case a socket contains CPU with different type, polarization
>> +     * or dedication then they will be defined in different CPU containers.
>> +     * Currently we assume all CPU are identical and the only reason to have
>> +     * several S390TopologyCores inside a socket is to have more than 64 CPUs
>> +     * in that case the origin field, representing the offset of the first CPU
>> +     * in the CPU container allows to represent up to the maximal number of
>> +     * CPU inside several CPU containers inside the socket container.
>> +     */
>> +    origin = 64 * (core_id / 64);
>> +
>> +    cores = s390_get_cores(ms, socket, origin, errp);
>> +    if (!cores) {
>> +        return false;
>> +    }
>> +
>> +    bit = 63 - (core_id - origin);
>> +    set_bit(bit, &cores->mask);
>> +    cores->origin = origin;
> 
> This is redundant, origin is already set.

right

> Also I think you should generally pass the core_id and not the origin.
> Then on construction you can also set the bit.

OK

> 
>> +
>> +    return true;
>> +}
>> +
>> +/*
>> + * Setting the first topology: 1 book, 1 socket
>> + * This is enough for 64 cores if the topology is flat (single socket)
>> + */
>> +void s390_topology_setup(MachineState *ms)
>> +{
>> +    DeviceState *dev;
>> +
>> +    /* Create BOOK bridge device */
>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
>> +    object_property_add_child(qdev_get_machine(),
>> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
> 
> Why add it to the machine instead of directly using a static?

For my opinion it is a characteristic of the machine.

> So it's visible to the user via info qtree or something?

It is already visible to the user on info qtree.

> Would that even be the appropriate location to show that?

That is a very good question and I really appreciate if we discuss on 
the design before diving into details.

The idea is to have the architecture details being on qtree as object so 
we can plug new drawers/books/socket/cores and in the future when the 
infrastructure allows it unplug them.

There is a info numa (info cpus does not give a lot info) to give 
information on nodes but AFAIU, a node is more a theoritical that can be 
used above the virtual architecture, sockets/cores, to specify 
characteristics like distance and associated memory.

As I understand it can be used above socket and for us above books or 
drawers too like in:

-numa cpu,node-id=0,socket-id=0

All cores in socket 0 belong to node 0

or
-numa cpu,node-id=1,drawer-id=1

all cores from all sockets of drawer 1 belong to node 1


As there is no info socket, I think that for now we do not need an info 
book/drawer we have everything in qtree.


> 
>> +    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
>> +}
>> +
>> +S390TopologyBook *s390_get_topology(void)
>> +{
>> +    static S390TopologyBook *book;
>> +
>> +    if (!book) {
>> +        book = S390_TOPOLOGY_BOOK(
>> +            object_resolve_path(TYPE_S390_TOPOLOGY_BOOK, NULL));
>> +        assert(book != NULL);
>> +    }
>> +
>> +    return book;
>> +}
>> +
>> +/* --- CORES Definitions --- */
>> +
>> +static Property s390_topology_cores_properties[] = {
>> +    DEFINE_PROP_BOOL("dedicated", S390TopologyCores, dedicated, false),
>> +    DEFINE_PROP_UINT8("polarity", S390TopologyCores, polarity,
>> +                      S390_TOPOLOGY_POLARITY_H),
>> +    DEFINE_PROP_UINT8("cputype", S390TopologyCores, cputype,
>> +                      S390_TOPOLOGY_CPU_TYPE),
>> +    DEFINE_PROP_UINT16("origin", S390TopologyCores, origin, 0),
>> +    DEFINE_PROP_UINT64("mask", S390TopologyCores, mask, 0),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void cpu_cores_class_init(ObjectClass *oc, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(oc);
>> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>> +
>> +    device_class_set_props(dc, s390_topology_cores_properties);
>> +    hc->unplug = qdev_simple_device_unplug_cb;
>> +    dc->bus_type = TYPE_S390_TOPOLOGY_SOCKET_BUS;
>> +    dc->desc = "topology cpu entry";
>> +}
>> +
>> +static const TypeInfo cpu_cores_info = {
>> +    .name          = TYPE_S390_TOPOLOGY_CORES,
>> +    .parent        = TYPE_DEVICE,
>> +    .instance_size = sizeof(S390TopologyCores),
>> +    .class_init    = cpu_cores_class_init,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_HOTPLUG_HANDLER },
> 
> Why implement the hotplug interface? That is not actually supported, is it?

exact, no need.

>> +        { }
>> +    }
>> +};
>> +
>> +static char *socket_bus_get_dev_path(DeviceState *dev)
>> +{
>> +    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
>> +    DeviceState *book = dev->parent_bus->parent;
>> +    char *id = qdev_get_dev_path(book);
>> +    char *ret;
>> +
>> +    if (id) {
>> +        ret = g_strdup_printf("%s:%02d", id, socket->socket_id);
>> +        g_free(id);
>> +    } else {
>> +        ret = g_strdup_printf("_:%02d", socket->socket_id);
> 
> How can this case occur? Sockets get attached to the book bus immediately after creation, correct?

yes

> 
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static void socket_bus_class_init(ObjectClass *oc, void *data)
>> +{
>> +    BusClass *k = BUS_CLASS(oc);
>> +
>> +    k->get_dev_path = socket_bus_get_dev_path;
>> +    k->max_dev = S390_MAX_SOCKETS;
> 
> This is the bus the cores are attached to, correct?
> Is this constant badly named, or should this be MAX_CORES (which doesn't exist)?
> How does this limit get enforced?
> Why is there a limit in the first place? I don't see one defined by STSI, other than having to fit in a u8.

I will suppress this, it was intended to be an architecture dependant 
value. for Example, could be 3 for Z16, 5 for Z17 but I am not sure it 
is worth.

>> +}
>> +
>> +static const TypeInfo socket_bus_info = {
>> +    .name = TYPE_S390_TOPOLOGY_SOCKET_BUS,
>> +    .parent = TYPE_BUS,
>> +    .instance_size = 0,
> 
> After a bit of grepping it seems to me that omitting that field is more common that setting it to 0.

right

> 
>> +    .class_init = socket_bus_class_init,
>> +};
>> +
>> +static void s390_socket_device_realize(DeviceState *dev, Error **errp)
>> +{
>> +    S390TopologySocket *socket = S390_TOPOLOGY_SOCKET(dev);
>> +    BusState *bus;
>> +
>> +    bus = qbus_new(TYPE_S390_TOPOLOGY_SOCKET_BUS, dev,
>> +                   TYPE_S390_TOPOLOGY_SOCKET_BUS);
>> +    qbus_set_hotplug_handler(bus, OBJECT(dev));
>> +    socket->bus = bus;
>> +}
>> +
>> +static void socket_class_init(ObjectClass *oc, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(oc);
>> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>> +
>> +    hc->unplug = qdev_simple_device_unplug_cb;
>> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>> +    dc->bus_type = TYPE_S390_TOPOLOGY_BOOK_BUS;
>> +    dc->realize = s390_socket_device_realize;
>> +    dc->desc = "topology socket";
>> +}
>> +
>> +static const TypeInfo socket_info = {
>> +    .name          = TYPE_S390_TOPOLOGY_SOCKET,
>> +    .parent        = TYPE_DEVICE,
>> +    .instance_size = sizeof(S390TopologySocket),
>> +    .class_init    = socket_class_init,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_HOTPLUG_HANDLER },
>> +        { }
>> +    }
>> +};
>> +
>> +static char *book_bus_get_dev_path(DeviceState *dev)
>> +{
>> +    return g_strdup("00");
>> +}
>> +
>> +static void book_bus_class_init(ObjectClass *oc, void *data)
>> +{
>> +    BusClass *k = BUS_CLASS(oc);
>> +
>> +    k->get_dev_path = book_bus_get_dev_path;
>> +    k->max_dev = S390_MAX_BOOKS;
> 
> Same question as for socket_bus_class_init here.
> 
>> +}
>> +
>> +static const TypeInfo book_bus_info = {
>> +    .name = TYPE_S390_TOPOLOGY_BOOK_BUS,
>> +    .parent = TYPE_BUS,
>> +    .instance_size = 0,
>> +    .class_init = book_bus_class_init,
>> +};
>> +
>> +static void s390_book_device_realize(DeviceState *dev, Error **errp)
>> +{
>> +    S390TopologyBook *book = S390_TOPOLOGY_BOOK(dev);
>> +    BusState *bus;
>> +
>> +    bus = qbus_new(TYPE_S390_TOPOLOGY_BOOK_BUS, dev,
>> +                   TYPE_S390_TOPOLOGY_BOOK_BUS);
>> +    qbus_set_hotplug_handler(bus, OBJECT(dev));
>> +    book->bus = bus;
>> +}
>> +
>> +static void book_class_init(ObjectClass *oc, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(oc);
>> +    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
>> +
>> +    hc->unplug = qdev_simple_device_unplug_cb;
>> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>> +    dc->realize = s390_book_device_realize;
>> +    dc->desc = "topology book";
>> +}
>> +
>> +static const TypeInfo book_info = {
>> +    .name          = TYPE_S390_TOPOLOGY_BOOK,
>> +    .parent        = TYPE_SYS_BUS_DEVICE,
>> +    .instance_size = sizeof(S390TopologyBook),
>> +    .class_init    = book_class_init,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_HOTPLUG_HANDLER },
>> +        { }
>> +    }
>> +};
>> +
>> +static void topology_register(void)
>> +{
>> +    type_register_static(&cpu_cores_info);
>> +    type_register_static(&socket_bus_info);
>> +    type_register_static(&socket_info);
>> +    type_register_static(&book_bus_info);
>> +    type_register_static(&book_info);
>> +}
>> +
>> +type_init(topology_register);
>> diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
>> index feefe0717e..3592fa952b 100644
>> --- a/hw/s390x/meson.build
>> +++ b/hw/s390x/meson.build
>> @@ -2,6 +2,7 @@ s390x_ss = ss.source_set()
>>   s390x_ss.add(files(
>>     'ap-bridge.c',
>>     'ap-device.c',
>> +  'cpu-topology.c',
>>     'ccw-device.c',
>>     'css-bridge.c',
>>     'css.c',
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index cc3097bfee..a586875b24 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -43,6 +43,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/s390x/pv.h"
>>   #include "migration/blocker.h"
>> +#include "hw/s390x/cpu-topology.h"
>>   
>>   static Error *pv_mig_blocker;
>>   
>> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>>       /* initialize possible_cpus */
>>       mc->possible_cpu_arch_ids(machine);
>>   
>> +    s390_topology_setup(machine);
>>       for (i = 0; i < machine->smp.cpus; i++) {
>>           s390x_new_cpu(machine->cpu_type, i, &error_fatal);
>>       }
>> @@ -306,6 +308,10 @@ static void s390_cpu_plug(HotplugHandler *hotplug_dev,
>>       g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
>>       ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
>>   
>> +    if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
>> +        return;
>> +    }
>> +
>>       if (dev->hotplugged) {
>>           raise_irq_cpu_hotplug();
>>       }
>> diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
>> new file mode 100644
>> index 0000000000..beec61706c
>> --- /dev/null
>> +++ b/include/hw/s390x/cpu-topology.h
>> @@ -0,0 +1,74 @@
>> +/*
>> + * CPU Topology
>> + *
>> + * Copyright 2022 IBM Corp.
> 
> Same issue as with .c copyright notice.
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
>> + * your option) any later version. See the COPYING file in the top-level
>> + * directory.
>> + */
>> +#ifndef HW_S390X_CPU_TOPOLOGY_H
>> +#define HW_S390X_CPU_TOPOLOGY_H
>> +
>> +#include "hw/qdev-core.h"
>> +#include "qom/object.h"
>> +
>> +#define S390_TOPOLOGY_CPU_TYPE    0x03
> 
> This is the IFL type, right? If so the name should reflect it.

OK

>> +
>> +#define S390_TOPOLOGY_POLARITY_H  0x00
>> +#define S390_TOPOLOGY_POLARITY_VL 0x01
>> +#define S390_TOPOLOGY_POLARITY_VM 0x02
>> +#define S390_TOPOLOGY_POLARITY_VH 0x03
> 
> Why not use an enum?

these are bits inside a bitfield, not a count, so I prefer a direct 
definition.

>> +
>> +#define TYPE_S390_TOPOLOGY_CORES "topology cores"
> 
> Seems to me that using a - instead of a space is the usual way of doing things.

right

>> +    /*
>> +     * Each CPU inside a socket will be represented by a bit in a 64bit
>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>> +     * All CPU inside a mask share the same dedicated, polarity and
>> +     * cputype values.
>> +     * The origin is the offset of the first CPU in a mask.
>> +     */
>> +struct S390TopologyCores {
>> +    DeviceState parent_obj;
>> +    int id;
>> +    bool dedicated;
>> +    uint8_t polarity;
>> +    uint8_t cputype;
> 
> Why not snake_case for cpu type?

I do not understand what you mean.

> 
>> +    uint16_t origin;
>> +    uint64_t mask;
>> +    int cnt;
> 
> num_cores instead ?

I suppress this it is unused

> 
>> +};
>> +typedef struct S390TopologyCores S390TopologyCores;
>> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyCores, S390_TOPOLOGY_CORES)
>> +
>> +#define TYPE_S390_TOPOLOGY_SOCKET "topology socket"
>> +#define TYPE_S390_TOPOLOGY_SOCKET_BUS "socket-bus"
>> +struct S390TopologySocket {
>> +    DeviceState parent_obj;
>> +    BusState *bus;
>> +    int socket_id;
>> +    int cnt;
>> +};
>> +typedef struct S390TopologySocket S390TopologySocket;
>> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologySocket, S390_TOPOLOGY_SOCKET)
>> +#define S390_MAX_SOCKETS 4
>> +
>> +#define TYPE_S390_TOPOLOGY_BOOK "topology book"
>> +#define TYPE_S390_TOPOLOGY_BOOK_BUS "book-bus"
>> +struct S390TopologyBook {
>> +    SysBusDevice parent_obj;
>> +    BusState *bus;
>> +    int book_id;
>> +    int cnt;
>> +};
>> +typedef struct S390TopologyBook S390TopologyBook;
>> +OBJECT_DECLARE_SIMPLE_TYPE(S390TopologyBook, S390_TOPOLOGY_BOOK)
>> +#define S390_MAX_BOOKS 1
>> +
>> +S390TopologyBook *s390_init_topology(void);
>> +
>> +S390TopologyBook *s390_get_topology(void);
>> +void s390_topology_setup(MachineState *ms);
>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp);
>> +
>> +#endif
>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>> index 7d6d01325b..216adfde26 100644
>> --- a/target/s390x/cpu.h
>> +++ b/target/s390x/cpu.h
> 
> I think these definitions should be moved to the STSI patch since they're not used in this one.

right

> 
>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>   } SysIB;
>>   QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>>   
>> +/* CPU type Topology List Entry */
>> +typedef struct SysIBTl_cpu {
>> +        uint8_t nl;
>> +        uint8_t reserved0[3];
>> +        uint8_t reserved1:5;
>> +        uint8_t dedicated:1;
>> +        uint8_t polarity:2;
>> +        uint8_t type;
>> +        uint16_t origin;
>> +        uint64_t mask;
>> +} SysIBTl_cpu;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>> +
>> +/* Container type Topology List Entry */
>> +typedef struct SysIBTl_container {
>> +        uint8_t nl;
>> +        uint8_t reserved[6];
>> +        uint8_t id;
>> +} QEMU_PACKED SysIBTl_container;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>> +
>> +/* Generic Topology List Entry */
>> +typedef union SysIBTl_entry {
>> +        uint8_t nl;
>> +        SysIBTl_container container;
>> +        SysIBTl_cpu cpu;
>> +} SysIBTl_entry;
> 
> I don't like this union, it's only used in SysIB_151x below and that's misleading,
> because the entries are packed without padding, but the union members have different
> sizes.

the entries have different sizes 64bits and 128bits.
I do not understand why they should be padded.

However, the union here is useless. will remove it.

> 
>> +
>> +#define TOPOLOGY_NR_MAG  6
>> +#define TOPOLOGY_NR_MAG6 0
>> +#define TOPOLOGY_NR_MAG5 1
>> +#define TOPOLOGY_NR_MAG4 2
>> +#define TOPOLOGY_NR_MAG3 3
>> +#define TOPOLOGY_NR_MAG2 4
>> +#define TOPOLOGY_NR_MAG1 5
>> +/* Configuration topology */
>> +typedef struct SysIB_151x {
>> +    uint8_t  res0[2];
>> +    uint16_t length;
>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>> +    uint8_t  res1;
>> +    uint8_t  mnest;
>> +    uint32_t res2;
>> +    SysIBTl_entry tle[0];
> 
> I think this should just be a uint64_t[] or uint64_t[0], whichever is QEMU style.

ok

>> +} SysIB_151x;
>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>> +
>>   /* MMU defines */
>>   #define ASCE_ORIGIN           (~0xfffULL) /* segment table origin             */
>>   #define ASCE_SUBSPACE         0x200       /* subspace group control           */
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-13 14:59     ` Pierre Morel
@ 2022-07-14 10:38       ` Janis Schoetterl-Glausch
  2022-07-14 11:25         ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-14 10:38 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/13/22 16:59, Pierre Morel wrote:
> 
> 
> On 7/12/22 17:40, Janis Schoetterl-Glausch wrote:
>> On 6/20/22 16:03, Pierre Morel wrote:
>>> We use new objects to have a dynamic administration of the CPU topology.
>>> The highest level object in this implementation is the s390 book and
>>> in this first implementation of CPU topology for S390 we have a single
>>> book.
>>> The book is built as a SYSBUS bridge during the CPU initialization.
>>> Other objects, sockets and core will be built after the parsing
>>> of the QEMU -smp argument.
>>>
>>> Every object under this single book will be build dynamically
>>> immediately after a CPU has be realized if it is needed.
>>> The CPU will fill the sockets once after the other, according to the
>>> number of core per socket defined during the smp parsing.
>>>
>>> Each CPU inside a socket will be represented by a bit in a 64bit
>>> unsigned long. Set on plug and clear on unplug of a CPU.
>>>
>>> For the S390 CPU topology, thread and cores are merged into
>>> topology cores and the number of topology cores is the multiplication
>>> of cores by the numbers of threads.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> ---
>>>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>>>   hw/s390x/meson.build            |   1 +
>>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>>   target/s390x/cpu.h              |  47 ++++
>>>   5 files changed, 519 insertions(+)
>>>   create mode 100644 hw/s390x/cpu-topology.c
>>>   create mode 100644 include/hw/s390x/cpu-topology.h
>>>

[...]

>>> +}
>>> +
>>> +/*
>>> + * s390_topology_new_cpu:
>>> + * @core_id: the core ID is machine wide
>>> + *
>>> + * We have a single book returned by s390_get_topology(),
>>> + * then we build the hierarchy on demand.
>>> + * Note that we do not destroy the hierarchy on error creating
>>> + * an entry in the topology, we just keep it empty.
>>> + * We do not need to worry about not finding a topology level
>>> + * entry this would have been caught during smp parsing.
>>> + */
>>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>>> +{
>>> +    S390TopologyBook *book;
>>> +    S390TopologySocket *socket;
>>> +    S390TopologyCores *cores;
>>> +    int nb_cores_per_socket;
>>
>> num_cores_per_socket instead?
>>
>>> +    int origin, bit;
>>> +
>>> +    book = s390_get_topology();
>>> +
>>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>>
>> We don't support the multithreading facility, do we?
>> So, I think we should assert smp.threads == 1 somewhere.
>> In any case I think the correct expression would round the threads up to the next power of 2,
>> because the core_id has the thread id in the lower bits, but threads per core doesn't need to be
>> a power of 2 according to the architecture.
> 
> That is right.
> I will add that.

Add the assert?
It should probably be somewhere else.
And you can set thread > 1 today, so we'd need to handle that. (increase the number of cpus instead and print a warning?)

[...]

>>> +
>>> +/*
>>> + * Setting the first topology: 1 book, 1 socket
>>> + * This is enough for 64 cores if the topology is flat (single socket)
>>> + */
>>> +void s390_topology_setup(MachineState *ms)
>>> +{
>>> +    DeviceState *dev;
>>> +
>>> +    /* Create BOOK bridge device */
>>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
>>> +    object_property_add_child(qdev_get_machine(),
>>> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
>>
>> Why add it to the machine instead of directly using a static?
> 
> For my opinion it is a characteristic of the machine.
> 
>> So it's visible to the user via info qtree or something?
> 
> It is already visible to the user on info qtree.
> 
>> Would that even be the appropriate location to show that?
> 
> That is a very good question and I really appreciate if we discuss on the design before diving into details.
> 
> The idea is to have the architecture details being on qtree as object so we can plug new drawers/books/socket/cores and in the future when the infrastructure allows it unplug them.

Would it not be more accurate to say that we plug in new cpus only?
Since you need to specify the topology up front with -smp and it cannot change after.
So that all is static, books/sockets might be completely unpopulated, but they still exist in a way.
As far as I understand, STSI only allows for cpus to change, nothing above it.
> 
> There is a info numa (info cpus does not give a lot info) to give information on nodes but AFAIU, a node is more a theoritical that can be used above the virtual architecture, sockets/cores, to specify characteristics like distance and associated memory.

https://qemu.readthedocs.io/en/latest/interop/qemu-qmp-ref.html#qapidoc-2391
shows that the relevant information can be queried via qmp.
When I tried it on s390x it only showed the core_id, but we should be able to add the rest.


Am I correct in my understanding, that there are two reasons to have the hierarchy objects:
1. Caching the topology instead of computing it when STSI is called
2. So they show up in info qtree

?

> 
> As I understand it can be used above socket and for us above books or drawers too like in:
> 
> -numa cpu,node-id=0,socket-id=0
> 
> All cores in socket 0 belong to node 0
> 
> or
> -numa cpu,node-id=1,drawer-id=1
> 
> all cores from all sockets of drawer 1 belong to node 1
> 
> 
> As there is no info socket, I think that for now we do not need an info book/drawer we have everything in qtree.
> 
> 
>>

[...]

> 
>>> +    /*
>>> +     * Each CPU inside a socket will be represented by a bit in a 64bit
>>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>>> +     * All CPU inside a mask share the same dedicated, polarity and
>>> +     * cputype values.
>>> +     * The origin is the offset of the first CPU in a mask.
>>> +     */
>>> +struct S390TopologyCores {
>>> +    DeviceState parent_obj;
>>> +    int id;
>>> +    bool dedicated;
>>> +    uint8_t polarity;
>>> +    uint8_t cputype;
>>
>> Why not snake_case for cpu type?
> 
> I do not understand what you mean.

I'm suggesting s/cputype/cpu_type/
> 
>>
>>> +    uint16_t origin;
>>> +    uint64_t mask;
>>> +    int cnt;
>>
>> num_cores instead ?
> 
> I suppress this it is unused
> 
>>

[...]

>>> @@ -565,6 +565,53 @@ typedef union SysIB {
>>>   } SysIB;
>>>   QEMU_BUILD_BUG_ON(sizeof(SysIB) != 4096);
>>>   +/* CPU type Topology List Entry */
>>> +typedef struct SysIBTl_cpu {
>>> +        uint8_t nl;
>>> +        uint8_t reserved0[3];
>>> +        uint8_t reserved1:5;
>>> +        uint8_t dedicated:1;
>>> +        uint8_t polarity:2;
>>> +        uint8_t type;
>>> +        uint16_t origin;
>>> +        uint64_t mask;
>>> +} SysIBTl_cpu;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_cpu) != 16);
>>> +
>>> +/* Container type Topology List Entry */
>>> +typedef struct SysIBTl_container {
>>> +        uint8_t nl;
>>> +        uint8_t reserved[6];
>>> +        uint8_t id;
>>> +} QEMU_PACKED SysIBTl_container;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIBTl_container) != 8);
>>> +
>>> +/* Generic Topology List Entry */
>>> +typedef union SysIBTl_entry {
>>> +        uint8_t nl;
>>> +        SysIBTl_container container;
>>> +        SysIBTl_cpu cpu;
>>> +} SysIBTl_entry;
>>
>> I don't like this union, it's only used in SysIB_151x below and that's misleading,
>> because the entries are packed without padding, but the union members have different
>> sizes.
> 
> the entries have different sizes 64bits and 128bits.
> I do not understand why they should be padded.

I way saying that in the SYSIB there is no padding, but the size of the union is 16,
so two container entries in the array would have padding, which is misleading.
There is no actual problem, since the array is not actually used as such.
> 
> However, the union here is useless. will remove it.
> 
>>
>>> +
>>> +#define TOPOLOGY_NR_MAG  6
>>> +#define TOPOLOGY_NR_MAG6 0
>>> +#define TOPOLOGY_NR_MAG5 1
>>> +#define TOPOLOGY_NR_MAG4 2
>>> +#define TOPOLOGY_NR_MAG3 3
>>> +#define TOPOLOGY_NR_MAG2 4
>>> +#define TOPOLOGY_NR_MAG1 5
>>> +/* Configuration topology */
>>> +typedef struct SysIB_151x {
>>> +    uint8_t  res0[2];
>>> +    uint16_t length;
>>> +    uint8_t  mag[TOPOLOGY_NR_MAG];
>>> +    uint8_t  res1;
>>> +    uint8_t  mnest;
>>> +    uint32_t res2;
>>> +    SysIBTl_entry tle[0];
>>
>> I think this should just be a uint64_t[] or uint64_t[0], whichever is QEMU style.
> 
> ok
> 
>>> +} SysIB_151x;
>>> +QEMU_BUILD_BUG_ON(sizeof(SysIB_151x) != 16);
>>> +
>>>   /* MMU defines */
>>>   #define ASCE_ORIGIN           (~0xfffULL) /* segment table origin             */
>>>   #define ASCE_SUBSPACE         0x200       /* subspace group control           */
>>
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-14 10:38       ` Janis Schoetterl-Glausch
@ 2022-07-14 11:25         ` Pierre Morel
  2022-07-14 12:50           ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-14 11:25 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/14/22 12:38, Janis Schoetterl-Glausch wrote:
> On 7/13/22 16:59, Pierre Morel wrote:
>>
>>
>> On 7/12/22 17:40, Janis Schoetterl-Glausch wrote:
>>> On 6/20/22 16:03, Pierre Morel wrote:

> 
> [...]
> 
>>>> +}
>>>> +
>>>> +/*
>>>> + * s390_topology_new_cpu:
>>>> + * @core_id: the core ID is machine wide
>>>> + *
>>>> + * We have a single book returned by s390_get_topology(),
>>>> + * then we build the hierarchy on demand.
>>>> + * Note that we do not destroy the hierarchy on error creating
>>>> + * an entry in the topology, we just keep it empty.
>>>> + * We do not need to worry about not finding a topology level
>>>> + * entry this would have been caught during smp parsing.
>>>> + */
>>>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>>>> +{
>>>> +    S390TopologyBook *book;
>>>> +    S390TopologySocket *socket;
>>>> +    S390TopologyCores *cores;
>>>> +    int nb_cores_per_socket;
>>>
>>> num_cores_per_socket instead?
>>>
>>>> +    int origin, bit;
>>>> +
>>>> +    book = s390_get_topology();
>>>> +
>>>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>>>
>>> We don't support the multithreading facility, do we?
>>> So, I think we should assert smp.threads == 1 somewhere.
>>> In any case I think the correct expression would round the threads up to the next power of 2,
>>> because the core_id has the thread id in the lower bits, but threads per core doesn't need to be
>>> a power of 2 according to the architecture.
>>
>> That is right.
>> I will add that.
> 
> Add the assert?
> It should probably be somewhere else.

That is sure.
I thought about put a fatal error report during the initialization in 
the s390_topology_setup()

> And you can set thread > 1 today, so we'd need to handle that. (increase the number of cpus instead and print a warning?)
> 
> [...]

this would introduce arch dependencies in the hw/core/
I think that the error report for Z is enough.

So once we support Multithreading in the guest we can adjust it easier 
without involving the common code.

Or we can introduce a thread_supported in SMPCompatProps, which would be 
good.
I would prefer to propose this outside of the series and suppress the 
fatal error once it is adopted.

> 
>>>> +
>>>> +/*
>>>> + * Setting the first topology: 1 book, 1 socket
>>>> + * This is enough for 64 cores if the topology is flat (single socket)
>>>> + */
>>>> +void s390_topology_setup(MachineState *ms)
>>>> +{
>>>> +    DeviceState *dev;
>>>> +
>>>> +    /* Create BOOK bridge device */
>>>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
>>>> +    object_property_add_child(qdev_get_machine(),
>>>> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
>>>
>>> Why add it to the machine instead of directly using a static?
>>
>> For my opinion it is a characteristic of the machine.
>>
>>> So it's visible to the user via info qtree or something?
>>
>> It is already visible to the user on info qtree.
>>
>>> Would that even be the appropriate location to show that?
>>
>> That is a very good question and I really appreciate if we discuss on the design before diving into details.
>>
>> The idea is to have the architecture details being on qtree as object so we can plug new drawers/books/socket/cores and in the future when the infrastructure allows it unplug them.
> 
> Would it not be more accurate to say that we plug in new cpus only?
> Since you need to specify the topology up front with -smp and it cannot change after.

smp specify the maximum we can have.
I thought we can add dynamically elements inside this maximum set.

> So that all is static, books/sockets might be completely unpopulated, but they still exist in a way.
> As far as I understand, STSI only allows for cpus to change, nothing above it.

I thought we want to plug new books or drawers but I may be wrong.

>>
>> There is a info numa (info cpus does not give a lot info) to give information on nodes but AFAIU, a node is more a theoritical that can be used above the virtual architecture, sockets/cores, to specify characteristics like distance and associated memory.
> 
> https://qemu.readthedocs.io/en/latest/interop/qemu-qmp-ref.html#qapidoc-2391
> shows that the relevant information can be queried via qmp.
> When I tried it on s390x it only showed the core_id, but we should be able to add the rest.

yes, sure.

> 
> 
> Am I correct in my understanding, that there are two reasons to have the hierarchy objects:
> 1. Caching the topology instead of computing it when STSI is called
> 2. So they show up in info qtree
> 
> ?

and have the possibility to add the objects dynamically. yes


> [...]
> 
>>
>>>> +    /*
>>>> +     * Each CPU inside a socket will be represented by a bit in a 64bit
>>>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>>>> +     * All CPU inside a mask share the same dedicated, polarity and
>>>> +     * cputype values.
>>>> +     * The origin is the offset of the first CPU in a mask.
>>>> +     */
>>>> +struct S390TopologyCores {
>>>> +    DeviceState parent_obj;
>>>> +    int id;
>>>> +    bool dedicated;
>>>> +    uint8_t polarity;
>>>> +    uint8_t cputype;
>>>
>>> Why not snake_case for cpu type?
>>
>> I do not understand what you mean.
> 
> I'm suggesting s/cputype/cpu_type/

ok


Thanks,

regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-14 11:25         ` Pierre Morel
@ 2022-07-14 12:50           ` Janis Schoetterl-Glausch
  2022-07-14 19:26             ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-14 12:50 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/14/22 13:25, Pierre Morel wrote:

[...]

> 
> That is sure.
> I thought about put a fatal error report during the initialization in the s390_topology_setup()
> 
>> And you can set thread > 1 today, so we'd need to handle that. (increase the number of cpus instead and print a warning?)
>>
>> [...]
> 
> this would introduce arch dependencies in the hw/core/
> I think that the error report for Z is enough.
> 
> So once we support Multithreading in the guest we can adjust it easier without involving the common code.
> 
> Or we can introduce a thread_supported in SMPCompatProps, which would be good.
> I would prefer to propose this outside of the series and suppress the fatal error once it is adopted.
> 

Yeah, could be a separate series, but then the question remains what you in this one, that is
if you change the code so it would be correct if multithreading were supported.
>>
>>>>> +
>>>>> +/*
>>>>> + * Setting the first topology: 1 book, 1 socket
>>>>> + * This is enough for 64 cores if the topology is flat (single socket)
>>>>> + */
>>>>> +void s390_topology_setup(MachineState *ms)
>>>>> +{
>>>>> +    DeviceState *dev;
>>>>> +
>>>>> +    /* Create BOOK bridge device */
>>>>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
>>>>> +    object_property_add_child(qdev_get_machine(),
>>>>> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
>>>>
>>>> Why add it to the machine instead of directly using a static?
>>>
>>> For my opinion it is a characteristic of the machine.
>>>
>>>> So it's visible to the user via info qtree or something?
>>>
>>> It is already visible to the user on info qtree.
>>>
>>>> Would that even be the appropriate location to show that?
>>>
>>> That is a very good question and I really appreciate if we discuss on the design before diving into details.
>>>
>>> The idea is to have the architecture details being on qtree as object so we can plug new drawers/books/socket/cores and in the future when the infrastructure allows it unplug them.
>>
>> Would it not be more accurate to say that we plug in new cpus only?
>> Since you need to specify the topology up front with -smp and it cannot change after.
> 
> smp specify the maximum we can have.
> I thought we can add dynamically elements inside this maximum set.
> 
>> So that all is static, books/sockets might be completely unpopulated, but they still exist in a way.
>> As far as I understand, STSI only allows for cpus to change, nothing above it.
> 
> I thought we want to plug new books or drawers but I may be wrong.

So you want to be able to plug in, for example, a socket without any cpus in it?
I'm not seeing anything in the description of STSI that forbids having empty containers
or containers with a cpu entry without any cpus. But I don't know why that would be useful.
And if you don't want empty containers, then the container will just show up when plugging in the cpu.
> 
>>>
>>> There is a info numa (info cpus does not give a lot info) to give information on nodes but AFAIU, a node is more a theoritical that can be used above the virtual architecture, sockets/cores, to specify characteristics like distance and associated memory.
>>
>> https://qemu.readthedocs.io/en/latest/interop/qemu-qmp-ref.html#qapidoc-2391
>> shows that the relevant information can be queried via qmp.
>> When I tried it on s390x it only showed the core_id, but we should be able to add the rest.
> 
> yes, sure.> 
>>
>>
>> Am I correct in my understanding, that there are two reasons to have the hierarchy objects:
>> 1. Caching the topology instead of computing it when STSI is called
>> 2. So they show up in info qtree
>>
>> ?
> 
> and have the possibility to add the objects dynamically. yes
> 
[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
@ 2022-07-14 14:57   ` Janis Schoetterl-Glausch
  2022-07-14 20:17     ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-14 14:57 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 6/20/22 16:03, Pierre Morel wrote:
> S390x CPU Topology allows a non uniform repartition of the CPU
> inside the topology containers, sockets, books and drawers.
> 
> We use numa to place the CPU inside the right topology container
> and report the non uniform topology to the guest.
> 
> Note that s390x needs CPU0 to belong to the topology and consequently
> all topology must include CPU0.
> 
> We accept a partial QEMU numa definition, in that case undefined CPUs
> are added to free slots in the topology starting with slot 0 and going
> up.

I don't understand why doing it this way, via numa, makes sense for us.
We report the topology to the guest via STSI, which tells the guest
what the topology "tree" looks like. We don't report any numa distances to the guest.
The natural way to specify where a cpu is added to the vm, seems to me to be
by specify the socket, book, ... IDs when doing a device_add or via -device on 
the command line.

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
                   ` (11 preceding siblings ...)
  2022-06-20 14:03 ` [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology Pierre Morel
@ 2022-07-14 18:43 ` Janis Schoetterl-Glausch
  2022-07-14 20:05   ` Pierre Morel
  12 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-14 18:43 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 6/20/22 16:03, Pierre Morel wrote:
> Hi,
> 
> This new spin is essentially for coherence with the last Linux CPU
> Topology patch, function testing and coding style modifications.
> 
> Forword
> =======
> 
> The goal of this series is to implement CPU topology for S390, it
> improves the preceeding series with the implementation of books and
> drawers, of non uniform CPU topology and with documentation.
> 
> To use these patches, you will need the Linux series version 10.
> You find it there:
> https://lkml.org/lkml/2022/6/20/590
> 
> Currently this code is for KVM only, I have no idea if it is interesting
> to provide a TCG patch. If ever it will be done in another series.
> 
> To have a better understanding of the S390x CPU Topology and its
> implementation in QEMU you can have a look at the documentation in the
> last patch or follow the introduction here under.
> 
> A short introduction
> ====================
> 
> CPU Topology is described in the S390 POP with essentially the description
> of two instructions:
> 
> PTF Perform Topology function used to poll for topology change
>     and used to set the polarization but this part is not part of this item.
> 
> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>     configuration.
> 
> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>     of containers. The last topology level, specifying the CPU cores.
> 
>     This patch series only uses the two lower levels sockets and cores.
>     
>     To get the information on the topology, S390 provides the STSI
>     instruction, which stores a structures providing the list of the
>     containers used in the Machine topology: the SYSIB.
>     A selector within the STSI instruction allow to chose how many topology
>     levels will be provide in the SYSIB.
> 
>     Using the Topology List Entries (TLE) provided inside the SYSIB we
>     the Linux kernel is able to compute the information about the cache
>     distance between two cores and can use this information to take
>     scheduling decisions.

Do the socket, book, ... metaphors and looking at STSI from the existing
smp infrastructure even make sense?

STSI 15.1.x reports the topology to the guest and for a virtual machine,
this topology can be very dynamic. So a CPU can move from from one topology
container to another, but the socket of a cpu changing while it's running seems
a bit strange. And this isn't supported by this patch series as far as I understand,
the only topology changes are on hotplug.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-07-14 12:50           ` Janis Schoetterl-Glausch
@ 2022-07-14 19:26             ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-07-14 19:26 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/14/22 14:50, Janis Schoetterl-Glausch wrote:
> On 7/14/22 13:25, Pierre Morel wrote:
> 
> [...]
> 
>>
>> That is sure.
>> I thought about put a fatal error report during the initialization in the s390_topology_setup()
>>
>>> And you can set thread > 1 today, so we'd need to handle that. (increase the number of cpus instead and print a warning?)
>>>
>>> [...]
>>
>> this would introduce arch dependencies in the hw/core/
>> I think that the error report for Z is enough.
>>
>> So once we support Multithreading in the guest we can adjust it easier without involving the common code.
>>
>> Or we can introduce a thread_supported in SMPCompatProps, which would be good.
>> I would prefer to propose this outside of the series and suppress the fatal error once it is adopted.
>>
> 
> Yeah, could be a separate series, but then the question remains what you in this one, that is
> if you change the code so it would be correct if multithreading were supported.

I would like to first not support multi-thread and do a fatal error if 
threads are defined or implicitly defined as different of 1.

I prefer to keep multithreading for later, I did not have a look at all 
the implications for the moment.

>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * Setting the first topology: 1 book, 1 socket
>>>>>> + * This is enough for 64 cores if the topology is flat (single socket)
>>>>>> + */
>>>>>> +void s390_topology_setup(MachineState *ms)
>>>>>> +{
>>>>>> +    DeviceState *dev;
>>>>>> +
>>>>>> +    /* Create BOOK bridge device */
>>>>>> +    dev = qdev_new(TYPE_S390_TOPOLOGY_BOOK);
>>>>>> +    object_property_add_child(qdev_get_machine(),
>>>>>> +                              TYPE_S390_TOPOLOGY_BOOK, OBJECT(dev));
>>>>>
>>>>> Why add it to the machine instead of directly using a static?
>>>>
>>>> For my opinion it is a characteristic of the machine.
>>>>
>>>>> So it's visible to the user via info qtree or something?
>>>>
>>>> It is already visible to the user on info qtree.
>>>>
>>>>> Would that even be the appropriate location to show that?
>>>>
>>>> That is a very good question and I really appreciate if we discuss on the design before diving into details.
>>>>
>>>> The idea is to have the architecture details being on qtree as object so we can plug new drawers/books/socket/cores and in the future when the infrastructure allows it unplug them.
>>>
>>> Would it not be more accurate to say that we plug in new cpus only?
>>> Since you need to specify the topology up front with -smp and it cannot change after.
>>
>> smp specify the maximum we can have.
>> I thought we can add dynamically elements inside this maximum set.
>>
>>> So that all is static, books/sockets might be completely unpopulated, but they still exist in a way.
>>> As far as I understand, STSI only allows for cpus to change, nothing above it.
>>
>> I thought we want to plug new books or drawers but I may be wrong.
> 
> So you want to be able to plug in, for example, a socket without any cpus in it?
> I'm not seeing anything in the description of STSI that forbids having empty containers
> or containers with a cpu entry without any cpus. But I don't know why that would be useful.
> And if you don't want empty containers, then the container will just show up when plugging in the cpu.

You already convinced me, it is a non sense and, anyway, building every 
container when a cpu is added is how it works with the current 
implementation.


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
@ 2022-07-14 20:05   ` Pierre Morel
  2022-07-15  9:31     ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-14 20:05 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> Hi,
>>
>> This new spin is essentially for coherence with the last Linux CPU
>> Topology patch, function testing and coding style modifications.
>>
>> Forword
>> =======
>>
>> The goal of this series is to implement CPU topology for S390, it
>> improves the preceeding series with the implementation of books and
>> drawers, of non uniform CPU topology and with documentation.
>>
>> To use these patches, you will need the Linux series version 10.
>> You find it there:
>> https://lkml.org/lkml/2022/6/20/590
>>
>> Currently this code is for KVM only, I have no idea if it is interesting
>> to provide a TCG patch. If ever it will be done in another series.
>>
>> To have a better understanding of the S390x CPU Topology and its
>> implementation in QEMU you can have a look at the documentation in the
>> last patch or follow the introduction here under.
>>
>> A short introduction
>> ====================
>>
>> CPU Topology is described in the S390 POP with essentially the description
>> of two instructions:
>>
>> PTF Perform Topology function used to poll for topology change
>>      and used to set the polarization but this part is not part of this item.
>>
>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>      configuration.
>>
>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>      of containers. The last topology level, specifying the CPU cores.
>>
>>      This patch series only uses the two lower levels sockets and cores.
>>      
>>      To get the information on the topology, S390 provides the STSI
>>      instruction, which stores a structures providing the list of the
>>      containers used in the Machine topology: the SYSIB.
>>      A selector within the STSI instruction allow to chose how many topology
>>      levels will be provide in the SYSIB.
>>
>>      Using the Topology List Entries (TLE) provided inside the SYSIB we
>>      the Linux kernel is able to compute the information about the cache
>>      distance between two cores and can use this information to take
>>      scheduling decisions.
> 
> Do the socket, book, ... metaphors and looking at STSI from the existing
> smp infrastructure even make sense?

Sorry, I do not understand.
I admit the cover-letter is old and I did not rewrite it really good 
since the first patch series.

What we do is:
Compute the STSI from the SMP + numa + device QEMU parameters .

> 
> STSI 15.1.x reports the topology to the guest and for a virtual machine,
> this topology can be very dynamic. So a CPU can move from from one topology
> container to another, but the socket of a cpu changing while it's running seems
> a bit strange. And this isn't supported by this patch series as far as I understand,
> the only topology changes are on hotplug.

A CPU changing from a socket to another socket is the only case the PTF 
instruction reports a change in the topology with the case a new CPU is 
plug in.
It is not expected to appear often but it does appear.
The code has been removed from the kernel in spin 10 for 2 reasons:
1) we decided to first support only dedicated and pinned CPU
2) Christian fears it may happen too often due to Linux host scheduling 
and could be a performance problem

So yes now we only have a topology report on vCPU plug.







> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-14 14:57   ` Janis Schoetterl-Glausch
@ 2022-07-14 20:17     ` Pierre Morel
  2022-07-15  9:11       ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-14 20:17 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> S390x CPU Topology allows a non uniform repartition of the CPU
>> inside the topology containers, sockets, books and drawers.
>>
>> We use numa to place the CPU inside the right topology container
>> and report the non uniform topology to the guest.
>>
>> Note that s390x needs CPU0 to belong to the topology and consequently
>> all topology must include CPU0.
>>
>> We accept a partial QEMU numa definition, in that case undefined CPUs
>> are added to free slots in the topology starting with slot 0 and going
>> up.
> 
> I don't understand why doing it this way, via numa, makes sense for us.
> We report the topology to the guest via STSI, which tells the guest
> what the topology "tree" looks like. We don't report any numa distances to the guest.
> The natural way to specify where a cpu is added to the vm, seems to me to be
> by specify the socket, book, ... IDs when doing a device_add or via -device on
> the command line.
> 
> [...]
> 

It is a choice to have the core-id to determine were the CPU is situated 
in the topology.

But yes we can chose the use drawer-id,book-id,socket-id and use a 
core-id starting on 0 on each socket.

It is not done in the current implementation because the core-id implies 
the socket-id, book-id and drawer-id together with the smp parameters.


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-14 20:17     ` Pierre Morel
@ 2022-07-15  9:11       ` Janis Schoetterl-Glausch
  2022-07-15 13:07         ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-15  9:11 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/14/22 22:17, Pierre Morel wrote:
> 
> 
> On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
>> On 6/20/22 16:03, Pierre Morel wrote:
>>> S390x CPU Topology allows a non uniform repartition of the CPU
>>> inside the topology containers, sockets, books and drawers.
>>>
>>> We use numa to place the CPU inside the right topology container
>>> and report the non uniform topology to the guest.
>>>
>>> Note that s390x needs CPU0 to belong to the topology and consequently
>>> all topology must include CPU0.
>>>
>>> We accept a partial QEMU numa definition, in that case undefined CPUs
>>> are added to free slots in the topology starting with slot 0 and going
>>> up.
>>
>> I don't understand why doing it this way, via numa, makes sense for us.
>> We report the topology to the guest via STSI, which tells the guest
>> what the topology "tree" looks like. We don't report any numa distances to the guest.
>> The natural way to specify where a cpu is added to the vm, seems to me to be
>> by specify the socket, book, ... IDs when doing a device_add or via -device on
>> the command line.
>>
>> [...]
>>
> 
> It is a choice to have the core-id to determine were the CPU is situated in the topology.
> 
> But yes we can chose the use drawer-id,book-id,socket-id and use a core-id starting on 0 on each socket.
> 
> It is not done in the current implementation because the core-id implies the socket-id, book-id and drawer-id together with the smp parameters.
> 
> 
Regardless of whether the core-id or the combination of socket-id, book-id .. is used to specify where a CPU is
located, why use the numa framework and not just device_add or -device ?

That feels way more natural since it should already just work if you can do hotplug.
At least with core-id and I suspect with a subset of your changes also with socket-id, etc.

Whereas numa is an awkward fit since it's for specifying distances between nodes, which we don't do,
and you have to use a hack to get it to specify which CPUs to plug (via setting arch_id to -1).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-07-14 20:05   ` Pierre Morel
@ 2022-07-15  9:31     ` Janis Schoetterl-Glausch
  2022-07-15 13:47       ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-15  9:31 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/14/22 22:05, Pierre Morel wrote:
> 
> 
> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>> On 6/20/22 16:03, Pierre Morel wrote:
>>> Hi,
>>>
>>> This new spin is essentially for coherence with the last Linux CPU
>>> Topology patch, function testing and coding style modifications.
>>>
>>> Forword
>>> =======
>>>
>>> The goal of this series is to implement CPU topology for S390, it
>>> improves the preceeding series with the implementation of books and
>>> drawers, of non uniform CPU topology and with documentation.
>>>
>>> To use these patches, you will need the Linux series version 10.
>>> You find it there:
>>> https://lkml.org/lkml/2022/6/20/590
>>>
>>> Currently this code is for KVM only, I have no idea if it is interesting
>>> to provide a TCG patch. If ever it will be done in another series.
>>>
>>> To have a better understanding of the S390x CPU Topology and its
>>> implementation in QEMU you can have a look at the documentation in the
>>> last patch or follow the introduction here under.
>>>
>>> A short introduction
>>> ====================
>>>
>>> CPU Topology is described in the S390 POP with essentially the description
>>> of two instructions:
>>>
>>> PTF Perform Topology function used to poll for topology change
>>>      and used to set the polarization but this part is not part of this item.
>>>
>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>      configuration.
>>>
>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>      of containers. The last topology level, specifying the CPU cores.
>>>
>>>      This patch series only uses the two lower levels sockets and cores.
>>>           To get the information on the topology, S390 provides the STSI
>>>      instruction, which stores a structures providing the list of the
>>>      containers used in the Machine topology: the SYSIB.
>>>      A selector within the STSI instruction allow to chose how many topology
>>>      levels will be provide in the SYSIB.
>>>
>>>      Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>      the Linux kernel is able to compute the information about the cache
>>>      distance between two cores and can use this information to take
>>>      scheduling decisions.
>>
>> Do the socket, book, ... metaphors and looking at STSI from the existing
>> smp infrastructure even make sense?
> 
> Sorry, I do not understand.
> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
> 
> What we do is:
> Compute the STSI from the SMP + numa + device QEMU parameters .
> 
>>
>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>> this topology can be very dynamic. So a CPU can move from from one topology
>> container to another, but the socket of a cpu changing while it's running seems
>> a bit strange. And this isn't supported by this patch series as far as I understand,
>> the only topology changes are on hotplug.
> 
> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.

Can a CPU actually change between sockets right now?
The socket-id is computed from the core-id, so it's fixed, is it not?

> It is not expected to appear often but it does appear.
> The code has been removed from the kernel in spin 10 for 2 reasons:
> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem

This seems sensible, but now it seems too static.
For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
unless I'm misunderstanding something.
And migration is rare, but something you'd want to be able to react to.
And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.

> 
> So yes now we only have a topology report on vCPU plug.
> 
> 
> 
> 
> 
> 
> 
>>
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-15  9:11       ` Janis Schoetterl-Glausch
@ 2022-07-15 13:07         ` Pierre Morel
  2022-07-20 17:24           ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-15 13:07 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/15/22 11:11, Janis Schoetterl-Glausch wrote:
> On 7/14/22 22:17, Pierre Morel wrote:
>>
>>
>> On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>> S390x CPU Topology allows a non uniform repartition of the CPU
>>>> inside the topology containers, sockets, books and drawers.
>>>>
>>>> We use numa to place the CPU inside the right topology container
>>>> and report the non uniform topology to the guest.
>>>>
>>>> Note that s390x needs CPU0 to belong to the topology and consequently
>>>> all topology must include CPU0.
>>>>
>>>> We accept a partial QEMU numa definition, in that case undefined CPUs
>>>> are added to free slots in the topology starting with slot 0 and going
>>>> up.
>>>
>>> I don't understand why doing it this way, via numa, makes sense for us.
>>> We report the topology to the guest via STSI, which tells the guest
>>> what the topology "tree" looks like. We don't report any numa distances to the guest.
>>> The natural way to specify where a cpu is added to the vm, seems to me to be
>>> by specify the socket, book, ... IDs when doing a device_add or via -device on
>>> the command line.
>>>
>>> [...]
>>>
>>
>> It is a choice to have the core-id to determine were the CPU is situated in the topology.
>>
>> But yes we can chose the use drawer-id,book-id,socket-id and use a core-id starting on 0 on each socket.
>>
>> It is not done in the current implementation because the core-id implies the socket-id, book-id and drawer-id together with the smp parameters.
>>
>>
> Regardless of whether the core-id or the combination of socket-id, book-id .. is used to specify where a CPU is
> located, why use the numa framework and not just device_add or -device ?

You are right, at least we should be able to use both.
I will work on this.

> 
> That feels way more natural since it should already just work if you can do hotplug.
> At least with core-id and I suspect with a subset of your changes also with socket-id, etc.

yes, it already works with core-id

> 
> Whereas numa is an awkward fit since it's for specifying distances between nodes, which we don't do,
> and you have to use a hack to get it to specify which CPUs to plug (via setting arch_id to -1).
> 

Is it only for this?

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-07-15  9:31     ` Janis Schoetterl-Glausch
@ 2022-07-15 13:47       ` Pierre Morel
  2022-07-15 18:28         ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-15 13:47 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/15/22 11:31, Janis Schoetterl-Glausch wrote:
> On 7/14/22 22:05, Pierre Morel wrote:
>>
>>
>> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>> Hi,
>>>>
>>>> This new spin is essentially for coherence with the last Linux CPU
>>>> Topology patch, function testing and coding style modifications.
>>>>
>>>> Forword
>>>> =======
>>>>
>>>> The goal of this series is to implement CPU topology for S390, it
>>>> improves the preceeding series with the implementation of books and
>>>> drawers, of non uniform CPU topology and with documentation.
>>>>
>>>> To use these patches, you will need the Linux series version 10.
>>>> You find it there:
>>>> https://lkml.org/lkml/2022/6/20/590
>>>>
>>>> Currently this code is for KVM only, I have no idea if it is interesting
>>>> to provide a TCG patch. If ever it will be done in another series.
>>>>
>>>> To have a better understanding of the S390x CPU Topology and its
>>>> implementation in QEMU you can have a look at the documentation in the
>>>> last patch or follow the introduction here under.
>>>>
>>>> A short introduction
>>>> ====================
>>>>
>>>> CPU Topology is described in the S390 POP with essentially the description
>>>> of two instructions:
>>>>
>>>> PTF Perform Topology function used to poll for topology change
>>>>       and used to set the polarization but this part is not part of this item.
>>>>
>>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>>       configuration.
>>>>
>>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>>       of containers. The last topology level, specifying the CPU cores.
>>>>
>>>>       This patch series only uses the two lower levels sockets and cores.
>>>>            To get the information on the topology, S390 provides the STSI
>>>>       instruction, which stores a structures providing the list of the
>>>>       containers used in the Machine topology: the SYSIB.
>>>>       A selector within the STSI instruction allow to chose how many topology
>>>>       levels will be provide in the SYSIB.
>>>>
>>>>       Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>>       the Linux kernel is able to compute the information about the cache
>>>>       distance between two cores and can use this information to take
>>>>       scheduling decisions.
>>>
>>> Do the socket, book, ... metaphors and looking at STSI from the existing
>>> smp infrastructure even make sense?
>>
>> Sorry, I do not understand.
>> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
>>
>> What we do is:
>> Compute the STSI from the SMP + numa + device QEMU parameters .
>>
>>>
>>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>>> this topology can be very dynamic. So a CPU can move from from one topology
>>> container to another, but the socket of a cpu changing while it's running seems
>>> a bit strange. And this isn't supported by this patch series as far as I understand,
>>> the only topology changes are on hotplug.
>>
>> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.
> 
> Can a CPU actually change between sockets right now?

To be exact, what I understand is that a shared CPU can be scheduled to 
another real CPU exactly as a guest vCPU can be scheduled by the host to 
another host CPU.

> The socket-id is computed from the core-id, so it's fixed, is it not?

the virtual socket-id is computed from the virtual core-id

> 
>> It is not expected to appear often but it does appear.
>> The code has been removed from the kernel in spin 10 for 2 reasons:
>> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem
> 
> This seems sensible, but now it seems too static.
> For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
> unless I'm misunderstanding something.

No, to do this we would need to ask the kernel about it.

> And migration is rare, but something you'd want to be able to react to.
> And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.

I think on migration we should just make a kvm_set_mtcr on post_load 
like Nico suggested everything else seems complicated for a questionable 
benefit.


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-07-15 13:47       ` Pierre Morel
@ 2022-07-15 18:28         ` Janis Schoetterl-Glausch
  2022-07-18 12:32           ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-15 18:28 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/15/22 15:47, Pierre Morel wrote:
> 
> 
> On 7/15/22 11:31, Janis Schoetterl-Glausch wrote:
>> On 7/14/22 22:05, Pierre Morel wrote:
>>>
>>>
>>> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>> Hi,
>>>>>
>>>>> This new spin is essentially for coherence with the last Linux CPU
>>>>> Topology patch, function testing and coding style modifications.
>>>>>
>>>>> Forword
>>>>> =======
>>>>>
>>>>> The goal of this series is to implement CPU topology for S390, it
>>>>> improves the preceeding series with the implementation of books and
>>>>> drawers, of non uniform CPU topology and with documentation.
>>>>>
>>>>> To use these patches, you will need the Linux series version 10.
>>>>> You find it there:
>>>>> https://lkml.org/lkml/2022/6/20/590
>>>>>
>>>>> Currently this code is for KVM only, I have no idea if it is interesting
>>>>> to provide a TCG patch. If ever it will be done in another series.
>>>>>
>>>>> To have a better understanding of the S390x CPU Topology and its
>>>>> implementation in QEMU you can have a look at the documentation in the
>>>>> last patch or follow the introduction here under.
>>>>>
>>>>> A short introduction
>>>>> ====================
>>>>>
>>>>> CPU Topology is described in the S390 POP with essentially the description
>>>>> of two instructions:
>>>>>
>>>>> PTF Perform Topology function used to poll for topology change
>>>>>       and used to set the polarization but this part is not part of this item.
>>>>>
>>>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>>>       configuration.
>>>>>
>>>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>>>       of containers. The last topology level, specifying the CPU cores.
>>>>>
>>>>>       This patch series only uses the two lower levels sockets and cores.
>>>>>            To get the information on the topology, S390 provides the STSI
>>>>>       instruction, which stores a structures providing the list of the
>>>>>       containers used in the Machine topology: the SYSIB.
>>>>>       A selector within the STSI instruction allow to chose how many topology
>>>>>       levels will be provide in the SYSIB.
>>>>>
>>>>>       Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>>>       the Linux kernel is able to compute the information about the cache
>>>>>       distance between two cores and can use this information to take
>>>>>       scheduling decisions.
>>>>
>>>> Do the socket, book, ... metaphors and looking at STSI from the existing
>>>> smp infrastructure even make sense?
>>>
>>> Sorry, I do not understand.
>>> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
>>>
>>> What we do is:
>>> Compute the STSI from the SMP + numa + device QEMU parameters .
>>>
>>>>
>>>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>>>> this topology can be very dynamic. So a CPU can move from from one topology
>>>> container to another, but the socket of a cpu changing while it's running seems
>>>> a bit strange. And this isn't supported by this patch series as far as I understand,
>>>> the only topology changes are on hotplug.
>>>
>>> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.
>>
>> Can a CPU actually change between sockets right now?
> 
> To be exact, what I understand is that a shared CPU can be scheduled to another real CPU exactly as a guest vCPU can be scheduled by the host to another host CPU.

Ah, ok, this is what I'm forgetting, and what made communication harder,
there are two ways by which the topology can change:
1. the host topology changes
2. the vCPU threads are scheduled on another host CPU

I've been only thinking about the 2.
I assumed some outside entity (libvirt?) pins vCPU threads, and so it would
be the responsibility of that entity to set the topology which then is 
reported to the guest. So if you pin vCPUs for the whole lifetime of the vm
then you could do that by specifying the topology up front with -devices.
If you want to support migration, then the outside entity would need a way
to tell qemu the updated topology.
 
> 
>> The socket-id is computed from the core-id, so it's fixed, is it not?
> 
> the virtual socket-id is computed from the virtual core-id

Meaning cpu.env.core_id, correct? (which is the same as cpu.cpu_index which is the same as
ms->possible_cpus->cpus[core_id].props.core_id)
And a cpu's core id doesn't change during the lifetime of the vm, right?
And so it's socket id doesn't either.

> 
>>
>>> It is not expected to appear often but it does appear.
>>> The code has been removed from the kernel in spin 10 for 2 reasons:
>>> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem
>>
>> This seems sensible, but now it seems too static.
>> For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
>> unless I'm misunderstanding something.
> 
> No, to do this we would need to ask the kernel about it.

You mean polling /sys/devices/system/cpu/cpu*/topology/*_id ?
That should work if it isn't done to frequently, right?
And if it's done by the entity doing the pinning it could judge if the host topology change
is relevant to the guest and if so tell qemu how to update it.
> 
>> And migration is rare, but something you'd want to be able to react to.
>> And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.
> 
> I think on migration we should just make a kvm_set_mtcr on post_load like Nico suggested everything else seems complicated for a questionable benefit.

But what is the point? The result of STSI reported to the guest doesn't actually change, does it?
Since the same CPUs with the same calculated socket-ids, ..., exist.
You cannot migrate to a vm with a different virtual topology, since the CPUs get matched via the cpu_index
as far as I can tell, which is the same as the core_id, or am I misunderstanding something?
Migrating the MTCR bit is correct, if it is 1 than there was a cpu hotplug that the guest did not yet observe,
but setting it to 1 after migration would we wrong if the STSI result would be the same.
> 
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 00/12] s390x: CPU Topology
  2022-07-15 18:28         ` Janis Schoetterl-Glausch
@ 2022-07-18 12:32           ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-07-18 12:32 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/15/22 20:28, Janis Schoetterl-Glausch wrote:
> On 7/15/22 15:47, Pierre Morel wrote:
>>
>>
>> On 7/15/22 11:31, Janis Schoetterl-Glausch wrote:
>>> On 7/14/22 22:05, Pierre Morel wrote:
>>>>
>>>>
>>>> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>>> Hi,
>>>>>>
>>>>>> This new spin is essentially for coherence with the last Linux CPU
>>>>>> Topology patch, function testing and coding style modifications.
>>>>>>
>>>>>> Forword
>>>>>> =======
>>>>>>
>>>>>> The goal of this series is to implement CPU topology for S390, it
>>>>>> improves the preceeding series with the implementation of books and
>>>>>> drawers, of non uniform CPU topology and with documentation.
>>>>>>
>>>>>> To use these patches, you will need the Linux series version 10.
>>>>>> You find it there:
>>>>>> https://lkml.org/lkml/2022/6/20/590
>>>>>>
>>>>>> Currently this code is for KVM only, I have no idea if it is interesting
>>>>>> to provide a TCG patch. If ever it will be done in another series.
>>>>>>
>>>>>> To have a better understanding of the S390x CPU Topology and its
>>>>>> implementation in QEMU you can have a look at the documentation in the
>>>>>> last patch or follow the introduction here under.
>>>>>>
>>>>>> A short introduction
>>>>>> ====================
>>>>>>
>>>>>> CPU Topology is described in the S390 POP with essentially the description
>>>>>> of two instructions:
>>>>>>
>>>>>> PTF Perform Topology function used to poll for topology change
>>>>>>        and used to set the polarization but this part is not part of this item.
>>>>>>
>>>>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>>>>        configuration.
>>>>>>
>>>>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>>>>        of containers. The last topology level, specifying the CPU cores.
>>>>>>
>>>>>>        This patch series only uses the two lower levels sockets and cores.
>>>>>>             To get the information on the topology, S390 provides the STSI
>>>>>>        instruction, which stores a structures providing the list of the
>>>>>>        containers used in the Machine topology: the SYSIB.
>>>>>>        A selector within the STSI instruction allow to chose how many topology
>>>>>>        levels will be provide in the SYSIB.
>>>>>>
>>>>>>        Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>>>>        the Linux kernel is able to compute the information about the cache
>>>>>>        distance between two cores and can use this information to take
>>>>>>        scheduling decisions.
>>>>>
>>>>> Do the socket, book, ... metaphors and looking at STSI from the existing
>>>>> smp infrastructure even make sense?
>>>>
>>>> Sorry, I do not understand.
>>>> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
>>>>
>>>> What we do is:
>>>> Compute the STSI from the SMP + numa + device QEMU parameters .
>>>>
>>>>>
>>>>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>>>>> this topology can be very dynamic. So a CPU can move from from one topology
>>>>> container to another, but the socket of a cpu changing while it's running seems
>>>>> a bit strange. And this isn't supported by this patch series as far as I understand,
>>>>> the only topology changes are on hotplug.
>>>>
>>>> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.
>>>
>>> Can a CPU actually change between sockets right now?
>>
>> To be exact, what I understand is that a shared CPU can be scheduled to another real CPU exactly as a guest vCPU can be scheduled by the host to another host CPU.
> 
> Ah, ok, this is what I'm forgetting, and what made communication harder,
> there are two ways by which the topology can change:
> 1. the host topology changes
> 2. the vCPU threads are scheduled on another host CPU
> 
> I've been only thinking about the 2.
> I assumed some outside entity (libvirt?) pins vCPU threads, and so it would
> be the responsibility of that entity to set the topology which then is
> reported to the guest. So if you pin vCPUs for the whole lifetime of the vm
> then you could do that by specifying the topology up front with -devices.
> If you want to support migration, then the outside entity would need a way
> to tell qemu the updated topology.

Yes

>   
>>
>>> The socket-id is computed from the core-id, so it's fixed, is it not?
>>
>> the virtual socket-id is computed from the virtual core-id
> 
> Meaning cpu.env.core_id, correct? (which is the same as cpu.cpu_index which is the same as
> ms->possible_cpus->cpus[core_id].props.core_id)
> And a cpu's core id doesn't change during the lifetime of the vm, right?

right

> And so it's socket id doesn't either.

Yes

> 
>>
>>>
>>>> It is not expected to appear often but it does appear.
>>>> The code has been removed from the kernel in spin 10 for 2 reasons:
>>>> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem
>>>
>>> This seems sensible, but now it seems too static.
>>> For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
>>> unless I'm misunderstanding something.
>>
>> No, to do this we would need to ask the kernel about it.
> 
> You mean polling /sys/devices/system/cpu/cpu*/topology/*_id ?
> That should work if it isn't done to frequently, right?
> And if it's done by the entity doing the pinning it could judge if the host topology change
> is relevant to the guest and if so tell qemu how to update it.

yes, I guess we will need to change the core-id which may be complicated 
or find another way to link the vCPU topology with the host CPU topology.
First I wanted to have something directly from the kernel, as we have 
there all the info on vCPU and host topology.
That is why I had a struct in the UAPI.

Viktor is as you say here for something in userland only.

For the moment I would like to stay for this patch series on a fixed 
topology, set by the admin, then we go on updating the topology.


>>
>>> And migration is rare, but something you'd want to be able to react to.
>>> And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.
>>
>> I think on migration we should just make a kvm_set_mtcr on post_load like Nico suggested everything else seems complicated for a questionable benefit.
> 
> But what is the point? The result of STSI reported to the guest doesn't actually change, does it?
> Since the same CPUs with the same calculated socket-ids, ..., exist.
> You cannot migrate to a vm with a different virtual topology, since the CPUs get matched via the cpu_index
> as far as I can tell, which is the same as the core_id, or am I misunderstanding something?
> Migrating the MTCR bit is correct, if it is 1 than there was a cpu hotplug that the guest did not yet observe,
> but setting it to 1 after migration would we wrong if the STSI result would be the same.

That is a good point, IIUC it follows that:
- a CPU hotplug can not be done during the migration.
- migration can not be started while a CPU is being hot plugged.


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-15 13:07         ` Pierre Morel
@ 2022-07-20 17:24           ` Janis Schoetterl-Glausch
  2022-07-21  7:58             ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-20 17:24 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/15/22 15:07, Pierre Morel wrote:
> 
> 
> On 7/15/22 11:11, Janis Schoetterl-Glausch wrote:
>> On 7/14/22 22:17, Pierre Morel wrote:
>>>
>>>
>>> On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>> S390x CPU Topology allows a non uniform repartition of the CPU
>>>>> inside the topology containers, sockets, books and drawers.
>>>>>
>>>>> We use numa to place the CPU inside the right topology container
>>>>> and report the non uniform topology to the guest.
>>>>>
>>>>> Note that s390x needs CPU0 to belong to the topology and consequently
>>>>> all topology must include CPU0.
>>>>>
>>>>> We accept a partial QEMU numa definition, in that case undefined CPUs
>>>>> are added to free slots in the topology starting with slot 0 and going
>>>>> up.
>>>>
>>>> I don't understand why doing it this way, via numa, makes sense for us.
>>>> We report the topology to the guest via STSI, which tells the guest
>>>> what the topology "tree" looks like. We don't report any numa distances to the guest.
>>>> The natural way to specify where a cpu is added to the vm, seems to me to be
>>>> by specify the socket, book, ... IDs when doing a device_add or via -device on
>>>> the command line.
>>>>
>>>> [...]
>>>>
>>>
>>> It is a choice to have the core-id to determine were the CPU is situated in the topology.
>>>
>>> But yes we can chose the use drawer-id,book-id,socket-id and use a core-id starting on 0 on each socket.
>>>
>>> It is not done in the current implementation because the core-id implies the socket-id, book-id and drawer-id together with the smp parameters.
>>>
>>>
>> Regardless of whether the core-id or the combination of socket-id, book-id .. is used to specify where a CPU is
>> located, why use the numa framework and not just device_add or -device ?
> 
> You are right, at least we should be able to use both.
> I will work on this.
> 
>>
>> That feels way more natural since it should already just work if you can do hotplug.
>> At least with core-id and I suspect with a subset of your changes also with socket-id, etc.
> 
> yes, it already works with core-id
> 
>>
>> Whereas numa is an awkward fit since it's for specifying distances between nodes, which we don't do,
>> and you have to use a hack to get it to specify which CPUs to plug (via setting arch_id to -1).
>>
> 
> Is it only for this?
> 
That's what it looks like to me, but I'm not an expert by any means.
x86 reports distances and more via ACPI, riscv via device tree and power appears to
calculate hierarchy values which the linux kernel will turn into distances again.
That's maybe closest to s390x. However, as far as I can tell all of that is static
and cannot be reconfigured. If we want to have STSI dynamically reflect the topology
at some point in the future, we should have a roadmap for how to achieve that.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information
  2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
  2022-06-27 14:26   ` Janosch Frank
@ 2022-07-20 19:34   ` Janis Schoetterl-Glausch
  2022-07-21 11:23     ` Pierre Morel
  1 sibling, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-20 19:34 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 6/20/22 16:03, Pierre Morel wrote:
> The handling of STSI is enhanced with the interception of the
> function code 15 for storing CPU topology.
> 
> Using the objects built during the plugging of CPU, we build the
> SYSIB 15_1_x structures.
> 
> With this patch the maximum MNEST level is 2, this is also
> the only level allowed and only SYSIB 15_1_2 will be built.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  target/s390x/cpu.h          |   2 +
>  target/s390x/cpu_topology.c | 112 ++++++++++++++++++++++++++++++++++++
>  target/s390x/kvm/kvm.c      |   5 ++
>  target/s390x/meson.build    |   1 +
>  4 files changed, 120 insertions(+)
>  create mode 100644 target/s390x/cpu_topology.c
> 
> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
> index 216adfde26..9d48087b71 100644
> --- a/target/s390x/cpu.h
> +++ b/target/s390x/cpu.h
> @@ -890,4 +890,6 @@ S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
>  
>  #include "exec/cpu-all.h"
>  
> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar);
> +
>  #endif
> diff --git a/target/s390x/cpu_topology.c b/target/s390x/cpu_topology.c
> new file mode 100644
> index 0000000000..9f656d7e51
> --- /dev/null
> +++ b/target/s390x/cpu_topology.c
> @@ -0,0 +1,112 @@
> +/*
> + * QEMU S390x CPU Topology
> + *
> + * Copyright IBM Corp. 2022
> + * Author(s): Pierre Morel <pmorel@linux.ibm.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> + * your option) any later version. See the COPYING file in the top-level
> + * directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "cpu.h"
> +#include "hw/s390x/pv.h"
> +#include "hw/sysbus.h"
> +#include "hw/s390x/cpu-topology.h"
> +
> +static int stsi_15_container(void *p, int nl, int id)
> +{
> +    SysIBTl_container *tle = (SysIBTl_container *)p;
> +
> +    tle->nl = nl;
> +    tle->id = id;
> +
> +    return sizeof(*tle);
> +}
> +
> +static int stsi_15_cpus(void *p, S390TopologyCores *cd)
> +{
> +    SysIBTl_cpu *tle = (SysIBTl_cpu *)p;
> +
> +    tle->nl = 0;
> +    tle->dedicated = cd->dedicated;
> +    tle->polarity = cd->polarity;
> +    tle->type = cd->cputype;
> +    tle->origin = be16_to_cpu(cd->origin);
> +    tle->mask = be64_to_cpu(cd->mask);
> +
> +    return sizeof(*tle);
> +}
> +
> +static int set_socket(const MachineState *ms, void *p,
> +                      S390TopologySocket *socket)
> +{
> +    BusChild *kid;
> +    int l, len = 0;
> +
> +    len += stsi_15_container(p, 1, socket->socket_id);
> +    p += len;
> +

You could put a comment here, TODO: different cpu types, polarizations not supported,
or similar, since those require a specific order.
 
> +    QTAILQ_FOREACH_REVERSE(kid, &socket->bus->children, sibling) {

Is there no synchronization/RCU read section necessary to guard against a concurrent hotplug?
Since the children are ordered by creation, not core_id, the order of the entries is incorrect.
Ditto for the other equivalent loops.

> +        l = stsi_15_cpus(p, S390_TOPOLOGY_CORES(kid->child));
> +        p += l;
> +        len += l;
> +    }
> +    return len;
> +}
> +
> +static void setup_stsi(const MachineState *ms, void *p, int level)

I don't love the name of this function, it's not very descriptive. fill_sysib_15_1_x ?
Why don't you pass a SysIB_151x* instead of a void*?

> +{
> +    S390TopologyBook *book;
> +    SysIB_151x *sysib;
> +    BusChild *kid;
> +    int len, l;
> +
> +    sysib = (SysIB_151x *)p;
> +    sysib->mnest = level;
> +    sysib->mag[TOPOLOGY_NR_MAG2] = ms->smp.sockets;
> +    sysib->mag[TOPOLOGY_NR_MAG1] = ms->smp.cores * ms->smp.threads;

If I understood STSI right, it doesn't care about threads, so there should not be a multiplication here.
> +
> +    book = s390_get_topology();
> +    len = sizeof(SysIB_151x);
> +    p += len;
> +
> +    QTAILQ_FOREACH_REVERSE(kid, &book->bus->children, sibling) {
> +        l = set_socket(ms, p, S390_TOPOLOGY_SOCKET(kid->child));
> +        p += l;

I'm uncomfortable with advancing the pointer without a check if the page is being overflowed.
With lots of cpus in lots of sockets and a deep hierarchy the topology list can get quite long.

> +        len += l;> +    }
> +
> +    sysib->length = be16_to_cpu(len);
> +}
> +
> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
> +{
> +    const MachineState *machine = MACHINE(qdev_get_machine());
> +    void *p;
> +    int ret;
> +
> +    /*
> +     * Until the SCLP STSI Facility reporting the MNEST value is used,
> +     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
> +     */

Do you actually implement the SCLP functionality in this series? You're changing
this check in subsequent patches, but I only see the definition of a new constant,
not that you're presenting it to the guest.

> +    if (sel2 != 2) {
> +        setcc(cpu, 3);
> +        return;
> +    }
> +
> +    p = g_malloc0(TARGET_PAGE_SIZE);

Any reason not to stack allocate the sysib?
> +
> +    setup_stsi(machine, p, 2);
> +
> +    if (s390_is_pv()) {
> +        ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);
> +    } else {
> +        ret = s390_cpu_virt_mem_write(cpu, addr, ar, p, TARGET_PAGE_SIZE);
> +    }

Since we're allowed to not store the reserved space after the sysib, it seems more natural
to do so. I don't know if it makes any difference performance wise, but it doesn't harm.
> +
> +    setcc(cpu, ret ? 3 : 0);

Shouldn't this result in an exception instead? Not sure if you should call
s390_cpu_virt_mem_handle_exc thereafter.

> +    g_free(p);
> +}
> +
> diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
> index 7bd8db0e7b..563bf5ac60 100644
> --- a/target/s390x/kvm/kvm.c
> +++ b/target/s390x/kvm/kvm.c
> @@ -51,6 +51,7 @@
>  #include "hw/s390x/s390-virtio-ccw.h"
>  #include "hw/s390x/s390-virtio-hcall.h"
>  #include "hw/s390x/pv.h"
> +#include "hw/s390x/cpu-topology.h"
>  
>  #ifndef DEBUG_KVM
>  #define DEBUG_KVM  0
> @@ -1918,6 +1919,10 @@ static int handle_stsi(S390CPU *cpu)
>          /* Only sysib 3.2.2 needs post-handling for now. */
>          insert_stsi_3_2_2(cpu, run->s390_stsi.addr, run->s390_stsi.ar);
>          return 0;
> +    case 15:
> +        insert_stsi_15_1_x(cpu, run->s390_stsi.sel2, run->s390_stsi.addr,
> +                           run->s390_stsi.ar);
> +        return 0;
>      default:
>          return 0;
>      }
> diff --git a/target/s390x/meson.build b/target/s390x/meson.build
> index 84c1402a6a..890ccfa789 100644
> --- a/target/s390x/meson.build
> +++ b/target/s390x/meson.build
> @@ -29,6 +29,7 @@ s390x_softmmu_ss.add(files(
>    'sigp.c',
>    'cpu-sysemu.c',
>    'cpu_models_sysemu.c',
> +  'cpu_topology.c',
>  ))
>  
>  s390x_user_ss = ss.source_set()


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-20 17:24           ` Janis Schoetterl-Glausch
@ 2022-07-21  7:58             ` Pierre Morel
  2022-07-21  8:16               ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-21  7:58 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/20/22 19:24, Janis Schoetterl-Glausch wrote:
> On 7/15/22 15:07, Pierre Morel wrote:
>>
>>
>> On 7/15/22 11:11, Janis Schoetterl-Glausch wrote:
>>> On 7/14/22 22:17, Pierre Morel wrote:
>>>>
>>>>
>>>> On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
>>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>>> S390x CPU Topology allows a non uniform repartition of the CPU
>>>>>> inside the topology containers, sockets, books and drawers.
>>>>>>
>>>>>> We use numa to place the CPU inside the right topology container
>>>>>> and report the non uniform topology to the guest.
>>>>>>
>>>>>> Note that s390x needs CPU0 to belong to the topology and consequently
>>>>>> all topology must include CPU0.
>>>>>>
>>>>>> We accept a partial QEMU numa definition, in that case undefined CPUs
>>>>>> are added to free slots in the topology starting with slot 0 and going
>>>>>> up.
>>>>>
>>>>> I don't understand why doing it this way, via numa, makes sense for us.
>>>>> We report the topology to the guest via STSI, which tells the guest
>>>>> what the topology "tree" looks like. We don't report any numa distances to the guest.
>>>>> The natural way to specify where a cpu is added to the vm, seems to me to be
>>>>> by specify the socket, book, ... IDs when doing a device_add or via -device on
>>>>> the command line.
>>>>>
>>>>> [...]
>>>>>
>>>>
>>>> It is a choice to have the core-id to determine were the CPU is situated in the topology.
>>>>
>>>> But yes we can chose the use drawer-id,book-id,socket-id and use a core-id starting on 0 on each socket.
>>>>
>>>> It is not done in the current implementation because the core-id implies the socket-id, book-id and drawer-id together with the smp parameters.
>>>>
>>>>
>>> Regardless of whether the core-id or the combination of socket-id, book-id .. is used to specify where a CPU is
>>> located, why use the numa framework and not just device_add or -device ?
>>
>> You are right, at least we should be able to use both.
>> I will work on this.
>>
>>>
>>> That feels way more natural since it should already just work if you can do hotplug.
>>> At least with core-id and I suspect with a subset of your changes also with socket-id, etc.
>>
>> yes, it already works with core-id
>>
>>>
>>> Whereas numa is an awkward fit since it's for specifying distances between nodes, which we don't do,
>>> and you have to use a hack to get it to specify which CPUs to plug (via setting arch_id to -1).
>>>
>>
>> Is it only for this?
>>
> That's what it looks like to me, but I'm not an expert by any means.
> x86 reports distances and more via ACPI, riscv via device tree and power appears to
> calculate hierarchy values which the linux kernel will turn into distances again.
> That's maybe closest to s390x. However, as far as I can tell all of that is static
> and cannot be reconfigured. If we want to have STSI dynamically reflect the topology
> at some point in the future, we should have a roadmap for how to achieve that.
> 
> 


You are right, numa is redundant for us as we specify the topology using 
the core-id.
The roadmap I would like to discuss is using a new:

(qemu) cpu_move src dst

where src is the current core-id and dst is the destination core-id.

I am aware that there are deep implication on current cpu code but I do 
not think it is not possible.
If it is unpossible then we would need a new argument to the device_add 
for cpu to define the "effective_core_id"
But we will still need the new hmp command to update the topology.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-21  7:58             ` Pierre Morel
@ 2022-07-21  8:16               ` Janis Schoetterl-Glausch
  2022-07-21 11:41                 ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-21  8:16 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/21/22 09:58, Pierre Morel wrote:
> 
> 
> On 7/20/22 19:24, Janis Schoetterl-Glausch wrote:
>> On 7/15/22 15:07, Pierre Morel wrote:
>>>
>>>
>>> On 7/15/22 11:11, Janis Schoetterl-Glausch wrote:
>>>> On 7/14/22 22:17, Pierre Morel wrote:
>>>>>
>>>>>
>>>>> On 7/14/22 16:57, Janis Schoetterl-Glausch wrote:
>>>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>>>> S390x CPU Topology allows a non uniform repartition of the CPU
>>>>>>> inside the topology containers, sockets, books and drawers.
>>>>>>>
>>>>>>> We use numa to place the CPU inside the right topology container
>>>>>>> and report the non uniform topology to the guest.
>>>>>>>
>>>>>>> Note that s390x needs CPU0 to belong to the topology and consequently
>>>>>>> all topology must include CPU0.
>>>>>>>
>>>>>>> We accept a partial QEMU numa definition, in that case undefined CPUs
>>>>>>> are added to free slots in the topology starting with slot 0 and going
>>>>>>> up.
>>>>>>
>>>>>> I don't understand why doing it this way, via numa, makes sense for us.
>>>>>> We report the topology to the guest via STSI, which tells the guest
>>>>>> what the topology "tree" looks like. We don't report any numa distances to the guest.
>>>>>> The natural way to specify where a cpu is added to the vm, seems to me to be
>>>>>> by specify the socket, book, ... IDs when doing a device_add or via -device on
>>>>>> the command line.
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>
>>>>> It is a choice to have the core-id to determine were the CPU is situated in the topology.
>>>>>
>>>>> But yes we can chose the use drawer-id,book-id,socket-id and use a core-id starting on 0 on each socket.
>>>>>
>>>>> It is not done in the current implementation because the core-id implies the socket-id, book-id and drawer-id together with the smp parameters.
>>>>>
>>>>>
>>>> Regardless of whether the core-id or the combination of socket-id, book-id .. is used to specify where a CPU is
>>>> located, why use the numa framework and not just device_add or -device ?
>>>
>>> You are right, at least we should be able to use both.
>>> I will work on this.
>>>
>>>>
>>>> That feels way more natural since it should already just work if you can do hotplug.
>>>> At least with core-id and I suspect with a subset of your changes also with socket-id, etc.
>>>
>>> yes, it already works with core-id
>>>
>>>>
>>>> Whereas numa is an awkward fit since it's for specifying distances between nodes, which we don't do,
>>>> and you have to use a hack to get it to specify which CPUs to plug (via setting arch_id to -1).
>>>>
>>>
>>> Is it only for this?
>>>
>> That's what it looks like to me, but I'm not an expert by any means.
>> x86 reports distances and more via ACPI, riscv via device tree and power appears to
>> calculate hierarchy values which the linux kernel will turn into distances again.
>> That's maybe closest to s390x. However, as far as I can tell all of that is static
>> and cannot be reconfigured. If we want to have STSI dynamically reflect the topology
>> at some point in the future, we should have a roadmap for how to achieve that.
>>
>>
> 
> 
> You are right, numa is redundant for us as we specify the topology using the core-id.
> The roadmap I would like to discuss is using a new:
> 
> (qemu) cpu_move src dst
> 
> where src is the current core-id and dst is the destination core-id.
> 
> I am aware that there are deep implication on current cpu code but I do not think it is not possible.
> If it is unpossible then we would need a new argument to the device_add for cpu to define the "effective_core_id"
> But we will still need the new hmp command to update the topology.
> 
I don't think core-id is the right one, that's the guest visible CPU address, isn't it?
Although it seems badly named then, since multiple threads are part of the same core (ok, we don't support threads).
Instead socket-id, book-id could be changed dynamically instead of being computed from the core-id.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information
  2022-07-20 19:34   ` Janis Schoetterl-Glausch
@ 2022-07-21 11:23     ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-07-21 11:23 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/20/22 21:34, Janis Schoetterl-Glausch wrote:
> On 6/20/22 16:03, Pierre Morel wrote:
>> The handling of STSI is enhanced with the interception of the
>> function code 15 for storing CPU topology.
>>
>> Using the objects built during the plugging of CPU, we build the
>> SYSIB 15_1_x structures.
>>
>> With this patch the maximum MNEST level is 2, this is also
>> the only level allowed and only SYSIB 15_1_2 will be built.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   target/s390x/cpu.h          |   2 +
>>   target/s390x/cpu_topology.c | 112 ++++++++++++++++++++++++++++++++++++
>>   target/s390x/kvm/kvm.c      |   5 ++
>>   target/s390x/meson.build    |   1 +
>>   4 files changed, 120 insertions(+)
>>   create mode 100644 target/s390x/cpu_topology.c
>>
>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>> index 216adfde26..9d48087b71 100644
>> --- a/target/s390x/cpu.h
>> +++ b/target/s390x/cpu.h
>> @@ -890,4 +890,6 @@ S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
>>   
>>   #include "exec/cpu-all.h"
>>   
>> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar);
>> +
>>   #endif
>> diff --git a/target/s390x/cpu_topology.c b/target/s390x/cpu_topology.c
>> new file mode 100644
>> index 0000000000..9f656d7e51
>> --- /dev/null
>> +++ b/target/s390x/cpu_topology.c
>> @@ -0,0 +1,112 @@
>> +/*
>> + * QEMU S390x CPU Topology
>> + *
>> + * Copyright IBM Corp. 2022
>> + * Author(s): Pierre Morel <pmorel@linux.ibm.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
>> + * your option) any later version. See the COPYING file in the top-level
>> + * directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "cpu.h"
>> +#include "hw/s390x/pv.h"
>> +#include "hw/sysbus.h"
>> +#include "hw/s390x/cpu-topology.h"
>> +
>> +static int stsi_15_container(void *p, int nl, int id)
>> +{
>> +    SysIBTl_container *tle = (SysIBTl_container *)p;
>> +
>> +    tle->nl = nl;
>> +    tle->id = id;
>> +
>> +    return sizeof(*tle);
>> +}
>> +
>> +static int stsi_15_cpus(void *p, S390TopologyCores *cd)
>> +{
>> +    SysIBTl_cpu *tle = (SysIBTl_cpu *)p;
>> +
>> +    tle->nl = 0;
>> +    tle->dedicated = cd->dedicated;
>> +    tle->polarity = cd->polarity;
>> +    tle->type = cd->cputype;
>> +    tle->origin = be16_to_cpu(cd->origin);
>> +    tle->mask = be64_to_cpu(cd->mask);
>> +
>> +    return sizeof(*tle);
>> +}
>> +
>> +static int set_socket(const MachineState *ms, void *p,
>> +                      S390TopologySocket *socket)
>> +{
>> +    BusChild *kid;
>> +    int l, len = 0;
>> +
>> +    len += stsi_15_container(p, 1, socket->socket_id);
>> +    p += len;
>> +
> 
> You could put a comment here, TODO: different cpu types, polarizations not supported,
> or similar, since those require a specific order.

I prefer to put that during the creation process as here there is no 
control but just fill the SysIB with the data.
>   
>> +    QTAILQ_FOREACH_REVERSE(kid, &socket->bus->children, sibling) {
> 
> Is there no synchronization/RCU read section necessary to guard against a concurrent hotplug?

definitively.
I think there must be a synchronization point around the all topology 
creation and the all SysIB creation

> Since the children are ordered by creation, not core_id, the order of the entries is incorrect.
> Ditto for the other equivalent loops.

right will change this during creation

> 
>> +        l = stsi_15_cpus(p, S390_TOPOLOGY_CORES(kid->child));
>> +        p += l;
>> +        len += l;
>> +    }
>> +    return len;
>> +}
>> +
>> +static void setup_stsi(const MachineState *ms, void *p, int level)
> 
> I don't love the name of this function, it's not very descriptive. fill_sysib_15_1_x ?

OK and I will add some s390_ prefix too to make the function easier to find.

> Why don't you pass a SysIB_151x* instead of a void*?

OK

> 
>> +{
>> +    S390TopologyBook *book;
>> +    SysIB_151x *sysib;
>> +    BusChild *kid;
>> +    int len, l;
>> +
>> +    sysib = (SysIB_151x *)p;
>> +    sysib->mnest = level;
>> +    sysib->mag[TOPOLOGY_NR_MAG2] = ms->smp.sockets;
>> +    sysib->mag[TOPOLOGY_NR_MAG1] = ms->smp.cores * ms->smp.threads;
> 
> If I understood STSI right, it doesn't care about threads, so there should not be a multiplication here.

Right now that we have threads == 1 being forced

>> +
>> +    book = s390_get_topology();
>> +    len = sizeof(SysIB_151x);
>> +    p += len;
>> +
>> +    QTAILQ_FOREACH_REVERSE(kid, &book->bus->children, sibling) {
>> +        l = set_socket(ms, p, S390_TOPOLOGY_SOCKET(kid->child));
>> +        p += l;
> 
> I'm uncomfortable with advancing the pointer without a check if the page is being overflowed.
> With lots of cpus in lots of sockets and a deep hierarchy the topology list can get quite long.

right but I better check on creation I guess, because here I have no way 
to tell that the topology is not available.

> 
>> +        len += l;> +    }
>> +
>> +    sysib->length = be16_to_cpu(len);
>> +}
>> +
>> +void insert_stsi_15_1_x(S390CPU *cpu, int sel2, __u64 addr, uint8_t ar)
>> +{
>> +    const MachineState *machine = MACHINE(qdev_get_machine());
>> +    void *p;
>> +    int ret;
>> +
>> +    /*
>> +     * Until the SCLP STSI Facility reporting the MNEST value is used,
>> +     * a sel2 value of 2 is the only value allowed in STSI 15.1.x.
>> +     */
> 
> Do you actually implement the SCLP functionality in this series? You're changing
> this check in subsequent patches, but I only see the definition of a new constant,
> not that you're presenting it to the guest.

!!! exact I lost the
s390x-SCLP-reporting-the-maximum-nested-topology-.patch
during transition from v6 to v7 !!

strange, must be a rebase error.

> 
>> +    if (sel2 != 2) {
>> +        setcc(cpu, 3);
>> +        return;
>> +    }
>> +
>> +    p = g_malloc0(TARGET_PAGE_SIZE);
> 
> Any reason not to stack allocate the sysib?

No this can be done

>> +
>> +    setup_stsi(machine, p, 2);
>> +
>> +    if (s390_is_pv()) {
>> +        ret = s390_cpu_pv_mem_write(cpu, 0, p, TARGET_PAGE_SIZE);

This goes away for now, STSI(15,x,x) is not supported by PV


>> +    } else {
>> +        ret = s390_cpu_virt_mem_write(cpu, addr, ar, p, TARGET_PAGE_SIZE);
>> +    }
> 
> Since we're allowed to not store the reserved space after the sysib, it seems more natural
> to do so. I don't know if it makes any difference performance wise, but it doesn't harm.

right

>> +
>> +    setcc(cpu, ret ? 3 : 0);
> 
> Shouldn't this result in an exception instead? Not sure if you should call

right,

> s390_cpu_virt_mem_handle_exc thereafter.

I think it is for TCG only but it should not arm.

Thanks,

regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-21  8:16               ` Janis Schoetterl-Glausch
@ 2022-07-21 11:41                 ` Pierre Morel
  2022-07-22 12:08                   ` Janis Schoetterl-Glausch
  0 siblings, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-07-21 11:41 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/21/22 10:16, Janis Schoetterl-Glausch wrote:
> On 7/21/22 09:58, Pierre Morel wrote:
>>
>>

...snip...

>>
>> You are right, numa is redundant for us as we specify the topology using the core-id.
>> The roadmap I would like to discuss is using a new:
>>
>> (qemu) cpu_move src dst
>>
>> where src is the current core-id and dst is the destination core-id.
>>
>> I am aware that there are deep implication on current cpu code but I do not think it is not possible.
>> If it is unpossible then we would need a new argument to the device_add for cpu to define the "effective_core_id"
>> But we will still need the new hmp command to update the topology.
>>
> I don't think core-id is the right one, that's the guest visible CPU address, isn't it?

Yes, the topology is the one seen by the guest.

> Although it seems badly named then, since multiple threads are part of the same core (ok, we don't support threads).

I guess that threads will always move with the core or... we do not 
support threads.

> Instead socket-id, book-id could be changed dynamically instead of being computed from the core-id.
> 

What becomes of the core-id ?




-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-21 11:41                 ` Pierre Morel
@ 2022-07-22 12:08                   ` Janis Schoetterl-Glausch
  2022-08-23 16:25                     ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Janis Schoetterl-Glausch @ 2022-07-22 12:08 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja

On 7/21/22 13:41, Pierre Morel wrote:
> 
> 
> On 7/21/22 10:16, Janis Schoetterl-Glausch wrote:
>> On 7/21/22 09:58, Pierre Morel wrote:
>>>
>>>
> 
> ...snip...
> 
>>>
>>> You are right, numa is redundant for us as we specify the topology using the core-id.
>>> The roadmap I would like to discuss is using a new:
>>>
>>> (qemu) cpu_move src dst
>>>
>>> where src is the current core-id and dst is the destination core-id.
>>>
>>> I am aware that there are deep implication on current cpu code but I do not think it is not possible.
>>> If it is unpossible then we would need a new argument to the device_add for cpu to define the "effective_core_id"
>>> But we will still need the new hmp command to update the topology.

Why the requirement for a hmp command specifically? Would qom-set on a cpu property work?
>>>
>> I don't think core-id is the right one, that's the guest visible CPU address, isn't it?
> 
> Yes, the topology is the one seen by the guest.
> 
>> Although it seems badly named then, since multiple threads are part of the same core (ok, we don't support threads).
> 
> I guess that threads will always move with the core or... we do not support threads.
> 
>> Instead socket-id, book-id could be changed dynamically instead of being computed from the core-id.
>>
> 
> What becomes of the core-id ?

It would stay the same. It has to, right? Can't change the address as reported by STAP.
I would just be completely independent of the other ids.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
  2022-06-27 13:31   ` Janosch Frank
  2022-07-12 15:40   ` Janis Schoetterl-Glausch
@ 2022-08-23 13:30   ` Thomas Huth
  2022-08-23 16:30     ` Pierre Morel
  2022-08-23 17:41     ` Pierre Morel
  2 siblings, 2 replies; 49+ messages in thread
From: Thomas Huth @ 2022-08-23 13:30 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, cohuck,
	mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake, armbru,
	seiden, nrb, frankja

On 20/06/2022 16.03, Pierre Morel wrote:
> We use new objects to have a dynamic administration of the CPU topology.
> The highest level object in this implementation is the s390 book and
> in this first implementation of CPU topology for S390 we have a single
> book.
> The book is built as a SYSBUS bridge during the CPU initialization.
> Other objects, sockets and core will be built after the parsing
> of the QEMU -smp argument.
> 
> Every object under this single book will be build dynamically
> immediately after a CPU has be realized if it is needed.
> The CPU will fill the sockets once after the other, according to the
> number of core per socket defined during the smp parsing.
> 
> Each CPU inside a socket will be represented by a bit in a 64bit
> unsigned long. Set on plug and clear on unplug of a CPU.
> 
> For the S390 CPU topology, thread and cores are merged into
> topology cores and the number of topology cores is the multiplication
> of cores by the numbers of threads.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>   hw/s390x/meson.build            |   1 +
>   hw/s390x/s390-virtio-ccw.c      |   6 +
>   include/hw/s390x/cpu-topology.h |  74 ++++++
>   target/s390x/cpu.h              |  47 ++++
>   5 files changed, 519 insertions(+)
>   create mode 100644 hw/s390x/cpu-topology.c
>   create mode 100644 include/hw/s390x/cpu-topology.h
...
> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
> +{
> +    S390TopologyBook *book;
> +    S390TopologySocket *socket;
> +    S390TopologyCores *cores;
> +    int nb_cores_per_socket;
> +    int origin, bit;
> +
> +    book = s390_get_topology();
> +
> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
> +
> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, errp);
> +    if (!socket) {
> +        return false;
> +    }
> +
> +    /*
> +     * At the core level, each CPU is represented by a bit in a 64bit
> +     * unsigned long. Set on plug and clear on unplug of a CPU.
> +     * The firmware assume that all CPU in the core description have the same
> +     * type, polarization and are all dedicated or shared.
> +     * In the case a socket contains CPU with different type, polarization
> +     * or dedication then they will be defined in different CPU containers.
> +     * Currently we assume all CPU are identical and the only reason to have
> +     * several S390TopologyCores inside a socket is to have more than 64 CPUs
> +     * in that case the origin field, representing the offset of the first CPU
> +     * in the CPU container allows to represent up to the maximal number of
> +     * CPU inside several CPU containers inside the socket container.
> +     */
> +    origin = 64 * (core_id / 64);

Maybe faster:

	origin = core_id & ~63;

By the way, where is this limitation to 64 coming from? Just because we're 
using a "unsigned long" for now? Or is this a limitation from the architecture?

> +    cores = s390_get_cores(ms, socket, origin, errp);
> +    if (!cores) {
> +        return false;
> +    }
> +
> +    bit = 63 - (core_id - origin);
> +    set_bit(bit, &cores->mask);
> +    cores->origin = origin;
> +
> +    return true;
> +}
...
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index cc3097bfee..a586875b24 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -43,6 +43,7 @@
>   #include "sysemu/sysemu.h"
>   #include "hw/s390x/pv.h"
>   #include "migration/blocker.h"
> +#include "hw/s390x/cpu-topology.h"
>   
>   static Error *pv_mig_blocker;
>   
> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>       /* initialize possible_cpus */
>       mc->possible_cpu_arch_ids(machine);
>   
> +    s390_topology_setup(machine);

Is this safe with regards to migration? Did you tried a ping-pong migration 
from an older QEMU to a QEMU with your modifications and back to the older 
one? If it does not work, we might need to wire this setup to the machine 
types...

>       for (i = 0; i < machine->smp.cpus; i++) {
>           s390x_new_cpu(machine->cpu_type, i, &error_fatal);
>       }
> @@ -306,6 +308,10 @@ static void s390_cpu_plug(HotplugHandler *hotplug_dev,
>       g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
>       ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
>   
> +    if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
> +        return;
> +    }
> +
>       if (dev->hotplugged) {
>           raise_irq_cpu_hotplug();
>       }

  Thomas


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology
  2022-07-22 12:08                   ` Janis Schoetterl-Glausch
@ 2022-08-23 16:25                     ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-08-23 16:25 UTC (permalink / raw)
  To: Janis Schoetterl-Glausch, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, thuth,
	cohuck, mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake,
	armbru, seiden, nrb, frankja



On 7/22/22 14:08, Janis Schoetterl-Glausch wrote:
> On 7/21/22 13:41, Pierre Morel wrote:
>>
>>
>> On 7/21/22 10:16, Janis Schoetterl-Glausch wrote:
>>> On 7/21/22 09:58, Pierre Morel wrote:
>>>>
>>>>
>>
>> ...snip...
>>
>>>>
>>>> You are right, numa is redundant for us as we specify the topology using the core-id.
>>>> The roadmap I would like to discuss is using a new:
>>>>
>>>> (qemu) cpu_move src dst
>>>>
>>>> where src is the current core-id and dst is the destination core-id.
>>>>
>>>> I am aware that there are deep implication on current cpu code but I do not think it is not possible.
>>>> If it is unpossible then we would need a new argument to the device_add for cpu to define the "effective_core_id"
>>>> But we will still need the new hmp command to update the topology.
> 
> Why the requirement for a hmp command specifically? Would qom-set on a cpu property work?


We will work on modifying the topology in another series.
Let's discuss this at that moment.

>>>>
>>> I don't think core-id is the right one, that's the guest visible CPU address, isn't it?
>>
>> Yes, the topology is the one seen by the guest.
>>
>>> Although it seems badly named then, since multiple threads are part of the same core (ok, we don't support threads).
>>
>> I guess that threads will always move with the core or... we do not support threads.
>>
>>> Instead socket-id, book-id could be changed dynamically instead of being computed from the core-id.
>>>
>>
>> What becomes of the core-id ?
> 
> It would stay the same. It has to, right? Can't change the address as reported by STAP.
> I would just be completely independent of the other ids.
> 

We will work on modifying the topology in another series.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-08-23 13:30   ` Thomas Huth
@ 2022-08-23 16:30     ` Pierre Morel
  2022-08-23 17:41     ` Pierre Morel
  1 sibling, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-08-23 16:30 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, cohuck,
	mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake, armbru,
	seiden, nrb, frankja



On 8/23/22 15:30, Thomas Huth wrote:
> On 20/06/2022 16.03, Pierre Morel wrote:
>> We use new objects to have a dynamic administration of the CPU topology.
>> The highest level object in this implementation is the s390 book and
>> in this first implementation of CPU topology for S390 we have a single
>> book.
>> The book is built as a SYSBUS bridge during the CPU initialization.
>> Other objects, sockets and core will be built after the parsing
>> of the QEMU -smp argument.
>>
>> Every object under this single book will be build dynamically
>> immediately after a CPU has be realized if it is needed.
>> The CPU will fill the sockets once after the other, according to the
>> number of core per socket defined during the smp parsing.
>>
>> Each CPU inside a socket will be represented by a bit in a 64bit
>> unsigned long. Set on plug and clear on unplug of a CPU.
>>
>> For the S390 CPU topology, thread and cores are merged into
>> topology cores and the number of topology cores is the multiplication
>> of cores by the numbers of threads.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>>   hw/s390x/meson.build            |   1 +
>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>   target/s390x/cpu.h              |  47 ++++
>>   5 files changed, 519 insertions(+)
>>   create mode 100644 hw/s390x/cpu-topology.c
>>   create mode 100644 include/hw/s390x/cpu-topology.h
> ...
>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>> +{
>> +    S390TopologyBook *book;
>> +    S390TopologySocket *socket;
>> +    S390TopologyCores *cores;
>> +    int nb_cores_per_socket;
>> +    int origin, bit;
>> +
>> +    book = s390_get_topology();
>> +
>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>> +
>> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, 
>> errp);
>> +    if (!socket) {
>> +        return false;
>> +    }
>> +
>> +    /*
>> +     * At the core level, each CPU is represented by a bit in a 64bit
>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>> +     * The firmware assume that all CPU in the core description have 
>> the same
>> +     * type, polarization and are all dedicated or shared.
>> +     * In the case a socket contains CPU with different type, 
>> polarization
>> +     * or dedication then they will be defined in different CPU 
>> containers.
>> +     * Currently we assume all CPU are identical and the only reason 
>> to have
>> +     * several S390TopologyCores inside a socket is to have more than 
>> 64 CPUs
>> +     * in that case the origin field, representing the offset of the 
>> first CPU
>> +     * in the CPU container allows to represent up to the maximal 
>> number of
>> +     * CPU inside several CPU containers inside the socket container.
>> +     */
>> +    origin = 64 * (core_id / 64);
> 
> Maybe faster:
> 
>      origin = core_id & ~63;

yes thanks

> 
> By the way, where is this limitation to 64 coming from? Just because 
> we're using a "unsigned long" for now? Or is this a limitation from the 
> architecture?
> 

It is a limitation from the architecture who use a 64bit field to 
represent the CPUs in a CPU TLE mask.

but this patch serie is quite old now and I changed some things 
according to the comments I received
I plan to send the new version before the end of the week.


>> +    cores = s390_get_cores(ms, socket, origin, errp);
>> +    if (!cores) {
>> +        return false;
>> +    }
>> +
>> +    bit = 63 - (core_id - origin);
>> +    set_bit(bit, &cores->mask);
>> +    cores->origin = origin;
>> +
>> +    return true;
>> +}
> ...
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index cc3097bfee..a586875b24 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -43,6 +43,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/s390x/pv.h"
>>   #include "migration/blocker.h"
>> +#include "hw/s390x/cpu-topology.h"
>>   static Error *pv_mig_blocker;
>> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>>       /* initialize possible_cpus */
>>       mc->possible_cpu_arch_ids(machine);
>> +    s390_topology_setup(machine);
> 
> Is this safe with regards to migration? Did you tried a ping-pong 
> migration from an older QEMU to a QEMU with your modifications and back 
> to the older one? If it does not work, we might need to wire this setup 
> to the machine types...

I did not, I will add this test.


> 
>>       for (i = 0; i < machine->smp.cpus; i++) {
>>           s390x_new_cpu(machine->cpu_type, i, &error_fatal);
>>       }
>> @@ -306,6 +308,10 @@ static void s390_cpu_plug(HotplugHandler 
>> *hotplug_dev,
>>       g_assert(!ms->possible_cpus->cpus[cpu->env.core_id].cpu);
>>       ms->possible_cpus->cpus[cpu->env.core_id].cpu = OBJECT(dev);
>> +    if (!s390_topology_new_cpu(ms, cpu->env.core_id, errp)) {
>> +        return;
>> +    }
>> +
>>       if (dev->hotplugged) {
>>           raise_irq_cpu_hotplug();
>>       }
> 
>   Thomas
> 

Thanks Thomas,

Regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-08-23 13:30   ` Thomas Huth
  2022-08-23 16:30     ` Pierre Morel
@ 2022-08-23 17:41     ` Pierre Morel
  2022-08-24  7:30       ` Thomas Huth
  1 sibling, 1 reply; 49+ messages in thread
From: Pierre Morel @ 2022-08-23 17:41 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, cohuck,
	mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake, armbru,
	seiden, nrb, frankja



On 8/23/22 15:30, Thomas Huth wrote:
> On 20/06/2022 16.03, Pierre Morel wrote:
>> We use new objects to have a dynamic administration of the CPU topology.
>> The highest level object in this implementation is the s390 book and
>> in this first implementation of CPU topology for S390 we have a single
>> book.
>> The book is built as a SYSBUS bridge during the CPU initialization.
>> Other objects, sockets and core will be built after the parsing
>> of the QEMU -smp argument.
>>
>> Every object under this single book will be build dynamically
>> immediately after a CPU has be realized if it is needed.
>> The CPU will fill the sockets once after the other, according to the
>> number of core per socket defined during the smp parsing.
>>
>> Each CPU inside a socket will be represented by a bit in a 64bit
>> unsigned long. Set on plug and clear on unplug of a CPU.
>>
>> For the S390 CPU topology, thread and cores are merged into
>> topology cores and the number of topology cores is the multiplication
>> of cores by the numbers of threads.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>>   hw/s390x/meson.build            |   1 +
>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>   target/s390x/cpu.h              |  47 ++++
>>   5 files changed, 519 insertions(+)
>>   create mode 100644 hw/s390x/cpu-topology.c
>>   create mode 100644 include/hw/s390x/cpu-topology.h
> ...
>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>> +{
>> +    S390TopologyBook *book;
>> +    S390TopologySocket *socket;
>> +    S390TopologyCores *cores;
>> +    int nb_cores_per_socket;
>> +    int origin, bit;
>> +
>> +    book = s390_get_topology();
>> +
>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>> +
>> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, 
>> errp);
>> +    if (!socket) {
>> +        return false;
>> +    }
>> +
>> +    /*
>> +     * At the core level, each CPU is represented by a bit in a 64bit
>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>> +     * The firmware assume that all CPU in the core description have 
>> the same
>> +     * type, polarization and are all dedicated or shared.
>> +     * In the case a socket contains CPU with different type, 
>> polarization
>> +     * or dedication then they will be defined in different CPU 
>> containers.
>> +     * Currently we assume all CPU are identical and the only reason 
>> to have
>> +     * several S390TopologyCores inside a socket is to have more than 
>> 64 CPUs
>> +     * in that case the origin field, representing the offset of the 
>> first CPU
>> +     * in the CPU container allows to represent up to the maximal 
>> number of
>> +     * CPU inside several CPU containers inside the socket container.
>> +     */
>> +    origin = 64 * (core_id / 64);
> 
> Maybe faster:
> 
>      origin = core_id & ~63;
> 
> By the way, where is this limitation to 64 coming from? Just because 
> we're using a "unsigned long" for now? Or is this a limitation from the 
> architecture?
> 
>> +    cores = s390_get_cores(ms, socket, origin, errp);
>> +    if (!cores) {
>> +        return false;
>> +    }
>> +
>> +    bit = 63 - (core_id - origin);
>> +    set_bit(bit, &cores->mask);
>> +    cores->origin = origin;
>> +
>> +    return true;
>> +}
> ...
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index cc3097bfee..a586875b24 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -43,6 +43,7 @@
>>   #include "sysemu/sysemu.h"
>>   #include "hw/s390x/pv.h"
>>   #include "migration/blocker.h"
>> +#include "hw/s390x/cpu-topology.h"
>>   static Error *pv_mig_blocker;
>> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>>       /* initialize possible_cpus */
>>       mc->possible_cpu_arch_ids(machine);
>> +    s390_topology_setup(machine);
> 
> Is this safe with regards to migration? Did you tried a ping-pong 
> migration from an older QEMU to a QEMU with your modifications and back 
> to the older one? If it does not work, we might need to wire this setup 
> to the machine types...

I checked with the follow-up series :
OLD-> NEW -> OLD -> NEW

It is working fine, of course we need to fence the CPU topology facility 
with ctop=off on the new QEMU to avoid authorizing the new instructions, 
PTF and STSI(15).

The new series will also be much simpler, 725 LOCs including a 
documentation against ... 1377 without documentation.

I let fall a lot of QEMU objects that did not have really a use on the 
advise from Janis and also simplified the STSI handling.

I still need to had more comments so it will grow again but for a good 
reason.

Regards,
Pierre


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-08-23 17:41     ` Pierre Morel
@ 2022-08-24  7:30       ` Thomas Huth
  2022-08-24  8:41         ` Pierre Morel
  0 siblings, 1 reply; 49+ messages in thread
From: Thomas Huth @ 2022-08-24  7:30 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, cohuck,
	mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake, armbru,
	seiden, nrb, frankja

On 23/08/2022 19.41, Pierre Morel wrote:
> 
> 
> On 8/23/22 15:30, Thomas Huth wrote:
>> On 20/06/2022 16.03, Pierre Morel wrote:
>>> We use new objects to have a dynamic administration of the CPU topology.
>>> The highest level object in this implementation is the s390 book and
>>> in this first implementation of CPU topology for S390 we have a single
>>> book.
>>> The book is built as a SYSBUS bridge during the CPU initialization.
>>> Other objects, sockets and core will be built after the parsing
>>> of the QEMU -smp argument.
>>>
>>> Every object under this single book will be build dynamically
>>> immediately after a CPU has be realized if it is needed.
>>> The CPU will fill the sockets once after the other, according to the
>>> number of core per socket defined during the smp parsing.
>>>
>>> Each CPU inside a socket will be represented by a bit in a 64bit
>>> unsigned long. Set on plug and clear on unplug of a CPU.
>>>
>>> For the S390 CPU topology, thread and cores are merged into
>>> topology cores and the number of topology cores is the multiplication
>>> of cores by the numbers of threads.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> ---
>>>   hw/s390x/cpu-topology.c         | 391 ++++++++++++++++++++++++++++++++
>>>   hw/s390x/meson.build            |   1 +
>>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>>   target/s390x/cpu.h              |  47 ++++
>>>   5 files changed, 519 insertions(+)
>>>   create mode 100644 hw/s390x/cpu-topology.c
>>>   create mode 100644 include/hw/s390x/cpu-topology.h
>> ...
>>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error **errp)
>>> +{
>>> +    S390TopologyBook *book;
>>> +    S390TopologySocket *socket;
>>> +    S390TopologyCores *cores;
>>> +    int nb_cores_per_socket;
>>> +    int origin, bit;
>>> +
>>> +    book = s390_get_topology();
>>> +
>>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>>> +
>>> +    socket = s390_get_socket(ms, book, core_id / nb_cores_per_socket, 
>>> errp);
>>> +    if (!socket) {
>>> +        return false;
>>> +    }
>>> +
>>> +    /*
>>> +     * At the core level, each CPU is represented by a bit in a 64bit
>>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>>> +     * The firmware assume that all CPU in the core description have the 
>>> same
>>> +     * type, polarization and are all dedicated or shared.
>>> +     * In the case a socket contains CPU with different type, polarization
>>> +     * or dedication then they will be defined in different CPU containers.
>>> +     * Currently we assume all CPU are identical and the only reason to 
>>> have
>>> +     * several S390TopologyCores inside a socket is to have more than 64 
>>> CPUs
>>> +     * in that case the origin field, representing the offset of the 
>>> first CPU
>>> +     * in the CPU container allows to represent up to the maximal number of
>>> +     * CPU inside several CPU containers inside the socket container.
>>> +     */
>>> +    origin = 64 * (core_id / 64);
>>
>> Maybe faster:
>>
>>      origin = core_id & ~63;
>>
>> By the way, where is this limitation to 64 coming from? Just because we're 
>> using a "unsigned long" for now? Or is this a limitation from the 
>> architecture?
>>
>>> +    cores = s390_get_cores(ms, socket, origin, errp);
>>> +    if (!cores) {
>>> +        return false;
>>> +    }
>>> +
>>> +    bit = 63 - (core_id - origin);
>>> +    set_bit(bit, &cores->mask);
>>> +    cores->origin = origin;
>>> +
>>> +    return true;
>>> +}
>> ...
>>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>>> index cc3097bfee..a586875b24 100644
>>> --- a/hw/s390x/s390-virtio-ccw.c
>>> +++ b/hw/s390x/s390-virtio-ccw.c
>>> @@ -43,6 +43,7 @@
>>>   #include "sysemu/sysemu.h"
>>>   #include "hw/s390x/pv.h"
>>>   #include "migration/blocker.h"
>>> +#include "hw/s390x/cpu-topology.h"
>>>   static Error *pv_mig_blocker;
>>> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>>>       /* initialize possible_cpus */
>>>       mc->possible_cpu_arch_ids(machine);
>>> +    s390_topology_setup(machine);
>>
>> Is this safe with regards to migration? Did you tried a ping-pong 
>> migration from an older QEMU to a QEMU with your modifications and back to 
>> the older one? If it does not work, we might need to wire this setup to 
>> the machine types...
> 
> I checked with the follow-up series :
> OLD-> NEW -> OLD -> NEW
> 
> It is working fine, of course we need to fence the CPU topology facility 
> with ctop=off on the new QEMU to avoid authorizing the new instructions, PTF 
> and STSI(15).

When using an older machine type, the facility should be disabled by 
default, so the user does not have to know that ctop=off has to be set ... 
so I think you should only do the s390_topology_setup() by default if using 
the 7.2 machine type (or newer).

  Thomas


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures
  2022-08-24  7:30       ` Thomas Huth
@ 2022-08-24  8:41         ` Pierre Morel
  0 siblings, 0 replies; 49+ messages in thread
From: Pierre Morel @ 2022-08-24  8:41 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: qemu-devel, borntraeger, pasic, richard.henderson, david, cohuck,
	mst, pbonzini, kvm, ehabkost, marcel.apfelbaum, eblake, armbru,
	seiden, nrb, frankja



On 8/24/22 09:30, Thomas Huth wrote:
> On 23/08/2022 19.41, Pierre Morel wrote:
>>
>>
>> On 8/23/22 15:30, Thomas Huth wrote:
>>> On 20/06/2022 16.03, Pierre Morel wrote:
>>>> We use new objects to have a dynamic administration of the CPU 
>>>> topology.
>>>> The highest level object in this implementation is the s390 book and
>>>> in this first implementation of CPU topology for S390 we have a single
>>>> book.
>>>> The book is built as a SYSBUS bridge during the CPU initialization.
>>>> Other objects, sockets and core will be built after the parsing
>>>> of the QEMU -smp argument.
>>>>
>>>> Every object under this single book will be build dynamically
>>>> immediately after a CPU has be realized if it is needed.
>>>> The CPU will fill the sockets once after the other, according to the
>>>> number of core per socket defined during the smp parsing.
>>>>
>>>> Each CPU inside a socket will be represented by a bit in a 64bit
>>>> unsigned long. Set on plug and clear on unplug of a CPU.
>>>>
>>>> For the S390 CPU topology, thread and cores are merged into
>>>> topology cores and the number of topology cores is the multiplication
>>>> of cores by the numbers of threads.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>> ---
>>>>   hw/s390x/cpu-topology.c         | 391 
>>>> ++++++++++++++++++++++++++++++++
>>>>   hw/s390x/meson.build            |   1 +
>>>>   hw/s390x/s390-virtio-ccw.c      |   6 +
>>>>   include/hw/s390x/cpu-topology.h |  74 ++++++
>>>>   target/s390x/cpu.h              |  47 ++++
>>>>   5 files changed, 519 insertions(+)
>>>>   create mode 100644 hw/s390x/cpu-topology.c
>>>>   create mode 100644 include/hw/s390x/cpu-topology.h
>>> ...
>>>> +bool s390_topology_new_cpu(MachineState *ms, int core_id, Error 
>>>> **errp)
>>>> +{
>>>> +    S390TopologyBook *book;
>>>> +    S390TopologySocket *socket;
>>>> +    S390TopologyCores *cores;
>>>> +    int nb_cores_per_socket;
>>>> +    int origin, bit;
>>>> +
>>>> +    book = s390_get_topology();
>>>> +
>>>> +    nb_cores_per_socket = ms->smp.cores * ms->smp.threads;
>>>> +
>>>> +    socket = s390_get_socket(ms, book, core_id / 
>>>> nb_cores_per_socket, errp);
>>>> +    if (!socket) {
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * At the core level, each CPU is represented by a bit in a 64bit
>>>> +     * unsigned long. Set on plug and clear on unplug of a CPU.
>>>> +     * The firmware assume that all CPU in the core description 
>>>> have the same
>>>> +     * type, polarization and are all dedicated or shared.
>>>> +     * In the case a socket contains CPU with different type, 
>>>> polarization
>>>> +     * or dedication then they will be defined in different CPU 
>>>> containers.
>>>> +     * Currently we assume all CPU are identical and the only 
>>>> reason to have
>>>> +     * several S390TopologyCores inside a socket is to have more 
>>>> than 64 CPUs
>>>> +     * in that case the origin field, representing the offset of 
>>>> the first CPU
>>>> +     * in the CPU container allows to represent up to the maximal 
>>>> number of
>>>> +     * CPU inside several CPU containers inside the socket container.
>>>> +     */
>>>> +    origin = 64 * (core_id / 64);
>>>
>>> Maybe faster:
>>>
>>>      origin = core_id & ~63;
>>>
>>> By the way, where is this limitation to 64 coming from? Just because 
>>> we're using a "unsigned long" for now? Or is this a limitation from 
>>> the architecture?
>>>
>>>> +    cores = s390_get_cores(ms, socket, origin, errp);
>>>> +    if (!cores) {
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    bit = 63 - (core_id - origin);
>>>> +    set_bit(bit, &cores->mask);
>>>> +    cores->origin = origin;
>>>> +
>>>> +    return true;
>>>> +}
>>> ...
>>>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>>>> index cc3097bfee..a586875b24 100644
>>>> --- a/hw/s390x/s390-virtio-ccw.c
>>>> +++ b/hw/s390x/s390-virtio-ccw.c
>>>> @@ -43,6 +43,7 @@
>>>>   #include "sysemu/sysemu.h"
>>>>   #include "hw/s390x/pv.h"
>>>>   #include "migration/blocker.h"
>>>> +#include "hw/s390x/cpu-topology.h"
>>>>   static Error *pv_mig_blocker;
>>>> @@ -89,6 +90,7 @@ static void s390_init_cpus(MachineState *machine)
>>>>       /* initialize possible_cpus */
>>>>       mc->possible_cpu_arch_ids(machine);
>>>> +    s390_topology_setup(machine);
>>>
>>> Is this safe with regards to migration? Did you tried a ping-pong 
>>> migration from an older QEMU to a QEMU with your modifications and 
>>> back to the older one? If it does not work, we might need to wire 
>>> this setup to the machine types...
>>
>> I checked with the follow-up series :
>> OLD-> NEW -> OLD -> NEW
>>
>> It is working fine, of course we need to fence the CPU topology 
>> facility with ctop=off on the new QEMU to avoid authorizing the new 
>> instructions, PTF and STSI(15).
> 
> When using an older machine type, the facility should be disabled by 
> default, so the user does not have to know that ctop=off has to be set 
> ... so I think you should only do the s390_topology_setup() by default 
> if using the 7.2 machine type (or newer).
> 
>   Thomas
> 

Oh OK, thanks.
I add this for the next series of course.

Regards,
Pierre


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2022-08-24  8:42 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
2022-06-27 13:31   ` Janosch Frank
2022-06-28 11:08     ` Pierre Morel
2022-06-29 15:25     ` Pierre Morel
2022-07-04 11:47       ` Janosch Frank
2022-07-04 14:51         ` Pierre Morel
2022-07-12 15:40   ` Janis Schoetterl-Glausch
2022-07-13 14:59     ` Pierre Morel
2022-07-14 10:38       ` Janis Schoetterl-Glausch
2022-07-14 11:25         ` Pierre Morel
2022-07-14 12:50           ` Janis Schoetterl-Glausch
2022-07-14 19:26             ` Pierre Morel
2022-08-23 13:30   ` Thomas Huth
2022-08-23 16:30     ` Pierre Morel
2022-08-23 17:41     ` Pierre Morel
2022-08-24  7:30       ` Thomas Huth
2022-08-24  8:41         ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
2022-06-27 14:26   ` Janosch Frank
2022-06-28 11:03     ` Pierre Morel
2022-07-20 19:34   ` Janis Schoetterl-Glausch
2022-07-21 11:23     ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
2022-07-14 14:57   ` Janis Schoetterl-Glausch
2022-07-14 20:17     ` Pierre Morel
2022-07-15  9:11       ` Janis Schoetterl-Glausch
2022-07-15 13:07         ` Pierre Morel
2022-07-20 17:24           ` Janis Schoetterl-Glausch
2022-07-21  7:58             ` Pierre Morel
2022-07-21  8:16               ` Janis Schoetterl-Glausch
2022-07-21 11:41                 ` Pierre Morel
2022-07-22 12:08                   ` Janis Schoetterl-Glausch
2022-08-23 16:25                     ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 09/12] target/s390x: interception of PTF instruction Pierre Morel
2022-06-20 14:03 ` [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report Pierre Morel
2022-06-20 14:03 ` [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration Pierre Morel
2022-06-20 14:03 ` [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology Pierre Morel
2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
2022-07-14 20:05   ` Pierre Morel
2022-07-15  9:31     ` Janis Schoetterl-Glausch
2022-07-15 13:47       ` Pierre Morel
2022-07-15 18:28         ` Janis Schoetterl-Glausch
2022-07-18 12:32           ` Pierre Morel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.