All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support
@ 2020-11-09  3:04 Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus Ying Fang
                   ` (12 more replies)
  0 siblings, 13 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

An accurate cpu topology may help improve the cpu scheduler's decision
making when dealing with multi-core system. So cpu topology description
is helpful to provide guest with the right view. Cpu cache information may
also have slight impact on the sched domain, and even userspace software
may check the cpu cache information to do some optimizations. Dario Faggioli's
talk in [0] also shows the virtual topology may has impact on sched performace.
Thus this patch series is posted to provide cpu and cache topology support
for arm platform.

Both fdt and ACPI are introduced to present the cpu and cache topology.
To describe the cpu topology via ACPI, a PPTT table is introduced according
to the processor hierarchy node structure. To describe the cpu cache
information, a default cache hierarchy is given and built according to the
cache type structure defined by ACPI, it can be made configurable later.

The RFC v1 was posted at [1], we tried to map the MPIDR register into cpu
topology, however it is totally wrong. Andrew points it out that Linux kernel
is goint to stop using MPIDR for topology information [2]. The root cause is
the MPIDR register has been abused by ARM OEM manufactures. It is only used as
an identifer for a specific cpu, not representation of the topology. Moreover
this v2 is rebased on Andrew's latest branch shared [4].

This patch series was initially based on the patches posted by Andrew Jones [3].
I jumped in on it since some OS vendor cooperative partner are eager for it.
Thanks for Andrew's contribution.

After applying this patch series, launch a guest with virt-5.3 and cpu
topology configured with sockets:cores:threads = 2:4:2, you will get the
bellow messages with the lscpu command.

-----------------------------------------
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       2
NUMA node(s):                    2
Vendor ID:                       HiSilicon
Model:                           0
Model name:                      Kunpeng-920
Stepping:                        0x1
BogoMIPS:                        200.00
L1d cache:                       512 KiB
L1i cache:                       512 KiB
L2 cache:                        4 MiB
L3 cache:                        128 MiB
NUMA node0 CPU(s):               0-7
NUMA node1 CPU(s):               8-15


changelog
v2 -> v3:
- Make use of possible_cpus->cpus[i].cpu to check against current online cpus

v1 -> v2:
- Rebased to the latest branch shared by Andrew Jones [4]
- Stop mapping MPIDR into vcpu topology

[0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
[1] https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg06027.html
[2] https://patchwork.kernel.org/project/linux-arm-kernel/patch/20200829130016.26106-1-valentin.schneider@arm.com/
[3] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com
[4] https://github.com/rhdrjones/qemu/commits/virt-cpu-topology-refresh 

Andrew Jones (5):
  hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  hw/arm/virt: Remove unused variable
  hw/arm/virt: Replace smp_parse with one that prefers cores
  device_tree: Add qemu_fdt_add_path
  hw/arm/virt: DT: add cpu-map

Ying Fang (8):
  hw: add compat machines for 5.3
  hw/arm/virt-acpi-build: distinguish possible and present cpus
  hw/acpi/aml-build: add processor hierarchy node structure
  hw/arm/virt-acpi-build: add PPTT table
  target/arm/cpu: Add cpu cache description for arm
  hw/arm/virt: add fdt cache information
  hw/acpi/aml-build: Build ACPI cpu cache hierarchy information
  hw/arm/virt-acpi-build: Enable cpu and cache topology

 device_tree.c                |  45 +++++-
 hw/acpi/aml-build.c          |  68 +++++++++
 hw/arm/virt-acpi-build.c     |  99 ++++++++++++-
 hw/arm/virt.c                | 273 +++++++++++++++++++++++++++++++----
 hw/core/machine.c            |   3 +
 hw/i386/pc.c                 |   3 +
 hw/i386/pc_piix.c            |  15 +-
 hw/i386/pc_q35.c             |  14 +-
 hw/ppc/spapr.c               |  15 +-
 hw/s390x/s390-virtio-ccw.c   |  14 +-
 include/hw/acpi/acpi-defs.h  |  14 ++
 include/hw/acpi/aml-build.h  |  11 ++
 include/hw/arm/virt.h        |   4 +-
 include/hw/boards.h          |   3 +
 include/hw/i386/pc.h         |   3 +
 include/sysemu/device_tree.h |   1 +
 target/arm/cpu.c             |  42 ++++++
 target/arm/cpu.h             |  27 ++++
 18 files changed, 609 insertions(+), 45 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09 10:45   ` Salil Mehta
  2020-11-09  3:04 ` [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable Ying Fang
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, imammedo, salil.mehta

From: Andrew Jones <drjones@redhat.com>

Prefer to spell out the smp.cpus and smp.max_cpus machine state
variables in order to make grepping easier and to avoid any
confusion as to what cpu count is being used where.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 hw/arm/virt-acpi-build.c |  8 +++----
 hw/arm/virt.c            | 51 +++++++++++++++++++---------------------
 include/hw/arm/virt.h    |  2 +-
 3 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9747a6458f..a222981737 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -57,11 +57,11 @@
 
 #define ARM_SPI_BASE 32
 
-static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
+static void acpi_dsdt_add_cpus(Aml *scope, int cpus)
 {
     uint16_t i;
 
-    for (i = 0; i < smp_cpus; i++) {
+    for (i = 0; i < cpus; i++) {
         Aml *dev = aml_device("C%.03X", i);
         aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
         aml_append(dev, aml_name_decl("_UID", aml_int(i)));
@@ -480,7 +480,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
     gicd->version = vms->gic_version;
 
-    for (i = 0; i < vms->smp_cpus; i++) {
+    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
         AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
                                                            sizeof(*gicc));
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
@@ -599,7 +599,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      * the RTC ACPI device at all when using UEFI.
      */
     scope = aml_scope("\\_SB");
-    acpi_dsdt_add_cpus(scope, vms->smp_cpus);
+    acpi_dsdt_add_cpus(scope, ms->smp.cpus);
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e465a988d6..0069fa1298 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -322,7 +322,7 @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
     if (vms->gic_version == VIRT_GIC_VERSION_2) {
         irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
                              GIC_FDT_IRQ_PPI_CPU_WIDTH,
-                             (1 << vms->smp_cpus) - 1);
+                             (1 << MACHINE(vms)->smp.cpus) - 1);
     }
 
     qemu_fdt_add_subnode(vms->fdt, "/timer");
@@ -363,7 +363,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
      *  The simplest way to go is to examine affinity IDs of all our CPUs. If
      *  at least one of them has Aff3 populated, we set #address-cells to 2.
      */
-    for (cpu = 0; cpu < vms->smp_cpus; cpu++) {
+    for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
 
         if (armcpu->mp_affinity & ARM_AFF3_MASK) {
@@ -376,7 +376,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
     qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#address-cells", addr_cells);
     qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#size-cells", 0x0);
 
-    for (cpu = vms->smp_cpus - 1; cpu >= 0; cpu--) {
+    for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
         char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
         CPUState *cs = CPU(armcpu);
@@ -387,7 +387,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                                     armcpu->dtb_compatible);
 
         if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED
-            && vms->smp_cpus > 1) {
+            && ms->smp.cpus > 1) {
             qemu_fdt_setprop_string(vms->fdt, nodename,
                                         "enable-method", "psci");
         }
@@ -533,7 +533,7 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
     if (vms->gic_version == VIRT_GIC_VERSION_2) {
         irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
                              GIC_FDT_IRQ_PPI_CPU_WIDTH,
-                             (1 << vms->smp_cpus) - 1);
+                             (1 << MACHINE(vms)->smp.cpus) - 1);
     }
 
     qemu_fdt_add_subnode(vms->fdt, "/pmu");
@@ -622,14 +622,13 @@ static void create_gic(VirtMachineState *vms)
     SysBusDevice *gicbusdev;
     const char *gictype;
     int type = vms->gic_version, i;
-    unsigned int smp_cpus = ms->smp.cpus;
     uint32_t nb_redist_regions = 0;
 
     gictype = (type == 3) ? gicv3_class_name() : gic_class_name();
 
     vms->gic = qdev_new(gictype);
     qdev_prop_set_uint32(vms->gic, "revision", type);
-    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
+    qdev_prop_set_uint32(vms->gic, "num-cpu", ms->smp.cpus);
     /* Note that the num-irq property counts both internal and external
      * interrupts; there are always 32 of the former (mandated by GIC spec).
      */
@@ -641,7 +640,7 @@ static void create_gic(VirtMachineState *vms)
     if (type == 3) {
         uint32_t redist0_capacity =
                     vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
-        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
+        uint32_t redist0_count = MIN(ms->smp.cpus, redist0_capacity);
 
         nb_redist_regions = virt_gicv3_redist_region_count(vms);
 
@@ -654,7 +653,7 @@ static void create_gic(VirtMachineState *vms)
                     vms->memmap[VIRT_HIGH_GIC_REDIST2].size / GICV3_REDIST_SIZE;
 
             qdev_prop_set_uint32(vms->gic, "redist-region-count[1]",
-                MIN(smp_cpus - redist0_count, redist1_capacity));
+                MIN(ms->smp.cpus - redist0_count, redist1_capacity));
         }
     } else {
         if (!kvm_irqchip_in_kernel()) {
@@ -683,7 +682,7 @@ static void create_gic(VirtMachineState *vms)
      * maintenance interrupt signal to the appropriate GIC PPI inputs,
      * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
      */
-    for (i = 0; i < smp_cpus; i++) {
+    for (i = 0; i < ms->smp.cpus; i++) {
         DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
         int ppibase = NUM_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
         int irq;
@@ -711,7 +710,7 @@ static void create_gic(VirtMachineState *vms)
         } else if (vms->virt) {
             qemu_irq irq = qdev_get_gpio_in(vms->gic,
                                             ppibase + ARCH_GIC_MAINT_IRQ);
-            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
+            sysbus_connect_irq(gicbusdev, i + 4 * ms->smp.cpus, irq);
         }
 
         qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
@@ -719,11 +718,11 @@ static void create_gic(VirtMachineState *vms)
                                                      + VIRTUAL_PMU_IRQ));
 
         sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
-        sysbus_connect_irq(gicbusdev, i + smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + ms->smp.cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
-        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 2 * ms->smp.cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
-        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 3 * ms->smp.cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
     }
 
@@ -1572,7 +1571,7 @@ static void virt_set_memmap(VirtMachineState *vms)
  */
 static void finalize_gic_version(VirtMachineState *vms)
 {
-    unsigned int max_cpus = MACHINE(vms)->smp.max_cpus;
+    MachineState *ms = MACHINE(vms);
 
     if (kvm_enabled()) {
         int probe_bitmap;
@@ -1613,7 +1612,8 @@ static void finalize_gic_version(VirtMachineState *vms)
             }
             return;
         case VIRT_GIC_VERSION_NOSEL:
-            if ((probe_bitmap & KVM_ARM_VGIC_V2) && max_cpus <= GIC_NCPU) {
+            if ((probe_bitmap & KVM_ARM_VGIC_V2) &&
+                ms->smp.max_cpus <= GIC_NCPU) {
                 vms->gic_version = VIRT_GIC_VERSION_2;
             } else if (probe_bitmap & KVM_ARM_VGIC_V3) {
                 /*
@@ -1622,7 +1622,7 @@ static void finalize_gic_version(VirtMachineState *vms)
                  * to v3. In any case defaulting to v2 would be broken.
                  */
                 vms->gic_version = VIRT_GIC_VERSION_3;
-            } else if (max_cpus > GIC_NCPU) {
+            } else if (ms->smp.max_cpus > GIC_NCPU) {
                 error_report("host only supports in-kernel GICv2 emulation "
                              "but more than 8 vcpus are requested");
                 exit(1);
@@ -1743,8 +1743,6 @@ static void machvirt_init(MachineState *machine)
     bool firmware_loaded;
     bool aarch64 = true;
     bool has_ged = !vmc->no_ged;
-    unsigned int smp_cpus = machine->smp.cpus;
-    unsigned int max_cpus = machine->smp.max_cpus;
 
     /*
      * In accelerated mode, the memory map is computed earlier in kvm_type()
@@ -1815,10 +1813,10 @@ static void machvirt_init(MachineState *machine)
         virt_max_cpus = GIC_NCPU;
     }
 
-    if (max_cpus > virt_max_cpus) {
+    if (machine->smp.max_cpus > virt_max_cpus) {
         error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
                      "supported by machine 'mach-virt' (%d)",
-                     max_cpus, virt_max_cpus);
+                     machine->smp.max_cpus, virt_max_cpus);
         exit(1);
     }
 
@@ -1843,7 +1841,7 @@ static void machvirt_init(MachineState *machine)
         Object *cpuobj;
         CPUState *cs;
 
-        if (n >= smp_cpus) {
+        if (n >= machine->smp.cpus) {
             break;
         }
 
@@ -2015,7 +2013,7 @@ static void machvirt_init(MachineState *machine)
     }
 
     vms->bootinfo.ram_size = machine->ram_size;
-    vms->bootinfo.nb_cpus = smp_cpus;
+    vms->bootinfo.nb_cpus = machine->smp.cpus;
     vms->bootinfo.board_id = -1;
     vms->bootinfo.loader_start = vms->memmap[VIRT_MEM].base;
     vms->bootinfo.get_dtb = machvirt_dtb;
@@ -2208,17 +2206,16 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
-    unsigned int max_cpus = ms->smp.max_cpus;
     VirtMachineState *vms = VIRT_MACHINE(ms);
 
     if (ms->possible_cpus) {
-        assert(ms->possible_cpus->len == max_cpus);
+        assert(ms->possible_cpus->len == ms->smp.max_cpus);
         return ms->possible_cpus;
     }
 
     ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
-                                  sizeof(CPUArchId) * max_cpus);
-    ms->possible_cpus->len = max_cpus;
+                                  sizeof(CPUArchId) * ms->smp.max_cpus);
+    ms->possible_cpus->len = ms->smp.max_cpus;
     for (n = 0; n < ms->possible_cpus->len; n++) {
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
         ms->possible_cpus->cpus[n].arch_id =
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index aad6d69841..953d94acc0 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -181,7 +181,7 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
 
     assert(vms->gic_version == VIRT_GIC_VERSION_3);
 
-    return vms->smp_cpus > redist0_capacity ? 2 : 1;
+    return MACHINE(vms)->smp.cpus > redist0_capacity ? 2 : 1;
 }
 
 #endif /* QEMU_ARM_VIRT_H */
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09 10:54   ` Salil Mehta
  2020-11-09  3:04 ` [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores Ying Fang
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, imammedo, salil.mehta

From: Andrew Jones <drjones@redhat.com>

We no longer use the smp_cpus virtual machine state variable.
Remove it.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 hw/arm/virt.c         | 2 --
 include/hw/arm/virt.h | 1 -
 2 files changed, 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0069fa1298..ea24b576c6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1820,8 +1820,6 @@ static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
-    vms->smp_cpus = smp_cpus;
-
     if (vms->virt && kvm_enabled()) {
         error_report("mach-virt: KVM does not support providing "
                      "Virtualization extensions to the guest CPU");
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 953d94acc0..010f24f580 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -151,7 +151,6 @@ struct VirtMachineState {
     MemMapEntry *memmap;
     char *pciehb_nodename;
     const int *irqmap;
-    int smp_cpus;
     void *fdt;
     int fdt_size;
     uint32_t clock_phandle;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09 11:01   ` Salil Mehta
  2020-11-09  3:04 ` [RFC PATCH v3 04/13] device_tree: Add qemu_fdt_add_path Ying Fang
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, imammedo, salil.mehta

From: Andrew Jones <drjones@redhat.com>

The virt machine type has never used the CPU topology parameters, other
than number of online CPUs and max CPUs. When choosing how to allocate
those CPUs the default has been to assume cores. In preparation for
using the other CPU topology parameters let's use an smp_parse that
prefers cores over sockets. We can also enforce the topology matches
max_cpus check because we have no legacy to preserve.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ea24b576c6..ba902b53ba 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -78,6 +78,8 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
 #include "qemu/guest-random.h"
+#include "qapi/qmp/qerror.h"
+#include "sysemu/replay.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -2444,6 +2446,79 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
     return requested_pa_size > 40 ? requested_pa_size : 0;
 }
 
+/*
+ * Unlike smp_parse() in hw/core/machine.c, we prefer cores over sockets,
+ * e.g. '-smp 8' creates 1 socket with 8 cores.  Whereas '-smp 8' with
+ * hw/core/machine.c's smp_parse() creates 8 sockets, each with 1 core.
+ * Additionally, we can enforce the topology matches max_cpus check,
+ * because we have no legacy to preserve.
+ */
+static void virt_smp_parse(MachineState *ms, QemuOpts *opts)
+{
+    if (opts) {
+        unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
+        unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+        unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
+        unsigned threads = qemu_opt_get_number(opts, "threads", 0);
+
+        /*
+         * Compute missing values; prefer cores over sockets and
+         * sockets over threads.
+         */
+        if (cpus == 0 || cores == 0) {
+            sockets = sockets > 0 ? sockets : 1;
+            threads = threads > 0 ? threads : 1;
+            if (cpus == 0) {
+                cores = cores > 0 ? cores : 1;
+                cpus = cores * threads * sockets;
+            } else {
+                ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+                cores = ms->smp.max_cpus / (sockets * threads);
+            }
+        } else if (sockets == 0) {
+            threads = threads > 0 ? threads : 1;
+            sockets = cpus / (cores * threads);
+            sockets = sockets > 0 ? sockets : 1;
+        } else if (threads == 0) {
+            threads = cpus / (cores * sockets);
+            threads = threads > 0 ? threads : 1;
+        } else if (sockets * cores * threads < cpus) {
+            error_report("cpu topology: "
+                         "sockets (%u) * cores (%u) * threads (%u) < "
+                         "smp_cpus (%u)",
+                         sockets, cores, threads, cpus);
+            exit(1);
+        }
+
+        ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+
+        if (ms->smp.max_cpus < cpus) {
+            error_report("maxcpus must be equal to or greater than smp");
+            exit(1);
+        }
+
+        if (sockets * cores * threads != ms->smp.max_cpus) {
+            error_report("cpu topology: "
+                         "sockets (%u) * cores (%u) * threads (%u)"
+                         "!= maxcpus (%u)",
+                         sockets, cores, threads,
+                         ms->smp.max_cpus);
+            exit(1);
+        }
+
+        ms->smp.cpus = cpus;
+        ms->smp.cores = cores;
+        ms->smp.threads = threads;
+        ms->smp.sockets = sockets;
+    }
+
+    if (ms->smp.cpus > 1) {
+        Error *blocker = NULL;
+        error_setg(&blocker, QERR_REPLAY_NOT_SUPPORTED, "smp");
+        replay_add_blocker(blocker);
+    }
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -2469,6 +2544,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+    mc->smp_parse = virt_smp_parse;
     mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 04/13] device_tree: Add qemu_fdt_add_path
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (2 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 05/13] hw: add compat machines for 5.3 Ying Fang
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, imammedo, salil.mehta

From: Andrew Jones <drjones@redhat.com>

qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
it also adds any missing parent nodes. We also tweak an error
message of qemu_fdt_add_subnode().

We'll make use of the new function in a coming patch.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 device_tree.c                | 45 ++++++++++++++++++++++++++++++++++--
 include/sysemu/device_tree.h |  1 +
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index b335dae707..c080909bb9 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -515,8 +515,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
 
     retval = fdt_add_subnode(fdt, parent, basename);
     if (retval < 0) {
-        error_report("FDT: Failed to create subnode %s: %s", name,
-                     fdt_strerror(retval));
+        error_report("%s: Failed to create subnode %s: %s",
+                     __func__, name, fdt_strerror(retval));
         exit(1);
     }
 
@@ -524,6 +524,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
     return retval;
 }
 
+/*
+ * Like qemu_fdt_add_subnode(), but will add all missing
+ * subnodes in the path.
+ */
+int qemu_fdt_add_path(void *fdt, const char *path)
+{
+    char *dupname, *basename, *p;
+    int parent, retval = -1;
+
+    if (path[0] != '/') {
+        return retval;
+    }
+
+    parent = fdt_path_offset(fdt, "/");
+    p = dupname = g_strdup(path);
+
+    while (p) {
+        *p = '/';
+        basename = p + 1;
+        p = strchr(p + 1, '/');
+        if (p) {
+            *p = '\0';
+        }
+        retval = fdt_path_offset(fdt, dupname);
+        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
+            error_report("%s: Invalid path %s: %s",
+                         __func__, path, fdt_strerror(retval));
+            exit(1);
+        } else if (retval == -FDT_ERR_NOTFOUND) {
+            retval = fdt_add_subnode(fdt, parent, basename);
+            if (retval < 0) {
+                break;
+            }
+        }
+        parent = retval;
+    }
+
+    g_free(dupname);
+    return retval;
+}
+
 void qemu_fdt_dumpdtb(void *fdt, int size)
 {
     const char *dumpdtb = qemu_opt_get(qemu_get_machine_opts(), "dumpdtb");
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 982c89345f..15fb98af98 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -104,6 +104,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
 uint32_t qemu_fdt_alloc_phandle(void *fdt);
 int qemu_fdt_nop_node(void *fdt, const char *node_path);
 int qemu_fdt_add_subnode(void *fdt, const char *name);
+int qemu_fdt_add_path(void *fdt, const char *path);
 
 #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
     do {                                                                      \
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 05/13] hw: add compat machines for 5.3
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (3 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 04/13] device_tree: Add qemu_fdt_add_path Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 06/13] hw/arm/virt: DT: add cpu-map Ying Fang
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

Add 5.3 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/arm/virt.c              |  9 ++++++++-
 hw/core/machine.c          |  3 +++
 hw/i386/pc.c               |  3 +++
 hw/i386/pc_piix.c          | 15 ++++++++++++++-
 hw/i386/pc_q35.c           | 14 +++++++++++++-
 hw/ppc/spapr.c             | 15 +++++++++++++--
 hw/s390x/s390-virtio-ccw.c | 14 +++++++++++++-
 include/hw/boards.h        |  3 +++
 include/hw/i386/pc.h       |  3 +++
 9 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ba902b53ba..ff8a14439e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2665,10 +2665,17 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
+static void virt_machine_5_3_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(5, 3)
+
 static void virt_machine_5_2_options(MachineClass *mc)
 {
+    virt_machine_5_3_options(mc);
+    compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(5, 2)
+DEFINE_VIRT_MACHINE(5, 2)
 
 static void virt_machine_5_1_options(MachineClass *mc)
 {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 7e2f4ec08e..6dc77699a9 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -28,6 +28,9 @@
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
 
+GlobalProperty hw_compat_5_2[] = { };
+const size_t hw_compat_5_2_len = G_N_ELEMENTS(hw_compat_5_2);
+
 GlobalProperty hw_compat_5_1[] = {
     { "vhost-scsi", "num_queues", "1"},
     { "vhost-user-blk", "num-queues", "1"},
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e87be5d29a..eaa046ff5d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -97,6 +97,9 @@
 #include "trace.h"
 #include CONFIG_DEVICES
 
+GlobalProperty pc_compat_5_2[] = { };
+const size_t pc_compat_5_2_len = G_N_ELEMENTS(pc_compat_5_2);
+
 GlobalProperty pc_compat_5_1[] = {
     { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
 };
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 3c2ae0612b..01254090ce 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -426,7 +426,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
     machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
 }
 
-static void pc_i440fx_5_2_machine_options(MachineClass *m)
+static void pc_i440fx_5_3_machine_options(MachineClass *m)
 {
     PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_i440fx_machine_options(m);
@@ -435,6 +435,19 @@ static void pc_i440fx_5_2_machine_options(MachineClass *m)
     pcmc->default_cpu_version = 1;
 }
 
+DEFINE_I440FX_MACHINE(v5_3, "pc-i440fx-5.3", NULL,
+                      pc_i440fx_5_3_machine_options);
+
+static void pc_i440fx_5_2_machine_options(MachineClass *m)
+{
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+    pc_i440fx_machine_options(m);
+    m->alias = NULL;
+    m->is_default = false;
+    compat_props_add(m->compat_props, hw_compat_5_2, hw_compat_5_2_len);
+    compat_props_add(m->compat_props, pc_compat_5_2, pc_compat_5_2_len);
+}
+
 DEFINE_I440FX_MACHINE(v5_2, "pc-i440fx-5.2", NULL,
                       pc_i440fx_5_2_machine_options);
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a3f4959c43..dd14803edb 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -344,7 +344,7 @@ static void pc_q35_machine_options(MachineClass *m)
     m->max_cpus = 288;
 }
 
-static void pc_q35_5_2_machine_options(MachineClass *m)
+static void pc_q35_5_3_machine_options(MachineClass *m)
 {
     PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_q35_machine_options(m);
@@ -352,6 +352,18 @@ static void pc_q35_5_2_machine_options(MachineClass *m)
     pcmc->default_cpu_version = 1;
 }
 
+DEFINE_Q35_MACHINE(v5_3, "pc-q35-5.3", NULL,
+                   pc_q35_5_3_machine_options);
+
+static void pc_q35_5_2_machine_options(MachineClass *m)
+{
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+    pc_q35_machine_options(m);
+    m->alias = NULL;
+    compat_props_add(m->compat_props, hw_compat_5_2, hw_compat_5_2_len);
+    compat_props_add(m->compat_props, pc_compat_5_2, pc_compat_5_2_len);
+}
+
 DEFINE_Q35_MACHINE(v5_2, "pc-q35-5.2", NULL,
                    pc_q35_5_2_machine_options);
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 2db810f73a..c292a3edd9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4511,15 +4511,26 @@ static void spapr_machine_latest_class_options(MachineClass *mc)
     }                                                                \
     type_init(spapr_machine_register_##suffix)
 
+/*
+ * pseries-5.3
+ */
+static void spapr_machine_5_3_class_options(MachineClass *mc)
+{
+    /* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(5_3, "5.3", true);
+
 /*
  * pseries-5.2
  */
 static void spapr_machine_5_2_class_options(MachineClass *mc)
 {
-    /* Defaults for the latest behaviour inherited from the base class */
+    spapr_machine_5_3_class_options(mc);
+    compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
 }
 
-DEFINE_SPAPR_MACHINE(5_2, "5.2", true);
+DEFINE_SPAPR_MACHINE(5_2, "5.2", false);
 
 /*
  * pseries-5.1
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 28266a3a35..bde084e13d 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -789,14 +789,26 @@ bool css_migration_enabled(void)
     }                                                                         \
     type_init(ccw_machine_register_##suffix)
 
+static void ccw_machine_5_3_instance_options(MachineState *machine)
+{
+}
+
+static void ccw_machine_5_3_class_options(MachineClass *mc)
+{
+}
+DEFINE_CCW_MACHINE(5_3, "5.3", true);
+
 static void ccw_machine_5_2_instance_options(MachineState *machine)
 {
+    ccw_machine_5_3_instance_options(machine);
 }
 
 static void ccw_machine_5_2_class_options(MachineClass *mc)
 {
+    ccw_machine_5_3_class_options(mc);
+    compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
 }
-DEFINE_CCW_MACHINE(5_2, "5.2", true);
+DEFINE_CCW_MACHINE(5_2, "5.2", false);
 
 static void ccw_machine_5_1_instance_options(MachineState *machine)
 {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index bf53e8a16e..f631c1799d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -311,6 +311,9 @@ struct MachineState {
     } \
     type_init(machine_initfn##_register_types)
 
+extern GlobalProperty hw_compat_5_2[];
+extern const size_t hw_compat_5_2_len;
+
 extern GlobalProperty hw_compat_5_1[];
 extern const size_t hw_compat_5_1_len;
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 84639d0ebc..6f1531ed14 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -190,6 +190,9 @@ void pc_system_firmware_init(PCMachineState *pcms, MemoryRegion *rom_memory);
 void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
                        const CPUArchIdList *apic_ids, GArray *entry);
 
+extern GlobalProperty pc_compat_5_2[];
+extern const size_t pc_compat_5_2_len;
+
 extern GlobalProperty pc_compat_5_1[];
 extern const size_t pc_compat_5_1_len;
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 06/13] hw/arm/virt: DT: add cpu-map
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (4 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 05/13] hw: add compat machines for 5.3 Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 07/13] hw/arm/virt-acpi-build: distinguish possible and present cpus Ying Fang
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

From: Andrew Jones <drjones@redhat.com>

Support devicetree CPU topology descriptions.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/arm/virt.c         | 40 +++++++++++++++++++++++++++++++++++++++-
 include/hw/arm/virt.h |  1 +
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ff8a14439e..d23b941020 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -351,9 +351,10 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
     int cpu;
     int addr_cells = 1;
     const MachineState *ms = MACHINE(vms);
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
 
     /*
-     * From Documentation/devicetree/bindings/arm/cpus.txt
+     * See Linux Documentation/devicetree/bindings/arm/cpus.yaml
      *  On ARM v8 64-bit systems value should be set to 2,
      *  that corresponds to the MPIDR_EL1 register size.
      *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
@@ -407,8 +408,42 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                 ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
         }
 
+        if (ms->smp.cpus > 1 && !vmc->ignore_cpu_topology) {
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle",
+                                  qemu_fdt_alloc_phandle(vms->fdt));
+        }
+
         g_free(nodename);
     }
+
+    if (ms->smp.cpus > 1 && !vmc->ignore_cpu_topology) {
+        /*
+         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
+         */
+        qemu_fdt_add_subnode(vms->fdt, "/cpus/cpu-map");
+
+        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
+            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
+            char *map_path;
+
+            if (ms->smp.threads > 1) {
+                map_path = g_strdup_printf(
+                            "/cpus/cpu-map/%s%d/%s%d/%s%d",
+                            "cluster", cpu / (ms->smp.cores * ms->smp.threads),
+                            "core", (cpu / ms->smp.threads) % ms->smp.cores,
+                            "thread", cpu % ms->smp.threads);
+            } else {
+                map_path = g_strdup_printf(
+                            "/cpus/cpu-map/%s%d/%s%d",
+                            "cluster", cpu / ms->smp.cores,
+                            "core", cpu % ms->smp.cores);
+            }
+            qemu_fdt_add_path(vms->fdt, map_path);
+            qemu_fdt_setprop_phandle(vms->fdt, map_path, "cpu", cpu_path);
+            g_free(map_path);
+            g_free(cpu_path);
+        }
+    }
 }
 
 static void fdt_add_its_gic_node(VirtMachineState *vms)
@@ -2672,8 +2707,11 @@ DEFINE_VIRT_MACHINE_AS_LATEST(5, 3)
 
 static void virt_machine_5_2_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
     virt_machine_5_3_options(mc);
     compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
+    vmc->ignore_cpu_topology = true;
 }
 DEFINE_VIRT_MACHINE(5, 2)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 010f24f580..917bd8b645 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -118,6 +118,7 @@ typedef enum VirtGICType {
 struct VirtMachineClass {
     MachineClass parent;
     bool disallow_affinity_adjustment;
+    bool ignore_cpu_topology;
     bool no_its;
     bool no_pmu;
     bool claim_edge_triggered_timers;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 07/13] hw/arm/virt-acpi-build: distinguish possible and present cpus
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (5 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 06/13] hw/arm/virt: DT: add cpu-map Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 08/13] hw/acpi/aml-build: add processor hierarchy node structure Ying Fang
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

When building ACPI tables regarding CPUs we should always build
them for the number of possible CPUs, not the number of present
CPUs. We then ensure only the present CPUs are enabled in madt.
Furthermore, it is also needed if we are going to support CPU
hotplug in the future.

This patch is a rework based on Andrew Jones's contribution at
https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/arm/virt-acpi-build.c | 17 ++++++++++++-----
 hw/arm/virt.c            |  3 +++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index a222981737..9edd6385dc 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -57,14 +57,18 @@
 
 #define ARM_SPI_BASE 32
 
-static void acpi_dsdt_add_cpus(Aml *scope, int cpus)
+static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
 {
     uint16_t i;
+    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
 
-    for (i = 0; i < cpus; i++) {
+    for (i = 0; i < possible_cpus->len; i++) {
         Aml *dev = aml_device("C%.03X", i);
         aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
         aml_append(dev, aml_name_decl("_UID", aml_int(i)));
+        if (possible_cpus->cpus[i].cpu == NULL) {
+            aml_append(dev, aml_name_decl("_STA", aml_int(0)));
+        }
         aml_append(scope, dev);
     }
 }
@@ -470,6 +474,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     const int *irqmap = vms->irqmap;
     AcpiMadtGenericDistributor *gicd;
     AcpiMadtGenericMsiFrame *gic_msi;
+    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
     int i;
 
     acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));
@@ -480,7 +485,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
     gicd->version = vms->gic_version;
 
-    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
+    for (i = 0; i < possible_cpus->len; i++) {
         AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
                                                            sizeof(*gicc));
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
@@ -495,7 +500,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         gicc->cpu_interface_number = cpu_to_le32(i);
         gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
         gicc->uid = cpu_to_le32(i);
-        gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+        if (possible_cpus->cpus[i].cpu != NULL) {
+            gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+        }
 
         if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {
             gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
@@ -599,7 +606,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      * the RTC ACPI device at all when using UEFI.
      */
     scope = aml_scope("\\_SB");
-    acpi_dsdt_add_cpus(scope, ms->smp.cpus);
+    acpi_dsdt_add_cpus(scope, vms);
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d23b941020..b6cebb5549 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1977,6 +1977,9 @@ static void machvirt_init(MachineState *machine)
 
         qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
         object_unref(cpuobj);
+
+        /* Initialize cpu member here since cpu hotplug is not supported yet */
+        machine->possible_cpus->cpus[n].cpu = cpuobj;
     }
     fdt_add_timer_nodes(vms);
     fdt_add_cpu_nodes(vms);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 08/13] hw/acpi/aml-build: add processor hierarchy node structure
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (6 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 07/13] hw/arm/virt-acpi-build: distinguish possible and present cpus Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 09/13] hw/arm/virt-acpi-build: add PPTT table Ying Fang
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: Henglong Fan, drjones, zhang.zhanghailiang, qemu-devel,
	shannon.zhaosl, qemu-arm, alistair.francis, Ying Fang, imammedo,
	salil.mehta

Add the processor hierarchy node structures to build ACPI information
for cpu topology. Three helpers are introduced:

(1) build_socket_hierarchy for socket description structure
(2) build_processor_hierarchy for processor description structure
(3) build_smt_hierarchy for thread (logic processor) description structure

We split the processor hierarchy node structure descriptions into
three helpers even if there are some identical code snippets between
them. The reason is that the private resources are variable among
different topology level. This will make the ACPI PPTT table much
more readable and easy to construct.

Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Henglong Fan <fanhenglong@huawei.com>
---
 hw/acpi/aml-build.c         | 37 +++++++++++++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h |  7 +++++++
 2 files changed, 44 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 3792ba96ce..d1aa9fd716 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1770,6 +1770,43 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms)
                  table_data->len - slit_start, 1, NULL, NULL);
 }
 
+/*
+ * ACPI 6.3: 5.2.29.1 Processor hierarchy node structure (Type 0)
+ */
+void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
+{
+    build_append_byte(tbl, 0);          /* Type 0 - processor */
+    build_append_byte(tbl, 20);         /* Length, no private resources */
+    build_append_int_noprefix(tbl, 0, 2);  /* Reserved */
+    build_append_int_noprefix(tbl, 1, 4);  /* Flags: Physical package */
+    build_append_int_noprefix(tbl, parent, 4);  /* Parent */
+    build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
+    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
+}
+
+void build_processor_hierarchy(GArray *tbl, uint32_t flags,
+                               uint32_t parent, uint32_t id)
+{
+    build_append_byte(tbl, 0);          /* Type 0 - processor */
+    build_append_byte(tbl, 20);         /* Length, no private resources */
+    build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
+    build_append_int_noprefix(tbl, flags, 4);  /* Flags */
+    build_append_int_noprefix(tbl, parent, 4); /* Parent */
+    build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
+    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
+}
+
+void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
+{
+    build_append_byte(tbl, 0);            /* Type 0 - processor */
+    build_append_byte(tbl, 20);           /* Length, no private resources */
+    build_append_int_noprefix(tbl, 0, 2); /* Reserved */
+    build_append_int_noprefix(tbl, 0x0e, 4);    /* Processor is a thread */
+    build_append_int_noprefix(tbl, parent , 4); /* Parent */
+    build_append_int_noprefix(tbl, id, 4);      /* ACPI processor ID */
+    build_append_int_noprefix(tbl, 0, 4);       /* Num of private resources */
+}
+
 /* build rev1/rev3/rev5.1 FADT */
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index fe0055fffb..56474835a7 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -437,6 +437,13 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
 
 void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms);
 
+void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
+
+void build_processor_hierarchy(GArray *tbl, uint32_t flags,
+                               uint32_t parent, uint32_t id);
+
+void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
+
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id);
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 09/13] hw/arm/virt-acpi-build: add PPTT table
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (7 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 08/13] hw/acpi/aml-build: add processor hierarchy node structure Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm Ying Fang
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

Add the Processor Properties Topology Table (PPTT) to present cpu topology
information to the guest.

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/arm/virt-acpi-build.c | 42 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9edd6385dc..5784370257 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -429,6 +429,42 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
 
+static void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms)
+{
+    int pptt_start = table_data->len;
+    int uid = 0, cpus = 0, socket;
+    unsigned int smp_cores = ms->smp.cores;
+    unsigned int smp_threads = ms->smp.threads;
+
+    acpi_data_push(table_data, sizeof(AcpiTableHeader));
+
+    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
+        uint32_t socket_offset = table_data->len - pptt_start;
+        int core;
+
+        build_socket_hierarchy(table_data, 0, socket);
+
+        for (core = 0; core < smp_cores; core++) {
+            uint32_t core_offset = table_data->len - pptt_start;
+            int thread;
+
+            if (smp_threads <= 1) {
+                build_processor_hierarchy(table_data, 2, socket_offset, uid++);
+             } else {
+                build_processor_hierarchy(table_data, 0, socket_offset, core);
+                for (thread = 0; thread < smp_threads; thread++) {
+                    build_smt_hierarchy(table_data, core_offset, uid++);
+                }
+             }
+        }
+        cpus += smp_cores * smp_threads;
+    }
+
+    build_header(linker, table_data,
+                 (void *)(table_data->data + pptt_start), "PPTT",
+                 table_data->len - pptt_start, 2, NULL, NULL);
+}
+
 /* GTDT */
 static void
 build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -669,6 +705,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     unsigned dsdt, xsdt;
     GArray *tables_blob = tables->table_data;
     MachineState *ms = MACHINE(vms);
+    bool cpu_topology_enabled = !vmc->ignore_cpu_topology;
 
     table_offsets = g_array_new(false, true /* clear */,
                                         sizeof(uint32_t));
@@ -688,6 +725,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_madt(tables_blob, tables->linker, vms);
 
+    if (cpu_topology_enabled) {
+        acpi_add_table(table_offsets, tables_blob);
+        build_pptt(tables_blob, tables->linker, ms);
+    }
+
     acpi_add_table(table_offsets, tables_blob);
     build_gtdt(tables_blob, tables->linker, vms);
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (8 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 09/13] hw/arm/virt-acpi-build: add PPTT table Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09 17:46   ` Jonathan Cameron
  2020-11-30 13:00   ` Peter Maydell
  2020-11-09  3:04 ` [RFC PATCH v3 11/13] hw/arm/virt: add fdt cache information Ying Fang
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

Add the CPUCacheInfo structure to hold cpu cache information for ARM cpus.
A classic three level cache topology is used here. The default cache
capacity is given and userspace can overwrite these values.

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 target/arm/cpu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.h | 27 +++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 056319859f..f1bac7452c 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -27,6 +27,7 @@
 #include "qapi/visitor.h"
 #include "cpu.h"
 #include "internals.h"
+#include "qemu/units.h"
 #include "exec/exec-all.h"
 #include "hw/qdev-properties.h"
 #if !defined(CONFIG_USER_ONLY)
@@ -997,6 +998,45 @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
     return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
 }
 
+static CPUCaches default_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+    .type = DATA_CACHE,
+        .level = 1,
+        .size = 64 * KiB,
+        .line_size = 64,
+        .associativity = 4,
+        .sets = 256,
+        .attributes = 0x02,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .size = 64 * KiB,
+        .line_size = 64,
+        .associativity = 4,
+        .sets = 256,
+        .attributes = 0x04,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .size = 512 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .sets = 1024,
+        .attributes = 0x0a,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .size = 65536 * KiB,
+        .line_size = 64,
+        .associativity = 15,
+        .sets = 2048,
+        .attributes = 0x0a,
+    },
+};
+
 static void cpreg_hashtable_data_destroy(gpointer data)
 {
     /*
@@ -1841,6 +1881,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         }
     }
 
+    cpu->caches = default_cache_info;
+
     qemu_init_vcpu(cs);
     cpu_reset(cs);
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index cfff1b5c8f..dbc33a9802 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -746,6 +746,30 @@ typedef enum ARMPSCIState {
 
 typedef struct ARMISARegisters ARMISARegisters;
 
+/* Cache information type */
+enum CacheType {
+    DATA_CACHE,
+    INSTRUCTION_CACHE,
+    UNIFIED_CACHE
+};
+
+typedef struct CPUCacheInfo {
+    enum CacheType type;      /* Cache Type*/
+    uint8_t level;
+    uint32_t size;            /* Size in bytes */
+    uint16_t line_size;       /* Line size in bytes */
+    uint8_t associativity;    /* Cache associativity */
+    uint32_t sets;            /* Number of sets */
+    uint8_t attributes;       /* Cache attributest  */
+} CPUCacheInfo;
+
+typedef struct CPUCaches {
+        CPUCacheInfo *l1d_cache;
+        CPUCacheInfo *l1i_cache;
+        CPUCacheInfo *l2_cache;
+        CPUCacheInfo *l3_cache;
+} CPUCaches;
+
 /**
  * ARMCPU:
  * @env: #CPUARMState
@@ -987,6 +1011,9 @@ struct ARMCPU {
 
     /* Generic timer counter frequency, in Hz */
     uint64_t gt_cntfrq_hz;
+
+    /* CPU cache information */
+    CPUCaches caches;
 };
 
 unsigned int gt_cntfrq_period_ns(ARMCPU *cpu);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 11/13] hw/arm/virt: add fdt cache information
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (9 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 12/13] hw/acpi/aml-build: Build ACPI cpu cache hierarchy information Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology Ying Fang
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

Support devicetree cpu cache information descriptions

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/arm/virt.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b6cebb5549..21275e03c2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -346,6 +346,89 @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
                        GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL2_IRQ, irqflags);
 }
 
+static void fdt_add_l3cache_nodes(const VirtMachineState *vms)
+{
+    int i;
+    const MachineState *ms = MACHINE(vms);
+    ARMCPU *cpu = ARM_CPU(first_cpu);
+    unsigned int smp_cores = ms->smp.cores;
+    unsigned int sockets = ms->smp.max_cpus / smp_cores;
+
+    for (i = 0; i < sockets; i++) {
+        char *nodename = g_strdup_printf("/cpus/l3-cache%d", i);
+        qemu_fdt_add_subnode(vms->fdt, nodename);
+        qemu_fdt_setprop_string(vms->fdt, nodename, "compatible", "cache");
+        qemu_fdt_setprop_string(vms->fdt, nodename, "cache-unified", "true");
+        qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-level", 3);
+        qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-size",
+                              cpu->caches.l3_cache->size);
+        qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-line-size",
+                              cpu->caches.l3_cache->line_size);
+        qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-sets",
+                              cpu->caches.l3_cache->sets);
+        qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle",
+                              qemu_fdt_alloc_phandle(vms->fdt));
+        g_free(nodename);
+    }
+}
+
+static void fdt_add_l2cache_nodes(const VirtMachineState *vms)
+{
+    int i, j;
+    const MachineState *ms = MACHINE(vms);
+    unsigned int smp_cores = ms->smp.cores;
+    signed int sockets = ms->smp.max_cpus / smp_cores;
+    ARMCPU *cpu = ARM_CPU(first_cpu);
+
+    for (i = 0; i < sockets; i++) {
+        char *next_path = g_strdup_printf("/cpus/l3-cache%d", i);
+        for (j = 0; j < smp_cores; j++) {
+            char *nodename = g_strdup_printf("/cpus/l2-cache%d",
+                                  i * smp_cores + j);
+            qemu_fdt_add_subnode(vms->fdt, nodename);
+            qemu_fdt_setprop_string(vms->fdt, nodename, "compatible", "cache");
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-size",
+                                  cpu->caches.l2_cache->size);
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-line-size",
+                                  cpu->caches.l2_cache->line_size);
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "cache-sets",
+                                  cpu->caches.l2_cache->sets);
+            qemu_fdt_setprop_phandle(vms->fdt, nodename,
+                                  "next-level-cache", next_path);
+            qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle",
+                                  qemu_fdt_alloc_phandle(vms->fdt));
+            g_free(nodename);
+        }
+        g_free(next_path);
+    }
+}
+
+static void fdt_add_l1cache_prop(const VirtMachineState *vms,
+                            char *nodename, int cpu_index)
+{
+
+    ARMCPU *cpu = ARM_CPU(qemu_get_cpu(cpu_index));
+    CPUCaches caches = cpu->caches;
+
+    char *cachename = g_strdup_printf("/cpus/l2-cache%d", cpu_index);
+
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "d-cache-size",
+                          caches.l1d_cache->size);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "d-cache-line-size",
+                          caches.l1d_cache->line_size);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "d-cache-sets",
+                          caches.l1d_cache->sets);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "i-cache-size",
+                          caches.l1i_cache->size);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "i-cache-line-size",
+                          caches.l1i_cache->line_size);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "i-cache-sets",
+                          caches.l1i_cache->sets);
+    qemu_fdt_setprop_phandle(vms->fdt, nodename, "next-level-cache",
+                          cachename);
+    g_free(cachename);
+}
+
 static void fdt_add_cpu_nodes(const VirtMachineState *vms)
 {
     int cpu;
@@ -379,6 +462,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
     qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#address-cells", addr_cells);
     qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#size-cells", 0x0);
 
+    if (!vmc->ignore_cpu_topology) {
+        fdt_add_l3cache_nodes(vms);
+        fdt_add_l2cache_nodes(vms);
+    }
+
     for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
         char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
@@ -408,6 +496,10 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                 ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
         }
 
+        if (!vmc->ignore_cpu_topology) {
+            fdt_add_l1cache_prop(vms, nodename, cpu);
+        }
+
         if (ms->smp.cpus > 1 && !vmc->ignore_cpu_topology) {
             qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle",
                                   qemu_fdt_alloc_phandle(vms->fdt));
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 12/13] hw/acpi/aml-build: Build ACPI cpu cache hierarchy information
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (10 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 11/13] hw/arm/virt: add fdt cache information Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09  3:04 ` [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology Ying Fang
  12 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

To build cache information, An AcpiCacheInfo structure is defined to
hold the type 1 cache structure according to ACPI spec v6.3 5.2.29.2.
A helper function build_cache_hierarchy is also introduced to encode
the cache information.

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/acpi/aml-build.c         | 26 ++++++++++++++++++++++++++
 include/hw/acpi/acpi-defs.h |  8 ++++++++
 include/hw/acpi/aml-build.h |  3 +++
 3 files changed, 37 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index d1aa9fd716..1a38110149 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1770,6 +1770,32 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms)
                  table_data->len - slit_start, 1, NULL, NULL);
 }
 
+/* ACPI 6.3: 5.29.2 Cache type structure (Type 1) */
+static void build_cache_head(GArray *tbl, uint32_t next_level)
+{
+    build_append_byte(tbl, 1);
+    build_append_byte(tbl, 24);
+    build_append_int_noprefix(tbl, 0, 2);
+    build_append_int_noprefix(tbl, 0x7f, 4);
+    build_append_int_noprefix(tbl, next_level, 4);
+}
+
+static void build_cache_tail(GArray *tbl, AcpiCacheInfo *cache_info)
+{
+    build_append_int_noprefix(tbl, cache_info->size, 4);
+    build_append_int_noprefix(tbl, cache_info->sets, 4);
+    build_append_byte(tbl, cache_info->associativity);
+    build_append_byte(tbl, cache_info->attributes);
+    build_append_int_noprefix(tbl, cache_info->line_size, 2);
+}
+
+void build_cache_hierarchy(GArray *tbl,
+              uint32_t next_level, AcpiCacheInfo *cache_info)
+{
+    build_cache_head(tbl, next_level);
+    build_cache_tail(tbl, cache_info);
+}
+
 /*
  * ACPI 6.3: 5.2.29.1 Processor hierarchy node structure (Type 0)
  */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 38a42f409a..3df38ab449 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -618,4 +618,12 @@ struct AcpiIortRC {
 } QEMU_PACKED;
 typedef struct AcpiIortRC AcpiIortRC;
 
+typedef struct AcpiCacheInfo {
+    uint32_t size;
+    uint32_t sets;
+    uint8_t  associativity;
+    uint8_t  attributes;
+    uint16_t line_size;
+} AcpiCacheInfo;
+
 #endif
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 56474835a7..01078753a8 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -437,6 +437,9 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
 
 void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms);
 
+void build_cache_hierarchy(GArray *tbl,
+              uint32_t next_level, AcpiCacheInfo *cache_info);
+
 void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
 
 void build_processor_hierarchy(GArray *tbl, uint32_t flags,
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology
  2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
                   ` (11 preceding siblings ...)
  2020-11-09  3:04 ` [RFC PATCH v3 12/13] hw/acpi/aml-build: Build ACPI cpu cache hierarchy information Ying Fang
@ 2020-11-09  3:04 ` Ying Fang
  2020-11-09 17:36   ` Jonathan Cameron
  12 siblings, 1 reply; 25+ messages in thread
From: Ying Fang @ 2020-11-09  3:04 UTC (permalink / raw)
  To: peter.maydell
  Cc: drjones, zhang.zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, Ying Fang, imammedo, salil.mehta

A helper struct AcpiCacheOffset is introduced to describe the offset
of three level caches. The cache hierarchy is built according to
ACPI spec v6.3 5.2.29.2. Let's enable CPU cache topology now.

Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 hw/acpi/aml-build.c         | 19 +++++++++-----
 hw/arm/virt-acpi-build.c    | 52 ++++++++++++++++++++++++++++++++-----
 include/hw/acpi/acpi-defs.h |  6 +++++
 include/hw/acpi/aml-build.h |  7 ++---
 4 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 1a38110149..93a81fbaf5 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1799,27 +1799,32 @@ void build_cache_hierarchy(GArray *tbl,
 /*
  * ACPI 6.3: 5.2.29.1 Processor hierarchy node structure (Type 0)
  */
-void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
+void build_socket_hierarchy(GArray *tbl, uint32_t parent,
+                            uint32_t offset, uint32_t id)
 {
     build_append_byte(tbl, 0);          /* Type 0 - processor */
-    build_append_byte(tbl, 20);         /* Length, no private resources */
+    build_append_byte(tbl, 24);         /* Length, with private resources */
     build_append_int_noprefix(tbl, 0, 2);  /* Reserved */
     build_append_int_noprefix(tbl, 1, 4);  /* Flags: Physical package */
     build_append_int_noprefix(tbl, parent, 4);  /* Parent */
     build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
-    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
+    build_append_int_noprefix(tbl, 1, 4);  /*  Number of private resources */
+    build_append_int_noprefix(tbl, offset, 4);  /* Private resources */
 }
 
-void build_processor_hierarchy(GArray *tbl, uint32_t flags,
-                               uint32_t parent, uint32_t id)
+void build_processor_hierarchy(GArray *tbl, uint32_t flags, uint32_t parent,
+                               AcpiCacheOffset offset, uint32_t id)
 {
     build_append_byte(tbl, 0);          /* Type 0 - processor */
-    build_append_byte(tbl, 20);         /* Length, no private resources */
+    build_append_byte(tbl, 32);         /* Length, with private resources */
     build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
     build_append_int_noprefix(tbl, flags, 4);  /* Flags */
     build_append_int_noprefix(tbl, parent, 4); /* Parent */
     build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
-    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
+    build_append_int_noprefix(tbl, 3, 4);  /* Number of private resources */
+    build_append_int_noprefix(tbl, offset.l1d_offset, 4);/* Private resources */
+    build_append_int_noprefix(tbl, offset.l1i_offset, 4);/* Private resources */
+    build_append_int_noprefix(tbl, offset.l2_offset, 4); /* Private resources */
 }
 
 void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 5784370257..ad49006b42 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -429,29 +429,69 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
 
-static void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms)
+static inline void arm_acpi_cache_info(CPUCacheInfo *cpu_cache,
+                                       AcpiCacheInfo *acpi_cache)
 {
+    acpi_cache->size = cpu_cache->size;
+    acpi_cache->sets = cpu_cache->sets;
+    acpi_cache->associativity = cpu_cache->associativity;
+    acpi_cache->attributes = cpu_cache->attributes;
+    acpi_cache->line_size = cpu_cache->line_size;
+}
+
+static void build_pptt(GArray *table_data, BIOSLinker *linker,
+                       VirtMachineState *vms)
+{
+    MachineState *ms = MACHINE(vms);
     int pptt_start = table_data->len;
     int uid = 0, cpus = 0, socket;
     unsigned int smp_cores = ms->smp.cores;
     unsigned int smp_threads = ms->smp.threads;
+    AcpiCacheOffset offset;
+    ARMCPU *cpu = ARM_CPU(qemu_get_cpu(cpus));
+    AcpiCacheInfo cache_info;
 
     acpi_data_push(table_data, sizeof(AcpiTableHeader));
 
     for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
-        uint32_t socket_offset = table_data->len - pptt_start;
+        uint32_t l3_offset = table_data->len - pptt_start;
+        uint32_t socket_offset;
         int core;
 
-        build_socket_hierarchy(table_data, 0, socket);
+        /* L3 cache type structure */
+        arm_acpi_cache_info(cpu->caches.l3_cache, &cache_info);
+        build_cache_hierarchy(table_data, 0, &cache_info);
+
+        socket_offset = table_data->len - pptt_start;
+        build_socket_hierarchy(table_data, 0, l3_offset, socket);
 
         for (core = 0; core < smp_cores; core++) {
             uint32_t core_offset = table_data->len - pptt_start;
             int thread;
 
+            /* L2 cache tpe structure */
+            offset.l2_offset = table_data->len - pptt_start;
+            arm_acpi_cache_info(cpu->caches.l2_cache, &cache_info);
+            build_cache_hierarchy(table_data, 0, &cache_info);
+
+            /* L1d cache type structure */
+            offset.l1d_offset = table_data->len - pptt_start;
+            arm_acpi_cache_info(cpu->caches.l1d_cache, &cache_info);
+            build_cache_hierarchy(table_data, offset.l2_offset, &cache_info);
+
+            /* L1i cache type structure */
+            offset.l1i_offset = table_data->len - pptt_start;
+            arm_acpi_cache_info(cpu->caches.l1i_cache, &cache_info);
+            build_cache_hierarchy(table_data, offset.l2_offset, &cache_info);
+
+            core_offset = table_data->len - pptt_start;
             if (smp_threads <= 1) {
-                build_processor_hierarchy(table_data, 2, socket_offset, uid++);
+                build_processor_hierarchy(table_data, 2, socket_offset,
+                                          offset, uid++);
              } else {
-                build_processor_hierarchy(table_data, 0, socket_offset, core);
+
+                build_processor_hierarchy(table_data, 0, socket_offset,
+                                          offset, core);
                 for (thread = 0; thread < smp_threads; thread++) {
                     build_smt_hierarchy(table_data, core_offset, uid++);
                 }
@@ -727,7 +767,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (cpu_topology_enabled) {
         acpi_add_table(table_offsets, tables_blob);
-        build_pptt(tables_blob, tables->linker, ms);
+        build_pptt(tables_blob, tables->linker, vms);
     }
 
     acpi_add_table(table_offsets, tables_blob);
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 3df38ab449..e48b7fa506 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -626,4 +626,10 @@ typedef struct AcpiCacheInfo {
     uint16_t line_size;
 } AcpiCacheInfo;
 
+typedef struct AcpiCacheOffset {
+    uint32_t l1d_offset;
+    uint32_t l1i_offset;
+    uint32_t l2_offset;
+} AcpiCacheOffset;
+
 #endif
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 01078753a8..a15ccb2c91 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -440,10 +440,11 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms);
 void build_cache_hierarchy(GArray *tbl,
               uint32_t next_level, AcpiCacheInfo *cache_info);
 
-void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
+void build_socket_hierarchy(GArray *tbl, uint32_t parent,
+                            uint32_t offset, uint32_t id);
 
-void build_processor_hierarchy(GArray *tbl, uint32_t flags,
-                               uint32_t parent, uint32_t id);
+void build_processor_hierarchy(GArray *tbl, uint32_t flags, uint32_t parent,
+                               AcpiCacheOffset offset, uint32_t id);
 
 void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  2020-11-09  3:04 ` [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus Ying Fang
@ 2020-11-09 10:45   ` Salil Mehta
  2020-11-17 10:27     ` Ying Fang
  0 siblings, 1 reply; 25+ messages in thread
From: Salil Mehta @ 2020-11-09 10:45 UTC (permalink / raw)
  To: fangying, peter.maydell
  Cc: drjones, Zhanghailiang, qemu-devel, shannon.zhaosl, qemu-arm,
	alistair.francis, imammedo

Hi Fangying,
A trivial thing. This patch looks bit of a noise in this patch-set. Better
to send it as a separate patch-set and get it accepted.

Thanks

> From: fangying
> Sent: Monday, November 9, 2020 3:05 AM
> To: peter.maydell@linaro.org
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; drjones@redhat.com;
> imammedo@redhat.com; shannon.zhaosl@gmail.com; alistair.francis@wdc.com;
> Zhanghailiang <zhang.zhanghailiang@huawei.com>; Salil Mehta
> <salil.mehta@huawei.com>
> Subject: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
> 
> From: Andrew Jones <drjones@redhat.com>
> 
> Prefer to spell out the smp.cpus and smp.max_cpus machine state
> variables in order to make grepping easier and to avoid any
> confusion as to what cpu count is being used where.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  hw/arm/virt-acpi-build.c |  8 +++----
>  hw/arm/virt.c            | 51 +++++++++++++++++++---------------------
>  include/hw/arm/virt.h    |  2 +-
>  3 files changed, 29 insertions(+), 32 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 9747a6458f..a222981737 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -57,11 +57,11 @@
> 
>  #define ARM_SPI_BASE 32
> 
> -static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
> +static void acpi_dsdt_add_cpus(Aml *scope, int cpus)
>  {
>      uint16_t i;
> 
> -    for (i = 0; i < smp_cpus; i++) {
> +    for (i = 0; i < cpus; i++) {
>          Aml *dev = aml_device("C%.03X", i);
>          aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
>          aml_append(dev, aml_name_decl("_UID", aml_int(i)));
> @@ -480,7 +480,7 @@ build_madt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
>      gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
>      gicd->version = vms->gic_version;
> 
> -    for (i = 0; i < vms->smp_cpus; i++) {
> +    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
>          AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
>                                                             sizeof(*gicc));
>          ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
> @@ -599,7 +599,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
>       * the RTC ACPI device at all when using UEFI.
>       */
>      scope = aml_scope("\\_SB");
> -    acpi_dsdt_add_cpus(scope, vms->smp_cpus);
> +    acpi_dsdt_add_cpus(scope, ms->smp.cpus);
>      acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                         (irqmap[VIRT_UART] + ARM_SPI_BASE));
>      if (vmc->acpi_expose_flash) {
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index e465a988d6..0069fa1298 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -322,7 +322,7 @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
>      if (vms->gic_version == VIRT_GIC_VERSION_2) {
>          irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
>                               GIC_FDT_IRQ_PPI_CPU_WIDTH,
> -                             (1 << vms->smp_cpus) - 1);
> +                             (1 << MACHINE(vms)->smp.cpus) - 1);
>      }
> 
>      qemu_fdt_add_subnode(vms->fdt, "/timer");
> @@ -363,7 +363,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>       *  The simplest way to go is to examine affinity IDs of all our CPUs. If
>       *  at least one of them has Aff3 populated, we set #address-cells to 2.
>       */
> -    for (cpu = 0; cpu < vms->smp_cpus; cpu++) {
> +    for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
>          ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
> 
>          if (armcpu->mp_affinity & ARM_AFF3_MASK) {
> @@ -376,7 +376,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>      qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#address-cells", addr_cells);
>      qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#size-cells", 0x0);
> 
> -    for (cpu = vms->smp_cpus - 1; cpu >= 0; cpu--) {
> +    for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
>          char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
>          ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
>          CPUState *cs = CPU(armcpu);
> @@ -387,7 +387,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>                                      armcpu->dtb_compatible);
> 
>          if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED
> -            && vms->smp_cpus > 1) {
> +            && ms->smp.cpus > 1) {
>              qemu_fdt_setprop_string(vms->fdt, nodename,
>                                          "enable-method", "psci");
>          }
> @@ -533,7 +533,7 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>      if (vms->gic_version == VIRT_GIC_VERSION_2) {
>          irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
>                               GIC_FDT_IRQ_PPI_CPU_WIDTH,
> -                             (1 << vms->smp_cpus) - 1);
> +                             (1 << MACHINE(vms)->smp.cpus) - 1);
>      }
> 
>      qemu_fdt_add_subnode(vms->fdt, "/pmu");
> @@ -622,14 +622,13 @@ static void create_gic(VirtMachineState *vms)
>      SysBusDevice *gicbusdev;
>      const char *gictype;
>      int type = vms->gic_version, i;
> -    unsigned int smp_cpus = ms->smp.cpus;
>      uint32_t nb_redist_regions = 0;
> 
>      gictype = (type == 3) ? gicv3_class_name() : gic_class_name();
> 
>      vms->gic = qdev_new(gictype);
>      qdev_prop_set_uint32(vms->gic, "revision", type);
> -    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
> +    qdev_prop_set_uint32(vms->gic, "num-cpu", ms->smp.cpus);
>      /* Note that the num-irq property counts both internal and external
>       * interrupts; there are always 32 of the former (mandated by GIC spec).
>       */
> @@ -641,7 +640,7 @@ static void create_gic(VirtMachineState *vms)
>      if (type == 3) {
>          uint32_t redist0_capacity =
>                      vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
> -        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
> +        uint32_t redist0_count = MIN(ms->smp.cpus, redist0_capacity);
> 
>          nb_redist_regions = virt_gicv3_redist_region_count(vms);
> 
> @@ -654,7 +653,7 @@ static void create_gic(VirtMachineState *vms)
>                      vms->memmap[VIRT_HIGH_GIC_REDIST2].size /
> GICV3_REDIST_SIZE;
> 
>              qdev_prop_set_uint32(vms->gic, "redist-region-count[1]",
> -                MIN(smp_cpus - redist0_count, redist1_capacity));
> +                MIN(ms->smp.cpus - redist0_count, redist1_capacity));
>          }
>      } else {
>          if (!kvm_irqchip_in_kernel()) {
> @@ -683,7 +682,7 @@ static void create_gic(VirtMachineState *vms)
>       * maintenance interrupt signal to the appropriate GIC PPI inputs,
>       * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
>       */
> -    for (i = 0; i < smp_cpus; i++) {
> +    for (i = 0; i < ms->smp.cpus; i++) {
>          DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
>          int ppibase = NUM_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
>          int irq;
> @@ -711,7 +710,7 @@ static void create_gic(VirtMachineState *vms)
>          } else if (vms->virt) {
>              qemu_irq irq = qdev_get_gpio_in(vms->gic,
>                                              ppibase + ARCH_GIC_MAINT_IRQ);
> -            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
> +            sysbus_connect_irq(gicbusdev, i + 4 * ms->smp.cpus, irq);
>          }
> 
>          qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
> @@ -719,11 +718,11 @@ static void create_gic(VirtMachineState *vms)
>                                                       + VIRTUAL_PMU_IRQ));
> 
>          sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev,
> ARM_CPU_IRQ));
> -        sysbus_connect_irq(gicbusdev, i + smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + ms->smp.cpus,
>                             qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
> -        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + 2 * ms->smp.cpus,
>                             qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
> -        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + 3 * ms->smp.cpus,
>                             qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
>      }
> 
> @@ -1572,7 +1571,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>   */
>  static void finalize_gic_version(VirtMachineState *vms)
>  {
> -    unsigned int max_cpus = MACHINE(vms)->smp.max_cpus;
> +    MachineState *ms = MACHINE(vms);
> 
>      if (kvm_enabled()) {
>          int probe_bitmap;
> @@ -1613,7 +1612,8 @@ static void finalize_gic_version(VirtMachineState *vms)
>              }
>              return;
>          case VIRT_GIC_VERSION_NOSEL:
> -            if ((probe_bitmap & KVM_ARM_VGIC_V2) && max_cpus <= GIC_NCPU) {
> +            if ((probe_bitmap & KVM_ARM_VGIC_V2) &&
> +                ms->smp.max_cpus <= GIC_NCPU) {
>                  vms->gic_version = VIRT_GIC_VERSION_2;
>              } else if (probe_bitmap & KVM_ARM_VGIC_V3) {
>                  /*
> @@ -1622,7 +1622,7 @@ static void finalize_gic_version(VirtMachineState *vms)
>                   * to v3. In any case defaulting to v2 would be broken.
>                   */
>                  vms->gic_version = VIRT_GIC_VERSION_3;
> -            } else if (max_cpus > GIC_NCPU) {
> +            } else if (ms->smp.max_cpus > GIC_NCPU) {
>                  error_report("host only supports in-kernel GICv2 emulation "
>                               "but more than 8 vcpus are requested");
>                  exit(1);
> @@ -1743,8 +1743,6 @@ static void machvirt_init(MachineState *machine)
>      bool firmware_loaded;
>      bool aarch64 = true;
>      bool has_ged = !vmc->no_ged;
> -    unsigned int smp_cpus = machine->smp.cpus;
> -    unsigned int max_cpus = machine->smp.max_cpus;
> 
>      /*
>       * In accelerated mode, the memory map is computed earlier in kvm_type()
> @@ -1815,10 +1813,10 @@ static void machvirt_init(MachineState *machine)
>          virt_max_cpus = GIC_NCPU;
>      }
> 
> -    if (max_cpus > virt_max_cpus) {
> +    if (machine->smp.max_cpus > virt_max_cpus) {
>          error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
>                       "supported by machine 'mach-virt' (%d)",
> -                     max_cpus, virt_max_cpus);
> +                     machine->smp.max_cpus, virt_max_cpus);
>          exit(1);
>      }
> 
> @@ -1843,7 +1841,7 @@ static void machvirt_init(MachineState *machine)
>          Object *cpuobj;
>          CPUState *cs;
> 
> -        if (n >= smp_cpus) {
> +        if (n >= machine->smp.cpus) {
>              break;
>          }
> 
> @@ -2015,7 +2013,7 @@ static void machvirt_init(MachineState *machine)
>      }
> 
>      vms->bootinfo.ram_size = machine->ram_size;
> -    vms->bootinfo.nb_cpus = smp_cpus;
> +    vms->bootinfo.nb_cpus = machine->smp.cpus;
>      vms->bootinfo.board_id = -1;
>      vms->bootinfo.loader_start = vms->memmap[VIRT_MEM].base;
>      vms->bootinfo.get_dtb = machvirt_dtb;
> @@ -2208,17 +2206,16 @@ static int64_t virt_get_default_cpu_node_id(const
> MachineState *ms, int idx)
>  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>  {
>      int n;
> -    unsigned int max_cpus = ms->smp.max_cpus;
>      VirtMachineState *vms = VIRT_MACHINE(ms);
> 
>      if (ms->possible_cpus) {
> -        assert(ms->possible_cpus->len == max_cpus);
> +        assert(ms->possible_cpus->len == ms->smp.max_cpus);
>          return ms->possible_cpus;
>      }
> 
>      ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> -                                  sizeof(CPUArchId) * max_cpus);
> -    ms->possible_cpus->len = max_cpus;
> +                                  sizeof(CPUArchId) * ms->smp.max_cpus);
> +    ms->possible_cpus->len = ms->smp.max_cpus;
>      for (n = 0; n < ms->possible_cpus->len; n++) {
>          ms->possible_cpus->cpus[n].type = ms->cpu_type;
>          ms->possible_cpus->cpus[n].arch_id =
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index aad6d69841..953d94acc0 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -181,7 +181,7 @@ static inline int
> virt_gicv3_redist_region_count(VirtMachineState *vms)
> 
>      assert(vms->gic_version == VIRT_GIC_VERSION_3);
> 
> -    return vms->smp_cpus > redist0_capacity ? 2 : 1;
> +    return MACHINE(vms)->smp.cpus > redist0_capacity ? 2 : 1;
>  }
> 
>  #endif /* QEMU_ARM_VIRT_H */
> --
> 2.23.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable
  2020-11-09  3:04 ` [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable Ying Fang
@ 2020-11-09 10:54   ` Salil Mehta
  0 siblings, 0 replies; 25+ messages in thread
From: Salil Mehta @ 2020-11-09 10:54 UTC (permalink / raw)
  To: fangying, peter.maydell
  Cc: drjones, Zhanghailiang, qemu-devel, shannon.zhaosl, qemu-arm,
	alistair.francis, imammedo

Hi Fangying,
Same here. Why not club [01/13] and [02/13] together and send it separately?

Thanks

> From: fangying
> Sent: Monday, November 9, 2020 3:05 AM
> To: peter.maydell@linaro.org
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; drjones@redhat.com;
> imammedo@redhat.com; shannon.zhaosl@gmail.com; alistair.francis@wdc.com;
> Zhanghailiang <zhang.zhanghailiang@huawei.com>; Salil Mehta
> <salil.mehta@huawei.com>
> Subject: [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable
> 
> From: Andrew Jones <drjones@redhat.com>
> 
> We no longer use the smp_cpus virtual machine state variable.
> Remove it.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  hw/arm/virt.c         | 2 --
>  include/hw/arm/virt.h | 1 -
>  2 files changed, 3 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 0069fa1298..ea24b576c6 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1820,8 +1820,6 @@ static void machvirt_init(MachineState *machine)
>          exit(1);
>      }
> 
> -    vms->smp_cpus = smp_cpus;
> -
>      if (vms->virt && kvm_enabled()) {
>          error_report("mach-virt: KVM does not support providing "
>                       "Virtualization extensions to the guest CPU");
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 953d94acc0..010f24f580 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -151,7 +151,6 @@ struct VirtMachineState {
>      MemMapEntry *memmap;
>      char *pciehb_nodename;
>      const int *irqmap;
> -    int smp_cpus;
>      void *fdt;
>      int fdt_size;
>      uint32_t clock_phandle;
> --
> 2.23.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores
  2020-11-09  3:04 ` [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores Ying Fang
@ 2020-11-09 11:01   ` Salil Mehta
  2020-11-09 11:58     ` Andrew Jones
  0 siblings, 1 reply; 25+ messages in thread
From: Salil Mehta @ 2020-11-09 11:01 UTC (permalink / raw)
  To: fangying, peter.maydell
  Cc: drjones, Zhanghailiang, qemu-devel, shannon.zhaosl, qemu-arm,
	alistair.francis, imammedo

> From: fangying
> Sent: Monday, November 9, 2020 3:05 AM
> To: peter.maydell@linaro.org
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; drjones@redhat.com;
> imammedo@redhat.com; shannon.zhaosl@gmail.com; alistair.francis@wdc.com;
> Zhanghailiang <zhang.zhanghailiang@huawei.com>; Salil Mehta
> <salil.mehta@huawei.com>
> Subject: [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that
> prefers cores
> 
> From: Andrew Jones <drjones@redhat.com>
> 
> The virt machine type has never used the CPU topology parameters, other
> than number of online CPUs and max CPUs. When choosing how to allocate
> those CPUs the default has been to assume cores. In preparation for
> using the other CPU topology parameters let's use an smp_parse that
> prefers cores over sockets. We can also enforce the topology matches
> max_cpus check because we have no legacy to preserve.


Hi Andrew,
I am wondering if we need to take care of other levels of processor
hierarchy as well in ARM64 like 'clusters'/'Dies'?

Thanks


> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ea24b576c6..ba902b53ba 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -78,6 +78,8 @@
>  #include "hw/virtio/virtio-iommu.h"
>  #include "hw/char/pl011.h"
>  #include "qemu/guest-random.h"
> +#include "qapi/qmp/qerror.h"
> +#include "sysemu/replay.h"
> 
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -2444,6 +2446,79 @@ static int virt_kvm_type(MachineState *ms, const char
> *type_str)
>      return requested_pa_size > 40 ? requested_pa_size : 0;
>  }
> 
> +/*
> + * Unlike smp_parse() in hw/core/machine.c, we prefer cores over sockets,
> + * e.g. '-smp 8' creates 1 socket with 8 cores.  Whereas '-smp 8' with
> + * hw/core/machine.c's smp_parse() creates 8 sockets, each with 1 core.
> + * Additionally, we can enforce the topology matches max_cpus check,
> + * because we have no legacy to preserve.
> + */
> +static void virt_smp_parse(MachineState *ms, QemuOpts *opts)
> +{
> +    if (opts) {
> +        unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
> +        unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
> +        unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
> +        unsigned threads = qemu_opt_get_number(opts, "threads", 0);
> +
> +        /*
> +         * Compute missing values; prefer cores over sockets and
> +         * sockets over threads.
> +         */
> +        if (cpus == 0 || cores == 0) {
> +            sockets = sockets > 0 ? sockets : 1;
> +            threads = threads > 0 ? threads : 1;
> +            if (cpus == 0) {
> +                cores = cores > 0 ? cores : 1;
> +                cpus = cores * threads * sockets;
> +            } else {
> +                ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
> +                cores = ms->smp.max_cpus / (sockets * threads);
> +            }
> +        } else if (sockets == 0) {
> +            threads = threads > 0 ? threads : 1;
> +            sockets = cpus / (cores * threads);
> +            sockets = sockets > 0 ? sockets : 1;
> +        } else if (threads == 0) {
> +            threads = cpus / (cores * sockets);
> +            threads = threads > 0 ? threads : 1;
> +        } else if (sockets * cores * threads < cpus) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) * cores (%u) * threads (%u) < "
> +                         "smp_cpus (%u)",
> +                         sockets, cores, threads, cpus);
> +            exit(1);
> +        }
> +
> +        ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
> +
> +        if (ms->smp.max_cpus < cpus) {
> +            error_report("maxcpus must be equal to or greater than smp");
> +            exit(1);
> +        }
> +
> +        if (sockets * cores * threads != ms->smp.max_cpus) {
> +            error_report("cpu topology: "
> +                         "sockets (%u) * cores (%u) * threads (%u)"
> +                         "!= maxcpus (%u)",
> +                         sockets, cores, threads,
> +                         ms->smp.max_cpus);
> +            exit(1);
> +        }
> +
> +        ms->smp.cpus = cpus;
> +        ms->smp.cores = cores;
> +        ms->smp.threads = threads;
> +        ms->smp.sockets = sockets;
> +    }
> +
> +    if (ms->smp.cpus > 1) {
> +        Error *blocker = NULL;
> +        error_setg(&blocker, QERR_REPLAY_NOT_SUPPORTED, "smp");
> +        replay_add_blocker(blocker);
> +    }
> +}
> +
>  static void virt_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -2469,6 +2544,7 @@ static void virt_machine_class_init(ObjectClass *oc, void
> *data)
>      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
>      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
> +    mc->smp_parse = virt_smp_parse;
>      mc->kvm_type = virt_kvm_type;
>      assert(!mc->get_hotplug_handler);
>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> --
> 2.23.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores
  2020-11-09 11:01   ` Salil Mehta
@ 2020-11-09 11:58     ` Andrew Jones
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Jones @ 2020-11-09 11:58 UTC (permalink / raw)
  To: Salil Mehta
  Cc: peter.maydell, Zhanghailiang, qemu-devel, shannon.zhaosl,
	qemu-arm, alistair.francis, fangying, imammedo

On Mon, Nov 09, 2020 at 11:01:48AM +0000, Salil Mehta wrote:
> > From: fangying
> > Sent: Monday, November 9, 2020 3:05 AM
> > To: peter.maydell@linaro.org
> > Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; drjones@redhat.com;
> > imammedo@redhat.com; shannon.zhaosl@gmail.com; alistair.francis@wdc.com;
> > Zhanghailiang <zhang.zhanghailiang@huawei.com>; Salil Mehta
> > <salil.mehta@huawei.com>
> > Subject: [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that
> > prefers cores
> > 
> > From: Andrew Jones <drjones@redhat.com>
> > 
> > The virt machine type has never used the CPU topology parameters, other
> > than number of online CPUs and max CPUs. When choosing how to allocate
> > those CPUs the default has been to assume cores. In preparation for
> > using the other CPU topology parameters let's use an smp_parse that
> > prefers cores over sockets. We can also enforce the topology matches
> > max_cpus check because we have no legacy to preserve.
> 
> 
> Hi Andrew,
> I am wondering if we need to take care of other levels of processor
> hierarchy as well in ARM64 like 'clusters'/'Dies'?

I don't know, but so far in Linux only x86 considers dies. Also, Linux
doesn't define the concept of a cluster in its arch-neutral cpu topology
API. With that in mind, I think we should just focus on sockets, cores,
and threads until we have a guest OS that can be used to test other
levels.

Thanks,
drew

> 
> Thanks
> 
> 
> > 
> > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > ---
> >  hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 76 insertions(+)
> > 
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index ea24b576c6..ba902b53ba 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -78,6 +78,8 @@
> >  #include "hw/virtio/virtio-iommu.h"
> >  #include "hw/char/pl011.h"
> >  #include "qemu/guest-random.h"
> > +#include "qapi/qmp/qerror.h"
> > +#include "sysemu/replay.h"
> > 
> >  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
> >      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> > @@ -2444,6 +2446,79 @@ static int virt_kvm_type(MachineState *ms, const char
> > *type_str)
> >      return requested_pa_size > 40 ? requested_pa_size : 0;
> >  }
> > 
> > +/*
> > + * Unlike smp_parse() in hw/core/machine.c, we prefer cores over sockets,
> > + * e.g. '-smp 8' creates 1 socket with 8 cores.  Whereas '-smp 8' with
> > + * hw/core/machine.c's smp_parse() creates 8 sockets, each with 1 core.
> > + * Additionally, we can enforce the topology matches max_cpus check,
> > + * because we have no legacy to preserve.
> > + */
> > +static void virt_smp_parse(MachineState *ms, QemuOpts *opts)
> > +{
> > +    if (opts) {
> > +        unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
> > +        unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
> > +        unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
> > +        unsigned threads = qemu_opt_get_number(opts, "threads", 0);
> > +
> > +        /*
> > +         * Compute missing values; prefer cores over sockets and
> > +         * sockets over threads.
> > +         */
> > +        if (cpus == 0 || cores == 0) {
> > +            sockets = sockets > 0 ? sockets : 1;
> > +            threads = threads > 0 ? threads : 1;
> > +            if (cpus == 0) {
> > +                cores = cores > 0 ? cores : 1;
> > +                cpus = cores * threads * sockets;
> > +            } else {
> > +                ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
> > +                cores = ms->smp.max_cpus / (sockets * threads);
> > +            }
> > +        } else if (sockets == 0) {
> > +            threads = threads > 0 ? threads : 1;
> > +            sockets = cpus / (cores * threads);
> > +            sockets = sockets > 0 ? sockets : 1;
> > +        } else if (threads == 0) {
> > +            threads = cpus / (cores * sockets);
> > +            threads = threads > 0 ? threads : 1;
> > +        } else if (sockets * cores * threads < cpus) {
> > +            error_report("cpu topology: "
> > +                         "sockets (%u) * cores (%u) * threads (%u) < "
> > +                         "smp_cpus (%u)",
> > +                         sockets, cores, threads, cpus);
> > +            exit(1);
> > +        }
> > +
> > +        ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
> > +
> > +        if (ms->smp.max_cpus < cpus) {
> > +            error_report("maxcpus must be equal to or greater than smp");
> > +            exit(1);
> > +        }
> > +
> > +        if (sockets * cores * threads != ms->smp.max_cpus) {
> > +            error_report("cpu topology: "
> > +                         "sockets (%u) * cores (%u) * threads (%u)"
> > +                         "!= maxcpus (%u)",
> > +                         sockets, cores, threads,
> > +                         ms->smp.max_cpus);
> > +            exit(1);
> > +        }
> > +
> > +        ms->smp.cpus = cpus;
> > +        ms->smp.cores = cores;
> > +        ms->smp.threads = threads;
> > +        ms->smp.sockets = sockets;
> > +    }
> > +
> > +    if (ms->smp.cpus > 1) {
> > +        Error *blocker = NULL;
> > +        error_setg(&blocker, QERR_REPLAY_NOT_SUPPORTED, "smp");
> > +        replay_add_blocker(blocker);
> > +    }
> > +}
> > +
> >  static void virt_machine_class_init(ObjectClass *oc, void *data)
> >  {
> >      MachineClass *mc = MACHINE_CLASS(oc);
> > @@ -2469,6 +2544,7 @@ static void virt_machine_class_init(ObjectClass *oc, void
> > *data)
> >      mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
> >      mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
> >      mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
> > +    mc->smp_parse = virt_smp_parse;
> >      mc->kvm_type = virt_kvm_type;
> >      assert(!mc->get_hotplug_handler);
> >      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> > --
> > 2.23.0
> 
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology
  2020-11-09  3:04 ` [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology Ying Fang
@ 2020-11-09 17:36   ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2020-11-09 17:36 UTC (permalink / raw)
  To: Ying Fang
  Cc: peter.maydell, drjones, zhang.zhanghailiang, qemu-devel,
	shannon.zhaosl, qemu-arm, alistair.francis, imammedo,
	salil.mehta

On Mon, 9 Nov 2020 11:04:52 +0800
Ying Fang <fangying1@huawei.com> wrote:

> A helper struct AcpiCacheOffset is introduced to describe the offset
> of three level caches. The cache hierarchy is built according to
> ACPI spec v6.3 5.2.29.2. Let's enable CPU cache topology now.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> ---
>  hw/acpi/aml-build.c         | 19 +++++++++-----
>  hw/arm/virt-acpi-build.c    | 52 ++++++++++++++++++++++++++++++++-----
>  include/hw/acpi/acpi-defs.h |  6 +++++
>  include/hw/acpi/aml-build.h |  7 ++---
>  4 files changed, 68 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 1a38110149..93a81fbaf5 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1799,27 +1799,32 @@ void build_cache_hierarchy(GArray *tbl,
>  /*
>   * ACPI 6.3: 5.2.29.1 Processor hierarchy node structure (Type 0)
>   */
> -void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
> +void build_socket_hierarchy(GArray *tbl, uint32_t parent,
> +                            uint32_t offset, uint32_t id)
>  {
>      build_append_byte(tbl, 0);          /* Type 0 - processor */
> -    build_append_byte(tbl, 20);         /* Length, no private resources */
> +    build_append_byte(tbl, 24);         /* Length, with private resources */
>      build_append_int_noprefix(tbl, 0, 2);  /* Reserved */
>      build_append_int_noprefix(tbl, 1, 4);  /* Flags: Physical package */
>      build_append_int_noprefix(tbl, parent, 4);  /* Parent */
>      build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
> -    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
> +    build_append_int_noprefix(tbl, 1, 4);  /*  Number of private resources */
> +    build_append_int_noprefix(tbl, offset, 4);  /* Private resources */
>  }
>  
> -void build_processor_hierarchy(GArray *tbl, uint32_t flags,
> -                               uint32_t parent, uint32_t id)
> +void build_processor_hierarchy(GArray *tbl, uint32_t flags, uint32_t parent,
> +                               AcpiCacheOffset offset, uint32_t id)
>  {
>      build_append_byte(tbl, 0);          /* Type 0 - processor */
> -    build_append_byte(tbl, 20);         /* Length, no private resources */
> +    build_append_byte(tbl, 32);         /* Length, with private resources */
>      build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
>      build_append_int_noprefix(tbl, flags, 4);  /* Flags */
>      build_append_int_noprefix(tbl, parent, 4); /* Parent */
>      build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
> -    build_append_int_noprefix(tbl, 0, 4);  /* Number of private resources */
> +    build_append_int_noprefix(tbl, 3, 4);  /* Number of private resources */
> +    build_append_int_noprefix(tbl, offset.l1d_offset, 4);/* Private resources */
> +    build_append_int_noprefix(tbl, offset.l1i_offset, 4);/* Private resources */
> +    build_append_int_noprefix(tbl, offset.l2_offset, 4); /* Private resources */

So this is an interesting corner of the ACPI spec.  Do we need to include
the level 2 cache in the private resources or not?

My reading is that it is optional and we have at least one obscure bug in Linux
due to the ambiguity in the spec.  You won't hit that here for reasons found
in that thread though (cache types are different).

https://lore.kernel.org/linux-acpi/20201105174233.1146-2-shiju.jose@huawei.com/

So I suggest we make the only access to the l2 cache be via the l1 nodes
as per the example in ACPI 6.3 Figure 5.13 "Cache Type Structure - Type 1 Example"

i.e. Drop the l2 entry here and reduced the Length to 28.

Thanks,

Jonathan

>  }
>  
>  void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id)
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 5784370257..ad49006b42 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -429,29 +429,69 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>                   "SRAT", table_data->len - srat_start, 3, NULL, NULL);
>  }
>  
> -static void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms)
> +static inline void arm_acpi_cache_info(CPUCacheInfo *cpu_cache,
> +                                       AcpiCacheInfo *acpi_cache)
>  {
> +    acpi_cache->size = cpu_cache->size;
> +    acpi_cache->sets = cpu_cache->sets;
> +    acpi_cache->associativity = cpu_cache->associativity;
> +    acpi_cache->attributes = cpu_cache->attributes;
> +    acpi_cache->line_size = cpu_cache->line_size;
> +}
> +
> +static void build_pptt(GArray *table_data, BIOSLinker *linker,
> +                       VirtMachineState *vms)
> +{
> +    MachineState *ms = MACHINE(vms);
>      int pptt_start = table_data->len;
>      int uid = 0, cpus = 0, socket;
>      unsigned int smp_cores = ms->smp.cores;
>      unsigned int smp_threads = ms->smp.threads;
> +    AcpiCacheOffset offset;
> +    ARMCPU *cpu = ARM_CPU(qemu_get_cpu(cpus));
> +    AcpiCacheInfo cache_info;
>  
>      acpi_data_push(table_data, sizeof(AcpiTableHeader));
>  
>      for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> -        uint32_t socket_offset = table_data->len - pptt_start;
> +        uint32_t l3_offset = table_data->len - pptt_start;
> +        uint32_t socket_offset;
>          int core;
>  
> -        build_socket_hierarchy(table_data, 0, socket);
> +        /* L3 cache type structure */
> +        arm_acpi_cache_info(cpu->caches.l3_cache, &cache_info);
> +        build_cache_hierarchy(table_data, 0, &cache_info);
> +
> +        socket_offset = table_data->len - pptt_start;
> +        build_socket_hierarchy(table_data, 0, l3_offset, socket);
>  
>          for (core = 0; core < smp_cores; core++) {
>              uint32_t core_offset = table_data->len - pptt_start;
>              int thread;
>  
> +            /* L2 cache tpe structure */
> +            offset.l2_offset = table_data->len - pptt_start;
> +            arm_acpi_cache_info(cpu->caches.l2_cache, &cache_info);
> +            build_cache_hierarchy(table_data, 0, &cache_info);
> +
> +            /* L1d cache type structure */
> +            offset.l1d_offset = table_data->len - pptt_start;
> +            arm_acpi_cache_info(cpu->caches.l1d_cache, &cache_info);
> +            build_cache_hierarchy(table_data, offset.l2_offset, &cache_info);
> +
> +            /* L1i cache type structure */
> +            offset.l1i_offset = table_data->len - pptt_start;
> +            arm_acpi_cache_info(cpu->caches.l1i_cache, &cache_info);
> +            build_cache_hierarchy(table_data, offset.l2_offset, &cache_info);
> +
> +            core_offset = table_data->len - pptt_start;
>              if (smp_threads <= 1) {
> -                build_processor_hierarchy(table_data, 2, socket_offset, uid++);
> +                build_processor_hierarchy(table_data, 2, socket_offset,
> +                                          offset, uid++);
>               } else {
> -                build_processor_hierarchy(table_data, 0, socket_offset, core);
> +
> +                build_processor_hierarchy(table_data, 0, socket_offset,
> +                                          offset, core);
>                  for (thread = 0; thread < smp_threads; thread++) {
>                      build_smt_hierarchy(table_data, core_offset, uid++);
>                  }
> @@ -727,7 +767,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>  
>      if (cpu_topology_enabled) {
>          acpi_add_table(table_offsets, tables_blob);
> -        build_pptt(tables_blob, tables->linker, ms);
> +        build_pptt(tables_blob, tables->linker, vms);
>      }
>  
>      acpi_add_table(table_offsets, tables_blob);
> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
> index 3df38ab449..e48b7fa506 100644
> --- a/include/hw/acpi/acpi-defs.h
> +++ b/include/hw/acpi/acpi-defs.h
> @@ -626,4 +626,10 @@ typedef struct AcpiCacheInfo {
>      uint16_t line_size;
>  } AcpiCacheInfo;
>  
> +typedef struct AcpiCacheOffset {
> +    uint32_t l1d_offset;
> +    uint32_t l1i_offset;
> +    uint32_t l2_offset;
> +} AcpiCacheOffset;
> +
>  #endif
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 01078753a8..a15ccb2c91 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -440,10 +440,11 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms);
>  void build_cache_hierarchy(GArray *tbl,
>                uint32_t next_level, AcpiCacheInfo *cache_info);
>  
> -void build_socket_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
> +void build_socket_hierarchy(GArray *tbl, uint32_t parent,
> +                            uint32_t offset, uint32_t id);
>  
> -void build_processor_hierarchy(GArray *tbl, uint32_t flags,
> -                               uint32_t parent, uint32_t id);
> +void build_processor_hierarchy(GArray *tbl, uint32_t flags, uint32_t parent,
> +                               AcpiCacheOffset offset, uint32_t id);
>  
>  void build_smt_hierarchy(GArray *tbl, uint32_t parent, uint32_t id);
>  



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm
  2020-11-09  3:04 ` [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm Ying Fang
@ 2020-11-09 17:46   ` Jonathan Cameron
  2020-11-30 13:00   ` Peter Maydell
  1 sibling, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2020-11-09 17:46 UTC (permalink / raw)
  To: Ying Fang
  Cc: peter.maydell, drjones, zhang.zhanghailiang, qemu-devel,
	shannon.zhaosl, qemu-arm, alistair.francis, imammedo,
	salil.mehta

On Mon, 9 Nov 2020 11:04:49 +0800
Ying Fang <fangying1@huawei.com> wrote:

> Add the CPUCacheInfo structure to hold cpu cache information for ARM cpus.
> A classic three level cache topology is used here. The default cache
> capacity is given and userspace can overwrite these values.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>

I may be missing it another patch, but to add an L3 cache you need to
also supply a few CSR values.

CLIDR needs to reflect that there is an L3 present and you need a CSIDR[3]
entry to describe it. 0xB200123 should work for CLIDR. I'm too lazy to figure
out a CSIDR[3] value :)

Without those, Linux isn't going to pick up the PPTT entries etc when
building cacheinfo.  It only uses PPTT to update or supplement the
information gained from those CSRs.

Jonathan


> ---
>  target/arm/cpu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  target/arm/cpu.h | 27 +++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
> 
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 056319859f..f1bac7452c 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -27,6 +27,7 @@
>  #include "qapi/visitor.h"
>  #include "cpu.h"
>  #include "internals.h"
> +#include "qemu/units.h"
>  #include "exec/exec-all.h"
>  #include "hw/qdev-properties.h"
>  #if !defined(CONFIG_USER_ONLY)
> @@ -997,6 +998,45 @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
>      return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
>  }
>  
> +static CPUCaches default_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +    .type = DATA_CACHE,
> +        .level = 1,
> +        .size = 64 * KiB,
> +        .line_size = 64,
> +        .associativity = 4,
> +        .sets = 256,
> +        .attributes = 0x02,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .size = 64 * KiB,
> +        .line_size = 64,
> +        .associativity = 4,
> +        .sets = 256,
> +        .attributes = 0x04,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .size = 512 * KiB,
> +        .line_size = 64,
> +        .associativity = 8,
> +        .sets = 1024,
> +        .attributes = 0x0a,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .size = 65536 * KiB,
> +        .line_size = 64,
> +        .associativity = 15,
> +        .sets = 2048,
> +        .attributes = 0x0a,
> +    },
> +};
> +
>  static void cpreg_hashtable_data_destroy(gpointer data)
>  {
>      /*
> @@ -1841,6 +1881,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>          }
>      }
>  
> +    cpu->caches = default_cache_info;
> +
>      qemu_init_vcpu(cs);
>      cpu_reset(cs);
>  
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index cfff1b5c8f..dbc33a9802 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -746,6 +746,30 @@ typedef enum ARMPSCIState {
>  
>  typedef struct ARMISARegisters ARMISARegisters;
>  
> +/* Cache information type */
> +enum CacheType {
> +    DATA_CACHE,
> +    INSTRUCTION_CACHE,
> +    UNIFIED_CACHE
> +};
> +
> +typedef struct CPUCacheInfo {
> +    enum CacheType type;      /* Cache Type*/
> +    uint8_t level;
> +    uint32_t size;            /* Size in bytes */
> +    uint16_t line_size;       /* Line size in bytes */
> +    uint8_t associativity;    /* Cache associativity */
> +    uint32_t sets;            /* Number of sets */
> +    uint8_t attributes;       /* Cache attributest  */
> +} CPUCacheInfo;
> +
> +typedef struct CPUCaches {
> +        CPUCacheInfo *l1d_cache;
> +        CPUCacheInfo *l1i_cache;
> +        CPUCacheInfo *l2_cache;
> +        CPUCacheInfo *l3_cache;
> +} CPUCaches;
> +
>  /**
>   * ARMCPU:
>   * @env: #CPUARMState
> @@ -987,6 +1011,9 @@ struct ARMCPU {
>  
>      /* Generic timer counter frequency, in Hz */
>      uint64_t gt_cntfrq_hz;
> +
> +    /* CPU cache information */
> +    CPUCaches caches;
>  };
>  
>  unsigned int gt_cntfrq_period_ns(ARMCPU *cpu);



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  2020-11-09 10:45   ` Salil Mehta
@ 2020-11-17 10:27     ` Ying Fang
  2020-11-20 12:43       ` Andrew Jones
  0 siblings, 1 reply; 25+ messages in thread
From: Ying Fang @ 2020-11-17 10:27 UTC (permalink / raw)
  To: Salil Mehta, peter.maydell
  Cc: drjones, Zhanghailiang, qemu-devel, shannon.zhaosl, qemu-arm,
	alistair.francis, imammedo



On 11/9/2020 6:45 PM, Salil Mehta wrote:
> Hi Fangying,
> A trivial thing. This patch looks bit of a noise in this patch-set. Better
> to send it as a separate patch-set and get it accepted.
> 
Hmm, this patch looks like a code reactor for the somewhat confusing
*smp_cpus* which will tidy the code. Maybe Andrew could do that.

> Thanks
> 
>> From: fangying
>> Sent: Monday, November 9, 2020 3:05 AM
>> To: peter.maydell@linaro.org
>> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; drjones@redhat.com;
>> imammedo@redhat.com; shannon.zhaosl@gmail.com; alistair.francis@wdc.com;
>> Zhanghailiang <zhang.zhanghailiang@huawei.com>; Salil Mehta
>> <salil.mehta@huawei.com>
>> Subject: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
>>
>> From: Andrew Jones <drjones@redhat.com>
>>
>> Prefer to spell out the smp.cpus and smp.max_cpus machine state
>> variables in order to make grepping easier and to avoid any
>> confusion as to what cpu count is being used where.
>>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> ---
>>   hw/arm/virt-acpi-build.c |  8 +++----
>>   hw/arm/virt.c            | 51 +++++++++++++++++++---------------------
>>   include/hw/arm/virt.h    |  2 +-
>>   3 files changed, 29 insertions(+), 32 deletions(-)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 9747a6458f..a222981737 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -57,11 +57,11 @@
>>
>>   #define ARM_SPI_BASE 32
>>
>> -static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
>> +static void acpi_dsdt_add_cpus(Aml *scope, int cpus)
>>   {
>>       uint16_t i;
>>
>> -    for (i = 0; i < smp_cpus; i++) {
>> +    for (i = 0; i < cpus; i++) {
>>           Aml *dev = aml_device("C%.03X", i);
>>           aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
>>           aml_append(dev, aml_name_decl("_UID", aml_int(i)));
>> @@ -480,7 +480,7 @@ build_madt(GArray *table_data, BIOSLinker *linker,
>> VirtMachineState *vms)
>>       gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
>>       gicd->version = vms->gic_version;
>>
>> -    for (i = 0; i < vms->smp_cpus; i++) {
>> +    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
>>           AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
>>                                                              sizeof(*gicc));
>>           ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
>> @@ -599,7 +599,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>> VirtMachineState *vms)
>>        * the RTC ACPI device at all when using UEFI.
>>        */
>>       scope = aml_scope("\\_SB");
>> -    acpi_dsdt_add_cpus(scope, vms->smp_cpus);
>> +    acpi_dsdt_add_cpus(scope, ms->smp.cpus);
>>       acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>>                          (irqmap[VIRT_UART] + ARM_SPI_BASE));
>>       if (vmc->acpi_expose_flash) {
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index e465a988d6..0069fa1298 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -322,7 +322,7 @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
>>       if (vms->gic_version == VIRT_GIC_VERSION_2) {
>>           irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
>>                                GIC_FDT_IRQ_PPI_CPU_WIDTH,
>> -                             (1 << vms->smp_cpus) - 1);
>> +                             (1 << MACHINE(vms)->smp.cpus) - 1);
>>       }
>>
>>       qemu_fdt_add_subnode(vms->fdt, "/timer");
>> @@ -363,7 +363,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>        *  The simplest way to go is to examine affinity IDs of all our CPUs. If
>>        *  at least one of them has Aff3 populated, we set #address-cells to 2.
>>        */
>> -    for (cpu = 0; cpu < vms->smp_cpus; cpu++) {
>> +    for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
>>           ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
>>
>>           if (armcpu->mp_affinity & ARM_AFF3_MASK) {
>> @@ -376,7 +376,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>       qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#address-cells", addr_cells);
>>       qemu_fdt_setprop_cell(vms->fdt, "/cpus", "#size-cells", 0x0);
>>
>> -    for (cpu = vms->smp_cpus - 1; cpu >= 0; cpu--) {
>> +    for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
>>           char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
>>           ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
>>           CPUState *cs = CPU(armcpu);
>> @@ -387,7 +387,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>                                       armcpu->dtb_compatible);
>>
>>           if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED
>> -            && vms->smp_cpus > 1) {
>> +            && ms->smp.cpus > 1) {
>>               qemu_fdt_setprop_string(vms->fdt, nodename,
>>                                           "enable-method", "psci");
>>           }
>> @@ -533,7 +533,7 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
>>       if (vms->gic_version == VIRT_GIC_VERSION_2) {
>>           irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
>>                                GIC_FDT_IRQ_PPI_CPU_WIDTH,
>> -                             (1 << vms->smp_cpus) - 1);
>> +                             (1 << MACHINE(vms)->smp.cpus) - 1);
>>       }
>>
>>       qemu_fdt_add_subnode(vms->fdt, "/pmu");
>> @@ -622,14 +622,13 @@ static void create_gic(VirtMachineState *vms)
>>       SysBusDevice *gicbusdev;
>>       const char *gictype;
>>       int type = vms->gic_version, i;
>> -    unsigned int smp_cpus = ms->smp.cpus;
>>       uint32_t nb_redist_regions = 0;
>>
>>       gictype = (type == 3) ? gicv3_class_name() : gic_class_name();
>>
>>       vms->gic = qdev_new(gictype);
>>       qdev_prop_set_uint32(vms->gic, "revision", type);
>> -    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
>> +    qdev_prop_set_uint32(vms->gic, "num-cpu", ms->smp.cpus);
>>       /* Note that the num-irq property counts both internal and external
>>        * interrupts; there are always 32 of the former (mandated by GIC spec).
>>        */
>> @@ -641,7 +640,7 @@ static void create_gic(VirtMachineState *vms)
>>       if (type == 3) {
>>           uint32_t redist0_capacity =
>>                       vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
>> -        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
>> +        uint32_t redist0_count = MIN(ms->smp.cpus, redist0_capacity);
>>
>>           nb_redist_regions = virt_gicv3_redist_region_count(vms);
>>
>> @@ -654,7 +653,7 @@ static void create_gic(VirtMachineState *vms)
>>                       vms->memmap[VIRT_HIGH_GIC_REDIST2].size /
>> GICV3_REDIST_SIZE;
>>
>>               qdev_prop_set_uint32(vms->gic, "redist-region-count[1]",
>> -                MIN(smp_cpus - redist0_count, redist1_capacity));
>> +                MIN(ms->smp.cpus - redist0_count, redist1_capacity));
>>           }
>>       } else {
>>           if (!kvm_irqchip_in_kernel()) {
>> @@ -683,7 +682,7 @@ static void create_gic(VirtMachineState *vms)
>>        * maintenance interrupt signal to the appropriate GIC PPI inputs,
>>        * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
>>        */
>> -    for (i = 0; i < smp_cpus; i++) {
>> +    for (i = 0; i < ms->smp.cpus; i++) {
>>           DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
>>           int ppibase = NUM_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
>>           int irq;
>> @@ -711,7 +710,7 @@ static void create_gic(VirtMachineState *vms)
>>           } else if (vms->virt) {
>>               qemu_irq irq = qdev_get_gpio_in(vms->gic,
>>                                               ppibase + ARCH_GIC_MAINT_IRQ);
>> -            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
>> +            sysbus_connect_irq(gicbusdev, i + 4 * ms->smp.cpus, irq);
>>           }
>>
>>           qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
>> @@ -719,11 +718,11 @@ static void create_gic(VirtMachineState *vms)
>>                                                        + VIRTUAL_PMU_IRQ));
>>
>>           sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev,
>> ARM_CPU_IRQ));
>> -        sysbus_connect_irq(gicbusdev, i + smp_cpus,
>> +        sysbus_connect_irq(gicbusdev, i + ms->smp.cpus,
>>                              qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
>> -        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
>> +        sysbus_connect_irq(gicbusdev, i + 2 * ms->smp.cpus,
>>                              qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
>> -        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
>> +        sysbus_connect_irq(gicbusdev, i + 3 * ms->smp.cpus,
>>                              qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
>>       }
>>
>> @@ -1572,7 +1571,7 @@ static void virt_set_memmap(VirtMachineState *vms)
>>    */
>>   static void finalize_gic_version(VirtMachineState *vms)
>>   {
>> -    unsigned int max_cpus = MACHINE(vms)->smp.max_cpus;
>> +    MachineState *ms = MACHINE(vms);
>>
>>       if (kvm_enabled()) {
>>           int probe_bitmap;
>> @@ -1613,7 +1612,8 @@ static void finalize_gic_version(VirtMachineState *vms)
>>               }
>>               return;
>>           case VIRT_GIC_VERSION_NOSEL:
>> -            if ((probe_bitmap & KVM_ARM_VGIC_V2) && max_cpus <= GIC_NCPU) {
>> +            if ((probe_bitmap & KVM_ARM_VGIC_V2) &&
>> +                ms->smp.max_cpus <= GIC_NCPU) {
>>                   vms->gic_version = VIRT_GIC_VERSION_2;
>>               } else if (probe_bitmap & KVM_ARM_VGIC_V3) {
>>                   /*
>> @@ -1622,7 +1622,7 @@ static void finalize_gic_version(VirtMachineState *vms)
>>                    * to v3. In any case defaulting to v2 would be broken.
>>                    */
>>                   vms->gic_version = VIRT_GIC_VERSION_3;
>> -            } else if (max_cpus > GIC_NCPU) {
>> +            } else if (ms->smp.max_cpus > GIC_NCPU) {
>>                   error_report("host only supports in-kernel GICv2 emulation "
>>                                "but more than 8 vcpus are requested");
>>                   exit(1);
>> @@ -1743,8 +1743,6 @@ static void machvirt_init(MachineState *machine)
>>       bool firmware_loaded;
>>       bool aarch64 = true;
>>       bool has_ged = !vmc->no_ged;
>> -    unsigned int smp_cpus = machine->smp.cpus;
>> -    unsigned int max_cpus = machine->smp.max_cpus;
>>
>>       /*
>>        * In accelerated mode, the memory map is computed earlier in kvm_type()
>> @@ -1815,10 +1813,10 @@ static void machvirt_init(MachineState *machine)
>>           virt_max_cpus = GIC_NCPU;
>>       }
>>
>> -    if (max_cpus > virt_max_cpus) {
>> +    if (machine->smp.max_cpus > virt_max_cpus) {
>>           error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
>>                        "supported by machine 'mach-virt' (%d)",
>> -                     max_cpus, virt_max_cpus);
>> +                     machine->smp.max_cpus, virt_max_cpus);
>>           exit(1);
>>       }
>>
>> @@ -1843,7 +1841,7 @@ static void machvirt_init(MachineState *machine)
>>           Object *cpuobj;
>>           CPUState *cs;
>>
>> -        if (n >= smp_cpus) {
>> +        if (n >= machine->smp.cpus) {
>>               break;
>>           }
>>
>> @@ -2015,7 +2013,7 @@ static void machvirt_init(MachineState *machine)
>>       }
>>
>>       vms->bootinfo.ram_size = machine->ram_size;
>> -    vms->bootinfo.nb_cpus = smp_cpus;
>> +    vms->bootinfo.nb_cpus = machine->smp.cpus;
>>       vms->bootinfo.board_id = -1;
>>       vms->bootinfo.loader_start = vms->memmap[VIRT_MEM].base;
>>       vms->bootinfo.get_dtb = machvirt_dtb;
>> @@ -2208,17 +2206,16 @@ static int64_t virt_get_default_cpu_node_id(const
>> MachineState *ms, int idx)
>>   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>   {
>>       int n;
>> -    unsigned int max_cpus = ms->smp.max_cpus;
>>       VirtMachineState *vms = VIRT_MACHINE(ms);
>>
>>       if (ms->possible_cpus) {
>> -        assert(ms->possible_cpus->len == max_cpus);
>> +        assert(ms->possible_cpus->len == ms->smp.max_cpus);
>>           return ms->possible_cpus;
>>       }
>>
>>       ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
>> -                                  sizeof(CPUArchId) * max_cpus);
>> -    ms->possible_cpus->len = max_cpus;
>> +                                  sizeof(CPUArchId) * ms->smp.max_cpus);
>> +    ms->possible_cpus->len = ms->smp.max_cpus;
>>       for (n = 0; n < ms->possible_cpus->len; n++) {
>>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>           ms->possible_cpus->cpus[n].arch_id =
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index aad6d69841..953d94acc0 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -181,7 +181,7 @@ static inline int
>> virt_gicv3_redist_region_count(VirtMachineState *vms)
>>
>>       assert(vms->gic_version == VIRT_GIC_VERSION_3);
>>
>> -    return vms->smp_cpus > redist0_capacity ? 2 : 1;
>> +    return MACHINE(vms)->smp.cpus > redist0_capacity ? 2 : 1;
>>   }
>>
>>   #endif /* QEMU_ARM_VIRT_H */
>> --
>> 2.23.0
> 
> .
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  2020-11-17 10:27     ` Ying Fang
@ 2020-11-20 12:43       ` Andrew Jones
  2020-12-15 18:02         ` Andrew Jones
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Jones @ 2020-11-20 12:43 UTC (permalink / raw)
  To: Ying Fang
  Cc: peter.maydell, Salil Mehta, Zhanghailiang, qemu-devel,
	shannon.zhaosl, qemu-arm, alistair.francis, imammedo

On Tue, Nov 17, 2020 at 06:27:54PM +0800, Ying Fang wrote:
> 
> 
> On 11/9/2020 6:45 PM, Salil Mehta wrote:
> > Hi Fangying,
> > A trivial thing. This patch looks bit of a noise in this patch-set. Better
> > to send it as a separate patch-set and get it accepted.
> > 
> Hmm, this patch looks like a code reactor for the somewhat confusing
> *smp_cpus* which will tidy the code. Maybe Andrew could do that.
>

Sure. I can post the first two patches of this series after the 5.2
release.

Thanks,
drew



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm
  2020-11-09  3:04 ` [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm Ying Fang
  2020-11-09 17:46   ` Jonathan Cameron
@ 2020-11-30 13:00   ` Peter Maydell
  2021-01-12 13:25     ` Ying Fang
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2020-11-30 13:00 UTC (permalink / raw)
  To: Ying Fang
  Cc: Andrew Jones, zhanghailiang, QEMU Developers, Shannon Zhao,
	qemu-arm, Alistair Francis, Igor Mammedov, salil.mehta

On Mon, 9 Nov 2020 at 03:05, Ying Fang <fangying1@huawei.com> wrote:
>
> Add the CPUCacheInfo structure to hold cpu cache information for ARM cpus.
> A classic three level cache topology is used here. The default cache
> capacity is given and userspace can overwrite these values.
>
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> ---
>  target/arm/cpu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  target/arm/cpu.h | 27 +++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 056319859f..f1bac7452c 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -27,6 +27,7 @@
>  #include "qapi/visitor.h"
>  #include "cpu.h"
>  #include "internals.h"
> +#include "qemu/units.h"
>  #include "exec/exec-all.h"
>  #include "hw/qdev-properties.h"
>  #if !defined(CONFIG_USER_ONLY)
> @@ -997,6 +998,45 @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
>      return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
>  }
>
> +static CPUCaches default_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +    .type = DATA_CACHE,
> +        .level = 1,
> +        .size = 64 * KiB,
> +        .line_size = 64,
> +        .associativity = 4,
> +        .sets = 256,
> +        .attributes = 0x02,
> +    },

Would it be possible to populate this structure from the
CLIDR/CCSIDR ID register values, rather than having to
specify the same thing in two places?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus
  2020-11-20 12:43       ` Andrew Jones
@ 2020-12-15 18:02         ` Andrew Jones
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Jones @ 2020-12-15 18:02 UTC (permalink / raw)
  To: Ying Fang
  Cc: peter.maydell, Salil Mehta, Zhanghailiang, qemu-devel,
	shannon.zhaosl, qemu-arm, alistair.francis, imammedo

On Fri, Nov 20, 2020 at 01:43:19PM +0100, Andrew Jones wrote:
> On Tue, Nov 17, 2020 at 06:27:54PM +0800, Ying Fang wrote:
> > 
> > 
> > On 11/9/2020 6:45 PM, Salil Mehta wrote:
> > > Hi Fangying,
> > > A trivial thing. This patch looks bit of a noise in this patch-set. Better
> > > to send it as a separate patch-set and get it accepted.
> > > 
> > Hmm, this patch looks like a code reactor for the somewhat confusing
> > *smp_cpus* which will tidy the code. Maybe Andrew could do that.
> >
> 
> Sure. I can post the first two patches of this series after the 5.2
> release.
>

Just posted this

"[PATCH] hw/arm/virt: Remove virt machine state 'smp_cpus'"

Squashed in 2/13 and changed the approach a bit. I'm not quite so against
the "smp_cpus" string anymore, now that the global variable of the same
name is totally gone.

Thanks,
drew



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm
  2020-11-30 13:00   ` Peter Maydell
@ 2021-01-12 13:25     ` Ying Fang
  0 siblings, 0 replies; 25+ messages in thread
From: Ying Fang @ 2021-01-12 13:25 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Andrew Jones, zhanghailiang, QEMU Developers, Shannon Zhao,
	qemu-arm, Alistair Francis, Igor Mammedov, salil.mehta



On 11/30/2020 9:00 PM, Peter Maydell wrote:
> On Mon, 9 Nov 2020 at 03:05, Ying Fang <fangying1@huawei.com> wrote:
>>
>> Add the CPUCacheInfo structure to hold cpu cache information for ARM cpus.
>> A classic three level cache topology is used here. The default cache
>> capacity is given and userspace can overwrite these values.
>>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>> ---
>>   target/arm/cpu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   target/arm/cpu.h | 27 +++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+)
>>
>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>> index 056319859f..f1bac7452c 100644
>> --- a/target/arm/cpu.c
>> +++ b/target/arm/cpu.c
>> @@ -27,6 +27,7 @@
>>   #include "qapi/visitor.h"
>>   #include "cpu.h"
>>   #include "internals.h"
>> +#include "qemu/units.h"
>>   #include "exec/exec-all.h"
>>   #include "hw/qdev-properties.h"
>>   #if !defined(CONFIG_USER_ONLY)
>> @@ -997,6 +998,45 @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
>>       return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
>>   }
>>
>> +static CPUCaches default_cache_info = {
>> +    .l1d_cache = &(CPUCacheInfo) {
>> +    .type = DATA_CACHE,
>> +        .level = 1,
>> +        .size = 64 * KiB,
>> +        .line_size = 64,
>> +        .associativity = 4,
>> +        .sets = 256,
>> +        .attributes = 0x02,
>> +    },
> 
> Would it be possible to populate this structure from the
> CLIDR/CCSIDR ID register values, rather than having to
> specify the same thing in two places?

Sorry I missed this reply.

I had tried to fetch CLIDR/CCSID ID register values of host cpu
from KVM, however I did not get the value expected. May I made
some mistakes in KVM side.

Thanks for your guide, I'll try to populate them again.

> 
> thanks
> -- PMM
> .
> 

Thanks.
Ying.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-01-12 13:29 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09  3:04 [RFC PATCH v3 00/13] hw/arm/virt: Introduce cpu and cache topology support Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 01/13] hw/arm/virt: Spell out smp.cpus and smp.max_cpus Ying Fang
2020-11-09 10:45   ` Salil Mehta
2020-11-17 10:27     ` Ying Fang
2020-11-20 12:43       ` Andrew Jones
2020-12-15 18:02         ` Andrew Jones
2020-11-09  3:04 ` [RFC PATCH v3 02/13] hw/arm/virt: Remove unused variable Ying Fang
2020-11-09 10:54   ` Salil Mehta
2020-11-09  3:04 ` [RFC PATCH v3 03/13] hw/arm/virt: Replace smp_parse with one that prefers cores Ying Fang
2020-11-09 11:01   ` Salil Mehta
2020-11-09 11:58     ` Andrew Jones
2020-11-09  3:04 ` [RFC PATCH v3 04/13] device_tree: Add qemu_fdt_add_path Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 05/13] hw: add compat machines for 5.3 Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 06/13] hw/arm/virt: DT: add cpu-map Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 07/13] hw/arm/virt-acpi-build: distinguish possible and present cpus Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 08/13] hw/acpi/aml-build: add processor hierarchy node structure Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 09/13] hw/arm/virt-acpi-build: add PPTT table Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 10/13] target/arm/cpu: Add cpu cache description for arm Ying Fang
2020-11-09 17:46   ` Jonathan Cameron
2020-11-30 13:00   ` Peter Maydell
2021-01-12 13:25     ` Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 11/13] hw/arm/virt: add fdt cache information Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 12/13] hw/acpi/aml-build: Build ACPI cpu cache hierarchy information Ying Fang
2020-11-09  3:04 ` [RFC PATCH v3 13/13] hw/arm/virt-acpi-build: Enable cpu and cache topology Ying Fang
2020-11-09 17:36   ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.