QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support
@ 2021-04-13  8:07 Yanan Wang
  2021-04-13  8:07 ` [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path Yanan Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, zhukeqian1,
	Jiajie Li, David Gibson

Hi,

This series is a new version of [0] recently posted by Ying Fang
to introduce cpu topology support for ARM platform. I have taken
over his work about this now, thanks for his contribution.

Description:
An accurate cpu topology may help improve the cpu scheduler's decision
making when dealing with multi-core system. So cpu topology description
is helpful to provide guest with the right view. Dario Faggioli's talk
in [1] also shows the virtual topology could have impact on scheduling
performace. Thus this patch series introduces cpu topology support for
ARM platform.

This series originally comes from Andrew Jones's patches [2], but with
some re-arrangement. Thanks for Andrew's contribution. In this series,
both fdt and ACPI PPTT table are introduced to present cpu topology to
the guest. And a new function virt_smp_parse() not like the default
smp_parse() is introduced, which prefers cores over sockets.

[0] https://patchwork.kernel.org/project/qemu-devel/cover/20210225085627.2263-1-fangying1@huawei.com/
[1] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
[2] https://github.com/rhdrjones/qemu/commit/ecfc1565f22187d2c715a99bbcd35cf3a7e428fa

Test results:
After applying this patch series, launch a guest with virt-6.0 and cpu
topology configured with: -smp 96,sockets=2,clusters=6,cores=4,threads=2,
VM's cpu topology description shows as below.

Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  2
Core(s) per socket:  24
Socket(s):           2
NUMA node(s):        1
Vendor ID:           0x48
Model:               0
Stepping:            0x1
BogoMIPS:            200.00
NUMA node0 CPU(s):   0-95

---

Changelogs:
v1->v2:
- Address Andrew Jones's comments
- Address Michael S. Tsirkin's comments
- Pick up one more patch(patch#6) of Andrew Jones
- Rebased on v6.0.0-rc2 release

---

Andrew Jones (3):
  device_tree: Add qemu_fdt_add_path
  hw/arm/virt: DT: Add cpu-map
  hw/arm/virt: Replace smp_parse with one that prefers cores

Yanan Wang (2):
  hw/acpi/aml-build: Add processor hierarchy node structure
  hw/arm/virt-acpi-build: Add PPTT table

Ying Fang (1):
  hw/arm/virt-acpi-build: Distinguish possible and present cpus

 hw/acpi/aml-build.c          |  27 ++++++++
 hw/arm/virt-acpi-build.c     |  77 ++++++++++++++++++++--
 hw/arm/virt.c                | 120 ++++++++++++++++++++++++++++++++++-
 include/hw/acpi/aml-build.h  |   4 ++
 include/hw/arm/virt.h        |   1 +
 include/sysemu/device_tree.h |   1 +
 softmmu/device_tree.c        |  45 ++++++++++++-
 7 files changed, 268 insertions(+), 7 deletions(-)

-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-16  4:52   ` David Gibson
  2021-04-13  8:07 ` [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map Yanan Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, zhukeqian1,
	Jiajie Li, David Gibson

From: Andrew Jones <drjones@redhat.com>

qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
it also adds any missing subnodes in the path. We also tweak
an error message of qemu_fdt_add_subnode().

We'll make use of this new function in a coming patch.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 include/sysemu/device_tree.h |  1 +
 softmmu/device_tree.c        | 45 ++++++++++++++++++++++++++++++++++--
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 8a2fe55622..ef060a9759 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -121,6 +121,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
 uint32_t qemu_fdt_alloc_phandle(void *fdt);
 int qemu_fdt_nop_node(void *fdt, const char *node_path);
 int qemu_fdt_add_subnode(void *fdt, const char *name);
+int qemu_fdt_add_path(void *fdt, const char *path);
 
 #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
     do {                                                                      \
diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
index 2691c58cf6..8592c7aa1b 100644
--- a/softmmu/device_tree.c
+++ b/softmmu/device_tree.c
@@ -541,8 +541,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
 
     retval = fdt_add_subnode(fdt, parent, basename);
     if (retval < 0) {
-        error_report("FDT: Failed to create subnode %s: %s", name,
-                     fdt_strerror(retval));
+        error_report("%s: Failed to create subnode %s: %s",
+                     __func__, name, fdt_strerror(retval));
         exit(1);
     }
 
@@ -550,6 +550,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
     return retval;
 }
 
+/*
+ * Like qemu_fdt_add_subnode(), but will add all missing
+ * subnodes in the path.
+ */
+int qemu_fdt_add_path(void *fdt, const char *path)
+{
+    char *dupname, *basename, *p;
+    int parent, retval = -1;
+
+    if (path[0] != '/') {
+        return retval;
+    }
+
+    parent = fdt_path_offset(fdt, "/");
+    p = dupname = g_strdup(path);
+
+    while (p) {
+        *p = '/';
+        basename = p + 1;
+        p = strchr(p + 1, '/');
+        if (p) {
+            *p = '\0';
+        }
+        retval = fdt_path_offset(fdt, dupname);
+        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
+            error_report("%s: Invalid path %s: %s",
+                         __func__, path, fdt_strerror(retval));
+            exit(1);
+        } else if (retval == -FDT_ERR_NOTFOUND) {
+            retval = fdt_add_subnode(fdt, parent, basename);
+            if (retval < 0) {
+                break;
+            }
+        }
+        parent = retval;
+    }
+
+    g_free(dupname);
+    return retval;
+}
+
 void qemu_fdt_dumpdtb(void *fdt, int size)
 {
     const char *dumpdtb = current_machine->dumpdtb;
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
  2021-04-13  8:07 ` [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-27  9:47   ` Philippe Mathieu-Daudé
  2021-05-13  6:58   ` Andrew Jones
  2021-04-13  8:07 ` [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus Yanan Wang
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, zhukeqian1,
	Jiajie Li, David Gibson

From: Andrew Jones <drjones@redhat.com>

Support device tree CPU topology descriptions.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
 include/hw/arm/virt.h |  1 +
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9f01d9041b..f4ae60ded9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
     int cpu;
     int addr_cells = 1;
     const MachineState *ms = MACHINE(vms);
+    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
     int smp_cpus = ms->smp.cpus;
 
     /*
-     * From Documentation/devicetree/bindings/arm/cpus.txt
+     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
      *  On ARM v8 64-bit systems value should be set to 2,
      *  that corresponds to the MPIDR_EL1 register size.
      *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
@@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                 ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
         }
 
+        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
+            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
+                                  qemu_fdt_alloc_phandle(ms->fdt));
+        }
+
         g_free(nodename);
     }
+
+    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
+        /*
+         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
+         * In a SMP system, the hierarchy of CPUs is defined through four
+         * entities that are used to describe the layout of physical CPUs
+         * in the system: socket/cluster/core/thread.
+         */
+        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
+
+        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
+            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
+            char *map_path;
+
+            if (ms->smp.threads > 1) {
+                map_path = g_strdup_printf(
+                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
+                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
+                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
+                    "thread", cpu % ms->smp.threads);
+            } else {
+                map_path = g_strdup_printf(
+                    "/cpus/cpu-map/%s%d/%s%d",
+                    "socket", cpu / ms->smp.cores,
+                    "core", cpu % ms->smp.cores);
+            }
+            qemu_fdt_add_path(ms->fdt, map_path);
+            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
+            g_free(map_path);
+            g_free(cpu_path);
+        }
+    }
 }
 
 static void fdt_add_its_gic_node(VirtMachineState *vms)
@@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
     virt_machine_6_0_options(mc);
     compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
     vmc->no_secure_gpio = true;
+    vmc->no_cpu_topology = true;
 }
 DEFINE_VIRT_MACHINE(5, 2)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 921416f918..4a4b98e4a7 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -129,6 +129,7 @@ struct VirtMachineClass {
     bool no_kvm_steal_time;
     bool acpi_expose_flash;
     bool no_secure_gpio;
+    bool no_cpu_topology;
 };
 
 struct VirtMachineState {
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
  2021-04-13  8:07 ` [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path Yanan Wang
  2021-04-13  8:07 ` [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-27 13:18   ` Andrew Jones
  2021-04-27 14:50   ` Andrew Jones
  2021-04-13  8:07 ` [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure Yanan Wang
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, Ying Fang,
	zhukeqian1, Jiajie Li, David Gibson

From: Ying Fang <fangying1@huawei.com>

When building ACPI tables regarding CPUs we should always build
them for the number of possible CPUs, not the number of present
CPUs. We then ensure only the present CPUs are enabled in MADT.
Furthermore, it is also needed if we are going to support CPU
hotplug in the future.

This patch is a rework based on Andrew Jones's contribution at
https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html

Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 hw/arm/virt-acpi-build.c | 14 ++++++++++----
 hw/arm/virt.c            |  3 +++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f5a2b2d4cb..2ad5dad1bf 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -61,13 +61,16 @@
 
 static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
 {
-    MachineState *ms = MACHINE(vms);
+    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
     uint16_t i;
 
-    for (i = 0; i < ms->smp.cpus; i++) {
+    for (i = 0; i < possible_cpus->len; i++) {
         Aml *dev = aml_device("C%.03X", i);
         aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
         aml_append(dev, aml_name_decl("_UID", aml_int(i)));
+        if (possible_cpus->cpus[i].cpu == NULL) {
+            aml_append(dev, aml_name_decl("_STA", aml_int(0)));
+        }
         aml_append(scope, dev);
     }
 }
@@ -479,6 +482,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     const int *irqmap = vms->irqmap;
     AcpiMadtGenericDistributor *gicd;
     AcpiMadtGenericMsiFrame *gic_msi;
+    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
     int i;
 
     acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));
@@ -489,7 +493,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
     gicd->version = vms->gic_version;
 
-    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
+    for (i = 0; i < possible_cpus->len; i++) {
         AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
                                                            sizeof(*gicc));
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
@@ -504,7 +508,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         gicc->cpu_interface_number = cpu_to_le32(i);
         gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
         gicc->uid = cpu_to_le32(i);
-        gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+        if (possible_cpus->cpus[i].cpu != NULL) {
+            gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+        }
 
         if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {
             gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f4ae60ded9..3e5d9b6f26 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2063,6 +2063,9 @@ static void machvirt_init(MachineState *machine)
         }
 
         qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+
+        /* Initialize cpu member here since cpu hotplug is not supported yet */
+        machine->possible_cpus->cpus[n].cpu = cpuobj;
         object_unref(cpuobj);
     }
     fdt_add_timer_nodes(vms);
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
                   ` (2 preceding siblings ...)
  2021-04-13  8:07 ` [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-27 13:37   ` Andrew Jones
  2021-04-13  8:07 ` [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table Yanan Wang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Henglong Fan,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li, David Gibson

Add a generic API to build Processor Hierarchy Node Structure(Type 0),
which is strictly consistent with descriptions in ACPI 6.3: 5.2.29.1.

This function will be used to build ACPI PPTT table for cpu topology.

Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Henglong Fan <fanhenglong@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 hw/acpi/aml-build.c         | 27 +++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h |  4 ++++
 2 files changed, 31 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index d33ce8954a..75e01aea17 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1916,6 +1916,33 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                  table_data->len - slit_start, 1, oem_id, oem_table_id);
 }
 
+/*
+ * ACPI 6.3: 5.2.29.1 Processor Hierarchy Node Structure (Type 0)
+ */
+void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
+                                    uint32_t parent, uint32_t id,
+                                    uint32_t *priv_rsrc, uint32_t priv_num)
+{
+    int i;
+
+    build_append_byte(tbl, 0);                 /* Type 0 - processor */
+    build_append_byte(tbl, 20 + priv_num * 4); /* Length */
+    build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
+    build_append_int_noprefix(tbl, flags, 4);  /* Flags */
+    build_append_int_noprefix(tbl, parent, 4); /* Parent */
+    build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
+
+    /* Number of private resources */
+    build_append_int_noprefix(tbl, priv_num, 4);
+
+    /* Private resources[N] */
+    if (priv_num > 0 && priv_rsrc != NULL) {
+        for (i = 0; i < priv_num; i++) {
+            build_append_int_noprefix(tbl, priv_rsrc[i], 4);
+        }
+    }
+}
+
 /* build rev1/rev3/rev5.1 FADT */
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 471266d739..ea74b8f6ed 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -462,6 +462,10 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
 void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                 const char *oem_id, const char *oem_table_id);
 
+void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
+                                    uint32_t parent, uint32_t id,
+                                    uint32_t *priv_rsrc, uint32_t priv_num);
+
 void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
                 const char *oem_id, const char *oem_table_id);
 
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
                   ` (3 preceding siblings ...)
  2021-04-13  8:07 ` [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-27 14:16   ` Andrew Jones
  2021-05-13  5:10   ` wangyanan (Y)
  2021-04-13  8:07 ` [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores Yanan Wang
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, Ying Fang,
	zhukeqian1, Jiajie Li, David Gibson

Add the Processor Properties Topology Table (PPTT) to present
CPU topology information to ACPI guests. Note, while a DT boot
Linux guest with a non-flat CPU topology will see socket and
core IDs being sequential integers starting from zero, e.g.
with -smp 4,sockets=2,cores=2,threads=1

a DT boot produces

 cpu:  0 package_id:  0 core_id:  0
 cpu:  1 package_id:  0 core_id:  1
 cpu:  2 package_id:  1 core_id:  0
 cpu:  3 package_id:  1 core_id:  1

an ACPI boot produces

 cpu:  0 package_id: 36 core_id:  0
 cpu:  1 package_id: 36 core_id:  1
 cpu:  2 package_id: 96 core_id:  2
 cpu:  3 package_id: 96 core_id:  3

This is due to several reasons:

 1) DT cpu nodes do not have an equivalent field to what the PPTT
    ACPI Processor ID must be, i.e. something equal to the MADT CPU
    UID or equal to the UID of an ACPI processor container. In both
    ACPI cases those are platform dependant IDs assigned by the
    vendor.

 2) While QEMU is the vendor for a guest, if the topology specifies
    SMT (> 1 thread), then, with ACPI, it is impossible to assign a
    core-id the same value as a package-id, thus it is not possible
    to have package-id=0 and core-id=0. This is because package and
    core containers must be in the same ACPI namespace and therefore
    must have unique UIDs.

 3) ACPI processor containers are not required for PPTT tables to
    be used and, due to the limitations of which IDs are selected
    described above in (2), they are not helpful for QEMU, so we
    don't build them with this patch. In the absence of them, Linux
    assigns its own unique IDs. The maintainers have chosen not to use
    counters from zero, but rather ACPI table offsets, which explains
    why the numbers are so much larger than with DT.

 4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
    match the logical CPU IDs, because these IDs must be equal to the
    MADT CPU UID (as no processor containers are present), and QEMU
    uses the logical CPU ID for these MADT IDs.

Tested-by: Jiajie Li <lijiajie11@huawei.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 2ad5dad1bf..03fd812d5a 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                  vms->oem_table_id);
 }
 
+/* PPTT */
+static void
+build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
+{
+    int pptt_start = table_data->len;
+    int uid = 0, cpus = 0, socket = 0;
+    MachineState *ms = MACHINE(vms);
+    unsigned int smp_cores = ms->smp.cores;
+    unsigned int smp_threads = ms->smp.threads;
+
+    acpi_data_push(table_data, sizeof(AcpiTableHeader));
+
+    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
+        uint32_t socket_offset = table_data->len - pptt_start;
+        int core;
+
+        build_processor_hierarchy_node(
+            table_data, 1, /* Physical package */
+            0, socket, /* No parent */
+            NULL, 0);  /* No private resources */
+
+        for (core = 0; core < smp_cores; core++) {
+            uint32_t core_offset = table_data->len - pptt_start;
+            int thread;
+
+            if (smp_threads <= 1) {
+                build_processor_hierarchy_node(
+                    table_data,
+                    (1 << 1) | /* ACPI Processor ID valid */
+                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
+                    socket_offset, uid++, /* Parent is a Socket */
+                    NULL, 0);  /* No private resources */
+            } else {
+                build_processor_hierarchy_node(
+                    table_data, 0,
+                    socket_offset, core, /* Parent is a Socket */
+                    NULL, 0); /* No private resources */
+
+                for (thread = 0; thread < smp_threads; thread++) {
+                    build_processor_hierarchy_node(
+                        table_data,
+                        (1 << 1) | /* ACPI Processor ID valid */
+                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
+                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
+                        core_offset, uid++, /* Parent is a Core */
+                        NULL, 0);  /* No private resources */
+                }
+            }
+        }
+        cpus += smp_cores * smp_threads;
+    }
+
+    build_header(linker, table_data,
+                 (void *)(table_data->data + pptt_start), "PPTT",
+                 table_data->len - pptt_start, 2,
+                 vms->oem_id, vms->oem_table_id);
+}
+
 /* GTDT */
 static void
 build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
     acpi_add_table(table_offsets, tables_blob);
     build_madt(tables_blob, tables->linker, vms);
 
+    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
+        acpi_add_table(table_offsets, tables_blob);
+        build_pptt(tables_blob, tables->linker, vms);
+    }
+
     acpi_add_table(table_offsets, tables_blob);
     build_gtdt(tables_blob, tables->linker, vms);
 
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
                   ` (4 preceding siblings ...)
  2021-04-13  8:07 ` [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table Yanan Wang
@ 2021-04-13  8:07 ` Yanan Wang
  2021-04-27 14:58   ` Andrew Jones
  2021-04-21  7:40 ` [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support wangyanan (Y)
  2021-04-21  9:31 ` wangyanan (Y)
  7 siblings, 1 reply; 40+ messages in thread
From: Yanan Wang @ 2021-04-13  8:07 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Yanan Wang, Shannon Zhao, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, Igor Mammedov, zhukeqian1,
	Jiajie Li, David Gibson

From: Andrew Jones <drjones@redhat.com>

The virt machine type has never used the CPU topology parameters, other
than number of online CPUs and max CPUs. When choosing how to allocate
those CPUs the default has been to assume cores. In preparation for
using the other CPU topology parameters let's use an smp_parse that
prefers cores over sockets. We can also enforce the topology matches
max_cpus check because we have no legacy to preserve.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
 hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3e5d9b6f26..57ef961cb5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -79,6 +79,8 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
 #include "qemu/guest-random.h"
+#include "qapi/qmp/qerror.h"
+#include "sysemu/replay.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -2625,6 +2627,79 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
     return fixed_ipa ? 0 : requested_pa_size;
 }
 
+/*
+ * Unlike smp_parse() in hw/core/machine.c, we prefer cores over sockets,
+ * e.g. '-smp 8' creates 1 socket with 8 cores.  Whereas '-smp 8' with
+ * hw/core/machine.c's smp_parse() creates 8 sockets, each with 1 core.
+ * Additionally, we can enforce the topology matches max_cpus check,
+ * because we have no legacy to preserve.
+ */
+static void virt_smp_parse(MachineState *ms, QemuOpts *opts)
+{
+    if (opts) {
+        unsigned cpus = qemu_opt_get_number(opts, "cpus", 0);
+        unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+        unsigned cores = qemu_opt_get_number(opts, "cores", 0);
+        unsigned threads = qemu_opt_get_number(opts, "threads", 0);
+
+        /*
+         * Compute missing values; prefer cores over sockets and
+         * sockets over threads.
+         */
+        if (cpus == 0 || cores == 0) {
+            sockets = sockets > 0 ? sockets : 1;
+            threads = threads > 0 ? threads : 1;
+            if (cpus == 0) {
+                cores = cores > 0 ? cores : 1;
+                cpus = cores * threads * sockets;
+            } else {
+                ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+                cores = ms->smp.max_cpus / (sockets * threads);
+            }
+        } else if (sockets == 0) {
+            threads = threads > 0 ? threads : 1;
+            sockets = cpus / (cores * threads);
+            sockets = sockets > 0 ? sockets : 1;
+        } else if (threads == 0) {
+            threads = cpus / (cores * sockets);
+            threads = threads > 0 ? threads : 1;
+        } else if (sockets * cores * threads < cpus) {
+            error_report("cpu topology: "
+                         "sockets (%u) * cores (%u) * threads (%u) < "
+                         "smp_cpus (%u)",
+                         sockets, cores, threads, cpus);
+            exit(1);
+        }
+
+        ms->smp.max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+
+        if (ms->smp.max_cpus < cpus) {
+            error_report("maxcpus must be equal to or greater than smp");
+            exit(1);
+        }
+
+        if (sockets * cores * threads != ms->smp.max_cpus) {
+            error_report("cpu topology: "
+                         "sockets (%u) * cores (%u) * threads (%u)"
+                         "!= maxcpus (%u)",
+                         sockets, cores, threads,
+                         ms->smp.max_cpus);
+            exit(1);
+        }
+
+        ms->smp.cpus = cpus;
+        ms->smp.cores = cores;
+        ms->smp.threads = threads;
+        ms->smp.sockets = sockets;
+    }
+
+    if (ms->smp.cpus > 1) {
+        Error *blocker = NULL;
+        error_setg(&blocker, QERR_REPLAY_NOT_SUPPORTED, "smp");
+        replay_add_blocker(blocker);
+    }
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -2650,6 +2725,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+    mc->smp_parse = virt_smp_parse;
     mc->kvm_type = virt_kvm_type;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
-- 
2.19.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path
  2021-04-13  8:07 ` [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path Yanan Wang
@ 2021-04-16  4:52   ` David Gibson
  2021-04-17  2:36     ` wangyanan (Y)
  0 siblings, 1 reply; 40+ messages in thread
From: David Gibson @ 2021-04-16  4:52 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, zhukeqian1, Jiajie Li


[-- Attachment #1: Type: text/plain, Size: 3973 bytes --]

On Tue, Apr 13, 2021 at 04:07:40PM +0800, Yanan Wang wrote:
> From: Andrew Jones <drjones@redhat.com>
> 
> qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
> it also adds any missing subnodes in the path. We also tweak
> an error message of qemu_fdt_add_subnode().
> 
> We'll make use of this new function in a coming patch.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  include/sysemu/device_tree.h |  1 +
>  softmmu/device_tree.c        | 45 ++++++++++++++++++++++++++++++++++--
>  2 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
> index 8a2fe55622..ef060a9759 100644
> --- a/include/sysemu/device_tree.h
> +++ b/include/sysemu/device_tree.h
> @@ -121,6 +121,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
>  uint32_t qemu_fdt_alloc_phandle(void *fdt);
>  int qemu_fdt_nop_node(void *fdt, const char *node_path);
>  int qemu_fdt_add_subnode(void *fdt, const char *name);
> +int qemu_fdt_add_path(void *fdt, const char *path);
>  
>  #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
>      do {                                                                      \
> diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
> index 2691c58cf6..8592c7aa1b 100644
> --- a/softmmu/device_tree.c
> +++ b/softmmu/device_tree.c
> @@ -541,8 +541,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>  
>      retval = fdt_add_subnode(fdt, parent, basename);
>      if (retval < 0) {
> -        error_report("FDT: Failed to create subnode %s: %s", name,
> -                     fdt_strerror(retval));
> +        error_report("%s: Failed to create subnode %s: %s",
> +                     __func__, name, fdt_strerror(retval));
>          exit(1);
>      }
>  
> @@ -550,6 +550,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>      return retval;
>  }
>  
> +/*
> + * Like qemu_fdt_add_subnode(), but will add all missing
> + * subnodes in the path.
> + */
> +int qemu_fdt_add_path(void *fdt, const char *path)
> +{
> +    char *dupname, *basename, *p;
> +    int parent, retval = -1;
> +
> +    if (path[0] != '/') {
> +        return retval;
> +    }
> +
> +    parent = fdt_path_offset(fdt, "/");

Getting the offset for "/" is never needed - it's always 0.

> +    p = dupname = g_strdup(path);

You shouldn't need the strdup(), see below.

> +
> +    while (p) {
> +        *p = '/';
> +        basename = p + 1;
> +        p = strchr(p + 1, '/');
> +        if (p) {
> +            *p = '\0';
> +        }
> +        retval = fdt_path_offset(fdt, dupname);

The fdt_path_offset_namelen() function exists *exactly* so that you
can look up partial parths without having to mangle your input
string.  Just set the namelen right, and it will ignore anything to
the right of that.

> +        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
> +            error_report("%s: Invalid path %s: %s",
> +                         __func__, path, fdt_strerror(retval));

If you're getting an error other than FDT_ERR_NOTFOUND here, chances
are it's not an invalid path, but a corrupted fdt blob or something
else.

> +            exit(1);
> +        } else if (retval == -FDT_ERR_NOTFOUND) {
> +            retval = fdt_add_subnode(fdt, parent, basename);
> +            if (retval < 0) {
> +                break;
> +            }
> +        }
> +        parent = retval;
> +    }
> +
> +    g_free(dupname);
> +    return retval;
> +}
> +
>  void qemu_fdt_dumpdtb(void *fdt, int size)
>  {
>      const char *dumpdtb = current_machine->dumpdtb;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path
  2021-04-16  4:52   ` David Gibson
@ 2021-04-17  2:36     ` wangyanan (Y)
  2021-04-19  1:13       ` David Gibson
  0 siblings, 1 reply; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-17  2:36 UTC (permalink / raw)
  To: David Gibson
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, zhukeqian1, Jiajie Li

Hi David,

On 2021/4/16 12:52, David Gibson wrote:
> On Tue, Apr 13, 2021 at 04:07:40PM +0800, Yanan Wang wrote:
>> From: Andrew Jones <drjones@redhat.com>
>>
>> qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
>> it also adds any missing subnodes in the path. We also tweak
>> an error message of qemu_fdt_add_subnode().
>>
>> We'll make use of this new function in a coming patch.
>>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   include/sysemu/device_tree.h |  1 +
>>   softmmu/device_tree.c        | 45 ++++++++++++++++++++++++++++++++++--
>>   2 files changed, 44 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
>> index 8a2fe55622..ef060a9759 100644
>> --- a/include/sysemu/device_tree.h
>> +++ b/include/sysemu/device_tree.h
>> @@ -121,6 +121,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
>>   uint32_t qemu_fdt_alloc_phandle(void *fdt);
>>   int qemu_fdt_nop_node(void *fdt, const char *node_path);
>>   int qemu_fdt_add_subnode(void *fdt, const char *name);
>> +int qemu_fdt_add_path(void *fdt, const char *path);
>>   
>>   #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
>>       do {                                                                      \
>> diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
>> index 2691c58cf6..8592c7aa1b 100644
>> --- a/softmmu/device_tree.c
>> +++ b/softmmu/device_tree.c
>> @@ -541,8 +541,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>>   
>>       retval = fdt_add_subnode(fdt, parent, basename);
>>       if (retval < 0) {
>> -        error_report("FDT: Failed to create subnode %s: %s", name,
>> -                     fdt_strerror(retval));
>> +        error_report("%s: Failed to create subnode %s: %s",
>> +                     __func__, name, fdt_strerror(retval));
>>           exit(1);
>>       }
>>   
>> @@ -550,6 +550,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>>       return retval;
>>   }
>>   
>> +/*
>> + * Like qemu_fdt_add_subnode(), but will add all missing
>> + * subnodes in the path.
>> + */
>> +int qemu_fdt_add_path(void *fdt, const char *path)
>> +{
>> +    char *dupname, *basename, *p;
>> +    int parent, retval = -1;
>> +
>> +    if (path[0] != '/') {
>> +        return retval;
>> +    }
>> +
>> +    parent = fdt_path_offset(fdt, "/");
> Getting the offset for "/" is never needed - it's always 0.
Thanks, will fix it.
>> +    p = dupname = g_strdup(path);
> You shouldn't need the strdup(), see below.
>
>> +
>> +    while (p) {
>> +        *p = '/';
>> +        basename = p + 1;
>> +        p = strchr(p + 1, '/');
>> +        if (p) {
>> +            *p = '\0';
>> +        }
>> +        retval = fdt_path_offset(fdt, dupname);
> The fdt_path_offset_namelen() function exists *exactly* so that you
> can look up partial parths without having to mangle your input
> string.  Just set the namelen right, and it will ignore anything to
> the right of that.
Function fdt_path_offset_namelen() seems more reasonable.

After we call qemu_fdt_add_path() to add "/cpus/cpu-map/socket0/core0" 
successfully,
if we want to add another path like "/cpus/cpu-map/socket0/core1" we 
will get the error
-FDT_ERR_NOTFOUND for each partial path. But actually 
"/cpus/cpu-map/socket0"
already exists, so by using fdt_path_offset_namelen() with right namelen 
we can avoid
the error retval for this part.
>> +        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
>> +            error_report("%s: Invalid path %s: %s",
>> +                         __func__, path, fdt_strerror(retval));
> If you're getting an error other than FDT_ERR_NOTFOUND here, chances
> are it's not an invalid path, but a corrupted fdt blob or something
> else.

Right, there can be variable reasons for the fail in addition to the 
invalid path.

>> +            exit(1);
>> +        } else if (retval == -FDT_ERR_NOTFOUND) {
>> +            retval = fdt_add_subnode(fdt, parent, basename);
>> +            if (retval < 0) {
>> +                break;
>> +            }
I found another question here. If path "/cpus/cpu-map/socket0/core0" has 
already
been added, when we want to add another path "/cpus/cpu-map/socket0/core1"
and go here with retval = fdt_add_subnode(fdt, parent, "cpus"), then 
retval will
be -FDT_ERR_EXISTS, but we can't just break the loop in this case.

Am I right of the explanation ?

Thanks,
Yanan
>> +        }
>> +        parent = retval;
>> +    }
>> +
>> +    g_free(dupname);
>> +    return retval;
>> +}
>> +
>>   void qemu_fdt_dumpdtb(void *fdt, int size)
>>   {
>>       const char *dumpdtb = current_machine->dumpdtb;


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path
  2021-04-17  2:36     ` wangyanan (Y)
@ 2021-04-19  1:13       ` David Gibson
  2021-04-19  7:02         ` wangyanan (Y)
  0 siblings, 1 reply; 40+ messages in thread
From: David Gibson @ 2021-04-19  1:13 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, zhukeqian1, Jiajie Li


[-- Attachment #1: Type: text/plain, Size: 6091 bytes --]

On Sat, Apr 17, 2021 at 10:36:20AM +0800, wangyanan (Y) wrote:
> Hi David,
> 
> On 2021/4/16 12:52, David Gibson wrote:
> > On Tue, Apr 13, 2021 at 04:07:40PM +0800, Yanan Wang wrote:
> > > From: Andrew Jones <drjones@redhat.com>
> > > 
> > > qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
> > > it also adds any missing subnodes in the path. We also tweak
> > > an error message of qemu_fdt_add_subnode().
> > > 
> > > We'll make use of this new function in a coming patch.
> > > 
> > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > > ---
> > >   include/sysemu/device_tree.h |  1 +
> > >   softmmu/device_tree.c        | 45 ++++++++++++++++++++++++++++++++++--
> > >   2 files changed, 44 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
> > > index 8a2fe55622..ef060a9759 100644
> > > --- a/include/sysemu/device_tree.h
> > > +++ b/include/sysemu/device_tree.h
> > > @@ -121,6 +121,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
> > >   uint32_t qemu_fdt_alloc_phandle(void *fdt);
> > >   int qemu_fdt_nop_node(void *fdt, const char *node_path);
> > >   int qemu_fdt_add_subnode(void *fdt, const char *name);
> > > +int qemu_fdt_add_path(void *fdt, const char *path);
> > >   #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
> > >       do {                                                                      \
> > > diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
> > > index 2691c58cf6..8592c7aa1b 100644
> > > --- a/softmmu/device_tree.c
> > > +++ b/softmmu/device_tree.c
> > > @@ -541,8 +541,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
> > >       retval = fdt_add_subnode(fdt, parent, basename);
> > >       if (retval < 0) {
> > > -        error_report("FDT: Failed to create subnode %s: %s", name,
> > > -                     fdt_strerror(retval));
> > > +        error_report("%s: Failed to create subnode %s: %s",
> > > +                     __func__, name, fdt_strerror(retval));
> > >           exit(1);
> > >       }
> > > @@ -550,6 +550,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
> > >       return retval;
> > >   }
> > > +/*
> > > + * Like qemu_fdt_add_subnode(), but will add all missing
> > > + * subnodes in the path.
> > > + */
> > > +int qemu_fdt_add_path(void *fdt, const char *path)
> > > +{
> > > +    char *dupname, *basename, *p;
> > > +    int parent, retval = -1;
> > > +
> > > +    if (path[0] != '/') {
> > > +        return retval;
> > > +    }
> > > +
> > > +    parent = fdt_path_offset(fdt, "/");
> > Getting the offset for "/" is never needed - it's always 0.
> Thanks, will fix it.
> > > +    p = dupname = g_strdup(path);
> > You shouldn't need the strdup(), see below.
> > 
> > > +
> > > +    while (p) {
> > > +        *p = '/';
> > > +        basename = p + 1;
> > > +        p = strchr(p + 1, '/');
> > > +        if (p) {
> > > +            *p = '\0';
> > > +        }
> > > +        retval = fdt_path_offset(fdt, dupname);
> > The fdt_path_offset_namelen() function exists *exactly* so that you
> > can look up partial parths without having to mangle your input
> > string.  Just set the namelen right, and it will ignore anything to
> > the right of that.
> Function fdt_path_offset_namelen() seems more reasonable.
> 
> After we call qemu_fdt_add_path() to add "/cpus/cpu-map/socket0/core0"
> successfully,
> if we want to add another path like "/cpus/cpu-map/socket0/core1" we will
> get the error
> -FDT_ERR_NOTFOUND for each partial path. But actually
> "/cpus/cpu-map/socket0"
> already exists, so by using fdt_path_offset_namelen() with right namelen we
> can avoid
> the error retval for this part.

I don't quite follow what you're saying here.  AFAICT your logic was
correct - it just involved a lot of mangling the given path (adding
and removing \0s) which becomes unnecessary with
fdt_path_offset_namelen().

> > > +        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
> > > +            error_report("%s: Invalid path %s: %s",
> > > +                         __func__, path, fdt_strerror(retval));
> > If you're getting an error other than FDT_ERR_NOTFOUND here, chances
> > are it's not an invalid path, but a corrupted fdt blob or something
> > else.
> 
> Right, there can be variable reasons for the fail in addition to the invalid
> path.
> 
> > > +            exit(1);
> > > +        } else if (retval == -FDT_ERR_NOTFOUND) {
> > > +            retval = fdt_add_subnode(fdt, parent, basename);
> > > +            if (retval < 0) {
> > > +                break;
> > > +            }
> I found another question here. If path "/cpus/cpu-map/socket0/core0" has
> already
> been added, when we want to add another path "/cpus/cpu-map/socket0/core1"
> and go here with retval = fdt_add_subnode(fdt, parent, "cpus"), then retval
> will
> be -FDT_ERR_EXISTS, but we can't just break the loop in this case.
> 
> Am I right of the explanation ?

Not exactly.  If you called that fdt_add_subnode() in that case you
would get FDT_ERR_EXISTS.  However, because the previous
fdt_path_offset() has returned -FDT_ERR_NOTFOUND, you've already
established that the partial path doesn't exist, so if you got an
FDT_ERR_EXISTS here something has gone very strangely wrong (such as
concurrent access to the flat tree, which would very likely corrupt
it).

Note that the repeated use of fdt_path_offset() in this function will
repeatedly rescan the whole fdt from the beginning.  If you're
concerned about performance (which you might well not be), then a more
efficient approach would be to take your input path component by
component and use fdt_subnode_offset() directly.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path
  2021-04-19  1:13       ` David Gibson
@ 2021-04-19  7:02         ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-19  7:02 UTC (permalink / raw)
  To: David Gibson
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, zhukeqian1, Jiajie Li

Hi David,

On 2021/4/19 9:13, David Gibson wrote:
> On Sat, Apr 17, 2021 at 10:36:20AM +0800, wangyanan (Y) wrote:
>> Hi David,
>>
>> On 2021/4/16 12:52, David Gibson wrote:
>>> On Tue, Apr 13, 2021 at 04:07:40PM +0800, Yanan Wang wrote:
>>>> From: Andrew Jones <drjones@redhat.com>
>>>>
>>>> qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except
>>>> it also adds any missing subnodes in the path. We also tweak
>>>> an error message of qemu_fdt_add_subnode().
>>>>
>>>> We'll make use of this new function in a coming patch.
>>>>
>>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>>> ---
>>>>    include/sysemu/device_tree.h |  1 +
>>>>    softmmu/device_tree.c        | 45 ++++++++++++++++++++++++++++++++++--
>>>>    2 files changed, 44 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
>>>> index 8a2fe55622..ef060a9759 100644
>>>> --- a/include/sysemu/device_tree.h
>>>> +++ b/include/sysemu/device_tree.h
>>>> @@ -121,6 +121,7 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
>>>>    uint32_t qemu_fdt_alloc_phandle(void *fdt);
>>>>    int qemu_fdt_nop_node(void *fdt, const char *node_path);
>>>>    int qemu_fdt_add_subnode(void *fdt, const char *name);
>>>> +int qemu_fdt_add_path(void *fdt, const char *path);
>>>>    #define qemu_fdt_setprop_cells(fdt, node_path, property, ...)                 \
>>>>        do {                                                                      \
>>>> diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
>>>> index 2691c58cf6..8592c7aa1b 100644
>>>> --- a/softmmu/device_tree.c
>>>> +++ b/softmmu/device_tree.c
>>>> @@ -541,8 +541,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>>>>        retval = fdt_add_subnode(fdt, parent, basename);
>>>>        if (retval < 0) {
>>>> -        error_report("FDT: Failed to create subnode %s: %s", name,
>>>> -                     fdt_strerror(retval));
>>>> +        error_report("%s: Failed to create subnode %s: %s",
>>>> +                     __func__, name, fdt_strerror(retval));
>>>>            exit(1);
>>>>        }
>>>> @@ -550,6 +550,47 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
>>>>        return retval;
>>>>    }
>>>> +/*
>>>> + * Like qemu_fdt_add_subnode(), but will add all missing
>>>> + * subnodes in the path.
>>>> + */
>>>> +int qemu_fdt_add_path(void *fdt, const char *path)
>>>> +{
>>>> +    char *dupname, *basename, *p;
>>>> +    int parent, retval = -1;
>>>> +
>>>> +    if (path[0] != '/') {
>>>> +        return retval;
>>>> +    }
>>>> +
>>>> +    parent = fdt_path_offset(fdt, "/");
>>> Getting the offset for "/" is never needed - it's always 0.
>> Thanks, will fix it.
>>>> +    p = dupname = g_strdup(path);
>>> You shouldn't need the strdup(), see below.
>>>
>>>> +
>>>> +    while (p) {
>>>> +        *p = '/';
>>>> +        basename = p + 1;
>>>> +        p = strchr(p + 1, '/');
>>>> +        if (p) {
>>>> +            *p = '\0';
>>>> +        }
>>>> +        retval = fdt_path_offset(fdt, dupname);
>>> The fdt_path_offset_namelen() function exists *exactly* so that you
>>> can look up partial parths without having to mangle your input
>>> string.  Just set the namelen right, and it will ignore anything to
>>> the right of that.
>> Function fdt_path_offset_namelen() seems more reasonable.
>>
>> After we call qemu_fdt_add_path() to add "/cpus/cpu-map/socket0/core0"
>> successfully,
>> if we want to add another path like "/cpus/cpu-map/socket0/core1" we will
>> get the error
>> -FDT_ERR_NOTFOUND for each partial path. But actually
>> "/cpus/cpu-map/socket0"
>> already exists, so by using fdt_path_offset_namelen() with right namelen we
>> can avoid
>> the error retval for this part.
> I don't quite follow what you're saying here.  AFAICT your logic was
> correct - it just involved a lot of mangling the given path (adding
> and removing \0s) which becomes unnecessary with
> fdt_path_offset_namelen().
Right.
>>>> +        if (retval < 0 && retval != -FDT_ERR_NOTFOUND) {
>>>> +            error_report("%s: Invalid path %s: %s",
>>>> +                         __func__, path, fdt_strerror(retval));
>>> If you're getting an error other than FDT_ERR_NOTFOUND here, chances
>>> are it's not an invalid path, but a corrupted fdt blob or something
>>> else.
>> Right, there can be variable reasons for the fail in addition to the invalid
>> path.
>>
>>>> +            exit(1);
>>>> +        } else if (retval == -FDT_ERR_NOTFOUND) {
>>>> +            retval = fdt_add_subnode(fdt, parent, basename);
>>>> +            if (retval < 0) {
>>>> +                break;
>>>> +            }
>> I found another question here. If path "/cpus/cpu-map/socket0/core0" has
>> already
>> been added, when we want to add another path "/cpus/cpu-map/socket0/core1"
>> and go here with retval = fdt_add_subnode(fdt, parent, "cpus"), then retval
>> will
>> be -FDT_ERR_EXISTS, but we can't just break the loop in this case.
>>
>> Am I right of the explanation ?
> Not exactly.  If you called that fdt_add_subnode() in that case you
> would get FDT_ERR_EXISTS.  However, because the previous
> fdt_path_offset() has returned -FDT_ERR_NOTFOUND, you've already
> established that the partial path doesn't exist, so if you got an
> FDT_ERR_EXISTS here something has gone very strangely wrong (such as
> concurrent access to the flat tree, which would very likely corrupt
> it).
Thanks for the analysis, I was wrong...
> Note that the repeated use of fdt_path_offset() in this function will
> repeatedly rescan the whole fdt from the beginning.  If you're
> concerned about performance (which you might well not be), then a more
> efficient approach would be to take your input path component by
> component and use fdt_subnode_offset() directly.
I will try to address your comments and have some rework for this patch 
in next version.

Thanks,
Yanan
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
                   ` (5 preceding siblings ...)
  2021-04-13  8:07 ` [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores Yanan Wang
@ 2021-04-21  7:40 ` wangyanan (Y)
  2021-04-21  9:31 ` wangyanan (Y)
  7 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-21  7:40 UTC (permalink / raw)
  To: Peter Maydell, Andrew Jones, Michael S . Tsirkin, Shannon Zhao,
	Igor Mammedov, qemu-devel, qemu-arm
  Cc: Alistair Francis, wanghaibin.wang, yuzenghui, yangyicong,
	zhukeqian1, Jiajie Li, David Gibson

Hey guys, any comments will be really welcomed and appreciated! 😉

Thanks,
Yanan
On 2021/4/13 16:07, Yanan Wang wrote:
> Hi,
>
> This series is a new version of [0] recently posted by Ying Fang
> to introduce cpu topology support for ARM platform. I have taken
> over his work about this now, thanks for his contribution.
>
> Description:
> An accurate cpu topology may help improve the cpu scheduler's decision
> making when dealing with multi-core system. So cpu topology description
> is helpful to provide guest with the right view. Dario Faggioli's talk
> in [1] also shows the virtual topology could have impact on scheduling
> performace. Thus this patch series introduces cpu topology support for
> ARM platform.
>
> This series originally comes from Andrew Jones's patches [2], but with
> some re-arrangement. Thanks for Andrew's contribution. In this series,
> both fdt and ACPI PPTT table are introduced to present cpu topology to
> the guest. And a new function virt_smp_parse() not like the default
> smp_parse() is introduced, which prefers cores over sockets.
>
> [0] https://patchwork.kernel.org/project/qemu-devel/cover/20210225085627.2263-1-fangying1@huawei.com/
> [1] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
> [2] https://github.com/rhdrjones/qemu/commit/ecfc1565f22187d2c715a99bbcd35cf3a7e428fa
>
> Test results:
> After applying this patch series, launch a guest with virt-6.0 and cpu
> topology configured with: -smp 96,sockets=2,clusters=6,cores=4,threads=2,
> VM's cpu topology description shows as below.
>
> Architecture:        aarch64
> Byte Order:          Little Endian
> CPU(s):              96
> On-line CPU(s) list: 0-95
> Thread(s) per core:  2
> Core(s) per socket:  24
> Socket(s):           2
> NUMA node(s):        1
> Vendor ID:           0x48
> Model:               0
> Stepping:            0x1
> BogoMIPS:            200.00
> NUMA node0 CPU(s):   0-95
>
> ---
>
> Changelogs:
> v1->v2:
> - Address Andrew Jones's comments
> - Address Michael S. Tsirkin's comments
> - Pick up one more patch(patch#6) of Andrew Jones
> - Rebased on v6.0.0-rc2 release
>
> ---
>
> Andrew Jones (3):
>    device_tree: Add qemu_fdt_add_path
>    hw/arm/virt: DT: Add cpu-map
>    hw/arm/virt: Replace smp_parse with one that prefers cores
>
> Yanan Wang (2):
>    hw/acpi/aml-build: Add processor hierarchy node structure
>    hw/arm/virt-acpi-build: Add PPTT table
>
> Ying Fang (1):
>    hw/arm/virt-acpi-build: Distinguish possible and present cpus
>
>   hw/acpi/aml-build.c          |  27 ++++++++
>   hw/arm/virt-acpi-build.c     |  77 ++++++++++++++++++++--
>   hw/arm/virt.c                | 120 ++++++++++++++++++++++++++++++++++-
>   include/hw/acpi/aml-build.h  |   4 ++
>   include/hw/arm/virt.h        |   1 +
>   include/sysemu/device_tree.h |   1 +
>   softmmu/device_tree.c        |  45 ++++++++++++-
>   7 files changed, 268 insertions(+), 7 deletions(-)
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support
  2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
                   ` (6 preceding siblings ...)
  2021-04-21  7:40 ` [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support wangyanan (Y)
@ 2021-04-21  9:31 ` wangyanan (Y)
  7 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-21  9:31 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: Peter Maydell, Andrew Jones, Michael S . Tsirkin,
	wanghaibin.wang, Shannon Zhao, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li,
	David Gibson


On 2021/4/13 16:07, Yanan Wang wrote:
> Hi,
>
> This series is a new version of [0] recently posted by Ying Fang
> to introduce cpu topology support for ARM platform. I have taken
> over his work about this now, thanks for his contribution.
>
> Description:
> An accurate cpu topology may help improve the cpu scheduler's decision
> making when dealing with multi-core system. So cpu topology description
> is helpful to provide guest with the right view. Dario Faggioli's talk
> in [1] also shows the virtual topology could have impact on scheduling
> performace. Thus this patch series introduces cpu topology support for
> ARM platform.
>
> This series originally comes from Andrew Jones's patches [2], but with
> some re-arrangement. Thanks for Andrew's contribution. In this series,
> both fdt and ACPI PPTT table are introduced to present cpu topology to
> the guest. And a new function virt_smp_parse() not like the default
> smp_parse() is introduced, which prefers cores over sockets.
>
> [0] https://patchwork.kernel.org/project/qemu-devel/cover/20210225085627.2263-1-fangying1@huawei.com/
> [1] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
> [2] https://github.com/rhdrjones/qemu/commit/ecfc1565f22187d2c715a99bbcd35cf3a7e428fa
>
> Test results:
> After applying this patch series, launch a guest with virt-6.0 and cpu
> topology configured with: -smp 96,sockets=2,clusters=6,cores=4,threads=2,
Fix the incorrect statement:
Here the command line was "-smp 96, sockets=2, cores=24,threads=2" in 
reality.

Thanks,
Yanan
> VM's cpu topology description shows as below.
>
> Architecture:        aarch64
> Byte Order:          Little Endian
> CPU(s):              96
> On-line CPU(s) list: 0-95
> Thread(s) per core:  2
> Core(s) per socket:  24
> Socket(s):           2
> NUMA node(s):        1
> Vendor ID:           0x48
> Model:               0
> Stepping:            0x1
> BogoMIPS:            200.00
> NUMA node0 CPU(s):   0-95
>
> ---
>
> Changelogs:
> v1->v2:
> - Address Andrew Jones's comments
> - Address Michael S. Tsirkin's comments
> - Pick up one more patch(patch#6) of Andrew Jones
> - Rebased on v6.0.0-rc2 release
>
> ---
>
> Andrew Jones (3):
>    device_tree: Add qemu_fdt_add_path
>    hw/arm/virt: DT: Add cpu-map
>    hw/arm/virt: Replace smp_parse with one that prefers cores
>
> Yanan Wang (2):
>    hw/acpi/aml-build: Add processor hierarchy node structure
>    hw/arm/virt-acpi-build: Add PPTT table
>
> Ying Fang (1):
>    hw/arm/virt-acpi-build: Distinguish possible and present cpus
>
>   hw/acpi/aml-build.c          |  27 ++++++++
>   hw/arm/virt-acpi-build.c     |  77 ++++++++++++++++++++--
>   hw/arm/virt.c                | 120 ++++++++++++++++++++++++++++++++++-
>   include/hw/acpi/aml-build.h  |   4 ++
>   include/hw/arm/virt.h        |   1 +
>   include/sysemu/device_tree.h |   1 +
>   softmmu/device_tree.c        |  45 ++++++++++++-
>   7 files changed, 268 insertions(+), 7 deletions(-)
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-13  8:07 ` [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map Yanan Wang
@ 2021-04-27  9:47   ` Philippe Mathieu-Daudé
  2021-04-27 10:04     ` Andrew Jones
  2021-05-13  6:58   ` Andrew Jones
  1 sibling, 1 reply; 40+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-27  9:47 UTC (permalink / raw)
  To: Yanan Wang, qemu-devel, qemu-arm, Andrew Jones, David Gibson
  Cc: Peter Maydell, Michael S . Tsirkin, Shannon Zhao, Igor Mammedov,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	wanghaibin.wang, zhukeqian1, Jiajie Li

Hi Yanan, Drew,

On 4/13/21 10:07 AM, Yanan Wang wrote:
> From: Andrew Jones <drjones@redhat.com>
> 
> Support device tree CPU topology descriptions.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9f01d9041b..f4ae60ded9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>      int cpu;
>      int addr_cells = 1;
>      const MachineState *ms = MACHINE(vms);
> +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>      int smp_cpus = ms->smp.cpus;
>  
>      /*
> -     * From Documentation/devicetree/bindings/arm/cpus.txt
> +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
>       *  On ARM v8 64-bit systems value should be set to 2,
>       *  that corresponds to the MPIDR_EL1 register size.
>       *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
> @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>                  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
>          }
>  
> +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
> +                                  qemu_fdt_alloc_phandle(ms->fdt));
> +        }
> +
>          g_free(nodename);
>      }
> +
> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> +        /*
> +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
> +         * In a SMP system, the hierarchy of CPUs is defined through four
> +         * entities that are used to describe the layout of physical CPUs
> +         * in the system: socket/cluster/core/thread.
> +         */
> +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
> +
> +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
> +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
> +            char *map_path;
> +
> +            if (ms->smp.threads > 1) {
> +                map_path = g_strdup_printf(
> +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
> +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
> +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
> +                    "thread", cpu % ms->smp.threads);
> +            } else {
> +                map_path = g_strdup_printf(
> +                    "/cpus/cpu-map/%s%d/%s%d",
> +                    "socket", cpu / ms->smp.cores,
> +                    "core", cpu % ms->smp.cores);
> +            }
> +            qemu_fdt_add_path(ms->fdt, map_path);
> +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
> +            g_free(map_path);
> +            g_free(cpu_path);
> +        }
> +    }
>  }
>  
>  static void fdt_add_its_gic_node(VirtMachineState *vms)
> @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
>      virt_machine_6_0_options(mc);
>      compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
>      vmc->no_secure_gpio = true;
> +    vmc->no_cpu_topology = true;

Bare with me because "machine versioning" is something new to me, I was
expecting it to be only related to migrated fields.
Why do we need to care about not adding the FDT node in older machines?
Shouldn't the guest skip unknown FDT nodes?

Thanks,

Phil.

>  }
>  DEFINE_VIRT_MACHINE(5, 2)
>  
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 921416f918..4a4b98e4a7 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -129,6 +129,7 @@ struct VirtMachineClass {
>      bool no_kvm_steal_time;
>      bool acpi_expose_flash;
>      bool no_secure_gpio;
> +    bool no_cpu_topology;
>  };
>  
>  struct VirtMachineState {
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-27  9:47   ` Philippe Mathieu-Daudé
@ 2021-04-27 10:04     ` Andrew Jones
  2021-04-27 12:36       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 10:04 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Yanan Wang,
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1,
	Jiajie Li, David Gibson

On Tue, Apr 27, 2021 at 11:47:17AM +0200, Philippe Mathieu-Daudé wrote:
> Hi Yanan, Drew,
> 
> On 4/13/21 10:07 AM, Yanan Wang wrote:
> > From: Andrew Jones <drjones@redhat.com>
> > 
> > Support device tree CPU topology descriptions.
> > 
> > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > ---
> >  hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
> >  include/hw/arm/virt.h |  1 +
> >  2 files changed, 41 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 9f01d9041b..f4ae60ded9 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
> >      int cpu;
> >      int addr_cells = 1;
> >      const MachineState *ms = MACHINE(vms);
> > +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
> >      int smp_cpus = ms->smp.cpus;
> >  
> >      /*
> > -     * From Documentation/devicetree/bindings/arm/cpus.txt
> > +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
> >       *  On ARM v8 64-bit systems value should be set to 2,
> >       *  that corresponds to the MPIDR_EL1 register size.
> >       *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
> > @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
> >                  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
> >          }
> >  
> > +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> > +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
> > +                                  qemu_fdt_alloc_phandle(ms->fdt));
> > +        }
> > +
> >          g_free(nodename);
> >      }
> > +
> > +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> > +        /*
> > +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
> > +         * In a SMP system, the hierarchy of CPUs is defined through four
> > +         * entities that are used to describe the layout of physical CPUs
> > +         * in the system: socket/cluster/core/thread.
> > +         */
> > +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
> > +
> > +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
> > +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
> > +            char *map_path;
> > +
> > +            if (ms->smp.threads > 1) {
> > +                map_path = g_strdup_printf(
> > +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
> > +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
> > +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
> > +                    "thread", cpu % ms->smp.threads);
> > +            } else {
> > +                map_path = g_strdup_printf(
> > +                    "/cpus/cpu-map/%s%d/%s%d",
> > +                    "socket", cpu / ms->smp.cores,
> > +                    "core", cpu % ms->smp.cores);
> > +            }
> > +            qemu_fdt_add_path(ms->fdt, map_path);
> > +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
> > +            g_free(map_path);
> > +            g_free(cpu_path);
> > +        }
> > +    }
> >  }
> >  
> >  static void fdt_add_its_gic_node(VirtMachineState *vms)
> > @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
> >      virt_machine_6_0_options(mc);
> >      compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
> >      vmc->no_secure_gpio = true;
> > +    vmc->no_cpu_topology = true;
> 
> Bare with me because "machine versioning" is something new to me, I was
> expecting it to be only related to migrated fields.
> Why do we need to care about not adding the FDT node in older machines?
> Shouldn't the guest skip unknown FDT nodes?

It probably should, the question is whether it would. Also, the nodes may
not be unknown, so the guest will read the information and set up its
topology as instructed. That topology may not be the same as what was
getting used by default without the topology description. It's possible
that a user's application has a dependency on the topology and if that
topology gets changed under its feat it'll behave differently.

In short, machine versioning isn't just about vmstate, it's also about
keeping a machine type looking the same to the guest.

Now, it's possible that we're being overly cautious here, but this compat
variable doesn't complicate code too much. So I think I'd prefer to use it
than not.

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-27 10:04     ` Andrew Jones
@ 2021-04-27 12:36       ` Philippe Mathieu-Daudé
  2021-04-28  6:36         ` wangyanan (Y)
  0 siblings, 1 reply; 40+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-27 12:36 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Yanan Wang,
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	prime.zeng, yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1,
	Jiajie Li, David Gibson

On 4/27/21 12:04 PM, Andrew Jones wrote:
> On Tue, Apr 27, 2021 at 11:47:17AM +0200, Philippe Mathieu-Daudé wrote:
>> Hi Yanan, Drew,
>>
>> On 4/13/21 10:07 AM, Yanan Wang wrote:
>>> From: Andrew Jones <drjones@redhat.com>
>>>
>>> Support device tree CPU topology descriptions.
>>>
>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>> ---
>>>  hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
>>>  include/hw/arm/virt.h |  1 +
>>>  2 files changed, 41 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 9f01d9041b..f4ae60ded9 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>>      int cpu;
>>>      int addr_cells = 1;
>>>      const MachineState *ms = MACHINE(vms);
>>> +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>>>      int smp_cpus = ms->smp.cpus;
>>>  
>>>      /*
>>> -     * From Documentation/devicetree/bindings/arm/cpus.txt
>>> +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
>>>       *  On ARM v8 64-bit systems value should be set to 2,
>>>       *  that corresponds to the MPIDR_EL1 register size.
>>>       *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
>>> @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>>                  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
>>>          }
>>>  
>>> +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>>> +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
>>> +                                  qemu_fdt_alloc_phandle(ms->fdt));
>>> +        }
>>> +
>>>          g_free(nodename);
>>>      }
>>> +
>>> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>>> +        /*
>>> +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
>>> +         * In a SMP system, the hierarchy of CPUs is defined through four
>>> +         * entities that are used to describe the layout of physical CPUs
>>> +         * in the system: socket/cluster/core/thread.
>>> +         */
>>> +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
>>> +
>>> +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
>>> +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
>>> +            char *map_path;
>>> +
>>> +            if (ms->smp.threads > 1) {
>>> +                map_path = g_strdup_printf(
>>> +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
>>> +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
>>> +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
>>> +                    "thread", cpu % ms->smp.threads);
>>> +            } else {
>>> +                map_path = g_strdup_printf(
>>> +                    "/cpus/cpu-map/%s%d/%s%d",
>>> +                    "socket", cpu / ms->smp.cores,
>>> +                    "core", cpu % ms->smp.cores);
>>> +            }
>>> +            qemu_fdt_add_path(ms->fdt, map_path);
>>> +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
>>> +            g_free(map_path);
>>> +            g_free(cpu_path);
>>> +        }
>>> +    }
>>>  }
>>>  
>>>  static void fdt_add_its_gic_node(VirtMachineState *vms)
>>> @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
>>>      virt_machine_6_0_options(mc);
>>>      compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
>>>      vmc->no_secure_gpio = true;
>>> +    vmc->no_cpu_topology = true;
>>
>> Bare with me because "machine versioning" is something new to me, I was
>> expecting it to be only related to migrated fields.
>> Why do we need to care about not adding the FDT node in older machines?
>> Shouldn't the guest skip unknown FDT nodes?
> 
> It probably should, the question is whether it would. Also, the nodes may
> not be unknown, so the guest will read the information and set up its
> topology as instructed. That topology may not be the same as what was
> getting used by default without the topology description. It's possible
> that a user's application has a dependency on the topology and if that
> topology gets changed under its feat it'll behave differently.

[*]

I see.

> In short, machine versioning isn't just about vmstate, it's also about
> keeping a machine type looking the same to the guest.

Yes, TIL.

> Now, it's possible that we're being overly cautious here, but this compat
> variable doesn't complicate code too much. So I think I'd prefer to use it
> than not.

No problem. Could you or Yanan add your first paragraph ([*], reworded
in the commit description? I don't think a comment in the code is
useful, but having it in the commit is helpful IMO.

Thanks,

Phil.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus
  2021-04-13  8:07 ` [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus Yanan Wang
@ 2021-04-27 13:18   ` Andrew Jones
  2021-04-28  6:42     ` wangyanan (Y)
  2021-04-27 14:50   ` Andrew Jones
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 13:18 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li,
	David Gibson

On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:
> From: Ying Fang <fangying1@huawei.com>
> 
> When building ACPI tables regarding CPUs we should always build
> them for the number of possible CPUs, not the number of present
> CPUs. We then ensure only the present CPUs are enabled in MADT.
> Furthermore, it is also needed if we are going to support CPU
> hotplug in the future.
> 
> This patch is a rework based on Andrew Jones's contribution at
> https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/arm/virt-acpi-build.c | 14 ++++++++++----
>  hw/arm/virt.c            |  3 +++
>  2 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index f5a2b2d4cb..2ad5dad1bf 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -61,13 +61,16 @@
>  
>  static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
>  {
> -    MachineState *ms = MACHINE(vms);
> +    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
>      uint16_t i;
>  
> -    for (i = 0; i < ms->smp.cpus; i++) {
> +    for (i = 0; i < possible_cpus->len; i++) {
>          Aml *dev = aml_device("C%.03X", i);
>          aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
>          aml_append(dev, aml_name_decl("_UID", aml_int(i)));
> +        if (possible_cpus->cpus[i].cpu == NULL) {
> +            aml_append(dev, aml_name_decl("_STA", aml_int(0)));
> +        }
>          aml_append(scope, dev);
>      }
>  }
> @@ -479,6 +482,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>      const int *irqmap = vms->irqmap;
>      AcpiMadtGenericDistributor *gicd;
>      AcpiMadtGenericMsiFrame *gic_msi;
> +    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
>      int i;
>  
>      acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));
> @@ -489,7 +493,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>      gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
>      gicd->version = vms->gic_version;
>  
> -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
> +    for (i = 0; i < possible_cpus->len; i++) {
>          AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
>                                                             sizeof(*gicc));
>          ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
> @@ -504,7 +508,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>          gicc->cpu_interface_number = cpu_to_le32(i);
>          gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
>          gicc->uid = cpu_to_le32(i);
> -        gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
> +        if (possible_cpus->cpus[i].cpu != NULL) {
> +            gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
> +        }
>  
>          if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {
>              gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index f4ae60ded9..3e5d9b6f26 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2063,6 +2063,9 @@ static void machvirt_init(MachineState *machine)
>          }
>  
>          qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> +
> +        /* Initialize cpu member here since cpu hotplug is not supported yet */
> +        machine->possible_cpus->cpus[n].cpu = cpuobj;

Can drop the 'machine->' as possible_cpus is already set to the pointer.

>          object_unref(cpuobj);
>      }
>      fdt_add_timer_nodes(vms);
> -- 
> 2.19.1
> 

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure
  2021-04-13  8:07 ` [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure Yanan Wang
@ 2021-04-27 13:37   ` Andrew Jones
  2021-04-28  6:59     ` wangyanan (Y)
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 13:37 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Henglong Fan, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li, David Gibson

On Tue, Apr 13, 2021 at 04:07:43PM +0800, Yanan Wang wrote:
> Add a generic API to build Processor Hierarchy Node Structure(Type 0),
> which is strictly consistent with descriptions in ACPI 6.3: 5.2.29.1.
> 
> This function will be used to build ACPI PPTT table for cpu topology.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> Signed-off-by: Henglong Fan <fanhenglong@huawei.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/acpi/aml-build.c         | 27 +++++++++++++++++++++++++++
>  include/hw/acpi/aml-build.h |  4 ++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index d33ce8954a..75e01aea17 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -1916,6 +1916,33 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>                   table_data->len - slit_start, 1, oem_id, oem_table_id);
>  }
>  
> +/*
> + * ACPI 6.3: 5.2.29.1 Processor Hierarchy Node Structure (Type 0)
             ^ Doesn't this table show up in 6.2 first? We should always
use the oldest specification we can.

Also, please don't capitalize Hierarchy, Node, and Structure. Those words
are not capitalized in the spec section name and we want an exact match
here.

> + */
> +void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
> +                                    uint32_t parent, uint32_t id,
> +                                    uint32_t *priv_rsrc, uint32_t priv_num)
> +{
> +    int i;
> +
> +    build_append_byte(tbl, 0);                 /* Type 0 - processor */
> +    build_append_byte(tbl, 20 + priv_num * 4); /* Length */
> +    build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
> +    build_append_int_noprefix(tbl, flags, 4);  /* Flags */
> +    build_append_int_noprefix(tbl, parent, 4); /* Parent */
> +    build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
                                                          ^ should be
capitalized like in the spec
                        
> +
> +    /* Number of private resources */
> +    build_append_int_noprefix(tbl, priv_num, 4);
> +
> +    /* Private resources[N] */
> +    if (priv_num > 0 && priv_rsrc != NULL) {

Since we should never have priv_num > 0 and priv_rsrc == NULL, then we can
do

   if (priv_num > 0) {
       assert(priv_rsrc);
       ...

> +        for (i = 0; i < priv_num; i++) {
> +            build_append_int_noprefix(tbl, priv_rsrc[i], 4);
> +        }
> +    }
> +}
> +
>  /* build rev1/rev3/rev5.1 FADT */ 
>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>                  const char *oem_id, const char *oem_table_id)
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index 471266d739..ea74b8f6ed 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -462,6 +462,10 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
>  void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>                  const char *oem_id, const char *oem_table_id);
>  
> +void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
> +                                    uint32_t parent, uint32_t id,
> +                                    uint32_t *priv_rsrc, uint32_t priv_num);
> +
>  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>                  const char *oem_id, const char *oem_table_id);
>  
> -- 
> 2.19.1
>

Thanks,
drew 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-04-13  8:07 ` [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table Yanan Wang
@ 2021-04-27 14:16   ` Andrew Jones
  2021-04-28  7:30     ` wangyanan (Y)
  2021-05-13  5:10   ` wangyanan (Y)
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 14:16 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li,
	David Gibson

On Tue, Apr 13, 2021 at 04:07:44PM +0800, Yanan Wang wrote:
> Add the Processor Properties Topology Table (PPTT) to present
> CPU topology information to ACPI guests. Note, while a DT boot
> Linux guest with a non-flat CPU topology will see socket and
> core IDs being sequential integers starting from zero, e.g.
> with -smp 4,sockets=2,cores=2,threads=1
> 
> a DT boot produces
> 
>  cpu:  0 package_id:  0 core_id:  0
>  cpu:  1 package_id:  0 core_id:  1
>  cpu:  2 package_id:  1 core_id:  0
>  cpu:  3 package_id:  1 core_id:  1
> 
> an ACPI boot produces
> 
>  cpu:  0 package_id: 36 core_id:  0
>  cpu:  1 package_id: 36 core_id:  1
>  cpu:  2 package_id: 96 core_id:  2
>  cpu:  3 package_id: 96 core_id:  3
> 
> This is due to several reasons:
> 
>  1) DT cpu nodes do not have an equivalent field to what the PPTT
>     ACPI Processor ID must be, i.e. something equal to the MADT CPU
>     UID or equal to the UID of an ACPI processor container. In both
>     ACPI cases those are platform dependant IDs assigned by the
>     vendor.
> 
>  2) While QEMU is the vendor for a guest, if the topology specifies
>     SMT (> 1 thread), then, with ACPI, it is impossible to assign a
>     core-id the same value as a package-id, thus it is not possible
>     to have package-id=0 and core-id=0. This is because package and
>     core containers must be in the same ACPI namespace and therefore
>     must have unique UIDs.
> 
>  3) ACPI processor containers are not required for PPTT tables to
>     be used and, due to the limitations of which IDs are selected
>     described above in (2), they are not helpful for QEMU, so we
>     don't build them with this patch. In the absence of them, Linux
>     assigns its own unique IDs. The maintainers have chosen not to use
>     counters from zero, but rather ACPI table offsets, which explains
>     why the numbers are so much larger than with DT.
> 
>  4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
>     match the logical CPU IDs, because these IDs must be equal to the
>     MADT CPU UID (as no processor containers are present), and QEMU
>     uses the logical CPU ID for these MADT IDs.
> 
> Tested-by: Jiajie Li <lijiajie11@huawei.com>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 2ad5dad1bf..03fd812d5a 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>                   vms->oem_table_id);
>  }
>  
> +/* PPTT */

Please point out the ACPI spec section "5.2.29 Processor Properties
Topology Table"

> +static void
> +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)

QEMU doesn't do this style, please write as

static void build_pptt(GArray *table_data, BIOSLinker *linker,
                       VirtMachineState *vms)

> +{
> +    int pptt_start = table_data->len;
> +    int uid = 0, cpus = 0, socket = 0;
> +    MachineState *ms = MACHINE(vms);
> +    unsigned int smp_cores = ms->smp.cores;
> +    unsigned int smp_threads = ms->smp.threads;
> +
> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> +
> +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {

Why not iterate from zero to ms->smp.sockets? With this type of loop if
the number of sockets doesn't correctly fit the number of possible cpus,
then you'll magically create new sockets that the user didn't want. That
case shouldn't be able to happen, though, because the smp parsing should
catch it. In any case, iterating sockets between zero it's number would
make more sense.

> +        uint32_t socket_offset = table_data->len - pptt_start;
> +        int core;
> +
> +        build_processor_hierarchy_node(
> +            table_data, 1, /* Physical package */

If we want to pass the flags with in-argument-list comments, then please
make sure the flags are on separate lines. See below.

> +            0, socket, /* No parent */
> +            NULL, 0);  /* No private resources */

We don't need the 'No parent' and 'No private resources' comments.

        build_processor_hierarchy_node(table_data,
            (1 << 0), /* ACPI 6.2: Physical package */
            0, socket, NULL, 0);

> +
> +        for (core = 0; core < smp_cores; core++) {
> +            uint32_t core_offset = table_data->len - pptt_start;
> +            int thread;
> +
> +            if (smp_threads <= 1) {
> +                build_processor_hierarchy_node(
> +                    table_data,
> +                    (1 << 1) | /* ACPI Processor ID valid */
> +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> +                    socket_offset, uid++, /* Parent is a Socket */
> +                    NULL, 0);  /* No private resources */

Now I see why you were calling out 6.3 in the previous patch. I suggest
still keeping the function of the previous patch referencing 6.2, but
also keep referencing 6.3 here, like you already do

                build_processor_hierarchy_node(table_data,
                    (1 << 1) | /* ACPI Processor ID valid */
                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
                    socket_offset, uid++, NULL, 0);

> +            } else {
> +                build_processor_hierarchy_node(
> +                    table_data, 0,
> +                    socket_offset, core, /* Parent is a Socket */
> +                    NULL, 0); /* No private resources */

No need for these in-argument-comments that don't match up with the spec.

> +
> +                for (thread = 0; thread < smp_threads; thread++) {
> +                    build_processor_hierarchy_node(
> +                        table_data,
> +                        (1 << 1) | /* ACPI Processor ID valid */
> +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
> +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */

This looks good.

> +                        core_offset, uid++, /* Parent is a Core */
> +                        NULL, 0);  /* No private resources */

Don't need these comments.

> +                }
> +            }
> +        }
> +        cpus += smp_cores * smp_threads;

As stated above, we don't want this.

> +    }
> +
> +    build_header(linker, table_data,
> +                 (void *)(table_data->data + pptt_start), "PPTT",
> +                 table_data->len - pptt_start, 2,
> +                 vms->oem_id, vms->oem_table_id);
> +}
> +
>  /* GTDT */
>  static void
>  build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>      acpi_add_table(table_offsets, tables_blob);
>      build_madt(tables_blob, tables->linker, vms);
>  
> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> +        acpi_add_table(table_offsets, tables_blob);
> +        build_pptt(tables_blob, tables->linker, vms);
> +    }
> +
>      acpi_add_table(table_offsets, tables_blob);
>      build_gtdt(tables_blob, tables->linker, vms);
>  
> -- 
> 2.19.1
>

Besides some changes that I think should be changed back and the 6.3
flags, this patch looks very similar to [1], so I'd prefer my
authorship be maintained. However, if my authorship is dropped, then
my s-o-b should be replaced with a Co-developed-by.

[1] https://github.com/rhdrjones/qemu/commit/439b38d67ca1f2cbfa5b9892a822b651ebd05c11 

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus
  2021-04-13  8:07 ` [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus Yanan Wang
  2021-04-27 13:18   ` Andrew Jones
@ 2021-04-27 14:50   ` Andrew Jones
  2021-04-28  6:47     ` wangyanan (Y)
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 14:50 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li,
	David Gibson

On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:
> From: Ying Fang <fangying1@huawei.com>
> 
> When building ACPI tables regarding CPUs we should always build
> them for the number of possible CPUs, not the number of present
> CPUs. We then ensure only the present CPUs are enabled in MADT.
> Furthermore, it is also needed if we are going to support CPU
> hotplug in the future.
> 
> This patch is a rework based on Andrew Jones's contribution at
> https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html

Thank you for this credit, but I think I'd prefer a Co-developed-by
tag instead, if you don't mind.

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-13  8:07 ` [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores Yanan Wang
@ 2021-04-27 14:58   ` Andrew Jones
  2021-04-28  8:04     ` wangyanan (Y)
  2021-04-28  9:36     ` wangyanan (Y)
  0 siblings, 2 replies; 40+ messages in thread
From: Andrew Jones @ 2021-04-27 14:58 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
> From: Andrew Jones <drjones@redhat.com>
> 
> The virt machine type has never used the CPU topology parameters, other
> than number of online CPUs and max CPUs. When choosing how to allocate
> those CPUs the default has been to assume cores. In preparation for
> using the other CPU topology parameters let's use an smp_parse that
> prefers cores over sockets. We can also enforce the topology matches
> max_cpus check because we have no legacy to preserve.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)

Thanks, this patch matches [1]. Of course, I've always considered this
patch to be something of an RFC, though. Is there any harm in defaulting
to sockets over cores? If not, I wonder if we shouldn't just leave the
default as it is to avoid a mach-virt specific smp parser. The "no
topology" compat variable will keep existing machine types from switching
from cores to sockets, so we don't need to worry about that.

[1] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-27 12:36       ` Philippe Mathieu-Daudé
@ 2021-04-28  6:36         ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  6:36 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1, Jiajie Li,
	David Gibson


On 2021/4/27 20:36, Philippe Mathieu-Daudé wrote:
> On 4/27/21 12:04 PM, Andrew Jones wrote:
>> On Tue, Apr 27, 2021 at 11:47:17AM +0200, Philippe Mathieu-Daudé wrote:
>>> Hi Yanan, Drew,
>>>
>>> On 4/13/21 10:07 AM, Yanan Wang wrote:
>>>> From: Andrew Jones <drjones@redhat.com>
>>>>
>>>> Support device tree CPU topology descriptions.
>>>>
>>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>>> ---
>>>>   hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
>>>>   include/hw/arm/virt.h |  1 +
>>>>   2 files changed, 41 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index 9f01d9041b..f4ae60ded9 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>>>       int cpu;
>>>>       int addr_cells = 1;
>>>>       const MachineState *ms = MACHINE(vms);
>>>> +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>>>>       int smp_cpus = ms->smp.cpus;
>>>>   
>>>>       /*
>>>> -     * From Documentation/devicetree/bindings/arm/cpus.txt
>>>> +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
>>>>        *  On ARM v8 64-bit systems value should be set to 2,
>>>>        *  that corresponds to the MPIDR_EL1 register size.
>>>>        *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
>>>> @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>>>                   ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
>>>>           }
>>>>   
>>>> +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>>>> +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
>>>> +                                  qemu_fdt_alloc_phandle(ms->fdt));
>>>> +        }
>>>> +
>>>>           g_free(nodename);
>>>>       }
>>>> +
>>>> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>>>> +        /*
>>>> +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
>>>> +         * In a SMP system, the hierarchy of CPUs is defined through four
>>>> +         * entities that are used to describe the layout of physical CPUs
>>>> +         * in the system: socket/cluster/core/thread.
>>>> +         */
>>>> +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
>>>> +
>>>> +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
>>>> +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
>>>> +            char *map_path;
>>>> +
>>>> +            if (ms->smp.threads > 1) {
>>>> +                map_path = g_strdup_printf(
>>>> +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
>>>> +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
>>>> +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
>>>> +                    "thread", cpu % ms->smp.threads);
>>>> +            } else {
>>>> +                map_path = g_strdup_printf(
>>>> +                    "/cpus/cpu-map/%s%d/%s%d",
>>>> +                    "socket", cpu / ms->smp.cores,
>>>> +                    "core", cpu % ms->smp.cores);
>>>> +            }
>>>> +            qemu_fdt_add_path(ms->fdt, map_path);
>>>> +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
>>>> +            g_free(map_path);
>>>> +            g_free(cpu_path);
>>>> +        }
>>>> +    }
>>>>   }
>>>>   
>>>>   static void fdt_add_its_gic_node(VirtMachineState *vms)
>>>> @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
>>>>       virt_machine_6_0_options(mc);
>>>>       compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
>>>>       vmc->no_secure_gpio = true;
>>>> +    vmc->no_cpu_topology = true;
>>> Bare with me because "machine versioning" is something new to me, I was
>>> expecting it to be only related to migrated fields.
>>> Why do we need to care about not adding the FDT node in older machines?
>>> Shouldn't the guest skip unknown FDT nodes?
>> It probably should, the question is whether it would. Also, the nodes may
>> not be unknown, so the guest will read the information and set up its
>> topology as instructed. That topology may not be the same as what was
>> getting used by default without the topology description. It's possible
>> that a user's application has a dependency on the topology and if that
>> topology gets changed under its feat it'll behave differently.
> [*]
>
> I see.
>
>> In short, machine versioning isn't just about vmstate, it's also about
>> keeping a machine type looking the same to the guest.
> Yes, TIL.
>
>> Now, it's possible that we're being overly cautious here, but this compat
>> variable doesn't complicate code too much. So I think I'd prefer to use it
>> than not.
> No problem. Could you or Yanan add your first paragraph ([*], reworded
> in the commit description? I don't think a comment in the code is
> useful, but having it in the commit is helpful IMO.
Hi Philippe,

Of course. I think I can do this for the commit description.

Thanks,
Yanan
> Thanks,
>
> Phil.
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus
  2021-04-27 13:18   ` Andrew Jones
@ 2021-04-28  6:42     ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  6:42 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, David Gibson, zhukeqian1,
	Jiajie Li, Ying Fang

On 2021/4/27 21:18, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:
>> From: Ying Fang <fangying1@huawei.com>
>>
>> When building ACPI tables regarding CPUs we should always build
>> them for the number of possible CPUs, not the number of present
>> CPUs. We then ensure only the present CPUs are enabled in MADT.
>> Furthermore, it is also needed if we are going to support CPU
>> hotplug in the future.
>>
>> This patch is a rework based on Andrew Jones's contribution at
>> https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html
>>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/arm/virt-acpi-build.c | 14 ++++++++++----
>>   hw/arm/virt.c            |  3 +++
>>   2 files changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index f5a2b2d4cb..2ad5dad1bf 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -61,13 +61,16 @@
>>   
>>   static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
>>   {
>> -    MachineState *ms = MACHINE(vms);
>> +    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
>>       uint16_t i;
>>   
>> -    for (i = 0; i < ms->smp.cpus; i++) {
>> +    for (i = 0; i < possible_cpus->len; i++) {
>>           Aml *dev = aml_device("C%.03X", i);
>>           aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
>>           aml_append(dev, aml_name_decl("_UID", aml_int(i)));
>> +        if (possible_cpus->cpus[i].cpu == NULL) {
>> +            aml_append(dev, aml_name_decl("_STA", aml_int(0)));
>> +        }
>>           aml_append(scope, dev);
>>       }
>>   }
>> @@ -479,6 +482,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>       const int *irqmap = vms->irqmap;
>>       AcpiMadtGenericDistributor *gicd;
>>       AcpiMadtGenericMsiFrame *gic_msi;
>> +    CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
>>       int i;
>>   
>>       acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));
>> @@ -489,7 +493,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>       gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
>>       gicd->version = vms->gic_version;
>>   
>> -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
>> +    for (i = 0; i < possible_cpus->len; i++) {
>>           AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
>>                                                              sizeof(*gicc));
>>           ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
>> @@ -504,7 +508,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>           gicc->cpu_interface_number = cpu_to_le32(i);
>>           gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
>>           gicc->uid = cpu_to_le32(i);
>> -        gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
>> +        if (possible_cpus->cpus[i].cpu != NULL) {
>> +            gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
>> +        }
>>   
>>           if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {
>>               gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index f4ae60ded9..3e5d9b6f26 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -2063,6 +2063,9 @@ static void machvirt_init(MachineState *machine)
>>           }
>>   
>>           qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
>> +
>> +        /* Initialize cpu member here since cpu hotplug is not supported yet */
>> +        machine->possible_cpus->cpus[n].cpu = cpuobj;
> Can drop the 'machine->' as possible_cpus is already set to the pointer.
Right, I will drop it.

Thanks,
Yanan
>>           object_unref(cpuobj);
>>       }
>>       fdt_add_timer_nodes(vms);
>> -- 
>> 2.19.1
>>
> Thanks,
> drew
>
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus
  2021-04-27 14:50   ` Andrew Jones
@ 2021-04-28  6:47     ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  6:47 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, Ying Fang, zhukeqian1, Jiajie Li,
	David Gibson

Hi Drew,

On 2021/4/27 22:50, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:
>> From: Ying Fang <fangying1@huawei.com>
>>
>> When building ACPI tables regarding CPUs we should always build
>> them for the number of possible CPUs, not the number of present
>> CPUs. We then ensure only the present CPUs are enabled in MADT.
>> Furthermore, it is also needed if we are going to support CPU
>> hotplug in the future.
>>
>> This patch is a rework based on Andrew Jones's contribution at
>> https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html
> Thank you for this credit, but I think I'd prefer a Co-developed-by
> tag instead, if you don't mind.
No problem. I will add your Co-developed-by tag and you deserve that.
There may still be something inappropriate about credit in this series,
but I will try to make things all right.

Thanks,
Yanan
> Thanks,
> drew
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure
  2021-04-27 13:37   ` Andrew Jones
@ 2021-04-28  6:59     ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  6:59 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Henglong Fan, Michael S . Tsirkin,
	wanghaibin.wang, qemu-devel, Shannon Zhao, qemu-arm,
	Alistair Francis, prime.zeng, yangyicong, yuzenghui,
	Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

Hi Drew,

On 2021/4/27 21:37, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:43PM +0800, Yanan Wang wrote:
>> Add a generic API to build Processor Hierarchy Node Structure(Type 0),
>> which is strictly consistent with descriptions in ACPI 6.3: 5.2.29.1.
>>
>> This function will be used to build ACPI PPTT table for cpu topology.
>>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>> Signed-off-by: Henglong Fan <fanhenglong@huawei.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/acpi/aml-build.c         | 27 +++++++++++++++++++++++++++
>>   include/hw/acpi/aml-build.h |  4 ++++
>>   2 files changed, 31 insertions(+)
>>
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index d33ce8954a..75e01aea17 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1916,6 +1916,33 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>>                    table_data->len - slit_start, 1, oem_id, oem_table_id);
>>   }
>>   
>> +/*
>> + * ACPI 6.3: 5.2.29.1 Processor Hierarchy Node Structure (Type 0)
>               ^ Doesn't this table show up in 6.2 first? We should always
> use the oldest specification we can.
Yes, it shows up for the first time in 6.2 specification.
> Also, please don't capitalize Hierarchy, Node, and Structure. Those words
> are not capitalized in the spec section name and we want an exact match
> here.
Indeed, the exact format in spec is "5.2.29.1 Processor hierarchy node 
structure (Type 0)"
I will fix this.
>> + */
>> +void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
>> +                                    uint32_t parent, uint32_t id,
>> +                                    uint32_t *priv_rsrc, uint32_t priv_num)
>> +{
>> +    int i;
>> +
>> +    build_append_byte(tbl, 0);                 /* Type 0 - processor */
>> +    build_append_byte(tbl, 20 + priv_num * 4); /* Length */
>> +    build_append_int_noprefix(tbl, 0, 2);      /* Reserved */
>> +    build_append_int_noprefix(tbl, flags, 4);  /* Flags */
>> +    build_append_int_noprefix(tbl, parent, 4); /* Parent */
>> +    build_append_int_noprefix(tbl, id, 4);     /* ACPI processor ID */
>                                                            ^ should be
> capitalized like in the spec
Right.
>                          
>> +
>> +    /* Number of private resources */
>> +    build_append_int_noprefix(tbl, priv_num, 4);
>> +
>> +    /* Private resources[N] */
>> +    if (priv_num > 0 && priv_rsrc != NULL) {
> Since we should never have priv_num > 0 and priv_rsrc == NULL, then we can
> do
>
>     if (priv_num > 0) {
>         assert(priv_rsrc);
>         ...
It seems much better, will fit it.

Thanks,
Yanan
>> +        for (i = 0; i < priv_num; i++) {
>> +            build_append_int_noprefix(tbl, priv_rsrc[i], 4);
>> +        }
>> +    }
>> +}
>> +
>>   /* build rev1/rev3/rev5.1 FADT */
>>   void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>>                   const char *oem_id, const char *oem_table_id)
>> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
>> index 471266d739..ea74b8f6ed 100644
>> --- a/include/hw/acpi/aml-build.h
>> +++ b/include/hw/acpi/aml-build.h
>> @@ -462,6 +462,10 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
>>   void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>>                   const char *oem_id, const char *oem_table_id);
>>   
>> +void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
>> +                                    uint32_t parent, uint32_t id,
>> +                                    uint32_t *priv_rsrc, uint32_t priv_num);
>> +
>>   void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>>                   const char *oem_id, const char *oem_table_id);
>>   
>> -- 
>> 2.19.1
>>
> Thanks,
> drew
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-04-27 14:16   ` Andrew Jones
@ 2021-04-28  7:30     ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  7:30 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

Hi Drew,

On 2021/4/27 22:16, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:44PM +0800, Yanan Wang wrote:
>> Add the Processor Properties Topology Table (PPTT) to present
>> CPU topology information to ACPI guests. Note, while a DT boot
>> Linux guest with a non-flat CPU topology will see socket and
>> core IDs being sequential integers starting from zero, e.g.
>> with -smp 4,sockets=2,cores=2,threads=1
>>
>> a DT boot produces
>>
>>   cpu:  0 package_id:  0 core_id:  0
>>   cpu:  1 package_id:  0 core_id:  1
>>   cpu:  2 package_id:  1 core_id:  0
>>   cpu:  3 package_id:  1 core_id:  1
>>
>> an ACPI boot produces
>>
>>   cpu:  0 package_id: 36 core_id:  0
>>   cpu:  1 package_id: 36 core_id:  1
>>   cpu:  2 package_id: 96 core_id:  2
>>   cpu:  3 package_id: 96 core_id:  3
>>
>> This is due to several reasons:
>>
>>   1) DT cpu nodes do not have an equivalent field to what the PPTT
>>      ACPI Processor ID must be, i.e. something equal to the MADT CPU
>>      UID or equal to the UID of an ACPI processor container. In both
>>      ACPI cases those are platform dependant IDs assigned by the
>>      vendor.
>>
>>   2) While QEMU is the vendor for a guest, if the topology specifies
>>      SMT (> 1 thread), then, with ACPI, it is impossible to assign a
>>      core-id the same value as a package-id, thus it is not possible
>>      to have package-id=0 and core-id=0. This is because package and
>>      core containers must be in the same ACPI namespace and therefore
>>      must have unique UIDs.
>>
>>   3) ACPI processor containers are not required for PPTT tables to
>>      be used and, due to the limitations of which IDs are selected
>>      described above in (2), they are not helpful for QEMU, so we
>>      don't build them with this patch. In the absence of them, Linux
>>      assigns its own unique IDs. The maintainers have chosen not to use
>>      counters from zero, but rather ACPI table offsets, which explains
>>      why the numbers are so much larger than with DT.
>>
>>   4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
>>      match the logical CPU IDs, because these IDs must be equal to the
>>      MADT CPU UID (as no processor containers are present), and QEMU
>>      uses the logical CPU ID for these MADT IDs.
>>
>> Tested-by: Jiajie Li <lijiajie11@huawei.com>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 63 insertions(+)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 2ad5dad1bf..03fd812d5a 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>                    vms->oem_table_id);
>>   }
>>   
>> +/* PPTT */
> Please point out the ACPI spec section "5.2.29 Processor Properties
> Topology Table"
Will fix.
>> +static void
>> +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> QEMU doesn't do this style, please write as
>
> static void build_pptt(GArray *table_data, BIOSLinker *linker,
>                         VirtMachineState *vms)
Will fix.
>> +{
>> +    int pptt_start = table_data->len;
>> +    int uid = 0, cpus = 0, socket = 0;
>> +    MachineState *ms = MACHINE(vms);
>> +    unsigned int smp_cores = ms->smp.cores;
>> +    unsigned int smp_threads = ms->smp.threads;
>> +
>> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
>> +
>> +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> Why not iterate from zero to ms->smp.sockets? With this type of loop if
> the number of sockets doesn't correctly fit the number of possible cpus,
> then you'll magically create new sockets that the user didn't want. That
> case shouldn't be able to happen, though, because the smp parsing should
> catch it. In any case, iterating sockets between zero it's number would
> make more sense.
In either way, we will never meet "sockets * cores * threads != 
possible_cpus->len" here.
But yes, what you describe makes more sense and will make code easier 
for reading.
>> +        uint32_t socket_offset = table_data->len - pptt_start;
>> +        int core;
>> +
>> +        build_processor_hierarchy_node(
>> +            table_data, 1, /* Physical package */
> If we want to pass the flags with in-argument-list comments, then please
> make sure the flags are on separate lines. See below.
>
>> +            0, socket, /* No parent */
>> +            NULL, 0);  /* No private resources */
> We don't need the 'No parent' and 'No private resources' comments.
>
>          build_processor_hierarchy_node(table_data,
>              (1 << 0), /* ACPI 6.2: Physical package */
>              0, socket, NULL, 0);
>
>> +
>> +        for (core = 0; core < smp_cores; core++) {
>> +            uint32_t core_offset = table_data->len - pptt_start;
>> +            int thread;
>> +
>> +            if (smp_threads <= 1) {
>> +                build_processor_hierarchy_node(
>> +                    table_data,
>> +                    (1 << 1) | /* ACPI Processor ID valid */
>> +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
>> +                    socket_offset, uid++, /* Parent is a Socket */
>> +                    NULL, 0);  /* No private resources */
> Now I see why you were calling out 6.3 in the previous patch. I suggest
> still keeping the function of the previous patch referencing 6.2, but
> also keep referencing 6.3 here, like you already do
>
>                  build_processor_hierarchy_node(table_data,
>                      (1 << 1) | /* ACPI Processor ID valid */
>                      (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
>                      socket_offset, uid++, NULL, 0);
>
>> +            } else {
>> +                build_processor_hierarchy_node(
>> +                    table_data, 0,
>> +                    socket_offset, core, /* Parent is a Socket */
>> +                    NULL, 0); /* No private resources */
> No need for these in-argument-comments that don't match up with the spec.
>
>> +
>> +                for (thread = 0; thread < smp_threads; thread++) {
>> +                    build_processor_hierarchy_node(
>> +                        table_data,
>> +                        (1 << 1) | /* ACPI Processor ID valid */
>> +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
>> +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> This looks good.
>
>> +                        core_offset, uid++, /* Parent is a Core */
>> +                        NULL, 0);  /* No private resources */
> Don't need these comments.
Thanks for above suggestions and guidance about in-argument-comments.
I will make some adjustment.
>> +                }
>> +            }
>> +        }
>> +        cpus += smp_cores * smp_threads;
> As stated above, we don't want this.
>
>> +    }
>> +
>> +    build_header(linker, table_data,
>> +                 (void *)(table_data->data + pptt_start), "PPTT",
>> +                 table_data->len - pptt_start, 2,
>> +                 vms->oem_id, vms->oem_table_id);
>> +}
>> +
>>   /* GTDT */
>>   static void
>>   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>> @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>       acpi_add_table(table_offsets, tables_blob);
>>       build_madt(tables_blob, tables->linker, vms);
>>   
>> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>> +        acpi_add_table(table_offsets, tables_blob);
>> +        build_pptt(tables_blob, tables->linker, vms);
>> +    }
>> +
>>       acpi_add_table(table_offsets, tables_blob);
>>       build_gtdt(tables_blob, tables->linker, vms);
>>   
>> -- 
>> 2.19.1
>>
> Besides some changes that I think should be changed back and the 6.3
> flags, this patch looks very similar to [1], so I'd prefer my
> authorship be maintained. However, if my authorship is dropped, then
> my s-o-b should be replaced with a Co-developed-by.
Of course, I will make it right.

Thanks,
Yanan
>
> [1] https://github.com/rhdrjones/qemu/commit/439b38d67ca1f2cbfa5b9892a822b651ebd05c11
>
> Thanks,
> drew
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-27 14:58   ` Andrew Jones
@ 2021-04-28  8:04     ` wangyanan (Y)
  2021-04-28  9:36     ` wangyanan (Y)
  1 sibling, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  8:04 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1, Jiajie Li,
	David Gibson

Hi Drew,

On 2021/4/27 22:58, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
>> From: Andrew Jones <drjones@redhat.com>
>>
>> The virt machine type has never used the CPU topology parameters, other
>> than number of online CPUs and max CPUs. When choosing how to allocate
>> those CPUs the default has been to assume cores. In preparation for
>> using the other CPU topology parameters let's use an smp_parse that
>> prefers cores over sockets. We can also enforce the topology matches
>> max_cpus check because we have no legacy to preserve.
>>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 76 insertions(+)
> Thanks, this patch matches [1]. Of course, I've always considered this
> patch to be something of an RFC, though. Is there any harm in defaulting
> to sockets over cores? If not, I wonder if we shouldn't just leave the
> default as it is to avoid a mach-virt specific smp parser.
 From my view, I did't find any harm in defaulting to sockets over 
cores, but I'm not really sure..
At least, the arch-neutral function smp_parse and pc_smp_parse for x86 
all prefer sockets over cores in default.
>   The "no
> topology" compat variable will keep existing machine types from switching
> from cores to sockets, so we don't need to worry about that.
Yes, I agree about this.

Thanks,
Yanan
> [1] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
>
> Thanks,
> drew
>
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-27 14:58   ` Andrew Jones
  2021-04-28  8:04     ` wangyanan (Y)
@ 2021-04-28  9:36     ` wangyanan (Y)
  2021-04-28 10:13       ` Andrew Jones
  1 sibling, 1 reply; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-28  9:36 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1, Jiajie Li,
	David Gibson


On 2021/4/27 22:58, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
>> From: Andrew Jones <drjones@redhat.com>
>>
>> The virt machine type has never used the CPU topology parameters, other
>> than number of online CPUs and max CPUs. When choosing how to allocate
>> those CPUs the default has been to assume cores. In preparation for
>> using the other CPU topology parameters let's use an smp_parse that
>> prefers cores over sockets. We can also enforce the topology matches
>> max_cpus check because we have no legacy to preserve.
>>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 76 insertions(+)
> Thanks, this patch matches [1]. Of course, I've always considered this
> patch to be something of an RFC, though. Is there any harm in defaulting
> to sockets over cores? If not, I wonder if we shouldn't just leave the
> default as it is to avoid a mach-virt specific smp parser. The "no
> topology" compat variable will keep existing machine types from switching
> from cores to sockets, so we don't need to worry about that.
>
> [1] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
For CPU topology population, ARM64 kernel will firstly try to parse ACPI 
PPTT table
and then DT in function init_cpu_topology(), if failed it will rely on 
the MPIDR value
in function store_cpu_topology(). But MPIDR can not be trusted and is 
ignored for
any topology deduction. And instead, topology of one single socket with 
multiple
cores is made, which can not represent the real underlying system topology.
I think this is the reason why VMs will in default prefer cores over 
sockets if no
topology description is provided.

With the feature introduced by this series, guest kernel can 
successfully get cpu
information from one of the two (ACPI or DT) for topology population.

According to above analysis, IMO, whether the parsing logic is "sockets 
over cores" or
"cores over sockets", it just provide different topology information and 
consequently
different scheduling performance. Apart from this, I think there will 
not any harm or
problems caused.

So maybe it's fine that we just use the arch-neutral parsing logic?
How do you think ?

Thanks,
Yanan
> Thanks,
> drew
>
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-28  9:36     ` wangyanan (Y)
@ 2021-04-28 10:13       ` Andrew Jones
  2021-04-29  2:21         ` wangyanan (Y)
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-04-28 10:13 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1, Jiajie Li,
	David Gibson

On Wed, Apr 28, 2021 at 05:36:43PM +0800, wangyanan (Y) wrote:
> 
> On 2021/4/27 22:58, Andrew Jones wrote:
> > On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
> > > From: Andrew Jones <drjones@redhat.com>
> > > 
> > > The virt machine type has never used the CPU topology parameters, other
> > > than number of online CPUs and max CPUs. When choosing how to allocate
> > > those CPUs the default has been to assume cores. In preparation for
> > > using the other CPU topology parameters let's use an smp_parse that
> > > prefers cores over sockets. We can also enforce the topology matches
> > > max_cpus check because we have no legacy to preserve.
> > > 
> > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > > ---
> > >   hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 76 insertions(+)
> > Thanks, this patch matches [1]. Of course, I've always considered this
> > patch to be something of an RFC, though. Is there any harm in defaulting
> > to sockets over cores? If not, I wonder if we shouldn't just leave the
> > default as it is to avoid a mach-virt specific smp parser. The "no
> > topology" compat variable will keep existing machine types from switching
> > from cores to sockets, so we don't need to worry about that.
> > 
> > [1] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
> For CPU topology population, ARM64 kernel will firstly try to parse ACPI
> PPTT table
> and then DT in function init_cpu_topology(), if failed it will rely on the
> MPIDR value
> in function store_cpu_topology(). But MPIDR can not be trusted and is
> ignored for
> any topology deduction. And instead, topology of one single socket with
> multiple
> cores is made, which can not represent the real underlying system topology.
> I think this is the reason why VMs will in default prefer cores over sockets
> if no
> topology description is provided.
> 
> With the feature introduced by this series, guest kernel can successfully
> get cpu
> information from one of the two (ACPI or DT) for topology population.
> 
> According to above analysis, IMO, whether the parsing logic is "sockets over
> cores" or
> "cores over sockets", it just provide different topology information and
> consequently
> different scheduling performance. Apart from this, I think there will not
> any harm or
> problems caused.
> 
> So maybe it's fine that we just use the arch-neutral parsing logic?
> How do you think ?

Can you do an experiment where you create a guest with N vcpus, where N is
the number of cores in a single socket. Then, pin each of those vcpus to a
core in a single physical socket. Then, boot the VM with a topology of one
socket and N cores and run some benchmarks. Then, boot the VM again with N
sockets, one core each, and run the same benchmarks.

I'm guessing we'll see the same benchmark numbers (within noise allowance)
for both the runs. If we don't see the same numbers, then that'd be
interesting.

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores
  2021-04-28 10:13       ` Andrew Jones
@ 2021-04-29  2:21         ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-04-29  2:21 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, qemu-devel, Shannon Zhao,
	Igor Mammedov, qemu-arm, Alistair Francis, prime.zeng,
	yangyicong, yuzenghui, wanghaibin.wang, zhukeqian1, Jiajie Li,
	David Gibson

Hi Drew,

On 2021/4/28 18:13, Andrew Jones wrote:
> On Wed, Apr 28, 2021 at 05:36:43PM +0800, wangyanan (Y) wrote:
>> On 2021/4/27 22:58, Andrew Jones wrote:
>>> On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
>>>> From: Andrew Jones <drjones@redhat.com>
>>>>
>>>> The virt machine type has never used the CPU topology parameters, other
>>>> than number of online CPUs and max CPUs. When choosing how to allocate
>>>> those CPUs the default has been to assume cores. In preparation for
>>>> using the other CPU topology parameters let's use an smp_parse that
>>>> prefers cores over sockets. We can also enforce the topology matches
>>>> max_cpus check because we have no legacy to preserve.
>>>>
>>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>>> ---
>>>>    hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 76 insertions(+)
>>> Thanks, this patch matches [1]. Of course, I've always considered this
>>> patch to be something of an RFC, though. Is there any harm in defaulting
>>> to sockets over cores? If not, I wonder if we shouldn't just leave the
>>> default as it is to avoid a mach-virt specific smp parser. The "no
>>> topology" compat variable will keep existing machine types from switching
>>> from cores to sockets, so we don't need to worry about that.
>>>
>>> [1] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
>> For CPU topology population, ARM64 kernel will firstly try to parse ACPI
>> PPTT table
>> and then DT in function init_cpu_topology(), if failed it will rely on the
>> MPIDR value
>> in function store_cpu_topology(). But MPIDR can not be trusted and is
>> ignored for
>> any topology deduction. And instead, topology of one single socket with
>> multiple
>> cores is made, which can not represent the real underlying system topology.
>> I think this is the reason why VMs will in default prefer cores over sockets
>> if no
>> topology description is provided.
>>
>> With the feature introduced by this series, guest kernel can successfully
>> get cpu
>> information from one of the two (ACPI or DT) for topology population.
>>
>> According to above analysis, IMO, whether the parsing logic is "sockets over
>> cores" or
>> "cores over sockets", it just provide different topology information and
>> consequently
>> different scheduling performance. Apart from this, I think there will not
>> any harm or
>> problems caused.
>>
>> So maybe it's fine that we just use the arch-neutral parsing logic?
>> How do you think ?
> Can you do an experiment where you create a guest with N vcpus, where N is
> the number of cores in a single socket. Then, pin each of those vcpus to a
> core in a single physical socket. Then, boot the VM with a topology of one
> socket and N cores and run some benchmarks. Then, boot the VM again with N
> sockets, one core each, and run the same benchmarks.
>
> I'm guessing we'll see the same benchmark numbers (within noise allowance)
> for both the runs. If we don't see the same numbers, then that'd be
> interesting.
Yes, I can do the experiment, and will post the results later.

Thanks,
Yanan
> Thanks,
> drew
>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-04-13  8:07 ` [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table Yanan Wang
  2021-04-27 14:16   ` Andrew Jones
@ 2021-05-13  5:10   ` wangyanan (Y)
  2021-05-13  6:55     ` Andrew Jones
  2021-05-18  7:17     ` Salil Mehta
  1 sibling, 2 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-05-13  5:10 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

Hi Drew,

I got a question below, and hope your reply. Thanks!
On 2021/4/13 16:07, Yanan Wang wrote:
> Add the Processor Properties Topology Table (PPTT) to present
> CPU topology information to ACPI guests. Note, while a DT boot
> Linux guest with a non-flat CPU topology will see socket and
> core IDs being sequential integers starting from zero, e.g.
> with -smp 4,sockets=2,cores=2,threads=1
>
> a DT boot produces
>
>   cpu:  0 package_id:  0 core_id:  0
>   cpu:  1 package_id:  0 core_id:  1
>   cpu:  2 package_id:  1 core_id:  0
>   cpu:  3 package_id:  1 core_id:  1
>
> an ACPI boot produces
>
>   cpu:  0 package_id: 36 core_id:  0
>   cpu:  1 package_id: 36 core_id:  1
>   cpu:  2 package_id: 96 core_id:  2
>   cpu:  3 package_id: 96 core_id:  3
>
> This is due to several reasons:
>
>   1) DT cpu nodes do not have an equivalent field to what the PPTT
>      ACPI Processor ID must be, i.e. something equal to the MADT CPU
>      UID or equal to the UID of an ACPI processor container. In both
>      ACPI cases those are platform dependant IDs assigned by the
>      vendor.
>
>   2) While QEMU is the vendor for a guest, if the topology specifies
>      SMT (> 1 thread), then, with ACPI, it is impossible to assign a
>      core-id the same value as a package-id, thus it is not possible
>      to have package-id=0 and core-id=0. This is because package and
>      core containers must be in the same ACPI namespace and therefore
>      must have unique UIDs.
>
>   3) ACPI processor containers are not required for PPTT tables to
>      be used and, due to the limitations of which IDs are selected
>      described above in (2), they are not helpful for QEMU, so we
>      don't build them with this patch. In the absence of them, Linux
>      assigns its own unique IDs. The maintainers have chosen not to use
>      counters from zero, but rather ACPI table offsets, which explains
>      why the numbers are so much larger than with DT.
>
>   4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
>      match the logical CPU IDs, because these IDs must be equal to the
>      MADT CPU UID (as no processor containers are present), and QEMU
>      uses the logical CPU ID for these MADT IDs.
>
> Tested-by: Jiajie Li <lijiajie11@huawei.com>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Ying Fang <fangying1@huawei.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>   hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 2ad5dad1bf..03fd812d5a 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>                    vms->oem_table_id);
>   }
>   
> +/* PPTT */
> +static void
> +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> +{
> +    int pptt_start = table_data->len;
> +    int uid = 0, cpus = 0, socket = 0;
> +    MachineState *ms = MACHINE(vms);
> +    unsigned int smp_cores = ms->smp.cores;
> +    unsigned int smp_threads = ms->smp.threads;
> +
> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> +
> +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> +        uint32_t socket_offset = table_data->len - pptt_start;
> +        int core;
> +
> +        build_processor_hierarchy_node(
> +            table_data, 1, /* Physical package */
> +            0, socket, /* No parent */
> +            NULL, 0);  /* No private resources */
> +
> +        for (core = 0; core < smp_cores; core++) {
> +            uint32_t core_offset = table_data->len - pptt_start;
> +            int thread;
> +
> +            if (smp_threads <= 1) {
> +                build_processor_hierarchy_node(
> +                    table_data,
> +                    (1 << 1) | /* ACPI Processor ID valid */
> +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> +                    socket_offset, uid++, /* Parent is a Socket */
> +                    NULL, 0);  /* No private resources */
> +            } else {
> +                build_processor_hierarchy_node(
> +                    table_data, 0,
> +                    socket_offset, core, /* Parent is a Socket */
> +                    NULL, 0); /* No private resources */
> +
> +                for (thread = 0; thread < smp_threads; thread++) {
> +                    build_processor_hierarchy_node(
> +                        table_data,
> +                        (1 << 1) | /* ACPI Processor ID valid */
> +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
> +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> +                        core_offset, uid++, /* Parent is a Core */
> +                        NULL, 0);  /* No private resources */
> +                }
> +            }
> +        }
> +        cpus += smp_cores * smp_threads;
> +    }
> +
> +    build_header(linker, table_data,
> +                 (void *)(table_data->data + pptt_start), "PPTT",
> +                 table_data->len - pptt_start, 2,
> +                 vms->oem_id, vms->oem_table_id);
> +}
> +
>   /* GTDT */
>   static void
>   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>       acpi_add_table(table_offsets, tables_blob);
>       build_madt(tables_blob, tables->linker, vms);
>   
> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
I'm not really sure why we need to care about "ms->smp.cpus > 1" here?

IMO, just like MADT in which we create both ENABLED and DISABLED
gicc nodes no matter of number of ENABLED nodes is one or not, we
should create PPTT table for all the possible cpus and not care about
number of smp cpus, too. This will be more consistent with the ACPI
specification and the PPTT table will be used for ACPI cpu hotplug in
the future even with  "smp.cpus == 1".

Care of "smp.cpus > 1" in the DT cpu-map part makes sense to me,
because we are required to only add present cpu nodes to the DT and
Linux Doc says that a cpu-map is not needed for uniprocessor systems.

Thanks,
Yanan
> +        acpi_add_table(table_offsets, tables_blob);
> +        build_pptt(tables_blob, tables->linker, vms);
> +    }
> +
>       acpi_add_table(table_offsets, tables_blob);
>       build_gtdt(tables_blob, tables->linker, vms);
>   


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-13  5:10   ` wangyanan (Y)
@ 2021-05-13  6:55     ` Andrew Jones
  2021-05-18  7:17     ` Salil Mehta
  1 sibling, 0 replies; 40+ messages in thread
From: Andrew Jones @ 2021-05-13  6:55 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

On Thu, May 13, 2021 at 01:10:10PM +0800, wangyanan (Y) wrote:
> >   /* GTDT */
> >   static void
> >   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >       acpi_add_table(table_offsets, tables_blob);
> >       build_madt(tables_blob, tables->linker, vms);
> > +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> I'm not really sure why we need to care about "ms->smp.cpus > 1" here?
> 
> IMO, just like MADT in which we create both ENABLED and DISABLED
> gicc nodes no matter of number of ENABLED nodes is one or not, we
> should create PPTT table for all the possible cpus and not care about
> number of smp cpus, too. This will be more consistent with the ACPI
> specification and the PPTT table will be used for ACPI cpu hotplug in
> the future even with  "smp.cpus == 1".
> 
> Care of "smp.cpus > 1" in the DT cpu-map part makes sense to me,
> because we are required to only add present cpu nodes to the DT and
> Linux Doc says that a cpu-map is not needed for uniprocessor systems.
> 

Hi Yanan,

You're right. Let's just always generate the PPTT.

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-04-13  8:07 ` [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map Yanan Wang
  2021-04-27  9:47   ` Philippe Mathieu-Daudé
@ 2021-05-13  6:58   ` Andrew Jones
  2021-05-13  7:15     ` wangyanan (Y)
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-05-13  6:58 UTC (permalink / raw)
  To: Yanan Wang
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson

On Tue, Apr 13, 2021 at 04:07:41PM +0800, Yanan Wang wrote:
> From: Andrew Jones <drjones@redhat.com>
> 
> Support device tree CPU topology descriptions.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> ---
>  hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9f01d9041b..f4ae60ded9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>      int cpu;
>      int addr_cells = 1;
>      const MachineState *ms = MACHINE(vms);
> +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>      int smp_cpus = ms->smp.cpus;
>  
>      /*
> -     * From Documentation/devicetree/bindings/arm/cpus.txt
> +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
>       *  On ARM v8 64-bit systems value should be set to 2,
>       *  that corresponds to the MPIDR_EL1 register size.
>       *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
> @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>                  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
>          }
>  
> +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {

We should probably always generate the cpu-map, like we agreed to always
generate the PPTT. If, for some reason, we don't want to generate the
cpu-map for uniprocessor systems, then we should actually be checking
ms->smp.maxcpus here (and below) to be sure it's uniprocessor.

Thanks,
drew

> +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
> +                                  qemu_fdt_alloc_phandle(ms->fdt));
> +        }
> +
>          g_free(nodename);
>      }
> +
> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> +        /*
> +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
> +         * In a SMP system, the hierarchy of CPUs is defined through four
> +         * entities that are used to describe the layout of physical CPUs
> +         * in the system: socket/cluster/core/thread.
> +         */
> +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
> +
> +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
> +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
> +            char *map_path;
> +
> +            if (ms->smp.threads > 1) {
> +                map_path = g_strdup_printf(
> +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
> +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
> +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
> +                    "thread", cpu % ms->smp.threads);
> +            } else {
> +                map_path = g_strdup_printf(
> +                    "/cpus/cpu-map/%s%d/%s%d",
> +                    "socket", cpu / ms->smp.cores,
> +                    "core", cpu % ms->smp.cores);
> +            }
> +            qemu_fdt_add_path(ms->fdt, map_path);
> +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
> +            g_free(map_path);
> +            g_free(cpu_path);
> +        }
> +    }
>  }
>  
>  static void fdt_add_its_gic_node(VirtMachineState *vms)
> @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
>      virt_machine_6_0_options(mc);
>      compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
>      vmc->no_secure_gpio = true;
> +    vmc->no_cpu_topology = true;
>  }
>  DEFINE_VIRT_MACHINE(5, 2)
>  
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 921416f918..4a4b98e4a7 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -129,6 +129,7 @@ struct VirtMachineClass {
>      bool no_kvm_steal_time;
>      bool acpi_expose_flash;
>      bool no_secure_gpio;
> +    bool no_cpu_topology;
>  };
>  
>  struct VirtMachineState {
> -- 
> 2.19.1
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map
  2021-05-13  6:58   ` Andrew Jones
@ 2021-05-13  7:15     ` wangyanan (Y)
  0 siblings, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-05-13  7:15 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Michael S . Tsirkin, wanghaibin.wang, qemu-devel,
	Shannon Zhao, qemu-arm, Alistair Francis, prime.zeng, yangyicong,
	yuzenghui, Igor Mammedov, zhukeqian1, Jiajie Li, David Gibson


On 2021/5/13 14:58, Andrew Jones wrote:
> On Tue, Apr 13, 2021 at 04:07:41PM +0800, Yanan Wang wrote:
>> From: Andrew Jones <drjones@redhat.com>
>>
>> Support device tree CPU topology descriptions.
>>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>> ---
>>   hw/arm/virt.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
>>   include/hw/arm/virt.h |  1 +
>>   2 files changed, 41 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 9f01d9041b..f4ae60ded9 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>       int cpu;
>>       int addr_cells = 1;
>>       const MachineState *ms = MACHINE(vms);
>> +    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
>>       int smp_cpus = ms->smp.cpus;
>>   
>>       /*
>> -     * From Documentation/devicetree/bindings/arm/cpus.txt
>> +     *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
>>        *  On ARM v8 64-bit systems value should be set to 2,
>>        *  that corresponds to the MPIDR_EL1 register size.
>>        *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
>> @@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
>>                   ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
>>           }
>>   
>> +        if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> We should probably always generate the cpu-map, like we agreed to always
> generate the PPTT.
Ok, I will remove smp.cpus check for cpu-map too.
Single cpu node, corresponding to single cpu-map path also works.
> If, for some reason, we don't want to generate the
> cpu-map for uniprocessor systems, then we should actually be checking
> ms->smp.maxcpus here (and below) to be sure it's uniprocessor.
Right, it's max cpus that ought to be checked but not smp cpus.

Thanks,
Yanan
> Thanks,
> drew
>
>> +            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
>> +                                  qemu_fdt_alloc_phandle(ms->fdt));
>> +        }
>> +
>>           g_free(nodename);
>>       }
>> +
>> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>> +        /*
>> +         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
>> +         * In a SMP system, the hierarchy of CPUs is defined through four
>> +         * entities that are used to describe the layout of physical CPUs
>> +         * in the system: socket/cluster/core/thread.
>> +         */
>> +        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
>> +
>> +        for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
>> +            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
>> +            char *map_path;
>> +
>> +            if (ms->smp.threads > 1) {
>> +                map_path = g_strdup_printf(
>> +                    "/cpus/cpu-map/%s%d/%s%d/%s%d",
>> +                    "socket", cpu / (ms->smp.cores * ms->smp.threads),
>> +                    "core", (cpu / ms->smp.threads) % ms->smp.cores,
>> +                    "thread", cpu % ms->smp.threads);
>> +            } else {
>> +                map_path = g_strdup_printf(
>> +                    "/cpus/cpu-map/%s%d/%s%d",
>> +                    "socket", cpu / ms->smp.cores,
>> +                    "core", cpu % ms->smp.cores);
>> +            }
>> +            qemu_fdt_add_path(ms->fdt, map_path);
>> +            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
>> +            g_free(map_path);
>> +            g_free(cpu_path);
>> +        }
>> +    }
>>   }
>>   
>>   static void fdt_add_its_gic_node(VirtMachineState *vms)
>> @@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
>>       virt_machine_6_0_options(mc);
>>       compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
>>       vmc->no_secure_gpio = true;
>> +    vmc->no_cpu_topology = true;
>>   }
>>   DEFINE_VIRT_MACHINE(5, 2)
>>   
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index 921416f918..4a4b98e4a7 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -129,6 +129,7 @@ struct VirtMachineClass {
>>       bool no_kvm_steal_time;
>>       bool acpi_expose_flash;
>>       bool no_secure_gpio;
>> +    bool no_cpu_topology;
>>   };
>>   
>>   struct VirtMachineState {
>> -- 
>> 2.19.1
>>
> .


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-13  5:10   ` wangyanan (Y)
  2021-05-13  6:55     ` Andrew Jones
@ 2021-05-18  7:17     ` Salil Mehta
  2021-05-18  7:42       ` Andrew Jones
  2021-05-18  9:16       ` wangyanan (Y)
  1 sibling, 2 replies; 40+ messages in thread
From: Salil Mehta @ 2021-05-18  7:17 UTC (permalink / raw)
  To: wangyanan (Y), Andrew Jones
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, Shannon Zhao, Igor Mammedov, qemu-arm,
	Alistair Francis, Zengtao (B),
	yangyicong, yuzenghui, Wanghaibin (D), zhukeqian, lijiajie (H),
	David Gibson

> From: Qemu-arm [mailto:qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org]
> On Behalf Of wangyanan (Y)
> Sent: Thursday, May 13, 2021 6:10 AM
> 
> Hi Drew,
> 
> I got a question below, and hope your reply. Thanks!
> On 2021/4/13 16:07, Yanan Wang wrote:
> > Add the Processor Properties Topology Table (PPTT) to present
> > CPU topology information to ACPI guests. Note, while a DT boot
> > Linux guest with a non-flat CPU topology will see socket and
> > core IDs being sequential integers starting from zero, e.g.
> > with -smp 4,sockets=2,cores=2,threads=1
> >
> > a DT boot produces
> >
> >   cpu:  0 package_id:  0 core_id:  0
> >   cpu:  1 package_id:  0 core_id:  1
> >   cpu:  2 package_id:  1 core_id:  0
> >   cpu:  3 package_id:  1 core_id:  1
> >
> > an ACPI boot produces
> >
> >   cpu:  0 package_id: 36 core_id:  0
> >   cpu:  1 package_id: 36 core_id:  1
> >   cpu:  2 package_id: 96 core_id:  2
> >   cpu:  3 package_id: 96 core_id:  3
> >
> > This is due to several reasons:
> >
> >   1) DT cpu nodes do not have an equivalent field to what the PPTT
> >      ACPI Processor ID must be, i.e. something equal to the MADT CPU
> >      UID or equal to the UID of an ACPI processor container. In both
> >      ACPI cases those are platform dependant IDs assigned by the
> >      vendor.
> >
> >   2) While QEMU is the vendor for a guest, if the topology specifies
> >      SMT (> 1 thread), then, with ACPI, it is impossible to assign a
> >      core-id the same value as a package-id, thus it is not possible
> >      to have package-id=0 and core-id=0. This is because package and
> >      core containers must be in the same ACPI namespace and therefore
> >      must have unique UIDs.
> >
> >   3) ACPI processor containers are not required for PPTT tables to
> >      be used and, due to the limitations of which IDs are selected
> >      described above in (2), they are not helpful for QEMU, so we
> >      don't build them with this patch. In the absence of them, Linux
> >      assigns its own unique IDs. The maintainers have chosen not to use
> >      counters from zero, but rather ACPI table offsets, which explains
> >      why the numbers are so much larger than with DT.
> >
> >   4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
> >      match the logical CPU IDs, because these IDs must be equal to the
> >      MADT CPU UID (as no processor containers are present), and QEMU
> >      uses the logical CPU ID for these MADT IDs.
> >
> > Tested-by: Jiajie Li <lijiajie11@huawei.com>
> > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > Signed-off-by: Ying Fang <fangying1@huawei.com>
> > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 63 insertions(+)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 2ad5dad1bf..03fd812d5a 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >                    vms->oem_table_id);
> >   }
> >
> > +/* PPTT */
> > +static void
> > +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > +{
> > +    int pptt_start = table_data->len;
> > +    int uid = 0, cpus = 0, socket = 0;
> > +    MachineState *ms = MACHINE(vms);
> > +    unsigned int smp_cores = ms->smp.cores;
> > +    unsigned int smp_threads = ms->smp.threads;
> > +
> > +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> > +
> > +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> > +        uint32_t socket_offset = table_data->len - pptt_start;
> > +        int core;
> > +
> > +        build_processor_hierarchy_node(
> > +            table_data, 1, /* Physical package */
> > +            0, socket, /* No parent */
> > +            NULL, 0);  /* No private resources */
> > +
> > +        for (core = 0; core < smp_cores; core++) {
> > +            uint32_t core_offset = table_data->len - pptt_start;
> > +            int thread;
> > +
> > +            if (smp_threads <= 1) {
> > +                build_processor_hierarchy_node(
> > +                    table_data,
> > +                    (1 << 1) | /* ACPI Processor ID valid */
> > +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > +                    socket_offset, uid++, /* Parent is a Socket */
> > +                    NULL, 0);  /* No private resources */
> > +            } else {
> > +                build_processor_hierarchy_node(
> > +                    table_data, 0,
> > +                    socket_offset, core, /* Parent is a Socket */
> > +                    NULL, 0); /* No private resources */
> > +
> > +                for (thread = 0; thread < smp_threads; thread++) {
> > +                    build_processor_hierarchy_node(
> > +                        table_data,
> > +                        (1 << 1) | /* ACPI Processor ID valid */
> > +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
> > +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > +                        core_offset, uid++, /* Parent is a Core */
> > +                        NULL, 0);  /* No private resources */
> > +                }
> > +            }
> > +        }
> > +        cpus += smp_cores * smp_threads;
> > +    }
> > +
> > +    build_header(linker, table_data,
> > +                 (void *)(table_data->data + pptt_start), "PPTT",
> > +                 table_data->len - pptt_start, 2,
> > +                 vms->oem_id, vms->oem_table_id);
> > +}
> > +
> >   /* GTDT */
> >   static void
> >   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms,
> AcpiBuildTables *tables)
> >       acpi_add_table(table_offsets, tables_blob);
> >       build_madt(tables_blob, tables->linker, vms);
> >
> > +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> I'm not really sure why we need to care about "ms->smp.cpus > 1" here?
> 
> IMO, just like MADT in which we create both ENABLED and DISABLED
> gicc nodes no matter of number of ENABLED nodes is one or not, we
> should create PPTT table for all the possible cpus and not care about
> number of smp cpus, too. This will be more consistent with the ACPI
> specification and the PPTT table will be used for ACPI cpu hotplug in
> the future even with  "smp.cpus == 1".


A humble request:
Let us not anticipate the changes of vcpu Hotplug here. Things are fluid
with respect to the vcpu Hotplug right now and I think it will not be
right to base PPTT Table changes in anticipation of something we are not
sure of what it looks like.

Any such decisions should be postponed and be made part of the actual
vcpu Hotplug changes when(and if ever) they come for ARM64. This will
also ensure proper review of such changes and useful in that particular
context.


Thanks

> 
> Care of "smp.cpus > 1" in the DT cpu-map part makes sense to me,
> because we are required to only add present cpu nodes to the DT and
> Linux Doc says that a cpu-map is not needed for uniprocessor systems.
> 
> Thanks,
> Yanan
> > +        acpi_add_table(table_offsets, tables_blob);
> > +        build_pptt(tables_blob, tables->linker, vms);
> > +    }
> > +
> >       acpi_add_table(table_offsets, tables_blob);
> >       build_gtdt(tables_blob, tables->linker, vms);
> >


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-18  7:17     ` Salil Mehta
@ 2021-05-18  7:42       ` Andrew Jones
  2021-05-18 18:34         ` Salil Mehta
  2021-05-18  9:16       ` wangyanan (Y)
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-05-18  7:42 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, wangyanan (Y),
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	Zengtao (B), yangyicong, yuzenghui, Wanghaibin (D),
	zhukeqian, lijiajie (H),
	David Gibson

On Tue, May 18, 2021 at 07:17:56AM +0000, Salil Mehta wrote:
> > From: Qemu-arm [mailto:qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org]
> > On Behalf Of wangyanan (Y)
> > Sent: Thursday, May 13, 2021 6:10 AM
> > 
> > Hi Drew,
> > 
> > I got a question below, and hope your reply. Thanks!
> > On 2021/4/13 16:07, Yanan Wang wrote:
> > > Add the Processor Properties Topology Table (PPTT) to present
> > > CPU topology information to ACPI guests. Note, while a DT boot
> > > Linux guest with a non-flat CPU topology will see socket and
> > > core IDs being sequential integers starting from zero, e.g.
> > > with -smp 4,sockets=2,cores=2,threads=1
> > >
> > > a DT boot produces
> > >
> > >   cpu:  0 package_id:  0 core_id:  0
> > >   cpu:  1 package_id:  0 core_id:  1
> > >   cpu:  2 package_id:  1 core_id:  0
> > >   cpu:  3 package_id:  1 core_id:  1
> > >
> > > an ACPI boot produces
> > >
> > >   cpu:  0 package_id: 36 core_id:  0
> > >   cpu:  1 package_id: 36 core_id:  1
> > >   cpu:  2 package_id: 96 core_id:  2
> > >   cpu:  3 package_id: 96 core_id:  3
> > >
> > > This is due to several reasons:
> > >
> > >   1) DT cpu nodes do not have an equivalent field to what the PPTT
> > >      ACPI Processor ID must be, i.e. something equal to the MADT CPU
> > >      UID or equal to the UID of an ACPI processor container. In both
> > >      ACPI cases those are platform dependant IDs assigned by the
> > >      vendor.
> > >
> > >   2) While QEMU is the vendor for a guest, if the topology specifies
> > >      SMT (> 1 thread), then, with ACPI, it is impossible to assign a
> > >      core-id the same value as a package-id, thus it is not possible
> > >      to have package-id=0 and core-id=0. This is because package and
> > >      core containers must be in the same ACPI namespace and therefore
> > >      must have unique UIDs.
> > >
> > >   3) ACPI processor containers are not required for PPTT tables to
> > >      be used and, due to the limitations of which IDs are selected
> > >      described above in (2), they are not helpful for QEMU, so we
> > >      don't build them with this patch. In the absence of them, Linux
> > >      assigns its own unique IDs. The maintainers have chosen not to use
> > >      counters from zero, but rather ACPI table offsets, which explains
> > >      why the numbers are so much larger than with DT.
> > >
> > >   4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
> > >      match the logical CPU IDs, because these IDs must be equal to the
> > >      MADT CPU UID (as no processor containers are present), and QEMU
> > >      uses the logical CPU ID for these MADT IDs.
> > >
> > > Tested-by: Jiajie Li <lijiajie11@huawei.com>
> > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > Signed-off-by: Ying Fang <fangying1@huawei.com>
> > > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > > ---
> > >   hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 63 insertions(+)
> > >
> > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > index 2ad5dad1bf..03fd812d5a 100644
> > > --- a/hw/arm/virt-acpi-build.c
> > > +++ b/hw/arm/virt-acpi-build.c
> > > @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker,
> > VirtMachineState *vms)
> > >                    vms->oem_table_id);
> > >   }
> > >
> > > +/* PPTT */
> > > +static void
> > > +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > > +{
> > > +    int pptt_start = table_data->len;
> > > +    int uid = 0, cpus = 0, socket = 0;
> > > +    MachineState *ms = MACHINE(vms);
> > > +    unsigned int smp_cores = ms->smp.cores;
> > > +    unsigned int smp_threads = ms->smp.threads;
> > > +
> > > +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> > > +
> > > +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> > > +        uint32_t socket_offset = table_data->len - pptt_start;
> > > +        int core;
> > > +
> > > +        build_processor_hierarchy_node(
> > > +            table_data, 1, /* Physical package */
> > > +            0, socket, /* No parent */
> > > +            NULL, 0);  /* No private resources */
> > > +
> > > +        for (core = 0; core < smp_cores; core++) {
> > > +            uint32_t core_offset = table_data->len - pptt_start;
> > > +            int thread;
> > > +
> > > +            if (smp_threads <= 1) {
> > > +                build_processor_hierarchy_node(
> > > +                    table_data,
> > > +                    (1 << 1) | /* ACPI Processor ID valid */
> > > +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > > +                    socket_offset, uid++, /* Parent is a Socket */
> > > +                    NULL, 0);  /* No private resources */
> > > +            } else {
> > > +                build_processor_hierarchy_node(
> > > +                    table_data, 0,
> > > +                    socket_offset, core, /* Parent is a Socket */
> > > +                    NULL, 0); /* No private resources */
> > > +
> > > +                for (thread = 0; thread < smp_threads; thread++) {
> > > +                    build_processor_hierarchy_node(
> > > +                        table_data,
> > > +                        (1 << 1) | /* ACPI Processor ID valid */
> > > +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
> > > +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > > +                        core_offset, uid++, /* Parent is a Core */
> > > +                        NULL, 0);  /* No private resources */
> > > +                }
> > > +            }
> > > +        }
> > > +        cpus += smp_cores * smp_threads;
> > > +    }
> > > +
> > > +    build_header(linker, table_data,
> > > +                 (void *)(table_data->data + pptt_start), "PPTT",
> > > +                 table_data->len - pptt_start, 2,
> > > +                 vms->oem_id, vms->oem_table_id);
> > > +}
> > > +
> > >   /* GTDT */
> > >   static void
> > >   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > > @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms,
> > AcpiBuildTables *tables)
> > >       acpi_add_table(table_offsets, tables_blob);
> > >       build_madt(tables_blob, tables->linker, vms);
> > >
> > > +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> > I'm not really sure why we need to care about "ms->smp.cpus > 1" here?
> > 
> > IMO, just like MADT in which we create both ENABLED and DISABLED
> > gicc nodes no matter of number of ENABLED nodes is one or not, we
> > should create PPTT table for all the possible cpus and not care about
> > number of smp cpus, too. This will be more consistent with the ACPI
> > specification and the PPTT table will be used for ACPI cpu hotplug in
> > the future even with  "smp.cpus == 1".
> 
> 
> A humble request:
> Let us not anticipate the changes of vcpu Hotplug here. Things are fluid
> with respect to the vcpu Hotplug right now and I think it will not be
> right to base PPTT Table changes in anticipation of something we are not
> sure of what it looks like.
> 
> Any such decisions should be postponed and be made part of the actual
> vcpu Hotplug changes when(and if ever) they come for ARM64. This will
> also ensure proper review of such changes and useful in that particular
> context.

Hi Salil,

Can you please elaborate on this and send some pointers to the hot plug
discussions? You're not saying that we shouldn't try to generate PPTT
tables for AArch64 guests until a solution for hot plug has been
determined, are you? If so, I don't think I would agree with that. There
are benefits to properly describing cpu topology to guests, even without
hot plug. Those benefits, when vcpu pinning is used, are the same benefits
as for the host, which already use PPTT tables to describe topology, even
though hot plug isn't supported.

Now, if you're saying we should only generate tables for smp.cpus, not
smp.maxcpus, because hot plug isn't supported anyway, then I see your
point. But, it'd be better to require smp.cpus == smp.maxcpus in our
smp_parse function to do that, which we've never done before, so we may
have trouble supporting existing command lines.

Thanks,
drew

> 
> 
> Thanks
> 
> > 
> > Care of "smp.cpus > 1" in the DT cpu-map part makes sense to me,
> > because we are required to only add present cpu nodes to the DT and
> > Linux Doc says that a cpu-map is not needed for uniprocessor systems.
> > 
> > Thanks,
> > Yanan
> > > +        acpi_add_table(table_offsets, tables_blob);
> > > +        build_pptt(tables_blob, tables->linker, vms);
> > > +    }
> > > +
> > >       acpi_add_table(table_offsets, tables_blob);
> > >       build_gtdt(tables_blob, tables->linker, vms);
> > >
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-18  7:17     ` Salil Mehta
  2021-05-18  7:42       ` Andrew Jones
@ 2021-05-18  9:16       ` wangyanan (Y)
  1 sibling, 0 replies; 40+ messages in thread
From: wangyanan (Y) @ 2021-05-18  9:16 UTC (permalink / raw)
  To: Salil Mehta, Andrew Jones
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, Shannon Zhao, Igor Mammedov, qemu-arm,
	Alistair Francis, Zengtao (B),
	yangyicong, yuzenghui, Wanghaibin (D), zhukeqian, lijiajie (H),
	David Gibson


On 2021/5/18 15:17, Salil Mehta wrote:
>> From: Qemu-arm [mailto:qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org]
>> On Behalf Of wangyanan (Y)
>> Sent: Thursday, May 13, 2021 6:10 AM
>>
>> Hi Drew,
>>
>> I got a question below, and hope your reply. Thanks!
>> On 2021/4/13 16:07, Yanan Wang wrote:
>>> Add the Processor Properties Topology Table (PPTT) to present
>>> CPU topology information to ACPI guests. Note, while a DT boot
>>> Linux guest with a non-flat CPU topology will see socket and
>>> core IDs being sequential integers starting from zero, e.g.
>>> with -smp 4,sockets=2,cores=2,threads=1
>>>
>>> a DT boot produces
>>>
>>>    cpu:  0 package_id:  0 core_id:  0
>>>    cpu:  1 package_id:  0 core_id:  1
>>>    cpu:  2 package_id:  1 core_id:  0
>>>    cpu:  3 package_id:  1 core_id:  1
>>>
>>> an ACPI boot produces
>>>
>>>    cpu:  0 package_id: 36 core_id:  0
>>>    cpu:  1 package_id: 36 core_id:  1
>>>    cpu:  2 package_id: 96 core_id:  2
>>>    cpu:  3 package_id: 96 core_id:  3
>>>
>>> This is due to several reasons:
>>>
>>>    1) DT cpu nodes do not have an equivalent field to what the PPTT
>>>       ACPI Processor ID must be, i.e. something equal to the MADT CPU
>>>       UID or equal to the UID of an ACPI processor container. In both
>>>       ACPI cases those are platform dependant IDs assigned by the
>>>       vendor.
>>>
>>>    2) While QEMU is the vendor for a guest, if the topology specifies
>>>       SMT (> 1 thread), then, with ACPI, it is impossible to assign a
>>>       core-id the same value as a package-id, thus it is not possible
>>>       to have package-id=0 and core-id=0. This is because package and
>>>       core containers must be in the same ACPI namespace and therefore
>>>       must have unique UIDs.
>>>
>>>    3) ACPI processor containers are not required for PPTT tables to
>>>       be used and, due to the limitations of which IDs are selected
>>>       described above in (2), they are not helpful for QEMU, so we
>>>       don't build them with this patch. In the absence of them, Linux
>>>       assigns its own unique IDs. The maintainers have chosen not to use
>>>       counters from zero, but rather ACPI table offsets, which explains
>>>       why the numbers are so much larger than with DT.
>>>
>>>    4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
>>>       match the logical CPU IDs, because these IDs must be equal to the
>>>       MADT CPU UID (as no processor containers are present), and QEMU
>>>       uses the logical CPU ID for these MADT IDs.
>>>
>>> Tested-by: Jiajie Li <lijiajie11@huawei.com>
>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>>> Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
>>> ---
>>>    hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 63 insertions(+)
>>>
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index 2ad5dad1bf..03fd812d5a 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker,
>> VirtMachineState *vms)
>>>                     vms->oem_table_id);
>>>    }
>>>
>>> +/* PPTT */
>>> +static void
>>> +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>> +{
>>> +    int pptt_start = table_data->len;
>>> +    int uid = 0, cpus = 0, socket = 0;
>>> +    MachineState *ms = MACHINE(vms);
>>> +    unsigned int smp_cores = ms->smp.cores;
>>> +    unsigned int smp_threads = ms->smp.threads;
>>> +
>>> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
>>> +
>>> +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
>>> +        uint32_t socket_offset = table_data->len - pptt_start;
>>> +        int core;
>>> +
>>> +        build_processor_hierarchy_node(
>>> +            table_data, 1, /* Physical package */
>>> +            0, socket, /* No parent */
>>> +            NULL, 0);  /* No private resources */
>>> +
>>> +        for (core = 0; core < smp_cores; core++) {
>>> +            uint32_t core_offset = table_data->len - pptt_start;
>>> +            int thread;
>>> +
>>> +            if (smp_threads <= 1) {
>>> +                build_processor_hierarchy_node(
>>> +                    table_data,
>>> +                    (1 << 1) | /* ACPI Processor ID valid */
>>> +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
>>> +                    socket_offset, uid++, /* Parent is a Socket */
>>> +                    NULL, 0);  /* No private resources */
>>> +            } else {
>>> +                build_processor_hierarchy_node(
>>> +                    table_data, 0,
>>> +                    socket_offset, core, /* Parent is a Socket */
>>> +                    NULL, 0); /* No private resources */
>>> +
>>> +                for (thread = 0; thread < smp_threads; thread++) {
>>> +                    build_processor_hierarchy_node(
>>> +                        table_data,
>>> +                        (1 << 1) | /* ACPI Processor ID valid */
>>> +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
>>> +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
>>> +                        core_offset, uid++, /* Parent is a Core */
>>> +                        NULL, 0);  /* No private resources */
>>> +                }
>>> +            }
>>> +        }
>>> +        cpus += smp_cores * smp_threads;
>>> +    }
>>> +
>>> +    build_header(linker, table_data,
>>> +                 (void *)(table_data->data + pptt_start), "PPTT",
>>> +                 table_data->len - pptt_start, 2,
>>> +                 vms->oem_id, vms->oem_table_id);
>>> +}
>>> +
>>>    /* GTDT */
>>>    static void
>>>    build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>> @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms,
>> AcpiBuildTables *tables)
>>>        acpi_add_table(table_offsets, tables_blob);
>>>        build_madt(tables_blob, tables->linker, vms);
>>>
>>> +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
>> I'm not really sure why we need to care about "ms->smp.cpus > 1" here?
>>
>> IMO, just like MADT in which we create both ENABLED and DISABLED
>> gicc nodes no matter of number of ENABLED nodes is one or not, we
>> should create PPTT table for all the possible cpus and not care about
>> number of smp cpus, too. This will be more consistent with the ACPI
>> specification and the PPTT table will be used for ACPI cpu hotplug in
>> the future even with  "smp.cpus == 1".
>
> A humble request:
> Let us not anticipate the changes of vcpu Hotplug here. Things are fluid
> with respect to the vcpu Hotplug right now and I think it will not be
> right to base PPTT Table changes in anticipation of something we are not
> sure of what it looks like.
Hi Salil,

I agree with that I shouldn't anticipate vcpu hotplug which has little
connect with this series. So it's not appropriately to consider too much
of it when generating PPTT. I'm guessing this is what you mean.

Then PPTT generation is needed for cpu topology exposure to guest
and the ACPI spec context also indicates that we should provided the
hierarchy information of all cpus. See [1] (Note info at page 260).
Can we agree on this ?

[1] https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf

Thanks,
Yanan
> Any such decisions should be postponed and be made part of the actual
> vcpu Hotplug changes when(and if ever) they come for ARM64. This will
> also ensure proper review of such changes and useful in that particular
> context.
>
>
> Thanks
>
>> Care of "smp.cpus > 1" in the DT cpu-map part makes sense to me,
>> because we are required to only add present cpu nodes to the DT and
>> Linux Doc says that a cpu-map is not needed for uniprocessor systems.
>>
>> Thanks,
>> Yanan
>>> +        acpi_add_table(table_offsets, tables_blob);
>>> +        build_pptt(tables_blob, tables->linker, vms);
>>> +    }
>>> +
>>>        acpi_add_table(table_offsets, tables_blob);
>>>        build_gtdt(tables_blob, tables->linker, vms);
>>>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-18  7:42       ` Andrew Jones
@ 2021-05-18 18:34         ` Salil Mehta
  2021-05-18 19:05           ` Andrew Jones
  0 siblings, 1 reply; 40+ messages in thread
From: Salil Mehta @ 2021-05-18 18:34 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, wangyanan (Y),
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	Zengtao (B), yangyicong, yuzenghui, Wanghaibin (D),
	zhukeqian, lijiajie (H),
	David Gibson

> From: Andrew Jones [mailto:drjones@redhat.com]
> Sent: Tuesday, May 18, 2021 8:42 AM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: wangyanan (Y) <wangyanan55@huawei.com>; Peter Maydell
> <peter.maydell@linaro.org>; Michael S . Tsirkin <mst@redhat.com>; Wanghaibin
> (D) <wanghaibin.wang@huawei.com>; qemu-devel@nongnu.org; Shannon Zhao
> <shannon.zhaosl@gmail.com>; qemu-arm@nongnu.org; Alistair Francis
> <alistair.francis@wdc.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; yuzenghui <yuzenghui@huawei.com>; Igor
> Mammedov <imammedo@redhat.com>; zhukeqian <zhukeqian1@huawei.com>; lijiajie (H)
> <lijiajie11@huawei.com>; David Gibson <david@gibson.dropbear.id.au>; Linuxarm
> <linuxarm@huawei.com>; linuxarm@openeuler.org
> Subject: Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
> 
> On Tue, May 18, 2021 at 07:17:56AM +0000, Salil Mehta wrote:
> > > From: Qemu-arm
> [mailto:qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org]
> > > On Behalf Of wangyanan (Y)
> > > Sent: Thursday, May 13, 2021 6:10 AM
> > >
> > > Hi Drew,
> > >
> > > I got a question below, and hope your reply. Thanks!
> > > On 2021/4/13 16:07, Yanan Wang wrote:
> > > > Add the Processor Properties Topology Table (PPTT) to present
> > > > CPU topology information to ACPI guests. Note, while a DT boot
> > > > Linux guest with a non-flat CPU topology will see socket and
> > > > core IDs being sequential integers starting from zero, e.g.
> > > > with -smp 4,sockets=2,cores=2,threads=1
> > > >
> > > > a DT boot produces
> > > >
> > > >   cpu:  0 package_id:  0 core_id:  0
> > > >   cpu:  1 package_id:  0 core_id:  1
> > > >   cpu:  2 package_id:  1 core_id:  0
> > > >   cpu:  3 package_id:  1 core_id:  1
> > > >
> > > > an ACPI boot produces
> > > >
> > > >   cpu:  0 package_id: 36 core_id:  0
> > > >   cpu:  1 package_id: 36 core_id:  1
> > > >   cpu:  2 package_id: 96 core_id:  2
> > > >   cpu:  3 package_id: 96 core_id:  3
> > > >
> > > > This is due to several reasons:
> > > >
> > > >   1) DT cpu nodes do not have an equivalent field to what the PPTT
> > > >      ACPI Processor ID must be, i.e. something equal to the MADT CPU
> > > >      UID or equal to the UID of an ACPI processor container. In both
> > > >      ACPI cases those are platform dependant IDs assigned by the
> > > >      vendor.
> > > >
> > > >   2) While QEMU is the vendor for a guest, if the topology specifies
> > > >      SMT (> 1 thread), then, with ACPI, it is impossible to assign a
> > > >      core-id the same value as a package-id, thus it is not possible
> > > >      to have package-id=0 and core-id=0. This is because package and
> > > >      core containers must be in the same ACPI namespace and therefore
> > > >      must have unique UIDs.
> > > >
> > > >   3) ACPI processor containers are not required for PPTT tables to
> > > >      be used and, due to the limitations of which IDs are selected
> > > >      described above in (2), they are not helpful for QEMU, so we
> > > >      don't build them with this patch. In the absence of them, Linux
> > > >      assigns its own unique IDs. The maintainers have chosen not to use
> > > >      counters from zero, but rather ACPI table offsets, which explains
> > > >      why the numbers are so much larger than with DT.
> > > >
> > > >   4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
> > > >      match the logical CPU IDs, because these IDs must be equal to the
> > > >      MADT CPU UID (as no processor containers are present), and QEMU
> > > >      uses the logical CPU ID for these MADT IDs.
> > > >
> > > > Tested-by: Jiajie Li <lijiajie11@huawei.com>
> > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > Signed-off-by: Ying Fang <fangying1@huawei.com>
> > > > Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
> > > > ---
> > > >   hw/arm/virt-acpi-build.c | 63 ++++++++++++++++++++++++++++++++++++++++
> > > >   1 file changed, 63 insertions(+)
> > > >
> > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > > index 2ad5dad1bf..03fd812d5a 100644
> > > > --- a/hw/arm/virt-acpi-build.c
> > > > +++ b/hw/arm/virt-acpi-build.c
> > > > @@ -436,6 +436,64 @@ build_srat(GArray *table_data, BIOSLinker *linker,
> > > VirtMachineState *vms)
> > > >                    vms->oem_table_id);
> > > >   }
> > > >
> > > > +/* PPTT */
> > > > +static void
> > > > +build_pptt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> > > > +{
> > > > +    int pptt_start = table_data->len;
> > > > +    int uid = 0, cpus = 0, socket = 0;
> > > > +    MachineState *ms = MACHINE(vms);
> > > > +    unsigned int smp_cores = ms->smp.cores;
> > > > +    unsigned int smp_threads = ms->smp.threads;
> > > > +
> > > > +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> > > > +
> > > > +    for (socket = 0; cpus < ms->possible_cpus->len; socket++) {
> > > > +        uint32_t socket_offset = table_data->len - pptt_start;
> > > > +        int core;
> > > > +
> > > > +        build_processor_hierarchy_node(
> > > > +            table_data, 1, /* Physical package */
> > > > +            0, socket, /* No parent */
> > > > +            NULL, 0);  /* No private resources */
> > > > +
> > > > +        for (core = 0; core < smp_cores; core++) {
> > > > +            uint32_t core_offset = table_data->len - pptt_start;
> > > > +            int thread;
> > > > +
> > > > +            if (smp_threads <= 1) {
> > > > +                build_processor_hierarchy_node(
> > > > +                    table_data,
> > > > +                    (1 << 1) | /* ACPI Processor ID valid */
> > > > +                    (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > > > +                    socket_offset, uid++, /* Parent is a Socket */
> > > > +                    NULL, 0);  /* No private resources */
> > > > +            } else {
> > > > +                build_processor_hierarchy_node(
> > > > +                    table_data, 0,
> > > > +                    socket_offset, core, /* Parent is a Socket */
> > > > +                    NULL, 0); /* No private resources */
> > > > +
> > > > +                for (thread = 0; thread < smp_threads; thread++) {
> > > > +                    build_processor_hierarchy_node(
> > > > +                        table_data,
> > > > +                        (1 << 1) | /* ACPI Processor ID valid */
> > > > +                        (1 << 2) | /* ACPI 6.3 - Processor is a Thread */
> > > > +                        (1 << 3),  /* ACPI 6.3 - Node is a Leaf */
> > > > +                        core_offset, uid++, /* Parent is a Core */
> > > > +                        NULL, 0);  /* No private resources */
> > > > +                }
> > > > +            }
> > > > +        }
> > > > +        cpus += smp_cores * smp_threads;
> > > > +    }
> > > > +
> > > > +    build_header(linker, table_data,
> > > > +                 (void *)(table_data->data + pptt_start), "PPTT",
> > > > +                 table_data->len - pptt_start, 2,
> > > > +                 vms->oem_id, vms->oem_table_id);
> > > > +}
> > > > +
> > > >   /* GTDT */
> > > >   static void
> > > >   build_gtdt(GArray *table_data, BIOSLinker *linker, VirtMachineState
> *vms)
> > > > @@ -707,6 +765,11 @@ void virt_acpi_build(VirtMachineState *vms,
> > > AcpiBuildTables *tables)
> > > >       acpi_add_table(table_offsets, tables_blob);
> > > >       build_madt(tables_blob, tables->linker, vms);
> > > >
> > > > +    if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
> > > I'm not really sure why we need to care about "ms->smp.cpus > 1" here?
> > >
> > > IMO, just like MADT in which we create both ENABLED and DISABLED
> > > gicc nodes no matter of number of ENABLED nodes is one or not, we
> > > should create PPTT table for all the possible cpus and not care about
> > > number of smp cpus, too. This will be more consistent with the ACPI
> > > specification and the PPTT table will be used for ACPI cpu hotplug in
> > > the future even with  "smp.cpus == 1".
> >
> >
> > A humble request:
> > Let us not anticipate the changes of vcpu Hotplug here. Things are fluid
> > with respect to the vcpu Hotplug right now and I think it will not be
> > right to base PPTT Table changes in anticipation of something we are not
> > sure of what it looks like.
> >
> > Any such decisions should be postponed and be made part of the actual
> > vcpu Hotplug changes when(and if ever) they come for ARM64. This will
> > also ensure proper review of such changes and useful in that particular
> > context.
> 
> Hi Salil,
> 
> Can you please elaborate on this and send some pointers to the hot plug
> discussions? 

Hi Andrew,
As you are aware, ACPI based vcpu Hotplug is under contention right now.
It is being discussed within the ARM to have Hotplug mechanism which does
not involves QEMU<->Guest ACPI Hotplug exchanges and are purely based on
PSCI triggers(which might take a different ACPI path). If you wish you can
join Linaro Open Discussion meeting for the same. All these discussions
have been happening there

https://linaro.atlassian.net/wiki/spaces/LOD/pages/26844463630/2021-5-25+Meeting+Meeting+notes


You're not saying that we shouldn't try to generate PPTT
> tables for AArch64 guests until a solution for hot plug has been
> determined, are you? 

Sorry, I did not mean that. Changes of PPTT are independent to vcpu
Hotplug support and are still required without it. No problem with that.


If so, I don't think I would agree with that. There
> are benefits to properly describing cpu topology to guests, even without
> hot plug.

Agreed. No second thoughts on that.

 Those benefits, when vcpu pinning is used, are the same benefits
> as for the host, which already use PPTT tables to describe topology, even
> though hot plug isn't supported.

yes sure, you mean pinning vcpus according to the cpu topology for performance?

> 
> Now, if you're saying we should only generate tables for smp.cpus, not

Correct. This is what I thought we must be doing even now

> smp.maxcpus, because hot plug isn't supported anyway, then I see your
> point. But, it'd be better to require smp.cpus == smp.maxcpus in our
> smp_parse function to do that, which we've never done before, so we may
> have trouble supporting existing command lines.

I am trying to recall, if the vcpu Hotplug is not supported then can they
ever be different?

cpus =  (threads * cores * sockets)

static void smp_parse(MachineState *ms, QemuOpts *opts)
{
     [...]

        if (sockets * cores * threads != ms->smp.max_cpus) {
            warn_report("Invalid CPU topology deprecated: "
                        "sockets (%u) * cores (%u) * threads (%u) "
                        "!= maxcpus (%u)",
                        sockets, cores, threads,
                        ms->smp.max_cpus);
        }
     [...]
}
  
Although, above check does not exit(1) and just warns on detecting invalid
CPU topology. Not sure why?

Well if you think there are subtleties to support above implementation and
we cannot do it now then sure it is your call. :)

I just thought to slim the patch-set down and club the relevant logic to the
places where they ideally would have made more sense to review.


Thanks
Salil.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-18 18:34         ` Salil Mehta
@ 2021-05-18 19:05           ` Andrew Jones
  2021-05-18 19:22             ` Salil Mehta
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Jones @ 2021-05-18 19:05 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, wangyanan (Y),
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	Zengtao (B), yangyicong, yuzenghui, Wanghaibin (D),
	zhukeqian, lijiajie (H),
	David Gibson

On Tue, May 18, 2021 at 06:34:08PM +0000, Salil Mehta wrote:
>  Those benefits, when vcpu pinning is used, are the same benefits
> > as for the host, which already use PPTT tables to describe topology, even
> > though hot plug isn't supported.
> 
> yes sure, you mean pinning vcpus according to the cpu topology for performance?

Yup

> 
> > 
> > Now, if you're saying we should only generate tables for smp.cpus, not
> 
> Correct. This is what I thought we must be doing even now
> 
> > smp.maxcpus, because hot plug isn't supported anyway, then I see your
> > point. But, it'd be better to require smp.cpus == smp.maxcpus in our
> > smp_parse function to do that, which we've never done before, so we may
> > have trouble supporting existing command lines.
> 
> I am trying to recall, if the vcpu Hotplug is not supported then can they
> ever be different?
> 
> cpus =  (threads * cores * sockets)
> 
> static void smp_parse(MachineState *ms, QemuOpts *opts)
> {
>      [...]
> 
>         if (sockets * cores * threads != ms->smp.max_cpus) {
>             warn_report("Invalid CPU topology deprecated: "
>                         "sockets (%u) * cores (%u) * threads (%u) "
>                         "!= maxcpus (%u)",
>                         sockets, cores, threads,
>                         ms->smp.max_cpus);
>         }
>      [...]
> }
>   
> Although, above check does not exit(1) and just warns on detecting invalid
> CPU topology. Not sure why?

Hmm, not sure what code you have there. I see this in
hw/core/machine.c:smp_parse

        if (ms->smp.max_cpus < cpus) {
            error_report("maxcpus must be equal to or greater than smp");
            exit(1);
        }

        if (sockets * cores * threads != ms->smp.max_cpus) {
            error_report("Invalid CPU topology: "
                         "sockets (%u) * cores (%u) * threads (%u) "
                         "!= maxcpus (%u)",
                         sockets, cores, threads,
                         ms->smp.max_cpus);
            exit(1);
        }

> 
> Well if you think there are subtleties to support above implementation and
> we cannot do it now then sure it is your call. :)

The problem is that -smp 4,maxcpus=8 doesn't error out today, even though
it doesn't do anything. OTOH, -smp 4,cores=2 doesn't error out either, but
we're proposing that it should. Maybe we can start erroring out when
cpus != maxcpus until hot plug is supported?

Thanks,
drew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
  2021-05-18 19:05           ` Andrew Jones
@ 2021-05-18 19:22             ` Salil Mehta
  0 siblings, 0 replies; 40+ messages in thread
From: Salil Mehta @ 2021-05-18 19:22 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, linuxarm, Michael S . Tsirkin, qemu-devel,
	Linuxarm, wangyanan (Y),
	Shannon Zhao, Igor Mammedov, qemu-arm, Alistair Francis,
	Zengtao (B), yangyicong, yuzenghui, Wanghaibin (D),
	zhukeqian, lijiajie (H),
	David Gibson

> From: Andrew Jones [mailto:drjones@redhat.com]
> Sent: Tuesday, May 18, 2021 8:06 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: wangyanan (Y) <wangyanan55@huawei.com>; Peter Maydell
> <peter.maydell@linaro.org>; Michael S . Tsirkin <mst@redhat.com>; Wanghaibin
> (D) <wanghaibin.wang@huawei.com>; qemu-devel@nongnu.org; Shannon Zhao
> <shannon.zhaosl@gmail.com>; qemu-arm@nongnu.org; Alistair Francis
> <alistair.francis@wdc.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; yuzenghui <yuzenghui@huawei.com>; Igor
> Mammedov <imammedo@redhat.com>; zhukeqian <zhukeqian1@huawei.com>; lijiajie (H)
> <lijiajie11@huawei.com>; David Gibson <david@gibson.dropbear.id.au>; Linuxarm
> <linuxarm@huawei.com>; linuxarm@openeuler.org
> Subject: Re: [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table
> 
> On Tue, May 18, 2021 at 06:34:08PM +0000, Salil Mehta wrote:
> >  Those benefits, when vcpu pinning is used, are the same benefits
> > > as for the host, which already use PPTT tables to describe topology, even
> > > though hot plug isn't supported.
> >
> > yes sure, you mean pinning vcpus according to the cpu topology for performance?
> 
> Yup

Already Agreed :)

> > > Now, if you're saying we should only generate tables for smp.cpus, not
> >
> > Correct. This is what I thought we must be doing even now
> >
> > > smp.maxcpus, because hot plug isn't supported anyway, then I see your
> > > point. But, it'd be better to require smp.cpus == smp.maxcpus in our
> > > smp_parse function to do that, which we've never done before, so we may
> > > have trouble supporting existing command lines.
> >
> > I am trying to recall, if the vcpu Hotplug is not supported then can they
> > ever be different?
> >
> > cpus =  (threads * cores * sockets)
> >
> > static void smp_parse(MachineState *ms, QemuOpts *opts)
> > {
> >      [...]
> >
> >         if (sockets * cores * threads != ms->smp.max_cpus) {
> >             warn_report("Invalid CPU topology deprecated: "
> >                         "sockets (%u) * cores (%u) * threads (%u) "
> >                         "!= maxcpus (%u)",
> >                         sockets, cores, threads,
> >                         ms->smp.max_cpus);
> >         }
> >      [...]
> > }
> >
> > Although, above check does not exit(1) and just warns on detecting invalid
> > CPU topology. Not sure why?
> 
> Hmm, not sure what code you have there. I see this in
> hw/core/machine.c:smp_parse
> 
>         if (ms->smp.max_cpus < cpus) {
>             error_report("maxcpus must be equal to or greater than smp");
>             exit(1);
>         }
> 
>         if (sockets * cores * threads != ms->smp.max_cpus) {
>             error_report("Invalid CPU topology: "
>                          "sockets (%u) * cores (%u) * threads (%u) "
>                          "!= maxcpus (%u)",
>                          sockets, cores, threads,
>                          ms->smp.max_cpus);
>             exit(1);
>         }
> 
> >
> > Well if you think there are subtleties to support above implementation and
> > we cannot do it now then sure it is your call. :)
> 
> The problem is that -smp 4,maxcpus=8 doesn't error out today, even though
> it doesn't do anything. OTOH, -smp 4,cores=2 doesn't error out either, but
> we're proposing that it should. Maybe we can start erroring out when
> cpus != maxcpus until hot plug is supported?

Agreed, both don't make any sense if hotplug is not supported and ideally should
fail with error. We should block any such topology configuration.


Thanks
Salil


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, back to index

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13  8:07 [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support Yanan Wang
2021-04-13  8:07 ` [RFC PATCH v2 1/6] device_tree: Add qemu_fdt_add_path Yanan Wang
2021-04-16  4:52   ` David Gibson
2021-04-17  2:36     ` wangyanan (Y)
2021-04-19  1:13       ` David Gibson
2021-04-19  7:02         ` wangyanan (Y)
2021-04-13  8:07 ` [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map Yanan Wang
2021-04-27  9:47   ` Philippe Mathieu-Daudé
2021-04-27 10:04     ` Andrew Jones
2021-04-27 12:36       ` Philippe Mathieu-Daudé
2021-04-28  6:36         ` wangyanan (Y)
2021-05-13  6:58   ` Andrew Jones
2021-05-13  7:15     ` wangyanan (Y)
2021-04-13  8:07 ` [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus Yanan Wang
2021-04-27 13:18   ` Andrew Jones
2021-04-28  6:42     ` wangyanan (Y)
2021-04-27 14:50   ` Andrew Jones
2021-04-28  6:47     ` wangyanan (Y)
2021-04-13  8:07 ` [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure Yanan Wang
2021-04-27 13:37   ` Andrew Jones
2021-04-28  6:59     ` wangyanan (Y)
2021-04-13  8:07 ` [RFC PATCH v2 5/6] hw/arm/virt-acpi-build: Add PPTT table Yanan Wang
2021-04-27 14:16   ` Andrew Jones
2021-04-28  7:30     ` wangyanan (Y)
2021-05-13  5:10   ` wangyanan (Y)
2021-05-13  6:55     ` Andrew Jones
2021-05-18  7:17     ` Salil Mehta
2021-05-18  7:42       ` Andrew Jones
2021-05-18 18:34         ` Salil Mehta
2021-05-18 19:05           ` Andrew Jones
2021-05-18 19:22             ` Salil Mehta
2021-05-18  9:16       ` wangyanan (Y)
2021-04-13  8:07 ` [RFC PATCH v2 6/6] hw/arm/virt: Replace smp_parse with one that prefers cores Yanan Wang
2021-04-27 14:58   ` Andrew Jones
2021-04-28  8:04     ` wangyanan (Y)
2021-04-28  9:36     ` wangyanan (Y)
2021-04-28 10:13       ` Andrew Jones
2021-04-29  2:21         ` wangyanan (Y)
2021-04-21  7:40 ` [RFC PATCH v2 0/6] hw/arm/virt: Introduce cpu topology support wangyanan (Y)
2021-04-21  9:31 ` wangyanan (Y)

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git
	git clone --mirror https://lore.kernel.org/qemu-devel/2 qemu-devel/git/2.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git