qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
@ 2023-03-17  6:25 Gavin Shan
  2023-03-17  6:25 ` [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required Gavin Shan
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Gavin Shan @ 2023-03-17  6:25 UTC (permalink / raw)
  To: qemu-arm
  Cc: qemu-devel, qemu-riscv, rad, peter.maydell, quic_llindhol,
	eduardo, marcel.apfelbaum, philmd, wangyanan55, palmer,
	alistair.francis, bin.meng, thuth, lvivier, pbonzini, imammedo,
	ajones, berrange, dbarboza, yihyu, shan.gavin

For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
used to populate the CPU topology in the Linux guest. It's required that
the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
scheduling domain can't be sorted out, as the following warning message
indicates. To avoid the unexpected confusion, this series attempts to
warn about such kind of irregular configurations.

   -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
   -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
   -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
   -numa node,nodeid=2,cpus=4-5,memdev=ram2                \

   ------------[ cut here ]------------
   WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
   Modules linked in:
   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
   pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : build_sched_domains+0x284/0x910
   lr : build_sched_domains+0x184/0x910
   sp : ffff80000804bd50
   x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
   x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
   x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
   x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
   x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
   x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
   x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
   x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
   x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
   x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
   Call trace:
    build_sched_domains+0x284/0x910
    sched_init_domains+0xac/0xe0
    sched_init_smp+0x48/0xc8
    kernel_init_freeable+0x140/0x1ac
    kernel_init+0x28/0x140
    ret_from_fork+0x10/0x20

PATCH[1] Warn about the irregular configuration if required
PATCH[2] Enable the validation for aarch64 machines
PATCH[3] Enable the validation for riscv machines

v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html

Changelog
=========
v4:
  * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
  * Replace local variable @len with possible_cpus->len in
    validate_cpu_cluster_to_numa_boundary()                   (Philippe)
v3:
  * Validate cluster-to-NUMA instead of socket-to-NUMA
    boundary                                                  (Gavin)
  * Move the switch from MachineState to MachineClass         (Philippe)
  * Warning instead of rejecting the irregular configuration  (Daniel)
  * Comments to mention cluster-to-NUMA is platform instead
    of architectural choice                                   (Drew)
  * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
v2:
  * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
  * Add helper set_numa_socket_boundary() and validate the
    boundary in the generic path                              (Philippe)

Gavin Shan (3):
  numa: Validate cluster and NUMA node boundary if required
  hw/arm: Validate cluster and NUMA node boundary
  hw/riscv: Validate cluster and NUMA node boundary

 hw/arm/sbsa-ref.c   |  2 ++
 hw/arm/virt.c       |  2 ++
 hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
 hw/riscv/spike.c    |  2 ++
 hw/riscv/virt.c     |  2 ++
 include/hw/boards.h |  1 +
 6 files changed, 51 insertions(+)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required
  2023-03-17  6:25 [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Gavin Shan
@ 2023-03-17  6:25 ` Gavin Shan
  2023-03-21 11:39   ` Alistair Francis
  2023-03-17  6:25 ` [PATCH v4 2/3] hw/arm: Validate cluster and NUMA node boundary Gavin Shan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Gavin Shan @ 2023-03-17  6:25 UTC (permalink / raw)
  To: qemu-arm
  Cc: qemu-devel, qemu-riscv, rad, peter.maydell, quic_llindhol,
	eduardo, marcel.apfelbaum, philmd, wangyanan55, palmer,
	alistair.francis, bin.meng, thuth, lvivier, pbonzini, imammedo,
	ajones, berrange, dbarboza, yihyu, shan.gavin

For some architectures like ARM64, multiple CPUs in one cluster can be
associated with different NUMA nodes, which is irregular configuration
because we shouldn't have this in baremetal environment. The irregular
configuration causes Linux guest to misbehave, as the following warning
messages indicate.

  -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
  -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
  -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
  -numa node,nodeid=2,cpus=4-5,memdev=ram2                \

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
  pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : build_sched_domains+0x284/0x910
  lr : build_sched_domains+0x184/0x910
  sp : ffff80000804bd50
  x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
  x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
  x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
  x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
  x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
  x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
  x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
  x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
  x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
  x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
  Call trace:
   build_sched_domains+0x284/0x910
   sched_init_domains+0xac/0xe0
   sched_init_smp+0x48/0xc8
   kernel_init_freeable+0x140/0x1ac
   kernel_init+0x28/0x140
   ret_from_fork+0x10/0x20

Improve the situation to warn when multiple CPUs in one cluster have
been associated with different NUMA nodes. However, one NUMA node is
allowed to be associated with different clusters.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
 include/hw/boards.h |  1 +
 2 files changed, 43 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 45e3d24fdc..a2329f975d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1255,6 +1255,45 @@ static void machine_numa_finish_cpu_init(MachineState *machine)
     g_string_free(s, true);
 }
 
+static void validate_cpu_cluster_to_numa_boundary(MachineState *ms)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    NumaState *state = ms->numa_state;
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+    const CPUArchId *cpus = possible_cpus->cpus;
+    int i, j;
+
+    if (state->num_nodes <= 1 || possible_cpus->len <= 1) {
+        return;
+    }
+
+    /*
+     * The Linux scheduling domain can't be parsed when the multiple CPUs
+     * in one cluster have been associated with different NUMA nodes. However,
+     * it's fine to associate one NUMA node with CPUs in different clusters.
+     */
+    for (i = 0; i < possible_cpus->len; i++) {
+        for (j = i + 1; j < possible_cpus->len; j++) {
+            if (cpus[i].props.has_socket_id &&
+                cpus[i].props.has_cluster_id &&
+                cpus[i].props.has_node_id &&
+                cpus[j].props.has_socket_id &&
+                cpus[j].props.has_cluster_id &&
+                cpus[j].props.has_node_id &&
+                cpus[i].props.socket_id == cpus[j].props.socket_id &&
+                cpus[i].props.cluster_id == cpus[j].props.cluster_id &&
+                cpus[i].props.node_id != cpus[j].props.node_id) {
+                warn_report("CPU-%d and CPU-%d in socket-%ld-cluster-%ld "
+                             "have been associated with node-%ld and node-%ld "
+                             "respectively. It can cause OSes like Linux to "
+                             "misbehave", i, j, cpus[i].props.socket_id,
+                             cpus[i].props.cluster_id, cpus[i].props.node_id,
+                             cpus[j].props.node_id);
+            }
+        }
+    }
+}
+
 MemoryRegion *machine_consume_memdev(MachineState *machine,
                                      HostMemoryBackend *backend)
 {
@@ -1340,6 +1379,9 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
         numa_complete_configuration(machine);
         if (machine->numa_state->num_nodes) {
             machine_numa_finish_cpu_init(machine);
+            if (machine_class->cpu_cluster_has_numa_boundary) {
+                validate_cpu_cluster_to_numa_boundary(machine);
+            }
         }
     }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6fbbfd56c8..c9793b2789 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -273,6 +273,7 @@ struct MachineClass {
     bool nvdimm_supported;
     bool numa_mem_supported;
     bool auto_enable_numa;
+    bool cpu_cluster_has_numa_boundary;
     SMPCompatProps smp_props;
     const char *default_ram_id;
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/3] hw/arm: Validate cluster and NUMA node boundary
  2023-03-17  6:25 [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Gavin Shan
  2023-03-17  6:25 ` [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required Gavin Shan
@ 2023-03-17  6:25 ` Gavin Shan
  2023-03-17  6:25 ` [PATCH v4 3/3] hw/riscv: " Gavin Shan
  2023-03-27 13:26 ` [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Igor Mammedov
  3 siblings, 0 replies; 12+ messages in thread
From: Gavin Shan @ 2023-03-17  6:25 UTC (permalink / raw)
  To: qemu-arm
  Cc: qemu-devel, qemu-riscv, rad, peter.maydell, quic_llindhol,
	eduardo, marcel.apfelbaum, philmd, wangyanan55, palmer,
	alistair.francis, bin.meng, thuth, lvivier, pbonzini, imammedo,
	ajones, berrange, dbarboza, yihyu, shan.gavin

There are two ARM machines where NUMA is aware: 'virt' and 'sbsa-ref'.
Both of them are required to follow cluster-NUMA-node boundary. To
enable the validation to warn about the irregular configuration where
multiple CPUs in one cluster have been associated with different NUMA
nodes.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/sbsa-ref.c | 2 ++
 hw/arm/virt.c     | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 0b93558dde..efb380e7c8 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -864,6 +864,8 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
     mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids;
     mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props;
     mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id;
+    /* platform instead of architectural choice */
+    mc->cpu_cluster_has_numa_boundary = true;
 }
 
 static const TypeInfo sbsa_ref_info = {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ac626b3bef..b73ac6eabb 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3030,6 +3030,8 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->smp_props.clusters_supported = true;
     mc->auto_enable_numa_with_memhp = true;
     mc->auto_enable_numa_with_memdev = true;
+    /* platform instead of architectural choice */
+    mc->cpu_cluster_has_numa_boundary = true;
     mc->default_ram_id = "mach-virt.ram";
 
     object_class_property_add(oc, "acpi", "OnOffAuto",
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/3] hw/riscv: Validate cluster and NUMA node boundary
  2023-03-17  6:25 [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Gavin Shan
  2023-03-17  6:25 ` [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required Gavin Shan
  2023-03-17  6:25 ` [PATCH v4 2/3] hw/arm: Validate cluster and NUMA node boundary Gavin Shan
@ 2023-03-17  6:25 ` Gavin Shan
  2023-03-21 11:40   ` Alistair Francis
  2023-03-27 13:26 ` [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Igor Mammedov
  3 siblings, 1 reply; 12+ messages in thread
From: Gavin Shan @ 2023-03-17  6:25 UTC (permalink / raw)
  To: qemu-arm
  Cc: qemu-devel, qemu-riscv, rad, peter.maydell, quic_llindhol,
	eduardo, marcel.apfelbaum, philmd, wangyanan55, palmer,
	alistair.francis, bin.meng, thuth, lvivier, pbonzini, imammedo,
	ajones, berrange, dbarboza, yihyu, shan.gavin

There are two RISCV machines where NUMA is aware: 'virt' and 'spike'.
Both of them are required to follow cluster-NUMA-node boundary. To
enable the validation to warn about the irregular configuration where
multiple CPUs in one cluster has been associated with multiple NUMA
nodes.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 hw/riscv/spike.c | 2 ++
 hw/riscv/virt.c  | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index a584d5b3a2..4bf783884b 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -349,6 +349,8 @@ static void spike_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
     mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
     mc->numa_mem_supported = true;
+    /* platform instead of architectural choice */
+    mc->cpu_cluster_has_numa_boundary = true;
     mc->default_ram_id = "riscv.spike.ram";
 }
 
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4e3efbee16..84a2bca460 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1678,6 +1678,8 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
     mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
     mc->numa_mem_supported = true;
+    /* platform instead of architectural choice */
+    mc->cpu_cluster_has_numa_boundary = true;
     mc->default_ram_id = "riscv_virt_board.ram";
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required
  2023-03-17  6:25 ` [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required Gavin Shan
@ 2023-03-21 11:39   ` Alistair Francis
  0 siblings, 0 replies; 12+ messages in thread
From: Alistair Francis @ 2023-03-21 11:39 UTC (permalink / raw)
  To: Gavin Shan
  Cc: qemu-arm, qemu-devel, qemu-riscv, rad, peter.maydell,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	imammedo, ajones, berrange, dbarboza, yihyu, shan.gavin

On Fri, Mar 17, 2023 at 4:29 PM Gavin Shan <gshan@redhat.com> wrote:
>
> For some architectures like ARM64, multiple CPUs in one cluster can be
> associated with different NUMA nodes, which is irregular configuration
> because we shouldn't have this in baremetal environment. The irregular
> configuration causes Linux guest to misbehave, as the following warning
> messages indicate.
>
>   -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
>   -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
>   -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
>   -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
>
>   ------------[ cut here ]------------
>   WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
>   Modules linked in:
>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
>   pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : build_sched_domains+0x284/0x910
>   lr : build_sched_domains+0x184/0x910
>   sp : ffff80000804bd50
>   x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
>   x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
>   x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
>   x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
>   x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
>   x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
>   x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
>   x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
>   x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
>   x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
>   Call trace:
>    build_sched_domains+0x284/0x910
>    sched_init_domains+0xac/0xe0
>    sched_init_smp+0x48/0xc8
>    kernel_init_freeable+0x140/0x1ac
>    kernel_init+0x28/0x140
>    ret_from_fork+0x10/0x20
>
> Improve the situation to warn when multiple CPUs in one cluster have
> been associated with different NUMA nodes. However, one NUMA node is
> allowed to be associated with different clusters.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
>  include/hw/boards.h |  1 +
>  2 files changed, 43 insertions(+)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 45e3d24fdc..a2329f975d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1255,6 +1255,45 @@ static void machine_numa_finish_cpu_init(MachineState *machine)
>      g_string_free(s, true);
>  }
>
> +static void validate_cpu_cluster_to_numa_boundary(MachineState *ms)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    NumaState *state = ms->numa_state;
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +    const CPUArchId *cpus = possible_cpus->cpus;
> +    int i, j;
> +
> +    if (state->num_nodes <= 1 || possible_cpus->len <= 1) {
> +        return;
> +    }
> +
> +    /*
> +     * The Linux scheduling domain can't be parsed when the multiple CPUs
> +     * in one cluster have been associated with different NUMA nodes. However,
> +     * it's fine to associate one NUMA node with CPUs in different clusters.
> +     */
> +    for (i = 0; i < possible_cpus->len; i++) {
> +        for (j = i + 1; j < possible_cpus->len; j++) {
> +            if (cpus[i].props.has_socket_id &&
> +                cpus[i].props.has_cluster_id &&
> +                cpus[i].props.has_node_id &&
> +                cpus[j].props.has_socket_id &&
> +                cpus[j].props.has_cluster_id &&
> +                cpus[j].props.has_node_id &&
> +                cpus[i].props.socket_id == cpus[j].props.socket_id &&
> +                cpus[i].props.cluster_id == cpus[j].props.cluster_id &&
> +                cpus[i].props.node_id != cpus[j].props.node_id) {
> +                warn_report("CPU-%d and CPU-%d in socket-%ld-cluster-%ld "
> +                             "have been associated with node-%ld and node-%ld "
> +                             "respectively. It can cause OSes like Linux to "
> +                             "misbehave", i, j, cpus[i].props.socket_id,
> +                             cpus[i].props.cluster_id, cpus[i].props.node_id,
> +                             cpus[j].props.node_id);
> +            }
> +        }
> +    }
> +}
> +
>  MemoryRegion *machine_consume_memdev(MachineState *machine,
>                                       HostMemoryBackend *backend)
>  {
> @@ -1340,6 +1379,9 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
>          numa_complete_configuration(machine);
>          if (machine->numa_state->num_nodes) {
>              machine_numa_finish_cpu_init(machine);
> +            if (machine_class->cpu_cluster_has_numa_boundary) {
> +                validate_cpu_cluster_to_numa_boundary(machine);
> +            }
>          }
>      }
>
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 6fbbfd56c8..c9793b2789 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -273,6 +273,7 @@ struct MachineClass {
>      bool nvdimm_supported;
>      bool numa_mem_supported;
>      bool auto_enable_numa;
> +    bool cpu_cluster_has_numa_boundary;
>      SMPCompatProps smp_props;
>      const char *default_ram_id;
>
> --
> 2.23.0
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/3] hw/riscv: Validate cluster and NUMA node boundary
  2023-03-17  6:25 ` [PATCH v4 3/3] hw/riscv: " Gavin Shan
@ 2023-03-21 11:40   ` Alistair Francis
  0 siblings, 0 replies; 12+ messages in thread
From: Alistair Francis @ 2023-03-21 11:40 UTC (permalink / raw)
  To: Gavin Shan
  Cc: qemu-arm, qemu-devel, qemu-riscv, rad, peter.maydell,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	imammedo, ajones, berrange, dbarboza, yihyu, shan.gavin

On Fri, Mar 17, 2023 at 4:29 PM Gavin Shan <gshan@redhat.com> wrote:
>
> There are two RISCV machines where NUMA is aware: 'virt' and 'spike'.
> Both of them are required to follow cluster-NUMA-node boundary. To
> enable the validation to warn about the irregular configuration where
> multiple CPUs in one cluster has been associated with multiple NUMA
> nodes.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  hw/riscv/spike.c | 2 ++
>  hw/riscv/virt.c  | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index a584d5b3a2..4bf783884b 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -349,6 +349,8 @@ static void spike_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
>      mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
>      mc->numa_mem_supported = true;
> +    /* platform instead of architectural choice */
> +    mc->cpu_cluster_has_numa_boundary = true;
>      mc->default_ram_id = "riscv.spike.ram";
>  }
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 4e3efbee16..84a2bca460 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1678,6 +1678,8 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
>      mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
>      mc->numa_mem_supported = true;
> +    /* platform instead of architectural choice */
> +    mc->cpu_cluster_has_numa_boundary = true;
>      mc->default_ram_id = "riscv_virt_board.ram";
>      assert(!mc->get_hotplug_handler);
>      mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> --
> 2.23.0
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-03-17  6:25 [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Gavin Shan
                   ` (2 preceding siblings ...)
  2023-03-17  6:25 ` [PATCH v4 3/3] hw/riscv: " Gavin Shan
@ 2023-03-27 13:26 ` Igor Mammedov
  2023-04-12  1:07   ` Gavin Shan
  3 siblings, 1 reply; 12+ messages in thread
From: Igor Mammedov @ 2023-03-27 13:26 UTC (permalink / raw)
  To: Gavin Shan
  Cc: qemu-arm, qemu-devel, qemu-riscv, rad, peter.maydell,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	ajones, berrange, dbarboza, yihyu, shan.gavin

On Fri, 17 Mar 2023 14:25:39 +0800
Gavin Shan <gshan@redhat.com> wrote:

> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
> used to populate the CPU topology in the Linux guest. It's required that
> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
> scheduling domain can't be sorted out, as the following warning message
> indicates. To avoid the unexpected confusion, this series attempts to
> warn about such kind of irregular configurations.
> 
>    -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
>    -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
>    -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
>    -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
> 
>    ------------[ cut here ]------------
>    WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
>    Modules linked in:
>    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
>    pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>    pc : build_sched_domains+0x284/0x910
>    lr : build_sched_domains+0x184/0x910
>    sp : ffff80000804bd50
>    x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
>    x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
>    x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
>    x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
>    x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
>    x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
>    x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
>    x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
>    x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
>    x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
>    Call trace:
>     build_sched_domains+0x284/0x910
>     sched_init_domains+0xac/0xe0
>     sched_init_smp+0x48/0xc8
>     kernel_init_freeable+0x140/0x1ac
>     kernel_init+0x28/0x140
>     ret_from_fork+0x10/0x20
> 
> PATCH[1] Warn about the irregular configuration if required
> PATCH[2] Enable the validation for aarch64 machines
> PATCH[3] Enable the validation for riscv machines
> 
> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
> 
> Changelog
> =========
> v4:
>   * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
>   * Replace local variable @len with possible_cpus->len in
>     validate_cpu_cluster_to_numa_boundary()                   (Philippe)
> v3:
>   * Validate cluster-to-NUMA instead of socket-to-NUMA
>     boundary                                                  (Gavin)
>   * Move the switch from MachineState to MachineClass         (Philippe)
>   * Warning instead of rejecting the irregular configuration  (Daniel)
>   * Comments to mention cluster-to-NUMA is platform instead
>     of architectural choice                                   (Drew)
>   * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
> v2:
>   * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
>   * Add helper set_numa_socket_boundary() and validate the
>     boundary in the generic path                              (Philippe)
> 
> Gavin Shan (3):
>   numa: Validate cluster and NUMA node boundary if required
>   hw/arm: Validate cluster and NUMA node boundary
>   hw/riscv: Validate cluster and NUMA node boundary
> 
>  hw/arm/sbsa-ref.c   |  2 ++
>  hw/arm/virt.c       |  2 ++
>  hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
>  hw/riscv/spike.c    |  2 ++
>  hw/riscv/virt.c     |  2 ++
>  include/hw/boards.h |  1 +
>  6 files changed, 51 insertions(+)
> 

Acked-by: Igor Mammedov <imammedo@redhat.com>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-03-27 13:26 ` [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Igor Mammedov
@ 2023-04-12  1:07   ` Gavin Shan
  2023-04-12 11:42     ` Peter Maydell
  0 siblings, 1 reply; 12+ messages in thread
From: Gavin Shan @ 2023-04-12  1:07 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-arm, qemu-devel, qemu-riscv, rad, peter.maydell,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	ajones, berrange, dbarboza, yihyu, shan.gavin

Hi Peter,

On 3/27/23 9:26 PM, Igor Mammedov wrote:
> On Fri, 17 Mar 2023 14:25:39 +0800
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
>> used to populate the CPU topology in the Linux guest. It's required that
>> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
>> scheduling domain can't be sorted out, as the following warning message
>> indicates. To avoid the unexpected confusion, this series attempts to
>> warn about such kind of irregular configurations.
>>
>>     -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
>>     -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
>>     -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
>>     -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
>>
>>     ------------[ cut here ]------------
>>     WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
>>     Modules linked in:
>>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
>>     pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>     pc : build_sched_domains+0x284/0x910
>>     lr : build_sched_domains+0x184/0x910
>>     sp : ffff80000804bd50
>>     x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
>>     x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
>>     x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
>>     x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
>>     x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
>>     x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
>>     x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
>>     x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
>>     x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
>>     x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
>>     Call trace:
>>      build_sched_domains+0x284/0x910
>>      sched_init_domains+0xac/0xe0
>>      sched_init_smp+0x48/0xc8
>>      kernel_init_freeable+0x140/0x1ac
>>      kernel_init+0x28/0x140
>>      ret_from_fork+0x10/0x20
>>
>> PATCH[1] Warn about the irregular configuration if required
>> PATCH[2] Enable the validation for aarch64 machines
>> PATCH[3] Enable the validation for riscv machines
>>
>> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
>> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
>> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
>>
>> Changelog
>> =========
>> v4:
>>    * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
>>    * Replace local variable @len with possible_cpus->len in
>>      validate_cpu_cluster_to_numa_boundary()                   (Philippe)
>> v3:
>>    * Validate cluster-to-NUMA instead of socket-to-NUMA
>>      boundary                                                  (Gavin)
>>    * Move the switch from MachineState to MachineClass         (Philippe)
>>    * Warning instead of rejecting the irregular configuration  (Daniel)
>>    * Comments to mention cluster-to-NUMA is platform instead
>>      of architectural choice                                   (Drew)
>>    * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
>> v2:
>>    * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
>>    * Add helper set_numa_socket_boundary() and validate the
>>      boundary in the generic path                              (Philippe)
>>
>> Gavin Shan (3):
>>    numa: Validate cluster and NUMA node boundary if required
>>    hw/arm: Validate cluster and NUMA node boundary
>>    hw/riscv: Validate cluster and NUMA node boundary
>>
>>   hw/arm/sbsa-ref.c   |  2 ++
>>   hw/arm/virt.c       |  2 ++
>>   hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   hw/riscv/spike.c    |  2 ++
>>   hw/riscv/virt.c     |  2 ++
>>   include/hw/boards.h |  1 +
>>   6 files changed, 51 insertions(+)
>>
> 
> Acked-by: Igor Mammedov <imammedo@redhat.com>
> 

Not sure if QEMU v8.0 is still available to integrate this series.
Otherwise, it should be something for QEMU v8.1. By the way, I'm
also uncertain who needs to be merge this series.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-04-12  1:07   ` Gavin Shan
@ 2023-04-12 11:42     ` Peter Maydell
  2023-04-13  5:50       ` Gavin Shan
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2023-04-12 11:42 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Igor Mammedov, qemu-arm, qemu-devel, qemu-riscv, rad,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	ajones, berrange, dbarboza, yihyu, shan.gavin

On Wed, 12 Apr 2023 at 02:08, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 3/27/23 9:26 PM, Igor Mammedov wrote:
> > On Fri, 17 Mar 2023 14:25:39 +0800
> > Gavin Shan <gshan@redhat.com> wrote:
> >
> >> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
> >> used to populate the CPU topology in the Linux guest. It's required that
> >> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
> >> scheduling domain can't be sorted out, as the following warning message
> >> indicates. To avoid the unexpected confusion, this series attempts to
> >> warn about such kind of irregular configurations.
> >>
> >>     -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
> >>     -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
> >>     -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
> >>     -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
> >>
> >>     ------------[ cut here ]------------
> >>     WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
> >>     Modules linked in:
> >>     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
> >>     pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >>     pc : build_sched_domains+0x284/0x910
> >>     lr : build_sched_domains+0x184/0x910
> >>     sp : ffff80000804bd50
> >>     x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
> >>     x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
> >>     x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
> >>     x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
> >>     x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
> >>     x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
> >>     x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
> >>     x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
> >>     x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
> >>     x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
> >>     Call trace:
> >>      build_sched_domains+0x284/0x910
> >>      sched_init_domains+0xac/0xe0
> >>      sched_init_smp+0x48/0xc8
> >>      kernel_init_freeable+0x140/0x1ac
> >>      kernel_init+0x28/0x140
> >>      ret_from_fork+0x10/0x20
> >>
> >> PATCH[1] Warn about the irregular configuration if required
> >> PATCH[2] Enable the validation for aarch64 machines
> >> PATCH[3] Enable the validation for riscv machines
> >>
> >> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
> >> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
> >> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
> >>
> >> Changelog
> >> =========
> >> v4:
> >>    * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
> >>    * Replace local variable @len with possible_cpus->len in
> >>      validate_cpu_cluster_to_numa_boundary()                   (Philippe)
> >> v3:
> >>    * Validate cluster-to-NUMA instead of socket-to-NUMA
> >>      boundary                                                  (Gavin)
> >>    * Move the switch from MachineState to MachineClass         (Philippe)
> >>    * Warning instead of rejecting the irregular configuration  (Daniel)
> >>    * Comments to mention cluster-to-NUMA is platform instead
> >>      of architectural choice                                   (Drew)
> >>    * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
> >> v2:
> >>    * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
> >>    * Add helper set_numa_socket_boundary() and validate the
> >>      boundary in the generic path                              (Philippe)
> >>
> >> Gavin Shan (3):
> >>    numa: Validate cluster and NUMA node boundary if required
> >>    hw/arm: Validate cluster and NUMA node boundary
> >>    hw/riscv: Validate cluster and NUMA node boundary
> >>
> >>   hw/arm/sbsa-ref.c   |  2 ++
> >>   hw/arm/virt.c       |  2 ++
> >>   hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
> >>   hw/riscv/spike.c    |  2 ++
> >>   hw/riscv/virt.c     |  2 ++
> >>   include/hw/boards.h |  1 +
> >>   6 files changed, 51 insertions(+)
> >>
> >
> > Acked-by: Igor Mammedov <imammedo@redhat.com>
> >
>
> Not sure if QEMU v8.0 is still available to integrate this series.
> Otherwise, it should be something for QEMU v8.1. By the way, I'm
> also uncertain who needs to be merge this series.

It barely touches arm specific boards, so I'm assuming it will
be reviewed and taken by whoever handles hw/core/machine.c

And yes, 8.0 is nearly out the door, this is 8.1 stuff.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-04-12 11:42     ` Peter Maydell
@ 2023-04-13  5:50       ` Gavin Shan
  2023-04-13 11:21         ` Igor Mammedov
  0 siblings, 1 reply; 12+ messages in thread
From: Gavin Shan @ 2023-04-13  5:50 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Igor Mammedov, qemu-arm, qemu-devel, qemu-riscv, rad,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, pbonzini,
	ajones, berrange, dbarboza, yihyu, shan.gavin

On 4/12/23 7:42 PM, Peter Maydell wrote:
> On Wed, 12 Apr 2023 at 02:08, Gavin Shan <gshan@redhat.com> wrote:
>> On 3/27/23 9:26 PM, Igor Mammedov wrote:
>>> On Fri, 17 Mar 2023 14:25:39 +0800
>>> Gavin Shan <gshan@redhat.com> wrote:
>>>
>>>> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
>>>> used to populate the CPU topology in the Linux guest. It's required that
>>>> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
>>>> scheduling domain can't be sorted out, as the following warning message
>>>> indicates. To avoid the unexpected confusion, this series attempts to
>>>> warn about such kind of irregular configurations.
>>>>
>>>>      -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
>>>>      -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
>>>>      -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
>>>>      -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
>>>>
>>>>      ------------[ cut here ]------------
>>>>      WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
>>>>      Modules linked in:
>>>>      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
>>>>      pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>      pc : build_sched_domains+0x284/0x910
>>>>      lr : build_sched_domains+0x184/0x910
>>>>      sp : ffff80000804bd50
>>>>      x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
>>>>      x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
>>>>      x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
>>>>      x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
>>>>      x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
>>>>      x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
>>>>      x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
>>>>      x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
>>>>      x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
>>>>      x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
>>>>      Call trace:
>>>>       build_sched_domains+0x284/0x910
>>>>       sched_init_domains+0xac/0xe0
>>>>       sched_init_smp+0x48/0xc8
>>>>       kernel_init_freeable+0x140/0x1ac
>>>>       kernel_init+0x28/0x140
>>>>       ret_from_fork+0x10/0x20
>>>>
>>>> PATCH[1] Warn about the irregular configuration if required
>>>> PATCH[2] Enable the validation for aarch64 machines
>>>> PATCH[3] Enable the validation for riscv machines
>>>>
>>>> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
>>>> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
>>>> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
>>>>
>>>> Changelog
>>>> =========
>>>> v4:
>>>>     * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
>>>>     * Replace local variable @len with possible_cpus->len in
>>>>       validate_cpu_cluster_to_numa_boundary()                   (Philippe)
>>>> v3:
>>>>     * Validate cluster-to-NUMA instead of socket-to-NUMA
>>>>       boundary                                                  (Gavin)
>>>>     * Move the switch from MachineState to MachineClass         (Philippe)
>>>>     * Warning instead of rejecting the irregular configuration  (Daniel)
>>>>     * Comments to mention cluster-to-NUMA is platform instead
>>>>       of architectural choice                                   (Drew)
>>>>     * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
>>>> v2:
>>>>     * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
>>>>     * Add helper set_numa_socket_boundary() and validate the
>>>>       boundary in the generic path                              (Philippe)
>>>>
>>>> Gavin Shan (3):
>>>>     numa: Validate cluster and NUMA node boundary if required
>>>>     hw/arm: Validate cluster and NUMA node boundary
>>>>     hw/riscv: Validate cluster and NUMA node boundary
>>>>
>>>>    hw/arm/sbsa-ref.c   |  2 ++
>>>>    hw/arm/virt.c       |  2 ++
>>>>    hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>>    hw/riscv/spike.c    |  2 ++
>>>>    hw/riscv/virt.c     |  2 ++
>>>>    include/hw/boards.h |  1 +
>>>>    6 files changed, 51 insertions(+)
>>>>
>>>
>>> Acked-by: Igor Mammedov <imammedo@redhat.com>
>>>
>>
>> Not sure if QEMU v8.0 is still available to integrate this series.
>> Otherwise, it should be something for QEMU v8.1. By the way, I'm
>> also uncertain who needs to be merge this series.
> 
> It barely touches arm specific boards, so I'm assuming it will
> be reviewed and taken by whoever handles hw/core/machine.c
> 
> And yes, 8.0 is nearly out the door, this is 8.1 stuff.
> 

Indeed. In this case, it needs to be merged via 'Machine core' tree,
which is being taken care by Eduardo Habkost or Marcel Apfelbaum.

Eduardo and  Marcel, could you please merge this to QEMU v8.1 when it's
ready? Thanks in advance.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-04-13  5:50       ` Gavin Shan
@ 2023-04-13 11:21         ` Igor Mammedov
  2023-04-18  8:57           ` Gavin Shan
  0 siblings, 1 reply; 12+ messages in thread
From: Igor Mammedov @ 2023-04-13 11:21 UTC (permalink / raw)
  To: Gavin Shan, pbonzini
  Cc: Peter Maydell, qemu-arm, qemu-devel, qemu-riscv, rad,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, ajones,
	berrange, dbarboza, yihyu, shan.gavin

On Thu, 13 Apr 2023 13:50:57 +0800
Gavin Shan <gshan@redhat.com> wrote:

> On 4/12/23 7:42 PM, Peter Maydell wrote:
> > On Wed, 12 Apr 2023 at 02:08, Gavin Shan <gshan@redhat.com> wrote:  
> >> On 3/27/23 9:26 PM, Igor Mammedov wrote:  
> >>> On Fri, 17 Mar 2023 14:25:39 +0800
> >>> Gavin Shan <gshan@redhat.com> wrote:
> >>>  
> >>>> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
> >>>> used to populate the CPU topology in the Linux guest. It's required that
> >>>> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
> >>>> scheduling domain can't be sorted out, as the following warning message
> >>>> indicates. To avoid the unexpected confusion, this series attempts to
> >>>> warn about such kind of irregular configurations.
> >>>>
> >>>>      -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
> >>>>      -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
> >>>>      -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
> >>>>      -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
> >>>>
> >>>>      ------------[ cut here ]------------
> >>>>      WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
> >>>>      Modules linked in:
> >>>>      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
> >>>>      pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >>>>      pc : build_sched_domains+0x284/0x910
> >>>>      lr : build_sched_domains+0x184/0x910
> >>>>      sp : ffff80000804bd50
> >>>>      x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
> >>>>      x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
> >>>>      x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
> >>>>      x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
> >>>>      x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
> >>>>      x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
> >>>>      x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
> >>>>      x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
> >>>>      x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
> >>>>      x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
> >>>>      Call trace:
> >>>>       build_sched_domains+0x284/0x910
> >>>>       sched_init_domains+0xac/0xe0
> >>>>       sched_init_smp+0x48/0xc8
> >>>>       kernel_init_freeable+0x140/0x1ac
> >>>>       kernel_init+0x28/0x140
> >>>>       ret_from_fork+0x10/0x20
> >>>>
> >>>> PATCH[1] Warn about the irregular configuration if required
> >>>> PATCH[2] Enable the validation for aarch64 machines
> >>>> PATCH[3] Enable the validation for riscv machines
> >>>>
> >>>> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
> >>>> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
> >>>> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
> >>>>
> >>>> Changelog
> >>>> =========
> >>>> v4:
> >>>>     * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
> >>>>     * Replace local variable @len with possible_cpus->len in
> >>>>       validate_cpu_cluster_to_numa_boundary()                   (Philippe)
> >>>> v3:
> >>>>     * Validate cluster-to-NUMA instead of socket-to-NUMA
> >>>>       boundary                                                  (Gavin)
> >>>>     * Move the switch from MachineState to MachineClass         (Philippe)
> >>>>     * Warning instead of rejecting the irregular configuration  (Daniel)
> >>>>     * Comments to mention cluster-to-NUMA is platform instead
> >>>>       of architectural choice                                   (Drew)
> >>>>     * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
> >>>> v2:
> >>>>     * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
> >>>>     * Add helper set_numa_socket_boundary() and validate the
> >>>>       boundary in the generic path                              (Philippe)
> >>>>
> >>>> Gavin Shan (3):
> >>>>     numa: Validate cluster and NUMA node boundary if required
> >>>>     hw/arm: Validate cluster and NUMA node boundary
> >>>>     hw/riscv: Validate cluster and NUMA node boundary
> >>>>
> >>>>    hw/arm/sbsa-ref.c   |  2 ++
> >>>>    hw/arm/virt.c       |  2 ++
> >>>>    hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
> >>>>    hw/riscv/spike.c    |  2 ++
> >>>>    hw/riscv/virt.c     |  2 ++
> >>>>    include/hw/boards.h |  1 +
> >>>>    6 files changed, 51 insertions(+)
> >>>>  
> >>>
> >>> Acked-by: Igor Mammedov <imammedo@redhat.com>
> >>>  
> >>
> >> Not sure if QEMU v8.0 is still available to integrate this series.
> >> Otherwise, it should be something for QEMU v8.1. By the way, I'm
> >> also uncertain who needs to be merge this series.  
> > 
> > It barely touches arm specific boards, so I'm assuming it will
> > be reviewed and taken by whoever handles hw/core/machine.c
> > 
> > And yes, 8.0 is nearly out the door, this is 8.1 stuff.
> >   
> 
> Indeed. In this case, it needs to be merged via 'Machine core' tree,
> which is being taken care by Eduardo Habkost or Marcel Apfelbaum.
> 
> Eduardo and  Marcel, could you please merge this to QEMU v8.1 when it's
> ready? Thanks in advance.

Lately it was Paolo who taking care of generic machine queue

> 
> Thanks,
> Gavin
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines
  2023-04-13 11:21         ` Igor Mammedov
@ 2023-04-18  8:57           ` Gavin Shan
  0 siblings, 0 replies; 12+ messages in thread
From: Gavin Shan @ 2023-04-18  8:57 UTC (permalink / raw)
  To: Igor Mammedov, pbonzini
  Cc: Peter Maydell, qemu-arm, qemu-devel, qemu-riscv, rad,
	quic_llindhol, eduardo, marcel.apfelbaum, philmd, wangyanan55,
	palmer, alistair.francis, bin.meng, thuth, lvivier, ajones,
	berrange, dbarboza, yihyu, shan.gavin

Hi Igor,

On 4/13/23 7:21 PM, Igor Mammedov wrote:
> On Thu, 13 Apr 2023 13:50:57 +0800
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> On 4/12/23 7:42 PM, Peter Maydell wrote:
>>> On Wed, 12 Apr 2023 at 02:08, Gavin Shan <gshan@redhat.com> wrote:
>>>> On 3/27/23 9:26 PM, Igor Mammedov wrote:
>>>>> On Fri, 17 Mar 2023 14:25:39 +0800
>>>>> Gavin Shan <gshan@redhat.com> wrote:
>>>>>   
>>>>>> For arm64 and riscv architecture, the driver (/base/arch_topology.c) is
>>>>>> used to populate the CPU topology in the Linux guest. It's required that
>>>>>> the CPUs in one cluster can't span mutiple NUMA nodes. Otherwise, the Linux
>>>>>> scheduling domain can't be sorted out, as the following warning message
>>>>>> indicates. To avoid the unexpected confusion, this series attempts to
>>>>>> warn about such kind of irregular configurations.
>>>>>>
>>>>>>       -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
>>>>>>       -numa node,nodeid=0,cpus=0-1,memdev=ram0                \
>>>>>>       -numa node,nodeid=1,cpus=2-3,memdev=ram1                \
>>>>>>       -numa node,nodeid=2,cpus=4-5,memdev=ram2                \
>>>>>>
>>>>>>       ------------[ cut here ]------------
>>>>>>       WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
>>>>>>       Modules linked in:
>>>>>>       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
>>>>>>       pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>>>       pc : build_sched_domains+0x284/0x910
>>>>>>       lr : build_sched_domains+0x184/0x910
>>>>>>       sp : ffff80000804bd50
>>>>>>       x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
>>>>>>       x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
>>>>>>       x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
>>>>>>       x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
>>>>>>       x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
>>>>>>       x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
>>>>>>       x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
>>>>>>       x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
>>>>>>       x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
>>>>>>       x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
>>>>>>       Call trace:
>>>>>>        build_sched_domains+0x284/0x910
>>>>>>        sched_init_domains+0xac/0xe0
>>>>>>        sched_init_smp+0x48/0xc8
>>>>>>        kernel_init_freeable+0x140/0x1ac
>>>>>>        kernel_init+0x28/0x140
>>>>>>        ret_from_fork+0x10/0x20
>>>>>>
>>>>>> PATCH[1] Warn about the irregular configuration if required
>>>>>> PATCH[2] Enable the validation for aarch64 machines
>>>>>> PATCH[3] Enable the validation for riscv machines
>>>>>>
>>>>>> v3: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01226.html
>>>>>> v2: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg01080.html
>>>>>> v1: https://lists.nongnu.org/archive/html/qemu-arm/2023-02/msg00886.html
>>>>>>
>>>>>> Changelog
>>>>>> =========
>>>>>> v4:
>>>>>>      * Pick r-b and ack-b from Daniel/Philippe                   (Gavin)
>>>>>>      * Replace local variable @len with possible_cpus->len in
>>>>>>        validate_cpu_cluster_to_numa_boundary()                   (Philippe)
>>>>>> v3:
>>>>>>      * Validate cluster-to-NUMA instead of socket-to-NUMA
>>>>>>        boundary                                                  (Gavin)
>>>>>>      * Move the switch from MachineState to MachineClass         (Philippe)
>>>>>>      * Warning instead of rejecting the irregular configuration  (Daniel)
>>>>>>      * Comments to mention cluster-to-NUMA is platform instead
>>>>>>        of architectural choice                                   (Drew)
>>>>>>      * Drop PATCH[v2 1/4] related to qtests/numa-test            (Gavin)
>>>>>> v2:
>>>>>>      * Fix socket-NUMA-node boundary issues in qtests/numa-test  (Gavin)
>>>>>>      * Add helper set_numa_socket_boundary() and validate the
>>>>>>        boundary in the generic path                              (Philippe)
>>>>>>
>>>>>> Gavin Shan (3):
>>>>>>      numa: Validate cluster and NUMA node boundary if required
>>>>>>      hw/arm: Validate cluster and NUMA node boundary
>>>>>>      hw/riscv: Validate cluster and NUMA node boundary
>>>>>>
>>>>>>     hw/arm/sbsa-ref.c   |  2 ++
>>>>>>     hw/arm/virt.c       |  2 ++
>>>>>>     hw/core/machine.c   | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>     hw/riscv/spike.c    |  2 ++
>>>>>>     hw/riscv/virt.c     |  2 ++
>>>>>>     include/hw/boards.h |  1 +
>>>>>>     6 files changed, 51 insertions(+)
>>>>>>   
>>>>>
>>>>> Acked-by: Igor Mammedov <imammedo@redhat.com>
>>>>>   
>>>>
>>>> Not sure if QEMU v8.0 is still available to integrate this series.
>>>> Otherwise, it should be something for QEMU v8.1. By the way, I'm
>>>> also uncertain who needs to be merge this series.
>>>
>>> It barely touches arm specific boards, so I'm assuming it will
>>> be reviewed and taken by whoever handles hw/core/machine.c
>>>
>>> And yes, 8.0 is nearly out the door, this is 8.1 stuff.
>>>    
>>
>> Indeed. In this case, it needs to be merged via 'Machine core' tree,
>> which is being taken care by Eduardo Habkost or Marcel Apfelbaum.
>>
>> Eduardo and  Marcel, could you please merge this to QEMU v8.1 when it's
>> ready? Thanks in advance.
> 
> Lately it was Paolo who taking care of generic machine queue
> 

Thanks a lot, Igor. I will ping Paolo if needed when QEMU v8.1 is ready.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-18  8:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-17  6:25 [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Gavin Shan
2023-03-17  6:25 ` [PATCH v4 1/3] numa: Validate cluster and NUMA node boundary if required Gavin Shan
2023-03-21 11:39   ` Alistair Francis
2023-03-17  6:25 ` [PATCH v4 2/3] hw/arm: Validate cluster and NUMA node boundary Gavin Shan
2023-03-17  6:25 ` [PATCH v4 3/3] hw/riscv: " Gavin Shan
2023-03-21 11:40   ` Alistair Francis
2023-03-27 13:26 ` [PATCH v4 0/3] NUMA: Apply cluster-NUMA-node boundary for aarch64 and riscv machines Igor Mammedov
2023-04-12  1:07   ` Gavin Shan
2023-04-12 11:42     ` Peter Maydell
2023-04-13  5:50       ` Gavin Shan
2023-04-13 11:21         ` Igor Mammedov
2023-04-18  8:57           ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).