All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3
@ 2023-01-10  8:49 Wei Chen
  2023-01-10  8:49 ` [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
                   ` (16 more replies)
  0 siblings, 17 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

(The Arm device tree based NUMA support patch set contains 35
patches. In order to make stuff easier for reviewers, I split
them into 3 parts:
1. Preparation. I have re-sorted the patch series. And moved
   independent patches to the head of the series - merged in [1]
2. Move generically usable code from x86 to common - merged in [2]
3. Add new code to support Arm - this series.

This series only contains the third part patches. As the whole NUMA
series has been reviewed for 1 round in [3], so this series would
be v2)

Xen memory allocation and scheduler modules are NUMA aware.
But actually, on x86 has implemented the architecture APIs
to support NUMA. Arm was providing a set of fake architecture
APIs to make it compatible with NUMA awared memory allocation
and scheduler.

Arm system was working well as a single node NUMA system with
these fake APIs, because we didn't have multiple nodes NUMA
system on Arm. But in recent years, more and more Arm devices
support multiple nodes NUMA system.

So now we have a new problem. When Xen is running on these Arm
devices, Xen still treat them as single node SMP systems. The
NUMA affinity capability of Xen memory allocation and scheduler
becomes meaningless. Because they rely on input data that does
not reflect real NUMA layout.

Xen still think the access time for all of the memory is the
same for all CPUs. However, Xen may allocate memory to a VM
from different NUMA nodes with different access speeds. This
difference can be amplified in workloads inside VM, causing
performance instability and timeouts.

So in this patch series, we implement a set of NUMA API to use
device tree to describe the NUMA layout. We reuse most of the
code of x86 NUMA to create and maintain the mapping between
memory and CPU, create the matrix between any two NUMA nodes.
Except ACPI and some x86 specified code, we have moved other
code to common. In next stage, when we implement ACPI based
NUMA for Arm64, we may move the ACPI NUMA code to common too,
but in current stage, we keep it as x86 only.

This patch serires has been tested and booted well on one
Arm64 NUMA machine and one HPE x86 NUMA machine.

[1] https://lists.xenproject.org/archives/html/xen-devel/2022-06/msg00499.html
[2] https://lists.xenproject.org/archives/html/xen-devel/2022-11/msg01043.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg01903.html

Henry Wang (1):
  xen/arm: Set correct per-cpu cpu_core_mask

Wei Chen (16):
  xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  xen/arm: implement helpers to get and update NUMA status
  xen/arm: implement node distance helpers for Arm
  xen/arm: use arch_get_ram_range to memory ranges from bootinfo
  xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus
  xen/arm: Add boot and secondary CPU to NUMA system
  xen/arm: introduce a helper to parse device tree processor node
  xen/arm: introduce a helper to parse device tree memory node
  xen/arm: introduce a helper to parse device tree NUMA distance map
  xen/arm: unified entry to parse all NUMA data from device tree
  xen/arm: keep guest still be NUMA unware
  xen/arm: enable device tree based NUMA in system init
  xen/arm: implement numa_node_to_arch_nid for device tree NUMA
  xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot
  xen/arm: Provide Kconfig options for Arm to enable NUMA
  docs: update numa command line to support Arm

 SUPPORT.md                        |   1 +
 docs/misc/xen-command-line.pandoc |   2 +-
 xen/arch/arm/Kconfig              |  11 ++
 xen/arch/arm/Makefile             |   2 +
 xen/arch/arm/domain_build.c       |   6 +
 xen/arch/arm/include/asm/numa.h   |  87 ++++++++-
 xen/arch/arm/numa.c               | 183 +++++++++++++++++++
 xen/arch/arm/numa_device_tree.c   | 290 ++++++++++++++++++++++++++++++
 xen/arch/arm/setup.c              |  17 ++
 xen/arch/arm/smpboot.c            |  38 ++++
 xen/arch/x86/include/asm/numa.h   |   2 -
 xen/arch/x86/srat.c               |   2 +-
 xen/include/xen/numa.h            |  11 ++
 13 files changed, 647 insertions(+), 5 deletions(-)
 create mode 100644 xen/arch/arm/numa.c
 create mode 100644 xen/arch/arm/numa_device_tree.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  9:59   ` Jan Beulich
  2023-01-10  8:49 ` [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status Wei Chen
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

As a memory range described in device tree cannot be split across
multiple nodes. And it is very likely than if you have more than
64 nodes, you may need a lot more than 2 regions per node. So the
default NR_NODE_MEMBLKS value (MAX_NUMNODES * 2) makes no sense
on Arm.

So, for Arm, we would just define NR_NODE_MEMBLKS as an alias to
NR_MEM_BANKS. And in the future NR_MEM_BANKS will be user-configurable
via kconfig, but for now leave NR_MEM_BANKS as 128 on Arm. This
avoid to have different way to define the value based NUMA vs non-NUMA.

Further discussions can be found here[1].

[1] https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg02322.html

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Add code comments to explain using NR_MEM_BANKS for Arm
2. Refine commit messages.
---
 xen/arch/arm/include/asm/numa.h | 19 ++++++++++++++++++-
 xen/include/xen/numa.h          |  9 +++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index e2bee2bd82..7d6ae36a19 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -3,9 +3,26 @@
 
 #include <xen/mm.h>
 
+#include <asm/setup.h>
+
 typedef u8 nodeid_t;
 
-#ifndef CONFIG_NUMA
+#ifdef CONFIG_NUMA
+
+/*
+ * It is very likely that if you have more than 64 nodes, you may
+ * need a lot more than 2 regions per node. So, for Arm, we would
+ * just define NR_NODE_MEMBLKS as an alias to NR_MEM_BANKS.
+ * And in the future NR_MEM_BANKS will be bumped for new platforms,
+ * but for now leave NR_MEM_BANKS as it is on Arm. This avoid to
+ * have different way to define the value based NUMA vs non-NUMA.
+ *
+ * Further discussions can be found here:
+ * https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg02322.html
+ */
+#define NR_NODE_MEMBLKS NR_MEM_BANKS
+
+#else
 
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 29b8c2df89..b86d0851fc 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -13,7 +13,16 @@
 #define MAX_NUMNODES 1
 #endif
 
+/*
+ * Some architectures may have different considerations for
+ * number of node memory blocks. They can define their
+ * NR_NODE_MEMBLKS in asm/numa.h to reflect their architectural
+ * implementation. If the arch does not have specific implementation,
+ * the following default NR_NODE_MEMBLKS will be used.
+ */
+#ifndef NR_NODE_MEMBLKS
 #define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
+#endif
 
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
  2023-01-10  8:49 ` [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10 16:38   ` Jan Beulich
  2023-01-10  8:49 ` [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm Wei Chen
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

NUMA has one global and one implementation specific switches. For
ACPI NUMA implementation, Xen has acpi_numa, so we introduce
device_tree_numa for device tree NUMA implementation. And use
enumerations to indicate init, off and on status.

arch_numa_disabled will get device_tree_numa status, but for
arch_numa_setup we have not provided boot arguments to setup
device_tree_numa. So we just return -EINVAL in this patch.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use arch_numa_disabled to replace numa_enable_with_firmware.
2. Introduce enumerations for device tree numa status.
3. Use common numa_disabled, drop Arm version numa_disabled.
4. Introduce arch_numa_setup for Arm.
5. Rename bad_srat to numa_bad.
6. Add numa_enable_with_firmware helper.
7. Add numa_disabled helper.
8. Refine commit message.
---
 xen/arch/arm/include/asm/numa.h | 17 +++++++++++++
 xen/arch/arm/numa.c             | 44 +++++++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/numa.h |  1 -
 xen/include/xen/numa.h          |  1 +
 4 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/numa.c

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 7d6ae36a19..52ca414e47 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -22,6 +22,12 @@ typedef u8 nodeid_t;
  */
 #define NR_NODE_MEMBLKS NR_MEM_BANKS
 
+enum dt_numa_status {
+    DT_NUMA_INIT,
+    DT_NUMA_ON,
+    DT_NUMA_OFF,
+};
+
 #else
 
 /* Fake one node for now. See also node_online_map. */
@@ -39,6 +45,17 @@ extern mfn_t first_valid_mfn;
 #define node_start_pfn(nid) (mfn_x(first_valid_mfn))
 #define __node_distance(a, b) (20)
 
+#define numa_disabled() (true)
+static inline bool arch_numa_unavailable(void)
+{
+    return true;
+}
+
+static inline bool arch_numa_broken(void)
+{
+    return true;
+}
+
 #endif
 
 #define arch_want_default_dmazone() (false)
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
new file mode 100644
index 0000000000..1c02b6a25d
--- /dev/null
+++ b/xen/arch/arm/numa.c
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Architecture support layer for NUMA.
+ *
+ * Copyright (C) 2021 Arm Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#include <xen/init.h>
+#include <xen/numa.h>
+
+static enum dt_numa_status __read_mostly device_tree_numa;
+
+void __init numa_fw_bad(void)
+{
+    printk(KERN_ERR "NUMA: device tree numa info table not used.\n");
+    device_tree_numa = DT_NUMA_OFF;
+}
+
+bool __init arch_numa_unavailable(void)
+{
+    return device_tree_numa != DT_NUMA_ON;
+}
+
+bool arch_numa_disabled(void)
+{
+    return device_tree_numa == DT_NUMA_OFF;
+}
+
+int __init arch_numa_setup(const char *opt)
+{
+    return -EINVAL;
+}
diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index 7866afa408..61efe60a95 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -12,7 +12,6 @@ extern unsigned int numa_node_to_arch_nid(nodeid_t n);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
-extern bool numa_disabled(void);
 extern nodeid_t setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index b86d0851fc..7d7aeb3a3c 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -55,6 +55,7 @@ extern void numa_init_array(void);
 extern void numa_set_node(unsigned int cpu, nodeid_t node);
 extern void numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn);
 extern void numa_fw_bad(void);
+extern bool numa_disabled(void);
 
 extern int arch_numa_setup(const char *opt);
 extern bool arch_numa_unavailable(void);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
  2023-01-10  8:49 ` [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
  2023-01-10  8:49 ` [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10 16:47   ` Jan Beulich
  2023-01-10  8:49 ` [PATCH v2 04/17] xen/arm: use arch_get_ram_range to memory ranges from bootinfo Wei Chen
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

We will parse NUMA nodes distances from device tree. So we need
a matrix to record the distances between any two nodes we parsed.
Accordingly, we provide this node_set_distance API for device tree
NUMA to set the distance for any two nodes in this patch. When
NUMA initialization failed, __node_distance will return
NUMA_REMOTE_DISTANCE, this will help us avoid doing rollback
for distance maxtrix when NUMA initialization failed.

As both x86 and Arm have implemented __node_distance, so we move
its definition from asm/numa.h to xen/numa.h. At same time, the
outdated u8 return value of x86 has been changed to unsigned char.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use unsigned int/char instead of uint32_t/u8.
2. Re-org the commit message.
---
 xen/arch/arm/Makefile           |  1 +
 xen/arch/arm/include/asm/numa.h | 14 +++++++++
 xen/arch/arm/numa.c             | 52 ++++++++++++++++++++++++++++++++-
 xen/arch/x86/include/asm/numa.h |  1 -
 xen/arch/x86/srat.c             |  2 +-
 xen/include/xen/numa.h          |  1 +
 6 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 4d076b278b..9073398d6e 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_LIVEPATCH) += livepatch.o
 obj-y += mem_access.o
 obj-y += mm.o
 obj-y += monitor.o
+obj-$(CONFIG_NUMA) += numa.o
 obj-y += p2m.o
 obj-y += percpu.o
 obj-y += platform.o
diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 52ca414e47..dbdb632711 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -28,6 +28,20 @@ enum dt_numa_status {
     DT_NUMA_OFF,
 };
 
+/*
+ * In ACPI spec, 0-9 are the reserved values for node distance,
+ * 10 indicates local node distance, 20 indicates remote node
+ * distance. Set node distance map in device tree will follow
+ * the ACPI's definition.
+ */
+#define NUMA_DISTANCE_UDF_MIN   0
+#define NUMA_DISTANCE_UDF_MAX   9
+#define NUMA_LOCAL_DISTANCE     10
+#define NUMA_REMOTE_DISTANCE    20
+
+extern void numa_set_distance(nodeid_t from, nodeid_t to,
+                              unsigned int distance);
+
 #else
 
 /* Fake one node for now. See also node_online_map. */
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 1c02b6a25d..34851ceacf 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -2,7 +2,7 @@
 /*
  * Arm Architecture support layer for NUMA.
  *
- * Copyright (C) 2021 Arm Ltd
+ * Copyright (C) 2022 Arm Ltd
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -22,6 +22,11 @@
 
 static enum dt_numa_status __read_mostly device_tree_numa;
 
+static unsigned char __read_mostly
+node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
+    { 0 }
+};
+
 void __init numa_fw_bad(void)
 {
     printk(KERN_ERR "NUMA: device tree numa info table not used.\n");
@@ -42,3 +47,48 @@ int __init arch_numa_setup(const char *opt)
 {
     return -EINVAL;
 }
+
+void __init numa_set_distance(nodeid_t from, nodeid_t to,
+                              unsigned int distance)
+{
+    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )
+    {
+        printk(KERN_WARNING
+               "NUMA: invalid nodes: from=%"PRIu8" to=%"PRIu8" MAX=%"PRIu8"\n",
+               from, to, MAX_NUMNODES);
+        return;
+    }
+
+    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined */
+    if ( distance >= NUMA_NO_DISTANCE ||
+        (distance >= NUMA_DISTANCE_UDF_MIN &&
+         distance <= NUMA_DISTANCE_UDF_MAX) ||
+        (from == to && distance != NUMA_LOCAL_DISTANCE) )
+    {
+        printk(KERN_WARNING
+               "NUMA: invalid distance: from=%"PRIu8" to=%"PRIu8" distance=%"PRIu32"\n",
+               from, to, distance);
+        return;
+    }
+
+    node_distance_map[from][to] = distance;
+}
+
+unsigned char __node_distance(nodeid_t from, nodeid_t to)
+{
+    /* When NUMA is off, any distance will be treated as remote. */
+    if ( numa_disabled() )
+        return NUMA_REMOTE_DISTANCE;
+
+    /*
+     * Check whether the nodes are in the matrix range.
+     * When any node is out of range, except from and to nodes are the
+     * same, we treat them as unreachable (return 0xFF)
+     */
+    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )
+        return from == to ? NUMA_LOCAL_DISTANCE : NUMA_NO_DISTANCE;
+
+    return node_distance_map[from][to];
+}
+
+EXPORT_SYMBOL(__node_distance);
diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index 61efe60a95..18b71ddfef 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -21,7 +21,6 @@ extern void init_cpu_to_node(void);
 #define arch_want_default_dmazone() (num_online_nodes() > 1)
 
 void srat_parse_regions(paddr_t addr);
-extern u8 __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
 
 #endif
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 56749ddca5..50faf5d352 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -328,7 +328,7 @@ unsigned int numa_node_to_arch_nid(nodeid_t n)
 	return 0;
 }
 
-u8 __node_distance(nodeid_t a, nodeid_t b)
+unsigned char __node_distance(nodeid_t a, nodeid_t b)
 {
 	unsigned index;
 	u8 slit_val;
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7d7aeb3a3c..cff4fb8ccc 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -115,6 +115,7 @@ extern bool numa_memblks_available(void);
 extern bool numa_update_node_memblks(nodeid_t node, unsigned int arch_nid,
                                      paddr_t start, paddr_t size, bool hotplug);
 extern void numa_set_processor_nodes_parsed(nodeid_t node);
+extern unsigned char __node_distance(nodeid_t a, nodeid_t b);
 
 #else
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 04/17] xen/arm: use arch_get_ram_range to memory ranges from bootinfo
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (2 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 05/17] xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus Wei Chen
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

Implement the same helper "arch_get_ram_range" as x86 for NUMA
code to get memory bank from Arm bootinfo.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v2 -> v3:
1. Use arch_get_ram_range instead of arch_get_memory_map.
v1 -> v2:
1. Use arch_get_memory_map to replace arch_get_memory_bank_range
   and arch_get_memory_bank_number.
---
 xen/arch/arm/numa.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 34851ceacf..dcfcd85fcf 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -92,3 +92,14 @@ unsigned char __node_distance(nodeid_t from, nodeid_t to)
 }
 
 EXPORT_SYMBOL(__node_distance);
+
+int __init arch_get_ram_range(unsigned int idx, paddr_t *start, paddr_t *end)
+{
+    if ( idx >= bootinfo.mem.nr_banks )
+        return -ENOENT;
+
+    *start = bootinfo.mem.bank[idx].start;
+    *end = *start + bootinfo.mem.bank[idx].size;
+
+    return 0;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 05/17] xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (3 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 04/17] xen/arm: use arch_get_ram_range to memory ranges from bootinfo Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 06/17] xen/arm: Add boot and secondary CPU to NUMA system Wei Chen
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

NUMA implementation has a cpu_to_node array to store CPU to NODE
map. Xen is using CPU logical ID in runtime components, so we
use CPU logical ID as CPU index in cpu_to_node.

In device tree case, cpu_logical_map is created in dt_smp_init_cpus.
So, when NUMA is enabled, dt_smp_init_cpus will fetch CPU NUMA id
at the same time for cpu_to_node.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use static inline to replace macros to perform
   function paramerters type check.
2. Add numa_disabled to gate the numa-node-id check for
   CONFIG_NUMA on but numa disabled user case.
3. Use macro instead of static inline function to stub
   numa_set_node.
---
 xen/arch/arm/include/asm/numa.h |  4 ++++
 xen/arch/arm/smpboot.c          | 36 +++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index dbdb632711..3bc28416b4 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -70,6 +70,10 @@ static inline bool arch_numa_broken(void)
     return true;
 }
 
+static inline void numa_set_node(unsigned int cpu, nodeid_t node)
+{
+}
+
 #endif
 
 #define arch_want_default_dmazone() (false)
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 412ae22869..5ee6ab11e9 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -118,7 +118,12 @@ static void __init dt_smp_init_cpus(void)
     {
         [0 ... NR_CPUS - 1] = MPIDR_INVALID
     };
+    static nodeid_t node_map[NR_CPUS] __initdata =
+    {
+        [0 ... NR_CPUS - 1] = NUMA_NO_NODE
+    };
     bool bootcpu_valid = false;
+    unsigned int nid = 0;
     int rc;
 
     mpidr = system_cpuinfo.mpidr.bits & MPIDR_HWID_MASK;
@@ -169,6 +174,28 @@ static void __init dt_smp_init_cpus(void)
             continue;
         }
 
+        if ( IS_ENABLED(CONFIG_NUMA) )
+        {
+            /*
+             * When CONFIG_NUMA is set, try to fetch numa infomation
+             * from CPU dts node, otherwise the nid is always 0.
+             */
+            if ( !dt_property_read_u32(cpu, "numa-node-id", &nid) )
+            {
+                printk(XENLOG_WARNING
+                       "cpu[%d] dts path: %s: doesn't have numa information!\n",
+                       cpuidx, dt_node_full_name(cpu));
+                /*
+                 * During the early stage of NUMA initialization, when Xen
+                 * found any CPU dts node doesn't have numa-node-id info, the
+                 * NUMA will be treated as off, all CPU will be set to a FAKE
+                 * node 0. So if we get numa-node-id failed here, we should
+                 * set nid to 0.
+                 */
+                nid = 0;
+            }
+        }
+
         /*
          * 8 MSBs must be set to 0 in the DT since the reg property
          * defines the MPIDR[23:0]
@@ -228,9 +255,13 @@ static void __init dt_smp_init_cpus(void)
         {
             printk("cpu%d init failed (hwid %"PRIregister"): %d\n", i, hwid, rc);
             tmp_map[i] = MPIDR_INVALID;
+            node_map[i] = NUMA_NO_NODE;
         }
         else
+        {
             tmp_map[i] = hwid;
+            node_map[i] = nid;
+        }
     }
 
     if ( !bootcpu_valid )
@@ -246,6 +277,11 @@ static void __init dt_smp_init_cpus(void)
             continue;
         cpumask_set_cpu(i, &cpu_possible_map);
         cpu_logical_map(i) = tmp_map[i];
+
+        nid = node_map[i];
+        if ( nid >= MAX_NUMNODES )
+            nid = 0;
+        numa_set_node(i, nid);
     }
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 06/17] xen/arm: Add boot and secondary CPU to NUMA system
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (4 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 05/17] xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 07/17] xen/arm: introduce a helper to parse device tree processor node Wei Chen
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

In this patch, we make NUMA node online and add cpu to
its NUMA node. This will make NUMA-aware components
have NUMA affinity data to support their work.

To keep the mostly the same behavior of x86, we use
numa_detect_cpu_node to online node. The difference is that,
we have prepared cpu_to_node in dt_smp_init_cpus, so we don't
need to setup cpu_to_node in numa_detect_cpu_node.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v2 -> v3:
1. Use unsigned int instead of int for cpu id.
2. Use static inline for stub to do type check.
v1 -> v2:
1. Use numa_detect_cpu_node to online node.
2. Use macros instead of static inline functions to stub
   numa_detect_cpu_node.
---
 xen/arch/arm/include/asm/numa.h |  9 +++++++++
 xen/arch/arm/numa.c             | 10 ++++++++++
 xen/arch/arm/setup.c            |  5 +++++
 3 files changed, 24 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 3bc28416b4..e0c909cbb7 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -41,6 +41,7 @@ enum dt_numa_status {
 
 extern void numa_set_distance(nodeid_t from, nodeid_t to,
                               unsigned int distance);
+extern void numa_detect_cpu_node(unsigned int cpu);
 
 #else
 
@@ -74,6 +75,14 @@ static inline void numa_set_node(unsigned int cpu, nodeid_t node)
 {
 }
 
+static inline void numa_add_cpu(unsigned int cpu)
+{
+}
+
+static inline void numa_detect_cpu_node(unsigned int cpu)
+{
+}
+
 #endif
 
 #define arch_want_default_dmazone() (false)
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index dcfcd85fcf..4dd7cf10ba 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -74,6 +74,16 @@ void __init numa_set_distance(nodeid_t from, nodeid_t to,
     node_distance_map[from][to] = distance;
 }
 
+void numa_detect_cpu_node(unsigned int cpu)
+{
+    nodeid_t node = cpu_to_node[cpu];
+
+    if ( node == NUMA_NO_NODE )
+        node = 0;
+
+    node_set_online(node);
+}
+
 unsigned char __node_distance(nodeid_t from, nodeid_t to)
 {
     /* When NUMA is off, any distance will be treated as remote. */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 1f26f67b90..8c02cf6cd4 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -1115,6 +1115,11 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     for_each_present_cpu ( i )
     {
+        /* Detect and online node based on cpu_to_node[]. */
+        numa_detect_cpu_node(i);
+        /* Set up node_to_cpumask based on cpu_to_node[]. */
+        numa_add_cpu(i);
+
         if ( (num_online_cpus() < nr_cpu_ids) && !cpu_online(i) )
         {
             int ret = cpu_up(i);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 07/17] xen/arm: introduce a helper to parse device tree processor node
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (5 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 06/17] xen/arm: Add boot and secondary CPU to NUMA system Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 08/17] xen/arm: introduce a helper to parse device tree memory node Wei Chen
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

Processor NUMA ID information is stored in device tree's processor
node as "numa-node-id". We need a new helper to parse this ID from
processor node. If we get this ID from processor node, this ID's
validity still need to be checked. Once we got a invalid NUMA ID
from any processor node, the device tree will be marked as NUMA
information invalid.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Move numa_disabled from fdt_numa_processor_affinity_init
   to fdt_parse_numa_cpu_node.
2. Move invalid NUMA id check to fdt_parse_numa_cpu_node.
3. Return ENODATA for normal dtb without NUMA info.
4. Use NUMA status helpers instead of SRAT functions.
---
 xen/arch/arm/Makefile           |  1 +
 xen/arch/arm/include/asm/numa.h |  2 ++
 xen/arch/arm/numa.c             |  2 +-
 xen/arch/arm/numa_device_tree.c | 64 +++++++++++++++++++++++++++++++++
 4 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/numa_device_tree.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 9073398d6e..bbc68e3735 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -39,6 +39,7 @@ obj-y += mem_access.o
 obj-y += mm.o
 obj-y += monitor.o
 obj-$(CONFIG_NUMA) += numa.o
+obj-$(CONFIG_DEVICE_TREE_NUMA) += numa_device_tree.o
 obj-y += p2m.o
 obj-y += percpu.o
 obj-y += platform.o
diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index e0c909cbb7..923ffbfd42 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -28,6 +28,8 @@ enum dt_numa_status {
     DT_NUMA_OFF,
 };
 
+extern enum dt_numa_status device_tree_numa;
+
 /*
  * In ACPI spec, 0-9 are the reserved values for node distance,
  * 10 indicates local node distance, 20 indicates remote node
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 4dd7cf10ba..3e02cec646 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -20,7 +20,7 @@
 #include <xen/init.h>
 #include <xen/numa.h>
 
-static enum dt_numa_status __read_mostly device_tree_numa;
+enum dt_numa_status __read_mostly device_tree_numa;
 
 static unsigned char __read_mostly
 node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
diff --git a/xen/arch/arm/numa_device_tree.c b/xen/arch/arm/numa_device_tree.c
new file mode 100644
index 0000000000..c031053d24
--- /dev/null
+++ b/xen/arch/arm/numa_device_tree.c
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Architecture support layer for device tree NUMA.
+ *
+ * Copyright (C) 2022 Arm Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+#include <xen/init.h>
+#include <xen/nodemask.h>
+#include <xen/numa.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/device_tree.h>
+
+/* Callback for device tree processor affinity */
+static int __init fdt_numa_processor_affinity_init(nodeid_t node)
+{
+    numa_set_processor_nodes_parsed(node);
+    device_tree_numa = DT_NUMA_ON;
+
+    printk(KERN_INFO "DT: NUMA node %"PRIu8" processor parsed\n", node);
+
+    return 0;
+}
+
+/* Parse CPU NUMA node info */
+static int __init fdt_parse_numa_cpu_node(const void *fdt, int node)
+{
+    unsigned int nid;
+
+    if ( numa_disabled() )
+        return -EINVAL;
+
+    /*
+     * device_tree_get_u32 will return NUMA_NO_NODE when this CPU
+     * DT node doesn't have numa-node-id. This can help us to
+     * distinguish a bad DTB and a normal DTB without NUMA info.
+     */
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", NUMA_NO_NODE);
+    if ( nid == NUMA_NO_NODE )
+    {
+        numa_fw_bad();
+        return -ENODATA;
+    }
+    else if ( nid >= MAX_NUMNODES )
+    {
+        printk(XENLOG_ERR "DT: CPU numa node id %u is invalid\n", nid);
+        numa_fw_bad();
+        return -EINVAL;
+    }
+
+    return fdt_numa_processor_affinity_init(nid);
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 08/17] xen/arm: introduce a helper to parse device tree memory node
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (6 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 07/17] xen/arm: introduce a helper to parse device tree processor node Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 09/17] xen/arm: introduce a helper to parse device tree NUMA distance map Wei Chen
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

Memory blocks' NUMA ID information is stored in device tree's
memory nodes as "numa-node-id". We need a new helper to parse
and verify this ID from memory nodes.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1->v2:
1. Move numa_disabled check to fdt_parse_numa_memory_node.
2. Use numa_bad to replace bad_srat.
3. Replace tabs by spaces.
4. Align parameters.
5. return ENODATA for a normal dtb without numa info.
6. Un-addressed comment:
   "Why not parse numa-node-id and call fdt_numa_memory_affinity_init
   from xen/arch/arm/bootfdt.c:device_tree_get_meminfo. Is it because
   device_tree_get_meminfo is called too early?"
   I checked the device_tree_get_meminfo code and I think the answer
   is similar as I reply in RFC. I prefer a unify numa initialization
   entry. Don't want to make numa parse code in different places.
7. Use node id as dummy PXM for numa_update_node_memblks.
---
 xen/arch/arm/numa_device_tree.c | 89 +++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/xen/arch/arm/numa_device_tree.c b/xen/arch/arm/numa_device_tree.c
index c031053d24..793a410fd4 100644
--- a/xen/arch/arm/numa_device_tree.c
+++ b/xen/arch/arm/numa_device_tree.c
@@ -34,6 +34,26 @@ static int __init fdt_numa_processor_affinity_init(nodeid_t node)
     return 0;
 }
 
+/* Callback for parsing of the memory regions affinity */
+static int __init fdt_numa_memory_affinity_init(nodeid_t node,
+                                                paddr_t start, paddr_t size)
+{
+    if ( !numa_memblks_available() )
+    {
+        dprintk(XENLOG_WARNING,
+                "Too many numa entry, try bigger NR_NODE_MEMBLKS\n");
+        return -EINVAL;
+    }
+
+    numa_fw_nid_name = "numa-node-id";
+    if ( !numa_update_node_memblks(node, node, start, size, false) )
+        return -EINVAL;
+
+    device_tree_numa = DT_NUMA_ON;
+
+    return 0;
+}
+
 /* Parse CPU NUMA node info */
 static int __init fdt_parse_numa_cpu_node(const void *fdt, int node)
 {
@@ -62,3 +82,72 @@ static int __init fdt_parse_numa_cpu_node(const void *fdt, int node)
 
     return fdt_numa_processor_affinity_init(nid);
 }
+
+/* Parse memory node NUMA info */
+static int __init fdt_parse_numa_memory_node(const void *fdt, int node,
+                                             const char *name,
+                                             unsigned int addr_cells,
+                                             unsigned int size_cells)
+{
+    unsigned int nid;
+    int ret = 0, len;
+    paddr_t addr, size;
+    const struct fdt_property *prop;
+    unsigned int idx, ranges;
+    const __be32 *addresses;
+
+    if ( numa_disabled() )
+        return -EINVAL;
+
+    /*
+     * device_tree_get_u32 will return NUMA_NO_NODE when this memory
+     * DT node doesn't have numa-node-id. This can help us to
+     * distinguish a bad DTB and a normal DTB without NUMA info.
+     */
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", NUMA_NO_NODE);
+    if ( node == NUMA_NO_NODE )
+    {
+        numa_fw_bad();
+        return -ENODATA;
+    }
+    else if ( nid >= MAX_NUMNODES )
+    {
+        printk(XENLOG_WARNING "Node id %u exceeds maximum value\n", nid);
+        goto invalid_data;
+    }
+
+    prop = fdt_get_property(fdt, node, "reg", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_WARNING
+               "fdt: node `%s': missing `reg' property\n", name);
+        goto invalid_data;
+    }
+
+    addresses = (const __be32 *)prop->data;
+    ranges = len / (sizeof(__be32)* (addr_cells + size_cells));
+    for ( idx = 0; idx < ranges; idx++ )
+    {
+        device_tree_get_reg(&addresses, addr_cells, size_cells, &addr, &size);
+        /* Skip zero size ranges */
+        if ( !size )
+            continue;
+
+        ret = fdt_numa_memory_affinity_init(nid, addr, size);
+        if ( ret )
+            goto invalid_data;
+    }
+
+    if ( idx == 0 )
+    {
+        printk(XENLOG_ERR
+               "bad property in memory node, idx=%d ret=%d\n", idx, ret);
+        goto invalid_data;
+    }
+
+    return 0;
+
+invalid_data:
+    numa_fw_bad();
+    return -EINVAL;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 09/17] xen/arm: introduce a helper to parse device tree NUMA distance map
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (7 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 08/17] xen/arm: introduce a helper to parse device tree memory node Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 10/17] xen/arm: unified entry to parse all NUMA data from device tree Wei Chen
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

A NUMA aware device tree will provide a "distance-map" node to
describe distance between any two nodes. This patch introduce a
new helper to parse this distance map.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Get rid of useless braces.
2. Use new NUMA status helper.
3. Use PRIu32 to replace u in print messages.
4. Fix opposite = __node_distance(to, from).
5. disable dtb numa info table when we find an invalid data
   in dtb.
---
 xen/arch/arm/numa_device_tree.c | 107 ++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/xen/arch/arm/numa_device_tree.c b/xen/arch/arm/numa_device_tree.c
index 793a410fd4..88056e7ef8 100644
--- a/xen/arch/arm/numa_device_tree.c
+++ b/xen/arch/arm/numa_device_tree.c
@@ -151,3 +151,110 @@ invalid_data:
     numa_fw_bad();
     return -EINVAL;
 }
+
+/* Parse NUMA distance map v1 */
+static int __init fdt_parse_numa_distance_map_v1(const void *fdt, int node)
+{
+    const struct fdt_property *prop;
+    const __be32 *matrix;
+    unsigned int i, entry_count;
+    int len;
+
+    printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
+
+    prop = fdt_get_property(fdt, node, "distance-matrix", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_WARNING
+               "NUMA: No distance-matrix property in distance-map\n");
+        goto invalid_data;
+    }
+
+    if ( len % sizeof(__be32) != 0 )
+    {
+        printk(XENLOG_WARNING
+               "distance-matrix in node is not a multiple of u32\n");
+        goto invalid_data;
+    }
+
+    entry_count = len / sizeof(__be32);
+    if ( entry_count == 0 )
+    {
+        printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
+        goto invalid_data;
+    }
+
+    matrix = (const __be32 *)prop->data;
+    for ( i = 0; i + 2 < entry_count; i += 3 )
+    {
+        unsigned int from, to, distance, opposite;
+
+        from = dt_next_cell(1, &matrix);
+        to = dt_next_cell(1, &matrix);
+        distance = dt_next_cell(1, &matrix);
+        if ( (from == to && distance != NUMA_LOCAL_DISTANCE) ||
+            (from != to && distance <= NUMA_LOCAL_DISTANCE) )
+        {
+            printk(XENLOG_WARNING
+                   "NUMA: Invalid distance: NODE#%"PRIu32"->NODE#%"PRIu32":%"PRIu32"\n",
+                   from, to, distance);
+            goto invalid_data;
+        }
+
+        printk(XENLOG_INFO "NUMA: distance: NODE#%"PRIu32"->NODE#%"PRIu32":%"PRIu32"\n",
+               from, to, distance);
+
+        /* Get opposite way distance */
+        opposite = __node_distance(to, from);
+        if ( opposite == 0 )
+        {
+            /* Bi-directions are not set, set both */
+            numa_set_distance(from, to, distance);
+            numa_set_distance(to, from, distance);
+        }
+        else
+        {
+            /*
+             * Opposite way distance has been set to a different value.
+             * It may be a firmware device tree bug?
+             */
+            if ( opposite != distance )
+            {
+                /*
+                 * In device tree NUMA distance-matrix binding:
+                 * https://www.kernel.org/doc/Documentation/devicetree/bindings/numa.txt
+                 * There is a notes mentions:
+                 * "Each entry represents distance from first node to
+                 *  second node. The distances are equal in either
+                 *  direction."
+                 *
+                 * That means device tree doesn't permit this case.
+                 * But in ACPI spec, it cares to specifically permit this
+                 * case:
+                 * "Except for the relative distance from a System Locality
+                 *  to itself, each relative distance is stored twice in the
+                 *  matrix. This provides the capability to describe the
+                 *  scenario where the relative distances for the two
+                 *  directions between System Localities is different."
+                 *
+                 * That means a real machine allows such NUMA configuration.
+                 * So, place a WARNING here to notice system administrators,
+                 * is it the specail case that they hijack the device tree
+                 * to support their rare machines?
+                 */
+                printk(XENLOG_WARNING
+                       "Un-matched bi-direction! NODE#%"PRIu32"->NODE#%"PRIu32":%"PRIu32", NODE#%"PRIu32"->NODE#%"PRIu32":%"PRIu32"\n",
+                       from, to, distance, to, from, opposite);
+            }
+
+            /* Opposite way distance has been set, just set this way */
+            numa_set_distance(from, to, distance);
+        }
+    }
+
+    return 0;
+
+invalid_data:
+    numa_fw_bad();
+    return -EINVAL;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 10/17] xen/arm: unified entry to parse all NUMA data from device tree
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (8 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 09/17] xen/arm: introduce a helper to parse device tree NUMA distance map Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 11/17] xen/arm: keep guest still be NUMA unware Wei Chen
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

In this function, we scan the whole device tree to parse CPU node id,
memory node id and distance-map. Though early_scan_node will invoke
a handler to process memory nodes. If we want to parse memory node id
in that handler, we have to embed NUMA parse code in that handler.
But we still need to scan whole device tree to find CPU NUMA id and
distance-map. In this case, we include memory NUMA id parse in this
function too. Another benefit is that we have a unique entry for device
tree NUMA data parse.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1->v2:
1. Fix typos in commit message.
2. Fix code style and align parameters.
3. Use strncmp to replace memcmp.
---
 xen/arch/arm/include/asm/numa.h |  1 +
 xen/arch/arm/numa_device_tree.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 923ffbfd42..1213d30ce0 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -44,6 +44,7 @@ extern enum dt_numa_status device_tree_numa;
 extern void numa_set_distance(nodeid_t from, nodeid_t to,
                               unsigned int distance);
 extern void numa_detect_cpu_node(unsigned int cpu);
+extern int numa_device_tree_init(const void *fdt);
 
 #else
 
diff --git a/xen/arch/arm/numa_device_tree.c b/xen/arch/arm/numa_device_tree.c
index 88056e7ef8..4009b9b6de 100644
--- a/xen/arch/arm/numa_device_tree.c
+++ b/xen/arch/arm/numa_device_tree.c
@@ -258,3 +258,33 @@ invalid_data:
     numa_fw_bad();
     return -EINVAL;
 }
+
+static int __init fdt_scan_numa_nodes(const void *fdt, int node,
+                                      const char *uname, int depth,
+                                      unsigned int address_cells,
+                                      unsigned int size_cells, void *data)
+{
+    int len, ret = 0;
+    const void *prop;
+
+    prop = fdt_getprop(fdt, node, "device_type", &len);
+    if ( prop )
+    {
+        if ( strncmp(prop, "cpu", len) == 0 )
+            ret = fdt_parse_numa_cpu_node(fdt, node);
+        else if ( strncmp(prop, "memory", len) == 0 )
+            ret = fdt_parse_numa_memory_node(fdt, node, uname,
+                                address_cells, size_cells);
+    }
+    else if ( fdt_node_check_compatible(fdt, node,
+                                        "numa-distance-map-v1") == 0 )
+        ret = fdt_parse_numa_distance_map_v1(fdt, node);
+
+    return ret;
+}
+
+/* Initialize NUMA from device tree */
+int __init numa_device_tree_init(const void *fdt)
+{
+    return device_tree_for_each_node(fdt, 0, fdt_scan_numa_nodes, NULL);
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 11/17] xen/arm: keep guest still be NUMA unware
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (9 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 10/17] xen/arm: unified entry to parse all NUMA data from device tree Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 12/17] xen/arm: enable device tree based NUMA in system init Wei Chen
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

The NUMA information provided in the host Device-Tree
are only for Xen. For dom0, we want to hide them as they
may be different (for now, dom0 is still not aware of NUMA)
The CPU and memory nodes are recreated from scratch for the
domain. So we already skip the "numa-node-id" property for
these two types of nodes.

However, some devices like PCIe may have "numa-node-id"
property too. We have to skip them as well.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
1. Add Rb
---
 xen/arch/arm/domain_build.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 829cea8de8..8258048d0e 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1185,6 +1185,10 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
                 continue;
         }
 
+        /* Dom0 is currently NUMA unaware */
+        if ( dt_property_name_is_equal(prop, "numa-node-id") )
+            continue;
+
         res = fdt_property(kinfo->fdt, prop->name, prop_data, prop_len);
 
         if ( res )
@@ -2540,6 +2544,8 @@ static int __init handle_node(struct domain *d, struct kernel_info *kinfo,
         DT_MATCH_TYPE("memory"),
         /* The memory mapped timer is not supported by Xen. */
         DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
+        /* Numa info doesn't need to be exposed to Domain-0 */
+        DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
         { /* sentinel */ },
     };
     static const struct dt_device_match timer_matches[] __initconst =
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 12/17] xen/arm: enable device tree based NUMA in system init
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (10 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 11/17] xen/arm: keep guest still be NUMA unware Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 13/17] xen/arm: implement numa_node_to_arch_nid for device tree NUMA Wei Chen
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

In this patch, we can start to create NUMA system that is
based on device tree.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1->v2:
1. replace ~0 by INVALID_PADDR.
2. only print error messages for invalid dtb data.
3. remove unnecessary return.
4. remove the parameter of numa_init.
---
 xen/arch/arm/include/asm/numa.h |  5 +++
 xen/arch/arm/numa.c             | 57 +++++++++++++++++++++++++++++++++
 xen/arch/arm/setup.c            |  7 ++++
 3 files changed, 69 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 1213d30ce0..34eec9378c 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -45,6 +45,7 @@ extern void numa_set_distance(nodeid_t from, nodeid_t to,
                               unsigned int distance);
 extern void numa_detect_cpu_node(unsigned int cpu);
 extern int numa_device_tree_init(const void *fdt);
+extern void numa_init(void);
 
 #else
 
@@ -86,6 +87,10 @@ static inline void numa_detect_cpu_node(unsigned int cpu)
 {
 }
 
+static inline void numa_init(void)
+{
+}
+
 #endif
 
 #define arch_want_default_dmazone() (false)
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 3e02cec646..e9081d45ce 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -18,7 +18,11 @@
  *
  */
 #include <xen/init.h>
+#include <xen/device_tree.h>
+#include <xen/nodemask.h>
 #include <xen/numa.h>
+#include <xen/pfn.h>
+#include <xen/acpi.h>
 
 enum dt_numa_status __read_mostly device_tree_numa;
 
@@ -103,6 +107,59 @@ unsigned char __node_distance(nodeid_t from, nodeid_t to)
 
 EXPORT_SYMBOL(__node_distance);
 
+void __init numa_init(void)
+{
+    unsigned int idx;
+    paddr_t ram_start = INVALID_PADDR;
+    paddr_t ram_size = 0;
+    paddr_t ram_end = 0;
+
+    /* NUMA has been turned off through Xen parameters */
+    if ( numa_off )
+        goto mem_init;
+
+    /* Initialize NUMA from device tree when system is not ACPI booted */
+    if ( acpi_disabled )
+    {
+        int ret = numa_device_tree_init(device_tree_flattened);
+        if ( ret )
+        {
+            numa_off = true;
+            if ( ret == -EINVAL )
+                printk(XENLOG_WARNING
+                       "Init NUMA from device tree failed, ret=%d\n", ret);
+        }
+    }
+    else
+    {
+        /* We don't support NUMA for ACPI boot currently */
+        printk(XENLOG_WARNING
+               "ACPI NUMA has not been supported yet, NUMA off!\n");
+        numa_off = true;
+    }
+
+mem_init:
+    /*
+     * Find the minimal and maximum address of RAM, NUMA will
+     * build a memory to node mapping table for the whole range.
+     */
+    ram_start = bootinfo.mem.bank[0].start;
+    ram_size  = bootinfo.mem.bank[0].size;
+    ram_end   = ram_start + ram_size;
+    for ( idx = 1 ; idx < bootinfo.mem.nr_banks; idx++ )
+    {
+        paddr_t bank_start = bootinfo.mem.bank[idx].start;
+        paddr_t bank_size = bootinfo.mem.bank[idx].size;
+        paddr_t bank_end = bank_start + bank_size;
+
+        ram_size  = ram_size + bank_size;
+        ram_start = min(ram_start, bank_start);
+        ram_end   = max(ram_end, bank_end);
+    }
+
+    numa_initmem_init(PFN_UP(ram_start), PFN_DOWN(ram_end));
+}
+
 int __init arch_get_ram_range(unsigned int idx, paddr_t *start, paddr_t *end)
 {
     if ( idx >= bootinfo.mem.nr_banks )
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 8c02cf6cd4..4cdc7e2edb 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -1031,6 +1031,13 @@ void __init start_xen(unsigned long boot_phys_offset,
     /* Parse the ACPI tables for possible boot-time configuration */
     acpi_boot_table_init();
 
+    /*
+     * Try to initialize NUMA system, if failed, the system will
+     * fallback to uniform system which means system has only 1
+     * NUMA node.
+     */
+    numa_init();
+
     end_boot_allocator();
 
     /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 13/17] xen/arm: implement numa_node_to_arch_nid for device tree NUMA
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (11 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 12/17] xen/arm: enable device tree based NUMA in system init Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 14/17] xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot Wei Chen
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

Device tree based NUMA doesn't have the proximity domain like
ACPI. So we can return node id directly as arch nid.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use numa_node_to_arch_nid instead of dummy node_to_pxm.
---
 xen/arch/arm/include/asm/numa.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 34eec9378c..a0b8d7a11c 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -47,6 +47,15 @@ extern void numa_detect_cpu_node(unsigned int cpu);
 extern int numa_device_tree_init(const void *fdt);
 extern void numa_init(void);
 
+/*
+ * Device tree NUMA doesn't have architecural node id.
+ * So we can just return node id as arch nid.
+ */
+static inline unsigned int numa_node_to_arch_nid(nodeid_t n)
+{
+    return n;
+}
+
 #else
 
 /* Fake one node for now. See also node_online_map. */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 14/17] xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (12 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 13/17] xen/arm: implement numa_node_to_arch_nid for device tree NUMA Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 15/17] xen/arm: Set correct per-cpu cpu_core_mask Wei Chen
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

node_online_map in smpboot still need for Arm when NUMA is turn
off by Kconfig.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. No change.
---
 xen/arch/arm/smpboot.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 5ee6ab11e9..3ae359bbeb 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -41,8 +41,10 @@ integer_param("maxcpus", max_cpus);
 /* CPU logical map: map xen cpuid to an MPIDR */
 register_t __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = MPIDR_INVALID };
 
+#ifndef CONFIG_NUMA
 /* Fake one node for now. See also asm/numa.h */
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
+#endif
 
 /* Xen stack for bringing up the first CPU. */
 static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 15/17] xen/arm: Set correct per-cpu cpu_core_mask
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (13 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 14/17] xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 16/17] xen/arm: Provide Kconfig options for Arm to enable NUMA Wei Chen
  2023-01-10  8:49 ` [PATCH v2 17/17] docs: update numa command line to support Arm Wei Chen
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Henry Wang, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

From: Henry Wang <Henry.Wang@arm.com>

In the common sysctl command XEN_SYSCTL_physinfo, the cores_per_socket
is calculated based on the cpu_core_mask of CPU0. Currently on Arm
this is a fixed value 1 (can be checked via xl info), which is not
correct. This is because during the Arm cpu online process,
set_cpu_sibling_map() only sets the per-cpu cpu_core_mask for itself.

cores_per_socket refers to the number of cores that belong to the same
socket (NUMA node). Therefore, this commit introduces a helper function
numa_set_cpu_core_mask(cpu), which sets the per-cpu cpu_core_mask to
the cpus in the same NUMA node as cpu. Calling this function at the
boot time can ensure the correct cpu_core_mask, leading to the correct
cores_per_socket to be returned by XEN_SYSCTL_physinfo.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
---
v1 -> v2:
1. New patch

---
 xen/arch/arm/include/asm/numa.h |  7 +++++++
 xen/arch/arm/numa.c             | 11 +++++++++++
 xen/arch/arm/setup.c            |  5 +++++
 3 files changed, 23 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index a0b8d7a11c..e66fb0a11f 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -46,6 +46,7 @@ extern void numa_set_distance(nodeid_t from, nodeid_t to,
 extern void numa_detect_cpu_node(unsigned int cpu);
 extern int numa_device_tree_init(const void *fdt);
 extern void numa_init(void);
+extern void numa_set_cpu_core_mask(int cpu);
 
 /*
  * Device tree NUMA doesn't have architecural node id.
@@ -62,6 +63,12 @@ static inline unsigned int numa_node_to_arch_nid(nodeid_t n)
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
 
+static inline void numa_set_cpu_core_mask(int cpu)
+{
+    cpumask_or(per_cpu(cpu_core_mask, cpu),
+               per_cpu(cpu_core_mask, cpu), &cpu_possible_map);
+}
+
 /*
  * TODO: make first_valid_mfn static when NUMA is supported on Arm, this
  * is required because the dummy helpers are using it.
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index e9081d45ce..ef245e39a8 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -52,6 +52,17 @@ int __init arch_numa_setup(const char *opt)
     return -EINVAL;
 }
 
+void numa_set_cpu_core_mask(int cpu)
+{
+    nodeid_t node = cpu_to_node[cpu];
+
+    if ( node == NUMA_NO_NODE )
+        node = 0;
+
+    cpumask_or(per_cpu(cpu_core_mask, cpu),
+               per_cpu(cpu_core_mask, cpu), &node_to_cpumask(node));
+}
+
 void __init numa_set_distance(nodeid_t from, nodeid_t to,
                               unsigned int distance)
 {
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 4cdc7e2edb..d45becedee 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -1136,6 +1136,11 @@ void __init start_xen(unsigned long boot_phys_offset,
     }
 
     printk("Brought up %ld CPUs\n", (long)num_online_cpus());
+
+    /* Set per-cpu cpu_core_mask to cpus that belongs to the same NUMA node. */
+    for_each_online_cpu ( i )
+        numa_set_cpu_core_mask(i);
+
     /* TODO: smp_cpus_done(); */
 
     /* This should be done in a vpmu driver but we do not have one yet. */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 16/17] xen/arm: Provide Kconfig options for Arm to enable NUMA
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (14 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 15/17] xen/arm: Set correct per-cpu cpu_core_mask Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10  8:49 ` [PATCH v2 17/17] docs: update numa command line to support Arm Wei Chen
  16 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

Arm platforms support both ACPI and device tree. We don't
want users to select device tree NUMA or ACPI NUMA manually.
We hope users can just enable NUMA for Arm, and device tree
NUMA and ACPI NUMA can be selected depends on device tree
feature and ACPI feature status automatically. In this case,
these two kinds of NUMA support code can be co-exist in one
Xen binary. Xen can check feature flags to decide using
device tree or ACPI as NUMA based firmware.

So in this patch, we introduce a generic option:
CONFIG_ARM_NUMA for users to enable NUMA for Arm.
And one CONFIG_DEVICE_TREE_NUMA option for ARM_NUMA
to select when HAS_DEVICE_TREE option is enabled.
Once when ACPI NUMA for Arm is supported, ACPI_NUMA
can be selected here too.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Remove the condition of selecting DEVICE_TREE_NUMA.
---
 xen/arch/arm/Kconfig | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 239d3aed3c..e751ad50d1 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -39,6 +39,17 @@ config ACPI
 config ARM_EFI
 	bool
 
+config ARM_NUMA
+	bool "Arm NUMA (Non-Uniform Memory Access) Support (UNSUPPORTED)" if UNSUPPORTED
+	depends on HAS_DEVICE_TREE
+	select DEVICE_TREE_NUMA
+	help
+	  Enable Non-Uniform Memory Access (NUMA) for Arm architecutres
+
+config DEVICE_TREE_NUMA
+	bool
+	select NUMA
+
 config GICV3
 	bool "GICv3 driver"
 	depends on !NEW_VGIC
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 17/17] docs: update numa command line to support Arm
  2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
                   ` (15 preceding siblings ...)
  2023-01-10  8:49 ` [PATCH v2 16/17] xen/arm: Provide Kconfig options for Arm to enable NUMA Wei Chen
@ 2023-01-10  8:49 ` Wei Chen
  2023-01-10 10:01   ` Jan Beulich
  16 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-10  8:49 UTC (permalink / raw)
  To: xen-devel
  Cc: nd, Wei Chen, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

Current numa command in documentation is x86 only. Remove
x86 from numa command's arch limitation in this patch.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Update Arm NUMA status in SUPPORT.md to "Tech Preview".
---
 SUPPORT.md                        | 1 +
 docs/misc/xen-command-line.pandoc | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index 295369998e..b73d04a028 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -401,6 +401,7 @@ on embedded platforms and the x86 PV shim.
 Enables NUMA aware scheduling in Xen
 
     Status, x86: Supported
+    Status, Arm: Tech Preview
 
 ## Scalability
 
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 923910f553..40da980b67 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1878,7 +1878,7 @@ i.e. a limit on the number of guests it is possible to start each having
 assigned a device sharing a common interrupt line.  Accepts values between
 1 and 255.
 
-### numa (x86)
+### numa
 > `= on | off | fake=<integer> | noacpi`
 
 > Default: `on`
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  2023-01-10  8:49 ` [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
@ 2023-01-10  9:59   ` Jan Beulich
  2023-01-11 10:03     ` Wei Chen
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2023-01-10  9:59 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

On 10.01.2023 09:49, Wei Chen wrote:
> --- a/xen/arch/arm/include/asm/numa.h
> +++ b/xen/arch/arm/include/asm/numa.h
> @@ -3,9 +3,26 @@
>  
>  #include <xen/mm.h>
>  
> +#include <asm/setup.h>
> +
>  typedef u8 nodeid_t;
>  
> -#ifndef CONFIG_NUMA
> +#ifdef CONFIG_NUMA
> +
> +/*
> + * It is very likely that if you have more than 64 nodes, you may
> + * need a lot more than 2 regions per node. So, for Arm, we would
> + * just define NR_NODE_MEMBLKS as an alias to NR_MEM_BANKS.
> + * And in the future NR_MEM_BANKS will be bumped for new platforms,
> + * but for now leave NR_MEM_BANKS as it is on Arm. This avoid to
> + * have different way to define the value based NUMA vs non-NUMA.
> + *
> + * Further discussions can be found here:
> + * https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg02322.html
> + */
> +#define NR_NODE_MEMBLKS NR_MEM_BANKS

But isn't NR_MEM_BANKS a system-wide setting, which doesn't really
make sense to use as a per-node one? IOW aren't you now allowing
NR_MEM_BANKS regions on each node, which taken together will be
much more than NR_MEM_BANKS that you can deal with on the whole
system?

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 17/17] docs: update numa command line to support Arm
  2023-01-10  8:49 ` [PATCH v2 17/17] docs: update numa command line to support Arm Wei Chen
@ 2023-01-10 10:01   ` Jan Beulich
  0 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2023-01-10 10:01 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

On 10.01.2023 09:49, Wei Chen wrote:
> Current numa command in documentation is x86 only. Remove
> x86 from numa command's arch limitation in this patch.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Of course not on its own, only on top of all the earlier patches.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-10  8:49 ` [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status Wei Chen
@ 2023-01-10 16:38   ` Jan Beulich
  2023-01-12  6:22     ` Wei Chen
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2023-01-10 16:38 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 10.01.2023 09:49, Wei Chen wrote:
> --- a/xen/arch/arm/include/asm/numa.h
> +++ b/xen/arch/arm/include/asm/numa.h
> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
>   */
>  #define NR_NODE_MEMBLKS NR_MEM_BANKS
>  
> +enum dt_numa_status {
> +    DT_NUMA_INIT,

I don't see any use of this. I also think the name isn't good, as INIT
can be taken for "initializer" as well as "initialized". Suggesting an
alternative would require knowing what the future plans with this are;
right now ...

> +    DT_NUMA_ON,
> +    DT_NUMA_OFF,
> +};

... the other two are also used only in a single file, at which point
their placing in a header is also questionable.

> --- /dev/null
> +++ b/xen/arch/arm/numa.c
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Arm Architecture support layer for NUMA.
> + *
> + * Copyright (C) 2021 Arm Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + *
> + */
> +#include <xen/init.h>
> +#include <xen/numa.h>
> +
> +static enum dt_numa_status __read_mostly device_tree_numa;

__ro_after_init?

> --- a/xen/arch/x86/include/asm/numa.h
> +++ b/xen/arch/x86/include/asm/numa.h
> @@ -12,7 +12,6 @@ extern unsigned int numa_node_to_arch_nid(nodeid_t n);
>  
>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>  
> -extern bool numa_disabled(void);
>  extern nodeid_t setup_node(unsigned int pxm);
>  extern void srat_detect_node(int cpu);
>  
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -55,6 +55,7 @@ extern void numa_init_array(void);
>  extern void numa_set_node(unsigned int cpu, nodeid_t node);
>  extern void numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn);
>  extern void numa_fw_bad(void);
> +extern bool numa_disabled(void);
>  
>  extern int arch_numa_setup(const char *opt);
>  extern bool arch_numa_unavailable(void);

How is this movement of a declaration related to the subject of the patch?

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-10  8:49 ` [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm Wei Chen
@ 2023-01-10 16:47   ` Jan Beulich
  2023-01-12  6:31     ` Wei Chen
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2023-01-10 16:47 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 10.01.2023 09:49, Wei Chen wrote:
> --- a/xen/arch/arm/include/asm/numa.h
> +++ b/xen/arch/arm/include/asm/numa.h
> @@ -28,6 +28,20 @@ enum dt_numa_status {
>      DT_NUMA_OFF,
>  };
>  
> +/*
> + * In ACPI spec, 0-9 are the reserved values for node distance,
> + * 10 indicates local node distance, 20 indicates remote node
> + * distance. Set node distance map in device tree will follow
> + * the ACPI's definition.
> + */
> +#define NUMA_DISTANCE_UDF_MIN   0
> +#define NUMA_DISTANCE_UDF_MAX   9
> +#define NUMA_LOCAL_DISTANCE     10
> +#define NUMA_REMOTE_DISTANCE    20

In the absence of a caller of numa_set_distance() it is entirely unclear
whether this tying to ACPI used values is actually appropriate.

> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -2,7 +2,7 @@
>  /*
>   * Arm Architecture support layer for NUMA.
>   *
> - * Copyright (C) 2021 Arm Ltd
> + * Copyright (C) 2022 Arm Ltd

I don't think it makes sense to change such after the fact. And certainly
not in an unrelated patch.

> @@ -22,6 +22,11 @@
>  
>  static enum dt_numa_status __read_mostly device_tree_numa;
>  
> +static unsigned char __read_mostly
> +node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
> +    { 0 }
> +};

__ro_after_init?

> @@ -42,3 +47,48 @@ int __init arch_numa_setup(const char *opt)
>  {
>      return -EINVAL;
>  }
> +
> +void __init numa_set_distance(nodeid_t from, nodeid_t to,
> +                              unsigned int distance)
> +{
> +    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )
> +    {
> +        printk(KERN_WARNING
> +               "NUMA: invalid nodes: from=%"PRIu8" to=%"PRIu8" MAX=%"PRIu8"\n",
> +               from, to, MAX_NUMNODES);
> +        return;
> +    }
> +
> +    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined */
> +    if ( distance >= NUMA_NO_DISTANCE ||
> +        (distance >= NUMA_DISTANCE_UDF_MIN &&

Nit: Indentation.

> +         distance <= NUMA_DISTANCE_UDF_MAX) ||
> +        (from == to && distance != NUMA_LOCAL_DISTANCE) )
> +    {
> +        printk(KERN_WARNING
> +               "NUMA: invalid distance: from=%"PRIu8" to=%"PRIu8" distance=%"PRIu32"\n",
> +               from, to, distance);
> +        return;
> +    }
> +
> +    node_distance_map[from][to] = distance;
> +}
> +
> +unsigned char __node_distance(nodeid_t from, nodeid_t to)
> +{
> +    /* When NUMA is off, any distance will be treated as remote. */
> +    if ( numa_disabled() )
> +        return NUMA_REMOTE_DISTANCE;
> +
> +    /*
> +     * Check whether the nodes are in the matrix range.
> +     * When any node is out of range, except from and to nodes are the
> +     * same, we treat them as unreachable (return 0xFF)
> +     */
> +    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )

I guess using ARRAY_SIZE() here would be more future-proof.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  2023-01-10  9:59   ` Jan Beulich
@ 2023-01-11 10:03     ` Wei Chen
  0 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-11 10:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2023年1月10日 18:00
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Julien
> Grall <julien@xen.org>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Wei
> Liu <wl@xen.org>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override
> default NR_NODE_MEMBLKS
> 
> On 10.01.2023 09:49, Wei Chen wrote:
> > --- a/xen/arch/arm/include/asm/numa.h
> > +++ b/xen/arch/arm/include/asm/numa.h
> > @@ -3,9 +3,26 @@
> >
> >  #include <xen/mm.h>
> >
> > +#include <asm/setup.h>
> > +
> >  typedef u8 nodeid_t;
> >
> > -#ifndef CONFIG_NUMA
> > +#ifdef CONFIG_NUMA
> > +
> > +/*
> > + * It is very likely that if you have more than 64 nodes, you may
> > + * need a lot more than 2 regions per node. So, for Arm, we would
> > + * just define NR_NODE_MEMBLKS as an alias to NR_MEM_BANKS.
> > + * And in the future NR_MEM_BANKS will be bumped for new platforms,
> > + * but for now leave NR_MEM_BANKS as it is on Arm. This avoid to
> > + * have different way to define the value based NUMA vs non-NUMA.
> > + *
> > + * Further discussions can be found here:
> > + * https://lists.xenproject.org/archives/html/xen-devel/2021-
> 09/msg02322.html
> > + */
> > +#define NR_NODE_MEMBLKS NR_MEM_BANKS
> 
> But isn't NR_MEM_BANKS a system-wide setting, which doesn't really
> make sense to use as a per-node one? IOW aren't you now allowing
> NR_MEM_BANKS regions on each node, which taken together will be
> much more than NR_MEM_BANKS that you can deal with on the whole
> system?
> 

Thanks of pointing out this. Yes NR_MEM_BANKS a system-wide setting,
but we just use it to define the MAX memory banks for single node,
this does not mean that there are really so many banks on these nodes.
When a system only contains one node NR_MEM_BANKS equals to
NR_NODE_MEMBLKS. Our idea is that, if the total memory banks of
all nodes exceed the NR_MEM_BANKS, we will bump NR_MEM_BANKS.

But I am open to this question, if there are more suggestions from
maintainers.

Cheers,
Wei Chen


> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-10 16:38   ` Jan Beulich
@ 2023-01-12  6:22     ` Wei Chen
  2023-01-12  8:08       ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-12  6:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2023年1月11日 0:38
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Julien
> Grall <julien@xen.org>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Wei
> Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update
> NUMA status
> 
> On 10.01.2023 09:49, Wei Chen wrote:
> > --- a/xen/arch/arm/include/asm/numa.h
> > +++ b/xen/arch/arm/include/asm/numa.h
> > @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
> >   */
> >  #define NR_NODE_MEMBLKS NR_MEM_BANKS
> >
> > +enum dt_numa_status {
> > +    DT_NUMA_INIT,
> 
> I don't see any use of this. I also think the name isn't good, as INIT
> can be taken for "initializer" as well as "initialized". Suggesting an
> alternative would require knowing what the future plans with this are;
> right now ...
> 

static enum dt_numa_status __read_mostly device_tree_numa;
We use DT_NUMA_INIT to indicate device_tree_numa is in a default value
(System's initial value, hasn't done initialization). Maybe rename it
To DT_NUMA_UNINIT be better?

> > +    DT_NUMA_ON,
> > +    DT_NUMA_OFF,
> > +};
> 
> ... the other two are also used only in a single file, at which point
> their placing in a header is also questionable.
>

This is a good point, I will move them to that file.

> > --- /dev/null
> > +++ b/xen/arch/arm/numa.c
> > @@ -0,0 +1,44 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Arm Architecture support layer for NUMA.
> > + *
> > + * Copyright (C) 2021 Arm Ltd
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> > + *
> > + */
> > +#include <xen/init.h>
> > +#include <xen/numa.h>
> > +
> > +static enum dt_numa_status __read_mostly device_tree_numa;
> 
> __ro_after_init?
>

OK.
 
> > --- a/xen/arch/x86/include/asm/numa.h
> > +++ b/xen/arch/x86/include/asm/numa.h
> > @@ -12,7 +12,6 @@ extern unsigned int numa_node_to_arch_nid(nodeid_t n);
> >
> >  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
> >
> > -extern bool numa_disabled(void);
> >  extern nodeid_t setup_node(unsigned int pxm);
> >  extern void srat_detect_node(int cpu);
> >
> > --- a/xen/include/xen/numa.h
> > +++ b/xen/include/xen/numa.h
> > @@ -55,6 +55,7 @@ extern void numa_init_array(void);
> >  extern void numa_set_node(unsigned int cpu, nodeid_t node);
> >  extern void numa_initmem_init(unsigned long start_pfn, unsigned long
> end_pfn);
> >  extern void numa_fw_bad(void);
> > +extern bool numa_disabled(void);
> >
> >  extern int arch_numa_setup(const char *opt);
> >  extern bool arch_numa_unavailable(void);
> 
> How is this movement of a declaration related to the subject of the patch?
>

Can I add some messages in commit log to say something like "As we have
Implemented numa_disabled for Arm, so we move numa_disabled to common header"?

Cheers,
Wei Chen
 
> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-10 16:47   ` Jan Beulich
@ 2023-01-12  6:31     ` Wei Chen
  2023-01-12  8:11       ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-12  6:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2023年1月11日 0:47
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Julien
> Grall <julien@xen.org>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Wei
> Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for
> Arm
> 
> On 10.01.2023 09:49, Wei Chen wrote:
> > --- a/xen/arch/arm/include/asm/numa.h
> > +++ b/xen/arch/arm/include/asm/numa.h
> > @@ -28,6 +28,20 @@ enum dt_numa_status {
> >      DT_NUMA_OFF,
> >  };
> >
> > +/*
> > + * In ACPI spec, 0-9 are the reserved values for node distance,
> > + * 10 indicates local node distance, 20 indicates remote node
> > + * distance. Set node distance map in device tree will follow
> > + * the ACPI's definition.
> > + */
> > +#define NUMA_DISTANCE_UDF_MIN   0
> > +#define NUMA_DISTANCE_UDF_MAX   9
> > +#define NUMA_LOCAL_DISTANCE     10
> > +#define NUMA_REMOTE_DISTANCE    20
> 
> In the absence of a caller of numa_set_distance() it is entirely unclear
> whether this tying to ACPI used values is actually appropriate.
> 

From Kernel's NUMA device tree binding, it seems DT NUMA are reusing
ACPI used values for distances [1].

> > --- a/xen/arch/arm/numa.c
> > +++ b/xen/arch/arm/numa.c
> > @@ -2,7 +2,7 @@
> >  /*
> >   * Arm Architecture support layer for NUMA.
> >   *
> > - * Copyright (C) 2021 Arm Ltd
> > + * Copyright (C) 2022 Arm Ltd
> 
> I don't think it makes sense to change such after the fact. And certainly
> not in an unrelated patch.
> 

I will retore it, and add a SPDX header.

> > @@ -22,6 +22,11 @@
> >
> >  static enum dt_numa_status __read_mostly device_tree_numa;
> >
> > +static unsigned char __read_mostly
> > +node_distance_map[MAX_NUMNODES][MAX_NUMNODES] = {
> > +    { 0 }
> > +};
> 
> __ro_after_init?
> 

Yes.

> > @@ -42,3 +47,48 @@ int __init arch_numa_setup(const char *opt)
> >  {
> >      return -EINVAL;
> >  }
> > +
> > +void __init numa_set_distance(nodeid_t from, nodeid_t to,
> > +                              unsigned int distance)
> > +{
> > +    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )
> > +    {
> > +        printk(KERN_WARNING
> > +               "NUMA: invalid nodes: from=%"PRIu8" to=%"PRIu8"
> MAX=%"PRIu8"\n",
> > +               from, to, MAX_NUMNODES);
> > +        return;
> > +    }
> > +
> > +    /* NUMA defines 0xff as an unreachable node and 0-9 are undefined
> */
> > +    if ( distance >= NUMA_NO_DISTANCE ||
> > +        (distance >= NUMA_DISTANCE_UDF_MIN &&
> 
> Nit: Indentation.
> 

Ok.

> > +         distance <= NUMA_DISTANCE_UDF_MAX) ||
> > +        (from == to && distance != NUMA_LOCAL_DISTANCE) )
> > +    {
> > +        printk(KERN_WARNING
> > +               "NUMA: invalid distance: from=%"PRIu8" to=%"PRIu8"
> distance=%"PRIu32"\n",
> > +               from, to, distance);
> > +        return;
> > +    }
> > +
> > +    node_distance_map[from][to] = distance;
> > +}
> > +
> > +unsigned char __node_distance(nodeid_t from, nodeid_t to)
> > +{
> > +    /* When NUMA is off, any distance will be treated as remote. */
> > +    if ( numa_disabled() )
> > +        return NUMA_REMOTE_DISTANCE;
> > +
> > +    /*
> > +     * Check whether the nodes are in the matrix range.
> > +     * When any node is out of range, except from and to nodes are the
> > +     * same, we treat them as unreachable (return 0xFF)
> > +     */
> > +    if ( from >= MAX_NUMNODES || to >= MAX_NUMNODES )
> 
> I guess using ARRAY_SIZE() here would be more future-proof.
> 

I will use it in next version.

[1]https://www.kernel.org/doc/Documentation/devicetree/bindings/numa.txt

Thanks,
Wei Chen

> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-12  6:22     ` Wei Chen
@ 2023-01-12  8:08       ` Jan Beulich
  2023-01-16  9:20         ` Wei Chen
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2023-01-12  8:08 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 12.01.2023 07:22, Wei Chen wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 2023年1月11日 0:38
>>
>> On 10.01.2023 09:49, Wei Chen wrote:
>>> --- a/xen/arch/arm/include/asm/numa.h
>>> +++ b/xen/arch/arm/include/asm/numa.h
>>> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
>>>   */
>>>  #define NR_NODE_MEMBLKS NR_MEM_BANKS
>>>
>>> +enum dt_numa_status {
>>> +    DT_NUMA_INIT,
>>
>> I don't see any use of this. I also think the name isn't good, as INIT
>> can be taken for "initializer" as well as "initialized". Suggesting an
>> alternative would require knowing what the future plans with this are;
>> right now ...
>>
> 
> static enum dt_numa_status __read_mostly device_tree_numa;

There's no DT_NUMA_INIT here. You _imply_ it having a value of zero.

> We use DT_NUMA_INIT to indicate device_tree_numa is in a default value
> (System's initial value, hasn't done initialization). Maybe rename it
> To DT_NUMA_UNINIT be better?

Perhaps, yes.

>>> --- a/xen/arch/x86/include/asm/numa.h
>>> +++ b/xen/arch/x86/include/asm/numa.h
>>> @@ -12,7 +12,6 @@ extern unsigned int numa_node_to_arch_nid(nodeid_t n);
>>>
>>>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>>>
>>> -extern bool numa_disabled(void);
>>>  extern nodeid_t setup_node(unsigned int pxm);
>>>  extern void srat_detect_node(int cpu);
>>>
>>> --- a/xen/include/xen/numa.h
>>> +++ b/xen/include/xen/numa.h
>>> @@ -55,6 +55,7 @@ extern void numa_init_array(void);
>>>  extern void numa_set_node(unsigned int cpu, nodeid_t node);
>>>  extern void numa_initmem_init(unsigned long start_pfn, unsigned long
>> end_pfn);
>>>  extern void numa_fw_bad(void);
>>> +extern bool numa_disabled(void);
>>>
>>>  extern int arch_numa_setup(const char *opt);
>>>  extern bool arch_numa_unavailable(void);
>>
>> How is this movement of a declaration related to the subject of the patch?
>>
> 
> Can I add some messages in commit log to say something like "As we have
> Implemented numa_disabled for Arm, so we move numa_disabled to common header"?

See your own patch 3, where you have a similar statement (albeit you mean
"declaration" there, not "definition"). However, right now numa_disabled()
is a #define on Arm, so the declaration becoming common isn't really
warranted. In fact it'll get in the way of converting function-like macros
to inline functions for Misra.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-12  6:31     ` Wei Chen
@ 2023-01-12  8:11       ` Jan Beulich
  2023-01-12  8:29         ` Wei Chen
  2023-01-12  9:47         ` Julien Grall
  0 siblings, 2 replies; 34+ messages in thread
From: Jan Beulich @ 2023-01-12  8:11 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 12.01.2023 07:31, Wei Chen wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 2023年1月11日 0:47
>>
>> On 10.01.2023 09:49, Wei Chen wrote:
>>> --- a/xen/arch/arm/include/asm/numa.h
>>> +++ b/xen/arch/arm/include/asm/numa.h
>>> @@ -28,6 +28,20 @@ enum dt_numa_status {
>>>      DT_NUMA_OFF,
>>>  };
>>>
>>> +/*
>>> + * In ACPI spec, 0-9 are the reserved values for node distance,
>>> + * 10 indicates local node distance, 20 indicates remote node
>>> + * distance. Set node distance map in device tree will follow
>>> + * the ACPI's definition.
>>> + */
>>> +#define NUMA_DISTANCE_UDF_MIN   0
>>> +#define NUMA_DISTANCE_UDF_MAX   9
>>> +#define NUMA_LOCAL_DISTANCE     10
>>> +#define NUMA_REMOTE_DISTANCE    20
>>
>> In the absence of a caller of numa_set_distance() it is entirely unclear
>> whether this tying to ACPI used values is actually appropriate.
>>
> 
> From Kernel's NUMA device tree binding, it seems DT NUMA are reusing
> ACPI used values for distances [1].

I can't find any mention of ACPI in that doc, so the example values used
there matching ACPI's may also be coincidental. In no event can a Linux
kernel doc serve as DT specification. If values are to match ACPI's, I
expect a DT spec to actually say so.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-12  8:11       ` Jan Beulich
@ 2023-01-12  8:29         ` Wei Chen
  2023-01-12  9:47         ` Julien Grall
  1 sibling, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-12  8:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2023年1月12日 16:11
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Julien
> Grall <julien@xen.org>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Wei
> Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for
> Arm
> 
> On 12.01.2023 07:31, Wei Chen wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 2023年1月11日 0:47
> >>
> >> On 10.01.2023 09:49, Wei Chen wrote:
> >>> --- a/xen/arch/arm/include/asm/numa.h
> >>> +++ b/xen/arch/arm/include/asm/numa.h
> >>> @@ -28,6 +28,20 @@ enum dt_numa_status {
> >>>      DT_NUMA_OFF,
> >>>  };
> >>>
> >>> +/*
> >>> + * In ACPI spec, 0-9 are the reserved values for node distance,
> >>> + * 10 indicates local node distance, 20 indicates remote node
> >>> + * distance. Set node distance map in device tree will follow
> >>> + * the ACPI's definition.
> >>> + */
> >>> +#define NUMA_DISTANCE_UDF_MIN   0
> >>> +#define NUMA_DISTANCE_UDF_MAX   9
> >>> +#define NUMA_LOCAL_DISTANCE     10
> >>> +#define NUMA_REMOTE_DISTANCE    20
> >>
> >> In the absence of a caller of numa_set_distance() it is entirely
> unclear
> >> whether this tying to ACPI used values is actually appropriate.
> >>
> >
> > From Kernel's NUMA device tree binding, it seems DT NUMA are reusing
> > ACPI used values for distances [1].
> 
> I can't find any mention of ACPI in that doc, so the example values used
> there matching ACPI's may also be coincidental. In no event can a Linux
> kernel doc serve as DT specification. If values are to match ACPI's, I
> expect a DT spec to actually say so.
> 

Unfortunately, the latest device tree spec doesn't have any NUMA
description [1]. But if we define different values for DT NUMA in Xen,
we may have an incompatible with Linux DT.

[1] https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4-rc1/devicetree-specification-v0.4-rc1.pdf

Cheers,
Wei Chen

> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-12  8:11       ` Jan Beulich
  2023-01-12  8:29         ` Wei Chen
@ 2023-01-12  9:47         ` Julien Grall
  2023-01-16  9:40           ` Wei Chen
  1 sibling, 1 reply; 34+ messages in thread
From: Julien Grall @ 2023-01-12  9:47 UTC (permalink / raw)
  To: Jan Beulich, Wei Chen
  Cc: nd, Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné,
	xen-devel

Hi,

On 12/01/2023 08:11, Jan Beulich wrote:
> On 12.01.2023 07:31, Wei Chen wrote:
>>> -----Original Message-----
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 2023年1月11日 0:47
>>>
>>> On 10.01.2023 09:49, Wei Chen wrote:
>>>> --- a/xen/arch/arm/include/asm/numa.h
>>>> +++ b/xen/arch/arm/include/asm/numa.h
>>>> @@ -28,6 +28,20 @@ enum dt_numa_status {
>>>>       DT_NUMA_OFF,
>>>>   };
>>>>
>>>> +/*
>>>> + * In ACPI spec, 0-9 are the reserved values for node distance,
>>>> + * 10 indicates local node distance, 20 indicates remote node
>>>> + * distance. Set node distance map in device tree will follow
>>>> + * the ACPI's definition.

Looking at the ACPI spec, I agree that the local node distance is 
defined to 10. But I couldn't find any mention of the value 20.

How is NUMA_REMOTE_DISTANCE is meant to be used? Is this a default 
value? If so, maybe we should add "DEFAULT" in the name.

>>>> + */
>>>> +#define NUMA_DISTANCE_UDF_MIN   0
>>>> +#define NUMA_DISTANCE_UDF_MAX   9
>>>> +#define NUMA_LOCAL_DISTANCE     10
>>>> +#define NUMA_REMOTE_DISTANCE    20
>>>
>>> In the absence of a caller of numa_set_distance() it is entirely unclear
>>> whether this tying to ACPI used values is actually appropriate.
>>>
>>
>>  From Kernel's NUMA device tree binding, it seems DT NUMA are reusing
>> ACPI used values for distances [1].
> 
> I can't find any mention of ACPI in that doc, so the example values used
> there matching ACPI's may also be coincidental. In no event can a Linux
> kernel doc serve as DT specification. 

The directory Documentation/devicetree is the de-facto place where all 
the bindings are described. This is used by most (to not say all) users.

I vaguely remember there were plan in the past to move the bindings out 
of the kernel. But it looks like this has not been done. Yet, they tend 
to be reviewed independently from the kernel.

So, as Wei pointed, if we don't follow them then we will not be able to 
re-use Device-Tree shipped.

> If values are to match ACPI's, I
> expect a DT spec to actually say so.
I don't think it is necessary to say that. Bindings are not meant to 
change and a developer can rely on the local distance value to not 
change with the current bindings.

So IMHO, it is OK to assume that the local distance value is the same 
between ACPI and DT. That would definitely simplify the parsing and code.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-12  8:08       ` Jan Beulich
@ 2023-01-16  9:20         ` Wei Chen
  2023-01-16 11:14           ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-16  9:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

On 2023/1/12 16:08, Jan Beulich wrote:
> On 12.01.2023 07:22, Wei Chen wrote:
>>> -----Original Message-----
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 2023年1月11日 0:38
>>>
>>> On 10.01.2023 09:49, Wei Chen wrote:
>>>> --- a/xen/arch/arm/include/asm/numa.h
>>>> +++ b/xen/arch/arm/include/asm/numa.h
>>>> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
>>>>    */
>>>>   #define NR_NODE_MEMBLKS NR_MEM_BANKS
>>>>
>>>> +enum dt_numa_status {
>>>> +    DT_NUMA_INIT,
>>>
>>> I don't see any use of this. I also think the name isn't good, as INIT
>>> can be taken for "initializer" as well as "initialized". Suggesting an
>>> alternative would require knowing what the future plans with this are;
>>> right now ...
>>>
>>
>> static enum dt_numa_status __read_mostly device_tree_numa;
> 
> There's no DT_NUMA_INIT here. You _imply_ it having a value of zero.
> 

How about I assign device_tree_numa explicitly like:
... __read_mostly device_tree_numa = DT_NUMA_UNINIT;

>> We use DT_NUMA_INIT to indicate device_tree_numa is in a default value
>> (System's initial value, hasn't done initialization). Maybe rename it
>> To DT_NUMA_UNINIT be better?
> 
> Perhaps, yes.
> 
>>>> --- a/xen/arch/x86/include/asm/numa.h
>>>> +++ b/xen/arch/x86/include/asm/numa.h
>>>> @@ -12,7 +12,6 @@ extern unsigned int numa_node_to_arch_nid(nodeid_t n);
>>>>
>>>>   #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>>>>
>>>> -extern bool numa_disabled(void);
>>>>   extern nodeid_t setup_node(unsigned int pxm);
>>>>   extern void srat_detect_node(int cpu);
>>>>
>>>> --- a/xen/include/xen/numa.h
>>>> +++ b/xen/include/xen/numa.h
>>>> @@ -55,6 +55,7 @@ extern void numa_init_array(void);
>>>>   extern void numa_set_node(unsigned int cpu, nodeid_t node);
>>>>   extern void numa_initmem_init(unsigned long start_pfn, unsigned long
>>> end_pfn);
>>>>   extern void numa_fw_bad(void);
>>>> +extern bool numa_disabled(void);
>>>>
>>>>   extern int arch_numa_setup(const char *opt);
>>>>   extern bool arch_numa_unavailable(void);
>>>
>>> How is this movement of a declaration related to the subject of the patch?
>>>
>>
>> Can I add some messages in commit log to say something like "As we have
>> Implemented numa_disabled for Arm, so we move numa_disabled to common header"?
> 
> See your own patch 3, where you have a similar statement (albeit you mean
> "declaration" there, not "definition"). However, right now numa_disabled()
> is a #define on Arm, so the declaration becoming common isn't really
> warranted. In fact it'll get in the way of converting function-like macros
> to inline functions for Misra.
> 

Yes, I think you're right. This may also seem a little strange,when we 
disable NUMA, there are two headers have numa_disabled statement. I will 
revert this change. And I also will covert the macro to a static inline 
function.

Cheers,
Wei Chen

> Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm
  2023-01-12  9:47         ` Julien Grall
@ 2023-01-16  9:40           ` Wei Chen
  0 siblings, 0 replies; 34+ messages in thread
From: Wei Chen @ 2023-01-16  9:40 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: nd, Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné,
	xen-devel

Hi Julien,

On 2023/1/12 17:47, Julien Grall wrote:
> Hi,
> 
> On 12/01/2023 08:11, Jan Beulich wrote:
>> On 12.01.2023 07:31, Wei Chen wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 2023年1月11日 0:47
>>>>
>>>> On 10.01.2023 09:49, Wei Chen wrote:
>>>>> --- a/xen/arch/arm/include/asm/numa.h
>>>>> +++ b/xen/arch/arm/include/asm/numa.h
>>>>> @@ -28,6 +28,20 @@ enum dt_numa_status {
>>>>>       DT_NUMA_OFF,
>>>>>   };
>>>>>
>>>>> +/*
>>>>> + * In ACPI spec, 0-9 are the reserved values for node distance,
>>>>> + * 10 indicates local node distance, 20 indicates remote node
>>>>> + * distance. Set node distance map in device tree will follow
>>>>> + * the ACPI's definition.
> 
> Looking at the ACPI spec, I agree that the local node distance is 
> defined to 10. But I couldn't find any mention of the value 20.
> 
> How is NUMA_REMOTE_DISTANCE is meant to be used? Is this a default 
> value? If so, maybe we should add "DEFAULT" in the name.
> 

I think you're right, maybe we can rename NUMA_REMOTE_DISTANCE
to NUMA_DEFAULT_DISTANCE.Different pairs of nodes may have different 
remote distance values, I can not define one value for them.

And why use 20 as default value, I have followed the ACPI's logic.
In ACPI NUMA, when ACPI doesn't provide SLIT table, 20 will be used
for all nodes default distance.

>>>>> + */
>>>>> +#define NUMA_DISTANCE_UDF_MIN   0
>>>>> +#define NUMA_DISTANCE_UDF_MAX   9
>>>>> +#define NUMA_LOCAL_DISTANCE     10
>>>>> +#define NUMA_REMOTE_DISTANCE    20
>>>>
>>>> In the absence of a caller of numa_set_distance() it is entirely 
>>>> unclear
>>>> whether this tying to ACPI used values is actually appropriate.
>>>>
>>>
>>>  From Kernel's NUMA device tree binding, it seems DT NUMA are reusing
>>> ACPI used values for distances [1].
>>
>> I can't find any mention of ACPI in that doc, so the example values used
>> there matching ACPI's may also be coincidental. In no event can a Linux
>> kernel doc serve as DT specification. 
> 
> The directory Documentation/devicetree is the de-facto place where all 
> the bindings are described. This is used by most (to not say all) users.
> 
> I vaguely remember there were plan in the past to move the bindings out 
> of the kernel. But it looks like this has not been done. Yet, they tend 
> to be reviewed independently from the kernel.
> 
> So, as Wei pointed, if we don't follow them then we will not be able to 
> re-use Device-Tree shipped.
> 
>> If values are to match ACPI's, I
>> expect a DT spec to actually say so.
> I don't think it is necessary to say that. Bindings are not meant to 
> change and a developer can rely on the local distance value to not 
> change with the current bindings.
> 
> So IMHO, it is OK to assume that the local distance value is the same 
> between ACPI and DT. That would definitely simplify the parsing and code.
> 

Thanks for clarifying this.

Cheers,
Wei Chen

> Cheers,
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-16  9:20         ` Wei Chen
@ 2023-01-16 11:14           ` Jan Beulich
  2023-01-16 12:01             ` Wei Chen
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2023-01-16 11:14 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 16.01.2023 10:20, Wei Chen wrote:
> Hi Jan,
> 
> On 2023/1/12 16:08, Jan Beulich wrote:
>> On 12.01.2023 07:22, Wei Chen wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 2023年1月11日 0:38
>>>>
>>>> On 10.01.2023 09:49, Wei Chen wrote:
>>>>> --- a/xen/arch/arm/include/asm/numa.h
>>>>> +++ b/xen/arch/arm/include/asm/numa.h
>>>>> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
>>>>>    */
>>>>>   #define NR_NODE_MEMBLKS NR_MEM_BANKS
>>>>>
>>>>> +enum dt_numa_status {
>>>>> +    DT_NUMA_INIT,
>>>>
>>>> I don't see any use of this. I also think the name isn't good, as INIT
>>>> can be taken for "initializer" as well as "initialized". Suggesting an
>>>> alternative would require knowing what the future plans with this are;
>>>> right now ...
>>>>
>>>
>>> static enum dt_numa_status __read_mostly device_tree_numa;
>>
>> There's no DT_NUMA_INIT here. You _imply_ it having a value of zero.
>>
> 
> How about I assign device_tree_numa explicitly like:
> ... __read_mostly device_tree_numa = DT_NUMA_UNINIT;

Well, yes, this is what I was asking for when mentioning the lack of use
of the enumerator. Irrespective of that I remain unhappy with the name,
though.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-16 11:14           ` Jan Beulich
@ 2023-01-16 12:01             ` Wei Chen
  2023-01-16 13:02               ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Wei Chen @ 2023-01-16 12:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2023年1月16日 19:15
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Julien
> Grall <julien@xen.org>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Wei
> Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update
> NUMA status
> 
> On 16.01.2023 10:20, Wei Chen wrote:
> > Hi Jan,
> >
> > On 2023/1/12 16:08, Jan Beulich wrote:
> >> On 12.01.2023 07:22, Wei Chen wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 2023年1月11日 0:38
> >>>>
> >>>> On 10.01.2023 09:49, Wei Chen wrote:
> >>>>> --- a/xen/arch/arm/include/asm/numa.h
> >>>>> +++ b/xen/arch/arm/include/asm/numa.h
> >>>>> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
> >>>>>    */
> >>>>>   #define NR_NODE_MEMBLKS NR_MEM_BANKS
> >>>>>
> >>>>> +enum dt_numa_status {
> >>>>> +    DT_NUMA_INIT,
> >>>>
> >>>> I don't see any use of this. I also think the name isn't good, as
> INIT
> >>>> can be taken for "initializer" as well as "initialized". Suggesting
> an
> >>>> alternative would require knowing what the future plans with this are;
> >>>> right now ...
> >>>>
> >>>
> >>> static enum dt_numa_status __read_mostly device_tree_numa;
> >>
> >> There's no DT_NUMA_INIT here. You _imply_ it having a value of zero.
> >>
> >
> > How about I assign device_tree_numa explicitly like:
> > ... __read_mostly device_tree_numa = DT_NUMA_UNINIT;
> 
> Well, yes, this is what I was asking for when mentioning the lack of use
> of the enumerator. Irrespective of that I remain unhappy with the name,
> though.
> 

How about DT_NUMA_DEF or do you have some suggestions for the name?

Cheers,
Wei Chen

> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status
  2023-01-16 12:01             ` Wei Chen
@ 2023-01-16 13:02               ` Jan Beulich
  0 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2023-01-16 13:02 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 16.01.2023 13:01, Wei Chen wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 2023年1月16日 19:15
>>
>> On 16.01.2023 10:20, Wei Chen wrote:
>>> On 2023/1/12 16:08, Jan Beulich wrote:
>>>> On 12.01.2023 07:22, Wei Chen wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 2023年1月11日 0:38
>>>>>>
>>>>>> On 10.01.2023 09:49, Wei Chen wrote:
>>>>>>> --- a/xen/arch/arm/include/asm/numa.h
>>>>>>> +++ b/xen/arch/arm/include/asm/numa.h
>>>>>>> @@ -22,6 +22,12 @@ typedef u8 nodeid_t;
>>>>>>>    */
>>>>>>>   #define NR_NODE_MEMBLKS NR_MEM_BANKS
>>>>>>>
>>>>>>> +enum dt_numa_status {
>>>>>>> +    DT_NUMA_INIT,
>>>>>>
>>>>>> I don't see any use of this. I also think the name isn't good, as
>> INIT
>>>>>> can be taken for "initializer" as well as "initialized". Suggesting
>> an
>>>>>> alternative would require knowing what the future plans with this are;
>>>>>> right now ...
>>>>>>
>>>>>
>>>>> static enum dt_numa_status __read_mostly device_tree_numa;
>>>>
>>>> There's no DT_NUMA_INIT here. You _imply_ it having a value of zero.
>>>>
>>>
>>> How about I assign device_tree_numa explicitly like:
>>> ... __read_mostly device_tree_numa = DT_NUMA_UNINIT;
>>
>> Well, yes, this is what I was asking for when mentioning the lack of use
>> of the enumerator. Irrespective of that I remain unhappy with the name,
>> though.
>>
> 
> How about DT_NUMA_DEF or do you have some suggestions for the name?

Yeah, "DEFAULT" is probably the least bad one.

Jan


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2023-01-16 13:02 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10  8:49 [PATCH v2 00/17] Device tree based NUMA support for Arm - Part#3 Wei Chen
2023-01-10  8:49 ` [PATCH v2 01/17] xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS Wei Chen
2023-01-10  9:59   ` Jan Beulich
2023-01-11 10:03     ` Wei Chen
2023-01-10  8:49 ` [PATCH v2 02/17] xen/arm: implement helpers to get and update NUMA status Wei Chen
2023-01-10 16:38   ` Jan Beulich
2023-01-12  6:22     ` Wei Chen
2023-01-12  8:08       ` Jan Beulich
2023-01-16  9:20         ` Wei Chen
2023-01-16 11:14           ` Jan Beulich
2023-01-16 12:01             ` Wei Chen
2023-01-16 13:02               ` Jan Beulich
2023-01-10  8:49 ` [PATCH v2 03/17] xen/arm: implement node distance helpers for Arm Wei Chen
2023-01-10 16:47   ` Jan Beulich
2023-01-12  6:31     ` Wei Chen
2023-01-12  8:11       ` Jan Beulich
2023-01-12  8:29         ` Wei Chen
2023-01-12  9:47         ` Julien Grall
2023-01-16  9:40           ` Wei Chen
2023-01-10  8:49 ` [PATCH v2 04/17] xen/arm: use arch_get_ram_range to memory ranges from bootinfo Wei Chen
2023-01-10  8:49 ` [PATCH v2 05/17] xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus Wei Chen
2023-01-10  8:49 ` [PATCH v2 06/17] xen/arm: Add boot and secondary CPU to NUMA system Wei Chen
2023-01-10  8:49 ` [PATCH v2 07/17] xen/arm: introduce a helper to parse device tree processor node Wei Chen
2023-01-10  8:49 ` [PATCH v2 08/17] xen/arm: introduce a helper to parse device tree memory node Wei Chen
2023-01-10  8:49 ` [PATCH v2 09/17] xen/arm: introduce a helper to parse device tree NUMA distance map Wei Chen
2023-01-10  8:49 ` [PATCH v2 10/17] xen/arm: unified entry to parse all NUMA data from device tree Wei Chen
2023-01-10  8:49 ` [PATCH v2 11/17] xen/arm: keep guest still be NUMA unware Wei Chen
2023-01-10  8:49 ` [PATCH v2 12/17] xen/arm: enable device tree based NUMA in system init Wei Chen
2023-01-10  8:49 ` [PATCH v2 13/17] xen/arm: implement numa_node_to_arch_nid for device tree NUMA Wei Chen
2023-01-10  8:49 ` [PATCH v2 14/17] xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot Wei Chen
2023-01-10  8:49 ` [PATCH v2 15/17] xen/arm: Set correct per-cpu cpu_core_mask Wei Chen
2023-01-10  8:49 ` [PATCH v2 16/17] xen/arm: Provide Kconfig options for Arm to enable NUMA Wei Chen
2023-01-10  8:49 ` [PATCH v2 17/17] docs: update numa command line to support Arm Wei Chen
2023-01-10 10:01   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.