All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1
@ 2022-04-18  9:07 Wei Chen
  2022-04-18  9:07 ` [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
                   ` (10 more replies)
  0 siblings, 11 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Jan Beulich

(The Arm device tree based NUMA support patch set contains 35
patches. In order to make stuff easier for reviewers, I split
them into 3 parts:
1. Preparation. I have re-sorted the patch series. And moved
   independent patches to the head of the series.
2. Move generically usable code from x86 to common.
3. Add new code to support Arm.

This series only contains the first part patches.
The full patch list can be found in this cover letter.)

Xen memory allocation and scheduler modules are NUMA aware.
But actually, on x86 has implemented the architecture APIs
to support NUMA. Arm was providing a set of fake architecture
APIs to make it compatible with NUMA awared memory allocation
and scheduler.

Arm system was working well as a single node NUMA system with
these fake APIs, because we didn't have multiple nodes NUMA
system on Arm. But in recent years, more and more Arm devices
support multiple nodes NUMA system.

So now we have a new problem. When Xen is running on these Arm
devices, Xen still treat them as single node SMP systems. The
NUMA affinity capability of Xen memory allocation and scheduler
becomes meaningless. Because they rely on input data that does
not reflect real NUMA layout.

Xen still think the access time for all of the memory is the
same for all CPUs. However, Xen may allocate memory to a VM
from different NUMA nodes with different access speeds. This
difference can be amplified in workloads inside VM, causing
performance instability and timeouts. 

So in this patch series, we implement a set of NUMA API to use
device tree to describe the NUMA layout. We reuse most of the
code of x86 NUMA to create and maintain the mapping between
memory and CPU, create the matrix between any two NUMA nodes.
Except ACPI and some x86 specified code, we have moved other
code to common. In next stage, when we implement ACPI based
NUMA for Arm64, we may move the ACPI NUMA code to common too,
but in current stage, we keep it as x86 only.

This patch serires has been tested and booted well on one
Arm64 NUMA machine and one HPE x86 NUMA machine.

---
Part1 v1->v2:
1. Move independent patches from later to early of this series.
2. Drop the copy of EFI stub.c from Arm. Share common codes of
   x86 EFI stub for Arm.
3. Use CONFIG_ARM_EFI to replace CONFIG_EFI and remove help text
   and make CONFIG_ARM_EFI invisible.
4. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid.
5. Move MAX_NUMNODES from xen/numa.h to asm/numa.h for x86.
6. Extend the description of Arm's workaround for reserve DMA
   allocations to avoid the same discussion every time for
   arch_have_default_dmazone.
7. Update commit messages.

Wei Chen (10):
  xen/arm: Print a 64-bit number in hex from early uart
  xen/x86: move reusable EFI stub functions from x86 to common
  xen/arm: add CONFIG_ARM_EFI to stub EFI API
  xen/arm: Keep memory nodes in device tree when Xen boots from EFI
  xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  xen: introduce an arch helper for default dma zone status
  xen: decouple NUMA from ACPI in Kconfig
  xen/arm: use !CONFIG_NUMA to keep fake NUMA API
  xen/x86: use paddr_t for addresses in NUMA node structure
  xen/x86: add detection of memory interleaves for different nodes

PART2:
  xen/x86: introduce a helper to update memory hotplug end
  xen/x86: Use enumerations to indicate NUMA status
  xen/x86: move generically usable NUMA code from x86 to common
  xen/x86: use arch_get_memory_map to get information from E820 map
  xen/x86: move NUMA scan nodes codes from x86 to common
  xen/x86: rename bad_srat to numa_bad

PART3:
  xen: rename acpi_scan_nodes to numa_scan_nodes
  xen: introduce a Kconfig option to configure NUMA nodes number
  xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  xen/arm: implement helpers to get and update NUMA status
  xen/arm: implement node distance helpers for Arm
  xen/arm: use arch_get_memory_map to memory bank from bootinfo
  xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus
  xen/arm: Add boot and secondary CPU to NUMA system
  xen/arm: stub mem_hotplug_update_boundary for Arm
  xen/arm: introduce a helper to parse device tree processor node
  xen/arm: introduce a helper to parse device tree memory node
  xen/arm: introduce a helper to parse device tree NUMA distance map
  xen/arm: unified entry to parse all NUMA data from device tree
  xen/arm: keep guest still be NUMA unware
  xen/arm: enable device tree based NUMA in system init
  xen/arm: implement a dummy 1:1 node to pxm mapping
  xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot
  xen/arm: Provide Kconfig options for Arm to enable NUMA
  docs: update numa command line to support Arm

 xen/arch/arm/Kconfig                    |  5 ++
 xen/arch/arm/Makefile                   |  2 +-
 xen/arch/arm/arm64/head.S               | 12 ++--
 xen/arch/arm/bootfdt.c                  |  8 ++-
 xen/arch/arm/efi/Makefile               |  5 ++
 xen/arch/arm/efi/efi-boot.h             | 25 ---------
 xen/arch/arm/include/asm/numa.h         |  6 ++
 xen/arch/x86/Kconfig                    |  2 +-
 xen/arch/x86/efi/Makefile               |  4 +-
 xen/arch/x86/efi/{stub.c => stub-x86.c} | 37 -------------
 xen/arch/x86/include/asm/config.h       |  1 -
 xen/arch/x86/include/asm/numa.h         | 15 ++---
 xen/arch/x86/numa.c                     | 30 +++++-----
 xen/arch/x86/srat.c                     | 74 +++++++++++++++++++------
 xen/common/Kconfig                      |  3 +
 xen/common/efi/stub.c                   | 38 +++++++++++++
 xen/common/page_alloc.c                 |  2 +-
 xen/drivers/acpi/Kconfig                |  3 +-
 xen/drivers/acpi/Makefile               |  2 +-
 xen/include/xen/numa.h                  |  2 +
 20 files changed, 162 insertions(+), 114 deletions(-)
 rename xen/arch/x86/efi/{stub.c => stub-x86.c} (71%)
 create mode 100644 xen/common/efi/stub.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-19  9:13   ` Jiamei Xie
  2022-04-18  9:07 ` [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common Wei Chen
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Julien Grall

Current putn function that is using for early print
only can print low 32-bit of AArch64 register. This
will lose some important messages while debugging
with early console. For example:
(XEN) Bringing up CPU5
- CPU 0000000100000100 booting -
Will be truncated to
(XEN) Bringing up CPU5
- CPU 00000100 booting -

In this patch, we increased the print loops and shift
bits to make putn print 64-bit number.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
---
 xen/arch/arm/arm64/head.S | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index e62c48ec1c..2bb7906f69 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -866,17 +866,19 @@ puts:
         ret
 ENDPROC(puts)
 
-/* Print a 32-bit number in hex.  Specific to the PL011 UART.
+/*
+ * Print a 64-bit number in hex.
  * x0: Number to print.
  * x23: Early UART base address
- * Clobbers x0-x3 */
+ * Clobbers x0-x3
+ */
 putn:
         adr   x1, hex
-        mov   x3, #8
+        mov   x3, #16
 1:
         early_uart_ready x23, 2
-        and   x2, x0, #0xf0000000    /* Mask off the top nybble */
-        lsr   x2, x2, #28
+        and   x2, x0, #(0xf<<60)     /* Mask off the top nybble */
+        lsr   x2, x2, #60
         ldrb  w2, [x1, x2]           /* Convert to a char */
         early_uart_transmit x23, w2
         lsl   x0, x0, #4             /* Roll it through one nybble at a time */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
  2022-04-18  9:07 ` [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-21  0:17   ` Stefano Stabellini
  2022-04-26  8:53   ` Jan Beulich
  2022-04-18  9:07 ` [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API Wei Chen
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

Most of the functions in x86 EFI stub.c can be reused for other
architectures. So we move them to common and keep the x86 specific
function in stub-x86.c.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Drop the copy of stub.c from Arm EFI.
2. Share common codes of x86 EFI stub for other architectures.
---
 xen/arch/x86/efi/Makefile               |  4 +--
 xen/arch/x86/efi/{stub.c => stub-x86.c} | 37 ------------------------
 xen/common/efi/stub.c                   | 38 +++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 39 deletions(-)
 rename xen/arch/x86/efi/{stub.c => stub-x86.c} (71%)
 create mode 100644 xen/common/efi/stub.c

diff --git a/xen/arch/x86/efi/Makefile b/xen/arch/x86/efi/Makefile
index 034ec87895..5ca3a0b4a4 100644
--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -11,8 +11,8 @@ $(obj)/boot.init.o: $(obj)/buildid.o
 $(call cc-option-add,cflags-stack-boundary,CC,-mpreferred-stack-boundary=4)
 $(addprefix $(obj)/,$(EFIOBJ-y)): CFLAGS_stack_boundary := $(cflags-stack-boundary)
 
-obj-y := stub.o
+obj-y := stub.o stub-x86.o
 obj-$(XEN_BUILD_EFI) := $(filter-out %.init.o,$(EFIOBJ-y))
 obj-bin-$(XEN_BUILD_EFI) := $(filter %.init.o,$(EFIOBJ-y))
 extra-$(XEN_BUILD_EFI) += buildid.o relocs-dummy.o
-nocov-$(XEN_BUILD_EFI) += stub.o
+nocov-$(XEN_BUILD_EFI) += stub.o stub-x86.o
diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
similarity index 71%
rename from xen/arch/x86/efi/stub.c
rename to xen/arch/x86/efi/stub-x86.c
index 9984932626..2cd5c8d4dc 100644
--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub-x86.c
@@ -1,7 +1,5 @@
 #include <xen/efi.h>
-#include <xen/errno.h>
 #include <xen/init.h>
-#include <xen/lib.h>
 #include <asm/asm_defns.h>
 #include <asm/efibind.h>
 #include <asm/page.h>
@@ -45,11 +43,6 @@ void __init noreturn efi_multiboot2(EFI_HANDLE ImageHandle,
     unreachable();
 }
 
-bool efi_enabled(unsigned int feature)
-{
-    return false;
-}
-
 void __init efi_init_memory(void) { }
 
 bool efi_boot_mem_unused(unsigned long *start, unsigned long *end)
@@ -61,33 +54,3 @@ bool efi_boot_mem_unused(unsigned long *start, unsigned long *end)
 }
 
 void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { }
-
-bool efi_rs_using_pgtables(void)
-{
-    return false;
-}
-
-unsigned long efi_get_time(void)
-{
-    BUG();
-    return 0;
-}
-
-void efi_halt_system(void) { }
-void efi_reset_system(bool warm) { }
-
-int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
-{
-    return -ENOSYS;
-}
-
-int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
-    __attribute__((__alias__("efi_get_info")));
-
-int efi_runtime_call(struct xenpf_efi_runtime_call *op)
-{
-    return -ENOSYS;
-}
-
-int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
-    __attribute__((__alias__("efi_runtime_call")));
diff --git a/xen/common/efi/stub.c b/xen/common/efi/stub.c
new file mode 100644
index 0000000000..6e4f4de9af
--- /dev/null
+++ b/xen/common/efi/stub.c
@@ -0,0 +1,38 @@
+#include <xen/efi.h>
+#include <xen/errno.h>
+#include <xen/lib.h>
+
+bool efi_enabled(unsigned int feature)
+{
+    return false;
+}
+
+bool efi_rs_using_pgtables(void)
+{
+    return false;
+}
+
+unsigned long efi_get_time(void)
+{
+    BUG();
+    return 0;
+}
+
+void efi_halt_system(void) { }
+void efi_reset_system(bool warm) { }
+
+int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
+{
+    return -ENOSYS;
+}
+
+int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
+    __attribute__((__alias__("efi_get_info")));
+
+int efi_runtime_call(struct xenpf_efi_runtime_call *op)
+{
+    return -ENOSYS;
+}
+
+int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
+    __attribute__((__alias__("efi_runtime_call")));
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
  2022-04-18  9:07 ` [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
  2022-04-18  9:07 ` [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-21  0:25   ` Stefano Stabellini
  2022-04-18  9:07 ` [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI Wei Chen
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

x86 is using compiler feature testing to decide EFI
build enable or not. When EFI build is disabled, x86
will use a efi/stub.c file to replace efi/runtime.c
for build objects. Following this idea, we introduce
a stub file for Arm, but use CONFIG_ARM_EFI to decide
EFI build enable or not.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use CONFIG_ARM_EFI to replace CONFIG_EFI
2. Remove help text and make CONFIG_ARM_EFI invisible.
3. Merge one following patch:
   xen/arm: introduce a stub file for non-EFI architectures
4. Use the common stub.c instead of creating new one.
---
 xen/arch/arm/Kconfig      | 5 +++++
 xen/arch/arm/Makefile     | 2 +-
 xen/arch/arm/efi/Makefile | 5 +++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index ecfa6822e4..5f1b0121b0 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -6,6 +6,7 @@ config ARM_64
 	def_bool y
 	depends on !ARM_32
 	select 64BIT
+	select ARM_EFI
 	select HAS_FAST_MULTIPLY
 
 config ARM
@@ -33,6 +34,10 @@ config ACPI
 	  Advanced Configuration and Power Interface (ACPI) support for Xen is
 	  an alternative to device tree on ARM64.
 
+config ARM_EFI
+	bool
+	depends on ARM_64
+
 config GICV3
 	bool "GICv3 driver"
 	depends on ARM_64 && !NEW_VGIC
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 1d862351d1..bb7a6151c1 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -1,6 +1,5 @@
 obj-$(CONFIG_ARM_32) += arm32/
 obj-$(CONFIG_ARM_64) += arm64/
-obj-$(CONFIG_ARM_64) += efi/
 obj-$(CONFIG_ACPI) += acpi/
 obj-$(CONFIG_HAS_PCI) += pci/
 ifneq ($(CONFIG_NO_PLAT),y)
@@ -20,6 +19,7 @@ obj-y += domain.o
 obj-y += domain_build.init.o
 obj-y += domctl.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
+obj-y += efi/
 obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_GICV3) += gic-v3.o
diff --git a/xen/arch/arm/efi/Makefile b/xen/arch/arm/efi/Makefile
index 4313c39066..75ef180233 100644
--- a/xen/arch/arm/efi/Makefile
+++ b/xen/arch/arm/efi/Makefile
@@ -1,4 +1,9 @@
 include $(srctree)/common/efi/efi-common.mk
 
+ifeq ($(CONFIG_ARM_EFI),y)
 obj-y += $(EFIOBJ-y)
 obj-$(CONFIG_ACPI) +=  efi-dom0.init.o
+else
+EFIOBJ-y += stub.o
+obj-y += stub.o
+endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (2 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-21  0:30   ` Stefano Stabellini
  2022-04-18  9:07 ` [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid Wei Chen
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

In current code, when Xen is booting from EFI, it will delete
all memory nodes in device tree. This would work well in current
stage, because Xen can get memory map from EFI system table.
However, EFI system table cannot completely replace memory nodes
of device tree. EFI system table doesn't contain memory NUMA
information. Xen depends on ACPI SRAT or device tree memory nodes
to parse memory blocks' NUMA mapping. So in EFI + DTB boot, Xen
doesn't have any method to get numa-node-id for memory blocks any
more. This makes device tree based NUMA support become impossible
for Xen in EFI + DTB boot.

So in this patch, we will keep memory nodes in device tree for
NUMA code to parse memory numa-node-id later.

As a side effect, if we still parse boot memory information in
early_scan_node, bootmem.info will calculate memory ranges in
memory nodes twice. So we have to prevent early_scan_node to
parse memory nodes in EFI boot.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Move this patch from later to early of this series.
2. Refine commit message.
---
 xen/arch/arm/bootfdt.c      |  8 +++++++-
 xen/arch/arm/efi/efi-boot.h | 25 -------------------------
 2 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index e318ef9603..78e10c6ebc 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -11,6 +11,7 @@
 #include <xen/lib.h>
 #include <xen/kernel.h>
 #include <xen/init.h>
+#include <xen/efi.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/sort.h>
@@ -370,7 +371,12 @@ static int __init early_scan_node(const void *fdt,
 {
     int rc = 0;
 
-    if ( device_tree_node_matches(fdt, node, "memory") )
+    /*
+     * If Xen has been booted via UEFI, the memory banks are
+     * populated. So we should skip the parsing.
+     */
+    if ( !efi_enabled(EFI_BOOT) &&
+         device_tree_node_matches(fdt, node, "memory") )
         rc = process_memory_node(fdt, node, name, depth,
                                  address_cells, size_cells, &bootinfo.mem);
     else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index e452b687d8..59d93c24a1 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -231,33 +231,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE *sys_table,
     int status;
     u32 fdt_val32;
     u64 fdt_val64;
-    int prev;
     int num_rsv;
 
-    /*
-     * Delete any memory nodes present.  The EFI memory map is the only
-     * memory description provided to Xen.
-     */
-    prev = 0;
-    for (;;)
-    {
-        const char *type;
-        int len;
-
-        node = fdt_next_node(fdt, prev, NULL);
-        if ( node < 0 )
-            break;
-
-        type = fdt_getprop(fdt, node, "device_type", &len);
-        if ( type && strncmp(type, "memory", len) == 0 )
-        {
-            fdt_del_node(fdt, node);
-            continue;
-        }
-
-        prev = node;
-    }
-
    /*
     * Delete all memory reserve map entries. When booting via UEFI,
     * kernel will use the UEFI memory map to find reserved regions.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (3 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-21  0:37   ` Stefano Stabellini
  2022-04-26  9:02   ` Jan Beulich
  2022-04-18  9:07 ` [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status Wei Chen
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini

VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
results in two lines of error-checking code in phys_to_nid
that is not actually working and causing two compilation
errors:
1. error: "MAX_NUMNODES" undeclared (first use in this function).
   This is because in the common header file, "MAX_NUMNODES" is
   defined after the common header file includes the ARCH header
   file, where phys_to_nid has attempted to use "MAX_NUMNODES".
   This error was resolved when we moved the definition of
   "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
   "MAX_NUMNODES" definition in common header file through a
   conditional compilation for some architectures that don't
   need to define "MAX_NUMNODES" in their ARCH header files.
2. error: wrong type argument to unary exclamation mark.
   This is because, the error-checking code contains !node_data[nid].
   But node_data is a data structure variable, it's not a pointer.

So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
enable the two lines of error-checking code. And fix the left
compilation errors by replacing !node_data[nid] to
!node_data[nid].node_spanned_pages.

Because when node_spanned_pages is 0, this node has no memory,
numa_scan_node will print warning message for such kind of nodes:
"Firmware Bug or mis-configured hardware?".

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid.
2. Adjust the conditional express for ASSERT.
3. Move MAX_NUMNODES from xen/numa.h to asm/numa.h for x86.
4. Use conditional macro to gate MAX_NUMNODES for other architectures.
---
 xen/arch/x86/include/asm/numa.h | 6 +++---
 xen/include/xen/numa.h          | 2 ++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index bada2c0bb9..1f268ce77d 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -4,6 +4,7 @@
 #include <xen/cpumask.h>
 
 #define NODES_SHIFT 6
+#define MAX_NUMNODES    (1 << NODES_SHIFT)
 
 typedef u8 nodeid_t;
 
@@ -26,7 +27,6 @@ extern int compute_hash_shift(struct node *nodes, int numnodes,
 extern nodeid_t pxm_to_node(unsigned int pxm);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
-#define VIRTUAL_BUG_ON(x) 
 
 extern void numa_add_cpu(int cpu);
 extern void numa_init_array(void);
@@ -62,9 +62,9 @@ extern struct node_data node_data[];
 static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 { 
 	nodeid_t nid;
-	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
+	ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
 	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift]; 
-	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]); 
+	ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
 	return nid; 
 } 
 
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7aef1a88dc..91b25c5617 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -10,7 +10,9 @@
 #define NUMA_NO_NODE     0xFF
 #define NUMA_NO_DISTANCE 0xFF
 
+#ifndef MAX_NUMNODES
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
+#endif
 
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (4 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-19  9:18   ` Jan Beulich
  2022-04-18  9:07 ` [PATCH v2 07/10] xen: decouple NUMA from ACPI in Kconfig Wei Chen
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

In current code, when Xen is running in a multiple nodes
NUMA system, it will set dma_bitsize in end_boot_allocator
to reserve some low address memory as DMA zone.

There are some x86 implications in the implementation.
Because on x86, memory starts from 0. On a multiple-nodes
NUMA system, if a single node contains the majority or all
of the DMA memory, x86 prefers to give out memory from
non-local allocations rather than exhausting the DMA memory
ranges. Hence x86 uses dma_bitsize to set aside some largely
arbitrary amount of memory for DMA zone. The allocations
from DMA zone would happen only after exhausting all other
nodes' memory.

But the implications are not shared across all architectures.
For example, Arm cannot guarantee the availability of memory
below a certain boundary for DMA limited-capability devices
either. But currently, Arm doesn't need a reserved DMA zone
in Xen. Because there is no DMA device in Xen. And for guests,
Xen Arm only allows Dom0 to have DMA operations without IOMMU.
Xen will try to allocate memory under 4GB or memory range that
is limited by dma_bitsize for Dom0 in boot time. For DomU, even
Xen can passthrough devices to DomU without IOMMU, but Xen Arm
doesn't guarantee their DMA operations. So, Xen Arm doesn't
need a reserved DMA zone to provide DMA memory for guests.

In this patch, we introduce an arch_have_default_dmazone helper
for different architectures to determine whether they need to
set dma_bitsize for DMA zone reservation or not.

At the same time, when x86 Xen is built with CONFIG_PV=n could
probably leverage this new helper to actually not trigger DMA
zone reservation.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Extend the description of Arm's workaround for reserve DMA
   allocations to avoid the same discussion every time.
2. Use a macro to define arch_have_default_dmazone, because
   it's little hard to make x86 version to static inline.
   Use a macro will also avoid add __init for this function.
3. Change arch_have_default_dmazone return value from
   unsigned int to bool.
4. Un-addressed comment: make arch_have_default_dmazone
   of x86 to be static inline. Because, if we move
   arch_have_default_dmazone to x86/asm/numa.h, it depends
   on nodemask.h to provide num_online_nodes. But nodemask.h
   needs numa.h to provide MAX_NUMANODES. This will cause a
   loop dependency. And this function can only be used in
   end_boot_allocator, in Xen initialization. So I think,
   compared to the changes introduced by inline, it doesn't
   mean much.
---
 xen/arch/arm/include/asm/numa.h | 1 +
 xen/arch/x86/include/asm/numa.h | 1 +
 xen/common/page_alloc.c         | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 31a6de4e23..268ba93a92 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -24,6 +24,7 @@ extern mfn_t first_valid_mfn;
 #define node_spanned_pages(nid) (max_page - mfn_x(first_valid_mfn))
 #define node_start_pfn(nid) (mfn_x(first_valid_mfn))
 #define __node_distance(a, b) (20)
+#define arch_have_default_dmazone() (false)
 
 #endif /* __ARCH_ARM_NUMA_H */
 /*
diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index 1f268ce77d..6eeae02060 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -74,6 +74,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
 #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
 				 NODE_DATA(nid)->node_spanned_pages)
+#define arch_have_default_dmazone() (num_online_nodes() > 1)
 
 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 319029140f..4c0dc3cb3c 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1889,7 +1889,7 @@ void __init end_boot_allocator(void)
     }
     nr_bootmem_regions = 0;
 
-    if ( !dma_bitsize && (num_online_nodes() > 1) )
+    if ( !dma_bitsize && arch_have_default_dmazone() )
         dma_bitsize = arch_get_dma_bitsize();
 
     printk("Domain heap initialised");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 07/10] xen: decouple NUMA from ACPI in Kconfig
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (5 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-18  9:07 ` [PATCH v2 08/10] xen/arm: use !CONFIG_NUMA to keep fake NUMA API Wei Chen
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini

In current Xen code only implements x86 ACPI-based NUMA support.
So in Xen Kconfig system, NUMA equals to ACPI_NUMA. x86 selects
NUMA by default, and CONFIG_ACPI_NUMA is hardcode in config.h.

In a follow-up patch, we will introduce support for NUMA using
the device tree. That means we will have two NUMA implementations,
so in this patch we decouple NUMA from ACPI based NUMA in Kconfig.
Make NUMA as a common feature, that device tree based NUMA also
can select it.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/Kconfig              | 2 +-
 xen/arch/x86/include/asm/config.h | 1 -
 xen/common/Kconfig                | 3 +++
 xen/drivers/acpi/Kconfig          | 3 ++-
 xen/drivers/acpi/Makefile         | 2 +-
 5 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 06d6fbc864..1e31edc99f 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -6,6 +6,7 @@ config X86
 	def_bool y
 	select ACPI
 	select ACPI_LEGACY_TABLES_LOOKUP
+	select ACPI_NUMA
 	select ALTERNATIVE_CALL
 	select ARCH_SUPPORTS_INT128
 	select CORE_PARKING
@@ -26,7 +27,6 @@ config X86
 	select HAS_UBSAN
 	select HAS_VPCI if HVM
 	select NEEDS_LIBELF
-	select NUMA
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/arch/x86/include/asm/config.h b/xen/arch/x86/include/asm/config.h
index de20642524..07bcd15831 100644
--- a/xen/arch/x86/include/asm/config.h
+++ b/xen/arch/x86/include/asm/config.h
@@ -31,7 +31,6 @@
 /* Intel P4 currently has largest cache line (L2 line size is 128 bytes). */
 #define CONFIG_X86_L1_CACHE_SHIFT 7
 
-#define CONFIG_ACPI_NUMA 1
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index d921c74d61..d65add3fc6 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -70,6 +70,9 @@ config MEM_ACCESS
 config NEEDS_LIBELF
 	bool
 
+config NUMA
+	bool
+
 config STATIC_MEMORY
 	bool "Static Allocation Support (UNSUPPORTED)" if UNSUPPORTED
 	depends on ARM
diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index b64d3731fb..e3f3d8f4b1 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -5,5 +5,6 @@ config ACPI
 config ACPI_LEGACY_TABLES_LOOKUP
 	bool
 
-config NUMA
+config ACPI_NUMA
 	bool
+	select NUMA
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index 4f8e97228e..2fc5230253 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -3,7 +3,7 @@ obj-y += utilities/
 obj-$(CONFIG_X86) += apei/
 
 obj-bin-y += tables.init.o
-obj-$(CONFIG_NUMA) += numa.o
+obj-$(CONFIG_ACPI_NUMA) += numa.o
 obj-y += osl.o
 obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 08/10] xen/arm: use !CONFIG_NUMA to keep fake NUMA API
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (6 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 07/10] xen: decouple NUMA from ACPI in Kconfig Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-18  9:07 ` [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure Wei Chen
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk

We have introduced CONFIG_NUMA in a previous patch. And this
option is enabled only on x86 at the current stage. In a follow
up patch, we will enable this option for Arm. But we still
want users to be able to disable the CONFIG_NUMA via Kconfig. In
this case, keep the fake NUMA API, will make Arm code still
able to work with NUMA aware memory allocation and scheduler.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 xen/arch/arm/include/asm/numa.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/xen/arch/arm/include/asm/numa.h b/xen/arch/arm/include/asm/numa.h
index 268ba93a92..59cdf8e39e 100644
--- a/xen/arch/arm/include/asm/numa.h
+++ b/xen/arch/arm/include/asm/numa.h
@@ -5,6 +5,8 @@
 
 typedef u8 nodeid_t;
 
+#ifndef CONFIG_NUMA
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
@@ -24,6 +26,9 @@ extern mfn_t first_valid_mfn;
 #define node_spanned_pages(nid) (max_page - mfn_x(first_valid_mfn))
 #define node_start_pfn(nid) (mfn_x(first_valid_mfn))
 #define __node_distance(a, b) (20)
+
+#endif
+
 #define arch_have_default_dmazone() (false)
 
 #endif /* __ARCH_ARM_NUMA_H */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (7 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 08/10] xen/arm: use !CONFIG_NUMA to keep fake NUMA API Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-26  9:11   ` Jan Beulich
  2022-04-18  9:07 ` [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes Wei Chen
  2022-04-19  9:29 ` [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

NUMA node structure "struct node" is using u64 as node memory
range. In order to make other architectures can reuse this
NUMA node relative code, we replace the u64 to paddr_t. And
use pfn_to_paddr and paddr_to_pfn to replace explicit shift
operations. The relate PRIx64 in print messages have been
replaced by PRIpaddr at the same time. And some being-phased-out
types like u64 in the lines we have touched also have been
converted to uint64_t or unsigned long.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 ->v2:
1. Drop useless cast.
2. Use initializers of the variables.
3. Replace u64 by uint64_t.
4. Use unsigned long for start_pfn and end_pfn.
---
 xen/arch/x86/include/asm/numa.h |  8 ++++----
 xen/arch/x86/numa.c             | 30 +++++++++++++++---------------
 xen/arch/x86/srat.c             | 25 +++++++++++++------------
 3 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
index 6eeae02060..1239f017c2 100644
--- a/xen/arch/x86/include/asm/numa.h
+++ b/xen/arch/x86/include/asm/numa.h
@@ -19,7 +19,7 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
 struct node { 
-	u64 start,end; 
+	paddr_t start, end;
 };
 
 extern int compute_hash_shift(struct node *nodes, int numnodes,
@@ -38,7 +38,7 @@ extern void numa_set_node(int cpu, nodeid_t node);
 extern nodeid_t setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 
-extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
+extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern nodeid_t apicid_to_node[];
 extern void init_cpu_to_node(void);
 
@@ -76,9 +76,9 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 				 NODE_DATA(nid)->node_spanned_pages)
 #define arch_have_default_dmazone() (num_online_nodes() > 1)
 
-extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
+extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
 
-void srat_parse_regions(u64 addr);
+void srat_parse_regions(paddr_t addr);
 extern u8 __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
 
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 680b7d9002..2b3a51afd0 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -162,12 +162,12 @@ int __init compute_hash_shift(struct node *nodes, int numnodes,
     return shift;
 }
 /* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
-{ 
+void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
+{
     unsigned long start_pfn, end_pfn;
 
-    start_pfn = start >> PAGE_SHIFT;
-    end_pfn = end >> PAGE_SHIFT;
+    start_pfn = paddr_to_pfn(start);
+    end_pfn = paddr_to_pfn(end);
 
     NODE_DATA(nodeid)->node_start_pfn = start_pfn;
     NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
@@ -198,11 +198,12 @@ void __init numa_init_array(void)
 static int numa_fake __initdata = 0;
 
 /* Numa emulation */
-static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
+static int __init numa_emulation(unsigned long start_pfn,
+                                 unsigned long end_pfn)
 {
     int i;
     struct node nodes[MAX_NUMNODES];
-    u64 sz = ((end_pfn - start_pfn)<<PAGE_SHIFT) / numa_fake;
+    uint64_t sz = pfn_to_paddr(end_pfn - start_pfn) / numa_fake;
 
     /* Kludge needed for the hash function */
     if ( hweight64(sz) > 1 )
@@ -218,9 +219,9 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
     memset(&nodes,0,sizeof(nodes));
     for ( i = 0; i < numa_fake; i++ )
     {
-        nodes[i].start = (start_pfn<<PAGE_SHIFT) + i*sz;
+        nodes[i].start = pfn_to_paddr(start_pfn) + i*sz;
         if ( i == numa_fake - 1 )
-            sz = (end_pfn<<PAGE_SHIFT) - nodes[i].start;
+            sz = pfn_to_paddr(end_pfn) - nodes[i].start;
         nodes[i].end = nodes[i].start + sz;
         printk(KERN_INFO "Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
                i,
@@ -246,6 +247,8 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 { 
     int i;
+    paddr_t start = pfn_to_paddr(start_pfn);
+    paddr_t end = pfn_to_paddr(end_pfn);
 
 #ifdef CONFIG_NUMA_EMU
     if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
@@ -253,17 +256,15 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-    if ( !numa_off && !acpi_scan_nodes((u64)start_pfn << PAGE_SHIFT,
-         (u64)end_pfn << PAGE_SHIFT) )
+    if ( !numa_off && !acpi_scan_nodes(start, end) )
         return;
 #endif
 
     printk(KERN_INFO "%s\n",
            numa_off ? "NUMA turned off" : "No NUMA configuration found");
 
-    printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
-           (u64)start_pfn << PAGE_SHIFT,
-           (u64)end_pfn << PAGE_SHIFT);
+    printk(KERN_INFO "Faking a node at %"PRIpaddr"-%"PRIpaddr"\n",
+           start, end);
     /* setup dummy node covering all memory */
     memnode_shift = BITS_PER_LONG - 1;
     memnodemap = _memnodemap;
@@ -276,8 +277,7 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
     for ( i = 0; i < nr_cpu_ids; i++ )
         numa_set_node(i, 0);
     cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
-    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
-                    (u64)end_pfn << PAGE_SHIFT);
+    setup_node_bootmem(0, start, end);
 }
 
 void numa_add_cpu(int cpu)
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index cfe24c7e78..c3e13059e9 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -104,7 +104,7 @@ nodeid_t setup_node(unsigned pxm)
 	return node;
 }
 
-int valid_numa_range(u64 start, u64 end, nodeid_t node)
+int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
 {
 	int i;
 
@@ -119,7 +119,7 @@ int valid_numa_range(u64 start, u64 end, nodeid_t node)
 	return 0;
 }
 
-static __init int conflicting_memblks(u64 start, u64 end)
+static __init int conflicting_memblks(paddr_t start, paddr_t end)
 {
 	int i;
 
@@ -135,7 +135,7 @@ static __init int conflicting_memblks(u64 start, u64 end)
 	return -1;
 }
 
-static __init void cutoff_node(int i, u64 start, u64 end)
+static __init void cutoff_node(int i, paddr_t start, paddr_t end)
 {
 	struct node *nd = &nodes[i];
 	if (nd->start < start) {
@@ -275,7 +275,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 void __init
 acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 {
-	u64 start, end;
+	paddr_t start, end;
 	unsigned pxm;
 	nodeid_t node;
 	int i;
@@ -318,7 +318,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 		bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
 		                !test_bit(i, memblk_hotplug);
 
-		printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
+		printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with itself (%"PRIpaddr"-%"PRIpaddr")\n",
 		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
 		       node_memblk_range[i].start, node_memblk_range[i].end);
 		if (mismatch) {
@@ -327,7 +327,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 		}
 	} else {
 		printk(KERN_ERR
-		       "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
+		       "SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with PXM %u (%"PRIpaddr"-%"PRIpaddr")\n",
 		       pxm, start, end, node_to_pxm(memblk_nodeid[i]),
 		       node_memblk_range[i].start, node_memblk_range[i].end);
 		bad_srat();
@@ -346,7 +346,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 				nd->end = end;
 		}
 	}
-	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIx64"-%"PRIx64"%s\n",
+	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIpaddr"-%"PRIpaddr"%s\n",
 	       node, pxm, start, end,
 	       ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
 
@@ -369,7 +369,7 @@ static int __init nodes_cover_memory(void)
 
 	for (i = 0; i < e820.nr_map; i++) {
 		int j, found;
-		unsigned long long start, end;
+		paddr_t start, end;
 
 		if (e820.map[i].type != E820_RAM) {
 			continue;
@@ -396,7 +396,7 @@ static int __init nodes_cover_memory(void)
 
 		if (start < end) {
 			printk(KERN_ERR "SRAT: No PXM for e820 range: "
-				"%016Lx - %016Lx\n", start, end);
+				"%"PRIpaddr" - %"PRIpaddr"\n", start, end);
 			return 0;
 		}
 	}
@@ -432,7 +432,7 @@ static int __init cf_check srat_parse_region(
 	return 0;
 }
 
-void __init srat_parse_regions(u64 addr)
+void __init srat_parse_regions(paddr_t addr)
 {
 	u64 mask;
 	unsigned int i;
@@ -457,7 +457,7 @@ void __init srat_parse_regions(u64 addr)
 }
 
 /* Use the information discovered above to actually set up the nodes. */
-int __init acpi_scan_nodes(u64 start, u64 end)
+int __init acpi_scan_nodes(paddr_t start, paddr_t end)
 {
 	int i;
 	nodemask_t all_nodes_parsed;
@@ -489,7 +489,8 @@ int __init acpi_scan_nodes(u64 start, u64 end)
 	/* Finally register nodes */
 	for_each_node_mask(i, all_nodes_parsed)
 	{
-		u64 size = nodes[i].end - nodes[i].start;
+		paddr_t size = nodes[i].end - nodes[i].start;
+
 		if ( size == 0 )
 			printk(KERN_WARNING "SRAT: Node %u has no memory. "
 			       "BIOS Bug or mis-configured hardware?\n", i);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (8 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure Wei Chen
@ 2022-04-18  9:07 ` Wei Chen
  2022-04-26  9:20   ` Jan Beulich
  2022-04-19  9:29 ` [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
  10 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-18  9:07 UTC (permalink / raw)
  To: --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

One NUMA node may contain several memory blocks. In current Xen
code, Xen will maintain a node memory range for each node to cover
all its memory blocks. But here comes the problem, in the gap of
one node's two memory blocks, if there are some memory blocks don't
belong to this node (remote memory blocks). This node's memory range
will be expanded to cover these remote memory blocks.

One node's memory range contains other nodes' memory, this is
obviously not very reasonable. This means current NUMA code only
can support node has no interleaved memory blocks. However, on a
physical machine, the addresses of multiple nodes can be interleaved.

So in this patch, we add code to detect memory interleaves of
different nodes. NUMA initialization will be failed and error
messages will be printed when Xen detect such hardware configuration.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 ->v2:
1. Update the description to say we're after is no memory
   interleaves of different nodes.
2. Only update node range when it passes the interleave check.
3. Don't use full upper-case for "node".
---
 xen/arch/x86/srat.c | 49 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index c3e13059e9..53968e4085 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -271,6 +271,35 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 		       pxm, pa->apic_id, node);
 }
 
+/*
+ * Check to see if there are other nodes within this node's range.
+ * We just need to check full contains situation. Because overlaps
+ * have been checked before by conflicting_memblks.
+ */
+static bool __init check_node_memory_interleave(nodeid_t nid,
+                                                paddr_t start, paddr_t end)
+{
+	nodeid_t i;
+	const struct node *nd = &nodes[nid];
+
+	for_each_node_mask(i, memory_nodes_parsed)
+	{
+		/* Skip itself */
+		if (i == nid)
+			continue;
+
+		nd = &nodes[i];
+		if (start < nd->start && nd->end < end) {
+			printk(KERN_ERR
+			       "Node %u: (%"PRIpaddr"-%"PRIpaddr") interleaves with node %u (%"PRIpaddr"-%"PRIpaddr")\n",
+			       nid, start, end, i, nd->start, nd->end);
+			return true;
+		}
+	}
+
+	return false;
+}
+
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
 void __init
 acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
@@ -340,10 +369,22 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 			nd->start = start;
 			nd->end = end;
 		} else {
-			if (start < nd->start)
-				nd->start = start;
-			if (nd->end < end)
-				nd->end = end;
+			paddr_t new_start = nd->start;
+			paddr_t new_end = nd->end;
+
+			if (start < new_start)
+				new_start = start;
+			if (new_end < end)
+				new_end = end;
+
+			/* Check whether new range contains memory for other nodes */
+			if (check_node_memory_interleave(node, new_start, new_end)) {
+				bad_srat();
+				return;
+			}
+
+			nd->start = new_start;
+			nd->end = new_end;
 		}
 	}
 	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIpaddr"-%"PRIpaddr"%s\n",
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* RE: [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart
  2022-04-18  9:07 ` [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
@ 2022-04-19  9:13   ` Jiamei Xie
  2022-04-21  7:05     ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jiamei Xie @ 2022-04-19  9:13 UTC (permalink / raw)
  To: Wei Chen, --to=xen-devel, xen-devel
  Cc: nd, Wei Chen, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Julien Grall

Hi Wei,

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Wei Chen
> Sent: 2022年4月18日 17:07
> To: --to=xen-devel@lists.xenproject.org; xen-devel@lists.xenproject.org
> Cc: nd <nd@arm.com>; Wei Chen <Wei.Chen@arm.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Bertrand Marquis
> <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Julien Grall <jgrall@amazon.com>
> Subject: [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early
> uart
> 
> Current putn function that is using for early print
> only can print low 32-bit of AArch64 register. This
> will lose some important messages while debugging
> with early console. For example:
> (XEN) Bringing up CPU5
> - CPU 0000000100000100 booting -
> Will be truncated to
> (XEN) Bringing up CPU5
> - CPU 00000100 booting -
> 
> In this patch, we increased the print loops and shift
> bits to make putn print 64-bit number.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> Acked-by: Julien Grall <jgrall@amazon.com>
> ---
>  xen/arch/arm/arm64/head.S | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index e62c48ec1c..2bb7906f69 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -866,17 +866,19 @@ puts:
>          ret
>  ENDPROC(puts)
> 
> -/* Print a 32-bit number in hex.  Specific to the PL011 UART.
> +/*
> + * Print a 64-bit number in hex.
>   * x0: Number to print.
>   * x23: Early UART base address
> - * Clobbers x0-x3 */
> + * Clobbers x0-x3
> + */
>  putn:
>          adr   x1, hex
> -        mov   x3, #8
> +        mov   x3, #16
>  1:
>          early_uart_ready x23, 2
> -        and   x2, x0, #0xf0000000    /* Mask off the top nybble */
> -        lsr   x2, x2, #28
> +        and   x2, x0, #(0xf<<60)     /* Mask off the top nybble */
> +        lsr   x2, x2, #60
>          ldrb  w2, [x1, x2]           /* Convert to a char */
>          early_uart_transmit x23, w2
>          lsl   x0, x0, #4             /* Roll it through one nybble at a time */
> --
> 2.25.1
> 

I have tested the whole patch series on Armv8A(config without NUMA) and X86(config with NUMA), both can enter Dom0 successfully and the X86 NUMA works normally.

Tested-by: Jiamei Xie <jiamei.xie@arm.com> 

Regards,
Jiamei Xie


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status
  2022-04-18  9:07 ` [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status Wei Chen
@ 2022-04-19  9:18   ` Jan Beulich
  2022-04-21  7:03     ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-19  9:18 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 18.04.2022 11:07, Wei Chen wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1889,7 +1889,7 @@ void __init end_boot_allocator(void)
>      }
>      nr_bootmem_regions = 0;
>  
> -    if ( !dma_bitsize && (num_online_nodes() > 1) )
> +    if ( !dma_bitsize && arch_have_default_dmazone() )
>          dma_bitsize = arch_get_dma_bitsize();

Considering its purpose, may I suggest "want" instead of "have" in the
hook name?

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1
  2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
                   ` (9 preceding siblings ...)
  2022-04-18  9:07 ` [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes Wei Chen
@ 2022-04-19  9:29 ` Wei Chen
  10 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-19  9:29 UTC (permalink / raw)
  To: Wei Chen, xen-devel; +Cc: nd, Stefano Stabellini, Julien Grall, Jan Beulich

Hi,

I am sorry, I had added a wrong address--to=xen-devel@lists.xenproject.org
in this series' recipient address by mistake. When you reply to this series,
please remember to remove it. I'm sorry for the inconvenience again!

Cheers,
Wei Chen

> -----Original Message-----
> From: Wei Chen <wei.chen@arm.com>
> Sent: 2022年4月18日 17:07
> To: --to=xen-devel@lists.xenproject.org; xen-devel@lists.xenproject.org
> Cc: nd <nd@arm.com>; Wei Chen <Wei.Chen@arm.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Jan Beulich
> <jbeulich@suse.com>
> Subject: [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1
> 
> (The Arm device tree based NUMA support patch set contains 35
> patches. In order to make stuff easier for reviewers, I split
> them into 3 parts:
> 1. Preparation. I have re-sorted the patch series. And moved
>    independent patches to the head of the series.
> 2. Move generically usable code from x86 to common.
> 3. Add new code to support Arm.
> 
> This series only contains the first part patches.
> The full patch list can be found in this cover letter.)
> 
> Xen memory allocation and scheduler modules are NUMA aware.
> But actually, on x86 has implemented the architecture APIs
> to support NUMA. Arm was providing a set of fake architecture
> APIs to make it compatible with NUMA awared memory allocation
> and scheduler.
> 
> Arm system was working well as a single node NUMA system with
> these fake APIs, because we didn't have multiple nodes NUMA
> system on Arm. But in recent years, more and more Arm devices
> support multiple nodes NUMA system.
> 
> So now we have a new problem. When Xen is running on these Arm
> devices, Xen still treat them as single node SMP systems. The
> NUMA affinity capability of Xen memory allocation and scheduler
> becomes meaningless. Because they rely on input data that does
> not reflect real NUMA layout.
> 
> Xen still think the access time for all of the memory is the
> same for all CPUs. However, Xen may allocate memory to a VM
> from different NUMA nodes with different access speeds. This
> difference can be amplified in workloads inside VM, causing
> performance instability and timeouts.
> 
> So in this patch series, we implement a set of NUMA API to use
> device tree to describe the NUMA layout. We reuse most of the
> code of x86 NUMA to create and maintain the mapping between
> memory and CPU, create the matrix between any two NUMA nodes.
> Except ACPI and some x86 specified code, we have moved other
> code to common. In next stage, when we implement ACPI based
> NUMA for Arm64, we may move the ACPI NUMA code to common too,
> but in current stage, we keep it as x86 only.
> 
> This patch serires has been tested and booted well on one
> Arm64 NUMA machine and one HPE x86 NUMA machine.
> 
> ---
> Part1 v1->v2:
> 1. Move independent patches from later to early of this series.
> 2. Drop the copy of EFI stub.c from Arm. Share common codes of
>    x86 EFI stub for Arm.
> 3. Use CONFIG_ARM_EFI to replace CONFIG_EFI and remove help text
>    and make CONFIG_ARM_EFI invisible.
> 4. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid.
> 5. Move MAX_NUMNODES from xen/numa.h to asm/numa.h for x86.
> 6. Extend the description of Arm's workaround for reserve DMA
>    allocations to avoid the same discussion every time for
>    arch_have_default_dmazone.
> 7. Update commit messages.
> 
> Wei Chen (10):
>   xen/arm: Print a 64-bit number in hex from early uart
>   xen/x86: move reusable EFI stub functions from x86 to common
>   xen/arm: add CONFIG_ARM_EFI to stub EFI API
>   xen/arm: Keep memory nodes in device tree when Xen boots from EFI
>   xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
>   xen: introduce an arch helper for default dma zone status
>   xen: decouple NUMA from ACPI in Kconfig
>   xen/arm: use !CONFIG_NUMA to keep fake NUMA API
>   xen/x86: use paddr_t for addresses in NUMA node structure
>   xen/x86: add detection of memory interleaves for different nodes
> 
> PART2:
>   xen/x86: introduce a helper to update memory hotplug end
>   xen/x86: Use enumerations to indicate NUMA status
>   xen/x86: move generically usable NUMA code from x86 to common
>   xen/x86: use arch_get_memory_map to get information from E820 map
>   xen/x86: move NUMA scan nodes codes from x86 to common
>   xen/x86: rename bad_srat to numa_bad
> 
> PART3:
>   xen: rename acpi_scan_nodes to numa_scan_nodes
>   xen: introduce a Kconfig option to configure NUMA nodes number
>   xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
>   xen/arm: implement helpers to get and update NUMA status
>   xen/arm: implement node distance helpers for Arm
>   xen/arm: use arch_get_memory_map to memory bank from bootinfo
>   xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus
>   xen/arm: Add boot and secondary CPU to NUMA system
>   xen/arm: stub mem_hotplug_update_boundary for Arm
>   xen/arm: introduce a helper to parse device tree processor node
>   xen/arm: introduce a helper to parse device tree memory node
>   xen/arm: introduce a helper to parse device tree NUMA distance map
>   xen/arm: unified entry to parse all NUMA data from device tree
>   xen/arm: keep guest still be NUMA unware
>   xen/arm: enable device tree based NUMA in system init
>   xen/arm: implement a dummy 1:1 node to pxm mapping
>   xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot
>   xen/arm: Provide Kconfig options for Arm to enable NUMA
>   docs: update numa command line to support Arm
> 
>  xen/arch/arm/Kconfig                    |  5 ++
>  xen/arch/arm/Makefile                   |  2 +-
>  xen/arch/arm/arm64/head.S               | 12 ++--
>  xen/arch/arm/bootfdt.c                  |  8 ++-
>  xen/arch/arm/efi/Makefile               |  5 ++
>  xen/arch/arm/efi/efi-boot.h             | 25 ---------
>  xen/arch/arm/include/asm/numa.h         |  6 ++
>  xen/arch/x86/Kconfig                    |  2 +-
>  xen/arch/x86/efi/Makefile               |  4 +-
>  xen/arch/x86/efi/{stub.c => stub-x86.c} | 37 -------------
>  xen/arch/x86/include/asm/config.h       |  1 -
>  xen/arch/x86/include/asm/numa.h         | 15 ++---
>  xen/arch/x86/numa.c                     | 30 +++++-----
>  xen/arch/x86/srat.c                     | 74 +++++++++++++++++++------
>  xen/common/Kconfig                      |  3 +
>  xen/common/efi/stub.c                   | 38 +++++++++++++
>  xen/common/page_alloc.c                 |  2 +-
>  xen/drivers/acpi/Kconfig                |  3 +-
>  xen/drivers/acpi/Makefile               |  2 +-
>  xen/include/xen/numa.h                  |  2 +
>  20 files changed, 162 insertions(+), 114 deletions(-)
>  rename xen/arch/x86/efi/{stub.c => stub-x86.c} (71%)
>  create mode 100644 xen/common/efi/stub.c
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-18  9:07 ` [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common Wei Chen
@ 2022-04-21  0:17   ` Stefano Stabellini
  2022-04-26  8:53   ` Jan Beulich
  1 sibling, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2022-04-21  0:17 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, nd, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

On Mon, 18 Apr 2022, Wei Chen wrote:
> Most of the functions in x86 EFI stub.c can be reused for other
> architectures. So we move them to common and keep the x86 specific
> function in stub-x86.c.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> v1 -> v2:
> 1. Drop the copy of stub.c from Arm EFI.
> 2. Share common codes of x86 EFI stub for other architectures.
> ---
>  xen/arch/x86/efi/Makefile               |  4 +--
>  xen/arch/x86/efi/{stub.c => stub-x86.c} | 37 ------------------------
>  xen/common/efi/stub.c                   | 38 +++++++++++++++++++++++++
>  3 files changed, 40 insertions(+), 39 deletions(-)
>  rename xen/arch/x86/efi/{stub.c => stub-x86.c} (71%)
>  create mode 100644 xen/common/efi/stub.c
> 
> diff --git a/xen/arch/x86/efi/Makefile b/xen/arch/x86/efi/Makefile
> index 034ec87895..5ca3a0b4a4 100644
> --- a/xen/arch/x86/efi/Makefile
> +++ b/xen/arch/x86/efi/Makefile
> @@ -11,8 +11,8 @@ $(obj)/boot.init.o: $(obj)/buildid.o
>  $(call cc-option-add,cflags-stack-boundary,CC,-mpreferred-stack-boundary=4)
>  $(addprefix $(obj)/,$(EFIOBJ-y)): CFLAGS_stack_boundary := $(cflags-stack-boundary)
>  
> -obj-y := stub.o
> +obj-y := stub.o stub-x86.o
>  obj-$(XEN_BUILD_EFI) := $(filter-out %.init.o,$(EFIOBJ-y))
>  obj-bin-$(XEN_BUILD_EFI) := $(filter %.init.o,$(EFIOBJ-y))
>  extra-$(XEN_BUILD_EFI) += buildid.o relocs-dummy.o
> -nocov-$(XEN_BUILD_EFI) += stub.o
> +nocov-$(XEN_BUILD_EFI) += stub.o stub-x86.o
> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
> similarity index 71%
> rename from xen/arch/x86/efi/stub.c
> rename to xen/arch/x86/efi/stub-x86.c
> index 9984932626..2cd5c8d4dc 100644
> --- a/xen/arch/x86/efi/stub.c
> +++ b/xen/arch/x86/efi/stub-x86.c
> @@ -1,7 +1,5 @@
>  #include <xen/efi.h>
> -#include <xen/errno.h>
>  #include <xen/init.h>
> -#include <xen/lib.h>
>  #include <asm/asm_defns.h>
>  #include <asm/efibind.h>
>  #include <asm/page.h>
> @@ -45,11 +43,6 @@ void __init noreturn efi_multiboot2(EFI_HANDLE ImageHandle,
>      unreachable();
>  }
>  
> -bool efi_enabled(unsigned int feature)
> -{
> -    return false;
> -}
> -
>  void __init efi_init_memory(void) { }
>  
>  bool efi_boot_mem_unused(unsigned long *start, unsigned long *end)
> @@ -61,33 +54,3 @@ bool efi_boot_mem_unused(unsigned long *start, unsigned long *end)
>  }
>  
>  void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { }
> -
> -bool efi_rs_using_pgtables(void)
> -{
> -    return false;
> -}
> -
> -unsigned long efi_get_time(void)
> -{
> -    BUG();
> -    return 0;
> -}
> -
> -void efi_halt_system(void) { }
> -void efi_reset_system(bool warm) { }
> -
> -int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
> -{
> -    return -ENOSYS;
> -}
> -
> -int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
> -    __attribute__((__alias__("efi_get_info")));
> -
> -int efi_runtime_call(struct xenpf_efi_runtime_call *op)
> -{
> -    return -ENOSYS;
> -}
> -
> -int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
> -    __attribute__((__alias__("efi_runtime_call")));
> diff --git a/xen/common/efi/stub.c b/xen/common/efi/stub.c
> new file mode 100644
> index 0000000000..6e4f4de9af
> --- /dev/null
> +++ b/xen/common/efi/stub.c
> @@ -0,0 +1,38 @@
> +#include <xen/efi.h>
> +#include <xen/errno.h>
> +#include <xen/lib.h>
> +
> +bool efi_enabled(unsigned int feature)
> +{
> +    return false;
> +}
> +
> +bool efi_rs_using_pgtables(void)
> +{
> +    return false;
> +}
> +
> +unsigned long efi_get_time(void)
> +{
> +    BUG();
> +    return 0;
> +}
> +
> +void efi_halt_system(void) { }
> +void efi_reset_system(bool warm) { }
> +
> +int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
> +{
> +    return -ENOSYS;
> +}
> +
> +int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
> +    __attribute__((__alias__("efi_get_info")));
> +
> +int efi_runtime_call(struct xenpf_efi_runtime_call *op)
> +{
> +    return -ENOSYS;
> +}
> +
> +int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
> +    __attribute__((__alias__("efi_runtime_call")));
> -- 
> 2.25.1
> 
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API
  2022-04-18  9:07 ` [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API Wei Chen
@ 2022-04-21  0:25   ` Stefano Stabellini
  2022-04-21  7:01     ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Stefano Stabellini @ 2022-04-21  0:25 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

On Mon, 18 Apr 2022, Wei Chen wrote:
> x86 is using compiler feature testing to decide EFI
> build enable or not. When EFI build is disabled, x86
> will use a efi/stub.c file to replace efi/runtime.c
> for build objects. Following this idea, we introduce
> a stub file for Arm, but use CONFIG_ARM_EFI to decide
> EFI build enable or not.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> ---
> v1 -> v2:
> 1. Use CONFIG_ARM_EFI to replace CONFIG_EFI
> 2. Remove help text and make CONFIG_ARM_EFI invisible.
> 3. Merge one following patch:
>    xen/arm: introduce a stub file for non-EFI architectures
> 4. Use the common stub.c instead of creating new one.
> ---
>  xen/arch/arm/Kconfig      | 5 +++++
>  xen/arch/arm/Makefile     | 2 +-
>  xen/arch/arm/efi/Makefile | 5 +++++
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index ecfa6822e4..5f1b0121b0 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -6,6 +6,7 @@ config ARM_64
>  	def_bool y
>  	depends on !ARM_32
>  	select 64BIT
> +	select ARM_EFI
>  	select HAS_FAST_MULTIPLY
>  
>  config ARM
> @@ -33,6 +34,10 @@ config ACPI
>  	  Advanced Configuration and Power Interface (ACPI) support for Xen is
>  	  an alternative to device tree on ARM64.
>  
> +config ARM_EFI
> +	bool
> +	depends on ARM_64

As ARM_EFI is selected by ARM_64 and it cannot be enable via a menu (it
is hidden) there is no need for the "depends" line here


>  config GICV3
>  	bool "GICv3 driver"
>  	depends on ARM_64 && !NEW_VGIC
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 1d862351d1..bb7a6151c1 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -1,6 +1,5 @@
>  obj-$(CONFIG_ARM_32) += arm32/
>  obj-$(CONFIG_ARM_64) += arm64/
> -obj-$(CONFIG_ARM_64) += efi/
>  obj-$(CONFIG_ACPI) += acpi/
>  obj-$(CONFIG_HAS_PCI) += pci/
>  ifneq ($(CONFIG_NO_PLAT),y)
> @@ -20,6 +19,7 @@ obj-y += domain.o
>  obj-y += domain_build.init.o
>  obj-y += domctl.o
>  obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
> +obj-y += efi/
>  obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_GICV3) += gic-v3.o
> diff --git a/xen/arch/arm/efi/Makefile b/xen/arch/arm/efi/Makefile
> index 4313c39066..75ef180233 100644
> --- a/xen/arch/arm/efi/Makefile
> +++ b/xen/arch/arm/efi/Makefile
> @@ -1,4 +1,9 @@
>  include $(srctree)/common/efi/efi-common.mk
>  
> +ifeq ($(CONFIG_ARM_EFI),y)
>  obj-y += $(EFIOBJ-y)
>  obj-$(CONFIG_ACPI) +=  efi-dom0.init.o
> +else
> +EFIOBJ-y += stub.o
> +obj-y += stub.o

The change to EFIOBJ-y is not needed here as EFIOBJ-y is not used.
Only obj-y += stub.o should be enough

> +endif
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI
  2022-04-18  9:07 ` [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI Wei Chen
@ 2022-04-21  0:30   ` Stefano Stabellini
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2022-04-21  0:30 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, nd, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

On Mon, 18 Apr 2022, Wei Chen wrote:
> In current code, when Xen is booting from EFI, it will delete
> all memory nodes in device tree. This would work well in current
> stage, because Xen can get memory map from EFI system table.
> However, EFI system table cannot completely replace memory nodes
> of device tree. EFI system table doesn't contain memory NUMA
> information. Xen depends on ACPI SRAT or device tree memory nodes
> to parse memory blocks' NUMA mapping. So in EFI + DTB boot, Xen
> doesn't have any method to get numa-node-id for memory blocks any
> more. This makes device tree based NUMA support become impossible
> for Xen in EFI + DTB boot.
> 
> So in this patch, we will keep memory nodes in device tree for
> NUMA code to parse memory numa-node-id later.
> 
> As a side effect, if we still parse boot memory information in
> early_scan_node, bootmem.info will calculate memory ranges in
> memory nodes twice. So we have to prevent early_scan_node to
> parse memory nodes in EFI boot.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> v1 -> v2:
> 1. Move this patch from later to early of this series.
> 2. Refine commit message.
> ---
>  xen/arch/arm/bootfdt.c      |  8 +++++++-
>  xen/arch/arm/efi/efi-boot.h | 25 -------------------------
>  2 files changed, 7 insertions(+), 26 deletions(-)
> 
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index e318ef9603..78e10c6ebc 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -11,6 +11,7 @@
>  #include <xen/lib.h>
>  #include <xen/kernel.h>
>  #include <xen/init.h>
> +#include <xen/efi.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/sort.h>
> @@ -370,7 +371,12 @@ static int __init early_scan_node(const void *fdt,
>  {
>      int rc = 0;
>  
> -    if ( device_tree_node_matches(fdt, node, "memory") )
> +    /*
> +     * If Xen has been booted via UEFI, the memory banks are
> +     * populated. So we should skip the parsing.
> +     */
> +    if ( !efi_enabled(EFI_BOOT) &&
> +         device_tree_node_matches(fdt, node, "memory") )
>          rc = process_memory_node(fdt, node, name, depth,
>                                   address_cells, size_cells, &bootinfo.mem);
>      else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index e452b687d8..59d93c24a1 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -231,33 +231,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE *sys_table,
>      int status;
>      u32 fdt_val32;
>      u64 fdt_val64;
> -    int prev;
>      int num_rsv;
>  
> -    /*
> -     * Delete any memory nodes present.  The EFI memory map is the only
> -     * memory description provided to Xen.
> -     */
> -    prev = 0;
> -    for (;;)
> -    {
> -        const char *type;
> -        int len;
> -
> -        node = fdt_next_node(fdt, prev, NULL);
> -        if ( node < 0 )
> -            break;
> -
> -        type = fdt_getprop(fdt, node, "device_type", &len);
> -        if ( type && strncmp(type, "memory", len) == 0 )
> -        {
> -            fdt_del_node(fdt, node);
> -            continue;
> -        }
> -
> -        prev = node;
> -    }
> -
>     /*
>      * Delete all memory reserve map entries. When booting via UEFI,
>      * kernel will use the UEFI memory map to find reserved regions.
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-18  9:07 ` [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid Wei Chen
@ 2022-04-21  0:37   ` Stefano Stabellini
  2022-04-26  9:02   ` Jan Beulich
  1 sibling, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2022-04-21  0:37 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, nd, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini

On Mon, 18 Apr 2022, Wei Chen wrote:
> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> results in two lines of error-checking code in phys_to_nid
> that is not actually working and causing two compilation
> errors:
> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
>    This is because in the common header file, "MAX_NUMNODES" is
>    defined after the common header file includes the ARCH header
>    file, where phys_to_nid has attempted to use "MAX_NUMNODES".
>    This error was resolved when we moved the definition of
>    "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
>    "MAX_NUMNODES" definition in common header file through a
>    conditional compilation for some architectures that don't
>    need to define "MAX_NUMNODES" in their ARCH header files.
> 2. error: wrong type argument to unary exclamation mark.
>    This is because, the error-checking code contains !node_data[nid].
>    But node_data is a data structure variable, it's not a pointer.
> 
> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> enable the two lines of error-checking code. And fix the left
> compilation errors by replacing !node_data[nid] to
> !node_data[nid].node_spanned_pages.
> 
> Because when node_spanned_pages is 0, this node has no memory,
> numa_scan_node will print warning message for such kind of nodes:
> "Firmware Bug or mis-configured hardware?".
>
> Signed-off-by: Wei Chen <wei.chen@arm.com>

This patch looks OK to me but the x86 changes would benefit from a
review from one of the x86 maintainers


> ---
> v1 -> v2:
> 1. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid.
> 2. Adjust the conditional express for ASSERT.
> 3. Move MAX_NUMNODES from xen/numa.h to asm/numa.h for x86.
> 4. Use conditional macro to gate MAX_NUMNODES for other architectures.
> ---
>  xen/arch/x86/include/asm/numa.h | 6 +++---
>  xen/include/xen/numa.h          | 2 ++
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/numa.h b/xen/arch/x86/include/asm/numa.h
> index bada2c0bb9..1f268ce77d 100644
> --- a/xen/arch/x86/include/asm/numa.h
> +++ b/xen/arch/x86/include/asm/numa.h
> @@ -4,6 +4,7 @@
>  #include <xen/cpumask.h>
>  
>  #define NODES_SHIFT 6
> +#define MAX_NUMNODES    (1 << NODES_SHIFT)
>  
>  typedef u8 nodeid_t;
>  
> @@ -26,7 +27,6 @@ extern int compute_hash_shift(struct node *nodes, int numnodes,
>  extern nodeid_t pxm_to_node(unsigned int pxm);
>  
>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
> -#define VIRTUAL_BUG_ON(x) 
>  
>  extern void numa_add_cpu(int cpu);
>  extern void numa_init_array(void);
> @@ -62,9 +62,9 @@ extern struct node_data node_data[];
>  static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  { 
>  	nodeid_t nid;
> -	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> +	ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
>  	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift]; 
> -	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]); 
> +	ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
>  	return nid; 
>  } 
>  
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 7aef1a88dc..91b25c5617 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -10,7 +10,9 @@
>  #define NUMA_NO_NODE     0xFF
>  #define NUMA_NO_DISTANCE 0xFF
>  
> +#ifndef MAX_NUMNODES
>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
> +#endif
>  
>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API
  2022-04-21  0:25   ` Stefano Stabellini
@ 2022-04-21  7:01     ` Wei Chen
  2022-04-21 21:35       ` Stefano Stabellini
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-21  7:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, nd, Julien Grall, Bertrand Marquis, Volodymyr Babchuk

Hi Stefano,

On 2022/4/21 8:25, Stefano Stabellini wrote:
> On Mon, 18 Apr 2022, Wei Chen wrote:
>> x86 is using compiler feature testing to decide EFI
>> build enable or not. When EFI build is disabled, x86
>> will use a efi/stub.c file to replace efi/runtime.c
>> for build objects. Following this idea, we introduce
>> a stub file for Arm, but use CONFIG_ARM_EFI to decide
>> EFI build enable or not.
>>
>> Signed-off-by: Wei Chen <wei.chen@arm.com>
>> ---
>> v1 -> v2:
>> 1. Use CONFIG_ARM_EFI to replace CONFIG_EFI
>> 2. Remove help text and make CONFIG_ARM_EFI invisible.
>> 3. Merge one following patch:
>>     xen/arm: introduce a stub file for non-EFI architectures
>> 4. Use the common stub.c instead of creating new one.
>> ---
>>   xen/arch/arm/Kconfig      | 5 +++++
>>   xen/arch/arm/Makefile     | 2 +-
>>   xen/arch/arm/efi/Makefile | 5 +++++
>>   3 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index ecfa6822e4..5f1b0121b0 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -6,6 +6,7 @@ config ARM_64
>>   	def_bool y
>>   	depends on !ARM_32
>>   	select 64BIT
>> +	select ARM_EFI
>>   	select HAS_FAST_MULTIPLY
>>   
>>   config ARM
>> @@ -33,6 +34,10 @@ config ACPI
>>   	  Advanced Configuration and Power Interface (ACPI) support for Xen is
>>   	  an alternative to device tree on ARM64.
>>   
>> +config ARM_EFI
>> +	bool
>> +	depends on ARM_64
> 
> As ARM_EFI is selected by ARM_64 and it cannot be enable via a menu (it
> is hidden) there is no need for the "depends" line here
> 

Ok, I will remove it.

> 
>>   config GICV3
>>   	bool "GICv3 driver"
>>   	depends on ARM_64 && !NEW_VGIC
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 1d862351d1..bb7a6151c1 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -1,6 +1,5 @@
>>   obj-$(CONFIG_ARM_32) += arm32/
>>   obj-$(CONFIG_ARM_64) += arm64/
>> -obj-$(CONFIG_ARM_64) += efi/
>>   obj-$(CONFIG_ACPI) += acpi/
>>   obj-$(CONFIG_HAS_PCI) += pci/
>>   ifneq ($(CONFIG_NO_PLAT),y)
>> @@ -20,6 +19,7 @@ obj-y += domain.o
>>   obj-y += domain_build.init.o
>>   obj-y += domctl.o
>>   obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
>> +obj-y += efi/
>>   obj-y += gic.o
>>   obj-y += gic-v2.o
>>   obj-$(CONFIG_GICV3) += gic-v3.o
>> diff --git a/xen/arch/arm/efi/Makefile b/xen/arch/arm/efi/Makefile
>> index 4313c39066..75ef180233 100644
>> --- a/xen/arch/arm/efi/Makefile
>> +++ b/xen/arch/arm/efi/Makefile
>> @@ -1,4 +1,9 @@
>>   include $(srctree)/common/efi/efi-common.mk
>>   
>> +ifeq ($(CONFIG_ARM_EFI),y)
>>   obj-y += $(EFIOBJ-y)
>>   obj-$(CONFIG_ACPI) +=  efi-dom0.init.o
>> +else
>> +EFIOBJ-y += stub.o
>> +obj-y += stub.o
> 
> The change to EFIOBJ-y is not needed here as EFIOBJ-y is not used.
> Only obj-y += stub.o should be enough
> 

I add stub.o to EFIOBJ-y that's because I want to use the clean-files
in efi-common.mk. Otherwise the link of stub.c in arm/efi will not
be cleaned in "make clean".

>> +endif
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status
  2022-04-19  9:18   ` Jan Beulich
@ 2022-04-21  7:03     ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-21  7:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Jan,

On 2022/4/19 17:18, Jan Beulich wrote:
> On 18.04.2022 11:07, Wei Chen wrote:
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -1889,7 +1889,7 @@ void __init end_boot_allocator(void)
>>       }
>>       nr_bootmem_regions = 0;
>>   
>> -    if ( !dma_bitsize && (num_online_nodes() > 1) )
>> +    if ( !dma_bitsize && arch_have_default_dmazone() )
>>           dma_bitsize = arch_get_dma_bitsize();
> 
> Considering its purpose, may I suggest "want" instead of "have" in the
> hook name?
> 

Ok, I will do it.

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart
  2022-04-19  9:13   ` Jiamei Xie
@ 2022-04-21  7:05     ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-21  7:05 UTC (permalink / raw)
  To: Jiamei Xie, xen-devel
  Cc: nd, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Julien Grall

Hi Jiamei,

On 2022/4/19 17:13, Jiamei Xie wrote:
> Hi Wei,
> 
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
>> Wei Chen
>> Sent: 2022年4月18日 17:07
>> To: --to=xen-devel@lists.xenproject.org; xen-devel@lists.xenproject.org
>> Cc: nd <nd@arm.com>; Wei Chen <Wei.Chen@arm.com>; Stefano Stabellini
>> <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Bertrand Marquis
>> <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
>> <Volodymyr_Babchuk@epam.com>; Julien Grall <jgrall@amazon.com>
>> Subject: [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early
>> uart
>>
>> Current putn function that is using for early print
>> only can print low 32-bit of AArch64 register. This
>> will lose some important messages while debugging
>> with early console. For example:
>> (XEN) Bringing up CPU5
>> - CPU 0000000100000100 booting -
>> Will be truncated to
>> (XEN) Bringing up CPU5
>> - CPU 00000100 booting -
>>
>> In this patch, we increased the print loops and shift
>> bits to make putn print 64-bit number.
>>
>> Signed-off-by: Wei Chen <wei.chen@arm.com>
>> Acked-by: Julien Grall <jgrall@amazon.com>
>> ---
>>   xen/arch/arm/arm64/head.S | 12 +++++++-----
>>   1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index e62c48ec1c..2bb7906f69 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -866,17 +866,19 @@ puts:
>>           ret
>>   ENDPROC(puts)
>>
>> -/* Print a 32-bit number in hex.  Specific to the PL011 UART.
>> +/*
>> + * Print a 64-bit number in hex.
>>    * x0: Number to print.
>>    * x23: Early UART base address
>> - * Clobbers x0-x3 */
>> + * Clobbers x0-x3
>> + */
>>   putn:
>>           adr   x1, hex
>> -        mov   x3, #8
>> +        mov   x3, #16
>>   1:
>>           early_uart_ready x23, 2
>> -        and   x2, x0, #0xf0000000    /* Mask off the top nybble */
>> -        lsr   x2, x2, #28
>> +        and   x2, x0, #(0xf<<60)     /* Mask off the top nybble */
>> +        lsr   x2, x2, #60
>>           ldrb  w2, [x1, x2]           /* Convert to a char */
>>           early_uart_transmit x23, w2
>>           lsl   x0, x0, #4             /* Roll it through one nybble at a time */
>> --
>> 2.25.1
>>
> 
> I have tested the whole patch series on Armv8A(config without NUMA) and X86(config with NUMA), both can enter Dom0 successfully and the X86 NUMA works normally.
> 
> Tested-by: Jiamei Xie <jiamei.xie@arm.com>
> 

Thanks for your testing!

> Regards,
> Jiamei Xie


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API
  2022-04-21  7:01     ` Wei Chen
@ 2022-04-21 21:35       ` Stefano Stabellini
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2022-04-21 21:35 UTC (permalink / raw)
  To: Wei Chen
  Cc: Stefano Stabellini, xen-devel, nd, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

On Thu, 21 Apr 2022, Wei Chen wrote:
> Hi Stefano,
> 
> On 2022/4/21 8:25, Stefano Stabellini wrote:
> > On Mon, 18 Apr 2022, Wei Chen wrote:
> > > x86 is using compiler feature testing to decide EFI
> > > build enable or not. When EFI build is disabled, x86
> > > will use a efi/stub.c file to replace efi/runtime.c
> > > for build objects. Following this idea, we introduce
> > > a stub file for Arm, but use CONFIG_ARM_EFI to decide
> > > EFI build enable or not.
> > > 
> > > Signed-off-by: Wei Chen <wei.chen@arm.com>
> > > ---
> > > v1 -> v2:
> > > 1. Use CONFIG_ARM_EFI to replace CONFIG_EFI
> > > 2. Remove help text and make CONFIG_ARM_EFI invisible.
> > > 3. Merge one following patch:
> > >     xen/arm: introduce a stub file for non-EFI architectures
> > > 4. Use the common stub.c instead of creating new one.
> > > ---
> > >   xen/arch/arm/Kconfig      | 5 +++++
> > >   xen/arch/arm/Makefile     | 2 +-
> > >   xen/arch/arm/efi/Makefile | 5 +++++
> > >   3 files changed, 11 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > index ecfa6822e4..5f1b0121b0 100644
> > > --- a/xen/arch/arm/Kconfig
> > > +++ b/xen/arch/arm/Kconfig
> > > @@ -6,6 +6,7 @@ config ARM_64
> > >   	def_bool y
> > >   	depends on !ARM_32
> > >   	select 64BIT
> > > +	select ARM_EFI
> > >   	select HAS_FAST_MULTIPLY
> > >     config ARM
> > > @@ -33,6 +34,10 @@ config ACPI
> > >   	  Advanced Configuration and Power Interface (ACPI) support for Xen is
> > >   	  an alternative to device tree on ARM64.
> > >   +config ARM_EFI
> > > +	bool
> > > +	depends on ARM_64
> > 
> > As ARM_EFI is selected by ARM_64 and it cannot be enable via a menu (it
> > is hidden) there is no need for the "depends" line here
> > 
> 
> Ok, I will remove it.
> 
> > 
> > >   config GICV3
> > >   	bool "GICv3 driver"
> > >   	depends on ARM_64 && !NEW_VGIC
> > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> > > index 1d862351d1..bb7a6151c1 100644
> > > --- a/xen/arch/arm/Makefile
> > > +++ b/xen/arch/arm/Makefile
> > > @@ -1,6 +1,5 @@
> > >   obj-$(CONFIG_ARM_32) += arm32/
> > >   obj-$(CONFIG_ARM_64) += arm64/
> > > -obj-$(CONFIG_ARM_64) += efi/
> > >   obj-$(CONFIG_ACPI) += acpi/
> > >   obj-$(CONFIG_HAS_PCI) += pci/
> > >   ifneq ($(CONFIG_NO_PLAT),y)
> > > @@ -20,6 +19,7 @@ obj-y += domain.o
> > >   obj-y += domain_build.init.o
> > >   obj-y += domctl.o
> > >   obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
> > > +obj-y += efi/
> > >   obj-y += gic.o
> > >   obj-y += gic-v2.o
> > >   obj-$(CONFIG_GICV3) += gic-v3.o
> > > diff --git a/xen/arch/arm/efi/Makefile b/xen/arch/arm/efi/Makefile
> > > index 4313c39066..75ef180233 100644
> > > --- a/xen/arch/arm/efi/Makefile
> > > +++ b/xen/arch/arm/efi/Makefile
> > > @@ -1,4 +1,9 @@
> > >   include $(srctree)/common/efi/efi-common.mk
> > >   +ifeq ($(CONFIG_ARM_EFI),y)
> > >   obj-y += $(EFIOBJ-y)
> > >   obj-$(CONFIG_ACPI) +=  efi-dom0.init.o
> > > +else
> > > +EFIOBJ-y += stub.o
> > > +obj-y += stub.o
> > 
> > The change to EFIOBJ-y is not needed here as EFIOBJ-y is not used.
> > Only obj-y += stub.o should be enough
> > 
> 
> I add stub.o to EFIOBJ-y that's because I want to use the clean-files
> in efi-common.mk. Otherwise the link of stub.c in arm/efi will not
> be cleaned in "make clean".

I see. Makes sense.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-18  9:07 ` [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common Wei Chen
  2022-04-21  0:17   ` Stefano Stabellini
@ 2022-04-26  8:53   ` Jan Beulich
  2022-04-26 10:37     ` Wei Chen
  1 sibling, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26  8:53 UTC (permalink / raw)
  To: Wei Chen; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 18.04.2022 11:07, Wei Chen wrote:
> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
> similarity index 71%
> rename from xen/arch/x86/efi/stub.c
> rename to xen/arch/x86/efi/stub-x86.c
> index 9984932626..2cd5c8d4dc 100644
> --- a/xen/arch/x86/efi/stub.c
> +++ b/xen/arch/x86/efi/stub-x86.c

I'm not happy to see a file named *x86*.[ch] under x86/. I think the
x86 file wants to simply include the common one (and the symlinking
be suppressed when a real file already exists). Naming the common
file stub-common.c wouldn't help, as a similar anomaly would result.

> --- /dev/null
> +++ b/xen/common/efi/stub.c
> @@ -0,0 +1,38 @@
> +#include <xen/efi.h>
> +#include <xen/errno.h>
> +#include <xen/lib.h>
> +
> +bool efi_enabled(unsigned int feature)
> +{
> +    return false;
> +}
> +
> +bool efi_rs_using_pgtables(void)
> +{
> +    return false;
> +}
> +
> +unsigned long efi_get_time(void)
> +{
> +    BUG();
> +    return 0;
> +}
> +
> +void efi_halt_system(void) { }
> +void efi_reset_system(bool warm) { }
> +
> +int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
> +{
> +    return -ENOSYS;
> +}
> +
> +int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
> +    __attribute__((__alias__("efi_get_info")));

I doubt you need this outside of x86.

> +int efi_runtime_call(struct xenpf_efi_runtime_call *op)
> +{
> +    return -ENOSYS;
> +}
> +
> +int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
> +    __attribute__((__alias__("efi_runtime_call")));

Same here.

Even for the non-compat variants the need is un-obvious: Are you
intending to wire these up anywhere in Arm or common code? This
of course is once again pointing out that such code movement would
better be done when the new consumers actually appear, such that
it's clear why the movement is done - for every individual item.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-18  9:07 ` [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid Wei Chen
  2022-04-21  0:37   ` Stefano Stabellini
@ 2022-04-26  9:02   ` Jan Beulich
  2022-04-26 10:59     ` Wei Chen
  1 sibling, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26  9:02 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

On 18.04.2022 11:07, Wei Chen wrote:
> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> results in two lines of error-checking code in phys_to_nid
> that is not actually working and causing two compilation
> errors:
> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
>    This is because in the common header file, "MAX_NUMNODES" is
>    defined after the common header file includes the ARCH header
>    file, where phys_to_nid has attempted to use "MAX_NUMNODES".
>    This error was resolved when we moved the definition of
>    "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
>    "MAX_NUMNODES" definition in common header file through a
>    conditional compilation for some architectures that don't
>    need to define "MAX_NUMNODES" in their ARCH header files.

No, that's setting up a trap for someone else to fall into, especially
with the #ifdef around the original definition. Afaict all you need to
do is to move that #define ahead of the #include in xen/numa.h. Unlike
functions, #define-s can reference not-yet-defined identifiers.

> 2. error: wrong type argument to unary exclamation mark.
>    This is because, the error-checking code contains !node_data[nid].
>    But node_data is a data structure variable, it's not a pointer.
> 
> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> enable the two lines of error-checking code. And fix the left
> compilation errors by replacing !node_data[nid] to
> !node_data[nid].node_spanned_pages.
> 
> Because when node_spanned_pages is 0, this node has no memory,
> numa_scan_node will print warning message for such kind of nodes:
> "Firmware Bug or mis-configured hardware?".

This warning is bogus - nodes can have only processors. Therefore I'd
like to ask that you don't use it for justification. And indeed you
don't need to: phys_to_nid() is about translating an address. The
input address can't be valid if it maps to a node with no memory.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure
  2022-04-18  9:07 ` [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure Wei Chen
@ 2022-04-26  9:11   ` Jan Beulich
  2022-04-26 10:42     ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26  9:11 UTC (permalink / raw)
  To: Wei Chen; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 18.04.2022 11:07, Wei Chen wrote:
> v1 ->v2:
> 1. Drop useless cast.
> 2. Use initializers of the variables.

Would have been nice if this was extended to ...

> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index 680b7d9002..2b3a51afd0 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -162,12 +162,12 @@ int __init compute_hash_shift(struct node *nodes, int numnodes,
>      return shift;
>  }
>  /* initialize NODE_DATA given nodeid and start/end */
> -void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
> -{ 
> +void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
> +{
>      unsigned long start_pfn, end_pfn;
>  
> -    start_pfn = start >> PAGE_SHIFT;
> -    end_pfn = end >> PAGE_SHIFT;
> +    start_pfn = paddr_to_pfn(start);
> +    end_pfn = paddr_to_pfn(end);

... these as well.

> @@ -218,9 +219,9 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
>      memset(&nodes,0,sizeof(nodes));
>      for ( i = 0; i < numa_fake; i++ )
>      {
> -        nodes[i].start = (start_pfn<<PAGE_SHIFT) + i*sz;
> +        nodes[i].start = pfn_to_paddr(start_pfn) + i*sz;

Please add the missing blanks around * while touching this line.

> @@ -489,7 +489,8 @@ int __init acpi_scan_nodes(u64 start, u64 end)
>  	/* Finally register nodes */
>  	for_each_node_mask(i, all_nodes_parsed)
>  	{
> -		u64 size = nodes[i].end - nodes[i].start;
> +		paddr_t size = nodes[i].end - nodes[i].start;

In numa_emulation() you use uint64_t for a size; here you use paddr_t.
Please be consistent.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes
  2022-04-18  9:07 ` [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes Wei Chen
@ 2022-04-26  9:20   ` Jan Beulich
  2022-04-26 11:07     ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26  9:20 UTC (permalink / raw)
  To: Wei Chen; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 18.04.2022 11:07, Wei Chen wrote:
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -271,6 +271,35 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
>  		       pxm, pa->apic_id, node);
>  }
>  
> +/*
> + * Check to see if there are other nodes within this node's range.
> + * We just need to check full contains situation. Because overlaps
> + * have been checked before by conflicting_memblks.
> + */
> +static bool __init check_node_memory_interleave(nodeid_t nid,
> +                                                paddr_t start, paddr_t end)
> +{
> +	nodeid_t i;
> +	const struct node *nd = &nodes[nid];
> +
> +	for_each_node_mask(i, memory_nodes_parsed)
> +	{
> +		/* Skip itself */
> +		if (i == nid)
> +			continue;
> +
> +		nd = &nodes[i];
> +		if (start < nd->start && nd->end < end) {
> +			printk(KERN_ERR
> +			       "Node %u: (%"PRIpaddr"-%"PRIpaddr") interleaves with node %u (%"PRIpaddr"-%"PRIpaddr")\n",
> +			       nid, start, end, i, nd->start, nd->end);
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
>  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
>  void __init
>  acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
> @@ -340,10 +369,22 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)

Just up from here there already is overlap detection (via a call to
conflicting_memblks(), and you even mention that in the earlier
comment). If that doesn't cover all cases, I think it wants fixing
there rather than introducing a 2nd checking function. But afaics
that code covers the "fully contains" case.

Jan

>  			nd->start = start;
>  			nd->end = end;
>  		} else {
> -			if (start < nd->start)
> -				nd->start = start;
> -			if (nd->end < end)
> -				nd->end = end;
> +			paddr_t new_start = nd->start;
> +			paddr_t new_end = nd->end;
> +
> +			if (start < new_start)
> +				new_start = start;
> +			if (new_end < end)
> +				new_end = end;
> +
> +			/* Check whether new range contains memory for other nodes */
> +			if (check_node_memory_interleave(node, new_start, new_end)) {
> +				bad_srat();
> +				return;
> +			}
> +
> +			nd->start = new_start;
> +			nd->end = new_end;
>  		}
>  	}
>  	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIpaddr"-%"PRIpaddr"%s\n",



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-26  8:53   ` Jan Beulich
@ 2022-04-26 10:37     ` Wei Chen
  2022-04-26 14:31       ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-26 10:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

Hi Jan,

On 2022/4/26 16:53, Jan Beulich wrote:
> On 18.04.2022 11:07, Wei Chen wrote:
>> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
>> similarity index 71%
>> rename from xen/arch/x86/efi/stub.c
>> rename to xen/arch/x86/efi/stub-x86.c
>> index 9984932626..2cd5c8d4dc 100644
>> --- a/xen/arch/x86/efi/stub.c
>> +++ b/xen/arch/x86/efi/stub-x86.c
> 
> I'm not happy to see a file named *x86*.[ch] under x86/. I think the
> x86 file wants to simply include the common one (and the symlinking
> be suppressed when a real file already exists). Naming the common
> file stub-common.c wouldn't help, as a similar anomaly would result.
> 

How about using stub-arch.c to indicate this stub file only contains
the arch specific contents? However, we cannot predict what link files 
will be created in this directory in the future. If someone needs to
create a stub-arch.c link file in the future, can we solve it at that
time?  Or do you have any suggestions?

>> --- /dev/null
>> +++ b/xen/common/efi/stub.c
>> @@ -0,0 +1,38 @@
>> +#include <xen/efi.h>
>> +#include <xen/errno.h>
>> +#include <xen/lib.h>
>> +
>> +bool efi_enabled(unsigned int feature)
>> +{
>> +    return false;
>> +}
>> +
>> +bool efi_rs_using_pgtables(void)
>> +{
>> +    return false;
>> +}
>> +
>> +unsigned long efi_get_time(void)
>> +{
>> +    BUG();
>> +    return 0;
>> +}
>> +
>> +void efi_halt_system(void) { }
>> +void efi_reset_system(bool warm) { }
>> +
>> +int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
>> +{
>> +    return -ENOSYS;
>> +}
>> +
>> +int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
>> +    __attribute__((__alias__("efi_get_info")));
> 
> I doubt you need this outside of x86.
> 
>> +int efi_runtime_call(struct xenpf_efi_runtime_call *op)
>> +{
>> +    return -ENOSYS;
>> +}
>> +
>> +int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
>> +    __attribute__((__alias__("efi_runtime_call")));
> 
> Same here.
> 

You're correct, I check the code, Arm doesn't need above two
compat functions. I will restore them to x86 specific file.

> Even for the non-compat variants the need is un-obvious: Are you
> intending to wire these up anywhere in Arm or common code? This
> of course is once again pointing out that such code movement would
> better be done when the new consumers actually appear, such that
> it's clear why the movement is done - for every individual item.
> 

Yes, but I didn't deliberately ignore your comment from the last
series. I also hesitated for a while when constructing this patch.
I felt that this independent work, maybe it would be better to use
an independent patch.
I will merge patch3 with it, indicating that this movement is to
share the EFI stub for Arm.

Thanks,
Wei Che

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure
  2022-04-26  9:11   ` Jan Beulich
@ 2022-04-26 10:42     ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-26 10:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

Hi Jan,

On 2022/4/26 17:11, Jan Beulich wrote:
> On 18.04.2022 11:07, Wei Chen wrote:
>> v1 ->v2:
>> 1. Drop useless cast.
>> 2. Use initializers of the variables.
> 
> Would have been nice if this was extended to ...
> 
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 680b7d9002..2b3a51afd0 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -162,12 +162,12 @@ int __init compute_hash_shift(struct node *nodes, int numnodes,
>>       return shift;
>>   }
>>   /* initialize NODE_DATA given nodeid and start/end */
>> -void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
>> -{
>> +void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
>> +{
>>       unsigned long start_pfn, end_pfn;
>>   
>> -    start_pfn = start >> PAGE_SHIFT;
>> -    end_pfn = end >> PAGE_SHIFT;
>> +    start_pfn = paddr_to_pfn(start);
>> +    end_pfn = paddr_to_pfn(end);
> 
> ... these as well.
> 

Ok, I will do it.

>> @@ -218,9 +219,9 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
>>       memset(&nodes,0,sizeof(nodes));
>>       for ( i = 0; i < numa_fake; i++ )
>>       {
>> -        nodes[i].start = (start_pfn<<PAGE_SHIFT) + i*sz;
>> +        nodes[i].start = pfn_to_paddr(start_pfn) + i*sz;
> 
> Please add the missing blanks around * while touching this line.
> 

Ok.

>> @@ -489,7 +489,8 @@ int __init acpi_scan_nodes(u64 start, u64 end)
>>   	/* Finally register nodes */
>>   	for_each_node_mask(i, all_nodes_parsed)
>>   	{
>> -		u64 size = nodes[i].end - nodes[i].start;
>> +		paddr_t size = nodes[i].end - nodes[i].start;
> 
> In numa_emulation() you use uint64_t for a size; here you use paddr_t.
> Please be consistent.
> 

Ok.

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-26  9:02   ` Jan Beulich
@ 2022-04-26 10:59     ` Wei Chen
  2022-04-26 14:42       ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-26 10:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

Hi Jan,

On 2022/4/26 17:02, Jan Beulich wrote:
> On 18.04.2022 11:07, Wei Chen wrote:
>> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
>> results in two lines of error-checking code in phys_to_nid
>> that is not actually working and causing two compilation
>> errors:
>> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
>>     This is because in the common header file, "MAX_NUMNODES" is
>>     defined after the common header file includes the ARCH header
>>     file, where phys_to_nid has attempted to use "MAX_NUMNODES".
>>     This error was resolved when we moved the definition of
>>     "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
>>     "MAX_NUMNODES" definition in common header file through a
>>     conditional compilation for some architectures that don't
>>     need to define "MAX_NUMNODES" in their ARCH header files.
> 
> No, that's setting up a trap for someone else to fall into, especially
> with the #ifdef around the original definition. Afaict all you need to
> do is to move that #define ahead of the #include in xen/numa.h. Unlike
> functions, #define-s can reference not-yet-defined identifiers.
> 

I had tried it before. MAX_NUMNODES depends on NODE_SHIFT. But
NODE_SHIFT depends on the definition status in asm/numa.h. If I move 
MAX_NUMNODES to before asm/numa.h, then I have to move NODES_SHIFT as 
well. But this will break the original design. NODES_SHIFT in xen/numa.h 
will always be defined before asm/numa.h. This will be a duplicated 
definition error.

How about I move MAX_NUMNODES to arm and x86 asm/numa.h in this patch
at the same time? Because in one of following patches, MAX_NUMNODES and
phys_to_nid will be moved to xen/numa.h at the same time?

>> 2. error: wrong type argument to unary exclamation mark.
>>     This is because, the error-checking code contains !node_data[nid].
>>     But node_data is a data structure variable, it's not a pointer.
>>
>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
>> enable the two lines of error-checking code. And fix the left
>> compilation errors by replacing !node_data[nid] to
>> !node_data[nid].node_spanned_pages.
>>
>> Because when node_spanned_pages is 0, this node has no memory,
>> numa_scan_node will print warning message for such kind of nodes:
>> "Firmware Bug or mis-configured hardware?".
> 
> This warning is bogus - nodes can have only processors. Therefore I'd
> like to ask that you don't use it for justification. And indeed you

Yes, you're right, node can only has CPUs! I will remove it.

> don't need to: phys_to_nid() is about translating an address. The
> input address can't be valid if it maps to a node with no memory.
> 

Can I understand your comment:
Any input address is invalid, when node_spanned_pages is zero, because
this node has no memory?

Thanks,
Wei Chen

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes
  2022-04-26  9:20   ` Jan Beulich
@ 2022-04-26 11:07     ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-26 11:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

Hi Jan,

On 2022/4/26 17:20, Jan Beulich wrote:
> On 18.04.2022 11:07, Wei Chen wrote:
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -271,6 +271,35 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
>>   		       pxm, pa->apic_id, node);
>>   }
>>   
>> +/*
>> + * Check to see if there are other nodes within this node's range.
>> + * We just need to check full contains situation. Because overlaps
>> + * have been checked before by conflicting_memblks.
>> + */
>> +static bool __init check_node_memory_interleave(nodeid_t nid,
>> +                                                paddr_t start, paddr_t end)
>> +{
>> +	nodeid_t i;
>> +	const struct node *nd = &nodes[nid];
>> +
>> +	for_each_node_mask(i, memory_nodes_parsed)
>> +	{
>> +		/* Skip itself */
>> +		if (i == nid)
>> +			continue;
>> +
>> +		nd = &nodes[i];
>> +		if (start < nd->start && nd->end < end) {
>> +			printk(KERN_ERR
>> +			       "Node %u: (%"PRIpaddr"-%"PRIpaddr") interleaves with node %u (%"PRIpaddr"-%"PRIpaddr")\n",
>> +			       nid, start, end, i, nd->start, nd->end);
>> +			return true;
>> +		}
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>   /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
>>   void __init
>>   acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>> @@ -340,10 +369,22 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
> 
> Just up from here there already is overlap detection (via a call to
> conflicting_memblks(), and you even mention that in the earlier
> comment). If that doesn't cover all cases, I think it wants fixing
> there rather than introducing a 2nd checking function. But afaics
> that code covers the "fully contains" case.
> 

Yes, that makes sense, I will try to add this case check in 
conflicting_memblks.

Thanks,
Wei Chen

> Jan
> 
>>   			nd->start = start;
>>   			nd->end = end;
>>   		} else {
>> -			if (start < nd->start)
>> -				nd->start = start;
>> -			if (nd->end < end)
>> -				nd->end = end;
>> +			paddr_t new_start = nd->start;
>> +			paddr_t new_end = nd->end;
>> +
>> +			if (start < new_start)
>> +				new_start = start;
>> +			if (new_end < end)
>> +				new_end = end;
>> +
>> +			/* Check whether new range contains memory for other nodes */
>> +			if (check_node_memory_interleave(node, new_start, new_end)) {
>> +				bad_srat();
>> +				return;
>> +			}
>> +
>> +			nd->start = new_start;
>> +			nd->end = new_end;
>>   		}
>>   	}
>>   	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIpaddr"-%"PRIpaddr"%s\n",
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-26 10:37     ` Wei Chen
@ 2022-04-26 14:31       ` Jan Beulich
  2022-04-27  2:56         ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26 14:31 UTC (permalink / raw)
  To: Wei Chen; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 26.04.2022 12:37, Wei Chen wrote:
> On 2022/4/26 16:53, Jan Beulich wrote:
>> On 18.04.2022 11:07, Wei Chen wrote:
>>> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
>>> similarity index 71%
>>> rename from xen/arch/x86/efi/stub.c
>>> rename to xen/arch/x86/efi/stub-x86.c
>>> index 9984932626..2cd5c8d4dc 100644
>>> --- a/xen/arch/x86/efi/stub.c
>>> +++ b/xen/arch/x86/efi/stub-x86.c
>>
>> I'm not happy to see a file named *x86*.[ch] under x86/. I think the
>> x86 file wants to simply include the common one (and the symlinking
>> be suppressed when a real file already exists). Naming the common
>> file stub-common.c wouldn't help, as a similar anomaly would result.
>>
> 
> How about using stub-arch.c to indicate this stub file only contains
> the arch specific contents? However, we cannot predict what link files 
> will be created in this directory in the future. If someone needs to
> create a stub-arch.c link file in the future, can we solve it at that
> time?  Or do you have any suggestions?

I did provide my suggestion. I do not like stub-arch.c any better than
stub-x86.c or stub-common.c.

>>> --- /dev/null
>>> +++ b/xen/common/efi/stub.c
>>> @@ -0,0 +1,38 @@
>>> +#include <xen/efi.h>
>>> +#include <xen/errno.h>
>>> +#include <xen/lib.h>
>>> +
>>> +bool efi_enabled(unsigned int feature)
>>> +{
>>> +    return false;
>>> +}
>>> +
>>> +bool efi_rs_using_pgtables(void)
>>> +{
>>> +    return false;
>>> +}
>>> +
>>> +unsigned long efi_get_time(void)
>>> +{
>>> +    BUG();
>>> +    return 0;
>>> +}
>>> +
>>> +void efi_halt_system(void) { }
>>> +void efi_reset_system(bool warm) { }
>>> +
>>> +int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
>>> +{
>>> +    return -ENOSYS;
>>> +}
>>> +
>>> +int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
>>> +    __attribute__((__alias__("efi_get_info")));
>>
>> I doubt you need this outside of x86.
>>
>>> +int efi_runtime_call(struct xenpf_efi_runtime_call *op)
>>> +{
>>> +    return -ENOSYS;
>>> +}
>>> +
>>> +int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
>>> +    __attribute__((__alias__("efi_runtime_call")));
>>
>> Same here.
>>
> 
> You're correct, I check the code, Arm doesn't need above two
> compat functions. I will restore them to x86 specific file.
> 
>> Even for the non-compat variants the need is un-obvious: Are you
>> intending to wire these up anywhere in Arm or common code? This
>> of course is once again pointing out that such code movement would
>> better be done when the new consumers actually appear, such that
>> it's clear why the movement is done - for every individual item.
>>
> 
> Yes, but I didn't deliberately ignore your comment from the last
> series. I also hesitated for a while when constructing this patch.
> I felt that this independent work, maybe it would be better to use
> an independent patch.

Well, it of course depends on further aspects. If it had been clear
that what is moved is actually going to be wired up, this being a
standalone patch would be okay-ish. But with this unclear (and, as
per above, actually having gone too far) it's imo better to move
things as they get re-used.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-26 10:59     ` Wei Chen
@ 2022-04-26 14:42       ` Jan Beulich
  2022-04-27  3:52         ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-26 14:42 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

On 26.04.2022 12:59, Wei Chen wrote:
> On 2022/4/26 17:02, Jan Beulich wrote:
>> On 18.04.2022 11:07, Wei Chen wrote:
>>> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
>>> results in two lines of error-checking code in phys_to_nid
>>> that is not actually working and causing two compilation
>>> errors:
>>> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
>>>     This is because in the common header file, "MAX_NUMNODES" is
>>>     defined after the common header file includes the ARCH header
>>>     file, where phys_to_nid has attempted to use "MAX_NUMNODES".
>>>     This error was resolved when we moved the definition of
>>>     "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
>>>     "MAX_NUMNODES" definition in common header file through a
>>>     conditional compilation for some architectures that don't
>>>     need to define "MAX_NUMNODES" in their ARCH header files.
>>
>> No, that's setting up a trap for someone else to fall into, especially
>> with the #ifdef around the original definition. Afaict all you need to
>> do is to move that #define ahead of the #include in xen/numa.h. Unlike
>> functions, #define-s can reference not-yet-defined identifiers.
>>
> 
> I had tried it before. MAX_NUMNODES depends on NODE_SHIFT. But
> NODE_SHIFT depends on the definition status in asm/numa.h. If I move 
> MAX_NUMNODES to before asm/numa.h, then I have to move NODES_SHIFT as 
> well. But this will break the original design. NODES_SHIFT in xen/numa.h 
> will always be defined before asm/numa.h. This will be a duplicated 
> definition error.

I'm afraid I don't follow. MAX_NUMNODES depends on NODES_SHIFT only as
soon as some code actually uses MAX_NUMNODES. It does not require
NODES_SHIFT to be defined up front. Of course with the current layout
(phys_to_nid() living in an inline function in asm/numa.h) things won't
build. But wasn't the plan to move phys_to_nid() to xen/numa.h as well?

Otherwise I'd recommend to introduce a new header, say numa-defs.h,
holding (for now) just NODES_SHIFT. Then you'd include asm/numa-defs.h
first and asm/numa.h only after having defined MAX_NUMNODES. But
splitting the header should only be a last resort if things can't be
made work another way.

> How about I move MAX_NUMNODES to arm and x86 asm/numa.h in this patch
> at the same time? Because in one of following patches, MAX_NUMNODES and
> phys_to_nid will be moved to xen/numa.h at the same time?
> 
>>> 2. error: wrong type argument to unary exclamation mark.
>>>     This is because, the error-checking code contains !node_data[nid].
>>>     But node_data is a data structure variable, it's not a pointer.
>>>
>>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
>>> enable the two lines of error-checking code. And fix the left
>>> compilation errors by replacing !node_data[nid] to
>>> !node_data[nid].node_spanned_pages.
>>>
>>> Because when node_spanned_pages is 0, this node has no memory,
>>> numa_scan_node will print warning message for such kind of nodes:
>>> "Firmware Bug or mis-configured hardware?".
>>
>> This warning is bogus - nodes can have only processors. Therefore I'd
>> like to ask that you don't use it for justification. And indeed you
> 
> Yes, you're right, node can only has CPUs! I will remove it.
> 
>> don't need to: phys_to_nid() is about translating an address. The
>> input address can't be valid if it maps to a node with no memory.
>>
> 
> Can I understand your comment:
> Any input address is invalid, when node_spanned_pages is zero, because
> this node has no memory?

It's getting close, but it's not exactly equivalent I think. A node
with 0 bytes of memory might (at least in theory) have an entry in
memnodemap[]. But finding a node ID for that address would still
not mean that at least one byte of memory at that address is present
on the given node, because the node covers 0 bytes.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-26 14:31       ` Jan Beulich
@ 2022-04-27  2:56         ` Wei Chen
  2022-04-27  5:54           ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-27  2:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2022年4月26日 22:31
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau
> Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions
> from x86 to common
> 
> On 26.04.2022 12:37, Wei Chen wrote:
> > On 2022/4/26 16:53, Jan Beulich wrote:
> >> On 18.04.2022 11:07, Wei Chen wrote:
> >>> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
> >>> similarity index 71%
> >>> rename from xen/arch/x86/efi/stub.c
> >>> rename to xen/arch/x86/efi/stub-x86.c
> >>> index 9984932626..2cd5c8d4dc 100644
> >>> --- a/xen/arch/x86/efi/stub.c
> >>> +++ b/xen/arch/x86/efi/stub-x86.c
> >>
> >> I'm not happy to see a file named *x86*.[ch] under x86/. I think the
> >> x86 file wants to simply include the common one (and the symlinking
> >> be suppressed when a real file already exists). Naming the common
> >> file stub-common.c wouldn't help, as a similar anomaly would result.
> >>
> >
> > How about using stub-arch.c to indicate this stub file only contains
> > the arch specific contents? However, we cannot predict what link files
> > will be created in this directory in the future. If someone needs to
> > create a stub-arch.c link file in the future, can we solve it at that
> > time?  Or do you have any suggestions?
> 
> I did provide my suggestion. I do not like stub-arch.c any better than
> stub-x86.c or stub-common.c.
> 

With my limited English level, I can only see that you don't like this, but
I can't get what you want clearly from your comments. I can only guess:
For "x86 file wants to simply include the common one":
1. Did you mean, x86 still keeps it stub.c and includes all its original
   contents. The common/efi/stub.c link behavior will be ignored, because
   of x86 has a real stub.c? And common/efi/stub.c still can works for
   other architectures like Arm whom doesn't have a real stub.c?
   But in previous version's discussion, I had said I created a stub.c in
   Arm/efi, and copied Arm required functions from x86/efi/stub.c. But
   people didn't like it. If my guess is correct, I don't know what is
   the essential difference between the two approaches.
2. Keeps stub.c in x86/efi, and use it to include common/stub.c.
   I think this may not be the right understanding, but I can't think
   of any other understanding.
   And please forgive my limited reading level again!

> >>> --- /dev/null
> >>> +++ b/xen/common/efi/stub.c
> >>> @@ -0,0 +1,38 @@
> >>> +#include <xen/efi.h>
> >>> +#include <xen/errno.h>
> >>> +#include <xen/lib.h>
> >>> +
> >>> +bool efi_enabled(unsigned int feature)
> >>> +{
> >>> +    return false;
> >>> +}
> >>> +
> >>> +bool efi_rs_using_pgtables(void)
> >>> +{
> >>> +    return false;
> >>> +}
> >>> +
> >>> +unsigned long efi_get_time(void)
> >>> +{
> >>> +    BUG();
> >>> +    return 0;
> >>> +}
> >>> +
> >>> +void efi_halt_system(void) { }
> >>> +void efi_reset_system(bool warm) { }
> >>> +
> >>> +int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
> >>> +{
> >>> +    return -ENOSYS;
> >>> +}
> >>> +
> >>> +int efi_compat_get_info(uint32_t idx, union compat_pf_efi_info *)
> >>> +    __attribute__((__alias__("efi_get_info")));
> >>
> >> I doubt you need this outside of x86.
> >>
> >>> +int efi_runtime_call(struct xenpf_efi_runtime_call *op)
> >>> +{
> >>> +    return -ENOSYS;
> >>> +}
> >>> +
> >>> +int efi_compat_runtime_call(struct compat_pf_efi_runtime_call *)
> >>> +    __attribute__((__alias__("efi_runtime_call")));
> >>
> >> Same here.
> >>
> >
> > You're correct, I check the code, Arm doesn't need above two
> > compat functions. I will restore them to x86 specific file.
> >
> >> Even for the non-compat variants the need is un-obvious: Are you
> >> intending to wire these up anywhere in Arm or common code? This
> >> of course is once again pointing out that such code movement would
> >> better be done when the new consumers actually appear, such that
> >> it's clear why the movement is done - for every individual item.
> >>
> >
> > Yes, but I didn't deliberately ignore your comment from the last
> > series. I also hesitated for a while when constructing this patch.
> > I felt that this independent work, maybe it would be better to use
> > an independent patch.
> 
> Well, it of course depends on further aspects. If it had been clear
> that what is moved is actually going to be wired up, this being a
> standalone patch would be okay-ish. But with this unclear (and, as
> per above, actually having gone too far) it's imo better to move
> things as they get re-used.
> 

Ok, I understand it now.

Thanks,
Wei Chen

> Jan


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-26 14:42       ` Jan Beulich
@ 2022-04-27  3:52         ` Wei Chen
  2022-04-27  5:56           ` Jan Beulich
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Chen @ 2022-04-27  3:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2022年4月26日 22:42
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau
> Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
> <george.dunlap@citrix.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of
> VIRTUAL_BUG_ON for phys_to_nid
> 
> On 26.04.2022 12:59, Wei Chen wrote:
> > On 2022/4/26 17:02, Jan Beulich wrote:
> >> On 18.04.2022 11:07, Wei Chen wrote:
> >>> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> >>> results in two lines of error-checking code in phys_to_nid
> >>> that is not actually working and causing two compilation
> >>> errors:
> >>> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
> >>>     This is because in the common header file, "MAX_NUMNODES" is
> >>>     defined after the common header file includes the ARCH header
> >>>     file, where phys_to_nid has attempted to use "MAX_NUMNODES".
> >>>     This error was resolved when we moved the definition of
> >>>     "MAX_NUMNODES" to x86 ARCH header file. And we reserve the
> >>>     "MAX_NUMNODES" definition in common header file through a
> >>>     conditional compilation for some architectures that don't
> >>>     need to define "MAX_NUMNODES" in their ARCH header files.
> >>
> >> No, that's setting up a trap for someone else to fall into, especially
> >> with the #ifdef around the original definition. Afaict all you need to
> >> do is to move that #define ahead of the #include in xen/numa.h. Unlike
> >> functions, #define-s can reference not-yet-defined identifiers.
> >>
> >
> > I had tried it before. MAX_NUMNODES depends on NODE_SHIFT. But
> > NODE_SHIFT depends on the definition status in asm/numa.h. If I move
> > MAX_NUMNODES to before asm/numa.h, then I have to move NODES_SHIFT as
> > well. But this will break the original design. NODES_SHIFT in xen/numa.h
> > will always be defined before asm/numa.h. This will be a duplicated
> > definition error.
> 
> I'm afraid I don't follow. MAX_NUMNODES depends on NODES_SHIFT only as
> soon as some code actually uses MAX_NUMNODES. It does not require
> NODES_SHIFT to be defined up front. Of course with the current layout
> (phys_to_nid() living in an inline function in asm/numa.h) things won't
> build. But wasn't the plan to move phys_to_nid() to xen/numa.h as well?
>

Yes, I will drop this patch from part#1, and move it to part#2. This
patch will follow when we move phys_to_nid() to xen/numa.h.

Thanks,
Wei Chen

> Otherwise I'd recommend to introduce a new header, say numa-defs.h,
> holding (for now) just NODES_SHIFT. Then you'd include asm/numa-defs.h
> first and asm/numa.h only after having defined MAX_NUMNODES. But
> splitting the header should only be a last resort if things can't be
> made work another way.
> 
> > How about I move MAX_NUMNODES to arm and x86 asm/numa.h in this patch
> > at the same time? Because in one of following patches, MAX_NUMNODES and
> > phys_to_nid will be moved to xen/numa.h at the same time?
> >
> >>> 2. error: wrong type argument to unary exclamation mark.
> >>>     This is because, the error-checking code contains !node_data[nid].
> >>>     But node_data is a data structure variable, it's not a pointer.
> >>>
> >>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> >>> enable the two lines of error-checking code. And fix the left
> >>> compilation errors by replacing !node_data[nid] to
> >>> !node_data[nid].node_spanned_pages.
> >>>
> >>> Because when node_spanned_pages is 0, this node has no memory,
> >>> numa_scan_node will print warning message for such kind of nodes:
> >>> "Firmware Bug or mis-configured hardware?".
> >>
> >> This warning is bogus - nodes can have only processors. Therefore I'd
> >> like to ask that you don't use it for justification. And indeed you
> >
> > Yes, you're right, node can only has CPUs! I will remove it.
> >
> >> don't need to: phys_to_nid() is about translating an address. The
> >> input address can't be valid if it maps to a node with no memory.
> >>
> >
> > Can I understand your comment:
> > Any input address is invalid, when node_spanned_pages is zero, because
> > this node has no memory?
> 
> It's getting close, but it's not exactly equivalent I think. A node
> with 0 bytes of memory might (at least in theory) have an entry in
> memnodemap[]. But finding a node ID for that address would still

I have done a quick check in populate_memnodemap:
74          spdx = paddr_to_pdx(nodes[i].start);
75          epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
76          if ( spdx >= epdx )
77              continue;

It seems that if node has no memory, start == end, then this function
will not populate memnodemap entry for this node.

> not mean that at least one byte of memory at that address is present
> on the given node, because the node covers 0 bytes.
> 

And back to this patch, can I just drop the unnecessary justification
from the commit message?

And for the bogus warning message, can I update it to an INFO level
message in part#2 series, and just keep:
printk(KERN_INFO "SRAT: Node %u has no memory!\n", i);
but remove "BIOS Bug or mis-configured hardware?\n", i); ?

Thanks,
Wei Chen

> Jan


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-27  2:56         ` Wei Chen
@ 2022-04-27  5:54           ` Jan Beulich
  2022-04-27  6:26             ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-27  5:54 UTC (permalink / raw)
  To: Wei Chen; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 27.04.2022 04:56, Wei Chen wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 2022年4月26日 22:31
>>
>> On 26.04.2022 12:37, Wei Chen wrote:
>>> On 2022/4/26 16:53, Jan Beulich wrote:
>>>> On 18.04.2022 11:07, Wei Chen wrote:
>>>>> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
>>>>> similarity index 71%
>>>>> rename from xen/arch/x86/efi/stub.c
>>>>> rename to xen/arch/x86/efi/stub-x86.c
>>>>> index 9984932626..2cd5c8d4dc 100644
>>>>> --- a/xen/arch/x86/efi/stub.c
>>>>> +++ b/xen/arch/x86/efi/stub-x86.c
>>>>
>>>> I'm not happy to see a file named *x86*.[ch] under x86/. I think the
>>>> x86 file wants to simply include the common one (and the symlinking
>>>> be suppressed when a real file already exists). Naming the common
>>>> file stub-common.c wouldn't help, as a similar anomaly would result.
>>>>
>>>
>>> How about using stub-arch.c to indicate this stub file only contains
>>> the arch specific contents? However, we cannot predict what link files
>>> will be created in this directory in the future. If someone needs to
>>> create a stub-arch.c link file in the future, can we solve it at that
>>> time?  Or do you have any suggestions?
>>
>> I did provide my suggestion. I do not like stub-arch.c any better than
>> stub-x86.c or stub-common.c.
>>
> 
> With my limited English level, I can only see that you don't like this, but
> I can't get what you want clearly from your comments. I can only guess:
> For "x86 file wants to simply include the common one":
> 1. Did you mean, x86 still keeps it stub.c and includes all its original
>    contents. The common/efi/stub.c link behavior will be ignored, because
>    of x86 has a real stub.c? And common/efi/stub.c still can works for
>    other architectures like Arm whom doesn't have a real stub.c?
>    But in previous version's discussion, I had said I created a stub.c in
>    Arm/efi, and copied Arm required functions from x86/efi/stub.c. But
>    people didn't like it. If my guess is correct, I don't know what is
>    the essential difference between the two approaches.
> 2. Keeps stub.c in x86/efi, and use it to include common/stub.c.
>    I think this may not be the right understanding, but I can't think
>    of any other understanding.

2 is what I've been suggesting.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-27  3:52         ` Wei Chen
@ 2022-04-27  5:56           ` Jan Beulich
  2022-04-27  6:27             ` Wei Chen
  0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-04-27  5:56 UTC (permalink / raw)
  To: Wei Chen
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

On 27.04.2022 05:52, Wei Chen wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 2022年4月26日 22:42
>>
>> On 26.04.2022 12:59, Wei Chen wrote:
>>> On 2022/4/26 17:02, Jan Beulich wrote:
>>>> On 18.04.2022 11:07, Wei Chen wrote:
>>>>> 2. error: wrong type argument to unary exclamation mark.
>>>>>     This is because, the error-checking code contains !node_data[nid].
>>>>>     But node_data is a data structure variable, it's not a pointer.
>>>>>
>>>>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
>>>>> enable the two lines of error-checking code. And fix the left
>>>>> compilation errors by replacing !node_data[nid] to
>>>>> !node_data[nid].node_spanned_pages.
>>>>>
>>>>> Because when node_spanned_pages is 0, this node has no memory,
>>>>> numa_scan_node will print warning message for such kind of nodes:
>>>>> "Firmware Bug or mis-configured hardware?".
>>>>
>>>> This warning is bogus - nodes can have only processors. Therefore I'd
>>>> like to ask that you don't use it for justification. And indeed you
>>>
>>> Yes, you're right, node can only has CPUs! I will remove it.
>>>
>>>> don't need to: phys_to_nid() is about translating an address. The
>>>> input address can't be valid if it maps to a node with no memory.
>>>>
>>>
>>> Can I understand your comment:
>>> Any input address is invalid, when node_spanned_pages is zero, because
>>> this node has no memory?
>>
>> It's getting close, but it's not exactly equivalent I think. A node
>> with 0 bytes of memory might (at least in theory) have an entry in
>> memnodemap[]. But finding a node ID for that address would still
> 
> I have done a quick check in populate_memnodemap:
> 74          spdx = paddr_to_pdx(nodes[i].start);
> 75          epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
> 76          if ( spdx >= epdx )
> 77              continue;
> 
> It seems that if node has no memory, start == end, then this function
> will not populate memnodemap entry for this node.
> 
>> not mean that at least one byte of memory at that address is present
>> on the given node, because the node covers 0 bytes.
>>
> 
> And back to this patch, can I just drop the unnecessary justification
> from the commit message?

Well, you'll want to have _some_ justification for that particular
aspect of the patch.

> And for the bogus warning message, can I update it to an INFO level
> message in part#2 series, and just keep:
> printk(KERN_INFO "SRAT: Node %u has no memory!\n", i);
> but remove "BIOS Bug or mis-configured hardware?\n", i); ?

This sounds at least plausible to do.

Jan



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common
  2022-04-27  5:54           ` Jan Beulich
@ 2022-04-27  6:26             ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-27  6:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: nd, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

Hi Jan,

On 2022/4/27 13:54, Jan Beulich wrote:
> On 27.04.2022 04:56, Wei Chen wrote:
>>> -----Original Message-----
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 2022年4月26日 22:31
>>>
>>> On 26.04.2022 12:37, Wei Chen wrote:
>>>> On 2022/4/26 16:53, Jan Beulich wrote:
>>>>> On 18.04.2022 11:07, Wei Chen wrote:
>>>>>> diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub-x86.c
>>>>>> similarity index 71%
>>>>>> rename from xen/arch/x86/efi/stub.c
>>>>>> rename to xen/arch/x86/efi/stub-x86.c
>>>>>> index 9984932626..2cd5c8d4dc 100644
>>>>>> --- a/xen/arch/x86/efi/stub.c
>>>>>> +++ b/xen/arch/x86/efi/stub-x86.c
>>>>>
>>>>> I'm not happy to see a file named *x86*.[ch] under x86/. I think the
>>>>> x86 file wants to simply include the common one (and the symlinking
>>>>> be suppressed when a real file already exists). Naming the common
>>>>> file stub-common.c wouldn't help, as a similar anomaly would result.
>>>>>
>>>>
>>>> How about using stub-arch.c to indicate this stub file only contains
>>>> the arch specific contents? However, we cannot predict what link files
>>>> will be created in this directory in the future. If someone needs to
>>>> create a stub-arch.c link file in the future, can we solve it at that
>>>> time?  Or do you have any suggestions?
>>>
>>> I did provide my suggestion. I do not like stub-arch.c any better than
>>> stub-x86.c or stub-common.c.
>>>
>>
>> With my limited English level, I can only see that you don't like this, but
>> I can't get what you want clearly from your comments. I can only guess:
>> For "x86 file wants to simply include the common one":
>> 1. Did you mean, x86 still keeps it stub.c and includes all its original
>>     contents. The common/efi/stub.c link behavior will be ignored, because
>>     of x86 has a real stub.c? And common/efi/stub.c still can works for
>>     other architectures like Arm whom doesn't have a real stub.c?
>>     But in previous version's discussion, I had said I created a stub.c in
>>     Arm/efi, and copied Arm required functions from x86/efi/stub.c. But
>>     people didn't like it. If my guess is correct, I don't know what is
>>     the essential difference between the two approaches.
>> 2. Keeps stub.c in x86/efi, and use it to include common/stub.c.
>>     I think this may not be the right understanding, but I can't think
>>     of any other understanding.
> 
> 2 is what I've been suggesting.
> 

Got it, thanks!

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid
  2022-04-27  5:56           ` Jan Beulich
@ 2022-04-27  6:27             ` Wei Chen
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Chen @ 2022-04-27  6:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: nd, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	xen-devel

Hi Jan,

On 2022/4/27 13:56, Jan Beulich wrote:
> On 27.04.2022 05:52, Wei Chen wrote:
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 2022年4月26日 22:42
>>>
>>> On 26.04.2022 12:59, Wei Chen wrote:
>>>> On 2022/4/26 17:02, Jan Beulich wrote:
>>>>> On 18.04.2022 11:07, Wei Chen wrote:
>>>>>> 2. error: wrong type argument to unary exclamation mark.
>>>>>>      This is because, the error-checking code contains !node_data[nid].
>>>>>>      But node_data is a data structure variable, it's not a pointer.
>>>>>>
>>>>>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
>>>>>> enable the two lines of error-checking code. And fix the left
>>>>>> compilation errors by replacing !node_data[nid] to
>>>>>> !node_data[nid].node_spanned_pages.
>>>>>>
>>>>>> Because when node_spanned_pages is 0, this node has no memory,
>>>>>> numa_scan_node will print warning message for such kind of nodes:
>>>>>> "Firmware Bug or mis-configured hardware?".
>>>>>
>>>>> This warning is bogus - nodes can have only processors. Therefore I'd
>>>>> like to ask that you don't use it for justification. And indeed you
>>>>
>>>> Yes, you're right, node can only has CPUs! I will remove it.
>>>>
>>>>> don't need to: phys_to_nid() is about translating an address. The
>>>>> input address can't be valid if it maps to a node with no memory.
>>>>>
>>>>
>>>> Can I understand your comment:
>>>> Any input address is invalid, when node_spanned_pages is zero, because
>>>> this node has no memory?
>>>
>>> It's getting close, but it's not exactly equivalent I think. A node
>>> with 0 bytes of memory might (at least in theory) have an entry in
>>> memnodemap[]. But finding a node ID for that address would still
>>
>> I have done a quick check in populate_memnodemap:
>> 74          spdx = paddr_to_pdx(nodes[i].start);
>> 75          epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
>> 76          if ( spdx >= epdx )
>> 77              continue;
>>
>> It seems that if node has no memory, start == end, then this function
>> will not populate memnodemap entry for this node.
>>
>>> not mean that at least one byte of memory at that address is present
>>> on the given node, because the node covers 0 bytes.
>>>
>>
>> And back to this patch, can I just drop the unnecessary justification
>> from the commit message?
> 
> Well, you'll want to have _some_ justification for that particular
> aspect of the patch.
> 

Ok.

>> And for the bogus warning message, can I update it to an INFO level
>> message in part#2 series, and just keep:
>> printk(KERN_INFO "SRAT: Node %u has no memory!\n", i);
>> but remove "BIOS Bug or mis-configured hardware?\n", i); ?
> 
> This sounds at least plausible to do.
> 

Ok.

> Jan
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-04-27  6:28 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18  9:07 [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen
2022-04-18  9:07 ` [PATCH v2 01/10] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
2022-04-19  9:13   ` Jiamei Xie
2022-04-21  7:05     ` Wei Chen
2022-04-18  9:07 ` [PATCH v2 02/10] xen/x86: move reusable EFI stub functions from x86 to common Wei Chen
2022-04-21  0:17   ` Stefano Stabellini
2022-04-26  8:53   ` Jan Beulich
2022-04-26 10:37     ` Wei Chen
2022-04-26 14:31       ` Jan Beulich
2022-04-27  2:56         ` Wei Chen
2022-04-27  5:54           ` Jan Beulich
2022-04-27  6:26             ` Wei Chen
2022-04-18  9:07 ` [PATCH v2 03/10] xen/arm: add CONFIG_ARM_EFI to stub EFI API Wei Chen
2022-04-21  0:25   ` Stefano Stabellini
2022-04-21  7:01     ` Wei Chen
2022-04-21 21:35       ` Stefano Stabellini
2022-04-18  9:07 ` [PATCH v2 04/10] xen/arm: Keep memory nodes in device tree when Xen boots from EFI Wei Chen
2022-04-21  0:30   ` Stefano Stabellini
2022-04-18  9:07 ` [PATCH v2 05/10] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid Wei Chen
2022-04-21  0:37   ` Stefano Stabellini
2022-04-26  9:02   ` Jan Beulich
2022-04-26 10:59     ` Wei Chen
2022-04-26 14:42       ` Jan Beulich
2022-04-27  3:52         ` Wei Chen
2022-04-27  5:56           ` Jan Beulich
2022-04-27  6:27             ` Wei Chen
2022-04-18  9:07 ` [PATCH v2 06/10] xen: introduce an arch helper for default dma zone status Wei Chen
2022-04-19  9:18   ` Jan Beulich
2022-04-21  7:03     ` Wei Chen
2022-04-18  9:07 ` [PATCH v2 07/10] xen: decouple NUMA from ACPI in Kconfig Wei Chen
2022-04-18  9:07 ` [PATCH v2 08/10] xen/arm: use !CONFIG_NUMA to keep fake NUMA API Wei Chen
2022-04-18  9:07 ` [PATCH v2 09/10] xen/x86: use paddr_t for addresses in NUMA node structure Wei Chen
2022-04-26  9:11   ` Jan Beulich
2022-04-26 10:42     ` Wei Chen
2022-04-18  9:07 ` [PATCH v2 10/10] xen/x86: add detection of memory interleaves for different nodes Wei Chen
2022-04-26  9:20   ` Jan Beulich
2022-04-26 11:07     ` Wei Chen
2022-04-19  9:29 ` [PATCH v2 00/10] Device tree based NUMA support for Arm - Part#1 Wei Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.