[PATCH 00/37] Add device tree based NUMA support to Arm

* [PATCH 00/37] Add device tree based NUMA support to Arm
@ 2021-09-23 12:01 Wei Chen
  2021-09-23 12:02 ` [PATCH 01/37] xen/arm: Print a 64-bit number in hex from early uart Wei Chen
                   ` (36 more replies)
  0 siblings, 37 replies; 192+ messages in thread
From: Wei Chen @ 2021-09-23 12:01 UTC (permalink / raw)
  To: wei.chen, xen-devel, sstabellini, julien; +Cc: Bertrand.Marquis

Xen memory allocation and scheduler modules are NUMA aware.
But actually, on x86 has implemented the architecture APIs
to support NUMA. Arm was providing a set of fake architecture
APIs to make it compatible with NUMA awared memory allocation
and scheduler.

Arm system was working well as a single node NUMA system with
these fake APIs, because we didn't have multiple nodes NUMA
system on Arm. But in recent years, more and more Arm devices
support multiple nodes NUMA system.

So now we have a new problem. When Xen is running on these Arm
devices, Xen still treat them as single node SMP systems. The
NUMA affinity capability of Xen memory allocation and scheduler
becomes meaningless. Because they rely on input data that does
not reflect real NUMA layout.

Xen still think the access time for all of the memory is the
same for all CPUs. However, Xen may allocate memory to a VM
from different NUMA nodes with different access speeds. This
difference can be amplified in workloads inside VM, causing
performance instability and timeouts. 

So in this patch series, we implement a set of NUMA API to use
device tree to describe the NUMA layout. We reuse most of the
code of x86 NUMA to create and maintain the mapping between
memory and CPU, create the matrix between any two NUMA nodes.
Except ACPI and some x86 specified code, we have moved other
code to common. In next stage, when we implement ACPI based
NUMA for Arm64, we may move the ACPI NUMA code to common too,
but in current stage, we keep it as x86 only.

This patch serires has been tested and booted well on one
Arm64 NUMA machine and one HPE x86 NUMA machine.

---
@Julien, about numa=noacpi option, I haven't remove it in this
patch series. I had tried, and this option is not easy to remove.
So I'm not going to think about doing that in this series  

rfc -> v1:
 1. Re-order the whole patch series to avoid temporary code
 2. Add detection of discontinous node memory range
 3. Fix typos in commit messages and code comments.
 4. For variables that are used in common code, we don't
    covert them to external any more. Instead, we export
    helpers to provide variable access.
 5. Revert memnodemap[0] to 0 when NUMA init failed. Change
    memnodemapsize from ARRAY_SIZE(_memnodemap) to 1 to reflect
    reality.
 6. Use arch_has_default_dmazone for page_alloc.c instead of
    changing code inside Arm
 7. Keep Kconfig options alphabetically sorted.
 8. Replace #if !defined by #ifndef
 9. Use paddr_t for addresses in NUMA node structures and function
    parameters
10. Use fw_numa to replace acpi_numa for neutrality
11. Change BIOS to Firmware in print message.
12. Promote VIRTUAL_BUG_ON to ASSERT
13. Introduce CONFIG_EFI to stub API for non-EFI architecture
14. Use EFI stub API to replace arch helper for efi_enabled
15. Use NR_MEM_BANKS for Arm's NR_NODE_MEMBLKS
16. Change matrix map default value from NUMA_REMOTE_DISTANCE to 0
17. Remove check in numa_set_node.
18. Follow the x86's method of adding CPU to NUMA
19. Use fdt prefix for all device tree NUMA parser's API
20. Check un-matched bi-direction distance in matrix map
21. Remove unless fdt type check function
22. Update doc to remove numa x86 spcific
23. Introduce Arm generic NUMA Kconfig option

Wei Chen (37):
  xen/arm: Print a 64-bit number in hex from early uart
  xen: introduce a Kconfig option to configure NUMA nodes number
  xen/x86: Initialize memnodemapsize while faking NUMA node
  xen: introduce an arch helper for default dma zone status
  xen: decouple NUMA from ACPI in Kconfig
  xen/arm: use !CONFIG_NUMA to keep fake NUMA API
  xen/x86: use paddr_t for addresses in NUMA node structure
  xen/x86: add detection of discontinous node memory range
  xen/x86: introduce two helpers to access memory hotplug end
  xen/x86: use helpers to access/update mem_hotplug
  xen/x86: abstract neutral code from acpi_numa_memory_affinity_init
  xen/x86: decouple nodes_cover_memory from E820 map
  xen/x86: decouple processor_nodes_parsed from acpi numa functions
  xen/x86: use name fw_numa to replace acpi_numa
  xen/x86: rename acpi_scan_nodes to numa_scan_nodes
  xen/x86: export srat_bad to external
  xen/x86: use CONFIG_NUMA to gate numa_scan_nodes
  xen: move NUMA common code from x86 to common
  xen/x86: promote VIRTUAL_BUG_ON to ASSERT in
  xen: introduce CONFIG_EFI to stub API for non-EFI architecture
  xen/arm: Keep memory nodes in dtb for NUMA when boot from EFI
  xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS
  xen/arm: implement node distance helpers for Arm
  xen/arm: implement two arch helpers to get memory map info
  xen/arm: implement bad_srat for Arm NUMA initialization
  xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus
  xen/arm: Add boot and secondary CPU to NUMA system
  xen/arm: stub memory hotplug access helpers for Arm
  xen/arm: introduce a helper to parse device tree processor node
  xen/arm: introduce a helper to parse device tree memory node
  xen/arm: introduce a helper to parse device tree NUMA distance map
  xen/arm: unified entry to parse all NUMA data from device tree
  xen/arm: keep guest still be NUMA unware
  xen/arm: enable device tree based NUMA in system init
  xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot
  xen/arm: Provide Kconfig options for Arm to enable NUMA
  docs: update numa command line to support Arm

 docs/misc/xen-command-line.pandoc |   2 +-
 xen/arch/Kconfig                  |  11 +
 xen/arch/arm/Kconfig              |  12 +
 xen/arch/arm/Makefile             |   4 +-
 xen/arch/arm/arm64/head.S         |   9 +-
 xen/arch/arm/bootfdt.c            |   8 +-
 xen/arch/arm/domain_build.c       |   6 +
 xen/arch/arm/efi/efi-boot.h       |  25 --
 xen/arch/arm/numa.c               | 155 ++++++++++
 xen/arch/arm/numa_device_tree.c   | 274 ++++++++++++++++++
 xen/arch/arm/setup.c              |  12 +
 xen/arch/arm/smpboot.c            |  39 ++-
 xen/arch/x86/Kconfig              |   3 +-
 xen/arch/x86/numa.c               | 449 ++---------------------------
 xen/arch/x86/setup.c              |   2 +-
 xen/arch/x86/srat.c               | 232 ++-------------
 xen/common/Kconfig                |  14 +
 xen/common/Makefile               |   2 +
 xen/common/numa.c                 | 450 ++++++++++++++++++++++++++++++
 xen/common/numa_srat.c            | 264 ++++++++++++++++++
 xen/common/page_alloc.c           |   2 +-
 xen/drivers/acpi/Kconfig          |   3 +-
 xen/drivers/acpi/Makefile         |   2 +-
 xen/include/asm-arm/mm.h          |  10 +
 xen/include/asm-arm/numa.h        |  50 ++++
 xen/include/asm-x86/acpi.h        |   4 -
 xen/include/asm-x86/config.h      |   1 -
 xen/include/asm-x86/mm.h          |  10 +
 xen/include/asm-x86/numa.h        |  65 +----
 xen/include/asm-x86/setup.h       |   1 -
 xen/include/xen/efi.h             |  11 +
 xen/include/xen/numa.h            |  94 ++++++-
 32 files changed, 1470 insertions(+), 756 deletions(-)
 create mode 100644 xen/arch/arm/numa.c
 create mode 100644 xen/arch/arm/numa_device_tree.c
 create mode 100644 xen/common/numa.c
 create mode 100644 xen/common/numa_srat.c

-- 
2.25.1

^ permalink raw reply	[flat|nested] 192+ messages in thread