[PATCH 00/19] Virtual NUMA for PV and HVM

* [PATCH 00/19] Virtual NUMA for PV and HVM
@ 2014-11-21 15:06 Wei Liu
  2014-11-21 15:06 ` [PATCH 01/19] xen: dump vNUMA information with debug key "u" Wei Liu
                   ` (20 more replies)
  0 siblings, 21 replies; 44+ messages in thread
From: Wei Liu @ 2014-11-21 15:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu

Hi all

This patch series implements virtual NUMA support for both PV and HVM guest.
That is, admin can configure via libxl what virtual NUMA topology the guest
sees.

This is the stage 1 (basic vNUMA support) and part of stage 2 (vNUMA-ware
ballooning, hypervisor side) described in my previous email to xen-devel [0].

This series is broken into several parts:

1. xen patches: vNUMA debug output and vNUMA-aware memory hypercall support.
2. libxc/libxl support for PV vNUMA.
3. libxc/libxl support for HVM vNUMA.
4. xl vNUMA configuration documentation and parser.

I think one significant difference from Elena's work is that this patch series
makes use of multiple vmemranges should there be a memory hole, instead of
shrinking ram. This matches the behaviour of real hardware.

The vNUMA auto placement algorithm is missing at the moment and Dario is
working on it.

This series can be found at:
 git://xenbits.xen.org/people/liuw/xen.git wip.vnuma-v1 

With this series, the following configuration can be used to enabled virtual
NUMA support, and it works for both PV and HVM guests.

memory = 6000
vnuma_memory = [3000, 3000]
vnuma_vcpu_map = [0, 1]
vnuma_pnode_map = [0, 0]
vnuma_vdistances = [10, 30] # optional

dmesg output for HVM guest:

[    0.000000] ACPI: SRAT 00000000fc009ff0 000C8 (v01    Xen      HVM 00000000 HVML 00000000)
[    0.000000] ACPI: SLIT 00000000fc00a0c0 00030 (v01    Xen      HVM 00000000 HVML 00000000)
<...snip...>
[    0.000000] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x02 -> Node 1
[    0.000000] SRAT: Node 0 PXM 0 [mem 0x00000000-0xbb7fffff]
[    0.000000] SRAT: Node 1 PXM 1 [mem 0xbb800000-0xefffffff]
[    0.000000] SRAT: Node 1 PXM 1 [mem 0x100000000-0x186ffffff]
[    0.000000] NUMA: Initialized distance table, cnt=2
[    0.000000] NUMA: Node 1 [mem 0xbb800000-0xefffffff] + [mem 0x100000000-0x1867fffff] -> [mem 0xbb800000-0x1867fffff]
[    0.000000] Initmem setup node 0 [mem 0x00000000-0xbb7fffff]
[    0.000000]   NODE_DATA [mem 0xbb7fc000-0xbb7fffff]
[    0.000000] Initmem setup node 1 [mem 0xbb800000-0x1867fffff]
[    0.000000]   NODE_DATA [mem 0x1867f7000-0x1867fafff]
[    0.000000]  [ffffea0000000000-ffffea00029fffff] PMD -> [ffff8800b8600000-ffff8800baffffff] on node 0
[    0.000000]  [ffffea0002a00000-ffffea00055fffff] PMD -> [ffff880183000000-ffff8801859fffff] on node 1
<...snip...>
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00001000-0x0009efff]
[    0.000000]   node   0: [mem 0x00100000-0xbb7fffff]
[    0.000000]   node   1: [mem 0xbb800000-0xefffefff]
[    0.000000]   node   1: [mem 0x100000000-0x1867fffff]

numactl output for HVM guest:

available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 2999 MB
node 0 free: 2546 MB
node 1 cpus: 1
node 1 size: 2991 MB
node 1 free: 2144 MB
node distances:
node   0   1 
  0:  10  30 
  1:  30  10 

dmesg output for PV guest:

[    0.000000] NUMA: Initialized distance table, cnt=2
[    0.000000] NUMA: Node 1 [mem 0xbb800000-0xce68efff] + [mem 0x100000000-0x1a8970fff] -> [mem 0xbb800000-0x1a8970fff]
[    0.000000] NODE_DATA(0) allocated [mem 0xbb7fc000-0xbb7fffff]
[    0.000000] NODE_DATA(1) allocated [mem 0x1a8969000-0x1a896cfff]

numactl output for PV guest:

available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 2944 MB
node 0 free: 2917 MB
node 1 cpus: 1
node 1 size: 2934 MB
node 1 free: 2904 MB
node distances:
node   0   1 
  0:  10  30
  1:  30  10

And for a HVM guest on a real NUMA-capable machine:

(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 262144):
(XEN)     Node 0: 245758
(XEN)     Node 1: 16386
(XEN) Domain 2 (total: 2097226):
(XEN)     Node 0: 1046335
(XEN)     Node 1: 1050891
(XEN)      2 vnodes, 4 vcpus
(XEN)        vnode   0 - pnode 0
(XEN)         3840 MB:  0x0 - 0xf0000000
(XEN)         256 MB:  0x100000000 - 0x110000000
(XEN)         vcpus: 0 1 
(XEN)        vnode   1 - pnode 1
(XEN)         4096 MB:  0x110000000 - 0x210000000
(XEN)         vcpus: 2 3 

Wei.

[0] <20141111173606.GC21312@zion.uk.xensource.com>

Wei Liu (19):
  xen: dump vNUMA information with debug key "u"
  xen: make two memory hypercalls vNUMA-aware
  libxc: allocate memory with vNUMA information for PV guest
  libxl: add emacs local variables in libxl_{x86,arm}.c
  libxl: introduce vNUMA types
  libxl: add vmemrange to libxl__domain_build_state
  libxl: introduce libxl__vnuma_config_check
  libxl: x86: factor out e820_host_sanitize
  libxl: functions to build vmemranges for PV guest
  libxl: build, check and pass vNUMA info to Xen for PV guest
  hvmloader: add new fields for vNUMA information
  hvmloader: construct SRAT
  hvmloader: construct SLIT
  hvmloader: disallow memory relocation when vNUMA is enabled
  libxc: allocate memory with vNUMA information for HVM guest
  libxl: build, check and pass vNUMA info to Xen for HVM guest
  libxl: refactor hvm_build_set_params
  libxl: fill vNUMA information in hvm info
  xl: vNUMA support

 docs/man/xl.cfg.pod.5                   |   32 +++++
 tools/firmware/hvmloader/acpi/acpi2_0.h |   61 +++++++++
 tools/firmware/hvmloader/acpi/build.c   |  104 ++++++++++++++
 tools/firmware/hvmloader/pci.c          |   13 ++
 tools/libxc/include/xc_dom.h            |    5 +
 tools/libxc/include/xenguest.h          |    7 +
 tools/libxc/xc_dom_x86.c                |   72 ++++++++--
 tools/libxc/xc_hvm_build_x86.c          |  224 +++++++++++++++++++-----------
 tools/libxc/xc_private.h                |    2 +
 tools/libxl/Makefile                    |    2 +-
 tools/libxl/libxl_arch.h                |    6 +
 tools/libxl/libxl_arm.c                 |   17 +++
 tools/libxl/libxl_create.c              |    9 ++
 tools/libxl/libxl_dom.c                 |  172 ++++++++++++++++++++---
 tools/libxl/libxl_internal.h            |   18 +++
 tools/libxl/libxl_types.idl             |    9 ++
 tools/libxl/libxl_vnuma.c               |  228 +++++++++++++++++++++++++++++++
 tools/libxl/libxl_x86.c                 |  113 +++++++++++++--
 tools/libxl/xl_cmdimpl.c                |  151 ++++++++++++++++++++
 xen/arch/x86/numa.c                     |   46 ++++++-
 xen/common/memory.c                     |   58 +++++++-
 xen/include/public/features.h           |    3 +
 xen/include/public/hvm/hvm_info_table.h |   19 +++
 xen/include/public/memory.h             |    2 +
 24 files changed, 1247 insertions(+), 126 deletions(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 44+ messages in thread