All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/23] Virtual NUMA for PV and HVM
@ 2015-02-26 15:55 Wei Liu
  2015-02-26 15:55 ` [PATCH v6 01/23] xen: factor out construct_memop_from_reservation Wei Liu
                   ` (22 more replies)
  0 siblings, 23 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu

Hi all

This is version 6 of this series rebased on top of staging.

This patch series implements virtual NUMA support for both PV and HVM guest.
That is, admin can configure via libxl what virtual NUMA topology the guest
sees.

This is the stage 1 (basic vNUMA support) and part of stage 2 (vNUMA-ware
ballooning, hypervisor side) described in my previous email to xen-devel [0].

This series is broken into several parts:

1. xen patches: vNUMA debug output and vNUMA-aware memory hypercall support.
2. libxc/libxl support for PV vNUMA.
3. libxc/libxl/hypervisor support for HVM vNUMA.
4. xl vNUMA configuration documentation and parser.

One significant difference from Elena's work is that this patch series makes
use of multiple vmemranges should there be a memory hole, instead of shrinking
ram. This matches the behaviour of real hardware.

The vNUMA auto placement algorithm is missing at the moment and Dario is
working on it.

This series can be found at:
 git://xenbits.xen.org/people/liuw/xen.git wip.vnuma-v5

With this series, the following configuration can be used to enabled virtual
NUMA support, and it works for both PV and HVM guests.

vnuma = [ [ "pnode=0","size=3000","vcpus=0-3","vdistances=10,20"  ],
          [ "pnode=0","size=3000","vcpus=4-7","vdistances=20,10"  ],
        ]

For example output of guest NUMA information, please look at [1].

In terms of libxl / libxc internal, things are broken into several
parts:

1. libxl interface

Users of libxl can only specify how many vnodes a guest can have, but
currently they have no control over the actual memory layout. Note that
it's fairly easy to export the interface to control memory layout in the
future.

2. libxl internal

It generates some internal vNUMA configurations when building domain,
then transform them into libxc representations. It also validates vNUMA
configuration along the line.

3. libxc internal

Libxc does what it's told to do. It doesn't do anything smart (in fact,
I delibrately didn't put any smart logic inside it). Libxc will also
report back some information in HVM case to libxl but that's it.

Wei.

[0] <20141111173606.GC21312@zion.uk.xensource.com>
[1] <1416582421-10789-1-git-send-email-wei.liu2@citrix.com>

Wei Liu (23):
  xen: factor out construct_memop_from_reservation
  xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  xen: make two memory hypercalls vNUMA-aware
  libxc: duplicate snippet to allocate p2m_host array
  libxc: add p2m_size to xc_dom_image
  libxc: allocate memory with vNUMA information for PV guest
  libxl: introduce vNUMA types
  libxl: add vmemrange to libxl__domain_build_state
  libxl: introduce libxl__vnuma_config_check
  libxl: x86: factor out e820_host_sanitize
  libxl: functions to build vmemranges for PV guest
  libxl: build, check and pass vNUMA info to Xen for PV guest
  libxc: indentation change to xc_hvm_build_x86.c
  libxc: allocate memory with vNUMA information for HVM guest
  libxl: build, check and pass vNUMA info to Xen for HVM guest
  libxl: disallow memory relocation when vNUMA is enabled
  libxl: define LIBXL_HAVE_VNUMA
  libxlu: rework internal representation of setting
  libxlu: nested list support
  libxlu: record line and column number when parsing values
  libxlu: introduce new APIs
  xl: introduce xcalloc
  xl: vNUMA support

 docs/man/xl.cfg.pod.5                    |  54 +++++++
 tools/libxc/include/xc_dom.h             |  13 +-
 tools/libxc/include/xenguest.h           |  11 ++
 tools/libxc/xc_dom_arm.c                 |   1 +
 tools/libxc/xc_dom_core.c                |   8 +-
 tools/libxc/xc_dom_x86.c                 | 129 +++++++++++++---
 tools/libxc/xc_hvm_build_x86.c           | 237 +++++++++++++++++++----------
 tools/libxl/Makefile                     |   2 +-
 tools/libxl/libxl.h                      |   7 +
 tools/libxl/libxl_arch.h                 |   6 +
 tools/libxl/libxl_arm.c                  |   8 +
 tools/libxl/libxl_create.c               |   9 ++
 tools/libxl/libxl_dm.c                   |   6 +-
 tools/libxl/libxl_dom.c                  | 120 +++++++++++++++
 tools/libxl/libxl_internal.h             |  24 +++
 tools/libxl/libxl_types.idl              |  10 ++
 tools/libxl/libxl_vnuma.c                | 253 +++++++++++++++++++++++++++++++
 tools/libxl/libxl_x86.c                  | 105 +++++++++++--
 tools/libxl/libxlu_cfg.c                 | 209 ++++++++++++++++++-------
 tools/libxl/libxlu_cfg_i.h               |  14 +-
 tools/libxl/libxlu_cfg_y.c               |  72 ++++-----
 tools/libxl/libxlu_cfg_y.h               |   2 +-
 tools/libxl/libxlu_cfg_y.y               |  18 ++-
 tools/libxl/libxlu_internal.h            |  24 ++-
 tools/libxl/libxlutil.h                  |  13 ++
 tools/libxl/xl_cmdimpl.c                 | 150 +++++++++++++++++-
 xen/arch/x86/hpet.c                      |   2 +-
 xen/arch/x86/irq.c                       |   4 +-
 xen/arch/x86/numa.c                      |  14 +-
 xen/arch/x86/physdev.c                   |   2 +-
 xen/arch/x86/setup.c                     |   2 +-
 xen/arch/x86/smpboot.c                   |   2 +-
 xen/arch/x86/srat.c                      |  28 ++--
 xen/arch/x86/x86_64/mm.c                 |   2 +-
 xen/common/kernel.c                      |   2 +-
 xen/common/memory.c                      |  85 ++++++++---
 xen/common/page_alloc.c                  |   4 +-
 xen/drivers/passthrough/amd/iommu_init.c |   2 +-
 xen/drivers/passthrough/vtd/iommu.c      |   8 +-
 xen/include/public/features.h            |   3 +
 xen/include/public/memory.h              |   4 +
 xen/include/xen/numa.h                   |   5 +-
 42 files changed, 1392 insertions(+), 282 deletions(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 01/23] xen: factor out construct_memop_from_reservation
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-27 10:57   ` Andrew Cooper
  2015-02-26 15:55 ` [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE Wei Liu
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/common/memory.c | 52 +++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 17 deletions(-)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index e84ace9..d24b001 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -692,11 +692,43 @@ out:
     return rc;
 }
 
+static int construct_memop_from_reservation(
+               const struct xen_memory_reservation *r,
+               struct memop_args *a)
+{
+    int rc;
+    unsigned int address_bits;
+
+    a->extent_list  = r->extent_start;
+    a->nr_extents   = r->nr_extents;
+    a->extent_order = r->extent_order;
+    a->memflags     = 0;
+
+    address_bits = XENMEMF_get_address_bits(r->mem_flags);
+    if ( (address_bits != 0) &&
+         (address_bits < (get_order_from_pages(max_page) + PAGE_SHIFT)) )
+    {
+        if ( address_bits <= PAGE_SHIFT )
+        {
+            rc = -EINVAL;
+            goto out;
+        }
+        a->memflags = MEMF_bits(address_bits);
+    }
+
+    a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
+    if ( r->mem_flags & XENMEMF_exact_node_request )
+        a->memflags |= MEMF_exact_node;
+
+    rc = 0;
+ out:
+    return rc;
+}
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d;
     long rc;
-    unsigned int address_bits;
     struct xen_memory_reservation reservation;
     struct memop_args args;
     domid_t domid;
@@ -718,25 +750,11 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( unlikely(start_extent >= reservation.nr_extents) )
             return start_extent;
 
-        args.extent_list  = reservation.extent_start;
-        args.nr_extents   = reservation.nr_extents;
-        args.extent_order = reservation.extent_order;
         args.nr_done      = start_extent;
         args.preempted    = 0;
-        args.memflags     = 0;
 
-        address_bits = XENMEMF_get_address_bits(reservation.mem_flags);
-        if ( (address_bits != 0) &&
-             (address_bits < (get_order_from_pages(max_page) + PAGE_SHIFT)) )
-        {
-            if ( address_bits <= PAGE_SHIFT )
-                return start_extent;
-            args.memflags = MEMF_bits(address_bits);
-        }
-
-        args.memflags |= MEMF_node(XENMEMF_get_node(reservation.mem_flags));
-        if ( reservation.mem_flags & XENMEMF_exact_node_request )
-            args.memflags |= MEMF_exact_node;
+        if ( construct_memop_from_reservation(&reservation, &args) )
+            return start_extent;
 
         if ( op == XENMEM_populate_physmap
              && (reservation.mem_flags & XENMEMF_populate_on_demand) )
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
  2015-02-26 15:55 ` [PATCH v6 01/23] xen: factor out construct_memop_from_reservation Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-27 11:38   ` Andrew Cooper
  2015-02-27 16:42   ` Jan Beulich
  2015-02-26 15:55 ` [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware Wei Liu
                   ` (20 subsequent siblings)
  22 siblings, 2 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Update NUMA_NO_NODE in Xen code to use the new macro.

No functional change introduced.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/hpet.c                      |  2 +-
 xen/arch/x86/irq.c                       |  4 ++--
 xen/arch/x86/numa.c                      | 14 +++++++-------
 xen/arch/x86/physdev.c                   |  2 +-
 xen/arch/x86/setup.c                     |  2 +-
 xen/arch/x86/smpboot.c                   |  2 +-
 xen/arch/x86/srat.c                      | 28 ++++++++++++++--------------
 xen/arch/x86/x86_64/mm.c                 |  2 +-
 xen/common/page_alloc.c                  |  4 ++--
 xen/drivers/passthrough/amd/iommu_init.c |  2 +-
 xen/drivers/passthrough/vtd/iommu.c      |  8 ++++----
 xen/include/public/memory.h              |  2 ++
 xen/include/xen/numa.h                   |  5 ++---
 13 files changed, 39 insertions(+), 38 deletions(-)

diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 8f36f6f..3b6d12f 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -375,7 +375,7 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
 {
     int irq;
 
-    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
+    if ( (irq = create_irq(XEN_NUMA_NO_NODE)) < 0 )
         return irq;
 
     ch->msi.irq = irq;
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 786d1fc..deb67d7 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -173,7 +173,7 @@ int create_irq(nodeid_t node)
     {
         cpumask_t *mask = NULL;
 
-        if ( node != NUMA_NO_NODE )
+        if ( node != XEN_NUMA_NO_NODE )
         {
             mask = &node_to_cpumask(node);
             if (cpumask_empty(mask))
@@ -2000,7 +2000,7 @@ int map_domain_pirq(
             spin_unlock_irqrestore(&desc->lock, flags);
 
             info = NULL;
-            irq = create_irq(NUMA_NO_NODE);
+            irq = create_irq(XEN_NUMA_NO_NODE);
             ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
                            : irq;
             if ( ret )
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 132d694..6e1a0b8 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -37,13 +37,13 @@ unsigned long memnodemapsize;
 u8 *memnodemap;
 
 nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
-    [0 ... NR_CPUS-1] = NUMA_NO_NODE
+    [0 ... NR_CPUS-1] = XEN_NUMA_NO_NODE
 };
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
  */
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] __cpuinitdata = {
-    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
+    [0 ... MAX_LOCAL_APIC-1] = XEN_NUMA_NO_NODE
 };
 cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
 
@@ -71,7 +71,7 @@ static int __init populate_memnodemap(const struct node *nodes,
     unsigned long spdx, epdx;
     int i, res = -1;
 
-    memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
+    memset(memnodemap, XEN_NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
     for ( i = 0; i < numnodes; i++ )
     {
         spdx = paddr_to_pdx(nodes[i].start);
@@ -81,7 +81,7 @@ static int __init populate_memnodemap(const struct node *nodes,
         if ( (epdx >> shift) >= memnodemapsize )
             return 0;
         do {
-            if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
+            if ( memnodemap[spdx >> shift] != XEN_NUMA_NO_NODE )
                 return -1;
 
             if ( !nodeids )
@@ -199,7 +199,7 @@ void __init numa_init_array(void)
     rr = first_node(node_online_map);
     for ( i = 0; i < nr_cpu_ids; i++ )
     {
-        if ( cpu_to_node[i] != NUMA_NO_NODE )
+        if ( cpu_to_node[i] != XEN_NUMA_NO_NODE )
             continue;
         numa_set_node(i, rr);
         rr = next_node(rr, node_online_map);
@@ -350,7 +350,7 @@ void __init init_cpu_to_node(void)
         if ( apicid == BAD_APICID )
             continue;
         node = apicid_to_node[apicid];
-        if ( node == NUMA_NO_NODE || !node_online(node) )
+        if ( node == XEN_NUMA_NO_NODE || !node_online(node) )
             node = 0;
         numa_set_node(i, node);
     }
@@ -433,7 +433,7 @@ static void dump_numa(unsigned char key)
 
             err = snprintf(keyhandler_scratch, 12, "%3u",
                     vnuma->vnode_to_pnode[i]);
-            if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
+            if ( err < 0 || vnuma->vnode_to_pnode[i] == XEN_NUMA_NO_NODE )
                 strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
 
             printk("       %3u: pnode %s,", i, keyhandler_scratch);
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 1be1d50..a3a9564 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -146,7 +146,7 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int *pirq_p,
         irq = *index;
         if ( irq == -1 )
     case MAP_PIRQ_TYPE_MULTI_MSI:
-            irq = create_irq(NUMA_NO_NODE);
+            irq = create_irq(XEN_NUMA_NO_NODE);
 
         if ( irq < nr_irqs_gsi || irq >= nr_irqs )
         {
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 7593533..c4138ec 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -192,7 +192,7 @@ void __devinit srat_detect_node(int cpu)
     u32 apicid = x86_cpu_to_apicid[cpu];
 
     node = apicid_to_node[apicid];
-    if ( node == NUMA_NO_NODE )
+    if ( node == XEN_NUMA_NO_NODE )
         node = 0;
 
     node_set_online(node);
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 314e253..168dd6e 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -879,7 +879,7 @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm)
     {
         nodeid_t node = setup_node(pxm);
 
-        if ( node == NUMA_NO_NODE )
+        if ( node == XEN_NUMA_NO_NODE )
         {
             dprintk(XENLOG_WARNING,
                     "Setup node failed for pxm %x\n", pxm);
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index dfabba3..ed91b99 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -33,7 +33,7 @@ struct pxm2node {
 	nodeid_t node;
 };
 static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
-	{ [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
+	{ [0 ... MAX_NUMNODES - 1] = {.node = XEN_NUMA_NO_NODE} };
 
 static unsigned node_to_pxm(nodeid_t n);
 
@@ -44,7 +44,7 @@ static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static inline bool_t node_found(unsigned idx, unsigned pxm)
 {
 	return ((pxm2node[idx].pxm == pxm) &&
-		(pxm2node[idx].node != NUMA_NO_NODE));
+		(pxm2node[idx].node != XEN_NUMA_NO_NODE));
 }
 
 nodeid_t pxm_to_node(unsigned pxm)
@@ -58,7 +58,7 @@ nodeid_t pxm_to_node(unsigned pxm)
 		if (node_found(i, pxm))
 			return pxm2node[i].node;
 
-	return NUMA_NO_NODE;
+	return XEN_NUMA_NO_NODE;
 }
 
 __devinit nodeid_t setup_node(unsigned pxm)
@@ -67,21 +67,21 @@ __devinit nodeid_t setup_node(unsigned pxm)
 	unsigned idx;
 	static bool_t warned;
 
-	BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
+	BUILD_BUG_ON(MAX_NUMNODES >= XEN_NUMA_NO_NODE);
 
 	if (pxm < ARRAY_SIZE(pxm2node)) {
 		if (node_found(pxm, pxm))
 			return pxm2node[pxm].node;
 
 		/* Try to maintain indexing of pxm2node by pxm */
-		if (pxm2node[pxm].node == NUMA_NO_NODE) {
+		if (pxm2node[pxm].node == XEN_NUMA_NO_NODE) {
 			idx = pxm;
 			goto finish;
 		}
 	}
 
 	for (idx = 0; idx < ARRAY_SIZE(pxm2node); idx++)
-		if (pxm2node[idx].node == NUMA_NO_NODE)
+		if (pxm2node[idx].node == XEN_NUMA_NO_NODE)
 			goto finish;
 
 	if (!warned) {
@@ -89,7 +89,7 @@ __devinit nodeid_t setup_node(unsigned pxm)
 		warned = 1;
 	}
 
-	return NUMA_NO_NODE;
+	return XEN_NUMA_NO_NODE;
 
  finish:
 	node = first_unset_node(nodes_found);
@@ -152,9 +152,9 @@ static __init void bad_srat(void)
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
 	acpi_numa = -1;
 	for (i = 0; i < MAX_LOCAL_APIC; i++)
-		apicid_to_node[i] = NUMA_NO_NODE;
+		apicid_to_node[i] = XEN_NUMA_NO_NODE;
 	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		pxm2node[i].node = NUMA_NO_NODE;
+		pxm2node[i].node = XEN_NUMA_NO_NODE;
 	mem_hotplug = 0;
 }
 
@@ -218,7 +218,7 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
 		return;
 	pxm = pa->proximity_domain;
 	node = setup_node(pxm);
-	if (node == NUMA_NO_NODE) {
+	if (node == XEN_NUMA_NO_NODE) {
 		printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
 		bad_srat();
 		return;
@@ -253,7 +253,7 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
 		pxm |= pa->proximity_domain_hi[2] << 24;
 	}
 	node = setup_node(pxm);
-	if (node == NUMA_NO_NODE) {
+	if (node == XEN_NUMA_NO_NODE) {
 		printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
 		bad_srat();
 		return;
@@ -298,7 +298,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
 	if (srat_rev < 2)
 		pxm &= 0xff;
 	node = setup_node(pxm);
-	if (node == NUMA_NO_NODE) {
+	if (node == XEN_NUMA_NO_NODE) {
 		printk(KERN_ERR "SRAT: Too many proximity domains.\n");
 		bad_srat();
 		return;
@@ -475,10 +475,10 @@ int __init acpi_scan_nodes(u64 start, u64 end)
 		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
 	}
 	for (i = 0; i < nr_cpu_ids; i++) {
-		if (cpu_to_node[i] == NUMA_NO_NODE)
+		if (cpu_to_node[i] == XEN_NUMA_NO_NODE)
 			continue;
 		if (!node_isset(cpu_to_node[i], processor_nodes_parsed))
-			numa_set_node(i, NUMA_NO_NODE);
+			numa_set_node(i, XEN_NUMA_NO_NODE);
 	}
 	numa_init_array();
 	return 0;
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 6875c92..c6b7f7c 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1354,7 +1354,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( !mem_hotadd_check(spfn, epfn) )
         return -EINVAL;
 
-    if ( (node = setup_node(pxm)) == NUMA_NO_NODE )
+    if ( (node = setup_node(pxm)) == XEN_NUMA_NO_NODE )
         return -EINVAL;
 
     if ( !valid_numa_range(spfn << PAGE_SHIFT, epfn << PAGE_SHIFT, node) )
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index d96d25b..f62ee74 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -591,7 +591,7 @@ static struct page_info *alloc_heap_pages(
     /* Make sure there are enough bits in memflags for nodeID. */
     BUILD_BUG_ON((_MEMF_bits - _MEMF_node) < (8 * sizeof(nodeid_t)));
 
-    if ( node == NUMA_NO_NODE )
+    if ( node == XEN_NUMA_NO_NODE )
     {
         memflags &= ~MEMF_exact_node;
         if ( d != NULL )
@@ -1291,7 +1291,7 @@ static void __init smp_scrub_heap_pages(void *data)
     else
     {
         node = cpu_to_node(cpu);
-        if ( node == NUMA_NO_NODE )
+        if ( node == XEN_NUMA_NO_NODE )
             return;
         r = &region[node];
     }
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 56bda00..13fe7ae 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -782,7 +782,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
     unsigned long flags;
     u16 control;
 
-    irq = create_irq(NUMA_NO_NODE);
+    irq = create_irq(XEN_NUMA_NO_NODE);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1063677..d03b2a8 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -190,7 +190,7 @@ u64 alloc_pgtable_maddr(struct acpi_drhd_unit *drhd, unsigned long npages)
     struct acpi_rhsa_unit *rhsa;
     struct page_info *pg, *cur_pg;
     u64 *vaddr;
-    nodeid_t node = NUMA_NO_NODE;
+    nodeid_t node = XEN_NUMA_NO_NODE;
     unsigned int i;
 
     rhsa = drhd_to_rhsa(drhd);
@@ -198,7 +198,7 @@ u64 alloc_pgtable_maddr(struct acpi_drhd_unit *drhd, unsigned long npages)
         node =  pxm_to_node(rhsa->proximity_domain);
 
     pg = alloc_domheap_pages(NULL, get_order_from_pages(npages),
-                             (node == NUMA_NO_NODE) ? 0 : MEMF_node(node));
+                             (node == XEN_NUMA_NO_NODE) ? 0 : MEMF_node(node));
     if ( !pg )
         return 0;
 
@@ -1064,7 +1064,7 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
     struct irq_desc *desc;
 
     irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
-                          : NUMA_NO_NODE);
+                          : XEN_NUMA_NO_NODE);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
@@ -1959,7 +1959,7 @@ static void adjust_irq_affinity(struct acpi_drhd_unit *drhd)
 {
     const struct acpi_rhsa_unit *rhsa = drhd_to_rhsa(drhd);
     unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
-                             : NUMA_NO_NODE;
+                             : XEN_NUMA_NO_NODE;
     const cpumask_t *cpumask = &cpu_online_map;
 
     if ( node < MAX_NUMNODES && node_online(node) &&
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 595f953..0d8c85f 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -29,6 +29,8 @@
 
 #include "xen.h"
 
+#define XEN_NUMA_NO_NODE    0xFF
+
 /*
  * Increase or decrease the specified domain's memory reservation. Returns the
  * number of extents successfully allocated or freed.
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index ac4b391..60daaa5 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -1,20 +1,19 @@
 #ifndef _XEN_NUMA_H
 #define _XEN_NUMA_H
 
+#include <public/memory.h>
 #include <asm/numa.h>
 
 #ifndef NODES_SHIFT
 #define NODES_SHIFT     0
 #endif
 
-#define NUMA_NO_NODE    0xFF
-
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
 
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
 #define domain_to_node(d) \
   (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
-   ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
+   ? vcpu_to_node((d)->vcpu[0]) : XEN_NUMA_NO_NODE)
 
 #endif /* _XEN_NUMA_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
  2015-02-26 15:55 ` [PATCH v6 01/23] xen: factor out construct_memop_from_reservation Wei Liu
  2015-02-26 15:55 ` [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-27 16:59   ` Jan Beulich
  2015-02-26 15:55 ` [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array Wei Liu
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Make XENMEM_increase_reservation and XENMEM_populate_physmap
vNUMA-aware.

That is, if guest requests Xen to allocate memory for specific vnode,
Xen can translate vnode to pnode using vNUMA information of that guest.

XENMEMF_vnode is introduced for the guest to mark the node number is in
fact virtual node number and should be translated by Xen.

XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
able to translate virtual node to physical node.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v6:
1. Add logic in construct_memop_from_reservation.
---
 xen/common/kernel.c           |  2 +-
 xen/common/memory.c           | 45 ++++++++++++++++++++++++++++++++++++-------
 xen/include/public/features.h |  3 +++
 xen/include/public/memory.h   |  2 ++
 4 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 0d9e519..e5e0050 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -301,7 +301,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         switch ( fi.submap_idx )
         {
         case 0:
-            fi.submap = 0;
+            fi.submap = (1U << XENFEAT_memory_op_vnode_supported);
             if ( VM_ASSIST(d, VMASST_TYPE_pae_extended_cr3) )
                 fi.submap |= (1U << XENFEAT_pae_pgdir_above_4gb);
             if ( paging_mode_translate(current->domain) )
diff --git a/xen/common/memory.c b/xen/common/memory.c
index d24b001..9f8891b 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -692,7 +692,7 @@ out:
     return rc;
 }
 
-static int construct_memop_from_reservation(
+static int construct_memop_from_reservation(struct domain *d,
                const struct xen_memory_reservation *r,
                struct memop_args *a)
 {
@@ -716,9 +716,37 @@ static int construct_memop_from_reservation(
         a->memflags = MEMF_bits(address_bits);
     }
 
-    a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
-    if ( r->mem_flags & XENMEMF_exact_node_request )
-        a->memflags |= MEMF_exact_node;
+    if ( r->mem_flags & XENMEMF_vnode )
+    {
+        unsigned int vnode, pnode;
+
+        read_lock(&d->vnuma_rwlock);
+        if ( d->vnuma )
+        {
+            vnode = XENMEMF_get_node(r->mem_flags);
+            if ( vnode >= d->vnuma->nr_vnodes )
+            {
+                rc = -EINVAL;
+                read_unlock(&d->vnuma_rwlock);
+                goto out;
+            }
+
+            pnode = d->vnuma->vnode_to_pnode[vnode];
+            if ( pnode != XEN_NUMA_NO_NODE )
+            {
+                a->memflags |= MEMF_node(pnode);
+                if ( r->mem_flags & XENMEMF_exact_node_request )
+                    a->memflags |= MEMF_exact_node;
+            }
+        }
+        read_unlock(&d->vnuma_rwlock);
+    }
+    else
+    {
+        a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
+        if ( r->mem_flags & XENMEMF_exact_node_request )
+            a->memflags |= MEMF_exact_node;
+    }
 
     rc = 0;
  out:
@@ -753,9 +781,6 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         args.nr_done      = start_extent;
         args.preempted    = 0;
 
-        if ( construct_memop_from_reservation(&reservation, &args) )
-            return start_extent;
-
         if ( op == XENMEM_populate_physmap
              && (reservation.mem_flags & XENMEMF_populate_on_demand) )
             args.memflags |= MEMF_populate_on_demand;
@@ -765,6 +790,12 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
             return start_extent;
         args.domain = d;
 
+        if ( construct_memop_from_reservation(d, &reservation, &args) )
+        {
+            rcu_unlock_domain(d);
+            return start_extent;
+        }
+
         if ( xsm_memory_adjust_reservation(XSM_TARGET, current->domain, d) )
         {
             rcu_unlock_domain(d);
diff --git a/xen/include/public/features.h b/xen/include/public/features.h
index 16d92aa..2110b04 100644
--- a/xen/include/public/features.h
+++ b/xen/include/public/features.h
@@ -99,6 +99,9 @@
 #define XENFEAT_grant_map_identity        12
  */
 
+/* Guest can use XENMEMF_vnode to specify virtual node for memory op. */
+#define XENFEAT_memory_op_vnode_supported 13
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 0d8c85f..d71127f 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -57,6 +57,8 @@
 /* Flag to request allocation only from the node specified */
 #define XENMEMF_exact_node_request  (1<<17)
 #define XENMEMF_exact_node(n) (XENMEMF_node(n) | XENMEMF_exact_node_request)
+/* Flag to indicate the node specified is virtual node */
+#define XENMEMF_vnode  (1<<18)
 #endif
 
 struct xen_memory_reservation {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (2 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:26   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image Wei Liu
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Currently all in tree code doesn't set the superpage flag, but Konrad
wants it retained for the moment.

As I'm going to change the p2m_host array allocation, duplicate the code
snippet to allocate p2m_host array in this patch, so that we retain the
behaviour in superpage case.

This patch introduces no functional change and it will make future patch
easier to review. Also removed one stray tab while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Konrad Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/xc_dom_x86.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bf06fe4..9dbaedb 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             return rc;
     }
 
-    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
-    if ( dom->p2m_host == NULL )
-        return -EINVAL;
-
     if ( dom->superpages )
     {
         int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
         xen_pfn_t extents[count];
 
+        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+                                      dom->total_pages);
+        if ( dom->p2m_host == NULL )
+            return -EINVAL;
+
         DOMPRINTF("Populating memory with %d superpages", count);
         for ( pfn = 0; pfn < count; pfn++ )
             extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
@@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 return rc;
         }
         /* setup initial p2m */
+        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+                                      dom->total_pages);
+        if ( dom->p2m_host == NULL )
+            return -EINVAL;
         for ( pfn = 0; pfn < dom->total_pages; pfn++ )
             dom->p2m_host[pfn] = pfn;
-        
+
         /* allocate guest memory */
         for ( i = rc = allocsz = 0;
               (i < dom->total_pages) && !rc;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (3 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:28   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest Wei Liu
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Add a new field p2m_size to keep track of the number of pages covered by
p2m.  Change total_pages to p2m_size in functions which in fact need
the size of p2m.

This is needed because we are going to ditch the assumption that PV x86
has only one contiguous ram region. Originally the p2m size was always
equal to total_pages, but we will soon change that in later patch.

This patch doesn't change the behaviour of libxc.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_arm.c     |  1 +
 tools/libxc/xc_dom_core.c    |  8 ++++----
 tools/libxc/xc_dom_x86.c     | 19 +++++++++++--------
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 07d7224..6b8ddf4 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -129,6 +129,7 @@ struct xc_dom_image {
      */
     xen_pfn_t rambase_pfn;
     xen_pfn_t total_pages;
+    xen_pfn_t p2m_size;         /* number of pfns covered by p2m */
     struct xc_dom_phys *phys_pages;
     int realmodearea_log;
 #if defined (__arm__) || defined(__aarch64__)
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index c7feca7..b9fa66d 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -449,6 +449,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
     assert(dom->rambank_size[0] != 0);
     assert(ramsize == 0); /* Too much RAM is rejected above */
 
+    dom->p2m_size = p2m_size;
     dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size);
     if ( dom->p2m_host == NULL )
         return -EINVAL;
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index ecbf981..b100ce1 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -931,9 +931,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
     {
     case 4:
         DOMPRINTF("%s: dst 32bit, pages 0x%" PRIpfn "",
-                  __FUNCTION__, dom->total_pages);
+                  __FUNCTION__, dom->p2m_size);
         p2m_32 = dom->p2m_guest;
-        for ( i = 0; i < dom->total_pages; i++ )
+        for ( i = 0; i < dom->p2m_size; i++ )
             if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
                 p2m_32[i] = dom->p2m_host[i];
             else
@@ -941,9 +941,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
         break;
     case 8:
         DOMPRINTF("%s: dst 64bit, pages 0x%" PRIpfn "",
-                  __FUNCTION__, dom->total_pages);
+                  __FUNCTION__, dom->p2m_size);
         p2m_64 = dom->p2m_guest;
-        for ( i = 0; i < dom->total_pages; i++ )
+        for ( i = 0; i < dom->p2m_size; i++ )
             if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
                 p2m_64[i] = dom->p2m_host[i];
             else
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 9dbaedb..bea54f2 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -122,11 +122,11 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
 
         try_pfn_end = (try_virt_end - dom->parms.virt_base) >> PAGE_SHIFT_X86;
 
-        if ( try_pfn_end > dom->total_pages )
+        if ( try_pfn_end > dom->p2m_size )
         {
             xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
                          "%s: not enough memory for initial mapping (%#"PRIpfn" > %#"PRIpfn")",
-                         __FUNCTION__, try_pfn_end, dom->total_pages);
+                         __FUNCTION__, try_pfn_end, dom->p2m_size);
             return -ENOMEM;
         }
 
@@ -440,10 +440,11 @@ pfn_error:
 
 static int alloc_magic_pages(struct xc_dom_image *dom)
 {
-    size_t p2m_size = dom->total_pages * dom->arch_hooks->sizeof_pfn;
+    size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
 
     /* allocate phys2mach table */
-    if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach", 0, p2m_size) )
+    if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach",
+                              0, p2m_alloc_size) )
         return -1;
     dom->p2m_guest = xc_dom_seg_to_ptr(dom, &dom->p2m_seg);
     if ( dom->p2m_guest == NULL )
@@ -777,8 +778,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
         int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
         xen_pfn_t extents[count];
 
+        dom->p2m_size = dom->total_pages;
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-                                      dom->total_pages);
+                                      dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
 
@@ -810,8 +812,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 return rc;
         }
         /* setup initial p2m */
+        dom->p2m_size = dom->total_pages;
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-                                      dom->total_pages);
+                                      dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
         for ( pfn = 0; pfn < dom->total_pages; pfn++ )
@@ -860,7 +863,7 @@ static int map_grant_table_frames(struct xc_dom_image *dom)
     {
         rc = xc_domain_add_to_physmap(dom->xch, dom->guest_domid,
                                       XENMAPSPACE_grant_table,
-                                      i, dom->total_pages + i);
+                                      i, dom->p2m_size + i);
         if ( rc != 0 )
         {
             if ( (i > 0) && (errno == EINVAL) )
@@ -870,7 +873,7 @@ static int map_grant_table_frames(struct xc_dom_image *dom)
             }
             xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
                          "%s: mapping grant tables failed " "(pfn=0x%" PRIpfn
-                         ", rc=%d)", __FUNCTION__, dom->total_pages + i, rc);
+                         ", rc=%d)", __FUNCTION__, dom->p2m_size + i, rc);
             return rc;
         }
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (4 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:36   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 07/23] libxl: introduce vNUMA types Wei Liu
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

>From libxc's point of view, it only needs to know vnode to pnode mapping
and size of each vnode to allocate memory accordingly. Add these fields
to xc_dom structure.

The caller might not pass in vNUMA information. In that case, a dummy
layout is generated for the convenience of libxc's allocation code. The
upper layer (libxl etc) still sees the domain has no vNUMA
configuration.

Note that for this patch on PV x86 guest can have multiple regions of
ram allocated.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
---
Changes in v6:
1. Ditch XC_VNUMA_NO_NODE and use XEN_NUMA_NO_NODE.
2. Update comment in xc_dom.h.

Changes in v5:
1. Ditch xc_vnuma_info.

Changes in v4:
1. Pack fields into a struct.
2. Use "page" as unit.
3. __FUNCTION__ -> __func__.
4. Don't print total_pages.
5. Improve comment.

Changes in v3:
1. Rewrite commit log.
2. Shorten some error messages.
---
 tools/libxc/include/xc_dom.h |  12 ++++-
 tools/libxc/xc_dom_x86.c     | 101 +++++++++++++++++++++++++++++++++++++------
 2 files changed, 97 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 6b8ddf4..a7d059a 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -119,8 +119,10 @@ struct xc_dom_image {
 
     /* physical memory
      *
-     * An x86 PV guest has a single contiguous block of physical RAM,
-     * consisting of total_pages starting at rambase_pfn.
+     * An x86 PV guest has one or more blocks of physical RAM,
+     * consisting of total_pages starting at rambase_pfn. The start
+     * address and size of each block is controlled by vNUMA
+     * structures.
      *
      * An ARM guest has GUEST_RAM_BANKS regions of RAM, with
      * rambank_size[i] pages in each. The lowest RAM address
@@ -168,6 +170,12 @@ struct xc_dom_image {
     struct xc_dom_loader *kernel_loader;
     void *private_loader;
 
+    /* vNUMA information */
+    xen_vmemrange_t *vmemranges;
+    unsigned int nr_vmemranges;
+    unsigned int *vnode_to_pnode;
+    unsigned int nr_vnodes;
+
     /* kernel loader */
     struct xc_dom_arch *arch_hooks;
     /* allocate up to virt_alloc_end */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bea54f2..268d4db 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -760,7 +760,8 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
 int arch_setup_meminit(struct xc_dom_image *dom)
 {
     int rc;
-    xen_pfn_t pfn, allocsz, i, j, mfn;
+    xen_pfn_t pfn, allocsz, mfn, total, pfn_base;
+    int i, j;
 
     rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
     if ( rc )
@@ -811,26 +812,98 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             if ( rc )
                 return rc;
         }
-        /* setup initial p2m */
-        dom->p2m_size = dom->total_pages;
+
+        /* Setup dummy vNUMA information if it's not provided. Note
+         * that this is a valid state if libxl doesn't provide any
+         * vNUMA information.
+         *
+         * The dummy values make libxc allocate all pages from
+         * arbitrary physical nodes. This is the expected behaviour if
+         * no vNUMA configuration is provided to libxc.
+         *
+         * Note that the following hunk is just for the convenience of
+         * allocation code. No defaulting happens in libxc.
+         */
+        if ( dom->nr_vmemranges == 0 )
+        {
+            dom->nr_vmemranges = 1;
+            dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges));
+            dom->vmemranges[0].start = 0;
+            dom->vmemranges[0].end   = dom->total_pages << PAGE_SHIFT;
+            dom->vmemranges[0].flags = 0;
+            dom->vmemranges[0].nid   = 0;
+
+            dom->nr_vnodes = 1;
+            dom->vnode_to_pnode = xc_dom_malloc(dom,
+                                      sizeof(*dom->vnode_to_pnode));
+            dom->vnode_to_pnode[0] = XEN_NUMA_NO_NODE;
+        }
+
+        total = dom->p2m_size = 0;
+        for ( i = 0; i < dom->nr_vmemranges; i++ )
+        {
+            total += ((dom->vmemranges[i].end - dom->vmemranges[i].start)
+                      >> PAGE_SHIFT);
+            dom->p2m_size =
+                dom->p2m_size > (dom->vmemranges[i].end >> PAGE_SHIFT) ?
+                dom->p2m_size : (dom->vmemranges[i].end >> PAGE_SHIFT);
+        }
+        if ( total != dom->total_pages )
+        {
+            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                         "%s: vNUMA page count mismatch (0x%"PRIpfn" != 0x%"PRIpfn")\n",
+                         __func__, total, dom->total_pages);
+            return -EINVAL;
+        }
+
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
                                       dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
-        for ( pfn = 0; pfn < dom->total_pages; pfn++ )
-            dom->p2m_host[pfn] = pfn;
+        for ( pfn = 0; pfn < dom->p2m_size; pfn++ )
+            dom->p2m_host[pfn] = INVALID_P2M_ENTRY;
 
         /* allocate guest memory */
-        for ( i = rc = allocsz = 0;
-              (i < dom->total_pages) && !rc;
-              i += allocsz )
+        for ( i = 0; i < dom->nr_vmemranges; i++ )
         {
-            allocsz = dom->total_pages - i;
-            if ( allocsz > 1024*1024 )
-                allocsz = 1024*1024;
-            rc = xc_domain_populate_physmap_exact(
-                dom->xch, dom->guest_domid, allocsz,
-                0, 0, &dom->p2m_host[i]);
+            unsigned int memflags;
+            uint64_t pages;
+            unsigned int pnode = dom->vnode_to_pnode[dom->vmemranges[i].nid];
+
+            memflags = 0;
+            if ( pnode != XEN_NUMA_NO_NODE )
+                memflags |= XENMEMF_exact_node(pnode);
+
+            pages = (dom->vmemranges[i].end - dom->vmemranges[i].start)
+                >> PAGE_SHIFT;
+            pfn_base = dom->vmemranges[i].start >> PAGE_SHIFT;
+
+            for ( pfn = pfn_base; pfn < pfn_base+pages; pfn++ )
+                dom->p2m_host[pfn] = pfn;
+
+            for ( j = 0; j < pages; j += allocsz )
+            {
+                allocsz = pages - j;
+                if ( allocsz > 1024*1024 )
+                    allocsz = 1024*1024;
+
+                rc = xc_domain_populate_physmap_exact(dom->xch,
+                         dom->guest_domid, allocsz, 0, memflags,
+                         &dom->p2m_host[pfn_base+j]);
+
+                if ( rc )
+                {
+                    if ( pnode != XEN_NUMA_NO_NODE )
+                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                                     "%s: failed to allocate 0x%"PRIx64" pages (v=%d, p=%d)\n",
+                                     __func__, pages, i, pnode);
+                    else
+                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                                     "%s: failed to allocate 0x%"PRIx64" pages\n",
+                                     __func__, pages);
+                    return rc;
+                }
+            }
         }
 
         /* Ensure no unclaimed pages are left unused.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 07/23] libxl: introduce vNUMA types
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (5 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 15:55 ` [PATCH v6 08/23] libxl: add vmemrange to libxl__domain_build_state Wei Liu
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

A domain can contain several virtual NUMA nodes, hence we introduce an
array in libxl_domain_build_info.

libxl_vnode_info contains the size of memory in that node, the distance
from that node to every nodes, the underlying pnode and a bitmap of
vcpus.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v4:
1. Use MemKB.

Changes in v3:
1. Add commit message.
---
 tools/libxl/libxl_types.idl | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 02be466..14c7e7c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -356,6 +356,13 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
     ("budget",       integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
     ])
 
+libxl_vnode_info = Struct("vnode_info", [
+    ("memkb", MemKB),
+    ("distances", Array(uint32, "num_distances")), # distances from this node to other nodes
+    ("pnode", uint32), # physical node of this node
+    ("vcpus", libxl_bitmap), # vcpus in this node
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -376,6 +383,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("disable_migrate", libxl_defbool),
     ("cpuid",           libxl_cpuid_policy_list),
     ("blkdev_start",    string),
+
+    ("vnuma_nodes", Array(libxl_vnode_info, "num_vnuma_nodes")),
     
     ("device_model_version", libxl_device_model_version),
     ("device_model_stubdomain", libxl_defbool),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 08/23] libxl: add vmemrange to libxl__domain_build_state
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (6 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 07/23] libxl: introduce vNUMA types Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 15:55 ` [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check Wei Liu
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

A vnode consists of one or more vmemranges (virtual memory range).  One
example of multiple vmemranges is that there is a hole in one vnode.

Currently we haven't exported vmemrange interface to libxl user.
Vmemranges are generated during domain build, so we have relevant
structures in domain build state.

Later if we discover we need to export the interface, those structures
can be moved to libxl_domain_build_info as well.

These new fields (along with other fields in that struct) are set to 0
at start of day so we don't need to explicitly initialise them. A
following patch which introduces an independent checking function will
need to access these fields. I don't feel very comfortable squashing
this change into that one so I didn't use a single commit.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v5:
1. Fix commit message.

Changes in v4:
1. Improve commit message.

Changes in v3:
1. Rewrite commit message.
---
 tools/libxl/libxl_internal.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..6d3ac58 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -973,6 +973,9 @@ typedef struct {
     libxl__file_reference pv_ramdisk;
     const char * pv_cmdline;
     bool pvh_enabled;
+
+    xen_vmemrange_t *vmemranges;
+    uint32_t num_vmemranges;
 } libxl__domain_build_state;
 
 _hidden int libxl__build_pre(libxl__gc *gc, uint32_t domid,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (7 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 08/23] libxl: add vmemrange to libxl__domain_build_state Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:34   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 10/23] libxl: x86: factor out e820_host_sanitize Wei Liu
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

This function is used to check whether vNUMA configuration (be it
auto-generated or supplied by user) is valid.

Define a new error code ERROR_VNUMA_CONFIG_INVALID.

The checks performed can be found in the comment of the function.

This vNUMA function (and future ones) is placed in a new file called
libxl_vnuma.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
---
Changes in v6:
1. Address comments from Andrew.
2. Check vdistances.
3. use libxl_numainfo_list_free.
4. Change p to v.

Changes in v5:
1. Define and use new error code.
2. Use LOG macro.
3. Fix hard tabs.

Changes in v4:
1. Adapt to new interface.

Changes in v3:
1. Rewrite commit log.
2. Shorten two error messages.
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl_internal.h |   7 ++
 tools/libxl/libxl_types.idl  |   1 +
 tools/libxl/libxl_vnuma.c    | 151 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 7329521..1b16598 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -93,7 +93,7 @@ LIBXL_LIBS += -lyajl
 LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
-			libxl_json.o libxl_aoutils.o libxl_numa.o \
+			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6d3ac58..258be0d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3394,6 +3394,13 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
     libxl_bitmap_copy(CTX, &cndt->nodemap, nodemap);
 }
 
+/* Check if vNUMA config is valid. Returns 0 if valid,
+ * ERROR_VNUMA_CONFIG_INVALID otherwise.
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+                              const libxl_domain_build_info *b_info,
+                              const libxl__domain_build_state *state);
+
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 14c7e7c..23951fc 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -63,6 +63,7 @@ libxl_error = Enumeration("error", [
     (-17, "DEVICE_EXISTS"),
     (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
     (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    (-20, "VNUMA_CONFIG_INVALID"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
new file mode 100644
index 0000000..33d7a3c
--- /dev/null
+++ b/tools/libxl/libxl_vnuma.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright (C) 2014      Citrix Ltd.
+ * Author Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+#include "libxl_osdeps.h" /* must come before any other headers */
+#include "libxl_internal.h"
+#include <stdlib.h>
+
+/* Sort vmemranges in ascending order with "start" */
+static int compare_vmemrange(const void *a, const void *b)
+{
+    const xen_vmemrange_t *x = a, *y = b;
+    if (x->start < y->start)
+        return -1;
+    if (x->start > y->start)
+        return 1;
+    return 0;
+}
+
+/* Check if vNUMA configuration is valid:
+ *  1. all pnodes inside vnode_to_pnode array are valid
+ *  2. each vcpu belongs to one and only one vnode
+ *  3. each vmemrange is valid and doesn't overlap with any other
+ *  4. local distance cannot be larger than remote distance
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+                              const libxl_domain_build_info *b_info,
+                              const libxl__domain_build_state *state)
+{
+    int nr_nodes = 0, rc = ERROR_VNUMA_CONFIG_INVALID;
+    unsigned int i, j;
+    libxl_numainfo *ninfo = NULL;
+    uint64_t total_memkb = 0;
+    libxl_bitmap cpumap;
+    libxl_vnode_info *v;
+
+    libxl_bitmap_init(&cpumap);
+
+    /* Check pnode specified is valid */
+    ninfo = libxl_get_numainfo(CTX, &nr_nodes);
+    if (!ninfo) {
+        LOG(ERROR, "libxl_get_numainfo failed");
+        goto out;
+    }
+
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        uint32_t pnode;
+
+        v = &b_info->vnuma_nodes[i];
+        pnode = v->pnode;
+
+        /* The pnode specified is not valid? */
+        if (pnode >= nr_nodes) {
+            LOG(ERROR, "Invalid pnode %u specified", pnode);
+            goto out;
+        }
+
+        total_memkb += v->memkb;
+    }
+
+    if (total_memkb != b_info->max_memkb) {
+        LOG(ERROR, "Amount of memory mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+            total_memkb, b_info->max_memkb);
+        goto out;
+    }
+
+    /* Check vcpu mapping */
+    libxl_cpu_bitmap_alloc(CTX, &cpumap, b_info->max_vcpus);
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        v = &b_info->vnuma_nodes[i];
+        libxl_for_each_set_bit(j, v->vcpus) {
+            if (!libxl_bitmap_test(&cpumap, j))
+                libxl_bitmap_set(&cpumap, j);
+            else {
+                LOG(ERROR, "Vcpu %d assigned more than once", j);
+                goto out;
+            }
+        }
+    }
+
+    for (i = 0; i < b_info->max_vcpus; i++) {
+        if (!libxl_bitmap_test(&cpumap, i)) {
+            LOG(ERROR, "Vcpu %d is not assigned to any vnode", i);
+            goto out;
+        }
+    }
+
+    /* Check vmemranges */
+    qsort(state->vmemranges, state->num_vmemranges, sizeof(xen_vmemrange_t),
+          compare_vmemrange);
+
+    for (i = 0; i < state->num_vmemranges; i++) {
+        if (state->vmemranges[i].end < state->vmemranges[i].start) {
+                LOG(ERROR, "Vmemrange end < start");
+                goto out;
+        }
+    }
+
+    for (i = 0; i < state->num_vmemranges - 1; i++) {
+        if (state->vmemranges[i].end > state->vmemranges[i+1].start) {
+            LOG(ERROR,
+                "Vmemranges overlapped, 0x%"PRIx64"-0x%"PRIx64", 0x%"PRIx64"-0x%"PRIx64,
+                state->vmemranges[i].start, state->vmemranges[i].end,
+                state->vmemranges[i+1].start, state->vmemranges[i+1].end);
+            goto out;
+        }
+    }
+
+    /* Check vdistances */
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        uint32_t local_distance, remote_distance;
+
+        v = &b_info->vnuma_nodes[i];
+        local_distance = v->distances[i];
+
+        for (j = 0; j < v->num_distances; j++) {
+            if (i == j) continue;
+            remote_distance = v->distances[j];
+            if (local_distance > remote_distance) {
+                LOG(ERROR,
+                    "Distance from %u to %u smaller than %u's local distance",
+                    i, j, i);
+                goto out;
+            }
+        }
+    }
+
+    rc = 0;
+out:
+    libxl_numainfo_list_free(ninfo, nr_nodes);
+    libxl_bitmap_dispose(&cpumap);
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 10/23] libxl: x86: factor out e820_host_sanitize
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (8 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 15:55 ` [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest Wei Liu
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell, Elena Ufimtseva

This function gets the machine E820 map and sanitize it according to PV
guest configuration.

This will be used in later patch. No functional change introduced in
this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v4:
1. Use actual size of the map instead of using E820MAX.
---
 tools/libxl/libxl_x86.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 9ceb373..d012b4d 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -207,6 +207,27 @@ static int e820_sanitize(libxl_ctx *ctx, struct e820entry src[],
     return 0;
 }
 
+static int e820_host_sanitize(libxl__gc *gc,
+                              libxl_domain_build_info *b_info,
+                              struct e820entry map[],
+                              uint32_t *nr)
+{
+    int rc;
+
+    rc = xc_get_machine_memory_map(CTX->xch, map, *nr);
+    if (rc < 0) {
+        errno = rc;
+        return ERROR_FAIL;
+    }
+
+    *nr = rc;
+
+    rc = e820_sanitize(CTX, map, nr, b_info->target_memkb,
+                       (b_info->max_memkb - b_info->target_memkb) +
+                       b_info->u.pv.slack_memkb);
+    return rc;
+}
+
 static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
         libxl_domain_config *d_config)
 {
@@ -223,15 +244,8 @@ static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
     if (!libxl_defbool_val(b_info->u.pv.e820_host))
         return ERROR_INVAL;
 
-    rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX);
-    if (rc < 0) {
-        errno = rc;
-        return ERROR_FAIL;
-    }
-    nr = rc;
-    rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
-                       (b_info->max_memkb - b_info->target_memkb) +
-                       b_info->u.pv.slack_memkb);
+    nr = E820MAX;
+    rc = e820_host_sanitize(gc, b_info, map, &nr);
     if (rc)
         return ERROR_FAIL;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (9 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 10/23] libxl: x86: factor out e820_host_sanitize Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 16:39   ` Dario Faggioli
  2015-03-02 15:41   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 12/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
                   ` (11 subsequent siblings)
  22 siblings, 2 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

Introduce a arch-independent routine to generate one vmemrange per
vnode. Also introduce arch-dependent routines for different
architectures because part of the process is arch-specific -- ARM has
yet have NUMA support and E820 is x86 only.

For those x86 guests who care about machine E820 map (i.e. with
e820_host=1), vnode is further split into several vmemranges to
accommodate memory holes.  A few stubs for libxl_arm.c are created.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
---
Changes in v5:
1. Allocate array all in one go.
2. Reverse the logic of vmemranges generation.

Changes in v4:
1. Adapt to new interface.
2. Address Ian Jackson's comments.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxl/libxl_arch.h     |  6 ++++
 tools/libxl/libxl_arm.c      |  8 +++++
 tools/libxl/libxl_internal.h |  8 +++++
 tools/libxl/libxl_vnuma.c    | 41 +++++++++++++++++++++++++
 tools/libxl/libxl_x86.c      | 73 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 136 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index d3bc136..e249048 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -27,4 +27,10 @@ int libxl__arch_domain_init_hw_description(libxl__gc *gc,
 int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                       libxl_domain_build_info *info,
                                       struct xc_dom_image *dom);
+
+/* build vNUMA vmemrange with arch specific information */
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *b_info,
+                                      libxl__domain_build_state *state);
 #endif
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 65a762b..7da254f 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -707,6 +707,14 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
     return 0;
 }
 
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *info,
+                                      libxl__domain_build_state *state)
+{
+    return libxl__vnuma_build_vmemrange_pv_generic(gc, domid, info, state);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 258be0d..7d1e1cf 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3400,6 +3400,14 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
 int libxl__vnuma_config_check(libxl__gc *gc,
                               const libxl_domain_build_info *b_info,
                               const libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+                                            uint32_t domid,
+                                            libxl_domain_build_info *b_info,
+                                            libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+                                    uint32_t domid,
+                                    libxl_domain_build_info *b_info,
+                                    libxl__domain_build_state *state);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index 33d7a3c..04672b5 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -14,6 +14,7 @@
  */
 #include "libxl_osdeps.h" /* must come before any other headers */
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 #include <stdlib.h>
 
 /* Sort vmemranges in ascending order with "start" */
@@ -142,6 +143,46 @@ out:
     return rc;
 }
 
+
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+                                            uint32_t domid,
+                                            libxl_domain_build_info *b_info,
+                                            libxl__domain_build_state *state)
+{
+    int i;
+    uint64_t next;
+    xen_vmemrange_t *v = NULL;
+
+    /* Generate one vmemrange for each virtual node. */
+    GCREALLOC_ARRAY(v, b_info->num_vnuma_nodes);
+    next = 0;
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        v[i].start = next;
+        v[i].end = next + (p->memkb << 10);
+        v[i].flags = 0;
+        v[i].nid = i;
+
+        next = v[i].end;
+    }
+
+    state->vmemranges = v;
+    state->num_vmemranges = i;
+
+    return 0;
+}
+
+/* Build vmemranges for PV guest */
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+                                    uint32_t domid,
+                                    libxl_domain_build_info *b_info,
+                                    libxl__domain_build_state *state)
+{
+    assert(state->vmemranges == NULL);
+    return libxl__arch_vnuma_build_vmemrange(gc, domid, b_info, state);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index d012b4d..d37cca1 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -339,6 +339,79 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
     return 0;
 }
 
+/* Return 0 on success, ERROR_* on failure. */
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *b_info,
+                                      libxl__domain_build_state *state)
+{
+    int nid, nr_vmemrange, rc;
+    uint32_t nr_e820, e820_count;
+    struct e820entry map[E820MAX];
+    xen_vmemrange_t *vmemranges;
+
+    /* If e820_host is not set, call the generic function */
+    if (!(b_info->type == LIBXL_DOMAIN_TYPE_PV &&
+          libxl_defbool_val(b_info->u.pv.e820_host)))
+        return libxl__vnuma_build_vmemrange_pv_generic(gc, domid, b_info,
+                                                       state);
+
+    assert(state->vmemranges == NULL);
+
+    nr_e820 = E820MAX;
+    rc = e820_host_sanitize(gc, b_info, map, &nr_e820);
+    if (rc) goto out;
+
+    e820_count = 0;
+    nr_vmemrange = 0;
+    vmemranges = NULL;
+    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
+        uint64_t remaining_bytes = (p->memkb << 10), bytes;
+
+        while (remaining_bytes > 0) {
+            if (e820_count >= nr_e820) {
+                rc = ERROR_NOMEM;
+                goto out;
+            }
+
+            /* Skip non RAM region */
+            if (map[e820_count].type != E820_RAM) {
+                e820_count++;
+                continue;
+            }
+
+            GCREALLOC_ARRAY(vmemranges, nr_vmemrange+1);
+
+            bytes = map[e820_count].size >= remaining_bytes ?
+                remaining_bytes : map[e820_count].size;
+
+            vmemranges[nr_vmemrange].start = map[e820_count].addr;
+            vmemranges[nr_vmemrange].end = map[e820_count].addr + bytes;
+
+            if (map[e820_count].size >= remaining_bytes) {
+                map[e820_count].addr += bytes;
+                map[e820_count].size -= bytes;
+            } else {
+                e820_count++;
+            }
+
+            remaining_bytes -= bytes;
+
+            vmemranges[nr_vmemrange].flags = 0;
+            vmemranges[nr_vmemrange].nid = nid;
+            nr_vmemrange++;
+        }
+    }
+
+    state->vmemranges = vmemranges;
+    state->num_vmemranges = nr_vmemrange;
+
+    rc = 0;
+out:
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 12/23] libxl: build, check and pass vNUMA info to Xen for PV guest
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (10 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 15:55 ` [PATCH v6 13/23] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

Transform the user supplied vNUMA configuration into libxl internal
representations, and finally libxc representations. Check validity of
the configuration along the line.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Use "unsigned" for some variables.
2. Variable name: bit -> j.

Changes in v5:
1. Adapt to change of interface (ditching xc_vnuma_info).

Changes in v4:
1. Adapt to new interfaces.

Changes in v3:
1. Add more commit log.
---
 tools/libxl/libxl_dom.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a16d4a1..b58a19b 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -515,6 +515,51 @@ retry_transaction:
     return 0;
 }
 
+static int set_vnuma_info(libxl__gc *gc, uint32_t domid,
+                          const libxl_domain_build_info *info,
+                          const libxl__domain_build_state *state)
+{
+    int rc = 0;
+    unsigned int i, nr_vdistance;
+    unsigned int *vcpu_to_vnode, *vnode_to_pnode, *vdistance = NULL;
+
+    vcpu_to_vnode = libxl__calloc(gc, info->max_vcpus,
+                                  sizeof(unsigned int));
+    vnode_to_pnode = libxl__calloc(gc, info->num_vnuma_nodes,
+                                   sizeof(unsigned int));
+
+    nr_vdistance = info->num_vnuma_nodes * info->num_vnuma_nodes;
+    vdistance = libxl__calloc(gc, nr_vdistance, sizeof(unsigned int));
+
+    for (i = 0; i < info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *v = &info->vnuma_nodes[i];
+        int j;
+
+        /* vnode to pnode mapping */
+        vnode_to_pnode[i] = v->pnode;
+
+        /* vcpu to vnode mapping */
+        libxl_for_each_set_bit(j, v->vcpus)
+            vcpu_to_vnode[j] = i;
+
+        /* node distances */
+        assert(info->num_vnuma_nodes == v->num_distances);
+        memcpy(vdistance + (i * info->num_vnuma_nodes),
+               v->distances,
+               v->num_distances * sizeof(unsigned int));
+    }
+
+    if (xc_domain_setvnuma(CTX->xch, domid, info->num_vnuma_nodes,
+                           state->num_vmemranges, info->max_vcpus,
+                           state->vmemranges, vdistance,
+                           vcpu_to_vnode, vnode_to_pnode) < 0) {
+        LOGE(ERROR, "xc_domain_setvnuma failed");
+        rc = ERROR_FAIL;
+    }
+
+    return rc;
+}
+
 int libxl__build_pv(libxl__gc *gc, uint32_t domid,
              libxl_domain_build_info *info, libxl__domain_build_state *state)
 {
@@ -572,6 +617,38 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     dom->xenstore_domid = state->store_domid;
     dom->claim_enabled = libxl_defbool_val(info->claim_mode);
 
+    if (info->num_vnuma_nodes != 0) {
+        unsigned int i;
+
+        ret = libxl__vnuma_build_vmemrange_pv(gc, domid, info, state);
+        if (ret) {
+            LOGE(ERROR, "cannot build vmemranges");
+            goto out;
+        }
+        ret = libxl__vnuma_config_check(gc, info, state);
+        if (ret) goto out;
+
+        ret = set_vnuma_info(gc, domid, info, state);
+        if (ret) goto out;
+
+        dom->nr_vmemranges = state->num_vmemranges;
+        dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges) *
+                                        dom->nr_vmemranges);
+
+        for (i = 0; i < dom->nr_vmemranges; i++) {
+            dom->vmemranges[i].start = state->vmemranges[i].start;
+            dom->vmemranges[i].end   = state->vmemranges[i].end;
+            dom->vmemranges[i].flags = state->vmemranges[i].flags;
+            dom->vmemranges[i].nid   = state->vmemranges[i].nid;
+        }
+
+        dom->nr_vnodes = info->num_vnuma_nodes;
+        dom->vnode_to_pnode = xc_dom_malloc(dom, sizeof(*dom->vnode_to_pnode) *
+                                            dom->nr_vnodes);
+        for (i = 0; i < info->num_vnuma_nodes; i++)
+            dom->vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+    }
+
     if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) {
         LOGE(ERROR, "xc_dom_boot_xen_init failed");
         goto out;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 13/23] libxc: indentation change to xc_hvm_build_x86.c
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (11 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 12/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-02-26 15:55 ` [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

Move a while loop in xc_hvm_build_x86 one block to the right. No
functional change introduced.

Functional changes will be introduced in next patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_hvm_build_x86.c | 153 ++++++++++++++++++++++-------------------
 1 file changed, 81 insertions(+), 72 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index c81a25b..ecc3224 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -353,98 +353,107 @@ static int setup_guest(xc_interface *xch,
     cur_pages = 0xc0;
     stat_normal_pages = 0xc0;
 
-    while ( (rc == 0) && (nr_pages > cur_pages) )
     {
-        /* Clip count to maximum 1GB extent. */
-        unsigned long count = nr_pages - cur_pages;
-        unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
-
-        if ( count > max_pages )
-            count = max_pages;
-
-        cur_pfn = page_array[cur_pages];
-
-        /* Take care the corner cases of super page tails */
-        if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-             (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
-            count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
-        else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-                  (count > SUPERPAGE_1GB_NR_PFNS) )
-            count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
-
-        /* Attemp to allocate 1GB super page. Because in each pass we only
-         * allocate at most 1GB, we don't have to clip super page boundaries.
-         */
-        if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
-             /* Check if there exists MMIO hole in the 1GB memory range */
-             !check_mmio_hole(cur_pfn << PAGE_SHIFT,
-                              SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
-                              mmio_start, mmio_size) )
+        while ( (rc == 0) && (nr_pages > cur_pages) )
         {
-            long done;
-            unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
-            xen_pfn_t sp_extents[nr_extents];
-
-            for ( i = 0; i < nr_extents; i++ )
-                sp_extents[i] = page_array[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
-
-            done = xc_domain_populate_physmap(xch, dom, nr_extents, SUPERPAGE_1GB_SHIFT,
-                                              pod_mode, sp_extents);
-
-            if ( done > 0 )
-            {
-                stat_1gb_pages += done;
-                done <<= SUPERPAGE_1GB_SHIFT;
-                cur_pages += done;
-                count -= done;
-            }
-        }
+            /* Clip count to maximum 1GB extent. */
+            unsigned long count = nr_pages - cur_pages;
+            unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
 
-        if ( count != 0 )
-        {
-            /* Clip count to maximum 8MB extent. */
-            max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
             if ( count > max_pages )
                 count = max_pages;
-            
-            /* Clip partial superpage extents to superpage boundaries. */
-            if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                 (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
-                count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
-            else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                      (count > SUPERPAGE_2MB_NR_PFNS) )
-                count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
-
-            /* Attempt to allocate superpage extents. */
-            if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+
+            cur_pfn = page_array[cur_pages];
+
+            /* Take care the corner cases of super page tails */
+            if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                 (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
+                count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
+            else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                      (count > SUPERPAGE_1GB_NR_PFNS) )
+                count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
+
+            /* Attemp to allocate 1GB super page. Because in each pass
+             * we only allocate at most 1GB, we don't have to clip
+             * super page boundaries.
+             */
+            if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
+                 /* Check if there exists MMIO hole in the 1GB memory
+                  * range */
+                 !check_mmio_hole(cur_pfn << PAGE_SHIFT,
+                                  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
+                                  mmio_start, mmio_size) )
             {
                 long done;
-                unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+                unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
                 xen_pfn_t sp_extents[nr_extents];
 
                 for ( i = 0; i < nr_extents; i++ )
-                    sp_extents[i] = page_array[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+                    sp_extents[i] =
+                        page_array[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
 
-                done = xc_domain_populate_physmap(xch, dom, nr_extents, SUPERPAGE_2MB_SHIFT,
+                done = xc_domain_populate_physmap(xch, dom, nr_extents,
+                                                  SUPERPAGE_1GB_SHIFT,
                                                   pod_mode, sp_extents);
 
                 if ( done > 0 )
                 {
-                    stat_2mb_pages += done;
-                    done <<= SUPERPAGE_2MB_SHIFT;
+                    stat_1gb_pages += done;
+                    done <<= SUPERPAGE_1GB_SHIFT;
                     cur_pages += done;
                     count -= done;
                 }
             }
-        }
 
-        /* Fall back to 4kB extents. */
-        if ( count != 0 )
-        {
-            rc = xc_domain_populate_physmap_exact(
-                xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
-            cur_pages += count;
-            stat_normal_pages += count;
+            if ( count != 0 )
+            {
+                /* Clip count to maximum 8MB extent. */
+                max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
+                if ( count > max_pages )
+                    count = max_pages;
+
+                /* Clip partial superpage extents to superpage
+                 * boundaries. */
+                if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                     (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
+                    count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
+                else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                          (count > SUPERPAGE_2MB_NR_PFNS) )
+                    count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
+
+                /* Attempt to allocate superpage extents. */
+                if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+                {
+                    long done;
+                    unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+                    xen_pfn_t sp_extents[nr_extents];
+
+                    for ( i = 0; i < nr_extents; i++ )
+                        sp_extents[i] =
+                            page_array[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+
+                    done = xc_domain_populate_physmap(xch, dom, nr_extents,
+                                                      SUPERPAGE_2MB_SHIFT,
+                                                      pod_mode, sp_extents);
+
+                    if ( done > 0 )
+                    {
+                        stat_2mb_pages += done;
+                        done <<= SUPERPAGE_2MB_SHIFT;
+                        cur_pages += done;
+                        count -= done;
+                    }
+                }
+            }
+
+            /* Fall back to 4kB extents. */
+            if ( count != 0 )
+            {
+                rc = xc_domain_populate_physmap_exact(
+                    xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
+                cur_pages += count;
+                stat_normal_pages += count;
+            }
         }
     }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (12 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 13/23] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:43   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

The algorithm is more or less the same as the one used for PV guest.
Libxc gets hold of the mapping of vnode to pnode and size of each vnode
then allocate memory accordingly.

And then the function returns low memory end, high memory end and mmio
start to caller. Libxl needs those values to construct vmemranges for
that guest.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
---
Changes in v6:
1. Use XEN_NUMA_NO_NODE.
2. Fix a minor bug discovered by Dario.

Changes in v5:
1. Use a better loop variable name vnid.

Changes in v4:
1. Adapt to new interface.
2. Shorten error message.
3. This patch includes only functional changes.

Changes in v3:
1. Rewrite commit log.
2. Add a few code comments.
---
 tools/libxc/include/xenguest.h |  11 +++++
 tools/libxc/xc_hvm_build_x86.c | 102 ++++++++++++++++++++++++++++++++++-------
 2 files changed, 97 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 40bbac8..ff66cb1 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -230,6 +230,17 @@ struct xc_hvm_build_args {
     struct xc_hvm_firmware_module smbios_module;
     /* Whether to use claim hypercall (1 - enable, 0 - disable). */
     int claim_enabled;
+
+    /* vNUMA information*/
+    xen_vmemrange_t *vmemranges;
+    unsigned int nr_vmemranges;
+    unsigned int *vnode_to_pnode;
+    unsigned int nr_vnodes;
+
+    /* Out parameters  */
+    uint64_t lowmem_end;
+    uint64_t highmem_end;
+    uint64_t mmio_start;
 };
 
 /**
diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index ecc3224..fba02fb 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
 }
 
 static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-                           uint64_t mmio_start, uint64_t mmio_size)
+                           uint64_t mmio_start, uint64_t mmio_size,
+                           struct xc_hvm_build_args *args)
 {
     struct hvm_info_table *hvm_info = (struct hvm_info_table *)
         (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
@@ -119,6 +120,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
     hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
     hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
 
+    args->lowmem_end = lowmem_end;
+    args->highmem_end = highmem_end;
+    args->mmio_start = mmio_start;
+
     /* Finish with the checksum. */
     for ( i = 0, sum = 0; i < hvm_info->length; i++ )
         sum += ((uint8_t *)hvm_info)[i];
@@ -244,7 +249,7 @@ static int setup_guest(xc_interface *xch,
                        char *image, unsigned long image_size)
 {
     xen_pfn_t *page_array = NULL;
-    unsigned long i, nr_pages = args->mem_size >> PAGE_SHIFT;
+    unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
     unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
     uint64_t mmio_start = (1ull << 32) - args->mmio_size;
     uint64_t mmio_size = args->mmio_size;
@@ -258,13 +263,13 @@ static int setup_guest(xc_interface *xch,
     xen_capabilities_info_t caps;
     unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
         stat_1gb_pages = 0;
-    int pod_mode = 0;
+    unsigned int memflags = 0;
     int claim_enabled = args->claim_enabled;
     xen_pfn_t special_array[NR_SPECIAL_PAGES];
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
-
-    if ( nr_pages > target_pages )
-        pod_mode = XENMEMF_populate_on_demand;
+    uint64_t total_pages;
+    xen_vmemrange_t dummy_vmemrange;
+    unsigned int dummy_vnode_to_pnode;
 
     memset(&elf, 0, sizeof(elf));
     if ( elf_init(&elf, image, image_size) != 0 )
@@ -276,6 +281,43 @@ static int setup_guest(xc_interface *xch,
     v_start = 0;
     v_end = args->mem_size;
 
+    if ( nr_pages > target_pages )
+        memflags |= XENMEMF_populate_on_demand;
+
+    if ( args->nr_vmemranges == 0 )
+    {
+        /* Build dummy vnode information */
+        dummy_vmemrange.start = 0;
+        dummy_vmemrange.end   = args->mem_size;
+        dummy_vmemrange.flags = 0;
+        dummy_vmemrange.nid   = 0;
+        args->nr_vmemranges = 1;
+        args->vmemranges = &dummy_vmemrange;
+
+        dummy_vnode_to_pnode = XEN_NUMA_NO_NODE;
+        args->nr_vnodes = 1;
+        args->vnode_to_pnode = &dummy_vnode_to_pnode;
+    }
+    else
+    {
+        if ( nr_pages > target_pages )
+        {
+            PERROR("Cannot enable vNUMA and PoD at the same time");
+            goto error_out;
+        }
+    }
+
+    total_pages = 0;
+    for ( i = 0; i < args->nr_vmemranges; i++ )
+        total_pages += ((args->vmemranges[i].end - args->vmemranges[i].start)
+                        >> PAGE_SHIFT);
+    if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
+    {
+        PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+               total_pages, args->mem_size >> PAGE_SHIFT);
+        goto error_out;
+    }
+
     if ( xc_version(xch, XENVER_capabilities, &caps) != 0 )
     {
         PERROR("Could not get Xen capabilities");
@@ -320,7 +362,7 @@ static int setup_guest(xc_interface *xch,
         }
     }
 
-    if ( pod_mode )
+    if ( memflags & XENMEMF_populate_on_demand )
     {
         /*
          * Subtract VGA_HOLE_SIZE from target_pages for the VGA
@@ -349,15 +391,37 @@ static int setup_guest(xc_interface *xch,
      * ensure that we can be preempted and hence dom0 remains responsive.
      */
     rc = xc_domain_populate_physmap_exact(
-        xch, dom, 0xa0, 0, pod_mode, &page_array[0x00]);
-    cur_pages = 0xc0;
-    stat_normal_pages = 0xc0;
+        xch, dom, 0xa0, 0, memflags, &page_array[0x00]);
 
+    stat_normal_pages = 0;
+    for ( vmemid = 0; vmemid < args->nr_vmemranges; vmemid++ )
     {
-        while ( (rc == 0) && (nr_pages > cur_pages) )
+        unsigned int new_memflags = memflags;
+        uint64_t end_pages;
+        unsigned int vnode = args->vmemranges[vmemid].nid;
+        unsigned int pnode = args->vnode_to_pnode[vnode];
+
+        if ( pnode != XEN_NUMA_NO_NODE )
+            new_memflags |= XENMEMF_exact_node(pnode);
+
+        end_pages = args->vmemranges[vmemid].end >> PAGE_SHIFT;
+        /*
+         * Consider vga hole belongs to the vmemrange that covers
+         * 0xA0000-0xC0000. Note that 0x00000-0xA0000 is populated just
+         * before this loop.
+         */
+        if ( args->vmemranges[vmemid].start == 0 )
+        {
+            cur_pages = 0xc0;
+            stat_normal_pages += 0xc0;
+        }
+        else
+            cur_pages = args->vmemranges[vmemid].start >> PAGE_SHIFT;
+
+        while ( (rc == 0) && (end_pages > cur_pages) )
         {
             /* Clip count to maximum 1GB extent. */
-            unsigned long count = nr_pages - cur_pages;
+            unsigned long count = end_pages - cur_pages;
             unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
 
             if ( count > max_pages )
@@ -394,7 +458,7 @@ static int setup_guest(xc_interface *xch,
 
                 done = xc_domain_populate_physmap(xch, dom, nr_extents,
                                                   SUPERPAGE_1GB_SHIFT,
-                                                  pod_mode, sp_extents);
+                                                  memflags, sp_extents);
 
                 if ( done > 0 )
                 {
@@ -434,7 +498,7 @@ static int setup_guest(xc_interface *xch,
 
                     done = xc_domain_populate_physmap(xch, dom, nr_extents,
                                                       SUPERPAGE_2MB_SHIFT,
-                                                      pod_mode, sp_extents);
+                                                      memflags, sp_extents);
 
                     if ( done > 0 )
                     {
@@ -450,11 +514,14 @@ static int setup_guest(xc_interface *xch,
             if ( count != 0 )
             {
                 rc = xc_domain_populate_physmap_exact(
-                    xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
+                    xch, dom, count, 0, new_memflags, &page_array[cur_pages]);
                 cur_pages += count;
                 stat_normal_pages += count;
             }
         }
+
+        if ( rc != 0 )
+            break;
     }
 
     if ( rc != 0 )
@@ -478,7 +545,7 @@ static int setup_guest(xc_interface *xch,
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
         goto error_out;
-    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size);
+    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
     munmap(hvm_info_page, PAGE_SIZE);
 
     /* Allocate and clear special pages. */
@@ -617,6 +684,9 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
             args.acpi_module.guest_addr_out;
         hvm_args->smbios_module.guest_addr_out = 
             args.smbios_module.guest_addr_out;
+        hvm_args->lowmem_end = args.lowmem_end;
+        hvm_args->highmem_end = args.highmem_end;
+        hvm_args->mmio_start = args.mmio_start;
     }
 
     free(image);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen for HVM guest
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (13 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:44   ` Ian Campbell
  2015-02-26 15:55 ` [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell, Elena Ufimtseva

Transform user supplied vNUMA configuration into libxl internal
representations then libxc representations. Check validity along the
line.

Libxc has more involvement in building vmemranges in HVM case compared
to PV case. The building of vmemranges is placed after xc_hvm_build
returns, because it relies on memory hole information provided by
xc_hvm_build.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
---
Changes in v6:
1. Fix a minor bug discovered by Dario.

Changes in v5:
1. Check vnode 0 is large enough to accommodate video ram.

Changes in v4:
1. Adapt to new interface.
2. Rename some variables.
3. Use GCREALLOC_ARRAY.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxl/libxl_create.c   |  9 +++++++
 tools/libxl/libxl_dom.c      | 43 ++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |  5 ++++
 tools/libxl/libxl_vnuma.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 113 insertions(+)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 98687bd..af04248 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -853,6 +853,15 @@ static void initiate_domain_create(libxl__egc *egc,
         goto error_out;
     }
 
+    /* Disallow PoD and vNUMA to be enabled at the same time because PoD
+     * pool is not vNUMA-aware yet.
+     */
+    if (pod_enabled && d_config->b_info.num_vnuma_nodes) {
+        ret = ERROR_INVAL;
+        LOG(ERROR, "Cannot enable PoD and vNUMA at the same time");
+        goto error_out;
+    }
+
     ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
     if (ret) goto error_out;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index b58a19b..c1a409d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -893,12 +893,55 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
+    if (info->num_vnuma_nodes != 0) {
+        int i;
+
+        args.nr_vmemranges = state->num_vmemranges;
+        args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
+                                        args.nr_vmemranges);
+
+        for (i = 0; i < args.nr_vmemranges; i++) {
+            args.vmemranges[i].start = state->vmemranges[i].start;
+            args.vmemranges[i].end   = state->vmemranges[i].end;
+            args.vmemranges[i].flags = state->vmemranges[i].flags;
+            args.vmemranges[i].nid   = state->vmemranges[i].nid;
+        }
+
+        /* Consider video ram belongs to vmemrange 0 -- just shrink it
+         * by the size of video ram.
+         */
+        if (((args.vmemranges[0].end - args.vmemranges[0].start) >> 10)
+            < info->video_memkb) {
+            LOG(ERROR, "vmemrange 0 too small to contain video ram");
+            goto out;
+        }
+
+        args.vmemranges[0].end -= (info->video_memkb << 10);
+
+        args.nr_vnodes = info->num_vnuma_nodes;
+        args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) *
+                                            args.nr_vnodes);
+        for (i = 0; i < args.nr_vnodes; i++)
+            args.vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+    }
+
     ret = xc_hvm_build(ctx->xch, domid, &args);
     if (ret) {
         LOGEV(ERROR, ret, "hvm building failed");
         goto out;
     }
 
+    if (info->num_vnuma_nodes != 0) {
+        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
+        if (ret) {
+            LOGEV(ERROR, ret, "hvm build vmemranges failed");
+            goto out;
+        }
+        ret = libxl__vnuma_config_check(gc, info, state);
+        if (ret) goto out;
+        ret = set_vnuma_info(gc, domid, info, state);
+        if (ret) goto out;
+    }
     ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
                                &state->store_mfn, state->console_port,
                                &state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7d1e1cf..e93089a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3408,6 +3408,11 @@ int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
                                     uint32_t domid,
                                     libxl_domain_build_info *b_info,
                                     libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
+                                     uint32_t domid,
+                                     libxl_domain_build_info *b_info,
+                                     libxl__domain_build_state *state,
+                                     struct xc_hvm_build_args *args);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index 04672b5..a0576ee 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -183,6 +183,62 @@ int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
     return libxl__arch_vnuma_build_vmemrange(gc, domid, b_info, state);
 }
 
+/* Build vmemranges for HVM guest */
+int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
+                                     uint32_t domid,
+                                     libxl_domain_build_info *b_info,
+                                     libxl__domain_build_state *state,
+                                     struct xc_hvm_build_args *args)
+{
+    uint64_t hole_start, hole_end, next;
+    int nid, nr_vmemrange;
+    xen_vmemrange_t *vmemranges;
+
+    /* Derive vmemranges from vnode size and memory hole.
+     *
+     * Guest physical address space layout:
+     * [0, hole_start) [hole_start, hole_end) [hole_end, highmem_end)
+     */
+    hole_start = args->lowmem_end < args->mmio_start ?
+        args->lowmem_end : args->mmio_start;
+    hole_end = (args->mmio_start + args->mmio_size) > (1ULL << 32) ?
+        (args->mmio_start + args->mmio_size) : (1ULL << 32);
+
+    assert(state->vmemranges == NULL);
+
+    next = 0;
+    nr_vmemrange = 0;
+    vmemranges = NULL;
+    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
+        uint64_t remaining_bytes = p->memkb << 10;
+
+        while (remaining_bytes > 0) {
+            uint64_t count = remaining_bytes;
+
+            if (next >= hole_start && next < hole_end)
+                next = hole_end;
+            if ((next < hole_start) && (next + remaining_bytes >= hole_start))
+                count = hole_start - next;
+
+            GCREALLOC_ARRAY(vmemranges, nr_vmemrange+1);
+            vmemranges[nr_vmemrange].start = next;
+            vmemranges[nr_vmemrange].end = next + count;
+            vmemranges[nr_vmemrange].flags = 0;
+            vmemranges[nr_vmemrange].nid = nid;
+
+            nr_vmemrange++;
+            remaining_bytes -= count;
+            next += count;
+        }
+    }
+
+    state->vmemranges = vmemranges;
+    state->num_vmemranges = nr_vmemrange;
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (14 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
@ 2015-02-26 15:55 ` Wei Liu
  2015-03-02 15:46   ` Ian Campbell
  2015-02-26 15:56 ` [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA Wei Liu
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Disallow memory relocation when vNUMA is enabled, because relocated
memory ends up off node. Further more, even if we dynamically expand
node coverage in hvmloader, low memory and high memory may reside
in different physical nodes, blindly relocating low memory to high
memory gives us a sub-optimal configuration.

Introduce a function called libxl__vnuma_configured and use it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
---
Changes in v6:
1. Introduce a helper function.
---
 tools/libxl/libxl_dm.c       | 6 ++++--
 tools/libxl/libxl_internal.h | 1 +
 tools/libxl/libxl_vnuma.c    | 5 +++++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..7b09512 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1365,13 +1365,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
                         libxl__sprintf(gc, "%s/hvmloader/bios", path),
                         "%s", libxl_bios_type_to_string(b_info->u.hvm.bios));
         /* Disable relocating memory to make the MMIO hole larger
-         * unless we're running qemu-traditional */
+         * unless we're running qemu-traditional and vNUMA is not
+         * configured. */
         libxl__xs_write(gc, XBT_NULL,
                         libxl__sprintf(gc,
                                        "%s/hvmloader/allow-memory-relocate",
                                        path),
                         "%d",
-                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL);
+                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
+                        !libxl__vnuma_configured(b_info));
         free(path);
     }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e93089a..d04b6aa 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3413,6 +3413,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
                                      libxl_domain_build_info *b_info,
                                      libxl__domain_build_state *state,
                                      struct xc_hvm_build_args *args);
+bool libxl__vnuma_configured(const libxl_domain_build_info *b_info);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index a0576ee..6af3cde 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -17,6 +17,11 @@
 #include "libxl_arch.h"
 #include <stdlib.h>
 
+bool libxl__vnuma_configured(const libxl_domain_build_info *b_info)
+{
+    return b_info->num_vnuma_nodes != 0;
+}
+
 /* Sort vmemranges in ascending order with "start" */
 static int compare_vmemrange(const void *a, const void *b)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (15 preceding siblings ...)
  2015-02-26 15:55 ` [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-02-27 13:46   ` Dario Faggioli
  2015-02-26 15:56 ` [PATCH v6 18/23] libxlu: rework internal representation of setting Wei Liu
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v6:
1. Better description of the macro.
---
 tools/libxl/libxl.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index c219f59..c668c04 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -67,6 +67,13 @@
  * the same $(XEN_VERSION) (e.g. throughout a major release).
  */
 
+/* LIBXL_HAVE_VNUMA
+ *
+ * If this is defined the type libxl_vnode_info exists, and a
+ * field 'vnuma_nodes' is present in libxl_domain_build_info.
+ */
+#define LIBXL_HAVE_VNUMA 1
+
 /* LIBXL_HAVE_USERDATA_UNLINK
  *
  * If it is defined, libxl has a library function called
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 18/23] libxlu: rework internal representation of setting
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (16 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-02-26 15:56 ` [PATCH v6 19/23] libxlu: nested list support Wei Liu
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

This patches does following things:

1. Properly define a XLU_ConfigList type. Originally it was defined to
   be XLU_ConfigSetting.
2. Define XLU_ConfigValue type, which can be either a string or a list
   of XLU_ConfigValue.
3. ConfigSetting now references XLU_ConfigValue. Originally it only
   worked with **string.
4. Properly construct list where necessary, see changes to .y file.

To achieve above changes:

1. xlu__cfg_set_mk and xlu__cfg_set_add are deleted, because they
   are no more needed in the new code.
2. Introduce xlu__cfg_string_mk to make a XLU_ConfigSetting that points
   to a XLU_ConfigValue that wraps a string.
3. Introduce xlu__cfg_list_mk to make a XLU_ConfigSetting that points
   to XLU_ConfigValue that is a list.
4. The parser now generates XLU_ConfigValue instead of XLU_ConfigSetting
   when construct values, which enables us to recursively generate list
   of lists.
5. XLU_ConfigSetting is generated in xlu__cfg_set_store.
6. Adapt other functions to use new types.

No change to public API. Xl compiles without problem and 'xl create -n
guest.cfg' is valgrind clean.

This patch is needed because we're going to implement nested list
support, which requires support for list of list.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v5:
1. Use standard expanding-array pattern.
---
 tools/libxl/libxlu_cfg.c      | 170 ++++++++++++++++++++++++++++++------------
 tools/libxl/libxlu_cfg_i.h    |  12 ++-
 tools/libxl/libxlu_cfg_y.c    |  24 +++---
 tools/libxl/libxlu_cfg_y.h    |   2 +-
 tools/libxl/libxlu_cfg_y.y    |  14 ++--
 tools/libxl/libxlu_internal.h |  30 ++++++--
 6 files changed, 173 insertions(+), 79 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 22adcb0..f000eed 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -131,14 +131,28 @@ int xlu_cfg_readdata(XLU_Config *cfg, const char *data, int length) {
     return ctx.err;
 }
 
-void xlu__cfg_set_free(XLU_ConfigSetting *set) {
+void xlu__cfg_value_free(XLU_ConfigValue *value)
+{
     int i;
 
+    if (!value) return;
+
+    switch (value->type) {
+    case XLU_STRING:
+        free(value->u.string);
+        break;
+    case XLU_LIST:
+        for (i = 0; i < value->u.list.nvalues; i++)
+            xlu__cfg_value_free(value->u.list.values[i]);
+        free(value->u.list.values);
+    }
+    free(value);
+}
+
+void xlu__cfg_set_free(XLU_ConfigSetting *set) {
     if (!set) return;
     free(set->name);
-    for (i=0; i<set->nvalues; i++)
-        free(set->values[i]);
-    free(set->values);
+    xlu__cfg_value_free(set->value);
     free(set);
 }
 
@@ -173,7 +187,7 @@ static int find_atom(const XLU_Config *cfg, const char *n,
     set= find(cfg,n);
     if (!set) return ESRCH;
 
-    if (set->avalues!=1) {
+    if (set->value->type!=XLU_STRING) {
         if (!dont_warn)
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is"
@@ -191,7 +205,7 @@ int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
     int e;
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-    *value_r= set->values[0];
+    *value_r= set->value->u.string;
     return 0;
 }
 
@@ -202,7 +216,7 @@ int xlu_cfg_replace_string(const XLU_Config *cfg, const char *n,
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
     free(*value_r);
-    *value_r= strdup(set->values[0]);
+    *value_r= strdup(set->value->u.string);
     return 0;
 }
 
@@ -214,7 +228,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
     char *ep;
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-    errno= 0; l= strtol(set->values[0], &ep, 0);
+    errno= 0; l= strtol(set->value->u.string, &ep, 0);
     e= errno;
     if (errno) {
         e= errno;
@@ -226,7 +240,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
                     cfg->config_source, set->lineno, n, strerror(e));
         return e;
     }
-    if (*ep || ep==set->values[0]) {
+    if (*ep || ep==set->value->u.string) {
         if (!dont_warn)
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is not a valid number\n",
@@ -253,7 +267,7 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
                      XLU_ConfigList **list_r, int *entries_r, int dont_warn) {
     XLU_ConfigSetting *set;
     set= find(cfg,n);  if (!set) return ESRCH;
-    if (set->avalues==1) {
+    if (set->value->type!=XLU_LIST) {
         if (!dont_warn) {
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is a single value"
@@ -262,8 +276,8 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
         }
         return EINVAL;
     }
-    if (list_r) *list_r= set;
-    if (entries_r) *entries_r= set->nvalues;
+    if (list_r) *list_r= &set->value->u.list;
+    if (entries_r) *entries_r= set->value->u.list.nvalues;
     return 0;
 }
 
@@ -290,72 +304,130 @@ int xlu_cfg_get_list_as_string_list(const XLU_Config *cfg, const char *n,
     return 0;
 }
 
-const char *xlu_cfg_get_listitem(const XLU_ConfigList *set, int entry) {
-    if (entry < 0 || entry >= set->nvalues) return 0;
-    return set->values[entry];
+const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
+    if (entry < 0 || entry >= list->nvalues) return 0;
+    if (list->values[entry]->type != XLU_STRING) return 0;
+    return list->values[entry]->u.string;
 }
 
 
-XLU_ConfigSetting *xlu__cfg_set_mk(CfgParseContext *ctx,
-                                   int alloc, char *atom) {
-    XLU_ConfigSetting *set= 0;
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+{
+    XLU_ConfigValue *value = NULL;
 
     if (ctx->err) goto x;
-    assert(!!alloc == !!atom);
 
-    set= malloc(sizeof(*set));
-    if (!set) goto xe;
+    value = malloc(sizeof(*value));
+    if (!value) goto xe;
+    value->type = XLU_STRING;
+    value->u.string = atom;
+
+    return value;
 
-    set->name= 0; /* tbd */
-    set->avalues= alloc;
+ xe:
+    ctx->err= errno;
+ x:
+    free(value);
+    free(atom);
+    return NULL;
+}
 
-    if (!alloc) {
-        set->nvalues= 0;
-        set->values= 0;
-    } else {
-        set->values= malloc(sizeof(*set->values) * alloc);
-        if (!set->values) goto xe;
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
+{
+    XLU_ConfigValue *value = NULL;
+    XLU_ConfigValue **values = NULL;
+    XLU_ConfigValue *val = NULL;
 
-        set->nvalues= 1;
-        set->values[0]= atom;
-    }
-    return set;
+    if (ctx->err) goto x;
+
+    val = malloc(sizeof(*val));
+    if (!val) goto xe;
+    val->type = XLU_STRING;
+    val->u.string = atom;
+
+    values = malloc(sizeof(*values));
+    if (!values) goto xe;
+    values[0] = val;
+
+    value = malloc(sizeof(*value));
+    if (!value) goto xe;
+    value->type = XLU_LIST;
+    value->u.list.nvalues = 1;
+    value->u.list.avalues = 1;
+    value->u.list.values = values;
+
+    return value;
 
  xe:
     ctx->err= errno;
  x:
-    free(set);
+    free(value);
+    free(values);
+    free(val);
     free(atom);
-    return 0;
+    return NULL;
 }
 
-void xlu__cfg_set_add(CfgParseContext *ctx, XLU_ConfigSetting *set,
-                      char *atom) {
+void xlu__cfg_list_append(CfgParseContext *ctx,
+                          XLU_ConfigValue *list,
+                          char *atom)
+{
+    XLU_ConfigValue *val = NULL;
     if (ctx->err) return;
 
     assert(atom);
+    assert(list->type == XLU_LIST);
 
-    if (set->nvalues >= set->avalues) {
+    if (list->u.list.nvalues >= list->u.list.avalues) {
         int new_avalues;
-        char **new_values;
-
-        if (set->avalues > INT_MAX / 100) { ctx->err= ERANGE; return; }
-        new_avalues= set->avalues * 4;
-        new_values= realloc(set->values,
-                            sizeof(*new_values) * new_avalues);
-        if (!new_values) { ctx->err= errno; free(atom); return; }
-        set->values= new_values;
-        set->avalues= new_avalues;
+        XLU_ConfigValue **new_values = NULL;
+
+        if (list->u.list.avalues > INT_MAX / 100) {
+            ctx->err = ERANGE;
+            free(atom);
+            return;
+        }
+
+        new_avalues = list->u.list.avalues * 4;
+        new_values  = realloc(list->u.list.values,
+                              sizeof(*new_values) * new_avalues);
+        if (!new_values) {
+            ctx->err = errno;
+            free(atom);
+            return;
+        }
+
+        list->u.list.avalues = new_avalues;
+        list->u.list.values  = new_values;
+    }
+
+    val = malloc(sizeof(*val));
+    if (!val) {
+        ctx->err = errno;
+        free(atom);
+        return;
     }
-    set->values[set->nvalues++]= atom;
+
+    val->type = XLU_STRING;
+    val->u.string = atom;
+    list->u.list.values[list->u.list.nvalues] = val;
+    list->u.list.nvalues++;
 }
 
 void xlu__cfg_set_store(CfgParseContext *ctx, char *name,
-                        XLU_ConfigSetting *set, int lineno) {
+                        XLU_ConfigValue *val, int lineno) {
+    XLU_ConfigSetting *set;
+
     if (ctx->err) return;
 
     assert(name);
+    set = malloc(sizeof(*set));
+    if (!set) {
+        ctx->err = errno;
+        return;
+    }
     set->name= name;
+    set->value = val;
     set->lineno= lineno;
     set->next= ctx->cfg->settings;
     ctx->cfg->settings= set;
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 54d033c..b71e9fd 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -23,11 +23,15 @@
 #include "libxlu_cfg_y.h"
 
 void xlu__cfg_set_free(XLU_ConfigSetting *set);
-XLU_ConfigSetting *xlu__cfg_set_mk(CfgParseContext*, int alloc, char *atom);
-void xlu__cfg_set_add(CfgParseContext*, XLU_ConfigSetting *set, char *atom);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
-                        XLU_ConfigSetting *set, int lineno);
-
+                        XLU_ConfigValue *val, int lineno);
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
+                                    char *atom);
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom);
+void xlu__cfg_list_append(CfgParseContext *ctx,
+                          XLU_ConfigValue *list,
+                          char *atom);
+void xlu__cfg_value_free(XLU_ConfigValue *value);
 char *xlu__cfgl_strdup(CfgParseContext*, const char *src);
 char *xlu__cfgl_dequote(CfgParseContext*, const char *src);
 
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index 07b5a1d..eb3884f 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -126,7 +126,7 @@ typedef union YYSTYPE
 #line 25 "libxlu_cfg_y.y"
 
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 
 
 
@@ -1148,7 +1148,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1155 "libxlu_cfg_y.c"
@@ -1166,7 +1166,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1173 "libxlu_cfg_y.c"
@@ -1175,7 +1175,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1182 "libxlu_cfg_y.c"
@@ -1508,21 +1508,21 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 57 "libxlu_cfg_y.y"
-    { xlu__cfg_set_store(ctx,(yyvsp[(1) - (3)].string),(yyvsp[(3) - (3)].setting),(yylsp[(3) - (3)]).first_line); }
+    { xlu__cfg_set_store(ctx,(yyvsp[(1) - (3)].string),(yyvsp[(3) - (3)].value),(yylsp[(3) - (3)]).first_line); }
     break;
 
   case 12:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,1,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
     break;
 
   case 13:
 
 /* Line 1806 of yacc.c  */
 #line 63 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(3) - (4)].setting); }
+    { (yyval.value)= (yyvsp[(3) - (4)].value); }
     break;
 
   case 14:
@@ -1543,35 +1543,35 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 68 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,0,0); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
     break;
 
   case 17:
 
 /* Line 1806 of yacc.c  */
 #line 69 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(1) - (1)].setting); }
+    { (yyval.value)= (yyvsp[(1) - (1)].value); }
     break;
 
   case 18:
 
 /* Line 1806 of yacc.c  */
 #line 70 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(1) - (3)].setting); }
+    { (yyval.value)= (yyvsp[(1) - (3)].value); }
     break;
 
   case 19:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,2,(yyvsp[(1) - (2)].string)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].string)); }
     break;
 
   case 20:
 
 /* Line 1806 of yacc.c  */
 #line 73 "libxlu_cfg_y.y"
-    { xlu__cfg_set_add(ctx,(yyvsp[(1) - (5)].setting),(yyvsp[(4) - (5)].string)); (yyval.setting)= (yyvsp[(1) - (5)].setting); }
+    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].string)); (yyval.value)= (yyvsp[(1) - (5)].value); }
     break;
 
 
diff --git a/tools/libxl/libxlu_cfg_y.h b/tools/libxl/libxlu_cfg_y.h
index d7dfaf2..37e8213 100644
--- a/tools/libxl/libxlu_cfg_y.h
+++ b/tools/libxl/libxlu_cfg_y.h
@@ -54,7 +54,7 @@ typedef union YYSTYPE
 #line 25 "libxlu_cfg_y.y"
 
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 
 
 
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 5acd438..6848686 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -24,7 +24,7 @@
 
 %union {
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 }
 
 %locations
@@ -39,8 +39,8 @@
 %type <string>            atom
 %destructor { free($$); } atom IDENT STRING NUMBER
 
-%type <setting>                         value valuelist values
-%destructor { xlu__cfg_set_free($$); }  value valuelist values
+%type <value>                             value valuelist values
+%destructor { xlu__cfg_value_free($$); }  value valuelist values
 
 %%
 
@@ -59,18 +59,18 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_set_mk(ctx,1,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_set_mk(ctx,0,0); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: atom nlok                  { $$= xlu__cfg_set_mk(ctx,2,$1); }
- |      values ',' nlok atom nlok  { xlu__cfg_set_add(ctx,$1,$4); $$= $1; }
+values: atom nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+ |      values ',' nlok atom nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
         /* nothing */
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 7579158..092a17a 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -23,17 +23,35 @@
 #include <assert.h>
 #include <regex.h>
 
-#define XLU_ConfigList XLU_ConfigSetting
-
 #include "libxlutil.h"
 
-struct XLU_ConfigSetting { /* transparent */
+enum XLU_ConfigValueType {
+    XLU_STRING,
+    XLU_LIST,
+};
+
+typedef struct XLU_ConfigValue XLU_ConfigValue;
+
+typedef struct XLU_ConfigList {
+    int avalues; /* available slots */
+    int nvalues; /* actual occupied slots */
+    XLU_ConfigValue **values;
+} XLU_ConfigList;
+
+struct XLU_ConfigValue {
+    enum XLU_ConfigValueType type;
+    union {
+        char *string;
+        XLU_ConfigList list;
+    } u;
+};
+
+typedef struct XLU_ConfigSetting { /* transparent */
     struct XLU_ConfigSetting *next;
     char *name;
-    int nvalues, avalues; /* lists have avalues>1 */
-    char **values;
+    XLU_ConfigValue *value;
     int lineno;
-};
+} XLU_ConfigSetting;
 
 struct XLU_Config {
     XLU_ConfigSetting *settings;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 19/23] libxlu: nested list support
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (17 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 18/23] libxlu: rework internal representation of setting Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-02-26 15:56 ` [PATCH v6 20/23] libxlu: record line and column number when parsing values Wei Liu
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

1. Extend grammar of parser.
2. Adjust internal functions to accept XLU_ConfigValue instead of
   char *.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxlu_cfg.c   | 30 +++++++-----------------------
 tools/libxl/libxlu_cfg_i.h |  5 +++--
 tools/libxl/libxlu_cfg_y.c | 26 +++++++++++++-------------
 tools/libxl/libxlu_cfg_y.y |  4 ++--
 4 files changed, 25 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index f000eed..611f5ec 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -332,19 +332,14 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
     return NULL;
 }
 
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+                                  XLU_ConfigValue *val)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
-    XLU_ConfigValue *val = NULL;
 
     if (ctx->err) goto x;
 
-    val = malloc(sizeof(*val));
-    if (!val) goto xe;
-    val->type = XLU_STRING;
-    val->u.string = atom;
-
     values = malloc(sizeof(*values));
     if (!values) goto xe;
     values[0] = val;
@@ -363,19 +358,17 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
  x:
     free(value);
     free(values);
-    free(val);
-    free(atom);
+    xlu__cfg_value_free(val);
     return NULL;
 }
 
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
-                          char *atom)
+                          XLU_ConfigValue *val)
 {
-    XLU_ConfigValue *val = NULL;
     if (ctx->err) return;
 
-    assert(atom);
+    assert(val);
     assert(list->type == XLU_LIST);
 
     if (list->u.list.nvalues >= list->u.list.avalues) {
@@ -384,7 +377,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
 
         if (list->u.list.avalues > INT_MAX / 100) {
             ctx->err = ERANGE;
-            free(atom);
+            xlu__cfg_value_free(val);
             return;
         }
 
@@ -393,7 +386,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
                               sizeof(*new_values) * new_avalues);
         if (!new_values) {
             ctx->err = errno;
-            free(atom);
+            xlu__cfg_value_free(val);
             return;
         }
 
@@ -401,15 +394,6 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
         list->u.list.values  = new_values;
     }
 
-    val = malloc(sizeof(*val));
-    if (!val) {
-        ctx->err = errno;
-        free(atom);
-        return;
-    }
-
-    val->type = XLU_STRING;
-    val->u.string = atom;
     list->u.list.values[list->u.list.nvalues] = val;
     list->u.list.nvalues++;
 }
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index b71e9fd..11dc33f 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -27,10 +27,11 @@ void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
                                     char *atom);
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom);
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+                                  XLU_ConfigValue *val);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
-                          char *atom);
+                          XLU_ConfigValue *val);
 void xlu__cfg_value_free(XLU_ConfigValue *value);
 char *xlu__cfgl_strdup(CfgParseContext*, const char *src);
 char *xlu__cfgl_dequote(CfgParseContext*, const char *src);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index eb3884f..b05e48b 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -377,7 +377,7 @@ union yyalloc
 /* YYFINAL -- State number of the termination state.  */
 #define YYFINAL  3
 /* YYLAST -- Last index in YYTABLE.  */
-#define YYLAST   24
+#define YYLAST   25
 
 /* YYNTOKENS -- Number of terminals.  */
 #define YYNTOKENS  12
@@ -444,8 +444,8 @@ static const yytype_int8 yyrhs[] =
       15,    -1,    16,    17,    -1,    17,    -1,     1,     6,    -1,
        3,     7,    18,    -1,     6,    -1,     8,    -1,    19,    -1,
        9,    22,    20,    10,    -1,     4,    -1,     5,    -1,    -1,
-      21,    -1,    21,    11,    22,    -1,    19,    22,    -1,    21,
-      11,    22,    19,    22,    -1,    -1,    22,     6,    -1
+      21,    -1,    21,    11,    22,    -1,    18,    22,    -1,    21,
+      11,    22,    18,    22,    -1,    -1,    22,     6,    -1
 };
 
 /* YYRLINE[YYN] -- source line where rule number YYN was defined.  */
@@ -517,14 +517,14 @@ static const yytype_int8 yydefgoto[] =
 static const yytype_int8 yypact[] =
 {
      -18,     4,     0,   -18,    -1,     6,   -18,   -18,   -18,     3,
-     -18,   -18,    11,   -18,   -18,   -18,   -18,   -18,   -18,    13,
-     -18,   -18,    12,    10,    17,   -18,   -18,    13,   -18,    17
+     -18,   -18,    14,   -18,   -18,   -18,   -18,   -18,   -18,    11,
+     -18,   -18,    12,    10,    18,   -18,   -18,    11,   -18,    18
 };
 
 /* YYPGOTO[NTERM-NUM].  */
 static const yytype_int8 yypgoto[] =
 {
-     -18,   -18,   -18,   -18,   -18,    15,   -18,   -17,   -18,   -18,
+     -18,   -18,   -18,   -18,   -18,    16,   -17,   -18,   -18,   -18,
      -14
 };
 
@@ -535,8 +535,8 @@ static const yytype_int8 yypgoto[] =
 static const yytype_int8 yytable[] =
 {
       -2,     4,    21,     5,     3,    11,     6,    24,     7,     6,
-      28,     7,    27,    12,    29,    14,    15,    14,    15,    20,
-      16,    26,    25,    20,    13
+      28,     7,    27,    12,    29,    14,    15,    20,    14,    15,
+      16,    26,    25,    16,    20,    13
 };
 
 #define yypact_value_is_default(yystate) \
@@ -548,8 +548,8 @@ static const yytype_int8 yytable[] =
 static const yytype_uint8 yycheck[] =
 {
        0,     1,    19,     3,     0,     6,     6,    21,     8,     6,
-      27,     8,    26,     7,    28,     4,     5,     4,     5,     6,
-       9,    11,    10,     6,     9
+      27,     8,    26,     7,    28,     4,     5,     6,     4,     5,
+       9,    11,    10,     9,     6,     9
 };
 
 /* YYSTOS[STATE-NUM] -- The (internal number of the) accessing
@@ -558,7 +558,7 @@ static const yytype_uint8 yystos[] =
 {
        0,    13,    14,     0,     1,     3,     6,     8,    15,    16,
       17,     6,     7,    17,     4,     5,     9,    18,    19,    22,
-       6,    19,    20,    21,    22,    10,    11,    22,    19,    22
+       6,    18,    20,    21,    22,    10,    11,    22,    18,    22
 };
 
 #define yyerrok		(yyerrstatus = 0)
@@ -1564,14 +1564,14 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].string)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
     break;
 
   case 20:
 
 /* Line 1806 of yacc.c  */
 #line 73 "libxlu_cfg_y.y"
-    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].string)); (yyval.value)= (yyvsp[(1) - (5)].value); }
+    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].value)); (yyval.value)= (yyvsp[(1) - (5)].value); }
     break;
 
 
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 6848686..4a5ca3a 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -69,8 +69,8 @@ valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: atom nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
- |      values ',' nlok atom nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+ |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
         /* nothing */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (18 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 19/23] libxlu: nested list support Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-03-06 11:36   ` Ian Jackson
  2015-02-26 15:56 ` [PATCH v6 21/23] libxlu: introduce new APIs Wei Liu
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Originally only setting has line number recorded. Since we're moving to
more sophisticated API, record line number and column number for
individual value. They are useful for error reporting.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxlu_cfg.c      | 10 ++++++++--
 tools/libxl/libxlu_cfg_i.h    |  5 +++--
 tools/libxl/libxlu_cfg_y.c    | 32 ++++++++++++++++++--------------
 tools/libxl/libxlu_cfg_y.y    | 10 +++++++---
 tools/libxl/libxlu_internal.h |  1 +
 5 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 611f5ec..b921a13 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -311,7 +311,8 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
 }
 
 
-XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom,
+                                    int line, int column)
 {
     XLU_ConfigValue *value = NULL;
 
@@ -321,6 +322,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
     if (!value) goto xe;
     value->type = XLU_STRING;
     value->u.string = atom;
+    value->line = line;
+    value->column = column;
 
     return value;
 
@@ -333,7 +336,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
 }
 
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val)
+                                  XLU_ConfigValue *val,
+                                  int line, int column)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
@@ -350,6 +354,8 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
     value->u.list.nvalues = 1;
     value->u.list.avalues = 1;
     value->u.list.values = values;
+    value->line = line;
+    value->column = column;
 
     return value;
 
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 11dc33f..fa46460 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -26,9 +26,10 @@ void xlu__cfg_set_free(XLU_ConfigSetting *set);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
-                                    char *atom);
+                                    char *atom, int line, int column);
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val);
+                                  XLU_ConfigValue *val,
+                                  int line, int column);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
                           XLU_ConfigValue *val);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index b05e48b..ec4d95d 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -452,8 +452,8 @@ static const yytype_int8 yyrhs[] =
 static const yytype_uint8 yyrline[] =
 {
        0,    47,    47,    48,    50,    51,    53,    54,    55,    57,
-      59,    60,    62,    63,    65,    66,    68,    69,    70,    72,
-      73,    75,    77
+      59,    60,    62,    65,    67,    68,    70,    71,    72,    74,
+      77,    79,    81
 };
 #endif
 
@@ -1515,69 +1515,73 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string),
+							      (yylsp[(1) - (1)]).first_line,
+							      (yylsp[(1) - (1)]).first_column); }
     break;
 
   case 13:
 
 /* Line 1806 of yacc.c  */
-#line 63 "libxlu_cfg_y.y"
+#line 65 "libxlu_cfg_y.y"
     { (yyval.value)= (yyvsp[(3) - (4)].value); }
     break;
 
   case 14:
 
 /* Line 1806 of yacc.c  */
-#line 65 "libxlu_cfg_y.y"
+#line 67 "libxlu_cfg_y.y"
     { (yyval.string)= (yyvsp[(1) - (1)].string); }
     break;
 
   case 15:
 
 /* Line 1806 of yacc.c  */
-#line 66 "libxlu_cfg_y.y"
+#line 68 "libxlu_cfg_y.y"
     { (yyval.string)= (yyvsp[(1) - (1)].string); }
     break;
 
   case 16:
 
 /* Line 1806 of yacc.c  */
-#line 68 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
+#line 70 "libxlu_cfg_y.y"
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL,0,0); }
     break;
 
   case 17:
 
 /* Line 1806 of yacc.c  */
-#line 69 "libxlu_cfg_y.y"
+#line 71 "libxlu_cfg_y.y"
     { (yyval.value)= (yyvsp[(1) - (1)].value); }
     break;
 
   case 18:
 
 /* Line 1806 of yacc.c  */
-#line 70 "libxlu_cfg_y.y"
+#line 72 "libxlu_cfg_y.y"
     { (yyval.value)= (yyvsp[(1) - (3)].value); }
     break;
 
   case 19:
 
 /* Line 1806 of yacc.c  */
-#line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
+#line 74 "libxlu_cfg_y.y"
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value),
+							   (yylsp[(1) - (2)]).first_line,
+							   (yylsp[(1) - (2)]).first_column); }
     break;
 
   case 20:
 
 /* Line 1806 of yacc.c  */
-#line 73 "libxlu_cfg_y.y"
+#line 77 "libxlu_cfg_y.y"
     { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].value)); (yyval.value)= (yyvsp[(1) - (5)].value); }
     break;
 
 
 
 /* Line 1806 of yacc.c  */
-#line 1581 "libxlu_cfg_y.c"
+#line 1585 "libxlu_cfg_y.c"
       default: break;
     }
   /* User semantic actions sometimes alter yychar, and that requires
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 4a5ca3a..4ebf0b3 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -59,17 +59,21 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1,
+							      @1.first_line,
+							      @1.first_column); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL,0,0); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1,
+							   @1.first_line,
+							   @1.first_column); }
  |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 092a17a..73fd85f 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -44,6 +44,7 @@ struct XLU_ConfigValue {
         char *string;
         XLU_ConfigList list;
     } u;
+    int line, column;
 };
 
 typedef struct XLU_ConfigSetting { /* transparent */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 21/23] libxlu: introduce new APIs
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (19 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 20/23] libxlu: record line and column number when parsing values Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-03-06 11:40   ` Ian Jackson
  2015-02-26 15:56 ` [PATCH v6 22/23] xl: introduce xcalloc Wei Liu
  2015-02-26 15:56 ` [PATCH v6 23/23] xl: vNUMA support Wei Liu
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

These APIs can be used to manipulate XLU_ConfigValue and XLU_ConfigList.

APIs introduced:
1. xlu_cfg_value_type
2. xlu_cfg_value_get_string
3. xlu_cfg_value_get_list
4. xlu_cfg_get_listitem2

Move some definitions from private header to public header as needed.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Report value's line and column number on error.

Changes in v5:
1. Use calling convention like old APIs.
---
 tools/libxl/libxlu_cfg.c      | 45 +++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxlu_internal.h |  7 -------
 tools/libxl/libxlutil.h       | 13 +++++++++++++
 3 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index b921a13..62fb798 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -199,6 +199,51 @@ static int find_atom(const XLU_Config *cfg, const char *n,
     return 0;
 }
 
+
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value)
+{
+    return value->type;
+}
+
+int xlu_cfg_value_get_string(const XLU_Config *cfg, XLU_ConfigValue *value,
+                             char **value_r, int dont_warn)
+{
+    if (value->type != XLU_STRING) {
+        if (!dont_warn)
+            fprintf(cfg->report,
+                    "%s:%d:%d: warning: value is not a string\n",
+                    cfg->config_source, value->line, value->column);
+        *value_r = NULL;
+        return EINVAL;
+    }
+
+    *value_r = value->u.string;
+    return 0;
+}
+
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+                           XLU_ConfigList **value_r, int dont_warn)
+{
+    if (value->type != XLU_LIST) {
+        if (!dont_warn)
+            fprintf(cfg->report,
+                    "%s:%d:%d: warning: value is not a list\n",
+                    cfg->config_source, value->line, value->column);
+        *value_r = NULL;
+        return EINVAL;
+    }
+
+    *value_r = &value->u.list;
+    return 0;
+}
+
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+                                       int entry)
+{
+    if (entry < 0 || entry >= list->nvalues) return NULL;
+    return list->values[entry];
+}
+
 int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
                        const char **value_r, int dont_warn) {
     XLU_ConfigSetting *set;
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 73fd85f..1d310b1 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -25,13 +25,6 @@
 
 #include "libxlutil.h"
 
-enum XLU_ConfigValueType {
-    XLU_STRING,
-    XLU_LIST,
-};
-
-typedef struct XLU_ConfigValue XLU_ConfigValue;
-
 typedef struct XLU_ConfigList {
     int avalues; /* available slots */
     int nvalues; /* actual occupied slots */
diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h
index 0333e55..989605a 100644
--- a/tools/libxl/libxlutil.h
+++ b/tools/libxl/libxlutil.h
@@ -20,9 +20,15 @@
 
 #include "libxl.h"
 
+enum XLU_ConfigValueType {
+    XLU_STRING,
+    XLU_LIST,
+};
+
 /* Unless otherwise stated, all functions return an errno value. */
 typedef struct XLU_Config XLU_Config;
 typedef struct XLU_ConfigList XLU_ConfigList;
+typedef struct XLU_ConfigValue XLU_ConfigValue;
 
 XLU_Config *xlu_cfg_init(FILE *report, const char *report_filename);
   /* 0 means we got ENOMEM. */
@@ -66,6 +72,13 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList*, int entry);
   /* xlu_cfg_get_listitem cannot fail, except that if entry is
    * out of range it returns 0 (not setting errno) */
 
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value);
+int xlu_cfg_value_get_string(const XLU_Config *cfg,  XLU_ConfigValue *value,
+                             char **value_r, int dont_warn);
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+                           XLU_ConfigList **value_r, int dont_warn);
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+                                       int entry);
 
 /*
  * Disk specification parsing.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 22/23] xl: introduce xcalloc
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (20 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 21/23] libxlu: introduce new APIs Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-03-02 15:51   ` Ian Campbell
  2015-02-26 15:56 ` [PATCH v6 23/23] xl: vNUMA support Wei Liu
  22 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v6:
1. Join two lines to make code more compact.
2. Use %zu and drop casting.
---
 tools/libxl/xl_cmdimpl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 53c16eb..5b366f2 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -289,6 +289,16 @@ static void *xmalloc(size_t sz) {
     return r;
 }
 
+static void *xcalloc(size_t n, size_t sz) __attribute__((unused));
+static void *xcalloc(size_t n, size_t sz) {
+    void *r = calloc(n, sz);
+    if (!r) {
+        fprintf(stderr,"xl: Unable to calloc %zu bytes.\n", sz*n);
+        exit(-ERROR_FAIL);
+    }
+    return r;
+}
+
 static void *xrealloc(void *ptr, size_t sz) {
     void *r;
     if (!sz) { free(ptr); return 0; }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v6 23/23] xl: vNUMA support
  2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
                   ` (21 preceding siblings ...)
  2015-02-26 15:56 ` [PATCH v6 22/23] xl: introduce xcalloc Wei Liu
@ 2015-02-26 15:56 ` Wei Liu
  2015-02-27 16:17   ` Dario Faggioli
  2015-03-02 15:59   ` Ian Campbell
  22 siblings, 2 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-26 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

This patch includes configuration options parser and documentation.

Please find the hunk to xl.cfg.pod.5 for more information.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v6:
1. Disable NUMA auto-placement.
---
 docs/man/xl.cfg.pod.5    |  54 ++++++++++++++++++
 tools/libxl/xl_cmdimpl.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 193 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 408653f..2a27b1c 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -266,6 +266,60 @@ it will crash.
 
 =back
 
+=head3 Guest Virtual NUMA Configuration
+
+=over 4
+
+=item B<vnuma=[ VNODE_SPEC, VNODE_SPEC, ... ]
+
+Specify virtual NUMA configuration with positional arguments. The
+nth B<VNODE_SPECE> in the list specifies the configuration of nth
+virtual node.
+
+Each B<VNODE_SPEC> is a list, which has a form of
+"[VNODE_CONFIG_OPTION,VNODE_CONFIG_OPTION, ... ]"  (without quotes).
+
+For example vnuma = [ ["pnode=0","size=512","vcpus=0-4","vdistances=10,20"] ]
+means vnode 0 is mapped to pnode 0, has 512MB ram, has vcpus 0 to 4, the
+distance to itself is 10 and the distance to vnode 1 is 20.
+
+Each B<VNODE_CONFIG_OPTION> is a quoted string. Supported
+B<VNODE_CONFIG_OPTION>s are:
+
+=over 4
+
+=item B<pnode=NUMBER>
+
+Specify which physical node this virtual node maps to.
+
+=item B<size=MBYTES>
+
+Specify the size of this virtual node. The sum of memory size of all
+vnodes must match B<maxmem=> (or B<memory=> if B<maxmem=> is not
+specified).
+
+=item B<vcpus=CPU-STRING>
+
+Specify which vcpus belong to this node. B<CPU-STRING> is a string
+separated by comma. You can specify range and single cpu. An example
+is "vcpus=0-5,8", which means you specify vcpu 0 to vcpu 5, and vcpu
+8.
+
+=item B<vdistances=NUMBER, NUMBER, ... >
+
+Specify virtual distance from this node to all nodes (including
+itself) with positional arguments. For example, "vdistance=10,20"
+for vnode 0 means the distance from vnode 0 to vnode 0 is 10, from
+vnode 0 to vnode 1 is 20. The number of arguments supplied must match
+the total number of vnodes.
+
+Normally you can use the values from "xl info -n" or "numactl
+--hardware" to fill in vdistance list.
+
+=back
+
+=back
+
 =head3 Event Actions
 
 =over 4
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 5b366f2..2899d9f 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -158,7 +158,6 @@ struct domain_create {
 };
 
 
-static uint32_t find_domain(const char *p) __attribute__((warn_unused_result));
 static uint32_t find_domain(const char *p)
 {
     uint32_t domid;
@@ -987,6 +986,143 @@ static int parse_nic_config(libxl_device_nic *nic, XLU_Config **config, char *to
     return 0;
 }
 
+static void parse_vnuma_config(const XLU_Config *config,
+                               libxl_domain_build_info *b_info)
+{
+    libxl_physinfo physinfo;
+    uint32_t nr_nodes;
+    XLU_ConfigList *vnuma;
+    int i, j, len, num_vnuma;
+
+
+    libxl_physinfo_init(&physinfo);
+    if (libxl_get_physinfo(ctx, &physinfo) != 0) {
+        libxl_physinfo_dispose(&physinfo);
+        fprintf(stderr, "libxl_get_physinfo failed\n");
+        exit(1);
+    }
+
+    nr_nodes = physinfo.nr_nodes;
+    libxl_physinfo_dispose(&physinfo);
+
+    if (xlu_cfg_get_list(config, "vnuma", &vnuma, &num_vnuma, 1))
+        return;
+
+    b_info->num_vnuma_nodes = num_vnuma;
+    b_info->vnuma_nodes = xcalloc(num_vnuma, sizeof(libxl_vnode_info));
+
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        libxl_vnode_info_init(p);
+        libxl_cpu_bitmap_alloc(ctx, &p->vcpus, b_info->max_vcpus);
+        libxl_bitmap_set_none(&p->vcpus);
+        p->distances = xcalloc(b_info->num_vnuma_nodes,
+                               sizeof(*p->distances));
+        p->num_distances = b_info->num_vnuma_nodes;
+    }
+
+    for (i = 0; i < num_vnuma; i++) {
+        XLU_ConfigValue *vnode_spec, *conf_option;
+        XLU_ConfigList *vnode_config_list;
+        int conf_count;
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        vnode_spec = xlu_cfg_get_listitem2(vnuma, i);
+        assert(vnode_spec);
+
+        xlu_cfg_value_get_list(config, vnode_spec, &vnode_config_list, 0);
+        if (!vnode_config_list) {
+            fprintf(stderr, "xl: cannot get vnode config option list\n");
+            exit(1);
+        }
+
+        for (conf_count = 0;
+             (conf_option =
+              xlu_cfg_get_listitem2(vnode_config_list, conf_count));
+             conf_count++) {
+
+            if (xlu_cfg_value_type(conf_option) == XLU_STRING) {
+                char *buf, *option_untrimmed, *value_untrimmed;
+                char *option, *value;
+                char *endptr;
+                unsigned long val;
+
+                xlu_cfg_value_get_string(config, conf_option, &buf, 0);
+
+                if (!buf) continue;
+
+                if (split_string_into_pair(buf, "=",
+                                           &option_untrimmed,
+                                           &value_untrimmed)) {
+                    fprintf(stderr, "xl: failed to split \"%s\" into pair\n",
+                            buf);
+                    exit(1);
+                }
+                trim(isspace, option_untrimmed, &option);
+                trim(isspace, value_untrimmed, &value);
+
+#define ABORT_IF_FAILED(str)                                            \
+                do {                                                    \
+                    if (endptr == value || val == ULONG_MAX) {          \
+                        fprintf(stderr,                                 \
+                                "xl: failed to convert \"%s\" to number\n", \
+                                (str));                                 \
+                        exit(1);                                        \
+                    }                                                   \
+                } while (0)
+
+                if (!strcmp("pnode", option)) {
+                    val = strtoul(value, &endptr, 10);
+                    ABORT_IF_FAILED(value);
+                    if (val >= nr_nodes) {
+                        fprintf(stderr,
+                                "xl: invalid pnode number: %lu\n", val);
+                        exit(1);
+                    }
+                    p->pnode = val;
+                    libxl_defbool_set(&b_info->numa_placement, false);
+                } else if (!strcmp("size", option)) {
+                    val = strtoul(value, &endptr, 10);
+                    ABORT_IF_FAILED(value);
+                    p->memkb = val << 10;
+                } else if (!strcmp("vcpus", option)) {
+                    libxl_string_list cpu_spec_list;
+                    int cpu;
+                    unsigned long s, e;
+
+                    split_string_into_string_list(value, ",", &cpu_spec_list);
+                    len = libxl_string_list_length(&cpu_spec_list);
+
+                    for (j = 0; j < len; j++) {
+                        parse_range(cpu_spec_list[j], &s, &e);
+                        for (cpu = s; cpu <=e; cpu++)
+                            libxl_bitmap_set(&p->vcpus, cpu);
+                    }
+                    libxl_string_list_dispose(&cpu_spec_list);
+                } else if (!strcmp("vdistances", option)) {
+                    libxl_string_list vdist;
+
+                    split_string_into_string_list(value, ",", &vdist);
+                    len = libxl_string_list_length(&vdist);
+
+                    for (j = 0; j < len; j++) {
+                        val = strtoul(vdist[j], &endptr, 10);
+                        ABORT_IF_FAILED(vdist[j]);
+                        p->distances[j] = val;
+                    }
+                    libxl_string_list_dispose(&vdist);
+                }
+#undef ABORT_IF_FAILED
+                free(option);
+                free(value);
+                free(option_untrimmed);
+                free(value_untrimmed);
+            }
+        }
+    }
+}
+
 static void parse_config_data(const char *config_source,
                               const char *config_data,
                               int config_len,
@@ -1177,6 +1313,8 @@ static void parse_config_data(const char *config_source,
         }
     }
 
+    parse_vnuma_config(config, b_info);
+
     if (!xlu_cfg_get_long(config, "rtc_timeoffset", &l, 0))
         b_info->rtc_timeoffset = l;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest
  2015-02-26 15:55 ` [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest Wei Liu
@ 2015-02-26 16:39   ` Dario Faggioli
  2015-03-02 15:41   ` Ian Campbell
  1 sibling, 0 replies; 81+ messages in thread
From: Dario Faggioli @ 2015-02-26 16:39 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, Ian Campbell, ufimtseva, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1817 bytes --]

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Introduce a arch-independent routine to generate one vmemrange per
> vnode. Also introduce arch-dependent routines for different
> architectures because part of the process is arch-specific -- ARM has
> yet have NUMA support and E820 is x86 only.
> 
> For those x86 guests who care about machine E820 map (i.e. with
> e820_host=1), vnode is further split into several vmemranges to
> accommodate memory holes.  A few stubs for libxl_arm.c are created.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Elena Ufimtseva <ufimtseva@gmail.com>

> diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
> index 33d7a3c..04672b5 100644
> --- a/tools/libxl/libxl_vnuma.c
> +++ b/tools/libxl/libxl_vnuma.c
> @@ -14,6 +14,7 @@
>   */
>  #include "libxl_osdeps.h" /* must come before any other headers */
>  #include "libxl_internal.h"
> +#include "libxl_arch.h"
>  #include <stdlib.h>
>  
>  /* Sort vmemranges in ascending order with "start" */
> @@ -142,6 +143,46 @@ out:
>      return rc;
>  }
>  
> +
Aren't you adding an extra, unnecessary, blank line here?

> +int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
> +                                            uint32_t domid,
> +                                            libxl_domain_build_info *b_info,
> +                                            libxl__domain_build_state *state)
>

<snip>

Of course, my Reviewed-by still stands... I just noticed this while
having a quick look. So, if you happen to have to resend... :-)

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 01/23] xen: factor out construct_memop_from_reservation
  2015-02-26 15:55 ` [PATCH v6 01/23] xen: factor out construct_memop_from_reservation Wei Liu
@ 2015-02-27 10:57   ` Andrew Cooper
  0 siblings, 0 replies; 81+ messages in thread
From: Andrew Cooper @ 2015-02-27 10:57 UTC (permalink / raw)
  To: Wei Liu, xen-devel; +Cc: Jan Beulich

On 26/02/15 15:55, Wei Liu wrote:
> No functional change.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/common/memory.c | 52 +++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 35 insertions(+), 17 deletions(-)
>
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index e84ace9..d24b001 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -692,11 +692,43 @@ out:
>      return rc;
>  }
>  
> +static int construct_memop_from_reservation(
> +               const struct xen_memory_reservation *r,
> +               struct memop_args *a)
> +{
> +    int rc;
> +    unsigned int address_bits;
> +
> +    a->extent_list  = r->extent_start;
> +    a->nr_extents   = r->nr_extents;
> +    a->extent_order = r->extent_order;
> +    a->memflags     = 0;
> +
> +    address_bits = XENMEMF_get_address_bits(r->mem_flags);
> +    if ( (address_bits != 0) &&
> +         (address_bits < (get_order_from_pages(max_page) + PAGE_SHIFT)) )
> +    {
> +        if ( address_bits <= PAGE_SHIFT )
> +        {
> +            rc = -EINVAL;
> +            goto out;
> +        }
> +        a->memflags = MEMF_bits(address_bits);
> +    }
> +
> +    a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
> +    if ( r->mem_flags & XENMEMF_exact_node_request )
> +        a->memflags |= MEMF_exact_node;
> +
> +    rc = 0;
> + out:
> +    return rc;
> +}
> +
>  long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      struct domain *d;
>      long rc;
> -    unsigned int address_bits;
>      struct xen_memory_reservation reservation;
>      struct memop_args args;
>      domid_t domid;
> @@ -718,25 +750,11 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          if ( unlikely(start_extent >= reservation.nr_extents) )
>              return start_extent;
>  
> -        args.extent_list  = reservation.extent_start;
> -        args.nr_extents   = reservation.nr_extents;
> -        args.extent_order = reservation.extent_order;
>          args.nr_done      = start_extent;
>          args.preempted    = 0;

As construct_memop_from_reservation() nominally has free reign over all
fields in args, these two assignments should move after the call.

Other than that, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> -        args.memflags     = 0;
>  
> -        address_bits = XENMEMF_get_address_bits(reservation.mem_flags);
> -        if ( (address_bits != 0) &&
> -             (address_bits < (get_order_from_pages(max_page) + PAGE_SHIFT)) )
> -        {
> -            if ( address_bits <= PAGE_SHIFT )
> -                return start_extent;
> -            args.memflags = MEMF_bits(address_bits);
> -        }
> -
> -        args.memflags |= MEMF_node(XENMEMF_get_node(reservation.mem_flags));
> -        if ( reservation.mem_flags & XENMEMF_exact_node_request )
> -            args.memflags |= MEMF_exact_node;
> +        if ( construct_memop_from_reservation(&reservation, &args) )
> +            return start_extent;
>  
>          if ( op == XENMEM_populate_physmap
>               && (reservation.mem_flags & XENMEMF_populate_on_demand) )

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-26 15:55 ` [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE Wei Liu
@ 2015-02-27 11:38   ` Andrew Cooper
  2015-02-27 16:42   ` Jan Beulich
  1 sibling, 0 replies; 81+ messages in thread
From: Andrew Cooper @ 2015-02-27 11:38 UTC (permalink / raw)
  To: Wei Liu, xen-devel; +Cc: Jan Beulich

On 26/02/15 15:55, Wei Liu wrote:
> Update NUMA_NO_NODE in Xen code to use the new macro.
>
> No functional change introduced.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <JBeulich@suse.com>
>
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 595f953..0d8c85f 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -29,6 +29,8 @@
>  
>  #include "xen.h"
>  
> +#define XEN_NUMA_NO_NODE    0xFF
> +

I would suggest moving the define to beside the XENMEMF_{,get_}node()
definitions, which is where it is used.

(Also, someone in the future is going to have a fun time extending Xen's
node ids beyond 255, seeing as this is baked into our ABI)

>  /*
>   * Increase or decrease the specified domain's memory reservation. Returns the
>   * number of extents successfully allocated or freed.
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index ac4b391..60daaa5 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -1,20 +1,19 @@
>  #ifndef _XEN_NUMA_H
>  #define _XEN_NUMA_H
>  
> +#include <public/memory.h>
>  #include <asm/numa.h>
>  
>  #ifndef NODES_SHIFT
>  #define NODES_SHIFT     0
>  #endif
>  
> -#define NUMA_NO_NODE    0xFF
> -

An option would be to

#define NUMA_NO_NODE XEN_NUMA_NO_NODE

Which would avoid changing any other code. We have a precedent for this
with the errno public header.

I would welcome the opinions of other maintainers.

~Andrew

>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>  
>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>  
>  #define domain_to_node(d) \
>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
> -   ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
> +   ? vcpu_to_node((d)->vcpu[0]) : XEN_NUMA_NO_NODE)
>  
>  #endif /* _XEN_NUMA_H */

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA
  2015-02-26 15:56 ` [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA Wei Liu
@ 2015-02-27 13:46   ` Dario Faggioli
  0 siblings, 0 replies; 81+ messages in thread
From: Dario Faggioli @ 2015-02-27 13:46 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 271 bytes --]

On Thu, 2015-02-26 at 15:56 +0000, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 23/23] xl: vNUMA support
  2015-02-26 15:56 ` [PATCH v6 23/23] xl: vNUMA support Wei Liu
@ 2015-02-27 16:17   ` Dario Faggioli
  2015-03-02 15:59   ` Ian Campbell
  1 sibling, 0 replies; 81+ messages in thread
From: Dario Faggioli @ 2015-02-27 16:17 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 490 bytes --]

On Thu, 2015-02-26 at 15:56 +0000, Wei Liu wrote:
> This patch includes configuration options parser and documentation.
> 
> Please find the hunk to xl.cfg.pod.5 for more information.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> ---
> Changes in v6:
> 1. Disable NUMA auto-placement.
> ---
>
Reviewed-and-Tested-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-26 15:55 ` [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE Wei Liu
  2015-02-27 11:38   ` Andrew Cooper
@ 2015-02-27 16:42   ` Jan Beulich
  2015-02-27 16:51     ` Wei Liu
  1 sibling, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-02-27 16:42 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, xen-devel

>>> On 26.02.15 at 16:55, <wei.liu2@citrix.com> wrote:
> Update NUMA_NO_NODE in Xen code to use the new macro.
> 
> No functional change introduced.

But also no explanation given why this is being done. After all just
leaving out the explicit specification on a node in the memop flags
has the effect of saying NUMA_NO_NODE.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-27 16:42   ` Jan Beulich
@ 2015-02-27 16:51     ` Wei Liu
  2015-02-27 16:52       ` Andrew Cooper
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-02-27 16:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Feb 27, 2015 at 04:42:42PM +0000, Jan Beulich wrote:
> >>> On 26.02.15 at 16:55, <wei.liu2@citrix.com> wrote:
> > Update NUMA_NO_NODE in Xen code to use the new macro.
> > 
> > No functional change introduced.
> 
> But also no explanation given why this is being done. After all just
> leaving out the explicit specification on a node in the memop flags
> has the effect of saying NUMA_NO_NODE.
> 

During last round review, Andrew wanted me to move this to Xen public
header to avoid reinventing it in libxc. Now this value is used in libxc
patch.

But I don't particularly mind whether we move it or not, it's up to you
maintainers to decide.

Wei.

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-27 16:51     ` Wei Liu
@ 2015-02-27 16:52       ` Andrew Cooper
  2015-03-02  7:04         ` Jan Beulich
  0 siblings, 1 reply; 81+ messages in thread
From: Andrew Cooper @ 2015-02-27 16:52 UTC (permalink / raw)
  To: Wei Liu, Jan Beulich; +Cc: xen-devel

On 27/02/15 16:51, Wei Liu wrote:
> On Fri, Feb 27, 2015 at 04:42:42PM +0000, Jan Beulich wrote:
>>>>> On 26.02.15 at 16:55, <wei.liu2@citrix.com> wrote:
>>> Update NUMA_NO_NODE in Xen code to use the new macro.
>>>
>>> No functional change introduced.
>> But also no explanation given why this is being done. After all just
>> leaving out the explicit specification on a node in the memop flags
>> has the effect of saying NUMA_NO_NODE.
>>
> During last round review, Andrew wanted me to move this to Xen public
> header to avoid reinventing it in libxc. Now this value is used in libxc
> patch.
>
> But I don't particularly mind whether we move it or not, it's up to you
> maintainers to decide.

It is a sentinel value used in the public ABI.  It should therefore
appear in the public API.

~Andrew

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware
  2015-02-26 15:55 ` [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware Wei Liu
@ 2015-02-27 16:59   ` Jan Beulich
  2015-02-27 17:03     ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-02-27 16:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: AndrewCooper, xen-devel

>>> On 26.02.15 at 16:55, <wei.liu2@citrix.com> wrote:
> Make XENMEM_increase_reservation and XENMEM_populate_physmap
> vNUMA-aware.
> 
> That is, if guest requests Xen to allocate memory for specific vnode,
> Xen can translate vnode to pnode using vNUMA information of that guest.
> 
> XENMEMF_vnode is introduced for the guest to mark the node number is in
> fact virtual node number and should be translated by Xen.
> 
> XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
> able to translate virtual node to physical node.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

As I massaged your first patch (also, but not only, to do what Andrew
requested), this one will need adjustment too. Perhaps additionally if
the 2nd one is to be dropped...

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware
  2015-02-27 16:59   ` Jan Beulich
@ 2015-02-27 17:03     ` Wei Liu
  0 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-02-27 17:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: AndrewCooper, Wei Liu, xen-devel

On Fri, Feb 27, 2015 at 04:59:02PM +0000, Jan Beulich wrote:
> >>> On 26.02.15 at 16:55, <wei.liu2@citrix.com> wrote:
> > Make XENMEM_increase_reservation and XENMEM_populate_physmap
> > vNUMA-aware.
> > 
> > That is, if guest requests Xen to allocate memory for specific vnode,
> > Xen can translate vnode to pnode using vNUMA information of that guest.
> > 
> > XENMEMF_vnode is introduced for the guest to mark the node number is in
> > fact virtual node number and should be translated by Xen.
> > 
> > XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
> > able to translate virtual node to physical node.
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> 
> As I massaged your first patch (also, but not only, to do what Andrew
> requested), this one will need adjustment too. Perhaps additionally if
> the 2nd one is to be dropped...
> 

I can resend after we come to conclusion on what to do with patch 2.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-02-27 16:52       ` Andrew Cooper
@ 2015-03-02  7:04         ` Jan Beulich
  2015-03-02 15:30           ` Ian Campbell
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-02  7:04 UTC (permalink / raw)
  To: andrew.cooper3, wei.liu2; +Cc: xen-devel

>>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>On 27/02/15 16:51, Wei Liu wrote:
>> During last round review, Andrew wanted me to move this to Xen public
>> header to avoid reinventing it in libxc. Now this value is used in libxc
>> patch.
>>
>> But I don't particularly mind whether we move it or not, it's up to you
>> maintainers to decide.
>
>It is a sentinel value used in the public ABI.  It should therefore
>appear in the public API.

Which it already does, as XENMEMF_get_node(0). I don't think it needs
particular naming as a new constant, even more that it isn't intended to
be used explicitly in any of the memops.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-02-26 15:55 ` [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array Wei Liu
@ 2015-03-02 15:26   ` Ian Campbell
  2015-03-02 15:33     ` Wei Liu
  2015-03-02 16:46     ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Currently all in tree code doesn't set the superpage flag, but Konrad
> wants it retained for the moment.

... because? I can't seem to see any relevant mails from Konrad in my
vnuma patch folder.

In any case, the why is far more important than the who here.

> 
> As I'm going to change the p2m_host array allocation, duplicate the code
> snippet to allocate p2m_host array in this patch, so that we retain the
> behaviour in superpage case.
> 
> This patch introduces no functional change and it will make future patch
> easier to review. Also removed one stray tab while I was there.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Konrad Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxc/xc_dom_x86.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index bf06fe4..9dbaedb 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              return rc;
>      }
>  
> -    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
> -    if ( dom->p2m_host == NULL )
> -        return -EINVAL;
> -
>      if ( dom->superpages )
>      {
>          int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
>          xen_pfn_t extents[count];
>  
> +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> +                                      dom->total_pages);
> +        if ( dom->p2m_host == NULL )
> +            return -EINVAL;
> +
>          DOMPRINTF("Populating memory with %d superpages", count);
>          for ( pfn = 0; pfn < count; pfn++ )
>              extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
> @@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>                  return rc;
>          }
>          /* setup initial p2m */
> +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> +                                      dom->total_pages);
> +        if ( dom->p2m_host == NULL )
> +            return -EINVAL;
>          for ( pfn = 0; pfn < dom->total_pages; pfn++ )
>              dom->p2m_host[pfn] = pfn;
> -        
> +
>          /* allocate guest memory */
>          for ( i = rc = allocsz = 0;
>                (i < dom->total_pages) && !rc;

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image
  2015-02-26 15:55 ` [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image Wei Liu
@ 2015-03-02 15:28   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:28 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Add a new field p2m_size to keep track of the number of pages covered by
> p2m.  Change total_pages to p2m_size in functions which in fact need
> the size of p2m.
> 
> This is needed because we are going to ditch the assumption that PV x86
> has only one contiguous ram region. Originally the p2m size was always
> equal to total_pages, but we will soon change that in later patch.
> 
> This patch doesn't change the behaviour of libxc.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02  7:04         ` Jan Beulich
@ 2015-03-02 15:30           ` Ian Campbell
  2015-03-02 15:38             ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, wei.liu2, xen-devel

On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
> >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
> >On 27/02/15 16:51, Wei Liu wrote:
> >> During last round review, Andrew wanted me to move this to Xen public
> >> header to avoid reinventing it in libxc. Now this value is used in libxc
> >> patch.
> >>
> >> But I don't particularly mind whether we move it or not, it's up to you
> >> maintainers to decide.
> >
> >It is a sentinel value used in the public ABI.  It should therefore
> >appear in the public API.
> 
> Which it already does, as XENMEMF_get_node(0). I don't think it needs
> particular naming as a new constant, even more that it isn't intended to
> be used explicitly in any of the memops.

IMHO the named constant does seem to make the tools code at least more
readable, but without Wei having said where this is to be used I'm not
sure where it should live. In particular I'm unsure if/how/where this
value gets passed to a hypercall, as opposed to perhaps being used as a
flag within the tools (which would therefore assume it is unused in the
h/call interface).

Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-03-02 15:26   ` Ian Campbell
@ 2015-03-02 15:33     ` Wei Liu
  2015-03-02 16:18       ` Ian Campbell
  2015-03-02 16:46     ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-02 15:33 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Mon, Mar 02, 2015 at 03:26:58PM +0000, Ian Campbell wrote:
> On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > Currently all in tree code doesn't set the superpage flag, but Konrad
> > wants it retained for the moment.
> 
> ... because? I can't seem to see any relevant mails from Konrad in my
> vnuma patch folder.
> 

That was quite some time ago (last year) I asked him about this and he
said oracle is still using it.

I would remove superpage support if I can, but since Konrad said they
are using it I will just leave it alone.

Wei.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check
  2015-02-26 15:55 ` [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check Wei Liu
@ 2015-03-02 15:34   ` Ian Campbell
  2015-03-02 15:50     ` Wei Liu
  2015-03-03  3:52     ` Dario Faggioli
  0 siblings, 2 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:34 UTC (permalink / raw)
  To: Wei Liu; +Cc: Dario Faggioli, Ian Jackson, Elena Ufimtseva, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> This function is used to check whether vNUMA configuration (be it
> auto-generated or supplied by user) is valid.
> 
> Define a new error code ERROR_VNUMA_CONFIG_INVALID.
> 
> The checks performed can be found in the comment of the function.
> 
> This vNUMA function (and future ones) is placed in a new file called
> libxl_vnuma.c
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Although I'd like to see comment from Dario and/or Elena. (Ian's
concerns from last round look to me to have been addresses, Im sure
he'll speak up if not).

I have one question:
> + *  4. local distance cannot be larger than remote distance

What is the "local distance" in NUMA speak?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest
  2015-02-26 15:55 ` [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest Wei Liu
@ 2015-03-02 15:36   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:36 UTC (permalink / raw)
  To: Wei Liu; +Cc: Dario Faggioli, Ian Jackson, Elena Ufimtseva, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> From libxc's point of view, it only needs to know vnode to pnode mapping
> and size of each vnode to allocate memory accordingly. Add these fields
> to xc_dom structure.
> 
> The caller might not pass in vNUMA information. In that case, a dummy
> layout is generated for the convenience of libxc's allocation code. The
> upper layer (libxl etc) still sees the domain has no vNUMA
> configuration.
> 
> Note that for this patch on PV x86 guest can have multiple regions of
> ram allocated.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Elena Ufimtseva <ufimtseva@gmail.com>
> ---
> Changes in v6:
> 1. Ditch XC_VNUMA_NO_NODE and use XEN_NUMA_NO_NODE.
> 2. Update comment in xc_dom.h.
> 
> Changes in v5:
> 1. Ditch xc_vnuma_info.
> 
> Changes in v4:
> 1. Pack fields into a struct.
> 2. Use "page" as unit.
> 3. __FUNCTION__ -> __func__.
> 4. Don't print total_pages.
> 5. Improve comment.
> 
> Changes in v3:
> 1. Rewrite commit log.
> 2. Shorten some error messages.
> ---
>  tools/libxc/include/xc_dom.h |  12 ++++-
>  tools/libxc/xc_dom_x86.c     | 101 +++++++++++++++++++++++++++++++++++++------
>  2 files changed, 97 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> index 6b8ddf4..a7d059a 100644
> --- a/tools/libxc/include/xc_dom.h
> +++ b/tools/libxc/include/xc_dom.h
> @@ -119,8 +119,10 @@ struct xc_dom_image {
>  
>      /* physical memory
>       *
> -     * An x86 PV guest has a single contiguous block of physical RAM,
> -     * consisting of total_pages starting at rambase_pfn.
> +     * An x86 PV guest has one or more blocks of physical RAM,
> +     * consisting of total_pages starting at rambase_pfn. The start
> +     * address and size of each block is controlled by vNUMA
> +     * structures.
>       *
>       * An ARM guest has GUEST_RAM_BANKS regions of RAM, with
>       * rambank_size[i] pages in each. The lowest RAM address
> @@ -168,6 +170,12 @@ struct xc_dom_image {
>      struct xc_dom_loader *kernel_loader;
>      void *private_loader;
>  
> +    /* vNUMA information */
> +    xen_vmemrange_t *vmemranges;
> +    unsigned int nr_vmemranges;
> +    unsigned int *vnode_to_pnode;
> +    unsigned int nr_vnodes;
> +
>      /* kernel loader */
>      struct xc_dom_arch *arch_hooks;
>      /* allocate up to virt_alloc_end */
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index bea54f2..268d4db 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -760,7 +760,8 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
>  int arch_setup_meminit(struct xc_dom_image *dom)
>  {
>      int rc;
> -    xen_pfn_t pfn, allocsz, i, j, mfn;
> +    xen_pfn_t pfn, allocsz, mfn, total, pfn_base;
> +    int i, j;
>  
>      rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
>      if ( rc )
> @@ -811,26 +812,98 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              if ( rc )
>                  return rc;
>          }
> -        /* setup initial p2m */
> -        dom->p2m_size = dom->total_pages;
> +
> +        /* Setup dummy vNUMA information if it's not provided. Note
> +         * that this is a valid state if libxl doesn't provide any
> +         * vNUMA information.
> +         *
> +         * The dummy values make libxc allocate all pages from
> +         * arbitrary physical nodes. This is the expected behaviour if
> +         * no vNUMA configuration is provided to libxc.
> +         *
> +         * Note that the following hunk is just for the convenience of
> +         * allocation code. No defaulting happens in libxc.
> +         */
> +        if ( dom->nr_vmemranges == 0 )
> +        {
> +            dom->nr_vmemranges = 1;
> +            dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges));
> +            dom->vmemranges[0].start = 0;
> +            dom->vmemranges[0].end   = dom->total_pages << PAGE_SHIFT;
> +            dom->vmemranges[0].flags = 0;
> +            dom->vmemranges[0].nid   = 0;
> +
> +            dom->nr_vnodes = 1;
> +            dom->vnode_to_pnode = xc_dom_malloc(dom,
> +                                      sizeof(*dom->vnode_to_pnode));
> +            dom->vnode_to_pnode[0] = XEN_NUMA_NO_NODE;
> +        }
> +
> +        total = dom->p2m_size = 0;
> +        for ( i = 0; i < dom->nr_vmemranges; i++ )
> +        {
> +            total += ((dom->vmemranges[i].end - dom->vmemranges[i].start)
> +                      >> PAGE_SHIFT);
> +            dom->p2m_size =
> +                dom->p2m_size > (dom->vmemranges[i].end >> PAGE_SHIFT) ?
> +                dom->p2m_size : (dom->vmemranges[i].end >> PAGE_SHIFT);
> +        }
> +        if ( total != dom->total_pages )
> +        {
> +            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
> +                         "%s: vNUMA page count mismatch (0x%"PRIpfn" != 0x%"PRIpfn")\n",
> +                         __func__, total, dom->total_pages);
> +            return -EINVAL;
> +        }
> +
>          dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
>                                        dom->p2m_size);
>          if ( dom->p2m_host == NULL )
>              return -EINVAL;
> -        for ( pfn = 0; pfn < dom->total_pages; pfn++ )
> -            dom->p2m_host[pfn] = pfn;
> +        for ( pfn = 0; pfn < dom->p2m_size; pfn++ )
> +            dom->p2m_host[pfn] = INVALID_P2M_ENTRY;
>  
>          /* allocate guest memory */
> -        for ( i = rc = allocsz = 0;
> -              (i < dom->total_pages) && !rc;
> -              i += allocsz )
> +        for ( i = 0; i < dom->nr_vmemranges; i++ )
>          {
> -            allocsz = dom->total_pages - i;
> -            if ( allocsz > 1024*1024 )
> -                allocsz = 1024*1024;
> -            rc = xc_domain_populate_physmap_exact(
> -                dom->xch, dom->guest_domid, allocsz,
> -                0, 0, &dom->p2m_host[i]);
> +            unsigned int memflags;
> +            uint64_t pages;
> +            unsigned int pnode = dom->vnode_to_pnode[dom->vmemranges[i].nid];
> +
> +            memflags = 0;
> +            if ( pnode != XEN_NUMA_NO_NODE )
> +                memflags |= XENMEMF_exact_node(pnode);
> +
> +            pages = (dom->vmemranges[i].end - dom->vmemranges[i].start)
> +                >> PAGE_SHIFT;
> +            pfn_base = dom->vmemranges[i].start >> PAGE_SHIFT;
> +
> +            for ( pfn = pfn_base; pfn < pfn_base+pages; pfn++ )
> +                dom->p2m_host[pfn] = pfn;
> +
> +            for ( j = 0; j < pages; j += allocsz )
> +            {
> +                allocsz = pages - j;
> +                if ( allocsz > 1024*1024 )
> +                    allocsz = 1024*1024;
> +
> +                rc = xc_domain_populate_physmap_exact(dom->xch,
> +                         dom->guest_domid, allocsz, 0, memflags,
> +                         &dom->p2m_host[pfn_base+j]);
> +
> +                if ( rc )
> +                {
> +                    if ( pnode != XEN_NUMA_NO_NODE )
> +                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
> +                                     "%s: failed to allocate 0x%"PRIx64" pages (v=%d, p=%d)\n",
> +                                     __func__, pages, i, pnode);
> +                    else
> +                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
> +                                     "%s: failed to allocate 0x%"PRIx64" pages\n",
> +                                     __func__, pages);
> +                    return rc;
> +                }
> +            }
>          }
>  
>          /* Ensure no unclaimed pages are left unused.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 15:30           ` Ian Campbell
@ 2015-03-02 15:38             ` Wei Liu
  2015-03-02 15:51               ` Jan Beulich
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-02 15:38 UTC (permalink / raw)
  To: Ian Campbell; +Cc: andrew.cooper3, wei.liu2, Jan Beulich, xen-devel

On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
> > >On 27/02/15 16:51, Wei Liu wrote:
> > >> During last round review, Andrew wanted me to move this to Xen public
> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
> > >> patch.
> > >>
> > >> But I don't particularly mind whether we move it or not, it's up to you
> > >> maintainers to decide.
> > >
> > >It is a sentinel value used in the public ABI.  It should therefore
> > >appear in the public API.
> > 
> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
> > particular naming as a new constant, even more that it isn't intended to
> > be used explicitly in any of the memops.
> 
> IMHO the named constant does seem to make the tools code at least more
> readable, but without Wei having said where this is to be used I'm not
> sure where it should live. In particular I'm unsure if/how/where this
> value gets passed to a hypercall, as opposed to perhaps being used as a

This is used to fill in vnode_to_pnode array. That array get
subsequently passed down to hypervisor.

Wei.

> flag within the tools (which would therefore assume it is unused in the
> h/call interface).
> 
> Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest
  2015-02-26 15:55 ` [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest Wei Liu
  2015-02-26 16:39   ` Dario Faggioli
@ 2015-03-02 15:41   ` Ian Campbell
  2015-03-02 17:52     ` Wei Liu
  1 sibling, 1 reply; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:41 UTC (permalink / raw)
  To: Wei Liu; +Cc: Dario Faggioli, Ian Jackson, Elena Ufimtseva, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Introduce a arch-independent routine to generate one vmemrange per
> vnode. Also introduce arch-dependent routines for different
> architectures because part of the process is arch-specific -- ARM has
> yet have NUMA support and E820 is x86 only.
> 
> For those x86 guests who care about machine E820 map (i.e. with
> e820_host=1), vnode is further split into several vmemranges to
> accommodate memory holes.  A few stubs for libxl_arm.c are created.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Although one comment:

> +    e820_count = 0;
> +    nr_vmemrange = 0;
> +    vmemranges = NULL;
> +    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
> +        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
> +        uint64_t remaining_bytes = (p->memkb << 10), bytes;
> +
> +        while (remaining_bytes > 0) {
> +            if (e820_count >= nr_e820) {
> +                rc = ERROR_NOMEM;
> +                goto out;
> +            }
> +
> +            /* Skip non RAM region */
> +            if (map[e820_count].type != E820_RAM) {
> +                e820_count++;
> +                continue;
> +            }
> +
> +            GCREALLOC_ARRAY(vmemranges, nr_vmemrange+1);

Once we've hit the limit this is going to reallocate every time (and the
limit starts as zero, so that's every time). Which means an aweful lot
of reallocing every time.

Perhaps start with some non-zero size and increase by a non-linear
amount as needed? That would require tracking the array size separate
from nr_vmemrange of course.

Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest
  2015-02-26 15:55 ` [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
@ 2015-03-02 15:43   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:43 UTC (permalink / raw)
  To: Wei Liu; +Cc: Dario Faggioli, Ian Jackson, Elena Ufimtseva, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> The algorithm is more or less the same as the one used for PV guest.
> Libxc gets hold of the mapping of vnode to pnode and size of each vnode
> then allocate memory accordingly.
> 
> And then the function returns low memory end, high memory end and mmio
> start to caller. Libxl needs those values to construct vmemranges for
> that guest.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen for HVM guest
  2015-02-26 15:55 ` [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
@ 2015-03-02 15:44   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: Dario Faggioli, Ian Jackson, Elena Ufimtseva, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Transform user supplied vNUMA configuration into libxl internal
> representations then libxc representations. Check validity along the
> line.
> 
> Libxc has more involvement in building vmemranges in HVM case compared
> to PV case. The building of vmemranges is placed after xc_hvm_build
> returns, because it relies on memory hole information provided by
> xc_hvm_build.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled
  2015-02-26 15:55 ` [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
@ 2015-03-02 15:46   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:46 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> Disallow memory relocation when vNUMA is enabled, because relocated
> memory ends up off node. Further more, even if we dynamically expand
> node coverage in hvmloader, low memory and high memory may reside
> in different physical nodes, blindly relocating low memory to high
> memory gives us a sub-optimal configuration.
> 
> Introduce a function called libxl__vnuma_configured and use it.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Konrad Wilk <konrad.wilk@oracle.com>
> ---
> Changes in v6:
> 1. Introduce a helper function.
> ---
>  tools/libxl/libxl_dm.c       | 6 ++++--
>  tools/libxl/libxl_internal.h | 1 +
>  tools/libxl/libxl_vnuma.c    | 5 +++++
>  3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 8599a6a..7b09512 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -1365,13 +1365,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
>                          libxl__sprintf(gc, "%s/hvmloader/bios", path),
>                          "%s", libxl_bios_type_to_string(b_info->u.hvm.bios));
>          /* Disable relocating memory to make the MMIO hole larger
> -         * unless we're running qemu-traditional */
> +         * unless we're running qemu-traditional and vNUMA is not
> +         * configured. */
>          libxl__xs_write(gc, XBT_NULL,
>                          libxl__sprintf(gc,
>                                         "%s/hvmloader/allow-memory-relocate",
>                                         path),
>                          "%d",
> -                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL);
> +                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
> +                        !libxl__vnuma_configured(b_info));
>          free(path);
>      }
>  
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index e93089a..d04b6aa 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3413,6 +3413,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
>                                       libxl_domain_build_info *b_info,
>                                       libxl__domain_build_state *state,
>                                       struct xc_hvm_build_args *args);
> +bool libxl__vnuma_configured(const libxl_domain_build_info *b_info);
>  
>  _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
>                                     const libxl_ms_vm_genid *id);
> diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
> index a0576ee..6af3cde 100644
> --- a/tools/libxl/libxl_vnuma.c
> +++ b/tools/libxl/libxl_vnuma.c
> @@ -17,6 +17,11 @@
>  #include "libxl_arch.h"
>  #include <stdlib.h>
>  
> +bool libxl__vnuma_configured(const libxl_domain_build_info *b_info)
> +{
> +    return b_info->num_vnuma_nodes != 0;
> +}
> +
>  /* Sort vmemranges in ascending order with "start" */
>  static int compare_vmemrange(const void *a, const void *b)
>  {

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check
  2015-03-02 15:34   ` Ian Campbell
@ 2015-03-02 15:50     ` Wei Liu
  2015-03-03  3:52     ` Dario Faggioli
  1 sibling, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-03-02 15:50 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Elena Ufimtseva, xen-devel

On Mon, Mar 02, 2015 at 03:34:58PM +0000, Ian Campbell wrote:
> On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > This function is used to check whether vNUMA configuration (be it
> > auto-generated or supplied by user) is valid.
> > 
> > Define a new error code ERROR_VNUMA_CONFIG_INVALID.
> > 
> > The checks performed can be found in the comment of the function.
> > 
> > This vNUMA function (and future ones) is placed in a new file called
> > libxl_vnuma.c
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Although I'd like to see comment from Dario and/or Elena. (Ian's
> concerns from last round look to me to have been addresses, Im sure
> he'll speak up if not).
> 
> I have one question:
> > + *  4. local distance cannot be larger than remote distance
> 
> What is the "local distance" in NUMA speak?
> 

Local distance means the distance from a node to itself.

Wei.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 22/23] xl: introduce xcalloc
  2015-02-26 15:56 ` [PATCH v6 22/23] xl: introduce xcalloc Wei Liu
@ 2015-03-02 15:51   ` Ian Campbell
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:51 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-02-26 at 15:56 +0000, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 15:38             ` Wei Liu
@ 2015-03-02 15:51               ` Jan Beulich
  2015-03-02 16:08                 ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-02 15:51 UTC (permalink / raw)
  To: Wei Liu; +Cc: andrew.cooper3, Ian Campbell, xen-devel

>>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
> On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>> > >On 27/02/15 16:51, Wei Liu wrote:
>> > >> During last round review, Andrew wanted me to move this to Xen public
>> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
>> > >> patch.
>> > >>
>> > >> But I don't particularly mind whether we move it or not, it's up to you
>> > >> maintainers to decide.
>> > >
>> > >It is a sentinel value used in the public ABI.  It should therefore
>> > >appear in the public API.
>> > 
>> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
>> > particular naming as a new constant, even more that it isn't intended to
>> > be used explicitly in any of the memops.
>> 
>> IMHO the named constant does seem to make the tools code at least more
>> readable, but without Wei having said where this is to be used I'm not
>> sure where it should live. In particular I'm unsure if/how/where this
>> value gets passed to a hypercall, as opposed to perhaps being used as a
> 
> This is used to fill in vnode_to_pnode array. That array get
> subsequently passed down to hypervisor.

Do we really accept NUMA_NO_NODE to be passed that way?

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 23/23] xl: vNUMA support
  2015-02-26 15:56 ` [PATCH v6 23/23] xl: vNUMA support Wei Liu
  2015-02-27 16:17   ` Dario Faggioli
@ 2015-03-02 15:59   ` Ian Campbell
  2015-03-02 16:31     ` Wei Liu
  1 sibling, 1 reply; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 15:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-02-26 at 15:56 +0000, Wei Liu wrote:
> This patch includes configuration options parser and documentation.
> 
> Please find the hunk to xl.cfg.pod.5 for more information.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> ---
> Changes in v6:
> 1. Disable NUMA auto-placement.
> ---
>  docs/man/xl.cfg.pod.5    |  54 ++++++++++++++++++
>  tools/libxl/xl_cmdimpl.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 193 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 408653f..2a27b1c 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -266,6 +266,60 @@ it will crash.
>  
>  =back
>  
> +=head3 Guest Virtual NUMA Configuration
> +
> +=over 4
> +
> +=item B<vnuma=[ VNODE_SPEC, VNODE_SPEC, ... ]
> +
> +Specify virtual NUMA configuration with positional arguments. The
> +nth B<VNODE_SPECE> in the list specifies the configuration of nth

                    ^stray E?

> +virtual node.
> +
> +Each B<VNODE_SPEC> is a list, which has a form of
> +"[VNODE_CONFIG_OPTION,VNODE_CONFIG_OPTION, ... ]"  (without quotes).

BTW, near the top of this file is:
        A list of C<VALUES> of the above types. Lists are homogeneous and are
        not nested.

"are not nested" is no longer true.

> +
> +For example vnuma = [ ["pnode=0","size=512","vcpus=0-4","vdistances=10,20"] ]
> +means vnode 0 is mapped to pnode 0, has 512MB ram, has vcpus 0 to 4, the
> +distance to itself is 10 and the distance to vnode 1 is 20.
> +
> +Each B<VNODE_CONFIG_OPTION> is a quoted string.

s/string/key=value pair/?

>  Supported
> +B<VNODE_CONFIG_OPTION>s are:
> +
> +=over 4
> +
> +=item B<pnode=NUMBER>
> +
> +Specify which physical node this virtual node maps to.
> +
> +=item B<size=MBYTES>
> +
> +Specify the size of this virtual node. The sum of memory size of all
> +vnodes must match B<maxmem=> (or B<memory=> if B<maxmem=> is not
> +specified).

Didn't you suggest in a reply v4 that this need for the user to do
tedious book keeping could go away after some other change: "This point
becomes moot if we use other syntax to specify vcpu list."?

Should it be allowable to no specify maxmem and have xl do the sums over
the vnodes  to come up with the correct value? (memory= could still be
given in that case).

The code looks fine.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 15:51               ` Jan Beulich
@ 2015-03-02 16:08                 ` Wei Liu
  2015-03-02 16:27                   ` Jan Beulich
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-02 16:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, Wei Liu, Ian Campbell, xen-devel

On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
> >>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
> > On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
> >> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
> >> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
> >> > >On 27/02/15 16:51, Wei Liu wrote:
> >> > >> During last round review, Andrew wanted me to move this to Xen public
> >> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
> >> > >> patch.
> >> > >>
> >> > >> But I don't particularly mind whether we move it or not, it's up to you
> >> > >> maintainers to decide.
> >> > >
> >> > >It is a sentinel value used in the public ABI.  It should therefore
> >> > >appear in the public API.
> >> > 
> >> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
> >> > particular naming as a new constant, even more that it isn't intended to
> >> > be used explicitly in any of the memops.
> >> 
> >> IMHO the named constant does seem to make the tools code at least more
> >> readable, but without Wei having said where this is to be used I'm not
> >> sure where it should live. In particular I'm unsure if/how/where this
> >> value gets passed to a hypercall, as opposed to perhaps being used as a
> > 
> > This is used to fill in vnode_to_pnode array. That array get
> > subsequently passed down to hypervisor.
> 
> Do we really accept NUMA_NO_NODE to be passed that way?
> 

public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-03-02 15:33     ` Wei Liu
@ 2015-03-02 16:18       ` Ian Campbell
  2015-03-02 16:45         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 16:18 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Mon, 2015-03-02 at 15:33 +0000, Wei Liu wrote:
> On Mon, Mar 02, 2015 at 03:26:58PM +0000, Ian Campbell wrote:
> > On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > > Currently all in tree code doesn't set the superpage flag, but Konrad
> > > wants it retained for the moment.
> > 
> > ... because? I can't seem to see any relevant mails from Konrad in my
> > vnuma patch folder.
> > 
> 
> That was quite some time ago (last year) I asked him about this and he
> said oracle is still using it.

and will still be using it when they upgrade to 4.6?

> I would remove superpage support if I can, but since Konrad said they
> are using it I will just leave it alone.

Please can you explain this in the commit log then.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 16:08                 ` Wei Liu
@ 2015-03-02 16:27                   ` Jan Beulich
  2015-03-02 16:39                     ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-02 16:27 UTC (permalink / raw)
  To: Wei Liu; +Cc: andrew.cooper3, Ian Campbell, xen-devel

>>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
> On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>> >>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>> > On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>> >> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>> >> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>> >> > >On 27/02/15 16:51, Wei Liu wrote:
>> >> > >> During last round review, Andrew wanted me to move this to Xen public
>> >> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
>> >> > >> patch.
>> >> > >>
>> >> > >> But I don't particularly mind whether we move it or not, it's up to you
>> >> > >> maintainers to decide.
>> >> > >
>> >> > >It is a sentinel value used in the public ABI.  It should therefore
>> >> > >appear in the public API.
>> >> > 
>> >> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
>> >> > particular naming as a new constant, even more that it isn't intended to
>> >> > be used explicitly in any of the memops.
>> >> 
>> >> IMHO the named constant does seem to make the tools code at least more
>> >> readable, but without Wei having said where this is to be used I'm not
>> >> sure where it should live. In particular I'm unsure if/how/where this
>> >> value gets passed to a hypercall, as opposed to perhaps being used as a
>> > 
>> > This is used to fill in vnode_to_pnode array. That array get
>> > subsequently passed down to hypervisor.
>> 
>> Do we really accept NUMA_NO_NODE to be passed that way?
>> 
> 
> public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.

That wasn't my concern - I was rather wondering why we would
accept any of this array's fields to be set to "no node".

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 23/23] xl: vNUMA support
  2015-03-02 15:59   ` Ian Campbell
@ 2015-03-02 16:31     ` Wei Liu
  0 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-03-02 16:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Mon, Mar 02, 2015 at 03:59:19PM +0000, Ian Campbell wrote:
> On Thu, 2015-02-26 at 15:56 +0000, Wei Liu wrote:
> > This patch includes configuration options parser and documentation.
> > 
> > Please find the hunk to xl.cfg.pod.5 for more information.
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > ---
> > Changes in v6:
> > 1. Disable NUMA auto-placement.
> > ---
> >  docs/man/xl.cfg.pod.5    |  54 ++++++++++++++++++
> >  tools/libxl/xl_cmdimpl.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 193 insertions(+), 1 deletion(-)
> > 
> > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> > index 408653f..2a27b1c 100644
> > --- a/docs/man/xl.cfg.pod.5
> > +++ b/docs/man/xl.cfg.pod.5
> > @@ -266,6 +266,60 @@ it will crash.
> >  
> >  =back
> >  
> > +=head3 Guest Virtual NUMA Configuration
> > +
> > +=over 4
> > +
> > +=item B<vnuma=[ VNODE_SPEC, VNODE_SPEC, ... ]
> > +
> > +Specify virtual NUMA configuration with positional arguments. The
> > +nth B<VNODE_SPECE> in the list specifies the configuration of nth
> 
>                     ^stray E?
> 

Fixed.

> > +virtual node.
> > +
> > +Each B<VNODE_SPEC> is a list, which has a form of
> > +"[VNODE_CONFIG_OPTION,VNODE_CONFIG_OPTION, ... ]"  (without quotes).
> 
> BTW, near the top of this file is:
>         A list of C<VALUES> of the above types. Lists are homogeneous and are
>         not nested.
> 
> "are not nested" is no longer true.
> 

I will add a patch after my changes to libxlu, so that we don't mix that
up with vNUMA configuration options.

> > +
> > +For example vnuma = [ ["pnode=0","size=512","vcpus=0-4","vdistances=10,20"] ]
> > +means vnode 0 is mapped to pnode 0, has 512MB ram, has vcpus 0 to 4, the
> > +distance to itself is 10 and the distance to vnode 1 is 20.
> > +
> > +Each B<VNODE_CONFIG_OPTION> is a quoted string.
> 
> s/string/key=value pair/?

Fixed.

> 
> >  Supported
> > +B<VNODE_CONFIG_OPTION>s are:
> > +
> > +=over 4
> > +
> > +=item B<pnode=NUMBER>
> > +
> > +Specify which physical node this virtual node maps to.
> > +
> > +=item B<size=MBYTES>
> > +
> > +Specify the size of this virtual node. The sum of memory size of all
> > +vnodes must match B<maxmem=> (or B<memory=> if B<maxmem=> is not
> > +specified).
> 
> Didn't you suggest in a reply v4 that this need for the user to do
> tedious book keeping could go away after some other change: "This point
> becomes moot if we use other syntax to specify vcpu list."?
> 

That was for specifying vcpus I think.

> Should it be allowable to no specify maxmem and have xl do the sums over
> the vnodes  to come up with the correct value? (memory= could still be
> given in that case).
> 

I missed this point.

I think your suggestion is good. We can derive maxmem= from vnuma to
relieve user from tedious bookkeeping. If user has specified maxmem= and
vnuma we check in xl that if maxmem= equals to the sum of vnuma memory.

Wei.

> The code looks fine.
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 16:27                   ` Jan Beulich
@ 2015-03-02 16:39                     ` Wei Liu
  2015-03-02 16:50                       ` Jan Beulich
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-02 16:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, Wei Liu, Ian Campbell, xen-devel

On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
> >>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
> > On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
> >> >>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
> >> > On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
> >> >> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
> >> >> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
> >> >> > >On 27/02/15 16:51, Wei Liu wrote:
> >> >> > >> During last round review, Andrew wanted me to move this to Xen public
> >> >> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
> >> >> > >> patch.
> >> >> > >>
> >> >> > >> But I don't particularly mind whether we move it or not, it's up to you
> >> >> > >> maintainers to decide.
> >> >> > >
> >> >> > >It is a sentinel value used in the public ABI.  It should therefore
> >> >> > >appear in the public API.
> >> >> > 
> >> >> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
> >> >> > particular naming as a new constant, even more that it isn't intended to
> >> >> > be used explicitly in any of the memops.
> >> >> 
> >> >> IMHO the named constant does seem to make the tools code at least more
> >> >> readable, but without Wei having said where this is to be used I'm not
> >> >> sure where it should live. In particular I'm unsure if/how/where this
> >> >> value gets passed to a hypercall, as opposed to perhaps being used as a
> >> > 
> >> > This is used to fill in vnode_to_pnode array. That array get
> >> > subsequently passed down to hypervisor.
> >> 
> >> Do we really accept NUMA_NO_NODE to be passed that way?
> >> 
> > 
> > public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
> 
> That wasn't my concern - I was rather wondering why we would
> accept any of this array's fields to be set to "no node".
> 

If you want to have numa topology exposed to guest but doesn't care
about underly memory affinity?

My code as is doesn't support this kind of setting though.

Wei.


> Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-03-02 16:18       ` Ian Campbell
@ 2015-03-02 16:45         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 81+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-02 16:45 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Mon, Mar 02, 2015 at 04:18:08PM +0000, Ian Campbell wrote:
> On Mon, 2015-03-02 at 15:33 +0000, Wei Liu wrote:
> > On Mon, Mar 02, 2015 at 03:26:58PM +0000, Ian Campbell wrote:
> > > On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > > > Currently all in tree code doesn't set the superpage flag, but Konrad
> > > > wants it retained for the moment.
> > > 
> > > ... because? I can't seem to see any relevant mails from Konrad in my
> > > vnuma patch folder.
> > > 
> > 
> > That was quite some time ago (last year) I asked him about this and he
> > said oracle is still using it.
> 
> and will still be using it when they upgrade to 4.6?

Sadly yes.
> 
> > I would remove superpage support if I can, but since Konrad said they
> > are using it I will just leave it alone.
> 
> Please can you explain this in the commit log then.
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array
  2015-03-02 15:26   ` Ian Campbell
  2015-03-02 15:33     ` Wei Liu
@ 2015-03-02 16:46     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 81+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-02 16:46 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

> > @@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
> >              return rc;
> >      }
> >  
> > -    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
> > -    if ( dom->p2m_host == NULL )
> > -        return -EINVAL;
> > -
> >      if ( dom->superpages )
> >      {
> >          int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
> >          xen_pfn_t extents[count];
> >  
> > +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> > +                                      dom->total_pages);

'count' pls.

> > +        if ( dom->p2m_host == NULL )
> > +            return -EINVAL;
> > +
> >          DOMPRINTF("Populating memory with %d superpages", count);
> >          for ( pfn = 0; pfn < count; pfn++ )
> >              extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
> > @@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
> >                  return rc;
> >          }
> >          /* setup initial p2m */
> > +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> > +                                      dom->total_pages);
> > +        if ( dom->p2m_host == NULL )
> > +            return -EINVAL;
> >          for ( pfn = 0; pfn < dom->total_pages; pfn++ )
> >              dom->p2m_host[pfn] = pfn;
> > -        
> > +

??
> >          /* allocate guest memory */
> >          for ( i = rc = allocsz = 0;
> >                (i < dom->total_pages) && !rc;
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 16:39                     ` Wei Liu
@ 2015-03-02 16:50                       ` Jan Beulich
  2015-03-02 17:00                         ` Wei Liu
  2015-03-02 17:01                         ` Andrew Cooper
  0 siblings, 2 replies; 81+ messages in thread
From: Jan Beulich @ 2015-03-02 16:50 UTC (permalink / raw)
  To: Wei Liu; +Cc: andrew.cooper3, Ian Campbell, xen-devel

>>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
> On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
>> >>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
>> > On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>> >> >>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>> >> > On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>> >> >> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>> >> >> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>> >> >> > >On 27/02/15 16:51, Wei Liu wrote:
>> >> >> > >> During last round review, Andrew wanted me to move this to Xen public
>> >> >> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
>> >> >> > >> patch.
>> >> >> > >>
>> >> >> > >> But I don't particularly mind whether we move it or not, it's up to you
>> >> >> > >> maintainers to decide.
>> >> >> > >
>> >> >> > >It is a sentinel value used in the public ABI.  It should therefore
>> >> >> > >appear in the public API.
>> >> >> > 
>> >> >> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
>> >> >> > particular naming as a new constant, even more that it isn't intended to
>> >> >> > be used explicitly in any of the memops.
>> >> >> 
>> >> >> IMHO the named constant does seem to make the tools code at least more
>> >> >> readable, but without Wei having said where this is to be used I'm not
>> >> >> sure where it should live. In particular I'm unsure if/how/where this
>> >> >> value gets passed to a hypercall, as opposed to perhaps being used as a
>> >> > 
>> >> > This is used to fill in vnode_to_pnode array. That array get
>> >> > subsequently passed down to hypervisor.
>> >> 
>> >> Do we really accept NUMA_NO_NODE to be passed that way?
>> >> 
>> > 
>> > public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
>> 
>> That wasn't my concern - I was rather wondering why we would
>> accept any of this array's fields to be set to "no node".
>> 
> 
> If you want to have numa topology exposed to guest but doesn't care
> about underly memory affinity?

Is this useful for anything in reality? I.e. why would you want to
tell the guest it's NUMA when it really isn't? The only case I could
see is testing vNUMA code changes without having a NUMA box
around, but that hardly counts as a real use.

Furthermore iirc that array is an array of uint32_t, and the
sentinel (if any) there ought to be 0xffffffff irrespective of what
we use internally in the hypervisor.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 16:50                       ` Jan Beulich
@ 2015-03-02 17:00                         ` Wei Liu
  2015-03-03  7:44                           ` Jan Beulich
  2015-03-02 17:01                         ` Andrew Cooper
  1 sibling, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-02 17:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, Wei Liu, Ian Campbell, xen-devel

On Mon, Mar 02, 2015 at 04:50:48PM +0000, Jan Beulich wrote:
> >>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
> > On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
> >> >>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
> >> > On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
> >> >> >>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
> >> >> > On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
> >> >> >> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
> >> >> >> > >>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
> >> >> >> > >On 27/02/15 16:51, Wei Liu wrote:
> >> >> >> > >> During last round review, Andrew wanted me to move this to Xen public
> >> >> >> > >> header to avoid reinventing it in libxc. Now this value is used in libxc
> >> >> >> > >> patch.
> >> >> >> > >>
> >> >> >> > >> But I don't particularly mind whether we move it or not, it's up to you
> >> >> >> > >> maintainers to decide.
> >> >> >> > >
> >> >> >> > >It is a sentinel value used in the public ABI.  It should therefore
> >> >> >> > >appear in the public API.
> >> >> >> > 
> >> >> >> > Which it already does, as XENMEMF_get_node(0). I don't think it needs
> >> >> >> > particular naming as a new constant, even more that it isn't intended to
> >> >> >> > be used explicitly in any of the memops.
> >> >> >> 
> >> >> >> IMHO the named constant does seem to make the tools code at least more
> >> >> >> readable, but without Wei having said where this is to be used I'm not
> >> >> >> sure where it should live. In particular I'm unsure if/how/where this
> >> >> >> value gets passed to a hypercall, as opposed to perhaps being used as a
> >> >> > 
> >> >> > This is used to fill in vnode_to_pnode array. That array get
> >> >> > subsequently passed down to hypervisor.
> >> >> 
> >> >> Do we really accept NUMA_NO_NODE to be passed that way?
> >> >> 
> >> > 
> >> > public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
> >> 
> >> That wasn't my concern - I was rather wondering why we would
> >> accept any of this array's fields to be set to "no node".
> >> 
> > 
> > If you want to have numa topology exposed to guest but doesn't care
> > about underly memory affinity?
> 
> Is this useful for anything in reality? I.e. why would you want to
> tell the guest it's NUMA when it really isn't? The only case I could
> see is testing vNUMA code changes without having a NUMA box
> around, but that hardly counts as a real use.
> 

Dario has the idea that vNUMA is not only about performance. AIUI he
thinks it's perfectly OK to have that kind of use case.

> Furthermore iirc that array is an array of uint32_t, and the
> sentinel (if any) there ought to be 0xffffffff irrespective of what
> we use internally in the hypervisor.
> 

There are two issues here irrespective of this series:

1. should we expose that sentinel in ABI?
2. if so, what should the sentinel be?

I think Andrew and you disagree on the first one. We can work out the
answer to the second question later.

Currently my code doesn't really pass that down to hypervisor (that is,
requires a valid pnode to map a vnode to), so I'm fine with either
exposing or not exposing.

Wei.

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 16:50                       ` Jan Beulich
  2015-03-02 17:00                         ` Wei Liu
@ 2015-03-02 17:01                         ` Andrew Cooper
  2015-03-02 17:26                           ` Jan Beulich
  1 sibling, 1 reply; 81+ messages in thread
From: Andrew Cooper @ 2015-03-02 17:01 UTC (permalink / raw)
  To: Jan Beulich, Wei Liu; +Cc: Ian Campbell, xen-devel

On 02/03/15 16:50, Jan Beulich wrote:
>>>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
>> On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
>>>>>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
>>>> On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>>>>>>>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>>>>>> On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>>>>>>> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>>>>>>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>>>>>>>>> On 27/02/15 16:51, Wei Liu wrote:
>>>>>>>>>> During last round review, Andrew wanted me to move this to Xen public
>>>>>>>>>> header to avoid reinventing it in libxc. Now this value is used in libxc
>>>>>>>>>> patch.
>>>>>>>>>>
>>>>>>>>>> But I don't particularly mind whether we move it or not, it's up to you
>>>>>>>>>> maintainers to decide.
>>>>>>>>> It is a sentinel value used in the public ABI.  It should therefore
>>>>>>>>> appear in the public API.
>>>>>>>> Which it already does, as XENMEMF_get_node(0). I don't think it needs
>>>>>>>> particular naming as a new constant, even more that it isn't intended to
>>>>>>>> be used explicitly in any of the memops.
>>>>>>> IMHO the named constant does seem to make the tools code at least more
>>>>>>> readable, but without Wei having said where this is to be used I'm not
>>>>>>> sure where it should live. In particular I'm unsure if/how/where this
>>>>>>> value gets passed to a hypercall, as opposed to perhaps being used as a
>>>>>> This is used to fill in vnode_to_pnode array. That array get
>>>>>> subsequently passed down to hypervisor.
>>>>> Do we really accept NUMA_NO_NODE to be passed that way?
>>>>>
>>>> public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
>>> That wasn't my concern - I was rather wondering why we would
>>> accept any of this array's fields to be set to "no node".
>>>
>> If you want to have numa topology exposed to guest but doesn't care
>> about underly memory affinity?
> Is this useful for anything in reality?

Yes.  If there is insufficient memory on real numa nodes, the memory
could be striped and the guest told that its memory really is scattered
all over.

~Andrew

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:01                         ` Andrew Cooper
@ 2015-03-02 17:26                           ` Jan Beulich
  2015-03-02 17:34                             ` David Vrabel
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-02 17:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Ian Campbell, xen-devel

>>> On 02.03.15 at 18:01, <andrew.cooper3@citrix.com> wrote:
> On 02/03/15 16:50, Jan Beulich wrote:
>>>>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
>>> On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
>>>>>>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
>>>>> On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>>>>>>>>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>>>>>>> On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>>>>>>>> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>>>>>>>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>>>>>>>>>> On 27/02/15 16:51, Wei Liu wrote:
>>>>>>>>>>> During last round review, Andrew wanted me to move this to Xen public
>>>>>>>>>>> header to avoid reinventing it in libxc. Now this value is used in libxc
>>>>>>>>>>> patch.
>>>>>>>>>>>
>>>>>>>>>>> But I don't particularly mind whether we move it or not, it's up to you
>>>>>>>>>>> maintainers to decide.
>>>>>>>>>> It is a sentinel value used in the public ABI.  It should therefore
>>>>>>>>>> appear in the public API.
>>>>>>>>> Which it already does, as XENMEMF_get_node(0). I don't think it needs
>>>>>>>>> particular naming as a new constant, even more that it isn't intended to
>>>>>>>>> be used explicitly in any of the memops.
>>>>>>>> IMHO the named constant does seem to make the tools code at least more
>>>>>>>> readable, but without Wei having said where this is to be used I'm not
>>>>>>>> sure where it should live. In particular I'm unsure if/how/where this
>>>>>>>> value gets passed to a hypercall, as opposed to perhaps being used as a
>>>>>>> This is used to fill in vnode_to_pnode array. That array get
>>>>>>> subsequently passed down to hypervisor.
>>>>>> Do we really accept NUMA_NO_NODE to be passed that way?
>>>>>>
>>>>> public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
>>>> That wasn't my concern - I was rather wondering why we would
>>>> accept any of this array's fields to be set to "no node".
>>>>
>>> If you want to have numa topology exposed to guest but doesn't care
>>> about underly memory affinity?
>> Is this useful for anything in reality?
> 
> Yes.  If there is insufficient memory on real numa nodes, the memory
> could be striped and the guest told that its memory really is scattered
> all over.

I which case it could as well not be handed any vNUMA info at all.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:26                           ` Jan Beulich
@ 2015-03-02 17:34                             ` David Vrabel
  2015-03-02 17:43                               ` Andrew Cooper
  2015-03-03  7:36                               ` Jan Beulich
  0 siblings, 2 replies; 81+ messages in thread
From: David Vrabel @ 2015-03-02 17:34 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Wei Liu, Ian Campbell, xen-devel

On 02/03/15 17:26, Jan Beulich wrote:
>>>> On 02.03.15 at 18:01, <andrew.cooper3@citrix.com> wrote:
>> On 02/03/15 16:50, Jan Beulich wrote:
>>>>>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
>>>> On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
>>>>>>>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
>>>>>> On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>>>>>>>>>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>>>>>>>> On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>>>>>>>>> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>>>>>>>>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>>>>>>>>>>> On 27/02/15 16:51, Wei Liu wrote:
>>>>>>>>>>>> During last round review, Andrew wanted me to move this to Xen public
>>>>>>>>>>>> header to avoid reinventing it in libxc. Now this value is used in libxc
>>>>>>>>>>>> patch.
>>>>>>>>>>>>
>>>>>>>>>>>> But I don't particularly mind whether we move it or not, it's up to you
>>>>>>>>>>>> maintainers to decide.
>>>>>>>>>>> It is a sentinel value used in the public ABI.  It should therefore
>>>>>>>>>>> appear in the public API.
>>>>>>>>>> Which it already does, as XENMEMF_get_node(0). I don't think it needs
>>>>>>>>>> particular naming as a new constant, even more that it isn't intended to
>>>>>>>>>> be used explicitly in any of the memops.
>>>>>>>>> IMHO the named constant does seem to make the tools code at least more
>>>>>>>>> readable, but without Wei having said where this is to be used I'm not
>>>>>>>>> sure where it should live. In particular I'm unsure if/how/where this
>>>>>>>>> value gets passed to a hypercall, as opposed to perhaps being used as a
>>>>>>>> This is used to fill in vnode_to_pnode array. That array get
>>>>>>>> subsequently passed down to hypervisor.
>>>>>>> Do we really accept NUMA_NO_NODE to be passed that way?
>>>>>>>
>>>>>> public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
>>>>> That wasn't my concern - I was rather wondering why we would
>>>>> accept any of this array's fields to be set to "no node".
>>>>>
>>>> If you want to have numa topology exposed to guest but doesn't care
>>>> about underly memory affinity?
>>> Is this useful for anything in reality?
>>
>> Yes.  If there is insufficient memory on real numa nodes, the memory
>> could be striped and the guest told that its memory really is scattered
>> all over.
> 
> I which case it could as well not be handed any vNUMA info at all.

A guest that previously had 2 vNUMA nodes is migrated to a host with
only 1 pNUMA node.  It should still have 2 vNUMA nodes.

David

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:34                             ` David Vrabel
@ 2015-03-02 17:43                               ` Andrew Cooper
  2015-03-02 17:48                                 ` David Vrabel
  2015-03-03  7:36                               ` Jan Beulich
  1 sibling, 1 reply; 81+ messages in thread
From: Andrew Cooper @ 2015-03-02 17:43 UTC (permalink / raw)
  To: David Vrabel, Jan Beulich; +Cc: Wei Liu, Ian Campbell, xen-devel

On 02/03/15 17:34, David Vrabel wrote:
> On 02/03/15 17:26, Jan Beulich wrote:
>>>>> On 02.03.15 at 18:01, <andrew.cooper3@citrix.com> wrote:
>>> On 02/03/15 16:50, Jan Beulich wrote:
>>>>>>> On 02.03.15 at 17:39, <wei.liu2@citrix.com> wrote:
>>>>> On Mon, Mar 02, 2015 at 04:27:25PM +0000, Jan Beulich wrote:
>>>>>>>>> On 02.03.15 at 17:08, <wei.liu2@citrix.com> wrote:
>>>>>>> On Mon, Mar 02, 2015 at 03:51:37PM +0000, Jan Beulich wrote:
>>>>>>>>>>> On 02.03.15 at 16:38, <wei.liu2@citrix.com> wrote:
>>>>>>>>> On Mon, Mar 02, 2015 at 03:30:21PM +0000, Ian Campbell wrote:
>>>>>>>>>> On Mon, 2015-03-02 at 07:04 +0000, Jan Beulich wrote:
>>>>>>>>>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 02/27/15 5:58 PM >>>
>>>>>>>>>>>> On 27/02/15 16:51, Wei Liu wrote:
>>>>>>>>>>>>> During last round review, Andrew wanted me to move this to Xen public
>>>>>>>>>>>>> header to avoid reinventing it in libxc. Now this value is used in libxc
>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I don't particularly mind whether we move it or not, it's up to you
>>>>>>>>>>>>> maintainers to decide.
>>>>>>>>>>>> It is a sentinel value used in the public ABI.  It should therefore
>>>>>>>>>>>> appear in the public API.
>>>>>>>>>>> Which it already does, as XENMEMF_get_node(0). I don't think it needs
>>>>>>>>>>> particular naming as a new constant, even more that it isn't intended to
>>>>>>>>>>> be used explicitly in any of the memops.
>>>>>>>>>> IMHO the named constant does seem to make the tools code at least more
>>>>>>>>>> readable, but without Wei having said where this is to be used I'm not
>>>>>>>>>> sure where it should live. In particular I'm unsure if/how/where this
>>>>>>>>>> value gets passed to a hypercall, as opposed to perhaps being used as a
>>>>>>>>> This is used to fill in vnode_to_pnode array. That array get
>>>>>>>>> subsequently passed down to hypervisor.
>>>>>>>> Do we really accept NUMA_NO_NODE to be passed that way?
>>>>>>>>
>>>>>>> public/domctl.h:struct xen_domctl_vnuma has vnode_to_pnode array.
>>>>>> That wasn't my concern - I was rather wondering why we would
>>>>>> accept any of this array's fields to be set to "no node".
>>>>>>
>>>>> If you want to have numa topology exposed to guest but doesn't care
>>>>> about underly memory affinity?
>>>> Is this useful for anything in reality?
>>> Yes.  If there is insufficient memory on real numa nodes, the memory
>>> could be striped and the guest told that its memory really is scattered
>>> all over.
>> I which case it could as well not be handed any vNUMA info at all.
> A guest that previously had 2 vNUMA nodes is migrated to a host with
> only 1 pNUMA node.  It should still have 2 vNUMA nodes.

A natural consequence of vNUMA is that the guest must expect the vNUMA
layout to change across suspend/resume.  The toolstack cannot guarentee
that it can construct a similar vNUMA layout after a migration.  This
includes the toolstack indicating that it was unable to make any useful
NUMA affinity with the memory ranges.

~Andrew

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:43                               ` Andrew Cooper
@ 2015-03-02 17:48                                 ` David Vrabel
  2015-03-02 17:52                                   ` Ian Campbell
  0 siblings, 1 reply; 81+ messages in thread
From: David Vrabel @ 2015-03-02 17:48 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich; +Cc: Wei Liu, Ian Campbell, xen-devel

On 02/03/15 17:43, Andrew Cooper wrote:
> On 02/03/15 17:34, David Vrabel wrote:
>> 
>> A guest that previously had 2 vNUMA nodes is migrated to a host with
>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
> 
> A natural consequence of vNUMA is that the guest must expect the vNUMA
> layout to change across suspend/resume.  The toolstack cannot guarentee
> that it can construct a similar vNUMA layout after a migration.  This
> includes the toolstack indicating that it was unable to make any useful
> NUMA affinity with the memory ranges.

Eep!  I very much doubt we can do anything in Linux except retain the
existing NUMA layout across a save/restore.

David

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest
  2015-03-02 15:41   ` Ian Campbell
@ 2015-03-02 17:52     ` Wei Liu
  0 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-03-02 17:52 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Elena Ufimtseva, xen-devel

On Mon, Mar 02, 2015 at 03:41:21PM +0000, Ian Campbell wrote:
> On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > Introduce a arch-independent routine to generate one vmemrange per
> > vnode. Also introduce arch-dependent routines for different
> > architectures because part of the process is arch-specific -- ARM has
> > yet have NUMA support and E820 is x86 only.
> > 
> > For those x86 guests who care about machine E820 map (i.e. with
> > e820_host=1), vnode is further split into several vmemranges to
> > accommodate memory holes.  A few stubs for libxl_arm.c are created.
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 

Thanks.

> Although one comment:
> 
> > +    e820_count = 0;
> > +    nr_vmemrange = 0;
> > +    vmemranges = NULL;
> > +    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
> > +        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
> > +        uint64_t remaining_bytes = (p->memkb << 10), bytes;
> > +
> > +        while (remaining_bytes > 0) {
> > +            if (e820_count >= nr_e820) {
> > +                rc = ERROR_NOMEM;
> > +                goto out;
> > +            }
> > +
> > +            /* Skip non RAM region */
> > +            if (map[e820_count].type != E820_RAM) {
> > +                e820_count++;
> > +                continue;
> > +            }
> > +
> > +            GCREALLOC_ARRAY(vmemranges, nr_vmemrange+1);
> 
> Once we've hit the limit this is going to reallocate every time (and the
> limit starts as zero, so that's every time). Which means an aweful lot
> of reallocing every time.
> 
> Perhaps start with some non-zero size and increase by a non-linear
> amount as needed? That would require tracking the array size separate
> from nr_vmemrange of course.
> 

I've made some changes to have it increased by 32 elements every time.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:48                                 ` David Vrabel
@ 2015-03-02 17:52                                   ` Ian Campbell
  2015-03-02 18:19                                     ` Andrew Cooper
  0 siblings, 1 reply; 81+ messages in thread
From: Ian Campbell @ 2015-03-02 17:52 UTC (permalink / raw)
  To: David Vrabel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, xen-devel

On Mon, 2015-03-02 at 17:48 +0000, David Vrabel wrote:
> On 02/03/15 17:43, Andrew Cooper wrote:
> > On 02/03/15 17:34, David Vrabel wrote:
> >> 
> >> A guest that previously had 2 vNUMA nodes is migrated to a host with
> >> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
> > 
> > A natural consequence of vNUMA is that the guest must expect the vNUMA
> > layout to change across suspend/resume.  The toolstack cannot guarentee
> > that it can construct a similar vNUMA layout after a migration.  This
> > includes the toolstack indicating that it was unable to make any useful
> > NUMA affinity with the memory ranges.
> 
> Eep!  I very much doubt we can do anything in Linux except retain the
> existing NUMA layout across a save/restore.

In the case you mention above I would expect the 2 vnuma nodes to just
point to the same single pnuma node.

As such I think it's probably not relevant to the need for
XEN_NO_NUMA_NODE?

Or is that not would be expected?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:52                                   ` Ian Campbell
@ 2015-03-02 18:19                                     ` Andrew Cooper
  2015-03-03  3:42                                       ` Dario Faggioli
  2015-03-03  7:56                                       ` Jan Beulich
  0 siblings, 2 replies; 81+ messages in thread
From: Andrew Cooper @ 2015-03-02 18:19 UTC (permalink / raw)
  To: Ian Campbell, David Vrabel; +Cc: Wei Liu, Jan Beulich, xen-devel

On 02/03/15 17:52, Ian Campbell wrote:
> On Mon, 2015-03-02 at 17:48 +0000, David Vrabel wrote:
>> On 02/03/15 17:43, Andrew Cooper wrote:
>>> On 02/03/15 17:34, David Vrabel wrote:
>>>> A guest that previously had 2 vNUMA nodes is migrated to a host with
>>>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
>>> A natural consequence of vNUMA is that the guest must expect the vNUMA
>>> layout to change across suspend/resume.  The toolstack cannot guarentee
>>> that it can construct a similar vNUMA layout after a migration.  This
>>> includes the toolstack indicating that it was unable to make any useful
>>> NUMA affinity with the memory ranges.
>> Eep!  I very much doubt we can do anything in Linux except retain the
>> existing NUMA layout across a save/restore.
> In the case you mention above I would expect the 2 vnuma nodes to just
> point to the same single pnuma node.
>
> As such I think it's probably not relevant to the need for
> XEN_NO_NUMA_NODE?
>
> Or is that not would be expected?

If we were to go down that route, the toolstack would need a way of
signalling "this vNUMA node does not contain memory on a single pNUMA
node" if there was insufficient free space to make the allocation.

In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
correct value to report.

~Andrew

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 18:19                                     ` Andrew Cooper
@ 2015-03-03  3:42                                       ` Dario Faggioli
  2015-03-03  8:55                                         ` Jan Beulich
  2015-03-03  7:56                                       ` Jan Beulich
  1 sibling, 1 reply; 81+ messages in thread
From: Dario Faggioli @ 2015-03-03  3:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Ian Campbell, Jan Beulich, David Vrabel


[-- Attachment #1.1: Type: text/plain, Size: 2843 bytes --]

On Mon, 2015-03-02 at 18:19 +0000, Andrew Cooper wrote:
> On 02/03/15 17:52, Ian Campbell wrote:
> >> On 02/03/15 17:43, Andrew Cooper wrote:
> >>> On 02/03/15 17:34, David Vrabel wrote:
> >>>> A guest that previously had 2 vNUMA nodes is migrated to a host with
> >>>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
> >>> A natural consequence of vNUMA is that the guest must expect the vNUMA
> >>> layout to change across suspend/resume.  The toolstack cannot guarentee
> >>> that it can construct a similar vNUMA layout after a migration.  This
> >>> includes the toolstack indicating that it was unable to make any useful
> >>> NUMA affinity with the memory ranges.

> > In the case you mention above I would expect the 2 vnuma nodes to just
> > point to the same single pnuma node.
> >
Right, that's still doable. But what if we have 4 nodes but none can,
alone, accommodate all the memory of the guest's vnodes? (e.g., 1GB free
on each pnode, guest with 2 vnodes, each 1.2GB wide.)

The point being that it would be quite complicated to have to deal with
all the possible variations such situations in the toolstack, especially
considering that Xen smoothly handles that already (at the cost of
performance, of course).

> > As such I think it's probably not relevant to the need for
> > XEN_NO_NUMA_NODE?
> >
> > Or is that not would be expected?
> 
> If we were to go down that route, the toolstack would need a way of
> signalling "this vNUMA node does not contain memory on a single pNUMA
> node" if there was insufficient free space to make the allocation.
> 
Exactly.

BTW, about the use cases: wanting to test vNUMA without NUMA hardware,
as Jan said. Also, wanting to test NUMA support in the guest OS, or in
an application inside the guest, without having NUMA hardware.

But much more important are the situations that Andrew and David
described.

> In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
> correct value to report.
> 
Indeed. It tells Xen: <<hey Xen, toolstack here: we either don't care or
could not come up with any sane vnode-to-pnode mapping, please figure
that out yourself>>.

That makes the code, IMO, simpler at any level. In fact, at Xen level,
there is a default way to deal with the situation (the striping)
already. At the toolstack level, we can only care about trying to come
up with some super-cool and super-good (for performance) configuration
and just give up, if anything like what David and Andrew said occurs.

It's exactly what we're doing right now, BTW, with no vNUMA: we try to
place a domain on one (or on the least possible amount of) NUMA node(s)
but, if we fail, we inform the user that performance may suffer (with a
WARNING), and let Xen do what it wants with the guest's memory.

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check
  2015-03-02 15:34   ` Ian Campbell
  2015-03-02 15:50     ` Wei Liu
@ 2015-03-03  3:52     ` Dario Faggioli
  1 sibling, 0 replies; 81+ messages in thread
From: Dario Faggioli @ 2015-03-03  3:52 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, ufimtseva, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 912 bytes --]

On Mon, 2015-03-02 at 15:34 +0000, Ian Campbell wrote:
> On Thu, 2015-02-26 at 15:55 +0000, Wei Liu wrote:
> > This function is used to check whether vNUMA configuration (be it
> > auto-generated or supplied by user) is valid.
> > 
> > Define a new error code ERROR_VNUMA_CONFIG_INVALID.
> > 
> > The checks performed can be found in the comment of the function.
> > 
> > This vNUMA function (and future ones) is placed in a new file called
> > libxl_vnuma.c
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Although I'd like to see comment from Dario and/or Elena. (Ian's
> concerns from last round look to me to have been addresses, Im sure
> he'll speak up if not).
> 
Yes, sorry for the delay.

The (minor) concerns I had on v5 have been addressed, so:

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:34                             ` David Vrabel
  2015-03-02 17:43                               ` Andrew Cooper
@ 2015-03-03  7:36                               ` Jan Beulich
  1 sibling, 0 replies; 81+ messages in thread
From: Jan Beulich @ 2015-03-03  7:36 UTC (permalink / raw)
  To: David Vrabel; +Cc: Andrew Cooper, Wei Liu, Ian Campbell, xen-devel

>>> On 02.03.15 at 18:34, <david.vrabel@citrix.com> wrote:
> A guest that previously had 2 vNUMA nodes is migrated to a host with
> only 1 pNUMA node.  It should still have 2 vNUMA nodes.

I don't think the migration aspect was really thought through yet
by the submitters of the vNUMA patches that went in so far. But
yes, I agree with your later reply of there being little hop to do
anything about an eventual change in the guest. In which case it
again becomes a pretty moot thing to discuss how that state
should be presented to the guest.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 17:00                         ` Wei Liu
@ 2015-03-03  7:44                           ` Jan Beulich
  2015-03-03 11:08                             ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-03  7:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: andrew.cooper3, Ian Campbell, xen-devel

>>> On 02.03.15 at 18:00, <wei.liu2@citrix.com> wrote:
> There are two issues here irrespective of this series:
> 
> 1. should we expose that sentinel in ABI?
> 2. if so, what should the sentinel be?
> 
> I think Andrew and you disagree on the first one. We can work out the
> answer to the second question later.

I very much think that if we want to allow a "no node" specification
via the domctl, then this should be part of the ABI. But that value
and its (implicit) equivalent used for memops don't need to be the
same, and it looks like they can't. And looking at this I think the
code we have right now needs fixing: The internal vnode_to_pnode
array should become nodeid_t[], and input from the domctl should
be validated to either be a valid pnode or the to be defined sentinel
(which then, due to being stored as a more narrow type, needs
translation to NUMA_NO_NODE).

If we don't want to allow "no node", then input should be validated
to present valid pnodes.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-02 18:19                                     ` Andrew Cooper
  2015-03-03  3:42                                       ` Dario Faggioli
@ 2015-03-03  7:56                                       ` Jan Beulich
  1 sibling, 0 replies; 81+ messages in thread
From: Jan Beulich @ 2015-03-03  7:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, David Vrabel, Ian Campbell

>>> On 02.03.15 at 19:19, <andrew.cooper3@citrix.com> wrote:
> On 02/03/15 17:52, Ian Campbell wrote:
>> On Mon, 2015-03-02 at 17:48 +0000, David Vrabel wrote:
>>> On 02/03/15 17:43, Andrew Cooper wrote:
>>>> On 02/03/15 17:34, David Vrabel wrote:
>>>>> A guest that previously had 2 vNUMA nodes is migrated to a host with
>>>>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
>>>> A natural consequence of vNUMA is that the guest must expect the vNUMA
>>>> layout to change across suspend/resume.  The toolstack cannot guarentee
>>>> that it can construct a similar vNUMA layout after a migration.  This
>>>> includes the toolstack indicating that it was unable to make any useful
>>>> NUMA affinity with the memory ranges.
>>> Eep!  I very much doubt we can do anything in Linux except retain the
>>> existing NUMA layout across a save/restore.
>> In the case you mention above I would expect the 2 vnuma nodes to just
>> point to the same single pnuma node.
>>
>> As such I think it's probably not relevant to the need for
>> XEN_NO_NUMA_NODE?
>>
>> Or is that not would be expected?
> 
> If we were to go down that route, the toolstack would need a way of
> signalling "this vNUMA node does not contain memory on a single pNUMA
> node" if there was insufficient free space to make the allocation.

That's quite the opposite of the example above: When collapsing 2
nodes to just one, there's no problem representing things - as Ian
says, just store the same node ID everywhere. Problems arise
when you need to distribute the guest across more nodes than it
originally ran on.

> In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
> correct value to report.

As would be the one single node ID.

For the other case I just mentioned: Distances between nodes
may vary, and hence it would still be better to have a way to
indicate what subset of nodes you'd like the allocations to come
from. Granted this can't be represented by the current model, as
it would require node masks instead of node IDs. But (dependent
upon improvements to the page allocator) the tool stack could
still at least hint at what it wants by selecting two nodes
representing the maximum distance it wants to allow for. The
fallback in the page allocator should be tweaked anyway to not
blindly consider any node when needing to allocate outside of
the initially specified node mask, but closest one(s) first. I just
added this to my todo list.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-03  3:42                                       ` Dario Faggioli
@ 2015-03-03  8:55                                         ` Jan Beulich
  2015-03-04 12:51                                           ` Dario Faggioli
  0 siblings, 1 reply; 81+ messages in thread
From: Jan Beulich @ 2015-03-03  8:55 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Andrew Cooper, xen-devel, Wei Liu, David Vrabel, Ian Campbell

>>> On 03.03.15 at 04:42, <raistlin.df@gmail.com> wrote:
> On Mon, 2015-03-02 at 18:19 +0000, Andrew Cooper wrote:
>> In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
>> correct value to report.
>> 
> Indeed. It tells Xen: <<hey Xen, toolstack here: we either don't care or
> could not come up with any sane vnode-to-pnode mapping, please figure
> that out yourself>>.
> 
> That makes the code, IMO, simpler at any level. In fact, at Xen level,
> there is a default way to deal with the situation (the striping)
> already. At the toolstack level, we can only care about trying to come
> up with some super-cool and super-good (for performance) configuration
> and just give up, if anything like what David and Andrew said occurs.

See my earlier reply - the tool stack at least giving hints to the
hypervisor in such a case would likely still be better (for the final
result) than leaving it entirely up to the hypervisor: "No node"
really means allocate from anywhere, whereas some specific
node passed in still allows the hypervisor to find second best fits
when having to fall back.

Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-03  7:44                           ` Jan Beulich
@ 2015-03-03 11:08                             ` Wei Liu
  0 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-03-03 11:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, Wei Liu, Ian Campbell, xen-devel

On Tue, Mar 03, 2015 at 07:44:25AM +0000, Jan Beulich wrote:
> >>> On 02.03.15 at 18:00, <wei.liu2@citrix.com> wrote:
> > There are two issues here irrespective of this series:
> > 
> > 1. should we expose that sentinel in ABI?
> > 2. if so, what should the sentinel be?
> > 
> > I think Andrew and you disagree on the first one. We can work out the
> > answer to the second question later.
> 
> I very much think that if we want to allow a "no node" specification
> via the domctl, then this should be part of the ABI. But that value
> and its (implicit) equivalent used for memops don't need to be the
> same, and it looks like they can't. And looking at this I think the
> code we have right now needs fixing: The internal vnode_to_pnode
> array should become nodeid_t[], and input from the domctl should
> be validated to either be a valid pnode or the to be defined sentinel
> (which then, due to being stored as a more narrow type, needs
> translation to NUMA_NO_NODE).
> 
> If we don't want to allow "no node", then input should be validated
> to present valid pnodes.
> 

This is currently the case, I don't allow "no node". Not because the
toolstack is designed / expected to work, but because there are many
implications that I am not quite sure of (and this thread sorta confirms
that everybody has his / her idea of how this should work). Libxl always
requires the pnode to be a valid one. A sentinel is used between libxc
and libxl to mark if the node specified is a valid pnode.  Other than
that, that sentinel is not used, not passed down to hypervisor.

I think Andrew and Dario came to the idea of needing to have a unified
sentinel from different angels. Andrew happened to see me using (~0UL)
as a sentinel then wanted me to propagate the hypervisor sentinel. Dario
thought more about other use cases that might need provisioning.

In any case, whether exposing this sentinel or not has no immediate user
visible effect (it's only needed between libxc and libxl at the moment),
and we don't have any burden maintaining a different sentinel in libxc
because libxc is not defined as stable interface.

Due to the controversy of this patch, I'm fine with dropping this patch
and move forward (now 20+ patches are blocked by this one). If we
can't come to a resolution within this week, I will just drop this patch
and resend the others next week.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE
  2015-03-03  8:55                                         ` Jan Beulich
@ 2015-03-04 12:51                                           ` Dario Faggioli
  0 siblings, 0 replies; 81+ messages in thread
From: Dario Faggioli @ 2015-03-04 12:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, xen-devel, Wei Liu, David Vrabel, Ian Campbell


[-- Attachment #1.1: Type: text/plain, Size: 1599 bytes --]

On Tue, 2015-03-03 at 08:55 +0000, Jan Beulich wrote:
> >>> On 03.03.15 at 04:42, <raistlin.df@gmail.com> wrote:

> > Indeed. It tells Xen: <<hey Xen, toolstack here: we either don't care or
> > could not come up with any sane vnode-to-pnode mapping, please figure
> > that out yourself>>.
> > 
> > That makes the code, IMO, simpler at any level. In fact, at Xen level,
> > there is a default way to deal with the situation (the striping)
> > already. At the toolstack level, we can only care about trying to come
> > up with some super-cool and super-good (for performance) configuration
> > and just give up, if anything like what David and Andrew said occurs.
> 
> See my earlier reply - the tool stack at least giving hints to the
> hypervisor in such a case would likely still be better (for the final
> result) than leaving it entirely up to the hypervisor: "No node"
> really means allocate from anywhere, whereas some specific
> node passed in still allows the hypervisor to find second best fits
> when having to fall back.
> 
Yes, at the cost of more complex algorithms, both in the hypervisor (as
you say in your other email) and in the toolstack. HV may not be an
issue, toolstack, I'm not sure...

I already have a draft implementation for that, which I'll rebase and
submit on top of Wei's series, as soon as that one will be in.

I think I agree with Wei when he says that it's probably better to drop
the argument for now... We'll see later whether we really need a way to
tell NO_NODE to the hypervisor, and add it then, or not.

Regards,
Dario



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-02-26 15:56 ` [PATCH v6 20/23] libxlu: record line and column number when parsing values Wei Liu
@ 2015-03-06 11:36   ` Ian Jackson
  2015-03-06 12:03     ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Ian Jackson @ 2015-03-06 11:36 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, xen-devel

Wei Liu writes ("[PATCH v6 20/23] libxlu: record line and column number when parsing values"):
> Originally only setting has line number recorded. Since we're moving to
> more sophisticated API, record line number and column number for
> individual value. They are useful for error reporting.

This is a good idea.

Why not use the flex location type in your formal parameters
everywhere ?  That would mean plumbing one rather than two values.

Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 21/23] libxlu: introduce new APIs
  2015-02-26 15:56 ` [PATCH v6 21/23] libxlu: introduce new APIs Wei Liu
@ 2015-03-06 11:40   ` Ian Jackson
  0 siblings, 0 replies; 81+ messages in thread
From: Ian Jackson @ 2015-03-06 11:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, xen-devel

Wei Liu writes ("[PATCH v6 21/23] libxlu: introduce new APIs"):
> These APIs can be used to manipulate XLU_ConfigValue and XLU_ConfigList.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-03-06 11:36   ` Ian Jackson
@ 2015-03-06 12:03     ` Wei Liu
  2015-03-06 14:30       ` Ian Jackson
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-06 12:03 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, Ian Campbell, xen-devel

On Fri, Mar 06, 2015 at 11:36:03AM +0000, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v6 20/23] libxlu: record line and column number when parsing values"):
> > Originally only setting has line number recorded. Since we're moving to
> > more sophisticated API, record line number and column number for
> > individual value. They are useful for error reporting.
> 
> This is a good idea.
> 
> Why not use the flex location type in your formal parameters
> everywhere ?  That would mean plumbing one rather than two values.
> 

Ack. This is how it looks like now. The changes are

1. Use YYLTYPE instead of line and column.
2. Handle empty rule inside list_mk and string_mk (that is when YYLTYPE
   points to NULL)

---8<---
>From 2594d29a21909b8039e0f09001a44035af8f5f5b Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Mon, 16 Feb 2015 19:01:12 +0000
Subject: [PATCH] libxlu: record line and column number when parsing values

Originally only setting has line number recorded. Since we're moving to
more sophisticated API, record line number and column number for
individual value. They are useful for error reporting.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v7:
1. Use YYLTYPE instead of individual line / column values.
---
 tools/libxl/libxlu_cfg.c      | 16 ++++++++++++++--
 tools/libxl/libxlu_cfg_i.h    |  5 +++--
 tools/libxl/libxlu_cfg_y.c    |  6 +++---
 tools/libxl/libxlu_cfg_y.y    |  6 +++---
 tools/libxl/libxlu_internal.h |  1 +
 5 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 611f5ec..0c69573 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -311,7 +311,8 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
 }
 
 
-XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom,
+                                    YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
 
@@ -321,6 +322,11 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
     if (!value) goto xe;
     value->type = XLU_STRING;
     value->u.string = atom;
+    if (loc) {
+        value->line = loc->first_line;
+        value->column = loc->first_column;
+    } else
+        value->line = value->column = 0;
 
     return value;
 
@@ -333,7 +339,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
 }
 
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val)
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
@@ -350,6 +357,11 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
     value->u.list.nvalues = 1;
     value->u.list.avalues = 1;
     value->u.list.values = values;
+    if (loc) {
+        value->line = loc->first_line;
+        value->column = loc->first_column;
+    } else
+        value->line = value->column = 0;
 
     return value;
 
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 11dc33f..1b59b33 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -26,9 +26,10 @@ void xlu__cfg_set_free(XLU_ConfigSetting *set);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
-                                    char *atom);
+                                    char *atom, YYLTYPE *loc);
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val);
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
                           XLU_ConfigValue *val);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index b05e48b..7e690d5 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -1515,7 +1515,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string),&(yylsp[(1) - (1)])); }
     break;
 
   case 13:
@@ -1543,7 +1543,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 68 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL,NULL); }
     break;
 
   case 17:
@@ -1564,7 +1564,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value),&(yylsp[(1) - (2)])); }
     break;
 
   case 20:
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 4a5ca3a..c9437c9 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -59,17 +59,17 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1,&@1); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL,NULL); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1,&@1); }
  |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 092a17a..73fd85f 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -44,6 +44,7 @@ struct XLU_ConfigValue {
         char *string;
         XLU_ConfigList list;
     } u;
+    int line, column;
 };
 
 typedef struct XLU_ConfigSetting { /* transparent */
-- 
1.9.1



> Ian.

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-03-06 12:03     ` Wei Liu
@ 2015-03-06 14:30       ` Ian Jackson
  2015-03-06 16:11         ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Ian Jackson @ 2015-03-06 14:30 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, xen-devel

Wei Liu writes ("Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values"):
> Ack. This is how it looks like now. The changes are
> 
> 1. Use YYLTYPE instead of line and column.

This is better but why not have YYLTYPE inside XLU_ConfigValue ?

> 2. Handle empty rule inside list_mk and string_mk (that is when YYLTYPE
>    points to NULL)

I hadn't spotted this in the previous patch.  Why not use `yyloc', the
location of the lookahead token ?

> +    if (loc) {
> +        value->line = loc->first_line;
> +        value->column = loc->first_column;
> +    } else
> +        value->line = value->column = 0;

This code appears twice.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-03-06 14:30       ` Ian Jackson
@ 2015-03-06 16:11         ` Wei Liu
  2015-03-06 16:57           ` Wei Liu
  0 siblings, 1 reply; 81+ messages in thread
From: Wei Liu @ 2015-03-06 16:11 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, Ian Campbell, xen-devel

On Fri, Mar 06, 2015 at 02:30:27PM +0000, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values"):
> > Ack. This is how it looks like now. The changes are
> > 
> > 1. Use YYLTYPE instead of line and column.
> 
> This is better but why not have YYLTYPE inside XLU_ConfigValue ?
> 

That appears to cause some kind of circular inclusion problem. I will
see what I can do.

> > 2. Handle empty rule inside list_mk and string_mk (that is when YYLTYPE
> >    points to NULL)
> 
> I hadn't spotted this in the previous patch.  Why not use `yyloc', the
> location of the lookahead token ?
> 

I think you mean yylloc. yyloc is internal variable.

> > +    if (loc) {
> > +        value->line = loc->first_line;
> > +        value->column = loc->first_column;
> > +    } else
> > +        value->line = value->column = 0;
> 
> This code appears twice.
> 

If I use yylloc (or yyloc) this hunk will just be gone.

Wei.

> Thanks,
> Ian.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values
  2015-03-06 16:11         ` Wei Liu
@ 2015-03-06 16:57           ` Wei Liu
  0 siblings, 0 replies; 81+ messages in thread
From: Wei Liu @ 2015-03-06 16:57 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, Ian Campbell, xen-devel

On Fri, Mar 06, 2015 at 04:11:53PM +0000, Wei Liu wrote:
> On Fri, Mar 06, 2015 at 02:30:27PM +0000, Ian Jackson wrote:
> > Wei Liu writes ("Re: [PATCH v6 20/23] libxlu: record line and column number when parsing values"):
> > > Ack. This is how it looks like now. The changes are
> > > 
> > > 1. Use YYLTYPE instead of line and column.
> > 
> > This is better but why not have YYLTYPE inside XLU_ConfigValue ?
> > 
> 
> That appears to cause some kind of circular inclusion problem. I will
> see what I can do.
> 

I use a trick to work around this. That is, only have a pointer in
XLU_ConfigValue.  Then malloc XLU_ConfigValue and YYLTYPE in one go and
set the pointer accordingly. 

This way we avoid including libxlu_cfg_i.h in libxlu_internal.h and don't
need to worry about leaking memory. 

---8<---
>From ce87242c706549c42581971e8e35ffcd4e6c1da7 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Mon, 16 Feb 2015 19:01:12 +0000
Subject: [PATCH] libxlu: record location when parsing values

Originally only setting has line number recorded. Since we're moving to
more sophisticated API, record the location for individual value. It is
useful for error reporting.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v7:
1. Use yylloc in empty rule.
1. Use YYLTYPE instead of individual line / column values.
---
 tools/libxl/libxlu_cfg.c      | 14 ++++++++++----
 tools/libxl/libxlu_cfg_i.h    |  5 +++--
 tools/libxl/libxlu_cfg_y.c    |  6 +++---
 tools/libxl/libxlu_cfg_y.y    |  6 +++---
 tools/libxl/libxlu_internal.h |  2 ++
 5 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 611f5ec..858f894 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -311,16 +311,19 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
 }
 
 
-XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom,
+                                    YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
 
     if (ctx->err) goto x;
 
-    value = malloc(sizeof(*value));
+    value = malloc(sizeof(*value)+sizeof(*loc));
     if (!value) goto xe;
     value->type = XLU_STRING;
     value->u.string = atom;
+    value->loc = (YYLTYPE *)(value+1);
+    memcpy(value->loc, loc, sizeof(*loc));
 
     return value;
 
@@ -333,7 +336,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
 }
 
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val)
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
@@ -344,12 +348,14 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
     if (!values) goto xe;
     values[0] = val;
 
-    value = malloc(sizeof(*value));
+    value = malloc(sizeof(*value)+sizeof(*loc));
     if (!value) goto xe;
     value->type = XLU_LIST;
     value->u.list.nvalues = 1;
     value->u.list.avalues = 1;
     value->u.list.values = values;
+    value->loc = (YYLTYPE *)(value+1);
+    memcpy(value->loc, loc, sizeof(*loc));
 
     return value;
 
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 11dc33f..1b59b33 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -26,9 +26,10 @@ void xlu__cfg_set_free(XLU_ConfigSetting *set);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
-                                    char *atom);
+                                    char *atom, YYLTYPE *loc);
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val);
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
                           XLU_ConfigValue *val);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index b05e48b..fbfdd0f 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -1515,7 +1515,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string),&(yylsp[(1) - (1)])); }
     break;
 
   case 13:
@@ -1543,7 +1543,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 68 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
     break;
 
   case 17:
@@ -1564,7 +1564,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value),&(yylsp[(1) - (2)])); }
     break;
 
   case 20:
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 4a5ca3a..a923f76 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -59,17 +59,17 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1,&@1); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1,&@1); }
  |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 092a17a..cc1d400 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -38,12 +38,14 @@ typedef struct XLU_ConfigList {
     XLU_ConfigValue **values;
 } XLU_ConfigList;
 
+typedef struct YYLTYPE YYLTYPE;
 struct XLU_ConfigValue {
     enum XLU_ConfigValueType type;
     union {
         char *string;
         XLU_ConfigList list;
     } u;
+    YYLTYPE *loc;
 };
 
 typedef struct XLU_ConfigSetting { /* transparent */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2015-03-06 16:57 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-26 15:55 [PATCH v6 00/23] Virtual NUMA for PV and HVM Wei Liu
2015-02-26 15:55 ` [PATCH v6 01/23] xen: factor out construct_memop_from_reservation Wei Liu
2015-02-27 10:57   ` Andrew Cooper
2015-02-26 15:55 ` [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE Wei Liu
2015-02-27 11:38   ` Andrew Cooper
2015-02-27 16:42   ` Jan Beulich
2015-02-27 16:51     ` Wei Liu
2015-02-27 16:52       ` Andrew Cooper
2015-03-02  7:04         ` Jan Beulich
2015-03-02 15:30           ` Ian Campbell
2015-03-02 15:38             ` Wei Liu
2015-03-02 15:51               ` Jan Beulich
2015-03-02 16:08                 ` Wei Liu
2015-03-02 16:27                   ` Jan Beulich
2015-03-02 16:39                     ` Wei Liu
2015-03-02 16:50                       ` Jan Beulich
2015-03-02 17:00                         ` Wei Liu
2015-03-03  7:44                           ` Jan Beulich
2015-03-03 11:08                             ` Wei Liu
2015-03-02 17:01                         ` Andrew Cooper
2015-03-02 17:26                           ` Jan Beulich
2015-03-02 17:34                             ` David Vrabel
2015-03-02 17:43                               ` Andrew Cooper
2015-03-02 17:48                                 ` David Vrabel
2015-03-02 17:52                                   ` Ian Campbell
2015-03-02 18:19                                     ` Andrew Cooper
2015-03-03  3:42                                       ` Dario Faggioli
2015-03-03  8:55                                         ` Jan Beulich
2015-03-04 12:51                                           ` Dario Faggioli
2015-03-03  7:56                                       ` Jan Beulich
2015-03-03  7:36                               ` Jan Beulich
2015-02-26 15:55 ` [PATCH v6 03/23] xen: make two memory hypercalls vNUMA-aware Wei Liu
2015-02-27 16:59   ` Jan Beulich
2015-02-27 17:03     ` Wei Liu
2015-02-26 15:55 ` [PATCH v6 04/23] libxc: duplicate snippet to allocate p2m_host array Wei Liu
2015-03-02 15:26   ` Ian Campbell
2015-03-02 15:33     ` Wei Liu
2015-03-02 16:18       ` Ian Campbell
2015-03-02 16:45         ` Konrad Rzeszutek Wilk
2015-03-02 16:46     ` Konrad Rzeszutek Wilk
2015-02-26 15:55 ` [PATCH v6 05/23] libxc: add p2m_size to xc_dom_image Wei Liu
2015-03-02 15:28   ` Ian Campbell
2015-02-26 15:55 ` [PATCH v6 06/23] libxc: allocate memory with vNUMA information for PV guest Wei Liu
2015-03-02 15:36   ` Ian Campbell
2015-02-26 15:55 ` [PATCH v6 07/23] libxl: introduce vNUMA types Wei Liu
2015-02-26 15:55 ` [PATCH v6 08/23] libxl: add vmemrange to libxl__domain_build_state Wei Liu
2015-02-26 15:55 ` [PATCH v6 09/23] libxl: introduce libxl__vnuma_config_check Wei Liu
2015-03-02 15:34   ` Ian Campbell
2015-03-02 15:50     ` Wei Liu
2015-03-03  3:52     ` Dario Faggioli
2015-02-26 15:55 ` [PATCH v6 10/23] libxl: x86: factor out e820_host_sanitize Wei Liu
2015-02-26 15:55 ` [PATCH v6 11/23] libxl: functions to build vmemranges for PV guest Wei Liu
2015-02-26 16:39   ` Dario Faggioli
2015-03-02 15:41   ` Ian Campbell
2015-03-02 17:52     ` Wei Liu
2015-02-26 15:55 ` [PATCH v6 12/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-02-26 15:55 ` [PATCH v6 13/23] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
2015-02-26 15:55 ` [PATCH v6 14/23] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
2015-03-02 15:43   ` Ian Campbell
2015-02-26 15:55 ` [PATCH v6 15/23] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-03-02 15:44   ` Ian Campbell
2015-02-26 15:55 ` [PATCH v6 16/23] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
2015-03-02 15:46   ` Ian Campbell
2015-02-26 15:56 ` [PATCH v6 17/23] libxl: define LIBXL_HAVE_VNUMA Wei Liu
2015-02-27 13:46   ` Dario Faggioli
2015-02-26 15:56 ` [PATCH v6 18/23] libxlu: rework internal representation of setting Wei Liu
2015-02-26 15:56 ` [PATCH v6 19/23] libxlu: nested list support Wei Liu
2015-02-26 15:56 ` [PATCH v6 20/23] libxlu: record line and column number when parsing values Wei Liu
2015-03-06 11:36   ` Ian Jackson
2015-03-06 12:03     ` Wei Liu
2015-03-06 14:30       ` Ian Jackson
2015-03-06 16:11         ` Wei Liu
2015-03-06 16:57           ` Wei Liu
2015-02-26 15:56 ` [PATCH v6 21/23] libxlu: introduce new APIs Wei Liu
2015-03-06 11:40   ` Ian Jackson
2015-02-26 15:56 ` [PATCH v6 22/23] xl: introduce xcalloc Wei Liu
2015-03-02 15:51   ` Ian Campbell
2015-02-26 15:56 ` [PATCH v6 23/23] xl: vNUMA support Wei Liu
2015-02-27 16:17   ` Dario Faggioli
2015-03-02 15:59   ` Ian Campbell
2015-03-02 16:31     ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.