All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)
@ 2015-03-10  2:27 Boris Ostrovsky
  2015-03-10  2:27 ` [PATCH v4 1/9] numa: __node_distance() should return u8 Boris Ostrovsky
                   ` (9 more replies)
  0 siblings, 10 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel


Changes in v4:
* Split cputopology and NUMA info changes into separate patches
* Added patch#1 (partly because patch#4 needs to know when when distance is invalid,
  i.e. NUMA_NO_DISTANCE)
* Split sysctl version update into a separate patch
* Other changes are listed in each patch
* NOTE: I did not test python's xc changes since I don't think I know how.

Changes in v3:
* Added patch #1 to more consistently define nodes as a u8 and properly
  use NUMA_NO_NODE.
* Make changes to xen_sysctl_numainfo, similar to those made to
  xen_sysctl_topologyinfo. (Q: I kept both sets of changes in the same
  patch #3 to avoid bumping interface version twice. Perhaps it's better
  to split it into two?)
* Instead of copying data for each loop index allocate a buffer and copy
  once for all three queries in sysctl.c.
* Move hypercall buffer management from libxl to libxc (as requested by
  Dario, patches #5 and #6).
* Report topology info for offlined CPUs as well
* Added LIBXL_HAVE_PCITOPO macro

Changes in v2:
* Split topology sysctls into two --- one for CPU topology and the other
  for devices
* Avoid long loops in the hypervisor by using continuations. (I am not
  particularly happy about using first_dev in the interface, suggestions
  for a better interface would be appreciated)
* Use proper libxl conventions for interfaces
* Avoid hypervisor stack corruption when copying PXM data from guest


A few patches that add interface for querying hypervisor about device
topology and allow 'xl info -n' display this information if PXM object
is provided by ACPI.

This series also makes some optimizations and cleanup of current CPU
topology and NUMA sysctl queries.

Boris Ostrovsky (9):
  numa: __node_distance() should return u8
  pci: Stash device's PXM information in struct pci_dev
  sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  sysctl: Add sysctl interface for querying PCI topology
  sysctl: Update sysctl version to 0x0000000C
  libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer
    management to libxc
  libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management
    to libxc
  libxl: Add interface for querying hypervisor about PCI topology

 tools/libxc/include/xenctrl.h     |   12 ++-
 tools/libxc/xc_misc.c             |   83 +++++++++++++---
 tools/libxl/libxl.c               |  157 +++++++++++++++--------------
 tools/libxl/libxl.h               |   12 +++
 tools/libxl/libxl_freebsd.c       |   12 +++
 tools/libxl/libxl_internal.h      |    5 +
 tools/libxl/libxl_linux.c         |   69 +++++++++++++
 tools/libxl/libxl_netbsd.c        |   12 +++
 tools/libxl/libxl_types.idl       |    7 ++
 tools/libxl/libxl_utils.c         |    8 ++
 tools/libxl/xl_cmdimpl.c          |   40 ++++++--
 tools/misc/xenpm.c                |   92 +++++++-----------
 tools/python/xen/lowlevel/xc/xc.c |  109 ++++++++-------------
 xen/arch/x86/physdev.c            |   23 ++++-
 xen/arch/x86/srat.c               |   15 ++-
 xen/common/page_alloc.c           |    4 +-
 xen/common/sysctl.c               |  197 +++++++++++++++++++++++++++----------
 xen/drivers/passthrough/pci.c     |   13 ++-
 xen/include/asm-x86/numa.h        |    2 +-
 xen/include/public/physdev.h      |    6 +
 xen/include/public/sysctl.h       |  111 +++++++++++++++------
 xen/include/xen/numa.h            |    3 +-
 xen/include/xen/pci.h             |    5 +-
 23 files changed, 669 insertions(+), 328 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 1/9] numa: __node_distance() should return u8
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 11:53   ` Andrew Cooper
  2015-03-10  2:27 ` [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev Boris Ostrovsky
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

SLIT values are byte-sized and some of them (0-9 and 255) have
special meaning. Adjust __node_distance() to reflect this and
modify scrub_heap_pages() to deal with __node_distance() returning
an invalid SLIT entry.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/srat.c        |   15 +++++++++++----
 xen/common/page_alloc.c    |    4 ++--
 xen/include/asm-x86/numa.h |    2 +-
 xen/include/xen/numa.h     |    3 ++-
 4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index dfabba3..aa2eda3 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -496,14 +496,21 @@ static unsigned node_to_pxm(nodeid_t n)
 	return 0;
 }
 
-int __node_distance(nodeid_t a, nodeid_t b)
+u8 __node_distance(nodeid_t a, nodeid_t b)
 {
-	int index;
+	u8 slit_val;
 
 	if (!acpi_slit)
 		return a == b ? 10 : 20;
-	index = acpi_slit->locality_count * node_to_pxm(a);
-	return acpi_slit->entry[index + node_to_pxm(b)];
+
+	slit_val = acpi_slit->entry[acpi_slit->locality_count * node_to_pxm(a) +
+				                node_to_pxm(b)];
+
+	/* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
+	if ((slit_val == 0xff) || (slit_val <= 9))
+		return NUMA_NO_DISTANCE;
+	else
+		return slit_val;
 }
 
 EXPORT_SYMBOL(__node_distance);
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index d96d25b..0f0ca56 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1435,13 +1435,13 @@ void __init scrub_heap_pages(void)
         /* Figure out which NODE CPUs are close. */
         for_each_online_node ( j )
         {
-            int distance;
+            u8 distance;
 
             if ( cpumask_empty(&node_to_cpumask(j)) )
                 continue;
 
             distance = __node_distance(i, j);
-            if ( distance < last_distance )
+            if ( (distance < last_distance) && (distance != NUMA_NO_DISTANCE) )
             {
                 last_distance = distance;
                 best_node = j;
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index cc5b5d1..7a489d3 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -85,6 +85,6 @@ extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 #endif
 
 void srat_parse_regions(u64 addr);
-extern int __node_distance(nodeid_t a, nodeid_t b);
+extern u8 __node_distance(nodeid_t a, nodeid_t b);
 
 #endif
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index ac4b391..7aef1a8 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -7,7 +7,8 @@
 #define NODES_SHIFT     0
 #endif
 
-#define NUMA_NO_NODE    0xFF
+#define NUMA_NO_NODE     0xFF
+#define NUMA_NO_DISTANCE 0xFF
 
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
  2015-03-10  2:27 ` [PATCH v4 1/9] numa: __node_distance() should return u8 Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-13 16:15   ` Jan Beulich
  2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

If ACPI provides PXM data for IO devices then dom0 will pass it to
hypervisor during PHYSDEVOP_pci_device_add call. This information,
however, is currently ignored.

We will store this information (in the form of nodeID) in pci_dev
structure so that we can provide it, for example, to the toolstack
when it adds support (in the following patches) for querying the
hypervisor about device topology

We will also print it when user requests device information dump.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---

Changes in v4:
* New commit title

 xen/arch/x86/physdev.c        |   23 ++++++++++++++++++++---
 xen/drivers/passthrough/pci.c |   13 +++++++++----
 xen/include/public/physdev.h  |    6 ++++++
 xen/include/xen/pci.h         |    5 ++++-
 4 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 1be1d50..8d63da5 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -565,7 +565,8 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( copy_from_guest(&manage_pci, arg, 1) != 0 )
             break;
 
-        ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn, NULL);
+        ret = pci_add_device(0, manage_pci.bus, manage_pci.devfn,
+                             NULL, NUMA_NO_NODE);
         break;
     }
 
@@ -597,13 +598,14 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         pdev_info.physfn.devfn = manage_pci_ext.physfn.devfn;
         ret = pci_add_device(0, manage_pci_ext.bus,
                              manage_pci_ext.devfn,
-                             &pdev_info);
+                             &pdev_info, NUMA_NO_NODE);
         break;
     }
 
     case PHYSDEVOP_pci_device_add: {
         struct physdev_pci_device_add add;
         struct pci_dev_info pdev_info;
+        u8 node;
 
         ret = -EFAULT;
         if ( copy_from_guest(&add, arg, 1) != 0 )
@@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         }
         else
             pdev_info.is_virtfn = 0;
-        ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
+
+        if ( add.flags & XEN_PCI_DEV_PXM )
+        {
+            uint32_t pxm;
+            size_t optarr_off = offsetof(struct physdev_pci_device_add, optarr) /
+                                sizeof(add.optarr[0]);
+
+            if ( copy_from_guest_offset(&pxm, arg, optarr_off, 1) )
+                break;
+
+            node = pxm_to_node(pxm);
+        }
+        else
+            node = NUMA_NO_NODE;
+
+        ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info, node);
         break;
     }
 
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 4b83583..7d3590d 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -568,7 +568,8 @@ static void pci_enable_acs(struct pci_dev *pdev)
     pci_conf_write16(seg, bus, dev, func, pos + PCI_ACS_CTRL, ctrl);
 }
 
-int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *info)
+int pci_add_device(u16 seg, u8 bus, u8 devfn,
+                   const struct pci_dev_info *info, u8 node)
 {
     struct pci_seg *pseg;
     struct pci_dev *pdev;
@@ -586,7 +587,8 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *info)
         pdev = pci_get_pdev(seg, info->physfn.bus, info->physfn.devfn);
         spin_unlock(&pcidevs_lock);
         if ( !pdev )
-            pci_add_device(seg, info->physfn.bus, info->physfn.devfn, NULL);
+            pci_add_device(seg, info->physfn.bus, info->physfn.devfn,
+                           NULL, node);
         pdev_type = "virtual function";
     }
     else
@@ -609,6 +611,8 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *info)
     if ( !pdev )
         goto out;
 
+    pdev->node = node;
+
     if ( info )
         pdev->info = *info;
     else if ( !pdev->vf_rlen[0] )
@@ -1191,10 +1195,11 @@ static int _dump_pci_devices(struct pci_seg *pseg, void *arg)
 
     list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
     {
-        printk("%04x:%02x:%02x.%u - dom %-3d - MSIs < ",
+        printk("%04x:%02x:%02x.%u - dom %-3d - node %-3d - MSIs < ",
                pseg->nr, pdev->bus,
                PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
-               pdev->domain ? pdev->domain->domain_id : -1);
+               pdev->domain ? pdev->domain->domain_id : -1,
+               (pdev->node != NUMA_NO_NODE) ? pdev->node : -1);
         list_for_each_entry ( msi, &pdev->msi_list, list )
                printk("%d ", msi->irq);
         printk(">\n");
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index 2683719..f33845d 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -293,6 +293,12 @@ struct physdev_pci_device_add {
         uint8_t bus;
         uint8_t devfn;
     } physfn;
+
+    /*
+     * Optional parameters array.
+     * First element ([0]) is PXM domain associated with the device (if
+     * XEN_PCI_DEV_PXM is set)
+     */
 #if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
     uint32_t optarr[];
 #elif defined(__GNUC__)
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 3988ee6..daece64 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -57,6 +57,8 @@ struct pci_dev {
 
     u8 phantom_stride;
 
+    u8 node; /* NUMA node */
+
     enum pdev_type {
         DEV_TYPE_PCI_UNKNOWN,
         DEV_TYPE_PCIe_ENDPOINT,
@@ -103,7 +105,8 @@ void setup_hwdom_pci_devices(struct domain *,
 int pci_release_devices(struct domain *d);
 int pci_add_segment(u16 seg);
 const unsigned long *pci_get_ro_map(u16 seg);
-int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *);
+int pci_add_device(u16 seg, u8 bus, u8 devfn,
+                   const struct pci_dev_info *, u8 node);
 int pci_remove_device(u16 seg, u8 bus, u8 devfn);
 int pci_ro_device(int seg, int bus, int devfn);
 void arch_pci_ro_device(int seg, int bdf);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
  2015-03-10  2:27 ` [PATCH v4 1/9] numa: __node_distance() should return u8 Boris Ostrovsky
  2015-03-10  2:27 ` [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 14:29   ` Andrew Cooper
                     ` (2 more replies)
  2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

Instead of copying data for each field in xen_sysctl_topologyinfo separately
put cpu/socket/node into a single structure and do a single copy for each
processor.

Do not use max_cpu_index, which is almost always used for calculating number
CPUs (thus requiring adding or subtracting one), replace it with num_cpus.

There is no need to copy whole op in sysctl to user at the end, we only need
num_cpus.

Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
that these are used for CPU topology. Subsequent patch will add support for
PCI topology sysctl.

Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
(core, socket, node).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* Split v3's patch into two --- one for CPU topology and one for NUMA info
* Replaced max_cpu_index with num_cpus to avoid always adding/subtracting one
* Replaced INVALID_TOPOLOGY_ID with separate macros for core/socket/node
* No buffer allocation in sysctl, copy data cpu-by-cpu
* Check for BAD_APICID/NUMA_NO_NODE in sysctl

 tools/libxc/include/xenctrl.h     |    4 +-
 tools/libxc/xc_misc.c             |   10 ++--
 tools/libxl/libxl.c               |   46 +++++++-----------
 tools/misc/xenpm.c                |   93 +++++++++++++++---------------------
 tools/python/xen/lowlevel/xc/xc.c |   60 ++++++++++--------------
 xen/common/sysctl.c               |   66 +++++++++++++++++---------
 xen/include/public/sysctl.h       |   48 +++++++++++--------
 7 files changed, 160 insertions(+), 167 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 09d819f..fef4cca 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1226,7 +1226,7 @@ int xc_readconsolering(xc_interface *xch,
 int xc_send_debug_keys(xc_interface *xch, char *keys);
 
 typedef xen_sysctl_physinfo_t xc_physinfo_t;
-typedef xen_sysctl_topologyinfo_t xc_topologyinfo_t;
+typedef xen_sysctl_cputopoinfo_t xc_cputopoinfo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
@@ -1237,7 +1237,7 @@ typedef uint64_t xc_node_to_memfree_t;
 typedef uint32_t xc_node_to_node_dist_t;
 
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
-int xc_topologyinfo(xc_interface *xch, xc_topologyinfo_t *info);
+int xc_cputopoinfo(xc_interface *xch, xc_cputopoinfo_t *info);
 int xc_numainfo(xc_interface *xch, xc_numainfo_t *info);
 
 int xc_sched_id(xc_interface *xch,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index e253a58..be68291 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -177,20 +177,20 @@ int xc_physinfo(xc_interface *xch,
     return 0;
 }
 
-int xc_topologyinfo(xc_interface *xch,
-                xc_topologyinfo_t *put_info)
+int xc_cputopoinfo(xc_interface *xch,
+                   xc_cputopoinfo_t *put_info)
 {
     int ret;
     DECLARE_SYSCTL;
 
-    sysctl.cmd = XEN_SYSCTL_topologyinfo;
+    sysctl.cmd = XEN_SYSCTL_cputopoinfo;
 
-    memcpy(&sysctl.u.topologyinfo, put_info, sizeof(*put_info));
+    memcpy(&sysctl.u.cputopoinfo, put_info, sizeof(*put_info));
 
     if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
         return ret;
 
-    memcpy(put_info, &sysctl.u.topologyinfo, sizeof(*put_info));
+    memcpy(put_info, &sysctl.u.cputopoinfo, sizeof(*put_info));
 
     return 0;
 }
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a68f6ef..6660133 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5036,10 +5036,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
 libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
 {
     GC_INIT(ctx);
-    xc_topologyinfo_t tinfo;
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_core_t, coremap);
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_socket_t, socketmap);
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_node_t, nodemap);
+    xc_cputopoinfo_t tinfo;
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
     libxl_cputopology *ret = NULL;
     int i;
     int max_cpus;
@@ -5052,45 +5050,37 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
         goto out;
     }
 
-    coremap = xc_hypercall_buffer_alloc
-        (ctx->xch, coremap, sizeof(*coremap) * max_cpus);
-    socketmap = xc_hypercall_buffer_alloc
-        (ctx->xch, socketmap, sizeof(*socketmap) * max_cpus);
-    nodemap = xc_hypercall_buffer_alloc
-        (ctx->xch, nodemap, sizeof(*nodemap) * max_cpus);
-    if ((coremap == NULL) || (socketmap == NULL) || (nodemap == NULL)) {
+    cputopo = xc_hypercall_buffer_alloc(ctx->xch, cputopo,
+                                        sizeof(*cputopo) * max_cpus);
+    if (cputopo == NULL) {
         LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
                             "Unable to allocate hypercall arguments");
         goto fail;
     }
+    set_xen_guest_handle(tinfo.cputopo, cputopo);
+    tinfo.num_cpus = max_cpus;
 
-    set_xen_guest_handle(tinfo.cpu_to_core, coremap);
-    set_xen_guest_handle(tinfo.cpu_to_socket, socketmap);
-    set_xen_guest_handle(tinfo.cpu_to_node, nodemap);
-    tinfo.max_cpu_index = max_cpus - 1;
-    if (xc_topologyinfo(ctx->xch, &tinfo) != 0) {
-        LIBXL__LOG_ERRNO(ctx, XTL_ERROR, "Topology info hypercall failed");
+    if (xc_cputopoinfo(ctx->xch, &tinfo) != 0) {
+        LIBXL__LOG_ERRNO(ctx, XTL_ERROR, "CPU topology info hypercall failed");
         goto fail;
     }
 
-    if (tinfo.max_cpu_index < max_cpus - 1)
-        max_cpus = tinfo.max_cpu_index + 1;
+    if (tinfo.num_cpus < max_cpus)
+        max_cpus = tinfo.num_cpus;
 
     ret = libxl__zalloc(NOGC, sizeof(libxl_cputopology) * max_cpus);
 
     for (i = 0; i < max_cpus; i++) {
-#define V(map, i) (map[i] == INVALID_TOPOLOGY_ID) ? \
-    LIBXL_CPUTOPOLOGY_INVALID_ENTRY : map[i]
-        ret[i].core = V(coremap, i);
-        ret[i].socket = V(socketmap, i);
-        ret[i].node = V(nodemap, i);
+#define V(map, i, invalid) ( cputopo[i].map == invalid) ? \
+   LIBXL_CPUTOPOLOGY_INVALID_ENTRY : cputopo[i].map
+        ret[i].core = V(core, i, XEN_INVALID_CORE_ID);
+        ret[i].socket = V(socket, i, XEN_INVALID_SOCKET_ID);
+        ret[i].node = V(node, i, XEN_INVALID_NODE_ID);
 #undef V
     }
 
-fail:
-    xc_hypercall_buffer_free(ctx->xch, coremap);
-    xc_hypercall_buffer_free(ctx->xch, socketmap);
-    xc_hypercall_buffer_free(ctx->xch, nodemap);
+ fail:
+    xc_hypercall_buffer_free(ctx->xch, cputopo);
 
     if (ret)
         *nb_cpu_out = max_cpus;
diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index e43924c..23d6b63 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -355,16 +355,13 @@ static void signal_int_handler(int signo)
     int i, j, k;
     struct timeval tv;
     int cx_cap = 0, px_cap = 0;
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_core);
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_socket);
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_node);
-    xc_topologyinfo_t info = { 0 };
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
+    xc_cputopoinfo_t info = { 0 };
 
-    cpu_to_core = xc_hypercall_buffer_alloc(xc_handle, cpu_to_core, sizeof(*cpu_to_core) * MAX_NR_CPU);
-    cpu_to_socket = xc_hypercall_buffer_alloc(xc_handle, cpu_to_socket, sizeof(*cpu_to_socket) * MAX_NR_CPU);
-    cpu_to_node = xc_hypercall_buffer_alloc(xc_handle, cpu_to_node, sizeof(*cpu_to_node) * MAX_NR_CPU);
+    cputopo = xc_hypercall_buffer_alloc(xc_handle, cputopo,
+                                        sizeof(*cputopo) * MAX_NR_CPU);
 
-    if ( cpu_to_core == NULL || cpu_to_socket == NULL || cpu_to_node == NULL )
+    if ( cputopo == NULL )
     {
 	fprintf(stderr, "failed to allocate hypercall buffers\n");
 	goto out;
@@ -448,47 +445,45 @@ static void signal_int_handler(int signo)
             printf("  Avg freq\t%d\tKHz\n", avgfreq[i]);
     }
 
-    set_xen_guest_handle(info.cpu_to_core, cpu_to_core);
-    set_xen_guest_handle(info.cpu_to_socket, cpu_to_socket);
-    set_xen_guest_handle(info.cpu_to_node, cpu_to_node);
-    info.max_cpu_index = MAX_NR_CPU - 1;
+    set_xen_guest_handle(info.cputopo, cputopo);
+    info.num_cpus = MAX_NR_CPU;
 
-    if ( cx_cap && !xc_topologyinfo(xc_handle, &info) )
+    if ( cx_cap && !xc_cputopoinfo(xc_handle, &info) )
     {
         uint32_t socket_ids[MAX_NR_CPU];
         uint32_t core_ids[MAX_NR_CPU];
         uint32_t socket_nr = 0;
         uint32_t core_nr = 0;
 
-        if ( info.max_cpu_index > MAX_NR_CPU - 1 )
-            info.max_cpu_index = MAX_NR_CPU - 1;
+        if ( info.num_cpus > MAX_NR_CPU )
+            info.num_cpus = MAX_NR_CPU;
         /* check validity */
-        for ( i = 0; i <= info.max_cpu_index; i++ )
+        for ( i = 0; i < info.num_cpus; i++ )
         {
-            if ( cpu_to_core[i] == INVALID_TOPOLOGY_ID ||
-                 cpu_to_socket[i] == INVALID_TOPOLOGY_ID )
+            if ( cputopo[i].core == XEN_INVALID_CORE_ID ||
+                 cputopo[i].socket == XEN_INVALID_SOCKET_ID )
                 break;
         }
-        if ( i > info.max_cpu_index )
+        if ( i >= info.num_cpus )
         {
             /* find socket nr & core nr per socket */
-            for ( i = 0; i <= info.max_cpu_index; i++ )
+            for ( i = 0; i < info.num_cpus; i++ )
             {
                 for ( j = 0; j < socket_nr; j++ )
-                    if ( cpu_to_socket[i] == socket_ids[j] )
+                    if ( cputopo[i].socket == socket_ids[j] )
                         break;
                 if ( j == socket_nr )
                 {
-                    socket_ids[j] = cpu_to_socket[i];
+                    socket_ids[j] = cputopo[i].socket;
                     socket_nr++;
                 }
 
                 for ( j = 0; j < core_nr; j++ )
-                    if ( cpu_to_core[i] == core_ids[j] )
+                    if ( cputopo[i].core == core_ids[j] )
                         break;
                 if ( j == core_nr )
                 {
-                    core_ids[j] = cpu_to_core[i];
+                    core_ids[j] = cputopo[i].core;
                     core_nr++;
                 }
             }
@@ -499,9 +494,9 @@ static void signal_int_handler(int signo)
                 unsigned int n;
                 uint64_t res;
 
-                for ( j = 0; j <= info.max_cpu_index; j++ )
+                for ( j = 0; j < info.num_cpus; j++ )
                 {
-                    if ( cpu_to_socket[j] == socket_ids[i] )
+                    if ( cputopo[j].socket == socket_ids[i] )
                         break;
                 }
                 printf("\nSocket %d\n", socket_ids[i]);
@@ -518,10 +513,10 @@ static void signal_int_handler(int signo)
                 }
                 for ( k = 0; k < core_nr; k++ )
                 {
-                    for ( j = 0; j <= info.max_cpu_index; j++ )
+                    for ( j = 0; j < info.num_cpus; j++ )
                     {
-                        if ( cpu_to_socket[j] == socket_ids[i] &&
-                             cpu_to_core[j] == core_ids[k] )
+                        if ( cputopo[j].socket == socket_ids[i] &&
+                             cputopo[j].core == core_ids[k] )
                             break;
                     }
                     printf("\t Core %d CPU %d\n", core_ids[k], j);
@@ -556,9 +551,7 @@ static void signal_int_handler(int signo)
     free(sum);
     free(avgfreq);
 out:
-    xc_hypercall_buffer_free(xc_handle, cpu_to_core);
-    xc_hypercall_buffer_free(xc_handle, cpu_to_socket);
-    xc_hypercall_buffer_free(xc_handle, cpu_to_node);
+    xc_hypercall_buffer_free(xc_handle, cputopo);
     xc_interface_close(xc_handle);
     exit(0);
 }
@@ -965,28 +958,22 @@ void scaling_governor_func(int argc, char *argv[])
 
 void cpu_topology_func(int argc, char *argv[])
 {
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_core);
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_socket);
-    DECLARE_HYPERCALL_BUFFER(uint32_t, cpu_to_node);
-    xc_topologyinfo_t info = { 0 };
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
+    xc_cputopoinfo_t info = { 0 };
     int i, rc = ENOMEM;
 
-    cpu_to_core = xc_hypercall_buffer_alloc(xc_handle, cpu_to_core, sizeof(*cpu_to_core) * MAX_NR_CPU);
-    cpu_to_socket = xc_hypercall_buffer_alloc(xc_handle, cpu_to_socket, sizeof(*cpu_to_socket) * MAX_NR_CPU);
-    cpu_to_node = xc_hypercall_buffer_alloc(xc_handle, cpu_to_node, sizeof(*cpu_to_node) * MAX_NR_CPU);
-
-    if ( cpu_to_core == NULL || cpu_to_socket == NULL || cpu_to_node == NULL )
+    cputopo = xc_hypercall_buffer_alloc(xc_handle, cputopo,
+                                        sizeof(*cputopo) * MAX_NR_CPU);
+    if ( cputopo == NULL )
     {
 	fprintf(stderr, "failed to allocate hypercall buffers\n");
 	goto out;
     }
 
-    set_xen_guest_handle(info.cpu_to_core, cpu_to_core);
-    set_xen_guest_handle(info.cpu_to_socket, cpu_to_socket);
-    set_xen_guest_handle(info.cpu_to_node, cpu_to_node);
-    info.max_cpu_index = MAX_NR_CPU-1;
+    set_xen_guest_handle(info.cputopo, cputopo);
+    info.num_cpus = MAX_NR_CPU;
 
-    if ( xc_topologyinfo(xc_handle, &info) )
+    if ( xc_cputopoinfo(xc_handle, &info) )
     {
         rc = errno;
         fprintf(stderr, "Cannot get Xen CPU topology (%d - %s)\n",
@@ -994,22 +981,20 @@ void cpu_topology_func(int argc, char *argv[])
         goto out;
     }
 
-    if ( info.max_cpu_index > (MAX_NR_CPU-1) )
-        info.max_cpu_index = MAX_NR_CPU-1;
+    if ( info.num_cpus > (MAX_NR_CPU) )
+        info.num_cpus = MAX_NR_CPU;
 
     printf("CPU\tcore\tsocket\tnode\n");
-    for ( i = 0; i <= info.max_cpu_index; i++ )
+    for ( i = 0; i < info.num_cpus; i++ )
     {
-        if ( cpu_to_core[i] == INVALID_TOPOLOGY_ID )
+        if ( cputopo[i].core == XEN_INVALID_CORE_ID )
             continue;
         printf("CPU%d\t %d\t %d\t %d\n",
-               i, cpu_to_core[i], cpu_to_socket[i], cpu_to_node[i]);
+               i, cputopo[i].core, cputopo[i].socket, cputopo[i].node);
     }
     rc = 0;
 out:
-    xc_hypercall_buffer_free(xc_handle, cpu_to_core);
-    xc_hypercall_buffer_free(xc_handle, cpu_to_socket);
-    xc_hypercall_buffer_free(xc_handle, cpu_to_node);
+    xc_hypercall_buffer_free(xc_handle, cputopo);
     if ( rc )
         exit(rc);
 }
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 2aa0dc7..2fd93e0 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1220,78 +1220,70 @@ static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds)
 
 static PyObject *pyxc_topologyinfo(XcObject *self)
 {
-#define MAX_CPU_INDEX 255
-    xc_topologyinfo_t tinfo = { 0 };
-    int i, max_cpu_index;
+#define MAX_CPUS 256
+    xc_cputopoinfo_t tinfo = { 0 };
+    unsigned i, num_cpus;
     PyObject *ret_obj = NULL;
     PyObject *cpu_to_core_obj, *cpu_to_socket_obj, *cpu_to_node_obj;
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_core_t, coremap);
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_socket_t, socketmap);
-    DECLARE_HYPERCALL_BUFFER(xc_cpu_to_node_t, nodemap);
 
-    coremap = xc_hypercall_buffer_alloc(self->xc_handle, coremap, sizeof(*coremap) * (MAX_CPU_INDEX+1));
-    if ( coremap == NULL )
-        goto out;
-    socketmap = xc_hypercall_buffer_alloc(self->xc_handle, socketmap, sizeof(*socketmap) * (MAX_CPU_INDEX+1));
-    if ( socketmap == NULL  )
-        goto out;
-    nodemap = xc_hypercall_buffer_alloc(self->xc_handle, nodemap, sizeof(*nodemap) * (MAX_CPU_INDEX+1));
-    if ( nodemap == NULL )
-        goto out;
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
+
+    cputopo = xc_hypercall_buffer_alloc(self->xc_handle, cputopo,
+                                        sizeof(*cputopo) * (MAX_CPUS));
+    if ( cputopo == NULL )
+    	goto out;
 
-    set_xen_guest_handle(tinfo.cpu_to_core, coremap);
-    set_xen_guest_handle(tinfo.cpu_to_socket, socketmap);
-    set_xen_guest_handle(tinfo.cpu_to_node, nodemap);
-    tinfo.max_cpu_index = MAX_CPU_INDEX;
+    set_xen_guest_handle(tinfo.cputopo, cputopo);
+    tinfo.num_cpus = MAX_CPUS;
 
-    if ( xc_topologyinfo(self->xc_handle, &tinfo) != 0 )
+    if ( xc_cputopoinfo(self->xc_handle, &tinfo) != 0 )
         goto out;
 
-    max_cpu_index = tinfo.max_cpu_index;
-    if ( max_cpu_index > MAX_CPU_INDEX )
-        max_cpu_index = MAX_CPU_INDEX;
+    num_cpus = tinfo.num_cpus;
+    if ( num_cpus > MAX_CPUS )
+        num_cpus = MAX_CPUS;
 
     /* Construct cpu-to-* lists. */
     cpu_to_core_obj = PyList_New(0);
     cpu_to_socket_obj = PyList_New(0);
     cpu_to_node_obj = PyList_New(0);
-    for ( i = 0; i <= max_cpu_index; i++ )
+    for ( i = 0; i < num_cpus; i++ )
     {
-        if ( coremap[i] == INVALID_TOPOLOGY_ID )
+        if ( cputopo[i].core == XEN_INVALID_CORE_ID )
         {
             PyList_Append(cpu_to_core_obj, Py_None);
         }
         else
         {
-            PyObject *pyint = PyInt_FromLong(coremap[i]);
+            PyObject *pyint = PyInt_FromLong(cputopo[i].core);
             PyList_Append(cpu_to_core_obj, pyint);
             Py_DECREF(pyint);
         }
 
-        if ( socketmap[i] == INVALID_TOPOLOGY_ID )
+        if ( cputopo[i].socket == XEN_INVALID_SOCKET_ID )
         {
             PyList_Append(cpu_to_socket_obj, Py_None);
         }
         else
         {
-            PyObject *pyint = PyInt_FromLong(socketmap[i]);
+            PyObject *pyint = PyInt_FromLong(cputopo[i].socket);
             PyList_Append(cpu_to_socket_obj, pyint);
             Py_DECREF(pyint);
         }
 
-        if ( nodemap[i] == INVALID_TOPOLOGY_ID )
+        if ( cputopo[i].node == XEN_INVALID_NODE_ID )
         {
             PyList_Append(cpu_to_node_obj, Py_None);
         }
         else
         {
-            PyObject *pyint = PyInt_FromLong(nodemap[i]);
+            PyObject *pyint = PyInt_FromLong(cputopo[i].node);
             PyList_Append(cpu_to_node_obj, pyint);
             Py_DECREF(pyint);
         }
     }
 
-    ret_obj = Py_BuildValue("{s:i}", "max_cpu_index", max_cpu_index);
+    ret_obj = Py_BuildValue("{s:i}", "max_cpu_index", num_cpus + 1);
 
     PyDict_SetItemString(ret_obj, "cpu_to_core", cpu_to_core_obj);
     Py_DECREF(cpu_to_core_obj);
@@ -1303,9 +1295,7 @@ static PyObject *pyxc_topologyinfo(XcObject *self)
     Py_DECREF(cpu_to_node_obj);
 
 out:
-    xc_hypercall_buffer_free(self->xc_handle, coremap);
-    xc_hypercall_buffer_free(self->xc_handle, socketmap);
-    xc_hypercall_buffer_free(self->xc_handle, nodemap);
+    xc_hypercall_buffer_free(self->xc_handle, cputopo);
     return ret_obj ? ret_obj : pyxc_error_to_exception(self->xc_handle);
 #undef MAX_CPU_INDEX
 }
@@ -1375,7 +1365,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
         for ( j = 0; j <= max_node_index; j++ )
         {
             uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
-            if ( dist == INVALID_TOPOLOGY_ID )
+            if ( dist == ~0u )
             {
                 PyList_Append(node_to_node_dist_obj, Py_None);
             }
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 0cb6ee1..fe48ee8 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -320,39 +320,61 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
     }
     break;
 
-    case XEN_SYSCTL_topologyinfo:
+    case XEN_SYSCTL_cputopoinfo:
     {
-        uint32_t i, max_cpu_index, last_online_cpu;
-        xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
+        uint32_t i, num_cpus;
+        xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
 
-        last_online_cpu = cpumask_last(&cpu_online_map);
-        max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
-        ti->max_cpu_index = last_online_cpu;
+        if ( guest_handle_is_null(ti->cputopo) )
+        {
+            ret = -EINVAL;
+            break;
+        }
 
-        for ( i = 0; i <= max_cpu_index; i++ )
+        num_cpus = cpumask_last(&cpu_online_map) + 1;
+        if ( ti->num_cpus != num_cpus )
         {
-            if ( !guest_handle_is_null(ti->cpu_to_core) )
+            uint32_t array_sz = ti->num_cpus;
+
+            ti->num_cpus = num_cpus;
+            if ( __copy_field_to_guest(u_sysctl, op,
+                                       u.cputopoinfo.num_cpus) )
             {
-                uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
-                if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
-                    break;
+                ret = -EFAULT;
+                break;
+            }
+            num_cpus = min_t(uint32_t, array_sz, num_cpus);
+        }
+
+        for ( i = 0; i < num_cpus; i++ )
+        {
+            xen_sysctl_cputopo_t cputopo;
+
+            if ( cpu_present(i) )
+            {
+                cputopo.core = cpu_to_core(i);
+                if ( cputopo.core == BAD_APICID )
+                    cputopo.core = XEN_INVALID_CORE_ID;
+                cputopo.socket = cpu_to_socket(i);
+                if ( cputopo.socket == BAD_APICID )
+                    cputopo.socket = XEN_INVALID_SOCKET_ID;
+                cputopo.node = cpu_to_node(i);
+                if ( cputopo.node == NUMA_NO_NODE )
+                    cputopo.node = XEN_INVALID_NODE_ID;
             }
-            if ( !guest_handle_is_null(ti->cpu_to_socket) )
+            else
             {
-                uint32_t socket = cpu_online(i) ? cpu_to_socket(i) : ~0u;
-                if ( copy_to_guest_offset(ti->cpu_to_socket, i, &socket, 1) )
-                    break;
+                cputopo.core = XEN_INVALID_CORE_ID;
+                cputopo.socket = XEN_INVALID_SOCKET_ID;
+                cputopo.node = XEN_INVALID_NODE_ID;
             }
-            if ( !guest_handle_is_null(ti->cpu_to_node) )
+
+            if ( copy_to_guest_offset(ti->cputopo, i, &cputopo, 1) )
             {
-                uint32_t node = cpu_online(i) ? cpu_to_node(i) : ~0u;
-                if ( copy_to_guest_offset(ti->cpu_to_node, i, &node, 1) )
-                    break;
+                ret = -EFAULT;
+                break;
             }
         }
-
-        ret = ((i <= max_cpu_index) || copy_to_guest(u_sysctl, op, 1))
-            ? -EFAULT : 0;
     }
     break;
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 8552dc6..f20da69 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -462,31 +462,37 @@ struct xen_sysctl_lockprof_op {
 typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
 
-/* XEN_SYSCTL_topologyinfo */
-#define INVALID_TOPOLOGY_ID  (~0U)
-struct xen_sysctl_topologyinfo {
+/* XEN_SYSCTL_cputopoinfo */
+#define XEN_INVALID_CORE_ID     (~0U)
+#define XEN_INVALID_SOCKET_ID   (~0U)
+#define XEN_INVALID_NODE_ID     ((uint8_t)~0)
+
+struct xen_sysctl_cputopo {
+    uint32_t core;
+    uint32_t socket;
+    uint8_t node;
+};
+typedef struct xen_sysctl_cputopo xen_sysctl_cputopo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopo_t);
+
+struct xen_sysctl_cputopoinfo {
     /*
-     * IN: maximum addressable entry in the caller-provided arrays.
-     * OUT: largest cpu identifier in the system.
-     * If OUT is greater than IN then the arrays are truncated!
-     * If OUT is leass than IN then the array tails are not written by sysctl.
+     * IN: size of caller-provided cputopo array.
+     * OUT: Number of CPUs in the system.
      */
-    uint32_t max_cpu_index;
+    uint32_t num_cpus;
 
     /*
-     * If not NULL, these arrays are filled with core/socket/node identifier
-     * for each cpu.
-     * If a cpu has no core/socket/node information (e.g., cpu not present) 
-     * then the sentinel value ~0u is written to each array.
-     * The number of array elements written by the sysctl is:
-     *   min(@max_cpu_index_IN,@max_cpu_index_OUT)+1
+     * If not NULL, filled with core/socket/node identifier for each cpu
+     * If information for a particular entry is not avalable it is set to
+     * XEN_INVALID_<xxx>_ID.
+     * The number of array elements for CPU topology written by the sysctl is:
+     *   min(@num_cpus_IN,@num_cpus_OUT)+1.
      */
-    XEN_GUEST_HANDLE_64(uint32) cpu_to_core;
-    XEN_GUEST_HANDLE_64(uint32) cpu_to_socket;
-    XEN_GUEST_HANDLE_64(uint32) cpu_to_node;
+    XEN_GUEST_HANDLE_64(xen_sysctl_cputopo_t) cputopo;
 };
-typedef struct xen_sysctl_topologyinfo xen_sysctl_topologyinfo_t;
-DEFINE_XEN_GUEST_HANDLE(xen_sysctl_topologyinfo_t);
+typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
 
 /* XEN_SYSCTL_numainfo */
 #define INVALID_NUMAINFO_ID (~0U)
@@ -672,7 +678,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_pm_op                         12
 #define XEN_SYSCTL_page_offline_op               14
 #define XEN_SYSCTL_lockprof_op                   15
-#define XEN_SYSCTL_topologyinfo                  16 
+#define XEN_SYSCTL_cputopoinfo                   16
 #define XEN_SYSCTL_numainfo                      17
 #define XEN_SYSCTL_cpupool_op                    18
 #define XEN_SYSCTL_scheduler_op                  19
@@ -683,7 +689,7 @@ struct xen_sysctl {
         struct xen_sysctl_readconsole       readconsole;
         struct xen_sysctl_tbuf_op           tbuf_op;
         struct xen_sysctl_physinfo          physinfo;
-        struct xen_sysctl_topologyinfo      topologyinfo;
+        struct xen_sysctl_cputopoinfo       cputopoinfo;
         struct xen_sysctl_numainfo          numainfo;
         struct xen_sysctl_sched_id          sched_id;
         struct xen_sysctl_perfc_op          perfc_op;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (2 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 14:34   ` Andrew Cooper
                     ` (2 more replies)
  2015-03-10  2:27 ` [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology Boris Ostrovsky
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

Make sysctl NUMA topology query use fewer copies by combining some
fields into a single structure and copying distances for each node
in a single copy.

Instead of using max_node_index for passing number of nodes keep this
value in num_nodes: almost all uses of max_node_index required adding
or subtracting one to eventually get to number of nodes anyway.

Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
XEN_INVALID_NODE_DIST.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* Split this patch from CPU topology changes
* Replace max_node_index with num_nodes (for the same reason as in previous patch)
* No buffer allocation in sysctl, copy data node-by-node
* Replaced INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ, added XEN_INVALID_NODE_DIST

 tools/libxl/libxl.c               |   50 ++++++++++++-------------
 tools/python/xen/lowlevel/xc/xc.c |   56 ++++++++++++++---------------
 xen/common/sysctl.c               |   72 +++++++++++++++++++++----------------
 xen/include/public/sysctl.h       |   32 +++++++++++------
 4 files changed, 113 insertions(+), 97 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 6660133..8a6f979 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5093,9 +5093,8 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
 {
     GC_INIT(ctx);
     xc_numainfo_t ninfo;
-    DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
-    DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
-    DECLARE_HYPERCALL_BUFFER(uint32_t, node_dists);
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
+    DECLARE_HYPERCALL_BUFFER(uint8_t, distance);
     libxl_numainfo *ret = NULL;
     int i, j, max_nodes;
 
@@ -5107,51 +5106,50 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
         goto out;
     }
 
-    memsize = xc_hypercall_buffer_alloc
-        (ctx->xch, memsize, sizeof(*memsize) * max_nodes);
-    memfree = xc_hypercall_buffer_alloc
-        (ctx->xch, memfree, sizeof(*memfree) * max_nodes);
-    node_dists = xc_hypercall_buffer_alloc
-        (ctx->xch, node_dists, sizeof(*node_dists) * max_nodes * max_nodes);
-    if ((memsize == NULL) || (memfree == NULL) || (node_dists == NULL)) {
+    meminfo = xc_hypercall_buffer_alloc(ctx->xch, meminfo,
+                                        sizeof(*meminfo) * max_nodes);
+    distance = xc_hypercall_buffer_alloc(ctx->xch, distance,
+                                         sizeof(*distance) *
+                                         max_nodes * max_nodes);
+    if ((meminfo == NULL) || (distance == NULL)) {
         LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
                             "Unable to allocate hypercall arguments");
         goto fail;
     }
 
-    set_xen_guest_handle(ninfo.node_to_memsize, memsize);
-    set_xen_guest_handle(ninfo.node_to_memfree, memfree);
-    set_xen_guest_handle(ninfo.node_to_node_distance, node_dists);
-    ninfo.max_node_index = max_nodes - 1;
+    set_xen_guest_handle(ninfo.meminfo, meminfo);
+    set_xen_guest_handle(ninfo.distance, distance);
+    ninfo.num_nodes = max_nodes;
     if (xc_numainfo(ctx->xch, &ninfo) != 0) {
         LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "getting numainfo");
         goto fail;
     }
 
-    if (ninfo.max_node_index < max_nodes - 1)
-        max_nodes = ninfo.max_node_index + 1;
+    if (ninfo.num_nodes < max_nodes)
+        max_nodes = ninfo.num_nodes;
 
     *nr = max_nodes;
 
     ret = libxl__zalloc(NOGC, sizeof(libxl_numainfo) * max_nodes);
     for (i = 0; i < max_nodes; i++)
-        ret[i].dists = libxl__calloc(NOGC, max_nodes, sizeof(*node_dists));
+        ret[i].dists = libxl__calloc(NOGC, max_nodes, sizeof(*distance));
 
     for (i = 0; i < max_nodes; i++) {
-#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
-    LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
-        ret[i].size = V(memsize, i);
-        ret[i].free = V(memfree, i);
+#define V(val, invalid) (val == invalid) ? \
+       LIBXL_NUMAINFO_INVALID_ENTRY : val
+        ret[i].size = V(meminfo[i].memsize, XEN_INVALID_MEM_SZ);
+        ret[i].free = V(meminfo[i].memfree, XEN_INVALID_MEM_SZ);
         ret[i].num_dists = max_nodes;
-        for (j = 0; j < ret[i].num_dists; j++)
-            ret[i].dists[j] = V(node_dists, i * max_nodes + j);
+        for (j = 0; j < ret[i].num_dists; j++) {
+            unsigned idx = i * max_nodes + j;
+            ret[i].dists[j] = V(distance[idx], XEN_INVALID_NODE_DIST);
+        }
 #undef V
     }
 
  fail:
-    xc_hypercall_buffer_free(ctx->xch, memsize);
-    xc_hypercall_buffer_free(ctx->xch, memfree);
-    xc_hypercall_buffer_free(ctx->xch, node_dists);
+    xc_hypercall_buffer_free(ctx->xch, meminfo);
+    xc_hypercall_buffer_free(ctx->xch, distance);
 
  out:
     GC_FREE;
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 2fd93e0..edbaf60 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1302,55 +1302,53 @@ out:
 
 static PyObject *pyxc_numainfo(XcObject *self)
 {
-#define MAX_NODE_INDEX 31
+#define MAX_NODES 32
     xc_numainfo_t ninfo = { 0 };
-    int i, j, max_node_index;
+    unsigned i, j, num_nodes;
     uint64_t free_heap;
     PyObject *ret_obj = NULL, *node_to_node_dist_list_obj;
     PyObject *node_to_memsize_obj, *node_to_memfree_obj;
     PyObject *node_to_dma32_mem_obj, *node_to_node_dist_obj;
-    DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, node_memsize);
-    DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, node_memfree);
-    DECLARE_HYPERCALL_BUFFER(xc_node_to_node_dist_t, nodes_dist);
+    DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
+    DECLARE_HYPERCALL_BUFFER(uint8_t, distance);
 
-    node_memsize = xc_hypercall_buffer_alloc(self->xc_handle, node_memsize, sizeof(*node_memsize)*(MAX_NODE_INDEX+1));
-    if ( node_memsize == NULL )
+    meminfo = xc_hypercall_buffer_alloc(self->xc_handle, meminfo,
+                                        sizeof(*meminfo) * MAX_NODES);
+    if ( meminfo == NULL )
         goto out;
-    node_memfree = xc_hypercall_buffer_alloc(self->xc_handle, node_memfree, sizeof(*node_memfree)*(MAX_NODE_INDEX+1));
-    if ( node_memfree == NULL )
-        goto out;
-    nodes_dist = xc_hypercall_buffer_alloc(self->xc_handle, nodes_dist, sizeof(*nodes_dist)*(MAX_NODE_INDEX+1)*(MAX_NODE_INDEX+1));
-    if ( nodes_dist == NULL )
+    distance = xc_hypercall_buffer_alloc(self->xc_handle, distance,
+                                         sizeof(*distance) * MAX_NODES * MAX_NODES);
+    if ( distance == NULL )
         goto out;
 
-    set_xen_guest_handle(ninfo.node_to_memsize, node_memsize);
-    set_xen_guest_handle(ninfo.node_to_memfree, node_memfree);
-    set_xen_guest_handle(ninfo.node_to_node_distance, nodes_dist);
-    ninfo.max_node_index = MAX_NODE_INDEX;
+    set_xen_guest_handle(ninfo.meminfo, meminfo);
+    set_xen_guest_handle(ninfo.distance, distance);
+    ninfo.num_nodes = MAX_NODES;
 
     if ( xc_numainfo(self->xc_handle, &ninfo) != 0 )
         goto out;
 
-    max_node_index = ninfo.max_node_index;
-    if ( max_node_index > MAX_NODE_INDEX )
-        max_node_index = MAX_NODE_INDEX;
+    num_nodes = ninfo.num_nodes;
+    if ( num_nodes > MAX_NODES )
+        num_nodes = MAX_NODES;
 
     /* Construct node-to-* lists. */
     node_to_memsize_obj = PyList_New(0);
     node_to_memfree_obj = PyList_New(0);
     node_to_dma32_mem_obj = PyList_New(0);
     node_to_node_dist_list_obj = PyList_New(0);
-    for ( i = 0; i <= max_node_index; i++ )
+    for ( i = 0; i < num_nodes; i++ )
     {
         PyObject *pyint;
+        unsigned invalid_node;
 
         /* Total Memory */
-        pyint = PyInt_FromLong(node_memsize[i] >> 20); /* MB */
+        pyint = PyInt_FromLong(meminfo[i].memsize >> 20); /* MB */
         PyList_Append(node_to_memsize_obj, pyint);
         Py_DECREF(pyint);
 
         /* Free Memory */
-        pyint = PyInt_FromLong(node_memfree[i] >> 20); /* MB */
+        pyint = PyInt_FromLong(meminfo[i].memfree >> 20); /* MB */
         PyList_Append(node_to_memfree_obj, pyint);
         Py_DECREF(pyint);
 
@@ -1362,10 +1360,11 @@ static PyObject *pyxc_numainfo(XcObject *self)
 
         /* Node to Node Distance */
         node_to_node_dist_obj = PyList_New(0);
-        for ( j = 0; j <= max_node_index; j++ )
+        invalid_node = (meminfo[i].memsize == XEN_INVALID_MEM_SZ);
+        for ( j = 0; j < num_nodes; j++ )
         {
-            uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
-            if ( dist == ~0u )
+            uint8_t dist = distance[i * num_nodes + j];
+            if ( invalid_node || (dist == XEN_INVALID_NODE_DIST) )
             {
                 PyList_Append(node_to_node_dist_obj, Py_None);
             }
@@ -1380,7 +1379,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
         Py_DECREF(node_to_node_dist_obj);
     }
 
-    ret_obj = Py_BuildValue("{s:i}", "max_node_index", max_node_index);
+    ret_obj = Py_BuildValue("{s:i}", "max_node_index", num_nodes + 1);
 
     PyDict_SetItemString(ret_obj, "node_memsize", node_to_memsize_obj);
     Py_DECREF(node_to_memsize_obj);
@@ -1396,9 +1395,8 @@ static PyObject *pyxc_numainfo(XcObject *self)
     Py_DECREF(node_to_node_dist_list_obj);
 
 out:
-    xc_hypercall_buffer_free(self->xc_handle, node_memsize);
-    xc_hypercall_buffer_free(self->xc_handle, node_memfree);
-    xc_hypercall_buffer_free(self->xc_handle, nodes_dist);
+    xc_hypercall_buffer_free(self->xc_handle, meminfo);
+    xc_hypercall_buffer_free(self->xc_handle, distance);
     return ret_obj ? ret_obj : pyxc_error_to_exception(self->xc_handle);
 #undef MAX_NODE_INDEX
 }
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index fe48ee8..2d11a76 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -274,49 +274,59 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
 
     case XEN_SYSCTL_numainfo:
     {
-        uint32_t i, j, max_node_index, last_online_node;
+        uint32_t i, j, num_nodes;
         xen_sysctl_numainfo_t *ni = &op->u.numainfo;
 
-        last_online_node = last_node(node_online_map);
-        max_node_index = min_t(uint32_t, ni->max_node_index, last_online_node);
-        ni->max_node_index = last_online_node;
+        if ( guest_handle_is_null(ni->meminfo) ||
+             guest_handle_is_null(ni->distance) )
+        {
+            ret = -EINVAL;
+            break;
+        }
 
-        for ( i = 0; i <= max_node_index; i++ )
+        num_nodes = last_node(node_online_map) + 1;
+        if ( ni->num_nodes != num_nodes )
         {
-            if ( !guest_handle_is_null(ni->node_to_memsize) )
+            uint32_t array_sz = ni->num_nodes;
+
+            ni->num_nodes = num_nodes;
+            if ( __copy_field_to_guest(u_sysctl, op,
+                                       u.numainfo.num_nodes) )
             {
-                uint64_t memsize = node_online(i) ?
-                                   node_spanned_pages(i) << PAGE_SHIFT : 0ul;
-                if ( copy_to_guest_offset(ni->node_to_memsize, i, &memsize, 1) )
-                    break;
+                ret = -EFAULT;
+                break;
+            }
+            num_nodes = min_t(uint32_t, array_sz, num_nodes);
+        }
+
+        for ( i = 0; i < num_nodes; i++ )
+        {
+            xen_sysctl_meminfo_t meminfo;
+            uint8_t distance[MAX_NUMNODES];
+
+            if ( node_online(i) )
+            {
+                meminfo.memsize = node_spanned_pages(i) << PAGE_SHIFT;
+                meminfo.memfree = avail_node_heap_pages(i) << PAGE_SHIFT;
             }
-            if ( !guest_handle_is_null(ni->node_to_memfree) )
+            else
+                meminfo.memsize = meminfo.memfree = XEN_INVALID_MEM_SZ;
+
+            for ( j = 0; j < num_nodes; j++ )
             {
-                uint64_t memfree = node_online(i) ?
-                                   avail_node_heap_pages(i) << PAGE_SHIFT : 0ul;
-                if ( copy_to_guest_offset(ni->node_to_memfree, i, &memfree, 1) )
-                    break;
+                distance[j] = __node_distance(i, j);
+                if ( distance[j] == NUMA_NO_DISTANCE )
+                    distance[j] = XEN_INVALID_NODE_DIST;
             }
 
-            if ( !guest_handle_is_null(ni->node_to_node_distance) )
+            if ( copy_to_guest_offset(ni->distance, i * num_nodes,
+                                      distance, num_nodes) ||
+                 copy_to_guest_offset(ni->meminfo, i, &meminfo, 1) )
             {
-                for ( j = 0; j <= max_node_index; j++)
-                {
-                    uint32_t distance = ~0u;
-                    if ( node_online(i) && node_online(j) )
-                        distance = __node_distance(i, j);
-                    if ( copy_to_guest_offset(
-                        ni->node_to_node_distance,
-                        i*(max_node_index+1) + j, &distance, 1) )
-                        break;
-                }
-                if ( j <= max_node_index )
-                    break;
+                ret = -EFAULT;
+                break;
             }
         }
-
-        ret = ((i <= max_node_index) || copy_to_guest(u_sysctl, op, 1))
-            ? -EFAULT : 0;
     }
     break;
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index f20da69..c544c76 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -495,23 +495,33 @@ typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
 
 /* XEN_SYSCTL_numainfo */
-#define INVALID_NUMAINFO_ID (~0U)
+#define XEN_INVALID_MEM_SZ     (~0U)
+#define XEN_INVALID_NODE_DIST  ((uint8_t)~0)
+
+struct xen_sysctl_meminfo {
+    uint64_t memsize;
+    uint64_t memfree;
+};
+typedef struct xen_sysctl_meminfo xen_sysctl_meminfo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_meminfo_t);
+
 struct xen_sysctl_numainfo {
     /*
-     * IN: maximum addressable entry in the caller-provided arrays.
-     * OUT: largest node identifier in the system.
-     * If OUT is greater than IN then the arrays are truncated!
+     * IN: size of caller-provided arrays.
+     * OUT: number of nodes in the system.
      */
-    uint32_t max_node_index;
+    uint32_t num_nodes;
 
-    /* NB. Entries are 0 if node is not present. */
-    XEN_GUEST_HANDLE_64(uint64) node_to_memsize;
-    XEN_GUEST_HANDLE_64(uint64) node_to_memfree;
+    /*
+     * OUT: max_node_index-sized array. Entries are XEN_INVALID_MEM_SZ
+     * if node is not present.
+     */
+    XEN_GUEST_HANDLE_64(xen_sysctl_meminfo_t) meminfo;
 
     /*
-     * Array, of size (max_node_index+1)^2, listing memory access distances
+     * OUT: Array, of size (max_node_index+1)^2, listing memory access distances
      * between nodes. If an entry has no node distance information (e.g., node 
-     * not present) then the value ~0u is written.
+     * not present) then the value XEN_INVALID_NODE_DIST is written.
      * 
      * Note that the array rows must be indexed by multiplying by the minimum 
      * of the caller-provided max_node_index and the returned value of
@@ -522,7 +532,7 @@ struct xen_sysctl_numainfo {
      * in the system is larger than the caller can handle, then a 2-d array of
      * the maximum size handleable by the caller is constructed.
      */
-    XEN_GUEST_HANDLE_64(uint32) node_to_node_distance;
+    XEN_GUEST_HANDLE_64(uint8) distance;
 };
 typedef struct xen_sysctl_numainfo xen_sysctl_numainfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_numainfo_t);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (3 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 14:54   ` Andrew Cooper
  2015-03-13 16:03   ` Jan Beulich
  2015-03-10  2:27 ` [PATCH v4 6/9] sysctl: Update sysctl version to 0x0000000C Boris Ostrovsky
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* No buffer allocation in sysctl, copy data device-by-device
* Replace INVALID_TOPOLOGY_ID with XEN_INVALID_NODE_ID

 xen/common/sysctl.c         |   59 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/sysctl.h |   29 +++++++++++++++++++++
 2 files changed, 88 insertions(+), 0 deletions(-)

diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 2d11a76..9cd6321 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -387,7 +387,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         }
     }
     break;
+#ifdef HAS_PCI
+    case XEN_SYSCTL_pcitopoinfo:
+    {
+        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
+
+        if ( guest_handle_is_null(ti->devs) ||
+             guest_handle_is_null(ti->nodes) ||
+             (ti->first_dev > ti->num_devs) )
+        {
+            ret = -EINVAL;
+            break;
+        }
+
+        for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
+        {
+            physdev_pci_device_t dev;
+            uint8_t node;
+            struct pci_dev *pdev;
+
+            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
+            {
+                ret = -EFAULT;
+                break;
+            }
+
+            spin_lock(&pcidevs_lock);
+            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
+            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
+                node = XEN_INVALID_NODE_ID;
+            else
+                node = pdev->node;
+            spin_unlock(&pcidevs_lock);
+
+            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
+            {
+                ret = -EFAULT;
+                break;
+            }
 
+            if ( hypercall_preempt_check() )
+                break;
+        }
+
+        if ( !ret )
+        {
+            ti->first_dev++;
+
+            if ( __copy_field_to_guest(u_sysctl, op, u.pcitopoinfo.first_dev) )
+            {
+                ret = -EFAULT;
+                break;
+            }
+
+            if ( ti->first_dev < ti->num_devs )
+                ret = hypercall_create_continuation(__HYPERVISOR_sysctl,
+                                                    "h", u_sysctl);
+        }
+    }
+    break;
+#endif
 #ifdef TEST_COVERAGE
     case XEN_SYSCTL_coverage_op:
         ret = sysctl_coverage_op(&op->u.coverage_op);
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index c544c76..a224951 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -33,6 +33,7 @@
 
 #include "xen.h"
 #include "domctl.h"
+#include "physdev.h"
 
 #define XEN_SYSCTL_INTERFACE_VERSION 0x0000000B
 
@@ -494,6 +495,32 @@ struct xen_sysctl_cputopoinfo {
 typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
 
+/* XEN_SYSCTL_pcitopoinfo */
+struct xen_sysctl_pcitopoinfo {
+    /* IN: Size of pcitopo array */
+    uint32_t num_devs;
+
+    /*
+     * IN/OUT: First element of pcitopo array that needs to be processed by
+     * hypervisor.
+     * This is used primarily by hypercall continuations and callers will
+     * typically set it to zero.
+     */
+    uint32_t first_dev;
+
+    /* IN: list of devices */
+    XEN_GUEST_HANDLE_64(physdev_pci_device_t) devs;
+
+    /*
+     * OUT: node identifier for each device.
+     * If information for a particular device is not avalable then set
+     * to XEN_INVALID_NODE_ID.
+     */
+    XEN_GUEST_HANDLE_64(uint8) nodes;
+};
+typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
+
 /* XEN_SYSCTL_numainfo */
 #define XEN_INVALID_MEM_SZ     (~0U)
 #define XEN_INVALID_NODE_DIST  ((uint8_t)~0)
@@ -694,12 +721,14 @@ struct xen_sysctl {
 #define XEN_SYSCTL_scheduler_op                  19
 #define XEN_SYSCTL_coverage_op                   20
 #define XEN_SYSCTL_psr_cmt_op                    21
+#define XEN_SYSCTL_pcitopoinfo                   22
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
         struct xen_sysctl_tbuf_op           tbuf_op;
         struct xen_sysctl_physinfo          physinfo;
         struct xen_sysctl_cputopoinfo       cputopoinfo;
+        struct xen_sysctl_pcitopoinfo       pcitopoinfo;
         struct xen_sysctl_numainfo          numainfo;
         struct xen_sysctl_sched_id          sched_id;
         struct xen_sysctl_perfc_op          perfc_op;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 6/9] sysctl: Update sysctl version to 0x0000000C
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (4 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10  2:27 ` [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc Boris Ostrovsky
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

Need to bump sysctl version due to changes to CPU and NUMA topology
data structures

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/include/public/sysctl.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index a224951..c3446e2 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -35,7 +35,7 @@
 #include "domctl.h"
 #include "physdev.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x0000000B
+#define XEN_SYSCTL_INTERFACE_VERSION 0x0000000C
 
 /*
  * Read console content from Xen buffer ring.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (5 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 6/9] sysctl: Update sysctl version to 0x0000000C Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 13:40   ` Dario Faggioli
  2015-03-10  2:27 ` [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s " Boris Ostrovsky
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

xc_cputopoinfo() is not expected to be used on a hot path and therefore
hypercall buffer management can be pushed into libxc. This will simplify
life for callers.

Also update error reporting macros.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* Update commit message
* Make max_cpus argument in xc_cputopoinfo an unsigned
* Replace error logging macros

 tools/libxc/include/xenctrl.h     |    5 ++-
 tools/libxc/xc_misc.c             |   23 +++++++++++-----
 tools/libxl/libxl.c               |   33 ++++++++----------------
 tools/misc/xenpm.c                |   51 ++++++++++++++++---------------------
 tools/python/xen/lowlevel/xc/xc.c |   16 +++--------
 5 files changed, 57 insertions(+), 71 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index fef4cca..5550916 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1226,7 +1226,7 @@ int xc_readconsolering(xc_interface *xch,
 int xc_send_debug_keys(xc_interface *xch, char *keys);
 
 typedef xen_sysctl_physinfo_t xc_physinfo_t;
-typedef xen_sysctl_cputopoinfo_t xc_cputopoinfo_t;
+typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
@@ -1237,7 +1237,8 @@ typedef uint64_t xc_node_to_memfree_t;
 typedef uint32_t xc_node_to_node_dist_t;
 
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
-int xc_cputopoinfo(xc_interface *xch, xc_cputopoinfo_t *info);
+int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
+                   xc_cputopo_t *cputopo);
 int xc_numainfo(xc_interface *xch, xc_numainfo_t *info);
 
 int xc_sched_id(xc_interface *xch,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index be68291..bc6eed2 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -177,22 +177,31 @@ int xc_physinfo(xc_interface *xch,
     return 0;
 }
 
-int xc_cputopoinfo(xc_interface *xch,
-                   xc_cputopoinfo_t *put_info)
+int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
+                   xc_cputopo_t *cputopo)
 {
     int ret;
     DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(cputopo, *max_cpus * sizeof(*cputopo),
+                             XC_HYPERCALL_BUFFER_BOUNCE_OUT);
 
-    sysctl.cmd = XEN_SYSCTL_cputopoinfo;
+    if ((ret = xc_hypercall_bounce_pre(xch, cputopo)))
+        goto out;
 
-    memcpy(&sysctl.u.cputopoinfo, put_info, sizeof(*put_info));
+    sysctl.u.cputopoinfo.num_cpus = *max_cpus;
+    set_xen_guest_handle(sysctl.u.cputopoinfo.cputopo, cputopo);
+
+    sysctl.cmd = XEN_SYSCTL_cputopoinfo;
 
     if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
-        return ret;
+        goto out;
 
-    memcpy(put_info, &sysctl.u.cputopoinfo, sizeof(*put_info));
+    *max_cpus = sysctl.u.cputopoinfo.num_cpus;
 
-    return 0;
+out:
+    xc_hypercall_bounce_post(xch, cputopo);
+
+    return ret;
 }
 
 int xc_numainfo(xc_interface *xch,
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 8a6f979..9d7d9f5 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5036,37 +5036,29 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
 libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
 {
     GC_INIT(ctx);
-    xc_cputopoinfo_t tinfo;
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
+    xc_cputopo_t *cputopo;
     libxl_cputopology *ret = NULL;
     int i;
     int max_cpus;
+    unsigned num_cpus;
 
     max_cpus = libxl_get_max_cpus(ctx);
     if (max_cpus < 0)
     {
-        LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of CPUS");
-        ret = NULL;
+        LOG(ERROR, "Unable to determine number of CPUS");
         goto out;
     }
 
-    cputopo = xc_hypercall_buffer_alloc(ctx->xch, cputopo,
-                                        sizeof(*cputopo) * max_cpus);
-    if (cputopo == NULL) {
-        LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
-                            "Unable to allocate hypercall arguments");
-        goto fail;
-    }
-    set_xen_guest_handle(tinfo.cputopo, cputopo);
-    tinfo.num_cpus = max_cpus;
+    cputopo = libxl__zalloc(gc, sizeof(*cputopo) * max_cpus);
 
-    if (xc_cputopoinfo(ctx->xch, &tinfo) != 0) {
-        LIBXL__LOG_ERRNO(ctx, XTL_ERROR, "CPU topology info hypercall failed");
-        goto fail;
+    num_cpus = max_cpus;
+    if (xc_cputopoinfo(ctx->xch, &num_cpus, cputopo) != 0) {
+        LOGE(ERROR, "CPU topology info hypercall failed");
+        goto out;
     }
 
-    if (tinfo.num_cpus < max_cpus)
-        max_cpus = tinfo.num_cpus;
+    if (num_cpus < max_cpus)
+        max_cpus = num_cpus;
 
     ret = libxl__zalloc(NOGC, sizeof(libxl_cputopology) * max_cpus);
 
@@ -5079,11 +5071,8 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
 #undef V
     }
 
- fail:
-    xc_hypercall_buffer_free(ctx->xch, cputopo);
+    *nb_cpu_out = max_cpus;
 
-    if (ret)
-        *nb_cpu_out = max_cpus;
  out:
     GC_FREE;
     return ret;
diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index 23d6b63..122bee4 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -355,12 +355,11 @@ static void signal_int_handler(int signo)
     int i, j, k;
     struct timeval tv;
     int cx_cap = 0, px_cap = 0;
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
-    xc_cputopoinfo_t info = { 0 };
-
-    cputopo = xc_hypercall_buffer_alloc(xc_handle, cputopo,
-                                        sizeof(*cputopo) * MAX_NR_CPU);
+    xc_cputopo_t *cputopo;
+    unsigned max_cpus;
 
+    max_cpus = MAX_NR_CPU;
+    cputopo = calloc(max_cpus, sizeof(*cputopo));
     if ( cputopo == NULL )
     {
 	fprintf(stderr, "failed to allocate hypercall buffers\n");
@@ -445,29 +444,26 @@ static void signal_int_handler(int signo)
             printf("  Avg freq\t%d\tKHz\n", avgfreq[i]);
     }
 
-    set_xen_guest_handle(info.cputopo, cputopo);
-    info.num_cpus = MAX_NR_CPU;
-
-    if ( cx_cap && !xc_cputopoinfo(xc_handle, &info) )
+    if ( cx_cap && !xc_cputopoinfo(xc_handle, &max_cpus, cputopo) )
     {
         uint32_t socket_ids[MAX_NR_CPU];
         uint32_t core_ids[MAX_NR_CPU];
         uint32_t socket_nr = 0;
         uint32_t core_nr = 0;
 
-        if ( info.num_cpus > MAX_NR_CPU )
-            info.num_cpus = MAX_NR_CPU;
+        if ( max_cpus > MAX_NR_CPU )
+            max_cpus = MAX_NR_CPU;
         /* check validity */
-        for ( i = 0; i < info.num_cpus; i++ )
+        for ( i = 0; i < max_cpus; i++ )
         {
             if ( cputopo[i].core == XEN_INVALID_CORE_ID ||
                  cputopo[i].socket == XEN_INVALID_SOCKET_ID )
                 break;
         }
-        if ( i >= info.num_cpus )
+        if ( i >= max_cpus )
         {
             /* find socket nr & core nr per socket */
-            for ( i = 0; i < info.num_cpus; i++ )
+            for ( i = 0; i < max_cpus; i++ )
             {
                 for ( j = 0; j < socket_nr; j++ )
                     if ( cputopo[i].socket == socket_ids[j] )
@@ -494,7 +490,7 @@ static void signal_int_handler(int signo)
                 unsigned int n;
                 uint64_t res;
 
-                for ( j = 0; j < info.num_cpus; j++ )
+                for ( j = 0; j < max_cpus; j++ )
                 {
                     if ( cputopo[j].socket == socket_ids[i] )
                         break;
@@ -513,7 +509,7 @@ static void signal_int_handler(int signo)
                 }
                 for ( k = 0; k < core_nr; k++ )
                 {
-                    for ( j = 0; j < info.num_cpus; j++ )
+                    for ( j = 0; j < max_cpus; j++ )
                     {
                         if ( cputopo[j].socket == socket_ids[i] &&
                              cputopo[j].core == core_ids[k] )
@@ -551,7 +547,7 @@ static void signal_int_handler(int signo)
     free(sum);
     free(avgfreq);
 out:
-    xc_hypercall_buffer_free(xc_handle, cputopo);
+    free(cputopo);
     xc_interface_close(xc_handle);
     exit(0);
 }
@@ -958,22 +954,19 @@ void scaling_governor_func(int argc, char *argv[])
 
 void cpu_topology_func(int argc, char *argv[])
 {
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
-    xc_cputopoinfo_t info = { 0 };
+    xc_cputopo_t *cputopo;
+    unsigned max_cpus;
     int i, rc = ENOMEM;
 
-    cputopo = xc_hypercall_buffer_alloc(xc_handle, cputopo,
-                                        sizeof(*cputopo) * MAX_NR_CPU);
+    max_cpus = MAX_NR_CPU;
+    cputopo = calloc(max_cpus, sizeof(*cputopo));
     if ( cputopo == NULL )
     {
 	fprintf(stderr, "failed to allocate hypercall buffers\n");
 	goto out;
     }
 
-    set_xen_guest_handle(info.cputopo, cputopo);
-    info.num_cpus = MAX_NR_CPU;
-
-    if ( xc_cputopoinfo(xc_handle, &info) )
+    if ( xc_cputopoinfo(xc_handle, &max_cpus, cputopo) )
     {
         rc = errno;
         fprintf(stderr, "Cannot get Xen CPU topology (%d - %s)\n",
@@ -981,11 +974,11 @@ void cpu_topology_func(int argc, char *argv[])
         goto out;
     }
 
-    if ( info.num_cpus > (MAX_NR_CPU) )
-        info.num_cpus = MAX_NR_CPU;
+    if ( max_cpus > (MAX_NR_CPU) )
+        max_cpus = MAX_NR_CPU;
 
     printf("CPU\tcore\tsocket\tnode\n");
-    for ( i = 0; i < info.num_cpus; i++ )
+    for ( i = 0; i < max_cpus; i++ )
     {
         if ( cputopo[i].core == XEN_INVALID_CORE_ID )
             continue;
@@ -994,7 +987,7 @@ void cpu_topology_func(int argc, char *argv[])
     }
     rc = 0;
 out:
-    xc_hypercall_buffer_free(xc_handle, cputopo);
+    free(cputopo);
     if ( rc )
         exit(rc);
 }
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index edbaf60..20c4d39 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1221,25 +1221,19 @@ static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds)
 static PyObject *pyxc_topologyinfo(XcObject *self)
 {
 #define MAX_CPUS 256
-    xc_cputopoinfo_t tinfo = { 0 };
+    xc_cputopo_t *cputopo;
     unsigned i, num_cpus;
     PyObject *ret_obj = NULL;
     PyObject *cpu_to_core_obj, *cpu_to_socket_obj, *cpu_to_node_obj;
 
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
-
-    cputopo = xc_hypercall_buffer_alloc(self->xc_handle, cputopo,
-                                        sizeof(*cputopo) * (MAX_CPUS));
+    num_cpus = MAX_CPUS;
+    cputopo = calloc(num_cpus, sizeof(*cputopo));
     if ( cputopo == NULL )
     	goto out;
 
-    set_xen_guest_handle(tinfo.cputopo, cputopo);
-    tinfo.num_cpus = MAX_CPUS;
-
-    if ( xc_cputopoinfo(self->xc_handle, &tinfo) != 0 )
+    if ( xc_cputopoinfo(self->xc_handle, &num_cpus, cputopo) != 0 )
         goto out;
 
-    num_cpus = tinfo.num_cpus;
     if ( num_cpus > MAX_CPUS )
         num_cpus = MAX_CPUS;
 
@@ -1295,7 +1289,7 @@ static PyObject *pyxc_topologyinfo(XcObject *self)
     Py_DECREF(cpu_to_node_obj);
 
 out:
-    xc_hypercall_buffer_free(self->xc_handle, cputopo);
+    free(cputopo);
     return ret_obj ? ret_obj : pyxc_error_to_exception(self->xc_handle);
 #undef MAX_CPU_INDEX
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management to libxc
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (6 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 13:44   ` Dario Faggioli
  2015-03-10  2:27 ` [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology Boris Ostrovsky
  2015-03-10  8:20 ` [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Jan Beulich
  9 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

xc_numainfo() is not expected to be used on a hot path and therefore
hypercall buffer management can be pushed into libxc. This will simplify
life for callers.

Also update error logging macros.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* Update commit message
* Make max_nodes argument in xc_numainfo an unsigned
* Replace error logging macros

 tools/libxc/include/xenctrl.h     |    4 ++-
 tools/libxc/xc_misc.c             |   29 +++++++++++++++++++------
 tools/libxl/libxl.c               |   42 +++++++++++-------------------------
 tools/python/xen/lowlevel/xc/xc.c |   27 ++++++++---------------
 4 files changed, 47 insertions(+), 55 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 5550916..09b0ec2 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1228,6 +1228,7 @@ int xc_send_debug_keys(xc_interface *xch, char *keys);
 typedef xen_sysctl_physinfo_t xc_physinfo_t;
 typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
+typedef xen_sysctl_meminfo_t xc_meminfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
 typedef uint32_t xc_cpu_to_socket_t;
@@ -1239,7 +1240,8 @@ typedef uint32_t xc_node_to_node_dist_t;
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo);
-int xc_numainfo(xc_interface *xch, xc_numainfo_t *info);
+int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
+                xc_meminfo_t *meminfo, uint8_t *distance);
 
 int xc_sched_id(xc_interface *xch,
                 int *sched_id);
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index bc6eed2..58d3883 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -204,22 +204,37 @@ out:
     return ret;
 }
 
-int xc_numainfo(xc_interface *xch,
-                xc_numainfo_t *put_info)
+int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
+                xc_meminfo_t *meminfo, uint8_t *distance)
 {
     int ret;
     DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(meminfo, *max_nodes * sizeof(*meminfo),
+                             XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+    DECLARE_HYPERCALL_BOUNCE(distance,
+                             *max_nodes * *max_nodes * sizeof(*distance),
+                             XC_HYPERCALL_BUFFER_BOUNCE_OUT);
 
-    sysctl.cmd = XEN_SYSCTL_numainfo;
+    if ((ret = xc_hypercall_bounce_pre(xch, meminfo)))
+        goto out;
+    if ((ret = xc_hypercall_bounce_pre(xch, distance)))
+        goto out;
 
-    memcpy(&sysctl.u.numainfo, put_info, sizeof(*put_info));
+    sysctl.u.numainfo.num_nodes = *max_nodes;
+    set_xen_guest_handle(sysctl.u.numainfo.meminfo, meminfo);
+    set_xen_guest_handle(sysctl.u.numainfo.distance, distance);
 
+    sysctl.cmd = XEN_SYSCTL_numainfo;
     if ((ret = do_sysctl(xch, &sysctl)) != 0)
-        return ret;
+        goto out;
 
-    memcpy(put_info, &sysctl.u.numainfo, sizeof(*put_info));
+    *max_nodes = sysctl.u.numainfo.num_nodes;
 
-    return 0;
+out:
+    xc_hypercall_bounce_post(xch, meminfo);
+    xc_hypercall_bounce_post(xch, distance);
+
+    return ret;
 }
 
 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9d7d9f5..b37b3bd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5081,41 +5081,29 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
 libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
 {
     GC_INIT(ctx);
-    xc_numainfo_t ninfo;
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
-    DECLARE_HYPERCALL_BUFFER(uint8_t, distance);
+    xc_meminfo_t *meminfo;
+    uint8_t *distance;
     libxl_numainfo *ret = NULL;
     int i, j, max_nodes;
+    unsigned num_nodes;
 
     max_nodes = libxl_get_max_nodes(ctx);
-    if (max_nodes < 0)
-    {
-        LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of NODES");
-        ret = NULL;
+    if (max_nodes < 0) {
+        LOG(ERROR, "Unable to determine number of NODES");
         goto out;
     }
 
-    meminfo = xc_hypercall_buffer_alloc(ctx->xch, meminfo,
-                                        sizeof(*meminfo) * max_nodes);
-    distance = xc_hypercall_buffer_alloc(ctx->xch, distance,
-                                         sizeof(*distance) *
-                                         max_nodes * max_nodes);
-    if ((meminfo == NULL) || (distance == NULL)) {
-        LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
-                            "Unable to allocate hypercall arguments");
-        goto fail;
-    }
+    meminfo = libxl__zalloc(gc, sizeof(*meminfo) * max_nodes);
+    distance = libxl__zalloc(gc, sizeof(*distance) * max_nodes * max_nodes);
 
-    set_xen_guest_handle(ninfo.meminfo, meminfo);
-    set_xen_guest_handle(ninfo.distance, distance);
-    ninfo.num_nodes = max_nodes;
-    if (xc_numainfo(ctx->xch, &ninfo) != 0) {
-        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "getting numainfo");
-        goto fail;
+    num_nodes = max_nodes;
+    if (xc_numainfo(ctx->xch, &num_nodes, meminfo, distance) != 0) {
+        LOGE(ERROR, "getting numainfo");
+        goto out;
     }
 
-    if (ninfo.num_nodes < max_nodes)
-        max_nodes = ninfo.num_nodes;
+    if (num_nodes < max_nodes)
+        max_nodes = num_nodes;
 
     *nr = max_nodes;
 
@@ -5136,10 +5124,6 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
 #undef V
     }
 
- fail:
-    xc_hypercall_buffer_free(ctx->xch, meminfo);
-    xc_hypercall_buffer_free(ctx->xch, distance);
-
  out:
     GC_FREE;
     return ret;
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 20c4d39..dacf49e 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1297,32 +1297,23 @@ out:
 static PyObject *pyxc_numainfo(XcObject *self)
 {
 #define MAX_NODES 32
-    xc_numainfo_t ninfo = { 0 };
     unsigned i, j, num_nodes;
     uint64_t free_heap;
     PyObject *ret_obj = NULL, *node_to_node_dist_list_obj;
     PyObject *node_to_memsize_obj, *node_to_memfree_obj;
     PyObject *node_to_dma32_mem_obj, *node_to_node_dist_obj;
-    DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
-    DECLARE_HYPERCALL_BUFFER(uint8_t, distance);
+    xc_meminfo_t *meminfo;
+    uint8_t *distance;
 
-    meminfo = xc_hypercall_buffer_alloc(self->xc_handle, meminfo,
-                                        sizeof(*meminfo) * MAX_NODES);
-    if ( meminfo == NULL )
+    num_nodes = MAX_NODES;
+    meminfo = calloc(num_nodes, sizeof(*meminfo));
+    distance = calloc(num_nodes * num_nodes, sizeof(*distance));
+    if ( meminfo == NULL || distance == NULL)
         goto out;
-    distance = xc_hypercall_buffer_alloc(self->xc_handle, distance,
-                                         sizeof(*distance) * MAX_NODES * MAX_NODES);
-    if ( distance == NULL )
-        goto out;
-
-    set_xen_guest_handle(ninfo.meminfo, meminfo);
-    set_xen_guest_handle(ninfo.distance, distance);
-    ninfo.num_nodes = MAX_NODES;
 
-    if ( xc_numainfo(self->xc_handle, &ninfo) != 0 )
+    if ( xc_numainfo(self->xc_handle, &num_nodes, meminfo, distance) != 0 )
         goto out;
 
-    num_nodes = ninfo.num_nodes;
     if ( num_nodes > MAX_NODES )
         num_nodes = MAX_NODES;
 
@@ -1389,8 +1380,8 @@ static PyObject *pyxc_numainfo(XcObject *self)
     Py_DECREF(node_to_node_dist_list_obj);
 
 out:
-    xc_hypercall_buffer_free(self->xc_handle, meminfo);
-    xc_hypercall_buffer_free(self->xc_handle, distance);
+    free(meminfo);
+    free(distance);
     return ret_obj ? ret_obj : pyxc_error_to_exception(self->xc_handle);
 #undef MAX_NODE_INDEX
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (7 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s " Boris Ostrovsky
@ 2015-03-10  2:27 ` Boris Ostrovsky
  2015-03-10 13:51   ` Dario Faggioli
  2015-03-10  8:20 ` [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Jan Beulich
  9 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10  2:27 UTC (permalink / raw)
  To: andrew.cooper3, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, boris.ostrovsky, xen-devel

.. and use this new interface to display it along with CPU topology
and NUMA information when 'xl info -n' command is issued

The output will look like
...
cpu_topology           :
cpu:    core    socket     node
  0:       0        0        0
...
device topology        :
device           node
0000:00:00.0      0
0000:00:01.0      0
...

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v4:
* Rename LIBXL_HAVE_PCITOPO to LIBXL_HAVE_PCITOPOLOGY, drop ifdefs from code
* Don't treat no PCI devices as an error in libxl_get_pci_topology()
* Drop unnecessary comments
* Make num_devs argument in xc_pcitopoinfo an unsigned

 tools/libxc/include/xenctrl.h |    3 ++
 tools/libxc/xc_misc.c         |   31 ++++++++++++++++++
 tools/libxl/libxl.c           |   42 +++++++++++++++++++++++++
 tools/libxl/libxl.h           |   12 +++++++
 tools/libxl/libxl_freebsd.c   |   12 +++++++
 tools/libxl/libxl_internal.h  |    5 +++
 tools/libxl/libxl_linux.c     |   69 +++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_netbsd.c    |   12 +++++++
 tools/libxl/libxl_types.idl   |    7 ++++
 tools/libxl/libxl_utils.c     |    8 +++++
 tools/libxl/xl_cmdimpl.c      |   40 +++++++++++++++++++----
 11 files changed, 234 insertions(+), 7 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 09b0ec2..c08f253 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1229,6 +1229,7 @@ typedef xen_sysctl_physinfo_t xc_physinfo_t;
 typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
 typedef xen_sysctl_meminfo_t xc_meminfo_t;
+typedef xen_sysctl_pcitopoinfo_t xc_pcitopoinfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
 typedef uint32_t xc_cpu_to_socket_t;
@@ -1242,6 +1243,8 @@ int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo);
 int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
                 xc_meminfo_t *meminfo, uint8_t *distance);
+int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
+                   physdev_pci_device_t *devs, uint8_t *nodes);
 
 int xc_sched_id(xc_interface *xch,
                 int *sched_id);
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 58d3883..f6cb05b 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -237,6 +237,37 @@ out:
     return ret;
 }
 
+int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
+                   physdev_pci_device_t *devs,
+                   uint8_t *nodes)
+{
+    int ret;
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(devs, num_devs * sizeof(*devs),
+                             XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    DECLARE_HYPERCALL_BOUNCE(nodes, num_devs* sizeof(*nodes),
+                             XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ((ret = xc_hypercall_bounce_pre(xch, devs)))
+        goto out;
+    if ((ret = xc_hypercall_bounce_pre(xch, nodes)))
+        goto out;
+
+    sysctl.u.pcitopoinfo.first_dev = 0;
+    sysctl.u.pcitopoinfo.num_devs = num_devs;
+    set_xen_guest_handle(sysctl.u.pcitopoinfo.devs, devs);
+    set_xen_guest_handle(sysctl.u.pcitopoinfo.nodes, nodes);
+
+    sysctl.cmd = XEN_SYSCTL_pcitopoinfo;
+
+    ret = do_sysctl(xch, &sysctl);
+
+ out:
+    xc_hypercall_bounce_post(xch, devs);
+    xc_hypercall_bounce_post(xch, nodes);
+
+    return ret;
+}
 
 int xc_sched_id(xc_interface *xch,
                 int *sched_id)
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index b37b3bd..0f8b519 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5078,6 +5078,48 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
     return ret;
 }
 
+libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs)
+{
+    GC_INIT(ctx);
+    physdev_pci_device_t *devs;
+    uint8_t *nodes;
+    libxl_pcitopology *ret = NULL;
+    int i;
+
+    *num_devs = libxl__pci_numdevs(gc);
+    if (*num_devs < 0) {
+        LOG(ERROR, "Unable to determine number of PCI devices");
+        goto out;
+    }
+
+    devs = libxl__zalloc(gc, sizeof(*devs) * *num_devs);
+    nodes = libxl__zalloc(gc, sizeof(*nodes) * *num_devs);
+
+    if (libxl__pci_topology_init(gc, devs, *num_devs)) {
+        LOGE(ERROR, "Cannot initialize PCI hypercall structure");
+        goto out;
+    }
+
+    if (xc_pcitopoinfo(ctx->xch, *num_devs, devs, nodes) != 0) {
+        LOGE(ERROR, "PCI topology info hypercall failed");
+        goto out;
+    }
+
+    ret = libxl__zalloc(NOGC, sizeof(libxl_pcitopology) * *num_devs);
+
+    for (i = 0; i < *num_devs; i++) {
+        ret[i].seg = devs[i].seg;
+        ret[i].bus = devs[i].bus;
+        ret[i].devfn = devs[i].devfn;
+        ret[i].node = (nodes[i] == XEN_INVALID_NODE_ID) ?
+            LIBXL_PCITOPOLOGY_INVALID_ENTRY : nodes[i];
+    }
+
+ out:
+    GC_FREE;
+    return ret;
+}
+
 libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
 {
     GC_INIT(ctx);
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f784df5..a4c1bf5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -714,6 +714,14 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
 #define LIBXL_HAVE_PSR_CMT 1
 #endif
 
+/*
+ * LIBXL_HAVE_PCITOPOLOGY
+ *
+ * If this is defined, then interface to query hypervisor about PCI device
+ * topology is available.
+ */
+#define LIBXL_HAVE_PCITOPOLOGY 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
@@ -1092,6 +1100,10 @@ void libxl_vminfo_list_free(libxl_vminfo *list, int nb_vm);
 libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out);
 void libxl_cputopology_list_free(libxl_cputopology *, int nb_cpu);
 
+#define LIBXL_PCITOPOLOGY_INVALID_ENTRY (~(uint32_t)0)
+libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs);
+void libxl_pcitopology_list_free(libxl_pcitopology *, int num_devs);
+
 #define LIBXL_NUMAINFO_INVALID_ENTRY (~(uint32_t)0)
 libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr);
 void libxl_numainfo_list_free(libxl_numainfo *, int nr);
diff --git a/tools/libxl/libxl_freebsd.c b/tools/libxl/libxl_freebsd.c
index e8b88b3..47c3391 100644
--- a/tools/libxl/libxl_freebsd.c
+++ b/tools/libxl/libxl_freebsd.c
@@ -131,3 +131,15 @@ libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
 {
     return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
 }
+
+int libxl__pci_numdevs(libxl__gc *gc)
+{
+    return ERROR_NI;
+}
+
+int libxl__pci_topology_init(libxl__gc *gc,
+                             physdev_pci_device_t *devs,
+                             int num_devs)
+{
+    return ERROR_NI;
+}
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..42942b2 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1170,6 +1170,11 @@ _hidden int libxl__try_phy_backend(mode_t st_mode);
 
 _hidden char *libxl__devid_to_localdev(libxl__gc *gc, int devid);
 
+_hidden int libxl__pci_numdevs(libxl__gc *gc);
+_hidden int libxl__pci_topology_init(libxl__gc *gc,
+                                     physdev_pci_device_t *devs,
+                                     int num_devs);
+
 /* from libxl_pci */
 
 _hidden int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, int starting);
diff --git a/tools/libxl/libxl_linux.c b/tools/libxl/libxl_linux.c
index b51930c..be76c6f 100644
--- a/tools/libxl/libxl_linux.c
+++ b/tools/libxl/libxl_linux.c
@@ -279,3 +279,72 @@ libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
 {
     return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
 }
+
+int libxl__pci_numdevs(libxl__gc *gc)
+{
+    DIR *dir;
+    struct dirent *entry;
+    int num_devs = 0;
+
+    dir = opendir("/sys/bus/pci/devices");
+    if (!dir) {
+        LOGEV(ERROR, errno, "Cannot open /sys/bus/pci/devices");
+        return ERROR_FAIL;
+    }
+
+    while ((entry = readdir(dir))) {
+        if (entry->d_name[0] == '.')
+            continue;
+        num_devs++;
+    }
+    closedir(dir);
+
+    return num_devs;
+}
+
+int libxl__pci_topology_init(libxl__gc *gc,
+                             physdev_pci_device_t *devs,
+                             int num_devs)
+{
+
+    DIR *dir;
+    struct dirent *entry;
+    int i, err = 0;
+
+    dir = opendir("/sys/bus/pci/devices");
+    if (!dir) {
+        LOGEV(ERROR, errno, "Cannot open /sys/bus/pci/devices");
+        return ERROR_FAIL;
+    }
+
+    i = 0;
+    while ((entry = readdir(dir))) {
+        unsigned int dom, bus, dev, func;
+
+        if (entry->d_name[0] == '.')
+            continue;
+
+        if (i == num_devs) {
+            LOGE(ERROR, "Too many devices\n");
+            err = ERROR_FAIL;
+            goto out;
+        }
+
+        if (sscanf(entry->d_name, "%x:%x:%x.%d", &dom, &bus, &dev, &func) < 4) {
+            LOGEV(ERROR, errno, "Error processing /sys/bus/pci/devices");
+            err = ERROR_FAIL;
+            goto out;
+        }
+
+        devs[i].seg = dom;
+        devs[i].bus = bus;
+        devs[i].devfn = ((dev & 0x1f) << 3) | (func & 7);
+
+        i++;
+    }
+
+ out:
+    closedir(dir);
+
+    return err;
+}
diff --git a/tools/libxl/libxl_netbsd.c b/tools/libxl/libxl_netbsd.c
index 898e160..a2a962e 100644
--- a/tools/libxl/libxl_netbsd.c
+++ b/tools/libxl/libxl_netbsd.c
@@ -95,3 +95,15 @@ libxl_device_model_version libxl__default_device_model(libxl__gc *gc)
 {
     return LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
 }
+
+int libxl__pci_numdevs(libxl__gc *gc)
+{
+    return ERROR_NI;
+}
+
+int libxl__pci_topology_init(libxl__gc *gc,
+                             physdev_pci_device_t *devs,
+                             int num_devs)
+{
+    return ERROR_NI;
+}
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 02be466..2accda3 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -645,6 +645,13 @@ libxl_cputopology = Struct("cputopology", [
     ("node", uint32),
     ], dir=DIR_OUT)
 
+libxl_pcitopology = Struct("pcitopology", [
+    ("seg", uint16),
+    ("bus", uint8),
+    ("devfn", uint8),
+    ("node", uint32),
+    ], dir=DIR_OUT)
+
 libxl_sched_credit_params = Struct("sched_credit_params", [
     ("tslice_ms", integer),
     ("ratelimit_us", integer),
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
index 7095b58..c31407f 100644
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -877,6 +877,14 @@ void libxl_cputopology_list_free(libxl_cputopology *list, int nr)
     free(list);
 }
 
+void libxl_pcitopology_list_free(libxl_pcitopology *list, int nr)
+{
+    int i;
+    for (i = 0; i < nr; i++)
+        libxl_pcitopology_dispose(&list[i]);
+    free(list);
+}
+
 void libxl_numainfo_list_free(libxl_numainfo *list, int nr)
 {
     int i;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index e41f633..b9fd8cf 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -5214,12 +5214,15 @@ static void output_numainfo(void)
 
 static void output_topologyinfo(void)
 {
-    libxl_cputopology *info;
+    libxl_cputopology *cpuinfo;
     int i, nr;
+    libxl_pcitopology *pciinfo;
+    int valid_devs;
 
-    info = libxl_get_cpu_topology(ctx, &nr);
-    if (info == NULL) {
-        fprintf(stderr, "libxl_get_topologyinfo failed.\n");
+
+    cpuinfo = libxl_get_cpu_topology(ctx, &nr);
+    if (cpuinfo == NULL) {
+        fprintf(stderr, "libxl_get_cpu_topology failed.\n");
         return;
     }
 
@@ -5227,12 +5230,35 @@ static void output_topologyinfo(void)
     printf("cpu:    core    socket     node\n");
 
     for (i = 0; i < nr; i++) {
-        if (info[i].core != LIBXL_CPUTOPOLOGY_INVALID_ENTRY)
+        if (cpuinfo[i].core != LIBXL_CPUTOPOLOGY_INVALID_ENTRY)
             printf("%3d:    %4d     %4d     %4d\n", i,
-                   info[i].core, info[i].socket, info[i].node);
+                   cpuinfo[i].core, cpuinfo[i].socket, cpuinfo[i].node);
+    }
+
+    libxl_cputopology_list_free(cpuinfo, nr);
+
+    pciinfo = libxl_get_pci_topology(ctx, &nr);
+    if (cpuinfo == NULL) {
+        fprintf(stderr, "libxl_get_pci_topology failed.\n");
+        return;
     }
 
-    libxl_cputopology_list_free(info, nr);
+    printf("device topology        :\n");
+    printf("device           node\n");
+    for (i = 0; i < nr; i++) {
+        if (pciinfo[i].node != LIBXL_PCITOPOLOGY_INVALID_ENTRY) {
+            printf("%04x:%02x:%02x.%01x      %d\n", pciinfo[i].seg,
+                   pciinfo[i].bus,
+                   ((pciinfo[i].devfn >> 3) & 0x1f), (pciinfo[i].devfn & 7),
+                   pciinfo[i].node);
+            valid_devs++;
+        }
+    }
+
+    if (valid_devs == 0)
+        printf("No device topology data available\n");
+
+    libxl_pcitopology_list_free(pciinfo, nr);
 
     return;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)
  2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
                   ` (8 preceding siblings ...)
  2015-03-10  2:27 ` [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology Boris Ostrovsky
@ 2015-03-10  8:20 ` Jan Beulich
  2015-03-10 13:39   ` Boris Ostrovsky
  9 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2015-03-10  8:20 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
> Changes in v4:
> * Split cputopology and NUMA info changes into separate patches
> * Added patch#1 (partly because patch#4 needs to know when when distance is invalid,
>   i.e. NUMA_NO_DISTANCE)
> * Split sysctl version update into a separate patch

Why?

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 1/9] numa: __node_distance() should return u8
  2015-03-10  2:27 ` [PATCH v4 1/9] numa: __node_distance() should return u8 Boris Ostrovsky
@ 2015-03-10 11:53   ` Andrew Cooper
  2015-03-10 13:49     ` Boris Ostrovsky
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2015-03-10 11:53 UTC (permalink / raw)
  To: Boris Ostrovsky, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 10/03/15 02:27, Boris Ostrovsky wrote:
> SLIT values are byte-sized and some of them (0-9 and 255) have
> special meaning. Adjust __node_distance() to reflect this and
> modify scrub_heap_pages() to deal with __node_distance() returning
> an invalid SLIT entry.
>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

You also need to teach XEN_SYSCTL_numainfo about the new NUMA_NO_DISTANCE.

> ---
>  xen/arch/x86/srat.c        |   15 +++++++++++----
>  xen/common/page_alloc.c    |    4 ++--
>  xen/include/asm-x86/numa.h |    2 +-
>  xen/include/xen/numa.h     |    3 ++-
>  4 files changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index dfabba3..aa2eda3 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -496,14 +496,21 @@ static unsigned node_to_pxm(nodeid_t n)
>  	return 0;
>  }
>  
> -int __node_distance(nodeid_t a, nodeid_t b)
> +u8 __node_distance(nodeid_t a, nodeid_t b)
>  {
> -	int index;
> +	u8 slit_val;
>  
>  	if (!acpi_slit)
>  		return a == b ? 10 : 20;
> -	index = acpi_slit->locality_count * node_to_pxm(a);
> -	return acpi_slit->entry[index + node_to_pxm(b)];
> +
> +	slit_val = acpi_slit->entry[acpi_slit->locality_count * node_to_pxm(a) +
> +				                node_to_pxm(b)];

This would be easier to read if you kept the old index temporary
(although making it a u64).

~Andrew

> +
> +	/* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
> +	if ((slit_val == 0xff) || (slit_val <= 9))
> +		return NUMA_NO_DISTANCE;
> +	else
> +		return slit_val;
>  }
>  
>  EXPORT_SYMBOL(__node_distance);
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index d96d25b..0f0ca56 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1435,13 +1435,13 @@ void __init scrub_heap_pages(void)
>          /* Figure out which NODE CPUs are close. */
>          for_each_online_node ( j )
>          {
> -            int distance;
> +            u8 distance;
>  
>              if ( cpumask_empty(&node_to_cpumask(j)) )
>                  continue;
>  
>              distance = __node_distance(i, j);
> -            if ( distance < last_distance )
> +            if ( (distance < last_distance) && (distance != NUMA_NO_DISTANCE) )
>              {
>                  last_distance = distance;
>                  best_node = j;
> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index cc5b5d1..7a489d3 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -85,6 +85,6 @@ extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>  #endif
>  
>  void srat_parse_regions(u64 addr);
> -extern int __node_distance(nodeid_t a, nodeid_t b);
> +extern u8 __node_distance(nodeid_t a, nodeid_t b);
>  
>  #endif
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index ac4b391..7aef1a8 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -7,7 +7,8 @@
>  #define NODES_SHIFT     0
>  #endif
>  
> -#define NUMA_NO_NODE    0xFF
> +#define NUMA_NO_NODE     0xFF
> +#define NUMA_NO_DISTANCE 0xFF
>  
>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>  

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)
  2015-03-10  8:20 ` [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Jan Beulich
@ 2015-03-10 13:39   ` Boris Ostrovsky
  2015-03-10 13:47     ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10 13:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

On 03/10/2015 04:20 AM, Jan Beulich wrote:
>>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
>> Changes in v4:
>> * Split cputopology and NUMA info changes into separate patches
>> * Added patch#1 (partly because patch#4 needs to know when when distance is invalid,
>>    i.e. NUMA_NO_DISTANCE)
>> * Split sysctl version update into a separate patch
> Why?

Which patch should it go to then? The first that changed the interfaces 
(patch#3) or the second (#4)?

-boris

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc
  2015-03-10  2:27 ` [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc Boris Ostrovsky
@ 2015-03-10 13:40   ` Dario Faggioli
  2015-03-11 11:07     ` Ian Campbell
  0 siblings, 1 reply; 34+ messages in thread
From: Dario Faggioli @ 2015-03-10 13:40 UTC (permalink / raw)
  To: boris.ostrovsky
  Cc: elena.ufimtseva, Wei Liu, Ian Campbell, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, Keir (Xen.org)


[-- Attachment #1.1: Type: text/plain, Size: 420 bytes --]

On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> xc_cputopoinfo() is not expected to be used on a hot path and therefore
> hypercall buffer management can be pushed into libxc. This will simplify
> life for callers.
> 
> Also update error reporting macros.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management to libxc
  2015-03-10  2:27 ` [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s " Boris Ostrovsky
@ 2015-03-10 13:44   ` Dario Faggioli
  2015-03-11 11:08     ` Ian Campbell
  0 siblings, 1 reply; 34+ messages in thread
From: Dario Faggioli @ 2015-03-10 13:44 UTC (permalink / raw)
  To: boris.ostrovsky
  Cc: elena.ufimtseva, Wei Liu, Ian Campbell, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, Keir (Xen.org)


[-- Attachment #1.1: Type: text/plain, Size: 415 bytes --]

On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> xc_numainfo() is not expected to be used on a hot path and therefore
> hypercall buffer management can be pushed into libxc. This will simplify
> life for callers.
> 
> Also update error logging macros.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)
  2015-03-10 13:39   ` Boris Ostrovsky
@ 2015-03-10 13:47     ` Jan Beulich
  0 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2015-03-10 13:47 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 14:39, <boris.ostrovsky@oracle.com> wrote:
> On 03/10/2015 04:20 AM, Jan Beulich wrote:
>>>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
>>> Changes in v4:
>>> * Split cputopology and NUMA info changes into separate patches
>>> * Added patch#1 (partly because patch#4 needs to know when when distance is invalid,
>>>    i.e. NUMA_NO_DISTANCE)
>>> * Split sysctl version update into a separate patch
>> Why?
> 
> Which patch should it go to then? The first that changed the interfaces 
> (patch#3) or the second (#4)?

Whichever first changes the interface in an incompatible way.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 1/9] numa: __node_distance() should return u8
  2015-03-10 11:53   ` Andrew Cooper
@ 2015-03-10 13:49     ` Boris Ostrovsky
  0 siblings, 0 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10 13:49 UTC (permalink / raw)
  To: Andrew Cooper, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 03/10/2015 07:53 AM, Andrew Cooper wrote:
> On 10/03/15 02:27, Boris Ostrovsky wrote:
>> SLIT values are byte-sized and some of them (0-9 and 255) have
>> special meaning. Adjust __node_distance() to reflect this and
>> modify scrub_heap_pages() to deal with __node_distance() returning
>> an invalid SLIT entry.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> You also need to teach XEN_SYSCTL_numainfo about the new NUMA_NO_DISTANCE.

The sysctl is updated later in the series but this may indeed break 
bisection.

>
>> ---
>>   xen/arch/x86/srat.c        |   15 +++++++++++----
>>   xen/common/page_alloc.c    |    4 ++--
>>   xen/include/asm-x86/numa.h |    2 +-
>>   xen/include/xen/numa.h     |    3 ++-
>>   4 files changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index dfabba3..aa2eda3 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -496,14 +496,21 @@ static unsigned node_to_pxm(nodeid_t n)
>>   	return 0;
>>   }
>>   
>> -int __node_distance(nodeid_t a, nodeid_t b)
>> +u8 __node_distance(nodeid_t a, nodeid_t b)
>>   {
>> -	int index;
>> +	u8 slit_val;
>>   
>>   	if (!acpi_slit)
>>   		return a == b ? 10 : 20;
>> -	index = acpi_slit->locality_count * node_to_pxm(a);
>> -	return acpi_slit->entry[index + node_to_pxm(b)];
>> +
>> +	slit_val = acpi_slit->entry[acpi_slit->locality_count * node_to_pxm(a) +
>> +				                node_to_pxm(b)];
> This would be easier to read if you kept the old index temporary
> (although making it a u64).

Yes.

-boris

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology
  2015-03-10  2:27 ` [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology Boris Ostrovsky
@ 2015-03-10 13:51   ` Dario Faggioli
  2015-03-11 11:10     ` Ian Campbell
  0 siblings, 1 reply; 34+ messages in thread
From: Dario Faggioli @ 2015-03-10 13:51 UTC (permalink / raw)
  To: boris.ostrovsky
  Cc: elena.ufimtseva, Wei Liu, Ian Campbell, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, Keir (Xen.org)


[-- Attachment #1.1: Type: text/plain, Size: 586 bytes --]

On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> .. and use this new interface to display it along with CPU topology
> and NUMA information when 'xl info -n' command is issued
> 
> The output will look like
> ...
> cpu_topology           :
> cpu:    core    socket     node
>   0:       0        0        0
> ...
> device topology        :
> device           node
> 0000:00:00.0      0
> 0000:00:01.0      0
> ...
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
@ 2015-03-10 14:29   ` Andrew Cooper
  2015-03-10 15:22     ` Boris Ostrovsky
  2015-03-11 11:04   ` Ian Campbell
  2015-03-13 15:51   ` Jan Beulich
  2 siblings, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2015-03-10 14:29 UTC (permalink / raw)
  To: Boris Ostrovsky, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 10/03/15 02:27, Boris Ostrovsky wrote:
> Instead of copying data for each field in xen_sysctl_topologyinfo separately
> put cpu/socket/node into a single structure and do a single copy for each
> processor.
>
> Do not use max_cpu_index, which is almost always used for calculating number
> CPUs (thus requiring adding or subtracting one), replace it with num_cpus.
>
> There is no need to copy whole op in sysctl to user at the end, we only need
> num_cpus.
>
> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
> that these are used for CPU topology. Subsequent patch will add support for
> PCI topology sysctl.
>
> Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
> (core, socket, node).
>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

In principle, a good improvement, but I have some specific issues.

> ---
>
> diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
> index 2aa0dc7..2fd93e0 100644
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -1375,7 +1365,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
>          for ( j = 0; j <= max_node_index; j++ )
>          {
>              uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
> -            if ( dist == INVALID_TOPOLOGY_ID )
> +            if ( dist == ~0u )
>              {
>                  PyList_Append(node_to_node_dist_obj, Py_None);
>              }

This looks like a spurious hunk (perhaps from patch 9?)

> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 0cb6ee1..fe48ee8 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -320,39 +320,61 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>      }
>      break;
>  
> -    case XEN_SYSCTL_topologyinfo:
> +    case XEN_SYSCTL_cputopoinfo:
>      {
> -        uint32_t i, max_cpu_index, last_online_cpu;
> -        xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
> +        uint32_t i, num_cpus;
> +        xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
>  
> -        last_online_cpu = cpumask_last(&cpu_online_map);
> -        max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
> -        ti->max_cpu_index = last_online_cpu;
> +        if ( guest_handle_is_null(ti->cputopo) )
> +        {
> +            ret = -EINVAL;
> +            break;
> +        }

The prevailing hypervisor style is to use a NULL guest handle as an
explicit request for size.  i.e. write back num_cpus in ti and return
success.

>  
> -        for ( i = 0; i <= max_cpu_index; i++ )
> +        num_cpus = cpumask_last(&cpu_online_map) + 1;
> +        if ( ti->num_cpus != num_cpus )
>          {
> -            if ( !guest_handle_is_null(ti->cpu_to_core) )
> +            uint32_t array_sz = ti->num_cpus;
> +
> +            ti->num_cpus = num_cpus;
> +            if ( __copy_field_to_guest(u_sysctl, op,
> +                                       u.cputopoinfo.num_cpus) )
>              {
> -                uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
> -                if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
> -                    break;
> +                ret = -EFAULT;
> +                break;
> +            }
> +            num_cpus = min_t(uint32_t, array_sz, num_cpus);
> +        }
> +
> +        for ( i = 0; i < num_cpus; i++ )
> +        {
> +            xen_sysctl_cputopo_t cputopo;
> +
> +            if ( cpu_present(i) )
> +            {
> +                cputopo.core = cpu_to_core(i);
> +                if ( cputopo.core == BAD_APICID )
> +                    cputopo.core = XEN_INVALID_CORE_ID;
> +                cputopo.socket = cpu_to_socket(i);
> +                if ( cputopo.socket == BAD_APICID )
> +                    cputopo.socket = XEN_INVALID_SOCKET_ID;
> +                cputopo.node = cpu_to_node(i);
> +                if ( cputopo.node == NUMA_NO_NODE )
> +                    cputopo.node = XEN_INVALID_NODE_ID;
>              }
> -            if ( !guest_handle_is_null(ti->cpu_to_socket) )
> +            else
>              {
> -                uint32_t socket = cpu_online(i) ? cpu_to_socket(i) : ~0u;
> -                if ( copy_to_guest_offset(ti->cpu_to_socket, i, &socket, 1) )
> -                    break;
> +                cputopo.core = XEN_INVALID_CORE_ID;
> +                cputopo.socket = XEN_INVALID_SOCKET_ID;
> +                cputopo.node = XEN_INVALID_NODE_ID;
>              }
> -            if ( !guest_handle_is_null(ti->cpu_to_node) )
> +
> +            if ( copy_to_guest_offset(ti->cputopo, i, &cputopo, 1) )
>              {
> -                uint32_t node = cpu_online(i) ? cpu_to_node(i) : ~0u;
> -                if ( copy_to_guest_offset(ti->cpu_to_node, i, &node, 1) )
> -                    break;
> +                ret = -EFAULT;
> +                break;
>              }
>          }
> -
> -        ret = ((i <= max_cpu_index) || copy_to_guest(u_sysctl, op, 1))
> -            ? -EFAULT : 0;
>      }
>      break;
>  
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 8552dc6..f20da69 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -462,31 +462,37 @@ struct xen_sysctl_lockprof_op {
>  typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
>  
> -/* XEN_SYSCTL_topologyinfo */
> -#define INVALID_TOPOLOGY_ID  (~0U)
> -struct xen_sysctl_topologyinfo {
> +/* XEN_SYSCTL_cputopoinfo */
> +#define XEN_INVALID_CORE_ID     (~0U)
> +#define XEN_INVALID_SOCKET_ID   (~0U)
> +#define XEN_INVALID_NODE_ID     ((uint8_t)~0)
> +
> +struct xen_sysctl_cputopo {
> +    uint32_t core;
> +    uint32_t socket;
> +    uint8_t node;
> +};
> +typedef struct xen_sysctl_cputopo xen_sysctl_cputopo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopo_t);
> +
> +struct xen_sysctl_cputopoinfo {
>      /*
> -     * IN: maximum addressable entry in the caller-provided arrays.
> -     * OUT: largest cpu identifier in the system.
> -     * If OUT is greater than IN then the arrays are truncated!
> -     * If OUT is leass than IN then the array tails are not written by sysctl.
> +     * IN: size of caller-provided cputopo array.
> +     * OUT: Number of CPUs in the system.
>       */
> -    uint32_t max_cpu_index;
> +    uint32_t num_cpus;
>  
>      /*
> -     * If not NULL, these arrays are filled with core/socket/node identifier
> -     * for each cpu.
> -     * If a cpu has no core/socket/node information (e.g., cpu not present) 
> -     * then the sentinel value ~0u is written to each array.
> -     * The number of array elements written by the sysctl is:
> -     *   min(@max_cpu_index_IN,@max_cpu_index_OUT)+1
> +     * If not NULL, filled with core/socket/node identifier for each cpu
> +     * If information for a particular entry is not avalable it is set to
> +     * XEN_INVALID_<xxx>_ID.
> +     * The number of array elements for CPU topology written by the sysctl is:
> +     *   min(@num_cpus_IN,@num_cpus_OUT)+1.

This is also a problematic interface as it clobbers the actual length of
the valid array.  strace/valgrind have no way of validating the state
after the hypercall, and the user of the interface has to remember the
number they passed in to start with.

It is also prevailing style to return the number of entries actually
written to, and fail the hypercall with -ENOBUFS if there is
insufficient space in the destination array.

See the implementation of XEN_DOMCTL_get_vcpu_msrs as an example which
handles each of these corner cases.

~Andrew

>       */
> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_core;
> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_socket;
> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_node;
> +    XEN_GUEST_HANDLE_64(xen_sysctl_cputopo_t) cputopo;
>  };
> -typedef struct xen_sysctl_topologyinfo xen_sysctl_topologyinfo_t;
> -DEFINE_XEN_GUEST_HANDLE(xen_sysctl_topologyinfo_t);
> +typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
>  
>  /* XEN_SYSCTL_numainfo */
>  #define INVALID_NUMAINFO_ID (~0U)
> @@ -672,7 +678,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_pm_op                         12
>  #define XEN_SYSCTL_page_offline_op               14
>  #define XEN_SYSCTL_lockprof_op                   15
> -#define XEN_SYSCTL_topologyinfo                  16 
> +#define XEN_SYSCTL_cputopoinfo                   16
>  #define XEN_SYSCTL_numainfo                      17
>  #define XEN_SYSCTL_cpupool_op                    18
>  #define XEN_SYSCTL_scheduler_op                  19
> @@ -683,7 +689,7 @@ struct xen_sysctl {
>          struct xen_sysctl_readconsole       readconsole;
>          struct xen_sysctl_tbuf_op           tbuf_op;
>          struct xen_sysctl_physinfo          physinfo;
> -        struct xen_sysctl_topologyinfo      topologyinfo;
> +        struct xen_sysctl_cputopoinfo       cputopoinfo;
>          struct xen_sysctl_numainfo          numainfo;
>          struct xen_sysctl_sched_id          sched_id;
>          struct xen_sysctl_perfc_op          perfc_op;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
@ 2015-03-10 14:34   ` Andrew Cooper
  2015-03-11 11:06   ` Ian Campbell
  2015-03-13 15:55   ` Jan Beulich
  2 siblings, 0 replies; 34+ messages in thread
From: Andrew Cooper @ 2015-03-10 14:34 UTC (permalink / raw)
  To: Boris Ostrovsky, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 10/03/15 02:27, Boris Ostrovsky wrote:
> Make sysctl NUMA topology query use fewer copies by combining some
> fields into a single structure and copying distances for each node
> in a single copy.
>
> Instead of using max_node_index for passing number of nodes keep this
> value in num_nodes: almost all uses of max_node_index required adding
> or subtracting one to eventually get to number of nodes anyway.
>
> Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
> XEN_INVALID_NODE_DIST.
>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

I have the same concerns with the interface as patch 3, but other comments.

~Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology
  2015-03-10  2:27 ` [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology Boris Ostrovsky
@ 2015-03-10 14:54   ` Andrew Cooper
  2015-03-13 16:03   ` Jan Beulich
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Cooper @ 2015-03-10 14:54 UTC (permalink / raw)
  To: Boris Ostrovsky, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 10/03/15 02:27, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>
> Changes in v4:
> * No buffer allocation in sysctl, copy data device-by-device
> * Replace INVALID_TOPOLOGY_ID with XEN_INVALID_NODE_ID
>
>  xen/common/sysctl.c         |   59 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/public/sysctl.h |   29 +++++++++++++++++++++
>  2 files changed, 88 insertions(+), 0 deletions(-)
>
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 2d11a76..9cd6321 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -387,7 +387,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>          }
>      }
>      break;

Blank line here.

> +#ifdef HAS_PCI
> +    case XEN_SYSCTL_pcitopoinfo:

Please try and keep the SYSCTL implementations in numerical order, which
should put this new block below the pcr and coverage blocks.

> +    {
> +        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
> +
> +        if ( guest_handle_is_null(ti->devs) ||
> +             guest_handle_is_null(ti->nodes) ||
> +             (ti->first_dev > ti->num_devs) )
> +        {
> +            ret = -EINVAL;
> +            break;
> +        }
> +
> +        for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
> +        {
> +            physdev_pci_device_t dev;
> +            uint8_t node;
> +            struct pci_dev *pdev;
> +
> +            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
> +            {
> +                ret = -EFAULT;
> +                break;
> +            }
> +
> +            spin_lock(&pcidevs_lock);
> +            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
> +            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
> +                node = XEN_INVALID_NODE_ID;
> +            else
> +                node = pdev->node;
> +            spin_unlock(&pcidevs_lock);
> +
> +            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
> +            {
> +                ret = -EFAULT;
> +                break;
> +            }
>  
> +            if ( hypercall_preempt_check() )
> +                break;
> +        }
> +
> +        if ( !ret )
> +        {
> +            ti->first_dev++;
> +
> +            if ( __copy_field_to_guest(u_sysctl, op, u.pcitopoinfo.first_dev) )
> +            {
> +                ret = -EFAULT;
> +                break;
> +            }
> +
> +            if ( ti->first_dev < ti->num_devs )
> +                ret = hypercall_create_continuation(__HYPERVISOR_sysctl,
> +                                                    "h", u_sysctl);
> +        }
> +    }
> +    break;
> +#endif

And here.


>  #ifdef TEST_COVERAGE
>      case XEN_SYSCTL_coverage_op:
>          ret = sysctl_coverage_op(&op->u.coverage_op);
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index c544c76..a224951 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -33,6 +33,7 @@
>  
>  #include "xen.h"
>  #include "domctl.h"
> +#include "physdev.h"
>  
>  #define XEN_SYSCTL_INTERFACE_VERSION 0x0000000B
>  
> @@ -494,6 +495,32 @@ struct xen_sysctl_cputopoinfo {
>  typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
>  
> +/* XEN_SYSCTL_pcitopoinfo */
> +struct xen_sysctl_pcitopoinfo {
> +    /* IN: Size of pcitopo array */

Number of elements (as the two arrays are different sized structures) of
both the devs and nodes array.  It is not sensible for a caller to ever
pass arrays of different lengths.

> +    uint32_t num_devs;
> +
> +    /*
> +     * IN/OUT: First element of pcitopo array that needs to be processed by
> +     * hypervisor.
> +     * This is used primarily by hypercall continuations and callers will
> +     * typically set it to zero.

I would word this more strongly and state that callers must set this to
0 and that it is an internal implementation detail of Xen.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> +     */
> +    uint32_t first_dev;
> +
> +    /* IN: list of devices */
> +    XEN_GUEST_HANDLE_64(physdev_pci_device_t) devs;
> +
> +    /*
> +     * OUT: node identifier for each device.
> +     * If information for a particular device is not avalable then set
> +     * to XEN_INVALID_NODE_ID.
> +     */
> +    XEN_GUEST_HANDLE_64(uint8) nodes;
> +};
> +typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
> +
>  /* XEN_SYSCTL_numainfo */
>  #define XEN_INVALID_MEM_SZ     (~0U)
>  #define XEN_INVALID_NODE_DIST  ((uint8_t)~0)
> @@ -694,12 +721,14 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_scheduler_op                  19
>  #define XEN_SYSCTL_coverage_op                   20
>  #define XEN_SYSCTL_psr_cmt_op                    21
> +#define XEN_SYSCTL_pcitopoinfo                   22
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
>          struct xen_sysctl_tbuf_op           tbuf_op;
>          struct xen_sysctl_physinfo          physinfo;
>          struct xen_sysctl_cputopoinfo       cputopoinfo;
> +        struct xen_sysctl_pcitopoinfo       pcitopoinfo;
>          struct xen_sysctl_numainfo          numainfo;
>          struct xen_sysctl_sched_id          sched_id;
>          struct xen_sysctl_perfc_op          perfc_op;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-10 14:29   ` Andrew Cooper
@ 2015-03-10 15:22     ` Boris Ostrovsky
  0 siblings, 0 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-10 15:22 UTC (permalink / raw)
  To: Andrew Cooper, jbeulich, keir, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2
  Cc: elena.ufimtseva, dario.faggioli, xen-devel

On 03/10/2015 10:29 AM, Andrew Cooper wrote:
> On 10/03/15 02:27, Boris Ostrovsky wrote:
>> Instead of copying data for each field in xen_sysctl_topologyinfo separately
>> put cpu/socket/node into a single structure and do a single copy for each
>> processor.
>>
>> Do not use max_cpu_index, which is almost always used for calculating number
>> CPUs (thus requiring adding or subtracting one), replace it with num_cpus.
>>
>> There is no need to copy whole op in sysctl to user at the end, we only need
>> num_cpus.
>>
>> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
>> that these are used for CPU topology. Subsequent patch will add support for
>> PCI topology sysctl.
>>
>> Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
>> (core, socket, node).
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> In principle, a good improvement, but I have some specific issues.
>
>> ---
>>
>> diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
>> index 2aa0dc7..2fd93e0 100644
>> --- a/tools/python/xen/lowlevel/xc/xc.c
>> +++ b/tools/python/xen/lowlevel/xc/xc.c
>> @@ -1375,7 +1365,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
>>           for ( j = 0; j <= max_node_index; j++ )
>>           {
>>               uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
>> -            if ( dist == INVALID_TOPOLOGY_ID )
>> +            if ( dist == ~0u )
>>               {
>>                   PyList_Append(node_to_node_dist_obj, Py_None);
>>               }
> This looks like a spurious hunk (perhaps from patch 9?)

No, this is sort of an intermediate step: INVALID_TOPOLOGY_ID is no 
longer defined and a new macro for distance will show up in the next patch.

And, in fact, it's wrong anyway since the sysctl never sets distance to 
INVALID_TOPOLOGY_ID, it sets it explicitly to ~0u. It just so happens 
that INVALID_TOPOLOGY_ID is also ~0U.

I thought about moving macro definition into this patch but then 
logically it wouldn't belong here so I figured I'd do what the 
hypervisor does, which is use a constant. Until the next patch.


>
>> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
>> index 0cb6ee1..fe48ee8 100644
>> --- a/xen/common/sysctl.c
>> +++ b/xen/common/sysctl.c
>> @@ -320,39 +320,61 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>>       }
>>       break;
>>   
>> -    case XEN_SYSCTL_topologyinfo:
>> +    case XEN_SYSCTL_cputopoinfo:
>>       {
>> -        uint32_t i, max_cpu_index, last_online_cpu;
>> -        xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
>> +        uint32_t i, num_cpus;
>> +        xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
>>   
>> -        last_online_cpu = cpumask_last(&cpu_online_map);
>> -        max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
>> -        ti->max_cpu_index = last_online_cpu;
>> +        if ( guest_handle_is_null(ti->cputopo) )
>> +        {
>> +            ret = -EINVAL;
>> +            break;
>> +        }
> The prevailing hypervisor style is to use a NULL guest handle as an
> explicit request for size.  i.e. write back num_cpus in ti and return
> success.

Yes, this would look better from the interface POV.

-boris

>
>>   
>> -        for ( i = 0; i <= max_cpu_index; i++ )
>> +        num_cpus = cpumask_last(&cpu_online_map) + 1;
>> +        if ( ti->num_cpus != num_cpus )
>>           {
>> -            if ( !guest_handle_is_null(ti->cpu_to_core) )
>> +            uint32_t array_sz = ti->num_cpus;
>> +
>> +            ti->num_cpus = num_cpus;
>> +            if ( __copy_field_to_guest(u_sysctl, op,
>> +                                       u.cputopoinfo.num_cpus) )
>>               {
>> -                uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
>> -                if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
>> -                    break;
>> +                ret = -EFAULT;
>> +                break;
>> +            }
>> +            num_cpus = min_t(uint32_t, array_sz, num_cpus);
>> +        }
>> +
>> +        for ( i = 0; i < num_cpus; i++ )
>> +        {
>> +            xen_sysctl_cputopo_t cputopo;
>> +
>> +            if ( cpu_present(i) )
>> +            {
>> +                cputopo.core = cpu_to_core(i);
>> +                if ( cputopo.core == BAD_APICID )
>> +                    cputopo.core = XEN_INVALID_CORE_ID;
>> +                cputopo.socket = cpu_to_socket(i);
>> +                if ( cputopo.socket == BAD_APICID )
>> +                    cputopo.socket = XEN_INVALID_SOCKET_ID;
>> +                cputopo.node = cpu_to_node(i);
>> +                if ( cputopo.node == NUMA_NO_NODE )
>> +                    cputopo.node = XEN_INVALID_NODE_ID;
>>               }
>> -            if ( !guest_handle_is_null(ti->cpu_to_socket) )
>> +            else
>>               {
>> -                uint32_t socket = cpu_online(i) ? cpu_to_socket(i) : ~0u;
>> -                if ( copy_to_guest_offset(ti->cpu_to_socket, i, &socket, 1) )
>> -                    break;
>> +                cputopo.core = XEN_INVALID_CORE_ID;
>> +                cputopo.socket = XEN_INVALID_SOCKET_ID;
>> +                cputopo.node = XEN_INVALID_NODE_ID;
>>               }
>> -            if ( !guest_handle_is_null(ti->cpu_to_node) )
>> +
>> +            if ( copy_to_guest_offset(ti->cputopo, i, &cputopo, 1) )
>>               {
>> -                uint32_t node = cpu_online(i) ? cpu_to_node(i) : ~0u;
>> -                if ( copy_to_guest_offset(ti->cpu_to_node, i, &node, 1) )
>> -                    break;
>> +                ret = -EFAULT;
>> +                break;
>>               }
>>           }
>> -
>> -        ret = ((i <= max_cpu_index) || copy_to_guest(u_sysctl, op, 1))
>> -            ? -EFAULT : 0;
>>       }
>>       break;
>>   
>> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
>> index 8552dc6..f20da69 100644
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -462,31 +462,37 @@ struct xen_sysctl_lockprof_op {
>>   typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t;
>>   DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
>>   
>> -/* XEN_SYSCTL_topologyinfo */
>> -#define INVALID_TOPOLOGY_ID  (~0U)
>> -struct xen_sysctl_topologyinfo {
>> +/* XEN_SYSCTL_cputopoinfo */
>> +#define XEN_INVALID_CORE_ID     (~0U)
>> +#define XEN_INVALID_SOCKET_ID   (~0U)
>> +#define XEN_INVALID_NODE_ID     ((uint8_t)~0)
>> +
>> +struct xen_sysctl_cputopo {
>> +    uint32_t core;
>> +    uint32_t socket;
>> +    uint8_t node;
>> +};
>> +typedef struct xen_sysctl_cputopo xen_sysctl_cputopo_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopo_t);
>> +
>> +struct xen_sysctl_cputopoinfo {
>>       /*
>> -     * IN: maximum addressable entry in the caller-provided arrays.
>> -     * OUT: largest cpu identifier in the system.
>> -     * If OUT is greater than IN then the arrays are truncated!
>> -     * If OUT is leass than IN then the array tails are not written by sysctl.
>> +     * IN: size of caller-provided cputopo array.
>> +     * OUT: Number of CPUs in the system.
>>        */
>> -    uint32_t max_cpu_index;
>> +    uint32_t num_cpus;
>>   
>>       /*
>> -     * If not NULL, these arrays are filled with core/socket/node identifier
>> -     * for each cpu.
>> -     * If a cpu has no core/socket/node information (e.g., cpu not present)
>> -     * then the sentinel value ~0u is written to each array.
>> -     * The number of array elements written by the sysctl is:
>> -     *   min(@max_cpu_index_IN,@max_cpu_index_OUT)+1
>> +     * If not NULL, filled with core/socket/node identifier for each cpu
>> +     * If information for a particular entry is not avalable it is set to
>> +     * XEN_INVALID_<xxx>_ID.
>> +     * The number of array elements for CPU topology written by the sysctl is:
>> +     *   min(@num_cpus_IN,@num_cpus_OUT)+1.
> This is also a problematic interface as it clobbers the actual length of
> the valid array.  strace/valgrind have no way of validating the state
> after the hypercall, and the user of the interface has to remember the
> number they passed in to start with.
>
> It is also prevailing style to return the number of entries actually
> written to, and fail the hypercall with -ENOBUFS if there is
> insufficient space in the destination array.
>
> See the implementation of XEN_DOMCTL_get_vcpu_msrs as an example which
> handles each of these corner cases.
>
> ~Andrew
>
>>        */
>> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_core;
>> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_socket;
>> -    XEN_GUEST_HANDLE_64(uint32) cpu_to_node;
>> +    XEN_GUEST_HANDLE_64(xen_sysctl_cputopo_t) cputopo;
>>   };
>> -typedef struct xen_sysctl_topologyinfo xen_sysctl_topologyinfo_t;
>> -DEFINE_XEN_GUEST_HANDLE(xen_sysctl_topologyinfo_t);
>> +typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
>>   
>>   /* XEN_SYSCTL_numainfo */
>>   #define INVALID_NUMAINFO_ID (~0U)
>> @@ -672,7 +678,7 @@ struct xen_sysctl {
>>   #define XEN_SYSCTL_pm_op                         12
>>   #define XEN_SYSCTL_page_offline_op               14
>>   #define XEN_SYSCTL_lockprof_op                   15
>> -#define XEN_SYSCTL_topologyinfo                  16
>> +#define XEN_SYSCTL_cputopoinfo                   16
>>   #define XEN_SYSCTL_numainfo                      17
>>   #define XEN_SYSCTL_cpupool_op                    18
>>   #define XEN_SYSCTL_scheduler_op                  19
>> @@ -683,7 +689,7 @@ struct xen_sysctl {
>>           struct xen_sysctl_readconsole       readconsole;
>>           struct xen_sysctl_tbuf_op           tbuf_op;
>>           struct xen_sysctl_physinfo          physinfo;
>> -        struct xen_sysctl_topologyinfo      topologyinfo;
>> +        struct xen_sysctl_cputopoinfo       cputopoinfo;
>>           struct xen_sysctl_numainfo          numainfo;
>>           struct xen_sysctl_sched_id          sched_id;
>>           struct xen_sysctl_perfc_op          perfc_op;
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
  2015-03-10 14:29   ` Andrew Cooper
@ 2015-03-11 11:04   ` Ian Campbell
  2015-03-11 12:02     ` Boris Ostrovsky
  2015-03-13 15:51   ` Jan Beulich
  2 siblings, 1 reply; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 11:04 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, stefano.stabellini, andrew.cooper3,
	dario.faggioli, ian.jackson, xen-devel, jbeulich, keir

On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> Instead of copying data for each field in xen_sysctl_topologyinfo separately
> put cpu/socket/node into a single structure and do a single copy for each
> processor.
> 
> Do not use max_cpu_index, which is almost always used for calculating number
> CPUs (thus requiring adding or subtracting one), replace it with num_cpus.
> 
> There is no need to copy whole op in sysctl to user at the end, we only need
> num_cpus.
> 
> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
> that these are used for CPU topology. Subsequent patch will add support for
> PCI topology sysctl.
> 
> Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
> (core, socket, node).
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
> 
> Changes in v4:
> * Split v3's patch into two --- one for CPU topology and one for NUMA info

I think this means this is now back to how v2 looked, in which case you
may feel free to reinstate my ack. I only glanced through this version
but it looks ok. Let me know if it is actually different to v2 and I'll
have a closer look.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
  2015-03-10 14:34   ` Andrew Cooper
@ 2015-03-11 11:06   ` Ian Campbell
  2015-03-13 15:55   ` Jan Beulich
  2 siblings, 0 replies; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 11:06 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, stefano.stabellini, andrew.cooper3,
	dario.faggioli, ian.jackson, xen-devel, jbeulich, keir

On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> Make sysctl NUMA topology query use fewer copies by combining some
> fields into a single structure and copying distances for each node
> in a single copy.
> 
> Instead of using max_node_index for passing number of nodes keep this
> value in num_nodes: almost all uses of max_node_index required adding
> or subtracting one to eventually get to number of nodes anyway.
> 
> Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
> XEN_INVALID_NODE_DIST.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Assuming the h/v guys are happy with the interface the tools side looks
like a valid wrapper to it:
        Acked-by: Ian Campbell <ian.campbell@citrix.com>
        
Ian

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc
  2015-03-10 13:40   ` Dario Faggioli
@ 2015-03-11 11:07     ` Ian Campbell
  0 siblings, 0 replies; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 11:07 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: elena.ufimtseva, Wei Liu, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, boris.ostrovsky,
	Keir (Xen.org)

On Tue, 2015-03-10 at 13:40 +0000, Dario Faggioli wrote:
> On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> > xc_cputopoinfo() is not expected to be used on a hot path and therefore
> > hypercall buffer management can be pushed into libxc. This will simplify
> > life for callers.
> > 
> > Also update error reporting macros.
> > 
> > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(without close review of my own, thanks Dario).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management to libxc
  2015-03-10 13:44   ` Dario Faggioli
@ 2015-03-11 11:08     ` Ian Campbell
  0 siblings, 0 replies; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 11:08 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: elena.ufimtseva, Wei Liu, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, boris.ostrovsky,
	Keir (Xen.org)

On Tue, 2015-03-10 at 13:44 +0000, Dario Faggioli wrote:
> On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> > xc_numainfo() is not expected to be used on a hot path and therefore
> > hypercall buffer management can be pushed into libxc. This will simplify
> > life for callers.
> > 
> > Also update error logging macros.
> > 
> > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(again without close review, thanks Dario)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology
  2015-03-10 13:51   ` Dario Faggioli
@ 2015-03-11 11:10     ` Ian Campbell
  0 siblings, 0 replies; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 11:10 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: elena.ufimtseva, Wei Liu, Andrew Cooper, xen-devel,
	Stefano Stabellini, jbeulich, Ian Jackson, boris.ostrovsky,
	Keir (Xen.org)

On Tue, 2015-03-10 at 13:51 +0000, Dario Faggioli wrote:
> On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> > .. and use this new interface to display it along with CPU topology
> > and NUMA information when 'xl info -n' command is issued
> > 
> > The output will look like
> > ...
> > cpu_topology           :
> > cpu:    core    socket     node
> >   0:       0        0        0
> > ...
> > device topology        :
> > device           node
> > 0000:00:00.0      0
> > 0000:00:01.0      0
> > ...
> > 
> > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(I only looked closely at the libxl API changes)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-11 11:04   ` Ian Campbell
@ 2015-03-11 12:02     ` Boris Ostrovsky
  2015-03-11 12:24       ` Ian Campbell
  0 siblings, 1 reply; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-11 12:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: elena.ufimtseva, wei.liu2, stefano.stabellini, andrew.cooper3,
	dario.faggioli, ian.jackson, xen-devel, jbeulich, keir

On 03/11/2015 07:04 AM, Ian Campbell wrote:
> On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
>> Instead of copying data for each field in xen_sysctl_topologyinfo separately
>> put cpu/socket/node into a single structure and do a single copy for each
>> processor.
>>
>> Do not use max_cpu_index, which is almost always used for calculating number
>> CPUs (thus requiring adding or subtracting one), replace it with num_cpus.
>>
>> There is no need to copy whole op in sysctl to user at the end, we only need
>> num_cpus.
>>
>> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
>> that these are used for CPU topology. Subsequent patch will add support for
>> PCI topology sysctl.
>>
>> Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
>> (core, socket, node).
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>
>> Changes in v4:
>> * Split v3's patch into two --- one for CPU topology and one for NUMA info
> I think this means this is now back to how v2 looked, in which case you
> may feel free to reinstate my ack. I only glanced through this version
> but it looks ok. Let me know if it is actually different to v2 and I'll
> have a closer look.


Yes, this is close to v2. The differences are that we are now sizing 
everything with max number of CPUs vs max CPU index (so a review for 
off-by-one errors would be useful) plus new INVALID macros that resulted 
in some (fairly minor) changes.

However, Andrew asked for more changes in sysctl implementation that may 
affect the interface (for both this and NUMA patches) so it's probably 
better to wait until v5.


-boris

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-11 12:02     ` Boris Ostrovsky
@ 2015-03-11 12:24       ` Ian Campbell
  0 siblings, 0 replies; 34+ messages in thread
From: Ian Campbell @ 2015-03-11 12:24 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, stefano.stabellini, andrew.cooper3,
	dario.faggioli, ian.jackson, xen-devel, jbeulich, keir

On Wed, 2015-03-11 at 08:02 -0400, Boris Ostrovsky wrote:
> it's probably better to wait until v5.

Done!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
  2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
  2015-03-10 14:29   ` Andrew Cooper
  2015-03-11 11:04   ` Ian Campbell
@ 2015-03-13 15:51   ` Jan Beulich
  2 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2015-03-13 15:51 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
> +struct xen_sysctl_cputopo {
> +    uint32_t core;
> +    uint32_t socket;
> +    uint8_t node;
> +};

Let's avoid me asking for explicit padding as well a future problems
(it's sysctl and hence can be changed, but anyway) by making the
node a uint32_t too (especially if you already think socket and core
IDs could get this large).

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
  2015-03-10 14:34   ` Andrew Cooper
  2015-03-11 11:06   ` Ian Campbell
@ 2015-03-13 15:55   ` Jan Beulich
  2 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2015-03-13 15:55 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
> @@ -522,7 +532,7 @@ struct xen_sysctl_numainfo {
>       * in the system is larger than the caller can handle, then a 2-d array of
>       * the maximum size handleable by the caller is constructed.
>       */
> -    XEN_GUEST_HANDLE_64(uint32) node_to_node_distance;
> +    XEN_GUEST_HANDLE_64(uint8) distance;

For an architecture neutral interface this shouldn't be artificially
constrained to 8 bits.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology
  2015-03-10  2:27 ` [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology Boris Ostrovsky
  2015-03-10 14:54   ` Andrew Cooper
@ 2015-03-13 16:03   ` Jan Beulich
  2015-03-13 16:35     ` Boris Ostrovsky
  1 sibling, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2015-03-13 16:03 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
> +    case XEN_SYSCTL_pcitopoinfo:
> +    {
> +        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
> +
> +        if ( guest_handle_is_null(ti->devs) ||
> +             guest_handle_is_null(ti->nodes) ||
> +             (ti->first_dev > ti->num_devs) )
> +        {
> +            ret = -EINVAL;
> +            break;
> +        }
> +
> +        for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
> +        {
> +            physdev_pci_device_t dev;
> +            uint8_t node;
> +            struct pci_dev *pdev;
> +
> +            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
> +            {
> +                ret = -EFAULT;
> +                break;
> +            }
> +
> +            spin_lock(&pcidevs_lock);
> +            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
> +            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
> +                node = XEN_INVALID_NODE_ID;
> +            else
> +                node = pdev->node;
> +            spin_unlock(&pcidevs_lock);
> +
> +            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
> +            {
> +                ret = -EFAULT;
> +                break;
> +            }
>  
> +            if ( hypercall_preempt_check() )
> +                break;
> +        }
> +
> +        if ( !ret )
> +        {
> +            ti->first_dev++;

This is correct in the preempt case, but it seems wrong to me in the
completely-done one. Perhaps it would be better to put the
increment right before the preempt check?

> +struct xen_sysctl_pcitopoinfo {
> +    /* IN: Size of pcitopo array */
> +    uint32_t num_devs;
> +
> +    /*
> +     * IN/OUT: First element of pcitopo array that needs to be processed by
> +     * hypervisor.
> +     * This is used primarily by hypercall continuations and callers will
> +     * typically set it to zero.
> +     */
> +    uint32_t first_dev;
> +
> +    /* IN: list of devices */
> +    XEN_GUEST_HANDLE_64(physdev_pci_device_t) devs;
> +
> +    /*
> +     * OUT: node identifier for each device.
> +     * If information for a particular device is not avalable then set
> +     * to XEN_INVALID_NODE_ID.
> +     */
> +    XEN_GUEST_HANDLE_64(uint8) nodes;

As said in earlier patches - let's not expose the 8-bit limitation to
user space, avoiding needless changes going forward.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev
  2015-03-10  2:27 ` [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev Boris Ostrovsky
@ 2015-03-13 16:15   ` Jan Beulich
  0 siblings, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2015-03-13 16:15 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -57,6 +57,8 @@ struct pci_dev {
>  
>      u8 phantom_stride;
>  
> +    u8 node; /* NUMA node */

I was about to commit this when I noticed that you use u8 instead of
nodeid_t (which you had introduced yourself) throughout this patch.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology
  2015-03-13 16:03   ` Jan Beulich
@ 2015-03-13 16:35     ` Boris Ostrovsky
  0 siblings, 0 replies; 34+ messages in thread
From: Boris Ostrovsky @ 2015-03-13 16:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: elena.ufimtseva, wei.liu2, ian.campbell, stefano.stabellini,
	andrew.cooper3, dario.faggioli, ian.jackson, xen-devel, keir

On 03/13/2015 12:03 PM, Jan Beulich wrote:
>>>> On 10.03.15 at 03:27, <boris.ostrovsky@oracle.com> wrote:
>> +    case XEN_SYSCTL_pcitopoinfo:
>> +    {
>> +        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
>> +
>> +        if ( guest_handle_is_null(ti->devs) ||
>> +             guest_handle_is_null(ti->nodes) ||
>> +             (ti->first_dev > ti->num_devs) )
>> +        {
>> +            ret = -EINVAL;
>> +            break;
>> +        }
>> +
>> +        for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
>> +        {
>> +            physdev_pci_device_t dev;
>> +            uint8_t node;
>> +            struct pci_dev *pdev;
>> +
>> +            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
>> +            {
>> +                ret = -EFAULT;
>> +                break;
>> +            }
>> +
>> +            spin_lock(&pcidevs_lock);
>> +            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
>> +            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
>> +                node = XEN_INVALID_NODE_ID;
>> +            else
>> +                node = pdev->node;
>> +            spin_unlock(&pcidevs_lock);
>> +
>> +            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
>> +            {
>> +                ret = -EFAULT;
>> +                break;
>> +            }
>>   
>> +            if ( hypercall_preempt_check() )
>> +                break;
>> +        }
>> +
>> +        if ( !ret )
>> +        {
>> +            ti->first_dev++;
> This is correct in the preempt case, but it seems wrong to me in the
> completely-done one. Perhaps it would be better to put the
> increment right before the preempt check?


Callers are really not supposed to use this field as it is useless as 
output.

But for consistency (and to make this look more natural) --- yes, moving it
to the loop above (which will become a 'while') would be better.

-boris

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-03-13 16:35 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-10  2:27 [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Boris Ostrovsky
2015-03-10  2:27 ` [PATCH v4 1/9] numa: __node_distance() should return u8 Boris Ostrovsky
2015-03-10 11:53   ` Andrew Cooper
2015-03-10 13:49     ` Boris Ostrovsky
2015-03-10  2:27 ` [PATCH v4 2/9] pci: Stash device's PXM information in struct pci_dev Boris Ostrovsky
2015-03-13 16:15   ` Jan Beulich
2015-03-10  2:27 ` [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient Boris Ostrovsky
2015-03-10 14:29   ` Andrew Cooper
2015-03-10 15:22     ` Boris Ostrovsky
2015-03-11 11:04   ` Ian Campbell
2015-03-11 12:02     ` Boris Ostrovsky
2015-03-11 12:24       ` Ian Campbell
2015-03-13 15:51   ` Jan Beulich
2015-03-10  2:27 ` [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo " Boris Ostrovsky
2015-03-10 14:34   ` Andrew Cooper
2015-03-11 11:06   ` Ian Campbell
2015-03-13 15:55   ` Jan Beulich
2015-03-10  2:27 ` [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology Boris Ostrovsky
2015-03-10 14:54   ` Andrew Cooper
2015-03-13 16:03   ` Jan Beulich
2015-03-13 16:35     ` Boris Ostrovsky
2015-03-10  2:27 ` [PATCH v4 6/9] sysctl: Update sysctl version to 0x0000000C Boris Ostrovsky
2015-03-10  2:27 ` [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc Boris Ostrovsky
2015-03-10 13:40   ` Dario Faggioli
2015-03-11 11:07     ` Ian Campbell
2015-03-10  2:27 ` [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s " Boris Ostrovsky
2015-03-10 13:44   ` Dario Faggioli
2015-03-11 11:08     ` Ian Campbell
2015-03-10  2:27 ` [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology Boris Ostrovsky
2015-03-10 13:51   ` Dario Faggioli
2015-03-11 11:10     ` Ian Campbell
2015-03-10  8:20 ` [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup) Jan Beulich
2015-03-10 13:39   ` Boris Ostrovsky
2015-03-10 13:47     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.