All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/25] ARM: Add Xen NUMA support
@ 2017-03-28 15:53 vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
                   ` (24 more replies)
  0 siblings, 25 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

With this RFC patch series, NUMA support is added for ARM platform.
Both DT and ACPI based NUMA support is added.
Only Xen is made aware of NUMA platform. NUMA awareness to DOM0 is not
added.

As part of this series, the code under x86 architecture is
reused by moving into common files.
New files xen/common/numa.c and xen/drivers/acpi/srat.c files are
added.
For ARM specific new folder is added xen/arch/arm/numa and new files
numa.c, dt_numa.c and acpi_numa are introduced under this folder.

DT NUMA: The following major changes are performed
 - Dropped numa-node-id information from Dom0 DT.
   So that Dom0 devices make allocation from node 0 for
   devmalloc requests.
 - Memory DT is not deleted by EFI. It is exposed to Xen
   to extract numa information.
 - On NUMA failure, Fallback to Non-NUMA booting.ACPI_SRAT_TYPE_MEMORY_AFFINITY
   Assuming all the memory and CPU's are under node 0.
 - CONFIG_NUMA is introduced.

ACPI NUMA:
 - MADT is parsed before parsing SRAT table to extract
   CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
   table, MADT table is opened and extract MPIDR. This
   approach avoids opening ACPI tables recursively.
 - SRAT table is parsed for ACPI_SRAT_TYPE_GICC_AFFINITY to extract
   proximity info and MPIDR from CPU_ID to MPIDR mapping table.
 - SRAT table is parsed for ACPI_SRAT_TYPE_MEMORY_AFFINITY to extract
   memory proximity.
 - Re-use SLIT parsing of x86 for node distance information.
 - CONFIG_ACPI_NUMA is introduced

No changes are made to x86 implementation only code is sanitized and refactored.
Hence only compilation tested for x86.

Code is shared at
https://github.com/vijaykilari/xen-numa/commits/rfc_2

v2: Major changes
  - Rebased to lastest staging branch
  - Reworked on x86 NUMA code and cleanup to possible extent.
    Patches 1 to 8 are created for this
  - Reworked on DT and ACPI NUMA extracting information
  - Reused DT code for memory node processing to extract NUMA info.
  - Fixed issues with DT processing
  - Added arch specific processing of SRAT
  - Reworked on MADT and SRAT processing
  - Reworked on node distance
  - All ARM changes are moved under folder arch/arm/numa.
  - NUMA ACPI common changes are kept in drivers/acpi/srat.c

Vijaya Kumar K (25):
  x86: NUMA: Clean up: Drop trailing spaces
  x86: NUMA: Fix datatypes and attributes
  x86: NUMA: Rename and sanitize some common functions
  x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake
    variables
  x86: NUMA: Move generic dummy_numa_init to separate function
  x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  x86: NUMA: Rename some generic functions
  x86: NUMA: Sanitize node distance
  ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  x86: NUMA: Move numa code and make it generic
  x86: NUMA: Move common code from srat.c
  ARM: NUMA: Parse CPU NUMA information
  ARM: NUMA: Parse memory NUMA information
  ARM: NUMA: Parse NUMA distance information
  ARM: NUMA: Add CPU NUMA support
  ARM: NUMA: Add memory NUMA support
  ARM: NUMA: Add fallback on NUMA failure
  ARM: NUMA: Do not expose numa info to DOM0
  ACPI: Refactor acpi SRAT and SLIT table handling code
  ARM: NUMA: Extract MPIDR from MADT table
  ACPI: Move arch specific SRAT parsing
  ARM: NUMA: Extract proximity from SRAT table
  ARM: NUMA: Initialize ACPI NUMA
  NUMA: Move CONFIG_NUMA to common Kconfig
  NUMA: Enable ACPI_NUMA config

 xen/arch/arm/Makefile               |   1 +
 xen/arch/arm/acpi/boot.c            |   2 +
 xen/arch/arm/bootfdt.c              |  44 ++-
 xen/arch/arm/domain_build.c         |   9 +
 xen/arch/arm/efi/efi-boot.h         |  25 --
 xen/arch/arm/numa/Makefile          |   3 +
 xen/arch/arm/numa/acpi_numa.c       | 249 ++++++++++++++
 xen/arch/arm/numa/dt_numa.c         | 244 +++++++++++++
 xen/arch/arm/numa/numa.c            | 196 +++++++++++
 xen/arch/arm/setup.c                |   4 +
 xen/arch/arm/smpboot.c              |  25 +-
 xen/arch/x86/dom0_build.c           |   1 +
 xen/arch/x86/mm.c                   |   2 -
 xen/arch/x86/numa.c                 | 454 +------------------------
 xen/arch/x86/physdev.c              |   1 +
 xen/arch/x86/setup.c                |   3 +-
 xen/arch/x86/smpboot.c              |   3 +-
 xen/arch/x86/srat.c                 | 412 ++++------------------
 xen/arch/x86/x86_64/mm.c            |   3 +-
 xen/common/Kconfig                  |   4 +
 xen/common/Makefile                 |   1 +
 xen/common/numa.c                   | 662 ++++++++++++++++++++++++++++++++++++
 xen/drivers/acpi/Kconfig            |   5 +-
 xen/drivers/acpi/Makefile           |   1 +
 xen/drivers/acpi/numa.c             |  58 +---
 xen/drivers/acpi/srat.c             | 299 ++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/actbl1.h           |  17 +-
 xen/include/acpi/srat.h             |  24 ++
 xen/include/asm-arm/numa.h          |  73 +++-
 xen/include/asm-arm/setup.h         |   6 +-
 xen/include/asm-x86/acpi.h          |   4 -
 xen/include/asm-x86/config.h        |   1 -
 xen/include/asm-x86/mm.h            |   1 -
 xen/include/asm-x86/numa.h          |  64 +---
 xen/include/xen/acpi.h              |   6 +
 xen/include/xen/mm.h                |   2 +
 xen/include/xen/numa.h              |  42 +++
 38 files changed, 2023 insertions(+), 929 deletions(-)
 create mode 100644 xen/arch/arm/numa/Makefile
 create mode 100644 xen/arch/arm/numa/acpi_numa.c
 create mode 100644 xen/arch/arm/numa/dt_numa.c
 create mode 100644 xen/arch/arm/numa/numa.c
 create mode 100644 xen/common/numa.c
 create mode 100644 xen/drivers/acpi/srat.c
 create mode 100644 xen/include/acpi/srat.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 16:44   ` Wei Liu
                     ` (2 more replies)
  2017-03-28 15:53 ` [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes vijay.kilari
                   ` (23 subsequent siblings)
  24 siblings, 3 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Fix coding style, trailing spaces, tabs in NUMA code.
Also drop unused macros and functions.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 47 +++++++++++++++++++++-------------------------
 xen/arch/x86/srat.c        |  2 +-
 xen/include/asm-x86/numa.h | 43 ++++++++++++++++--------------------------
 3 files changed, 38 insertions(+), 54 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 6f4d438..8ee2302 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -1,8 +1,8 @@
-/* 
+/*
  * Generic VM initialization for x86-64 NUMA setups.
  * Copyright 2002,2003 Andi Kleen, SuSE Labs.
  * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
- */ 
+ */
 
 #include <xen/mm.h>
 #include <xen/string.h>
@@ -21,13 +21,6 @@
 static int numa_setup(char *s);
 custom_param("numa", numa_setup);
 
-#ifndef Dprintk
-#define Dprintk(x...)
-#endif
-
-/* from proto.h */
-#define round_up(x,y) ((((x)+(y))-1) & (~((y)-1)))
-
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
@@ -144,8 +137,9 @@ static int __init extract_lsb_from_nodes(const struct node *nodes,
     if ( nodes_used <= 1 )
         i = BITS_PER_LONG - 1;
     else
-        i = find_first_bit(&bitfield, sizeof(unsigned long)*8);
+        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
     memnodemapsize = (memtop >> i) + 1;
+
     return i;
 }
 
@@ -173,7 +167,7 @@ int __init compute_hash_shift(struct node *nodes, int numnodes,
 }
 /* initialize NODE_DATA given nodeid and start/end */
 void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
-{ 
+{
     unsigned long start_pfn, end_pfn;
 
     start_pfn = start >> PAGE_SHIFT;
@@ -183,7 +177,7 @@ void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
     NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
 
     node_set_online(nodeid);
-} 
+}
 
 void __init numa_init_array(void)
 {
@@ -214,7 +208,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 {
     int i;
     struct node nodes[MAX_NUMNODES];
-    u64 sz = ((end_pfn - start_pfn)<<PAGE_SHIFT) / numa_fake;
+    u64 sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
 
     /* Kludge needed for the hash function */
     if ( hweight64(sz) > 1 )
@@ -222,21 +216,22 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
         u64 x = 1;
         while ( (x << 1) < sz )
             x <<= 1;
-        if ( x < sz/2 )
-            printk(KERN_ERR "Numa emulation unbalanced. Complain to maintainer\n");
+        if ( x < sz / 2 )
+            printk(KERN_ERR
+                   "Numa emulation unbalanced. Complain to maintainer\n");
         sz = x;
     }
 
     memset(&nodes,0,sizeof(nodes));
     for ( i = 0; i < numa_fake; i++ )
     {
-        nodes[i].start = (start_pfn<<PAGE_SHIFT) + i*sz;
+        nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
         if ( i == numa_fake - 1 )
-            sz = (end_pfn<<PAGE_SHIFT) - nodes[i].start;
+            sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
         nodes[i].end = nodes[i].start + sz;
-        printk(KERN_INFO "Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
-               i,
-               nodes[i].start, nodes[i].end,
+        printk(KERN_INFO
+               "Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
+               i, nodes[i].start, nodes[i].end,
                (nodes[i].end - nodes[i].start) >> 20);
         node_set_online(i);
     }
@@ -256,7 +251,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 #endif
 
 void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
-{ 
+{
     int i;
 
 #ifdef CONFIG_NUMA_EMU
@@ -291,7 +286,7 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 void numa_add_cpu(int cpu)
 {
     cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-} 
+}
 
 void numa_set_node(int cpu, nodeid_t node)
 {
@@ -299,8 +294,8 @@ void numa_set_node(int cpu, nodeid_t node)
 }
 
 /* [numa=off] */
-static __init int numa_setup(char *opt) 
-{ 
+static __init int numa_setup(char *opt)
+{
     if ( !strncmp(opt,"off",3) )
         numa_off = 1;
     if ( !strncmp(opt,"on",2) )
@@ -323,7 +318,7 @@ static __init int numa_setup(char *opt)
 #endif
 
     return 1;
-} 
+}
 
 /*
  * Setup early cpu_to_node.
@@ -385,7 +380,7 @@ static void dump_numa(unsigned char key)
     const struct vnuma_info *vnuma;
 
     printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-           (u32)(now>>32), (u32)now);
+           (u32)(now >> 32), (u32)now);
 
     for_each_online_node ( i )
     {
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index d86783e..d270b75 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -7,7 +7,7 @@
  * Called from acpi_numa_init while reading the SRAT and SLIT tables.
  * Assumes all memory regions belonging to a single proximity domain
  * are in one chunk. Holes between them will be included in the node.
- * 
+ *
  * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
  */
 
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 2479238..da8a459 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -1,4 +1,4 @@
-#ifndef _ASM_X8664_NUMA_H 
+#ifndef _ASM_X8664_NUMA_H
 #define _ASM_X8664_NUMA_H 1
 
 #include <xen/cpumask.h>
@@ -12,21 +12,20 @@ extern int srat_rev;
 extern nodeid_t      cpu_to_node[NR_CPUS];
 extern cpumask_t     node_to_cpumask[];
 
-#define cpu_to_node(cpu)		(cpu_to_node[cpu])
-#define parent_node(node)		(node)
+#define cpu_to_node(cpu)         (cpu_to_node[cpu])
+#define parent_node(node)        (node)
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
-struct node { 
-	u64 start,end; 
+struct node {
+    u64 start,end;
 };
 
 extern int compute_hash_shift(struct node *nodes, int numnodes,
-			      nodeid_t *nodeids);
+                              nodeid_t *nodeids);
 extern nodeid_t pxm_to_node(unsigned int pxm);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
-#define VIRTUAL_BUG_ON(x) 
 
 extern void numa_add_cpu(int cpu);
 extern void numa_init_array(void);
@@ -42,14 +41,8 @@ extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
 extern nodeid_t apicid_to_node[];
 extern void init_cpu_to_node(void);
 
-static inline void clear_node_cpumask(int cpu)
-{
-	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-}
-
 /* Simple perfect hash to map pdx to node numbers */
-extern int memnode_shift; 
-extern unsigned long memnodemapsize;
+extern int memnode_shift;
 extern u8 *memnodemap;
 
 struct node_data {
@@ -60,20 +53,16 @@ struct node_data {
 extern struct node_data node_data[];
 
 static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
-{ 
-	nodeid_t nid;
-	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
-	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift]; 
-	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]); 
-	return nid; 
-} 
-
-#define NODE_DATA(nid)		(&(node_data[nid]))
-
-#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
-#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
+{
+    return memnodemap[paddr_to_pdx(addr) >> memnode_shift];
+}
+
+#define NODE_DATA(nid)          (&(node_data[nid]))
+
+#define node_start_pfn(nid)     (NODE_DATA(nid)->node_start_pfn)
+#define node_spanned_pages(nid) (NODE_DATA(nid)->node_spanned_pages)
 #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
-				 NODE_DATA(nid)->node_spanned_pages)
+                                 NODE_DATA(nid)->node_spanned_pages)
 
 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 16:44   ` Wei Liu
  2017-05-31 10:35   ` Jan Beulich
  2017-03-28 15:53 ` [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions vijay.kilari
                   ` (22 subsequent siblings)
  24 siblings, 2 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Change u{8,32,64} to uint{8,32,64}_t and bool_t to bool.
Fix attributes coding styles.
Also change memnodeshift to unsigned int.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 40 +++++++++++++++++------------------
 xen/arch/x86/srat.c        | 52 +++++++++++++++++++++++-----------------------
 xen/include/asm-arm/numa.h |  2 +-
 xen/include/asm-x86/numa.h | 17 ++++++++-------
 4 files changed, 56 insertions(+), 55 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 8ee2302..8ed31cb 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -24,12 +24,12 @@ custom_param("numa", numa_setup);
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
-int memnode_shift;
+unsigned int memnode_shift;
 static typeof(*memnodemap) _memnodemap[64];
 unsigned long memnodemapsize;
-u8 *memnodemap;
+uint8_t *memnodemap;
 
-nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
+nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
     [0 ... NR_CPUS-1] = NUMA_NO_NODE
 };
 /*
@@ -38,11 +38,11 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
     [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
-cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
+cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-bool_t numa_off = 0;
+bool numa_off = 0;
 s8 acpi_numa = 0;
 
 int srat_disabled(void)
@@ -166,7 +166,7 @@ int __init compute_hash_shift(struct node *nodes, int numnodes,
     return shift;
 }
 /* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
+void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
 {
     unsigned long start_pfn, end_pfn;
 
@@ -201,19 +201,19 @@ void __init numa_init_array(void)
 }
 
 #ifdef CONFIG_NUMA_EMU
-static int numa_fake __initdata = 0;
+static int __initdata numa_fake = 0;
 
 /* Numa emulation */
-static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
+static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 {
     int i;
     struct node nodes[MAX_NUMNODES];
-    u64 sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
+    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
 
     /* Kludge needed for the hash function */
     if ( hweight64(sz) > 1 )
     {
-        u64 x = 1;
+        uint64_t x = 1;
         while ( (x << 1) < sz )
             x <<= 1;
         if ( x < sz / 2 )
@@ -260,8 +260,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-    if ( !numa_off && !acpi_scan_nodes((u64)start_pfn << PAGE_SHIFT,
-         (u64)end_pfn << PAGE_SHIFT) )
+    if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+         (uint64_t)end_pfn << PAGE_SHIFT) )
         return;
 #endif
 
@@ -269,8 +269,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
            numa_off ? "NUMA turned off" : "No NUMA configuration found");
 
     printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
-           (u64)start_pfn << PAGE_SHIFT,
-           (u64)end_pfn << PAGE_SHIFT);
+           (uint64_t)start_pfn << PAGE_SHIFT,
+           (uint64_t)end_pfn << PAGE_SHIFT);
     /* setup dummy node covering all memory */
     memnode_shift = BITS_PER_LONG - 1;
     memnodemap = _memnodemap;
@@ -279,8 +279,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
     for ( i = 0; i < nr_cpu_ids; i++ )
         numa_set_node(i, 0);
     cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
-    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
-                    (u64)end_pfn << PAGE_SHIFT);
+    setup_node_bootmem(0, (paddr_t)start_pfn << PAGE_SHIFT,
+                    (paddr_t)end_pfn << PAGE_SHIFT);
 }
 
 void numa_add_cpu(int cpu)
@@ -294,7 +294,7 @@ void numa_set_node(int cpu, nodeid_t node)
 }
 
 /* [numa=off] */
-static __init int numa_setup(char *opt)
+static int __init numa_setup(char *opt)
 {
     if ( !strncmp(opt,"off",3) )
         numa_off = 1;
@@ -339,7 +339,7 @@ void __init init_cpu_to_node(void)
 
     for ( i = 0; i < nr_cpu_ids; i++ )
     {
-        u32 apicid = x86_cpu_to_apicid[i];
+        uint32_t apicid = x86_cpu_to_apicid[i];
         if ( apicid == BAD_APICID )
             continue;
         node = apicid < MAX_LOCAL_APIC ? apicid_to_node[apicid] : NUMA_NO_NODE;
@@ -380,7 +380,7 @@ static void dump_numa(unsigned char key)
     const struct vnuma_info *vnuma;
 
     printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-           (u32)(now >> 32), (u32)now);
+           (uint32_t)(now >> 32), (uint32_t)now);
 
     for_each_online_node ( i )
     {
@@ -507,7 +507,7 @@ static void dump_numa(unsigned char key)
     rcu_read_unlock(&domlist_read_lock);
 }
 
-static __init int register_numa_trigger(void)
+static int __init register_numa_trigger(void)
 {
     register_keyhandler('u', dump_numa, "dump NUMA info", 1);
     return 0;
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index d270b75..800a7c3 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -23,12 +23,12 @@
 
 static struct acpi_table_slit *__read_mostly acpi_slit;
 
-static nodemask_t memory_nodes_parsed __initdata;
-static nodemask_t processor_nodes_parsed __initdata;
-static struct node nodes[MAX_NUMNODES] __initdata;
+static nodemask_t __initdata memory_nodes_parsed;
+static nodemask_t __initdata processor_nodes_parsed;
+static struct node __initdata nodes[MAX_NUMNODES];
 
 struct pxm2node {
-	unsigned pxm;
+	unsigned int pxm;
 	nodeid_t node;
 };
 static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
@@ -41,15 +41,15 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
 static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static inline bool_t node_found(unsigned idx, unsigned pxm)
+static inline bool node_found(unsigned idx, unsigned pxm)
 {
 	return ((pxm2node[idx].pxm == pxm) &&
 		(pxm2node[idx].node != NUMA_NO_NODE));
 }
 
-nodeid_t pxm_to_node(unsigned pxm)
+nodeid_t pxm_to_node(unsigned int pxm)
 {
-	unsigned i;
+	unsigned int i;
 
 	if ((pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm))
 		return pxm2node[pxm].node;
@@ -64,9 +64,9 @@ nodeid_t pxm_to_node(unsigned pxm)
 nodeid_t setup_node(unsigned pxm)
 {
 	nodeid_t node;
-	unsigned idx;
-	static bool_t warned;
-	static unsigned nodes_found;
+	unsigned int idx;
+	static bool warned;
+	static unsigned int nodes_found;
 
 	BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
 
@@ -103,7 +103,7 @@ nodeid_t setup_node(unsigned pxm)
 	return node;
 }
 
-int valid_numa_range(u64 start, u64 end, nodeid_t node)
+int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
 {
 	int i;
 
@@ -118,7 +118,7 @@ int valid_numa_range(u64 start, u64 end, nodeid_t node)
 	return 0;
 }
 
-static __init int conflicting_memblks(u64 start, u64 end)
+static int __init conflicting_memblks(paddr_t start, paddr_t end)
 {
 	int i;
 
@@ -134,7 +134,7 @@ static __init int conflicting_memblks(u64 start, u64 end)
 	return -1;
 }
 
-static __init void cutoff_node(int i, u64 start, u64 end)
+static void __init cutoff_node(int i, paddr_t start, paddr_t end)
 {
 	struct node *nd = &nodes[i];
 	if (nd->start < start) {
@@ -149,7 +149,7 @@ static __init void cutoff_node(int i, u64 start, u64 end)
 	}
 }
 
-static __init void bad_srat(void)
+static void __init bad_srat(void)
 {
 	int i;
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
@@ -167,13 +167,13 @@ static __init void bad_srat(void)
  * distance than the others.
  * Do some quick checks here and only use the SLIT if it passes.
  */
-static __init int slit_valid(struct acpi_table_slit *slit)
+static int __init slit_valid(struct acpi_table_slit *slit)
 {
 	int i, j;
 	int d = slit->locality_count;
 	for (i = 0; i < d; i++) {
 		for (j = 0; j < d; j++)  {
-			u8 val = slit->entry[d*i + j];
+			uint8_t val = slit->entry[d*i + j];
 			if (i == j) {
 				if (val != 10)
 					return 0;
@@ -274,7 +274,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 void __init
 acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 {
-	u64 start, end;
+	uint64_t start, end;
 	unsigned pxm;
 	nodeid_t node;
 	int i;
@@ -311,8 +311,8 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	if (i < 0)
 		/* everything fine */;
 	else if (memblk_nodeid[i] == node) {
-		bool_t mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
-		                  !test_bit(i, memblk_hotplug);
+		bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
+		                !test_bit(i, memblk_hotplug);
 
 		printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
 		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
@@ -401,7 +401,7 @@ static int __init nodes_cover_memory(void)
 
 void __init acpi_numa_arch_fixup(void) {}
 
-static u64 __initdata srat_region_mask;
+static uint64_t __initdata srat_region_mask;
 
 static int __init srat_parse_region(struct acpi_subtable_header *header,
 				    const unsigned long end)
@@ -428,9 +428,9 @@ static int __init srat_parse_region(struct acpi_subtable_header *header,
 	return 0;
 }
 
-void __init srat_parse_regions(u64 addr)
+void __init srat_parse_regions(uint64_t addr)
 {
-	u64 mask;
+	uint64_t mask;
 	unsigned int i;
 
 	if (acpi_disabled || acpi_numa < 0 ||
@@ -453,7 +453,7 @@ void __init srat_parse_regions(u64 addr)
 }
 
 /* Use the information discovered above to actually set up the nodes. */
-int __init acpi_scan_nodes(u64 start, u64 end)
+int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 {
 	int i;
 	nodemask_t all_nodes_parsed;
@@ -485,7 +485,7 @@ int __init acpi_scan_nodes(u64 start, u64 end)
 	/* Finally register nodes */
 	for_each_node_mask(i, all_nodes_parsed)
 	{
-		u64 size = nodes[i].end - nodes[i].start;
+		uint64_t size = nodes[i].end - nodes[i].start;
 		if ( size == 0 )
 			printk(KERN_WARNING "SRAT: Node %u has no memory. "
 			       "BIOS Bug or mis-configured hardware?\n", i);
@@ -514,10 +514,10 @@ static unsigned node_to_pxm(nodeid_t n)
 	return 0;
 }
 
-u8 __node_distance(nodeid_t a, nodeid_t b)
+uint8_t __node_distance(nodeid_t a, nodeid_t b)
 {
 	unsigned index;
-	u8 slit_val;
+	uint8_t slit_val;
 
 	if (!acpi_slit)
 		return a == b ? 10 : 20;
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index a2c1a34..53f99af 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -1,7 +1,7 @@
 #ifndef __ARCH_ARM_NUMA_H
 #define __ARCH_ARM_NUMA_H
 
-typedef u8 nodeid_t;
+typedef uint8_t nodeid_t;
 
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index da8a459..748cdfd 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -5,7 +5,7 @@
 
 #define NODES_SHIFT 6
 
-typedef u8 nodeid_t;
+typedef uint8_t nodeid_t;
 
 extern int srat_rev;
 
@@ -18,7 +18,8 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
 struct node {
-    u64 start,end;
+    paddr_t start;
+    paddr_t end;
 };
 
 extern int compute_hash_shift(struct node *nodes, int numnodes,
@@ -37,13 +38,13 @@ extern void numa_set_node(int cpu, nodeid_t node);
 extern nodeid_t setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 
-extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
+extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern nodeid_t apicid_to_node[];
 extern void init_cpu_to_node(void);
 
 /* Simple perfect hash to map pdx to node numbers */
-extern int memnode_shift;
-extern u8 *memnodemap;
+extern unsigned int memnode_shift;
+extern uint8_t *memnodemap;
 
 struct node_data {
     unsigned long node_start_pfn;
@@ -64,10 +65,10 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
                                  NODE_DATA(nid)->node_spanned_pages)
 
-extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
+extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
 
-void srat_parse_regions(u64 addr);
-extern u8 __node_distance(nodeid_t a, nodeid_t b);
+void srat_parse_regions(uint64_t addr);
+extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
 
 #endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-06-30 14:05   ` Jan Beulich
  2017-03-28 15:53 ` [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables vijay.kilari
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Following changes are made
 - Rename compute_hash_shift to compute_memnode_shift
   and return error values instead of shift.
 - Changes prototypes of populate_memnodemap()
   and extract_lsb_from_nodes()

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 47 +++++++++++++++++++++++-----------------------
 xen/arch/x86/srat.c        |  7 +++----
 xen/include/asm-x86/numa.h |  4 ++--
 3 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 8ed31cb..964fc5a 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -53,15 +53,15 @@ int srat_disabled(void)
 /*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
- * 1 if OK
- * 0 if memnodmap[] too small (of shift too small)
- * -1 if node overlap or lost ram (shift too big)
+ * 0 if OK
+ * -EINVAL if memnodmap[] too small (of shift too small)
+ * OR if node overlap or lost ram (shift too big)
  */
-static int __init populate_memnodemap(const struct node *nodes,
-                                      int numnodes, int shift, nodeid_t *nodeids)
+static int __init populate_memnodemap(const struct node *nodes, int numnodes,
+                                      unsigned int shift, nodeid_t *nodeids)
 {
     unsigned long spdx, epdx;
-    int i, res = -1;
+    int i, res = -EINVAL;
 
     memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
     for ( i = 0; i < numnodes; i++ )
@@ -74,7 +74,7 @@ static int __init populate_memnodemap(const struct node *nodes,
             return 0;
         do {
             if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-                return -1;
+                return -EINVAL;
 
             if ( !nodeids )
                 memnodemap[spdx >> shift] = i;
@@ -83,7 +83,7 @@ static int __init populate_memnodemap(const struct node *nodes,
 
             spdx += (1UL << shift);
         } while ( spdx < epdx );
-        res = 1;
+        res = 0;
     }
 
     return res;
@@ -99,7 +99,7 @@ static int __init allocate_cachealigned_memnodemap(void)
         printk(KERN_ERR
                "NUMA: Unable to allocate Memory to Node hash map\n");
         memnodemapsize = 0;
-        return -1;
+        return -ENOMEM;
     }
 
     memnodemap = mfn_to_virt(mfn);
@@ -116,10 +116,10 @@ static int __init allocate_cachealigned_memnodemap(void)
  * The LSB of all start and end addresses in the node map is the value of the
  * maximum possible shift.
  */
-static int __init extract_lsb_from_nodes(const struct node *nodes,
-                                         int numnodes)
+static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
+                                                  int numnodes)
 {
-    int i, nodes_used = 0;
+    unsigned int i, nodes_used = 0;
     unsigned long spdx, epdx;
     unsigned long bitfield = 0, memtop = 0;
 
@@ -143,27 +143,27 @@ static int __init extract_lsb_from_nodes(const struct node *nodes,
     return i;
 }
 
-int __init compute_hash_shift(struct node *nodes, int numnodes,
-                              nodeid_t *nodeids)
+int __init compute_memnode_shift(struct node *nodes, int numnodes,
+                                 nodeid_t *nodeids, unsigned int *shift)
 {
-    int shift;
+    *shift = extract_lsb_from_nodes(nodes, numnodes);
 
-    shift = extract_lsb_from_nodes(nodes, numnodes);
     if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
         memnodemap = _memnodemap;
     else if ( allocate_cachealigned_memnodemap() )
-        return -1;
-    printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
+        return -ENOMEM;
+
+    printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
 
-    if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
+    if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
     {
         printk(KERN_INFO "Your memory is not aligned you need to "
                "rebuild your hypervisor with a bigger NODEMAPSIZE "
-               "shift=%d\n", shift);
-        return -1;
+               "shift=%u\n", *shift);
+        return -EINVAL;
     }
 
-    return shift;
+    return 0;
 }
 /* initialize NODE_DATA given nodeid and start/end */
 void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
@@ -235,8 +235,7 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
                (nodes[i].end - nodes[i].start) >> 20);
         node_set_online(i);
     }
-    memnode_shift = compute_hash_shift(nodes, numa_fake, NULL);
-    if ( memnode_shift < 0 )
+    if ( compute_memnode_shift(nodes, numa_fake, NULL, &memnode_shift) )
     {
         memnode_shift = 0;
         printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 800a7c3..2d0c047 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -470,10 +470,9 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 		return -1;
 	}
 
-	memnode_shift = compute_hash_shift(node_memblk_range, num_node_memblks,
-				memblk_nodeid);
-
-	if (memnode_shift < 0) {
+	if (compute_memnode_shift(node_memblk_range, num_node_memblks,
+				  memblk_nodeid, &memnode_shift)) {
+		memnode_shift = 0;
 		printk(KERN_ERR
 		     "SRAT: No NUMA node hash function found. Contact maintainer\n");
 		bad_srat();
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 748cdfd..bb22bff 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -22,8 +22,8 @@ struct node {
     paddr_t end;
 };
 
-extern int compute_hash_shift(struct node *nodes, int numnodes,
-                              nodeid_t *nodeids);
+extern int compute_memnode_shift(struct node *nodes, int numnodes,
+                                 nodeid_t *nodeids, unsigned int *shift);
 extern nodeid_t pxm_to_node(unsigned int pxm);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (2 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-04-20 15:59   ` Julien Grall
  2017-06-30 14:07   ` Jan Beulich
  2017-03-28 15:53 ` [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function vijay.kilari
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Add accessor functions for acpi_numa and numa_off static
variables. Init value of acpi_numa is set 1 instead of 0.
Also return value of srat_disabled is changed to bool.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 43 +++++++++++++++++++++++++++++++------------
 xen/arch/x86/setup.c       |  2 +-
 xen/arch/x86/srat.c        | 12 ++++++------
 xen/include/asm-x86/acpi.h |  1 -
 xen/include/asm-x86/numa.h |  5 +----
 xen/include/xen/numa.h     |  3 +++
 6 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 964fc5a..6b794a7 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-bool numa_off = 0;
-s8 acpi_numa = 0;
+static bool numa_off = 0;
+static bool acpi_numa = 1;
 
-int srat_disabled(void)
+bool is_numa_off(void)
 {
-    return numa_off || acpi_numa < 0;
+    return numa_off;
+}
+
+bool get_acpi_numa(void)
+{
+    return acpi_numa;
+}
+
+void set_acpi_numa(bool_t val)
+{
+    acpi_numa = val;
+}
+
+bool srat_disabled(void)
+{
+    return numa_off || acpi_numa == 0;
 }
 
 /*
@@ -202,13 +217,17 @@ void __init numa_init_array(void)
 
 #ifdef CONFIG_NUMA_EMU
 static int __initdata numa_fake = 0;
+static int get_numa_fake(void)
+{
+    return numa_fake;
+}
 
 /* Numa emulation */
 static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 {
     int i;
     struct node nodes[MAX_NUMNODES];
-    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
+    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / get_numa_fake();
 
     /* Kludge needed for the hash function */
     if ( hweight64(sz) > 1 )
@@ -223,10 +242,10 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
     }
 
     memset(&nodes,0,sizeof(nodes));
-    for ( i = 0; i < numa_fake; i++ )
+    for ( i = 0; i < get_numa_fake(); i++ )
     {
         nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
-        if ( i == numa_fake - 1 )
+        if ( i == get_numa_fake() - 1 )
             sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
         nodes[i].end = nodes[i].start + sz;
         printk(KERN_INFO
@@ -235,7 +254,7 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
                (nodes[i].end - nodes[i].start) >> 20);
         node_set_online(i);
     }
-    if ( compute_memnode_shift(nodes, numa_fake, NULL, &memnode_shift) )
+    if ( compute_memnode_shift(nodes, get_numa_fake(), NULL, &memnode_shift) )
     {
         memnode_shift = 0;
         printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
@@ -254,18 +273,18 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
     int i;
 
 #ifdef CONFIG_NUMA_EMU
-    if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
+    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
         return;
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-    if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
          (uint64_t)end_pfn << PAGE_SHIFT) )
         return;
 #endif
 
     printk(KERN_INFO "%s\n",
-           numa_off ? "NUMA turned off" : "No NUMA configuration found");
+           is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
 
     printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
            (uint64_t)start_pfn << PAGE_SHIFT,
@@ -312,7 +331,7 @@ static int __init numa_setup(char *opt)
     if ( !strncmp(opt,"noacpi",6) )
     {
         numa_off = 0;
-        acpi_numa = -1;
+        acpi_numa = 0;
     }
 #endif
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 1cd290e..4410e53 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -241,7 +241,7 @@ void srat_detect_node(int cpu)
     node_set_online(node);
     numa_set_node(cpu, node);
 
-    if ( opt_cpu_info && acpi_numa > 0 )
+    if ( opt_cpu_info && get_acpi_numa() )
         printk("CPU %d APIC %d -> Node %d\n", cpu, apicid, node);
 }
 
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 2d0c047..ccacbcd 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -153,7 +153,7 @@ static void __init bad_srat(void)
 {
 	int i;
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
-	acpi_numa = -1;
+	set_acpi_numa(0);
 	for (i = 0; i < MAX_LOCAL_APIC; i++)
 		apicid_to_node[i] = NUMA_NO_NODE;
 	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
@@ -232,7 +232,7 @@ acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
 
 	apicid_to_node[pa->apic_id] = node;
 	node_set(node, processor_nodes_parsed);
-	acpi_numa = 1;
+	set_acpi_numa(1);
 	printk(KERN_INFO "SRAT: PXM %u -> APIC %08x -> Node %u\n",
 	       pxm, pa->apic_id, node);
 }
@@ -265,7 +265,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 	}
 	apicid_to_node[pa->apic_id] = node;
 	node_set(node, processor_nodes_parsed);
-	acpi_numa = 1;
+	set_acpi_numa(1);
 	printk(KERN_INFO "SRAT: PXM %u -> APIC %02x -> Node %u\n",
 	       pxm, pa->apic_id, node);
 }
@@ -418,7 +418,7 @@ static int __init srat_parse_region(struct acpi_subtable_header *header,
 	    (ma->flags & ACPI_SRAT_MEM_NON_VOLATILE))
 		return 0;
 
-	if (numa_off)
+	if (is_numa_off())
 		printk(KERN_INFO "SRAT: %013"PRIx64"-%013"PRIx64"\n",
 		       ma->base_address, ma->base_address + ma->length - 1);
 
@@ -433,7 +433,7 @@ void __init srat_parse_regions(uint64_t addr)
 	uint64_t mask;
 	unsigned int i;
 
-	if (acpi_disabled || acpi_numa < 0 ||
+	if (acpi_disabled || (get_acpi_numa() == 0) ||
 	    acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat))
 		return;
 
@@ -462,7 +462,7 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 	for (i = 0; i < MAX_NUMNODES; i++)
 		cutoff_node(i, start, end);
 
-	if (acpi_numa <= 0)
+	if (get_acpi_numa() == 0)
 		return -1;
 
 	if (!nodes_cover_memory()) {
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index a766688..9298d42 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
 
 #define ARCH_HAS_POWER_INIT	1
 
-extern s8 acpi_numa;
 extern int acpi_scan_nodes(u64 start, u64 end);
 #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
 
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index bb22bff..ae5768b 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
 
 extern void numa_add_cpu(int cpu);
 extern void numa_init_array(void);
-extern bool_t numa_off;
-
-
-extern int srat_disabled(void);
+extern bool srat_disabled(void);
 extern void numa_set_node(int cpu, nodeid_t node);
 extern nodeid_t setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7aef1a8..7f6d090 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -18,4 +18,7 @@
   (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
    ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
 
+bool is_numa_off(void);
+bool get_acpi_numa(void);
+void set_acpi_numa(bool val);
 #endif /* _XEN_NUMA_H */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (3 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-04-20 16:12   ` Julien Grall
  2017-06-30 14:08   ` Jan Beulich
  2017-03-28 15:53 ` [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs vijay.kilari
                   ` (19 subsequent siblings)
  24 siblings, 2 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Split numa_initmem_init() so that the numa fallback code is moved
as separate function which is generic.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 6b794a7..0888d53 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -268,21 +268,10 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 }
 #endif
 
-void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
+static void __init numa_dummy_init(unsigned long start_pfn, unsigned long end_pfn)
 {
     int i;
 
-#ifdef CONFIG_NUMA_EMU
-    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
-        return;
-#endif
-
-#ifdef CONFIG_ACPI_NUMA
-    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
-         (uint64_t)end_pfn << PAGE_SHIFT) )
-        return;
-#endif
-
     printk(KERN_INFO "%s\n",
            is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
 
@@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
                     (paddr_t)end_pfn << PAGE_SHIFT);
 }
 
+void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
+{
+#ifdef CONFIG_NUMA_EMU
+    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
+        return;
+#endif
+
+#ifdef CONFIG_ACPI_NUMA
+    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+         (uint64_t)end_pfn << PAGE_SHIFT) )
+        return;
+#endif
+
+    numa_dummy_init(start_pfn, end_pfn);
+}
+
 void numa_add_cpu(int cpu)
 {
     cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (4 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-08 14:39   ` Julien Grall
  2017-03-28 15:53 ` [RFC PATCH v2 07/25] x86: NUMA: Rename some generic functions vijay.kilari
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Add accessor for nodes[] and other static variables and
used those accessors.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/srat.c | 108 +++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 82 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index ccacbcd..983e1d8 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -41,7 +41,45 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
 static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static inline bool node_found(unsigned idx, unsigned pxm)
+static struct node *get_numa_node(int id)
+{
+	return &nodes[id];
+}
+
+static nodeid_t get_memblk_nodeid(int id)
+{
+	return memblk_nodeid[id];
+}
+
+static nodeid_t *get_memblk_nodeid_map(void)
+{
+	return &memblk_nodeid[0];
+}
+
+static struct node *get_node_memblk_range(int memblk)
+{
+	return &node_memblk_range[memblk];
+}
+
+static int get_num_node_memblks(void)
+{
+	return num_node_memblks;
+}
+
+static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size)
+{
+	if (nodeid >= NR_NODE_MEMBLKS)
+		return -EINVAL;
+
+	node_memblk_range[num_node_memblks].start = start;
+	node_memblk_range[num_node_memblks].end = start + size;
+	memblk_nodeid[num_node_memblks] = nodeid;
+	num_node_memblks++;
+
+	return 0;
+}
+
+static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
 	return ((pxm2node[idx].pxm == pxm) &&
 		(pxm2node[idx].node != NUMA_NO_NODE));
@@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
 {
 	int i;
 
-	for (i = 0; i < num_node_memblks; i++) {
-		struct node *nd = &node_memblk_range[i];
+	for (i = 0; i < get_num_node_memblks(); i++) {
+		struct node *nd = get_node_memblk_range(i);
 
 		if (nd->start <= start && nd->end > end &&
-			memblk_nodeid[i] == node )
+		    get_memblk_nodeid(i) == node)
 			return 1;
 	}
 
@@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start, paddr_t end)
 {
 	int i;
 
-	for (i = 0; i < num_node_memblks; i++) {
-		struct node *nd = &node_memblk_range[i];
+	for (i = 0; i < get_num_node_memblks(); i++) {
+		struct node *nd = get_node_memblk_range(i);
 		if (nd->start == nd->end)
 			continue;
 		if (nd->end > start && nd->start < end)
@@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start, paddr_t end)
 
 static void __init cutoff_node(int i, paddr_t start, paddr_t end)
 {
-	struct node *nd = &nodes[i];
+	struct node *nd = get_numa_node(i);
+
 	if (nd->start < start) {
 		nd->start = start;
 		if (nd->end < nd->start)
@@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	unsigned pxm;
 	nodeid_t node;
 	int i;
+	struct node *memblk;
 
 	if (srat_disabled())
 		return;
@@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
 		return;
 
-	if (num_node_memblks >= NR_NODE_MEMBLKS)
+	if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
 	{
 		dprintk(XENLOG_WARNING,
                 "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
@@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	i = conflicting_memblks(start, end);
 	if (i < 0)
 		/* everything fine */;
-	else if (memblk_nodeid[i] == node) {
+	else if (get_memblk_nodeid(i) == node) {
 		bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
 		                !test_bit(i, memblk_hotplug);
 
+		memblk = get_node_memblk_range(i);
+
 		printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
 		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
-		       node_memblk_range[i].start, node_memblk_range[i].end);
+		       memblk->start, memblk->end);
 		if (mismatch) {
 			bad_srat();
 			return;
 		}
 	} else {
+		memblk = get_node_memblk_range(i);
+
 		printk(KERN_ERR
 		       "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
-		       pxm, start, end, node_to_pxm(memblk_nodeid[i]),
-		       node_memblk_range[i].start, node_memblk_range[i].end);
+		       pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
+		       memblk->start, memblk->end);
 		bad_srat();
 		return;
 	}
 	if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
-		struct node *nd = &nodes[node];
+		struct node *nd = get_numa_node(node);
 
 		if (!node_test_and_set(node, memory_nodes_parsed)) {
 			nd->start = start;
@@ -346,15 +390,17 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	       node, pxm, start, end,
 	       ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
 
-	node_memblk_range[num_node_memblks].start = start;
-	node_memblk_range[num_node_memblks].end = end;
-	memblk_nodeid[num_node_memblks] = node;
+	if (numa_add_memblk(node, start, ma->length)) {
+		printk(KERN_ERR "SRAT: node-id %u out of range\n", node);
+		bad_srat();
+		return;
+	}
+
 	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
-		__set_bit(num_node_memblks, memblk_hotplug);
+		__set_bit(get_num_node_memblks(), memblk_hotplug);
 		if (end > mem_hotplug)
 			mem_hotplug = end;
 	}
-	num_node_memblks++;
 }
 
 /* Sanity check to catch more bad SRATs (they are amazingly common).
@@ -377,17 +423,21 @@ static int __init nodes_cover_memory(void)
 		do {
 			found = 0;
 			for_each_node_mask(j, memory_nodes_parsed)
-				if (start < nodes[j].end
-				    && end > nodes[j].start) {
-					if (start >= nodes[j].start) {
-						start = nodes[j].end;
+			{
+		                struct node *nd = get_numa_node(j);
+
+				if (start < nd->end
+				    && end > nd->start) {
+					if (start >= nd->start) {
+						start = nd->end;
 						found = 1;
 					}
-					if (end <= nodes[j].end) {
-						end = nodes[j].start;
+					if (end <= nd->end) {
+						end = nd->start;
 						found = 1;
 					}
 				}
+			}
 		} while (found && start < end);
 
 		if (start < end) {
@@ -457,6 +507,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 {
 	int i;
 	nodemask_t all_nodes_parsed;
+	struct node *memblks;
+	nodeid_t *nodeids;
 
 	/* First clean up the node list */
 	for (i = 0; i < MAX_NUMNODES; i++)
@@ -470,6 +522,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 		return -1;
 	}
 
+	memblks = get_node_memblk_range(0);
+	nodeids = get_memblk_nodeid_map();
 	if (compute_memnode_shift(node_memblk_range, num_node_memblks,
 				  memblk_nodeid, &memnode_shift)) {
 		memnode_shift = 0;
@@ -484,12 +538,14 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 	/* Finally register nodes */
 	for_each_node_mask(i, all_nodes_parsed)
 	{
-		uint64_t size = nodes[i].end - nodes[i].start;
+		struct node *nd = get_numa_node(i);
+		uint64_t size = nd->end - nd->start;
+
 		if ( size == 0 )
 			printk(KERN_WARNING "SRAT: Node %u has no memory. "
 			       "BIOS Bug or mis-configured hardware?\n", i);
 
-		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
+		setup_node_bootmem(i, nd->start, nd->end);
 	}
 	for (i = 0; i < nr_cpu_ids; i++) {
 		if (cpu_to_node[i] == NUMA_NO_NODE)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 07/25] x86: NUMA: Rename some generic functions
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (5 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 08/25] x86: NUMA: Sanitize node distance vijay.kilari
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Rename some function in ACPI code as follow
 - Rename setup_node to acpi_setup_node
 - Rename bad_srat to numa_failed
 - Rename nodes_cover_memory to arch_sanitize_nodes_memory
 - Rename acpi_scan_nodes to numa_scan_nodes

Also introduce reset_pxm2node() to reset pxm2node variable.
This avoids exporting pxm2node.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        |  2 +-
 xen/arch/x86/smpboot.c     |  2 +-
 xen/arch/x86/srat.c        | 51 ++++++++++++++++++++++++++--------------------
 xen/arch/x86/x86_64/mm.c   |  2 +-
 xen/include/asm-x86/acpi.h |  2 +-
 xen/include/asm-x86/numa.h |  2 +-
 6 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 0888d53..3bdab9a 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -298,7 +298,7 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+    if ( !is_numa_off() && !numa_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
          (uint64_t)end_pfn << PAGE_SHIFT) )
         return;
 #endif
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 82559ed..203733e 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -959,7 +959,7 @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm)
 
     if ( !srat_disabled() )
     {
-        nodeid_t node = setup_node(pxm);
+        nodeid_t node = acpi_setup_node(pxm);
 
         if ( node == NUMA_NO_NODE )
         {
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 983e1d8..3ade36d 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -85,6 +85,14 @@ static inline bool node_found(unsigned int idx, unsigned int pxm)
 		(pxm2node[idx].node != NUMA_NO_NODE));
 }
 
+static void reset_pxm2node(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
+		pxm2node[i].node = NUMA_NO_NODE;
+}
+
 nodeid_t pxm_to_node(unsigned int pxm)
 {
 	unsigned int i;
@@ -99,7 +107,7 @@ nodeid_t pxm_to_node(unsigned int pxm)
 	return NUMA_NO_NODE;
 }
 
-nodeid_t setup_node(unsigned pxm)
+nodeid_t acpi_setup_node(unsigned int pxm)
 {
 	nodeid_t node;
 	unsigned int idx;
@@ -188,15 +196,14 @@ static void __init cutoff_node(int i, paddr_t start, paddr_t end)
 	}
 }
 
-static void __init bad_srat(void)
+static void __init numa_failed(void)
 {
 	int i;
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
 	set_acpi_numa(0);
 	for (i = 0; i < MAX_LOCAL_APIC; i++)
 		apicid_to_node[i] = NUMA_NO_NODE;
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		pxm2node[i].node = NUMA_NO_NODE;
+	reset_pxm2node();
 	mem_hotplug = 0;
 }
 
@@ -252,7 +259,7 @@ acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
 	if (srat_disabled())
 		return;
 	if (pa->header.length < sizeof(struct acpi_srat_x2apic_cpu_affinity)) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -263,9 +270,9 @@ acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
 	}
 
 	pxm = pa->proximity_domain;
-	node = setup_node(pxm);
+	node = acpi_setup_node(pxm);
 	if (node == NUMA_NO_NODE) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 
@@ -286,7 +293,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 	if (srat_disabled())
 		return;
 	if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -297,9 +304,9 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 		pxm |= pa->proximity_domain_hi[1] << 16;
 		pxm |= pa->proximity_domain_hi[2] << 24;
 	}
-	node = setup_node(pxm);
+	node = acpi_setup_node(pxm);
 	if (node == NUMA_NO_NODE) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	apicid_to_node[pa->apic_id] = node;
@@ -322,7 +329,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	if (srat_disabled())
 		return;
 	if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
@@ -332,7 +339,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	{
 		dprintk(XENLOG_WARNING,
                 "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
-		bad_srat();
+		numa_failed();
 		return;
 	}
 
@@ -341,9 +348,9 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 	pxm = ma->proximity_domain;
 	if (srat_rev < 2)
 		pxm &= 0xff;
-	node = setup_node(pxm);
+	node = acpi_setup_node(pxm);
 	if (node == NUMA_NO_NODE) {
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	/* It is fine to add this area to the nodes data it will be used later*/
@@ -360,7 +367,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
 		       memblk->start, memblk->end);
 		if (mismatch) {
-			bad_srat();
+			numa_failed();
 			return;
 		}
 	} else {
@@ -370,7 +377,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 		       "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
 		       pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
 		       memblk->start, memblk->end);
-		bad_srat();
+		numa_failed();
 		return;
 	}
 	if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
@@ -392,7 +399,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 
 	if (numa_add_memblk(node, start, ma->length)) {
 		printk(KERN_ERR "SRAT: node-id %u out of range\n", node);
-		bad_srat();
+		numa_failed();
 		return;
 	}
 
@@ -405,7 +412,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 
 /* Sanity check to catch more bad SRATs (they are amazingly common).
    Make sure the PXMs cover all memory. */
-static int __init nodes_cover_memory(void)
+static int __init arch_sanitize_nodes_memory(void)
 {
 	int i;
 
@@ -503,7 +510,7 @@ void __init srat_parse_regions(uint64_t addr)
 }
 
 /* Use the information discovered above to actually set up the nodes. */
-int __init acpi_scan_nodes(uint64_t start, uint64_t end)
+int __init numa_scan_nodes(uint64_t start, uint64_t end)
 {
 	int i;
 	nodemask_t all_nodes_parsed;
@@ -517,8 +524,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 	if (get_acpi_numa() == 0)
 		return -1;
 
-	if (!nodes_cover_memory()) {
-		bad_srat();
+	if (!arch_sanitize_nodes_memory()) {
+		numa_failed();
 		return -1;
 	}
 
@@ -529,7 +536,7 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
 		memnode_shift = 0;
 		printk(KERN_ERR
 		     "SRAT: No NUMA node hash function found. Contact maintainer\n");
-		bad_srat();
+		numa_failed();
 		return -1;
 	}
 
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 34f3250..f0082e1 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1369,7 +1369,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( !mem_hotadd_check(spfn, epfn) )
         return -EINVAL;
 
-    if ( (node = setup_node(pxm)) == NUMA_NO_NODE )
+    if ( (node = acpi_setup_node(pxm)) == NUMA_NO_NODE )
         return -EINVAL;
 
     if ( !valid_numa_range(spfn << PAGE_SHIFT, epfn << PAGE_SHIFT, node) )
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index 9298d42..445b8e5 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -103,7 +103,7 @@ extern void acpi_reserve_bootmem(void);
 
 #define ARCH_HAS_POWER_INIT	1
 
-extern int acpi_scan_nodes(u64 start, u64 end);
+extern int numa_scan_nodes(u64 start, u64 end);
 #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
 
 #ifdef CONFIG_ACPI_SLEEP
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index ae5768b..7237ad1 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -32,7 +32,7 @@ extern void numa_add_cpu(int cpu);
 extern void numa_init_array(void);
 extern bool srat_disabled(void);
 extern void numa_set_node(int cpu, nodeid_t node);
-extern nodeid_t setup_node(unsigned int pxm);
+extern nodeid_t acpi_setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 
 extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 08/25] x86: NUMA: Sanitize node distance
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (6 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 07/25] x86: NUMA: Rename some generic functions vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Introduce acpi_node_distance() and call from __node_distance().
This helps to implement arch specific __node_distance().
Also introduce LOCAL_DISTANCE & REMOTE DISTANCE macros.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/srat.c    | 13 +++++++++----
 xen/include/xen/numa.h |  2 ++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 3ade36d..7cf4771 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -221,9 +221,9 @@ static int __init slit_valid(struct acpi_table_slit *slit)
 		for (j = 0; j < d; j++)  {
 			uint8_t val = slit->entry[d*i + j];
 			if (i == j) {
-				if (val != 10)
+				if (val != LOCAL_DISTANCE)
 					return 0;
-			} else if (val <= 10)
+			} else if (val <= LOCAL_DISTANCE)
 				return 0;
 		}
 	}
@@ -576,13 +576,13 @@ static unsigned node_to_pxm(nodeid_t n)
 	return 0;
 }
 
-uint8_t __node_distance(nodeid_t a, nodeid_t b)
+static uint8_t acpi_node_distance(nodeid_t a, nodeid_t b)
 {
 	unsigned index;
 	uint8_t slit_val;
 
 	if (!acpi_slit)
-		return a == b ? 10 : 20;
+		return a == b ? LOCAL_DISTANCE : REMOTE_DISTANCE;
 	index = acpi_slit->locality_count * node_to_pxm(a);
 	slit_val = acpi_slit->entry[index + node_to_pxm(b)];
 
@@ -593,4 +593,9 @@ uint8_t __node_distance(nodeid_t a, nodeid_t b)
 		return slit_val;
 }
 
+uint8_t __node_distance(nodeid_t a, nodeid_t b)
+{
+	return acpi_node_distance(a, b);
+}
+
 EXPORT_SYMBOL(__node_distance);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7f6d090..922fbd8 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -8,6 +8,8 @@
 #endif
 
 #define NUMA_NO_NODE     0xFF
+#define LOCAL_DISTANCE   10
+#define REMOTE_DISTANCE  20
 #define NUMA_NO_DISTANCE 0xFF
 
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (7 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 08/25] x86: NUMA: Sanitize node distance vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-08 15:58   ` Julien Grall
  2017-03-28 15:53 ` [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic vijay.kilari
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Right now CONFIG_NUMA is not enabled for ARM and
existing code in asm-arm/numa.h is for !CONFIG_NUMA.
Hence put this code under #ifndef CONFIG_NUMA.

This help to make this changes work when CONFIG_NUMA
is not enabled.

Also define NODES_SHIFT macro for ARM to value 2.
This limits number of NUMA nodes supported to 4.
There is not hard restrictions on this value set to 2.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/include/asm-arm/numa.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 53f99af..924bfc0 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -3,6 +3,10 @@
 
 typedef uint8_t nodeid_t;
 
+/* Limit number of NUMA nodes supported to 4 */
+#define NODES_SHIFT 2
+
+#ifndef CONFIG_NUMA
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
@@ -16,6 +20,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define node_spanned_pages(nid) (total_pages)
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)
+#endif /* CONFIG_NUMA */
 
 static inline unsigned int arch_get_dma_bitsize(void)
 {
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (8 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-08 16:41   ` Julien Grall
  2017-05-08 16:51   ` Julien Grall
  2017-03-28 15:53 ` [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c vijay.kilari
                   ` (14 subsequent siblings)
  24 siblings, 2 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move code from xen/arch/x86/numa.c to xen/common/numa.c
so that it can be used by other archs.
Few generic static functions in x86/numa.c is made
non-static common/numa.c

The generic contents of header file asm-x86/numa.h
are moved to xen/numa.h.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 456 ------------------------------------------
 xen/arch/x86/srat.c        |   7 +
 xen/common/Makefile        |   1 +
 xen/common/numa.c          | 488 +++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/numa.h |  15 --
 xen/include/xen/numa.h     |  18 ++
 6 files changed, 514 insertions(+), 471 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 3bdab9a..33c6806 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -10,286 +10,13 @@
 #include <xen/ctype.h>
 #include <xen/nodemask.h>
 #include <xen/numa.h>
-#include <xen/keyhandler.h>
 #include <xen/time.h>
 #include <xen/smp.h>
 #include <xen/pfn.h>
 #include <asm/acpi.h>
-#include <xen/sched.h>
-#include <xen/softirq.h>
-
-static int numa_setup(char *s);
-custom_param("numa", numa_setup);
-
-struct node_data node_data[MAX_NUMNODES];
-
-/* Mapping from pdx to node id */
-unsigned int memnode_shift;
-static typeof(*memnodemap) _memnodemap[64];
-unsigned long memnodemapsize;
-uint8_t *memnodemap;
-
-nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
-    [0 ... NR_CPUS-1] = NUMA_NO_NODE
-};
-/*
- * Keep BIOS's CPU2node information, should not be used for memory allocaion
- */
-nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
-    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
-};
-cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-static bool numa_off = 0;
-static bool acpi_numa = 1;
-
-bool is_numa_off(void)
-{
-    return numa_off;
-}
-
-bool get_acpi_numa(void)
-{
-    return acpi_numa;
-}
-
-void set_acpi_numa(bool_t val)
-{
-    acpi_numa = val;
-}
-
-bool srat_disabled(void)
-{
-    return numa_off || acpi_numa == 0;
-}
-
-/*
- * Given a shift value, try to populate memnodemap[]
- * Returns :
- * 0 if OK
- * -EINVAL if memnodmap[] too small (of shift too small)
- * OR if node overlap or lost ram (shift too big)
- */
-static int __init populate_memnodemap(const struct node *nodes, int numnodes,
-                                      unsigned int shift, nodeid_t *nodeids)
-{
-    unsigned long spdx, epdx;
-    int i, res = -EINVAL;
-
-    memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
-    for ( i = 0; i < numnodes; i++ )
-    {
-        spdx = paddr_to_pdx(nodes[i].start);
-        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-        if ( spdx >= epdx )
-            continue;
-        if ( (epdx >> shift) >= memnodemapsize )
-            return 0;
-        do {
-            if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-                return -EINVAL;
-
-            if ( !nodeids )
-                memnodemap[spdx >> shift] = i;
-            else
-                memnodemap[spdx >> shift] = nodeids[i];
-
-            spdx += (1UL << shift);
-        } while ( spdx < epdx );
-        res = 0;
-    }
-
-    return res;
-}
-
-static int __init allocate_cachealigned_memnodemap(void)
-{
-    unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
-    unsigned long mfn = alloc_boot_pages(size, 1);
-
-    if ( !mfn )
-    {
-        printk(KERN_ERR
-               "NUMA: Unable to allocate Memory to Node hash map\n");
-        memnodemapsize = 0;
-        return -ENOMEM;
-    }
-
-    memnodemap = mfn_to_virt(mfn);
-    mfn <<= PAGE_SHIFT;
-    size <<= PAGE_SHIFT;
-    printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
-           mfn, mfn + size);
-    memnodemapsize = size / sizeof(*memnodemap);
-
-    return 0;
-}
-
-/*
- * The LSB of all start and end addresses in the node map is the value of the
- * maximum possible shift.
- */
-static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
-                                                  int numnodes)
-{
-    unsigned int i, nodes_used = 0;
-    unsigned long spdx, epdx;
-    unsigned long bitfield = 0, memtop = 0;
-
-    for ( i = 0; i < numnodes; i++ )
-    {
-        spdx = paddr_to_pdx(nodes[i].start);
-        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-        if ( spdx >= epdx )
-            continue;
-        bitfield |= spdx;
-        nodes_used++;
-        if ( epdx > memtop )
-            memtop = epdx;
-    }
-    if ( nodes_used <= 1 )
-        i = BITS_PER_LONG - 1;
-    else
-        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
-    memnodemapsize = (memtop >> i) + 1;
-
-    return i;
-}
-
-int __init compute_memnode_shift(struct node *nodes, int numnodes,
-                                 nodeid_t *nodeids, unsigned int *shift)
-{
-    *shift = extract_lsb_from_nodes(nodes, numnodes);
-
-    if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
-        memnodemap = _memnodemap;
-    else if ( allocate_cachealigned_memnodemap() )
-        return -ENOMEM;
-
-    printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
-
-    if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
-    {
-        printk(KERN_INFO "Your memory is not aligned you need to "
-               "rebuild your hypervisor with a bigger NODEMAPSIZE "
-               "shift=%u\n", *shift);
-        return -EINVAL;
-    }
-
-    return 0;
-}
-/* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
-{
-    unsigned long start_pfn, end_pfn;
-
-    start_pfn = start >> PAGE_SHIFT;
-    end_pfn = end >> PAGE_SHIFT;
-
-    NODE_DATA(nodeid)->node_start_pfn = start_pfn;
-    NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
-
-    node_set_online(nodeid);
-}
-
-void __init numa_init_array(void)
-{
-    int rr, i;
-
-    /* There are unfortunately some poorly designed mainboards around
-       that only connect memory to a single CPU. This breaks the 1:1 cpu->node
-       mapping. To avoid this fill in the mapping for all possible
-       CPUs, as the number of CPUs is not known yet.
-       We round robin the existing nodes. */
-    rr = first_node(node_online_map);
-    for ( i = 0; i < nr_cpu_ids; i++ )
-    {
-        if ( cpu_to_node[i] != NUMA_NO_NODE )
-            continue;
-        numa_set_node(i, rr);
-        rr = next_node(rr, node_online_map);
-        if ( rr == MAX_NUMNODES )
-            rr = first_node(node_online_map);
-    }
-}
-
-#ifdef CONFIG_NUMA_EMU
-static int __initdata numa_fake = 0;
-static int get_numa_fake(void)
-{
-    return numa_fake;
-}
-
-/* Numa emulation */
-static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
-{
-    int i;
-    struct node nodes[MAX_NUMNODES];
-    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / get_numa_fake();
-
-    /* Kludge needed for the hash function */
-    if ( hweight64(sz) > 1 )
-    {
-        uint64_t x = 1;
-        while ( (x << 1) < sz )
-            x <<= 1;
-        if ( x < sz / 2 )
-            printk(KERN_ERR
-                   "Numa emulation unbalanced. Complain to maintainer\n");
-        sz = x;
-    }
-
-    memset(&nodes,0,sizeof(nodes));
-    for ( i = 0; i < get_numa_fake(); i++ )
-    {
-        nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
-        if ( i == get_numa_fake() - 1 )
-            sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
-        nodes[i].end = nodes[i].start + sz;
-        printk(KERN_INFO
-               "Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
-               i, nodes[i].start, nodes[i].end,
-               (nodes[i].end - nodes[i].start) >> 20);
-        node_set_online(i);
-    }
-    if ( compute_memnode_shift(nodes, get_numa_fake(), NULL, &memnode_shift) )
-    {
-        memnode_shift = 0;
-        printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
-        return -1;
-    }
-    for_each_online_node ( i )
-        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
-    numa_init_array();
-
-    return 0;
-}
-#endif
-
-static void __init numa_dummy_init(unsigned long start_pfn, unsigned long end_pfn)
-{
-    int i;
-
-    printk(KERN_INFO "%s\n",
-           is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
-
-    printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
-           (uint64_t)start_pfn << PAGE_SHIFT,
-           (uint64_t)end_pfn << PAGE_SHIFT);
-    /* setup dummy node covering all memory */
-    memnode_shift = BITS_PER_LONG - 1;
-    memnodemap = _memnodemap;
-    nodes_clear(node_online_map);
-    node_set_online(0);
-    for ( i = 0; i < nr_cpu_ids; i++ )
-        numa_set_node(i, 0);
-    cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
-    setup_node_bootmem(0, (paddr_t)start_pfn << PAGE_SHIFT,
-                    (paddr_t)end_pfn << PAGE_SHIFT);
-}
-
 void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
 {
 #ifdef CONFIG_NUMA_EMU
@@ -306,43 +33,6 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
     numa_dummy_init(start_pfn, end_pfn);
 }
 
-void numa_add_cpu(int cpu)
-{
-    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-}
-
-void numa_set_node(int cpu, nodeid_t node)
-{
-    cpu_to_node[cpu] = node;
-}
-
-/* [numa=off] */
-static int __init numa_setup(char *opt)
-{
-    if ( !strncmp(opt,"off",3) )
-        numa_off = 1;
-    if ( !strncmp(opt,"on",2) )
-        numa_off = 0;
-#ifdef CONFIG_NUMA_EMU
-    if ( !strncmp(opt, "fake=", 5) )
-    {
-        numa_off = 0;
-        numa_fake = simple_strtoul(opt+5,NULL,0);
-        if ( numa_fake >= MAX_NUMNODES )
-            numa_fake = MAX_NUMNODES;
-    }
-#endif
-#ifdef CONFIG_ACPI_NUMA
-    if ( !strncmp(opt,"noacpi",6) )
-    {
-        numa_off = 0;
-        acpi_numa = 0;
-    }
-#endif
-
-    return 1;
-}
-
 /*
  * Setup early cpu_to_node.
  *
@@ -391,149 +81,3 @@ unsigned int __init arch_get_dma_bitsize(void)
                  flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
                  + PAGE_SHIFT, 32);
 }
-
-static void dump_numa(unsigned char key)
-{
-    s_time_t now = NOW();
-    unsigned int i, j, n;
-    int err;
-    struct domain *d;
-    struct page_info *page;
-    unsigned int page_num_node[MAX_NUMNODES];
-    const struct vnuma_info *vnuma;
-
-    printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-           (uint32_t)(now >> 32), (uint32_t)now);
-
-    for_each_online_node ( i )
-    {
-        paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1);
-
-        printk("NODE%u start->%lu size->%lu free->%lu\n",
-               i, node_start_pfn(i), node_spanned_pages(i),
-               avail_node_heap_pages(i));
-        /* sanity check phys_to_nid() */
-        if ( phys_to_nid(pa) != i )
-            printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n",
-                   pa, phys_to_nid(pa), i);
-    }
-
-    j = cpumask_first(&cpu_online_map);
-    n = 0;
-    for_each_online_cpu ( i )
-    {
-        if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] )
-        {
-            if ( n > 1 )
-                printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
-            else
-                printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
-            j = i;
-            n = 1;
-        }
-        else
-            ++n;
-    }
-    if ( n > 1 )
-        printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
-    else
-        printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
-
-    rcu_read_lock(&domlist_read_lock);
-
-    printk("Memory location of each domain:\n");
-    for_each_domain ( d )
-    {
-        process_pending_softirqs();
-
-        printk("Domain %u (total: %u):\n", d->domain_id, d->tot_pages);
-
-        for_each_online_node ( i )
-            page_num_node[i] = 0;
-
-        spin_lock(&d->page_alloc_lock);
-        page_list_for_each(page, &d->page_list)
-        {
-            i = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
-            page_num_node[i]++;
-        }
-        spin_unlock(&d->page_alloc_lock);
-
-        for_each_online_node ( i )
-            printk("    Node %u: %u\n", i, page_num_node[i]);
-
-        if ( !read_trylock(&d->vnuma_rwlock) )
-            continue;
-
-        if ( !d->vnuma )
-        {
-            read_unlock(&d->vnuma_rwlock);
-            continue;
-        }
-
-        vnuma = d->vnuma;
-        printk("     %u vnodes, %u vcpus, guest physical layout:\n",
-               vnuma->nr_vnodes, d->max_vcpus);
-        for ( i = 0; i < vnuma->nr_vnodes; i++ )
-        {
-            unsigned int start_cpu = ~0U;
-
-            err = snprintf(keyhandler_scratch, 12, "%3u",
-                    vnuma->vnode_to_pnode[i]);
-            if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
-                strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
-
-            printk("       %3u: pnode %s,", i, keyhandler_scratch);
-
-            printk(" vcpus ");
-
-            for ( j = 0; j < d->max_vcpus; j++ )
-            {
-                if ( !(j & 0x3f) )
-                    process_pending_softirqs();
-
-                if ( vnuma->vcpu_to_vnode[j] == i )
-                {
-                    if ( start_cpu == ~0U )
-                    {
-                        printk("%d", j);
-                        start_cpu = j;
-                    }
-                }
-                else if ( start_cpu != ~0U )
-                {
-                    if ( j - 1 != start_cpu )
-                        printk("-%d ", j - 1);
-                    else
-                        printk(" ");
-                    start_cpu = ~0U;
-                }
-            }
-
-            if ( start_cpu != ~0U  && start_cpu != j - 1 )
-                printk("-%d", j - 1);
-
-            printk("\n");
-
-            for ( j = 0; j < vnuma->nr_vmemranges; j++ )
-            {
-                if ( vnuma->vmemrange[j].nid == i )
-                    printk("           %016"PRIx64" - %016"PRIx64"\n",
-                           vnuma->vmemrange[j].start,
-                           vnuma->vmemrange[j].end);
-            }
-        }
-
-        read_unlock(&d->vnuma_rwlock);
-    }
-
-    rcu_read_unlock(&domlist_read_lock);
-}
-
-static int __init register_numa_trigger(void)
-{
-    register_keyhandler('u', dump_numa, "dump NUMA info", 1);
-    return 0;
-}
-__initcall(register_numa_trigger);
-
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 7cf4771..2cc87a3 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -27,6 +27,13 @@ static nodemask_t __initdata memory_nodes_parsed;
 static nodemask_t __initdata processor_nodes_parsed;
 static struct node __initdata nodes[MAX_NUMNODES];
 
+/*
+ * Keep BIOS's CPU2node information, should not be used for memory allocaion
+ */
+nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
+    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
+};
+
 struct pxm2node {
 	unsigned int pxm;
 	nodeid_t node;
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 0fed30b..4b17344 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -63,6 +63,7 @@ obj-y += wait.o
 obj-bin-y += warning.init.o
 obj-$(CONFIG_XENOPROF) += xenoprof.o
 obj-y += xmalloc_tlsf.o
+obj-$(CONFIG_NUMA) += numa.o
 
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
diff --git a/xen/common/numa.c b/xen/common/numa.c
new file mode 100644
index 0000000..207ebd8
--- /dev/null
+++ b/xen/common/numa.c
@@ -0,0 +1,488 @@
+/*
+ * Common NUMA handling functions for x86 and arm.
+ * Original code extracted from arch/x86/numa.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/mm.h>
+#include <xen/string.h>
+#include <xen/init.h>
+#include <xen/ctype.h>
+#include <xen/nodemask.h>
+#include <xen/numa.h>
+#include <xen/keyhandler.h>
+#include <xen/time.h>
+#include <xen/smp.h>
+#include <xen/pfn.h>
+#include <asm/acpi.h>
+#include <xen/sched.h>
+#include <xen/softirq.h>
+
+static int numa_setup(char *s);
+custom_param("numa", numa_setup);
+
+struct node_data node_data[MAX_NUMNODES];
+
+/* Mapping from pdx to node id */
+unsigned int memnode_shift;
+static typeof(*memnodemap) _memnodemap[64];
+unsigned long memnodemapsize;
+uint8_t *memnodemap;
+
+nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
+    [0 ... NR_CPUS-1] = NUMA_NO_NODE
+};
+cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
+
+static bool numa_off = 0;
+static bool acpi_numa = 1;
+
+bool is_numa_off(void)
+{
+    return numa_off;
+}
+
+bool get_acpi_numa(void)
+{
+    return acpi_numa;
+}
+
+void set_acpi_numa(bool_t val)
+{
+    acpi_numa = val;
+}
+
+bool srat_disabled(void)
+{
+    return numa_off || acpi_numa == 0;
+}
+
+/*
+ * Given a shift value, try to populate memnodemap[]
+ * Returns :
+ * 0 if OK
+ * -EINVAL if memnodmap[] too small (of shift too small)
+ * OR if node overlap or lost ram (shift too big)
+ */
+static int __init populate_memnodemap(const struct node *nodes, int numnodes,
+                                      unsigned int shift, nodeid_t *nodeids)
+{
+    unsigned long spdx, epdx;
+    int i, res = -EINVAL;
+
+    memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
+    for ( i = 0; i < numnodes; i++ )
+    {
+        spdx = paddr_to_pdx(nodes[i].start);
+        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
+        if ( spdx >= epdx )
+            continue;
+        if ( (epdx >> shift) >= memnodemapsize )
+            return 0;
+        do {
+            if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
+                return -EINVAL;
+
+            if ( !nodeids )
+                memnodemap[spdx >> shift] = i;
+            else
+                memnodemap[spdx >> shift] = nodeids[i];
+
+            spdx += (1UL << shift);
+        } while ( spdx < epdx );
+        res = 0;
+    }
+
+    return res;
+}
+
+static int __init allocate_cachealigned_memnodemap(void)
+{
+    unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
+    unsigned long mfn = alloc_boot_pages(size, 1);
+
+    if ( !mfn )
+    {
+        printk(KERN_ERR
+               "NUMA: Unable to allocate Memory to Node hash map\n");
+        memnodemapsize = 0;
+        return -ENOMEM;
+    }
+
+    memnodemap = mfn_to_virt(mfn);
+    mfn <<= PAGE_SHIFT;
+    size <<= PAGE_SHIFT;
+    printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
+           mfn, mfn + size);
+    memnodemapsize = size / sizeof(*memnodemap);
+
+    return 0;
+}
+
+/*
+ * The LSB of all start and end addresses in the node map is the value of the
+ * maximum possible shift.
+ */
+static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
+                                                  int numnodes)
+{
+    unsigned int i, nodes_used = 0;
+    unsigned long spdx, epdx;
+    unsigned long bitfield = 0, memtop = 0;
+
+    for ( i = 0; i < numnodes; i++ )
+    {
+        spdx = paddr_to_pdx(nodes[i].start);
+        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
+        if ( spdx >= epdx )
+            continue;
+        bitfield |= spdx;
+        nodes_used++;
+        if ( epdx > memtop )
+            memtop = epdx;
+    }
+    if ( nodes_used <= 1 )
+        i = BITS_PER_LONG - 1;
+    else
+        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
+
+    memnodemapsize = (memtop >> i) + 1;
+
+    return i;
+}
+
+int __init compute_memnode_shift(struct node *nodes, int numnodes,
+                                 nodeid_t *nodeids, unsigned int *shift)
+{
+    *shift = extract_lsb_from_nodes(nodes, numnodes);
+
+    if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
+        memnodemap = _memnodemap;
+    else if ( allocate_cachealigned_memnodemap() )
+        return -ENOMEM;
+
+    printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
+
+    if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
+    {
+        printk(KERN_INFO "Your memory is not aligned you need to "
+               "rebuild your hypervisor with a bigger NODEMAPSIZE "
+               "shift=%u\n", *shift);
+        return -EINVAL;
+    }
+
+    return 0;
+}
+/* initialize NODE_DATA given nodeid and start/end */
+void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
+{
+    unsigned long start_pfn, end_pfn;
+
+    start_pfn = start >> PAGE_SHIFT;
+    end_pfn = end >> PAGE_SHIFT;
+
+    NODE_DATA(nodeid)->node_start_pfn = start_pfn;
+    NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
+
+    node_set_online(nodeid);
+}
+
+void __init numa_init_array(void)
+{
+    int rr, i;
+
+    /* There are unfortunately some poorly designed mainboards around
+       that only connect memory to a single CPU. This breaks the 1:1 cpu->node
+       mapping. To avoid this fill in the mapping for all possible
+       CPUs, as the number of CPUs is not known yet.
+       We round robin the existing nodes. */
+    rr = first_node(node_online_map);
+    for ( i = 0; i < nr_cpu_ids; i++ )
+    {
+        if ( cpu_to_node[i] != NUMA_NO_NODE )
+            continue;
+        numa_set_node(i, rr);
+        rr = next_node(rr, node_online_map);
+        if ( rr == MAX_NUMNODES )
+            rr = first_node(node_online_map);
+    }
+}
+
+#ifdef CONFIG_NUMA_EMU
+static int __initdata numa_fake = 0;
+
+int get_numa_fake(void)
+{
+    return numa_fake;
+}
+
+/* Numa emulation */
+int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
+{
+    int i;
+    struct node nodes[MAX_NUMNODES];
+    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / get_numa_fake();
+
+    /* Kludge needed for the hash function */
+    if ( hweight64(sz) > 1 )
+    {
+        uint64_t x = 1;
+        while ( (x << 1) < sz )
+            x <<= 1;
+        if ( x < sz / 2 )
+            printk(KERN_ERR
+                   "Numa emulation unbalanced. Complain to maintainer\n");
+        sz = x;
+    }
+
+    memset(&nodes,0,sizeof(nodes));
+    for ( i = 0; i < get_numa_fake(); i++ )
+    {
+        nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
+        if ( i == get_numa_fake() - 1 )
+            sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
+        nodes[i].end = nodes[i].start + sz;
+        printk(KERN_INFO
+               "Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
+               i, nodes[i].start, nodes[i].end,
+               (nodes[i].end - nodes[i].start) >> 20);
+        node_set_online(i);
+    }
+
+    if ( compute_memnode_shift(nodes, get_numa_fake(), NULL, &memnode_shift) )
+    {
+        memnode_shift = 0;
+        printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
+        return -1;
+    }
+    for_each_online_node ( i )
+        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
+    numa_init_array();
+
+    return 0;
+}
+#endif
+
+void __init numa_dummy_init(unsigned long start_pfn, unsigned long end_pfn)
+{
+    int i;
+
+    printk(KERN_INFO "%s\n",
+           is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
+
+    printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
+           (uint64_t)start_pfn << PAGE_SHIFT,
+           (uint64_t)end_pfn << PAGE_SHIFT);
+    /* setup dummy node covering all memory */
+    memnode_shift = BITS_PER_LONG - 1;
+    memnodemap = _memnodemap;
+    nodes_clear(node_online_map);
+    node_set_online(0);
+    for ( i = 0; i < nr_cpu_ids; i++ )
+        numa_set_node(i, 0);
+    cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
+    setup_node_bootmem(0, (paddr_t)start_pfn << PAGE_SHIFT,
+                    (paddr_t)end_pfn << PAGE_SHIFT);
+}
+
+void numa_add_cpu(int cpu)
+{
+    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
+}
+
+void numa_set_node(int cpu, nodeid_t node)
+{
+    cpu_to_node[cpu] = node;
+}
+
+/* [numa=off] */
+static int __init numa_setup(char *opt)
+{
+    if ( !strncmp(opt,"off",3) )
+        numa_off = 1;
+    if ( !strncmp(opt,"on",2) )
+        numa_off = 0;
+#ifdef CONFIG_NUMA_EMU
+    if ( !strncmp(opt, "fake=", 5) )
+    {
+        numa_off = 0;
+        numa_fake = simple_strtoul(opt+5,NULL,0);
+        if ( numa_fake >= MAX_NUMNODES )
+            numa_fake = MAX_NUMNODES;
+    }
+#endif
+#ifdef CONFIG_ACPI_NUMA
+    if ( !strncmp(opt,"noacpi",6) )
+    {
+        numa_off = 0;
+        acpi_numa = 0;
+    }
+#endif
+
+    return 1;
+}
+
+static void dump_numa(unsigned char key)
+{
+    s_time_t now = NOW();
+    unsigned int i, j, n;
+    int err;
+    struct domain *d;
+    struct page_info *page;
+    unsigned int page_num_node[MAX_NUMNODES];
+    const struct vnuma_info *vnuma;
+
+    printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
+           (uint32_t)(now >> 32), (uint32_t)now);
+
+    for_each_online_node ( i )
+    {
+        paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1);
+
+        printk("NODE%u start->%lu size->%lu free->%lu\n",
+               i, node_start_pfn(i), node_spanned_pages(i),
+               avail_node_heap_pages(i));
+        /* sanity check phys_to_nid() */
+        if ( phys_to_nid(pa) != i )
+            printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n",
+                   pa, phys_to_nid(pa), i);
+    }
+
+    j = cpumask_first(&cpu_online_map);
+    n = 0;
+    for_each_online_cpu ( i )
+    {
+        if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] )
+        {
+            if ( n > 1 )
+                printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
+            else
+                printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
+            j = i;
+            n = 1;
+        }
+        else
+            ++n;
+    }
+    if ( n > 1 )
+        printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
+    else
+        printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    printk("Memory location of each domain:\n");
+    for_each_domain ( d )
+    {
+        process_pending_softirqs();
+
+        printk("Domain %u (total: %u):\n", d->domain_id, d->tot_pages);
+
+        for_each_online_node ( i )
+            page_num_node[i] = 0;
+
+        spin_lock(&d->page_alloc_lock);
+        page_list_for_each(page, &d->page_list)
+        {
+            i = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
+            page_num_node[i]++;
+        }
+        spin_unlock(&d->page_alloc_lock);
+
+        for_each_online_node ( i )
+            printk("    Node %u: %u\n", i, page_num_node[i]);
+
+        if ( !read_trylock(&d->vnuma_rwlock) )
+            continue;
+
+        if ( !d->vnuma )
+        {
+            read_unlock(&d->vnuma_rwlock);
+            continue;
+        }
+
+        vnuma = d->vnuma;
+        printk("     %u vnodes, %u vcpus, guest physical layout:\n",
+               vnuma->nr_vnodes, d->max_vcpus);
+        for ( i = 0; i < vnuma->nr_vnodes; i++ )
+        {
+            unsigned int start_cpu = ~0U;
+
+            err = snprintf(keyhandler_scratch, 12, "%3u",
+                    vnuma->vnode_to_pnode[i]);
+            if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
+                strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
+
+            printk("       %3u: pnode %s,", i, keyhandler_scratch);
+
+            printk(" vcpus ");
+
+            for ( j = 0; j < d->max_vcpus; j++ )
+            {
+                if ( !(j & 0x3f) )
+                    process_pending_softirqs();
+
+                if ( vnuma->vcpu_to_vnode[j] == i )
+                {
+                    if ( start_cpu == ~0U )
+                    {
+                        printk("%d", j);
+                        start_cpu = j;
+                    }
+                }
+                else if ( start_cpu != ~0U )
+                {
+                    if ( j - 1 != start_cpu )
+                        printk("-%d ", j - 1);
+                    else
+                        printk(" ");
+                    start_cpu = ~0U;
+                }
+            }
+
+            if ( start_cpu != ~0U  && start_cpu != j - 1 )
+                printk("-%d", j - 1);
+
+            printk("\n");
+
+            for ( j = 0; j < vnuma->nr_vmemranges; j++ )
+            {
+                if ( vnuma->vmemrange[j].nid == i )
+                    printk("           %016"PRIx64" - %016"PRIx64"\n",
+                           vnuma->vmemrange[j].start,
+                           vnuma->vmemrange[j].end);
+            }
+        }
+
+        read_unlock(&d->vnuma_rwlock);
+    }
+
+    rcu_read_unlock(&domlist_read_lock);
+}
+
+static int __init register_numa_trigger(void)
+{
+    register_keyhandler('u', dump_numa, "dump NUMA info", 1);
+    return 0;
+}
+__initcall(register_numa_trigger);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 7237ad1..421e8b7 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -17,27 +17,12 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
-struct node {
-    paddr_t start;
-    paddr_t end;
-};
-
-extern int compute_memnode_shift(struct node *nodes, int numnodes,
-                                 nodeid_t *nodeids, unsigned int *shift);
 extern nodeid_t pxm_to_node(unsigned int pxm);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
 extern void numa_add_cpu(int cpu);
-extern void numa_init_array(void);
-extern bool srat_disabled(void);
-extern void numa_set_node(int cpu, nodeid_t node);
-extern nodeid_t acpi_setup_node(unsigned int pxm);
-extern void srat_detect_node(int cpu);
-
-extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern nodeid_t apicid_to_node[];
-extern void init_cpu_to_node(void);
 
 /* Simple perfect hash to map pdx to node numbers */
 extern unsigned int memnode_shift;
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 922fbd8..eed40af 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -14,6 +14,21 @@
 
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
 
+struct node {
+    paddr_t start;
+    paddr_t end;
+};
+
+extern int compute_memnode_shift(struct node *nodes, int numnodes,
+                                 nodeid_t *nodeids, unsigned int *shift);
+extern void numa_init_array(void);
+extern bool_t srat_disabled(void);
+extern void numa_set_node(int cpu, nodeid_t node);
+extern nodeid_t acpi_setup_node(unsigned int pxm);
+extern void srat_detect_node(int cpu);
+extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
+extern void init_cpu_to_node(void);
+
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
 #define domain_to_node(d) \
@@ -23,4 +38,7 @@
 bool is_numa_off(void);
 bool get_acpi_numa(void);
 void set_acpi_numa(bool val);
+int get_numa_fake(void);
+extern int numa_emulation(uint64_t start_pfn, uint64_t end_pfn);
+extern void numa_dummy_init(uint64_t start_pfn, uint64_t end_pfn);
 #endif /* _XEN_NUMA_H */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (9 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-08 17:06   ` Julien Grall
  2017-03-28 15:53 ` [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information vijay.kilari
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move code from xen/arch/x86/srat.c to xen/common/numa.c
so that it can be used by other archs.
Few generic static functions in x86/srat.c are made
non-static common/numa.c

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/srat.c        | 152 ++-------------------------------------------
 xen/common/numa.c          | 146 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/acpi.h |   3 -
 xen/include/asm-x86/numa.h |   2 -
 xen/include/xen/numa.h     |  14 +++++
 5 files changed, 164 insertions(+), 153 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 2cc87a3..55947bb 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -23,9 +23,8 @@
 
 static struct acpi_table_slit *__read_mostly acpi_slit;
 
-static nodemask_t __initdata memory_nodes_parsed;
-static nodemask_t __initdata processor_nodes_parsed;
-static struct node __initdata nodes[MAX_NUMNODES];
+extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
@@ -43,49 +42,8 @@ static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
 
 static unsigned node_to_pxm(nodeid_t n);
 
-static int num_node_memblks;
-static struct node node_memblk_range[NR_NODE_MEMBLKS];
-static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static struct node *get_numa_node(int id)
-{
-	return &nodes[id];
-}
-
-static nodeid_t get_memblk_nodeid(int id)
-{
-	return memblk_nodeid[id];
-}
-
-static nodeid_t *get_memblk_nodeid_map(void)
-{
-	return &memblk_nodeid[0];
-}
-
-static struct node *get_node_memblk_range(int memblk)
-{
-	return &node_memblk_range[memblk];
-}
-
-static int get_num_node_memblks(void)
-{
-	return num_node_memblks;
-}
-
-static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size)
-{
-	if (nodeid >= NR_NODE_MEMBLKS)
-		return -EINVAL;
-
-	node_memblk_range[num_node_memblks].start = start;
-	node_memblk_range[num_node_memblks].end = start + size;
-	memblk_nodeid[num_node_memblks] = nodeid;
-	num_node_memblks++;
-
-	return 0;
-}
-
 static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
 	return ((pxm2node[idx].pxm == pxm) &&
@@ -156,54 +114,7 @@ nodeid_t acpi_setup_node(unsigned int pxm)
 	return node;
 }
 
-int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
-{
-	int i;
-
-	for (i = 0; i < get_num_node_memblks(); i++) {
-		struct node *nd = get_node_memblk_range(i);
-
-		if (nd->start <= start && nd->end > end &&
-		    get_memblk_nodeid(i) == node)
-			return 1;
-	}
-
-	return 0;
-}
-
-static int __init conflicting_memblks(paddr_t start, paddr_t end)
-{
-	int i;
-
-	for (i = 0; i < get_num_node_memblks(); i++) {
-		struct node *nd = get_node_memblk_range(i);
-		if (nd->start == nd->end)
-			continue;
-		if (nd->end > start && nd->start < end)
-			return i;
-		if (nd->end == end && nd->start == start)
-			return i;
-	}
-	return -1;
-}
-
-static void __init cutoff_node(int i, paddr_t start, paddr_t end)
-{
-	struct node *nd = get_numa_node(i);
-
-	if (nd->start < start) {
-		nd->start = start;
-		if (nd->end < nd->start)
-			nd->start = nd->end;
-	}
-	if (nd->end > end) {
-		nd->end = end;
-		if (nd->start > nd->end)
-			nd->start = nd->end;
-	}
-}
-
-static void __init numa_failed(void)
+void __init numa_failed(void)
 {
 	int i;
 	printk(KERN_ERR "SRAT: SRAT not used.\n");
@@ -419,7 +330,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
 
 /* Sanity check to catch more bad SRATs (they are amazingly common).
    Make sure the PXMs cover all memory. */
-static int __init arch_sanitize_nodes_memory(void)
+int __init arch_sanitize_nodes_memory(void)
 {
 	int i;
 
@@ -516,61 +427,6 @@ void __init srat_parse_regions(uint64_t addr)
 	pfn_pdx_hole_setup(mask >> PAGE_SHIFT);
 }
 
-/* Use the information discovered above to actually set up the nodes. */
-int __init numa_scan_nodes(uint64_t start, uint64_t end)
-{
-	int i;
-	nodemask_t all_nodes_parsed;
-	struct node *memblks;
-	nodeid_t *nodeids;
-
-	/* First clean up the node list */
-	for (i = 0; i < MAX_NUMNODES; i++)
-		cutoff_node(i, start, end);
-
-	if (get_acpi_numa() == 0)
-		return -1;
-
-	if (!arch_sanitize_nodes_memory()) {
-		numa_failed();
-		return -1;
-	}
-
-	memblks = get_node_memblk_range(0);
-	nodeids = get_memblk_nodeid_map();
-	if (compute_memnode_shift(node_memblk_range, num_node_memblks,
-				  memblk_nodeid, &memnode_shift)) {
-		memnode_shift = 0;
-		printk(KERN_ERR
-		     "SRAT: No NUMA node hash function found. Contact maintainer\n");
-		numa_failed();
-		return -1;
-	}
-
-	nodes_or(all_nodes_parsed, memory_nodes_parsed, processor_nodes_parsed);
-
-	/* Finally register nodes */
-	for_each_node_mask(i, all_nodes_parsed)
-	{
-		struct node *nd = get_numa_node(i);
-		uint64_t size = nd->end - nd->start;
-
-		if ( size == 0 )
-			printk(KERN_WARNING "SRAT: Node %u has no memory. "
-			       "BIOS Bug or mis-configured hardware?\n", i);
-
-		setup_node_bootmem(i, nd->start, nd->end);
-	}
-	for (i = 0; i < nr_cpu_ids; i++) {
-		if (cpu_to_node[i] == NUMA_NO_NODE)
-			continue;
-		if (!node_isset(cpu_to_node[i], processor_nodes_parsed))
-			numa_set_node(i, NUMA_NO_NODE);
-	}
-	numa_init_array();
-	return 0;
-}
-
 static unsigned node_to_pxm(nodeid_t n)
 {
 	unsigned i;
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 207ebd8..1789bba 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -32,6 +32,8 @@
 static int numa_setup(char *s);
 custom_param("numa", numa_setup);
 
+nodemask_t __initdata memory_nodes_parsed;
+nodemask_t __initdata processor_nodes_parsed;
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
@@ -47,6 +49,10 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 static bool numa_off = 0;
 static bool acpi_numa = 1;
+static int num_node_memblks;
+static struct node node_memblk_range[NR_NODE_MEMBLKS];
+static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
+static struct node __initdata nodes[MAX_NUMNODES];
 
 bool is_numa_off(void)
 {
@@ -68,6 +74,91 @@ bool srat_disabled(void)
     return numa_off || acpi_numa == 0;
 }
 
+struct node *get_numa_node(int id)
+{
+    return &nodes[id];
+}
+
+nodeid_t get_memblk_nodeid(int id)
+{
+    return memblk_nodeid[id];
+}
+
+nodeid_t *get_memblk_nodeid_map(void)
+{
+    return &memblk_nodeid[0];
+}
+
+struct node *get_node_memblk_range(int memblk)
+{
+    return &node_memblk_range[memblk];
+}
+
+int get_num_node_memblks(void)
+{
+    return num_node_memblks;
+}
+
+int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size)
+{
+    if (nodeid >= NR_NODE_MEMBLKS)
+        return -EINVAL;
+
+    node_memblk_range[num_node_memblks].start = start;
+    node_memblk_range[num_node_memblks].end = start + size;
+    memblk_nodeid[num_node_memblks] = nodeid;
+    num_node_memblks++;
+
+    return 0;
+}
+
+int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
+{
+    int i;
+
+    for (i = 0; i < get_num_node_memblks(); i++) {
+        struct node *nd = get_node_memblk_range(i);
+
+        if (nd->start <= start && nd->end > end &&
+            get_memblk_nodeid(i) == node)
+            return 1;
+    }
+
+    return 0;
+}
+
+int __init conflicting_memblks(paddr_t start, paddr_t end)
+{
+    int i;
+
+    for (i = 0; i < get_num_node_memblks(); i++) {
+        struct node *nd = get_node_memblk_range(i);
+        if (nd->start == nd->end)
+            continue;
+        if (nd->end > start && nd->start < end)
+            return i;
+        if (nd->end == end && nd->start == start)
+            return i;
+    }
+    return -1;
+}
+
+void __init cutoff_node(int i, paddr_t start, paddr_t end)
+{
+    struct node *nd = get_numa_node(i);
+
+    if (nd->start < start) {
+        nd->start = start;
+        if (nd->end < nd->start)
+            nd->start = nd->end;
+    }
+    if (nd->end > end) {
+        nd->end = end;
+        if (nd->start > nd->end)
+            nd->start = nd->end;
+    }
+}
+
 /*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
@@ -306,6 +397,61 @@ void numa_set_node(int cpu, nodeid_t node)
     cpu_to_node[cpu] = node;
 }
 
+/* Use the information discovered above to actually set up the nodes. */
+int __init numa_scan_nodes(uint64_t start, uint64_t end)
+{
+    int i;
+    nodemask_t all_nodes_parsed;
+    struct node *memblks;
+    nodeid_t *nodeids;
+
+    /* First clean up the node list */
+    for (i = 0; i < MAX_NUMNODES; i++)
+        cutoff_node(i, start, end);
+
+    if (get_acpi_numa() == 0)
+        return -1;
+
+    if (!arch_sanitize_nodes_memory()) {
+        numa_failed();
+        return -1;
+    }
+
+    memblks = get_node_memblk_range(0);
+    nodeids = get_memblk_nodeid_map();
+    if (compute_memnode_shift(node_memblk_range, num_node_memblks,
+                  memblk_nodeid, &memnode_shift)) {
+        memnode_shift = 0;
+        printk(KERN_ERR
+             "SRAT: No NUMA node hash function found. Contact maintainer\n");
+        numa_failed();
+        return -1;
+    }
+
+    nodes_or(all_nodes_parsed, memory_nodes_parsed, processor_nodes_parsed);
+
+    /* Finally register nodes */
+    for_each_node_mask(i, all_nodes_parsed)
+    {
+        struct node *nd = get_numa_node(i);
+        uint64_t size = nd->end - nd->start;
+
+        if ( size == 0 )
+            printk(KERN_WARNING "SRAT: Node %u has no memory. "
+                   "BIOS Bug or mis-configured hardware?\n", i);
+
+        setup_node_bootmem(i, nd->start, nd->end);
+    }
+    for (i = 0; i < nr_cpu_ids; i++) {
+        if (cpu_to_node[i] == NUMA_NO_NODE)
+            continue;
+        if (!node_isset(cpu_to_node[i], processor_nodes_parsed))
+            numa_set_node(i, NUMA_NO_NODE);
+    }
+    numa_init_array();
+    return 0;
+}
+
 /* [numa=off] */
 static int __init numa_setup(char *opt)
 {
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index 445b8e5..e4f3f9a 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -103,9 +103,6 @@ extern void acpi_reserve_bootmem(void);
 
 #define ARCH_HAS_POWER_INIT	1
 
-extern int numa_scan_nodes(u64 start, u64 end);
-#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
-
 #ifdef CONFIG_ACPI_SLEEP
 
 extern struct acpi_sleep_info acpi_sinfo;
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 421e8b7..7cff220 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -47,8 +47,6 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
                                  NODE_DATA(nid)->node_spanned_pages)
 
-extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
-
 void srat_parse_regions(uint64_t addr);
 extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index eed40af..ee53526 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -13,6 +13,7 @@
 #define NUMA_NO_DISTANCE 0xFF
 
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
+#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
 
 struct node {
     paddr_t start;
@@ -28,6 +29,19 @@ extern nodeid_t acpi_setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern void init_cpu_to_node(void);
+extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
+extern int conflicting_memblks(paddr_t start, paddr_t end);
+extern void cutoff_node(int i, paddr_t start, paddr_t end);
+extern struct node *get_numa_node(int id);
+extern nodeid_t get_memblk_nodeid(int memblk);
+extern nodeid_t *get_memblk_nodeid_map(void);
+extern struct node *get_node_memblk_range(int memblk);
+extern struct node *get_memblk(int memblk);
+extern int numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size);
+extern int get_num_node_memblks(void);
+extern int arch_sanitize_nodes_memory(void);
+extern void numa_failed(void);
+extern int numa_scan_nodes(uint64_t start, uint64_t end);
 
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (10 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-08 17:31   ` Julien Grall
  2017-03-28 15:53 ` [RFC PATCH v2 13/25] ARM: NUMA: Parse memory " vijay.kilari
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse CPU node and fetch numa-node-id information.
For each node-id found, update nodemask_t mask.
Refer to /Documentation/devicetree/bindings/numa.txt.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Makefile       |  1 +
 xen/arch/arm/bootfdt.c      | 16 ++++++++--
 xen/arch/arm/numa/Makefile  |  2 ++
 xen/arch/arm/numa/dt_numa.c | 78 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa/numa.c    | 50 +++++++++++++++++++++++++++++
 xen/arch/arm/setup.c        |  4 +++
 xen/include/asm-arm/numa.h  | 10 +++++-
 xen/include/asm-arm/setup.h |  4 ++-
 8 files changed, 161 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 0ce94a8..d13b79f 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
 subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
+subdir-$(CONFIG_NUMA) += numa
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index ea188a0..1f876f0 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
     *size = dt_next_cell(size_cells, cell);
 }
 
-static u32 __init device_tree_get_u32(const void *fdt, int node,
-                                      const char *prop_name, u32 dflt)
+bool_t __init device_tree_type_matches(const void *fdt, int node,
+                                       const char *match)
+{
+    const void *prop;
+
+    prop = fdt_getprop(fdt, node, "device_type", NULL);
+    if ( prop == NULL )
+        return 0;
+
+    return strcmp(prop, match) == 0 ? 1 : 0;
+}
+
+u32 __init device_tree_get_u32(const void *fdt, int node,
+                               const char *prop_name, u32 dflt)
 {
     const struct fdt_property *prop;
 
diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
new file mode 100644
index 0000000..3af3aff
--- /dev/null
+++ b/xen/arch/arm/numa/Makefile
@@ -0,0 +1,2 @@
+obj-y += dt_numa.o
+obj-y += numa.o
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
new file mode 100644
index 0000000..66c6efb
--- /dev/null
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -0,0 +1,78 @@
+/*
+ * OF NUMA Parsing support.
+ *
+ * Copyright (C) 2015 - 2016 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/nodemask.h>
+#include <asm/mm.h>
+#include <xen/numa.h>
+#include <xen/device_tree.h>
+#include <asm/setup.h>
+
+extern nodemask_t processor_nodes_parsed;
+
+/*
+ * Even though we connect cpus to numa domains later in SMP
+ * init, we need to know the node ids now for all cpus.
+ */
+static int __init dt_numa_process_cpu_node(const void *fdt, int node,
+                                           const char *name,
+                                           uint32_t address_cells,
+                                           uint32_t size_cells)
+{
+    uint32_t nid;
+
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+
+    if ( nid >= MAX_NUMNODES )
+        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n", nid);
+    else
+        node_set(nid, processor_nodes_parsed);
+
+    return 0;
+}
+
+static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
+                                        const char *name, int depth,
+                                        uint32_t address_cells,
+                                        uint32_t size_cells, void *data)
+{
+    if ( device_tree_type_matches(fdt, node, "cpu") )
+        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
+                                        size_cells);
+
+    return 0;
+}
+
+int __init dt_numa_init(void)
+{
+    int ret;
+
+    ret = device_tree_for_each_node((void *)device_tree_flattened,
+                                    dt_numa_scan_cpu_node, NULL);
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
new file mode 100644
index 0000000..c1c7c35
--- /dev/null
+++ b/xen/arch/arm/numa/numa.c
@@ -0,0 +1,50 @@
+/*
+ * ARM NUMA Implementation
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K <vijaya.kumar@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/init.h>
+#include <xen/ctype.h>
+#include <xen/mm.h>
+#include <xen/nodemask.h>
+#include <asm/mm.h>
+#include <xen/numa.h>
+#include <asm/acpi.h>
+
+extern nodemask_t processor_nodes_parsed;
+
+void __init numa_init(void)
+{
+    int ret = 0;
+
+    nodes_clear(processor_nodes_parsed);
+    if ( is_numa_off() )
+        goto no_numa;
+
+    ret = dt_numa_init();
+    if ( ret )
+        printk(XENLOG_WARNING "DT NUMA init failed\n");
+
+no_numa:
+    return;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 92a2de6..ed58f0e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -38,6 +38,7 @@
 #include <xen/libfdt/libfdt.h>
 #include <xen/acpi.h>
 #include <asm/alternative.h>
+#include <xen/numa.h>
 #include <asm/page.h>
 #include <asm/current.h>
 #include <asm/setup.h>
@@ -758,6 +759,9 @@ void __init start_xen(unsigned long boot_phys_offset,
     else
         printk("Booting using ACPI\n");
 
+    /* numa_init parses acpi tables. So call after acpi init */
+    numa_init();
+
     end_boot_allocator();
 
     vm_init();
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 924bfc0..e50ee19 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -6,7 +6,15 @@ typedef uint8_t nodeid_t;
 /* Limit number of NUMA nodes supported to 4 */
 #define NODES_SHIFT 2
 
-#ifndef CONFIG_NUMA
+#ifdef CONFIG_NUMA
+extern void numa_init(void);
+extern int dt_numa_init(void);
+#else
+static inline void numa_init(void)
+{
+    return;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 7c76185..b1022a3 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -79,7 +79,9 @@ struct bootmodule *add_boot_module(bootmodule_kind kind,
                                    const char *cmdline);
 struct bootmodule *boot_module_find_by_kind(bootmodule_kind kind);
 const char * __init boot_module_kind_as_string(bootmodule_kind kind);
-
+u32 device_tree_get_u32(const void *fdt, int node,
+                        const char *prop_name, u32 dflt);
+bool_t device_tree_type_matches(const void *fdt, int node, const char *match);
 #endif
 /*
  * Local variables:
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 13/25] ARM: NUMA: Parse memory NUMA information
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (11 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 14/25] ARM: NUMA: Parse NUMA distance information vijay.kilari
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse memory node and fetch numa-node-id information.
For each memory range, store in node_memblk_range[]
along with node id.

When booting in UEFI mode, UEFI passes memory information
to Dom0 using EFI memory descriptor table and deletes the
memory nodes from the host DT. However to fetch the memory
numa node id, memory DT node should not be deleted by EFI stub.
With this patch, do not delete memory node from FDT.

NUMA info of memory is extracted from process_memory_node()
instead of parsing the DT again during numa_init().

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/bootfdt.c      | 24 ++++++++++++++++++++----
 xen/arch/arm/efi/efi-boot.h | 25 -------------------------
 xen/arch/arm/numa/dt_numa.c | 33 +++++++++++++++++++++++++++++++++
 xen/arch/arm/numa/numa.c    |  9 +++++++++
 xen/include/asm-arm/numa.h  |  2 ++
 5 files changed, 64 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 1f876f0..993760a 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -13,6 +13,7 @@
 #include <xen/init.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
+#include <xen/efi.h>
 #include <xsm/xsm.h>
 #include <asm/setup.h>
 
@@ -146,6 +147,9 @@ static void __init process_memory_node(const void *fdt, int node,
     const __be32 *cell;
     paddr_t start, size;
     u32 reg_cells = address_cells + size_cells;
+#ifdef CONFIG_NUMA
+    uint32_t nid;
+#endif
 
     if ( address_cells < 1 || size_cells < 1 )
     {
@@ -154,24 +158,36 @@ static void __init process_memory_node(const void *fdt, int node,
         return;
     }
 
+#ifdef CONFIG_NUMA
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", NR_NODE_MEMBLKS);
+#endif
     prop = fdt_get_property(fdt, node, "reg", NULL);
     if ( !prop )
     {
         printk("fdt: node `%s': missing `reg' property\n", name);
+#ifdef CONFIG_NUMA
+	numa_failed();
+#endif
         return;
     }
 
     cell = (const __be32 *)prop->data;
     banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
 
-    for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS; i++ )
+    for ( i = 0; i < banks; i++ )
     {
         device_tree_get_reg(&cell, address_cells, size_cells, &start, &size);
         if ( !size )
             continue;
-        bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
-        bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
-        bootinfo.mem.nr_banks++;
+        if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks < NR_MEM_BANKS )
+        {
+            bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
+            bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
+            bootinfo.mem.nr_banks++;
+        }
+#ifdef CONFIG_NUMA
+        dt_numa_process_memory_node(nid, start, size);
+#endif
     }
 }
 
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index e1e447a..07fe178 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -194,33 +194,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE *sys_table,
     int status;
     u32 fdt_val32;
     u64 fdt_val64;
-    int prev;
     int num_rsv;
 
-    /*
-     * Delete any memory nodes present.  The EFI memory map is the only
-     * memory description provided to Xen.
-     */
-    prev = 0;
-    for (;;)
-    {
-        const char *type;
-        int len;
-
-        node = fdt_next_node(fdt, prev, NULL);
-        if ( node < 0 )
-            break;
-
-        type = fdt_getprop(fdt, node, "device_type", &len);
-        if ( type && strncmp(type, "memory", len) == 0 )
-        {
-            fdt_del_node(fdt, node);
-            continue;
-        }
-
-        prev = node;
-    }
-
    /*
     * Delete all memory reserve map entries. When booting via UEFI,
     * kernel will use the UEFI memory map to find reserved regions.
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 66c6efb..593c647 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -25,6 +25,7 @@
 #include <asm/setup.h>
 
 extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /*
  * Even though we connect cpus to numa domains later in SMP
@@ -59,6 +60,38 @@ static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
     return 0;
 }
 
+void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
+                                       paddr_t size)
+{
+    struct node *nd;
+    int i;
+
+    i = conflicting_memblks(start, start + size);
+    if ( i < 0 )
+    {
+         if ( numa_add_memblk(nid, start, size) )
+         {
+             printk(XENLOG_WARNING "DT: NUMA: node-id %u overflow \n", nid);
+             numa_failed();
+             return;
+         }
+    }
+    else
+    {
+         nd = get_node_memblk_range(i);
+         printk(XENLOG_ERR
+                "NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with %d (%"PRIx64"-%"PRIx64")\n",
+                nid, start, start + size, i, nd->start, nd->end);
+
+         numa_failed();
+         return;
+    }
+
+    node_set(nid, memory_nodes_parsed);
+
+    return;
+}
+
 int __init dt_numa_init(void)
 {
     int ret;
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index c1c7c35..b333453 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -23,6 +23,12 @@
 #include <asm/acpi.h>
 
 extern nodemask_t processor_nodes_parsed;
+static bool_t dt_numa = 1;
+
+void numa_failed(void)
+{
+    dt_numa = 0;
+}
 
 void __init numa_init(void)
 {
@@ -32,6 +38,9 @@ void __init numa_init(void)
     if ( is_numa_off() )
         goto no_numa;
 
+    if ( !dt_numa )
+        goto no_numa;
+
     ret = dt_numa_init();
     if ( ret )
         printk(XENLOG_WARNING "DT NUMA init failed\n");
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index e50ee19..962a214 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -6,6 +6,8 @@ typedef uint8_t nodeid_t;
 /* Limit number of NUMA nodes supported to 4 */
 #define NODES_SHIFT 2
 
+extern void dt_numa_process_memory_node(uint32_t nid,paddr_t start,
+                                        paddr_t size);
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
 extern int dt_numa_init(void);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 14/25] ARM: NUMA: Parse NUMA distance information
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (12 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 13/25] ARM: NUMA: Parse memory " vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 15/25] ARM: NUMA: Add CPU NUMA support vijay.kilari
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse distance-matrix and fetch node distance information.
Store distance information in node_distance[].

Register dt_node_distance() function pointer with
the ARM numa code. This approach can be later used for
ACPI.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/bootfdt.c      |   4 +-
 xen/arch/arm/numa/dt_numa.c | 133 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa/numa.c    |  21 +++++++
 xen/include/asm-arm/numa.h  |   3 +
 xen/include/asm-arm/setup.h |   2 +
 5 files changed, 161 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 993760a..c72300c 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -17,8 +17,8 @@
 #include <xsm/xsm.h>
 #include <asm/setup.h>
 
-static bool_t __init device_tree_node_matches(const void *fdt, int node,
-                                              const char *match)
+bool_t __init device_tree_node_matches(const void *fdt, int node,
+                                       const char *match)
 {
     const char *name;
     size_t match_len;
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 593c647..c2dcfa1 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -27,6 +27,48 @@
 extern nodemask_t processor_nodes_parsed;
 extern nodemask_t memory_nodes_parsed;
 
+static uint8_t node_distance[MAX_NUMNODES][MAX_NUMNODES];
+
+static uint8_t dt_node_distance(nodeid_t nodea, nodeid_t nodeb)
+{
+    if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
+        return nodea == nodeb ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+
+    return node_distance[nodea][nodeb];
+}
+
+static int dt_numa_set_distance(uint32_t nodea, uint32_t nodeb,
+                                uint32_t distance)
+{
+   /* node_distance is uint8_t. Ensure distance is less than 255 */
+   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES || distance > 255 )
+       return -EINVAL;
+
+   node_distance[nodea][nodeb] = distance;
+
+   return 0;
+}
+
+void init_dt_numa_distance(void)
+{
+    int i, j;
+
+    for ( i = 0; i < MAX_NUMNODES; i++ )
+    {
+        for ( j = 0; j < MAX_NUMNODES; j++ )
+        {
+            /*
+             * Initialize distance 10 for local distance and
+             * 20 for remote distance.
+             */
+            if ( i  == j )
+                node_distance[i][j] = LOCAL_DISTANCE;
+            else
+                node_distance[i][j] = REMOTE_DISTANCE;
+        }
+    }
+}
+
 /*
  * Even though we connect cpus to numa domains later in SMP
  * init, we need to know the node ids now for all cpus.
@@ -48,6 +90,76 @@ static int __init dt_numa_process_cpu_node(const void *fdt, int node,
     return 0;
 }
 
+static int __init dt_numa_parse_distance_map(const void *fdt, int node,
+                                             const char *name,
+                                             uint32_t address_cells,
+                                             uint32_t size_cells)
+{
+    const struct fdt_property *prop;
+    const __be32 *matrix;
+    int entry_count, len, i;
+
+    printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
+
+    prop = fdt_get_property(fdt, node, "distance-matrix", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_WARNING
+               "NUMA: No distance-matrix property in distance-map\n");
+
+        return -EINVAL;
+    }
+
+    if ( len % sizeof(uint32_t) != 0 )
+    {
+         printk(XENLOG_WARNING
+                "distance-matrix in node is not a multiple of u32\n");
+
+        return -EINVAL;
+    }
+
+    entry_count = len / sizeof(uint32_t);
+    if ( entry_count <= 0 )
+    {
+        printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
+
+        return -EINVAL;
+    }
+
+    matrix = (const __be32 *)prop->data;
+    for ( i = 0; i + 2 < entry_count; i += 3 )
+    {
+        uint32_t nodea, nodeb, distance;
+
+        nodea = dt_read_number(matrix, 1);
+        matrix++;
+        nodeb = dt_read_number(matrix, 1);
+        matrix++;
+        distance = dt_read_number(matrix, 1);
+        matrix++;
+
+        if ( dt_numa_set_distance(nodea, nodeb, distance) )
+        {
+            printk(XENLOG_WARNING
+                   "NUMA: node-id out of range in distance matrix for [node%d -> node%d]\n",
+                   nodea, nodeb);
+            return -EINVAL;
+
+        }
+        printk(XENLOG_INFO "NUMA: distance[node%d -> node%d] = %d\n",
+               nodea, nodeb, distance);
+
+        /*
+         * Set default distance of node B->A same as A->B.
+         * No need to check for return value of numa_set_distance.
+         */
+        if ( nodeb > nodea )
+            dt_numa_set_distance(nodeb, nodea, distance);
+    }
+
+    return 0;
+}
+
 static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
                                         const char *name, int depth,
                                         uint32_t address_cells,
@@ -92,12 +204,33 @@ void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
     return;
 }
 
+static int __init dt_numa_scan_distance_node(const void *fdt, int node,
+                                             const char *name, int depth,
+                                             uint32_t address_cells,
+                                             uint32_t size_cells, void *data)
+{
+    if ( device_tree_node_matches(fdt, node, "numa-distance-map-v1") )
+        return dt_numa_parse_distance_map(fdt, node, name, address_cells,
+                                          size_cells);
+
+    return 0;
+}
+
 int __init dt_numa_init(void)
 {
     int ret;
 
     ret = device_tree_for_each_node((void *)device_tree_flattened,
                                     dt_numa_scan_cpu_node, NULL);
+    if ( ret )
+        return ret;
+
+    ret = device_tree_for_each_node((void *)device_tree_flattened,
+                                    dt_numa_scan_distance_node, NULL);
+
+    if ( !ret )
+        register_node_distance(&dt_node_distance);
+
     return ret;
 }
 
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index b333453..0ee89da 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -21,13 +21,33 @@
 #include <asm/mm.h>
 #include <xen/numa.h>
 #include <asm/acpi.h>
+#include <xen/errno.h>
+
+static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
 extern nodemask_t processor_nodes_parsed;
 static bool_t dt_numa = 1;
 
+uint8_t __node_distance(nodeid_t a, nodeid_t b)
+{
+    if ( node_distance_fn != NULL);
+        return node_distance_fn(a, b);
+
+    return a == b ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+}
+
+EXPORT_SYMBOL(__node_distance);
+
+void register_node_distance(uint8_t (fn)(nodeid_t a, nodeid_t b))
+{
+    node_distance_fn = fn;
+}
+
 void numa_failed(void)
 {
     dt_numa = 0;
+    init_dt_numa_distance();
+    node_distance_fn = NULL;
 }
 
 void __init numa_init(void)
@@ -35,6 +55,7 @@ void __init numa_init(void)
     int ret = 0;
 
     nodes_clear(processor_nodes_parsed);
+    init_dt_numa_distance();
     if ( is_numa_off() )
         goto no_numa;
 
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 962a214..c390a0e 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -8,6 +8,9 @@ typedef uint8_t nodeid_t;
 
 extern void dt_numa_process_memory_node(uint32_t nid,paddr_t start,
                                         paddr_t size);
+extern void register_node_distance(uint8_t (fn)(nodeid_t a, nodeid_t b));
+extern void init_dt_numa_distance(void);
+extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
 extern int dt_numa_init(void);
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index b1022a3..8713657 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -82,6 +82,8 @@ const char * __init boot_module_kind_as_string(bootmodule_kind kind);
 u32 device_tree_get_u32(const void *fdt, int node,
                         const char *prop_name, u32 dflt);
 bool_t device_tree_type_matches(const void *fdt, int node, const char *match);
+bool_t __init device_tree_node_matches(const void *fdt, int node,
+                                       const char *match);
 #endif
 /*
  * Local variables:
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 15/25] ARM: NUMA: Add CPU NUMA support
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (13 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 14/25] ARM: NUMA: Parse NUMA distance information vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 16/25] ARM: NUMA: Add memory " vijay.kilari
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

For each cpu, update cpu_to_node[] with node id from
the numa-node-id DT property. Also, initialize cpu_to_node[]
with node 0.

Add macros to access cpu_to_node[] information.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/numa.c   | 21 +++++++++++++++++++++
 xen/arch/arm/smpboot.c     | 25 ++++++++++++++++++++++++-
 xen/include/asm-arm/numa.h | 24 ++++++++++++++++++++++++
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 0ee89da..eef5870 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -28,6 +28,25 @@ static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 extern nodemask_t processor_nodes_parsed;
 static bool_t dt_numa = 1;
 
+/*
+ * Setup early cpu_to_node.
+ */
+void __init init_cpu_to_node(void)
+{
+    int i;
+
+    for ( i = 0; i < NR_CPUS; i++ )
+        numa_set_node(i, 0);
+}
+
+void __init numa_set_cpu_node(int cpu, unsigned int nid)
+{
+    if ( !node_isset(nid, processor_nodes_parsed) || nid >= MAX_NUMNODES )
+        nid = 0;
+
+    numa_set_node(cpu, nid);
+}
+
 uint8_t __node_distance(nodeid_t a, nodeid_t b)
 {
     if ( node_distance_fn != NULL);
@@ -48,6 +67,7 @@ void numa_failed(void)
     dt_numa = 0;
     init_dt_numa_distance();
     node_distance_fn = NULL;
+    init_cpu_to_node();
 }
 
 void __init numa_init(void)
@@ -55,6 +75,7 @@ void __init numa_init(void)
     int ret = 0;
 
     nodes_clear(processor_nodes_parsed);
+    init_cpu_to_node();
     init_dt_numa_distance();
     if ( is_numa_off() )
         goto no_numa;
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 32e8722..bf7ddaf 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -29,6 +29,7 @@
 #include <xen/timer.h>
 #include <xen/irq.h>
 #include <xen/console.h>
+#include <xen/numa.h>
 #include <asm/cpuerrata.h>
 #include <asm/gic.h>
 #include <asm/psci.h>
@@ -106,6 +107,7 @@ static void __init dt_smp_init_cpus(void)
         [0 ... NR_CPUS - 1] = MPIDR_INVALID
     };
     bool_t bootcpu_valid = 0;
+    nodeid_t *cpu_to_nodemap;
     int rc;
 
     mpidr = boot_cpu_data.mpidr.bits & MPIDR_HWID_MASK;
@@ -117,11 +119,18 @@ static void __init dt_smp_init_cpus(void)
         return;
     }
 
+    cpu_to_nodemap = xzalloc_array(nodeid_t, NR_CPUS);
+    if ( !cpu_to_nodemap )
+    {
+        printk(XENLOG_WARNING "Failed to allocate memory for cpu_to_nodemap\n");
+        return;
+    }
+
     dt_for_each_child_node( cpus, cpu )
     {
         const __be32 *prop;
         u64 addr;
-        u32 reg_len;
+        u32 reg_len, nid;
         register_t hwid;
 
         if ( !dt_device_type_is_equal(cpu, "cpu") )
@@ -146,6 +155,15 @@ static void __init dt_smp_init_cpus(void)
             continue;
         }
 
+        if ( !dt_property_read_u32(cpu, "numa-node-id", &nid) )
+        {
+            printk(XENLOG_WARNING "cpu node `%s`: numa-node-id not found\n",
+                   dt_node_full_name(cpu));
+            nid = 0;
+        }
+
+        cpu_to_nodemap[cpuidx] = nid;
+
         addr = dt_read_number(prop, dt_n_addr_cells(cpu));
 
         hwid = addr;
@@ -224,6 +242,7 @@ static void __init dt_smp_init_cpus(void)
     {
         printk(XENLOG_WARNING "DT missing boot CPU MPIDR[23:0]\n"
                "Using only 1 CPU\n");
+        xfree(cpu_to_nodemap);
         return;
     }
 
@@ -233,7 +252,10 @@ static void __init dt_smp_init_cpus(void)
             continue;
         cpumask_set_cpu(i, &cpu_possible_map);
         cpu_logical_map(i) = tmp_map[i];
+        numa_set_cpu_node(i, cpu_to_nodemap[i]);
     }
+
+    xfree(cpu_to_nodemap);
 }
 
 void __init smp_init_cpus(void)
@@ -313,6 +335,7 @@ void start_secondary(unsigned long boot_phys_offset,
      */
     smp_wmb();
 
+    numa_add_cpu(cpuid);
     /* Now report this CPU is up */
     cpumask_set_cpu(cpuid, &cpu_online_map);
 
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index c390a0e..65bdd5e 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -14,12 +14,36 @@ extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
 extern int dt_numa_init(void);
+extern void numa_set_cpu_node(int cpu, unsigned int nid);
+extern void numa_add_cpu(int cpu);
+
+extern nodeid_t      cpu_to_node[NR_CPUS];
+extern cpumask_t     node_to_cpumask[];
+/* Simple perfect hash to map pdx to node numbers */
+extern unsigned int memnode_shift;
+extern uint8_t *memnodemap;
+
+#define cpu_to_node(cpu)         (cpu_to_node[cpu])
+#define parent_node(node)        (node)
+#define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
+#define node_to_cpumask(node)    (node_to_cpumask[node])
+
 #else
 static inline void numa_init(void)
 {
     return;
 }
 
+static inline void numa_set_cpu_node(int cpu, unsigned int nid)
+{
+    return;
+}
+
+static inline void numa_add_cpu(int cpu)
+{
+     return;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 16/25] ARM: NUMA: Add memory NUMA support
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (14 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 15/25] ARM: NUMA: Add CPU NUMA support vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 17/25] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Implement arch_sanitize_nodes_memory() which looks at all banks
in bootinfo.mem, update nodes[] with corresponding nodeid.
Call numa_scan_nodes() generic function with ram start and
end address, which takes care of further computing memnodeshift
and populating memnodemap[] using generic implementation.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/numa.c   | 79 +++++++++++++++++++++++++++++++++++++++++++++-
 xen/common/numa.c          | 14 ++++++++
 xen/include/asm-arm/numa.h | 19 +++++++++++
 xen/include/xen/numa.h     |  1 +
 4 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index eef5870..7583a40 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -70,9 +70,74 @@ void numa_failed(void)
     init_cpu_to_node();
 }
 
+int __init arch_sanitize_nodes_memory(void)
+{
+    nodemask_t mem_nodes_parsed;
+    int bank, nodeid;
+    struct node *nd;
+    paddr_t start, size, end;
+
+    nodes_clear(mem_nodes_parsed);
+    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+    {
+        start = bootinfo.mem.bank[bank].start;
+        size = bootinfo.mem.bank[bank].size;
+        end = start + size;
+
+        nodeid = get_mem_nodeid(start, end);
+        if ( nodeid >= NUMA_NO_NODE )
+        {
+            printk(XENLOG_WARNING
+                   "NUMA: node for mem bank start 0x%lx - 0x%lx not found\n",
+                   start, end);
+
+            return 0;
+        }
+
+        nd = get_numa_node(nodeid);
+        if ( !node_test_and_set(nodeid, mem_nodes_parsed) )
+        {
+            nd->start = start;
+            nd->end = end;
+        }
+        else
+        {
+            if ( start < nd->start )
+                nd->start = start;
+            if ( nd->end < end )
+                nd->end = end;
+        }
+    }
+
+    return 1;
+}
+
+static bool_t __init numa_initmem_init(paddr_t ram_start, paddr_t ram_end)
+{
+    int i;
+    struct node *nd;
+    /*
+     * In arch_sanitize_nodes_memory() we update nodes[] with properly.
+     * Hence we reset the nodes[] before calling numa_scan_nodes().
+     */
+    for ( i = 0; i < MAX_NUMNODES; i++ )
+    {
+        nd = get_numa_node(i);
+        nd->start = 0;
+        nd->end = 0;
+    }
+
+    if ( !numa_scan_nodes(ram_start, ram_end) )
+            return 0;
+
+    return 1;
+}
+
 void __init numa_init(void)
 {
-    int ret = 0;
+    int ret = 0, bank;
+    paddr_t ram_start = ~0;
+    paddr_t ram_end = 0;
 
     nodes_clear(processor_nodes_parsed);
     init_cpu_to_node();
@@ -87,6 +152,18 @@ void __init numa_init(void)
     if ( ret )
         printk(XENLOG_WARNING "DT NUMA init failed\n");
 
+    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+    {
+        paddr_t bank_start = bootinfo.mem.bank[bank].start;
+        paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
+
+        ram_start = min(ram_start, bank_start);
+        ram_end = max(ram_end, bank_end);
+    }
+
+    if ( !ret )
+        ret = numa_initmem_init(ram_start, ram_end);
+
 no_numa:
     return;
 }
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 1789bba..f2ac726 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -84,6 +84,20 @@ nodeid_t get_memblk_nodeid(int id)
     return memblk_nodeid[id];
 }
 
+int __init get_mem_nodeid(paddr_t start, paddr_t end)
+{
+    unsigned int i;
+
+    for ( i = 0; i < get_num_node_memblks(); i++ )
+    {
+        if ( start >= node_memblk_range[i].start &&
+             end <= node_memblk_range[i].end )
+            return memblk_nodeid[i];
+    }
+
+    return -EINVAL;
+}
+
 nodeid_t *get_memblk_nodeid_map(void)
 {
     return &memblk_nodeid[0];
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 65bdd5e..85fbbe8 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -1,6 +1,8 @@
 #ifndef __ARCH_ARM_NUMA_H
 #define __ARCH_ARM_NUMA_H
 
+#include <xen/mm.h>
+
 typedef uint8_t nodeid_t;
 
 /* Limit number of NUMA nodes supported to 4 */
@@ -28,6 +30,23 @@ extern uint8_t *memnodemap;
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
+static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
+{
+    return memnodemap[paddr_to_pdx(addr) >> memnode_shift];
+}
+
+struct node_data {
+    unsigned long node_start_pfn;
+    unsigned long node_spanned_pages;
+};
+
+extern struct node_data node_data[];
+#define NODE_DATA(nid)          (&(node_data[nid]))
+
+#define node_start_pfn(nid)     (NODE_DATA(nid)->node_start_pfn)
+#define node_spanned_pages(nid) (NODE_DATA(nid)->node_spanned_pages)
+#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
+
 #else
 static inline void numa_init(void)
 {
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index ee53526..b40a841 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -38,6 +38,7 @@ extern nodeid_t *get_memblk_nodeid_map(void);
 extern struct node *get_node_memblk_range(int memblk);
 extern struct node *get_memblk(int memblk);
 extern int numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size);
+extern int get_mem_nodeid(paddr_t start, paddr_t end);
 extern int get_num_node_memblks(void);
 extern int arch_sanitize_nodes_memory(void);
 extern void numa_failed(void);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 17/25] ARM: NUMA: Add fallback on NUMA failure
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (15 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 16/25] ARM: NUMA: Add memory " vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 18/25] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

On NUMA initialization failure, reset all the
NUMA structures to emulate as single node.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/numa.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 7583a40..891d304 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -22,6 +22,7 @@
 #include <xen/numa.h>
 #include <asm/acpi.h>
 #include <xen/errno.h>
+#include <xen/pfn.h>
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
@@ -164,7 +165,12 @@ void __init numa_init(void)
     if ( !ret )
         ret = numa_initmem_init(ram_start, ram_end);
 
+    if ( !ret )
+        return;
+
 no_numa:
+    numa_dummy_init(PFN_UP(ram_start),PFN_DOWN(ram_end));
+
     return;
 }
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 18/25] ARM: NUMA: Do not expose numa info to DOM0
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (16 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 17/25] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 19/25] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Delete numa-node-id and distance map from DOM0 DT
so that NUMA information is not exposed to DOM0.
This helps particularly to boot Node 1 devices
as if booting on Node0.

However this approach has limitation where memory allocation
for the devices should be local.

Also, do not expose numa distance node to DOM0.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/domain_build.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index de59e5f..4a7e645 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -421,6 +421,10 @@ static int write_properties(struct domain *d, struct kernel_info *kinfo,
             }
         }
 
+        /* Don't expose the property numa to the guest */
+        if ( dt_property_name_is_equal(prop, "numa-node-id") )
+            continue;
+
         /* Don't expose the property "xen,passthrough" to the guest */
         if ( dt_property_name_is_equal(prop, "xen,passthrough") )
             continue;
@@ -1173,6 +1177,11 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
         DT_MATCH_TYPE("memory"),
         /* The memory mapped timer is not supported by Xen. */
         DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
+        /*
+         * NUMA info is not exposed to Dom0.
+         * So, skip distance-map infomation
+         */
+        DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
         { /* sentinel */ },
     };
     static const struct dt_device_match timer_matches[] __initconst =
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 19/25] ACPI: Refactor acpi SRAT and SLIT table handling code
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (17 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 18/25] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 20/25] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move SRAT handling code which is common across
architecture is moved to new file xen/drivers/acpi/srat.c
from xen/arch/x86/srat.c file. New header file srat.h is
introduced.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/dom0_build.c           |   1 +
 xen/arch/x86/mm.c                   |   2 -
 xen/arch/x86/physdev.c              |   1 +
 xen/arch/x86/setup.c                |   1 +
 xen/arch/x86/smpboot.c              |   1 +
 xen/arch/x86/srat.c                 | 250 +-----------------------------
 xen/arch/x86/x86_64/mm.c            |   1 +
 xen/drivers/acpi/Makefile           |   1 +
 xen/drivers/acpi/srat.c             | 299 ++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/srat.h             |  24 +++
 xen/include/asm-x86/mm.h            |   1 -
 xen/include/asm-x86/numa.h          |   4 -
 xen/include/xen/mm.h                |   2 +
 xen/include/xen/numa.h              |   1 -
 15 files changed, 333 insertions(+), 257 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 20221b5..c131a81 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -11,6 +11,7 @@
 #include <xen/sched.h>
 #include <xen/sched-if.h>
 #include <xen/softirq.h>
+#include <acpi/srat.h>
 
 #include <asm/dom0_build.h>
 #include <asm/hpet.h>
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a6b2649..ebabb0c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -137,8 +137,6 @@ l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
 #define PTE_UPDATE_WITH_CMPXCHG
 #endif
 
-paddr_t __read_mostly mem_hotplug;
-
 /* Private domain structs for DOMID_XEN and DOMID_IO. */
 struct domain *dom_xen, *dom_io, *dom_cow;
 
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 81cd6c9..ecc0daf 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -8,6 +8,7 @@
 #include <xen/guest_access.h>
 #include <xen/iocap.h>
 #include <xen/serial.h>
+#include <acpi/srat.h>
 #include <asm/current.h>
 #include <asm/io_apic.h>
 #include <asm/msi.h>
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 4410e53..d29fd1a 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -27,6 +27,7 @@
 #include <xen/tmem_xen.h>
 #include <xen/virtual_region.h>
 #include <xen/watchdog.h>
+#include <acpi/srat.h>
 #include <public/version.h>
 #include <compat/platform.h>
 #include <compat/xen.h>
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 203733e..7dc06e4 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include <xen/serial.h>
 #include <xen/numa.h>
 #include <xen/cpu.h>
+#include <acpi/srat.h>
 #include <asm/current.h>
 #include <asm/mc146818rtc.h>
 #include <asm/desc.h>
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 55947bb..760df7f 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -18,14 +18,12 @@
 #include <xen/acpi.h>
 #include <xen/numa.h>
 #include <xen/pfn.h>
+#include <acpi/srat.h>
 #include <asm/e820.h>
 #include <asm/page.h>
 
-static struct acpi_table_slit *__read_mostly acpi_slit;
-
 extern nodemask_t processor_nodes_parsed;
 extern nodemask_t memory_nodes_parsed;
-
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
  */
@@ -33,87 +31,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
     [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
 
-struct pxm2node {
-	unsigned int pxm;
-	nodeid_t node;
-};
-static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
-	{ [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
-
-static unsigned node_to_pxm(nodeid_t n);
-
-static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
-
-static inline bool node_found(unsigned int idx, unsigned int pxm)
-{
-	return ((pxm2node[idx].pxm == pxm) &&
-		(pxm2node[idx].node != NUMA_NO_NODE));
-}
-
-static void reset_pxm2node(void)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		pxm2node[i].node = NUMA_NO_NODE;
-}
-
-nodeid_t pxm_to_node(unsigned int pxm)
-{
-	unsigned int i;
-
-	if ((pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm))
-		return pxm2node[pxm].node;
-
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		if (node_found(i, pxm))
-			return pxm2node[i].node;
-
-	return NUMA_NO_NODE;
-}
-
-nodeid_t acpi_setup_node(unsigned int pxm)
-{
-	nodeid_t node;
-	unsigned int idx;
-	static bool warned;
-	static unsigned int nodes_found;
-
-	BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
-
-	if (pxm < ARRAY_SIZE(pxm2node)) {
-		if (node_found(pxm, pxm))
-			return pxm2node[pxm].node;
-
-		/* Try to maintain indexing of pxm2node by pxm */
-		if (pxm2node[pxm].node == NUMA_NO_NODE) {
-			idx = pxm;
-			goto finish;
-		}
-	}
-
-	for (idx = 0; idx < ARRAY_SIZE(pxm2node); idx++)
-		if (pxm2node[idx].node == NUMA_NO_NODE)
-			goto finish;
-
-	if (!warned) {
-		printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n",
-		       pxm);
-		warned = 1;
-	}
-
-	return NUMA_NO_NODE;
-
- finish:
-	node = nodes_found++;
-	if (node >= MAX_NUMNODES)
-		return NUMA_NO_NODE;
-	pxm2node[idx].pxm = pxm;
-	pxm2node[idx].node = node;
-
-	return node;
-}
-
 void __init numa_failed(void)
 {
 	int i;
@@ -125,48 +42,6 @@ void __init numa_failed(void)
 	mem_hotplug = 0;
 }
 
-/*
- * A lot of BIOS fill in 10 (= no distance) everywhere. This messes
- * up the NUMA heuristics which wants the local node to have a smaller
- * distance than the others.
- * Do some quick checks here and only use the SLIT if it passes.
- */
-static int __init slit_valid(struct acpi_table_slit *slit)
-{
-	int i, j;
-	int d = slit->locality_count;
-	for (i = 0; i < d; i++) {
-		for (j = 0; j < d; j++)  {
-			uint8_t val = slit->entry[d*i + j];
-			if (i == j) {
-				if (val != LOCAL_DISTANCE)
-					return 0;
-			} else if (val <= LOCAL_DISTANCE)
-				return 0;
-		}
-	}
-	return 1;
-}
-
-/* Callback for SLIT parsing */
-void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
-{
-	unsigned long mfn;
-	if (!slit_valid(slit)) {
-		printk(KERN_INFO "ACPI: SLIT table looks invalid. "
-		       "Not used.\n");
-		return;
-	}
-	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
-	if (!mfn) {
-		printk(KERN_ERR "ACPI: Unable to allocate memory for "
-		       "saving ACPI SLIT numa information.\n");
-		return;
-	}
-	acpi_slit = mfn_to_virt(mfn);
-	memcpy(acpi_slit, slit, slit->header.length);
-}
-
 /* Callback for Proximity Domain -> x2APIC mapping */
 void __init
 acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
@@ -234,100 +109,6 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
 	       pxm, pa->apic_id, node);
 }
 
-/* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
-void __init
-acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
-{
-	uint64_t start, end;
-	unsigned pxm;
-	nodeid_t node;
-	int i;
-	struct node *memblk;
-
-	if (srat_disabled())
-		return;
-	if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) {
-		numa_failed();
-		return;
-	}
-	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
-		return;
-
-	if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
-	{
-		dprintk(XENLOG_WARNING,
-                "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
-		numa_failed();
-		return;
-	}
-
-	start = ma->base_address;
-	end = start + ma->length;
-	pxm = ma->proximity_domain;
-	if (srat_rev < 2)
-		pxm &= 0xff;
-	node = acpi_setup_node(pxm);
-	if (node == NUMA_NO_NODE) {
-		numa_failed();
-		return;
-	}
-	/* It is fine to add this area to the nodes data it will be used later*/
-	i = conflicting_memblks(start, end);
-	if (i < 0)
-		/* everything fine */;
-	else if (get_memblk_nodeid(i) == node) {
-		bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
-		                !test_bit(i, memblk_hotplug);
-
-		memblk = get_node_memblk_range(i);
-
-		printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
-		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
-		       memblk->start, memblk->end);
-		if (mismatch) {
-			numa_failed();
-			return;
-		}
-	} else {
-		memblk = get_node_memblk_range(i);
-
-		printk(KERN_ERR
-		       "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
-		       pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
-		       memblk->start, memblk->end);
-		numa_failed();
-		return;
-	}
-	if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
-		struct node *nd = get_numa_node(node);
-
-		if (!node_test_and_set(node, memory_nodes_parsed)) {
-			nd->start = start;
-			nd->end = end;
-		} else {
-			if (start < nd->start)
-				nd->start = start;
-			if (nd->end < end)
-				nd->end = end;
-		}
-	}
-	printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIx64"-%"PRIx64"%s\n",
-	       node, pxm, start, end,
-	       ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
-
-	if (numa_add_memblk(node, start, ma->length)) {
-		printk(KERN_ERR "SRAT: node-id %u out of range\n", node);
-		numa_failed();
-		return;
-	}
-
-	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
-		__set_bit(get_num_node_memblks(), memblk_hotplug);
-		if (end > mem_hotplug)
-			mem_hotplug = end;
-	}
-}
-
 /* Sanity check to catch more bad SRATs (they are amazingly common).
    Make sure the PXMs cover all memory. */
 int __init arch_sanitize_nodes_memory(void)
@@ -427,35 +208,6 @@ void __init srat_parse_regions(uint64_t addr)
 	pfn_pdx_hole_setup(mask >> PAGE_SHIFT);
 }
 
-static unsigned node_to_pxm(nodeid_t n)
-{
-	unsigned i;
-
-	if ((n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n))
-		return pxm2node[n].pxm;
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		if (pxm2node[i].node == n)
-			return pxm2node[i].pxm;
-	return 0;
-}
-
-static uint8_t acpi_node_distance(nodeid_t a, nodeid_t b)
-{
-	unsigned index;
-	uint8_t slit_val;
-
-	if (!acpi_slit)
-		return a == b ? LOCAL_DISTANCE : REMOTE_DISTANCE;
-	index = acpi_slit->locality_count * node_to_pxm(a);
-	slit_val = acpi_slit->entry[index + node_to_pxm(b)];
-
-	/* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
-	if ((slit_val == 0xff) || (slit_val <= 9))
-		return NUMA_NO_DISTANCE;
-	else
-		return slit_val;
-}
-
 uint8_t __node_distance(nodeid_t a, nodeid_t b)
 {
 	return acpi_node_distance(a, b);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index f0082e1..8d5fd4e 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -27,6 +27,7 @@ asm(".file \"" __FILE__ "\"");
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/mem_access.h>
+#include <acpi/srat.h>
 #include <asm/current.h>
 #include <asm/asm_defns.h>
 #include <asm/page.h>
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index 444b11d..69edc26 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -4,6 +4,7 @@ subdir-$(CONFIG_X86) += apei
 
 obj-bin-y += tables.init.o
 obj-$(CONFIG_NUMA) += numa.o
+obj-$(CONFIG_NUMA) += srat.o
 obj-y += osl.o
 obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
 
diff --git a/xen/drivers/acpi/srat.c b/xen/drivers/acpi/srat.c
new file mode 100644
index 0000000..9a68a4b
--- /dev/null
+++ b/xen/drivers/acpi/srat.c
@@ -0,0 +1,299 @@
+/*
+ * ACPI 3.0 based NUMA setup
+ * Copyright 2004 Andi Kleen, SuSE Labs.
+ *
+ * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs.
+ *
+ * Called from acpi_numa_init while reading the SRAT and SLIT tables.
+ * Assumes all memory regions belonging to a single proximity domain
+ * are in one chunk. Holes between them will be included in the node.
+ *
+ * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
+ */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/inttypes.h>
+#include <xen/nodemask.h>
+#include <xen/acpi.h>
+#include <xen/numa.h>
+#include <xen/pfn.h>
+#include <acpi/srat.h>
+#include <asm/page.h>
+#include <asm/acpi.h>
+
+paddr_t __read_mostly mem_hotplug;
+extern nodemask_t memory_nodes_parsed;
+static struct acpi_table_slit __read_mostly *acpi_slit;
+static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
+    { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
+
+static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
+
+static inline bool_t node_found(unsigned int idx, unsigned int pxm)
+{
+    return ( (pxm2node[idx].pxm == pxm) &&
+        (pxm2node[idx].node != NUMA_NO_NODE) );
+}
+
+void reset_pxm2node(void)
+{
+    unsigned int i;
+
+    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
+        pxm2node[i].node = NUMA_NO_NODE;
+}
+
+unsigned node_to_pxm(nodeid_t n)
+{
+    unsigned int i;
+
+    if ( (n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n) )
+        return pxm2node[n].pxm;
+
+    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
+        if ( pxm2node[i].node == n )
+            return pxm2node[i].pxm;
+
+    return 0;
+}
+
+nodeid_t pxm_to_node(unsigned int pxm)
+{
+    unsigned int i;
+
+    if ( (pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm) )
+        return pxm2node[pxm].node;
+
+    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
+        if ( node_found(i, pxm) )
+            return pxm2node[i].node;
+
+    return NUMA_NO_NODE;
+}
+
+nodeid_t acpi_setup_node(unsigned int pxm)
+{
+    nodeid_t node;
+    unsigned int idx;
+    static bool_t warned;
+    static unsigned int nodes_found;
+
+    BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
+
+    if ( pxm < ARRAY_SIZE(pxm2node) )
+    {
+        if ( node_found(pxm, pxm) )
+            return pxm2node[pxm].node;
+
+        /* Try to maintain indexing of pxm2node by pxm */
+        if ( pxm2node[pxm].node == NUMA_NO_NODE )
+        {
+            idx = pxm;
+            goto finish;
+        }
+    }
+
+    for ( idx = 0; idx < ARRAY_SIZE(pxm2node); idx++ )
+        if ( pxm2node[idx].node == NUMA_NO_NODE )
+            goto finish;
+
+    if ( !warned )
+    {
+        printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n", pxm);
+        warned = 1;
+    }
+
+    return NUMA_NO_NODE;
+
+ finish:
+    node = nodes_found++;
+    if ( node >= MAX_NUMNODES )
+        return NUMA_NO_NODE;
+    pxm2node[idx].pxm = pxm;
+    pxm2node[idx].node = node;
+
+    return node;
+}
+
+/*
+ * A lot of BIOS fill in 10 (= no distance) everywhere. This messes
+ * up the NUMA heuristics which wants the local node to have a smaller
+ * distance than the others.
+ * Do some quick checks here and only use the SLIT if it passes.
+ */
+static __init int slit_valid(struct acpi_table_slit *slit)
+{
+    int i, j;
+    int d = slit->locality_count;
+
+    for ( i = 0; i < d; i++ )
+    {
+        for ( j = 0; j < d; j++ )
+        {
+            uint8_t val = slit->entry[d * i + j];
+
+            if ( i == j )
+            {
+                if ( val != LOCAL_DISTANCE )
+                    return 0;
+            } else if ( val <= LOCAL_DISTANCE )
+                return 0;
+        }
+    }
+
+    return 1;
+}
+
+/* Callback for SLIT parsing */
+void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
+{
+    unsigned long mfn;
+
+    if ( !slit_valid(slit) )
+    {
+        printk(KERN_INFO "ACPI: SLIT table looks invalid. Not used.\n");
+        return;
+    }
+
+    mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
+    if ( !mfn )
+    {
+        printk(KERN_ERR "ACPI: Unable to allocate memory for "
+               "saving ACPI SLIT numa information.\n");
+        return;
+    }
+    acpi_slit = mfn_to_virt(mfn);
+    memcpy(acpi_slit, slit, slit->header.length);
+}
+
+/* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
+void __init
+acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
+{
+    uint64_t start, end;
+    unsigned pxm;
+    nodeid_t node;
+    int i;
+    struct node *memblk;
+
+    if ( srat_disabled() )
+        return;
+
+    if ( ma->header.length != sizeof(struct acpi_srat_mem_affinity) )
+    {
+        numa_failed();
+        return;
+    }
+
+    if ( !(ma->flags & ACPI_SRAT_MEM_ENABLED) )
+        return;
+
+    if ( get_num_node_memblks() >= NR_NODE_MEMBLKS )
+    {
+        dprintk(XENLOG_WARNING,
+                "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
+        numa_failed();
+        return;
+    }
+
+    start = ma->base_address;
+    end = start + ma->length;
+    pxm = ma->proximity_domain;
+    if ( srat_rev < 2 )
+        pxm &= 0xff;
+    node = acpi_setup_node(pxm);
+    if ( node == NUMA_NO_NODE )
+    {
+        numa_failed();
+        return;
+    }
+
+    /* It is fine to add this area to the nodes data it will be used later*/
+    i = conflicting_memblks(start, end);
+    if ( i < 0 )
+        /* everything fine */;
+    else if ( get_memblk_nodeid(i) == node )
+    {
+        bool_t mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
+                          !test_bit(i, memblk_hotplug);
+
+        memblk = get_node_memblk_range(i);
+
+        printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
+               mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
+               memblk->start, memblk->end);
+        if ( mismatch )
+        {
+            numa_failed();
+            return;
+        }
+    }
+    else
+    {
+        memblk = get_node_memblk_range(i);
+
+        printk(KERN_ERR
+               "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
+               pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
+               memblk->start, memblk->end);
+        numa_failed();
+        return;
+    }
+
+    if ( !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) )
+    {
+        struct node *nd = get_numa_node(node);
+
+        if ( !node_test_and_set(node, memory_nodes_parsed) )
+        {
+            nd->start = start;
+            nd->end = end;
+        }
+        else
+        {
+            if ( start < nd->start )
+                nd->start = start;
+            if ( nd->end < end )
+                nd->end = end;
+        }
+    }
+    printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIx64"-%"PRIx64"%s\n",
+           node, pxm, start, end,
+           ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
+
+    numa_add_memblk(node, start, ma->length);
+    if ( ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE )
+    {
+        __set_bit(get_num_node_memblks(), memblk_hotplug);
+        if ( end > mem_hotplug )
+            mem_hotplug = end;
+    }
+}
+
+uint8_t acpi_node_distance(nodeid_t a, nodeid_t b)
+{
+    unsigned index;
+    uint8_t slit_val;
+
+    if ( !acpi_slit )
+        return a == b ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+
+    index = acpi_slit->locality_count * node_to_pxm(a);
+    slit_val = acpi_slit->entry[index + node_to_pxm(b)];
+
+    /* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
+    if ( (slit_val == 0xff) || (slit_val <= 9) )
+        return NUMA_NO_DISTANCE;
+    else
+        return slit_val;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index a5c61c6..d882951 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -30,6 +30,7 @@
 #include <xen/pci.h>
 #include <xen/pci_regs.h>
 #include <xen/keyhandler.h>
+#include <acpi/srat.h>
 #include <asm/msi.h>
 #include <asm/irq.h>
 #include <asm/hvm/vmx/vmx.h>
diff --git a/xen/include/acpi/srat.h b/xen/include/acpi/srat.h
new file mode 100644
index 0000000..c630ae9
--- /dev/null
+++ b/xen/include/acpi/srat.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_SRAT_H__
+#define __XEN_SRAT_H__
+
+extern int srat_rev;
+struct pxm2node {
+    unsigned int pxm;
+    nodeid_t node;
+};
+
+extern nodeid_t pxm_to_node(unsigned pxm);
+extern nodeid_t acpi_setup_node(unsigned pxm);
+extern unsigned int node_to_pxm(nodeid_t n);
+extern uint8_t acpi_node_distance(nodeid_t a, nodeid_t b);
+extern void reset_pxm2node(void);
+#endif /* __XEN_SRAT_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index e22603c..be0e0d4 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -405,7 +405,6 @@ static inline int get_page_and_type(struct page_info *page,
 int check_descriptor(const struct domain *, struct desc_struct *d);
 
 extern bool_t opt_allow_superpage;
-extern paddr_t mem_hotplug;
 
 /******************************************************************************
  * With shadow pagetables, the different kinds of address start 
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 7cff220..d8fa67c 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -7,8 +7,6 @@
 
 typedef uint8_t nodeid_t;
 
-extern int srat_rev;
-
 extern nodeid_t      cpu_to_node[NR_CPUS];
 extern cpumask_t     node_to_cpumask[];
 
@@ -17,8 +15,6 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
-extern nodeid_t pxm_to_node(unsigned int pxm);
-
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
 extern void numa_add_cpu(int cpu);
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 88de3c1..61d059d 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -63,6 +63,8 @@ TYPE_SAFE(unsigned long, mfn);
 #undef mfn_t
 #endif
 
+extern paddr_t mem_hotplug;
+
 static inline mfn_t mfn_add(mfn_t mfn, unsigned long i)
 {
     return _mfn(mfn_x(mfn) + i);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index b40a841..851f4a7 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -25,7 +25,6 @@ extern int compute_memnode_shift(struct node *nodes, int numnodes,
 extern void numa_init_array(void);
 extern bool_t srat_disabled(void);
 extern void numa_set_node(int cpu, nodeid_t node);
-extern nodeid_t acpi_setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern void init_cpu_to_node(void);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 20/25] ARM: NUMA: Extract MPIDR from MADT table
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (18 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 19/25] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 21/25] ACPI: Move arch specific SRAT parsing vijay.kilari
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse MADT table and extract MPIDR for all
CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
and store in cpuid_to_hwid_map[]

This mapping is used by SRAT table parsing to
extract MPIDR of the CPU ID.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/Makefile    |  1 +
 xen/arch/arm/numa/acpi_numa.c | 94 +++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa/numa.c      |  6 +++
 3 files changed, 101 insertions(+)

diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
index 3af3aff..b549459 100644
--- a/xen/arch/arm/numa/Makefile
+++ b/xen/arch/arm/numa/Makefile
@@ -1,2 +1,3 @@
 obj-y += dt_numa.o
 obj-y += numa.o
+obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
new file mode 100644
index 0000000..45b3d35
--- /dev/null
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -0,0 +1,94 @@
+/*
+ * ACPI based NUMA setup
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K <Vijaya.Kumar@cavium.com>
+ *
+ * Reads the ACPI MADT and SRAT table to setup NUMA information.
+ * Contains Excerpts from x86 implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/inttypes.h>
+#include <xen/nodemask.h>
+#include <xen/acpi.h>
+#include <xen/numa.h>
+#include <xen/pfn.h>
+#include <acpi/srat.h>
+#include <asm/page.h>
+#include <asm/acpi.h>
+
+/* Holds CPUID to MPIDR mapping read from MADT table. */
+struct cpuid_to_hwid {
+    uint32_t cpuid;
+    uint64_t hwid;
+};
+
+#define PHYS_CPUID_INVALID 0xff
+
+/* Holds mapping of CPU id to MPIDR read from MADT */
+static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
+    { [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static unsigned int num_cpuid_to_hwid;
+
+static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
+{
+    if ( mpidr == MPIDR_INVALID )
+    {
+        printk("Skip MADT cpu entry with invalid MPIDR\n");
+        numa_failed();
+        return;
+    }
+
+    cpuid_to_hwid_map[num_cpuid_to_hwid].hwid = mpidr;
+    cpuid_to_hwid_map[num_cpuid_to_hwid].cpuid = cpuid;
+    num_cpuid_to_hwid++;
+}
+
+static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
+                                          const unsigned long end)
+{
+    uint64_t mpidr;
+    struct acpi_madt_generic_interrupt *p =
+               container_of(header, struct acpi_madt_generic_interrupt, header);
+
+    if ( BAD_MADT_ENTRY(p, end) )
+    {
+        /* Though MADT is invalid, we disable NUMA by calling numa_failed() */
+        numa_failed();
+        return -EINVAL;
+    }
+
+    acpi_table_print_madt_entry(header);
+    mpidr = p->arm_mpidr & MPIDR_HWID_MASK;
+    acpi_map_cpu_to_hwid(p->uid, mpidr);
+
+    return 0;
+}
+
+void __init acpi_map_uid_to_mpidr(void)
+{
+    acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
+                    acpi_parse_madt_handler, NR_CPUS);
+}
+
+void __init acpi_numa_arch_fixup(void) {}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 891d304..958085c 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -23,6 +23,7 @@
 #include <asm/acpi.h>
 #include <xen/errno.h>
 #include <xen/pfn.h>
+#include <acpi/srat.h>
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
@@ -69,6 +70,11 @@ void numa_failed(void)
     init_dt_numa_distance();
     node_distance_fn = NULL;
     init_cpu_to_node();
+
+#ifdef CONFIG_ACPI_NUMA
+    set_acpi_numa(0);
+    reset_pxm2node();
+#endif
 }
 
 int __init arch_sanitize_nodes_memory(void)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 21/25] ACPI: Move arch specific SRAT parsing
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (19 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 20/25] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 22/25] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

SRAT's X2APIC_CPU_AFFINITY and CPU_AFFINITY types are not used
by ARM. Hence move handling of this SRAT types to arch specific
file and handle them under arch_table_parse_srat().

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/acpi_numa.c |  5 +++++
 xen/arch/x86/srat.c           | 44 +++++++++++++++++++++++++++++++++++++++++++
 xen/drivers/acpi/numa.c       | 43 ++----------------------------------------
 xen/include/xen/acpi.h        |  6 ++++++
 4 files changed, 57 insertions(+), 41 deletions(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 45b3d35..6fd937d 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -82,6 +82,11 @@ void __init acpi_map_uid_to_mpidr(void)
                     acpi_parse_madt_handler, NR_CPUS);
 }
 
+void __init arch_table_parse_srat(void)
+{
+    return;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 760df7f..2c79329 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -214,3 +214,47 @@ uint8_t __node_distance(nodeid_t a, nodeid_t b)
 }
 
 EXPORT_SYMBOL(__node_distance);
+
+static int __init
+acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
+			   const unsigned long end)
+{
+	const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
+		= container_of(header, struct acpi_srat_x2apic_cpu_affinity,
+			       header);
+
+	if (!header)
+		return -EINVAL;
+
+	acpi_table_print_srat_entry(header);
+
+	/* let architecture-dependent part to do it */
+	acpi_numa_x2apic_affinity_init(processor_affinity);
+
+	return 0;
+}
+
+static int __init
+acpi_parse_processor_affinity(struct acpi_subtable_header *header,
+			      const unsigned long end)
+{
+	const struct acpi_srat_cpu_affinity *processor_affinity
+		= container_of(header, struct acpi_srat_cpu_affinity, header);
+
+	if (!header)
+		return -EINVAL;
+
+	acpi_table_print_srat_entry(header);
+
+	acpi_numa_processor_affinity_init(processor_affinity);
+
+	return 0;
+}
+
+void __init arch_table_parse_srat(void)
+{
+	acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
+			      acpi_parse_x2apic_affinity, 0);
+	acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
+			      acpi_parse_processor_affinity, 0);
+}
diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
index 85f8917..0adc32c 100644
--- a/xen/drivers/acpi/numa.c
+++ b/xen/drivers/acpi/numa.c
@@ -120,43 +120,6 @@ static int __init acpi_parse_slit(struct acpi_table_header *table)
 }
 
 static int __init
-acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
-			   const unsigned long end)
-{
-	const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
-		= container_of(header, struct acpi_srat_x2apic_cpu_affinity,
-			       header);
-
-	if (!header)
-		return -EINVAL;
-
-	acpi_table_print_srat_entry(header);
-
-	/* let architecture-dependent part to do it */
-	acpi_numa_x2apic_affinity_init(processor_affinity);
-
-	return 0;
-}
-
-static int __init
-acpi_parse_processor_affinity(struct acpi_subtable_header *header,
-			      const unsigned long end)
-{
-	const struct acpi_srat_cpu_affinity *processor_affinity
-		= container_of(header, struct acpi_srat_cpu_affinity, header);
-
-	if (!header)
-		return -EINVAL;
-
-	acpi_table_print_srat_entry(header);
-
-	/* let architecture-dependent part to do it */
-	acpi_numa_processor_affinity_init(processor_affinity);
-
-	return 0;
-}
-
-static int __init
 acpi_parse_memory_affinity(struct acpi_subtable_header *header,
 			   const unsigned long end)
 {
@@ -197,13 +160,11 @@ int __init acpi_numa_init(void)
 {
 	/* SRAT: Static Resource Affinity Table */
 	if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
-		acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
-				      acpi_parse_x2apic_affinity, 0);
-		acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
-				      acpi_parse_processor_affinity, 0);
 		acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
 				      acpi_parse_memory_affinity,
 				      NR_NODE_MEMBLKS);
+		/* This call handles architecture dependant SRAT */
+		arch_table_parse_srat();
 	}
 
 	/* SLIT: System Locality Information Table */
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 30ec0ee..0726524 100644
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -95,7 +95,13 @@ void acpi_numa_slit_init (struct acpi_table_slit *slit);
 void acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *);
 void acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *);
 void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *);
+#ifdef CONFIG_ACPI_NUMA
 void acpi_numa_arch_fixup(void);
+void arch_table_parse_srat(void);
+#else
+static inline void acpi_numa_arch_fixup(void) { }
+static inline void arch_table_parse_srat(void) { }
+#endif
 
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Arch dependent functions for cpu hotplug support */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 22/25] ARM: NUMA: Extract proximity from SRAT table
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (20 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 21/25] ACPI: Move arch specific SRAT parsing vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 23/25] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Register SRAT entry handler for type
ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
and extract proximity for all CPU IDs.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/acpi/boot.c      |   2 +
 xen/arch/arm/numa/acpi_numa.c | 126 +++++++++++++++++++++++++++++++++++++++++-
 xen/drivers/acpi/numa.c       |  15 +++++
 xen/include/acpi/actbl1.h     |  17 +++++-
 xen/include/asm-arm/numa.h    |   9 +++
 xen/include/xen/numa.h        |   4 ++
 6 files changed, 171 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 889208a..835c44e 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -31,6 +31,7 @@
 #include <acpi/actables.h>
 #include <xen/mm.h>
 #include <xen/device_tree.h>
+#include <xen/numa.h>
 
 #include <asm/acpi.h>
 #include <asm/smp.h>
@@ -117,6 +118,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
         return;
     }
 
+    numa_set_node(enabled_cpus, acpi_get_nodeid(mpidr));
     /* map the logical cpu id to cpu MPIDR */
     cpu_logical_map(enabled_cpus) = mpidr;
 
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 6fd937d..8f51ed0 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -28,19 +28,71 @@
 #include <asm/page.h>
 #include <asm/acpi.h>
 
+extern nodemask_t processor_nodes_parsed;
+
 /* Holds CPUID to MPIDR mapping read from MADT table. */
 struct cpuid_to_hwid {
     uint32_t cpuid;
     uint64_t hwid;
 };
 
+/* Holds NODE to MPIDR mapping. */
+struct node_to_hwid {
+    nodeid_t nodeid;
+    uint64_t hwid;
+};
+
 #define PHYS_CPUID_INVALID 0xff
 
 /* Holds mapping of CPU id to MPIDR read from MADT */
 static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
     { [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static struct node_to_hwid __read_mostly node_to_hwid_map[NR_CPUS] =
+    { [0 ... NR_CPUS - 1] = {NUMA_NO_NODE, MPIDR_INVALID} };
+static unsigned int cpus_in_srat;
 static unsigned int num_cpuid_to_hwid;
 
+nodeid_t __init acpi_get_nodeid(uint64_t hwid)
+{
+    unsigned int i;
+
+    for ( i = 0; i < cpus_in_srat; i++ )
+    {
+        if ( node_to_hwid_map[i].hwid == hwid )
+            return node_to_hwid_map[i].nodeid;
+    }
+
+    return NUMA_NO_NODE;
+}
+
+static uint64_t acpi_get_cpu_hwid(int cid)
+{
+    unsigned int i;
+
+    for ( i = 0; i < num_cpuid_to_hwid; i++ )
+    {
+        if ( cpuid_to_hwid_map[i].cpuid == cid )
+            return cpuid_to_hwid_map[i].hwid;
+    }
+
+    return MPIDR_INVALID;
+}
+
+static void __init acpi_map_node_to_hwid(nodeid_t nodeid, uint64_t hwid)
+{
+    if ( nodeid >= MAX_NUMNODES )
+    {
+        printk(XENLOG_WARNING
+               "ACPI: NUMA: nodeid out of range %d with MPIDR 0x%lx\n",
+               nodeid, hwid);
+        numa_failed();
+        return;
+    }
+
+    node_to_hwid_map[cpus_in_srat].nodeid = nodeid;
+    node_to_hwid_map[cpus_in_srat].hwid = hwid;
+}
+
 static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
 {
     if ( mpidr == MPIDR_INVALID )
@@ -76,15 +128,87 @@ static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
     return 0;
 }
 
+/* Callback for Proximity Domain -> ACPI processor UID mapping */
+static void __init
+acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+    int pxm, node;
+    uint64_t mpidr;
+
+    if ( srat_disabled() )
+        return;
+
+    if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
+    {
+        printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
+               pa->header.length);
+        numa_failed();
+        return;
+    }
+
+    if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
+        return;
+
+    if ( cpus_in_srat >= NR_CPUS )
+    {
+        printk(XENLOG_ERR
+               "SRAT: cpu_to_node_map[%d] is too small to fit all cpus\n",
+               NR_CPUS);
+        return;
+    }
+
+    pxm = pa->proximity_domain;
+    node = acpi_setup_node(pxm);
+    if ( node == NUMA_NO_NODE )
+    {
+        numa_failed();
+        return;
+    }
+
+    mpidr = acpi_get_cpu_hwid(pa->acpi_processor_uid);
+    if ( mpidr == MPIDR_INVALID )
+    {
+        printk(XENLOG_ERR
+               "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in MADT\n",
+               pxm, pa->acpi_processor_uid);
+        numa_failed();
+        return;
+    }
+
+    acpi_map_node_to_hwid(node, mpidr);
+    node_set(node, processor_nodes_parsed);
+    cpus_in_srat++;
+    set_acpi_numa(1);
+    printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
+           pxm, mpidr, node);
+}
+
 void __init acpi_map_uid_to_mpidr(void)
 {
     acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
                     acpi_parse_madt_handler, NR_CPUS);
 }
 
+static int __init
+acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
+                         const unsigned long end)
+{
+   const struct acpi_srat_gicc_affinity *processor_affinity
+                = (struct acpi_srat_gicc_affinity *)header;
+
+   if (!processor_affinity)
+       return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+   acpi_numa_gicc_affinity_init(processor_affinity);
+
+   return 0;
+}
+
 void __init arch_table_parse_srat(void)
 {
-    return;
+    acpi_table_parse_srat(ACPI_SRAT_TYPE_GICC_AFFINITY,
+                          acpi_parse_gicc_affinity, NR_CPUS);
 }
 
 void __init acpi_numa_arch_fixup(void) {}
diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
index 0adc32c..b48d91d 100644
--- a/xen/drivers/acpi/numa.c
+++ b/xen/drivers/acpi/numa.c
@@ -104,6 +104,21 @@ void __init acpi_table_print_srat_entry(struct acpi_subtable_header * header)
 		}
 #endif				/* ACPI_DEBUG_OUTPUT */
 		break;
+       case ACPI_SRAT_TYPE_GICC_AFFINITY:
+#ifdef ACPI_DEBUG_OUTPUT
+		{
+			struct acpi_srat_gicc_affinity *p =
+			    (struct acpi_srat_gicc_affinity *)header;
+			ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+					  "SRAT Processor (acpi id[0x%04x]) in"
+					  " proximity domain %d %s\n",
+					  p->acpi_processor_uid,
+					  p->proximity_domain,
+					  (p->flags & ACPI_SRAT_GICC_ENABLED) ?
+					  "enabled" : "disabled");
+		}
+#endif                         /* ACPI_DEBUG_OUTPUT */
+               break;
 	default:
 		printk(KERN_WARNING PREFIX
 		       "Found unsupported SRAT entry (type = %#x)\n",
diff --git a/xen/include/acpi/actbl1.h b/xen/include/acpi/actbl1.h
index e199136..b84bfba 100644
--- a/xen/include/acpi/actbl1.h
+++ b/xen/include/acpi/actbl1.h
@@ -949,7 +949,8 @@ enum acpi_srat_type {
 	ACPI_SRAT_TYPE_CPU_AFFINITY = 0,
 	ACPI_SRAT_TYPE_MEMORY_AFFINITY = 1,
 	ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY = 2,
-	ACPI_SRAT_TYPE_RESERVED = 3	/* 3 and greater are reserved */
+	ACPI_SRAT_TYPE_GICC_AFFINITY = 3,
+	ACPI_SRAT_TYPE_RESERVED = 4	/* 4 and greater are reserved */
 };
 
 /*
@@ -1007,6 +1008,20 @@ struct acpi_srat_x2apic_cpu_affinity {
 
 #define ACPI_SRAT_CPU_ENABLED       (1)	/* 00: Use affinity structure */
 
+/* 3: GICC Affinity (ACPI 5.1) */
+
+struct acpi_srat_gicc_affinity {
+	struct acpi_subtable_header header;
+	u32 proximity_domain;
+	u32 acpi_processor_uid;
+	u32 flags;
+	u32 clock_domain;
+};
+
+/* Flags for struct acpi_srat_gicc_affinity */
+
+#define ACPI_SRAT_GICC_ENABLED     (1)  /* 00: Use affinity structure */
+
 /* Reset to default packing */
 
 #pragma pack()
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 85fbbe8..1d4dc98 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -13,6 +13,15 @@ extern void dt_numa_process_memory_node(uint32_t nid,paddr_t start,
 extern void register_node_distance(uint8_t (fn)(nodeid_t a, nodeid_t b));
 extern void init_dt_numa_distance(void);
 extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
+#ifdef CONFIG_ACPI_NUMA
+nodeid_t acpi_get_nodeid(uint64_t hwid);
+#else
+static inline nodeid_t acpi_get_nodeid(uint64_t hwid)
+{
+    return 0;
+}
+#endif /* CONFIG_ACPI_NUMA */
+
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
 extern int dt_numa_init(void);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 851f4a7..c3b4adc 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -24,7 +24,11 @@ extern int compute_memnode_shift(struct node *nodes, int numnodes,
                                  nodeid_t *nodeids, unsigned int *shift);
 extern void numa_init_array(void);
 extern bool_t srat_disabled(void);
+#ifdef CONFIG_NUMA
 extern void numa_set_node(int cpu, nodeid_t node);
+#else
+static inline void numa_set_node(int cpu, nodeid_t node) { }
+#endif
 extern void srat_detect_node(int cpu);
 extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
 extern void init_cpu_to_node(void);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 23/25] ARM: NUMA: Initialize ACPI NUMA
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (21 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 22/25] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig vijay.kilari
  2017-03-28 15:53 ` [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config vijay.kilari
  24 siblings, 0 replies; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa/acpi_numa.c | 28 +++++++++++++++++++++++++++-
 xen/arch/arm/numa/numa.c      |  6 ++++++
 xen/common/numa.c             | 14 ++++++++++++++
 xen/include/asm-arm/numa.h    |  1 +
 xen/include/xen/numa.h        |  1 +
 5 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 8f51ed0..574ed45 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -29,6 +29,7 @@
 #include <asm/acpi.h>
 
 extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /* Holds CPUID to MPIDR mapping read from MADT table. */
 struct cpuid_to_hwid {
@@ -183,7 +184,7 @@ acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
            pxm, mpidr, node);
 }
 
-void __init acpi_map_uid_to_mpidr(void)
+static void __init acpi_map_uid_to_mpidr(void)
 {
     acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
                     acpi_parse_madt_handler, NR_CPUS);
@@ -211,6 +212,31 @@ void __init arch_table_parse_srat(void)
                           acpi_parse_gicc_affinity, NR_CPUS);
 }
 
+bool_t __init arch_acpi_numa_init(void)
+{
+    int ret;
+
+    if ( !acpi_disabled )
+    {
+        /*
+         * If firmware has DT, process_memory_node() call
+         * would have added memory blocks. So reset it before
+         * ACPI numa init.
+         */
+        numa_clear_memblks();
+        nodes_clear(memory_nodes_parsed);
+        acpi_map_uid_to_mpidr();
+        ret = acpi_numa_init();
+        if ( ret || srat_disabled() )
+            return 1;
+
+        /* Register acpi node_distance handler */
+        register_node_distance(&acpi_node_distance);
+    }
+
+    return 0;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 958085c..b5556c6 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -152,12 +152,18 @@ void __init numa_init(void)
     if ( is_numa_off() )
         goto no_numa;
 
+#ifdef CONFIG_ACPI_NUMA
+    ret = arch_acpi_numa_init();
+    if ( ret )
+        printk(XENLOG_WARNING "ACPI NUMA init failed\n");
+#else
     if ( !dt_numa )
         goto no_numa;
 
     ret = dt_numa_init();
     if ( ret )
         printk(XENLOG_WARNING "DT NUMA init failed\n");
+#endif
 
     for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
     {
diff --git a/xen/common/numa.c b/xen/common/numa.c
index f2ac726..aca2386 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -84,6 +84,20 @@ nodeid_t get_memblk_nodeid(int id)
     return memblk_nodeid[id];
 }
 
+void __init numa_clear_memblks(void)
+{
+    unsigned int i;
+
+    for ( i = 0; i < get_num_node_memblks(); i++ )
+    {
+        node_memblk_range[i].start = 0;
+        node_memblk_range[i].end = 0;
+        memblk_nodeid[i] = NUMA_NO_NODE;
+    }
+
+    num_node_memblks = 0;
+}
+
 int __init get_mem_nodeid(paddr_t start, paddr_t end)
 {
     unsigned int i;
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 1d4dc98..f932ba3 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -24,6 +24,7 @@ static inline nodeid_t acpi_get_nodeid(uint64_t hwid)
 
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
+extern bool_t arch_acpi_numa_init(void);
 extern int dt_numa_init(void);
 extern void numa_set_cpu_node(int cpu, unsigned int nid);
 extern void numa_add_cpu(int cpu);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index c3b4adc..6c885bd 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -59,4 +59,5 @@ void set_acpi_numa(bool val);
 int get_numa_fake(void);
 extern int numa_emulation(uint64_t start_pfn, uint64_t end_pfn);
 extern void numa_dummy_init(uint64_t start_pfn, uint64_t end_pfn);
+extern void numa_clear_memblks(void);
 #endif /* _XEN_NUMA_H */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (22 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 23/25] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-31 10:04   ` Jan Beulich
  2017-03-28 15:53 ` [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config vijay.kilari
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

CONFIG_NUMA is defined in xen/drivers/acpi/Kconfig.
Move to common/Kconfig and enabled by default.
Also, NUMA feature uses PDX for physical address to
memory node mapping. Hence make HAS_PDX dependent
for NUMA.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/common/Kconfig       | 4 ++++
 xen/drivers/acpi/Kconfig | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 5334be3..d6b8a40 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -41,6 +41,10 @@ config HAS_GDBSX
 config HAS_IOPORTS
 	bool
 
+config NUMA
+	def_bool y
+	depends on HAS_PDX
+
 config HAS_BUILD_ID
 	string
 	option env="XEN_HAS_BUILD_ID"
diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index b64d373..488372f 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,6 +4,3 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
 	bool
-
-config NUMA
-	bool
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config
  2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
                   ` (23 preceding siblings ...)
  2017-03-28 15:53 ` [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig vijay.kilari
@ 2017-03-28 15:53 ` vijay.kilari
  2017-05-31 10:05   ` Jan Beulich
  24 siblings, 1 reply; 71+ messages in thread
From: vijay.kilari @ 2017-03-28 15:53 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, julien.grall, jbeulich, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Add CONFIG_ACPI_NUMA to xen/drivers/acpi/Kconfig and
drop CONFIG_ACPI_NUMA set in asm-x86/config.h.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/drivers/acpi/Kconfig     | 4 ++++
 xen/include/asm-x86/config.h | 1 -
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index 488372f..8e15428 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,3 +4,7 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
 	bool
+
+config ACPI_NUMA
+	def_bool y
+	depends on ACPI && NUMA
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index b9a6d94..cc27a52 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -37,7 +37,6 @@
 #define CONFIG_X86_L1_CACHE_SHIFT 7
 
 #define CONFIG_ACPI_SLEEP 1
-#define CONFIG_ACPI_NUMA 1
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
@ 2017-03-28 16:44   ` Wei Liu
  2017-05-31 10:20   ` Jan Beulich
  2017-05-31 10:21   ` Jan Beulich
  2 siblings, 0 replies; 71+ messages in thread
From: Wei Liu @ 2017-03-28 16:44 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, julien.grall, jbeulich, Vijaya Kumar K

The subject should be updated -- it does more than dropping trailing
spaces.

On Tue, Mar 28, 2017 at 09:23:09PM +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> 
> Fix coding style, trailing spaces, tabs in NUMA code.
> Also drop unused macros and functions.
> 

No functional change.

> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes
  2017-03-28 15:53 ` [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes vijay.kilari
@ 2017-03-28 16:44   ` Wei Liu
  2017-05-31 10:35   ` Jan Beulich
  1 sibling, 0 replies; 71+ messages in thread
From: Wei Liu @ 2017-03-28 16:44 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, julien.grall, jbeulich, Vijaya Kumar K

On Tue, Mar 28, 2017 at 09:23:10PM +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> 
> Change u{8,32,64} to uint{8,32,64}_t and bool_t to bool.
> Fix attributes coding styles.
> Also change memnodeshift to unsigned int.

I think you also need to say you changed u64 to paddr_t where
appropriate.

> @@ -269,8 +269,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>             numa_off ? "NUMA turned off" : "No NUMA configuration found");
>  
>      printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
> -           (u64)start_pfn << PAGE_SHIFT,
> -           (u64)end_pfn << PAGE_SHIFT);
> +           (uint64_t)start_pfn << PAGE_SHIFT,
> +           (uint64_t)end_pfn << PAGE_SHIFT);
>      /* setup dummy node covering all memory */
>      memnode_shift = BITS_PER_LONG - 1;
>      memnodemap = _memnodemap;
> @@ -279,8 +279,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>      for ( i = 0; i < nr_cpu_ids; i++ )
>          numa_set_node(i, 0);
>      cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
> -    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
> -                    (u64)end_pfn << PAGE_SHIFT);
> +    setup_node_bootmem(0, (paddr_t)start_pfn << PAGE_SHIFT,
> +                    (paddr_t)end_pfn << PAGE_SHIFT);

Indentation.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-03-28 15:53 ` [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables vijay.kilari
@ 2017-04-20 15:59   ` Julien Grall
  2017-04-25  6:54     ` Vijay Kilari
  2017-06-30 14:07   ` Jan Beulich
  1 sibling, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-04-20 15:59 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Add accessor functions for acpi_numa and numa_off static
> variables. Init value of acpi_numa is set 1 instead of 0.

Please explain why you change the acpi_numa value from 0 to 1.

Also, I am not sure to understand the benefits of those helpers. Why do 
you need them? Why not using the global variable directly, this will 
avoid to call a function just for returning a value...

> Also return value of srat_disabled is changed to bool.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/numa.c        | 43 +++++++++++++++++++++++++++++++------------
>  xen/arch/x86/setup.c       |  2 +-
>  xen/arch/x86/srat.c        | 12 ++++++------
>  xen/include/asm-x86/acpi.h |  1 -
>  xen/include/asm-x86/numa.h |  5 +----
>  xen/include/xen/numa.h     |  3 +++
>  6 files changed, 42 insertions(+), 24 deletions(-)
>
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index 964fc5a..6b794a7 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>
>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>
> -bool numa_off = 0;
> -s8 acpi_numa = 0;
> +static bool numa_off = 0;
> +static bool acpi_numa = 1;

Please don't mix 0/1 and bool. Instead use false/true.

>
> -int srat_disabled(void)
> +bool is_numa_off(void)
>  {
> -    return numa_off || acpi_numa < 0;
> +    return numa_off;
> +}
> +
> +bool get_acpi_numa(void)
> +{
> +    return acpi_numa;
> +}
> +
> +void set_acpi_numa(bool_t val)
> +{
> +    acpi_numa = val;
> +}
> +
> +bool srat_disabled(void)
> +{
> +    return numa_off || acpi_numa == 0;
>  }
>
>  /*
> @@ -202,13 +217,17 @@ void __init numa_init_array(void)
>
>  #ifdef CONFIG_NUMA_EMU
>  static int __initdata numa_fake = 0;
> +static int get_numa_fake(void)
> +{
> +    return numa_fake;
> +}
>
>  /* Numa emulation */
>  static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>  {
>      int i;
>      struct node nodes[MAX_NUMNODES];
> -    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
> +    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / get_numa_fake();
>
>      /* Kludge needed for the hash function */
>      if ( hweight64(sz) > 1 )
> @@ -223,10 +242,10 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>      }
>
>      memset(&nodes,0,sizeof(nodes));
> -    for ( i = 0; i < numa_fake; i++ )
> +    for ( i = 0; i < get_numa_fake(); i++ )
>      {
>          nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
> -        if ( i == numa_fake - 1 )
> +        if ( i == get_numa_fake() - 1 )
>              sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
>          nodes[i].end = nodes[i].start + sz;
>          printk(KERN_INFO
> @@ -235,7 +254,7 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>                 (nodes[i].end - nodes[i].start) >> 20);
>          node_set_online(i);
>      }
> -    if ( compute_memnode_shift(nodes, numa_fake, NULL, &memnode_shift) )
> +    if ( compute_memnode_shift(nodes, get_numa_fake(), NULL, &memnode_shift) )
>      {
>          memnode_shift = 0;
>          printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
> @@ -254,18 +273,18 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>      int i;
>
>  #ifdef CONFIG_NUMA_EMU
> -    if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
> +    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>          return;
>  #endif
>
>  #ifdef CONFIG_ACPI_NUMA
> -    if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
> +    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
>           (uint64_t)end_pfn << PAGE_SHIFT) )
>          return;
>  #endif
>
>      printk(KERN_INFO "%s\n",
> -           numa_off ? "NUMA turned off" : "No NUMA configuration found");
> +           is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
>
>      printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
>             (uint64_t)start_pfn << PAGE_SHIFT,
> @@ -312,7 +331,7 @@ static int __init numa_setup(char *opt)
>      if ( !strncmp(opt,"noacpi",6) )
>      {
>          numa_off = 0;
> -        acpi_numa = -1;
> +        acpi_numa = 0;
>      }
>  #endif
>
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 1cd290e..4410e53 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -241,7 +241,7 @@ void srat_detect_node(int cpu)
>      node_set_online(node);
>      numa_set_node(cpu, node);
>
> -    if ( opt_cpu_info && acpi_numa > 0 )
> +    if ( opt_cpu_info && get_acpi_numa() )
>          printk("CPU %d APIC %d -> Node %d\n", cpu, apicid, node);
>  }
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index 2d0c047..ccacbcd 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -153,7 +153,7 @@ static void __init bad_srat(void)
>  {
>  	int i;
>  	printk(KERN_ERR "SRAT: SRAT not used.\n");
> -	acpi_numa = -1;
> +	set_acpi_numa(0);
>  	for (i = 0; i < MAX_LOCAL_APIC; i++)
>  		apicid_to_node[i] = NUMA_NO_NODE;
>  	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
> @@ -232,7 +232,7 @@ acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
>
>  	apicid_to_node[pa->apic_id] = node;
>  	node_set(node, processor_nodes_parsed);
> -	acpi_numa = 1;
> +	set_acpi_numa(1);
>  	printk(KERN_INFO "SRAT: PXM %u -> APIC %08x -> Node %u\n",
>  	       pxm, pa->apic_id, node);
>  }
> @@ -265,7 +265,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
>  	}
>  	apicid_to_node[pa->apic_id] = node;
>  	node_set(node, processor_nodes_parsed);
> -	acpi_numa = 1;
> +	set_acpi_numa(1);
>  	printk(KERN_INFO "SRAT: PXM %u -> APIC %02x -> Node %u\n",
>  	       pxm, pa->apic_id, node);
>  }
> @@ -418,7 +418,7 @@ static int __init srat_parse_region(struct acpi_subtable_header *header,
>  	    (ma->flags & ACPI_SRAT_MEM_NON_VOLATILE))
>  		return 0;
>
> -	if (numa_off)
> +	if (is_numa_off())
>  		printk(KERN_INFO "SRAT: %013"PRIx64"-%013"PRIx64"\n",
>  		       ma->base_address, ma->base_address + ma->length - 1);
>
> @@ -433,7 +433,7 @@ void __init srat_parse_regions(uint64_t addr)
>  	uint64_t mask;
>  	unsigned int i;
>
> -	if (acpi_disabled || acpi_numa < 0 ||
> +	if (acpi_disabled || (get_acpi_numa() == 0) ||
>  	    acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat))
>  		return;
>
> @@ -462,7 +462,7 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
>  	for (i = 0; i < MAX_NUMNODES; i++)
>  		cutoff_node(i, start, end);
>
> -	if (acpi_numa <= 0)
> +	if (get_acpi_numa() == 0)
>  		return -1;
>
>  	if (!nodes_cover_memory()) {
> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
> index a766688..9298d42 100644
> --- a/xen/include/asm-x86/acpi.h
> +++ b/xen/include/asm-x86/acpi.h
> @@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
>
>  #define ARCH_HAS_POWER_INIT	1
>
> -extern s8 acpi_numa;
>  extern int acpi_scan_nodes(u64 start, u64 end);
>  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>
> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index bb22bff..ae5768b 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>
>  extern void numa_add_cpu(int cpu);
>  extern void numa_init_array(void);
> -extern bool_t numa_off;
> -
> -
> -extern int srat_disabled(void);
> +extern bool srat_disabled(void);
>  extern void numa_set_node(int cpu, nodeid_t node);
>  extern nodeid_t setup_node(unsigned int pxm);
>  extern void srat_detect_node(int cpu);
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 7aef1a8..7f6d090 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -18,4 +18,7 @@
>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>
> +bool is_numa_off(void);
> +bool get_acpi_numa(void);
> +void set_acpi_numa(bool val);

I am not sure to understand why you add those helpers directly here in 
xen/numa.h. IHMO, This should belong to the moving code patches.

>  #endif /* _XEN_NUMA_H */
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function
  2017-03-28 15:53 ` [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function vijay.kilari
@ 2017-04-20 16:12   ` Julien Grall
  2017-04-25  6:59     ` Vijay Kilari
  2017-06-30 14:08   ` Jan Beulich
  1 sibling, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-04-20 16:12 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Split numa_initmem_init() so that the numa fallback code is moved
> as separate function which is generic.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/numa.c | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index 6b794a7..0888d53 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -268,21 +268,10 @@ static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>  }
>  #endif
>
> -void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
> +static void __init numa_dummy_init(unsigned long start_pfn, unsigned long end_pfn)
>  {
>      int i;
>
> -#ifdef CONFIG_NUMA_EMU
> -    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
> -        return;
> -#endif
> -
> -#ifdef CONFIG_ACPI_NUMA
> -    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
> -         (uint64_t)end_pfn << PAGE_SHIFT) )
> -        return;
> -#endif
> -
>      printk(KERN_INFO "%s\n",
>             is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
>
> @@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>                      (paddr_t)end_pfn << PAGE_SHIFT);
>  }
>
> +void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
> +{
> +#ifdef CONFIG_NUMA_EMU
> +    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
> +        return;
> +#endif

I am not sure where to comment about it in this series, so I will say it 
here.

As asked on v1, why don't you consider fake NUMA? This would help to 
test the series on non-NUMA platform.

> +
> +#ifdef CONFIG_ACPI_NUMA
> +    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
> +         (uint64_t)end_pfn << PAGE_SHIFT) )
> +        return;
> +#endif
> +
> +    numa_dummy_init(start_pfn, end_pfn);
> +}
> +
>  void numa_add_cpu(int cpu)
>  {
>      cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>

Cheers,

-- 
  Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-20 15:59   ` Julien Grall
@ 2017-04-25  6:54     ` Vijay Kilari
  2017-04-25 12:04       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-04-25  6:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Add accessor functions for acpi_numa and numa_off static
>> variables. Init value of acpi_numa is set 1 instead of 0.
>
>
> Please explain why you change the acpi_numa value from 0 to 1.

previously acpi_numa was s8 and are using 0 and -1 values. I have made
it bool and set
the initial value to 1.

By setting 1, we are enabling acpi_numa by default. If not enabled, the below
call has check srat_disabled() before proceeding fails.

acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
{
....

    if ( srat_disabled() )
        return;

}

>
> Also, I am not sure to understand the benefits of those helpers. Why do you
> need them? Why not using the global variable directly, this will avoid to
> call a function just for returning a value...

These helpers are used by both common code and arch specific code later.
Hence made these global variables as static and added helpers

>
>> Also return value of srat_disabled is changed to bool.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/numa.c        | 43
>> +++++++++++++++++++++++++++++++------------
>>  xen/arch/x86/setup.c       |  2 +-
>>  xen/arch/x86/srat.c        | 12 ++++++------
>>  xen/include/asm-x86/acpi.h |  1 -
>>  xen/include/asm-x86/numa.h |  5 +----
>>  xen/include/xen/numa.h     |  3 +++
>>  6 files changed, 42 insertions(+), 24 deletions(-)
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 964fc5a..6b794a7 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>>
>>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>>
>> -bool numa_off = 0;
>> -s8 acpi_numa = 0;
>> +static bool numa_off = 0;
>> +static bool acpi_numa = 1;
>
>
> Please don't mix 0/1 and bool. Instead use false/true.

OK.
>
>
>>
>> -int srat_disabled(void)
>> +bool is_numa_off(void)
>>  {
>> -    return numa_off || acpi_numa < 0;
>> +    return numa_off;
>> +}
>> +
>> +bool get_acpi_numa(void)
>> +{
>> +    return acpi_numa;
>> +}
>> +
>> +void set_acpi_numa(bool_t val)
>> +{
>> +    acpi_numa = val;
>> +}
>> +
>> +bool srat_disabled(void)
>> +{
>> +    return numa_off || acpi_numa == 0;
>>  }
>>
>>  /*
>> @@ -202,13 +217,17 @@ void __init numa_init_array(void)
>>
>>  #ifdef CONFIG_NUMA_EMU
>>  static int __initdata numa_fake = 0;
>> +static int get_numa_fake(void)
>> +{
>> +    return numa_fake;
>> +}
>>
>>  /* Numa emulation */
>>  static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>>  {
>>      int i;
>>      struct node nodes[MAX_NUMNODES];
>> -    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
>> +    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) /
>> get_numa_fake();
>>
>>      /* Kludge needed for the hash function */
>>      if ( hweight64(sz) > 1 )
>> @@ -223,10 +242,10 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>>      }
>>
>>      memset(&nodes,0,sizeof(nodes));
>> -    for ( i = 0; i < numa_fake; i++ )
>> +    for ( i = 0; i < get_numa_fake(); i++ )
>>      {
>>          nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
>> -        if ( i == numa_fake - 1 )
>> +        if ( i == get_numa_fake() - 1 )
>>              sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
>>          nodes[i].end = nodes[i].start + sz;
>>          printk(KERN_INFO
>> @@ -235,7 +254,7 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>>                 (nodes[i].end - nodes[i].start) >> 20);
>>          node_set_online(i);
>>      }
>> -    if ( compute_memnode_shift(nodes, numa_fake, NULL, &memnode_shift) )
>> +    if ( compute_memnode_shift(nodes, get_numa_fake(), NULL,
>> &memnode_shift) )
>>      {
>>          memnode_shift = 0;
>>          printk(KERN_ERR "No NUMA hash function found. Emulation
>> disabled.\n");
>> @@ -254,18 +273,18 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>      int i;
>>
>>  #ifdef CONFIG_NUMA_EMU
>> -    if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
>> +    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>>          return;
>>  #endif
>>
>>  #ifdef CONFIG_ACPI_NUMA
>> -    if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
>> +    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>>           (uint64_t)end_pfn << PAGE_SHIFT) )
>>          return;
>>  #endif
>>
>>      printk(KERN_INFO "%s\n",
>> -           numa_off ? "NUMA turned off" : "No NUMA configuration found");
>> +           is_numa_off() ? "NUMA turned off" : "No NUMA configuration
>> found");
>>
>>      printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
>>             (uint64_t)start_pfn << PAGE_SHIFT,
>> @@ -312,7 +331,7 @@ static int __init numa_setup(char *opt)
>>      if ( !strncmp(opt,"noacpi",6) )
>>      {
>>          numa_off = 0;
>> -        acpi_numa = -1;
>> +        acpi_numa = 0;
>>      }
>>  #endif
>>
>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>> index 1cd290e..4410e53 100644
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -241,7 +241,7 @@ void srat_detect_node(int cpu)
>>      node_set_online(node);
>>      numa_set_node(cpu, node);
>>
>> -    if ( opt_cpu_info && acpi_numa > 0 )
>> +    if ( opt_cpu_info && get_acpi_numa() )
>>          printk("CPU %d APIC %d -> Node %d\n", cpu, apicid, node);
>>  }
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 2d0c047..ccacbcd 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -153,7 +153,7 @@ static void __init bad_srat(void)
>>  {
>>         int i;
>>         printk(KERN_ERR "SRAT: SRAT not used.\n");
>> -       acpi_numa = -1;
>> +       set_acpi_numa(0);
>>         for (i = 0; i < MAX_LOCAL_APIC; i++)
>>                 apicid_to_node[i] = NUMA_NO_NODE;
>>         for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
>> @@ -232,7 +232,7 @@ acpi_numa_x2apic_affinity_init(const struct
>> acpi_srat_x2apic_cpu_affinity *pa)
>>
>>         apicid_to_node[pa->apic_id] = node;
>>         node_set(node, processor_nodes_parsed);
>> -       acpi_numa = 1;
>> +       set_acpi_numa(1);
>>         printk(KERN_INFO "SRAT: PXM %u -> APIC %08x -> Node %u\n",
>>                pxm, pa->apic_id, node);
>>  }
>> @@ -265,7 +265,7 @@ acpi_numa_processor_affinity_init(const struct
>> acpi_srat_cpu_affinity *pa)
>>         }
>>         apicid_to_node[pa->apic_id] = node;
>>         node_set(node, processor_nodes_parsed);
>> -       acpi_numa = 1;
>> +       set_acpi_numa(1);
>>         printk(KERN_INFO "SRAT: PXM %u -> APIC %02x -> Node %u\n",
>>                pxm, pa->apic_id, node);
>>  }
>> @@ -418,7 +418,7 @@ static int __init srat_parse_region(struct
>> acpi_subtable_header *header,
>>             (ma->flags & ACPI_SRAT_MEM_NON_VOLATILE))
>>                 return 0;
>>
>> -       if (numa_off)
>> +       if (is_numa_off())
>>                 printk(KERN_INFO "SRAT: %013"PRIx64"-%013"PRIx64"\n",
>>                        ma->base_address, ma->base_address + ma->length -
>> 1);
>>
>> @@ -433,7 +433,7 @@ void __init srat_parse_regions(uint64_t addr)
>>         uint64_t mask;
>>         unsigned int i;
>>
>> -       if (acpi_disabled || acpi_numa < 0 ||
>> +       if (acpi_disabled || (get_acpi_numa() == 0) ||
>>             acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat))
>>                 return;
>>
>> @@ -462,7 +462,7 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t
>> end)
>>         for (i = 0; i < MAX_NUMNODES; i++)
>>                 cutoff_node(i, start, end);
>>
>> -       if (acpi_numa <= 0)
>> +       if (get_acpi_numa() == 0)
>>                 return -1;
>>
>>         if (!nodes_cover_memory()) {
>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>> index a766688..9298d42 100644
>> --- a/xen/include/asm-x86/acpi.h
>> +++ b/xen/include/asm-x86/acpi.h
>> @@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
>>
>>  #define ARCH_HAS_POWER_INIT    1
>>
>> -extern s8 acpi_numa;
>>  extern int acpi_scan_nodes(u64 start, u64 end);
>>  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index bb22bff..ae5768b 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>>
>>  extern void numa_add_cpu(int cpu);
>>  extern void numa_init_array(void);
>> -extern bool_t numa_off;
>> -
>> -
>> -extern int srat_disabled(void);
>> +extern bool srat_disabled(void);
>>  extern void numa_set_node(int cpu, nodeid_t node);
>>  extern nodeid_t setup_node(unsigned int pxm);
>>  extern void srat_detect_node(int cpu);
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 7aef1a8..7f6d090 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -18,4 +18,7 @@
>>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>
>> +bool is_numa_off(void);
>> +bool get_acpi_numa(void);
>> +void set_acpi_numa(bool val);
>
>
> I am not sure to understand why you add those helpers directly here in
> xen/numa.h. IHMO, This should belong to the moving code patches.

To have code moving patches doing only code movement, I have exported
in the patch it is defined.

>
>
>>  #endif /* _XEN_NUMA_H */
>>
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function
  2017-04-20 16:12   ` Julien Grall
@ 2017-04-25  6:59     ` Vijay Kilari
  0 siblings, 0 replies; 71+ messages in thread
From: Vijay Kilari @ 2017-04-25  6:59 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Thu, Apr 20, 2017 at 9:42 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Split numa_initmem_init() so that the numa fallback code is moved
>> as separate function which is generic.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/numa.c | 29 +++++++++++++++++------------
>>  1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 6b794a7..0888d53 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -268,21 +268,10 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>>  }
>>  #endif
>>
>> -void __init numa_initmem_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>> +static void __init numa_dummy_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>>  {
>>      int i;
>>
>> -#ifdef CONFIG_NUMA_EMU
>> -    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>> -        return;
>> -#endif
>> -
>> -#ifdef CONFIG_ACPI_NUMA
>> -    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>> -         (uint64_t)end_pfn << PAGE_SHIFT) )
>> -        return;
>> -#endif
>> -
>>      printk(KERN_INFO "%s\n",
>>             is_numa_off() ? "NUMA turned off" : "No NUMA configuration
>> found");
>>
>> @@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>                      (paddr_t)end_pfn << PAGE_SHIFT);
>>  }
>>
>> +void __init numa_initmem_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>> +{
>> +#ifdef CONFIG_NUMA_EMU
>> +    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>> +        return;
>> +#endif
>
>
> I am not sure where to comment about it in this series, so I will say it
> here.
>
> As asked on v1, why don't you consider fake NUMA? This would help to test
> the series on non-NUMA platform.

I have not tested non-NUMA case with this series. Agreed this two
lines should be added
to numa_initmem_init() of arm (xen/arch/arm/numa/numa.c)

>
>> +
>> +#ifdef CONFIG_ACPI_NUMA
>> +    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>> +         (uint64_t)end_pfn << PAGE_SHIFT) )
>> +        return;
>> +#endif
>> +
>> +    numa_dummy_init(start_pfn, end_pfn);
>> +}
>> +
>>  void numa_add_cpu(int cpu)
>>  {
>>      cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>>
>
> Cheers,
>
> --
>  Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25  6:54     ` Vijay Kilari
@ 2017-04-25 12:04       ` Julien Grall
  2017-04-25 12:20         ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-04-25 12:04 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

Hello Vijay,

On 25/04/17 07:54, Vijay Kilari wrote:
> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Vijay,
>>
>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Add accessor functions for acpi_numa and numa_off static
>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>
>>
>> Please explain why you change the acpi_numa value from 0 to 1.
>
> previously acpi_numa was s8 and are using 0 and -1 values. I have made
> it bool and set
> the initial value to 1.

Are you sure? With a quick grep I spot it sounds like acpi_numa can have 
the value 0, -1, 1.

>
> By setting 1, we are enabling acpi_numa by default. If not enabled, the below
> call has check srat_disabled() before proceeding fails.

My understanding is on x86 acpi_numa is disabled by default and will be 
enabled if they are able to parse the SRAT. So why are you changing the 
behavior for x86?

>
> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
> {
> ....
>
>     if ( srat_disabled() )
>         return;
>
> }
>
>>
>> Also, I am not sure to understand the benefits of those helpers. Why do you
>> need them? Why not using the global variable directly, this will avoid to
>> call a function just for returning a value...
>
> These helpers are used by both common code and arch specific code later.
> Hence made these global variables as static and added helpers

The variables were global so they could already be used anywhere.

Furthermore, those helpers are not even inlined, so everytime you want 
to access read acpi_numa you have to do a branch then a memory access.

So what is the point to do that?

>>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>>> index a766688..9298d42 100644
>>> --- a/xen/include/asm-x86/acpi.h
>>> +++ b/xen/include/asm-x86/acpi.h
>>> @@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
>>>
>>>  #define ARCH_HAS_POWER_INIT    1
>>>
>>> -extern s8 acpi_numa;
>>>  extern int acpi_scan_nodes(u64 start, u64 end);
>>>  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>>
>>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>>> index bb22bff..ae5768b 100644
>>> --- a/xen/include/asm-x86/numa.h
>>> +++ b/xen/include/asm-x86/numa.h
>>> @@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>>>
>>>  extern void numa_add_cpu(int cpu);
>>>  extern void numa_init_array(void);
>>> -extern bool_t numa_off;
>>> -
>>> -
>>> -extern int srat_disabled(void);
>>> +extern bool srat_disabled(void);
>>>  extern void numa_set_node(int cpu, nodeid_t node);
>>>  extern nodeid_t setup_node(unsigned int pxm);
>>>  extern void srat_detect_node(int cpu);
>>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>>> index 7aef1a8..7f6d090 100644
>>> --- a/xen/include/xen/numa.h
>>> +++ b/xen/include/xen/numa.h
>>> @@ -18,4 +18,7 @@
>>>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>>
>>> +bool is_numa_off(void);
>>> +bool get_acpi_numa(void);
>>> +void set_acpi_numa(bool val);
>>
>>
>> I am not sure to understand why you add those helpers directly here in
>> xen/numa.h. IHMO, This should belong to the moving code patches.
>
> To have code moving patches doing only code movement, I have exported
> in the patch it is defined.

Sorry but what are prototypes if not code?

The point of moving the prototypes in the patch which move the actual 
declarations is helping the reviewers to check if you correctly moved 
everything.

>
>>
>>
>>>  #endif /* _XEN_NUMA_H */
>>>
>>
>> --
>> Julien Grall

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 12:04       ` Julien Grall
@ 2017-04-25 12:20         ` Vijay Kilari
  2017-04-25 12:28           ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-04-25 12:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Tue, Apr 25, 2017 at 5:34 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 25/04/17 07:54, Vijay Kilari wrote:
>>
>> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hi Vijay,
>>>
>>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>>
>>>>
>>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>>
>>>> Add accessor functions for acpi_numa and numa_off static
>>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>>
>>>
>>>
>>> Please explain why you change the acpi_numa value from 0 to 1.
>>
>>
>> previously acpi_numa was s8 and are using 0 and -1 values. I have made
>> it bool and set
>> the initial value to 1.
>
>
> Are you sure? With a quick grep I spot it sounds like acpi_numa can have the
> value 0, -1, 1.
>

Hmm.. But I don't see use of having 0, -1 and 1. But I don't see any use of
having 3 values to say enable or disable.

>>
>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>> below
>> call has check srat_disabled() before proceeding fails.
>
>
> My understanding is on x86 acpi_numa is disabled by default and will be
> enabled if they are able to parse the SRAT. So why are you changing the
> behavior for x86?

acpi_numa = 0 means it is enabled by default on x86.

>
>>
>> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>> {
>> ....
>>
>>     if ( srat_disabled() )
>>         return;
>>
>> }
>>
>>>
>>> Also, I am not sure to understand the benefits of those helpers. Why do
>>> you
>>> need them? Why not using the global variable directly, this will avoid to
>>> call a function just for returning a value...
>>
>>
>> These helpers are used by both common code and arch specific code later.
>> Hence made these global variables as static and added helpers
>
>
> The variables were global so they could already be used anywhere.
>
> Furthermore, those helpers are not even inlined, so everytime you want to
> access read acpi_numa you have to do a branch then a memory access.
>
> So what is the point to do that?

I agree with making inline. But I don't think there is any harm in making them
static and adding helpers.

>
>
>>>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>>>> index a766688..9298d42 100644
>>>> --- a/xen/include/asm-x86/acpi.h
>>>> +++ b/xen/include/asm-x86/acpi.h
>>>> @@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
>>>>
>>>>  #define ARCH_HAS_POWER_INIT    1
>>>>
>>>> -extern s8 acpi_numa;
>>>>  extern int acpi_scan_nodes(u64 start, u64 end);
>>>>  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>>>
>>>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>>>> index bb22bff..ae5768b 100644
>>>> --- a/xen/include/asm-x86/numa.h
>>>> +++ b/xen/include/asm-x86/numa.h
>>>> @@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>>>>
>>>>  extern void numa_add_cpu(int cpu);
>>>>  extern void numa_init_array(void);
>>>> -extern bool_t numa_off;
>>>> -
>>>> -
>>>> -extern int srat_disabled(void);
>>>> +extern bool srat_disabled(void);
>>>>  extern void numa_set_node(int cpu, nodeid_t node);
>>>>  extern nodeid_t setup_node(unsigned int pxm);
>>>>  extern void srat_detect_node(int cpu);
>>>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>>>> index 7aef1a8..7f6d090 100644
>>>> --- a/xen/include/xen/numa.h
>>>> +++ b/xen/include/xen/numa.h
>>>> @@ -18,4 +18,7 @@
>>>>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>>>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>>>
>>>> +bool is_numa_off(void);
>>>> +bool get_acpi_numa(void);
>>>> +void set_acpi_numa(bool val);
>>>
>>>
>>>
>>> I am not sure to understand why you add those helpers directly here in
>>> xen/numa.h. IHMO, This should belong to the moving code patches.
>>
>>
>> To have code moving patches doing only code movement, I have exported
>> in the patch it is defined.
>
>
> Sorry but what are prototypes if not code?
>
> The point of moving the prototypes in the patch which move the actual
> declarations is helping the reviewers to check if you correctly moved
> everything.

I am ok if it helps in review.

>
>
>>
>>>
>>>
>>>>  #endif /* _XEN_NUMA_H */
>>>>
>>>
>>> --
>>> Julien Grall
>
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 12:20         ` Vijay Kilari
@ 2017-04-25 12:28           ` Julien Grall
  2017-04-25 14:54             ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-04-25 12:28 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K



On 25/04/17 13:20, Vijay Kilari wrote:
> On Tue, Apr 25, 2017 at 5:34 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>> On 25/04/17 07:54, Vijay Kilari wrote:
>>>
>>> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> Hi Vijay,
>>>>
>>>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>>>
>>>>>
>>>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>>>
>>>>> Add accessor functions for acpi_numa and numa_off static
>>>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>>>
>>>>
>>>>
>>>> Please explain why you change the acpi_numa value from 0 to 1.
>>>
>>>
>>> previously acpi_numa was s8 and are using 0 and -1 values. I have made
>>> it bool and set
>>> the initial value to 1.
>>
>>
>> Are you sure? With a quick grep I spot it sounds like acpi_numa can have the
>> value 0, -1, 1.
>>
>
> Hmm.. But I don't see use of having 0, -1 and 1. But I don't see any use of
> having 3 values to say enable or disable.

Then explain why in the commit message and don't let people discover. If 
you have not done it, I would recommend to read:

https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches

>
>>>
>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>> below
>>> call has check srat_disabled() before proceeding fails.
>>
>>
>> My understanding is on x86 acpi_numa is disabled by default and will be
>> enabled if they are able to parse the SRAT. So why are you changing the
>> behavior for x86?
>
> acpi_numa = 0 means it is enabled by default on x86.

In acpi_scan_nodes:

if (acpi_numa <= 0)
   return -1;

So it does not seem that 0 means enabled.

>
>>
>>>
>>> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>>> {
>>> ....
>>>
>>>     if ( srat_disabled() )
>>>         return;
>>>
>>> }
>>>
>>>>
>>>> Also, I am not sure to understand the benefits of those helpers. Why do
>>>> you
>>>> need them? Why not using the global variable directly, this will avoid to
>>>> call a function just for returning a value...
>>>
>>>
>>> These helpers are used by both common code and arch specific code later.
>>> Hence made these global variables as static and added helpers
>>
>>
>> The variables were global so they could already be used anywhere.
>>
>> Furthermore, those helpers are not even inlined, so everytime you want to
>> access read acpi_numa you have to do a branch then a memory access.
>>
>> So what is the point to do that?
>
> I agree with making inline. But I don't think there is any harm in making them
> static and adding helpers.

But why? Why don't you keep the code as it is? You modify code without 
any justification and not for the better.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 12:28           ` Julien Grall
@ 2017-04-25 14:54             ` Vijay Kilari
  2017-04-25 15:14               ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-04-25 14:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 25/04/17 13:20, Vijay Kilari wrote:
>>
>> On Tue, Apr 25, 2017 at 5:34 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hello Vijay,
>>>
>>> On 25/04/17 07:54, Vijay Kilari wrote:
>>>>
>>>>
>>>> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.grall@arm.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Vijay,
>>>>>
>>>>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>>>>
>>>>>> Add accessor functions for acpi_numa and numa_off static
>>>>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Please explain why you change the acpi_numa value from 0 to 1.
>>>>
>>>>
>>>>
>>>> previously acpi_numa was s8 and are using 0 and -1 values. I have made
>>>> it bool and set
>>>> the initial value to 1.
>>>
>>>
>>>
>>> Are you sure? With a quick grep I spot it sounds like acpi_numa can have
>>> the
>>> value 0, -1, 1.
>>>
>>
>> Hmm.. But I don't see use of having 0, -1 and 1. But I don't see any use
>> of
>> having 3 values to say enable or disable.
>
>
> Then explain why in the commit message and don't let people discover. If you
> have not done it, I would recommend to read:
>
> https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
>
>>
>>>>
>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>> below
>>>> call has check srat_disabled() before proceeding fails.
>>>
>>>
>>>
>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>> enabled if they are able to parse the SRAT. So why are you changing the
>>> behavior for x86?
>>
>>
>> acpi_numa = 0 means it is enabled by default on x86.
>
>
> In acpi_scan_nodes:
>
> if (acpi_numa <= 0)
>   return -1;
>
> So it does not seem that 0 means enabled.

IMO, In x86
         -1 means disabled
          0 enabled but not numa initialized
          1 enabled and numa initialized.

I clubbed 0 & 1.

I was referring to below code in x86 where in acpi_numa = 0 means
srat_disabled() returns false. Which means acpi_numa is enabled implicitly.

int srat_disabled(void)
{
      return numa_off || acpi_numa < 0;
}

When I changed acpi_numa to bool, it is initialized to 1 and changed
below code.

bool srat_disabled(void)
{
    return numa_off || acpi_numa == 0;
}

Also this srat_disabed() is used in acpi_numa_memory_affinity_init which is
called from acpi_numa_init() before calling acpi_scan_nodes().

>
>>
>>>
>>>>
>>>> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>>>> {
>>>> ....
>>>>
>>>>     if ( srat_disabled() )
>>>>         return;
>>>>
>>>> }
>>>>
>>>>>
>>>>> Also, I am not sure to understand the benefits of those helpers. Why do
>>>>> you
>>>>> need them? Why not using the global variable directly, this will avoid
>>>>> to
>>>>> call a function just for returning a value...
>>>>
>>>>
>>>>
>>>> These helpers are used by both common code and arch specific code later.
>>>> Hence made these global variables as static and added helpers
>>>
>>>
>>>
>>> The variables were global so they could already be used anywhere.
>>>
>>> Furthermore, those helpers are not even inlined, so everytime you want to
>>> access read acpi_numa you have to do a branch then a memory access.
>>>
>>> So what is the point to do that?
>>
>>
>> I agree with making inline. But I don't think there is any harm in making
>> them
>> static and adding helpers.
>
>
> But why? Why don't you keep the code as it is? You modify code without any
> justification and not for the better.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 14:54             ` Vijay Kilari
@ 2017-04-25 15:14               ` Julien Grall
  2017-04-25 15:43                 ` Jan Beulich
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-04-25 15:14 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K



On 25/04/17 15:54, Vijay Kilari wrote:
> On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>
>>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>>> below
>>>>> call has check srat_disabled() before proceeding fails.
>>>>
>>>>
>>>>
>>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>>> enabled if they are able to parse the SRAT. So why are you changing the
>>>> behavior for x86?
>>>
>>>
>>> acpi_numa = 0 means it is enabled by default on x86.
>>
>>
>> In acpi_scan_nodes:
>>
>> if (acpi_numa <= 0)
>>   return -1;
>>
>> So it does not seem that 0 means enabled.
>
> IMO, In x86
>          -1 means disabled
>           0 enabled but not numa initialized
>           1 enabled and numa initialized.
>
> I clubbed 0 & 1.

 From your description 0 and 1 have different meaning, so I don't see 
how you can merge them that easily without any explanation.

Anyway, I will leave x86 maintainers give their opinion here.

>
> I was referring to below code in x86 where in acpi_numa = 0 means
> srat_disabled() returns false. Which means acpi_numa is enabled implicitly.
>
> int srat_disabled(void)
> {
>       return numa_off || acpi_numa < 0;
> }
>
> When I changed acpi_numa to bool, it is initialized to 1 and changed
> below code.
>
> bool srat_disabled(void)
> {
>     return numa_off || acpi_numa == 0;
> }
>
> Also this srat_disabed() is used in acpi_numa_memory_affinity_init which is
> called from acpi_numa_init() before calling acpi_scan_nodes().



-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 15:14               ` Julien Grall
@ 2017-04-25 15:43                 ` Jan Beulich
  2017-05-02  9:47                   ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Jan Beulich @ 2017-04-25 15:43 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, xen-devel

>>> On 25.04.17 at 17:14, <julien.grall@arm.com> wrote:
> On 25/04/17 15:54, Vijay Kilari wrote:
>> On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>>
>>>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>>>> below
>>>>>> call has check srat_disabled() before proceeding fails.
>>>>>
>>>>>
>>>>>
>>>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>>>> enabled if they are able to parse the SRAT. So why are you changing the
>>>>> behavior for x86?
>>>>
>>>>
>>>> acpi_numa = 0 means it is enabled by default on x86.
>>>
>>>
>>> In acpi_scan_nodes:
>>>
>>> if (acpi_numa <= 0)
>>>   return -1;
>>>
>>> So it does not seem that 0 means enabled.
>>
>> IMO, In x86
>>          -1 means disabled
>>           0 enabled but not numa initialized
>>           1 enabled and numa initialized.
>>
>> I clubbed 0 & 1.
> 
>  From your description 0 and 1 have different meaning, so I don't see 
> how you can merge them that easily without any explanation.
> 
> Anyway, I will leave x86 maintainers give their opinion here.

I'm pretty certain this needs to remain a tristate.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-04-25 15:43                 ` Jan Beulich
@ 2017-05-02  9:47                   ` Vijay Kilari
  2017-05-02  9:54                     ` Jan Beulich
  2017-05-08 17:38                     ` Julien Grall
  0 siblings, 2 replies; 71+ messages in thread
From: Vijay Kilari @ 2017-05-02  9:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, Julien Grall,
	xen-devel

On Tue, Apr 25, 2017 at 9:13 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 25.04.17 at 17:14, <julien.grall@arm.com> wrote:
>> On 25/04/17 15:54, Vijay Kilari wrote:
>>> On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>>>
>>>>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>>>>> below
>>>>>>> call has check srat_disabled() before proceeding fails.
>>>>>>
>>>>>>
>>>>>>
>>>>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>>>>> enabled if they are able to parse the SRAT. So why are you changing the
>>>>>> behavior for x86?
>>>>>
>>>>>
>>>>> acpi_numa = 0 means it is enabled by default on x86.
>>>>
>>>>
>>>> In acpi_scan_nodes:
>>>>
>>>> if (acpi_numa <= 0)
>>>>   return -1;
>>>>
>>>> So it does not seem that 0 means enabled.
>>>
>>> IMO, In x86
>>>          -1 means disabled
>>>           0 enabled but not numa initialized
>>>           1 enabled and numa initialized.
>>>
>>> I clubbed 0 & 1.
>>
>>  From your description 0 and 1 have different meaning, so I don't see
>> how you can merge them that easily without any explanation.
>>
>> Anyway, I will leave x86 maintainers give their opinion here.
>
> I'm pretty certain this needs to remain a tristate.

Ok. I will drop this patch from this series and can be fixed
outside this series.
BTW, any review comments on remaining patches?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-05-02  9:47                   ` Vijay Kilari
@ 2017-05-02  9:54                     ` Jan Beulich
  2017-05-08 17:38                     ` Julien Grall
  1 sibling, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-05-02  9:54 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, Julien Grall,
	xen-devel

>>> On 02.05.17 at 11:47, <vijay.kilari@gmail.com> wrote:
> BTW, any review comments on remaining patches?

I didn't even manage to get to the start of the flood of RFCs
posted during the last 1.5 months (which priority wise all sit behind
the various non-RFC postings), so I can't predict when I'll get to
yours. Also please remember that the focus right now is to make
4.9 good quality ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  2017-03-28 15:53 ` [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs vijay.kilari
@ 2017-05-08 14:39   ` Julien Grall
  2017-05-09  7:02     ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 14:39 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Add accessor for nodes[] and other static variables and

s/accessor/accessors/

> used those accessors.

Also, I am not sure to understand the usefulness of those accessors over 
a global variable.

> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/srat.c | 108 +++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 82 insertions(+), 26 deletions(-)
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index ccacbcd..983e1d8 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -41,7 +41,45 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>
> -static inline bool node_found(unsigned idx, unsigned pxm)
> +static struct node *get_numa_node(int id)

unsigned int.

> +{
> +	return &nodes[id];
> +}
> +
> +static nodeid_t get_memblk_nodeid(int id)

unsigned int.

> +{
> +	return memblk_nodeid[id];
> +}
> +
> +static nodeid_t *get_memblk_nodeid_map(void)
> +{
> +	return &memblk_nodeid[0];
> +}
> +
> +static struct node *get_node_memblk_range(int memblk)

unsigned int.

> +{
> +	return &node_memblk_range[memblk];
> +}
> +
> +static int get_num_node_memblks(void)
> +{
> +	return num_node_memblks;
> +}
> +
> +static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size)
> +{
> +	if (nodeid >= NR_NODE_MEMBLKS)
> +		return -EINVAL;
> +
> +	node_memblk_range[num_node_memblks].start = start;
> +	node_memblk_range[num_node_memblks].end = start + size;
> +	memblk_nodeid[num_node_memblks] = nodeid;
> +	num_node_memblks++;
> +
> +	return 0;
> +}
> +
> +static inline bool node_found(unsigned int idx, unsigned int pxm)

Please don't make unrelated change in the same patch. In this case I 
don't see why you switch from "unsigned" to "unsigned int".

>  {
>  	return ((pxm2node[idx].pxm == pxm) &&
>  		(pxm2node[idx].node != NUMA_NO_NODE));
> @@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
>  {
>  	int i;
>
> -	for (i = 0; i < num_node_memblks; i++) {
> -		struct node *nd = &node_memblk_range[i];
> +	for (i = 0; i < get_num_node_memblks(); i++) {
> +		struct node *nd = get_node_memblk_range(i);
>
>  		if (nd->start <= start && nd->end > end &&
> -			memblk_nodeid[i] == node )
> +		    get_memblk_nodeid(i) == node)

Why the indentation changed here?

>  			return 1;
>  	}
>
> @@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start, paddr_t end)
>  {
>  	int i;
>
> -	for (i = 0; i < num_node_memblks; i++) {
> -		struct node *nd = &node_memblk_range[i];
> +	for (i = 0; i < get_num_node_memblks(); i++) {
> +		struct node *nd = get_node_memblk_range(i);
>  		if (nd->start == nd->end)
>  			continue;
>  		if (nd->end > start && nd->start < end)
> @@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start, paddr_t end)
>
>  static void __init cutoff_node(int i, paddr_t start, paddr_t end)
>  {
> -	struct node *nd = &nodes[i];
> +	struct node *nd = get_numa_node(i);
> +
>  	if (nd->start < start) {
>  		nd->start = start;
>  		if (nd->end < nd->start)
> @@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  	unsigned pxm;
>  	nodeid_t node;
>  	int i;
> +	struct node *memblk;
>
>  	if (srat_disabled())
>  		return;
> @@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
>  		return;
>
> -	if (num_node_memblks >= NR_NODE_MEMBLKS)
> +	if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
>  	{
>  		dprintk(XENLOG_WARNING,
>                  "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
> @@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  	i = conflicting_memblks(start, end);
>  	if (i < 0)
>  		/* everything fine */;
> -	else if (memblk_nodeid[i] == node) {
> +	else if (get_memblk_nodeid(i) == node) {
>  		bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
>  		                !test_bit(i, memblk_hotplug);
>
> +		memblk = get_node_memblk_range(i);
> +
>  		printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
>  		       mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
> -		       node_memblk_range[i].start, node_memblk_range[i].end);
> +		       memblk->start, memblk->end);
>  		if (mismatch) {
>  			bad_srat();
>  			return;
>  		}
>  	} else {
> +		memblk = get_node_memblk_range(i);
> +
>  		printk(KERN_ERR
>  		       "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
> -		       pxm, start, end, node_to_pxm(memblk_nodeid[i]),
> -		       node_memblk_range[i].start, node_memblk_range[i].end);
> +		       pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
> +		       memblk->start, memblk->end);
>  		bad_srat();
>  		return;
>  	}
>  	if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> -		struct node *nd = &nodes[node];
> +		struct node *nd = get_numa_node(node);
>
>  		if (!node_test_and_set(node, memory_nodes_parsed)) {
>  			nd->start = start;
> @@ -346,15 +390,17 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  	       node, pxm, start, end,
>  	       ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
>
> -	node_memblk_range[num_node_memblks].start = start;
> -	node_memblk_range[num_node_memblks].end = end;
> -	memblk_nodeid[num_node_memblks] = node;
> +	if (numa_add_memblk(node, start, ma->length)) {
> +		printk(KERN_ERR "SRAT: node-id %u out of range\n", node);
> +		bad_srat();
> +		return;
> +	}
> +
>  	if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> -		__set_bit(num_node_memblks, memblk_hotplug);
> +		__set_bit(get_num_node_memblks(), memblk_hotplug);
>  		if (end > mem_hotplug)
>  			mem_hotplug = end;
>  	}
> -	num_node_memblks++;
>  }
>
>  /* Sanity check to catch more bad SRATs (they are amazingly common).
> @@ -377,17 +423,21 @@ static int __init nodes_cover_memory(void)
>  		do {
>  			found = 0;
>  			for_each_node_mask(j, memory_nodes_parsed)
> -				if (start < nodes[j].end
> -				    && end > nodes[j].start) {
> -					if (start >= nodes[j].start) {
> -						start = nodes[j].end;
> +			{
> +		                struct node *nd = get_numa_node(j);
> +
> +				if (start < nd->end
> +				    && end > nd->start) {
> +					if (start >= nd->start) {
> +						start = nd->end;
>  						found = 1;
>  					}
> -					if (end <= nodes[j].end) {
> -						end = nodes[j].start;
> +					if (end <= nd->end) {
> +						end = nd->start;
>  						found = 1;
>  					}
>  				}
> +			}
>  		} while (found && start < end);
>
>  		if (start < end) {
> @@ -457,6 +507,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
>  {
>  	int i;
>  	nodemask_t all_nodes_parsed;
> +	struct node *memblks;
> +	nodeid_t *nodeids;
>
>  	/* First clean up the node list */
>  	for (i = 0; i < MAX_NUMNODES; i++)
> @@ -470,6 +522,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
>  		return -1;
>  	}
>
> +	memblks = get_node_memblk_range(0);
> +	nodeids = get_memblk_nodeid_map();
>  	if (compute_memnode_shift(node_memblk_range, num_node_memblks,
>  				  memblk_nodeid, &memnode_shift)) {
>  		memnode_shift = 0;
> @@ -484,12 +538,14 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t end)
>  	/* Finally register nodes */
>  	for_each_node_mask(i, all_nodes_parsed)
>  	{
> -		uint64_t size = nodes[i].end - nodes[i].start;
> +		struct node *nd = get_numa_node(i);
> +		uint64_t size = nd->end - nd->start;
> +
>  		if ( size == 0 )
>  			printk(KERN_WARNING "SRAT: Node %u has no memory. "
>  			       "BIOS Bug or mis-configured hardware?\n", i);
>
> -		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
> +		setup_node_bootmem(i, nd->start, nd->end);
>  	}
>  	for (i = 0; i < nr_cpu_ids; i++) {
>  		if (cpu_to_node[i] == NUMA_NO_NODE)
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-03-28 15:53 ` [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
@ 2017-05-08 15:58   ` Julien Grall
  2017-05-09  7:14     ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 15:58 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Right now CONFIG_NUMA is not enabled for ARM and
> existing code in asm-arm/numa.h is for !CONFIG_NUMA.
> Hence put this code under #ifndef CONFIG_NUMA.
>
> This help to make this changes work when CONFIG_NUMA
> is not enabled.

But you always turn NUMA on by default (see patch #24) and there is no 
possibility to turn off NUMA.

>
> Also define NODES_SHIFT macro for ARM to value 2.
> This limits number of NUMA nodes supported to 4.
> There is not hard restrictions on this value set to 2.

Again, why only 2 when x86 is supporting 6?

Furthermore, this is not related to this patch itself and should be part 
of separate patch.

Lastly, why don't you move that to a Kconfig allowing the user to 
configure the number of Nodes?

>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/include/asm-arm/numa.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index 53f99af..924bfc0 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -3,6 +3,10 @@
>
>  typedef uint8_t nodeid_t;
>
> +/* Limit number of NUMA nodes supported to 4 */
> +#define NODES_SHIFT 2

Why this is not covered by CONFIG_NUMA?

> +
> +#ifndef CONFIG_NUMA
>  /* Fake one node for now. See also node_online_map. */
>  #define cpu_to_node(cpu) 0
>  #define node_to_cpumask(node)   (cpu_online_map)
> @@ -16,6 +20,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #define node_spanned_pages(nid) (total_pages)
>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>  #define __node_distance(a, b) (20)
> +#endif /* CONFIG_NUMA */
>
>  static inline unsigned int arch_get_dma_bitsize(void)
>  {
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-03-28 15:53 ` [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic vijay.kilari
@ 2017-05-08 16:41   ` Julien Grall
  2017-05-09  7:36     ` Vijay Kilari
  2017-05-08 16:51   ` Julien Grall
  1 sibling, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 16:41 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index 3bdab9a..33c6806 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -10,286 +10,13 @@
>  #include <xen/ctype.h>
>  #include <xen/nodemask.h>
>  #include <xen/numa.h>
> -#include <xen/keyhandler.h>
>  #include <xen/time.h>
>  #include <xen/smp.h>
>  #include <xen/pfn.h>
>  #include <asm/acpi.h>
> -#include <xen/sched.h>
> -#include <xen/softirq.h>
> -
> -static int numa_setup(char *s);
> -custom_param("numa", numa_setup);
> -
> -struct node_data node_data[MAX_NUMNODES];
> -
> -/* Mapping from pdx to node id */
> -unsigned int memnode_shift;
> -static typeof(*memnodemap) _memnodemap[64];
> -unsigned long memnodemapsize;
> -uint8_t *memnodemap;
> -
> -nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
> -    [0 ... NR_CPUS-1] = NUMA_NO_NODE
> -};
> -/*
> - * Keep BIOS's CPU2node information, should not be used for memory allocaion
> - */
> -nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
> -    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
> -};

Why this is moved in this patch from here to x86/srat.c?

[...]

> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index 7cf4771..2cc87a3 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -27,6 +27,13 @@ static nodemask_t __initdata memory_nodes_parsed;
>  static nodemask_t __initdata processor_nodes_parsed;
>  static struct node __initdata nodes[MAX_NUMNODES];
>
> +/*
> + * Keep BIOS's CPU2node information, should not be used for memory allocaion
> + */
> +nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
> +    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
> +};
> +

This does not belong to this patch...

>  struct pxm2node {
>  	unsigned int pxm;
>  	nodeid_t node;

[...]

> diff --git a/xen/common/numa.c b/xen/common/numa.c
> new file mode 100644
> index 0000000..207ebd8
> --- /dev/null
> +++ b/xen/common/numa.c
> @@ -0,0 +1,488 @@
> +/*
> + * Common NUMA handling functions for x86 and arm.
> + * Original code extracted from arch/x86/numa.c
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/mm.h>
> +#include <xen/string.h>
> +#include <xen/init.h>
> +#include <xen/ctype.h>
> +#include <xen/nodemask.h>
> +#include <xen/numa.h>
> +#include <xen/keyhandler.h>
> +#include <xen/time.h>
> +#include <xen/smp.h>
> +#include <xen/pfn.h>
> +#include <asm/acpi.h>
> +#include <xen/sched.h>
> +#include <xen/softirq.h>

Whilst you are moving this in a newfile, please order the includes.

[...]

> +static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
> +                                                  int numnodes)
> +{
> +    unsigned int i, nodes_used = 0;
> +    unsigned long spdx, epdx;
> +    unsigned long bitfield = 0, memtop = 0;
> +
> +    for ( i = 0; i < numnodes; i++ )
> +    {
> +        spdx = paddr_to_pdx(nodes[i].start);
> +        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
> +        if ( spdx >= epdx )
> +            continue;
> +        bitfield |= spdx;
> +        nodes_used++;
> +        if ( epdx > memtop )
> +            memtop = epdx;
> +    }
> +    if ( nodes_used <= 1 )
> +        i = BITS_PER_LONG - 1;
> +    else
> +        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
> +

It is interesting to see that newline was added in the process of moving 
the code.

> +    memnodemapsize = (memtop >> i) + 1;

[....]

> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 922fbd8..eed40af 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -14,6 +14,21 @@
>
>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>
> +struct node {
> +    paddr_t start;
> +    paddr_t end;
> +};
> +
> +extern int compute_memnode_shift(struct node *nodes, int numnodes,
> +                                 nodeid_t *nodeids, unsigned int *shift);
> +extern void numa_init_array(void);
> +extern bool_t srat_disabled(void);
> +extern void numa_set_node(int cpu, nodeid_t node);
> +extern nodeid_t acpi_setup_node(unsigned int pxm);
> +extern void srat_detect_node(int cpu);
> +extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
> +extern void init_cpu_to_node(void);

Can you please be consistent with this file and drop the unecessary 
"extern".

> +
>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>
>  #define domain_to_node(d) \
> @@ -23,4 +38,7 @@
>  bool is_numa_off(void);
>  bool get_acpi_numa(void);
>  void set_acpi_numa(bool val);
> +int get_numa_fake(void);
> +extern int numa_emulation(uint64_t start_pfn, uint64_t end_pfn);
> +extern void numa_dummy_init(uint64_t start_pfn, uint64_t end_pfn);

Ditto.

>  #endif /* _XEN_NUMA_H */
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-03-28 15:53 ` [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic vijay.kilari
  2017-05-08 16:41   ` Julien Grall
@ 2017-05-08 16:51   ` Julien Grall
  2017-05-09  7:39     ` Vijay Kilari
  1 sibling, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 16:51 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> new file mode 100644
> index 0000000..207ebd8
> --- /dev/null
> +++ b/xen/common/numa.c
> @@ -0,0 +1,488 @@
> +/*
> + * Common NUMA handling functions for x86 and arm.
> + * Original code extracted from arch/x86/numa.c
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/mm.h>
> +#include <xen/string.h>
> +#include <xen/init.h>
> +#include <xen/ctype.h>
> +#include <xen/nodemask.h>
> +#include <xen/numa.h>
> +#include <xen/keyhandler.h>
> +#include <xen/time.h>
> +#include <xen/smp.h>
> +#include <xen/pfn.h>
> +#include <asm/acpi.h>
> +#include <xen/sched.h>
> +#include <xen/softirq.h>
> +
> +static int numa_setup(char *s);
> +custom_param("numa", numa_setup);
> +
> +struct node_data node_data[MAX_NUMNODES];
> +
> +/* Mapping from pdx to node id */
> +unsigned int memnode_shift;
> +static typeof(*memnodemap) _memnodemap[64];

Also, you move the hardcoded 64 here. But have you checked it is valid 
for ARM?

Regardless that, this sounds like something that should be turned into a 
define and require a comment.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c
  2017-03-28 15:53 ` [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c vijay.kilari
@ 2017-05-08 17:06   ` Julien Grall
  2017-05-10  9:00     ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 17:06 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Move code from xen/arch/x86/srat.c to xen/common/numa.c
> so that it can be used by other archs.
> Few generic static functions in x86/srat.c are made
> non-static common/numa.c
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/srat.c        | 152 ++-------------------------------------------
>  xen/common/numa.c          | 146 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/acpi.h |   3 -
>  xen/include/asm-x86/numa.h |   2 -
>  xen/include/xen/numa.h     |  14 +++++
>  5 files changed, 164 insertions(+), 153 deletions(-)
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index 2cc87a3..55947bb 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -23,9 +23,8 @@
>
>  static struct acpi_table_slit *__read_mostly acpi_slit;
>
> -static nodemask_t __initdata memory_nodes_parsed;
> -static nodemask_t __initdata processor_nodes_parsed;
> -static struct node __initdata nodes[MAX_NUMNODES];
> +extern nodemask_t processor_nodes_parsed;
> +extern nodemask_t memory_nodes_parsed;

On v1, Jan clearly NAK to changes like this. Declarations belong in 
header files. It is a different variable compare to v1, but I would have 
expected you to apply what he said everywhere...

[...]

> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 207ebd8..1789bba 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -32,6 +32,8 @@
>  static int numa_setup(char *s);
>  custom_param("numa", numa_setup);
>
> +nodemask_t __initdata memory_nodes_parsed;
> +nodemask_t __initdata processor_nodes_parsed;
>  struct node_data node_data[MAX_NUMNODES];
>
>  /* Mapping from pdx to node id */
> @@ -47,6 +49,10 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>
>  static bool numa_off = 0;
>  static bool acpi_numa = 1;
> +static int num_node_memblks;
> +static struct node node_memblk_range[NR_NODE_MEMBLKS];
> +static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
> +static struct node __initdata nodes[MAX_NUMNODES];

It would make sense to keep those variables together with 
{memory,processor}_nodes_parsed.

[...]

> +int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
> +{
> +    int i;
> +
> +    for (i = 0; i < get_num_node_memblks(); i++) {

common/numa.c is using Xen coding style whilst arch/x86/srat.c is using 
Linux coding style.

You decided to validly switch to soft tab, making quite difficult to 
check if this patch is only code movement. But you did not go far enough 
and fix the coding style of the code moved.

Please do it properly and not half of it. For simplicity I would be OK 
that it is done in this patch. But this needs to be clearly written in 
the commit message.

> +        struct node *nd = get_node_memblk_range(i);
> +
> +        if (nd->start <= start && nd->end > end &&
> +            get_memblk_nodeid(i) == node)
> +            return 1;
> +    }
> +
> +    return 0;
> +}

[...]

> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index 421e8b7..7cff220 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -47,8 +47,6 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
>                                   NODE_DATA(nid)->node_spanned_pages)
>
> -extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
> -
>  void srat_parse_regions(uint64_t addr);
>  extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
>  unsigned int arch_get_dma_bitsize(void);
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index eed40af..ee53526 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -13,6 +13,7 @@
>  #define NUMA_NO_DISTANCE 0xFF
>
>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>
>  struct node {
>      paddr_t start;
> @@ -28,6 +29,19 @@ extern nodeid_t acpi_setup_node(unsigned int pxm);
>  extern void srat_detect_node(int cpu);
>  extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end);
>  extern void init_cpu_to_node(void);
> +extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
> +extern int conflicting_memblks(paddr_t start, paddr_t end);
> +extern void cutoff_node(int i, paddr_t start, paddr_t end);
> +extern struct node *get_numa_node(int id);
> +extern nodeid_t get_memblk_nodeid(int memblk);
> +extern nodeid_t *get_memblk_nodeid_map(void);
> +extern struct node *get_node_memblk_range(int memblk);
> +extern struct node *get_memblk(int memblk);
> +extern int numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size);
> +extern int get_num_node_memblks(void);
> +extern int arch_sanitize_nodes_memory(void);
> +extern void numa_failed(void);
> +extern int numa_scan_nodes(uint64_t start, uint64_t end);

See my comment on the previous patch.

>
>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information
  2017-03-28 15:53 ` [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information vijay.kilari
@ 2017-05-08 17:31   ` Julien Grall
  2017-05-10  5:24     ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-08 17:31 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Vijaya Kumar K

Hi Vijay,

The title likely needs to have the work device-tree/DT in it.

On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Parse CPU node and fetch numa-node-id information.
> For each node-id found, update nodemask_t mask.
> Refer to /Documentation/devicetree/bindings/numa.txt.

In which repository?

>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Makefile       |  1 +
>  xen/arch/arm/bootfdt.c      | 16 ++++++++--
>  xen/arch/arm/numa/Makefile  |  2 ++
>  xen/arch/arm/numa/dt_numa.c | 78 +++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/numa/numa.c    | 50 +++++++++++++++++++++++++++++
>  xen/arch/arm/setup.c        |  4 +++
>  xen/include/asm-arm/numa.h  | 10 +++++-
>  xen/include/asm-arm/setup.h |  4 ++-
>  8 files changed, 161 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 0ce94a8..d13b79f 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
>  subdir-y += platforms
>  subdir-$(CONFIG_ARM_64) += efi
>  subdir-$(CONFIG_ACPI) += acpi
> +subdir-$(CONFIG_NUMA) += numa
>
>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.init.o
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index ea188a0..1f876f0 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
>      *size = dt_next_cell(size_cells, cell);
>  }
>
> -static u32 __init device_tree_get_u32(const void *fdt, int node,
> -                                      const char *prop_name, u32 dflt)
> +bool_t __init device_tree_type_matches(const void *fdt, int node,
> +                                       const char *match)
> +{
> +    const void *prop;
> +
> +    prop = fdt_getprop(fdt, node, "device_type", NULL);
> +    if ( prop == NULL )
> +        return 0;
> +
> +    return strcmp(prop, match) == 0 ? 1 : 0;
> +}
> +

This change is not explained in the patch and does not belong to it anyway.

> +u32 __init device_tree_get_u32(const void *fdt, int node,
> +                               const char *prop_name, u32 dflt)

Ditto. I would recommend to read [1] for tips to break down a patch.

>  {
>      const struct fdt_property *prop;
>
> diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
> new file mode 100644
> index 0000000..3af3aff
> --- /dev/null
> +++ b/xen/arch/arm/numa/Makefile
> @@ -0,0 +1,2 @@
> +obj-y += dt_numa.o
> +obj-y += numa.o
> diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
> new file mode 100644
> index 0000000..66c6efb
> --- /dev/null
> +++ b/xen/arch/arm/numa/dt_numa.c
> @@ -0,0 +1,78 @@
> +/*
> + * OF NUMA Parsing support.
> + *
> + * Copyright (C) 2015 - 2016 Cavium Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/nodemask.h>
> +#include <asm/mm.h>

This is already included by xen/mm.h

> +#include <xen/numa.h>
> +#include <xen/device_tree.h>
> +#include <asm/setup.h>

Please order the include.

> +
> +extern nodemask_t processor_nodes_parsed;

See my comment on patch #11. I may miss of them and hoping you will fix 
all the occurrence in the next version.

> +
> +/*
> + * Even though we connect cpus to numa domains later in SMP
> + * init, we need to know the node ids now for all cpus.
> + */
> +static int __init dt_numa_process_cpu_node(const void *fdt, int node,
> +                                           const char *name,
> +                                           uint32_t address_cells,
> +                                           uint32_t size_cells)
> +{
> +    uint32_t nid;
> +
> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
> +
> +    if ( nid >= MAX_NUMNODES )
> +        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n", nid);
> +    else
> +        node_set(nid, processor_nodes_parsed);
> +
> +    return 0;
> +}
> +
> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
> +                                        const char *name, int depth,
> +                                        uint32_t address_cells,
> +                                        uint32_t size_cells, void *data)
> +{
> +    if ( device_tree_type_matches(fdt, node, "cpu") )
> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
> +                                        size_cells);

As said on v1, this code is wrong. CPUs nodes can only be in /cpus and 
you cannot rely on the name to be "cpu" (see binding in 
Documentation/devicetree/bindings/arm/cpu.txt). The only way to check if 
it is a CPU is to look for the property device_type.

Cheers,

[1] https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-05-02  9:47                   ` Vijay Kilari
  2017-05-02  9:54                     ` Jan Beulich
@ 2017-05-08 17:38                     ` Julien Grall
  1 sibling, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-08 17:38 UTC (permalink / raw)
  To: Vijay Kilari, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, xen-devel

Hi Vijay,

On 02/05/17 10:47, Vijay Kilari wrote:
> On Tue, Apr 25, 2017 at 9:13 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 25.04.17 at 17:14, <julien.grall@arm.com> wrote:
>>> On 25/04/17 15:54, Vijay Kilari wrote:
>>>> On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>>>>
>>>>>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>>>>>> below
>>>>>>>> call has check srat_disabled() before proceeding fails.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>>>>>> enabled if they are able to parse the SRAT. So why are you changing the
>>>>>>> behavior for x86?
>>>>>>
>>>>>>
>>>>>> acpi_numa = 0 means it is enabled by default on x86.
>>>>>
>>>>>
>>>>> In acpi_scan_nodes:
>>>>>
>>>>> if (acpi_numa <= 0)
>>>>>   return -1;
>>>>>
>>>>> So it does not seem that 0 means enabled.
>>>>
>>>> IMO, In x86
>>>>          -1 means disabled
>>>>           0 enabled but not numa initialized
>>>>           1 enabled and numa initialized.
>>>>
>>>> I clubbed 0 & 1.
>>>
>>>  From your description 0 and 1 have different meaning, so I don't see
>>> how you can merge them that easily without any explanation.
>>>
>>> Anyway, I will leave x86 maintainers give their opinion here.
>>
>> I'm pretty certain this needs to remain a tristate.
>
> Ok. I will drop this patch from this series and can be fixed
> outside this series.
> BTW, any review comments on remaining patches?

I had a looked at the series and decided to stop reviewing it because 
comments are not addressed.

I am not going to review anything until *all* the comments from previous 
version are addressed. I would recommend the other to do the same.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  2017-05-08 14:39   ` Julien Grall
@ 2017-05-09  7:02     ` Vijay Kilari
  2017-05-09  8:13       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-05-09  7:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Mon, May 8, 2017 at 8:09 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Add accessor for nodes[] and other static variables and
>
>
> s/accessor/accessors/
>
>> used those accessors.
>
>
> Also, I am not sure to understand the usefulness of those accessors over a
> global variable.

These are static variables which needs to accessed from other files and
later moved to generic file.

>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/srat.c | 108
>> +++++++++++++++++++++++++++++++++++++++-------------
>>  1 file changed, 82 insertions(+), 26 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index ccacbcd..983e1d8 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -41,7 +41,45 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
>>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>>
>> -static inline bool node_found(unsigned idx, unsigned pxm)
>> +static struct node *get_numa_node(int id)
>
>
> unsigned int.
OK
>
>> +{
>> +       return &nodes[id];
>> +}
>> +
>> +static nodeid_t get_memblk_nodeid(int id)
>
>
> unsigned int.
>
>> +{
>> +       return memblk_nodeid[id];
>> +}
>> +
>> +static nodeid_t *get_memblk_nodeid_map(void)
>> +{
>> +       return &memblk_nodeid[0];
>> +}
>> +
>> +static struct node *get_node_memblk_range(int memblk)
>
>
> unsigned int.
>
>> +{
>> +       return &node_memblk_range[memblk];
>> +}
>> +
>> +static int get_num_node_memblks(void)
>> +{
>> +       return num_node_memblks;
>> +}
>> +
>> +static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start,
>> uint64_t size)
>> +{
>> +       if (nodeid >= NR_NODE_MEMBLKS)
>> +               return -EINVAL;
>> +
>> +       node_memblk_range[num_node_memblks].start = start;
>> +       node_memblk_range[num_node_memblks].end = start + size;
>> +       memblk_nodeid[num_node_memblks] = nodeid;
>> +       num_node_memblks++;
>> +
>> +       return 0;
>> +}
>> +
>> +static inline bool node_found(unsigned int idx, unsigned int pxm)
>
>
> Please don't make unrelated change in the same patch. In this case I don't
> see why you switch from "unsigned" to "unsigned int".
>
>>  {
>>         return ((pxm2node[idx].pxm == pxm) &&
>>                 (pxm2node[idx].node != NUMA_NO_NODE));
>> @@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end,
>> nodeid_t node)
>>  {
>>         int i;
>>
>> -       for (i = 0; i < num_node_memblks; i++) {
>> -               struct node *nd = &node_memblk_range[i];
>> +       for (i = 0; i < get_num_node_memblks(); i++) {
>> +               struct node *nd = get_node_memblk_range(i);
>>
>>                 if (nd->start <= start && nd->end > end &&
>> -                       memblk_nodeid[i] == node )
>> +                   get_memblk_nodeid(i) == node)
>
>
> Why the indentation changed here?

OK. will wrap these changes in other patches.

>
>
>>                         return 1;
>>         }
>>
>> @@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start,
>> paddr_t end)
>>  {
>>         int i;
>>
>> -       for (i = 0; i < num_node_memblks; i++) {
>> -               struct node *nd = &node_memblk_range[i];
>> +       for (i = 0; i < get_num_node_memblks(); i++) {
>> +               struct node *nd = get_node_memblk_range(i);
>>                 if (nd->start == nd->end)
>>                         continue;
>>                 if (nd->end > start && nd->start < end)
>> @@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start,
>> paddr_t end)
>>
>>  static void __init cutoff_node(int i, paddr_t start, paddr_t end)
>>  {
>> -       struct node *nd = &nodes[i];
>> +       struct node *nd = get_numa_node(i);
>> +
>>         if (nd->start < start) {
>>                 nd->start = start;
>>                 if (nd->end < nd->start)
>> @@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>>         unsigned pxm;
>>         nodeid_t node;
>>         int i;
>> +       struct node *memblk;
>>
>>         if (srat_disabled())
>>                 return;
>> @@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>>         if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
>>                 return;
>>
>> -       if (num_node_memblks >= NR_NODE_MEMBLKS)
>> +       if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
>>         {
>>                 dprintk(XENLOG_WARNING,
>>                  "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
>> @@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>>         i = conflicting_memblks(start, end);
>>         if (i < 0)
>>                 /* everything fine */;
>> -       else if (memblk_nodeid[i] == node) {
>> +       else if (get_memblk_nodeid(i) == node) {
>>                 bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
>> !=
>>                                 !test_bit(i, memblk_hotplug);
>>
>> +               memblk = get_node_memblk_range(i);
>> +
>>                 printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with
>> itself (%"PRIx64"-%"PRIx64")\n",
>>                        mismatch ? KERN_ERR : KERN_WARNING, pxm, start,
>> end,
>> -                      node_memblk_range[i].start,
>> node_memblk_range[i].end);
>> +                      memblk->start, memblk->end);
>>                 if (mismatch) {
>>                         bad_srat();
>>                         return;
>>                 }
>>         } else {
>> +               memblk = get_node_memblk_range(i);
>> +
>>                 printk(KERN_ERR
>>                        "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with
>> PXM %u (%"PRIx64"-%"PRIx64")\n",
>> -                      pxm, start, end, node_to_pxm(memblk_nodeid[i]),
>> -                      node_memblk_range[i].start,
>> node_memblk_range[i].end);
>> +                      pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),
>> +                      memblk->start, memblk->end);
>>                 bad_srat();
>>                 return;
>>         }
>>         if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>> -               struct node *nd = &nodes[node];
>> +               struct node *nd = get_numa_node(node);
>>
>>                 if (!node_test_and_set(node, memory_nodes_parsed)) {
>>                         nd->start = start;
>> @@ -346,15 +390,17 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>>                node, pxm, start, end,
>>                ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" :
>> "");
>>
>> -       node_memblk_range[num_node_memblks].start = start;
>> -       node_memblk_range[num_node_memblks].end = end;
>> -       memblk_nodeid[num_node_memblks] = node;
>> +       if (numa_add_memblk(node, start, ma->length)) {
>> +               printk(KERN_ERR "SRAT: node-id %u out of range\n", node);
>> +               bad_srat();
>> +               return;
>> +       }
>> +
>>         if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>> -               __set_bit(num_node_memblks, memblk_hotplug);
>> +               __set_bit(get_num_node_memblks(), memblk_hotplug);
>>                 if (end > mem_hotplug)
>>                         mem_hotplug = end;
>>         }
>> -       num_node_memblks++;
>>  }
>>
>>  /* Sanity check to catch more bad SRATs (they are amazingly common).
>> @@ -377,17 +423,21 @@ static int __init nodes_cover_memory(void)
>>                 do {
>>                         found = 0;
>>                         for_each_node_mask(j, memory_nodes_parsed)
>> -                               if (start < nodes[j].end
>> -                                   && end > nodes[j].start) {
>> -                                       if (start >= nodes[j].start) {
>> -                                               start = nodes[j].end;
>> +                       {
>> +                               struct node *nd = get_numa_node(j);
>> +
>> +                               if (start < nd->end
>> +                                   && end > nd->start) {
>> +                                       if (start >= nd->start) {
>> +                                               start = nd->end;
>>                                                 found = 1;
>>                                         }
>> -                                       if (end <= nodes[j].end) {
>> -                                               end = nodes[j].start;
>> +                                       if (end <= nd->end) {
>> +                                               end = nd->start;
>>                                                 found = 1;
>>                                         }
>>                                 }
>> +                       }
>>                 } while (found && start < end);
>>
>>                 if (start < end) {
>> @@ -457,6 +507,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t
>> end)
>>  {
>>         int i;
>>         nodemask_t all_nodes_parsed;
>> +       struct node *memblks;
>> +       nodeid_t *nodeids;
>>
>>         /* First clean up the node list */
>>         for (i = 0; i < MAX_NUMNODES; i++)
>> @@ -470,6 +522,8 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t
>> end)
>>                 return -1;
>>         }
>>
>> +       memblks = get_node_memblk_range(0);
>> +       nodeids = get_memblk_nodeid_map();
>>         if (compute_memnode_shift(node_memblk_range, num_node_memblks,
>>                                   memblk_nodeid, &memnode_shift)) {
>>                 memnode_shift = 0;
>> @@ -484,12 +538,14 @@ int __init acpi_scan_nodes(uint64_t start, uint64_t
>> end)
>>         /* Finally register nodes */
>>         for_each_node_mask(i, all_nodes_parsed)
>>         {
>> -               uint64_t size = nodes[i].end - nodes[i].start;
>> +               struct node *nd = get_numa_node(i);
>> +               uint64_t size = nd->end - nd->start;
>> +
>>                 if ( size == 0 )
>>                         printk(KERN_WARNING "SRAT: Node %u has no memory.
>> "
>>                                "BIOS Bug or mis-configured hardware?\n",
>> i);
>>
>> -               setup_node_bootmem(i, nodes[i].start, nodes[i].end);
>> +               setup_node_bootmem(i, nd->start, nd->end);
>>         }
>>         for (i = 0; i < nr_cpu_ids; i++) {
>>                 if (cpu_to_node[i] == NUMA_NO_NODE)
>>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-05-08 15:58   ` Julien Grall
@ 2017-05-09  7:14     ` Vijay Kilari
  2017-05-09  8:21       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-05-09  7:14 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Mon, May 8, 2017 at 9:28 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Right now CONFIG_NUMA is not enabled for ARM and
>> existing code in asm-arm/numa.h is for !CONFIG_NUMA.
>> Hence put this code under #ifndef CONFIG_NUMA.
>>
>> This help to make this changes work when CONFIG_NUMA
>> is not enabled.
>
>
> But you always turn NUMA on by default (see patch #24) and there is no
> possibility to turn off NUMA.

Yes at the end of the series we enable NUMA by default.
But the the intermittent patches of this patch series fails to compile.

>
>>
>> Also define NODES_SHIFT macro for ARM to value 2.
>> This limits number of NUMA nodes supported to 4.
>> There is not hard restrictions on this value set to 2.
>
>
> Again, why only 2 when x86 is supporting 6?
>
> Furthermore, this is not related to this patch itself and should be part of
> separate patch.
>
> Lastly, why don't you move that to a Kconfig allowing the user to configure
> the number of Nodes?

ok

>
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/include/asm-arm/numa.h | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index 53f99af..924bfc0 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -3,6 +3,10 @@
>>
>>  typedef uint8_t nodeid_t;
>>
>> +/* Limit number of NUMA nodes supported to 4 */
>> +#define NODES_SHIFT 2
>
>
> Why this is not covered by CONFIG_NUMA?

The below define is used in generic code irrespective of CONFIG_NUMA

#define MAX_NUMNODES    (1 << NODES_SHIFT)

>
>> +
>> +#ifndef CONFIG_NUMA
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> @@ -16,6 +20,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_spanned_pages(nid) (total_pages)
>>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>>  #define __node_distance(a, b) (20)
>> +#endif /* CONFIG_NUMA */
>>
>>  static inline unsigned int arch_get_dma_bitsize(void)
>>  {
>>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-05-08 16:41   ` Julien Grall
@ 2017-05-09  7:36     ` Vijay Kilari
  2017-05-09  8:23       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-05-09  7:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Mon, May 8, 2017 at 10:11 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 3bdab9a..33c6806 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -10,286 +10,13 @@
>>  #include <xen/ctype.h>
>>  #include <xen/nodemask.h>
>>  #include <xen/numa.h>
>> -#include <xen/keyhandler.h>
>>  #include <xen/time.h>
>>  #include <xen/smp.h>
>>  #include <xen/pfn.h>
>>  #include <asm/acpi.h>
>> -#include <xen/sched.h>
>> -#include <xen/softirq.h>
>> -
>> -static int numa_setup(char *s);
>> -custom_param("numa", numa_setup);
>> -
>> -struct node_data node_data[MAX_NUMNODES];
>> -
>> -/* Mapping from pdx to node id */
>> -unsigned int memnode_shift;
>> -static typeof(*memnodemap) _memnodemap[64];
>> -unsigned long memnodemapsize;
>> -uint8_t *memnodemap;
>> -
>> -nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
>> -    [0 ... NR_CPUS-1] = NUMA_NO_NODE
>> -};
>> -/*
>> - * Keep BIOS's CPU2node information, should not be used for memory
>> allocaion
>> - */
>> -nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>> -    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>> -};
>
>
> Why this is moved in this patch from here to x86/srat.c?

This is x86 specific. I will make a separate patch for this
move.

>
> [...]
>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 7cf4771..2cc87a3 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -27,6 +27,13 @@ static nodemask_t __initdata memory_nodes_parsed;
>>  static nodemask_t __initdata processor_nodes_parsed;
>>  static struct node __initdata nodes[MAX_NUMNODES];
>>
>> +/*
>> + * Keep BIOS's CPU2node information, should not be used for memory
>> allocaion
>> + */
>> +nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>> +    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>> +};
>> +
>
>
> This does not belong to this patch...
Ok
>
>>  struct pxm2node {
>>         unsigned int pxm;
>>         nodeid_t node;
>
>
> [...]
>
>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> new file mode 100644
>> index 0000000..207ebd8
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,488 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/mm.h>
>> +#include <xen/string.h>
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/numa.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/time.h>
>> +#include <xen/smp.h>
>> +#include <xen/pfn.h>
>> +#include <asm/acpi.h>
>> +#include <xen/sched.h>
>> +#include <xen/softirq.h>
>
>
> Whilst you are moving this in a newfile, please order the includes.

I understand that you don't like any code changes in code movement
patch.

>
> [...]
>
>> +static unsigned int __init extract_lsb_from_nodes(const struct node
>> *nodes,
>> +                                                  int numnodes)
>> +{
>> +    unsigned int i, nodes_used = 0;
>> +    unsigned long spdx, epdx;
>> +    unsigned long bitfield = 0, memtop = 0;
>> +
>> +    for ( i = 0; i < numnodes; i++ )
>> +    {
>> +        spdx = paddr_to_pdx(nodes[i].start);
>> +        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
>> +        if ( spdx >= epdx )
>> +            continue;
>> +        bitfield |= spdx;
>> +        nodes_used++;
>> +        if ( epdx > memtop )
>> +            memtop = epdx;
>> +    }
>> +    if ( nodes_used <= 1 )
>> +        i = BITS_PER_LONG - 1;
>> +    else
>> +        i = find_first_bit(&bitfield, sizeof(unsigned long) * 8);
>> +
>
>
> It is interesting to see that newline was added in the process of moving the
> code.

OK.
>
>> +    memnodemapsize = (memtop >> i) + 1;
>
>
> [....]
>
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 922fbd8..eed40af 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -14,6 +14,21 @@
>>
>>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>>
>> +struct node {
>> +    paddr_t start;
>> +    paddr_t end;
>> +};
>> +
>> +extern int compute_memnode_shift(struct node *nodes, int numnodes,
>> +                                 nodeid_t *nodeids, unsigned int *shift);
>> +extern void numa_init_array(void);
>> +extern bool_t srat_disabled(void);
>> +extern void numa_set_node(int cpu, nodeid_t node);
>> +extern nodeid_t acpi_setup_node(unsigned int pxm);
>> +extern void srat_detect_node(int cpu);
>> +extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t
>> end);
>> +extern void init_cpu_to_node(void);
>
>
> Can you please be consistent with this file and drop the unecessary
> "extern".

I see all the externs are not required here. I will drop

>
>> +
>>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>>
>>  #define domain_to_node(d) \
>> @@ -23,4 +38,7 @@
>>  bool is_numa_off(void);
>>  bool get_acpi_numa(void);
>>  void set_acpi_numa(bool val);
>> +int get_numa_fake(void);
>> +extern int numa_emulation(uint64_t start_pfn, uint64_t end_pfn);
>> +extern void numa_dummy_init(uint64_t start_pfn, uint64_t end_pfn);
>
>
> Ditto.
>
>
>>  #endif /* _XEN_NUMA_H */
>>
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-05-08 16:51   ` Julien Grall
@ 2017-05-09  7:39     ` Vijay Kilari
  2017-05-09  8:26       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-05-09  7:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Mon, May 8, 2017 at 10:21 PM, Julien Grall <julien.grall@arm.com> wrote:
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> new file mode 100644
>> index 0000000..207ebd8
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,488 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/mm.h>
>> +#include <xen/string.h>
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/numa.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/time.h>
>> +#include <xen/smp.h>
>> +#include <xen/pfn.h>
>> +#include <asm/acpi.h>
>> +#include <xen/sched.h>
>> +#include <xen/softirq.h>
>> +
>> +static int numa_setup(char *s);
>> +custom_param("numa", numa_setup);
>> +
>> +struct node_data node_data[MAX_NUMNODES];
>> +
>> +/* Mapping from pdx to node id */
>> +unsigned int memnode_shift;
>> +static typeof(*memnodemap) _memnodemap[64];
>
>
> Also, you move the hardcoded 64 here. But have you checked it is valid for
> ARM?
>
> Regardless that, this sounds like something that should be turned into a
> define and require a comment.

64 is good enough. This _memnodemap is used in case of NUMA failed or off,
in which case memnode_shift is 63 (BITS_PER_LONG -1).

So all the phys_to_nid() conversion will indexed within limits of _memnodemap[]

>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  2017-05-09  7:02     ` Vijay Kilari
@ 2017-05-09  8:13       ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-09  8:13 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K



On 05/09/2017 08:02 AM, Vijay Kilari wrote:
> On Mon, May 8, 2017 at 8:09 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Vijay,
>>
>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Add accessor for nodes[] and other static variables and
>>
>>
>> s/accessor/accessors/
>>
>>> used those accessors.
>>
>>
>> Also, I am not sure to understand the usefulness of those accessors over a
>> global variable.
>
> These are static variables which needs to accessed from other files and
> later moved to generic file.

101 of a contributor, always explaining in the commit message why you do 
something. Also, I am quite confused why sometimes you decide to use 
static and helper, other time you will use global variables.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-05-09  7:14     ` Vijay Kilari
@ 2017-05-09  8:21       ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-09  8:21 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On 05/09/2017 08:14 AM, Vijay Kilari wrote:
> On Mon, May 8, 2017 at 9:28 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Vijay,
>>
>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Right now CONFIG_NUMA is not enabled for ARM and
>>> existing code in asm-arm/numa.h is for !CONFIG_NUMA.
>>> Hence put this code under #ifndef CONFIG_NUMA.
>>>
>>> This help to make this changes work when CONFIG_NUMA
>>> is not enabled.
>>
>>
>> But you always turn NUMA on by default (see patch #24) and there is no
>> possibility to turn off NUMA.
>
> Yes at the end of the series we enable NUMA by default.
> But the the intermittent patches of this patch series fails to compile.

So for helping this series, you add code that will get rotten???

I don't like this idea at all, we should avoid to add code in Xen that 
will not be used.

>>
>>>
>>> Also define NODES_SHIFT macro for ARM to value 2.
>>> This limits number of NUMA nodes supported to 4.
>>> There is not hard restrictions on this value set to 2.
>>
>>
>> Again, why only 2 when x86 is supporting 6?
>>
>> Furthermore, this is not related to this patch itself and should be part of
>> separate patch.
>>
>> Lastly, why don't you move that to a Kconfig allowing the user to configure
>> the number of Nodes?
>
> ok
>
>>
>>>
>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/include/asm-arm/numa.h | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>>> index 53f99af..924bfc0 100644
>>> --- a/xen/include/asm-arm/numa.h
>>> +++ b/xen/include/asm-arm/numa.h
>>> @@ -3,6 +3,10 @@
>>>
>>>  typedef uint8_t nodeid_t;
>>>
>>> +/* Limit number of NUMA nodes supported to 4 */
>>> +#define NODES_SHIFT 2
>>
>>
>> Why this is not covered by CONFIG_NUMA?
>
> The below define is used in generic code irrespective of CONFIG_NUMA
>
> #define MAX_NUMNODES    (1 << NODES_SHIFT)

As you may have noticed NODES_SHIFT currently does not exist on ARM and 
we are still able to compile the generic code. So why do you need to do 
it unconditionally?

If you look at the code, xen/numa.h will define NODES_SHIFT to 0 if it 
has not previously defined. So I still don't see any reason on what you 
are doing.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-05-09  7:36     ` Vijay Kilari
@ 2017-05-09  8:23       ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-09  8:23 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K



On 05/09/2017 08:36 AM, Vijay Kilari wrote:
> On Mon, May 8, 2017 at 10:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Vijay,
>>
>>
>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>>
>>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>>> index 3bdab9a..33c6806 100644
>>> --- a/xen/arch/x86/numa.c
>>> +++ b/xen/arch/x86/numa.c
>>> @@ -10,286 +10,13 @@
>>>  #include <xen/ctype.h>
>>>  #include <xen/nodemask.h>
>>>  #include <xen/numa.h>
>>> -#include <xen/keyhandler.h>
>>>  #include <xen/time.h>
>>>  #include <xen/smp.h>
>>>  #include <xen/pfn.h>
>>>  #include <asm/acpi.h>
>>> -#include <xen/sched.h>
>>> -#include <xen/softirq.h>
>>> -
>>> -static int numa_setup(char *s);
>>> -custom_param("numa", numa_setup);
>>> -
>>> -struct node_data node_data[MAX_NUMNODES];
>>> -
>>> -/* Mapping from pdx to node id */
>>> -unsigned int memnode_shift;
>>> -static typeof(*memnodemap) _memnodemap[64];
>>> -unsigned long memnodemapsize;
>>> -uint8_t *memnodemap;
>>> -
>>> -nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
>>> -    [0 ... NR_CPUS-1] = NUMA_NO_NODE
>>> -};
>>> -/*
>>> - * Keep BIOS's CPU2node information, should not be used for memory
>>> allocaion
>>> - */
>>> -nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>>> -    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>>> -};
>>
>>
>> Why this is moved in this patch from here to x86/srat.c?
>
> This is x86 specific. I will make a separate patch for this
> move.

But x86/numa.c is specific specific.... So why do you move it????

>
>>
>> [...]
>>
>>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>>> index 7cf4771..2cc87a3 100644
>>> --- a/xen/arch/x86/srat.c
>>> +++ b/xen/arch/x86/srat.c
>>> @@ -27,6 +27,13 @@ static nodemask_t __initdata memory_nodes_parsed;
>>>  static nodemask_t __initdata processor_nodes_parsed;
>>>  static struct node __initdata nodes[MAX_NUMNODES];
>>>
>>> +/*
>>> + * Keep BIOS's CPU2node information, should not be used for memory
>>> allocaion
>>> + */
>>> +nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>>> +    [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>>> +};
>>> +
>>
>>
>> This does not belong to this patch...
> Ok
>>
>>>  struct pxm2node {
>>>         unsigned int pxm;
>>>         nodeid_t node;
>>
>>
>> [...]
>>
>>
>>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>>> new file mode 100644
>>> index 0000000..207ebd8
>>> --- /dev/null
>>> +++ b/xen/common/numa.c
>>> @@ -0,0 +1,488 @@
>>> +/*
>>> + * Common NUMA handling functions for x86 and arm.
>>> + * Original code extracted from arch/x86/numa.c
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms and conditions of the GNU General Public
>>> + * License, version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#include <xen/mm.h>
>>> +#include <xen/string.h>
>>> +#include <xen/init.h>
>>> +#include <xen/ctype.h>
>>> +#include <xen/nodemask.h>
>>> +#include <xen/numa.h>
>>> +#include <xen/keyhandler.h>
>>> +#include <xen/time.h>
>>> +#include <xen/smp.h>
>>> +#include <xen/pfn.h>
>>> +#include <asm/acpi.h>
>>> +#include <xen/sched.h>
>>> +#include <xen/softirq.h>
>>
>>
>> Whilst you are moving this in a newfile, please order the includes.
>
> I understand that you don't like any code changes in code movement
> patch.

Are you saying you blindly copied the headers without even checking they 
are necessary?

Surely, you only added the one necessary which means it would be ok to 
sort them as if one is missing this would be catch by compilation.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic
  2017-05-09  7:39     ` Vijay Kilari
@ 2017-05-09  8:26       ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-09  8:26 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K



On 05/09/2017 08:39 AM, Vijay Kilari wrote:
> On Mon, May 8, 2017 at 10:21 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>> +static int numa_setup(char *s);
>>> +custom_param("numa", numa_setup);
>>> +
>>> +struct node_data node_data[MAX_NUMNODES];
>>> +
>>> +/* Mapping from pdx to node id */
>>> +unsigned int memnode_shift;
>>> +static typeof(*memnodemap) _memnodemap[64];
>>
>>
>> Also, you move the hardcoded 64 here. But have you checked it is valid for
>> ARM?
>>
>> Regardless that, this sounds like something that should be turned into a
>> define and require a comment.
>
> 64 is good enough. This _memnodemap is used in case of NUMA failed or off,
> in which case memnode_shift is 63 (BITS_PER_LONG -1).

If it based on BITS_PER_LONG, then you should use BITS_PER_LONG (via a 
proper define) rather than hardcoding it.


-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information
  2017-05-08 17:31   ` Julien Grall
@ 2017-05-10  5:24     ` Vijay Kilari
  2017-05-10  8:52       ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-05-10  5:24 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

On Mon, May 8, 2017 at 11:01 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
> The title likely needs to have the work device-tree/DT in it.
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Parse CPU node and fetch numa-node-id information.
>> For each node-id found, update nodemask_t mask.
>> Refer to /Documentation/devicetree/bindings/numa.txt.
>
>
> In which repository?
>
>
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/Makefile       |  1 +
>>  xen/arch/arm/bootfdt.c      | 16 ++++++++--
>>  xen/arch/arm/numa/Makefile  |  2 ++
>>  xen/arch/arm/numa/dt_numa.c | 78
>> +++++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/numa/numa.c    | 50 +++++++++++++++++++++++++++++
>>  xen/arch/arm/setup.c        |  4 +++
>>  xen/include/asm-arm/numa.h  | 10 +++++-
>>  xen/include/asm-arm/setup.h |  4 ++-
>>  8 files changed, 161 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 0ce94a8..d13b79f 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
>>  subdir-y += platforms
>>  subdir-$(CONFIG_ARM_64) += efi
>>  subdir-$(CONFIG_ACPI) += acpi
>> +subdir-$(CONFIG_NUMA) += numa
>>
>>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>  obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index ea188a0..1f876f0 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32
>> **cell, u32 address_cells,
>>      *size = dt_next_cell(size_cells, cell);
>>  }
>>
>> -static u32 __init device_tree_get_u32(const void *fdt, int node,
>> -                                      const char *prop_name, u32 dflt)
>> +bool_t __init device_tree_type_matches(const void *fdt, int node,
>> +                                       const char *match)
>> +{
>> +    const void *prop;
>> +
>> +    prop = fdt_getprop(fdt, node, "device_type", NULL);
>> +    if ( prop == NULL )
>> +        return 0;
>> +
>> +    return strcmp(prop, match) == 0 ? 1 : 0;
>> +}
>> +
>
>
> This change is not explained in the patch and does not belong to it anyway.

OK.
>
>> +u32 __init device_tree_get_u32(const void *fdt, int node,
>> +                               const char *prop_name, u32 dflt)
>
>
> Ditto. I would recommend to read [1] for tips to break down a patch.
>
>
>>  {
>>      const struct fdt_property *prop;
>>
>> diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
>> new file mode 100644
>> index 0000000..3af3aff
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/Makefile
>> @@ -0,0 +1,2 @@
>> +obj-y += dt_numa.o
>> +obj-y += numa.o
>> diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
>> new file mode 100644
>> index 0000000..66c6efb
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/dt_numa.c
>> @@ -0,0 +1,78 @@
>> +/*
>> + * OF NUMA Parsing support.
>> + *
>> + * Copyright (C) 2015 - 2016 Cavium Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/libfdt/libfdt.h>
>> +#include <xen/mm.h>
>> +#include <xen/nodemask.h>
>> +#include <asm/mm.h>
>
>
> This is already included by xen/mm.h
>
>> +#include <xen/numa.h>
>> +#include <xen/device_tree.h>
>> +#include <asm/setup.h>
>
>
> Please order the include.
>
>> +
>> +extern nodemask_t processor_nodes_parsed;
>
>
> See my comment on patch #11. I may miss of them and hoping you will fix all
> the occurrence in the next version.
>
>
>> +
>> +/*
>> + * Even though we connect cpus to numa domains later in SMP
>> + * init, we need to know the node ids now for all cpus.
>> + */
>> +static int __init dt_numa_process_cpu_node(const void *fdt, int node,
>> +                                           const char *name,
>> +                                           uint32_t address_cells,
>> +                                           uint32_t size_cells)
>> +{
>> +    uint32_t nid;
>> +
>> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
>> +
>> +    if ( nid >= MAX_NUMNODES )
>> +        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n",
>> nid);
>> +    else
>> +        node_set(nid, processor_nodes_parsed);
>> +
>> +    return 0;
>> +}
>> +
>> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>> +                                        const char *name, int depth,
>> +                                        uint32_t address_cells,
>> +                                        uint32_t size_cells, void *data)
>> +{
>> +    if ( device_tree_type_matches(fdt, node, "cpu") )
>> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
>> +                                        size_cells);
>
>
> As said on v1, this code is wrong. CPUs nodes can only be in /cpus and you
> cannot rely on the name to be "cpu" (see binding in
> Documentation/devicetree/bindings/arm/cpu.txt). The only way to check if it
> is a CPU is to look for the property device_type.

The function device_tree_type_matches() isn't looking for device_type?.
Below is dt info on device_type. Anything missing?

                cpu@10101 {
                        compatible = "cavium,thunder", "arm,armv8";
                        device_type = "cpu";
                        ....
                };

>
> Cheers,
>
> [1] https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information
  2017-05-10  5:24     ` Vijay Kilari
@ 2017-05-10  8:52       ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-05-10  8:52 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

Hi,

On 05/10/2017 06:24 AM, Vijay Kilari wrote:
> On Mon, May 8, 2017 at 11:01 PM, Julien Grall <julien.grall@arm.com> wrote:
>>> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>>> +                                        const char *name, int depth,
>>> +                                        uint32_t address_cells,
>>> +                                        uint32_t size_cells, void *data)
>>> +{
>>> +    if ( device_tree_type_matches(fdt, node, "cpu") )
>>> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
>>> +                                        size_cells);
>>
>>
>> As said on v1, this code is wrong. CPUs nodes can only be in /cpus and you
>> cannot rely on the name to be "cpu" (see binding in
>> Documentation/devicetree/bindings/arm/cpu.txt). The only way to check if it
>> is a CPU is to look for the property device_type.
>
> The function device_tree_type_matches() isn't looking for device_type?.
> Below is dt info on device_type. Anything missing?
>
>                 cpu@10101 {
>                         compatible = "cavium,thunder", "arm,armv8";
>                         device_type = "cpu";
>                         ....
>                 };

You only cover one part of my comment and still miss "CPUS nodes can 
only be in /cpus".

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c
  2017-05-08 17:06   ` Julien Grall
@ 2017-05-10  9:00     ` Vijay Kilari
  0 siblings, 0 replies; 71+ messages in thread
From: Vijay Kilari @ 2017-05-10  9:00 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	Vijaya Kumar K

 and

On Mon, May 8, 2017 at 10:36 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move code from xen/arch/x86/srat.c to xen/common/numa.c
>> so that it can be used by other archs.
>> Few generic static functions in x86/srat.c are made
>> non-static common/numa.c
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/srat.c        | 152
>> ++-------------------------------------------
>>  xen/common/numa.c          | 146
>> +++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/acpi.h |   3 -
>>  xen/include/asm-x86/numa.h |   2 -
>>  xen/include/xen/numa.h     |  14 +++++
>>  5 files changed, 164 insertions(+), 153 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 2cc87a3..55947bb 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -23,9 +23,8 @@
>>
>>  static struct acpi_table_slit *__read_mostly acpi_slit;
>>
>> -static nodemask_t __initdata memory_nodes_parsed;
>> -static nodemask_t __initdata processor_nodes_parsed;
>> -static struct node __initdata nodes[MAX_NUMNODES];
>> +extern nodemask_t processor_nodes_parsed;
>> +extern nodemask_t memory_nodes_parsed;
>
>
> On v1, Jan clearly NAK to changes like this. Declarations belong in header
> files. It is a different variable compare to v1, but I would have expected
> you to apply what he said everywhere...

Ok I will move these to header files.

One more change that I made is moved from static to global.
because creating accessor functions around these nodesmask_t is tricky because
the macros (defined in nodemask.h) does not take pointer parameters.

I will add comment.

>
> [...]
>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 207ebd8..1789bba 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -32,6 +32,8 @@
>>  static int numa_setup(char *s);
>>  custom_param("numa", numa_setup);
>>
>> +nodemask_t __initdata memory_nodes_parsed;
>> +nodemask_t __initdata processor_nodes_parsed;
>>  struct node_data node_data[MAX_NUMNODES];
>>
>>  /* Mapping from pdx to node id */
>> @@ -47,6 +49,10 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>>
>>  static bool numa_off = 0;
>>  static bool acpi_numa = 1;
>> +static int num_node_memblks;
>> +static struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>> +static struct node __initdata nodes[MAX_NUMNODES];
>
>
> It would make sense to keep those variables together with
> {memory,processor}_nodes_parsed.

ok
>
> [...]
>
>> +int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < get_num_node_memblks(); i++) {
>
>
> common/numa.c is using Xen coding style whilst arch/x86/srat.c is using
> Linux coding style.
>
> You decided to validly switch to soft tab, making quite difficult to check
> if this patch is only code movement. But you did not go far enough and fix
> the coding style of the code moved.
>
> Please do it properly and not half of it. For simplicity I would be OK that
> it is done in this patch. But this needs to be clearly written in the commit
> message.

I will add in commit message about coding style changes to destination file
compared to source file.

>
>> +        struct node *nd = get_node_memblk_range(i);
>> +
>> +        if (nd->start <= start && nd->end > end &&
>> +            get_memblk_nodeid(i) == node)
>> +            return 1;
>> +    }
>> +
>> +    return 0;
>> +}
>
>
> [...]
>
>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index 421e8b7..7cff220 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -47,8 +47,6 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
>>                                   NODE_DATA(nid)->node_spanned_pages)
>>
>> -extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
>> -
>>  void srat_parse_regions(uint64_t addr);
>>  extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
>>  unsigned int arch_get_dma_bitsize(void);
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index eed40af..ee53526 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -13,6 +13,7 @@
>>  #define NUMA_NO_DISTANCE 0xFF
>>
>>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>>
>>  struct node {
>>      paddr_t start;
>> @@ -28,6 +29,19 @@ extern nodeid_t acpi_setup_node(unsigned int pxm);
>>  extern void srat_detect_node(int cpu);
>>  extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t
>> end);
>>  extern void init_cpu_to_node(void);
>> +extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
>> +extern int conflicting_memblks(paddr_t start, paddr_t end);
>> +extern void cutoff_node(int i, paddr_t start, paddr_t end);
>> +extern struct node *get_numa_node(int id);
>> +extern nodeid_t get_memblk_nodeid(int memblk);
>> +extern nodeid_t *get_memblk_nodeid_map(void);
>> +extern struct node *get_node_memblk_range(int memblk);
>> +extern struct node *get_memblk(int memblk);
>> +extern int numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t
>> size);
>> +extern int get_num_node_memblks(void);
>> +extern int arch_sanitize_nodes_memory(void);
>> +extern void numa_failed(void);
>> +extern int numa_scan_nodes(uint64_t start, uint64_t end);
>
>
> See my comment on the previous patch.

I understand that we can drop this extern
>
>>
>>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>>
>>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-03-28 15:53 ` [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig vijay.kilari
@ 2017-05-31 10:04   ` Jan Beulich
  2017-05-31 10:18     ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:04 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, julien.grall, xen-devel

>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -41,6 +41,10 @@ config HAS_GDBSX
>  config HAS_IOPORTS
>  	bool
>  
> +config NUMA
> +	def_bool y
> +	depends on HAS_PDX

What makes necessary this dependency?

> --- a/xen/drivers/acpi/Kconfig
> +++ b/xen/drivers/acpi/Kconfig
> @@ -4,6 +4,3 @@ config ACPI
>  
>  config ACPI_LEGACY_TABLES_LOOKUP
>  	bool
> -
> -config NUMA
> -	bool

This makes clear that so far this is an option which architectures
are expected to select. I think we want it to remain that way, but
if we didn't you should remove the existing select(s).

Also, does it really matter much whether this is under drivers/acpi/
or common/? After all ACPI appears to be a prereq on ARM too.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config
  2017-03-28 15:53 ` [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config vijay.kilari
@ 2017-05-31 10:05   ` Jan Beulich
  0 siblings, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:05 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, julien.grall, xen-devel

>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
> --- a/xen/drivers/acpi/Kconfig
> +++ b/xen/drivers/acpi/Kconfig
> @@ -4,3 +4,7 @@ config ACPI
>  
>  config ACPI_LEGACY_TABLES_LOOKUP
>  	bool
> +
> +config ACPI_NUMA
> +	def_bool y
> +	depends on ACPI && NUMA
> --- a/xen/include/asm-x86/config.h
> +++ b/xen/include/asm-x86/config.h
> @@ -37,7 +37,6 @@
>  #define CONFIG_X86_L1_CACHE_SHIFT 7
>  
>  #define CONFIG_ACPI_SLEEP 1
> -#define CONFIG_ACPI_NUMA 1
>  #define CONFIG_ACPI_SRAT 1
>  #define CONFIG_ACPI_CSTATE 1

I'm not convinced the better approach wouldn't be to simply drop
CONFIG_ACPI_NUMA, suitable replacing the few uses.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-05-31 10:04   ` Jan Beulich
@ 2017-05-31 10:18     ` Julien Grall
  2017-05-31 10:37       ` Jan Beulich
  0 siblings, 1 reply; 71+ messages in thread
From: Julien Grall @ 2017-05-31 10:18 UTC (permalink / raw)
  To: Jan Beulich, vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, xen-devel

Hi Jan,

On 31/05/17 11:04, Jan Beulich wrote:
>>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -41,6 +41,10 @@ config HAS_GDBSX
>>  config HAS_IOPORTS
>>  	bool
>>
>> +config NUMA
>> +	def_bool y
>> +	depends on HAS_PDX
>
> What makes necessary this dependency?

IIRC, this is because the numa code is using PDX helpers.

>
>> --- a/xen/drivers/acpi/Kconfig
>> +++ b/xen/drivers/acpi/Kconfig
>> @@ -4,6 +4,3 @@ config ACPI
>>
>>  config ACPI_LEGACY_TABLES_LOOKUP
>>  	bool
>> -
>> -config NUMA
>> -	bool
>
> This makes clear that so far this is an option which architectures
> are expected to select. I think we want it to remain that way, but
> if we didn't you should remove the existing select(s).
>
> Also, does it really matter much whether this is under drivers/acpi/
> or common/? After all ACPI appears to be a prereq on ARM too.

ACPI is not a prereq for NUMA. You can use it with Device Tree too.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
  2017-03-28 16:44   ` Wei Liu
@ 2017-05-31 10:20   ` Jan Beulich
  2017-05-31 10:21   ` Jan Beulich
  2 siblings, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:20 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, julien.grall, xen-devel

>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -1,4 +1,4 @@
> -#ifndef _ASM_X8664_NUMA_H 
> +#ifndef _ASM_X8664_NUMA_H
>  #define _ASM_X8664_NUMA_H 1
>  
>  #include <xen/cpumask.h>
> @@ -12,21 +12,20 @@ extern int srat_rev;
>  extern nodeid_t      cpu_to_node[NR_CPUS];
>  extern cpumask_t     node_to_cpumask[];
>  
> -#define cpu_to_node(cpu)		(cpu_to_node[cpu])
> -#define parent_node(node)		(node)
> +#define cpu_to_node(cpu)         (cpu_to_node[cpu])
> +#define parent_node(node)        (node)
>  #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
>  #define node_to_cpumask(node)    (node_to_cpumask[node])
>  
> -struct node { 
> -	u64 start,end; 
> +struct node {
> +    u64 start,end;

You want to add a blank after the comma. Also at least where you
touch lines anyway, please also switch to uint64_t and alike.

> @@ -42,14 +41,8 @@ extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>  extern nodeid_t apicid_to_node[];
>  extern void init_cpu_to_node(void);
>  
> -static inline void clear_node_cpumask(int cpu)
> -{
> -	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
> -}
> -
>  /* Simple perfect hash to map pdx to node numbers */
> -extern int memnode_shift; 
> -extern unsigned long memnodemapsize;
> +extern int memnode_shift;

If you remove an extern declaration from a header, the
corresponding definition should become static.

> @@ -60,20 +53,16 @@ struct node_data {
>  extern struct node_data node_data[];
>  
>  static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
> -{ 
> -	nodeid_t nid;
> -	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> -	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift]; 
> -	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]); 
> -	return nid; 
> -} 
> -
> -#define NODE_DATA(nid)		(&(node_data[nid]))
> -
> -#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
> -#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
> +{
> +    return memnodemap[paddr_to_pdx(addr) >> memnode_shift];

I think it would be a good idea to convert the ineffective
VIRTUAL_BUG_ON()s to ASSERT()s.

> +}
> +
> +#define NODE_DATA(nid)          (&(node_data[nid]))

Please drop the pointless inner parentheses.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces
  2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
  2017-03-28 16:44   ` Wei Liu
  2017-05-31 10:20   ` Jan Beulich
@ 2017-05-31 10:21   ` Jan Beulich
  2 siblings, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:21 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, julien.grall

>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
> @@ -60,20 +53,16 @@ struct node_data {
>  extern struct node_data node_data[];
>  
>  static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)

Please also take the opportunity and switch to __attribute_pure__
here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes
  2017-03-28 15:53 ` [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes vijay.kilari
  2017-03-28 16:44   ` Wei Liu
@ 2017-05-31 10:35   ` Jan Beulich
  1 sibling, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:35 UTC (permalink / raw)
  To: vijay.kilari, xen-devel
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, julien.grall

>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
> Change u{8,32,64} to uint{8,32,64}_t and bool_t to bool.

Oh, I see you do this in a separate patch. Please disregard the
respective comments on patch 1 then.

> -bool_t numa_off = 0;
> +bool numa_off = 0;

"false" at the same time (and "true" wherever it is being set).

> @@ -201,19 +201,19 @@ void __init numa_init_array(void)
>  }
>  
>  #ifdef CONFIG_NUMA_EMU
> -static int numa_fake __initdata = 0;
> +static int __initdata numa_fake = 0;

unsigned int and the initializer can be dropped at once.

>  /* Numa emulation */
> -static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
> +static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>  {
>      int i;
>      struct node nodes[MAX_NUMNODES];
> -    u64 sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
> +    uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
>  
>      /* Kludge needed for the hash function */
>      if ( hweight64(sz) > 1 )
>      {
> -        u64 x = 1;
> +        uint64_t x = 1;
>          while ( (x << 1) < sz )

Please also add the missing blank line between declaration and
statements (same elsewhere).

> @@ -260,8 +260,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>  #endif
>  
>  #ifdef CONFIG_ACPI_NUMA
> -    if ( !numa_off && !acpi_scan_nodes((u64)start_pfn << PAGE_SHIFT,
> -         (u64)end_pfn << PAGE_SHIFT) )
> +    if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
> +         (uint64_t)end_pfn << PAGE_SHIFT) )

Instead of altering the casts please use pfn_to_paddr() (also further
down).

> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -23,12 +23,12 @@
>  
>  static struct acpi_table_slit *__read_mostly acpi_slit;
>  
> -static nodemask_t memory_nodes_parsed __initdata;
> -static nodemask_t processor_nodes_parsed __initdata;
> -static struct node nodes[MAX_NUMNODES] __initdata;
> +static nodemask_t __initdata memory_nodes_parsed;
> +static nodemask_t __initdata processor_nodes_parsed;
> +static struct node __initdata nodes[MAX_NUMNODES];
>  
>  struct pxm2node {
> -	unsigned pxm;
> +	unsigned int pxm;
>  	nodeid_t node;
>  };

How come there are still tabs left here after patch 1?

> @@ -64,9 +64,9 @@ nodeid_t pxm_to_node(unsigned pxm)
>  nodeid_t setup_node(unsigned pxm)
>  {
>  	nodeid_t node;
> -	unsigned idx;
> -	static bool_t warned;
> -	static unsigned nodes_found;
> +	unsigned int idx;
> +	static bool warned;

Again together with the type change you should also change the
setting of the variable to use "true".

> @@ -134,7 +134,7 @@ static __init int conflicting_memblks(u64 start, u64 end)
>  	return -1;
>  }
>  
> -static __init void cutoff_node(int i, u64 start, u64 end)
> +static void __init cutoff_node(int i, paddr_t start, paddr_t end)

The first parameter also wants changing (presumably nodeid_t).

> @@ -274,7 +274,7 @@ acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *pa)
>  void __init
>  acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  {
> -	u64 start, end;
> +	uint64_t start, end;

Why not paddr_t, like you do elsewhere?

> @@ -428,9 +428,9 @@ static int __init srat_parse_region(struct acpi_subtable_header *header,
>  	return 0;
>  }
>  
> -void __init srat_parse_regions(u64 addr)
> +void __init srat_parse_regions(uint64_t addr)

Same (at least) here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-05-31 10:18     ` Julien Grall
@ 2017-05-31 10:37       ` Jan Beulich
  2017-06-15  7:52         ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Jan Beulich @ 2017-05-31 10:37 UTC (permalink / raw)
  To: Julien Grall, vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya Kumar K, xen-devel

>>> On 31.05.17 at 12:18, <julien.grall@arm.com> wrote:
> On 31/05/17 11:04, Jan Beulich wrote:
>>>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
>>> --- a/xen/common/Kconfig
>>> +++ b/xen/common/Kconfig
>>> @@ -41,6 +41,10 @@ config HAS_GDBSX
>>>  config HAS_IOPORTS
>>>  	bool
>>>
>>> +config NUMA
>>> +	def_bool y
>>> +	depends on HAS_PDX
>>
>> What makes necessary this dependency?
> 
> IIRC, this is because the numa code is using PDX helpers.

Well, these helpers should have 1:1 translation equivalents for
the non-PDX case; I don't see the need for the dependency.

>>> --- a/xen/drivers/acpi/Kconfig
>>> +++ b/xen/drivers/acpi/Kconfig
>>> @@ -4,6 +4,3 @@ config ACPI
>>>
>>>  config ACPI_LEGACY_TABLES_LOOKUP
>>>  	bool
>>> -
>>> -config NUMA
>>> -	bool
>>
>> This makes clear that so far this is an option which architectures
>> are expected to select. I think we want it to remain that way, but
>> if we didn't you should remove the existing select(s).
>>
>> Also, does it really matter much whether this is under drivers/acpi/
>> or common/? After all ACPI appears to be a prereq on ARM too.
> 
> ACPI is not a prereq for NUMA. You can use it with Device Tree too.

Oh, okay. That should be said in the commit message then.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-05-31 10:37       ` Jan Beulich
@ 2017-06-15  7:52         ` Vijay Kilari
  2017-06-15  9:00           ` Julien Grall
  0 siblings, 1 reply; 71+ messages in thread
From: Vijay Kilari @ 2017-06-15  7:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, Julien Grall,
	xen-devel

On Wed, May 31, 2017 at 4:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 31.05.17 at 12:18, <julien.grall@arm.com> wrote:
>> On 31/05/17 11:04, Jan Beulich wrote:
>>>>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
>>>> --- a/xen/common/Kconfig
>>>> +++ b/xen/common/Kconfig
>>>> @@ -41,6 +41,10 @@ config HAS_GDBSX
>>>>  config HAS_IOPORTS
>>>>     bool
>>>>
>>>> +config NUMA
>>>> +   def_bool y
>>>> +   depends on HAS_PDX
>>>
>>> What makes necessary this dependency?
>>
>> IIRC, this is because the numa code is using PDX helpers.
>
> Well, these helpers should have 1:1 translation equivalents for
> the non-PDX case; I don't see the need for the dependency.

PDX is necessary. Without that xen fails to compile for ARM.
IMO, there is no equivalent non-PDX support available.

As it is mandatory config, I propose to remove this dependency with
NUMA config. ok?

>
>>>> --- a/xen/drivers/acpi/Kconfig
>>>> +++ b/xen/drivers/acpi/Kconfig
>>>> @@ -4,6 +4,3 @@ config ACPI
>>>>
>>>>  config ACPI_LEGACY_TABLES_LOOKUP
>>>>     bool
>>>> -
>>>> -config NUMA
>>>> -   bool
>>>
>>> This makes clear that so far this is an option which architectures
>>> are expected to select. I think we want it to remain that way, but
>>> if we didn't you should remove the existing select(s).
>>>
>>> Also, does it really matter much whether this is under drivers/acpi/
>>> or common/? After all ACPI appears to be a prereq on ARM too.
>>
>> ACPI is not a prereq for NUMA. You can use it with Device Tree too.
>
> Oh, okay. That should be said in the commit message then.
>
> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig
  2017-06-15  7:52         ` Vijay Kilari
@ 2017-06-15  9:00           ` Julien Grall
  0 siblings, 0 replies; 71+ messages in thread
From: Julien Grall @ 2017-06-15  9:00 UTC (permalink / raw)
  To: Vijay Kilari, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Vijaya Kumar K, xen-devel

Hi Vijay,

On 15/06/17 08:52, Vijay Kilari wrote:
> On Wed, May 31, 2017 at 4:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 31.05.17 at 12:18, <julien.grall@arm.com> wrote:
>>> On 31/05/17 11:04, Jan Beulich wrote:
>>>>>>> On 28.03.17 at 17:53, <vijay.kilari@gmail.com> wrote:
>>>>> --- a/xen/common/Kconfig
>>>>> +++ b/xen/common/Kconfig
>>>>> @@ -41,6 +41,10 @@ config HAS_GDBSX
>>>>>  config HAS_IOPORTS
>>>>>     bool
>>>>>
>>>>> +config NUMA
>>>>> +   def_bool y
>>>>> +   depends on HAS_PDX
>>>>
>>>> What makes necessary this dependency?
>>>
>>> IIRC, this is because the numa code is using PDX helpers.
>>
>> Well, these helpers should have 1:1 translation equivalents for
>> the non-PDX case; I don't see the need for the dependency.
>
> PDX is necessary. Without that xen fails to compile for ARM.
> IMO, there is no equivalent non-PDX support available.

I am well aware that ARM requires PDX... This is because, all the memory 
is not contiguous on ARM and we don't want to waste space in the frametable.

But this is not the point of the discussion. A new architecture may 
decide that PDX is not necessary but still want to use NUMA. Hence why 
Jan suggested 1:1 helpers for the non-PDX case.

>
> As it is mandatory config, I propose to remove this dependency with
> NUMA config. ok?

This is what Jan suggested and I am happy with that.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions
  2017-03-28 15:53 ` [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions vijay.kilari
@ 2017-06-30 14:05   ` Jan Beulich
  2017-07-11 10:16     ` Vijay Kilari
  0 siblings, 1 reply; 71+ messages in thread
From: Jan Beulich @ 2017-06-30 14:05 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya.Kumar, julien.grall, xen-devel

>>> <vijay.kilari@gmail.com> 03/28/17 5:54 PM >>>
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -53,15 +53,15 @@ int srat_disabled(void)
>  /*
>   * Given a shift value, try to populate memnodemap[]
>   * Returns :
> - * 1 if OK
> - * 0 if memnodmap[] too small (of shift too small)
> - * -1 if node overlap or lost ram (shift too big)
> + * 0 if OK
> + * -EINVAL if memnodmap[] too small (of shift too small)
> + * OR if node overlap or lost ram (shift too big)

It may not matter too much, but you're making things actually worse to
the caller, as it now can't distinguish the two failure modes anymore.
Also, if you already touch it, please also correct the apparent typo
("of" quite likely meant to be "or"). But what I consider most problematic
is that you convert ...

> @@ -74,7 +74,7 @@ static int __init populate_memnodemap(const struct node *nodes,
>              return 0;

... what is an error case so far to a success one.

> @@ -116,10 +116,10 @@ static int __init allocate_cachealigned_memnodemap(void)
>   * The LSB of all start and end addresses in the node map is the value of the
>   * maximum possible shift.
>   */
> -static int __init extract_lsb_from_nodes(const struct node *nodes,
> -                                         int numnodes)
> +static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
> +                                                  int numnodes)

Why would you convert the return type to unsigned, but not also that of the
bogusly signed parameter?

> @@ -143,27 +143,27 @@ static int __init extract_lsb_from_nodes(const struct node *nodes,
>      return i;
>  }
>  
> -int __init compute_hash_shift(struct node *nodes, int numnodes,
> -                              nodeid_t *nodeids)
> +int __init compute_memnode_shift(struct node *nodes, int numnodes,
> +                                 nodeid_t *nodeids, unsigned int *shift)

I'm not in favor of returning the shift count via pointer when it can easily
be returned by value.

>  {
> -    int shift;
> +    *shift = extract_lsb_from_nodes(nodes, numnodes);
>  
> -    shift = extract_lsb_from_nodes(nodes, numnodes);
>      if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
>          memnodemap = _memnodemap;
>      else if ( allocate_cachealigned_memnodemap() )
> -        return -1;
> -    printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
> +        return -ENOMEM;
> +
> +    printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
>  
> -    if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
> +    if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
>      {
>          printk(KERN_INFO "Your memory is not aligned you need to "
>                 "rebuild your hypervisor with a bigger NODEMAPSIZE "
> -               "shift=%d\n", shift);
> -        return -1;
> +               "shift=%u\n", *shift);
> +        return -EINVAL;

So you make populate_memnodemap() return proper error values, but then discard
it and uniformly use -EINVAL here. If you mean the function to simply return a
success/failure indicator, make it return bool. Otherwise use the error value
it return (even if right now it can only ever be -EINVAL).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables
  2017-03-28 15:53 ` [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables vijay.kilari
  2017-04-20 15:59   ` Julien Grall
@ 2017-06-30 14:07   ` Jan Beulich
  1 sibling, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-06-30 14:07 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya.Kumar, julien.grall, xen-devel

>>> <vijay.kilari@gmail.com> 03/28/17 5:55 PM >>>
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>  
>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>  
> -bool numa_off = 0;
> -s8 acpi_numa = 0;
> +static bool numa_off = 0;
> +static bool acpi_numa = 1;
>  
> -int srat_disabled(void)
> +bool is_numa_off(void)

numa_enabled() (or less desirably numa_disabled())

> +bool get_acpi_numa(void)

acpi_numa_enabled() then perhaps.

Iirc Julien has already commented on the non-boolean nature of acpi_numa.

> @@ -202,13 +217,17 @@ void __init numa_init_array(void)
>  
>  #ifdef CONFIG_NUMA_EMU
>  static int __initdata numa_fake = 0;
> +static int get_numa_fake(void)
> +{
> +    return numa_fake;
> +}

I don't see the point of having static accessors for static variables. Even
if the accessor became non-static, I'd expect it to be used only in other
translation units.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function
  2017-03-28 15:53 ` [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function vijay.kilari
  2017-04-20 16:12   ` Julien Grall
@ 2017-06-30 14:08   ` Jan Beulich
  1 sibling, 0 replies; 71+ messages in thread
From: Jan Beulich @ 2017-06-30 14:08 UTC (permalink / raw)
  To: vijay.kilari
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, Vijaya.Kumar, julien.grall, xen-devel

>>> <vijay.kilari@gmail.com> 03/28/17 5:54 PM >>>
> @@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>                      (paddr_t)end_pfn << PAGE_SHIFT);
>  }
>  
> +void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
> +{
> +#ifdef CONFIG_NUMA_EMU
> +    if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
> +        return;
> +#endif
> +
> +#ifdef CONFIG_ACPI_NUMA
> +    if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
> +         (uint64_t)end_pfn << PAGE_SHIFT) )

Please use pfn_to_paddr() as you move this code.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions
  2017-06-30 14:05   ` Jan Beulich
@ 2017-07-11 10:16     ` Vijay Kilari
  0 siblings, 0 replies; 71+ messages in thread
From: Vijay Kilari @ 2017-07-11 10:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Kumar, Vijaya, Julien Grall,
	xen-devel

Hi Jan,

 Sorry for late reply.

On Fri, Jun 30, 2017 at 7:35 PM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> <vijay.kilari@gmail.com> 03/28/17 5:54 PM >>>
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -53,15 +53,15 @@ int srat_disabled(void)
>>  /*
>>   * Given a shift value, try to populate memnodemap[]
>>   * Returns :
>> - * 1 if OK
>> - * 0 if memnodmap[] too small (of shift too small)
>> - * -1 if node overlap or lost ram (shift too big)
>> + * 0 if OK
>> + * -EINVAL if memnodmap[] too small (of shift too small)
>> + * OR if node overlap or lost ram (shift too big)
>
> It may not matter too much, but you're making things actually worse to
> the caller, as it now can't distinguish the two failure modes anymore.
> Also, if you already touch it, please also correct the apparent typo
> ("of" quite likely meant to be "or"). But what I consider most problematic
> is that you convert ...

OK. I propose to return different error values for two failure modes.
-ENOMEM for "if memnodmap[] too small" and
-EINVAL for if node overlap or lost ram

But In any case it does not matter much and can drop this change.

> ... what is an error case so far to a success one.
>
>> @@ -116,10 +116,10 @@ static int __init allocate_cachealigned_memnodemap(void)
>>   * The LSB of all start and end addresses in the node map is the value of the
>>   * maximum possible shift.
>>   */
>> -static int __init extract_lsb_from_nodes(const struct node *nodes,
>> -                                         int numnodes)
>> +static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
>> +                                                  int numnodes)
>
> Why would you convert the return type to unsigned, but not also that of the
> bogusly signed parameter?

Because memnode_shift type is changed from int to unsigned int.
The return type is changed.

I will change int parameter to unsigned int.
Apart from that I see that variable 'i' in extract_lsb_from_nodes() is int.
This needs to changed to unsigned int.

>
>> @@ -143,27 +143,27 @@ static int __init extract_lsb_from_nodes(const struct node *nodes,
>>      return i;
>>  }
>>
>> -int __init compute_hash_shift(struct node *nodes, int numnodes,
>> -                              nodeid_t *nodeids)
>> +int __init compute_memnode_shift(struct node *nodes, int numnodes,
>> +                                 nodeid_t *nodeids, unsigned int *shift)
>
> I'm not in favor of returning the shift count via pointer when it can easily
> be returned by value.

OK.

>
>>  {
>> -    int shift;
>> +    *shift = extract_lsb_from_nodes(nodes, numnodes);
>>
>> -    shift = extract_lsb_from_nodes(nodes, numnodes);
>>      if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
>>          memnodemap = _memnodemap;
>>      else if ( allocate_cachealigned_memnodemap() )
>> -        return -1;
>> -    printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
>> +        return -ENOMEM;
>> +
>> +    printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
>>
>> -    if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
>> +    if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
>>      {
>>          printk(KERN_INFO "Your memory is not aligned you need to "
>>                 "rebuild your hypervisor with a bigger NODEMAPSIZE "
>> -               "shift=%d\n", shift);
>> -        return -1;
>> +               "shift=%u\n", *shift);
>> +        return -EINVAL;
>
> So you make populate_memnodemap() return proper error values, but then discard
> it and uniformly use -EINVAL here. If you mean the function to simply return a
> success/failure indicator, make it return bool. Otherwise use the error value
> it return (even if right now it can only ever be -EINVAL).

OK. I will drop this change and keep compute_hash_shift() return -1 or
shift value.

Regards
Vijay

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2017-07-11 10:16 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-28 15:53 [RFC PATCH v2 00/25] ARM: Add Xen NUMA support vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces vijay.kilari
2017-03-28 16:44   ` Wei Liu
2017-05-31 10:20   ` Jan Beulich
2017-05-31 10:21   ` Jan Beulich
2017-03-28 15:53 ` [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes vijay.kilari
2017-03-28 16:44   ` Wei Liu
2017-05-31 10:35   ` Jan Beulich
2017-03-28 15:53 ` [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions vijay.kilari
2017-06-30 14:05   ` Jan Beulich
2017-07-11 10:16     ` Vijay Kilari
2017-03-28 15:53 ` [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables vijay.kilari
2017-04-20 15:59   ` Julien Grall
2017-04-25  6:54     ` Vijay Kilari
2017-04-25 12:04       ` Julien Grall
2017-04-25 12:20         ` Vijay Kilari
2017-04-25 12:28           ` Julien Grall
2017-04-25 14:54             ` Vijay Kilari
2017-04-25 15:14               ` Julien Grall
2017-04-25 15:43                 ` Jan Beulich
2017-05-02  9:47                   ` Vijay Kilari
2017-05-02  9:54                     ` Jan Beulich
2017-05-08 17:38                     ` Julien Grall
2017-06-30 14:07   ` Jan Beulich
2017-03-28 15:53 ` [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function vijay.kilari
2017-04-20 16:12   ` Julien Grall
2017-04-25  6:59     ` Vijay Kilari
2017-06-30 14:08   ` Jan Beulich
2017-03-28 15:53 ` [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs vijay.kilari
2017-05-08 14:39   ` Julien Grall
2017-05-09  7:02     ` Vijay Kilari
2017-05-09  8:13       ` Julien Grall
2017-03-28 15:53 ` [RFC PATCH v2 07/25] x86: NUMA: Rename some generic functions vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 08/25] x86: NUMA: Sanitize node distance vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
2017-05-08 15:58   ` Julien Grall
2017-05-09  7:14     ` Vijay Kilari
2017-05-09  8:21       ` Julien Grall
2017-03-28 15:53 ` [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic vijay.kilari
2017-05-08 16:41   ` Julien Grall
2017-05-09  7:36     ` Vijay Kilari
2017-05-09  8:23       ` Julien Grall
2017-05-08 16:51   ` Julien Grall
2017-05-09  7:39     ` Vijay Kilari
2017-05-09  8:26       ` Julien Grall
2017-03-28 15:53 ` [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c vijay.kilari
2017-05-08 17:06   ` Julien Grall
2017-05-10  9:00     ` Vijay Kilari
2017-03-28 15:53 ` [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information vijay.kilari
2017-05-08 17:31   ` Julien Grall
2017-05-10  5:24     ` Vijay Kilari
2017-05-10  8:52       ` Julien Grall
2017-03-28 15:53 ` [RFC PATCH v2 13/25] ARM: NUMA: Parse memory " vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 14/25] ARM: NUMA: Parse NUMA distance information vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 15/25] ARM: NUMA: Add CPU NUMA support vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 16/25] ARM: NUMA: Add memory " vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 17/25] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 18/25] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 19/25] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 20/25] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 21/25] ACPI: Move arch specific SRAT parsing vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 22/25] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 23/25] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
2017-03-28 15:53 ` [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig vijay.kilari
2017-05-31 10:04   ` Jan Beulich
2017-05-31 10:18     ` Julien Grall
2017-05-31 10:37       ` Jan Beulich
2017-06-15  7:52         ` Vijay Kilari
2017-06-15  9:00           ` Julien Grall
2017-03-28 15:53 ` [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config vijay.kilari
2017-05-31 10:05   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.