All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/21] ARM: Add Xen NUMA support
@ 2017-02-09 15:56 vijay.kilari
  2017-02-09 15:56 ` [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
                   ` (22 more replies)
  0 siblings, 23 replies; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

With this RFC patch series, NUMA support is added for arm platform.
Both DT and ACPI based NUMA support is added.
Only Xen is made aware of NUMA platform. Dom0 is awareness is not
added.

As part of this series, the code under x86 architecture is
reused by moving into common files.
New files xen/common/numa.c and xen/commom/srat.c files are
added which are common for both x86 and arm.

Patches 1 - 12 & 20 are for DT NUMA and 13 - 19 & 21 are for
ACPI NUMA.

DT NUMA: The following major changes are performed
 - Dropped numa-node-id information from Dom0 DT.
   So that Dom0 devices make allocation from node 0 for
   devmalloc requests.
 - Memory DT is not deleted by EFI. It is exposed to Xen
   to extract numa information.
 - On NUMA failure, Fallback to Non-NUMA booting.
   Assuming all the memory and CPU's are under node 0.
 - CONFIG_NUMA is introduced.

ACPI NUMA:
 - MADT is parsed before parsing SRAT table to extract
   CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
   table, MADT table is opened and extract MPIDR. However this
   approach is not working on Xen it allows only one table to
   be open at a time because when ACPI table is opened, Xen
   maps to single region. So opening ACPI tables recursively
   leads to overwriting of contents.
 - SRAT table is parsed for ACPI_SRAT_TYPE_GICC_AFFINITY to extract
   proximity info and MPIDR from CPU_ID to MPIDR mapping table.
 - SRAT table is parsed for ACPI_SRAT_TYPE_MEMORY_AFFINITY to extract
   memory proximity.
 - Re-use SLIT parsing of x86 for node distance information.
 - CONFIG_ACPI_NUMA is introduced

The node_distance() API is implemented separately for x86 and arm
as arm has DT and ACPI based distance information.

No changes are made to x86 implementation only code is refactored.
Hence only compilation tested for x86.

Code is shared at https://github.com/vijaykilari/xen-numa rfc_1

Note: Please use this patch series only for review.
For testing, patch to boot allocator is required. Which will
be sent outside this series.

Vijaya Kumar K (21):
  ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  x86: NUMA: Refactor NUMA code
  NUMA: Move arch specific NUMA code as common
  NUMA: Refactor generic and arch specific code of numa_setup
  ARM: efi: Do not delete memory node from fdt
  ARM: NUMA: Parse CPU NUMA information
  ARM: NUMA: Parse memory NUMA information
  ARM: NUMA: Parse NUMA distance information
  ARM: NUMA: Add CPU NUMA support
  ARM: NUMA: Add memory NUMA support
  ARM: NUMA: Add fallback on NUMA failure
  ARM: NUMA: Do not expose numa info to DOM0
  ACPI: Refactor acpi SRAT and SLIT table handling code
  ACPI: Move srat_disabled to common code
  ARM: NUMA: Extract MPIDR from MADT table
  ARM: NUMA: Extract proximity from SRAT table
  ARM: NUMA: Extract memory proximity from SRAT table
  ARM: NUMA: update node_distance with ACPI support
  ARM: NUMA: Initialize ACPI NUMA
  ARM: NUMA: Enable CONFIG_NUMA config
  ARM: NUMA: Enable CONFIG_ACPI_NUMA config

 xen/arch/arm/Kconfig                |   5 +
 xen/arch/arm/Makefile               |   3 +
 xen/arch/arm/acpi_numa.c            | 257 +++++++++++++++++++++
 xen/arch/arm/bootfdt.c              |  21 +-
 xen/arch/arm/domain_build.c         |   9 +
 xen/arch/arm/dt_numa.c              | 244 ++++++++++++++++++++
 xen/arch/arm/efi/efi-boot.h         |  25 --
 xen/arch/arm/numa.c                 | 249 ++++++++++++++++++++
 xen/arch/arm/setup.c                |   5 +
 xen/arch/arm/smpboot.c              |   3 +
 xen/arch/x86/domain_build.c         |   1 +
 xen/arch/x86/numa.c                 | 318 +-------------------------
 xen/arch/x86/physdev.c              |   1 +
 xen/arch/x86/setup.c                |   1 +
 xen/arch/x86/smpboot.c              |   1 +
 xen/arch/x86/srat.c                 | 183 +--------------
 xen/arch/x86/x86_64/mm.c            |   1 +
 xen/common/Makefile                 |   2 +
 xen/common/numa.c                   | 439 ++++++++++++++++++++++++++++++++++++
 xen/common/srat.c                   | 157 +++++++++++++
 xen/drivers/acpi/numa.c             |  37 +++
 xen/drivers/acpi/osl.c              |   2 +
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/actbl1.h           |  17 +-
 xen/include/asm-arm/acpi.h          |   2 +
 xen/include/asm-arm/numa.h          |  41 ++++
 xen/include/asm-x86/acpi.h          |   2 -
 xen/include/asm-x86/numa.h          |  53 +----
 xen/include/xen/acpi.h              |  39 ++++
 xen/include/xen/device_tree.h       |   7 +
 xen/include/xen/numa.h              |  61 ++++-
 xen/include/xen/srat.h              |  15 ++
 32 files changed, 1620 insertions(+), 582 deletions(-)
 create mode 100644 xen/arch/arm/acpi_numa.c
 create mode 100644 xen/arch/arm/dt_numa.c
 create mode 100644 xen/arch/arm/numa.c
 create mode 100644 xen/common/numa.c
 create mode 100644 xen/common/srat.c
 create mode 100644 xen/include/xen/srat.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-20 11:39   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code vijay.kilari
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Right not CONFIG_NUMA is not enabled for ARM and
existing code in asm-arm/numa.h is for !COFIG_NUMA.
Hence put this code under #ifndef CONFIG_NUMA.

This help to make this changes work when CONFIG_NUMA
is not enabled.

Also define NODES_SHIFT macro for ARM.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/include/asm-arm/numa.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index a2c1a34..a60c7eb 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -3,6 +3,9 @@
 
 typedef u8 nodeid_t;
 
+#define NODES_SHIFT 2
+
+#ifndef CONFIG_NUMA
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
@@ -16,6 +19,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define node_spanned_pages(nid) (total_pages)
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)
+#endif /* CONFIG_NUMA */
 
 static inline unsigned int arch_get_dma_bitsize(void)
 {
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
  2017-02-09 15:56 ` [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-09 16:11   ` Jan Beulich
  2017-02-20 12:37   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common vijay.kilari
                   ` (20 subsequent siblings)
  22 siblings, 2 replies; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move common generic NUMA code to xen/common/numa.c from
xen/arch/x86/numa.c. Also move generic code in header file
xen/include/asm-x86/numa.h to xen/include/xen/numa.h

This common code can be re-used later for ARM.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 299 ---------------------------------------
 xen/common/Makefile        |   1 +
 xen/common/numa.c          | 342 +++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/numa.h |  47 -------
 xen/include/xen/numa.h     |  54 +++++++
 5 files changed, 397 insertions(+), 346 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 6f4d438..bc787e0 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -25,27 +25,12 @@ custom_param("numa", numa_setup);
 #define Dprintk(x...)
 #endif
 
-/* from proto.h */
-#define round_up(x,y) ((((x)+(y))-1) & (~((y)-1)))
-
-struct node_data node_data[MAX_NUMNODES];
-
-/* Mapping from pdx to node id */
-int memnode_shift;
-static typeof(*memnodemap) _memnodemap[64];
-unsigned long memnodemapsize;
-u8 *memnodemap;
-
-nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
-    [0 ... NR_CPUS-1] = NUMA_NO_NODE
-};
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
  */
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
     [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
-cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
@@ -57,134 +42,6 @@ int srat_disabled(void)
     return numa_off || acpi_numa < 0;
 }
 
-/*
- * Given a shift value, try to populate memnodemap[]
- * Returns :
- * 1 if OK
- * 0 if memnodmap[] too small (of shift too small)
- * -1 if node overlap or lost ram (shift too big)
- */
-static int __init populate_memnodemap(const struct node *nodes,
-                                      int numnodes, int shift, nodeid_t *nodeids)
-{
-    unsigned long spdx, epdx;
-    int i, res = -1;
-
-    memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
-    for ( i = 0; i < numnodes; i++ )
-    {
-        spdx = paddr_to_pdx(nodes[i].start);
-        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-        if ( spdx >= epdx )
-            continue;
-        if ( (epdx >> shift) >= memnodemapsize )
-            return 0;
-        do {
-            if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-                return -1;
-
-            if ( !nodeids )
-                memnodemap[spdx >> shift] = i;
-            else
-                memnodemap[spdx >> shift] = nodeids[i];
-
-            spdx += (1UL << shift);
-        } while ( spdx < epdx );
-        res = 1;
-    }
-
-    return res;
-}
-
-static int __init allocate_cachealigned_memnodemap(void)
-{
-    unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
-    unsigned long mfn = alloc_boot_pages(size, 1);
-
-    if ( !mfn )
-    {
-        printk(KERN_ERR
-               "NUMA: Unable to allocate Memory to Node hash map\n");
-        memnodemapsize = 0;
-        return -1;
-    }
-
-    memnodemap = mfn_to_virt(mfn);
-    mfn <<= PAGE_SHIFT;
-    size <<= PAGE_SHIFT;
-    printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
-           mfn, mfn + size);
-    memnodemapsize = size / sizeof(*memnodemap);
-
-    return 0;
-}
-
-/*
- * The LSB of all start and end addresses in the node map is the value of the
- * maximum possible shift.
- */
-static int __init extract_lsb_from_nodes(const struct node *nodes,
-                                         int numnodes)
-{
-    int i, nodes_used = 0;
-    unsigned long spdx, epdx;
-    unsigned long bitfield = 0, memtop = 0;
-
-    for ( i = 0; i < numnodes; i++ )
-    {
-        spdx = paddr_to_pdx(nodes[i].start);
-        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-        if ( spdx >= epdx )
-            continue;
-        bitfield |= spdx;
-        nodes_used++;
-        if ( epdx > memtop )
-            memtop = epdx;
-    }
-    if ( nodes_used <= 1 )
-        i = BITS_PER_LONG - 1;
-    else
-        i = find_first_bit(&bitfield, sizeof(unsigned long)*8);
-    memnodemapsize = (memtop >> i) + 1;
-    return i;
-}
-
-int __init compute_hash_shift(struct node *nodes, int numnodes,
-                              nodeid_t *nodeids)
-{
-    int shift;
-
-    shift = extract_lsb_from_nodes(nodes, numnodes);
-    if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
-        memnodemap = _memnodemap;
-    else if ( allocate_cachealigned_memnodemap() )
-        return -1;
-    printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
-
-    if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
-    {
-        printk(KERN_INFO "Your memory is not aligned you need to "
-               "rebuild your hypervisor with a bigger NODEMAPSIZE "
-               "shift=%d\n", shift);
-        return -1;
-    }
-
-    return shift;
-}
-/* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
-{ 
-    unsigned long start_pfn, end_pfn;
-
-    start_pfn = start >> PAGE_SHIFT;
-    end_pfn = end >> PAGE_SHIFT;
-
-    NODE_DATA(nodeid)->node_start_pfn = start_pfn;
-    NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
-
-    node_set_online(nodeid);
-} 
-
 void __init numa_init_array(void)
 {
     int rr, i;
@@ -288,16 +145,6 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
                     (u64)end_pfn << PAGE_SHIFT);
 }
 
-void numa_add_cpu(int cpu)
-{
-    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-} 
-
-void numa_set_node(int cpu, nodeid_t node)
-{
-    cpu_to_node[cpu] = node;
-}
-
 /* [numa=off] */
 static __init int numa_setup(char *opt) 
 { 
@@ -373,149 +220,3 @@ unsigned int __init arch_get_dma_bitsize(void)
                  flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
                  + PAGE_SHIFT, 32);
 }
-
-static void dump_numa(unsigned char key)
-{
-    s_time_t now = NOW();
-    unsigned int i, j, n;
-    int err;
-    struct domain *d;
-    struct page_info *page;
-    unsigned int page_num_node[MAX_NUMNODES];
-    const struct vnuma_info *vnuma;
-
-    printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-           (u32)(now>>32), (u32)now);
-
-    for_each_online_node ( i )
-    {
-        paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1);
-
-        printk("NODE%u start->%lu size->%lu free->%lu\n",
-               i, node_start_pfn(i), node_spanned_pages(i),
-               avail_node_heap_pages(i));
-        /* sanity check phys_to_nid() */
-        if ( phys_to_nid(pa) != i )
-            printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n",
-                   pa, phys_to_nid(pa), i);
-    }
-
-    j = cpumask_first(&cpu_online_map);
-    n = 0;
-    for_each_online_cpu ( i )
-    {
-        if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] )
-        {
-            if ( n > 1 )
-                printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
-            else
-                printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
-            j = i;
-            n = 1;
-        }
-        else
-            ++n;
-    }
-    if ( n > 1 )
-        printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
-    else
-        printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
-
-    rcu_read_lock(&domlist_read_lock);
-
-    printk("Memory location of each domain:\n");
-    for_each_domain ( d )
-    {
-        process_pending_softirqs();
-
-        printk("Domain %u (total: %u):\n", d->domain_id, d->tot_pages);
-
-        for_each_online_node ( i )
-            page_num_node[i] = 0;
-
-        spin_lock(&d->page_alloc_lock);
-        page_list_for_each(page, &d->page_list)
-        {
-            i = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
-            page_num_node[i]++;
-        }
-        spin_unlock(&d->page_alloc_lock);
-
-        for_each_online_node ( i )
-            printk("    Node %u: %u\n", i, page_num_node[i]);
-
-        if ( !read_trylock(&d->vnuma_rwlock) )
-            continue;
-
-        if ( !d->vnuma )
-        {
-            read_unlock(&d->vnuma_rwlock);
-            continue;
-        }
-
-        vnuma = d->vnuma;
-        printk("     %u vnodes, %u vcpus, guest physical layout:\n",
-               vnuma->nr_vnodes, d->max_vcpus);
-        for ( i = 0; i < vnuma->nr_vnodes; i++ )
-        {
-            unsigned int start_cpu = ~0U;
-
-            err = snprintf(keyhandler_scratch, 12, "%3u",
-                    vnuma->vnode_to_pnode[i]);
-            if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
-                strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
-
-            printk("       %3u: pnode %s,", i, keyhandler_scratch);
-
-            printk(" vcpus ");
-
-            for ( j = 0; j < d->max_vcpus; j++ )
-            {
-                if ( !(j & 0x3f) )
-                    process_pending_softirqs();
-
-                if ( vnuma->vcpu_to_vnode[j] == i )
-                {
-                    if ( start_cpu == ~0U )
-                    {
-                        printk("%d", j);
-                        start_cpu = j;
-                    }
-                }
-                else if ( start_cpu != ~0U )
-                {
-                    if ( j - 1 != start_cpu )
-                        printk("-%d ", j - 1);
-                    else
-                        printk(" ");
-                    start_cpu = ~0U;
-                }
-            }
-
-            if ( start_cpu != ~0U  && start_cpu != j - 1 )
-                printk("-%d", j - 1);
-
-            printk("\n");
-
-            for ( j = 0; j < vnuma->nr_vmemranges; j++ )
-            {
-                if ( vnuma->vmemrange[j].nid == i )
-                    printk("           %016"PRIx64" - %016"PRIx64"\n",
-                           vnuma->vmemrange[j].start,
-                           vnuma->vmemrange[j].end);
-            }
-        }
-
-        read_unlock(&d->vnuma_rwlock);
-    }
-
-    rcu_read_unlock(&domlist_read_lock);
-}
-
-static __init int register_numa_trigger(void)
-{
-    register_keyhandler('u', dump_numa, "dump NUMA info", 1);
-    return 0;
-}
-__initcall(register_numa_trigger);
-
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 0fed30b..c1bd2ff 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -63,6 +63,7 @@ obj-y += wait.o
 obj-bin-y += warning.init.o
 obj-$(CONFIG_XENOPROF) += xenoprof.o
 obj-y += xmalloc_tlsf.o
+obj-y += numa.o
 
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
diff --git a/xen/common/numa.c b/xen/common/numa.c
new file mode 100644
index 0000000..59dcb63
--- /dev/null
+++ b/xen/common/numa.c
@@ -0,0 +1,342 @@
+/*
+ * Common NUMA handling functions for x86 and arm.
+ * Original code extracted from arch/x86/numa.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include <xen/mm.h>
+#include <xen/string.h>
+#include <xen/init.h>
+#include <xen/ctype.h>
+#include <xen/nodemask.h>
+#include <xen/numa.h>
+#include <xen/keyhandler.h>
+#include <xen/time.h>
+#include <xen/smp.h>
+#include <xen/pfn.h>
+#include <xen/sched.h>
+#include <xen/errno.h>
+#include <xen/softirq.h>
+#include <asm/setup.h>
+
+struct node_data node_data[MAX_NUMNODES];
+
+/* Mapping from pdx to node id */
+int memnode_shift;
+unsigned long memnodemapsize;
+u8 *memnodemap;
+typeof(*memnodemap) _memnodemap[64];
+
+nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
+    [0 ... NR_CPUS-1] = NUMA_NO_NODE
+};
+
+cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
+
+/*
+ * Given a shift value, try to populate memnodemap[]
+ * Returns :
+ * 1 if OK
+ * 0 if memnodmap[] too small (of shift too small)
+ * -1 if node overlap or lost ram (shift too big)
+ */
+static int __init populate_memnodemap(const struct node *nodes,
+                                      int numnodes, int shift,
+                                      nodeid_t *nodeids)
+{
+    unsigned long spdx, epdx;
+    int i, res = -1;
+
+    memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
+    for ( i = 0; i < numnodes; i++ )
+    {
+        spdx = paddr_to_pdx(nodes[i].start);
+        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
+        if ( spdx >= epdx )
+            continue;
+        if ( (epdx >> shift) >= memnodemapsize )
+            return 0;
+        do {
+            if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
+                return -1;
+
+            if ( !nodeids )
+                memnodemap[spdx >> shift] = i;
+            else
+                memnodemap[spdx >> shift] = nodeids[i];
+
+            spdx += (1UL << shift);
+        } while ( spdx < epdx );
+        res = 1;
+    }
+
+    return res;
+}
+
+static int __init allocate_cachealigned_memnodemap(void)
+{
+    unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
+    unsigned long mfn;
+
+    mfn = alloc_boot_pages(size, 1);
+    if ( !mfn )
+    {
+        printk(KERN_ERR
+               "NUMA: Unable to allocate Memory to Node hash map\n");
+        memnodemapsize = 0;
+        return -1;
+    }
+
+    memnodemap = mfn_to_virt(mfn);
+    mfn <<= PAGE_SHIFT;
+    size <<= PAGE_SHIFT;
+    printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
+           mfn, mfn + size);
+    memnodemapsize = size / sizeof(*memnodemap);
+
+    return 0;
+}
+
+/*
+ * The LSB of all start and end addresses in the node map is the value of the
+ * maximum possible shift.
+ */
+static int __init extract_lsb_from_nodes(const struct node *nodes,
+                                         int numnodes)
+{
+    int i, nodes_used = 0;
+    unsigned long spdx, epdx;
+    unsigned long bitfield = 0, memtop = 0;
+
+    for ( i = 0; i < numnodes; i++ )
+    {
+        spdx = paddr_to_pdx(nodes[i].start);
+        epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
+        if ( spdx >= epdx )
+            continue;
+        bitfield |= spdx;
+        nodes_used++;
+        if ( epdx > memtop )
+            memtop = epdx;
+    }
+    if ( nodes_used <= 1 )
+        i = BITS_PER_LONG - 1;
+    else
+        i = find_first_bit(&bitfield, sizeof(unsigned long)*8);
+
+    memnodemapsize = (memtop >> i) + 1;
+
+    return i;
+}
+
+int __init compute_hash_shift(struct node *nodes, int numnodes,
+                              nodeid_t *nodeids)
+{
+    int shift;
+
+    shift = extract_lsb_from_nodes(nodes, numnodes);
+    if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
+        memnodemap = _memnodemap;
+    else if ( allocate_cachealigned_memnodemap() )
+        return -1;
+    printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
+
+    if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
+    {
+        printk(KERN_INFO "Your memory is not aligned you need to "
+               "rebuild your hypervisor with a bigger NODEMAPSIZE "
+               "shift=%d\n", shift);
+        return -1;
+    }
+
+    return shift;
+}
+
+/* initialize NODE_DATA given nodeid and start/end */
+void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
+{
+    unsigned long start_pfn, end_pfn;
+
+    start_pfn = start >> PAGE_SHIFT;
+    end_pfn = end >> PAGE_SHIFT;
+
+    NODE_DATA(nodeid)->node_id = nodeid;
+    NODE_DATA(nodeid)->node_start_pfn = start_pfn;
+    NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
+
+    node_set_online(nodeid);
+}
+
+void numa_add_cpu(int cpu)
+{
+    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
+}
+
+void numa_set_node(int cpu, nodeid_t node)
+{
+    cpu_to_node[cpu] = node;
+}
+
+EXPORT_SYMBOL(node_to_cpumask);
+EXPORT_SYMBOL(memnode_shift);
+EXPORT_SYMBOL(memnodemap);
+EXPORT_SYMBOL(node_data);
+
+static void dump_numa(unsigned char key)
+{
+    s_time_t now = NOW();
+    unsigned int i, j, n;
+    int err;
+    struct domain *d;
+    struct page_info *page;
+    unsigned int page_num_node[MAX_NUMNODES] = {0};
+    const struct vnuma_info *vnuma;
+
+    printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
+           (u32)(now>>32), (u32)now);
+
+    for_each_online_node ( i )
+    {
+        paddr_t pa = (paddr_t)(NODE_DATA(i)->node_start_pfn + 1)<< PAGE_SHIFT;
+        printk("idx%d -> NODE%d start->%lu size->%lu free->%lu\n",
+               i, NODE_DATA(i)->node_id,
+               NODE_DATA(i)->node_start_pfn,
+               NODE_DATA(i)->node_spanned_pages,
+               avail_node_heap_pages(i));
+        /* sanity check phys_to_nid() */
+        printk("phys_to_nid(%"PRIpaddr") -> %d should be %d\n", pa,
+               phys_to_nid(pa),
+               NODE_DATA(i)->node_id);
+    }
+
+    j = cpumask_first(&cpu_online_map);
+    n = 0;
+    for_each_online_cpu ( i )
+    {
+        if ( i != j + n || cpu_to_node[j] != cpu_to_node[i] )
+        {
+            if ( n > 1 )
+                printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
+            else
+                printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
+            j = i;
+            n = 1;
+        }
+        else
+            ++n;
+    }
+    if ( n > 1 )
+        printk("CPU%u...%u -> NODE%d\n", j, j + n - 1, cpu_to_node[j]);
+    else
+        printk("CPU%u -> NODE%d\n", j, cpu_to_node[j]);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    printk("Memory location of each domain:\n");
+    for_each_domain ( d )
+    {
+        process_pending_softirqs();
+        printk("Domain %u (total: %u):\n", d->domain_id, d->tot_pages);
+
+        for_each_online_node ( i )
+            page_num_node[i] = 0;
+
+        spin_lock(&d->page_alloc_lock);
+        page_list_for_each(page, &d->page_list)
+        {
+            i = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
+            page_num_node[i]++;
+        }
+        spin_unlock(&d->page_alloc_lock);
+
+        for_each_online_node ( i )
+            printk("    Node %u: %u\n", i, page_num_node[i]);
+
+        if ( !read_trylock(&d->vnuma_rwlock) )
+            continue;
+
+        if ( !d->vnuma )
+        {
+            read_unlock(&d->vnuma_rwlock);
+            continue;
+        }
+
+        vnuma = d->vnuma;
+        printk("     %u vnodes, %u vcpus, guest physical layout:\n",
+               vnuma->nr_vnodes, d->max_vcpus);
+        for ( i = 0; i < vnuma->nr_vnodes; i++ )
+        {
+            unsigned int start_cpu = ~0U;
+
+            err = snprintf(keyhandler_scratch, 12, "%3u",
+                    vnuma->vnode_to_pnode[i]);
+            if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
+                strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
+
+            printk("       %3u: pnode %s,", i, keyhandler_scratch);
+
+            printk(" vcpus ");
+
+            for ( j = 0; j < d->max_vcpus; j++ )
+            {
+                if ( !(j & 0x3f) )
+                    process_pending_softirqs();
+
+                if ( vnuma->vcpu_to_vnode[j] == i )
+                {
+                    if ( start_cpu == ~0U )
+                    {
+                        printk("%d", j);
+                        start_cpu = j;
+                    }
+                }
+                else if ( start_cpu != ~0U )
+                {
+                    if ( j - 1 != start_cpu )
+                        printk("-%d ", j - 1);
+                    else
+                        printk(" ");
+                    start_cpu = ~0U;
+                }
+            }
+
+            if ( start_cpu != ~0U  && start_cpu != j - 1 )
+                printk("-%d", j - 1);
+
+            printk("\n");
+
+            for ( j = 0; j < vnuma->nr_vmemranges; j++ )
+            {
+                if ( vnuma->vmemrange[j].nid == i )
+                    printk("           %016"PRIx64" - %016"PRIx64"\n",
+                           vnuma->vmemrange[j].start,
+                           vnuma->vmemrange[j].end);
+            }
+        }
+
+        read_unlock(&d->vnuma_rwlock);
+    }
+
+    rcu_read_unlock(&domlist_read_lock);
+}
+
+static __init int register_numa_trigger(void)
+{
+    register_keyhandler('u', dump_numa, "dump NUMA info", 1);
+    return 0;
+}
+__initcall(register_numa_trigger);
+
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 2479238..61bcd8e 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -17,64 +17,17 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
-struct node { 
-	u64 start,end; 
-};
-
-extern int compute_hash_shift(struct node *nodes, int numnodes,
-			      nodeid_t *nodeids);
 extern nodeid_t pxm_to_node(unsigned int pxm);
 
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
-#define VIRTUAL_BUG_ON(x) 
 
-extern void numa_add_cpu(int cpu);
 extern void numa_init_array(void);
 extern bool_t numa_off;
 
-
 extern int srat_disabled(void);
-extern void numa_set_node(int cpu, nodeid_t node);
-extern nodeid_t setup_node(unsigned int pxm);
 extern void srat_detect_node(int cpu);
 
-extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
 extern nodeid_t apicid_to_node[];
-extern void init_cpu_to_node(void);
-
-static inline void clear_node_cpumask(int cpu)
-{
-	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
-}
-
-/* Simple perfect hash to map pdx to node numbers */
-extern int memnode_shift; 
-extern unsigned long memnodemapsize;
-extern u8 *memnodemap;
-
-struct node_data {
-    unsigned long node_start_pfn;
-    unsigned long node_spanned_pages;
-};
-
-extern struct node_data node_data[];
-
-static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
-{ 
-	nodeid_t nid;
-	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
-	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift]; 
-	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]); 
-	return nid; 
-} 
-
-#define NODE_DATA(nid)		(&(node_data[nid]))
-
-#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
-#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
-#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
-				 NODE_DATA(nid)->node_spanned_pages)
-
 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 
 void srat_parse_regions(u64 addr);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7aef1a8..dd33c92 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -18,4 +18,58 @@
   (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
    ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
 
+struct node {
+	u64 start,end;
+};
+
+struct node_data {
+    unsigned long node_start_pfn;
+    unsigned long node_spanned_pages;
+    nodeid_t      node_id;
+};
+
+#define NODE_DATA(nid)		(&(node_data[nid]))
+#define VIRTUAL_BUG_ON(x)
+
+#ifdef CONFIG_NUMA
+extern void init_cpu_to_node(void);
+
+static inline void clear_node_cpumask(int cpu)
+{
+	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
+}
+
+/* Simple perfect hash to map pdx to node numbers */
+extern int memnode_shift;
+extern unsigned long memnodemapsize;
+extern u8 *memnodemap;
+extern typeof(*memnodemap) _memnodemap[];
+
+extern struct node_data node_data[];
+
+static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
+{
+	nodeid_t nid;
+	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
+	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
+	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
+	return nid;
+}
+
+#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
+#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
+#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
+				 NODE_DATA(nid)->node_spanned_pages)
+
+#else
+#define init_cpu_to_node() do {} while (0)
+#define clear_node_cpumask(cpu) do {} while (0)
+#endif /* CONFIG_NUMA */
+
+extern void numa_add_cpu(int cpu);
+extern nodeid_t setup_node(unsigned int pxm);
+extern void numa_set_node(int cpu, nodeid_t node);
+extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
+extern int compute_hash_shift(struct node *nodes, int numnodes,
+			      nodeid_t *nodeids);
 #endif /* _XEN_NUMA_H */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
  2017-02-09 15:56 ` [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
  2017-02-09 15:56 ` [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-09 16:15   ` Jan Beulich
  2017-02-20 12:47   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup vijay.kilari
                   ` (19 subsequent siblings)
  22 siblings, 2 replies; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move some common numa code from xen/arch/x86/srat.c
to xen/common/numa.c

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/srat.c        | 54 ++++-----------------------------------------
 xen/common/numa.c          | 55 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/acpi.h |  1 -
 xen/include/asm-x86/numa.h |  1 -
 xen/include/xen/numa.h     |  5 ++++-
 5 files changed, 63 insertions(+), 53 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index d86783e..58dee09 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -25,7 +25,7 @@ static struct acpi_table_slit *__read_mostly acpi_slit;
 
 static nodemask_t memory_nodes_parsed __initdata;
 static nodemask_t processor_nodes_parsed __initdata;
-static struct node nodes[MAX_NUMNODES] __initdata;
+extern struct node nodes[MAX_NUMNODES] __initdata;
 
 struct pxm2node {
 	unsigned pxm;
@@ -36,9 +36,9 @@ static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
 
 static unsigned node_to_pxm(nodeid_t n);
 
-static int num_node_memblks;
-static struct node node_memblk_range[NR_NODE_MEMBLKS];
-static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
+extern int num_node_memblks;
+extern struct node node_memblk_range[NR_NODE_MEMBLKS];
+extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
 static inline bool_t node_found(unsigned idx, unsigned pxm)
@@ -103,52 +103,6 @@ nodeid_t setup_node(unsigned pxm)
 	return node;
 }
 
-int valid_numa_range(u64 start, u64 end, nodeid_t node)
-{
-	int i;
-
-	for (i = 0; i < num_node_memblks; i++) {
-		struct node *nd = &node_memblk_range[i];
-
-		if (nd->start <= start && nd->end > end &&
-			memblk_nodeid[i] == node )
-			return 1;
-	}
-
-	return 0;
-}
-
-static __init int conflicting_memblks(u64 start, u64 end)
-{
-	int i;
-
-	for (i = 0; i < num_node_memblks; i++) {
-		struct node *nd = &node_memblk_range[i];
-		if (nd->start == nd->end)
-			continue;
-		if (nd->end > start && nd->start < end)
-			return i;
-		if (nd->end == end && nd->start == start)
-			return i;
-	}
-	return -1;
-}
-
-static __init void cutoff_node(int i, u64 start, u64 end)
-{
-	struct node *nd = &nodes[i];
-	if (nd->start < start) {
-		nd->start = start;
-		if (nd->end < nd->start)
-			nd->start = nd->end;
-	}
-	if (nd->end > end) {
-		nd->end = end;
-		if (nd->start > nd->end)
-			nd->start = nd->end;
-	}
-}
-
 static __init void bad_srat(void)
 {
 	int i;
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 59dcb63..13f147c 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -46,6 +46,61 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
 
 cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
 
+int num_node_memblks;
+struct node node_memblk_range[NR_NODE_MEMBLKS];
+nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
+struct node nodes[MAX_NUMNODES] __initdata;
+
+int valid_numa_range(u64 start, u64 end, nodeid_t node)
+{
+#ifdef CONFIG_NUMA
+    int i;
+
+    for (i = 0; i < num_node_memblks; i++) {
+        struct node *nd = &node_memblk_range[i];
+
+        if (nd->start <= start && nd->end > end &&
+            memblk_nodeid[i] == node )
+            return 1;
+    }
+
+    return 0;
+#else
+    return 1;
+#endif
+}
+
+__init int conflicting_memblks(u64 start, u64 end)
+{
+    int i;
+
+    for (i = 0; i < num_node_memblks; i++) {
+        struct node *nd = &node_memblk_range[i];
+        if (nd->start == nd->end)
+            continue;
+        if (nd->end > start && nd->start < end)
+            return i;
+        if (nd->end == end && nd->start == start)
+            return i;
+    }
+    return -1;
+}
+
+__init void cutoff_node(int i, u64 start, u64 end)
+{
+    struct node *nd = &nodes[i];
+    if (nd->start < start) {
+        nd->start = start;
+        if (nd->end < nd->start)
+            nd->start = nd->end;
+    }
+    if (nd->end > end) {
+        nd->end = end;
+        if (nd->start > nd->end)
+            nd->start = nd->end;
+    }
+}
+
 /*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index d36bee9..f1a8e9d 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -106,7 +106,6 @@ extern void acpi_reserve_bootmem(void);
 
 extern s8 acpi_numa;
 extern int acpi_scan_nodes(u64 start, u64 end);
-#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
 
 #ifdef CONFIG_ACPI_SLEEP
 
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 61bcd8e..df1f7d5 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -28,7 +28,6 @@ extern int srat_disabled(void);
 extern void srat_detect_node(int cpu);
 
 extern nodeid_t apicid_to_node[];
-extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 
 void srat_parse_regions(u64 addr);
 extern u8 __node_distance(nodeid_t a, nodeid_t b);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index dd33c92..810f742 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -11,7 +11,7 @@
 #define NUMA_NO_DISTANCE 0xFF
 
 #define MAX_NUMNODES    (1 << NODES_SHIFT)
-
+#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
 #define domain_to_node(d) \
@@ -66,6 +66,9 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif /* CONFIG_NUMA */
 
+extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
+extern int conflicting_memblks(u64 start, u64 end);
+extern void cutoff_node(int i, u64 start, u64 end);
 extern void numa_add_cpu(int cpu);
 extern nodeid_t setup_node(unsigned int pxm);
 extern void numa_set_node(int cpu, nodeid_t node);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (2 preceding siblings ...)
  2017-02-09 15:56 ` [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-20 13:39   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt vijay.kilari
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

numa_setup() contains generic and arch specific code.
Split numa_setup() and move architecture specific code
under arch_numa_setup().

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Makefile      |  1 +
 xen/arch/arm/numa.c        | 28 ++++++++++++++++++++++++++++
 xen/arch/x86/numa.c        | 11 +----------
 xen/common/numa.c          | 14 ++++++++++++++
 xen/include/asm-arm/numa.h |  9 ++++++++-
 xen/include/asm-x86/numa.h |  2 +-
 xen/include/xen/numa.h     |  1 +
 7 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 7afb8a3..b5d7a19 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -49,6 +49,7 @@ obj-y += vm_event.o
 obj-y += vtimer.o
 obj-y += vpsci.o
 obj-y += vuart.o
+obj-$(CONFIG_NUMA) += numa.o
 
 #obj-bin-y += ....o
 
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
new file mode 100644
index 0000000..59d09c7
--- /dev/null
+++ b/xen/arch/arm/numa.c
@@ -0,0 +1,28 @@
+/*
+ * ARM NUMA Implementation
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K <vijaya.kumar@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/init.h>
+#include <xen/ctype.h>
+#include <xen/mm.h>
+#include <xen/nodemask.h>
+#include <asm/mm.h>
+#include <xen/numa.h>
+
+int __init arch_numa_setup(char *opt)
+{
+    return 1;
+}
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index bc787e0..28d1891 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -18,9 +18,6 @@
 #include <xen/sched.h>
 #include <xen/softirq.h>
 
-static int numa_setup(char *s);
-custom_param("numa", numa_setup);
-
 #ifndef Dprintk
 #define Dprintk(x...)
 #endif
@@ -34,7 +31,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-bool_t numa_off = 0;
 s8 acpi_numa = 0;
 
 int srat_disabled(void)
@@ -145,13 +141,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
                     (u64)end_pfn << PAGE_SHIFT);
 }
 
-/* [numa=off] */
-static __init int numa_setup(char *opt) 
+int __init arch_numa_setup(char *opt)
 { 
-    if ( !strncmp(opt,"off",3) )
-        numa_off = 1;
-    if ( !strncmp(opt,"on",2) )
-        numa_off = 0;
 #ifdef CONFIG_NUMA_EMU
     if ( !strncmp(opt, "fake=", 5) )
     {
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 13f147c..9b9cf9c 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -32,6 +32,10 @@
 #include <xen/softirq.h>
 #include <asm/setup.h>
 
+static int numa_setup(char *s);
+custom_param("numa", numa_setup);
+
+bool_t numa_off = 0;
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
@@ -250,6 +254,16 @@ EXPORT_SYMBOL(memnode_shift);
 EXPORT_SYMBOL(memnodemap);
 EXPORT_SYMBOL(node_data);
 
+static __init int numa_setup(char *opt)
+{
+    if ( !strncmp(opt,"off",3) )
+        numa_off = 1;
+    if ( !strncmp(opt,"on",2) )
+        numa_off = 0;
+
+    return arch_numa_setup(opt);
+}
+
 static void dump_numa(unsigned char key)
 {
     s_time_t now = NOW();
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index a60c7eb..c1e8a7d 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -5,7 +5,14 @@ typedef u8 nodeid_t;
 
 #define NODES_SHIFT 2
 
-#ifndef CONFIG_NUMA
+#ifdef CONFIG_NUMA
+int arch_numa_setup(char *opt);
+#else
+static inline int arch_numa_setup(char *opt)
+{
+    return 1;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index df1f7d5..659ff6a 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -22,7 +22,6 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
 extern void numa_init_array(void);
-extern bool_t numa_off;
 
 extern int srat_disabled(void);
 extern void srat_detect_node(int cpu);
@@ -32,5 +31,6 @@ extern nodeid_t apicid_to_node[];
 void srat_parse_regions(u64 addr);
 extern u8 __node_distance(nodeid_t a, nodeid_t b);
 unsigned int arch_get_dma_bitsize(void);
+int arch_numa_setup(char *opt);
 
 #endif
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 810f742..77c5cfd 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -18,6 +18,7 @@
   (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
    ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
 
+extern bool_t numa_off;
 struct node {
 	u64 start,end;
 };
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (3 preceding siblings ...)
  2017-02-09 15:56 ` [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-20 13:42   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information vijay.kilari
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

When booting in UEFI mode, UEFI passes memory information
to Dom0 using EFI memory descriptor table and deletes the
memory nodes from the host DT. However to fetch the memory
numa node id, memory DT node should not be deleted by EFI stub.

With this patch, do not delete memory node from FDT.
This memory nodes are later used by XEN to extract numa
node id information.

Also, parse memory node only if bootmeminfo is NULL.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/bootfdt.c      |  9 +++++++--
 xen/arch/arm/efi/efi-boot.h | 25 -------------------------
 2 files changed, 7 insertions(+), 27 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index cae6f83..979f675 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -285,8 +285,13 @@ static int __init early_scan_node(const void *fdt,
                                   u32 address_cells, u32 size_cells,
                                   void *data)
 {
-    if ( device_tree_node_matches(fdt, node, "memory") )
-        process_memory_node(fdt, node, name, address_cells, size_cells);
+    /*
+     * Parse memory node only if bootinfo.mem is empty.
+     */
+    if ( bootinfo.mem.nr_banks == 0 ) {
+        if ( device_tree_node_matches(fdt, node, "memory") )
+            process_memory_node(fdt, node, name, address_cells, size_cells);
+    }
     else if ( device_tree_node_compatible(fdt, node, "xen,multiboot-module" ) ||
               device_tree_node_compatible(fdt, node, "multiboot,module" ))
         process_multiboot_node(fdt, node, name, address_cells, size_cells);
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 045d6ce..0b9c37f 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -192,33 +192,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE *sys_table,
     int status;
     u32 fdt_val32;
     u64 fdt_val64;
-    int prev;
     int num_rsv;
 
-    /*
-     * Delete any memory nodes present.  The EFI memory map is the only
-     * memory description provided to Xen.
-     */
-    prev = 0;
-    for (;;)
-    {
-        const char *type;
-        int len;
-
-        node = fdt_next_node(fdt, prev, NULL);
-        if ( node < 0 )
-            break;
-
-        type = fdt_getprop(fdt, node, "device_type", &len);
-        if ( type && strncmp(type, "memory", len) == 0 )
-        {
-            fdt_del_node(fdt, node);
-            continue;
-        }
-
-        prev = node;
-    }
-
    /*
     * Delete all memory reserve map entries. When booting via UEFI,
     * kernel will use the UEFI memory map to find reserved regions.
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (4 preceding siblings ...)
  2017-02-09 15:56 ` [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-20 17:32   ` Julien Grall
  2017-02-20 17:36   ` Julien Grall
  2017-02-09 15:56 ` [RFC PATCH v1 07/21] ARM: NUMA: Parse memory " vijay.kilari
                   ` (16 subsequent siblings)
  22 siblings, 2 replies; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse CPU node and fetch numa-node-id information.
For each node-id found, update nodemask_t mask.

Call numa_init() from setup_mm() with start and end
pfn of the complete ram..

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Makefile         |  1 +
 xen/arch/arm/bootfdt.c        |  8 ++---
 xen/arch/arm/dt_numa.c        | 72 +++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa.c           | 14 +++++++++
 xen/arch/arm/setup.c          |  3 ++
 xen/include/asm-arm/numa.h    | 14 +++++++++
 xen/include/xen/device_tree.h |  4 +++
 7 files changed, 112 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index b5d7a19..7694485 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -50,6 +50,7 @@ obj-y += vtimer.o
 obj-y += vpsci.o
 obj-y += vuart.o
 obj-$(CONFIG_NUMA) += numa.o
+obj-$(CONFIG_NUMA) += dt_numa.o
 
 #obj-bin-y += ....o
 
diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 979f675..d1122d8 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -17,8 +17,8 @@
 #include <xsm/xsm.h>
 #include <asm/setup.h>
 
-static bool_t __init device_tree_node_matches(const void *fdt, int node,
-                                              const char *match)
+bool_t __init device_tree_node_matches(const void *fdt, int node,
+                                       const char *match)
 {
     const char *name;
     size_t match_len;
@@ -63,8 +63,8 @@ static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
     *size = dt_next_cell(size_cells, cell);
 }
 
-static u32 __init device_tree_get_u32(const void *fdt, int node,
-                                      const char *prop_name, u32 dflt)
+u32 __init device_tree_get_u32(const void *fdt, int node,
+                               const char *prop_name, u32 dflt)
 {
     const struct fdt_property *prop;
 
diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
new file mode 100644
index 0000000..4b94c36
--- /dev/null
+++ b/xen/arch/arm/dt_numa.c
@@ -0,0 +1,72 @@
+/*
+ * OF NUMA Parsing support.
+ *
+ * Copyright (C) 2015 - 2016 Cavium Inc.
+ *
+ * Some code extracts are taken from linux drivers/of/of_numa.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+#include <xen/device_tree.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/nodemask.h>
+#include <asm/mm.h>
+#include <xen/numa.h>
+
+nodemask_t numa_nodes_parsed;
+
+/*
+ * Even though we connect cpus to numa domains later in SMP
+ * init, we need to know the node ids now for all cpus.
+*/
+static int __init dt_numa_process_cpu_node(const void *fdt, int node,
+                                           const char *name,
+                                           u32 address_cells, u32 size_cells)
+{
+    u32 nid;
+
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+
+    if ( nid >= MAX_NUMNODES )
+        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n", nid);
+    else
+        node_set(nid, numa_nodes_parsed);
+
+    return 0;
+}
+
+static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
+                                        const char *name, int depth,
+                                        u32 address_cells, u32 size_cells,
+                                        void *data)
+
+{
+    if ( device_tree_node_matches(fdt, node, "cpu") )
+        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
+                                        size_cells);
+
+    return 0;
+}
+
+int __init dt_numa_init(void)
+{
+    int ret;
+
+    nodes_clear(numa_nodes_parsed);
+    ret = device_tree_for_each_node((void *)device_tree_flattened,
+                                    dt_numa_scan_cpu_node, NULL);
+    return ret;
+}
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 59d09c7..9a7f0bb 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -21,6 +21,20 @@
 #include <xen/nodemask.h>
 #include <asm/mm.h>
 #include <xen/numa.h>
+#include <asm/acpi.h>
+
+int __init numa_init(void)
+{
+    int ret = 0;
+
+    if ( numa_off )
+        goto no_numa;
+
+    ret = dt_numa_init();
+
+no_numa:
+    return ret;
+}
 
 int __init arch_numa_setup(char *opt)
 {
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 049e449..b6618ca 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -39,6 +39,7 @@
 #include <xen/libfdt/libfdt.h>
 #include <xen/acpi.h>
 #include <asm/alternative.h>
+#include <xen/numa.h>
 #include <asm/page.h>
 #include <asm/current.h>
 #include <asm/setup.h>
@@ -753,6 +754,8 @@ void __init start_xen(unsigned long boot_phys_offset,
     /* Parse the ACPI tables for possible boot-time configuration */
     acpi_boot_table_init();
 
+    numa_init();
+
     end_boot_allocator();
 
     vm_init();
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index c1e8a7d..cdfeecd 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -1,18 +1,32 @@
 #ifndef __ARCH_ARM_NUMA_H
 #define __ARCH_ARM_NUMA_H
 
+#include <xen/errno.h>
+
 typedef u8 nodeid_t;
 
 #define NODES_SHIFT 2
 
 #ifdef CONFIG_NUMA
 int arch_numa_setup(char *opt);
+int __init numa_init(void);
+int __init dt_numa_init(void);
 #else
 static inline int arch_numa_setup(char *opt)
 {
     return 1;
 }
 
+static inline int __init numa_init(void)
+{
+    return 0;
+}
+
+static inline int __init dt_numa_init(void)
+{
+    return -EINVAL;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 0aecbe0..de6b351 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -188,6 +188,10 @@ int device_tree_for_each_node(const void *fdt,
                                      device_tree_node_func func,
                                      void *data);
 
+bool_t device_tree_node_matches(const void *fdt, int node,
+                                const char *match);
+u32 device_tree_get_u32(const void *fdt, int node,
+                        const char *prop_name, u32 dflt);
 /**
  * dt_unflatten_host_device_tree - Unflatten the host device tree
  *
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (5 preceding siblings ...)
  2017-02-09 15:56 ` [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information vijay.kilari
@ 2017-02-09 15:56 ` vijay.kilari
  2017-02-20 18:05   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information vijay.kilari
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:56 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse memory node and fetch numa-node-id information.
For each memory range, store in node_memblk_range[]
along with node id.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/bootfdt.c        |  4 +--
 xen/arch/arm/dt_numa.c        | 84 ++++++++++++++++++++++++++++++++++++++++++-
 xen/common/numa.c             |  8 +++++
 xen/include/xen/device_tree.h |  3 ++
 xen/include/xen/numa.h        |  1 +
 5 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index d1122d8..5e2df92 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -56,8 +56,8 @@ static bool_t __init device_tree_node_compatible(const void *fdt, int node,
     return 0;
 }
 
-static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
-                                       u32 size_cells, u64 *start, u64 *size)
+void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
+                                u32 size_cells, u64 *start, u64 *size)
 {
     *start = dt_next_cell(address_cells, cell);
     *size = dt_next_cell(size_cells, cell);
diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
index 4b94c36..fce9e67 100644
--- a/xen/arch/arm/dt_numa.c
+++ b/xen/arch/arm/dt_numa.c
@@ -27,6 +27,7 @@
 #include <xen/numa.h>
 
 nodemask_t numa_nodes_parsed;
+extern struct node node_memblk_range[NR_NODE_MEMBLKS];
 
 /*
  * Even though we connect cpus to numa domains later in SMP
@@ -48,11 +49,73 @@ static int __init dt_numa_process_cpu_node(const void *fdt, int node,
     return 0;
 }
 
+static int __init dt_numa_process_memory_node(const void *fdt, int node,
+                                              const char *name,
+                                              u32 address_cells,
+                                              u32 size_cells)
+{
+    const struct fdt_property *prop;
+    int i, ret, banks;
+    const __be32 *cell;
+    paddr_t start, size;
+    u32 reg_cells = address_cells + size_cells;
+    u32 nid;
+
+    if ( address_cells < 1 || size_cells < 1 )
+    {
+        printk(XENLOG_WARNING
+               "fdt: node `%s': invalid #address-cells or #size-cells", name);
+        return -EINVAL;
+    }
+
+    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+    if ( nid >= MAX_NUMNODES) {
+        /*
+         * No node id found. Skip this memory node.
+         */
+        return 0;
+    }
+
+    prop = fdt_get_property(fdt, node, "reg", NULL);
+    if ( !prop )
+    {
+        printk(XENLOG_WARNING "fdt: node `%s': missing `reg' property\n",
+               name);
+        return -EINVAL;
+    }
+
+    cell = (const __be32 *)prop->data;
+    banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
+
+    for ( i = 0; i < banks; i++ )
+    {
+        device_tree_get_reg(&cell, address_cells, size_cells, &start, &size);
+        if ( !size )
+            continue;
+
+        /* It is fine to add this area to the nodes data it will be used later*/
+        ret = conflicting_memblks(start, start + size);
+        if (ret < 0)
+             numa_add_memblk(nid, start, size);
+        else
+        {
+             printk(XENLOG_ERR
+                    "NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with ret %d (%"PRIx64"-%"PRIx64")\n",
+                    nid, start, start + size, ret,
+                    node_memblk_range[i].start, node_memblk_range[i].end);
+             return -EINVAL;
+        }
+    }
+
+    node_set(nid, numa_nodes_parsed);
+
+    return 0;
+}
+
 static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
                                         const char *name, int depth,
                                         u32 address_cells, u32 size_cells,
                                         void *data)
-
 {
     if ( device_tree_node_matches(fdt, node, "cpu") )
         return dt_numa_process_cpu_node(fdt, node, name, address_cells,
@@ -61,6 +124,18 @@ static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
     return 0;
 }
 
+static int __init dt_numa_scan_memory_node(const void *fdt, int node,
+                                           const char *name, int depth,
+                                           u32 address_cells, u32 size_cells,
+                                           void *data)
+{
+    if ( device_tree_node_matches(fdt, node, "memory") )
+        return dt_numa_process_memory_node(fdt, node, name, address_cells,
+                                           size_cells);
+
+    return 0;
+}
+
 int __init dt_numa_init(void)
 {
     int ret;
@@ -68,5 +143,12 @@ int __init dt_numa_init(void)
     nodes_clear(numa_nodes_parsed);
     ret = device_tree_for_each_node((void *)device_tree_flattened,
                                     dt_numa_scan_cpu_node, NULL);
+
+    if ( ret )
+        return ret;
+
+    ret = device_tree_for_each_node((void *)device_tree_flattened,
+                                    dt_numa_scan_memory_node, NULL);
+
     return ret;
 }
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 9b9cf9c..62c76af 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -55,6 +55,14 @@ struct node node_memblk_range[NR_NODE_MEMBLKS];
 nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 struct node nodes[MAX_NUMNODES] __initdata;
 
+void __init numa_add_memblk(nodeid_t nodeid, u64 start, u64 size)
+{
+    node_memblk_range[num_node_memblks].start = start;
+    node_memblk_range[num_node_memblks].end = start + size;
+    memblk_nodeid[num_node_memblks] = nodeid;
+    num_node_memblks++;
+}
+
 int valid_numa_range(u64 start, u64 end, nodeid_t node)
 {
 #ifdef CONFIG_NUMA
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index de6b351..d92e47e 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -192,6 +192,9 @@ bool_t device_tree_node_matches(const void *fdt, int node,
                                 const char *match);
 u32 device_tree_get_u32(const void *fdt, int node,
                         const char *prop_name, u32 dflt);
+void device_tree_get_reg(const __be32 **cell, u32 address_cells,
+                         u32 size_cells, u64 *start, u64 *size);
+
 /**
  * dt_unflatten_host_device_tree - Unflatten the host device tree
  *
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 77c5cfd..9392a89 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -67,6 +67,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif /* CONFIG_NUMA */
 
+extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 extern int conflicting_memblks(u64 start, u64 end);
 extern void cutoff_node(int i, u64 start, u64 end);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (6 preceding siblings ...)
  2017-02-09 15:56 ` [RFC PATCH v1 07/21] ARM: NUMA: Parse memory " vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-02-20 18:28   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support vijay.kilari
                   ` (14 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse distance-matrix and fetch node distance information.
Store distance information in node_distance[].

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/dt_numa.c     | 90 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa.c        | 19 +++++++++-
 xen/include/asm-arm/numa.h |  1 +
 3 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
index fce9e67..8979612 100644
--- a/xen/arch/arm/dt_numa.c
+++ b/xen/arch/arm/dt_numa.c
@@ -28,6 +28,19 @@
 
 nodemask_t numa_nodes_parsed;
 extern struct node node_memblk_range[NR_NODE_MEMBLKS];
+extern int _node_distance[MAX_NUMNODES * 2];
+extern int *node_distance;
+
+static int numa_set_distance(u32 nodea, u32 nodeb, u32 distance)
+{
+   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
+       return -EINVAL;
+
+   _node_distance[(nodea * MAX_NUMNODES) + nodeb] = distance;
+   node_distance = &_node_distance[0];
+
+   return 0;
+}
 
 /*
  * Even though we connect cpus to numa domains later in SMP
@@ -112,6 +125,66 @@ static int __init dt_numa_process_memory_node(const void *fdt, int node,
     return 0;
 }
 
+static int __init dt_numa_parse_distance_map(const void *fdt, int node,
+                                             const char *name,
+                                             u32 address_cells,
+                                             u32 size_cells)
+{
+    const struct fdt_property *prop;
+    const __be32 *matrix;
+    int entry_count, len, i;
+
+    printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
+
+    prop = fdt_get_property(fdt, node, "distance-matrix", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_WARNING
+               "NUMA: No distance-matrix property in distance-map\n");
+
+        return -EINVAL;
+    }
+
+    if ( len % sizeof(u32) != 0 )
+    {
+         printk(XENLOG_WARNING
+                "distance-matrix in node is not a multiple of u32\n");
+
+        return -EINVAL;
+    }
+
+    entry_count = len / sizeof(u32);
+    if ( entry_count <= 0 )
+    {
+        printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
+
+        return -EINVAL;
+    }
+
+    matrix = (const __be32 *)prop->data;
+    for ( i = 0; i + 2 < entry_count; i += 3 )
+    {
+        u32 nodea, nodeb, distance;
+
+        nodea = dt_read_number(matrix, 1);
+        matrix++;
+        nodeb = dt_read_number(matrix, 1);
+        matrix++;
+        distance = dt_read_number(matrix, 1);
+        matrix++;
+
+        numa_set_distance(nodea, nodeb, distance);
+        printk(XENLOG_INFO "NUMA:  distance[node%d -> node%d] = %d\n",
+               nodea, nodeb, distance);
+
+        /* Set default distance of node B->A same as A->B */
+        if ( nodeb > nodea )
+            numa_set_distance(nodeb, nodea, distance);
+    }
+
+    return 0;
+}
+
 static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
                                         const char *name, int depth,
                                         u32 address_cells, u32 size_cells,
@@ -136,6 +209,18 @@ static int __init dt_numa_scan_memory_node(const void *fdt, int node,
     return 0;
 }
 
+static int __init dt_numa_scan_distance_node(const void *fdt, int node,
+                                             const char *name, int depth,
+                                             u32 address_cells, u32 size_cells,
+                                             void *data)
+{
+    if ( device_tree_node_matches(fdt, node, "distance-map") )
+        return dt_numa_parse_distance_map(fdt, node, name, address_cells,
+                                          size_cells);
+
+    return 0;
+}
+
 int __init dt_numa_init(void)
 {
     int ret;
@@ -149,6 +234,11 @@ int __init dt_numa_init(void)
 
     ret = device_tree_for_each_node((void *)device_tree_flattened,
                                     dt_numa_scan_memory_node, NULL);
+    if ( ret )
+        return ret;
+
+    ret = device_tree_for_each_node((void *)device_tree_flattened,
+                                    dt_numa_scan_distance_node, NULL);
 
     return ret;
 }
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 9a7f0bb..11d100b 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -22,14 +22,31 @@
 #include <asm/mm.h>
 #include <xen/numa.h>
 #include <asm/acpi.h>
+#include <xen/errno.h>
+
+int _node_distance[MAX_NUMNODES * 2];
+int *node_distance;
+
+u8 __node_distance(nodeid_t a, nodeid_t b)
+{
+    if ( !node_distance )
+        return a == b ? 10 : 20;
+
+    return _node_distance[a * MAX_NUMNODES + b];
+}
+
+EXPORT_SYMBOL(__node_distance);
 
 int __init numa_init(void)
 {
-    int ret = 0;
+    int i, ret = 0;
 
     if ( numa_off )
         goto no_numa;
 
+    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
+        _node_distance[i] = 0;
+
     ret = dt_numa_init();
 
 no_numa:
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index cdfeecd..b8857e2 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -11,6 +11,7 @@ typedef u8 nodeid_t;
 int arch_numa_setup(char *opt);
 int __init numa_init(void);
 int __init dt_numa_init(void);
+u8 __node_distance(nodeid_t a, nodeid_t b);
 #else
 static inline int arch_numa_setup(char *opt)
 {
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (7 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-02-20 18:32   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 10/21] ARM: NUMA: Add memory " vijay.kilari
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

For each cpu, update cpu_to_node[] with node id from
the MPIDR registers. Also, initialize cpu_to_node[]
with node 0.

Add macros to access cpu_to_node[] information.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa.c        | 25 +++++++++++++++++++++++++
 xen/arch/arm/setup.c       |  2 ++
 xen/arch/arm/smpboot.c     |  3 +++
 xen/include/asm-arm/numa.h | 15 +++++++++++++++
 4 files changed, 45 insertions(+)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 11d100b..d4dbad4 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -23,9 +23,23 @@
 #include <xen/numa.h>
 #include <asm/acpi.h>
 #include <xen/errno.h>
+#include <xen/cpumask.h>
 
 int _node_distance[MAX_NUMNODES * 2];
 int *node_distance;
+extern nodemask_t numa_nodes_parsed;
+
+void __init numa_set_cpu_node(int cpu, unsigned long hwid)
+{
+    unsigned node;
+
+    node = hwid >> 16 & 0xf;
+    if ( !node_isset(node, numa_nodes_parsed) || node == MAX_NUMNODES )
+        node = 0;
+
+    numa_set_node(cpu, node);
+    numa_add_cpu(cpu);
+}
 
 u8 __node_distance(nodeid_t a, nodeid_t b)
 {
@@ -37,6 +51,17 @@ u8 __node_distance(nodeid_t a, nodeid_t b)
 
 EXPORT_SYMBOL(__node_distance);
 
+/*
+ * Setup early cpu_to_node.
+ */
+void __init init_cpu_to_node(void)
+{
+    int i;
+
+    for ( i = 0; i < nr_cpu_ids; i++ )
+        numa_set_node(i, 0);
+}
+
 int __init numa_init(void)
 {
     int i, ret = 0;
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index b6618ca..5612ba6 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -749,6 +749,8 @@ void __init start_xen(unsigned long boot_phys_offset,
            xen_paddr, xen_paddr + xen_bootmodule->size);
     xen_bootmodule->start = xen_paddr;
 
+    init_cpu_to_node();
+
     setup_mm(fdt_paddr, fdt_size);
 
     /* Parse the ACPI tables for possible boot-time configuration */
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 32e8722..3667d4b 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -30,6 +30,7 @@
 #include <xen/irq.h>
 #include <xen/console.h>
 #include <asm/cpuerrata.h>
+#include <xen/numa.h>
 #include <asm/gic.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
@@ -313,6 +314,8 @@ void start_secondary(unsigned long boot_phys_offset,
      */
     smp_wmb();
 
+    numa_set_cpu_node(cpuid, hwid);
+
     /* Now report this CPU is up */
     cpumask_set_cpu(cpuid, &cpu_online_map);
 
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index b8857e2..33a9e53 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -2,16 +2,26 @@
 #define __ARCH_ARM_NUMA_H
 
 #include <xen/errno.h>
+#include <xen/cpumask.h>
 
 typedef u8 nodeid_t;
 
 #define NODES_SHIFT 2
 
 #ifdef CONFIG_NUMA
+
+extern cpumask_t     node_to_cpumask[];
+extern nodeid_t      cpu_to_node[NR_CPUS];
+#define cpu_to_node(cpu)         (cpu_to_node[cpu])
+#define parent_node(node)        (node)
+#define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
+#define node_to_cpumask(node)    (node_to_cpumask[node])
+
 int arch_numa_setup(char *opt);
 int __init numa_init(void);
 int __init dt_numa_init(void);
 u8 __node_distance(nodeid_t a, nodeid_t b);
+void __init numa_set_cpu_node(int cpu, unsigned long hwid);
 #else
 static inline int arch_numa_setup(char *opt)
 {
@@ -28,6 +38,11 @@ static inline int __init dt_numa_init(void)
     return -EINVAL;
 }
 
+static inline void __init numa_set_cpu_node(int cpu, unsigned long hwid)
+{
+    return;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 10/21] ARM: NUMA: Add memory NUMA support
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (8 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 16:05   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

For all banks in bootinfo.mem, update nodes[] with
corresponding nodeid and register these nodes by
calling setup_node_bootmem().
compute memnode_shift and initialize memnodemap[] to fetch
nodeid for a given physical address.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa.c    | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/numa.c      | 14 ++++++++
 xen/include/xen/numa.h |  1 +
 3 files changed, 105 insertions(+)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index d4dbad4..aa34c82 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -24,10 +24,15 @@
 #include <asm/acpi.h>
 #include <xen/errno.h>
 #include <xen/cpumask.h>
+#include <asm/setup.h>
 
 int _node_distance[MAX_NUMNODES * 2];
 int *node_distance;
 extern nodemask_t numa_nodes_parsed;
+extern struct node nodes[MAX_NUMNODES] __initdata;
+extern int num_node_memblks;
+extern struct node node_memblk_range[NR_NODE_MEMBLKS];
+extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 
 void __init numa_set_cpu_node(int cpu, unsigned long hwid)
 {
@@ -51,6 +56,88 @@ u8 __node_distance(nodeid_t a, nodeid_t b)
 
 EXPORT_SYMBOL(__node_distance);
 
+static int __init numa_mem_init(void)
+{
+    nodemask_t memory_nodes_parsed;
+    int bank, nodeid;
+    struct node *nd;
+    paddr_t start, size, end;
+
+    nodes_clear(memory_nodes_parsed);
+    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+    {
+        start = bootinfo.mem.bank[bank].start;
+        size = bootinfo.mem.bank[bank].size;
+        end = start + size;
+
+        nodeid = get_numa_node(start, end);
+        if ( nodeid == -EINVAL || nodeid > MAX_NUMNODES )
+        {
+            printk(XENLOG_WARNING
+                   "NUMA: node for mem bank start 0x%lx - 0x%lx not found\n",
+                   start, end);
+
+            return -EINVAL;
+        }
+
+        nd = &nodes[nodeid];
+        if ( !node_test_and_set(nodeid, memory_nodes_parsed) )
+        {
+            nd->start = start;
+            nd->end = end;
+        }
+        else
+        {
+            if ( start < nd->start )
+                nd->start = start;
+            if ( nd->end < end )
+                nd->end = end;
+        }
+    }
+
+    return 0;
+}
+
+/* Use the information discovered above to actually set up the nodes. */
+static int __init numa_scan_mem_nodes(void)
+{
+    int i;
+
+    memnode_shift = compute_hash_shift(node_memblk_range, num_node_memblks,
+                                       memblk_nodeid);
+    if ( memnode_shift < 0 )
+    {
+        printk(XENLOG_WARNING "No NUMA hash found.\n");
+        memnode_shift = 0;
+    }
+
+    for_each_node_mask(i, numa_nodes_parsed)
+    {
+        u64 size = node_memblk_range[i].end - node_memblk_range[i].start;
+
+        if ( size == 0 )
+            printk(XENLOG_WARNING "NUMA: Node %u has no memory. \n", i);
+
+        printk(XENLOG_INFO
+               "NUMA: NODE[%d]: Start 0x%lx End 0x%lx\n",
+               i, nodes[i].start, nodes[i].end);
+        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
+    }
+
+    return 0;
+}
+
+static int __init numa_initmem_init(void)
+{
+    if ( !numa_mem_init() )
+    {
+        if ( !numa_scan_mem_nodes() )
+            return 0;
+    }
+
+    return -EINVAL;
+}
+
 /*
  * Setup early cpu_to_node.
  */
@@ -74,6 +161,9 @@ int __init numa_init(void)
 
     ret = dt_numa_init();
 
+    if ( !ret )
+        ret = numa_initmem_init();
+
 no_numa:
     return ret;
 }
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 62c76af..2f5266a 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -63,6 +63,20 @@ void __init numa_add_memblk(nodeid_t nodeid, u64 start, u64 size)
     num_node_memblks++;
 }
 
+int __init get_numa_node(u64 start, u64 end)
+{
+    int i;
+
+    for ( i = 0; i < num_node_memblks; i++ )
+    {
+        if ( start >= node_memblk_range[i].start &&
+             end <= node_memblk_range[i].end )
+            return memblk_nodeid[i];
+    }
+
+    return -EINVAL;
+}
+
 int valid_numa_range(u64 start, u64 end, nodeid_t node)
 {
 #ifdef CONFIG_NUMA
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 9392a89..4f04ab4 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -68,6 +68,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
 #endif /* CONFIG_NUMA */
 
 extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
+extern int get_numa_node(u64 start, u64 end);
 extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 extern int conflicting_memblks(u64 start, u64 end);
 extern void cutoff_node(int i, u64 start, u64 end);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (9 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 10/21] ARM: NUMA: Add memory " vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 16:09   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

On NUMA initialization failure, reset all the
NUMA structures to emulate as single node.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index aa34c82..31dc552 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -19,6 +19,7 @@
 #include <xen/ctype.h>
 #include <xen/mm.h>
 #include <xen/nodemask.h>
+#include <xen/pfn.h>
 #include <asm/mm.h>
 #include <xen/numa.h>
 #include <asm/acpi.h>
@@ -127,6 +128,29 @@ static int __init numa_scan_mem_nodes(void)
     return 0;
 }
 
+static void __init numa_dummy_init(unsigned long start_pfn,
+                                   unsigned long end_pfn)
+{
+    int i;
+
+    nodes_clear(numa_nodes_parsed);
+    memnode_shift = BITS_PER_LONG - 1;
+    memnodemap = _memnodemap;
+    nodes_clear(node_online_map);
+    node_set_online(0);
+
+    for ( i = 0; i < NR_CPUS; i++ )
+        numa_set_node(i, 0);
+
+    node_distance = NULL;
+    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
+        _node_distance[i] = 0;
+
+    cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
+    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
+                       (u64)end_pfn << PAGE_SHIFT);
+}
+
 static int __init numa_initmem_init(void)
 {
     if ( !numa_mem_init() )
@@ -151,7 +175,9 @@ void __init init_cpu_to_node(void)
 
 int __init numa_init(void)
 {
-    int i, ret = 0;
+    int i, bank, ret = 0;
+    paddr_t ram_start = ~0;
+    paddr_t ram_end = 0;
 
     if ( numa_off )
         goto no_numa;
@@ -164,8 +190,28 @@ int __init numa_init(void)
     if ( !ret )
         ret = numa_initmem_init();
 
+    if ( !ret )
+        return 0;
+
 no_numa:
-    return ret;
+    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+    {
+        paddr_t bank_start = bootinfo.mem.bank[bank].start;
+        paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
+
+        ram_start = min(ram_start, bank_start);
+        ram_end = max(ram_end, bank_end);
+    }
+
+    printk(XENLOG_INFO "%s\n",
+           numa_off ? "NUMA turned off" : "No NUMA configuration found");
+
+    printk(XENLOG_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
+           (u64)ram_start, (u64)ram_end);
+
+    numa_dummy_init(PFN_UP(ram_start),PFN_DOWN(ram_end));
+
+    return 0;
 }
 
 int __init arch_numa_setup(char *opt)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (10 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-02-20 18:36   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Delete numa-node-id and distance map from Dom0 DT
so that NUMA information is not exposed to Dom0.

This helps particularly to boot Node 1 devices
as if booting on Node0.

However this approach has limitation where memory allocation
for the devices should be local.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/domain_build.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c97a1f5..5e89eaa 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -424,6 +424,10 @@ static int write_properties(struct domain *d, struct kernel_info *kinfo,
             }
         }
 
+        /* Don't expose the property numa to the guest */
+        if ( dt_property_name_is_equal(prop, "numa-node-id") )
+            continue;
+
         /* Don't expose the property "xen,passthrough" to the guest */
         if ( dt_property_name_is_equal(prop, "xen,passthrough") )
             continue;
@@ -1176,6 +1180,11 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
         DT_MATCH_TYPE("memory"),
         /* The memory mapped timer is not supported by Xen. */
         DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
+        /*
+         * NUMA info is not exposed to Dom0.
+         * So, skip distance-map infomation
+         */
+        DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
         { /* sentinel */ },
     };
     static const struct dt_device_match timer_matches[] __initconst =
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (11 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 15:30   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 14/21] ACPI: Move srat_disabled to common code vijay.kilari
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move SRAT handling code which is common across
architecture is moved to new file xen/commom/srat.c
from xen/arch/x86/srat.c file. New header file srat.h is
introduced.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/domain_build.c         |   1 +
 xen/arch/x86/numa.c                 |   1 +
 xen/arch/x86/physdev.c              |   1 +
 xen/arch/x86/setup.c                |   1 +
 xen/arch/x86/smpboot.c              |   1 +
 xen/arch/x86/srat.c                 | 129 +------------------------------
 xen/arch/x86/x86_64/mm.c            |   1 +
 xen/common/Makefile                 |   1 +
 xen/common/srat.c                   | 150 ++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/asm-x86/numa.h          |   2 -
 xen/include/xen/numa.h              |   1 -
 xen/include/xen/srat.h              |  13 ++++
 13 files changed, 173 insertions(+), 130 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 243df96..4d7795b 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -22,6 +22,7 @@
 #include <xen/compat.h>
 #include <xen/libelf.h>
 #include <xen/pfn.h>
+#include <xen/srat.h>
 #include <asm/regs.h>
 #include <asm/system.h>
 #include <asm/io.h>
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 28d1891..58de324 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -14,6 +14,7 @@
 #include <xen/time.h>
 #include <xen/smp.h>
 #include <xen/pfn.h>
+#include <xen/srat.h>
 #include <asm/acpi.h>
 #include <xen/sched.h>
 #include <xen/softirq.h>
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 5a49796..2184d62 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -9,6 +9,7 @@
 #include <xen/guest_access.h>
 #include <xen/iocap.h>
 #include <xen/serial.h>
+#include <xen/srat.h>
 #include <asm/current.h>
 #include <asm/io_apic.h>
 #include <asm/msi.h>
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 176ee74..ee7a368 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -28,6 +28,7 @@
 #include <xen/tmem_xen.h>
 #include <xen/virtual_region.h>
 #include <xen/watchdog.h>
+#include <xen/srat.h>
 #include <public/version.h>
 #include <compat/platform.h>
 #include <compat/xen.h>
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 9b390b8..6c61114 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -34,6 +34,7 @@
 #include <xen/serial.h>
 #include <xen/numa.h>
 #include <xen/cpu.h>
+#include <xen/srat.h>
 #include <asm/current.h>
 #include <asm/mc146818rtc.h>
 #include <asm/desc.h>
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 58dee09..af12e26 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -18,91 +18,20 @@
 #include <xen/acpi.h>
 #include <xen/numa.h>
 #include <xen/pfn.h>
+#include <xen/srat.h>
 #include <asm/e820.h>
 #include <asm/page.h>
 
-static struct acpi_table_slit *__read_mostly acpi_slit;
+extern struct acpi_table_slit *__read_mostly acpi_slit;
 
 static nodemask_t memory_nodes_parsed __initdata;
 static nodemask_t processor_nodes_parsed __initdata;
 extern struct node nodes[MAX_NUMNODES] __initdata;
-
-struct pxm2node {
-	unsigned pxm;
-	nodeid_t node;
-};
-static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
-	{ [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
-
-static unsigned node_to_pxm(nodeid_t n);
-
 extern int num_node_memblks;
 extern struct node node_memblk_range[NR_NODE_MEMBLKS];
 extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static inline bool_t node_found(unsigned idx, unsigned pxm)
-{
-	return ((pxm2node[idx].pxm == pxm) &&
-		(pxm2node[idx].node != NUMA_NO_NODE));
-}
-
-nodeid_t pxm_to_node(unsigned pxm)
-{
-	unsigned i;
-
-	if ((pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm))
-		return pxm2node[pxm].node;
-
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		if (node_found(i, pxm))
-			return pxm2node[i].node;
-
-	return NUMA_NO_NODE;
-}
-
-nodeid_t setup_node(unsigned pxm)
-{
-	nodeid_t node;
-	unsigned idx;
-	static bool_t warned;
-	static unsigned nodes_found;
-
-	BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
-
-	if (pxm < ARRAY_SIZE(pxm2node)) {
-		if (node_found(pxm, pxm))
-			return pxm2node[pxm].node;
-
-		/* Try to maintain indexing of pxm2node by pxm */
-		if (pxm2node[pxm].node == NUMA_NO_NODE) {
-			idx = pxm;
-			goto finish;
-		}
-	}
-
-	for (idx = 0; idx < ARRAY_SIZE(pxm2node); idx++)
-		if (pxm2node[idx].node == NUMA_NO_NODE)
-			goto finish;
-
-	if (!warned) {
-		printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n",
-		       pxm);
-		warned = 1;
-	}
-
-	return NUMA_NO_NODE;
-
- finish:
-	node = nodes_found++;
-	if (node >= MAX_NUMNODES)
-		return NUMA_NO_NODE;
-	pxm2node[idx].pxm = pxm;
-	pxm2node[idx].node = node;
-
-	return node;
-}
-
 static __init void bad_srat(void)
 {
 	int i;
@@ -115,48 +44,6 @@ static __init void bad_srat(void)
 	mem_hotplug = 0;
 }
 
-/*
- * A lot of BIOS fill in 10 (= no distance) everywhere. This messes
- * up the NUMA heuristics which wants the local node to have a smaller
- * distance than the others.
- * Do some quick checks here and only use the SLIT if it passes.
- */
-static __init int slit_valid(struct acpi_table_slit *slit)
-{
-	int i, j;
-	int d = slit->locality_count;
-	for (i = 0; i < d; i++) {
-		for (j = 0; j < d; j++)  {
-			u8 val = slit->entry[d*i + j];
-			if (i == j) {
-				if (val != 10)
-					return 0;
-			} else if (val <= 10)
-				return 0;
-		}
-	}
-	return 1;
-}
-
-/* Callback for SLIT parsing */
-void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
-{
-	unsigned long mfn;
-	if (!slit_valid(slit)) {
-		printk(KERN_INFO "ACPI: SLIT table looks invalid. "
-		       "Not used.\n");
-		return;
-	}
-	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
-	if (!mfn) {
-		printk(KERN_ERR "ACPI: Unable to allocate memory for "
-		       "saving ACPI SLIT numa information.\n");
-		return;
-	}
-	acpi_slit = mfn_to_virt(mfn);
-	memcpy(acpi_slit, slit, slit->header.length);
-}
-
 /* Callback for Proximity Domain -> x2APIC mapping */
 void __init
 acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
@@ -456,18 +343,6 @@ int __init acpi_scan_nodes(u64 start, u64 end)
 	return 0;
 }
 
-static unsigned node_to_pxm(nodeid_t n)
-{
-	unsigned i;
-
-	if ((n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n))
-		return pxm2node[n].pxm;
-	for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-		if (pxm2node[i].node == n)
-			return pxm2node[i].pxm;
-	return 0;
-}
-
 u8 __node_distance(nodeid_t a, nodeid_t b)
 {
 	unsigned index;
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 9ead02e..f823fb3 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -27,6 +27,7 @@ asm(".file \"" __FILE__ "\"");
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/mem_access.h>
+#include <xen/srat.h>
 #include <asm/current.h>
 #include <asm/asm_defns.h>
 #include <asm/page.h>
diff --git a/xen/common/Makefile b/xen/common/Makefile
index c1bd2ff..a668094 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -64,6 +64,7 @@ obj-bin-y += warning.init.o
 obj-$(CONFIG_XENOPROF) += xenoprof.o
 obj-y += xmalloc_tlsf.o
 obj-y += numa.o
+obj-y += srat.o
 
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
diff --git a/xen/common/srat.c b/xen/common/srat.c
new file mode 100644
index 0000000..cf50c78
--- /dev/null
+++ b/xen/common/srat.c
@@ -0,0 +1,150 @@
+/*
+ * ACPI 3.0 based NUMA setup
+ * Copyright 2004 Andi Kleen, SuSE Labs.
+ *
+ * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs.
+ *
+ * Called from acpi_numa_init while reading the SRAT and SLIT tables.
+ * Assumes all memory regions belonging to a single proximity domain
+ * are in one chunk. Holes between them will be included in the node.
+ *
+ * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
+ *
+ * Moved this generic code from xen/arch/x86/srat.c for other arch usage
+ * by Vijaya Kumar K <Vijaya.Kumar@cavium.com>
+ */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/inttypes.h>
+#include <xen/nodemask.h>
+#include <xen/acpi.h>
+#include <xen/numa.h>
+#include <xen/pfn.h>
+#include <xen/srat.h>
+#include <asm/page.h>
+
+struct acpi_table_slit *__read_mostly acpi_slit;
+extern struct node nodes[MAX_NUMNODES] __initdata;
+
+struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
+    { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
+
+static inline bool_t node_found(unsigned idx, unsigned pxm)
+{
+    return ((pxm2node[idx].pxm == pxm) &&
+        (pxm2node[idx].node != NUMA_NO_NODE));
+}
+
+nodeid_t pxm_to_node(unsigned pxm)
+{
+    unsigned i;
+
+    if ( (pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm) )
+        return pxm2node[pxm].node;
+
+    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
+        if ( node_found(i, pxm) )
+            return pxm2node[i].node;
+
+    return NUMA_NO_NODE;
+}
+
+nodeid_t setup_node(unsigned pxm)
+{
+    nodeid_t node;
+    unsigned idx;
+    static bool_t warned;
+    static unsigned nodes_found;
+
+    BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
+
+    if ( pxm < ARRAY_SIZE(pxm2node) ) {
+        if (node_found(pxm, pxm))
+            return pxm2node[pxm].node;
+
+        /* Try to maintain indexing of pxm2node by pxm */
+        if ( pxm2node[pxm].node == NUMA_NO_NODE ) {
+            idx = pxm;
+            goto finish;
+        }
+    }
+
+    for ( idx = 0; idx < ARRAY_SIZE(pxm2node); idx++ )
+        if ( pxm2node[idx].node == NUMA_NO_NODE )
+            goto finish;
+
+    if ( !warned ) {
+        printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n",
+               pxm);
+        warned = 1;
+    }
+
+    return NUMA_NO_NODE;
+
+ finish:
+    node = nodes_found++;
+    if (node >= MAX_NUMNODES)
+        return NUMA_NO_NODE;
+    pxm2node[idx].pxm = pxm;
+    pxm2node[idx].node = node;
+
+    return node;
+}
+
+/*
+ * A lot of BIOS fill in 10 (= no distance) everywhere. This messes
+ * up the NUMA heuristics which wants the local node to have a smaller
+ * distance than the others.
+ * Do some quick checks here and only use the SLIT if it passes.
+ */
+static __init int slit_valid(struct acpi_table_slit *slit)
+{
+    int i, j;
+    int d = slit->locality_count;
+
+    for ( i = 0; i < d; i++ ) {
+        for ( j = 0; j < d; j++ )  {
+            u8 val = slit->entry[d*i + j];
+            if ( i == j ) {
+                if (val != 10)
+                    return 0;
+            } else if ( val <= 10 )
+                return 0;
+        }
+    }
+
+    return 1;
+}
+
+/* Callback for SLIT parsing */
+void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
+{
+    unsigned long mfn;
+
+    if ( !slit_valid(slit) ) {
+        printk(KERN_INFO "ACPI: SLIT table looks invalid. "
+               "Not used.\n");
+        return;
+    }
+    mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
+    if ( !mfn ) {
+        printk(KERN_ERR "ACPI: Unable to allocate memory for "
+               "saving ACPI SLIT numa information.\n");
+        return;
+    }
+    acpi_slit = mfn_to_virt(mfn);
+    memcpy(acpi_slit, slit, slit->header.length);
+}
+
+unsigned node_to_pxm(nodeid_t n)
+{
+    unsigned i;
+
+    if ( (n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n) )
+        return pxm2node[n].pxm;
+    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
+        if ( pxm2node[i].node == n )
+            return pxm2node[i].pxm;
+    return 0;
+}
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index a5c61c6..7a4463d 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -30,6 +30,7 @@
 #include <xen/pci.h>
 #include <xen/pci_regs.h>
 #include <xen/keyhandler.h>
+#include <xen/srat.h>
 #include <asm/msi.h>
 #include <asm/irq.h>
 #include <asm/hvm/vmx/vmx.h>
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 659ff6a..79a445c 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -17,8 +17,6 @@ extern cpumask_t     node_to_cpumask[];
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
 
-extern nodeid_t pxm_to_node(unsigned int pxm);
-
 #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
 
 extern void numa_init_array(void);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 4f04ab4..eb18380 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -73,7 +73,6 @@ extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
 extern int conflicting_memblks(u64 start, u64 end);
 extern void cutoff_node(int i, u64 start, u64 end);
 extern void numa_add_cpu(int cpu);
-extern nodeid_t setup_node(unsigned int pxm);
 extern void numa_set_node(int cpu, nodeid_t node);
 extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
 extern int compute_hash_shift(struct node *nodes, int numnodes,
diff --git a/xen/include/xen/srat.h b/xen/include/xen/srat.h
new file mode 100644
index 0000000..978f1e8
--- /dev/null
+++ b/xen/include/xen/srat.h
@@ -0,0 +1,13 @@
+#ifndef __XEN_SRAT_H__
+#define __XEN_SRAT_H__
+
+struct pxm2node {
+    unsigned pxm;
+    nodeid_t node;
+};
+
+extern struct pxm2node __read_mostly pxm2node[MAX_NUMNODES];
+nodeid_t pxm_to_node(unsigned pxm);
+nodeid_t setup_node(unsigned pxm);
+unsigned node_to_pxm(nodeid_t n);
+#endif /* __XEN_SRAT_H__ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 14/21] ACPI: Move srat_disabled to common code
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (12 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-02-09 15:57 ` [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Move srat_disabled() from xen/arch/x86/numa.c to
xen/commom/srat.c.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/x86/numa.c        | 7 -------
 xen/common/srat.c          | 7 +++++++
 xen/include/asm-x86/acpi.h | 1 -
 xen/include/asm-x86/numa.h | 1 -
 xen/include/xen/srat.h     | 2 ++
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 58de324..ec251bd 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -32,13 +32,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-s8 acpi_numa = 0;
-
-int srat_disabled(void)
-{
-    return numa_off || acpi_numa < 0;
-}
-
 void __init numa_init_array(void)
 {
     int rr, i;
diff --git a/xen/common/srat.c b/xen/common/srat.c
index cf50c78..a96406e 100644
--- a/xen/common/srat.c
+++ b/xen/common/srat.c
@@ -30,6 +30,13 @@ extern struct node nodes[MAX_NUMNODES] __initdata;
 struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
     { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
 
+s8 acpi_numa = 0;
+
+int srat_disabled(void)
+{
+    return numa_off || acpi_numa < 0;
+}
+
 static inline bool_t node_found(unsigned idx, unsigned pxm)
 {
     return ((pxm2node[idx].pxm == pxm) &&
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index f1a8e9d..934cd66 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -104,7 +104,6 @@ extern void acpi_reserve_bootmem(void);
 
 #define ARCH_HAS_POWER_INIT	1
 
-extern s8 acpi_numa;
 extern int acpi_scan_nodes(u64 start, u64 end);
 
 #ifdef CONFIG_ACPI_SLEEP
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 79a445c..9c11db4 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -21,7 +21,6 @@ extern cpumask_t     node_to_cpumask[];
 
 extern void numa_init_array(void);
 
-extern int srat_disabled(void);
 extern void srat_detect_node(int cpu);
 
 extern nodeid_t apicid_to_node[];
diff --git a/xen/include/xen/srat.h b/xen/include/xen/srat.h
index 978f1e8..ab33d86 100644
--- a/xen/include/xen/srat.h
+++ b/xen/include/xen/srat.h
@@ -6,7 +6,9 @@ struct pxm2node {
     nodeid_t node;
 };
 
+extern s8 acpi_numa;
 extern struct pxm2node __read_mostly pxm2node[MAX_NUMNODES];
+extern int srat_disabled(void);
 nodeid_t pxm_to_node(unsigned pxm);
 nodeid_t setup_node(unsigned pxm);
 unsigned node_to_pxm(nodeid_t n);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (13 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 14/21] ACPI: Move srat_disabled to common code vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 16:28   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Parse MADT table and extract MPIDR for all
CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
and store in cpu_uid_to_hwid[].

This mapping is used by SRAT table parsing to
extract MPIDR of the CPU ID.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Makefile      |   1 +
 xen/arch/arm/acpi_numa.c   | 122 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/numa.c        |   1 +
 xen/include/asm-arm/acpi.h |   2 +
 4 files changed, 126 insertions(+)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 7694485..8c5e67b 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -51,6 +51,7 @@ obj-y += vpsci.o
 obj-y += vuart.o
 obj-$(CONFIG_NUMA) += numa.o
 obj-$(CONFIG_NUMA) += dt_numa.o
+obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
 
 #obj-bin-y += ....o
 
diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
new file mode 100644
index 0000000..3ee87f2
--- /dev/null
+++ b/xen/arch/arm/acpi_numa.c
@@ -0,0 +1,122 @@
+/*
+ * ACPI based NUMA setup
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K <Vijaya.Kumar@cavium.com>
+ *
+ * Reads the ACPI MADT and SRAT table to setup NUMA information.
+ *
+ * Contains Excerpts from x86 implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/inttypes.h>
+#include <xen/nodemask.h>
+#include <xen/acpi.h>
+#include <xen/numa.h>
+#include <xen/pfn.h>
+#include <xen/srat.h>
+#include <asm/page.h>
+#include <asm/acpi.h>
+
+extern nodemask_t numa_nodes_parsed;
+struct uid_to_mpidr {
+    u32 uid;
+    u64 mpidr;
+};
+
+/* Holds mapping of CPU id to MPIDR read from MADT */
+static struct uid_to_mpidr cpu_uid_to_hwid[NR_CPUS] __read_mostly;
+
+static __init void bad_srat(void)
+{
+    int i;
+
+    printk(KERN_ERR "SRAT: SRAT not used.\n");
+    acpi_numa = -1;
+    for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
+        pxm2node[i].node = NUMA_NO_NODE;
+}
+
+static u64 acpi_get_cpu_mpidr(int uid)
+{
+    int i;
+
+    if ( uid < ARRAY_SIZE(cpu_uid_to_hwid) && cpu_uid_to_hwid[uid].uid == uid &&
+         cpu_uid_to_hwid[uid].mpidr != MPIDR_INVALID )
+        return cpu_uid_to_hwid[uid].mpidr;
+
+    for ( i = 0; i < NR_CPUS; i++ )
+    {
+        if ( cpu_uid_to_hwid[i].uid == uid )
+            return cpu_uid_to_hwid[i].mpidr;
+    }
+
+    return MPIDR_INVALID;
+}
+
+static void __init
+acpi_map_cpu_to_mpidr(struct acpi_madt_generic_interrupt *processor)
+{
+    static int cpus = 0;
+
+    u64 mpidr = processor->arm_mpidr & MPIDR_HWID_MASK;
+
+    if ( mpidr == MPIDR_INVALID )
+    {
+        printk("Skip MADT cpu entry with invalid MPIDR\n");
+        bad_srat();
+        return;
+    }
+
+    cpu_uid_to_hwid[cpus].mpidr = mpidr;
+    cpu_uid_to_hwid[cpus].uid = processor->uid;
+
+    cpus++;
+}
+
+static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
+                                          const unsigned long end)
+{
+    struct acpi_madt_generic_interrupt *p =
+               container_of(header, struct acpi_madt_generic_interrupt, header);
+
+    if ( BAD_MADT_ENTRY(p, end) )
+    {
+        /* Though MADT is invalid, we disable NUMA by calling bad_srat() */
+        bad_srat();
+        return -EINVAL;
+    }
+
+    acpi_table_print_madt_entry(header);
+    acpi_map_cpu_to_mpidr(p);
+
+    return 0;
+}
+
+void __init acpi_map_uid_to_mpidr(void)
+{
+    int i;
+
+    for ( i  = 0; i < NR_CPUS; i++ )
+    {
+        cpu_uid_to_hwid[i].mpidr = MPIDR_INVALID;
+        cpu_uid_to_hwid[i].uid = -1;
+    }
+
+    acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
+                    acpi_parse_madt_handler, 0);
+}
+
+void __init acpi_numa_arch_fixup(void) {}
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 31dc552..5c49347 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -20,6 +20,7 @@
 #include <xen/mm.h>
 #include <xen/nodemask.h>
 #include <xen/pfn.h>
+#include <xen/acpi.h>
 #include <asm/mm.h>
 #include <xen/numa.h>
 #include <asm/acpi.h>
diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h
index 9f954d3..b1f36f4 100644
--- a/xen/include/asm-arm/acpi.h
+++ b/xen/include/asm-arm/acpi.h
@@ -68,6 +68,8 @@ static inline void enable_acpi(void)
 {
     acpi_disabled = 0;
 }
+
+void acpi_map_uid_to_mpidr(void);
 #else
 #define acpi_disabled (1)
 #define disable_acpi()
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (14 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 17:21   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 17/21] ARM: NUMA: Extract memory " vijay.kilari
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Register SRAT entry handler for type
ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
and extract proximity for all CPU IDs.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/acpi_numa.c  | 55 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/drivers/acpi/numa.c   | 37 +++++++++++++++++++++++++++++++
 xen/drivers/acpi/osl.c    |  2 ++
 xen/include/acpi/actbl1.h | 17 ++++++++++++++-
 xen/include/xen/acpi.h    | 39 +++++++++++++++++++++++++++++++++
 5 files changed, 149 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
index 3ee87f2..f659275 100644
--- a/xen/arch/arm/acpi_numa.c
+++ b/xen/arch/arm/acpi_numa.c
@@ -105,6 +105,61 @@ static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
     return 0;
 }
 
+/* Callback for Proximity Domain -> ACPI processor UID mapping */
+void __init acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+    int pxm, node;
+    u64 mpidr = 0;
+    static u32 cpus_in_srat;
+
+    if ( srat_disabled() )
+        return;
+
+    if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
+    {
+        printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
+               pa->header.length);
+        bad_srat();
+        return;
+    }
+
+    if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
+        return;
+
+    if ( cpus_in_srat >= NR_CPUS )
+    {
+        printk(XENLOG_WARNING
+               "SRAT: cpu_to_node_map[%d] is too small to fit all cpus\n",
+               NR_CPUS);
+        return;
+    }
+
+    pxm = pa->proximity_domain;
+    node = setup_node(pxm);
+    if ( node == NUMA_NO_NODE || node >= MAX_NUMNODES )
+    {
+        printk(XENLOG_WARNING "SRAT: Too many proximity domains %d\n", pxm);
+        bad_srat();
+        return;
+    }
+
+    mpidr = acpi_get_cpu_mpidr(pa->acpi_processor_uid);
+    if ( mpidr == MPIDR_INVALID )
+    {
+        printk(XENLOG_WARNING
+               "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in MADT\n",
+               pxm, pa->acpi_processor_uid);
+        bad_srat();
+        return;
+    }
+
+    node_set(node, numa_nodes_parsed);
+    cpus_in_srat++;
+    acpi_numa = 1;
+    printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
+           pxm, mpidr, node);
+}
+
 void __init acpi_map_uid_to_mpidr(void)
 {
     int i;
diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
index 50bf9f8..ce22e88 100644
--- a/xen/drivers/acpi/numa.c
+++ b/xen/drivers/acpi/numa.c
@@ -25,9 +25,11 @@
 #include <xen/init.h>
 #include <xen/types.h>
 #include <xen/errno.h>
+#include <xen/mm.h>
 #include <xen/acpi.h>
 #include <xen/numa.h>
 #include <acpi/acmacros.h>
+#include <asm/mm.h>
 
 #define ACPI_NUMA	0x80000000
 #define _COMPONENT	ACPI_NUMA
@@ -105,6 +107,21 @@ void __init acpi_table_print_srat_entry(struct acpi_subtable_header * header)
 		}
 #endif				/* ACPI_DEBUG_OUTPUT */
 		break;
+       case ACPI_SRAT_TYPE_GICC_AFFINITY:
+#ifdef ACPI_DEBUG_OUTPUT
+		{
+			struct acpi_srat_gicc_affinity *p =
+			    (struct acpi_srat_gicc_affinity *)header;
+			ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+					  "SRAT Processor (acpi id[0x%04x]) in"
+					  " proximity domain %d %s\n",
+					  p->acpi_processor_uid,
+					  p->proximity_domain,
+					  (p->flags & ACPI_SRAT_GICC_ENABLED) ?
+					  "enabled" : "disabled");
+		}
+#endif                         /* ACPI_DEBUG_OUTPUT */
+               break;
 	default:
 		printk(KERN_WARNING PREFIX
 		       "Found unsupported SRAT entry (type = %#x)\n",
@@ -185,6 +202,24 @@ int __init acpi_parse_srat(struct acpi_table_header *table)
 	return 0;
 }
 
+static int __init
+acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
+			 const unsigned long end)
+{
+	const struct acpi_srat_gicc_affinity *processor_affinity
+			= (struct acpi_srat_gicc_affinity *)header;
+
+	if (!processor_affinity)
+		return -EINVAL;
+
+	acpi_table_print_srat_entry(header);
+
+	/* let architecture-dependent part to do it */
+	acpi_numa_gicc_affinity_init(processor_affinity);
+
+	return 0;
+}
+
 int __init
 acpi_table_parse_srat(int id, acpi_madt_entry_handler handler,
 		      unsigned int max_entries)
@@ -205,6 +240,8 @@ int __init acpi_numa_init(void)
 		acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
 				      acpi_parse_memory_affinity,
 				      NR_NODE_MEMBLKS);
+		acpi_table_parse_srat(ACPI_SRAT_TYPE_GICC_AFFINITY,
+				      acpi_parse_gicc_affinity, NR_CPUS);
 	}
 
 	/* SLIT: System Locality Information Table */
diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
index 7199047..7046816 100644
--- a/xen/drivers/acpi/osl.c
+++ b/xen/drivers/acpi/osl.c
@@ -29,6 +29,7 @@
 #include <xen/pfn.h>
 #include <xen/types.h>
 #include <xen/errno.h>
+#include <xen/mm.h>
 #include <xen/acpi.h>
 #include <xen/numa.h>
 #include <acpi/acmacros.h>
@@ -39,6 +40,7 @@
 #include <xen/efi.h>
 #include <xen/vmap.h>
 #include <xen/kconfig.h>
+#include <asm/mm.h>
 
 #define _COMPONENT		ACPI_OS_SERVICES
 ACPI_MODULE_NAME("osl")
diff --git a/xen/include/acpi/actbl1.h b/xen/include/acpi/actbl1.h
index e199136..b84bfba 100644
--- a/xen/include/acpi/actbl1.h
+++ b/xen/include/acpi/actbl1.h
@@ -949,7 +949,8 @@ enum acpi_srat_type {
 	ACPI_SRAT_TYPE_CPU_AFFINITY = 0,
 	ACPI_SRAT_TYPE_MEMORY_AFFINITY = 1,
 	ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY = 2,
-	ACPI_SRAT_TYPE_RESERVED = 3	/* 3 and greater are reserved */
+	ACPI_SRAT_TYPE_GICC_AFFINITY = 3,
+	ACPI_SRAT_TYPE_RESERVED = 4	/* 4 and greater are reserved */
 };
 
 /*
@@ -1007,6 +1008,20 @@ struct acpi_srat_x2apic_cpu_affinity {
 
 #define ACPI_SRAT_CPU_ENABLED       (1)	/* 00: Use affinity structure */
 
+/* 3: GICC Affinity (ACPI 5.1) */
+
+struct acpi_srat_gicc_affinity {
+	struct acpi_subtable_header header;
+	u32 proximity_domain;
+	u32 acpi_processor_uid;
+	u32 flags;
+	u32 clock_domain;
+};
+
+/* Flags for struct acpi_srat_gicc_affinity */
+
+#define ACPI_SRAT_GICC_ENABLED     (1)  /* 00: Use affinity structure */
+
 /* Reset to default packing */
 
 #pragma pack()
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 30ec0ee..67fe1bb 100644
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -92,10 +92,49 @@ void acpi_table_print_srat_entry (struct acpi_subtable_header *srat);
 
 /* the following four functions are architecture-dependent */
 void acpi_numa_slit_init (struct acpi_table_slit *slit);
+#if defined(CONFIG_X86)
 void acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *);
 void acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *);
 void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *);
 void acpi_numa_arch_fixup(void);
+static inline void
+acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+	return;
+}
+#elif defined(CONFIG_ARM)
+static inline void
+acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *cpu_aff)
+{
+	return;
+}
+static inline void
+acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *x2apic)
+{
+	return;
+}
+#if defined(CONFIG_ACPI_NUMA)
+void acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa);
+void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *);
+void acpi_numa_arch_fixup(void);
+#else
+static inline void
+acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+	return;
+}
+static inline void
+acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
+{
+	return;
+}
+static inline void
+acpi_numa_arch_fixup(void)
+{
+	return;
+}
+#endif /* CONFIG_ACPI_NUMA */
+#endif /* CONFIG_X86 */
 
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Arch dependent functions for cpu hotplug support */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 17/21] ARM: NUMA: Extract memory proximity from SRAT table
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (15 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-02-10 17:33   ` Konrad Rzeszutek Wilk
  2017-02-09 15:57 ` [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support vijay.kilari
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Register SRAT entry handler for type
ACPI_SRAT_TYPE_MEMORY_AFFINITY to parse SRAT table
and extract proximity for all memory mappings.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/acpi_numa.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
index f659275..23cb07b 100644
--- a/xen/arch/arm/acpi_numa.c
+++ b/xen/arch/arm/acpi_numa.c
@@ -30,6 +30,9 @@
 #include <asm/page.h>
 #include <asm/acpi.h>
 
+extern int num_node_memblks;
+extern struct node node_memblk_range[NR_NODE_MEMBLKS];
+extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 extern nodemask_t numa_nodes_parsed;
 struct uid_to_mpidr {
     u32 uid;
@@ -38,6 +41,7 @@ struct uid_to_mpidr {
 
 /* Holds mapping of CPU id to MPIDR read from MADT */
 static struct uid_to_mpidr cpu_uid_to_hwid[NR_CPUS] __read_mostly;
+static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
 static __init void bad_srat(void)
 {
@@ -160,6 +164,82 @@ void __init acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *p
            pxm, mpidr, node);
 }
 
+/* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
+void __init
+acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
+{
+    u64 start, end;
+    unsigned pxm;
+    nodeid_t node;
+    int i;
+
+    if ( srat_disabled() )
+        return;
+
+    if ( ma->header.length != sizeof(struct acpi_srat_mem_affinity) )
+    {
+        bad_srat();
+        return;
+    }
+    if ( !(ma->flags & ACPI_SRAT_MEM_ENABLED) )
+        return;
+
+    if ( num_node_memblks >= NR_NODE_MEMBLKS )
+    {
+        printk(XENLOG_WARNING
+               "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
+        bad_srat();
+        return;
+    }
+
+    start = ma->base_address;
+    end = start + ma->length;
+    pxm = ma->proximity_domain;
+    node = setup_node(pxm);
+    if ( node == NUMA_NO_NODE )
+    {
+        bad_srat();
+        return;
+    }
+    /* It is fine to add this area to the nodes data it will be used later*/
+    i = conflicting_memblks(start, end);
+    if ( i < 0 )
+        /* everything fine */;
+    else if ( memblk_nodeid[i] == node )
+    {
+        bool_t mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
+                           !test_bit(i, memblk_hotplug);
+
+        printk(XENLOG_WARNING
+               "%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with itself (%"PRIx64"-%"PRIx64")\n",
+               mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
+               node_memblk_range[i].start, node_memblk_range[i].end);
+        if ( mismatch )
+        {
+            bad_srat();
+            return;
+        }
+    }
+    else
+    {
+         printk(XENLOG_WARNING
+                "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u (%"PRIx64"-%"PRIx64")\n",
+                pxm, start, end, node_to_pxm(memblk_nodeid[i]),
+                node_memblk_range[i].start, node_memblk_range[i].end);
+        bad_srat();
+        return;
+    }
+
+    printk(XENLOG_INFO "SRAT: Node %u PXM %u %"PRIx64"-%"PRIx64"%s\n",
+           node, pxm, start, end,
+           ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" : "");
+
+    numa_add_memblk(node, start, ma->length);
+    node_set(node, numa_nodes_parsed);
+    if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
+        __set_bit(num_node_memblks, memblk_hotplug);
+}
+
 void __init acpi_map_uid_to_mpidr(void)
 {
     int i;
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (16 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 17/21] ARM: NUMA: Extract memory " vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 17:24   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Update node_distance() function to handle
ACPI SLIT table information.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 5c49347..50c3dea 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -23,6 +23,7 @@
 #include <xen/acpi.h>
 #include <asm/mm.h>
 #include <xen/numa.h>
+#include <xen/srat.h>
 #include <asm/acpi.h>
 #include <xen/errno.h>
 #include <xen/cpumask.h>
@@ -35,6 +36,7 @@ extern struct node nodes[MAX_NUMNODES] __initdata;
 extern int num_node_memblks;
 extern struct node node_memblk_range[NR_NODE_MEMBLKS];
 extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
+extern struct acpi_table_slit *__read_mostly acpi_slit;
 
 void __init numa_set_cpu_node(int cpu, unsigned long hwid)
 {
@@ -50,9 +52,24 @@ void __init numa_set_cpu_node(int cpu, unsigned long hwid)
 
 u8 __node_distance(nodeid_t a, nodeid_t b)
 {
-    if ( !node_distance )
+    unsigned index;
+    u8 slit_val;
+
+    if ( !node_distance && !acpi_slit )
         return a == b ? 10 : 20;
 
+    if ( acpi_slit )
+    {
+        index = acpi_slit->locality_count * node_to_pxm(a);
+        slit_val = acpi_slit->entry[index + node_to_pxm(b)];
+
+        /* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
+        if ( (slit_val == 0xff) || (slit_val <= 9) )
+            return NUMA_NO_DISTANCE;
+        else
+            return slit_val;
+    }
+
     return _node_distance[a * MAX_NUMNODES + b];
 }
 
@@ -140,6 +157,7 @@ static void __init numa_dummy_init(unsigned long start_pfn,
     nodes_clear(node_online_map);
     node_set_online(0);
 
+    acpi_slit = NULL;
     for ( i = 0; i < NR_CPUS; i++ )
         numa_set_node(i, 0);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (17 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 17:25   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config vijay.kilari
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/numa.c | 12 +++++++++++-
 xen/common/numa.c   |  6 ++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 50c3dea..1d6e16c 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -204,7 +204,17 @@ int __init numa_init(void)
     for ( i = 0; i < MAX_NUMNODES * 2; i++ )
         _node_distance[i] = 0;
 
-    ret = dt_numa_init();
+#ifdef CONFIG_ACPI_NUMA
+    if ( !acpi_disabled )
+    {
+        acpi_map_uid_to_mpidr();
+        ret = acpi_numa_init();
+        if ( ret || srat_disabled() )
+            goto no_numa;
+    }
+    else
+#endif
+        ret = dt_numa_init();
 
     if ( !ret )
         ret = numa_initmem_init();
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 2f5266a..4c67d38 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -30,6 +30,7 @@
 #include <xen/sched.h>
 #include <xen/errno.h>
 #include <xen/softirq.h>
+#include <xen/srat.h>
 #include <asm/setup.h>
 
 static int numa_setup(char *s);
@@ -282,6 +283,11 @@ static __init int numa_setup(char *opt)
         numa_off = 1;
     if ( !strncmp(opt,"on",2) )
         numa_off = 0;
+    if ( !strncmp(opt,"noacpi",6) )
+    {
+        numa_off = 0;
+        acpi_numa = -1;
+    }
 
     return arch_numa_setup(opt);
 }
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (18 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 17:27   ` Julien Grall
  2017-02-09 15:57 ` [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config vijay.kilari
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Enable CONFIG_NUMA to enable DT NUMA

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 2e023d1..fbc4f23 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -23,6 +23,7 @@ config ARM
 	select HAS_PASSTHROUGH
 	select HAS_PDX
 	select VIDEO
+	select NUMA
 
 config ARCH_DEFCONFIG
 	string
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (19 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config vijay.kilari
@ 2017-02-09 15:57 ` vijay.kilari
  2017-03-02 17:31   ` Julien Grall
  2017-02-09 16:31 ` [RFC PATCH v1 00/21] ARM: Add Xen NUMA support Julien Grall
  2017-02-10 17:30 ` Konrad Rzeszutek Wilk
  22 siblings, 1 reply; 91+ messages in thread
From: vijay.kilari @ 2017-02-09 15:57 UTC (permalink / raw)
  To: julien.grall, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

Enable CONFIG_ACPI_NUMA to enable ACPI NUMA

Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
---
 xen/arch/arm/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index fbc4f23..4b74eef 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -43,6 +43,10 @@ config ACPI
 	  Advanced Configuration and Power Interface (ACPI) support for Xen is
 	  an alternative to device tree on ARM64.
 
+config ACPI_NUMA
+	def_bool y
+	depends on ACPI
+
 config HAS_GICV3
 	bool
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-09 15:56 ` [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code vijay.kilari
@ 2017-02-09 16:11   ` Jan Beulich
  2017-02-20 11:41     ` Julien Grall
  2017-02-27 11:43     ` Vijay Kilari
  2017-02-20 12:37   ` Julien Grall
  1 sibling, 2 replies; 91+ messages in thread
From: Jan Beulich @ 2017-02-09 16:11 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	julien.grall, xen-devel

>>> On 09.02.17 at 16:56, <vijay.kilari@gmail.com> wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> 
> Move common generic NUMA code to xen/common/numa.c from
> xen/arch/x86/numa.c. Also move generic code in header file
> xen/include/asm-x86/numa.h to xen/include/xen/numa.h
> 
> This common code can be re-used later for ARM.
> 
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>

I would have been nice if you Cc-ed the maintainers of the code
you're moving.

> --- /dev/null
> +++ b/xen/common/numa.c
> @@ -0,0 +1,342 @@
> +/*
> + * Common NUMA handling functions for x86 and arm.
> + * Original code extracted from arch/x86/numa.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +
> +#include <xen/mm.h>
> +#include <xen/string.h>
> +#include <xen/init.h>
> +#include <xen/ctype.h>
> +#include <xen/nodemask.h>
> +#include <xen/numa.h>
> +#include <xen/keyhandler.h>
> +#include <xen/time.h>
> +#include <xen/smp.h>
> +#include <xen/pfn.h>
> +#include <xen/sched.h>
> +#include <xen/errno.h>
> +#include <xen/softirq.h>
> +#include <asm/setup.h>

This last one would better not be included here.

> +struct node_data node_data[MAX_NUMNODES];
> +
> +/* Mapping from pdx to node id */
> +int memnode_shift;
> +unsigned long memnodemapsize;
> +u8 *memnodemap;
> +typeof(*memnodemap) _memnodemap[64];

Careful with removing any "static" please. For the last one in
particular you would need to change the name if it's really necessary
to make non-static. Even better would be though to keep it static
and provide suitable accessors.

Also please sanitize types as you're moving stuff: memnode_shift
looks like it really wants to be unsigned int, and u8 should really
be uint8_t (as we're trying to phase out u8 & Co). This also applies
to code further down.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common
  2017-02-09 15:56 ` [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common vijay.kilari
@ 2017-02-09 16:15   ` Jan Beulich
  2017-02-20 12:47   ` Julien Grall
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2017-02-09 16:15 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	julien.grall, xen-devel

>>> On 09.02.17 at 16:56, <vijay.kilari@gmail.com> wrote:
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -25,7 +25,7 @@ static struct acpi_table_slit *__read_mostly acpi_slit;
>  
>  static nodemask_t memory_nodes_parsed __initdata;
>  static nodemask_t processor_nodes_parsed __initdata;
> -static struct node nodes[MAX_NUMNODES] __initdata;
> +extern struct node nodes[MAX_NUMNODES] __initdata;

NAK to changes like this. Declarations belong in header files and
shouldn't have __init* annotations. But along the line of what I've
said for patch 2, it would be better if this wasn't made non-static.

> +__init int conflicting_memblks(u64 start, u64 end)

As you're moving code, again please sanitize stuff: Here, the __init
annotation is misplaced (it conventionally goes between type and
name). And of course the name is too generic to be kept as is.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 00/21] ARM: Add Xen NUMA support
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (20 preceding siblings ...)
  2017-02-09 15:57 ` [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config vijay.kilari
@ 2017-02-09 16:31 ` Julien Grall
  2017-02-09 16:59   ` Vijay Kilari
  2017-02-10 17:30 ` Konrad Rzeszutek Wilk
  22 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-09 16:31 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, Vijaya Kumar K

Hi Vijay,

On 02/09/2017 03:56 PM, vijay.kilari@gmail.com wrote:
> Note: Please use this patch series only for review.
> For testing, patch to boot allocator is required. Which will
> be sent outside this series.

Can you expand here? Is this patch a NUMA specific?

Also in a previous thread you mentioned issue to boot Xen with NUMA on 
Xen unstable. So how did you test it?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 00/21] ARM: Add Xen NUMA support
  2017-02-09 16:31 ` [RFC PATCH v1 00/21] ARM: Add Xen NUMA support Julien Grall
@ 2017-02-09 16:59   ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-02-09 16:59 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andre Przywara, Dario Faggioli, Stefano Stabellini,
	Vijaya Kumar K, xen-devel

Hi Julien,

On Thu, Feb 9, 2017 at 10:01 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Vijay,
>
> On 02/09/2017 03:56 PM, vijay.kilari@gmail.com wrote:
>>
>> Note: Please use this patch series only for review.
>> For testing, patch to boot allocator is required. Which will
>> be sent outside this series.
>
>
> Can you expand here? Is this patch a NUMA specific?

Yes it is NUMA specific, which I have reported here.
I have workaround for this. Need to prepare a patch. ( I hope till now
there is no
patch from anyone else for this issue)

https://www.mail-archive.com/xen-devel@lists.xen.org/msg92093.html

>
> Also in a previous thread you mentioned issue to boot Xen with NUMA on Xen
> unstable. So how did you test it?

This issue  (panic in page_alloc.c) that I reported is seen when I
boot plain unstable
xen on NUMA board without any NUMA or ITS patches. This issue
is seen only with on NUMA board with DT

I have tested this series with ACPI using unstable version and DT on
4.7 version.
Also, I have prepared a small patch as below (just adhoc way),
where in I called cpu_to_node() for all cpus and print phys_to_nid()
to see if the node id is correct or not.

-----------------------------------------------------------------------------------------
diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index d296523..d28e6bf 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -43,9 +43,11 @@ void __init numa_set_cpu_node(int cpu, unsigned long hwid)
     unsigned node;

     node = hwid >> 16 & 0xf;
+    printk("In %s cpu %d node %d\n",__func__, cpu, node);
     if ( !node_isset(node, numa_nodes_parsed) || node == MAX_NUMNODES )
         node = 0;

+    printk("In %s cpu %d node %d\n",__func__, cpu, node);
     numa_set_node(cpu, node);
     numa_add_cpu(cpu);
 }
@@ -245,3 +247,52 @@ int __init arch_numa_setup(char *opt)
 {
     return 1;
 }
+
+struct mem_list {
+    u64 start;
+    u64 end;
+};
+
+void numa_test(void)
+{
+    int i;
+
+    struct mem_list ml[] =
+    {
+        { 0x0000000001400000, 0x00000000fffecfff },
+        { 0x0000000100000000 , 0x0000000ff7ffffff },
+        { 0x0000000ff8000000 , 0x0000000ff801ffff },
+        { 0x0000000ff8020000 , 0x0000000fffa9cfff },
+        { 0x0000000fffa9d000 , 0x0000000fffffffff },
+        { 0x0000010000400000 , 0x0000010ff57b2fff },
+        { 0x0000010ff6618000 , 0x0000010ff6ff0fff },
+        { 0x0000010ff6ff1000 , 0x0000010ff724ffff },
+        { 0x0000010ff734b000 , 0x0000010ff73defff },
+        { 0x0000010ff73f0000 , 0x0000010ff73fbfff },
+        { 0x0000010ff73fc000 , 0x0000010ff74defff },
+        { 0x0000010ff74df000 , 0x0000010ff9718fff },
+        { 0x0000010ff97a2000 , 0x0000010ff97acfff },
+        { 0x0000010ff97ad000 , 0x0000010ff97b3fff },
+        { 0x0000010ff97b5000 , 0x0000010ff9813fff },
+        { 0x0000010ff9814000 , 0x0000010ff9819fff },
+        { 0x0000010ff981a000 , 0x0000010ff984afff },
+        { 0x0000010ff984c000 , 0x0000010ff9851fff },
+        { 0x0000010ff9935000 , 0x0000010ffaeb5fff },
+        { 0x0000010ffaff5000 , 0x0000010ffb008fff },
+        { 0x0000010ffb009000 , 0x0000010fffe28fff },
+        { 0x0000010fffe29000 , 0x0000010fffe70fff },
+        { 0x0000010fffe71000 , 0x0000010ffffb8fff },
+        { 0x0000010ffffff000 , 0x0000010fffffffff },
+    };
+
+    for ( i = 0; i < 23; i++ )
+    {
+        printk("NUMA MEM TEST: start 0x%lx in node %d end 0x%lx in node %d\n",
+               ml[i].start, phys_to_nid(ml[i].start), ml[i].end,
phys_to_nid(ml[i].end));
+    }
+
+    for ( i = 0; i < NR_CPUS; i++)
+    {
+       printk("NUMA CPU TEST: cpu %d in node %d\n", i, cpu_to_node(i));
+    }
+}
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 5612ba6..0598672 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -698,6 +698,7 @@ void __init setup_cache(void)
     cacheline_bytes = 1U << (4 + (ccsid & 0x7));
 }

+extern void numa_test(void);
 /* C entry point for boot CPU */
 void __init start_xen(unsigned long boot_phys_offset,
                       unsigned long fdt_paddr,
@@ -825,6 +826,7 @@ void __init start_xen(unsigned long boot_phys_offset,
         }
     }

+    numa_test();
     printk("Brought up %ld CPUs\n", (long)num_online_cpus());
     /* TODO: smp_cpus_done(); */


>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 00/21] ARM: Add Xen NUMA support
  2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
                   ` (21 preceding siblings ...)
  2017-02-09 16:31 ` [RFC PATCH v1 00/21] ARM: Add Xen NUMA support Julien Grall
@ 2017-02-10 17:30 ` Konrad Rzeszutek Wilk
  2017-03-02 14:49   ` Vijay Kilari
  22 siblings, 1 reply; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-02-10 17:30 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	julien.grall, xen-devel

On Thu, Feb 09, 2017 at 09:26:52PM +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> 
> With this RFC patch series, NUMA support is added for arm platform.
> Both DT and ACPI based NUMA support is added.
> Only Xen is made aware of NUMA platform. Dom0 is awareness is not
> added.
> 
> As part of this series, the code under x86 architecture is
> reused by moving into common files.
> New files xen/common/numa.c and xen/commom/srat.c files are
> added which are common for both x86 and arm.
> 
> Patches 1 - 12 & 20 are for DT NUMA and 13 - 19 & 21 are for
> ACPI NUMA.
> 
> DT NUMA: The following major changes are performed
>  - Dropped numa-node-id information from Dom0 DT.
>    So that Dom0 devices make allocation from node 0 for
>    devmalloc requests.
>  - Memory DT is not deleted by EFI. It is exposed to Xen
>    to extract numa information.
>  - On NUMA failure, Fallback to Non-NUMA booting.
>    Assuming all the memory and CPU's are under node 0.
>  - CONFIG_NUMA is introduced.
> 
> ACPI NUMA:
>  - MADT is parsed before parsing SRAT table to extract
>    CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
>    table, MADT table is opened and extract MPIDR. However this
>    approach is not working on Xen it allows only one table to
>    be open at a time because when ACPI table is opened, Xen
>    maps to single region. So opening ACPI tables recursively
>    leads to overwriting of contents.

Huh? Why can't you use vmap APIs to map them?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 17/21] ARM: NUMA: Extract memory proximity from SRAT table
  2017-02-09 15:57 ` [RFC PATCH v1 17/21] ARM: NUMA: Extract memory " vijay.kilari
@ 2017-02-10 17:33   ` Konrad Rzeszutek Wilk
  2017-02-10 17:35     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-02-10 17:33 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	julien.grall, xen-devel

On Thu, Feb 09, 2017 at 09:27:09PM +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> 
> Register SRAT entry handler for type
> ACPI_SRAT_TYPE_MEMORY_AFFINITY to parse SRAT table
> and extract proximity for all memory mappings.

Why can't you use arch/x86/srat.c code? Or move parts of that code to an
common code?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 17/21] ARM: NUMA: Extract memory proximity from SRAT table
  2017-02-10 17:33   ` Konrad Rzeszutek Wilk
@ 2017-02-10 17:35     ` Konrad Rzeszutek Wilk
  2017-03-02 14:41       ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-02-10 17:35 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	julien.grall, xen-devel

On Fri, Feb 10, 2017 at 12:33:33PM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 09, 2017 at 09:27:09PM +0530, vijay.kilari@gmail.com wrote:
> > From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> > 
> > Register SRAT entry handler for type
> > ACPI_SRAT_TYPE_MEMORY_AFFINITY to parse SRAT table
> > and extract proximity for all memory mappings.
> 
> Why can't you use arch/x86/srat.c code? Or move parts of that code to an
> common code?

And to be clear - I meant the 'acpi_numa_memory_affinity_init' function?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-02-09 15:56 ` [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
@ 2017-02-20 11:39   ` Julien Grall
  2017-02-22  9:18     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 11:39 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Right not CONFIG_NUMA is not enabled for ARM and

s/not/now/

> existing code in asm-arm/numa.h is for !COFIG_NUMA.

s/COFIG_NUMA/CONFIG_NUMA/

> Hence put this code under #ifndef CONFIG_NUMA.
>
> This help to make this changes work when CONFIG_NUMA
> is not enabled.
>

Is there any reason to let the choice to the user to enable/disable NUMA?

> Also define NODES_SHIFT macro for ARM.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/include/asm-arm/numa.h | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index a2c1a34..a60c7eb 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -3,6 +3,9 @@
>
>  typedef u8 nodeid_t;
>
> +#define NODES_SHIFT 2

Why 2?

> +
> +#ifndef CONFIG_NUMA
>  /* Fake one node for now. See also node_online_map. */
>  #define cpu_to_node(cpu) 0
>  #define node_to_cpumask(node)   (cpu_online_map)
> @@ -16,6 +19,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #define node_spanned_pages(nid) (total_pages)
>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>  #define __node_distance(a, b) (20)
> +#endif /* CONFIG_NUMA */
>
>  static inline unsigned int arch_get_dma_bitsize(void)
>  {
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-09 16:11   ` Jan Beulich
@ 2017-02-20 11:41     ` Julien Grall
  2017-02-27 11:43     ` Vijay Kilari
  1 sibling, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-20 11:41 UTC (permalink / raw)
  To: Jan Beulich, vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya Kumar K,
	xen-devel, nd

Hi,

On 09/02/17 16:11, Jan Beulich wrote:
>>>> On 09.02.17 at 16:56, <vijay.kilari@gmail.com> wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move common generic NUMA code to xen/common/numa.c from
>> xen/arch/x86/numa.c. Also move generic code in header file
>> xen/include/asm-x86/numa.h to xen/include/xen/numa.h
>>
>> This common code can be re-used later for ARM.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> I would have been nice if you Cc-ed the maintainers of the code
> you're moving.
>
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,342 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +
>> +#include <xen/mm.h>
>> +#include <xen/string.h>
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/numa.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/time.h>
>> +#include <xen/smp.h>
>> +#include <xen/pfn.h>
>> +#include <xen/sched.h>
>> +#include <xen/errno.h>
>> +#include <xen/softirq.h>
>> +#include <asm/setup.h>
>
> This last one would better not be included here.
>
>> +struct node_data node_data[MAX_NUMNODES];
>> +
>> +/* Mapping from pdx to node id */
>> +int memnode_shift;
>> +unsigned long memnodemapsize;
>> +u8 *memnodemap;
>> +typeof(*memnodemap) _memnodemap[64];
>
> Careful with removing any "static" please. For the last one in
> particular you would need to change the name if it's really necessary
> to make non-static. Even better would be though to keep it static
> and provide suitable accessors.
>
> Also please sanitize types as you're moving stuff: memnode_shift
> looks like it really wants to be unsigned int, and u8 should really
> be uint8_t (as we're trying to phase out u8 & Co). This also applies
> to code further down.

I agree with the changes suggested. If possible I would prefer all those 
changes made in separate patches before the move. This would ease the 
review.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-09 15:56 ` [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code vijay.kilari
  2017-02-09 16:11   ` Jan Beulich
@ 2017-02-20 12:37   ` Julien Grall
  2017-02-22 10:04     ` Vijay Kilari
  1 sibling, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 12:37 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Move common generic NUMA code to xen/common/numa.c from
> xen/arch/x86/numa.c. Also move generic code in header file
> xen/include/asm-x86/numa.h to xen/include/xen/numa.h
>
> This common code can be re-used later for ARM.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/numa.c        | 299 ---------------------------------------
>  xen/common/Makefile        |   1 +
>  xen/common/numa.c          | 342 +++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/numa.h |  47 -------
>  xen/include/xen/numa.h     |  54 +++++++
>  5 files changed, 397 insertions(+), 346 deletions(-)
>
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index 6f4d438..bc787e0 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -25,27 +25,12 @@ custom_param("numa", numa_setup);
>  #define Dprintk(x...)
>  #endif
>
> -/* from proto.h */
> -#define round_up(x,y) ((((x)+(y))-1) & (~((y)-1)))
> -
> -struct node_data node_data[MAX_NUMNODES];
> -
> -/* Mapping from pdx to node id */
> -int memnode_shift;
> -static typeof(*memnodemap) _memnodemap[64];
> -unsigned long memnodemapsize;
> -u8 *memnodemap;
> -
> -nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
> -    [0 ... NR_CPUS-1] = NUMA_NO_NODE
> -};
>  /*
>   * Keep BIOS's CPU2node information, should not be used for memory allocaion
>   */
>  nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>      [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>  };
> -cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>
>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };

Why this variable is not moved?

[...]

>  void __init numa_init_array(void)

Same question for this function.

>  {
>      int rr, i;
> @@ -288,16 +145,6 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>                      (u64)end_pfn << PAGE_SHIFT);
>  }
>
> -void numa_add_cpu(int cpu)
> -{
> -    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
> -}
> -
> -void numa_set_node(int cpu, nodeid_t node)
> -{
> -    cpu_to_node[cpu] = node;
> -}
> -
>  /* [numa=off] */
>  static __init int numa_setup(char *opt)

Same question here. Everything in numa_setup and the fake NUMA looks
valid for ARM too.

[....]

> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 0fed30b..c1bd2ff 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -63,6 +63,7 @@ obj-y += wait.o
>  obj-bin-y += warning.init.o
>  obj-$(CONFIG_XENOPROF) += xenoprof.o
>  obj-y += xmalloc_tlsf.o
> +obj-y += numa.o

This needs to be gated with CONFIG_NUMA.

>
>  obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
>
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> new file mode 100644
> index 0000000..59dcb63
> --- /dev/null
> +++ b/xen/common/numa.c
> @@ -0,0 +1,342 @@
> +/*
> + * Common NUMA handling functions for x86 and arm.
> + * Original code extracted from arch/x86/numa.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.

Xen is GPLv2 only. Please update to:

"This program is free software; you can redistribute it and/or
modify it under the terms and conditions of the GNU General Public
License, version 2, as published by the Free Software Foundation."

> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +
> +#include <xen/mm.h>
> +#include <xen/string.h>
> +#include <xen/init.h>
> +#include <xen/ctype.h>
> +#include <xen/nodemask.h>
> +#include <xen/numa.h>
> +#include <xen/keyhandler.h>
> +#include <xen/time.h>
> +#include <xen/smp.h>
> +#include <xen/pfn.h>
> +#include <xen/sched.h>
> +#include <xen/errno.h>
> +#include <xen/softirq.h>
> +#include <asm/setup.h>
> +
> +struct node_data node_data[MAX_NUMNODES];
> +
> +/* Mapping from pdx to node id */

Looking at this comment, it looks like the NUMA support should depend on 
HAS_PDX as this is not something that may be able on all the architecture.

> +int memnode_shift;
> +unsigned long memnodemapsize;
> +u8 *memnodemap;
> +typeof(*memnodemap) _memnodemap[64];
> +
> +nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
> +    [0 ... NR_CPUS-1] = NUMA_NO_NODE
> +};
> +
> +cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;

[...]

> +void numa_add_cpu(int cpu)
> +{
> +    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
> +}
> +
> +void numa_set_node(int cpu, nodeid_t node)
> +{
> +    cpu_to_node[cpu] = node;
> +}
> +
> +EXPORT_SYMBOL(node_to_cpumask);
> +EXPORT_SYMBOL(memnode_shift);
> +EXPORT_SYMBOL(memnodemap);
> +EXPORT_SYMBOL(node_data);

Those 4 lines are not part of the original code. Why did you add them?

To ease review I would like to see this patch split multiple one:
	- multiple small to prepare the code (export function, change the type...)
         - a patch to move the code and only move it. No changes at all 
in it.

[...]

> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index 2479238..61bcd8e 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -17,64 +17,17 @@ extern cpumask_t     node_to_cpumask[];
>  #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
>  #define node_to_cpumask(node)    (node_to_cpumask[node])
>
> -struct node {
> -	u64 start,end;
> -};
> -
> -extern int compute_hash_shift(struct node *nodes, int numnodes,
> -			      nodeid_t *nodeids);
>  extern nodeid_t pxm_to_node(unsigned int pxm);
>
>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
> -#define VIRTUAL_BUG_ON(x)
>
> -extern void numa_add_cpu(int cpu);
>  extern void numa_init_array(void);
>  extern bool_t numa_off;
>
> -

Spurious change.

>  extern int srat_disabled(void);
> -extern void numa_set_node(int cpu, nodeid_t node);
> -extern nodeid_t setup_node(unsigned int pxm);
>  extern void srat_detect_node(int cpu);
>
> -extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>  extern nodeid_t apicid_to_node[];
> -extern void init_cpu_to_node(void);
> -
> -static inline void clear_node_cpumask(int cpu)
> -{
> -	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
> -}
> -
> -/* Simple perfect hash to map pdx to node numbers */
> -extern int memnode_shift;
> -extern unsigned long memnodemapsize;
> -extern u8 *memnodemap;
> -
> -struct node_data {
> -    unsigned long node_start_pfn;
> -    unsigned long node_spanned_pages;
> -};
> -
> -extern struct node_data node_data[];
> -
> -static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
> -{
> -	nodeid_t nid;
> -	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> -	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
> -	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
> -	return nid;
> -}
> -
> -#define NODE_DATA(nid)		(&(node_data[nid]))
> -
> -#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
> -#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
> -#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
> -				 NODE_DATA(nid)->node_spanned_pages)
> -
>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>
>  void srat_parse_regions(u64 addr);
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 7aef1a8..dd33c92 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -18,4 +18,58 @@
>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>
> +struct node {
> +	u64 start,end;

This contains hard tab. It looks like that asm-x86/numa.h add a mix hard 
tab/soft tab. Can you have a clean-up patch to drop them first?

> +};
> +
> +struct node_data {
> +    unsigned long node_start_pfn;
> +    unsigned long node_spanned_pages;
> +    nodeid_t      node_id;
> +};
> +
> +#define NODE_DATA(nid)		(&(node_data[nid]))

Hard tab.

> +#define VIRTUAL_BUG_ON(x)

What is the point to have a BUG_ON that is a nop?

> +
> +#ifdef CONFIG_NUMA
> +extern void init_cpu_to_node(void);
> +
> +static inline void clear_node_cpumask(int cpu)
> +{
> +	cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);

Hard tab.

> +}

You move this function in xen/numa.h but this is never used in xen code. 
It would be better to drop it.

> +
> +/* Simple perfect hash to map pdx to node numbers */
> +extern int memnode_shift;
> +extern unsigned long memnodemapsize;
> +extern u8 *memnodemap;
> +extern typeof(*memnodemap) _memnodemap[];
> +
> +extern struct node_data node_data[];
> +
> +static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
> +{
> +	nodeid_t nid;
> +	VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> +	nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
> +	VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
> +	return nid;

Hard tab in all this function.

> +}
> +
> +#define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
> +#define node_spanned_pages(nid)	(NODE_DATA(nid)->node_spanned_pages)
> +#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
> +				 NODE_DATA(nid)->node_spanned_pages)

Same for those 3 macros.

> +
> +#else
> +#define init_cpu_to_node() do {} while (0)

Please use static inline init_cpu_to_node(void) {}

> +#define clear_node_cpumask(cpu) do {} while (0)

Not point of having this one.

> +#endif /* CONFIG_NUMA */
> +
> +extern void numa_add_cpu(int cpu);
> +extern nodeid_t setup_node(unsigned int pxm);
> +extern void numa_set_node(int cpu, nodeid_t node);
> +extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);

I am not sure to understand why those function are not protected by 
#ifdef CONFIFG_NUMA.

> +extern int compute_hash_shift(struct node *nodes, int numnodes,

The function name is a bit too generic.

> +			      nodeid_t *nodeids);
>  #endif /* _XEN_NUMA_H */
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common
  2017-02-09 15:56 ` [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common vijay.kilari
  2017-02-09 16:15   ` Jan Beulich
@ 2017-02-20 12:47   ` Julien Grall
  2017-02-22 10:08     ` Vijay Kilari
  1 sibling, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 12:47 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Move some common numa code from xen/arch/x86/srat.c
> to xen/common/numa.c
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/srat.c        | 54 ++++-----------------------------------------
>  xen/common/numa.c          | 55 ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/acpi.h |  1 -
>  xen/include/asm-x86/numa.h |  1 -
>  xen/include/xen/numa.h     |  5 ++++-
>  5 files changed, 63 insertions(+), 53 deletions(-)
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index d86783e..58dee09 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -25,7 +25,7 @@ static struct acpi_table_slit *__read_mostly acpi_slit;
>
>  static nodemask_t memory_nodes_parsed __initdata;
>  static nodemask_t processor_nodes_parsed __initdata;
> -static struct node nodes[MAX_NUMNODES] __initdata;
> +extern struct node nodes[MAX_NUMNODES] __initdata;
>
>  struct pxm2node {
>  	unsigned pxm;
> @@ -36,9 +36,9 @@ static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
>
>  static unsigned node_to_pxm(nodeid_t n);
>
> -static int num_node_memblks;
> -static struct node node_memblk_range[NR_NODE_MEMBLKS];
> -static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
> +extern int num_node_memblks;
> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];
> +extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>
>  static inline bool_t node_found(unsigned idx, unsigned pxm)
> @@ -103,52 +103,6 @@ nodeid_t setup_node(unsigned pxm)
>  	return node;
>  }
>
> -int valid_numa_range(u64 start, u64 end, nodeid_t node)
> -{
> -	int i;
> -
> -	for (i = 0; i < num_node_memblks; i++) {
> -		struct node *nd = &node_memblk_range[i];
> -
> -		if (nd->start <= start && nd->end > end &&
> -			memblk_nodeid[i] == node )
> -			return 1;
> -	}
> -
> -	return 0;
> -}
> -
> -static __init int conflicting_memblks(u64 start, u64 end)
> -{
> -	int i;
> -
> -	for (i = 0; i < num_node_memblks; i++) {
> -		struct node *nd = &node_memblk_range[i];
> -		if (nd->start == nd->end)
> -			continue;
> -		if (nd->end > start && nd->start < end)
> -			return i;
> -		if (nd->end == end && nd->start == start)
> -			return i;
> -	}
> -	return -1;
> -}
> -
> -static __init void cutoff_node(int i, u64 start, u64 end)
> -{
> -	struct node *nd = &nodes[i];
> -	if (nd->start < start) {
> -		nd->start = start;
> -		if (nd->end < nd->start)
> -			nd->start = nd->end;
> -	}
> -	if (nd->end > end) {
> -		nd->end = end;
> -		if (nd->start > nd->end)
> -			nd->start = nd->end;
> -	}
> -}
> -
>  static __init void bad_srat(void)
>  {
>  	int i;
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 59dcb63..13f147c 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -46,6 +46,61 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
>
>  cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>
> +int num_node_memblks;
> +struct node node_memblk_range[NR_NODE_MEMBLKS];
> +nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
> +struct node nodes[MAX_NUMNODES] __initdata;
> +
> +int valid_numa_range(u64 start, u64 end, nodeid_t node)

I am not sure why you move this code in common code when it is not even 
used in your series.

Furthermore, please use paddr_t rather than u64.

> +{
> +#ifdef CONFIG_NUMA

common/numa.c should really not be compiled at all for configuration not 
supporting NUMA. In other words, I really don't want to see #ifdefery in 
common/numa.c.

> +    int i;
> +
> +    for (i = 0; i < num_node_memblks; i++) {

I know you are moving code around, but fix the coding style before hand 
would have been appreciated.

> +        struct node *nd = &node_memblk_range[i];
> +
> +        if (nd->start <= start && nd->end > end &&
> +            memblk_nodeid[i] == node )
> +            return 1;
> +    }
> +
> +    return 0;
> +#else
> +    return 1;
> +#endif
> +}
> +
> +__init int conflicting_memblks(u64 start, u64 end)

Ditto for u64.

> +{
> +    int i;
> +
> +    for (i = 0; i < num_node_memblks; i++) {
> +        struct node *nd = &node_memblk_range[i];
> +        if (nd->start == nd->end)
> +            continue;
> +        if (nd->end > start && nd->start < end)
> +            return i;
> +        if (nd->end == end && nd->start == start)
> +            return i;
> +    }
> +    return -1;
> +}
> +
> +__init void cutoff_node(int i, u64 start, u64 end)

Same remark as above.

> +{
> +    struct node *nd = &nodes[i];
> +    if (nd->start < start) {
> +        nd->start = start;
> +        if (nd->end < nd->start)
> +            nd->start = nd->end;
> +    }
> +    if (nd->end > end) {
> +        nd->end = end;
> +        if (nd->start > nd->end)
> +            nd->start = nd->end;
> +    }
> +}
> +
>  /*
>   * Given a shift value, try to populate memnodemap[]
>   * Returns :
> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
> index d36bee9..f1a8e9d 100644
> --- a/xen/include/asm-x86/acpi.h
> +++ b/xen/include/asm-x86/acpi.h
> @@ -106,7 +106,6 @@ extern void acpi_reserve_bootmem(void);
>
>  extern s8 acpi_numa;
>  extern int acpi_scan_nodes(u64 start, u64 end);
> -#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>
>  #ifdef CONFIG_ACPI_SLEEP
>
> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index 61bcd8e..df1f7d5 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -28,7 +28,6 @@ extern int srat_disabled(void);
>  extern void srat_detect_node(int cpu);
>
>  extern nodeid_t apicid_to_node[];
> -extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>
>  void srat_parse_regions(u64 addr);
>  extern u8 __node_distance(nodeid_t a, nodeid_t b);
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index dd33c92..810f742 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -11,7 +11,7 @@
>  #define NUMA_NO_DISTANCE 0xFF
>
>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
> -

Spurious change.

> +#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>
>  #define domain_to_node(d) \
> @@ -66,6 +66,9 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #define clear_node_cpumask(cpu) do {} while (0)
>  #endif /* CONFIG_NUMA */
>
> +extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
> +extern int conflicting_memblks(u64 start, u64 end);
> +extern void cutoff_node(int i, u64 start, u64 end);
>  extern void numa_add_cpu(int cpu);
>  extern nodeid_t setup_node(unsigned int pxm);
>  extern void numa_set_node(int cpu, nodeid_t node);
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup
  2017-02-09 15:56 ` [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup vijay.kilari
@ 2017-02-20 13:39   ` Julien Grall
  2017-02-22 10:27     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 13:39 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> numa_setup() contains generic and arch specific code.
> Split numa_setup() and move architecture specific code
> under arch_numa_setup().
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Makefile      |  1 +
>  xen/arch/arm/numa.c        | 28 ++++++++++++++++++++++++++++
>  xen/arch/x86/numa.c        | 11 +----------
>  xen/common/numa.c          | 14 ++++++++++++++
>  xen/include/asm-arm/numa.h |  9 ++++++++-
>  xen/include/asm-x86/numa.h |  2 +-
>  xen/include/xen/numa.h     |  1 +
>  7 files changed, 54 insertions(+), 12 deletions(-)
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 7afb8a3..b5d7a19 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -49,6 +49,7 @@ obj-y += vm_event.o
>  obj-y += vtimer.o
>  obj-y += vpsci.o
>  obj-y += vuart.o
> +obj-$(CONFIG_NUMA) += numa.o
>
>  #obj-bin-y += ....o
>
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> new file mode 100644
> index 0000000..59d09c7
> --- /dev/null
> +++ b/xen/arch/arm/numa.c
> @@ -0,0 +1,28 @@
> +/*
> + * ARM NUMA Implementation
> + *
> + * Copyright (C) 2016 - Cavium Inc.
> + * Vijaya Kumar K <vijaya.kumar@cavium.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.

The copyright is wrong.

> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/init.h>
> +#include <xen/ctype.h>
> +#include <xen/mm.h>
> +#include <xen/nodemask.h>
> +#include <asm/mm.h>
> +#include <xen/numa.h>
> +
> +int __init arch_numa_setup(char *opt)
> +{
> +    return 1;
> +}
> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
> index bc787e0..28d1891 100644
> --- a/xen/arch/x86/numa.c
> +++ b/xen/arch/x86/numa.c
> @@ -18,9 +18,6 @@
>  #include <xen/sched.h>
>  #include <xen/softirq.h>
>
> -static int numa_setup(char *s);
> -custom_param("numa", numa_setup);
> -
>  #ifndef Dprintk
>  #define Dprintk(x...)
>  #endif
> @@ -34,7 +31,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>
>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>
> -bool_t numa_off = 0;
>  s8 acpi_numa = 0;
>
>  int srat_disabled(void)
> @@ -145,13 +141,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
>                      (u64)end_pfn << PAGE_SHIFT);
>  }
>
> -/* [numa=off] */
> -static __init int numa_setup(char *opt)
> +int __init arch_numa_setup(char *opt)

I don't understand why you split numa_setup. All the options look valid 
for ARM.

>  {
> -    if ( !strncmp(opt,"off",3) )
> -        numa_off = 1;
> -    if ( !strncmp(opt,"on",2) )
> -        numa_off = 0;
>  #ifdef CONFIG_NUMA_EMU
>      if ( !strncmp(opt, "fake=", 5) )
>      {
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 13f147c..9b9cf9c 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -32,6 +32,10 @@
>  #include <xen/softirq.h>
>  #include <asm/setup.h>
>
> +static int numa_setup(char *s);

Forward declaration can be avoided in most of the cases. So please add 
at the right place from the beginning.

> +custom_param("numa", numa_setup);
> +
> +bool_t numa_off = 0;
>  struct node_data node_data[MAX_NUMNODES];
>
>  /* Mapping from pdx to node id */
> @@ -250,6 +254,16 @@ EXPORT_SYMBOL(memnode_shift);
>  EXPORT_SYMBOL(memnodemap);
>  EXPORT_SYMBOL(node_data);
>
> +static __init int numa_setup(char *opt)
> +{
> +    if ( !strncmp(opt,"off",3) )
> +        numa_off = 1;
> +    if ( !strncmp(opt,"on",2) )
> +        numa_off = 0;
> +
> +    return arch_numa_setup(opt);
> +}
> +
>  static void dump_numa(unsigned char key)
>  {
>      s_time_t now = NOW();
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index a60c7eb..c1e8a7d 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -5,7 +5,14 @@ typedef u8 nodeid_t;
>
>  #define NODES_SHIFT 2
>
> -#ifndef CONFIG_NUMA
> +#ifdef CONFIG_NUMA
> +int arch_numa_setup(char *opt);
> +#else
> +static inline int arch_numa_setup(char *opt)
> +{
> +    return 1;
> +}

What is the point of this?

> +
>  /* Fake one node for now. See also node_online_map. */
>  #define cpu_to_node(cpu) 0
>  #define node_to_cpumask(node)   (cpu_online_map)
> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index df1f7d5..659ff6a 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -22,7 +22,6 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>
>  extern void numa_init_array(void);
> -extern bool_t numa_off;
>
>  extern int srat_disabled(void);
>  extern void srat_detect_node(int cpu);
> @@ -32,5 +31,6 @@ extern nodeid_t apicid_to_node[];
>  void srat_parse_regions(u64 addr);
>  extern u8 __node_distance(nodeid_t a, nodeid_t b);
>  unsigned int arch_get_dma_bitsize(void);
> +int arch_numa_setup(char *opt);
>
>  #endif
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 810f742..77c5cfd 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -18,6 +18,7 @@
>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>
> +extern bool_t numa_off;

The place you added looks a bit random. And would be surprised to see a 
need when NUMA is not enabled.

>  struct node {
>  	u64 start,end;
>  };
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt
  2017-02-09 15:56 ` [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt vijay.kilari
@ 2017-02-20 13:42   ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-20 13:42 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> When booting in UEFI mode, UEFI passes memory information
> to Dom0 using EFI memory descriptor table and deletes the
> memory nodes from the host DT. However to fetch the memory
> numa node id, memory DT node should not be deleted by EFI stub.
>
> With this patch, do not delete memory node from FDT.
> This memory nodes are later used by XEN to extract numa
> node id information.
>
> Also, parse memory node only if bootmeminfo is NULL.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/bootfdt.c      |  9 +++++++--
>  xen/arch/arm/efi/efi-boot.h | 25 -------------------------
>  2 files changed, 7 insertions(+), 27 deletions(-)
>
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index cae6f83..979f675 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -285,8 +285,13 @@ static int __init early_scan_node(const void *fdt,
>                                    u32 address_cells, u32 size_cells,
>                                    void *data)
>  {
> -    if ( device_tree_node_matches(fdt, node, "memory") )
> -        process_memory_node(fdt, node, name, address_cells, size_cells);
> +    /*
> +     * Parse memory node only if bootinfo.mem is empty.
> +     */
> +    if ( bootinfo.mem.nr_banks == 0 ) {

I would prefer if we use efi_enabled(EFI_BOOT) instead. This will catch 
any potential bug when booted without EFI.

> +        if ( device_tree_node_matches(fdt, node, "memory") )
> +            process_memory_node(fdt, node, name, address_cells, size_cells);
> +    }
>      else if ( device_tree_node_compatible(fdt, node, "xen,multiboot-module" ) ||
>                device_tree_node_compatible(fdt, node, "multiboot,module" ))
>          process_multiboot_node(fdt, node, name, address_cells, size_cells);

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information
  2017-02-09 15:56 ` [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information vijay.kilari
@ 2017-02-20 17:32   ` Julien Grall
  2017-02-22 10:46     ` Vijay Kilari
  2017-02-20 17:36   ` Julien Grall
  1 sibling, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 17:32 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Parse CPU node and fetch numa-node-id information.
> For each node-id found, update nodemask_t mask.

A link to the bindings would have been useful...

> Call numa_init() from setup_mm() with start and end
> pfn of the complete ram..

s/.././


> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Makefile         |  1 +
>  xen/arch/arm/bootfdt.c        |  8 ++---
>  xen/arch/arm/dt_numa.c        | 72 +++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/numa.c           | 14 +++++++++
>  xen/arch/arm/setup.c          |  3 ++
>  xen/include/asm-arm/numa.h    | 14 +++++++++
>  xen/include/xen/device_tree.h |  4 +++
>  7 files changed, 112 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index b5d7a19..7694485 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -50,6 +50,7 @@ obj-y += vtimer.o
>  obj-y += vpsci.o
>  obj-y += vuart.o
>  obj-$(CONFIG_NUMA) += numa.o
> +obj-$(CONFIG_NUMA) += dt_numa.o

I would prefer if we introduce a directly numa within arm. This would 
make the root cleaner.

>
>  #obj-bin-y += ....o
>
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index 979f675..d1122d8 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -17,8 +17,8 @@
>  #include <xsm/xsm.h>
>  #include <asm/setup.h>
>
> -static bool_t __init device_tree_node_matches(const void *fdt, int node,
> -                                              const char *match)
> +bool_t __init device_tree_node_matches(const void *fdt, int node,
> +                                       const char *match)
>  {
>      const char *name;
>      size_t match_len;
> @@ -63,8 +63,8 @@ static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
>      *size = dt_next_cell(size_cells, cell);
>  }
>
> -static u32 __init device_tree_get_u32(const void *fdt, int node,
> -                                      const char *prop_name, u32 dflt)
> +u32 __init device_tree_get_u32(const void *fdt, int node,
> +                               const char *prop_name, u32 dflt)
>  {
>      const struct fdt_property *prop;
>
> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
> new file mode 100644
> index 0000000..4b94c36
> --- /dev/null
> +++ b/xen/arch/arm/dt_numa.c
> @@ -0,0 +1,72 @@
> +/*
> + * OF NUMA Parsing support.
> + *
> + * Copyright (C) 2015 - 2016 Cavium Inc.
> + *
> + * Some code extracts are taken from linux drivers/of/of_numa.c

Please specify which code has been extract from Linux and the commit id.

Looking at this patch only, none of this code is from Linux. Or it has 
been heavily modified.

> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/config.h>

No need to include xen/config.h

> +#include <xen/device_tree.h>

This include should not be there as the device tree is not yet parsed.

> +#include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/nodemask.h>
> +#include <asm/mm.h>
> +#include <xen/numa.h>
> +
> +nodemask_t numa_nodes_parsed;

Why does this variable live in dt_numa.c when this is used by common and 
acpi code?

Also, any variable/function exported should have there prototype in the 
associated header.

> +
> +/*
> + * Even though we connect cpus to numa domains later in SMP
> + * init, we need to know the node ids now for all cpus.
> +*/

Coding style:

/*
  * ...
  */

> +static int __init dt_numa_process_cpu_node(const void *fdt, int node,
> +                                           const char *name,
> +                                           u32 address_cells, u32 size_cells)
> +{
> +    u32 nid;
> +
> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
> +
> +    if ( nid >= MAX_NUMNODES )
> +        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n", nid);
> +    else
> +        node_set(nid, numa_nodes_parsed);
> +
> +    return 0;
> +}
> +
> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
> +                                        const char *name, int depth,
> +                                        u32 address_cells, u32 size_cells,
> +                                        void *data)
> +
> +{
> +    if ( device_tree_node_matches(fdt, node, "cpu") )
> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
> +                                        size_cells);

This code is wrong. CPUs nodes can only be in /cpus and you cannot rely 
on the name to be "cpu" (see binding in 
Documentation/devicetree/bindings/arm/cpu.txt). The only way to check if 
it is a CPU is to look for the property device_type.

> +
> +    return 0;
> +}
> +
> +int __init dt_numa_init(void)
> +{
> +    int ret;
> +
> +    nodes_clear(numa_nodes_parsed);

Why this is done in dt_numa_init and not numa_init?

> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
> +                                    dt_numa_scan_cpu_node, NULL);
> +    return ret;
> +}
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 59d09c7..9a7f0bb 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -21,6 +21,20 @@
>  #include <xen/nodemask.h>
>  #include <asm/mm.h>
>  #include <xen/numa.h>
> +#include <asm/acpi.h>
> +
> +int __init numa_init(void)
> +{
> +    int ret = 0;
> +
> +    if ( numa_off )
> +        goto no_numa;
> +
> +    ret = dt_numa_init();
> +
> +no_numa:
> +    return ret;
> +}
>
>  int __init arch_numa_setup(char *opt)
>  {
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 049e449..b6618ca 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -39,6 +39,7 @@
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/acpi.h>
>  #include <asm/alternative.h>
> +#include <xen/numa.h>
>  #include <asm/page.h>
>  #include <asm/current.h>
>  #include <asm/setup.h>
> @@ -753,6 +754,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>      /* Parse the ACPI tables for possible boot-time configuration */
>      acpi_boot_table_init();
>
> +    numa_init();

I see very little point to have a function return a value but never 
check it. If the return does not matter, then the function should be void.

> +
>      end_boot_allocator();
>
>      vm_init();
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index c1e8a7d..cdfeecd 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -1,18 +1,32 @@
>  #ifndef __ARCH_ARM_NUMA_H
>  #define __ARCH_ARM_NUMA_H
>
> +#include <xen/errno.h>
> +
>  typedef u8 nodeid_t;
>
>  #define NODES_SHIFT 2
>
>  #ifdef CONFIG_NUMA
>  int arch_numa_setup(char *opt);
> +int __init numa_init(void);

Please drop __init, it should only be only on the declaration.

> +int __init dt_numa_init(void);

Ditto.

>  #else
>  static inline int arch_numa_setup(char *opt)
>  {
>      return 1;
>  }
>
> +static inline int __init numa_init(void)
> +{
> +    return 0;
> +}
> +
> +static inline int __init dt_numa_init(void)
> +{
> +    return -EINVAL;
> +}

This wrapper should never be called when NUMA is disabled. So please 
drop it.

> +
>  /* Fake one node for now. See also node_online_map. */
>  #define cpu_to_node(cpu) 0
>  #define node_to_cpumask(node)   (cpu_online_map)
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 0aecbe0..de6b351 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -188,6 +188,10 @@ int device_tree_for_each_node(const void *fdt,
>                                       device_tree_node_func func,
>                                       void *data);
>
> +bool_t device_tree_node_matches(const void *fdt, int node,
> +                                const char *match);
> +u32 device_tree_get_u32(const void *fdt, int node,
> +                        const char *prop_name, u32 dflt);

Please don't mix common and arch code. Those functions are arch 
specific. This should be defined in asm/setup.h.

>  /**
>   * dt_unflatten_host_device_tree - Unflatten the host device tree
>   *
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information
  2017-02-09 15:56 ` [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information vijay.kilari
  2017-02-20 17:32   ` Julien Grall
@ 2017-02-20 17:36   ` Julien Grall
  1 sibling, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-20 17:36 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
> +                                        const char *name, int depth,
> +                                        u32 address_cells, u32 size_cells,
> +                                        void *data)
> +

I forgot to mention this. Please drop this newline.


> +{
> +    if ( device_tree_node_matches(fdt, node, "cpu") )
> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
> +                                        size_cells);
> +
> +    return 0;
> +}

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-02-09 15:56 ` [RFC PATCH v1 07/21] ARM: NUMA: Parse memory " vijay.kilari
@ 2017-02-20 18:05   ` Julien Grall
  2017-03-02 12:25     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 18:05 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Parse memory node and fetch numa-node-id information.
> For each memory range, store in node_memblk_range[]
> along with node id.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/bootfdt.c        |  4 +--
>  xen/arch/arm/dt_numa.c        | 84 ++++++++++++++++++++++++++++++++++++++++++-
>  xen/common/numa.c             |  8 +++++
>  xen/include/xen/device_tree.h |  3 ++
>  xen/include/xen/numa.h        |  1 +
>  5 files changed, 97 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index d1122d8..5e2df92 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -56,8 +56,8 @@ static bool_t __init device_tree_node_compatible(const void *fdt, int node,
>      return 0;
>  }
>
> -static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
> -                                       u32 size_cells, u64 *start, u64 *size)
> +void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
> +                                u32 size_cells, u64 *start, u64 *size)
>  {
>      *start = dt_next_cell(address_cells, cell);
>      *size = dt_next_cell(size_cells, cell);
> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
> index 4b94c36..fce9e67 100644
> --- a/xen/arch/arm/dt_numa.c
> +++ b/xen/arch/arm/dt_numa.c
> @@ -27,6 +27,7 @@
>  #include <xen/numa.h>
>
>  nodemask_t numa_nodes_parsed;
> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];

This should have been declared in an header (likely in patch #3).

>
>  /*
>   * Even though we connect cpus to numa domains later in SMP
> @@ -48,11 +49,73 @@ static int __init dt_numa_process_cpu_node(const void *fdt, int node,
>      return 0;
>  }
>
> +static int __init dt_numa_process_memory_node(const void *fdt, int node,
> +                                              const char *name,
> +                                              u32 address_cells,
> +                                              u32 size_cells)

Rather than reimplementing the wheel, it might be better to hook the 
parsing in bootfdt.c. This would avoid an extra parsing of the 
device-tree, duplicate the code and expose primitive for early DT parsing.

If parsing NUMA information can be done after the DT has been 
unflattened, this will be even better.

> +{
> +    const struct fdt_property *prop;
> +    int i, ret, banks;

Both banks and i should be unsigned.

> +    const __be32 *cell;
> +    paddr_t start, size;
> +    u32 reg_cells = address_cells + size_cells;
> +    u32 nid;
> +
> +    if ( address_cells < 1 || size_cells < 1 )
> +    {
> +        printk(XENLOG_WARNING
> +               "fdt: node `%s': invalid #address-cells or #size-cells", name);
> +        return -EINVAL;
> +    }
> +
> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
> +    if ( nid >= MAX_NUMNODES) {

Coding style

if ( ... )
{

> +        /*
> +         * No node id found. Skip this memory node.
> +         */

This could be a single line:

/* ..... */

So no warning if it is impossible to get the numa-node-id? Also, I don't 
think this is right to boot using NUMA on platform having a buggy DT. So 
we should probably return an error and disable NUMA.

> +        return 0;
> +    }
> +
> +    prop = fdt_get_property(fdt, node, "reg", NULL);
> +    if ( !prop )
> +    {
> +        printk(XENLOG_WARNING "fdt: node `%s': missing `reg' property\n",
> +               name);
> +        return -EINVAL;
> +    }
> +
> +    cell = (const __be32 *)prop->data;
> +    banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
> +
> +    for ( i = 0; i < banks; i++ )
> +    {
> +        device_tree_get_reg(&cell, address_cells, size_cells, &start, &size);
> +        if ( !size )
> +            continue;
> +
> +        /* It is fine to add this area to the nodes data it will be used later*/
> +        ret = conflicting_memblks(start, start + size);
> +        if (ret < 0)
> +             numa_add_memblk(nid, start, size);

numa_add_memblk rely on the caller to check whether the array is not 
full. I think we should move the check in numa_add_memblk and return an 
error in this case.

> +        else
> +        {
> +             printk(XENLOG_ERR
> +                    "NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with ret %d (%"PRIx64"-%"PRIx64")\n",
> +                    nid, start, start + size, ret,
> +                    node_memblk_range[i].start, node_memblk_range[i].end);

i != ret. Please use the correct variable accordingly. However, I don't 
think the overlap range really matters here.

> +             return -EINVAL;
> +        }
> +    }
> +
> +    node_set(nid, numa_nodes_parsed);
> +
> +    return 0;
> +}
> +
>  static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>                                          const char *name, int depth,
>                                          u32 address_cells, u32 size_cells,
>                                          void *data)
> -

Spurious change. Please don't add the blank line at the first place 
(patch #6).

>  {
>      if ( device_tree_node_matches(fdt, node, "cpu") )
>          return dt_numa_process_cpu_node(fdt, node, name, address_cells,
> @@ -61,6 +124,18 @@ static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>      return 0;
>  }
>
> +static int __init dt_numa_scan_memory_node(const void *fdt, int node,
> +                                           const char *name, int depth,
> +                                           u32 address_cells, u32 size_cells,
> +                                           void *data)
> +{
> +    if ( device_tree_node_matches(fdt, node, "memory") )
> +        return dt_numa_process_memory_node(fdt, node, name, address_cells,
> +                                           size_cells);

Similarly to the CPUs, this code is wrong. You should check the type = 
"memory".

> +
> +    return 0;
> +}
> +
>  int __init dt_numa_init(void)
>  {
>      int ret;
> @@ -68,5 +143,12 @@ int __init dt_numa_init(void)
>      nodes_clear(numa_nodes_parsed);
>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>                                      dt_numa_scan_cpu_node, NULL);
> +
> +    if ( ret )
> +        return ret;
> +
> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
> +                                    dt_numa_scan_memory_node, NULL);
> +
>      return ret;
>  }
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 9b9cf9c..62c76af 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -55,6 +55,14 @@ struct node node_memblk_range[NR_NODE_MEMBLKS];
>  nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>  struct node nodes[MAX_NUMNODES] __initdata;
>
> +void __init numa_add_memblk(nodeid_t nodeid, u64 start, u64 size)

Please replace u64 by paddr_t.

> +{
> +    node_memblk_range[num_node_memblks].start = start;
> +    node_memblk_range[num_node_memblks].end = start + size;
> +    memblk_nodeid[num_node_memblks] = nodeid;
> +    num_node_memblks++;
> +}

You probably want to factor the code in acpi_numa_memory_affinity_init 
to create this function.

Also, you don't check if the array is full.

> +
>  int valid_numa_range(u64 start, u64 end, nodeid_t node)
>  {
>  #ifdef CONFIG_NUMA
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index de6b351..d92e47e 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -192,6 +192,9 @@ bool_t device_tree_node_matches(const void *fdt, int node,
>                                  const char *match);
>  u32 device_tree_get_u32(const void *fdt, int node,
>                          const char *prop_name, u32 dflt);
> +void device_tree_get_reg(const __be32 **cell, u32 address_cells,
> +                         u32 size_cells, u64 *start, u64 *size);
> +

Same remark as on patch #6 for the position of the declaration.

>  /**
>   * dt_unflatten_host_device_tree - Unflatten the host device tree
>   *
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 77c5cfd..9392a89 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -67,6 +67,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #define clear_node_cpumask(cpu) do {} while (0)
>  #endif /* CONFIG_NUMA */
>
> +extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>  extern int conflicting_memblks(u64 start, u64 end);
>  extern void cutoff_node(int i, u64 start, u64 end);
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-02-09 15:57 ` [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information vijay.kilari
@ 2017-02-20 18:28   ` Julien Grall
  2017-02-22 11:38     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 18:28 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Parse distance-matrix and fetch node distance information.
> Store distance information in node_distance[].
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/dt_numa.c     | 90 ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/numa.c        | 19 +++++++++-
>  xen/include/asm-arm/numa.h |  1 +
>  3 files changed, 109 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
> index fce9e67..8979612 100644
> --- a/xen/arch/arm/dt_numa.c
> +++ b/xen/arch/arm/dt_numa.c
> @@ -28,6 +28,19 @@
>
>  nodemask_t numa_nodes_parsed;
>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
> +extern int _node_distance[MAX_NUMNODES * 2];
> +extern int *node_distance;

I don't like this idea of having _node_distance and node_distance. 
Looking at the code, I see little point to do that. You could just 
initialize node_distance with the correct value.

Also the node distance can fit in u8, so you can save memory by using u8.

Lastly, I am not sure why you pre-allocate the memory. The distance 
table could be quite big.

> +
> +static int numa_set_distance(u32 nodea, u32 nodeb, u32 distance)

Please avoid the use of u32 in favor of uint32_t.

Also, this function does not look very DT specific.

> +{
> +   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
> +       return -EINVAL;
> +

I would have expected some sanity check here.

> +   _node_distance[(nodea * MAX_NUMNODES) + nodeb] = distance;
> +   node_distance = &_node_distance[0];
> +
> +   return 0;
> +}
>
>  /*
>   * Even though we connect cpus to numa domains later in SMP
> @@ -112,6 +125,66 @@ static int __init dt_numa_process_memory_node(const void *fdt, int node,
>      return 0;
>  }
>
> +static int __init dt_numa_parse_distance_map(const void *fdt, int node,
> +                                             const char *name,
> +                                             u32 address_cells,
> +                                             u32 size_cells)
> +{
> +    const struct fdt_property *prop;
> +    const __be32 *matrix;
> +    int entry_count, len, i;
> +
> +    printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
> +
> +    prop = fdt_get_property(fdt, node, "distance-matrix", &len);
> +    if ( !prop )
> +    {
> +        printk(XENLOG_WARNING
> +               "NUMA: No distance-matrix property in distance-map\n");
> +
> +        return -EINVAL;
> +    }
> +
> +    if ( len % sizeof(u32) != 0 )
> +    {
> +         printk(XENLOG_WARNING
> +                "distance-matrix in node is not a multiple of u32\n");
> +
> +        return -EINVAL;
> +    }
> +
> +    entry_count = len / sizeof(u32);
> +    if ( entry_count <= 0 )
> +    {
> +        printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
> +
> +        return -EINVAL;
> +    }
> +
> +    matrix = (const __be32 *)prop->data;
> +    for ( i = 0; i + 2 < entry_count; i += 3 )
> +    {
> +        u32 nodea, nodeb, distance;
> +
> +        nodea = dt_read_number(matrix, 1);
> +        matrix++;
> +        nodeb = dt_read_number(matrix, 1);
> +        matrix++;
> +        distance = dt_read_number(matrix, 1);
> +        matrix++;
> +
> +        numa_set_distance(nodea, nodeb, distance);

What if numa_set_distance failed?

> +        printk(XENLOG_INFO "NUMA:  distance[node%d -> node%d] = %d\n",
> +               nodea, nodeb, distance);
> +
> +        /* Set default distance of node B->A same as A->B */
> +        if ( nodeb > nodea )
> +            numa_set_distance(nodeb, nodea, distance);
> +    }
> +
> +    return 0;
> +}
> +
>  static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>                                          const char *name, int depth,
>                                          u32 address_cells, u32 size_cells,
> @@ -136,6 +209,18 @@ static int __init dt_numa_scan_memory_node(const void *fdt, int node,
>      return 0;
>  }
>
> +static int __init dt_numa_scan_distance_node(const void *fdt, int node,
> +                                             const char *name, int depth,
> +                                             u32 address_cells, u32 size_cells,
> +                                             void *data)
> +{
> +    if ( device_tree_node_matches(fdt, node, "distance-map") )

Similar to memory and cpu, the name is not fixed. What you want to look 
for is the compatible numa-distance-map-v1.

> +        return dt_numa_parse_distance_map(fdt, node, name, address_cells,
> +                                          size_cells);
> +
> +    return 0;
> +}
> +
>  int __init dt_numa_init(void)
>  {
>      int ret;
> @@ -149,6 +234,11 @@ int __init dt_numa_init(void)
>
>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>                                      dt_numa_scan_memory_node, NULL);
> +    if ( ret )
> +        return ret;
> +
> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
> +                                    dt_numa_scan_distance_node, NULL);
>
>      return ret;
>  }
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 9a7f0bb..11d100b 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -22,14 +22,31 @@
>  #include <asm/mm.h>
>  #include <xen/numa.h>
>  #include <asm/acpi.h>
> +#include <xen/errno.h>

Why did you add this include. I don't see any errno here.

> +
> +int _node_distance[MAX_NUMNODES * 2];
> +int *node_distance;
> +
> +u8 __node_distance(nodeid_t a, nodeid_t b)
> +{
> +    if ( !node_distance )
> +        return a == b ? 10 : 20;

Why does the 10/20 comes from?

> +
> +    return _node_distance[a * MAX_NUMNODES + b];
> +}
> +
> +EXPORT_SYMBOL(__node_distance);
>
>  int __init numa_init(void)
>  {
> -    int ret = 0;
> +    int i, ret = 0;
>
>      if ( numa_off )
>          goto no_numa;
>
> +    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
> +        _node_distance[i] = 0;
> +

Hmmmm... _node_distance will be zeroed by the compiler. So why that?

If you want to initialize correctly then use 10/20.

>      ret = dt_numa_init();
>
>  no_numa:
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index cdfeecd..b8857e2 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -11,6 +11,7 @@ typedef u8 nodeid_t;
>  int arch_numa_setup(char *opt);
>  int __init numa_init(void);
>  int __init dt_numa_init(void);
> +u8 __node_distance(nodeid_t a, nodeid_t b);

This should be defined in common/numa.h as it is used by common code.

>  #else
>  static inline int arch_numa_setup(char *opt)
>  {
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support
  2017-02-09 15:57 ` [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support vijay.kilari
@ 2017-02-20 18:32   ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-20 18:32 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> For each cpu, update cpu_to_node[] with node id from
> the MPIDR registers. Also, initialize cpu_to_node[]
> with node 0.

The interpretation of MPIDR is different from each platform and that's 
why you have the node ID described in the firmware table...

I will look at the rest of the patch when this is fixed/clarified.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0
  2017-02-09 15:57 ` [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
@ 2017-02-20 18:36   ` Julien Grall
  2017-03-02 12:30     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-20 18:36 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Delete numa-node-id and distance map from Dom0 DT
> so that NUMA information is not exposed to Dom0.
>
> This helps particularly to boot Node 1 devices
> as if booting on Node0.
>
> However this approach has limitation where memory allocation
> for the devices should be local.

We had a discussion about this few weeks ago but you never answered 
back... (see [1]).

If there is an issue, please provides input with examples and what will 
happen.

>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/domain_build.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index c97a1f5..5e89eaa 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -424,6 +424,10 @@ static int write_properties(struct domain *d, struct kernel_info *kinfo,
>              }
>          }
>
> +        /* Don't expose the property numa to the guest */
> +        if ( dt_property_name_is_equal(prop, "numa-node-id") )
> +            continue;
> +
>          /* Don't expose the property "xen,passthrough" to the guest */
>          if ( dt_property_name_is_equal(prop, "xen,passthrough") )
>              continue;
> @@ -1176,6 +1180,11 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
>          DT_MATCH_TYPE("memory"),
>          /* The memory mapped timer is not supported by Xen. */
>          DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
> +        /*
> +         * NUMA info is not exposed to Dom0.
> +         * So, skip distance-map infomation
> +         */
> +        DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
>          { /* sentinel */ },
>      };
>      static const struct dt_device_match timer_matches[] __initconst =
>

Regards,

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg02073.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-02-20 11:39   ` Julien Grall
@ 2017-02-22  9:18     ` Vijay Kilari
  2017-02-22 10:49       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22  9:18 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 5:09 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Right not CONFIG_NUMA is not enabled for ARM and
>
>
> s/not/now/
>
>> existing code in asm-arm/numa.h is for !COFIG_NUMA.
>
>
> s/COFIG_NUMA/CONFIG_NUMA/
>
>> Hence put this code under #ifndef CONFIG_NUMA.
>>
>> This help to make this changes work when CONFIG_NUMA
>> is not enabled.
>>
>
> Is there any reason to let the choice to the user to enable/disable NUMA?

I see no reason. Atleast in case of x86, I see it is enabled always.

>
>> Also define NODES_SHIFT macro for ARM.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/include/asm-arm/numa.h | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index a2c1a34..a60c7eb 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -3,6 +3,9 @@
>>
>>  typedef u8 nodeid_t;
>>
>> +#define NODES_SHIFT 2
>
>
> Why 2?

Just to support upto 4 node system.

>
>> +
>> +#ifndef CONFIG_NUMA
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> @@ -16,6 +19,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_spanned_pages(nid) (total_pages)
>>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>>  #define __node_distance(a, b) (20)
>> +#endif /* CONFIG_NUMA */
>>
>>  static inline unsigned int arch_get_dma_bitsize(void)
>>  {
>>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-20 12:37   ` Julien Grall
@ 2017-02-22 10:04     ` Vijay Kilari
  2017-02-22 10:55       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22 10:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 6:07 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move common generic NUMA code to xen/common/numa.c from
>> xen/arch/x86/numa.c. Also move generic code in header file
>> xen/include/asm-x86/numa.h to xen/include/xen/numa.h
>>
>> This common code can be re-used later for ARM.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/numa.c        | 299 ---------------------------------------
>>  xen/common/Makefile        |   1 +
>>  xen/common/numa.c          | 342
>> +++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/numa.h |  47 -------
>>  xen/include/xen/numa.h     |  54 +++++++
>>  5 files changed, 397 insertions(+), 346 deletions(-)
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 6f4d438..bc787e0 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -25,27 +25,12 @@ custom_param("numa", numa_setup);
>>  #define Dprintk(x...)
>>  #endif
>>
>> -/* from proto.h */
>> -#define round_up(x,y) ((((x)+(y))-1) & (~((y)-1)))
>> -
>> -struct node_data node_data[MAX_NUMNODES];
>> -
>> -/* Mapping from pdx to node id */
>> -int memnode_shift;
>> -static typeof(*memnodemap) _memnodemap[64];
>> -unsigned long memnodemapsize;
>> -u8 *memnodemap;
>> -
>> -nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
>> -    [0 ... NR_CPUS-1] = NUMA_NO_NODE
>> -};
>>  /*
>>   * Keep BIOS's CPU2node information, should not be used for memory
>> allocaion
>>   */
>>  nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>>      [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>>  };
>> -cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>>
>>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>
>
> Why this variable is not moved?
Ok. Can be moved.
>
> [...]
>
>>  void __init numa_init_array(void)
>
>
> Same question for this function.

Initially I was suspicious about the comment in this function and thought
it might be valid of x86 alone. But looks like it is generic.
I will have a look.

>
>>  {
>>      int rr, i;
>> @@ -288,16 +145,6 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>                      (u64)end_pfn << PAGE_SHIFT);
>>  }
>>
>> -void numa_add_cpu(int cpu)
>> -{
>> -    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>> -}
>> -
>> -void numa_set_node(int cpu, nodeid_t node)
>> -{
>> -    cpu_to_node[cpu] = node;
>> -}
>> -
>>  /* [numa=off] */
>>  static __init int numa_setup(char *opt)
>
>
> Same question here. Everything in numa_setup and the fake NUMA looks
> valid for ARM too.

numa_setup() is taken in separate patch. fake numa case is not considered.
for x86 it is under separate config CONFIG_NUMA_EMU.

>
> [....]
>
>> diff --git a/xen/common/Makefile b/xen/common/Makefile
>> index 0fed30b..c1bd2ff 100644
>> --- a/xen/common/Makefile
>> +++ b/xen/common/Makefile
>> @@ -63,6 +63,7 @@ obj-y += wait.o
>>  obj-bin-y += warning.init.o
>>  obj-$(CONFIG_XENOPROF) += xenoprof.o
>>  obj-y += xmalloc_tlsf.o
>> +obj-y += numa.o
>
>
> This needs to be gated with CONFIG_NUMA.

As it is shared with x86 and prior this changes it is not gated under
any config for x86.

>
>>
>>  obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo
>> unlz4 earlycpio,$(n).init.o)
>>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> new file mode 100644
>> index 0000000..59dcb63
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,342 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>
>
> Xen is GPLv2 only. Please update to:
>
> "This program is free software; you can redistribute it and/or
> modify it under the terms and conditions of the GNU General Public
> License, version 2, as published by the Free Software Foundation."
>
>
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +
>> +#include <xen/mm.h>
>> +#include <xen/string.h>
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/numa.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/time.h>
>> +#include <xen/smp.h>
>> +#include <xen/pfn.h>
>> +#include <xen/sched.h>
>> +#include <xen/errno.h>
>> +#include <xen/softirq.h>
>> +#include <asm/setup.h>
>> +
>> +struct node_data node_data[MAX_NUMNODES];
>> +
>> +/* Mapping from pdx to node id */
>
>
> Looking at this comment, it looks like the NUMA support should depend on
> HAS_PDX as this is not something that may be able on all the architecture.

yes it uses pfn_to_pdx() while updating memnodemap.
May be comment should be suffice.

>
>> +int memnode_shift;
>> +unsigned long memnodemapsize;
>> +u8 *memnodemap;
>> +typeof(*memnodemap) _memnodemap[64];
>> +
>> +nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
>> +    [0 ... NR_CPUS-1] = NUMA_NO_NODE
>> +};
>> +
>> +cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>
>
> [...]
>
>> +void numa_add_cpu(int cpu)
>> +{
>> +    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>> +}
>> +
>> +void numa_set_node(int cpu, nodeid_t node)
>> +{
>> +    cpu_to_node[cpu] = node;
>> +}
>> +
>> +EXPORT_SYMBOL(node_to_cpumask);
>> +EXPORT_SYMBOL(memnode_shift);
>> +EXPORT_SYMBOL(memnodemap);
>> +EXPORT_SYMBOL(node_data);
>
>
> Those 4 lines are not part of the original code. Why did you add them?
>
> To ease review I would like to see this patch split multiple one:
>         - multiple small to prepare the code (export function, change the
> type...)
>         - a patch to move the code and only move it. No changes at all in
> it.
OK
>
> [...]
>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index 2479238..61bcd8e 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -17,64 +17,17 @@ extern cpumask_t     node_to_cpumask[];
>>  #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
>>  #define node_to_cpumask(node)    (node_to_cpumask[node])
>>
>> -struct node {
>> -       u64 start,end;
>> -};
>> -
>> -extern int compute_hash_shift(struct node *nodes, int numnodes,
>> -                             nodeid_t *nodeids);
>>  extern nodeid_t pxm_to_node(unsigned int pxm);
>>
>>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>> -#define VIRTUAL_BUG_ON(x)
>>
>> -extern void numa_add_cpu(int cpu);
>>  extern void numa_init_array(void);
>>  extern bool_t numa_off;
>>
>> -
>
>
> Spurious change.
>
>
>>  extern int srat_disabled(void);
>> -extern void numa_set_node(int cpu, nodeid_t node);
>> -extern nodeid_t setup_node(unsigned int pxm);
>>  extern void srat_detect_node(int cpu);
>>
>> -extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>>  extern nodeid_t apicid_to_node[];
>> -extern void init_cpu_to_node(void);
>> -
>> -static inline void clear_node_cpumask(int cpu)
>> -{
>> -       cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>> -}
>> -
>> -/* Simple perfect hash to map pdx to node numbers */
>> -extern int memnode_shift;
>> -extern unsigned long memnodemapsize;
>> -extern u8 *memnodemap;
>> -
>> -struct node_data {
>> -    unsigned long node_start_pfn;
>> -    unsigned long node_spanned_pages;
>> -};
>> -
>> -extern struct node_data node_data[];
>> -
>> -static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>> -{
>> -       nodeid_t nid;
>> -       VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >=
>> memnodemapsize);
>> -       nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
>> -       VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
>> -       return nid;
>> -}
>> -
>> -#define NODE_DATA(nid)         (&(node_data[nid]))
>> -
>> -#define node_start_pfn(nid)    (NODE_DATA(nid)->node_start_pfn)
>> -#define node_spanned_pages(nid)
>> (NODE_DATA(nid)->node_spanned_pages)
>> -#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
>> -                                NODE_DATA(nid)->node_spanned_pages)
>> -
>>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>>
>>  void srat_parse_regions(u64 addr);
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 7aef1a8..dd33c92 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -18,4 +18,58 @@
>>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>
>> +struct node {
>> +       u64 start,end;
>
>
> This contains hard tab. It looks like that asm-x86/numa.h add a mix hard
> tab/soft tab. Can you have a clean-up patch to drop them first?

OK. I will try.
>
>> +};
>> +
>> +struct node_data {
>> +    unsigned long node_start_pfn;
>> +    unsigned long node_spanned_pages;
>> +    nodeid_t      node_id;
>> +};
>> +
>> +#define NODE_DATA(nid)         (&(node_data[nid]))
>
>
> Hard tab.
>
>> +#define VIRTUAL_BUG_ON(x)
>
>
> What is the point to have a BUG_ON that is a nop?

yes.. that is part of original code. As part of clean up patch.
I will drop it.

>
>> +
>> +#ifdef CONFIG_NUMA
>> +extern void init_cpu_to_node(void);
>> +
>> +static inline void clear_node_cpumask(int cpu)
>> +{
>> +       cpumask_clear_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>
>
> Hard tab.
>
>> +}
>
>
> You move this function in xen/numa.h but this is never used in xen code. It
> would be better to drop it.
>
>> +
>> +/* Simple perfect hash to map pdx to node numbers */
>> +extern int memnode_shift;
>> +extern unsigned long memnodemapsize;
>> +extern u8 *memnodemap;
>> +extern typeof(*memnodemap) _memnodemap[];
>> +
>> +extern struct node_data node_data[];
>> +
>> +static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>> +{
>> +       nodeid_t nid;
>> +       VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >=
>> memnodemapsize);
>> +       nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
>> +       VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
>> +       return nid;
>
>
> Hard tab in all this function.
>
>> +}
>> +
>> +#define node_start_pfn(nid)    (NODE_DATA(nid)->node_start_pfn)
>> +#define node_spanned_pages(nid)
>> (NODE_DATA(nid)->node_spanned_pages)
>> +#define node_end_pfn(nid)       (NODE_DATA(nid)->node_start_pfn + \
>> +                                NODE_DATA(nid)->node_spanned_pages)
>
>
> Same for those 3 macros.
>
>> +
>> +#else
>> +#define init_cpu_to_node() do {} while (0)
>
>
> Please use static inline init_cpu_to_node(void) {}
>
>> +#define clear_node_cpumask(cpu) do {} while (0)
>
>
> Not point of having this one.
>
>> +#endif /* CONFIG_NUMA */
>> +
>> +extern void numa_add_cpu(int cpu);
>> +extern nodeid_t setup_node(unsigned int pxm);
>> +extern void numa_set_node(int cpu, nodeid_t node);
>> +extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>
>
> I am not sure to understand why those function are not protected by #ifdef
> CONFIFG_NUMA.
As these defined in xen/common/numa.c which is not under CONFIG_NUMA,
I have kept them outside CONFIG_NUMA

>
>> +extern int compute_hash_shift(struct node *nodes, int numnodes,
>
>
> The function name is a bit too generic.
>
>> +                             nodeid_t *nodeids);
>>  #endif /* _XEN_NUMA_H */
>>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common
  2017-02-20 12:47   ` Julien Grall
@ 2017-02-22 10:08     ` Vijay Kilari
  2017-02-22 11:07       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22 10:08 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 6:17 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move some common numa code from xen/arch/x86/srat.c
>> to xen/common/numa.c
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/srat.c        | 54
>> ++++-----------------------------------------
>>  xen/common/numa.c          | 55
>> ++++++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/acpi.h |  1 -
>>  xen/include/asm-x86/numa.h |  1 -
>>  xen/include/xen/numa.h     |  5 ++++-
>>  5 files changed, 63 insertions(+), 53 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index d86783e..58dee09 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -25,7 +25,7 @@ static struct acpi_table_slit *__read_mostly acpi_slit;
>>
>>  static nodemask_t memory_nodes_parsed __initdata;
>>  static nodemask_t processor_nodes_parsed __initdata;
>> -static struct node nodes[MAX_NUMNODES] __initdata;
>> +extern struct node nodes[MAX_NUMNODES] __initdata;
>>
>>  struct pxm2node {
>>         unsigned pxm;
>> @@ -36,9 +36,9 @@ static struct pxm2node __read_mostly
>> pxm2node[MAX_NUMNODES] =
>>
>>  static unsigned node_to_pxm(nodeid_t n);
>>
>> -static int num_node_memblks;
>> -static struct node node_memblk_range[NR_NODE_MEMBLKS];
>> -static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>> +extern int num_node_memblks;
>> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>>
>>  static inline bool_t node_found(unsigned idx, unsigned pxm)
>> @@ -103,52 +103,6 @@ nodeid_t setup_node(unsigned pxm)
>>         return node;
>>  }
>>
>> -int valid_numa_range(u64 start, u64 end, nodeid_t node)
>> -{
>> -       int i;
>> -
>> -       for (i = 0; i < num_node_memblks; i++) {
>> -               struct node *nd = &node_memblk_range[i];
>> -
>> -               if (nd->start <= start && nd->end > end &&
>> -                       memblk_nodeid[i] == node )
>> -                       return 1;
>> -       }
>> -
>> -       return 0;
>> -}
>> -
>> -static __init int conflicting_memblks(u64 start, u64 end)
>> -{
>> -       int i;
>> -
>> -       for (i = 0; i < num_node_memblks; i++) {
>> -               struct node *nd = &node_memblk_range[i];
>> -               if (nd->start == nd->end)
>> -                       continue;
>> -               if (nd->end > start && nd->start < end)
>> -                       return i;
>> -               if (nd->end == end && nd->start == start)
>> -                       return i;
>> -       }
>> -       return -1;
>> -}
>> -
>> -static __init void cutoff_node(int i, u64 start, u64 end)
>> -{
>> -       struct node *nd = &nodes[i];
>> -       if (nd->start < start) {
>> -               nd->start = start;
>> -               if (nd->end < nd->start)
>> -                       nd->start = nd->end;
>> -       }
>> -       if (nd->end > end) {
>> -               nd->end = end;
>> -               if (nd->start > nd->end)
>> -                       nd->start = nd->end;
>> -       }
>> -}
>> -
>>  static __init void bad_srat(void)
>>  {
>>         int i;
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 59dcb63..13f147c 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -46,6 +46,61 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
>>
>>  cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>>
>> +int num_node_memblks;
>> +struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>> +struct node nodes[MAX_NUMNODES] __initdata;
>> +
>> +int valid_numa_range(u64 start, u64 end, nodeid_t node)
>
>
> I am not sure why you move this code in common code when it is not even used
> in your series.
Yes, it is used only by x86 but code is generic. I will keep in x86 to
void further
comments on this function.

>
> Furthermore, please use paddr_t rather than u64.
>
>> +{
>> +#ifdef CONFIG_NUMA
>
>
> common/numa.c should really not be compiled at all for configuration not
> supporting NUMA. In other words, I really don't want to see #ifdefery in
> common/numa.c.
>
>> +    int i;
>> +
>> +    for (i = 0; i < num_node_memblks; i++) {
>
>
> I know you are moving code around, but fix the coding style before hand
> would have been appreciated.
>
>> +        struct node *nd = &node_memblk_range[i];
>> +
>> +        if (nd->start <= start && nd->end > end &&
>> +            memblk_nodeid[i] == node )
>> +            return 1;
>> +    }
>> +
>> +    return 0;
>> +#else
>> +    return 1;
>> +#endif
>> +}
>> +
>> +__init int conflicting_memblks(u64 start, u64 end)
>
>
> Ditto for u64.
OK
>
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < num_node_memblks; i++) {
>> +        struct node *nd = &node_memblk_range[i];
>> +        if (nd->start == nd->end)
>> +            continue;
>> +        if (nd->end > start && nd->start < end)
>> +            return i;
>> +        if (nd->end == end && nd->start == start)
>> +            return i;
>> +    }
>> +    return -1;
>> +}
>> +
>> +__init void cutoff_node(int i, u64 start, u64 end)
>
>
> Same remark as above.
OK
>
>
>> +{
>> +    struct node *nd = &nodes[i];
>> +    if (nd->start < start) {
>> +        nd->start = start;
>> +        if (nd->end < nd->start)
>> +            nd->start = nd->end;
>> +    }
>> +    if (nd->end > end) {
>> +        nd->end = end;
>> +        if (nd->start > nd->end)
>> +            nd->start = nd->end;
>> +    }
>> +}
>> +
>>  /*
>>   * Given a shift value, try to populate memnodemap[]
>>   * Returns :
>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>> index d36bee9..f1a8e9d 100644
>> --- a/xen/include/asm-x86/acpi.h
>> +++ b/xen/include/asm-x86/acpi.h
>> @@ -106,7 +106,6 @@ extern void acpi_reserve_bootmem(void);
>>
>>  extern s8 acpi_numa;
>>  extern int acpi_scan_nodes(u64 start, u64 end);
>> -#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>
>>  #ifdef CONFIG_ACPI_SLEEP
>>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index 61bcd8e..df1f7d5 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -28,7 +28,6 @@ extern int srat_disabled(void);
>>  extern void srat_detect_node(int cpu);
>>
>>  extern nodeid_t apicid_to_node[];
>> -extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>>
>>  void srat_parse_regions(u64 addr);
>>  extern u8 __node_distance(nodeid_t a, nodeid_t b);
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index dd33c92..810f742 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -11,7 +11,7 @@
>>  #define NUMA_NO_DISTANCE 0xFF
>>
>>  #define MAX_NUMNODES    (1 << NODES_SHIFT)
>> -
>
>
> Spurious change.
>
>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>>
>>  #define domain_to_node(d) \
>> @@ -66,6 +66,9 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define clear_node_cpumask(cpu) do {} while (0)
>>  #endif /* CONFIG_NUMA */
>>
>> +extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>> +extern int conflicting_memblks(u64 start, u64 end);
>> +extern void cutoff_node(int i, u64 start, u64 end);
>>  extern void numa_add_cpu(int cpu);
>>  extern nodeid_t setup_node(unsigned int pxm);
>>  extern void numa_set_node(int cpu, nodeid_t node);
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup
  2017-02-20 13:39   ` Julien Grall
@ 2017-02-22 10:27     ` Vijay Kilari
  2017-02-22 11:09       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22 10:27 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 7:09 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> numa_setup() contains generic and arch specific code.
>> Split numa_setup() and move architecture specific code
>> under arch_numa_setup().
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/Makefile      |  1 +
>>  xen/arch/arm/numa.c        | 28 ++++++++++++++++++++++++++++
>>  xen/arch/x86/numa.c        | 11 +----------
>>  xen/common/numa.c          | 14 ++++++++++++++
>>  xen/include/asm-arm/numa.h |  9 ++++++++-
>>  xen/include/asm-x86/numa.h |  2 +-
>>  xen/include/xen/numa.h     |  1 +
>>  7 files changed, 54 insertions(+), 12 deletions(-)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 7afb8a3..b5d7a19 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -49,6 +49,7 @@ obj-y += vm_event.o
>>  obj-y += vtimer.o
>>  obj-y += vpsci.o
>>  obj-y += vuart.o
>> +obj-$(CONFIG_NUMA) += numa.o
>>
>>  #obj-bin-y += ....o
>>
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> new file mode 100644
>> index 0000000..59d09c7
>> --- /dev/null
>> +++ b/xen/arch/arm/numa.c
>> @@ -0,0 +1,28 @@
>> +/*
>> + * ARM NUMA Implementation
>> + *
>> + * Copyright (C) 2016 - Cavium Inc.
>> + * Vijaya Kumar K <vijaya.kumar@cavium.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>
>
> The copyright is wrong.
>
>
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/mm.h>
>> +#include <xen/nodemask.h>
>> +#include <asm/mm.h>
>> +#include <xen/numa.h>
>> +
>> +int __init arch_numa_setup(char *opt)
>> +{
>> +    return 1;
>> +}
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index bc787e0..28d1891 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -18,9 +18,6 @@
>>  #include <xen/sched.h>
>>  #include <xen/softirq.h>
>>
>> -static int numa_setup(char *s);
>> -custom_param("numa", numa_setup);
>> -
>>  #ifndef Dprintk
>>  #define Dprintk(x...)
>>  #endif
>> @@ -34,7 +31,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>>
>>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>>
>> -bool_t numa_off = 0;
>>  s8 acpi_numa = 0;
>>
>>  int srat_disabled(void)
>> @@ -145,13 +141,8 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>                      (u64)end_pfn << PAGE_SHIFT);
>>  }
>>
>> -/* [numa=off] */
>> -static __init int numa_setup(char *opt)
>> +int __init arch_numa_setup(char *opt)
>
>
> I don't understand why you split numa_setup. All the options look valid for
> ARM.

OK.  This is all valid for arm, provided CONFIG_NUMA_EMU is implemented.
Can be moved to generic and for now we can keep CONFIG_NUMA_EMU
disabled for arm.

>
>>  {
>> -    if ( !strncmp(opt,"off",3) )
>> -        numa_off = 1;
>> -    if ( !strncmp(opt,"on",2) )
>> -        numa_off = 0;
>>  #ifdef CONFIG_NUMA_EMU
>>      if ( !strncmp(opt, "fake=", 5) )
>>      {
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 13f147c..9b9cf9c 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -32,6 +32,10 @@
>>  #include <xen/softirq.h>
>>  #include <asm/setup.h>
>>
>> +static int numa_setup(char *s);
>
>
> Forward declaration can be avoided in most of the cases. So please add at
> the right place from the beginning.
>
>
>> +custom_param("numa", numa_setup);
>> +
>> +bool_t numa_off = 0;
>>  struct node_data node_data[MAX_NUMNODES];
>>
>>  /* Mapping from pdx to node id */
>> @@ -250,6 +254,16 @@ EXPORT_SYMBOL(memnode_shift);
>>  EXPORT_SYMBOL(memnodemap);
>>  EXPORT_SYMBOL(node_data);
>>
>> +static __init int numa_setup(char *opt)
>> +{
>> +    if ( !strncmp(opt,"off",3) )
>> +        numa_off = 1;
>> +    if ( !strncmp(opt,"on",2) )
>> +        numa_off = 0;
>> +
>> +    return arch_numa_setup(opt);
>> +}
>> +
>>  static void dump_numa(unsigned char key)
>>  {
>>      s_time_t now = NOW();
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index a60c7eb..c1e8a7d 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -5,7 +5,14 @@ typedef u8 nodeid_t;
>>
>>  #define NODES_SHIFT 2
>>
>> -#ifndef CONFIG_NUMA
>> +#ifdef CONFIG_NUMA
>> +int arch_numa_setup(char *opt);
>> +#else
>> +static inline int arch_numa_setup(char *opt)
>> +{
>> +    return 1;
>> +}
>
>
> What is the point of this?
>
>
>> +
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index df1f7d5..659ff6a 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -22,7 +22,6 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>>
>>  extern void numa_init_array(void);
>> -extern bool_t numa_off;
>>
>>  extern int srat_disabled(void);
>>  extern void srat_detect_node(int cpu);
>> @@ -32,5 +31,6 @@ extern nodeid_t apicid_to_node[];
>>  void srat_parse_regions(u64 addr);
>>  extern u8 __node_distance(nodeid_t a, nodeid_t b);
>>  unsigned int arch_get_dma_bitsize(void);
>> +int arch_numa_setup(char *opt);
>>
>>  #endif
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 810f742..77c5cfd 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -18,6 +18,7 @@
>>    (((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>     ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>
>> +extern bool_t numa_off;
>
>
> The place you added looks a bit random. And would be surprised to see a need
> when NUMA is not enabled.

Ok. I will check.

>
>>  struct node {
>>         u64 start,end;
>>  };
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information
  2017-02-20 17:32   ` Julien Grall
@ 2017-02-22 10:46     ` Vijay Kilari
  2017-02-22 11:10       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22 10:46 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 11:02 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Parse CPU node and fetch numa-node-id information.
>> For each node-id found, update nodemask_t mask.
>
>
> A link to the bindings would have been useful...
>
>> Call numa_init() from setup_mm() with start and end
>> pfn of the complete ram..
>
>
> s/.././
>
>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/Makefile         |  1 +
>>  xen/arch/arm/bootfdt.c        |  8 ++---
>>  xen/arch/arm/dt_numa.c        | 72
>> +++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/numa.c           | 14 +++++++++
>>  xen/arch/arm/setup.c          |  3 ++
>>  xen/include/asm-arm/numa.h    | 14 +++++++++
>>  xen/include/xen/device_tree.h |  4 +++
>>  7 files changed, 112 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index b5d7a19..7694485 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -50,6 +50,7 @@ obj-y += vtimer.o
>>  obj-y += vpsci.o
>>  obj-y += vuart.o
>>  obj-$(CONFIG_NUMA) += numa.o
>> +obj-$(CONFIG_NUMA) += dt_numa.o
>
>
> I would prefer if we introduce a directly numa within arm. This would make
> the root cleaner.

As it is very specific to DT, I have introduced in this file. ACPI is
implemented
in separate file. Common arm specific numa code is in numa.c file.

>
>
>>
>>  #obj-bin-y += ....o
>>
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index 979f675..d1122d8 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -17,8 +17,8 @@
>>  #include <xsm/xsm.h>
>>  #include <asm/setup.h>
>>
>> -static bool_t __init device_tree_node_matches(const void *fdt, int node,
>> -                                              const char *match)
>> +bool_t __init device_tree_node_matches(const void *fdt, int node,
>> +                                       const char *match)
>>  {
>>      const char *name;
>>      size_t match_len;
>> @@ -63,8 +63,8 @@ static void __init device_tree_get_reg(const __be32
>> **cell, u32 address_cells,
>>      *size = dt_next_cell(size_cells, cell);
>>  }
>>
>> -static u32 __init device_tree_get_u32(const void *fdt, int node,
>> -                                      const char *prop_name, u32 dflt)
>> +u32 __init device_tree_get_u32(const void *fdt, int node,
>> +                               const char *prop_name, u32 dflt)
>>  {
>>      const struct fdt_property *prop;
>>
>> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
>> new file mode 100644
>> index 0000000..4b94c36
>> --- /dev/null
>> +++ b/xen/arch/arm/dt_numa.c
>> @@ -0,0 +1,72 @@
>> +/*
>> + * OF NUMA Parsing support.
>> + *
>> + * Copyright (C) 2015 - 2016 Cavium Inc.
>> + *
>> + * Some code extracts are taken from linux drivers/of/of_numa.c
>
>
> Please specify which code has been extract from Linux and the commit id.
>
> Looking at this patch only, none of this code is from Linux. Or it has been
> heavily modified.
It is heavily modified. I will drop this statement.

>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/config.h>
>
>
> No need to include xen/config.h
>
>> +#include <xen/device_tree.h>
>
>
> This include should not be there as the device tree is not yet parsed.
>
>> +#include <xen/libfdt/libfdt.h>
>> +#include <xen/mm.h>
>> +#include <xen/nodemask.h>
>> +#include <asm/mm.h>
>> +#include <xen/numa.h>
>> +
>> +nodemask_t numa_nodes_parsed;
>
>
> Why does this variable live in dt_numa.c when this is used by common and
> acpi code?
>
> Also, any variable/function exported should have there prototype in the
> associated header.

ok. will check.
>
>> +
>> +/*
>> + * Even though we connect cpus to numa domains later in SMP
>> + * init, we need to know the node ids now for all cpus.
>> +*/
>
>
> Coding style:
>
> /*
>  * ...
>
>  */
>
>> +static int __init dt_numa_process_cpu_node(const void *fdt, int node,
>> +                                           const char *name,
>> +                                           u32 address_cells, u32
>> size_cells)
>> +{
>> +    u32 nid;
>> +
>> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
>> +
>> +    if ( nid >= MAX_NUMNODES )
>> +        printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n",
>> nid);
>> +    else
>> +        node_set(nid, numa_nodes_parsed);
>> +
>> +    return 0;
>> +}
>> +
>> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>> +                                        const char *name, int depth,
>> +                                        u32 address_cells, u32
>> size_cells,
>> +                                        void *data)
>> +
>> +{
>> +    if ( device_tree_node_matches(fdt, node, "cpu") )
>> +        return dt_numa_process_cpu_node(fdt, node, name, address_cells,
>> +                                        size_cells);
>
>
> This code is wrong. CPUs nodes can only be in /cpus and you cannot rely on
> the name to be "cpu" (see binding in
> Documentation/devicetree/bindings/arm/cpu.txt). The only way to check if it
> is a CPU is to look for the property device_type.
OK
>
>> +
>> +    return 0;
>> +}
>> +
>> +int __init dt_numa_init(void)
>> +{
>> +    int ret;
>> +
>> +    nodes_clear(numa_nodes_parsed);
>
>
> Why this is done in dt_numa_init and not numa_init?
ok.
>
>
>> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
>> +                                    dt_numa_scan_cpu_node, NULL);
>> +    return ret;
>> +}
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index 59d09c7..9a7f0bb 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -21,6 +21,20 @@
>>  #include <xen/nodemask.h>
>>  #include <asm/mm.h>
>>  #include <xen/numa.h>
>> +#include <asm/acpi.h>
>> +
>> +int __init numa_init(void)
>> +{
>> +    int ret = 0;
>> +
>> +    if ( numa_off )
>> +        goto no_numa;
>> +
>> +    ret = dt_numa_init();
>> +
>> +no_numa:
>> +    return ret;
>> +}
>>
>>  int __init arch_numa_setup(char *opt)
>>  {
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index 049e449..b6618ca 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -39,6 +39,7 @@
>>  #include <xen/libfdt/libfdt.h>
>>  #include <xen/acpi.h>
>>  #include <asm/alternative.h>
>> +#include <xen/numa.h>
>>  #include <asm/page.h>
>>  #include <asm/current.h>
>>  #include <asm/setup.h>
>> @@ -753,6 +754,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>>      /* Parse the ACPI tables for possible boot-time configuration */
>>      acpi_boot_table_init();
>>
>> +    numa_init();
>
>
> I see very little point to have a function return a value but never check
> it. If the return does not matter, then the function should be void.

ok
>
>> +
>>      end_boot_allocator();
>>
>>      vm_init();
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index c1e8a7d..cdfeecd 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -1,18 +1,32 @@
>>  #ifndef __ARCH_ARM_NUMA_H
>>  #define __ARCH_ARM_NUMA_H
>>
>> +#include <xen/errno.h>
>> +
>>  typedef u8 nodeid_t;
>>
>>  #define NODES_SHIFT 2
>>
>>  #ifdef CONFIG_NUMA
>>  int arch_numa_setup(char *opt);
>> +int __init numa_init(void);
>
>
> Please drop __init, it should only be only on the declaration.
ok
>
>> +int __init dt_numa_init(void);
>
>
> Ditto.
>
>>  #else
>>  static inline int arch_numa_setup(char *opt)
>>  {
>>      return 1;
>>  }
>>
>> +static inline int __init numa_init(void)
>> +{
>> +    return 0;
>> +}
>> +
>> +static inline int __init dt_numa_init(void)
>> +{
>> +    return -EINVAL;
>> +}
>
>
> This wrapper should never be called when NUMA is disabled. So please drop
> it.
ok
>
>> +
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
>> index 0aecbe0..de6b351 100644
>> --- a/xen/include/xen/device_tree.h
>> +++ b/xen/include/xen/device_tree.h
>> @@ -188,6 +188,10 @@ int device_tree_for_each_node(const void *fdt,
>>                                       device_tree_node_func func,
>>                                       void *data);
>>
>> +bool_t device_tree_node_matches(const void *fdt, int node,
>> +                                const char *match);
>> +u32 device_tree_get_u32(const void *fdt, int node,
>> +                        const char *prop_name, u32 dflt);
>
>
> Please don't mix common and arch code. Those functions are arch specific.
> This should be defined in asm/setup.h.
ok
>
>>  /**
>>   * dt_unflatten_host_device_tree - Unflatten the host device tree
>>   *
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  2017-02-22  9:18     ` Vijay Kilari
@ 2017-02-22 10:49       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-22 10:49 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 09:18, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 5:09 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Right not CONFIG_NUMA is not enabled for ARM and
>>
>>
>> s/not/now/
>>
>>> existing code in asm-arm/numa.h is for !COFIG_NUMA.
>>
>>
>> s/COFIG_NUMA/CONFIG_NUMA/
>>
>>> Hence put this code under #ifndef CONFIG_NUMA.
>>>
>>> This help to make this changes work when CONFIG_NUMA
>>> is not enabled.
>>>
>>
>> Is there any reason to let the choice to the user to enable/disable NUMA?
>
> I see no reason. Atleast in case of x86, I see it is enabled always.

Stefano, do you have any opinion on whether a user should be able to 
enable/disable NUMA?

>>
>>> Also define NODES_SHIFT macro for ARM.
>>>
>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/include/asm-arm/numa.h | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>>> index a2c1a34..a60c7eb 100644
>>> --- a/xen/include/asm-arm/numa.h
>>> +++ b/xen/include/asm-arm/numa.h
>>> @@ -3,6 +3,9 @@
>>>
>>>  typedef u8 nodeid_t;
>>>
>>> +#define NODES_SHIFT 2
>>
>>
>> Why 2?
>
> Just to support upto 4 node system.

Why only 4 node supported? Also, this should be specified in the commit 
message.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-22 10:04     ` Vijay Kilari
@ 2017-02-22 10:55       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-22 10:55 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 10:04, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 6:07 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>>  {
>>>      int rr, i;
>>> @@ -288,16 +145,6 @@ void __init numa_initmem_init(unsigned long
>>> start_pfn, unsigned long end_pfn)
>>>                      (u64)end_pfn << PAGE_SHIFT);
>>>  }
>>>
>>> -void numa_add_cpu(int cpu)
>>> -{
>>> -    cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
>>> -}
>>> -
>>> -void numa_set_node(int cpu, nodeid_t node)
>>> -{
>>> -    cpu_to_node[cpu] = node;
>>> -}
>>> -
>>>  /* [numa=off] */
>>>  static __init int numa_setup(char *opt)
>>
>>
>> Same question here. Everything in numa_setup and the fake NUMA looks
>> valid for ARM too.
>
> numa_setup() is taken in separate patch. fake numa case is not considered.
> for x86 it is under separate config CONFIG_NUMA_EMU.

Why fake numa is not considered? This would help to test the series on 
non-NUMA platform.

>>
>> [....]
>>
>>> diff --git a/xen/common/Makefile b/xen/common/Makefile
>>> index 0fed30b..c1bd2ff 100644
>>> --- a/xen/common/Makefile
>>> +++ b/xen/common/Makefile
>>> @@ -63,6 +63,7 @@ obj-y += wait.o
>>>  obj-bin-y += warning.init.o
>>>  obj-$(CONFIG_XENOPROF) += xenoprof.o
>>>  obj-y += xmalloc_tlsf.o
>>> +obj-y += numa.o
>>
>>
>> This needs to be gated with CONFIG_NUMA.
>
> As it is shared with x86 and prior this changes it is not gated under
> any config for x86.

Because x86 code assume that CONFIG_NUMA is always enabled. If you look 
at the .config CONFIG_NUMA will be set so you can gate it here.

>> Looking at this comment, it looks like the NUMA support should depend on
>> HAS_PDX as this is not something that may be able on all the architecture.
>
> yes it uses pfn_to_pdx() while updating memnodemap.
> May be comment should be suffice.

A comment may be missed, gated in the Kconfig will prevent a new 
architecture to use NUMA without knowing PDX is necessary.

[...]

>>> +#endif /* CONFIG_NUMA */
>>> +
>>> +extern void numa_add_cpu(int cpu);
>>> +extern nodeid_t setup_node(unsigned int pxm);
>>> +extern void numa_set_node(int cpu, nodeid_t node);
>>> +extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>>
>>
>> I am not sure to understand why those function are not protected by #ifdef
>> CONFIFG_NUMA.
> As these defined in xen/common/numa.c which is not under CONFIG_NUMA,
> I have kept them outside CONFIG_NUMA

xen/common/numa.c should be under CONFIG_NUMA. There is no reason to 
expose the code on non-NUMA platform.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common
  2017-02-22 10:08     ` Vijay Kilari
@ 2017-02-22 11:07       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-22 11:07 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 10:08, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 6:17 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>>> index 59dcb63..13f147c 100644
>>> --- a/xen/common/numa.c
>>> +++ b/xen/common/numa.c
>>> @@ -46,6 +46,61 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
>>>
>>>  cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
>>>
>>> +int num_node_memblks;
>>> +struct node node_memblk_range[NR_NODE_MEMBLKS];
>>> +nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>> +struct node nodes[MAX_NUMNODES] __initdata;
>>> +
>>> +int valid_numa_range(u64 start, u64 end, nodeid_t node)
>>
>>
>> I am not sure why you move this code in common code when it is not even used
>> in your series.
> Yes, it is used only by x86 but code is generic. I will keep in x86 to
> void further
> comments on this function.

I was only asking why you move the code and not saying it should not be 
moved...

Anyway, I am ok with the move. But a bit more description in the commit 
message would have been appreciated.

Cheers,

Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup
  2017-02-22 10:27     ` Vijay Kilari
@ 2017-02-22 11:09       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-22 11:09 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 10:27, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 7:09 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>> @@ -145,13 +141,8 @@ void __init numa_initmem_init(unsigned long
>>> start_pfn, unsigned long end_pfn)
>>>                      (u64)end_pfn << PAGE_SHIFT);
>>>  }
>>>
>>> -/* [numa=off] */
>>> -static __init int numa_setup(char *opt)
>>> +int __init arch_numa_setup(char *opt)
>>
>>
>> I don't understand why you split numa_setup. All the options look valid for
>> ARM.
>
> OK.  This is all valid for arm, provided CONFIG_NUMA_EMU is implemented.
> Can be moved to generic and for now we can keep CONFIG_NUMA_EMU
> disabled for arm.

Why do you want to keep CONFIG_NUMA_EMU disabled on ARM? This is really 
useful to test NUMA code on non-NUMA platform.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information
  2017-02-22 10:46     ` Vijay Kilari
@ 2017-02-22 11:10       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-02-22 11:10 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 10:46, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 11:02 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Parse CPU node and fetch numa-node-id information.
>>> For each node-id found, update nodemask_t mask.
>>
>>
>> A link to the bindings would have been useful...
>>
>>> Call numa_init() from setup_mm() with start and end
>>> pfn of the complete ram..
>>
>>
>> s/.././
>>
>>
>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/arch/arm/Makefile         |  1 +
>>>  xen/arch/arm/bootfdt.c        |  8 ++---
>>>  xen/arch/arm/dt_numa.c        | 72
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/numa.c           | 14 +++++++++
>>>  xen/arch/arm/setup.c          |  3 ++
>>>  xen/include/asm-arm/numa.h    | 14 +++++++++
>>>  xen/include/xen/device_tree.h |  4 +++
>>>  7 files changed, 112 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>> index b5d7a19..7694485 100644
>>> --- a/xen/arch/arm/Makefile
>>> +++ b/xen/arch/arm/Makefile
>>> @@ -50,6 +50,7 @@ obj-y += vtimer.o
>>>  obj-y += vpsci.o
>>>  obj-y += vuart.o
>>>  obj-$(CONFIG_NUMA) += numa.o
>>> +obj-$(CONFIG_NUMA) += dt_numa.o
>>
>>
>> I would prefer if we introduce a directly numa within arm. This would make
>> the root cleaner.
>
> As it is very specific to DT, I have introduced in this file. ACPI is
> implemented
> in separate file. Common arm specific numa code is in numa.c file.

Sorry I meant separate directory not not "directly".

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-02-20 18:28   ` Julien Grall
@ 2017-02-22 11:38     ` Vijay Kilari
  2017-02-22 11:44       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-22 11:38 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 11:58 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Parse distance-matrix and fetch node distance information.
>> Store distance information in node_distance[].
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/dt_numa.c     | 90
>> ++++++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/numa.c        | 19 +++++++++-
>>  xen/include/asm-arm/numa.h |  1 +
>>  3 files changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
>> index fce9e67..8979612 100644
>> --- a/xen/arch/arm/dt_numa.c
>> +++ b/xen/arch/arm/dt_numa.c
>> @@ -28,6 +28,19 @@
>>
>>  nodemask_t numa_nodes_parsed;
>>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +extern int _node_distance[MAX_NUMNODES * 2];
>> +extern int *node_distance;
>
>
> I don't like this idea of having _node_distance and node_distance. Looking
> at the code, I see little point to do that. You could just initialize
> node_distance with the correct value.
>
> Also the node distance can fit in u8, so you can save memory by using u8.

u8 might restrict the distance value
>
> Lastly, I am not sure why you pre-allocate the memory. The distance table.
> could be quite big.
ok. will do malloc
>
>> +
>> +static int numa_set_distance(u32 nodea, u32 nodeb, u32 distance)
>
>
> Please avoid the use of u32 in favor of uint32_t.
>
> Also, this function does not look very DT specific.
>
>> +{
>> +   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
>> +       return -EINVAL;
>> +
>
>
> I would have expected some sanity check here.
>
>
>> +   _node_distance[(nodea * MAX_NUMNODES) + nodeb] = distance;
>> +   node_distance = &_node_distance[0];
>> +
>> +   return 0;
>> +}
>>
>>  /*
>>   * Even though we connect cpus to numa domains later in SMP
>> @@ -112,6 +125,66 @@ static int __init dt_numa_process_memory_node(const
>> void *fdt, int node,
>>      return 0;
>>  }
>>
>> +static int __init dt_numa_parse_distance_map(const void *fdt, int node,
>> +                                             const char *name,
>> +                                             u32 address_cells,
>> +                                             u32 size_cells)
>> +{
>> +    const struct fdt_property *prop;
>> +    const __be32 *matrix;
>> +    int entry_count, len, i;
>> +
>> +    printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
>> +
>> +    prop = fdt_get_property(fdt, node, "distance-matrix", &len);
>> +    if ( !prop )
>> +    {
>> +        printk(XENLOG_WARNING
>> +               "NUMA: No distance-matrix property in distance-map\n");
>> +
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( len % sizeof(u32) != 0 )
>> +    {
>> +         printk(XENLOG_WARNING
>> +                "distance-matrix in node is not a multiple of u32\n");
>> +
>> +        return -EINVAL;
>> +    }
>> +
>> +    entry_count = len / sizeof(u32);
>> +    if ( entry_count <= 0 )
>> +    {
>> +        printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
>> +
>> +        return -EINVAL;
>> +    }
>> +
>> +    matrix = (const __be32 *)prop->data;
>> +    for ( i = 0; i + 2 < entry_count; i += 3 )
>> +    {
>> +        u32 nodea, nodeb, distance;
>> +
>> +        nodea = dt_read_number(matrix, 1);
>> +        matrix++;
>> +        nodeb = dt_read_number(matrix, 1);
>> +        matrix++;
>> +        distance = dt_read_number(matrix, 1);
>> +        matrix++;
>> +
>> +        numa_set_distance(nodea, nodeb, distance);
>
>
> What if numa_set_distance failed?

IMO, no need to fail numa. Should be set to default values for node_distance[].

>
>> +        printk(XENLOG_INFO "NUMA:  distance[node%d -> node%d] = %d\n",
>> +               nodea, nodeb, distance);
>> +
>> +        /* Set default distance of node B->A same as A->B */
>> +        if ( nodeb > nodea )
>> +            numa_set_distance(nodeb, nodea, distance);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>  static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>>                                          const char *name, int depth,
>>                                          u32 address_cells, u32
>> size_cells,
>> @@ -136,6 +209,18 @@ static int __init dt_numa_scan_memory_node(const void
>> *fdt, int node,
>>      return 0;
>>  }
>>
>> +static int __init dt_numa_scan_distance_node(const void *fdt, int node,
>> +                                             const char *name, int depth,
>> +                                             u32 address_cells, u32
>> size_cells,
>> +                                             void *data)
>> +{
>> +    if ( device_tree_node_matches(fdt, node, "distance-map") )
>
>
> Similar to memory and cpu, the name is not fixed. What you want to look for
> is the compatible numa-distance-map-v1.
ok
>
>
>> +        return dt_numa_parse_distance_map(fdt, node, name, address_cells,
>> +                                          size_cells);
>> +
>> +    return 0;
>> +}
>> +
>>  int __init dt_numa_init(void)
>>  {
>>      int ret;
>> @@ -149,6 +234,11 @@ int __init dt_numa_init(void)
>>
>>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>>                                      dt_numa_scan_memory_node, NULL);
>> +    if ( ret )
>> +        return ret;
>> +
>> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
>> +                                    dt_numa_scan_distance_node, NULL);
>>
>>      return ret;
>>  }
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index 9a7f0bb..11d100b 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -22,14 +22,31 @@
>>  #include <asm/mm.h>
>>  #include <xen/numa.h>
>>  #include <asm/acpi.h>
>> +#include <xen/errno.h>
>
>
> Why did you add this include. I don't see any errno here.
>
>> +
>> +int _node_distance[MAX_NUMNODES * 2];
>> +int *node_distance;
>> +
>> +u8 __node_distance(nodeid_t a, nodeid_t b)
>> +{
>> +    if ( !node_distance )
>> +        return a == b ? 10 : 20;
>
>
> Why does the 10/20 comes from?

That is default distance value.

>
>> +
>> +    return _node_distance[a * MAX_NUMNODES + b];
>> +}
>> +
>> +EXPORT_SYMBOL(__node_distance);
>>
>>  int __init numa_init(void)
>>  {
>> -    int ret = 0;
>> +    int i, ret = 0;
>>
>>      if ( numa_off )
>>          goto no_numa;
>>
>> +    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
>> +        _node_distance[i] = 0;
>> +
>
>
> Hmmmm... _node_distance will be zeroed by the compiler. So why that?
>
> If you want to initialize correctly then use 10/20.
>
>>      ret = dt_numa_init();
>>
>>  no_numa:
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index cdfeecd..b8857e2 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -11,6 +11,7 @@ typedef u8 nodeid_t;
>>  int arch_numa_setup(char *opt);
>>  int __init numa_init(void);
>>  int __init dt_numa_init(void);
>> +u8 __node_distance(nodeid_t a, nodeid_t b);
>
>
> This should be defined in common/numa.h as it is used by common code.
>
>>  #else
>>  static inline int arch_numa_setup(char *opt)
>>  {
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-02-22 11:38     ` Vijay Kilari
@ 2017-02-22 11:44       ` Julien Grall
  2017-03-02 12:10         ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-02-22 11:44 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 22/02/17 11:38, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 11:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Parse distance-matrix and fetch node distance information.
>>> Store distance information in node_distance[].
>>>
>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/arch/arm/dt_numa.c     | 90
>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/numa.c        | 19 +++++++++-
>>>  xen/include/asm-arm/numa.h |  1 +
>>>  3 files changed, 109 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
>>> index fce9e67..8979612 100644
>>> --- a/xen/arch/arm/dt_numa.c
>>> +++ b/xen/arch/arm/dt_numa.c
>>> @@ -28,6 +28,19 @@
>>>
>>>  nodemask_t numa_nodes_parsed;
>>>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>>> +extern int _node_distance[MAX_NUMNODES * 2];
>>> +extern int *node_distance;
>>
>>
>> I don't like this idea of having _node_distance and node_distance. Looking
>> at the code, I see little point to do that. You could just initialize
>> node_distance with the correct value.
>>
>> Also the node distance can fit in u8, so you can save memory by using u8.
>
> u8 might restrict the distance value

The numa distance function returns an u8 and the common code rely on u8. 
So IHMO it is fine to restrict to u8.

If you want to keep u8 then please fix the rest of the code.

[...]

>>> +        numa_set_distance(nodea, nodeb, distance);
>>
>>
>> What if numa_set_distance failed?
>
> IMO, no need to fail numa. Should be set to default values for node_distance[].

If you look at your implementation of numa_set_distance it returns an 
error if the nodea, nodeb are too big. So you should really check the
return an report an error. Because the DT is buggy.

[...]

>>
>>
>>> +        return dt_numa_parse_distance_map(fdt, node, name, address_cells,
>>> +                                          size_cells);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>  int __init dt_numa_init(void)
>>>  {
>>>      int ret;
>>> @@ -149,6 +234,11 @@ int __init dt_numa_init(void)
>>>
>>>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>>>                                      dt_numa_scan_memory_node, NULL);
>>> +    if ( ret )
>>> +        return ret;
>>> +
>>> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
>>> +                                    dt_numa_scan_distance_node, NULL);
>>>
>>>      return ret;
>>>  }
>>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>>> index 9a7f0bb..11d100b 100644
>>> --- a/xen/arch/arm/numa.c
>>> +++ b/xen/arch/arm/numa.c
>>> @@ -22,14 +22,31 @@
>>>  #include <asm/mm.h>
>>>  #include <xen/numa.h>
>>>  #include <asm/acpi.h>
>>> +#include <xen/errno.h>
>>
>>
>> Why did you add this include. I don't see any errno here.
>>
>>> +
>>> +int _node_distance[MAX_NUMNODES * 2];
>>> +int *node_distance;
>>> +
>>> +u8 __node_distance(nodeid_t a, nodeid_t b)
>>> +{
>>> +    if ( !node_distance )
>>> +        return a == b ? 10 : 20;
>>
>>
>> Why does the 10/20 comes from?
>
> That is default distance value.

 From where? Please give a link to the doc.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-09 16:11   ` Jan Beulich
  2017-02-20 11:41     ` Julien Grall
@ 2017-02-27 11:43     ` Vijay Kilari
  2017-02-27 14:58       ` Jan Beulich
  1 sibling, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-02-27 11:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Julien Grall, xen-devel

Hi Jan,

On Thu, Feb 9, 2017 at 9:41 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 09.02.17 at 16:56, <vijay.kilari@gmail.com> wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move common generic NUMA code to xen/common/numa.c from
>> xen/arch/x86/numa.c. Also move generic code in header file
>> xen/include/asm-x86/numa.h to xen/include/xen/numa.h
>>
>> This common code can be re-used later for ARM.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> I would have been nice if you Cc-ed the maintainers of the code
> you're moving.
>
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,342 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +
>> +#include <xen/mm.h>
>> +#include <xen/string.h>
>> +#include <xen/init.h>
>> +#include <xen/ctype.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/numa.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/time.h>
>> +#include <xen/smp.h>
>> +#include <xen/pfn.h>
>> +#include <xen/sched.h>
>> +#include <xen/errno.h>
>> +#include <xen/softirq.h>
>> +#include <asm/setup.h>
>
> This last one would better not be included here.
>
>> +struct node_data node_data[MAX_NUMNODES];
>> +
>> +/* Mapping from pdx to node id */
>> +int memnode_shift;
>> +unsigned long memnodemapsize;
>> +u8 *memnodemap;
>> +typeof(*memnodemap) _memnodemap[64];
>
> Careful with removing any "static" please. For the last one in
> particular you would need to change the name if it's really necessary
> to make non-static. Even better would be though to keep it static
> and provide suitable accessors.
>
> Also please sanitize types as you're moving stuff: memnode_shift
> looks like it really wants to be unsigned int, and u8 should really
> be uint8_t (as we're trying to phase out u8 & Co). This also applies
> to code further down.

You mean to change all occurrences of
s/u8/uint8_t
s/u32/uint32_t
s/u64/uint64_t

Also, I see that xen/arch/x86/srat.c coding style is not adhere to xen
coding style.
Shall I clean up before I move the code?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code
  2017-02-27 11:43     ` Vijay Kilari
@ 2017-02-27 14:58       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2017-02-27 14:58 UTC (permalink / raw)
  To: vijay.kilari
  Cc: sstabellini, andre.przywara, dario.faggioli, Vijaya.Kumar,
	julien.grall, xen-devel

>>> Vijay Kilari <vijay.kilari@gmail.com> 02/27/17 12:43 PM >>>
>On Thu, Feb 9, 2017 at 9:41 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 09.02.17 at 16:56, <vijay.kilari@gmail.com> wrote:
>>> +struct node_data node_data[MAX_NUMNODES];
>>> +
>>> +/* Mapping from pdx to node id */
>>> +int memnode_shift;
>>> +unsigned long memnodemapsize;
>>> +u8 *memnodemap;
>>> +typeof(*memnodemap) _memnodemap[64];
>>
>> Careful with removing any "static" please. For the last one in
>> particular you would need to change the name if it's really necessary
>> to make non-static. Even better would be though to keep it static
>> and provide suitable accessors.
>>
>> Also please sanitize types as you're moving stuff: memnode_shift
>> looks like it really wants to be unsigned int, and u8 should really
>> be uint8_t (as we're trying to phase out u8 & Co). This also applies
>> to code further down.
>
>You mean to change all occurrences of
>s/u8/uint8_t
>s/u32/uint32_t
>s/u64/uint64_t

Yes.

>Also, I see that xen/arch/x86/srat.c coding style is not adhere to xen
>coding style.
>Shall I clean up before I move the code?

If you want to take the time - sure. All I'd like to ask for is that at least the
file(s) you move the code _to_ end up with consistent style.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-02-22 11:44       ` Julien Grall
@ 2017-03-02 12:10         ` Vijay Kilari
  2017-03-02 12:17           ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 12:10 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Wed, Feb 22, 2017 at 5:14 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 22/02/17 11:38, Vijay Kilari wrote:
>>
>> On Mon, Feb 20, 2017 at 11:58 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hello Vijay,
>>>
>>> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>>>
>>>>
>>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>>
>>>> Parse distance-matrix and fetch node distance information.
>>>> Store distance information in node_distance[].
>>>>
>>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>> ---
>>>>  xen/arch/arm/dt_numa.c     | 90
>>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>>  xen/arch/arm/numa.c        | 19 +++++++++-
>>>>  xen/include/asm-arm/numa.h |  1 +
>>>>  3 files changed, 109 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
>>>> index fce9e67..8979612 100644
>>>> --- a/xen/arch/arm/dt_numa.c
>>>> +++ b/xen/arch/arm/dt_numa.c
>>>> @@ -28,6 +28,19 @@
>>>>
>>>>  nodemask_t numa_nodes_parsed;
>>>>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>>>> +extern int _node_distance[MAX_NUMNODES * 2];
>>>> +extern int *node_distance;
>>>
>>>
>>>
>>> I don't like this idea of having _node_distance and node_distance.
>>> Looking
>>> at the code, I see little point to do that. You could just initialize
>>> node_distance with the correct value.
>>>
>>> Also the node distance can fit in u8, so you can save memory by using u8.
>>
>>
>> u8 might restrict the distance value
>
>
> The numa distance function returns an u8 and the common code rely on u8. So
> IHMO it is fine to restrict to u8.
>
> If you want to keep u8 then please fix the rest of the code.
>
> [...]
>
>>>> +        numa_set_distance(nodea, nodeb, distance);
>>>
>>>
>>>
>>> What if numa_set_distance failed?
>>
>>
>> IMO, no need to fail numa. Should be set to default values for
>> node_distance[].
>
>
> If you look at your implementation of numa_set_distance it returns an error
> if the nodea, nodeb are too big. So you should really check the
> return an report an error. Because the DT is buggy.
ok
>
> [...]
>
>
>>>
>>>
>>>> +        return dt_numa_parse_distance_map(fdt, node, name,
>>>> address_cells,
>>>> +                                          size_cells);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>>  int __init dt_numa_init(void)
>>>>  {
>>>>      int ret;
>>>> @@ -149,6 +234,11 @@ int __init dt_numa_init(void)
>>>>
>>>>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>>>>                                      dt_numa_scan_memory_node, NULL);
>>>> +    if ( ret )
>>>> +        return ret;
>>>> +
>>>> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
>>>> +                                    dt_numa_scan_distance_node, NULL);
>>>>
>>>>      return ret;
>>>>  }
>>>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>>>> index 9a7f0bb..11d100b 100644
>>>> --- a/xen/arch/arm/numa.c
>>>> +++ b/xen/arch/arm/numa.c
>>>> @@ -22,14 +22,31 @@
>>>>  #include <asm/mm.h>
>>>>  #include <xen/numa.h>
>>>>  #include <asm/acpi.h>
>>>> +#include <xen/errno.h>
>>>
>>>
>>>
>>> Why did you add this include. I don't see any errno here.
>>>
>>>> +
>>>> +int _node_distance[MAX_NUMNODES * 2];
>>>> +int *node_distance;
>>>> +
>>>> +u8 __node_distance(nodeid_t a, nodeid_t b)
>>>> +{
>>>> +    if ( !node_distance )
>>>> +        return a == b ? 10 : 20;
>>>
>>>
>>>
>>> Why does the 10/20 comes from?
>>
>>
>> That is default distance value.
>
>
> From where? Please give a link to the doc.

10/20 is used by x86 implementation as well.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/topology.h?id=refs/tags/v4.10#n47

Also the default matrix is shown below
Documentation/devicetree/bindings/numa.txt

>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
  2017-03-02 12:10         ` Vijay Kilari
@ 2017-03-02 12:17           ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 12:17 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 02/03/17 12:10, Vijay Kilari wrote:
> On Wed, Feb 22, 2017 at 5:14 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 22/02/17 11:38, Vijay Kilari wrote:

[...]

>>> That is default distance value.
>>
>>
>> From where? Please give a link to the doc.
>
> 10/20 is used by x86 implementation as well.
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/topology.h?id=refs/tags/v4.10#n47


Then please introduce LOCAL_DISTANCE and REMOTE_DISTANCE to avoid 
hardcoded value.

>
> Also the default matrix is shown below
> Documentation/devicetree/bindings/numa.txt

Looking at the binding, the show an example not the default value.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-02-20 18:05   ` Julien Grall
@ 2017-03-02 12:25     ` Vijay Kilari
  2017-03-02 14:48       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 12:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Mon, Feb 20, 2017 at 11:35 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Parse memory node and fetch numa-node-id information.
>> For each memory range, store in node_memblk_range[]
>> along with node id.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/bootfdt.c        |  4 +--
>>  xen/arch/arm/dt_numa.c        | 84
>> ++++++++++++++++++++++++++++++++++++++++++-
>>  xen/common/numa.c             |  8 +++++
>>  xen/include/xen/device_tree.h |  3 ++
>>  xen/include/xen/numa.h        |  1 +
>>  5 files changed, 97 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index d1122d8..5e2df92 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -56,8 +56,8 @@ static bool_t __init device_tree_node_compatible(const
>> void *fdt, int node,
>>      return 0;
>>  }
>>
>> -static void __init device_tree_get_reg(const __be32 **cell, u32
>> address_cells,
>> -                                       u32 size_cells, u64 *start, u64
>> *size)
>> +void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
>> +                                u32 size_cells, u64 *start, u64 *size)
>>  {
>>      *start = dt_next_cell(address_cells, cell);
>>      *size = dt_next_cell(size_cells, cell);
>> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c
>> index 4b94c36..fce9e67 100644
>> --- a/xen/arch/arm/dt_numa.c
>> +++ b/xen/arch/arm/dt_numa.c
>> @@ -27,6 +27,7 @@
>>  #include <xen/numa.h>
>>
>>  nodemask_t numa_nodes_parsed;
>> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>
>
> This should have been declared in an header (likely in patch #3).
>
>>
>>  /*
>>   * Even though we connect cpus to numa domains later in SMP
>> @@ -48,11 +49,73 @@ static int __init dt_numa_process_cpu_node(const void
>> *fdt, int node,
>>      return 0;
>>  }
>>
>> +static int __init dt_numa_process_memory_node(const void *fdt, int node,
>> +                                              const char *name,
>> +                                              u32 address_cells,
>> +                                              u32 size_cells)
>
>
> Rather than reimplementing the wheel, it might be better to hook the parsing
> in bootfdt.c. This would avoid an extra parsing of the device-tree,
> duplicate the code and expose primitive for early DT parsing.

The process_memory_node() is called only if EFI_BOOT is not enabled. So
hooking might not work.

>
> If parsing NUMA information can be done after the DT has been unflattened,
> this will be even better.
>
>> +{
>> +    const struct fdt_property *prop;
>> +    int i, ret, banks;
>
>
> Both banks and i should be unsigned.
>
>> +    const __be32 *cell;
>> +    paddr_t start, size;
>> +    u32 reg_cells = address_cells + size_cells;
>> +    u32 nid;
>> +
>> +    if ( address_cells < 1 || size_cells < 1 )
>> +    {
>> +        printk(XENLOG_WARNING
>> +               "fdt: node `%s': invalid #address-cells or #size-cells",
>> name);
>> +        return -EINVAL;
>> +    }
>> +
>> +    nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
>> +    if ( nid >= MAX_NUMNODES) {
>
>
> Coding style
>
> if ( ... )
> {
>
>> +        /*
>> +         * No node id found. Skip this memory node.
>> +         */
>
>
> This could be a single line:
>
> /* ..... */
>
> So no warning if it is impossible to get the numa-node-id? Also, I don't
> think this is right to boot using NUMA on platform having a buggy DT. So we
> should probably return an error and disable NUMA.

OK.
>
>> +        return 0;
>> +    }
>> +
>> +    prop = fdt_get_property(fdt, node, "reg", NULL);
>> +    if ( !prop )
>> +    {
>> +        printk(XENLOG_WARNING "fdt: node `%s': missing `reg' property\n",
>> +               name);
>> +        return -EINVAL;
>> +    }
>> +
>> +    cell = (const __be32 *)prop->data;
>> +    banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
>> +
>> +    for ( i = 0; i < banks; i++ )
>> +    {
>> +        device_tree_get_reg(&cell, address_cells, size_cells, &start,
>> &size);
>> +        if ( !size )
>> +            continue;
>> +
>> +        /* It is fine to add this area to the nodes data it will be used
>> later*/
>> +        ret = conflicting_memblks(start, start + size);
>> +        if (ret < 0)
>> +             numa_add_memblk(nid, start, size);
>
>
> numa_add_memblk rely on the caller to check whether the array is not full. I
> think we should move the check in numa_add_memblk and return an error in
> this case.

OK
>
>> +        else
>> +        {
>> +             printk(XENLOG_ERR
>> +                    "NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with
>> ret %d (%"PRIx64"-%"PRIx64")\n",
>> +                    nid, start, start + size, ret,
>> +                    node_memblk_range[i].start,
>> node_memblk_range[i].end);
>
>
> i != ret. Please use the correct variable accordingly. However, I don't
> think the overlap range really matters here.

OK
>
>> +             return -EINVAL;
>> +        }
>> +    }
>> +
>> +    node_set(nid, numa_nodes_parsed);
>> +
>> +    return 0;
>> +}
>> +
>>  static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>>                                          const char *name, int depth,
>>                                          u32 address_cells, u32
>> size_cells,
>>                                          void *data)
>> -
>
>
> Spurious change. Please don't add the blank line at the first place (patch
> #6).
>
>>  {
>>      if ( device_tree_node_matches(fdt, node, "cpu") )
>>          return dt_numa_process_cpu_node(fdt, node, name, address_cells,
>> @@ -61,6 +124,18 @@ static int __init dt_numa_scan_cpu_node(const void
>> *fdt, int node,
>>      return 0;
>>  }
>>
>> +static int __init dt_numa_scan_memory_node(const void *fdt, int node,
>> +                                           const char *name, int depth,
>> +                                           u32 address_cells, u32
>> size_cells,
>> +                                           void *data)
>> +{
>> +    if ( device_tree_node_matches(fdt, node, "memory") )
>> +        return dt_numa_process_memory_node(fdt, node, name,
>> address_cells,
>> +                                           size_cells);
>
>
> Similarly to the CPUs, this code is wrong. You should check the type =
> "memory".

 if (!dt_node_type(node, "memory") ) should be fine?

>
>
>> +
>> +    return 0;
>> +}
>> +
>>  int __init dt_numa_init(void)
>>  {
>>      int ret;
>> @@ -68,5 +143,12 @@ int __init dt_numa_init(void)
>>      nodes_clear(numa_nodes_parsed);
>>      ret = device_tree_for_each_node((void *)device_tree_flattened,
>>                                      dt_numa_scan_cpu_node, NULL);
>> +
>> +    if ( ret )
>> +        return ret;
>> +
>> +    ret = device_tree_for_each_node((void *)device_tree_flattened,
>> +                                    dt_numa_scan_memory_node, NULL);
>> +
>>      return ret;
>>  }
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 9b9cf9c..62c76af 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -55,6 +55,14 @@ struct node node_memblk_range[NR_NODE_MEMBLKS];
>>  nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>  struct node nodes[MAX_NUMNODES] __initdata;
>>
>> +void __init numa_add_memblk(nodeid_t nodeid, u64 start, u64 size)
>
>
> Please replace u64 by paddr_t.
>
>> +{
>> +    node_memblk_range[num_node_memblks].start = start;
>> +    node_memblk_range[num_node_memblks].end = start + size;
>> +    memblk_nodeid[num_node_memblks] = nodeid;
>> +    num_node_memblks++;
>> +}
>
>
> You probably want to factor the code in acpi_numa_memory_affinity_init to
> create this function.
>
> Also, you don't check if the array is full.

I think x86 can use this. I will make it part of initial code clean up.
>
>> +
>>  int valid_numa_range(u64 start, u64 end, nodeid_t node)
>>  {
>>  #ifdef CONFIG_NUMA
>> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
>> index de6b351..d92e47e 100644
>> --- a/xen/include/xen/device_tree.h
>> +++ b/xen/include/xen/device_tree.h
>> @@ -192,6 +192,9 @@ bool_t device_tree_node_matches(const void *fdt, int
>> node,
>>                                  const char *match);
>>  u32 device_tree_get_u32(const void *fdt, int node,
>>                          const char *prop_name, u32 dflt);
>> +void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>> +                         u32 size_cells, u64 *start, u64 *size);
>> +
>
>
> Same remark as on patch #6 for the position of the declaration.
>
>>  /**
>>   * dt_unflatten_host_device_tree - Unflatten the host device tree
>>   *
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 77c5cfd..9392a89 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -67,6 +67,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define clear_node_cpumask(cpu) do {} while (0)
>>  #endif /* CONFIG_NUMA */
>>
>> +extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
>>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>>  extern int conflicting_memblks(u64 start, u64 end);
>>  extern void cutoff_node(int i, u64 start, u64 end);
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0
  2017-02-20 18:36   ` Julien Grall
@ 2017-03-02 12:30     ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 12:30 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Tue, Feb 21, 2017 at 12:06 AM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Delete numa-node-id and distance map from Dom0 DT
>> so that NUMA information is not exposed to Dom0.
>>
>> This helps particularly to boot Node 1 devices
>> as if booting on Node0.
>>
>> However this approach has limitation where memory allocation
>> for the devices should be local.
>
>
> We had a discussion about this few weeks ago but you never answered back...
> (see [1]).

OK. I will reply to [1].

>
> If there is an issue, please provides input with examples and what will
> happen.
>
>
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/domain_build.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index c97a1f5..5e89eaa 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -424,6 +424,10 @@ static int write_properties(struct domain *d, struct
>> kernel_info *kinfo,
>>              }
>>          }
>>
>> +        /* Don't expose the property numa to the guest */
>> +        if ( dt_property_name_is_equal(prop, "numa-node-id") )
>> +            continue;
>> +
>>          /* Don't expose the property "xen,passthrough" to the guest */
>>          if ( dt_property_name_is_equal(prop, "xen,passthrough") )
>>              continue;
>> @@ -1176,6 +1180,11 @@ static int handle_node(struct domain *d, struct
>> kernel_info *kinfo,
>>          DT_MATCH_TYPE("memory"),
>>          /* The memory mapped timer is not supported by Xen. */
>>          DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
>> +        /*
>> +         * NUMA info is not exposed to Dom0.
>> +         * So, skip distance-map infomation
>> +         */
>> +        DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
>>          { /* sentinel */ },
>>      };
>>      static const struct dt_device_match timer_matches[] __initconst =
>>
>
> Regards,
>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg02073.html
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 17/21] ARM: NUMA: Extract memory proximity from SRAT table
  2017-02-10 17:35     ` Konrad Rzeszutek Wilk
@ 2017-03-02 14:41       ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 14:41 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Julien Grall, xen-devel

On Fri, Feb 10, 2017 at 11:05 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Fri, Feb 10, 2017 at 12:33:33PM -0500, Konrad Rzeszutek Wilk wrote:
>> On Thu, Feb 09, 2017 at 09:27:09PM +0530, vijay.kilari@gmail.com wrote:
>> > From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> >
>> > Register SRAT entry handler for type
>> > ACPI_SRAT_TYPE_MEMORY_AFFINITY to parse SRAT table
>> > and extract proximity for all memory mappings.
>>
>> Why can't you use arch/x86/srat.c code? Or move parts of that code to an
>> common code?
>
> And to be clear - I meant the 'acpi_numa_memory_affinity_init' function?

OK. I will check.

Thanks
Vijay

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-03-02 12:25     ` Vijay Kilari
@ 2017-03-02 14:48       ` Julien Grall
  2017-03-02 15:08         ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 14:48 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 02/03/17 12:25, Vijay Kilari wrote:
> On Mon, Feb 20, 2017 at 11:35 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:

[...]

>> Rather than reimplementing the wheel, it might be better to hook the parsing
>> in bootfdt.c. This would avoid an extra parsing of the device-tree,
>> duplicate the code and expose primitive for early DT parsing.
>
> The process_memory_node() is called only if EFI_BOOT is not enabled. So
> hooking might not work.

This series adds this change (see patch #5) and the code is not set in 
stone.

We should rework the code when it makes sense rather than trying to find 
a more convolute way.

>>
>> Similarly to the CPUs, this code is wrong. You should check the type =
>> "memory".
>
>  if (!dt_node_type(node, "memory") ) should be fine?

You would have to create an helper for that. But the idea is there.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 00/21] ARM: Add Xen NUMA support
  2017-02-10 17:30 ` Konrad Rzeszutek Wilk
@ 2017-03-02 14:49   ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 14:49 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Julien Grall, xen-devel

Hi Konrad,

On Fri, Feb 10, 2017 at 11:00 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Thu, Feb 09, 2017 at 09:26:52PM +0530, vijay.kilari@gmail.com wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> With this RFC patch series, NUMA support is added for arm platform.
>> Both DT and ACPI based NUMA support is added.
>> Only Xen is made aware of NUMA platform. Dom0 is awareness is not
>> added.
>>
>> As part of this series, the code under x86 architecture is
>> reused by moving into common files.
>> New files xen/common/numa.c and xen/commom/srat.c files are
>> added which are common for both x86 and arm.
>>
>> Patches 1 - 12 & 20 are for DT NUMA and 13 - 19 & 21 are for
>> ACPI NUMA.
>>
>> DT NUMA: The following major changes are performed
>>  - Dropped numa-node-id information from Dom0 DT.
>>    So that Dom0 devices make allocation from node 0 for
>>    devmalloc requests.
>>  - Memory DT is not deleted by EFI. It is exposed to Xen
>>    to extract numa information.
>>  - On NUMA failure, Fallback to Non-NUMA booting.
>>    Assuming all the memory and CPU's are under node 0.
>>  - CONFIG_NUMA is introduced.
>>
>> ACPI NUMA:
>>  - MADT is parsed before parsing SRAT table to extract
>>    CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
>>    table, MADT table is opened and extract MPIDR. However this
>>    approach is not working on Xen it allows only one table to
>>    be open at a time because when ACPI table is opened, Xen
>>    maps to single region. So opening ACPI tables recursively
>>    leads to overwriting of contents.
>
> Huh? Why can't you use vmap APIs to map them?
I see acpi_os_map_memory() could be used.

However, this approach of caching of cpu to MPIDR mapping by parsing MADT
before processing SRAT is much efficient.
In linux for every CPU_ID entry in SRAT read, the MADT is opened and searched
for CPU_ID to MPIDR mapping and closed. So MADT is searched n*n times.

Regards
Vijay

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-03-02 14:48       ` Julien Grall
@ 2017-03-02 15:08         ` Vijay Kilari
  2017-03-02 15:19           ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 15:08 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 8:18 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 02/03/17 12:25, Vijay Kilari wrote:
>>
>> On Mon, Feb 20, 2017 at 11:35 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>
>
> [...]
>
>>> Rather than reimplementing the wheel, it might be better to hook the
>>> parsing
>>> in bootfdt.c. This would avoid an extra parsing of the device-tree,
>>> duplicate the code and expose primitive for early DT parsing.
>>
>>
>> The process_memory_node() is called only if EFI_BOOT is not enabled. So
>> hooking might not work.
>
>
> This series adds this change (see patch #5) and the code is not set in
> stone.

I have not added process_memory_node() in the patch #5, I have only
introduced the check.
>
> We should rework the code when it makes sense rather than trying to find a
> more convolute way.
>
I see two restrictions.
1) If we make a hook from process_memory_node(). This is called much
earlier than
    numa_init().
2) We cannot move numa_init() before setup_mm().

>>>
>>> Similarly to the CPUs, this code is wrong. You should check the type =
>>> "memory".
>>
>>
>>  if (!dt_node_type(node, "memory") ) should be fine?
>
>
> You would have to create an helper for that. But the idea is there.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
  2017-03-02 15:08         ` Vijay Kilari
@ 2017-03-02 15:19           ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 15:19 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd



On 02/03/17 15:08, Vijay Kilari wrote:
> On Thu, Mar 2, 2017 at 8:18 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>> On 02/03/17 12:25, Vijay Kilari wrote:
>>>
>>> On Mon, Feb 20, 2017 at 11:35 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> On 09/02/17 15:56, vijay.kilari@gmail.com wrote:
>>
>>
>> [...]
>>
>>>> Rather than reimplementing the wheel, it might be better to hook the
>>>> parsing
>>>> in bootfdt.c. This would avoid an extra parsing of the device-tree,
>>>> duplicate the code and expose primitive for early DT parsing.
>>>
>>>
>>> The process_memory_node() is called only if EFI_BOOT is not enabled. So
>>> hooking might not work.
>>
>>
>> This series adds this change (see patch #5) and the code is not set in
>> stone.
>
> I have not added process_memory_node() in the patch #5, I have only
> introduced the check.

That's exactly what I said...

>>
>> We should rework the code when it makes sense rather than trying to find a
>> more convolute way.
>>
> I see two restrictions.
> 1) If we make a hook from process_memory_node(). This is called much
> earlier than
>     numa_init().
> 2) We cannot move numa_init() before setup_mm().

I didn't ask to move numa_init earlier but moving the parsing earlier. 
AFAICT, it does not require to allocate memory nor rely on anything 
initialized big in numa_init but nodes_clear(...).

Regards,
-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code
  2017-02-09 15:57 ` [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
@ 2017-03-02 15:30   ` Julien Grall
  2017-03-02 16:31     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 15:30 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Move SRAT handling code which is common across
> architecture is moved to new file xen/commom/srat.c
> from xen/arch/x86/srat.c file. New header file srat.h is
> introduced.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/x86/domain_build.c         |   1 +
>  xen/arch/x86/numa.c                 |   1 +
>  xen/arch/x86/physdev.c              |   1 +
>  xen/arch/x86/setup.c                |   1 +
>  xen/arch/x86/smpboot.c              |   1 +
>  xen/arch/x86/srat.c                 | 129 +------------------------------
>  xen/arch/x86/x86_64/mm.c            |   1 +
>  xen/common/Makefile                 |   1 +
>  xen/common/srat.c                   | 150 ++++++++++++++++++++++++++++++++++++

This new file should be created in xen/drivers/acpi/

>  xen/drivers/passthrough/vtd/iommu.c |   1 +
>  xen/include/asm-x86/numa.h          |   2 -
>  xen/include/xen/numa.h              |   1 -
>  xen/include/xen/srat.h              |  13 ++++

This new file should be created in xen/include/acpi/

[...]

> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index 58dee09..af12e26 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -18,91 +18,20 @@
>  #include <xen/acpi.h>
>  #include <xen/numa.h>
>  #include <xen/pfn.h>
> +#include <xen/srat.h>
>  #include <asm/e820.h>
>  #include <asm/page.h>
>
> -static struct acpi_table_slit *__read_mostly acpi_slit;
> +extern struct acpi_table_slit *__read_mostly acpi_slit;

This should be defined in the header. However, I don't like the idea of 
exposing acpi_slit.

Looking at the usage it is only to parse the distance that can be common.

[...]

>  /* Callback for Proximity Domain -> x2APIC mapping */
>  void __init
>  acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *pa)
> @@ -456,18 +343,6 @@ int __init acpi_scan_nodes(u64 start, u64 end)
>  	return 0;
>  }

The code of acpi_numa_memory_affinity_init looks pretty generic. Why 
didn't you move it in the common code?

[...]

> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index c1bd2ff..a668094 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -64,6 +64,7 @@ obj-bin-y += warning.init.o
>  obj-$(CONFIG_XENOPROF) += xenoprof.o
>  obj-y += xmalloc_tlsf.o
>  obj-y += numa.o
> +obj-y += srat.o

This should be only compiled when CONFIG_ACPI is enabled.

>
>  obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
>
> diff --git a/xen/common/srat.c b/xen/common/srat.c
> new file mode 100644
> index 0000000..cf50c78
> --- /dev/null
> +++ b/xen/common/srat.c
> @@ -0,0 +1,150 @@
> +/*
> + * ACPI 3.0 based NUMA setup
> + * Copyright 2004 Andi Kleen, SuSE Labs.
> + *
> + * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs.
> + *
> + * Called from acpi_numa_init while reading the SRAT and SLIT tables.
> + * Assumes all memory regions belonging to a single proximity domain
> + * are in one chunk. Holes between them will be included in the node.
> + *
> + * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
> + *
> + * Moved this generic code from xen/arch/x86/srat.c for other arch usage
> + * by Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> + */
> +
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +#include <xen/inttypes.h>
> +#include <xen/nodemask.h>
> +#include <xen/acpi.h>
> +#include <xen/numa.h>
> +#include <xen/pfn.h>
> +#include <xen/srat.h>
> +#include <asm/page.h>
> +
> +struct acpi_table_slit *__read_mostly acpi_slit;

This should really be static.

> +extern struct node nodes[MAX_NUMNODES] __initdata;
> +
> +struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
> +    { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };

So this is not only exposed because of bad_srat(). The code should be 
reworked to avoid that.

> +
> +static inline bool_t node_found(unsigned idx, unsigned pxm)
> +{
> +    return ((pxm2node[idx].pxm == pxm) &&
> +        (pxm2node[idx].node != NUMA_NO_NODE));
> +}
> +
> +nodeid_t pxm_to_node(unsigned pxm)
> +{
> +    unsigned i;
> +
> +    if ( (pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm) )
> +        return pxm2node[pxm].node;
> +
> +    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
> +        if ( node_found(i, pxm) )
> +            return pxm2node[i].node;
> +
> +    return NUMA_NO_NODE;
> +}
> +
> +nodeid_t setup_node(unsigned pxm)

This name is too generic. The name of the function should make clear it 
is an ACPI only function.

[...]

> +unsigned node_to_pxm(nodeid_t n)
> +{
> +    unsigned i;
> +
> +    if ( (n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n) )
> +        return pxm2node[n].pxm;
> +    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
> +        if ( pxm2node[i].node == n )
> +            return pxm2node[i].pxm;
> +    return 0;
> +}

Missing emacs magic here.

> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index 659ff6a..79a445c 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -17,8 +17,6 @@ extern cpumask_t     node_to_cpumask[];
>  #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
>  #define node_to_cpumask(node)    (node_to_cpumask[node])
>
> -extern nodeid_t pxm_to_node(unsigned int pxm);
> -
>  #define ZONE_ALIGN (1UL << (MAX_ORDER+PAGE_SHIFT))
>
>  extern void numa_init_array(void);
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 4f04ab4..eb18380 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -73,7 +73,6 @@ extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>  extern int conflicting_memblks(u64 start, u64 end);
>  extern void cutoff_node(int i, u64 start, u64 end);
>  extern void numa_add_cpu(int cpu);
> -extern nodeid_t setup_node(unsigned int pxm);
>  extern void numa_set_node(int cpu, nodeid_t node);
>  extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end);
>  extern int compute_hash_shift(struct node *nodes, int numnodes,
> diff --git a/xen/include/xen/srat.h b/xen/include/xen/srat.h
> new file mode 100644
> index 0000000..978f1e8
> --- /dev/null
> +++ b/xen/include/xen/srat.h
> @@ -0,0 +1,13 @@
> +#ifndef __XEN_SRAT_H__
> +#define __XEN_SRAT_H__
> +
> +struct pxm2node {
> +    unsigned pxm;
> +    nodeid_t node;
> +};
> +
> +extern struct pxm2node __read_mostly pxm2node[MAX_NUMNODES];
> +nodeid_t pxm_to_node(unsigned pxm);
> +nodeid_t setup_node(unsigned pxm);
> +unsigned node_to_pxm(nodeid_t n);
> +#endif /* __XEN_SRAT_H__ */

Missing emacs magic.

>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 10/21] ARM: NUMA: Add memory NUMA support
  2017-02-09 15:57 ` [RFC PATCH v1 10/21] ARM: NUMA: Add memory " vijay.kilari
@ 2017-03-02 16:05   ` Julien Grall
  2017-03-02 16:23     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 16:05 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> For all banks in bootinfo.mem, update nodes[] with
> corresponding nodeid and register these nodes by
> calling setup_node_bootmem().
> compute memnode_shift and initialize memnodemap[] to fetch
> nodeid for a given physical address.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/numa.c    | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  xen/common/numa.c      | 14 ++++++++
>  xen/include/xen/numa.h |  1 +
>  3 files changed, 105 insertions(+)
>
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index d4dbad4..aa34c82 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -24,10 +24,15 @@
>  #include <asm/acpi.h>
>  #include <xen/errno.h>
>  #include <xen/cpumask.h>
> +#include <asm/setup.h>
>
>  int _node_distance[MAX_NUMNODES * 2];
>  int *node_distance;
>  extern nodemask_t numa_nodes_parsed;
> +extern struct node nodes[MAX_NUMNODES] __initdata;
> +extern int num_node_memblks;
> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];
> +extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>
>  void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>  {
> @@ -51,6 +56,88 @@ u8 __node_distance(nodeid_t a, nodeid_t b)
>
>  EXPORT_SYMBOL(__node_distance);
>
> +static int __init numa_mem_init(void)

The function return only 2 different value either 0 or -EINVAL. It 
should be better to use a boolean.

> +{
> +    nodemask_t memory_nodes_parsed;
> +    int bank, nodeid;
> +    struct node *nd;
> +    paddr_t start, size, end;
> +
> +    nodes_clear(memory_nodes_parsed);
> +    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
> +    {
> +        start = bootinfo.mem.bank[bank].start;
> +        size = bootinfo.mem.bank[bank].size;
> +        end = start + size;
> +
> +        nodeid = get_numa_node(start, end);
> +        if ( nodeid == -EINVAL || nodeid > MAX_NUMNODES )

If found, how the nodeid could be invalid? IHMO, this is a parsing bug.

> +        {
> +            printk(XENLOG_WARNING
> +                   "NUMA: node for mem bank start 0x%lx - 0x%lx not found\n",
> +                   start, end);
> +
> +            return -EINVAL;
> +        }
> +
> +        nd = &nodes[nodeid];
> +        if ( !node_test_and_set(nodeid, memory_nodes_parsed) )
> +        {
> +            nd->start = start;
> +            nd->end = end;
> +        }
> +        else
> +        {
> +            if ( start < nd->start )
> +                nd->start = start;
> +            if ( nd->end < end )
> +                nd->end = end;
> +        }
> +    }

This function is quite confusing. What is the purpose? Why do you go 
through bootinfo.mem and not node_memblk_range?

> +
> +    return 0;
> +}
> +
> +/* Use the information discovered above to actually set up the nodes. */
> +static int __init numa_scan_mem_nodes(void)

This function looks very similar to acpi_scan_nodes on x86. This is call 
to move more code in common.

> +{
> +    int i;
> +
> +    memnode_shift = compute_hash_shift(node_memblk_range, num_node_memblks,
> +                                       memblk_nodeid);
> +    if ( memnode_shift < 0 )
> +    {
> +        printk(XENLOG_WARNING "No NUMA hash found.\n");
> +        memnode_shift = 0;
> +    }
> +
> +    for_each_node_mask(i, numa_nodes_parsed)
> +    {
> +        u64 size = node_memblk_range[i].end - node_memblk_range[i].start;
> +
> +        if ( size == 0 )
> +            printk(XENLOG_WARNING "NUMA: Node %u has no memory. \n", i);
> +
> +        printk(XENLOG_INFO
> +               "NUMA: NODE[%d]: Start 0x%lx End 0x%lx\n",
> +               i, nodes[i].start, nodes[i].end);
> +        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
> +    }
> +
> +    return 0;
> +}
> +
> +static int __init numa_initmem_init(void)
> +{
> +    if ( !numa_mem_init() )
> +    {
> +        if ( !numa_scan_mem_nodes() )
> +            return 0;
> +    }

This could be simplified with

if ( !numa_mem_init() && !num_scan_mem_nodes )
    return 0;

However looking at the usage, this function seems rather pointless.

> +
> +    return -EINVAL;
> +}
> +
>  /*
>   * Setup early cpu_to_node.
>   */
> @@ -74,6 +161,9 @@ int __init numa_init(void)
>
>      ret = dt_numa_init();
>
> +    if ( !ret )
> +        ret = numa_initmem_init();
> +
>  no_numa:
>      return ret;
>  }
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 62c76af..2f5266a 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -63,6 +63,20 @@ void __init numa_add_memblk(nodeid_t nodeid, u64 start, u64 size)
>      num_node_memblks++;
>  }
>
> +int __init get_numa_node(u64 start, u64 end)

I think this function will be confusing for the caller. Why using 
num_node_memblks and not the node itself?

> +{
> +    int i;

Please unsigned int

> +
> +    for ( i = 0; i < num_node_memblks; i++ )
> +    {
> +        if ( start >= node_memblk_range[i].start &&
> +             end <= node_memblk_range[i].end )
> +            return memblk_nodeid[i];
> +    }
> +
> +    return -EINVAL;
> +}
> +
>  int valid_numa_range(u64 start, u64 end, nodeid_t node)
>  {
>  #ifdef CONFIG_NUMA
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index 9392a89..4f04ab4 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -68,6 +68,7 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>  #endif /* CONFIG_NUMA */
>
>  extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
> +extern int get_numa_node(u64 start, u64 end);
>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>  extern int conflicting_memblks(u64 start, u64 end);
>  extern void cutoff_node(int i, u64 start, u64 end);
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure
  2017-02-09 15:57 ` [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
@ 2017-03-02 16:09   ` Julien Grall
  2017-03-02 16:25     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 16:09 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> On NUMA initialization failure, reset all the
> NUMA structures to emulate as single node.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/numa.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 48 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index aa34c82..31dc552 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -19,6 +19,7 @@
>  #include <xen/ctype.h>
>  #include <xen/mm.h>
>  #include <xen/nodemask.h>
> +#include <xen/pfn.h>
>  #include <asm/mm.h>
>  #include <xen/numa.h>
>  #include <asm/acpi.h>
> @@ -127,6 +128,29 @@ static int __init numa_scan_mem_nodes(void)
>      return 0;
>  }
>
> +static void __init numa_dummy_init(unsigned long start_pfn,
> +                                   unsigned long end_pfn)

This code is very similar to numa_initmem_init on x86. This is a call to 
make the code more generic.

> +{
> +    int i;
> +
> +    nodes_clear(numa_nodes_parsed);
> +    memnode_shift = BITS_PER_LONG - 1;
> +    memnodemap = _memnodemap;
> +    nodes_clear(node_online_map);
> +    node_set_online(0);
> +
> +    for ( i = 0; i < NR_CPUS; i++ )
> +        numa_set_node(i, 0);
> +
> +    node_distance = NULL;
> +    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
> +        _node_distance[i] = 0;
> +
> +    cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
> +    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
> +                       (u64)end_pfn << PAGE_SHIFT);
> +}
> +
>  static int __init numa_initmem_init(void)
>  {
>      if ( !numa_mem_init() )
> @@ -151,7 +175,9 @@ void __init init_cpu_to_node(void)
>
>  int __init numa_init(void)

I would make numa_initmem_init from x86 generic and have an arch 
specific function called from there.

>  {
> -    int i, ret = 0;
> +    int i, bank, ret = 0;
> +    paddr_t ram_start = ~0;
> +    paddr_t ram_end = 0;
>
>      if ( numa_off )
>          goto no_numa;
> @@ -164,8 +190,28 @@ int __init numa_init(void)
>      if ( !ret )
>          ret = numa_initmem_init();
>
> +    if ( !ret )
> +        return 0;
> +
>  no_numa:
> -    return ret;
> +    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
> +    {
> +        paddr_t bank_start = bootinfo.mem.bank[bank].start;
> +        paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
> +
> +        ram_start = min(ram_start, bank_start);
> +        ram_end = max(ram_end, bank_end);
> +    }

I am sure this could be done in setup.c rather than here.

> +
> +    printk(XENLOG_INFO "%s\n",
> +           numa_off ? "NUMA turned off" : "No NUMA configuration found");
> +
> +    printk(XENLOG_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
> +           (u64)ram_start, (u64)ram_end);
> +
> +    numa_dummy_init(PFN_UP(ram_start),PFN_DOWN(ram_end));
> +
> +    return 0;
>  }
>
>  int __init arch_numa_setup(char *opt)
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 10/21] ARM: NUMA: Add memory NUMA support
  2017-03-02 16:05   ` Julien Grall
@ 2017-03-02 16:23     ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 16:23 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 9:35 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> For all banks in bootinfo.mem, update nodes[] with
>> corresponding nodeid and register these nodes by
>> calling setup_node_bootmem().
>> compute memnode_shift and initialize memnodemap[] to fetch
>> nodeid for a given physical address.
>>
>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/numa.c    | 90
>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  xen/common/numa.c      | 14 ++++++++
>>  xen/include/xen/numa.h |  1 +
>>  3 files changed, 105 insertions(+)
>>
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index d4dbad4..aa34c82 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -24,10 +24,15 @@
>>  #include <asm/acpi.h>
>>  #include <xen/errno.h>
>>  #include <xen/cpumask.h>
>> +#include <asm/setup.h>
>>
>>  int _node_distance[MAX_NUMNODES * 2];
>>  int *node_distance;
>>  extern nodemask_t numa_nodes_parsed;
>> +extern struct node nodes[MAX_NUMNODES] __initdata;
>> +extern int num_node_memblks;
>> +extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>
>>  void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>>  {
>> @@ -51,6 +56,88 @@ u8 __node_distance(nodeid_t a, nodeid_t b)
>>
>>  EXPORT_SYMBOL(__node_distance);
>>
>> +static int __init numa_mem_init(void)
>
>
> The function return only 2 different value either 0 or -EINVAL. It should be
> better to use a boolean.
>
>> +{
>> +    nodemask_t memory_nodes_parsed;
>> +    int bank, nodeid;
>> +    struct node *nd;
>> +    paddr_t start, size, end;
>> +
>> +    nodes_clear(memory_nodes_parsed);
>> +    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
>> +    {
>> +        start = bootinfo.mem.bank[bank].start;
>> +        size = bootinfo.mem.bank[bank].size;
>> +        end = start + size;
>> +
>> +        nodeid = get_numa_node(start, end);
>> +        if ( nodeid == -EINVAL || nodeid > MAX_NUMNODES )
>
>
> If found, how the nodeid could be invalid? IHMO, this is a parsing bug.
>
>> +        {
>> +            printk(XENLOG_WARNING
>> +                   "NUMA: node for mem bank start 0x%lx - 0x%lx not
>> found\n",
>> +                   start, end);
>> +
>> +            return -EINVAL;
>> +        }
>> +
>> +        nd = &nodes[nodeid];
>> +        if ( !node_test_and_set(nodeid, memory_nodes_parsed) )
>> +        {
>> +            nd->start = start;
>> +            nd->end = end;
>> +        }
>> +        else
>> +        {
>> +            if ( start < nd->start )
>> +                nd->start = start;
>> +            if ( nd->end < end )
>> +                nd->end = end;
>> +        }
>> +    }
>
>
> This function is quite confusing. What is the purpose? Why do you go through
> bootinfo.mem and not node_memblk_range?

node_memblk_range[] contains memory bank to node info read from DT memory node.
We go through bootinfo.mem and populate nodes[] with actual memory
range indexed by
nodeid.

I think this is similar to nodes_cover_memory in arch/x86/srat.c

>
>> +
>> +    return 0;
>> +}
>> +
>> +/* Use the information discovered above to actually set up the nodes. */
>> +static int __init numa_scan_mem_nodes(void)
>
>
> This function looks very similar to acpi_scan_nodes on x86. This is call to
> move more code in common.
>
Yes, I have already fixed it in next revision.

>
>> +{
>> +    int i;
>> +
>> +    memnode_shift = compute_hash_shift(node_memblk_range,
>> num_node_memblks,
>> +                                       memblk_nodeid);
>> +    if ( memnode_shift < 0 )
>> +    {
>> +        printk(XENLOG_WARNING "No NUMA hash found.\n");
>> +        memnode_shift = 0;
>> +    }
>> +
>> +    for_each_node_mask(i, numa_nodes_parsed)
>> +    {
>> +        u64 size = node_memblk_range[i].end - node_memblk_range[i].start;
>> +
>> +        if ( size == 0 )
>> +            printk(XENLOG_WARNING "NUMA: Node %u has no memory. \n", i);
>> +
>> +        printk(XENLOG_INFO
>> +               "NUMA: NODE[%d]: Start 0x%lx End 0x%lx\n",
>> +               i, nodes[i].start, nodes[i].end);
>> +        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int __init numa_initmem_init(void)
>> +{
>> +    if ( !numa_mem_init() )
>> +    {
>> +        if ( !numa_scan_mem_nodes() )
>> +            return 0;
>> +    }
>
>
> This could be simplified with
>
> if ( !numa_mem_init() && !num_scan_mem_nodes )
>    return 0;
>
> However looking at the usage, this function seems rather pointless.

OK. I have changed this in next revision.

>
>> +
>> +    return -EINVAL;
>> +}
>> +
>>  /*
>>   * Setup early cpu_to_node.
>>   */
>> @@ -74,6 +161,9 @@ int __init numa_init(void)
>>
>>      ret = dt_numa_init();
>>
>> +    if ( !ret )
>> +        ret = numa_initmem_init();
>> +
>>  no_numa:
>>      return ret;
>>  }
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 62c76af..2f5266a 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -63,6 +63,20 @@ void __init numa_add_memblk(nodeid_t nodeid, u64 start,
>> u64 size)
>>      num_node_memblks++;
>>  }
>>
>> +int __init get_numa_node(u64 start, u64 end)
>
>
> I think this function will be confusing for the caller. Why using
> num_node_memblks and not the node itself?

OK. Wll check
>
>> +{
>> +    int i;
>
>
> Please unsigned int
>
>> +
>> +    for ( i = 0; i < num_node_memblks; i++ )
>> +    {
>> +        if ( start >= node_memblk_range[i].start &&
>> +             end <= node_memblk_range[i].end )
>> +            return memblk_nodeid[i];
>> +    }
>> +
>> +    return -EINVAL;
>> +}
>> +
>>  int valid_numa_range(u64 start, u64 end, nodeid_t node)
>>  {
>>  #ifdef CONFIG_NUMA
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 9392a89..4f04ab4 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -68,6 +68,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #endif /* CONFIG_NUMA */
>>
>>  extern void numa_add_memblk(nodeid_t nodeid, u64 start, u64 size);
>> +extern int get_numa_node(u64 start, u64 end);
>>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>>  extern int conflicting_memblks(u64 start, u64 end);
>>  extern void cutoff_node(int i, u64 start, u64 end);
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure
  2017-03-02 16:09   ` Julien Grall
@ 2017-03-02 16:25     ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 16:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 9:39 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> On NUMA initialization failure, reset all the
>> NUMA structures to emulate as single node.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/numa.c | 50
>> ++++++++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 48 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index aa34c82..31dc552 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -19,6 +19,7 @@
>>  #include <xen/ctype.h>
>>  #include <xen/mm.h>
>>  #include <xen/nodemask.h>
>> +#include <xen/pfn.h>
>>  #include <asm/mm.h>
>>  #include <xen/numa.h>
>>  #include <asm/acpi.h>
>> @@ -127,6 +128,29 @@ static int __init numa_scan_mem_nodes(void)
>>      return 0;
>>  }
>>
>> +static void __init numa_dummy_init(unsigned long start_pfn,
>> +                                   unsigned long end_pfn)
>
>
> This code is very similar to numa_initmem_init on x86. This is a call to
> make the code more generic.

Yes, I have noticed it and taken care in next revision
>
>
>> +{
>> +    int i;
>> +
>> +    nodes_clear(numa_nodes_parsed);
>> +    memnode_shift = BITS_PER_LONG - 1;
>> +    memnodemap = _memnodemap;
>> +    nodes_clear(node_online_map);
>> +    node_set_online(0);
>> +
>> +    for ( i = 0; i < NR_CPUS; i++ )
>> +        numa_set_node(i, 0);
>> +
>> +    node_distance = NULL;
>> +    for ( i = 0; i < MAX_NUMNODES * 2; i++ )
>> +        _node_distance[i] = 0;
>> +
>> +    cpumask_copy(&node_to_cpumask[0], cpumask_of(0));
>> +    setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
>> +                       (u64)end_pfn << PAGE_SHIFT);
>> +}
>> +
>>  static int __init numa_initmem_init(void)
>>  {
>>      if ( !numa_mem_init() )
>> @@ -151,7 +175,9 @@ void __init init_cpu_to_node(void)
>>
>>  int __init numa_init(void)
>
>
> I would make numa_initmem_init from x86 generic and have an arch specific
> function called from there.
>
>>  {
>> -    int i, ret = 0;
>> +    int i, bank, ret = 0;
>> +    paddr_t ram_start = ~0;
>> +    paddr_t ram_end = 0;
>>
>>      if ( numa_off )
>>          goto no_numa;
>> @@ -164,8 +190,28 @@ int __init numa_init(void)
>>      if ( !ret )
>>          ret = numa_initmem_init();
>>
>> +    if ( !ret )
>> +        return 0;
>> +
>>  no_numa:
>> -    return ret;
>> +    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
>> +    {
>> +        paddr_t bank_start = bootinfo.mem.bank[bank].start;
>> +        paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
>> +
>> +        ram_start = min(ram_start, bank_start);
>> +        ram_end = max(ram_end, bank_end);
>> +    }
>
>
> I am sure this could be done in setup.c rather than here.

Yes, I have noticed it and taken care in next revision.
numa_init() is called from setup_mm() with ram_start and ram_end as params

>
>> +
>> +    printk(XENLOG_INFO "%s\n",
>> +           numa_off ? "NUMA turned off" : "No NUMA configuration found");
>> +
>> +    printk(XENLOG_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
>> +           (u64)ram_start, (u64)ram_end);
>> +
>> +    numa_dummy_init(PFN_UP(ram_start),PFN_DOWN(ram_end));
>> +
>> +    return 0;
>>  }
>>
>>  int __init arch_numa_setup(char *opt)
>>
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table
  2017-02-09 15:57 ` [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
@ 2017-03-02 16:28   ` Julien Grall
  2017-03-02 16:41     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 16:28 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Parse MADT table and extract MPIDR for all
> CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
> and store in cpu_uid_to_hwid[].
>
> This mapping is used by SRAT table parsing to
> extract MPIDR of the CPU ID.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Makefile      |   1 +
>  xen/arch/arm/acpi_numa.c   | 122 +++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/numa.c        |   1 +

This new file should go in xen/arch/arm/acpi/

>  xen/include/asm-arm/acpi.h |   2 +
>  4 files changed, 126 insertions(+)
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 7694485..8c5e67b 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -51,6 +51,7 @@ obj-y += vpsci.o
>  obj-y += vuart.o
>  obj-$(CONFIG_NUMA) += numa.o
>  obj-$(CONFIG_NUMA) += dt_numa.o
> +obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
>
>  #obj-bin-y += ....o
>
> diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
> new file mode 100644
> index 0000000..3ee87f2
> --- /dev/null
> +++ b/xen/arch/arm/acpi_numa.c
> @@ -0,0 +1,122 @@
> +/*
> + * ACPI based NUMA setup
> + *
> + * Copyright (C) 2016 - Cavium Inc.
> + * Vijaya Kumar K <Vijaya.Kumar@cavium.com>
> + *
> + * Reads the ACPI MADT and SRAT table to setup NUMA information.
> + *
> + * Contains Excerpts from x86 implementation
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.

Xen is GPLv2, please update the license accordingly.

> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +#include <xen/inttypes.h>
> +#include <xen/nodemask.h>
> +#include <xen/acpi.h>
> +#include <xen/numa.h>
> +#include <xen/pfn.h>
> +#include <xen/srat.h>
> +#include <asm/page.h>
> +#include <asm/acpi.h>
> +
> +extern nodemask_t numa_nodes_parsed;
> +struct uid_to_mpidr {
> +    u32 uid;
> +    u64 mpidr;
> +};
> +
> +/* Holds mapping of CPU id to MPIDR read from MADT */
> +static struct uid_to_mpidr cpu_uid_to_hwid[NR_CPUS] __read_mostly;
> +
> +static __init void bad_srat(void)
> +{
> +    int i;
> +
> +    printk(KERN_ERR "SRAT: SRAT not used.\n");
> +    acpi_numa = -1;
> +    for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
> +        pxm2node[i].node = NUMA_NO_NODE;
> +}
> +
> +static u64 acpi_get_cpu_mpidr(int uid)
> +{
> +    int i;
> +
> +    if ( uid < ARRAY_SIZE(cpu_uid_to_hwid) && cpu_uid_to_hwid[uid].uid == uid &&
> +         cpu_uid_to_hwid[uid].mpidr != MPIDR_INVALID )
> +        return cpu_uid_to_hwid[uid].mpidr;

Please don't make a special case. This makes more complicate to read the 
code.

We should just loop to find the entry matching the UID.

> +
> +    for ( i = 0; i < NR_CPUS; i++ )

You can limit the loop by keeping an the number of element in the array.

> +    {
> +        if ( cpu_uid_to_hwid[i].uid == uid )
> +            return cpu_uid_to_hwid[i].mpidr;
> +    }
> +
> +    return MPIDR_INVALID;
> +}
> +
> +static void __init
> +acpi_map_cpu_to_mpidr(struct acpi_madt_generic_interrupt *processor)
> +{
> +    static int cpus = 0;
> +
> +    u64 mpidr = processor->arm_mpidr & MPIDR_HWID_MASK;
> +
> +    if ( mpidr == MPIDR_INVALID )
> +    {
> +        printk("Skip MADT cpu entry with invalid MPIDR\n");
> +        bad_srat();
> +        return;
> +    }
> +
> +    cpu_uid_to_hwid[cpus].mpidr = mpidr;
> +    cpu_uid_to_hwid[cpus].uid = processor->uid;
> +
> +    cpus++;
> +}
> +
> +static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
> +                                          const unsigned long end)
> +{
> +    struct acpi_madt_generic_interrupt *p =
> +               container_of(header, struct acpi_madt_generic_interrupt, header);
> +
> +    if ( BAD_MADT_ENTRY(p, end) )
> +    {
> +        /* Though MADT is invalid, we disable NUMA by calling bad_srat() */
> +        bad_srat();
> +        return -EINVAL;
> +    }
> +
> +    acpi_table_print_madt_entry(header);
> +    acpi_map_cpu_to_mpidr(p);
> +
> +    return 0;
> +}

Why do you need to parse the MADT again? Can't this be done in the 
parsing made in acpi/boot.c?

> +
> +void __init acpi_map_uid_to_mpidr(void)
> +{
> +    int i;

unsigned int.

> +
> +    for ( i  = 0; i < NR_CPUS; i++ )
> +    {
> +        cpu_uid_to_hwid[i].mpidr = MPIDR_INVALID;
> +        cpu_uid_to_hwid[i].uid = -1;
> +    }
> +
> +    acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
> +                    acpi_parse_madt_handler, 0);
> +}
> +
> +void __init acpi_numa_arch_fixup(void) {}

Missing emacs magic.

> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 31dc552..5c49347 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -20,6 +20,7 @@
>  #include <xen/mm.h>
>  #include <xen/nodemask.h>
>  #include <xen/pfn.h>
> +#include <xen/acpi.h>

Why this include? This patch should compile without it.

>  #include <asm/mm.h>
>  #include <xen/numa.h>
>  #include <asm/acpi.h>
> diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h
> index 9f954d3..b1f36f4 100644
> --- a/xen/include/asm-arm/acpi.h
> +++ b/xen/include/asm-arm/acpi.h
> @@ -68,6 +68,8 @@ static inline void enable_acpi(void)
>  {
>      acpi_disabled = 0;
>  }
> +
> +void acpi_map_uid_to_mpidr(void);
>  #else
>  #define acpi_disabled (1)
>  #define disable_acpi()
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code
  2017-03-02 15:30   ` Julien Grall
@ 2017-03-02 16:31     ` Vijay Kilari
  2017-03-02 16:32       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 16:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 9:00 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Move SRAT handling code which is common across
>> architecture is moved to new file xen/commom/srat.c
>> from xen/arch/x86/srat.c file. New header file srat.h is
>> introduced.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/x86/domain_build.c         |   1 +
>>  xen/arch/x86/numa.c                 |   1 +
>>  xen/arch/x86/physdev.c              |   1 +
>>  xen/arch/x86/setup.c                |   1 +
>>  xen/arch/x86/smpboot.c              |   1 +
>>  xen/arch/x86/srat.c                 | 129 +------------------------------
>>  xen/arch/x86/x86_64/mm.c            |   1 +
>>  xen/common/Makefile                 |   1 +
>>  xen/common/srat.c                   | 150
>> ++++++++++++++++++++++++++++++++++++
>
>
> This new file should be created in xen/drivers/acpi/
OK
>
>>  xen/drivers/passthrough/vtd/iommu.c |   1 +
>>  xen/include/asm-x86/numa.h          |   2 -
>>  xen/include/xen/numa.h              |   1 -
>>  xen/include/xen/srat.h              |  13 ++++
>
>
> This new file should be created in xen/include/acpi/
OK
>
> [...]
>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 58dee09..af12e26 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -18,91 +18,20 @@
>>  #include <xen/acpi.h>
>>  #include <xen/numa.h>
>>  #include <xen/pfn.h>
>> +#include <xen/srat.h>
>>  #include <asm/e820.h>
>>  #include <asm/page.h>
>>
>> -static struct acpi_table_slit *__read_mostly acpi_slit;
>> +extern struct acpi_table_slit *__read_mostly acpi_slit;
>
>
> This should be defined in the header. However, I don't like the idea of
> exposing acpi_slit.
>
> Looking at the usage it is only to parse the distance that can be common.

node_distance() of x86 and arm is quite different as arm has DT mechanism.
I will check if possible to make ACPI node_distance part common between
x86 and arm.

>
> [...]
>
>>  /* Callback for Proximity Domain -> x2APIC mapping */
>>  void __init
>>  acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity
>> *pa)
>> @@ -456,18 +343,6 @@ int __init acpi_scan_nodes(u64 start, u64 end)
>>         return 0;
>>  }
>
>
> The code of acpi_numa_memory_affinity_init looks pretty generic. Why didn't
> you move it in the common code?

Agreed.
>
> [...]
>
>> diff --git a/xen/common/Makefile b/xen/common/Makefile
>> index c1bd2ff..a668094 100644
>> --- a/xen/common/Makefile
>> +++ b/xen/common/Makefile
>> @@ -64,6 +64,7 @@ obj-bin-y += warning.init.o
>>  obj-$(CONFIG_XENOPROF) += xenoprof.o
>>  obj-y += xmalloc_tlsf.o
>>  obj-y += numa.o
>> +obj-y += srat.o
>
>
> This should be only compiled when CONFIG_ACPI is enabled.
OK
>
>
>>
>>  obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo
>> unlz4 earlycpio,$(n).init.o)
>>
>> diff --git a/xen/common/srat.c b/xen/common/srat.c
>> new file mode 100644
>> index 0000000..cf50c78
>> --- /dev/null
>> +++ b/xen/common/srat.c
>> @@ -0,0 +1,150 @@
>> +/*
>> + * ACPI 3.0 based NUMA setup
>> + * Copyright 2004 Andi Kleen, SuSE Labs.
>> + *
>> + * Reads the ACPI SRAT table to figure out what memory belongs to which
>> CPUs.
>> + *
>> + * Called from acpi_numa_init while reading the SRAT and SLIT tables.
>> + * Assumes all memory regions belonging to a single proximity domain
>> + * are in one chunk. Holes between them will be included in the node.
>> + *
>> + * Adapted for Xen: Ryan Harper <ryanh@us.ibm.com>
>> + *
>> + * Moved this generic code from xen/arch/x86/srat.c for other arch usage
>> + * by Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> + */
>> +
>> +#include <xen/init.h>
>> +#include <xen/mm.h>
>> +#include <xen/inttypes.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/acpi.h>
>> +#include <xen/numa.h>
>> +#include <xen/pfn.h>
>> +#include <xen/srat.h>
>> +#include <asm/page.h>
>> +
>> +struct acpi_table_slit *__read_mostly acpi_slit;
>
>
> This should really be static.
>
>> +extern struct node nodes[MAX_NUMNODES] __initdata;
>> +
>> +struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
>> +    { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
>
>
> So this is not only exposed because of bad_srat(). The code should be
> reworked to avoid that.

I will check.
>
>> +
>> +static inline bool_t node_found(unsigned idx, unsigned pxm)
>> +{
>> +    return ((pxm2node[idx].pxm == pxm) &&
>> +        (pxm2node[idx].node != NUMA_NO_NODE));
>> +}
>> +
>> +nodeid_t pxm_to_node(unsigned pxm)
>> +{
>> +    unsigned i;
>> +
>> +    if ( (pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm) )
>> +        return pxm2node[pxm].node;
>> +
>> +    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
>> +        if ( node_found(i, pxm) )
>> +            return pxm2node[i].node;
>> +
>> +    return NUMA_NO_NODE;
>> +}
>> +
>> +nodeid_t setup_node(unsigned pxm)
>
>
> This name is too generic. The name of the function should make clear it is
> an ACPI only function.
>
OK
> [...]
>
>> +unsigned node_to_pxm(nodeid_t n)
>> +{
>> +    unsigned i;
>> +
>> +    if ( (n < ARRAY_SIZE(pxm2node)) && (pxm2node[n].node == n) )
>> +        return pxm2node[n].pxm;
>> +    for ( i = 0; i < ARRAY_SIZE(pxm2node); i++ )
>> +        if ( pxm2node[i].node == n )
>> +            return pxm2node[i].pxm;
>> +    return 0;
>> +}
>
>
> Missing emacs magic here.
You mean this at the end of the file?

/*
 * Local variables:
 * mode: C
 * c-file-style: "BSD"
 * c-basic-offset: 4
 * indent-tabs-mode: nil
 * End:
 */

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code
  2017-03-02 16:31     ` Vijay Kilari
@ 2017-03-02 16:32       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 16:32 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd



On 02/03/17 16:31, Vijay Kilari wrote:
> On Thu, Mar 2, 2017 at 9:00 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Missing emacs magic here.
> You mean this at the end of the file?

Yes.

>
> /*
>  * Local variables:
>  * mode: C
>  * c-file-style: "BSD"
>  * c-basic-offset: 4
>  * indent-tabs-mode: nil
>  * End:
>  */
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table
  2017-03-02 16:28   ` Julien Grall
@ 2017-03-02 16:41     ` Vijay Kilari
  2017-03-02 16:49       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-02 16:41 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 9:58 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Parse MADT table and extract MPIDR for all
>> CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
>> and store in cpu_uid_to_hwid[].
>>
>> This mapping is used by SRAT table parsing to
>> extract MPIDR of the CPU ID.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/Makefile      |   1 +
>>  xen/arch/arm/acpi_numa.c   | 122
>> +++++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/numa.c        |   1 +
>
>
> This new file should go in xen/arch/arm/acpi/

shouldn't be in xen/arch/arm/numa/?
>
>
>>  xen/include/asm-arm/acpi.h |   2 +
>>  4 files changed, 126 insertions(+)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 7694485..8c5e67b 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -51,6 +51,7 @@ obj-y += vpsci.o
>>  obj-y += vuart.o
>>  obj-$(CONFIG_NUMA) += numa.o
>>  obj-$(CONFIG_NUMA) += dt_numa.o
>> +obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
>>
>>  #obj-bin-y += ....o
>>
>> diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
>> new file mode 100644
>> index 0000000..3ee87f2
>> --- /dev/null
>> +++ b/xen/arch/arm/acpi_numa.c
>> @@ -0,0 +1,122 @@
>> +/*
>> + * ACPI based NUMA setup
>> + *
>> + * Copyright (C) 2016 - Cavium Inc.
>> + * Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>> + *
>> + * Reads the ACPI MADT and SRAT table to setup NUMA information.
>> + *
>> + * Contains Excerpts from x86 implementation
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>
>
> Xen is GPLv2, please update the license accordingly.
>
>
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/init.h>
>> +#include <xen/mm.h>
>> +#include <xen/inttypes.h>
>> +#include <xen/nodemask.h>
>> +#include <xen/acpi.h>
>> +#include <xen/numa.h>
>> +#include <xen/pfn.h>
>> +#include <xen/srat.h>
>> +#include <asm/page.h>
>> +#include <asm/acpi.h>
>> +
>> +extern nodemask_t numa_nodes_parsed;
>> +struct uid_to_mpidr {
>> +    u32 uid;
>> +    u64 mpidr;
>> +};
>> +
>> +/* Holds mapping of CPU id to MPIDR read from MADT */
>> +static struct uid_to_mpidr cpu_uid_to_hwid[NR_CPUS] __read_mostly;
>> +
>> +static __init void bad_srat(void)
>> +{
>> +    int i;
>> +
>> +    printk(KERN_ERR "SRAT: SRAT not used.\n");
>> +    acpi_numa = -1;
>> +    for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
>> +        pxm2node[i].node = NUMA_NO_NODE;
>> +}
>> +
>> +static u64 acpi_get_cpu_mpidr(int uid)
>> +{
>> +    int i;
>> +
>> +    if ( uid < ARRAY_SIZE(cpu_uid_to_hwid) && cpu_uid_to_hwid[uid].uid ==
>> uid &&
>> +         cpu_uid_to_hwid[uid].mpidr != MPIDR_INVALID )
>> +        return cpu_uid_to_hwid[uid].mpidr;
>
>
> Please don't make a special case. This makes more complicate to read the
> code.
>
> We should just loop to find the entry matching the UID.
>
>> +
>> +    for ( i = 0; i < NR_CPUS; i++ )
>
>
> You can limit the loop by keeping an the number of element in the array.
OK.
>
>
>> +    {
>> +        if ( cpu_uid_to_hwid[i].uid == uid )
>> +            return cpu_uid_to_hwid[i].mpidr;
>> +    }
>> +
>> +    return MPIDR_INVALID;
>> +}
>> +
>> +static void __init
>> +acpi_map_cpu_to_mpidr(struct acpi_madt_generic_interrupt *processor)
>> +{
>> +    static int cpus = 0;
>> +
>> +    u64 mpidr = processor->arm_mpidr & MPIDR_HWID_MASK;
>> +
>> +    if ( mpidr == MPIDR_INVALID )
>> +    {
>> +        printk("Skip MADT cpu entry with invalid MPIDR\n");
>> +        bad_srat();
>> +        return;
>> +    }
>> +
>> +    cpu_uid_to_hwid[cpus].mpidr = mpidr;
>> +    cpu_uid_to_hwid[cpus].uid = processor->uid;
>> +
>> +    cpus++;
>> +}
>> +
>> +static int __init acpi_parse_madt_handler(struct acpi_subtable_header
>> *header,
>> +                                          const unsigned long end)
>> +{
>> +    struct acpi_madt_generic_interrupt *p =
>> +               container_of(header, struct acpi_madt_generic_interrupt,
>> header);
>> +
>> +    if ( BAD_MADT_ENTRY(p, end) )
>> +    {
>> +        /* Though MADT is invalid, we disable NUMA by calling bad_srat()
>> */
>> +        bad_srat();
>> +        return -EINVAL;
>> +    }
>> +
>> +    acpi_table_print_madt_entry(header);
>> +    acpi_map_cpu_to_mpidr(p);
>> +
>> +    return 0;
>> +}
>
>
> Why do you need to parse the MADT again? Can't this be done in the parsing
> made in acpi/boot.c?
I will check. But I see that this is done quite late in smp_init().

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table
  2017-03-02 16:41     ` Vijay Kilari
@ 2017-03-02 16:49       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 16:49 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd



On 02/03/17 16:41, Vijay Kilari wrote:
> On Thu, Mar 2, 2017 at 9:58 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Parse MADT table and extract MPIDR for all
>>> CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
>>> and store in cpu_uid_to_hwid[].
>>>
>>> This mapping is used by SRAT table parsing to
>>> extract MPIDR of the CPU ID.
>>>
>>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/arch/arm/Makefile      |   1 +
>>>  xen/arch/arm/acpi_numa.c   | 122
>>> +++++++++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/numa.c        |   1 +
>>
>>
>> This new file should go in xen/arch/arm/acpi/
>
> shouldn't be in xen/arch/arm/numa/?

If you introduce a numa directory then move in it. Otherwise acpi/.


>>> +static int __init acpi_parse_madt_handler(struct acpi_subtable_header
>>> *header,
>>> +                                          const unsigned long end)
>>> +{
>>> +    struct acpi_madt_generic_interrupt *p =
>>> +               container_of(header, struct acpi_madt_generic_interrupt,
>>> header);
>>> +
>>> +    if ( BAD_MADT_ENTRY(p, end) )
>>> +    {
>>> +        /* Though MADT is invalid, we disable NUMA by calling bad_srat()
>>> */
>>> +        bad_srat();
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    acpi_table_print_madt_entry(header);
>>> +    acpi_map_cpu_to_mpidr(p);
>>> +
>>> +    return 0;
>>> +}
>>
>>
>> Why do you need to parse the MADT again? Can't this be done in the parsing
>> made in acpi/boot.c?
> I will check. But I see that this is done quite late in smp_init().

Then rework the code.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-02-09 15:57 ` [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
@ 2017-03-02 17:21   ` Julien Grall
  2017-03-03 12:39     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 17:21 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Jan Beulich, Vijaya Kumar K

(+ Jan as ACPI maintainer)

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Register SRAT entry handler for type
> ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
> and extract proximity for all CPU IDs.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/acpi_numa.c  | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>  xen/drivers/acpi/numa.c   | 37 +++++++++++++++++++++++++++++++
>  xen/drivers/acpi/osl.c    |  2 ++
>  xen/include/acpi/actbl1.h | 17 ++++++++++++++-
>  xen/include/xen/acpi.h    | 39 +++++++++++++++++++++++++++++++++
>  5 files changed, 149 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
> index 3ee87f2..f659275 100644
> --- a/xen/arch/arm/acpi_numa.c
> +++ b/xen/arch/arm/acpi_numa.c
> @@ -105,6 +105,61 @@ static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
>      return 0;
>  }
>
> +/* Callback for Proximity Domain -> ACPI processor UID mapping */
> +void __init acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
> +{
> +    int pxm, node;
> +    u64 mpidr = 0;

mpidr does not need to be set to 0.

> +    static u32 cpus_in_srat;
> +
> +    if ( srat_disabled() )
> +        return;
> +
> +    if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
> +    {
> +        printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
> +               pa->header.length);
> +        bad_srat();
> +        return;
> +    }
> +
> +    if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
> +        return;
> +
> +    if ( cpus_in_srat >= NR_CPUS )
> +    {
> +        printk(XENLOG_WARNING

This should be XENLOG_ERROR.

> +               "SRAT: cpu_to_node_map[%d] is too small to fit all cpus\n",
> +               NR_CPUS);
> +        return;
> +    }
> +
> +    pxm = pa->proximity_domain;
> +    node = setup_node(pxm);
> +    if ( node == NUMA_NO_NODE || node >= MAX_NUMNODES )

Looking at the implementation of setup_node, node will either be equal 
to NUMA_NO_NODE or valid. It is not possible to have node >= MAX_NUMNODES.

> +    {
> +        printk(XENLOG_WARNING "SRAT: Too many proximity domains %d\n", pxm);

setup_node is already printing an error if we have too many proximity 
domains. So no need to duplicate twice.

> +        bad_srat();
> +        return;
> +    }
> +
> +    mpidr = acpi_get_cpu_mpidr(pa->acpi_processor_uid);
> +    if ( mpidr == MPIDR_INVALID )
> +    {
> +        printk(XENLOG_WARNING

s/XENLOG_WARNING/XENLOG_ERROR/

> +               "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in MADT\n",
> +               pxm, pa->acpi_processor_uid);
> +        bad_srat();
> +        return;
> +    }
> +
> +    node_set(node, numa_nodes_parsed);
> +    cpus_in_srat++;
> +    acpi_numa = 1;
> +    printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
> +           pxm, mpidr, node);
> +}
> +
>  void __init acpi_map_uid_to_mpidr(void)
>  {
>      int i;
> diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
> index 50bf9f8..ce22e88 100644
> --- a/xen/drivers/acpi/numa.c
> +++ b/xen/drivers/acpi/numa.c
> @@ -25,9 +25,11 @@
>  #include <xen/init.h>
>  #include <xen/types.h>
>  #include <xen/errno.h>
> +#include <xen/mm.h>

Why do you need to inlucde xen/mm.h and ...

>  #include <xen/acpi.h>
>  #include <xen/numa.h>
>  #include <acpi/acmacros.h>
> +#include <asm/mm.h>

asm/mm.h?

>
>  #define ACPI_NUMA	0x80000000
>  #define _COMPONENT	ACPI_NUMA
> @@ -105,6 +107,21 @@ void __init acpi_table_print_srat_entry(struct acpi_subtable_header * header)
>  		}
>  #endif				/* ACPI_DEBUG_OUTPUT */
>  		break;
> +       case ACPI_SRAT_TYPE_GICC_AFFINITY:
> +#ifdef ACPI_DEBUG_OUTPUT
> +		{
> +			struct acpi_srat_gicc_affinity *p =
> +			    (struct acpi_srat_gicc_affinity *)header;
> +			ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> +					  "SRAT Processor (acpi id[0x%04x]) in"
> +					  " proximity domain %d %s\n",
> +					  p->acpi_processor_uid,
> +					  p->proximity_domain,
> +					  (p->flags & ACPI_SRAT_GICC_ENABLED) ?
> +					  "enabled" : "disabled");
> +		}
> +#endif                         /* ACPI_DEBUG_OUTPUT */
> +               break;
>  	default:
>  		printk(KERN_WARNING PREFIX
>  		       "Found unsupported SRAT entry (type = %#x)\n",
> @@ -185,6 +202,24 @@ int __init acpi_parse_srat(struct acpi_table_header *table)
>  	return 0;
>  }
>
> +static int __init
> +acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
> +			 const unsigned long end)
> +{
> +	const struct acpi_srat_gicc_affinity *processor_affinity
> +			= (struct acpi_srat_gicc_affinity *)header;
> +
> +	if (!processor_affinity)
> +		return -EINVAL;
> +
> +	acpi_table_print_srat_entry(header);
> +
> +	/* let architecture-dependent part to do it */
> +	acpi_numa_gicc_affinity_init(processor_affinity);
> +
> +	return 0;
> +}
> +
>  int __init
>  acpi_table_parse_srat(int id, acpi_madt_entry_handler handler,
>  		      unsigned int max_entries)
> @@ -205,6 +240,8 @@ int __init acpi_numa_init(void)
>  		acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
>  				      acpi_parse_memory_affinity,
>  				      NR_NODE_MEMBLKS);
> +		acpi_table_parse_srat(ACPI_SRAT_TYPE_GICC_AFFINITY,
> +				      acpi_parse_gicc_affinity, NR_CPUS);
>  	}
>
>  	/* SLIT: System Locality Information Table */
> diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
> index 7199047..7046816 100644
> --- a/xen/drivers/acpi/osl.c
> +++ b/xen/drivers/acpi/osl.c
> @@ -29,6 +29,7 @@
>  #include <xen/pfn.h>
>  #include <xen/types.h>
>  #include <xen/errno.h>
> +#include <xen/mm.h>
>  #include <xen/acpi.h>
>  #include <xen/numa.h>
>  #include <acpi/acmacros.h>
> @@ -39,6 +40,7 @@
>  #include <xen/efi.h>
>  #include <xen/vmap.h>
>  #include <xen/kconfig.h>
> +#include <asm/mm.h>
>
>  #define _COMPONENT		ACPI_OS_SERVICES
>  ACPI_MODULE_NAME("osl")
> diff --git a/xen/include/acpi/actbl1.h b/xen/include/acpi/actbl1.h
> index e199136..b84bfba 100644
> --- a/xen/include/acpi/actbl1.h
> +++ b/xen/include/acpi/actbl1.h
> @@ -949,7 +949,8 @@ enum acpi_srat_type {
>  	ACPI_SRAT_TYPE_CPU_AFFINITY = 0,
>  	ACPI_SRAT_TYPE_MEMORY_AFFINITY = 1,
>  	ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY = 2,
> -	ACPI_SRAT_TYPE_RESERVED = 3	/* 3 and greater are reserved */
> +	ACPI_SRAT_TYPE_GICC_AFFINITY = 3,
> +	ACPI_SRAT_TYPE_RESERVED = 4	/* 4 and greater are reserved */
>  };
>
>  /*
> @@ -1007,6 +1008,20 @@ struct acpi_srat_x2apic_cpu_affinity {
>
>  #define ACPI_SRAT_CPU_ENABLED       (1)	/* 00: Use affinity structure */
>
> +/* 3: GICC Affinity (ACPI 5.1) */
> +
> +struct acpi_srat_gicc_affinity {
> +	struct acpi_subtable_header header;
> +	u32 proximity_domain;
> +	u32 acpi_processor_uid;
> +	u32 flags;
> +	u32 clock_domain;
> +};
> +
> +/* Flags for struct acpi_srat_gicc_affinity */
> +
> +#define ACPI_SRAT_GICC_ENABLED     (1)  /* 00: Use affinity structure */
> +
>  /* Reset to default packing */
>
>  #pragma pack()
> diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
> index 30ec0ee..67fe1bb 100644
> --- a/xen/include/xen/acpi.h
> +++ b/xen/include/xen/acpi.h
> @@ -92,10 +92,49 @@ void acpi_table_print_srat_entry (struct acpi_subtable_header *srat);
>
>  /* the following four functions are architecture-dependent */
>  void acpi_numa_slit_init (struct acpi_table_slit *slit);
> +#if defined(CONFIG_X86)
>  void acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *);
>  void acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *);
>  void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *);
>  void acpi_numa_arch_fixup(void);
> +static inline void
> +acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
> +{
> +	return;
> +}
> +#elif defined(CONFIG_ARM)
> +static inline void
> +acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity *cpu_aff)
> +{
> +	return;
> +}
> +static inline void
> +acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity *x2apic)
> +{
> +	return;
> +}
> +#if defined(CONFIG_ACPI_NUMA)
> +void acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa);
> +void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *);
> +void acpi_numa_arch_fixup(void);
> +#else
> +static inline void
> +acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
> +{
> +	return;
> +}
> +static inline void
> +acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
> +{
> +	return;
> +}
> +static inline void
> +acpi_numa_arch_fixup(void)
> +{
> +	return;
> +}
> +#endif /* CONFIG_ACPI_NUMA */

This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM} in 
common header.

Also, x2apic and gicc are respectively x86-specific and arm-specific. So 
I think we should move the parsing in a separate arch-depend function to 
avoid those ifdery.

By that I mean having a function acpi_table_arch_parse_srat that will 
parse x2apci on x86 and gicc on ARM. Jan, what do you think?

Cheers,

> +#endif /* CONFIG_X86 */
>
>  #ifdef CONFIG_ACPI_HOTPLUG_CPU
>  /* Arch dependent functions for cpu hotplug support */
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support
  2017-02-09 15:57 ` [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support vijay.kilari
@ 2017-03-02 17:24   ` Julien Grall
  2017-03-03 12:43     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 17:24 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Update node_distance() function to handle
> ACPI SLIT table information.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/numa.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 5c49347..50c3dea 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -23,6 +23,7 @@
>  #include <xen/acpi.h>
>  #include <asm/mm.h>
>  #include <xen/numa.h>
> +#include <xen/srat.h>
>  #include <asm/acpi.h>
>  #include <xen/errno.h>
>  #include <xen/cpumask.h>
> @@ -35,6 +36,7 @@ extern struct node nodes[MAX_NUMNODES] __initdata;
>  extern int num_node_memblks;
>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>  extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
> +extern struct acpi_table_slit *__read_mostly acpi_slit;
>
>  void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>  {
> @@ -50,9 +52,24 @@ void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>
>  u8 __node_distance(nodeid_t a, nodeid_t b)
>  {
> -    if ( !node_distance )
> +    unsigned index;
> +    u8 slit_val;
> +
> +    if ( !node_distance && !acpi_slit )
>          return a == b ? 10 : 20;
>
> +    if ( acpi_slit )
> +    {
> +        index = acpi_slit->locality_count * node_to_pxm(a);
> +        slit_val = acpi_slit->entry[index + node_to_pxm(b)];
> +
> +        /* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
> +        if ( (slit_val == 0xff) || (slit_val <= 9) )
> +            return NUMA_NO_DISTANCE;
> +        else
> +            return slit_val;
> +    }
> +

arm/numa.c is the generic code and should not contain any ACPI specific 
code.

But as I said, the way to get the distance on ACPI is the same on x86 
and ARM.

So I would introduce __node_distance callback that will be setup at 
boot-time to either point to the ACPI version or DT version.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA
  2017-02-09 15:57 ` [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
@ 2017-03-02 17:25   ` Julien Grall
  2017-03-03 12:44     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-02 17:25 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/numa.c | 12 +++++++++++-
>  xen/common/numa.c   |  6 ++++++
>  2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 50c3dea..1d6e16c 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -204,7 +204,17 @@ int __init numa_init(void)
>      for ( i = 0; i < MAX_NUMNODES * 2; i++ )
>          _node_distance[i] = 0;
>
> -    ret = dt_numa_init();
> +#ifdef CONFIG_ACPI_NUMA
> +    if ( !acpi_disabled )
> +    {
> +        acpi_map_uid_to_mpidr();
> +        ret = acpi_numa_init();
> +        if ( ret || srat_disabled() )
> +            goto no_numa;
> +    }
> +    else
> +#endif

We should really have only on call to ACPI in the generic code. Please 
move all of this in a function.

> +        ret = dt_numa_init();
>
>      if ( !ret )
>          ret = numa_initmem_init();
> diff --git a/xen/common/numa.c b/xen/common/numa.c
> index 2f5266a..4c67d38 100644
> --- a/xen/common/numa.c
> +++ b/xen/common/numa.c
> @@ -30,6 +30,7 @@
>  #include <xen/sched.h>
>  #include <xen/errno.h>
>  #include <xen/softirq.h>
> +#include <xen/srat.h>
>  #include <asm/setup.h>
>
>  static int numa_setup(char *s);
> @@ -282,6 +283,11 @@ static __init int numa_setup(char *opt)
>          numa_off = 1;
>      if ( !strncmp(opt,"on",2) )
>          numa_off = 0;
> +    if ( !strncmp(opt,"noacpi",6) )
> +    {
> +        numa_off = 0;
> +        acpi_numa = -1;
> +    }
>
>      return arch_numa_setup(opt);
>  }
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config
  2017-02-09 15:57 ` [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config vijay.kilari
@ 2017-03-02 17:27   ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 17:27 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Enable CONFIG_NUMA to enable DT NUMA
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 2e023d1..fbc4f23 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -23,6 +23,7 @@ config ARM
>  	select HAS_PASSTHROUGH
>  	select HAS_PDX
>  	select VIDEO
> +	select NUMA

I would have expected a patch to move config NUMA from 
drivers/acpi/Kconfig to config/Kconfig.

Regards,


-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config
  2017-02-09 15:57 ` [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config vijay.kilari
@ 2017-03-02 17:31   ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-02 17:31 UTC (permalink / raw)
  To: vijay.kilari, sstabellini, andre.przywara, dario.faggioli
  Cc: xen-devel, nd, Vijaya Kumar K

Hello Vijay,

On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>
> Enable CONFIG_ACPI_NUMA to enable ACPI NUMA
>
> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
> ---
>  xen/arch/arm/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index fbc4f23..4b74eef 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -43,6 +43,10 @@ config ACPI
>  	  Advanced Configuration and Power Interface (ACPI) support for Xen is
>  	  an alternative to device tree on ARM64.
>
> +config ACPI_NUMA
> +	def_bool y
> +	depends on ACPI

This also depends on NUMA

> +

The define CONFIG_ACPI_NUMA in asm-x86/config.h should be turned into a 
Kconfig in drivers/acpi and make it selectable when NUMA && ACPI is 
available.

>  config HAS_GICV3
>  	bool
>
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-02 17:21   ` Julien Grall
@ 2017-03-03 12:39     ` Vijay Kilari
  2017-03-03 13:44       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 12:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd

On Thu, Mar 2, 2017 at 10:51 PM, Julien Grall <julien.grall@arm.com> wrote:
> (+ Jan as ACPI maintainer)
>
> Hello Vijay,
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Register SRAT entry handler for type
>> ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
>> and extract proximity for all CPU IDs.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/acpi_numa.c  | 55
>> +++++++++++++++++++++++++++++++++++++++++++++++
>>  xen/drivers/acpi/numa.c   | 37 +++++++++++++++++++++++++++++++
>>  xen/drivers/acpi/osl.c    |  2 ++
>>  xen/include/acpi/actbl1.h | 17 ++++++++++++++-
>>  xen/include/xen/acpi.h    | 39 +++++++++++++++++++++++++++++++++
>>  5 files changed, 149 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/acpi_numa.c b/xen/arch/arm/acpi_numa.c
>> index 3ee87f2..f659275 100644
>> --- a/xen/arch/arm/acpi_numa.c
>> +++ b/xen/arch/arm/acpi_numa.c
>> @@ -105,6 +105,61 @@ static int __init acpi_parse_madt_handler(struct
>> acpi_subtable_header *header,
>>      return 0;
>>  }
>>
>> +/* Callback for Proximity Domain -> ACPI processor UID mapping */
>> +void __init acpi_numa_gicc_affinity_init(const struct
>> acpi_srat_gicc_affinity *pa)
>> +{
>> +    int pxm, node;
>> +    u64 mpidr = 0;
>
>
> mpidr does not need to be set to 0.
>
>> +    static u32 cpus_in_srat;
>> +
>> +    if ( srat_disabled() )
>> +        return;
>> +
>> +    if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
>> +    {
>> +        printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
>> +               pa->header.length);
>> +        bad_srat();
>> +        return;
>> +    }
>> +
>> +    if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
>> +        return;
>> +
>> +    if ( cpus_in_srat >= NR_CPUS )
>> +    {
>> +        printk(XENLOG_WARNING
>
>
> This should be XENLOG_ERROR.
OK
>
>> +               "SRAT: cpu_to_node_map[%d] is too small to fit all
>> cpus\n",
>> +               NR_CPUS);
>> +        return;
>> +    }
>> +
>> +    pxm = pa->proximity_domain;
>> +    node = setup_node(pxm);
>> +    if ( node == NUMA_NO_NODE || node >= MAX_NUMNODES )
>
>
> Looking at the implementation of setup_node, node will either be equal to
> NUMA_NO_NODE or valid. It is not possible to have node >= MAX_NUMNODES.
ok
>
>> +    {
>> +        printk(XENLOG_WARNING "SRAT: Too many proximity domains %d\n",
>> pxm);
>
>
> setup_node is already printing an error if we have too many proximity
> domains. So no need to duplicate twice.
ok
>
>> +        bad_srat();
>> +        return;
>> +    }
>> +
>> +    mpidr = acpi_get_cpu_mpidr(pa->acpi_processor_uid);
>> +    if ( mpidr == MPIDR_INVALID )
>> +    {
>> +        printk(XENLOG_WARNING
>
>
> s/XENLOG_WARNING/XENLOG_ERROR/
>
>> +               "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in
>> MADT\n",
>> +               pxm, pa->acpi_processor_uid);
>> +        bad_srat();
>> +        return;
>> +    }
>> +
>> +    node_set(node, numa_nodes_parsed);
>> +    cpus_in_srat++;
>> +    acpi_numa = 1;
>> +    printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
>> +           pxm, mpidr, node);
>> +}
>> +
>>  void __init acpi_map_uid_to_mpidr(void)
>>  {
>>      int i;
>> diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
>> index 50bf9f8..ce22e88 100644
>> --- a/xen/drivers/acpi/numa.c
>> +++ b/xen/drivers/acpi/numa.c
>> @@ -25,9 +25,11 @@
>>  #include <xen/init.h>
>>  #include <xen/types.h>
>>  #include <xen/errno.h>
>> +#include <xen/mm.h>
>
>
> Why do you need to inlucde xen/mm.h and ...
>
>>  #include <xen/acpi.h>
>>  #include <xen/numa.h>
>>  #include <acpi/acmacros.h>
>> +#include <asm/mm.h>
>
>
> asm/mm.h?

I remember when CONFIG_ACPI +CONFIG_NUMA is enabled
there is compilation error.

>
>
>>
>>  #define ACPI_NUMA      0x80000000
>>  #define _COMPONENT     ACPI_NUMA
>> @@ -105,6 +107,21 @@ void __init acpi_table_print_srat_entry(struct
>> acpi_subtable_header * header)
>>                 }
>>  #endif                         /* ACPI_DEBUG_OUTPUT */
>>                 break;
>> +       case ACPI_SRAT_TYPE_GICC_AFFINITY:
>> +#ifdef ACPI_DEBUG_OUTPUT
>> +               {
>> +                       struct acpi_srat_gicc_affinity *p =
>> +                           (struct acpi_srat_gicc_affinity *)header;
>> +                       ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>> +                                         "SRAT Processor (acpi
>> id[0x%04x]) in"
>> +                                         " proximity domain %d %s\n",
>> +                                         p->acpi_processor_uid,
>> +                                         p->proximity_domain,
>> +                                         (p->flags &
>> ACPI_SRAT_GICC_ENABLED) ?
>> +                                         "enabled" : "disabled");
>> +               }
>> +#endif                         /* ACPI_DEBUG_OUTPUT */
>> +               break;
>>         default:
>>                 printk(KERN_WARNING PREFIX
>>                        "Found unsupported SRAT entry (type = %#x)\n",
>> @@ -185,6 +202,24 @@ int __init acpi_parse_srat(struct acpi_table_header
>> *table)
>>         return 0;
>>  }
>>
>> +static int __init
>> +acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
>> +                        const unsigned long end)
>> +{
>> +       const struct acpi_srat_gicc_affinity *processor_affinity
>> +                       = (struct acpi_srat_gicc_affinity *)header;
>> +
>> +       if (!processor_affinity)
>> +               return -EINVAL;
>> +
>> +       acpi_table_print_srat_entry(header);
>> +
>> +       /* let architecture-dependent part to do it */
>> +       acpi_numa_gicc_affinity_init(processor_affinity);
>> +
>> +       return 0;
>> +}
>> +
>>  int __init
>>  acpi_table_parse_srat(int id, acpi_madt_entry_handler handler,
>>                       unsigned int max_entries)
>> @@ -205,6 +240,8 @@ int __init acpi_numa_init(void)
>>                 acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
>>                                       acpi_parse_memory_affinity,
>>                                       NR_NODE_MEMBLKS);
>> +               acpi_table_parse_srat(ACPI_SRAT_TYPE_GICC_AFFINITY,
>> +                                     acpi_parse_gicc_affinity, NR_CPUS);
>>         }
>>
>>         /* SLIT: System Locality Information Table */
>> diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
>> index 7199047..7046816 100644
>> --- a/xen/drivers/acpi/osl.c
>> +++ b/xen/drivers/acpi/osl.c
>> @@ -29,6 +29,7 @@
>>  #include <xen/pfn.h>
>>  #include <xen/types.h>
>>  #include <xen/errno.h>
>> +#include <xen/mm.h>
>>  #include <xen/acpi.h>
>>  #include <xen/numa.h>
>>  #include <acpi/acmacros.h>
>> @@ -39,6 +40,7 @@
>>  #include <xen/efi.h>
>>  #include <xen/vmap.h>
>>  #include <xen/kconfig.h>
>> +#include <asm/mm.h>
>>
>>  #define _COMPONENT             ACPI_OS_SERVICES
>>  ACPI_MODULE_NAME("osl")
>> diff --git a/xen/include/acpi/actbl1.h b/xen/include/acpi/actbl1.h
>> index e199136..b84bfba 100644
>> --- a/xen/include/acpi/actbl1.h
>> +++ b/xen/include/acpi/actbl1.h
>> @@ -949,7 +949,8 @@ enum acpi_srat_type {
>>         ACPI_SRAT_TYPE_CPU_AFFINITY = 0,
>>         ACPI_SRAT_TYPE_MEMORY_AFFINITY = 1,
>>         ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY = 2,
>> -       ACPI_SRAT_TYPE_RESERVED = 3     /* 3 and greater are reserved */
>> +       ACPI_SRAT_TYPE_GICC_AFFINITY = 3,
>> +       ACPI_SRAT_TYPE_RESERVED = 4     /* 4 and greater are reserved */
>>  };
>>
>>  /*
>> @@ -1007,6 +1008,20 @@ struct acpi_srat_x2apic_cpu_affinity {
>>
>>  #define ACPI_SRAT_CPU_ENABLED       (1)        /* 00: Use affinity
>> structure */
>>
>> +/* 3: GICC Affinity (ACPI 5.1) */
>> +
>> +struct acpi_srat_gicc_affinity {
>> +       struct acpi_subtable_header header;
>> +       u32 proximity_domain;
>> +       u32 acpi_processor_uid;
>> +       u32 flags;
>> +       u32 clock_domain;
>> +};
>> +
>> +/* Flags for struct acpi_srat_gicc_affinity */
>> +
>> +#define ACPI_SRAT_GICC_ENABLED     (1)  /* 00: Use affinity structure */
>> +
>>  /* Reset to default packing */
>>
>>  #pragma pack()
>> diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
>> index 30ec0ee..67fe1bb 100644
>> --- a/xen/include/xen/acpi.h
>> +++ b/xen/include/xen/acpi.h
>> @@ -92,10 +92,49 @@ void acpi_table_print_srat_entry (struct
>> acpi_subtable_header *srat);
>>
>>  /* the following four functions are architecture-dependent */
>>  void acpi_numa_slit_init (struct acpi_table_slit *slit);
>> +#if defined(CONFIG_X86)
>>  void acpi_numa_processor_affinity_init(const struct
>> acpi_srat_cpu_affinity *);
>>  void acpi_numa_x2apic_affinity_init(const struct
>> acpi_srat_x2apic_cpu_affinity *);
>>  void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity
>> *);
>>  void acpi_numa_arch_fixup(void);
>> +static inline void
>> +acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
>> +{
>> +       return;
>> +}
>> +#elif defined(CONFIG_ARM)
>> +static inline void
>> +acpi_numa_processor_affinity_init(const struct acpi_srat_cpu_affinity
>> *cpu_aff)
>> +{
>> +       return;
>> +}
>> +static inline void
>> +acpi_numa_x2apic_affinity_init(const struct acpi_srat_x2apic_cpu_affinity
>> *x2apic)
>> +{
>> +       return;
>> +}
>> +#if defined(CONFIG_ACPI_NUMA)
>> +void acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity
>> *pa);
>> +void acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity
>> *);
>> +void acpi_numa_arch_fixup(void);
>> +#else
>> +static inline void
>> +acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
>> +{
>> +       return;
>> +}
>> +static inline void
>> +acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>> +{
>> +       return;
>> +}
>> +static inline void
>> +acpi_numa_arch_fixup(void)
>> +{
>> +       return;
>> +}
>> +#endif /* CONFIG_ACPI_NUMA */
>
>
> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM} in
> common header.
>
> Also, x2apic and gicc are respectively x86-specific and arm-specific. So I
> think we should move the parsing in a separate arch-depend function to avoid
> those ifdery.
>
> By that I mean having a function acpi_table_arch_parse_srat that will parse
> x2apci on x86 and gicc on ARM. Jan, what do you think?

Linux also follows similar approach
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265

Regards
Vijay

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support
  2017-03-02 17:24   ` Julien Grall
@ 2017-03-03 12:43     ` Vijay Kilari
  2017-03-03 13:46       ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 12:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 10:54 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Update node_distance() function to handle
>> ACPI SLIT table information.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/numa.c | 20 +++++++++++++++++++-
>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index 5c49347..50c3dea 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -23,6 +23,7 @@
>>  #include <xen/acpi.h>
>>  #include <asm/mm.h>
>>  #include <xen/numa.h>
>> +#include <xen/srat.h>
>>  #include <asm/acpi.h>
>>  #include <xen/errno.h>
>>  #include <xen/cpumask.h>
>> @@ -35,6 +36,7 @@ extern struct node nodes[MAX_NUMNODES] __initdata;
>>  extern int num_node_memblks;
>>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>>  extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>> +extern struct acpi_table_slit *__read_mostly acpi_slit;
>>
>>  void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>>  {
>> @@ -50,9 +52,24 @@ void __init numa_set_cpu_node(int cpu, unsigned long
>> hwid)
>>
>>  u8 __node_distance(nodeid_t a, nodeid_t b)
>>  {
>> -    if ( !node_distance )
>> +    unsigned index;
>> +    u8 slit_val;
>> +
>> +    if ( !node_distance && !acpi_slit )
>>          return a == b ? 10 : 20;
>>
>> +    if ( acpi_slit )
>> +    {
>> +        index = acpi_slit->locality_count * node_to_pxm(a);
>> +        slit_val = acpi_slit->entry[index + node_to_pxm(b)];
>> +
>> +        /* ACPI defines 0xff as an unreachable node and 0-9 are undefined
>> */
>> +        if ( (slit_val == 0xff) || (slit_val <= 9) )
>> +            return NUMA_NO_DISTANCE;
>> +        else
>> +            return slit_val;
>> +    }
>> +
>
>
> arm/numa.c is the generic code and should not contain any ACPI specific
> code.
Agreed.
>
> But as I said, the way to get the distance on ACPI is the same on x86 and
> ARM.
>
> So I would introduce __node_distance callback that will be setup at
> boot-time to either point to the ACPI version or DT version.

Instead of callback, Just call acpi's node_distance function if acpi is enabled
or else dt based.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA
  2017-03-02 17:25   ` Julien Grall
@ 2017-03-03 12:44     ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 12:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

On Thu, Mar 2, 2017 at 10:55 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
>
> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>
>> Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.
>>
>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>> ---
>>  xen/arch/arm/numa.c | 12 +++++++++++-
>>  xen/common/numa.c   |  6 ++++++
>>  2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>> index 50c3dea..1d6e16c 100644
>> --- a/xen/arch/arm/numa.c
>> +++ b/xen/arch/arm/numa.c
>> @@ -204,7 +204,17 @@ int __init numa_init(void)
>>      for ( i = 0; i < MAX_NUMNODES * 2; i++ )
>>          _node_distance[i] = 0;
>>
>> -    ret = dt_numa_init();
>> +#ifdef CONFIG_ACPI_NUMA
>> +    if ( !acpi_disabled )
>> +    {
>> +        acpi_map_uid_to_mpidr();
>> +        ret = acpi_numa_init();
>> +        if ( ret || srat_disabled() )
>> +            goto no_numa;
>> +    }
>> +    else
>> +#endif
>
>
> We should really have only on call to ACPI in the generic code. Please move
> all of this in a function.
OK
>
>
>> +        ret = dt_numa_init();
>>
>>      if ( !ret )
>>          ret = numa_initmem_init();
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 2f5266a..4c67d38 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -30,6 +30,7 @@
>>  #include <xen/sched.h>
>>  #include <xen/errno.h>
>>  #include <xen/softirq.h>
>> +#include <xen/srat.h>
>>  #include <asm/setup.h>
>>
>>  static int numa_setup(char *s);
>> @@ -282,6 +283,11 @@ static __init int numa_setup(char *opt)
>>          numa_off = 1;
>>      if ( !strncmp(opt,"on",2) )
>>          numa_off = 0;
>> +    if ( !strncmp(opt,"noacpi",6) )
>> +    {
>> +        numa_off = 0;
>> +        acpi_numa = -1;
>> +    }
>>
>>      return arch_numa_setup(opt);
>>  }
>>
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 12:39     ` Vijay Kilari
@ 2017-03-03 13:44       ` Julien Grall
  2017-03-03 13:50         ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-03 13:44 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd

Hello Vijay,

On 03/03/17 12:39, Vijay Kilari wrote:
> On Thu, Mar 2, 2017 at 10:51 PM, Julien Grall <julien.grall@arm.com> wrote:
>>> diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
>>> index 50bf9f8..ce22e88 100644
>>> --- a/xen/drivers/acpi/numa.c
>>> +++ b/xen/drivers/acpi/numa.c
>>> @@ -25,9 +25,11 @@
>>>  #include <xen/init.h>
>>>  #include <xen/types.h>
>>>  #include <xen/errno.h>
>>> +#include <xen/mm.h>
>>
>>
>> Why do you need to inlucde xen/mm.h and ...
>>
>>>  #include <xen/acpi.h>
>>>  #include <xen/numa.h>
>>>  #include <acpi/acmacros.h>
>>> +#include <asm/mm.h>
>>
>>
>> asm/mm.h?
>
> I remember when CONFIG_ACPI +CONFIG_NUMA is enabled
> there is compilation error.

Regarding asm/mm.h, it is already included by xen/mm.h. So no point to 
add it.

Now regarding xen/mm.h, just saying there is a compilation error is not 
helpful. Can you expand why it is needed?

[...]

>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM} in
>> common header.
>>
>> Also, x2apic and gicc are respectively x86-specific and arm-specific. So I
>> think we should move the parsing in a separate arch-depend function to avoid
>> those ifdery.
>>
>> By that I mean having a function acpi_table_arch_parse_srat that will parse
>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>
> Linux also follows similar approach
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265

So? What are you trying to prove?

The linux version is much readable than yours. Anyway, we should limit 
CONFIG_{X86,ARM} ifdefery in common code.

Currently, I see no point to have those ifdefery when it is possible to 
add an arch-depend function.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support
  2017-03-03 12:43     ` Vijay Kilari
@ 2017-03-03 13:46       ` Julien Grall
  0 siblings, 0 replies; 91+ messages in thread
From: Julien Grall @ 2017-03-03 13:46 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

Hello Vijay,

On 03/03/17 12:43, Vijay Kilari wrote:
> On Thu, Mar 2, 2017 at 10:54 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hello Vijay,
>>
>>
>> On 09/02/17 15:57, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
>>>
>>> Update node_distance() function to handle
>>> ACPI SLIT table information.
>>>
>>> Signed-off-by: Vijaya Kumar <Vijaya.Kumar@cavium.com>
>>> ---
>>>  xen/arch/arm/numa.c | 20 +++++++++++++++++++-
>>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
>>> index 5c49347..50c3dea 100644
>>> --- a/xen/arch/arm/numa.c
>>> +++ b/xen/arch/arm/numa.c
>>> @@ -23,6 +23,7 @@
>>>  #include <xen/acpi.h>
>>>  #include <asm/mm.h>
>>>  #include <xen/numa.h>
>>> +#include <xen/srat.h>
>>>  #include <asm/acpi.h>
>>>  #include <xen/errno.h>
>>>  #include <xen/cpumask.h>
>>> @@ -35,6 +36,7 @@ extern struct node nodes[MAX_NUMNODES] __initdata;
>>>  extern int num_node_memblks;
>>>  extern struct node node_memblk_range[NR_NODE_MEMBLKS];
>>>  extern nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>> +extern struct acpi_table_slit *__read_mostly acpi_slit;
>>>
>>>  void __init numa_set_cpu_node(int cpu, unsigned long hwid)
>>>  {
>>> @@ -50,9 +52,24 @@ void __init numa_set_cpu_node(int cpu, unsigned long
>>> hwid)
>>>
>>>  u8 __node_distance(nodeid_t a, nodeid_t b)
>>>  {
>>> -    if ( !node_distance )
>>> +    unsigned index;
>>> +    u8 slit_val;
>>> +
>>> +    if ( !node_distance && !acpi_slit )
>>>          return a == b ? 10 : 20;
>>>
>>> +    if ( acpi_slit )
>>> +    {
>>> +        index = acpi_slit->locality_count * node_to_pxm(a);
>>> +        slit_val = acpi_slit->entry[index + node_to_pxm(b)];
>>> +
>>> +        /* ACPI defines 0xff as an unreachable node and 0-9 are undefined
>>> */
>>> +        if ( (slit_val == 0xff) || (slit_val <= 9) )
>>> +            return NUMA_NO_DISTANCE;
>>> +        else
>>> +            return slit_val;
>>> +    }
>>> +
>>
>>
>> arm/numa.c is the generic code and should not contain any ACPI specific
>> code.
> Agreed.
>>
>> But as I said, the way to get the distance on ACPI is the same on x86 and
>> ARM.
>>
>> So I would introduce __node_distance callback that will be setup at
>> boot-time to either point to the ACPI version or DT version.
>
> Instead of callback, Just call acpi's node_distance function if acpi is enabled
> or else dt based.

Why? I don't see any reason to want checking whether acpi is enabled 
everytime __node_distance is called. The value will never change at runtime.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 13:44       ` Julien Grall
@ 2017-03-03 13:50         ` Vijay Kilari
  2017-03-03 13:52           ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 13:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd

On Fri, Mar 3, 2017 at 7:14 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hello Vijay,
>
> On 03/03/17 12:39, Vijay Kilari wrote:
>>
>> On Thu, Mar 2, 2017 at 10:51 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>>
>>>> diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
>>>> index 50bf9f8..ce22e88 100644
>>>> --- a/xen/drivers/acpi/numa.c
>>>> +++ b/xen/drivers/acpi/numa.c
>>>> @@ -25,9 +25,11 @@
>>>>  #include <xen/init.h>
>>>>  #include <xen/types.h>
>>>>  #include <xen/errno.h>
>>>> +#include <xen/mm.h>
>>>
>>>
>>>
>>> Why do you need to inlucde xen/mm.h and ...
>>>
>>>>  #include <xen/acpi.h>
>>>>  #include <xen/numa.h>
>>>>  #include <acpi/acmacros.h>
>>>> +#include <asm/mm.h>
>>>
>>>
>>>
>>> asm/mm.h?
>>
>>
>> I remember when CONFIG_ACPI +CONFIG_NUMA is enabled
>> there is compilation error.
>
>
> Regarding asm/mm.h, it is already included by xen/mm.h. So no point to add
> it.
>
> Now regarding xen/mm.h, just saying there is a compilation error is not
> helpful. Can you expand why it is needed?

I remember just adding xen/mm.h has not solved the problem. Anyway I
will check this
when I work for next revision.

>
> [...]
>
>>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM} in
>>> common header.
>>>
>>> Also, x2apic and gicc are respectively x86-specific and arm-specific. So
>>> I
>>> think we should move the parsing in a separate arch-depend function to
>>> avoid
>>> those ifdery.
>>>
>>> By that I mean having a function acpi_table_arch_parse_srat that will
>>> parse
>>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>>
>>
>> Linux also follows similar approach
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265
>
>
> So? What are you trying to prove?
>
> The linux version is much readable than yours. Anyway, we should limit
> CONFIG_{X86,ARM} ifdefery in common code.
>
> Currently, I see no point to have those ifdefery when it is possible to add
> an arch-depend function.

This is because in xen we have another level of config CONFIG_ACPI_NUMA.
I have plans to reuse cpu and memory part next revision.

Regards
Vijay

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 13:50         ` Vijay Kilari
@ 2017-03-03 13:52           ` Julien Grall
  2017-03-03 14:45             ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-03 13:52 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd



On 03/03/17 13:50, Vijay Kilari wrote:
> On Fri, Mar 3, 2017 at 7:14 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM} in
>>>> common header.
>>>>
>>>> Also, x2apic and gicc are respectively x86-specific and arm-specific. So
>>>> I
>>>> think we should move the parsing in a separate arch-depend function to
>>>> avoid
>>>> those ifdery.
>>>>
>>>> By that I mean having a function acpi_table_arch_parse_srat that will
>>>> parse
>>>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>>>
>>>
>>> Linux also follows similar approach
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265
>>
>>
>> So? What are you trying to prove?
>>
>> The linux version is much readable than yours. Anyway, we should limit
>> CONFIG_{X86,ARM} ifdefery in common code.
>>
>> Currently, I see no point to have those ifdefery when it is possible to add
>> an arch-depend function.
>
> This is because in xen we have another level of config CONFIG_ACPI_NUMA.
> I have plans to reuse cpu and memory part next revision.

This does not explain why you want to do like Linux.

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 13:52           ` Julien Grall
@ 2017-03-03 14:45             ` Vijay Kilari
  2017-03-03 14:52               ` Julien Grall
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 14:45 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd

On Fri, Mar 3, 2017 at 7:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 03/03/17 13:50, Vijay Kilari wrote:
>>
>> On Fri, Mar 3, 2017 at 7:14 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>
>>>>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM}
>>>>> in
>>>>> common header.
>>>>>
>>>>> Also, x2apic and gicc are respectively x86-specific and arm-specific.
>>>>> So
>>>>> I
>>>>> think we should move the parsing in a separate arch-depend function to
>>>>> avoid
>>>>> those ifdery.
>>>>>
>>>>> By that I mean having a function acpi_table_arch_parse_srat that will
>>>>> parse
>>>>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>>>>
>>>>
>>>>
>>>> Linux also follows similar approach
>>>>
>>>>
>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265
>>>
>>>
>>>
>>> So? What are you trying to prove?
>>>
>>> The linux version is much readable than yours. Anyway, we should limit
>>> CONFIG_{X86,ARM} ifdefery in common code.
>>>
>>> Currently, I see no point to have those ifdefery when it is possible to
>>> add
>>> an arch-depend function.
>>
>>
>> This is because in xen we have another level of config CONFIG_ACPI_NUMA.
>> I have plans to reuse cpu and memory part next revision.
>
>
> This does not explain why you want to do like Linux.

Basically want to reuse xen/drivers/acpi/numa.c which is common for
both x86 and ARM.
If not, then we have move some arch specific as you mentioned.

I have another idea where in, if all the NUMA ACPI code is programmed under
CONFIG_NUMA and only initialization is kept under CONFIG_ACPI_NUMA
similar to x86
then we don't need to pollute this header much and limit the changes.

I will try to implement this and see how simple it can go and let you know. OK?

>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 14:45             ` Vijay Kilari
@ 2017-03-03 14:52               ` Julien Grall
  2017-03-03 15:16                 ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Julien Grall @ 2017-03-03 14:52 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd



On 03/03/17 14:45, Vijay Kilari wrote:
> On Fri, Mar 3, 2017 at 7:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 03/03/17 13:50, Vijay Kilari wrote:
>>>
>>> On Fri, Mar 3, 2017 at 7:14 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>>>
>>>>>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM}
>>>>>> in
>>>>>> common header.
>>>>>>
>>>>>> Also, x2apic and gicc are respectively x86-specific and arm-specific.
>>>>>> So
>>>>>> I
>>>>>> think we should move the parsing in a separate arch-depend function to
>>>>>> avoid
>>>>>> those ifdery.
>>>>>>
>>>>>> By that I mean having a function acpi_table_arch_parse_srat that will
>>>>>> parse
>>>>>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>>>>>
>>>>>
>>>>>
>>>>> Linux also follows similar approach
>>>>>
>>>>>
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265
>>>>
>>>>
>>>>
>>>> So? What are you trying to prove?
>>>>
>>>> The linux version is much readable than yours. Anyway, we should limit
>>>> CONFIG_{X86,ARM} ifdefery in common code.
>>>>
>>>> Currently, I see no point to have those ifdefery when it is possible to
>>>> add
>>>> an arch-depend function.
>>>
>>>
>>> This is because in xen we have another level of config CONFIG_ACPI_NUMA.
>>> I have plans to reuse cpu and memory part next revision.
>>
>>
>> This does not explain why you want to do like Linux.
>
> Basically want to reuse xen/drivers/acpi/numa.c which is common for
> both x86 and ARM.
> If not, then we have move some arch specific as you mentioned.

I think you misunderstood what I suggested. I only asked to do something 
like:

int __init acpi_numa_init(void)
{
	if (!acpi_parse_table(....)) {
		acpi_table_parse_srat(TYPE_CPU_AFFINITY);
		acpi_table_parse_srat(TYPE_MEMORY_AFFINITY);
		acpi_table_arch_parse_srat();
	}
}

And then for x86

void acpi_table_arch_parse_start(void)
{
	acpi_table_parse_srat(TYPE_X2APIC_CPU_AFFINITY);
}

And for ARM

void acpi_table_arch_parse_start(void)
{
	acpi_table_parse_srat(TYPE_GICC_AFFINITY);
}


The code is still as common but a function is called for arch specific 
setup. This does not require any ifdefery.

>
> I have another idea where in, if all the NUMA ACPI code is programmed under
> CONFIG_NUMA and only initialization is kept under CONFIG_ACPI_NUMA
> similar to x86
> then we don't need to pollute this header much and limit the changes.
>
> I will try to implement this and see how simple it can go and let you know. OK?

I don't want to see the common header polluted with #ifdef CONFIG_X86 
and #ifdef CONFIG_ARM. This is just not right.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 14:52               ` Julien Grall
@ 2017-03-03 15:16                 ` Vijay Kilari
  2017-03-03 15:22                   ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Vijay Kilari @ 2017-03-03 15:16 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Jan Beulich, xen-devel, nd

On Fri, Mar 3, 2017 at 8:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 03/03/17 14:45, Vijay Kilari wrote:
>>
>> On Fri, Mar 3, 2017 at 7:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>
>>>
>>>
>>> On 03/03/17 13:50, Vijay Kilari wrote:
>>>>
>>>>
>>>> On Fri, Mar 3, 2017 at 7:14 PM, Julien Grall <julien.grall@arm.com>
>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> This is quite disgusting. We should avoid any #ifdef CONFIG_{X86,ARM}
>>>>>>> in
>>>>>>> common header.
>>>>>>>
>>>>>>> Also, x2apic and gicc are respectively x86-specific and arm-specific.
>>>>>>> So
>>>>>>> I
>>>>>>> think we should move the parsing in a separate arch-depend function
>>>>>>> to
>>>>>>> avoid
>>>>>>> those ifdery.
>>>>>>>
>>>>>>> By that I mean having a function acpi_table_arch_parse_srat that will
>>>>>>> parse
>>>>>>> x2apci on x86 and gicc on ARM. Jan, what do you think?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Linux also follows similar approach
>>>>>>
>>>>>>
>>>>>>
>>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/acpi.h?id=refs/tags/v4.10#n265
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> So? What are you trying to prove?
>>>>>
>>>>> The linux version is much readable than yours. Anyway, we should limit
>>>>> CONFIG_{X86,ARM} ifdefery in common code.
>>>>>
>>>>> Currently, I see no point to have those ifdefery when it is possible to
>>>>> add
>>>>> an arch-depend function.
>>>>
>>>>
>>>>
>>>> This is because in xen we have another level of config CONFIG_ACPI_NUMA.
>>>> I have plans to reuse cpu and memory part next revision.
>>>
>>>
>>>
>>> This does not explain why you want to do like Linux.
>>
>>
>> Basically want to reuse xen/drivers/acpi/numa.c which is common for
>> both x86 and ARM.
>> If not, then we have move some arch specific as you mentioned.
>
>
> I think you misunderstood what I suggested. I only asked to do something
> like:

Got it.
>
> int __init acpi_numa_init(void)
> {
>         if (!acpi_parse_table(....)) {
>                 acpi_table_parse_srat(TYPE_CPU_AFFINITY);

This is not defined for ARM. We have to make this also arch specific.
So all arch specific code from xen/drivers/acpi/numa.c should be moved
to arch specific to xen/arch/x86/srat.c

>                 acpi_table_parse_srat(TYPE_MEMORY_AFFINITY);
>                 acpi_table_arch_parse_srat();
>         }
> }
>
> And then for x86
>
> void acpi_table_arch_parse_start(void)
> {
>         acpi_table_parse_srat(TYPE_X2APIC_CPU_AFFINITY);
> }
>
> And for ARM
>
> void acpi_table_arch_parse_start(void)
> {
>         acpi_table_parse_srat(TYPE_GICC_AFFINITY);
> }
>
>
> The code is still as common but a function is called for arch specific
> setup. This does not require any ifdefery.
>
>>
>> I have another idea where in, if all the NUMA ACPI code is programmed
>> under
>> CONFIG_NUMA and only initialization is kept under CONFIG_ACPI_NUMA
>> similar to x86
>> then we don't need to pollute this header much and limit the changes.
>>
>> I will try to implement this and see how simple it can go and let you
>> know. OK?
>
>
> I don't want to see the common header polluted with #ifdef CONFIG_X86 and
> #ifdef CONFIG_ARM. This is just not right.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 15:16                 ` Vijay Kilari
@ 2017-03-03 15:22                   ` Jan Beulich
  2017-03-10 10:53                     ` Vijay Kilari
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2017-03-03 15:22 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, xen-devel, nd

>>> On 03.03.17 at 16:16, <vijay.kilari@gmail.com> wrote:
> On Fri, Mar 3, 2017 at 8:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>> int __init acpi_numa_init(void)
>> {
>>         if (!acpi_parse_table(....)) {
>>                 acpi_table_parse_srat(TYPE_CPU_AFFINITY);
> 
> This is not defined for ARM. We have to make this also arch specific.
> So all arch specific code from xen/drivers/acpi/numa.c should be moved
> to arch specific to xen/arch/x86/srat.c

There surely is a way to specify processor affinity on ARM?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table
  2017-03-03 15:22                   ` Jan Beulich
@ 2017-03-10 10:53                     ` Vijay Kilari
  0 siblings, 0 replies; 91+ messages in thread
From: Vijay Kilari @ 2017-03-10 10:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Andre Przywara, Dario Faggioli,
	Vijaya Kumar K, Julien Grall, xen-devel, nd

On Fri, Mar 3, 2017 at 8:52 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 03.03.17 at 16:16, <vijay.kilari@gmail.com> wrote:
>> On Fri, Mar 3, 2017 at 8:22 PM, Julien Grall <julien.grall@arm.com> wrote:
>>> int __init acpi_numa_init(void)
>>> {
>>>         if (!acpi_parse_table(....)) {
>>>                 acpi_table_parse_srat(TYPE_CPU_AFFINITY);
>>
>> This is not defined for ARM. We have to make this also arch specific.
>> So all arch specific code from xen/drivers/acpi/numa.c should be moved
>> to arch specific to xen/arch/x86/srat.c
>
> There surely is a way to specify processor affinity on ARM?

In ARM, we use ACPI_SRAT_TYPE_GICC_AFFINITY type entry in SRAT
to extract cpu to proximity mapping

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2017-03-10 10:53 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 15:56 [RFC PATCH v1 00/21] ARM: Add Xen NUMA support vijay.kilari
2017-02-09 15:56 ` [RFC PATCH v1 01/21] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA vijay.kilari
2017-02-20 11:39   ` Julien Grall
2017-02-22  9:18     ` Vijay Kilari
2017-02-22 10:49       ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 02/21] x86: NUMA: Refactor NUMA code vijay.kilari
2017-02-09 16:11   ` Jan Beulich
2017-02-20 11:41     ` Julien Grall
2017-02-27 11:43     ` Vijay Kilari
2017-02-27 14:58       ` Jan Beulich
2017-02-20 12:37   ` Julien Grall
2017-02-22 10:04     ` Vijay Kilari
2017-02-22 10:55       ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 03/21] NUMA: Move arch specific NUMA code as common vijay.kilari
2017-02-09 16:15   ` Jan Beulich
2017-02-20 12:47   ` Julien Grall
2017-02-22 10:08     ` Vijay Kilari
2017-02-22 11:07       ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 04/21] NUMA: Refactor generic and arch specific code of numa_setup vijay.kilari
2017-02-20 13:39   ` Julien Grall
2017-02-22 10:27     ` Vijay Kilari
2017-02-22 11:09       ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 05/21] ARM: efi: Do not delete memory node from fdt vijay.kilari
2017-02-20 13:42   ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 06/21] ARM: NUMA: Parse CPU NUMA information vijay.kilari
2017-02-20 17:32   ` Julien Grall
2017-02-22 10:46     ` Vijay Kilari
2017-02-22 11:10       ` Julien Grall
2017-02-20 17:36   ` Julien Grall
2017-02-09 15:56 ` [RFC PATCH v1 07/21] ARM: NUMA: Parse memory " vijay.kilari
2017-02-20 18:05   ` Julien Grall
2017-03-02 12:25     ` Vijay Kilari
2017-03-02 14:48       ` Julien Grall
2017-03-02 15:08         ` Vijay Kilari
2017-03-02 15:19           ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information vijay.kilari
2017-02-20 18:28   ` Julien Grall
2017-02-22 11:38     ` Vijay Kilari
2017-02-22 11:44       ` Julien Grall
2017-03-02 12:10         ` Vijay Kilari
2017-03-02 12:17           ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 09/21] ARM: NUMA: Add CPU NUMA support vijay.kilari
2017-02-20 18:32   ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 10/21] ARM: NUMA: Add memory " vijay.kilari
2017-03-02 16:05   ` Julien Grall
2017-03-02 16:23     ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 11/21] ARM: NUMA: Add fallback on NUMA failure vijay.kilari
2017-03-02 16:09   ` Julien Grall
2017-03-02 16:25     ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0 vijay.kilari
2017-02-20 18:36   ` Julien Grall
2017-03-02 12:30     ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 13/21] ACPI: Refactor acpi SRAT and SLIT table handling code vijay.kilari
2017-03-02 15:30   ` Julien Grall
2017-03-02 16:31     ` Vijay Kilari
2017-03-02 16:32       ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 14/21] ACPI: Move srat_disabled to common code vijay.kilari
2017-02-09 15:57 ` [RFC PATCH v1 15/21] ARM: NUMA: Extract MPIDR from MADT table vijay.kilari
2017-03-02 16:28   ` Julien Grall
2017-03-02 16:41     ` Vijay Kilari
2017-03-02 16:49       ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table vijay.kilari
2017-03-02 17:21   ` Julien Grall
2017-03-03 12:39     ` Vijay Kilari
2017-03-03 13:44       ` Julien Grall
2017-03-03 13:50         ` Vijay Kilari
2017-03-03 13:52           ` Julien Grall
2017-03-03 14:45             ` Vijay Kilari
2017-03-03 14:52               ` Julien Grall
2017-03-03 15:16                 ` Vijay Kilari
2017-03-03 15:22                   ` Jan Beulich
2017-03-10 10:53                     ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 17/21] ARM: NUMA: Extract memory " vijay.kilari
2017-02-10 17:33   ` Konrad Rzeszutek Wilk
2017-02-10 17:35     ` Konrad Rzeszutek Wilk
2017-03-02 14:41       ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 18/21] ARM: NUMA: update node_distance with ACPI support vijay.kilari
2017-03-02 17:24   ` Julien Grall
2017-03-03 12:43     ` Vijay Kilari
2017-03-03 13:46       ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 19/21] ARM: NUMA: Initialize ACPI NUMA vijay.kilari
2017-03-02 17:25   ` Julien Grall
2017-03-03 12:44     ` Vijay Kilari
2017-02-09 15:57 ` [RFC PATCH v1 20/21] ARM: NUMA: Enable CONFIG_NUMA config vijay.kilari
2017-03-02 17:27   ` Julien Grall
2017-02-09 15:57 ` [RFC PATCH v1 21/21] ARM: NUMA: Enable CONFIG_ACPI_NUMA config vijay.kilari
2017-03-02 17:31   ` Julien Grall
2017-02-09 16:31 ` [RFC PATCH v1 00/21] ARM: Add Xen NUMA support Julien Grall
2017-02-09 16:59   ` Vijay Kilari
2017-02-10 17:30 ` Konrad Rzeszutek Wilk
2017-03-02 14:49   ` Vijay Kilari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.